Procedural Generation#

Procedural generation is used in a variety of ways within games, with its most typical use being for terrain. As found in games like Minecraft, Valheim, etc. Before this project, I was playing Space Engineers, which features premade voxel planets that could be destroyed. So how hard could it be to procedurally generate entire planets?

So I looked into how procedural generation was done previously, and there’s this fantastic video by Henrik Kniberg who goes over how generation works in Minecraft ( link ). I used a lot of the techniques within this video to work on my own generation, with a lot of the effort being spent figuring out how to apply them to the spherical planets.

This implementation uses marching cubes for generating meshes, and I found this video by Sebastian Lague ( link ) incredibly helpful in learning how to do this, and some of the implementation of my version is based on it.

I will state immediately that this is not the most optimal implementation, and there is most likely plenty of improvements that could be made, but this project was more about figuring out how to implement it well rather than spending massive amounts of effort optimising it. Not to say there isn’t any, otherwise it would be impossible to play and showcase.

Why DOTS?#

I’ve been spending a lot of time learning DOTS, and while I have done procedural generation previously. I have done it with Gameobjects and Monobehaviours, and while this project could easily have been done with it. I wanted to learn how DOTS could be used in a project like this.

Generation#

For my noise generation, I use multiple noise maps called ‘Continentalness’, ‘Erosion’, ‘Peaks and Valleys’, ‘Caves’, and ‘Tunnels’. But these are not directly used for the height value of the terrain; similar to the Minecraft generation, it is then used to sample a spline that is configured with the height value for each position.

An example of the spline from Henrik Kniberg’s video, as I never implemented a nice view for my splines in the inspector, but it is functionally the same.

Due to this, the code for sampling each of these values to place within a 3D texture is relatively simple.

#pragma kernel CSMain
#include "/Includes/Noise.compute"

struct NoiseSettings {
	uint seed;
	uint octaves;
	float persistence;
	float lacunarity;
	float scale;
};

// Output
RWTexture3D<float> result;

// Input
StructuredBuffer<NoiseSettings> noiseMaps;	// 0 continentalness, 1 erosion, 2 peaks and valleys, 3 caves, 4 tunnels
StructuredBuffer<float2> continentalnessSpline;
StructuredBuffer<float2> erosionSpline;
StructuredBuffer<float2> pvSpline;
StructuredBuffer<int> edits;

float3 chunkLocalPosition;
float3 chunkLocalCenterPosition;
float planetSurfaceRadius;
float3 planetCenter;

// Linear Spline
float eval(StructuredBuffer<float2> spline, uint count, float x) {
	int i = getSegment(spline, count, x);
	float2 p0 = spline[i];
	float2 p1 = spline[i + 1];

	float t = x - p0.x;

	return lerp(p0.y, p1.y, t);
}

float sample(NoiseSettings settings, float3 coord) {
	float total = 0.0;
	float frequency = 1.0;
	float amplitude = 1.0;
	float maxValue = 0.0;
    
	float3 scaledPoint = coord * settings.scale;
	for (uint i = 0; i < settings.octaves; ++i) {
        total += simplex3d(settings.seed, scaledPoint * frequency) * amplitude;
		maxValue += amplitude;
		amplitude *= settings.persistence;
		frequency *= settings.lacunarity;
	}

	return total / maxValue;
}


[numthreads(8,8,8)]
void CSMain (uint3 id : SV_DispatchThreadID) {
	float3 p = chunkLocalPosition + ((float3)id * voxelSize);
	float centerDist = length(planetCenter - p);

	float height = (planetSurfaceRadius + GetHeight(p));
	float v = clamp(centerDist - height, -1.0, 1.0);

	float caves = sample(noiseMaps[3], p);
	float caveMask = step(caves, 0.45);

	float tunnels = sample(noiseMaps[4], p);
	float tunnelsMask = step(tunnels, 0.7) * (1.0 - step(0.7001, tunnels));

	result[id] = lerp(
		-1.0,
		v * caveMask * tunnelsMask,
		centerDist / (planetSurfaceRadius * 0.35)	// planetSurfaceRadius * 0.35 is the planets 'core' radius which no caves/tunnels will appear
	);
}

I don’t pass 5 textures straight to the compute shader. Instead, I pass the settings to create them. I prefer to do this as I don’t have to have another compute shader that generates it, then pass it here and waste performance. While I could cache them if the chunk needed to be regenerated, it would most likely improve performance, but would mean I would have to manage each of these textures and check I don’t use up too much memory holding onto 100’s of textures.

I include a separate compute shader that contains the functions for actually generating simplex noise, as I was more interested in generating the terrain rather than reimplementing a less performant simplex algorithm. I used Stegu’s implementation of simplex noise ( link ), but updated it for HLSL, and implemented the use of seeding within the permute function.

As shown in the images above, it generates a variety of terrain as well as caves and tunnels under the surface. One of the issues with the implementation is that it generates terrain underneath the surface that would never be visible to the player. Unless the player digs downwards, which wastes performance, but I never figured out a solid solution to this.

Additionally, outside the fact that the terrain is untextured, this could be done via a shader that samples textures and applies them to the surface using the normals of the terrain to distinguish between a rock face and a flat surface, and applies a grass texture. This would be the easiest implementation, but it does come with drawbacks. Such as when the player terraforms the planet and places a specific material, the shader would have to know what the player placed, e.g. dirt, stone, etc. and apply the correct texture.

While I never implemented this, I would likely take an approach where the noise generation returns a texture of integers and then use them as an ID of the voxel and whether it is air, dirt, stone, etc. But this would require reworking a large portion of the code to do this.

Marching Cubes#

As stated earlier, the marching cubes algorithm is based on the implementation made by Sebastian Lague. I would recommend watching his videos for a better explanation on how it works, as well as making this post a fair bit shorter.

That being said, the implementation has been modified to work with this project and still requires some additional work, such as fixing the normals along the edges of the chunks.

One of the changes is the use of an atomic counter, which doesn’t actually affect the generation but is to stop the game stalling when a mesh is being generated.

#pragma kernel CSMain
#include "/Includes/MarchTables.compute"

struct Vertex {
	float3 position;
	float3 normal;
	int2 id;
};

struct Triangle {
	Vertex c;
	Vertex b;
	Vertex a;
};

// Inputs
Texture3D<float> values;
uint textureSize;
float isoLevel;
float voxelSize;
float3 chunkLocalPosition;

// Outputs
AppendStructuredBuffer<Triangle> triangles;
RWStructuredBuffer<uint> counter;

float sample(int3 coord) {
	coord = max(0, min(coord, textureSize));
	return values[coord];
}

float3 coordToWorld(int3 coord) {
	return chunkLocalPosition + ((float3)coord * voxelSize);
}

float3 calculateNormal(int3 coord) {
	return normalize(float3(
		sample(coord + int3(1,0,0)) - sample(coord - int3(1,0,0)),
		sample(coord + int3(0,1,0)) - sample(coord - int3(0,1,0)),
		sample(coord + int3(0,0,1)) - sample(coord - int3(0,0,1))
	) + 1e-6);
}

int indexFromCoord(int3 coord) {
	return coord.z * textureSize * textureSize + coord.y * textureSize + coord.x;
}

Vertex createVertex(int3 coordA, int3 coordB) {
	float3 posA = coordToWorld(coordA);
	float3 posB = coordToWorld(coordB);
	float valueA = sample(coordA);
	float valueB = sample(coordB);

	// Interpolate corner points based on inputted value
	float t = (isoLevel - valueA) / (valueB - valueA);
	float3 position = posA + t * (posB - posA);

	float3 normalA = calculateNormal(coordA);
	float3 normalB = calculateNormal(coordB);
	float3 normal = normalize(normalA + t * (normalB - normalB));

	int indexA = indexFromCoord(coordA);
	int indexB = indexFromCoord(coordB);

	Vertex vertex;
	vertex.position = position;
	vertex.normal = normal;
	vertex.id = int2(min(indexA, indexB), max(indexA, indexB));

	return vertex;
}

[numthreads(8,8,8)]
void CSMain (uint3 id : SV_DispatchThreadID) {
	uint numVoxelsPerAxis = textureSize - 1;
	if (id.x >= numVoxelsPerAxis || id.y >= numVoxelsPerAxis || id.z >= numVoxelsPerAxis) {
		InterlockedAdd(counter[0], 1);
		return;
	}

	int3 cornerCoords[] = {
		id + int3(0, 0, 0),
		id + int3(1, 0, 0),
		id + int3(1, 0, 1),
		id + int3(0, 0, 1),
		id + int3(0, 1, 0),
		id + int3(1, 1, 0),
		id + int3(1, 1, 1),
		id + int3(0, 1, 1)
	};

	int cubeConfiguration = 0;
	for (int i = 0; i < 8; i++) {
		// Each corner is a 0 or 1 with 0 being air and 1 being blocked
		if (sample(cornerCoords[i]) < isoLevel)
			cubeConfiguration |= (1 << i);
	}

	int edgeIndices[] = triangulation[cubeConfiguration];

	for (i = 0; i < 16; i += 3) {
		// If edge index is -1 configuration is complete
		if (edgeIndices[i] == -1)
			break;

		// Get indices of the two corner points that define the edge
		int edgeIndexA = edgeIndices[i];
		int a0 = cornerIndexAFromEdge[edgeIndexA];
		int a1 = cornerIndexBFromEdge[edgeIndexA];

		int edgeIndexB = edgeIndices[i + 1];
		int b0 = cornerIndexAFromEdge[edgeIndexB];
		int b1 = cornerIndexBFromEdge[edgeIndexB];

		int edgeIndexC = edgeIndices[i + 2];
		int c0 = cornerIndexAFromEdge[edgeIndexC];
		int c1 = cornerIndexBFromEdge[edgeIndexC];

		Vertex vertexA = createVertex(cornerCoords[a0], cornerCoords[a1]);
		Vertex vertexB = createVertex(cornerCoords[b0], cornerCoords[b1]);
		Vertex vertexC = createVertex(cornerCoords[c0], cornerCoords[c1]);

		Triangle tri;
		tri.a = vertexC;
		tri.b = vertexB;
		tri.c = vertexA;
		triangles.Append(tri);
	}

	InterlockedAdd(counter[0], 1);
}

The Manager#

For actually calling these compute shaders and generating the meshes, I have to send it all via a Monobehaviour called the Compute Manager. Unfortunately, as of right now, there is no way I’m aware of to call compute shaders via an ISystem to make it a pure DOTS project, so this works as the middleman between the DOTS chunk data and the mesh generation.

The manager uses an update queue, which holds the data needed to generate the chunks mesh, which is then processed one by one based on the distance from the player, so the closer ones are processed first, as they are more important. I also only process what LOD the chunk should be using just before it is generated, as the player could be moving, and the LOD that was requested initially may no longer be the correct LOD, so I found it better to check just before it.

private struct UpdateMeshTask {
	public Entity chunkEntity;
	public Chunk chunk;
	public GridConfig config;
	public float3 location;
	public bool forceUpdate;
}

private async Awaitable OnEnable() {
	playerTransform = Camera.main.transform;
	cancelChunkProcessing?.Dispose();
	cancelChunkProcessing = new CancellationTokenSource();

	while (true) {
		while (updateQueue.Count == 0) {
			await Awaitable.NextFrameAsync(cancelChunkProcessing.Token);
			if (cancelChunkProcessing.IsCancellationRequested)
				return;
		}

		UpdateMeshTask task = GetNextChunkTask();

		EntityManager entityManager = World.DefaultGameObjectInjectionWorld.EntityManager;
		if (!entityManager.Exists(task.chunkEntity))
			continue;

		// As this may come seconds later and the player has moved it possible an update is no longer needed
		int lod = Helper.GetLod(task.config, math.distance(Helper.GetChunkLocalPosition(task.chunk), (float3)playerTransform.position));
		if (task.chunk.lodLevel == lod && !task.forceUpdate) {
			entityManager.RemoveComponent<ChunkUpdateInProgressTag>(task.chunkEntity);
			continue;
		}

		if (lod == 3) {
			await UpdateChunk(task, new Mesh(), lod);
			continue;
		}

		Mesh mesh = await GenerateMeshAsync(task.config, task.chunk, lod);
		await UpdateChunk(task, mesh, lod);
	}
}

‘GenerateMeshAsync()’ is where the actual mesh is processed and generated, and where the compute shaders are dispatched from. If you have worked with compute shaders before, most of this will seem familiar, though it should be stated that not all of the data is set here as a fair chunk of the data would only vary between different planets so I only set that data once seperately and only update it when a different planet is being generated which does provide a small performance improvement.

private void DispatchNoise(RenderTexture result, ComputeBuffer editsBuffer, bool isEditsRootLeaf, GridConfig config, float voxelSize, int chunkSize, float[] chunkLocalPosition, float[] chunkLocalCenterPosition) {
	noiseShader.SetTexture(0, "result", result);
	noiseShader.SetFloat("voxelSize", voxelSize);
	noiseShader.SetFloats("chunkLocalPosition", chunkLocalPosition);
	noiseShader.SetFloats("chunkLocalCenterPosition", chunkLocalCenterPosition);
	noiseShader.SetBool("isEditsRootLeaf", isEditsRootLeaf);
	noiseShader.SetBuffer(0, "edits", editsBuffer);
	noiseShader.SetFloat("chunkExtents", Consts.WORLD_CHUNK_EXTENTS_WITH_PADDING);

	noiseDispatchData.SetDispatchData(noiseShader, config);

	ComputeHelper.Dispatch(noiseShader, chunkSize, chunkSize, chunkSize, 0);
}

private void DispatchMarch(RenderTexture input, ComputeBuffer triangleBuffer, GridConfig config, float voxelSize, int chunkSize, float[] chunkLocalPosition) {
	marchingShader.SetTexture(0, "values", input);
	marchingShader.SetInt("textureSize", chunkSize);
	marchingShader.SetFloat("isoLevel", config.isoLevel);
	marchingShader.SetFloat("voxelSize", voxelSize);
	marchingShader.SetFloats("chunkLocalPosition", chunkLocalPosition);

	triangleBuffer.SetCounterValue(0);
	marchingShader.SetBuffer(0, "triangles", triangleBuffer);

	// March counter buffer is already set to the shader just need to refresh counter
	marchCounter.SetData(new uint[] { 0 });

	int numVoxelsPerAxis = chunkSize - 1;
	ComputeHelper.Dispatch(marchingShader, numVoxelsPerAxis, numVoxelsPerAxis, numVoxelsPerAxis, 0);
}

private async Awaitable<Vertex[]> ProcessMeshDataAsync(GridConfig config, Chunk chunk, int chunkSize) {
	float voxelSize = GetVoxelSize(chunkSize);
	float[] chunkLocalPosition = ComputeHelper.ConvertFloat3ToArray(Helper.GetChunkLocalPosition(chunk));

	RenderTexture result = CreateTexture(chunkSize);
	using ComputeBuffer editsBuffer = GetEditsBuffer(chunk);
	DispatchNoise(result, editsBuffer, chunk.edits.rootIsLeaf, config, voxelSize, chunkSize, chunkLocalPosition, ComputeHelper.ConvertFloat3ToArray(Helper.GetChunkLocalCenterPosition(chunk)));

	using ComputeBuffer triangleBuffer = new ComputeBuffer(GetMaxVertexCount(chunkSize), ComputeHelper.GetStride<Vertex>(), ComputeBufferType.Append);
	DispatchMarch(result, triangleBuffer, config, voxelSize, chunkSize, chunkLocalPosition);

	AsyncGPUReadbackRequest request = await AsyncGPUReadback.RequestAsync(marchCounter);
	result.Release();

	if (!this || request.hasError)
		return null;

	Vertex[] data = GetVertexData(triangleBuffer);
	return data;
}

Collision#

Collision is unfortunately a very important feature which took a fair amount of tweaking to get decent. I’m not 100% happy with the implementation, but it seemed to be the easiest method. As I regenerate the chunks mesh at a fixed lower resolution than the rendered mesh.

PhysicsCollider chunkCollider = entityManager.GetComponentData<PhysicsCollider>(task.chunkEntity);
if (mesh.vertexCount > 0 && (task.forceUpdate || (lod == 0 && !chunkCollider.IsValid))) {
	using PhysicsMeshData data = await GeneratePhysicsMeshAsync(task.config, task.chunk);

	// Incase entity is destroyed during collision generation
	if (!this || !entityManager.Exists(task.chunkEntity))
		return;

	if (data.vertices.IsCreated && data.vertices.Length > 0) {
		CollisionFilter filter = new CollisionFilter {
			BelongsTo = Consts.DEFAULT_LAYER_BITMASK,
			CollidesWith = Consts.DEFAULT_LAYER_BITMASK | Consts.PLAYER_LAYER_BITMASK,
			GroupIndex = 0
		};

		if (chunkCollider.IsValid)
			chunkCollider.Value.Dispose();

		BlobAssetReference<Collider> collider = MeshCollider.Create(data.vertices, data.triangles, filter, Unity.Physics.Material.Default);
		entityManager.SetComponentData(task.chunkEntity, new PhysicsCollider { Value = collider });
	}
}

While this does work, I feel like some heavy optimisation could be done, and it does impact the performance a lot. I also wish the process of generating the collision were simpler for entities, as I have to recreate the entire collider even though most of its data would be the same.

Terraforming#

Actually applying a change to the terrain is where I got stuck for a long time, especially doing it efficiently. As one much simpler option would be to have rendertextures that the player ‘paints’ on and pass that to the compute shaders. But the issue is that having a render texture for each chunk is terrible if there are edits to a lot of chunks, as well as not working well with my chunk component, as it’s managed data. So my system uses a burst-compatible Octree implementation.

[BurstCompile]
public struct Octree<T> : IDisposable where T : unmanaged {
	public Bounds bounds;

	public NativeArray<OctreeNode<T>> nodes;

	public int nodeCount { get; private set; }
	public bool rootIsLeaf { get; private set; }

	public Octree(Bounds bounds, int nodeCapacity) {
		this.bounds = bounds;
		this.nodeCount = 0;
		rootIsLeaf = false;

		nodes = new NativeArray<OctreeNode<T>>(nodeCapacity, Allocator.Persistent, NativeArrayOptions.ClearMemory);
	}

	public void Dispose() {
		nodes.Dispose();
	}

	// Octree insert implementation that goes down to a specific bounds size
	public bool Insert(float3 point, T v) {
		Bounds workingBounds = bounds;

		// Bounds check
		if (!workingBounds.Contains(point))
			return false;

		// Degenerate case, no nodes yet.
		if (nodeCount == 0) {
			rootIsLeaf = true;
			nodes[nodeCount++] = new OctreeNode<T>(default);
			return Insert(point, v);
		}

		// Prepare initial iteration
		int nodeIndex = 0;
		int parentNodeIndex = -1;
		int parentLocalNodeOffset = 0;
		bool isLeafNode = rootIsLeaf;

		for (int j = 0; j < 256; ++j) {
			OctreeNode<T> node = nodes[nodeIndex];
			bool3 comp = point >= ((float3)workingBounds.center);
			int localNodeOffset = math.bitmask(new bool4(comp, false));

			if (isLeafNode) {
				// Already ideal size just overwrite the value rather than splitting further
				if (j >= Consts.OCTREE_MAX_DEPTH) {
					nodes[nodeIndex] = new OctreeNode<T>(v);
					return true;
				}

				// We're a leaf node, so we need to create some children (split)
				int newNodeIndex = nodeCount + localNodeOffset;

				// Bounds check
				if (nodeCount + 8 >= nodes.Length)
					return false; // throw error?

				// Update our node
				for (int i = 0; i < 8; ++i)
					nodes[nodeCount + i] = new OctreeNode<T>(default);

				nodes[nodeIndex] = new OctreeNode<T>(0xFF, nodeCount);

				// Update parent node
				if (nodeIndex == 0) {
					rootIsLeaf = false;
				} else if (parentNodeIndex >= 0) {
					OctreeNode<T> parentNode = nodes[parentNodeIndex];
					parentNode.LeafMask &= ~(1 << parentLocalNodeOffset);
					nodes[parentNodeIndex] = parentNode;
				}

				nodeCount += 8;

				// Preserve parent information
				parentNodeIndex = nodeIndex;
				parentLocalNodeOffset = localNodeOffset;

				// Prepare next iteration
				nodeIndex = newNodeIndex;
				isLeafNode = true;

				workingBounds.SetMinMax(
					math.select(workingBounds.min, workingBounds.center, comp),
					math.select(workingBounds.center, workingBounds.max, comp)
				);
			} else {
				// We're not a leaf node, so we should insert into our children

				// Preserve parent information
				parentNodeIndex = nodeIndex;
				parentLocalNodeOffset = localNodeOffset;

				// Prepare next iteration
				nodeIndex = node.NodeIndex + localNodeOffset;
				isLeafNode = (node.LeafMask & (1 << localNodeOffset)) != 0;

				workingBounds.SetMinMax(
					math.select(workingBounds.min, workingBounds.center, comp),
					math.select(workingBounds.center, workingBounds.max, comp)
				);
			}
		}

		Debug.Log("Bailed");
		return false;
	}

	public T Get(float3 point) {
		Bounds workingBounds = bounds;

		// Bounds check
		if (!workingBounds.Contains(point))
			return default;

		if (nodeCount == 0)
			return default;

		int nodeIndex = 0;
		bool isLeafNode = rootIsLeaf;

		for (int j = 0; j < 256; ++j) {
			OctreeNode<T> node = nodes[nodeIndex];
			bool3 comp = point >= ((float3)workingBounds.center);
			int localNodeOffset = math.bitmask(new bool4(comp, false));

			if (isLeafNode) {
				return node.Value;
			} else {
				nodeIndex = node.NodeIndex + localNodeOffset;
				isLeafNode = (node.LeafMask & (1 << localNodeOffset)) != 0;

				workingBounds.SetMinMax(
					math.select(workingBounds.min, workingBounds.center, comp),
					math.select(workingBounds.center, workingBounds.max, comp)
				);
			}
		}

		Debug.Log("Bailed");
		return default;
	}

#if UNITY_EDITOR
	public T GetWithGizmoDraw(float3 point) {
		Bounds workingBounds = bounds;

		// Bounds check
		if (!workingBounds.Contains(point))
			return default;

		if (nodeCount == 0)
			return default;

		int nodeIndex = 0;
		bool isLeafNode = rootIsLeaf;

		for (int j = 0; j < 256; ++j) {
			
			OctreeNode<T> node = nodes[nodeIndex];
			bool3 comp = point >= ((float3)workingBounds.center);
			int localNodeOffset = math.bitmask(new bool4(comp, false));

			Gizmos.color = isLeafNode ? Color.red : Color.green;
			Gizmos.DrawWireCube(workingBounds.center, workingBounds.size);

			if (isLeafNode) {
				return node.Value;
			} else {
				nodeIndex = node.NodeIndex + localNodeOffset;
				isLeafNode = (node.LeafMask & (1 << localNodeOffset)) != 0;

				workingBounds.SetMinMax(
					math.select(workingBounds.min, workingBounds.center, comp),
					math.select(workingBounds.center, workingBounds.max, comp)
				);
			}
		}

		return default;
	}

	public T GetWithDebugDraw(float3 point) {
		Bounds workingBounds = bounds;

		// Bounds check
		if (!workingBounds.Contains(point))
			return default;

		if (nodeCount == 0)
			return default;

		int nodeIndex = 0;
		bool isLeafNode = rootIsLeaf;

		for (int j = 0; j < 256; ++j) {
			OctreeNode<T> node = nodes[nodeIndex];
			bool3 comp = point >= ((float3)workingBounds.center);
			int localNodeOffset = math.bitmask(new bool4(comp, false));

			Helper.DrawWireCube(workingBounds.center, workingBounds.size, isLeafNode ? Color.red : Color.green);

			if (isLeafNode) {
				return node.Value;
			} else {
				nodeIndex = node.NodeIndex + localNodeOffset;
				isLeafNode = (node.LeafMask & (1 << localNodeOffset)) != 0;

				workingBounds.SetMinMax(
					math.select(workingBounds.min, workingBounds.center, comp),
					math.select(workingBounds.center, workingBounds.max, comp)
				);
			}
		}

		Debug.Log("Bailed");
		return default;
	}
#endif
}

[BurstCompile]
public struct OctreeNode<T> where T : unmanaged {
	public const int LEAF_MASK_MASK = ((1 << 8) - 1) << 24;
	public const int NODE_INDEX_MASK = (1 << 24) - 1;

	public int LeafMask {
		get => (int)((uint)data >> 24);
		set => data = (data & ~LEAF_MASK_MASK) | (value << 24);
	}

	public int NodeIndex {
		get => data & NODE_INDEX_MASK;
		private set => data = (data & ~NODE_INDEX_MASK) | (value & NODE_INDEX_MASK);
	}

	public T Value {
		[MethodImpl(MethodImplOptions.AggressiveInlining)]
		get => Unsafe.As<int, T>(ref data);
		[MethodImpl(MethodImplOptions.AggressiveInlining)]
		private set => Unsafe.As<int, T>(ref data) = value;
	}

	private int data;

	public OctreeNode(int leafMask, int nodeIndex) {
		data = 0;
		LeafMask = leafMask;
		NodeIndex = nodeIndex;
	}

	public OctreeNode(T value) {
		data = 0;
		Value = value;
	}

	public bool AreEqual(T value) {
		int comp = 0;
		Unsafe.As<int, T>(ref comp) = value;
		return data == comp;
	}
}

This allows me to store and edit data within the smallest amount of data possible, which helps a lot with performance as it minimises memory usage and allows me to pass it straight to the compute shader. All at the cost of being much harder to implement, and even in its current state, there is a possible error that can occur after a large number of edits on the same chunk. I can’t take complete credit for this implementation, as this final version was done with the assistance of a good friend of mine, Craig Chapman ( link ), who helped massively on the final implementation shown above.

But for actually applying the edits to the noise generation compute shader, I have to traverse the data that we passed through in the shader itself and add it to the final value generated.

// previous data shown as before

float chunkExtents;
float voxelSize;
bool isEditsRootLeaf;

int getLeafMask(int node) {
	return node >> 24;
}

int getNodeIndex(int node) {
	return node & ((1 << 24) - 1);
}

float sampleEdits(float3 p) {
	if (edits.Length == 0)
		return 0;

	float3 boundsMin = chunkLocalCenterPosition - chunkExtents;
	float3 boundsMax = chunkLocalCenterPosition + chunkExtents;
	float3 boundsCenter = chunkLocalCenterPosition;

	if (any(p > boundsMax) || any(p < boundsMin))
		return 0;

	// initial setup
	uint nodeIndex = 0;
	bool isLeafNode = isEditsRootLeaf;

	[loop]
	for (int i = 0; i < 256; ++i) {
		int node = edits[nodeIndex];
		bool3 comp = p >= boundsCenter;
		uint localNodeOffset = (comp.z << 2) | (comp.y << 1) | comp.x;

		if (isLeafNode)
			return asfloat(node);

		isLeafNode = (getLeafMask(node) & (1 << localNodeOffset)) != 0;
		nodeIndex = getNodeIndex(node) + localNodeOffset;

		if (nodeIndex >= edits.Length)
			return 0;

		boundsMin =	comp ? boundsCenter : boundsMin;
		boundsMax =	comp ? boundsMax : boundsCenter;
		boundsCenter = boundsMin + ((boundsMax - boundsMin) * 0.5);
	}
	
	return 0;
}

// noise sampling functionality, same before

[numthreads(8,8,8)]
void CSMain (uint3 id : SV_DispatchThreadID) {
	// sampling same as before
	
    result[id] = lerp(
		-1.0,
		v * caveMask * tunnelsMask,
		centerDist / (planetSurfaceRadius * 0.35)	// planetSurfaceRadius * 0.35 is the planets 'core' radius which no caves/tunnels will appear
	) + sampleEdits(p);
}

For the sake of showing the complete process from the player, it is simply a raycast onto the mesh and then passing the positions to the octree. I was experimenting with the multiple positions to allow for a larger ‘brush’, but it seems to cause issues at chunk edges, allowing the player to occasionally see through the floor. I could alternatively decrease the octrees’ max depth, which would provide similar functionality but wouldn’t allow a finer edit and a wider edit ‘brush’.

[BurstCompile]
public void Terraform(ref SystemState state, LocalToWorld camLTW, bool subtract) {
	PhysicsWorld physicsWorld = SystemAPI.GetSingleton<PhysicsWorldSingleton>().PhysicsWorld;

	// This filter isnt functioning as expected
	CollisionFilter filter = new CollisionFilter {
		BelongsTo = Consts.ALL_LAYER_BITMASK,
		CollidesWith = Consts.DEFAULT_LAYER_BITMASK,
		GroupIndex = 0
	};

	RaycastInput raycastInput = new RaycastInput {
		Start = camLTW.Position,
		End = camLTW.Position + math.forward(camLTW.Rotation) * 10f,
		Filter = filter
	};

	if (physicsWorld.CastRay(raycastInput, out RaycastHit raycastHit)) {
		EntityCommandBuffer ecb = new EntityCommandBuffer(Allocator.Temp);
		NativeList<DistanceHit> overlapHits = new NativeList<DistanceHit>(Allocator.Temp);
		if (physicsWorld.OverlapSphere(raycastHit.Position, 1f, ref overlapHits, filter, QueryInteraction.IgnoreTriggers)) {
			float3[] positions = {
				raycastHit.Position,
				raycastHit.Position + (math.up() * 0.5f),
				raycastHit.Position + (math.down() * 0.5f),
				raycastHit.Position + (math.left() * 0.5f),
				raycastHit.Position + (math.right() * 0.5f),
			};
			foreach (DistanceHit hit in overlapHits) {
				if (SystemAPI.HasComponent<Chunk>(hit.Entity)) {
					RefRW<Chunk> chunk = SystemAPI.GetComponentRW<Chunk>(hit.Entity);
					bool change = false;
					foreach (float3 pos in positions)
						change = chunk.ValueRW.edits.Insert(pos, subtract ? Consts.REMOVED_VOXEL_VALUE : Consts.ADDED_VOXEL_VALUE) ? true : change;

					if (change) {
						UpdateChunkRequest req = new UpdateChunkRequest { forceColliderUpdate = true };
						if (SystemAPI.HasComponent<UpdateChunkRequest>(hit.Entity)) {
							ecb.SetComponent(hit.Entity, req);
						} else {
							ecb.AddComponent(hit.Entity, req);
						}
					}
				}
			}
		}
		overlapHits.Dispose();
		ecb.Playback(state.EntityManager);
	}
}

Conclusion#

I’m very happy with what I was able to accomplish in this project, and I may return to work on it further or rework it to allow for multiple materials that can be set down, as well as fix the insertion error that can occur after a lot of changes. But I do also feel like it’s reaching the point where it requires further optimisation. While in its current state, it’s ideal for a showcase; if this were used within an actual game, the performance would require great improvement.

Planet Generation