For the past few weeks, I decided to look into optimisation techniques used in game development, or more specifically, how compute shaders are used in Unity. Although compute shaders on their own can be quite difficult to learn and understand without any background in other shader programming or HLSL. So I thought the best way to teach myself was to also learn another interesting algorithm called "Marching Cubes", which I could then translate into a compute shader.

The following links helped me understand how compute shaders work, and the Marching Cubes algorithm:

Compute Shaders in Unity: __https://polycoding.net/compute-shaders-unity__

GPU Ray Tracing in Unity: __http://three-eyed-games.com/2018/05/03/gpu-ray-tracing-in-unity-part-1/__

Marching Cubes in Unity Part 1: __https://polycoding.net/marching-cubes/part-1/__

Marching Cubes in Unity Part 2: __https://polycoding.net/marching-cubes/part-2/__

### Research on Compute Shaders

Before I could start on creating the marching cubes algorithm, I first needed to understand how compute shaders work, and why they are faster than writing everything in C# on the CPU. Following along with the first link helped me a lot while doing this.

##### What is a Compute Shader

Compute shaders are best used when we need to calculate many things at once. Things like pathfinding, procedurally generating things, or even ray tracing. Doing this on the CPU is possible, but would take a long time to calculate, because it executes each line of code in order, one at a time. Doing the same thing in a compute shader is much faster because they allow us to execute multiple lines of code side by side.

When running a compute shader on the GPU, it creates something called a **workgroup**. A workgroup is basically a smaller computer that runs along side with many other workgroups, and are completely independent of eachother. Because of this, they are not guaranteed to be executed in a specific order.

#### A Single Workgroup

Workgroups are run in 3D space, therefore have 3 dimensions working along side eachother.

Each workgroup is then divided into threads which are also run in 3D space.

##### Writing my First Compute Shaders

After following the tutorial in the first link, I was able to create my first compute shader that calculated a random position for each cube in a grid, then colour it based on its height. (White for a Y position of 1, and Black for a Y position of -1). It calculated these every 0.5 seconds.

I also followed a much more intricate tutorial that showed me how to ray trace in compute shaders. After following along with the second link, I was able to create some smooth spheres that mirror each other. I also learnt how ray tracing works.

Now that I had written some compute shaders, and have a basic understanding of how they work, I then moved onto learning how to write the marching cubes algorithm.

### Marching Cubes

The marching cubes algorithm is a way of creating a surface based on some data given by a cube. The most common use for marching cubes in game development is to create unique terrain with caves and overhangs, as well as allowing the terrain to be modified in real time.

##### How Marching Cubes Works

The first thing to look at is a single cube, and how we get data from this cube and turn it into mesh data.

Each cube has 8 vertices and 12 edges. And each of these have a specific index, which is seen in the photos above. Each vertex has a value, and when it is below a certain threshold called the "Isosurface", it counts as __below__ the surface, and if its above the threshold, it will count as __ above__ the surface.

For an example, lets say that vertex 0 is below the threshold, while all the other vertices are above. Using this data, we now create a triangle in front of the vertex by connecting the 3 edges around it. In this case it would be edges 0, 3 and 8.

This gives us the following results...

There are a total of 256 combinations, but only 15 unique cases, which can be seen in the image below...

Using the data from a single cube, we search through a triangulation table to see which of the 256 cases has the correct triangle placement for the cube. We then march through every single cube in the world, then generate a mesh using the triangles that have been generated.

This triangulation table is an array of arrays consisting of floats. For example, the first of the 256 items would be the following...

`{0, 8, 3, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1}`

With this, we take each set of 3 digits, which gives us the information needed to create the triangles. In this particular case, the first three digits tell us we need to connect edges 0, 8 and 3. The rest of the vertices are considered above ground, so we fill them in as -1.

##### Marching Cubes in Unity

After researching the algorithm, it was time to implement it in Unity, though just in C# for now. The first step was to create a 3D grid of points that have a certain noise value. I simply created a small struct called "GridPoint" which contains a Vector3 position and a float value. I then use a nested for loop to loop through the X, Y and Z coordinates and place a grid point. To create some noise, I used Unity's inbuilt noise functionality which allowed me to create some 3D Perlin Noise.

```
public struct GridPoint
{
public float value;
public Vector3 position;
}
```

```
for (int x = 0; x < gridSize; x++)
{
for (int y = 0; y < gridSize; y++)
{
for (int z = 0, i = 0; z < gridSize; z++, i++)
{
GridPoint point = new GridPoint();
point.position = new Vector3(transform.position.x + x, transform.position.y + y, transform.position.z + z);
float val = Noise.PerlinNoise3D(((float)x / gridSize) * noiseScale, ((float)y / gridSize) * noiseScale, ((float)z / gridSize) * noiseScale) * noiseAmplitude;
point.value = val;
gridPoints.Add(point);
}
}
}
```

We also use OnDrawGizmos() to draw all our points, using the Color.Lerp() function to show black if we are below ground, and white if we are above ground. After doing so, I got a 16 x 16 grid with some 3D Perlin Noise values.

The next step was to implement the actual algorithm using these noise values. After downloading the triangulation table from the internet, (because there was no way I was going to figure out those values on my own), I needed a way to get the correct index. Online tutorials suggest using the bitwise OR operation is the best way to do this.

I started off with a single int called "cubeindex", and initialise it to 0. We then check the value at a single vertex and check if its value is below the isosurface threshold. If this is true, we use bitwise OR with a value of 1. We repeat this step for the rest of the vertices, but doubling the value each time.

Vertex 0 |= 1;

Vertex 1 |= 2;

Vertex 2 |= 4;

Vertex 3 |= 8;

Vertex 4 |= 18;

etc...

The code should look something like this...

```
int GetCubeIndex(float[] corners)
{
//Set the index for the cube by using bitwise OR operator
int cubeIndex = 0;
if (corners[0] < isoSurface) cubeIndex |= 1;
if (corners[1] < isoSurface) cubeIndex |= 2;
if (corners[2] < isoSurface) cubeIndex |= 4;
if (corners[3] < isoSurface) cubeIndex |= 8;
if (corners[4] < isoSurface) cubeIndex |= 16;
if (corners[5] < isoSurface) cubeIndex |= 32;
if (corners[6] < isoSurface) cubeIndex |= 64;
if (corners[7] < isoSurface) cubeIndex |= 128;
return cubeIndex;
}
```

The "corners" array is obviously an array of floats that holds the isosurface value of each corner of the current cube. The returned number will be the correct index that we need to get the correct triangulation results from the table. I then just got these values and put them in a temporary array...

`int edges[] = triTable[cubeIndex];`

If we are looking at the example I used before, since only vertex 0 is underground, we would get a cube index of 1. We the search the table for the array at index 1 which gives us the following array...

`{0, 8, 3, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1}`

Like I mentioned before, we need to go through each set of three digits. In this case, the first set happens to be** 0, 8 and 3**. With these 3 digits (which represent the edges), we need to find what vertices each edge lies between. We can find this by using another useful function I copied from online.

```
static const int edgeConnections[12][2] = {
{0,1}, {1,2}, {2,3}, {3,0},
{4,5}, {5,6}, {6,7}, {7,4},
{0,4}, {1,5}, {2,6}, {3,7}
};
```

This is an array of 12 items, with each item containing 2 ints. How this table works is quite simple. Each item represents an edge (12 edges), and the ints in each item tell us what vertices each edge lies between.

For example:

Edge 0 lies between V0 and V1.

Edge 1 lies between V1 and V2

Edge 7 lies between V7 and V4

etc...

For our particular case, we need to find the vertices for edges 0, 8 and 3. So they would be...

E0 = V0 and V1

E8 = V0 and V4

E3 = V3 and V0

Here is a visualisation...

Here is the code...

```
for (int i = 0; edges[i] != -1; i += 3)
{
int edgeA0 = MarchingCubeTable.EdgeConnections[edges[i]][0];
int edgeA1 = MarchingCubeTable.EdgeConnections[edges[i]][1];
int edgeB0 = MarchingCubeTable.EdgeConnections[edges[i + 1]][0];
int edgeB1 = MarchingCubeTable.EdgeConnections[edges[i + 1]][1];
int edgeC0 = MarchingCubeTable.EdgeConnections[edges[i + 2]][0];
int edgeC1 = MarchingCubeTable.EdgeConnections[edges[i + 2]][1];
}
```

We keep looping through each set of three digits until we hit a number that is equal to -1, which counts as blank space. In our case, we stop right after we get the first three digits, because the rest of them are blank.

As part of the final step, we need to know the physical coordinates of our vertices. This can easily be done by using the following function...

```
static const float3 cornerOffsets[8] = {
float3(0, 0, 1),
float3(1, 0, 1),
float3(1, 0, 0),
float3(0, 0, 0),
float3(0, 1, 1),
float3(1, 1, 1),
float3(1, 1, 0),
float3(0, 1, 0)
};
```

From here, we can either interpolate between the edges to get a smoother look, or leave it as it is to get a cooler blocky look, similar to minecraft. To interpolate, I used the following formula...

```
float3 Interp(float3 edgeVertex1, float valueAtVertex1, float3 edgeVertex2, float valueAtVertex2)
{
return (edgeVertex1 + (_IsoLevel - valueAtVertex1) * (edgeVertex2 - edgeVertex1) / (valueAtVertex2 - valueAtVertex1));
}
```

Other wise, for the blocky look, just add the vertices, then divide by 2 to get the mid point...

```
float3 CubeInterp(float3 indexA, float3 indexB, uint3 id)
{
return (indexA + indexB) / 2 + id;
}
```

##### Marching Cubes in a Compute Shader

The next step was to convert everything I had learnt about compute shaders and marching cubes, and combine them. The HLSL language isn't too hard to understand, the hardest part was transferring everything from Unity C# into HLSL and making them both communicate with each other.

Below is the entire compute shader...

```
#pragma kernel MarchCubes
#include "MarchingTable.hlsl"
static const uint numThreads = 8;
uint _ChunkSize;
float _IsoLevel;
RWStructuredBuffer<float> _Weights;
struct Triangle
{
float3 a, b, c;
};
AppendStructuredBuffer<Triangle> _Triangles;
//Interpolate between 2 edges
float3 Interp(float3 edgeVertex1, float valueAtVertex1, float3 edgeVertex2, float valueAtVertex2)
{
return (edgeVertex1 + (_IsoLevel - valueAtVertex1) * (edgeVertex2 - edgeVertex1) / (valueAtVertex2 - valueAtVertex1));
}
float3 CubeInterp(float3 indexA, float3 indexB)
{
return (indexA + indexB) / 2;
}
//Find a 3D point from a 1D array
static int IndexFromCoord(int x, int y, int z)
{
return x + _ChunkSize * (y + _ChunkSize * z);
}
[numthreads(numThreads,numThreads,numThreads)]
void MarchCubes (uint3 id : SV_DispatchThreadID)
{
//Return if we are outside the bounds of the array
if (id.x >= _ChunkSize - 1 || id.y >= _ChunkSize - 1 || id.z >= _ChunkSize - 1)
{
return;
}
//Get the current cube
float cubeValues[8] = {
_Weights[IndexFromCoord(id.x, id.y, id.z + 1)],
_Weights[IndexFromCoord(id.x + 1, id.y, id.z + 1)],
_Weights[IndexFromCoord(id.x + 1, id.y, id.z)],
_Weights[IndexFromCoord(id.x, id.y, id.z)],
_Weights[IndexFromCoord(id.x, id.y + 1, id.z + 1)],
_Weights[IndexFromCoord(id.x + 1, id.y + 1, id.z + 1)],
_Weights[IndexFromCoord(id.x + 1, id.y + 1, id.z)],
_Weights[IndexFromCoord(id.x, id.y + 1, id.z)]
};
//Find the index at which the cube will be filled
int cubeIndex = 0;
if (cubeValues[0] < _IsoLevel) cubeIndex |= 1;
if (cubeValues[1] < _IsoLevel) cubeIndex |= 2;
if (cubeValues[2] < _IsoLevel) cubeIndex |= 4;
if (cubeValues[3] < _IsoLevel) cubeIndex |= 8;
if (cubeValues[4] < _IsoLevel) cubeIndex |= 16;
if (cubeValues[5] < _IsoLevel) cubeIndex |= 32;
if (cubeValues[6] < _IsoLevel) cubeIndex |= 64;
if (cubeValues[7] < _IsoLevel) cubeIndex |= 128;
//Get the array at the index of the triangulation table
int edges[] = triTable[cubeIndex];
//Set the edges for each triangle
for (int i = 0; edges[i] != -1; i += 3)
{
//First edge between vertex A0 and A1
int A0 = edgeConnections[edges[i]][0];
int A1 = edgeConnections[edges[i]][1];
//Second edge between vertex B0 and B1
int B0 = edgeConnections[edges[i + 1]][0];
int B1 = edgeConnections[edges[i + 1]][1];
//Third edge between vertex C0 and C1
int C0 = edgeConnections[edges[i + 2]][0];
int C1 = edgeConnections[edges[i + 2]][1];
//New triangle with interpolated edges
Triangle tri;
tri.a = CubeInterp(cornerOffsets[A0], cornerOffsets[A1]) + id;
tri.b = CubeInterp(cornerOffsets[B0], cornerOffsets[B1]) + id;
tri.c = CubeInterp(cornerOffsets[C0], cornerOffsets[C1]) + id;
//Add the triangle to the list
_Triangles.Append(tri);
}
}
```

Below is the C# script that communicates with the compute shader...

```
public class Chunk : MonoBehaviour
{
ComputeBuffer _trianglesBuffer;
ComputeBuffer _trianglesCountBuffer;
ComputeBuffer _weightsBuffer;
[SerializeField] ComputeShader marchingShader;
[SerializeField] NoiseGenerator noiseGenerator;
[SerializeField] MeshFilter meshFilter;
public NoiseGenerator Noise { get { return noiseGenerator; } }
struct Triangle
{
public Vector3 a;
public Vector3 b;
public Vector3 c;
//Vector3 is 3 floats, therefore 3 vectors = float * 3 * 3
public static int SizeOf => sizeof(float) * 3 * 3;
}
float[] _weights;
private void Awake()
{
CreateBuffers();
}
private void OnDestroy()
{
ReleaseBuffers();
}
//Create buffers
void CreateBuffers()
{
_trianglesBuffer = new ComputeBuffer(5 * (GridMetrics.PointsPerChunk * GridMetrics.PointsPerChunk * GridMetrics.PointsPerChunk), Triangle.SizeOf, ComputeBufferType.Append);
_trianglesCountBuffer = new ComputeBuffer(1, sizeof(int), ComputeBufferType.Raw);
_weightsBuffer = new ComputeBuffer(GridMetrics.PointsPerChunk * GridMetrics.PointsPerChunk * GridMetrics.PointsPerChunk, sizeof(float));
}
//Release buffers
void ReleaseBuffers()
{
_trianglesBuffer.Release();
_trianglesCountBuffer.Release();
_weightsBuffer.Release();
}
public void CreateChunk()
{
//Clear the weights array if its not empty
if (_weights != null)
Array.Clear(_weights, 0, _weights.Length);
//Set the weights values to some generated noise
_weights = noiseGenerator.GetNoise();
//Destroy the shared mesh
if (meshFilter.sharedMesh)
Destroy(meshFilter.sharedMesh);
//Construct the mesh based on the generated noise
meshFilter.sharedMesh = ConstructMesh();
}
Mesh ConstructMesh()
{
//Set buffers
marchingShader.SetBuffer(0, "_Triangles", _trianglesBuffer);
marchingShader.SetBuffer(0, "_Weights", _weightsBuffer);
//Set variables in shader
marchingShader.SetInt("_ChunkSize", GridMetrics.PointsPerChunk);
marchingShader.SetFloat("_IsoLevel", 0.5f);
//Get the data from the weights buffer and clear triangles buffer
_weightsBuffer.SetData(_weights);
_trianglesBuffer.SetCounterValue(0);
//Dispatch kernal 0
marchingShader.Dispatch(0,
GridMetrics.PointsPerChunk / GridMetrics.NumThreads, //X Threads
GridMetrics.PointsPerChunk / GridMetrics.NumThreads, //Y Threads
GridMetrics.PointsPerChunk / GridMetrics.NumThreads); //Z Threads
Triangle[] triangles = new Triangle[ReadTriangleCount()];
_trianglesBuffer.GetData(triangles);
return CreateMeshFromTriangles(triangles);
}
int ReadTriangleCount()
{
int[] triCount = { 0 };
ComputeBuffer.CopyCount(_trianglesBuffer, _trianglesCountBuffer, 0);
_trianglesCountBuffer.GetData(triCount);
return triCount[0];
}
Mesh CreateMeshFromTriangles(Triangle[] triangles)
{
//Create new vertices and trianlges array
Vector3[] verts = new Vector3[triangles.Length * 3];
int[] tris = new int[triangles.Length * 3];
//Set the values
for (int i = 0; i < triangles.Length; i++)
{
int startIndex = i * 3;
verts[startIndex] = triangles[i].a;
verts[startIndex + 1] = triangles[i].b;
verts[startIndex + 2] = triangles[i].c;
tris[startIndex] = startIndex;
tris[startIndex + 1] = startIndex + 1;
tris[startIndex + 2] = startIndex + 2;
}
//Create new mesh and set the verts and triangles
Mesh mesh = new Mesh();
mesh.vertices = verts;
mesh.triangles = tris;
mesh.RecalculateNormals();
return mesh;
}
private void OnDrawGizmos()
{
if (_weights == null || _weights.Length == 0) return;
//Draw weighted points
for (int x = 0, i = 0; x < GridMetrics.PointsPerChunk; x++)
{
for (int y = 0; y < GridMetrics.PointsPerChunk; y++)
{
for (int z = 0; z < GridMetrics.PointsPerChunk; z++, i++)
{
float noiseValue = _weights[i];
Gizmos.color = Color.Lerp(Color.black, Color.white, noiseValue);
Gizmos.DrawWireCube(new Vector3(x, y, z), new Vector3(0.1f, 0.1f, 0.1f));
}
}
}
}
}
```

A visualisation of the algorithm...