Boids and Jobs
Boids#
Boids are an algorithm developed by Craig Reynolds in 1986 to simulate the natural flocking behaviour of birds, schools of fish, or other group dynamics. This is done by having each boid within a system follow three rules:
- Separation: Avoid crowding neighbours (steer to avoid collisions).
- Alignment: Steer toward the average heading of nearby boids (match direction).
- Cohesion: Move toward the average position of nearby boids (stay close to the group).
Boids within my project#
In my case, I’m using boids to allow for ships to fly towards and around a target asteroid to mine it and while in this, I don’t require worrying about collision, as I’m fine with the ships crossing each other. I still prefer that not be the case, and some effort be made to space them out.
In the video above, I have 8000 ships flying around destroying asteroids while running at 25fps (Disclaimer: This is being run on a Ryzen 7 9800X3D). Which, while this isn’t perfect, is more than enough for this project, at least for now.
Why use Jobs#
Originally, I implemented this using an ISystem OnUpdate loop and iterating over all the boids in the world via a foreach loop, which works fine for up to 100 ships and maybe 200-400 at a stretch. But the framerate will tank severely the more that are added, and while the implementation is not perfect and there is room for improvement, it would need dramatic changes if I wanted far more than 100+ ships.
[BurstCompile]
public void OnUpdate(ref SystemState state) {
float deltaTime = SystemAPI.Time.DeltaTime;
NativeArray<Entity> boids = query.ToEntityArray(Allocator.Temp);
for (int i = 0; i < boids.Length; ++i) {
RefRW<Boid> currentBoid = SystemAPI.GetComponentRW<Boid>(boids[i]);
RefRW<LocalTransform> currentBoidTransform = SystemAPI.GetComponentRW<LocalTransform>(boids[i]);
float2 seperation = float2.zero;
float2 alignment = float2.zero;
float2 cohesion = float2.zero;
int seperationCount = 0;
int alignmentCount = 0;
int cohesionCount = 0;
for (int j = 0; j < boids.Length; ++j) {
if (i == j)
continue;
RefRO<Boid> otherBoid = SystemAPI.GetComponentRO<Boid>(boids[j]);
RefRO<LocalTransform> otherBoidTransform = SystemAPI.GetComponentRO<LocalTransform>(boids[j]);
if (otherBoidTransform.ValueRO.Position.Equals(currentBoidTransform.ValueRO.Position))
currentBoidTransform.ValueRW.Position.y += 0.1f;
float2 dir = currentBoidTransform.ValueRO.Position.xy - otherBoidTransform.ValueRO.Position.xy;
float dist = math.length(dir);
if (dist < currentBoid.ValueRO.seperationRadius) {
seperation += dir / dist;
seperationCount++;
}
if (dist < currentBoid.ValueRO.alignmentRadius) {
alignment += otherBoid.ValueRO.velocity;
alignmentCount++;
}
if (dist < currentBoid.ValueRO.cohesionRadius) {
cohesion += otherBoidTransform.ValueRO.Position.xy;
cohesionCount++;
}
}
if (seperationCount > 0)
seperation = (seperation / seperationCount) * currentBoid.ValueRO.seperationWeight;
if (alignmentCount > 0)
alignment = (alignment / alignmentCount) * currentBoid.ValueRO.alignmentWeight;
if (cohesionCount > 0)
cohesion = ((cohesion / cohesionCount) - currentBoidTransform.ValueRO.Position.xy) * currentBoid.ValueRO.cohesionWeight;
float2 targetSteering = math.normalize(currentBoid.ValueRO.target - currentBoidTransform.ValueRO.Position.xy);
float2 newVel = currentBoid.ValueRO.velocity + seperation + alignment + cohesion + targetSteering;
currentBoid.ValueRW.velocity = math.lerp(currentBoid.ValueRO.velocity, math.normalize(newVel) * math.clamp(math.length(newVel), 0, currentBoid.ValueRO.maxSpeed), math.min(currentBoid.ValueRO.turnRateDegrees * deltaTime, 1.0f));
currentBoidTransform.ValueRW.Position.xy += currentBoid.ValueRO.velocity * deltaTime;
if (math.length(currentBoid.ValueRO.velocity) > 0.1f) {
currentBoidTransform.ValueRW.Rotation = quaternion.RotateZ(math.atan2(currentBoid.ValueRO.velocity.y, currentBoid.ValueRO.velocity.x));
}
}
boids.Dispose();
}
So the process of converting this to Jobs so it can be run in parallel needs to be done. For this implementation, I used the IJobEntity, which is similar to the foreach query from the previous implementation, with the entities being used being queried and the code within being run on each one of them, but rather than it all being done on the main thread and one after another, the entities are split into chunks and are then spread over additional threads.
The implementation would require tweaks, though, as we are now doing all these in parallel, I need to be careful about writing to and changing entities as each boid needs to know the velocity and position of each other. If I want to stay true to the previous implementation, I would need to use EntityCommandBuffer to queue the changes and then set the components on each boid. Which would work, but this causes structural changes. These happen whenever a component and/or entity is created or destroyed; these are expensive operations, so we would want to update the existing component to avoid this. So I created snapshots of the current state before the job is run with the positions and the state of all the boid components. This means I have a read-only array of all the data the boids would require, without having to worry about threads fighting each other and inevitably crashing the simulation and the game.
[BurstCompile]
private partial struct BoidJob : IJobEntity {
[ReadOnly] public float deltaTime;
[ReadOnly] public NativeArray<Entity> boidEntities;
[ReadOnly] public NativeArray<Boid> boidSnapshot;
[ReadOnly] public NativeArray<float2> boidPositionSnapshot;
[BurstCompile]
public void Execute(Entity currentBoidEntity, RefRW<LocalTransform> currentBoidTransform, RefRW<Boid> currentBoid) {
float2 seperation = float2.zero;
float2 alignment = float2.zero;
float2 cohesion = float2.zero;
int seperationCount = 0;
int alignmentCount = 0;
int cohesionCount = 0;
for (int j = 0; j < boidEntities.Length; ++j) {
if (currentBoidEntity == boidEntities[j])
continue;
float2 otherBoidPositon = boidPositionSnapshot[j];
if (otherBoidPositon.Equals(currentBoidTransform.ValueRO.Position.xy)) {
Random random = new Random((uint)currentBoidEntity.Index);
currentBoidTransform.ValueRW.Position.xy += random.NextFloat2(0f, 0.1f);
}
float2 dir = currentBoidTransform.ValueRO.Position.xy - otherBoidPositon;
float dist = math.length(dir);
if (dist < currentBoid.ValueRO.seperationRadius) {
seperation += dir / dist;
seperationCount++;
}
if (dist < currentBoid.ValueRO.alignmentRadius) {
alignment += boidSnapshot[j].velocity;
alignmentCount++;
}
if (dist < currentBoid.ValueRO.cohesionRadius) {
cohesion += otherBoidPositon;
cohesionCount++;
}
}
if (seperationCount > 0)
seperation = (seperation / seperationCount) * currentBoid.ValueRO.seperationWeight;
if (alignmentCount > 0)
alignment = (alignment / alignmentCount) * currentBoid.ValueRO.alignmentWeight;
if (cohesionCount > 0)
cohesion = ((cohesion / cohesionCount) - currentBoidTransform.ValueRO.Position.xy) * currentBoid.ValueRO.cohesionWeight;
float2 targetSteering = math.normalize(currentBoid.ValueRO.target - currentBoidTransform.ValueRO.Position.xy);
float2 newVel = currentBoid.ValueRO.velocity + seperation + alignment + cohesion + targetSteering;
currentBoid.ValueRW.velocity = math.lerp(currentBoid.ValueRO.velocity, math.normalize(newVel) * math.clamp(math.length(newVel), 0, currentBoid.ValueRO.maxSpeed), math.min(currentBoid.ValueRO.turnRateDegrees * deltaTime, 1.0f));
currentBoidTransform.ValueRW.Position.xy = currentBoidTransform.ValueRO.Position.xy + (currentBoid.ValueRO.velocity * deltaTime);
if (math.length(currentBoid.ValueRO.velocity) > 0.1f) {
currentBoidTransform.ValueRW.Rotation = quaternion.RotateZ(math.atan2(currentBoid.ValueRO.velocity.y, currentBoid.ValueRO.velocity.x));
}
}
}
Other optimisations#
While, im happy with the performance for this project for now using the job system it does not mean further improvements couldnt be made for example a spatial grid could be used to reduce the amount of boids each boid would have to check to compare themselves to and only have to iterate over boids that most likely will have an impact on the boid. Another common option would be to use a compute shader instead of Jobs which would mean the GPU would process all the boid updates which would enevitably improve performance dramatically again even with the added overhead of moving data between the CPU and the GPU each frame. And finally an improvement could be made to the components themselves that I use here specifically the Boid component as I could attempt to minimise this by moving the target and boid settings that are typically shared between multiple ships to a seperate component that all of them can read from.