Experiences and Benchmarks of Applying WebAssembly
I am currently implementing a physics engine using canvas and TypeScript. While there were no performance issues when implementing rigid body collisions, I encountered the performance issue & frame drop when implementing fluid collisions.
For fluid implementation, I referred to the 2005 paper "Particle-based Viscoelastic Fluid Simulation."
and studied with youtube tutorials (https://www.youtube.com/@pixel_physics)
To represent fluid, I am using particles as described in the paper. Particles represent the fluid in the form of small cells, and flow is expressed by implementing pressure, density, viscosity, and elasticity in these cells.
In my engine, when the number of particles exceeded 800, the CPU usage increased to over 90%, and frame drops also occurred.
My goal is to create a physics engine with TypeScript and develop a simple game to service, so performance improvements are necessary to express more complex fluids.
The bottleneck was caused by double density relaxation.
Double density relaxation expresses the flow of fluid by calculating the density of surrounding particles based on a specific particle and making them repel or attract each other.
- The algorithm is executed by dividing into the reference particle and the near particles, and in the worst case, it can have a time complexity of n^2.
- To maintain 60 frames per second, this means executing n^2 operations 60 times per second.
- Although object search optimization through Hash Grid was applied, it seems that this alone was insufficient.
Therefore, I applied WebAssembly, which I had been interested in, and checked the performance difference.
I chose Rust to implement WebAssembly. Rust is fast at runtime and has strict memory management. Therefore, I thought it had the right characteristics for a performance-sensitive physics engine.
Additionally, various libraries like wasm-bindgen, wasm-pack, and wasm-opt support WebAssembly, making it easy to set up the environment. (Personally, I had previously set up an environment using C++ and Emscripten, but I found a significant difference in difficulty.)
I optimized all computations for fluid collisions and position calculations to WebAssembly and handling canvas control and browser events through JavaScript.
- I had quite a hard time understanding the basic principles of Rust, and since my goal wasn't just to learn the language, I implemented solutions by searching for answers whenever I encountered problems.
- Rust's ownership system is undoubtedly creative.
class Particle {
position: Vector;
prevPosition: Vector;
velocity: Vector;
color: string;
constructor(position: Vector, color: string) {
this.position = position;
this.prevPosition = position;
this.velocity = new Vector({ x: 0, y: 0 });
this.color = color;
}
}
#[wasm_bindgen(getter_with_clone)]
#[repr(C)]
#[derive(Clone)]
pub struct Particle {
pub id: f64,
pub position: Vector,
pub prev_position: Vector,
pub velocity: Vector,
}
#[wasm_bindgen]
impl Particle {
#[wasm_bindgen(constructor)]
pub fn new(id:f64,position: Vector) -> Particle {
Particle {
id,
position: position.clone(),
prev_position: position.clone(),
velocity: Vector::new(0.0, 0.0),
}
}
}
- The wasm binary files built through wasm-pack can be imported from pkg/{pkg-name}.js.
- Functions and classes annotated with #[wasm_bindgen] can be loaded and executed.
- I built the engine itself in wasm and implemented the physics engine by executing updates on each frame.
import {
Vector as rustVector,
Universe,
} from '/rust-module/pkg/rust_module';
this.universe = new Universe(); // load Engine
this.universe.update(deltaTime);
pub fn update(&mut self, delta_time: f64) { // calculate frame events
self.apply_gravity();
self.predict_positions(&delta_time);
self.neighbor_search();
self.double_density_relaxation(&delta_time);
self.world_boundary();
self.compute_next_velocity(&delta_time);
}
- The execution results are stored in the linear memory space of WebAssembly, separated from the garbage-collected heap of JavaScript.
- To use these results on the canvas, data must be fetched through memory address access.
import init, { greet, fibonacci } from '/rust-module/pkg/rust_module';
init().then(async (wasm) => {
registry.memory = wasm.memory; // store loaded wasm memory.
}
const particlesPtr = this.universe.particles(); // the addres for particle's memory
const cells = new Float64Array(registry.memory.buffer, particlesPtr, {particlesLength} * {particleMemorySize}); // load particle datas with format float64
for (let i = 0; i < {particlesLength} * {particleMemorySize}; i += 7) {
// i is index of particle
// cells[i]; // particle id
// cells[i + 1]; // position X
// cells[i + 2]; // position Y
// cells[i + 3]; // prevPosition X
// cells[i + 4]; // prevPosition Y
// cells[i + 5]; // velocity X
// cells[i + 6]; // velocity Y
this.drawUtils.fillCircle(new Vector({ x: cells[i + 1], y: cells[i + 2] }), 5, 'blue');
// canvas drawing function.
// draws circle with particle center position
}
Frame drops no longer occur with 800 particles, and CPU usage has also significantly improved.
![]() | ![]() |
---|
The performance difference between the two is not visually apparent. However, a clear performance difference can be observed through Chrome's CPU usage.
The performance difference became more evident with 1600 particles. The more particles there are, the more noticeable the performance difference becomes.