WebGPU Bench

How fast is your GPU
in the browser?

Real WebGPU compute benchmarks. No install, no account. Just click Run.

Benchmarks
📊

Rastrigin

Standard optimization benchmark, embarrassingly parallel (POP=4096, DIM=2000)

Ready
🌌

N-Body Simulation

Gravitational physics, 512 bodies, 200 timesteps fused (SEQUENTIAL)

Ready
🎯

Acrobot-v1

Standard Gym RL, double pendulum, 500 steps with RK4 physics (SEQUENTIAL)

Ready
⛰️

MountainCar-v0

Standard Gym RL, 200 timesteps, linear policy (SEQUENTIAL)

Ready
⚖️

CartPole-v1

Standard Gym RL, inverted pendulum, 500 steps, 4→8→2 NN policy (SEQUENTIAL)

Ready
🎲

Monte Carlo Pi

Classic parallel estimation, 100K samples per worker (PARALLEL)

Ready

By clicking Run, your GPU model and benchmark results are saved anonymously. No personal information is collected. Privacy policy

Research

The science behind the benchmarks

Cross-vendor medians, fused vs unfused on the same device: 71× Apple Silicon, 56× NVIDIA, 20× phones (92 devices, 7 vendors). Controlled M2 Pro vs PyTorch MPS in the paper: 159× WebGPU, 720× CUDA on T4.

Gunaydin, A.B. (2026)

Single-Kernel Fusion for Sequential Fitness Evaluation via WebGPU Compute Shaders

doi:10.5281/zenodo.19331833

Every result is public

We don't cherry-pick. Every benchmark run from every device is published — GPU name, score, browser, OS, timestamp. No data is hidden. Verify any claim yourself.

Browse all results →
Companion projects

The research line and the end-to-end projects that build on it.

TheoryResearch line

kernelfusion.dev

The research line. Two published preprints, one npm SDK, 92 unique devices across 7 GPU vendors. The theory that all the applied projects build on.

71×
Apple Silicon median
56×
NVIDIA median
92
unique devices
Read the research →