Phi-3-mini on raw WebGPU
Runs entirely in your browser on 10 hand-written WGSL kernels. No server, no API calls, no data leaves this tab.
~2 GB
weights
~40 t/s
M2 Pro
f16
WebGPU
Not now
Preparing Zero-TVM
Starting…
0%
Details

      

Phi-3-mini

10 WGSL kernels · paged KV · WebGPU
Waiting
Chat with Phi-3-mini
q4f16_1 · 3.8 B 4 K context ~40 tok/s · M2 Pro On-device only
Running locally on your GPU through 10 hand-written WGSL kernels. Nothing leaves this tab — prompts, tokens and KV cache all stay in your browser.
Enter to send · Shift+Enter for new line · Zero TVM · 10 WGSL kernels · 228 dispatches/token