A peek into a possible future of Python in the browser
My Python code was too slow, so I made it faster with Python. For some definition of “Python”.
I spent last week in the gorgeous Aosta Valley with Antonio Cuni from Anaconda (middle) and Hood Chatham from Cloudflare (left). I was the only one without a Ph.D. degree, but they were both kind enough to act as if they didn’t notice. We managed to get SPy to run in the Web browser, unbreak Pyodide on iOS, as well as make a SPy-accelerated demo with Pyodide. But let’s start at the beginning.
Looking back
Eleven years ago Gary Bernhardt gave an influential talk at PyCon US called The Birth & Death of JavaScript. The Web platform is shaping up to be just as powerful as Gary’s vision in 2014. WebAssembly exceptions and garbage collection are generally available, JavaScript Promise Integration is close as well. I feel like we’re on the verge of true programming language democracy on the Web.
In 2019 I gave a keynote at PyConDE and PyLondinium I called “Python 2020+” where I shared my ideas about where I think Python as a language should be going. Of course 2020 ended up quite different 😅 The main premise of the talk was that “you should contribute to Python by inventing a new kind of Python”. I argued that it’s not only okay for new Python-like languages to appear, but it’s in fact necessary to sensibly compete on mobile and on the Web. I gave MicroPython as an example of a “Python-like enough” variant of the language that lets Python developers succeed in programming microcontrollers. MicroPython is great because despite a large set of documented differences with CPython, Python developers can successfully leverage their existing knowledge in an entirely new environment. Better yet, for some people MicroPython is their first experience writing Python altogether!
I argued something similar is necessary for the Web. At the time, Pyodide was the new kid on the block. It’s a port of CPython running on WebAssembly. It uses Emscripten as its compiler toolchain, which provides a compatibility layer between the browser and POSIX APIs that CPython calls internally. That port includes a number of the most popular scientific libraries, allowing Python users to be immediately productive.
I dismissed Pyodide in my talk as producing a necessarily subpar experience in terms of performance compared to JavaScript. After all, with Pyodide you are running CPython on top of WebAssembly’s stack-based VM. So a stack-based VM on top of a stack-based VM. You can never match JavaScript’s performance that way. I wanted a Python compiler for the Web, something that is “Python-like enough”, but compiles user code directly to run on the WebAssembly VM.
Python 2025+
This January I put a lot of effort evaluating both Pyodide and MicroPython as packaged by PyScript in the context of generative art. This is a good benchmark of how useful those runtimes are since effective animation needs to run at the refresh rate of your display, which in my case is 60Hz for the external monitor and 120Hz for my Macbook and iPhone. 120Hz also allows for more convincing audio/video synchronization when you’re doing realtime audio visualization. But 120Hz means you only get 8.33 milliseconds for calculations between frames. Neither Pyodide nor MicroPython is fast enough for comfort in that kind of use case, so you need to be creative with pre-computing things or moving things to compute on the GPU instead. Or to give up on native rendering in 4K.
In all fairness, it’s incredible that Pyodide lets you use Numpy in the browser on a “real” CPython interpreter that works just like you expect. It works surprisingly well. And Pyodide is itself increasingly leveraging the advanced features of the WebAssembly VM, with wasm-gc introduced not long ago and wasm-eh coming very soon. All this promises better performance and even smoother support of regular Python programs with no changes. The Pyodide project upstreams a lot to CPython to ensure it runs well on Emscripten, with the platform official Tier 3 support restored for Python 3.14.
MicroPython also seems like a good compromise between code size you need to download before running anything and capability. YMMV on how Pythonic it feels for you in practice, I mentioned some surprises in my Genuary article. But for smaller programs it’s clearly capable and downloads near-instantly.
And yet, I don’t think I was wrong in 2019. I hit the performance wall during my Genuary journey many times, and I felt like the things I wanted to achieve shouldn’t be too much to ask. Some had to do with the Python ↔︎ JavaScript FFI speed, or with the proliferation of network requests to load everything you need, but most problems were clearly “I wish I could crunch those numbers faster”.
Enter SPy
Let’s get one thing out of the way. SPy is a research project in its early stages at the moment. You should not attempt to use it yet, unless you plan to contribute to it, and even then you have to come with the right mindset. This software is incomplete both implementation-wise and design-wise. I’m writing this article mostly to be able to point at it in 2035 and say “I was there”. And to point at my 2019 talks and say “look, I wasn’t so wrong after all!”.
With all this in mind, SPy looks very attractive to me already. It’s a language designed to be friendly to Python users, but is not attempting to be Python-compatible. It can’t be, because with SPy, user code can be fully compiled to native binaries or WebAssembly.
Roses are red, violets are @blue
The big idea of it is that you have Python-like source code files that mix code executing at compile-time with code executing at runtime. The former is called blue code and the latter red code. Unlike C macros, blue code is fully interpreted and can do a lot. Unlike C++ templates, blue code looks mostly like your regular Python code.
If you ever thought “wouldn’t it be cool if this module-level computation could be pre-compiled?” then blue code is just that. You can have “compile-time decorators”, “compile-time type dispatch”, “compile-time code generation”, and so on. This moves us a bit closer to the “zero-cost abstractions” world of Rust.
SPy code can be executed in a fully interpreted mode where even blue code is interpreted on the fly as it would in Python. This is where you can debug things when you’re still experimenting with a script. Then, with a process called “red-shifting” you pre-compute all that’s blue. You can view the red-shifted code, seeing what its code generation produced. You can still interpret red-shifted programs to debug them.
Finally, you can compile red-shifted code to C. That allows it to produce a native binary or WebAssembly bytecode. The C code is also a pretty informative read if you want to know how a high-level Python-style construct translates to the resulting native computation. Most importantly, the C code generation is how SPy manages to immediately integrate with existing systems using pre-existing C FFI. This is an explicit goal of the project. That’s how we were able to make my generative art demo work.
Marching squares
For the first end-user project in SPy, I decided to convert an existing Genuary entry I made with PyScript that draws an endless abstract topographic map. The map is computed by taking three octaves of Perlin noise, and using the marching squares algorithm on it to draw contours of a few “elevation levels”.
This computation was too much for pure Python in either Pyodide or MicroPython to happen inside the animation loop, so in the original project I pre-computed the map area in a Web worker. That still meant you had to wait several seconds for it to complete, but thanks to the Web worker it all appeared incrementally. And as soon as the pre-computation was done, the rest of the animation can easily hit 120 FPS on my Mac and 60 FPS on my phone. Could be worse.
The SPy version of the project ditches the Web worker as the computation is over 100X faster. You’re literally waiting longer for the background audio file to load. The result looks exactly the same, which was an important metric for us. It’s still a Pyodide program, only now it’s importing a wheel built with setuptools
and cffi
to do all the perlins and march all the squares. All those calls look like normal Python function calls on the Python side.
Look below, your browser is running SPy-powered code right now!
You can clone the PyScript.com project and play with it yourself. By setting USE_SPY
to False
in main.py
you’re back to the pure Python version. Now pre-computing is counted in seconds! You can see this in action without having to edit anything by running v1 of the same project.
The code that ends up being perlin-0.0.0-cp312-cp312-pyodide_2024_0_wasm32.whl
lives here, you can find the marching_squares
function and see how you feel about the code. In my opinion it’s already serviceable, but in the future it will likely look even more friendly when we get advanced features like for-loops and accessing struct array items by value.
Bring your own wings
Again, the SPy project is in early stages, so working on my demo I had to add bitwise operator support to the SPy compiler, and to include an array implementation in my user script. I made silly mistakes converting my Python for-loops to while-loops. It really felt like I needed to bring my own wings to fly. But I’m fully aware that this is to be expected at this point. In fact, if I’m not mistaken, my demo might be actually the first non-trivial demo of the language in existence. Kind of a nice milestone for a workation week in the Alps!
The workflow shows a lot of promise thanks to the ability to run interpreted code and to compile your code to a native Mac or Linux program. That allowed me to iterate on it without involving the Web browser until I knew the numbers produced are alright.
In the end, everything connected just enough to execute in the PyScript environment and I’m very happy about that. The performance proves this sort of thing has real value.
Why not Cython then?
Nothing wrong with using Cython today! But remember our dream: at some point it might just be possible to skip CPython in the Web browser entirely. Only your end-user Python-like code remains. Running natively on WebAssembly with speed in the same order of magnitude as Rust.
We’re not close to this point and if we’re frank I’m somewhat concerned this project is currently being bootstrapped by just one person at Anaconda. However, I’m hopeful that with enough value demonstrated, the company will invest more resources in this moonshot project. The 2019 me would 100% approve.