Python and the need for speed

Python is doing very well in the world of programming languages. The popularity of Python has been growing steadily over the past decades as shown by the TIOBE and PYPL indexes. This doesn’t have to be surprising, since Python has a lot going for it: it’s easy to learn, it has a beautiful syntax, an excellent standard library, a great package ecosystem covering many domains and a friendly community. However, one thing that Python is not, is fast.

Python is Not Fast Enough

While it is true that many applications can benefit from NumPy, Numba and Cython to speed them up massively, there is a class of applications that cannot take this route. One example is my own project, rinohtype. There is no single, obvious bottleneck such as a tight number-crunching loop that can be sped up using these tools. But haven’t you heard about PyPy?, you say. I have, and alas, it runs rinohtype slower than CPython! PyPy neither is able to speed up all classes of applications. And especially for batch-type applications such as rinohtype, the significant JIT warmup time might even negate any speedups.

I am not alone to believe that Python’s speed is an issue. Countless efforts have been made at running Python applications at a faster pace: PyPy, Pyston, Pyjion, Nuitka , Pythran, FAT Python, RegisterVM, wpython, HotPy, unladen-swallow, Psyco, Shed Skin, and more. Pyston is a JIT-based Python implementation originally supported by DropBox. Unfortunately, DropBox have stopped sponsoring its development and has been moving its performance-critical code over to Go. Google similarly seems to be moving away from Python. These are clear indications that Python is losing ground to new programming languages that offer a similar features to Python but better performance in addition. There’s no more denying that Python isn’t fast enough.

A Different Approach

We should face the facts and admit we are fighting a losing battle. Yet another Python compiler or JIT engine will not bring the hoped-for performance boost. As demonstrated by the many failed attempts, speedy execution of (all) Python code is a near-impossible challenge. It’s time to look at more drastic, but technically much simpler ways to solve Python’s performance problem.

JavaScript (V8, SpiderMonkey) and Lua (LuaJIT) are often cited as examples of dynamic languages with very fast JIT implementations. These languages are arguably much simpler than Python, which undoubtedly influences the complexity and performance of a JIT compiler. Therefore, the reason for Python’s limited performance perhaps lies in the nature of the language itself; some of its features simply make it ill-suited for fast execution using a JIT. One example of such a feature is the fact that everything in Python is mutable, killing many opportunities for optimization.

It should be possible to define a subset of the Python language, uninspiredly dubbed “TurboPython”, that excludes those features that stand in the way of high-performance JIT execution (or compilation). Not using some of these features coincides with good design practices, so it doesn’t necessarily have to be all bad. While the fact that everything is mutable in Python is useful for mocking purposes during testing, it is generally considered extremely bad style to override builtin functions, constants or class methods in production code. Other candidates for banishment from TurboPython include eval and exec.

Since type annotations are now an official Python feature, I would go far as to say that these should be required in TurboPython code, since the presence of type information allows for the JIT compiler to be much simpler and more performant (and reduces warmup time as well). In fact, even without a JIT, a simple TurboPython interpreter might even be many times faster than CPython.

TurboPython should of course not replace Python as we know it. It needs to exist alongside Python, to be used when the extra performance is required. Since TurboPython is a subset of Python, it will also run on Python interpreters, albeit slower. Ideally, a single class, module or package used could be written in TurboPython and benefit from the extra speed while the rest of the application runs at its usual pace. Python 3.6 introduced an API that allows plugging a JIT into CPython, which might be used to implement something like this.

I lack the experience with interpreter/JIT/compiler design to assess whether the suggestions made above are in fact feasible and capable of resulting in a significant speed boost. I’ll leave that to others. I do however know that the current approaches are not cutting it and alternatives need to be explored.

Comments on comp.lang.python, Lobste.rs and Reddit.

Comments

Python and the need for speed

Python is Not Fast Enough

A Different Approach

Published

Category

Tags

Contact