Welcome
Welcome to "Rust-Python interoperability"!
This course will teach you how to call Rust code from Python, packaged as a native extension module.
We assume you are familiar with both Rust and Python, but we don't assume any prior interoperability knowledge. We will provide a brief explanation and references whenever we rely on advanced language features in either language.
Methodology
This course is based on the "learn by doing" principle.
You'll build up your knowledge in small, manageable steps. It has been designed to be interactive and hands-on.
Mainmatter developed this course
to be delivered in a classroom setting, over a whole day: each attendee advances
through the lessons at their own pace, with an experienced instructor providing
guidance, answering questions and diving deeper into the topics as needed.
If you're interested in attending one of our training sessions, or if you'd like to
bring this course to your company, please get in touch.
You can also follow the course on your own, but we recommend you find a friend or
a mentor to help you along the way should you get stuck. You can
also find solutions to all exercises in the
solutions
branch of the GitHub repository.
Prerequisites
To follow this course, you must install:
If Rust is already installed on your machine, make sure to update it to the latest version:
# If you installed Rust using `rustup`, the recommended way,
# you can update to the latest stable toolchain with:
rustup update stable
These commands should successfully run on your machine:
cargo --version
uv --version
Don't start the course until you have these tools installed and working.
Structure
On the left side of the screen, you can see that the course is divided into sections.
To verify your understanding, each section is paired with an exercise that you need to solve.
You can find the exercises in the
companion GitHub repository.
Before starting the course, make sure to clone the repository to your local machine:
# If you have an SSH key set up with GitHub
git clone git@github.com:mainmatter/rust-python-interoperability.git
# Otherwise, use the HTTPS URL:
#
# git clone https://github.com/mainmatter/rust-python-interoperability.git
We recommend you work on a branch, so you can easily track your progress and pull updates from the main repository if needed:
cd rust-python-interoperability
git checkout -b my-solutions
All exercises are located in the exercises
folder.
Each exercise is structured as a Rust package.
The package contains the exercise itself, instructions on what to do (in src/lib.rs
), and a test suite to
automatically verify your solution.
wr
, the workshop runner
To verify your solutions, we've provided a tool that will guide you through the course.
It is the wr
CLI (short for "workshop runner").
Install it with:
cargo install --locked workshop-runner
In a new terminal, navigate back to the top-level folder of the repository.
Run the wr
command to start the course:
wr
wr
will verify the solution to the current exercise.
Don't move on to the next section until you've solved the exercise for the current one.
We recommend committing your solutions to Git as you progress through the course, so you can easily track your progress and "restart" from a known point if needed.
Enjoy the course!
Author
This course was written by Luca Palmieri, Principal Engineering
Consultant at Mainmatter.
Luca has been working with Rust since 2018, initially at TrueLayer and then at AWS.
Luca is the author of "Zero to Production in Rust",
the go-to resource for learning how to build backend applications in Rust, and "100 Exercises to Learn Rust", a learn-by-doing introduction to Rust itself.
He is also the author and maintainer of a variety of open-source Rust projects, including
cargo-chef
,
Pavex and wiremock
.
Exercise
The exercise for this section is located in 01_intro/00_welcome
Anatomy of a Python extension
Don't jump ahead!
Complete the exercise for the previous section before you start this one.
It's located in exercises/01_intro/00_welcome
, in the course GitHub's repository.
Use wr
to start the course and verify your solutions.
To invoke Rust code from Python we need to create a Python extension module.
Rust, just like C and C++, compiles to native code. For this reason, extension modules written in Rust are often called native extensions. Throughout this course we'll use the terms Python extension, Python extension module and native extension interchangeably.
maturin
We'll use maturin
to build, package and publish Python extensions written in Rust. Let's install it:
uv tool install "maturin>=1.8"
Tools installed via uv
should be available in your path. Run:
uv tool update-shell
to make sure that's the case.
Exercise structure
All exercises in this course will follow the same structure:
- an extension module written in Rust, in the root of the exercise directory
- a Python package that invokes the functionality provided by the extension, in the
sample
subdirectory
The extension module will usually be tested from Python, in the sample/tests
subdirectory.
You will have to modify the Rust code in the extension module to make the tests pass.
Extension structure
Let's explore the structure of the extension module for this section.
01_setup
├── sample
├── src
│ └── lib.rs
├── Cargo.toml
└── pyproject.toml
Cargo.toml
The manifest file, Cargo.toml
, looks like this:
[package]
name = "setup"
version = "0.1.0"
edition = "2021"
[lib]
name = "setup"
crate-type = ["cdylib"]
[dependencies]
pyo3 = "0.23.0"
Two things stand out in this file compared to a regular Rust project:
- The
crate-type
attribute is set tocdylib
. - The
pyo3
crate is included as a dependency.
Let's cover these two points in more detail.
Linking
Static linking
By default, Rust libraries are compiled as static libraries.
All dependencies are linked into the final executable at compile-time, making the executable self-contained1.
That's great for distributing applications, but it's not ideal for Python extensions.
To perform static linking, the extension module would have to be compiled alongside the Python interpreter.
Furthermore, you'd have to distribute the modified interpreter to all your users.
At the ecosystem level, this process would scale poorly, as each user needs to leverage
several unrelated extensions at once. Every single project would have to compile its own
bespoke Python interpreter.
Dynamic linking
To avoid this scenario, Python extensions are packaged as dynamic libraries.
The Python interpreter can load these libraries at runtime, without having to be recompiled.
Instead of distributing a modified Python interpreter to all users, you must now distribute
the extension module as a standalone file.
Rust supports dynamic linking, and it provides two different flavors of dynamic libraries: dylib
and cdylib
.
dylib
are Rust-flavored dynamic libraries, geared towards Rust-to-Rust dynamic linking.
cdylib
, on the other hand, are dynamic libraries that export a C-compatible interface (C dynamic library).
You need a common dialect to get two different languages to communicate with each other. They
both need to speak it and understand it.
That bridge, today, is C's ABI (Application Binary Interface).
That's why, for Python extensions, you must use the cdylib
crate type:
[lib]
crate-type = ["cdylib"]
pyo3
It's not enough to expose a C-compatible interface. You must also comply with the Python C API, the interface Python uses to interact with C extensions.
Doing this manually is error-prone and tedious.
That's where the pyo3
crate comes in: it provides a safe and idiomatic way to write Python extensions in Rust, abstracting away the low-level details.
In lib.rs
, you can see it in action:
#![allow(unused)] fn main() { use pyo3::prelude::*; #[pyfunction] fn it_works() -> bool { todo!() } /// A Python module implemented in Rust. #[pymodule] fn setup(m: &Bound<'_, PyModule>) -> PyResult<()> { m.add_function(wrap_pyfunction!(it_works, m)?)?; Ok(()) } }
We're using pyo3
to define a Python function, named it_works
, that returns a boolean.
The function is then exposed to Python at the top-level of our extension module, named setup
.
That same function is then invoked from Python, inside sample/tests/test_sample.py
:
from setup import it_works
def test_works():
assert it_works()
We'll cover the details of #[pyfunction]
and #[pymodule]
in the next section, no worries.
pyproject.toml
Before we move on, let's take a look at pyproject.toml
, the Python "manifest" of the extension module:
[build-system]
requires = ["maturin>=1.8,<2.0"]
build-backend = "maturin"
[project]
name = "setup"
# [...]
requires-python = ">=3.13"
[tool.maturin]
features = ["pyo3/extension-module"]
It specifies the build system, the extension name and version, the required Python version, and the features to enable when building the extension module.
This is what uv
looks at when building the extension module, before delegating the build
process to maturin
, which in turn invokes cargo
to compile the Rust code.
What do I need to do?
A lot has to go right behind the scenes to make a Python extension work.
That's why the exercise for this section is fairly boring—we want to verify
that you can build and test a Python extension module without issues.
Things will get a lot more interesting over the coming sections, I promise!
References
Footnotes
This is true up to an extent. In most cases, some dependencies are still dynamically linked, e.g. libc on most Unix systems. Nonetheless, the final executable is self-contained in the sense that it doesn't rely on the presence of the Rust standard library or any other Rust crate on the user's system.
Exercise
The exercise for this section is located in 01_intro/01_setup
Modules
In Python, just like in Rust, your code is organized into modules.
Your entire extension is a module!
That module is defined using pyo3
's #[pymodule]
procedural macro, as
you've seen in the previous section:
#![allow(unused)] fn main() { #[pymodule] fn setup(m: &Bound<'_, PyModule>) -> PyResult<()> { // [...] } }
setup
becomes the entry point for the Python interpreter to load your extension.
Naming matters
The name of the annotated function is important: there must be at least one module with a name that matches the name of
the dynamic library artifact that Python will try to load.
This is the name of the library target specified in your Cargo.toml
file:
[lib]
name = "name_of_your_rust_library"
If you don't have a [lib]
section, it defaults to the name of your package,
specified in the [package]
section.
If the module name and the library name don't match, Python will raise an error when trying to import the module:
ImportError: dynamic module does not define
module export function (PyInit_name_of_your_module)
The name
argument
You can also specify the name of the module explicitly using the name
argument,
rather than relying on the name of the annotated function:
#![allow(unused)] fn main() { #[pymodule] #[pyo3(name = "setup")] fn random_name(m: &Bound<'_, PyModule>) -> PyResult<()> { // [...] } }
Mysterious types?
You might be wondering: what's up with &Bound<'_, PyModule>
? What about PyResult
?
Don't worry, we'll cover these types in due time later in the course.
Go with the flow for now!
Exercise
The exercise for this section is located in 01_intro/02_modules
Functions
Empty modules are not that useful: let's add some functions to our extension!
As you've seen in the "Setup" section, pyo3
provides another procedural macro
to define functions that can be called from Python: #[pyfunction]
.
Back then we used it to define the it_works
function:
#![allow(unused)] fn main() { use pyo3::prelude::*; // 👇 A Python function defined in Rust #[pyfunction] fn it_works() -> bool { true } }
Unlike modules, functions aren't exposed to Python automatically; you must
attach them to a module using the wrap_pyfunction!
macro:
#![allow(unused)] fn main() { #[pymodule] fn setup(m: &Bound<'_, PyModule>) -> PyResult<()> { // 👇 Expose the function to Python m.add_function(wrap_pyfunction!(it_works, m)?)?; Ok(()) } }
Exercise
The exercise for this section is located in 01_intro/03_functions
Arguments
no_op
, the function you added to solve the previous exercise, is very simple:
#![allow(unused)] fn main() { use pyo3::prelude::*; #[pyfunction] fn no_op() { // Do nothing } }
Let's take it up a notch: what if you want to pass a value from Python to Rust?
The FromPyObject
trait
#[pyfunction]
functions can take arguments, just like regular Rust functions.
But there's a catch: it must be possible to build those arguments from Python objects.
The contract is encoded in the FromPyObject
trait, defined in pyo3
:
#![allow(unused)] fn main() { pub trait FromPyObject<'py>: Sized { fn extract_bound(ob: &Bound<'py, PyAny>) -> PyResult<Self>; } }
We won't go into the details of FromPyObject
's definition just yet: it would require an
in-depth discussion of Python's Global Interpreter Lock (GIL) and the way
pyo3
models it in Rust. We'll get to it in the next section.
For the time being, let's focus on what the trait unlocks for us: the ability to convert
Python objects into Rust types.
Available implementations
pyo3
provides implementations of FromPyObject
for a large number of types—e.g. i32
, f64
, String
, Vec
, etc.
You can find an exhaustive list in pyo3
's guide,
under the "Rust" table column.
Conversion cost
Going from a Python object to a Rust type is not free—e.g. the
in-memory representation of a Python list doesn't match the in-memory representation of a Rust Vec
.
The conversion introduces a (usually small) overhead that you'll have to incur every time you invoke
your Rust function from Python. It's a good trade-off if you end up performing enough
computational work in Rust to amortize the conversion cost.
Python-native types
In pyo3
's documentation you can see a column of "Python-native" types.
Don't try to use them to solve the exercise for this section: we'll cover them in the next one.
References
Exercise
The exercise for this section is located in 01_intro/04_arguments
Global Interpreter Lock (GIL)
If you go back to pyo3
's documentation on arguments,
you'll find a table column listing so called "Python-native" types.
What are they, and why would you use them?
Python-native types
There is overhead in converting a Python object into a Rust-native type.
That overhead might dominate the cost of invoking your Rust function if the function itself isn't doing much
computational work. In those cases, it can be desirable to work directly using Python's in-memory representation of the object.
That's where the Py*
types come in: they give you direct access to Python objects, with minimal overhead1.
Out of the entire family of Py*
types, PyAny
deserves a special mention.
It's the most general Python-native type in pyo3
: it stands for an arbitrary Python object.
You can use it whenever you don't know the exact type of the object you're working with, or you don't care about it.
Py*
types don't implement FromPyObject
Let's try to rewrite the solution of the previous exercise using PyList
rather than Vec<u64>
:
#![allow(unused)] fn main() { use pyo3::prelude::*; fn print_number_list(list: &PyList) { todo!() } }
If you try to compile this code, you'll get an error:
error[E0277]: the trait bound `&PyList: PyFunctionArgument<'_, '_>` is not satisfied
--> src/lib.rs:7:28
|
7 | fn print_number_list(list: &PyList) {
| ^
| the trait `PyClass` is not implemented for `&PyList`,
| which is required by `&PyList: PyFunctionArgument<'_, '_>`
|
= help: the following other types implement trait `PyFunctionArgument<'a, 'py>`:
&'a pyo3::Bound<'py, T>
Option<&'a pyo3::Bound<'py, T>>
= note: required for `&PyList` to implement `FromPyObject<'_>`
= note: required for `&PyList` to implement `FromPyObjectBound<'_, '_>`
= note: required for `&PyList` to implement `PyFunctionArgument<'_, '_>`
The error message is a bit cryptic because it mentions a number of private pyo3
traits (PyFunctionArgument
and
FromPyObjectBound
), but the gist of it is that &PyList
doesn't implement FromPyObject
. That's
true for all Py*
types.
Confusing, isn't it? How is possible that Python-native types, that require no conversion, don't implement the trait that allows you to convert Python objects into Rust types?
It's time to have that talk, the one about Python's Global Interpreter Lock (GIL).
Global Interpreter Lock (GIL)
Out of the box, Python's2 data structures are not thread-safe. To prevent data races, there is a global mutual exclusion lock that allows only one thread to execute Python bytecode at a time—i.e. the so-called Global Interpreter Lock (GIL).
It is forbidden to interact with Python objects without holding the GIL.
That's why pyo3
doesn't implement FromPyObject
for Py*
types: it would allow you to interact with Python objects
without you necessarily holding the GIL, a recipe for disaster.
Python<'py>
pyo3
uses a combination of lifetimes and smart pointers to ensure that you're interacting with Python objects
in a safe way.
Python<'py>
is the cornerstone of the entire system: it's a token type that guarantees that you're holding the GIL.
All APIs that require you to hold the GIL will, either directly or indirectly, require you to provide a Python<'py>
token as proof.
pyo3
will automatically acquire the GIL behind the scenes whenever you invoke a Rust function from Python. In fact,
you can ask for a Python<'py>
token as an argument to your Rust function, and pyo3
will provide it for you—it has
no (additional) cost.
#![allow(unused)] fn main() { use pyo3::prelude::*; // There is no runtime difference between invoking the two functions // below from Python. // The first one is just more explicit about the fact that it requires // the caller to acquire the GIL ahead of time. #[pyfunction] fn print_number_list(_py: Python<'_>, list: Vec<u64>) { todo!() } #[pyfunction] fn print_number_list2(list: Vec<u64>) { todo!() } }
'py
, the lifetime parameter of Python<'py>
, is used to represent how long the GIL is going to be held.
Bound<'py>
You won't be interacting with Python<'py>
directly most of the time.
Instead, you'll use the Bound<'py, T>
type, a smart pointer that encapsulates a reference to a Python object, ensuring
that you're holding the GIL when you're interacting with it.
Using Bound<'py, T>
we can finally start using the Py*
types as function arguments:
#![allow(unused)] fn main() { use pyo3::prelude::*; #[pyfunction] fn print_number_list(list: Bound<'_, PyList>) { todo!() } }
Bound
ensures that we're holding the GIL when interacting with the list instance that has been passed to us
as function argument.
FromPyObject
We can now go back to the definition of the FromPyObject
trait:
#![allow(unused)] fn main() { pub trait FromPyObject<'py>: Sized { fn extract_bound(ob: &Bound<'py, PyAny>) -> PyResult<Self>; } }
extract_bound
takes a &Bound<'py, PyAny>
as argument, rather than a bare &PyAny
, to ensure that we're holding the GIL
when we're interacting with the Python object during the conversion.
References
FromPyObject
Python<'py>
- Global Interpreter Lock
- Official guidance on Python-native vs Rust-native types
Footnotes
pyo3
still needs to ensure that the Python object you're working with is of the expected type.
It'll therefore perform an isinstance
check before handing you the object—e.g.
checking that an object is indeed a list before giving you a PyList
argument. The only exception to this rule
is PyAny
, which can represent an arbitrary Python object.
CPython is the reference implementation of Python, written in C. It's the most widely used Python interpreter and what most people refer to when they say "Python".
Exercise
The exercise for this section is located in 01_intro/05_gil
Output values
We've gone deep into the weeds of how pyo3
handles arguments to your #[pyfunction]
s.
Let's now move our focus to output values: how do you return something from your Rust functions to Python?
IntoPyObject
Guess what? There's a trait for that too!
IntoPyObject
is the counterpart of FromPyObject
. It converts Rust values into Python objects:
#![allow(unused)] fn main() { pub trait IntoPyObject<'py>: Sized { type Target; type Output: BoundObject<'py, Self::Target>; type Error: Into<PyErr>; fn into_pyobject(self, py: Python<'py>) -> Result<Self::Output, Self::Error>; } }
The output type of your #[pyfunction]
must implement IntoPyObject
.
IntoPyObject::into_pyobject
IntoPyObject::into_pyobject
expects two arguments:
self
: the Rust value you want to convert into a Python object.Python<'py>
: a GIL token that you can use to create new Python objects.
The conversion can fail, so the method returns a Result
.
The output type itself is more complex, so let's break it down using an example.
Case study: a newtype
Let's look at a simple example: a newtype that wraps a u64
.
We want it to be represented as a "plain" integer in Python.
#![allow(unused)] fn main() { use std::convert::Infallible; use pyo3::prelude::*; use pyo3::types::PyInt; struct MyType { value: u64, } impl<'py> IntoPyObject<'py> for MyType { /// `Target` is the **concrete** Python type we want to use /// to represent our Rust value. /// The underlying Rust type is a `u64`, so we'll convert it to a `PyInt`, /// a Python integer. type Target = PyInt; /// `Output`, instead, is a **wrapper** around the concrete type. /// It captures the ownership relationship between the Python object /// and the Python runtime. /// In this case, we're using a `Bound` smart pointer to a `PyInt`. /// The `'py` lifetime ensures that the Python object is owned /// by the Python runtime. type Output = Bound<'py, PyInt>; /// Since the conversion can fail, we need to specify an error type. /// We can't fail to convert a `u64` into a Python integer, /// so we'll use `Infallible` as the error type. type Error = Infallible; fn into_pyobject(self, py: Python<'py>) -> Result<Self::Output, Self::Error> { // `u64` already implements `IntoPyObject`, so we delegate // to its implementation to do the actual conversion. self.value.into_pyobject(py) } } }
The Output
associated type
Let's focus on the Output
associated type for a moment.
In almost all cases, you'll be setting Output
to Bound<'py, Self::Target>
1. You're creating a new Python
object and its lifetime is tied to the Python runtime.
In a few cases, you might be able to rely on Borrowed<'a, 'py, Self::Target>
instead.
It's slightly faster2, but it's limited to scenarios where you are borrowing from an existing Python object—fairly
rare for an IntoPyObject
implementation.
There are no other options for Output
, since Output
must implement
the BoundObject
trait,
the trait is sealed and
those two types are the only implementors within pyo3
.
If it helps, think of Output
as an enum with two variants: Bound
and Borrowed
.
Provided implementations
pyo3
provides out-of-the-box implementations of IntoPyObject
for many Rust types, as well as for all Py*
types.
Check out its documentation
for an exhaustive list.
The actual syntax is a bit more complex: type Output = Bound<'py, <Self as IntoPyObject<'py>>::Target>>;
.
We've simplified it for clarity.
In addition to its documentation, you may find this issue
useful to understand the trade-offs between &Bound
and Borrowed
.
Exercise
The exercise for this section is located in 01_intro/06_output
Exceptions
Python and Rust have different error handling mechanisms.
In Python, you raise exceptions to signal that something went wrong.
In Rust, errors are normal values that you return from functions, usually via the Result
type.
pyo3
provides PyResult<T>
to help you bridge the gap between these two worlds.
PyResult<T>
PyResult<T>
is the type you'll return whenever your #[pyfunction]
can fail.
It is a type alias for Result<T, PyErr>
, where PyErr
is pyo3
's representation of a Python exception.
pyo3
will automatically raise a Python exception whenever a #[pyfunction]
returns Err(PyErr)
value:
#![allow(unused)] fn main() { use pyo3::prelude::*; use pyo3::types::PyAny; #[pyfunction] fn print_if_number(item: Bound<'_, PyAny>) -> PyResult<()> { let number = item.extract::<u64>()?; println!("{}", number); Ok(()) } }
In the example above, extract::<u64>()?
returns a PyResult<u64>
.
If the object is not an unsigned integer, extract
will return an error, which will be propagated up to the caller
via the ?
operator. On the Python side, this error will be raised as a Python exception by pyo3
.
Built-in exception types
You should be intentional about the types of exceptions you raise. What kind of error are you signaling? What is the caller expected to catch?
All built-in Python exceptions are available in pyo3::exceptions
—e.g. pyo3::exceptions::PyValueError
for
a ValueError
. You can use their new_err
method to create an instance.
Panics
Rust provides another mechanism for handling "unrecoverable" errors: panics. What happens if you panic in a #[pyfunction]
?
pyo3
will catch the panic and raise a pyo3_runtime.PanicException
to the Python caller. You've probably seen this
behaviour at play when solving the exercises associated to the previous sections.
Exercise
The exercise for this section is located in 01_intro/07_exceptions
Wrapping up
We've covered most of pyo3
's key concepts in this chapter.
Before moving on, let's go through one last exercise to consolidate what we've learned.
You'll have minimal guidance this time—just the exercise description and the tests to guide you.
Exercise
The exercise for this section is located in 01_intro/08_outro
Classes
We've covered Python functions written in Rust, but what about classes?
Defining a class
You can use the #[pyclass]
attribute to define a new Python class in Rust. Here's an example:
#![allow(unused)] fn main() { use pyo3::prelude::*; #[pyclass] struct Wallet { balance: i32, } }
It defines a new Python class called Wallet
with a single field, balance
.
Registering a class
Just like with #[pyfunction]
s, you must explicitly register your class with a module to make it visible to
users of your extension.
Continuing with the example above, you'd register the Wallet
class like this:
#![allow(unused)] fn main() { #[pymodule] fn my_module(m: &Bound<'_, PyModule>) -> PyResult<()> { m.add_class::<Wallet>()?; Ok(()) } }
IntoPyObject
Rust types that have been annotated with #[pyclass]
automatically implement the IntoPyObject
trait, thus
allowing you to return them from your #[pyfunction]
s.
For example, you can define a function that creates a new Wallet
instance:
#![allow(unused)] fn main() { #[pyfunction] fn new_wallet(balance: i32) -> Wallet { Wallet { balance } } }
It'll compile just fine, handing over a new Wallet
instance to the Python caller.
Attributes
By default, the fields of your #[pyclass]
-annotated structs aren't accessible to Python callers.
Going back to our Wallet
example—if you try to access the balance
field from Python, you'll get an error:
wallet = new_wallet(0)
> assert wallet.balance == 0
E AttributeError: 'builtins.Wallet' object has no attribute 'balance'
tests/test_sample.py:8: AttributeError
The same error would occur even if you made balance
a public field.
To make the field accessible to Python, you must add a getter.
This can be done using the #[pyo3(get)]
attribute:
#![allow(unused)] fn main() { #[pyclass] struct Wallet { #[pyo3(get)] balance: i32, } }
Now, the balance
field is accessible from Python:
def test_wallet():
wallet = new_wallet(0)
assert wallet.balance == 0
If you want to allow Python callers to modify the field, you can add a setter using the #[pyo3(set)]
attribute:
#![allow(unused)] fn main() { #[pyclass] struct Wallet { // Both getter and setter #[pyo3(get, set)] balance: i32, } }
Exercise
The exercise for this section is located in 02_classes/00_pyclass
Constructors
In the previous section (and its exercise) we relied on a #[pyfunction]
as the constructor for the #[pyclass]
we defined. Without new_wallet
, we wouldn't have been able to create new Wallet
instances from Python.
Let's now explore how to define a constructor directly within the #[pyclass]
itself.
Defining a constructor
You can add a constructor to your #[pyclass]
using the #[new]
attribute on a method. Here's an example:
#![allow(unused)] fn main() { use pyo3::prelude::*; #[pyclass] struct Wallet { #[pyo3(get, set)] balance: i32, } #[pymethods] impl Wallet { #[new] fn new(balance: i32) -> Self { Wallet { balance } } } }
A Rust method annotated with #[new]
is equivalent to the __new__
method in Python. At the moment there is no way to
define the __init__
method in Rust.
The impl
block containing the constructor must also be annotated with the #[pymethods]
attribute for #[new]
to work as expected.
Signature
Everything we learned about arguments in the context of #[pyfunction]
s applies to constructors as well.
In terms of output type, you can return Self
if the constructor is infallible, or PyResult<Self>
if it can fail.
Exercise
The exercise for this section is located in 02_classes/01_constructors
Methods
The #[pymethods]
attribute is not limited to constructors. You can use it to attach any number of methods to your #[pyclass]
:
#![allow(unused)] fn main() { use pyo3::prelude::*; #[pyclass] struct Wallet { #[pyo3(get, set)] balance: i32, } #[pymethods] impl Wallet { #[new] fn new(balance: i32) -> Self { Wallet { balance } } fn deposit(&mut self, amount: i32) { self.balance += amount; } fn withdraw(&mut self, amount: i32) { self.balance -= amount; } } }
All methods within an impl
block annotated with #[pymethods]
are automatically exposed to Python as methods on
your #[pyclass]
1. The deposit
and withdraw
methods in the example above can be called from Python like this:
wallet = Wallet(0)
wallet.deposit(100)
wallet.withdraw(50)
assert wallet.balance == 50
multiple-pymethods
You can't annotate multiple impl
blocks with #[pymethods]
for the same class, due to a limitation in
Rust's metaprogramming capabilities.
There is a way to work around this issue using some linker dark magic, via the
multiple-pymethods
feature flag, but it comes with a penalty in terms of compile times as well as limited cross-platform support.
Check out pyo3
's documentation for more details.
Footnotes
All methods in a #[pymethods]
block are exposed, even if they are private!
Exercise
The exercise for this section is located in 02_classes/02_methods
Custom setters and getters
In a previous section, we learned how to attach the default getter and setter to a field in a #[pyclass]
:
#![allow(unused)] fn main() { use pyo3::prelude::*; #[pyclass] struct Wallet { #[pyo3(get, set)] balance: i32, } }
This is convenient, but it's not always desirable!
Let's introduce an additional constraint to our Wallet
struct: the balance should never go below a pre-determined
overdraft threshold.
We'd start by enforcing this constraint in the constructor method:
#![allow(unused)] fn main() { use pyo3::prelude::*; use pyo3::exceptions::PyValueError; #[pyclass] struct Wallet { #[pyo3(get, set)] balance: i32, } const OVERDRAFT_LIMIT: i32 = -100; #[pymethods] impl Wallet { #[new] fn new(balance: i32) -> PyResult<Self> { if balance < OVERDRAFT_LIMIT { return Err(PyValueError::new_err("Balance cannot be below overdraft limit")); } Ok(Wallet { balance }) } } }
Wallet::new
ensures that a newly-created Wallet
upholds the overdraft constraint. But the default setter
can be easily used to circumvent the limit:
wallet = Wallet(0)
wallet.balance = -200 # This should not be allowed, but it is!
#[setter]
and #[getter]
We can override the default getter and setter by defining custom methods for them.
Here's how we can implement a custom setter for the balance
field via the #[setter]
attribute:
#![allow(unused)] fn main() { use pyo3::prelude::*; #[pyclass] struct Wallet { // We keep using the default getter, no issues there #[pyo3(get)] balance: i32, } const OVERDRAFT_LIMIT: i32 = -100; #[pymethods] impl Wallet { #[new] fn new(balance: i32) -> PyResult<Self> { Wallet::check_balance(balance)?; Ok(Wallet { balance }) } #[setter] fn set_balance(&mut self, value: i32) { Wallet::check_balance(value)?; self.balance = value; } } impl Wallet { // We put this method in a separate `impl` block to avoid exposing it to Python fn check_balance(balance: i32) -> PyResult<()> { if balance < OVERDRAFT_LIMIT { return Err(PyValueError::new_err("Balance cannot be below overdraft limit")); } Ok(()) } } }
Every time the balance
field is set in Python, Wallet::set_balance
will be called:
wallet = Wallet(0)
wallet.balance = -200 # Now raises a `ValueError`
The field is associated with its setter using a conventional naming strategy for the setter method: set_<field_name>
.
You can also explicitly specify the field name in the #[setter]
attribute, like this: #[setter(balance)]
.
Custom getters are defined in a similar way using the #[getter]
attribute. The naming convention for
getter methods is <field_name>
, but you can also specify the field name explicitly in the attribute—e.g.
#[getter(balance)]
.
Exercise
The exercise for this section is located in 02_classes/03_setters
Static methods
All the class methods we've seen so far have been instance methods—i.e. they take an instance of the class
as one of their arguments.
Python supports static methods as well. These methods don't take an instance of the class as an argument,
but they are "attached" to the class itself.
The same concept exists in Rust:
#![allow(unused)] fn main() { pub struct Wallet { balance: i32, } impl Wallet { pub fn default() -> Self { Wallet { balance: 0 } } } }
Wallet::get_default
is a static method since it doesn't take self
or references to self
as arguments.
You might then expect the following to define a Python static method on the Wallet
class:
#![allow(unused)] fn main() { use pyo3::prelude::*; #[pyclass] struct Wallet { #[pyo3(get, set)] balance: i32, } #[pymethods] impl Wallet { #[new] fn new(balance: i32) -> Self { Wallet { balance } } fn default() -> Self { Wallet { balance: 0 } } } }
However, this code will not compile.
To define a static method in Python, you need to explicitly mark it with the #[staticmethod]
attribute:
#![allow(unused)] fn main() { use pyo3::prelude::*; #[pyclass] struct Wallet { #[pyo3(get, set)] balance: i32, } #[pymethods] impl Wallet { #[new] fn new(balance: i32) -> Self { Wallet { balance } } // Notice the `#[staticmethod]` attribute here! #[staticmethod] fn default() -> Self { Wallet { balance: 0 } } } }
Class methods
Python also supports class methods. These methods take the class itself as an argument, rather than an instance of the class.
In Rust, you can define class methods by taking cls: &PyType
as the first argument:
#![allow(unused)] fn main() { use pyo3::prelude::*; #[pyclass] struct Wallet { #[pyo3(get, set)] balance: i32, } #[pymethods] impl Wallet { #[new] fn new(balance: i32) -> Self { Wallet { balance } } // Notice the `cls` argument here! #[classmethod] fn from_str(_cls: &Bound<'_, PyType>, balance: &str) -> PyResult<Self> { let balance = balance.parse::<i32>()?; Ok(Wallet { balance }) } } }
Since you can directly refer to the class in a Rust static method (i.e. the Self
type), you won't find yourself
using class methods as often as you would in Python.
Exercise
The exercise for this section is located in 02_classes/04_static_methods
Inheritance
Python, unlike Rust, supports inheritance.
Each class in Python can inherit attributes and methods from a parent class.
class Parent:
def __init__(self, name):
self.name = name
def greet(self):
print(f"Hello, {self.name}!")
# Declare `Child` as a subclass of `Parent`
class Child(Parent):
def __init__(self, name, age):
# Call the parent class's constructor
super().__init__(name)
self.age = age
child = Child("Alice", 7)
# `Child` inherits the `greet` method from `Parent`, so we can call it
child.greet() # Prints "Hello, Alice!"
pyo3
and inheritance
pyo3
supports inheritance as well, via additional attributes on the #[pyclass]
macro.
To understand how it works, let's try to translate the Python example above to Rust. We'll start with defining
the base class, Parent
:
#![allow(unused)] fn main() { use pyo3::prelude::*; #[pyclass(subclass)] struct Parent { name: String, } #[pymethods] impl Parent { #[new] fn new(name: String) -> Self { Parent { name } } fn greet(&self) { println!("Hello, {}!", self.name); } } }
You can spot one new attribute in the #[pyclass]
macro: subclass
. This attribute tells pyo3
that this class
can be subclassed, and it should generate the necessary machinery to support inheritance.
Now let's define the Child
class, which inherits from Parent
:
#![allow(unused)] fn main() { #[pyclass(extends=Parent)] struct Child { age: u8, } }
We're using the extends
attribute to specify that Child
is a subclass of Parent
.
Things get a bit more complicated when it comes to the constructor:
#![allow(unused)] fn main() { #[pymethods] impl Child { #[new] fn new(name: String, age: u8) -> PyClassInitializer<Self> { let parent = Parent::new(name); let child = Self { age }; PyClassInitializer::from(parent).add_subclass(child) } } }
Whenever you initialize a subclass, you need to make sure that the parent class is initialized first.
We start by calling Parent::new
to create an instance of the parent class. We then initialize Child
, via Self { age }
.
We then use PyClassInitializer
to return both the parent and child instances together.
Even though Child
doesn't have a greet
method on the Rust side, you'll be able to call it from Python since the
generated Child
class inherits it from Parent
.
Nested inheritance
PyClassInitializer
can be used to build arbitrarily deep inheritance hierarchies.
For example, if Child
had its own subclass, you could call add_subclass
again to add yet another subclass to the chain.
#![allow(unused)] fn main() { #[pyclass(extends=Child)] struct Grandchild { hobby: String, } #[pymethods] impl Grandchild { #[new] fn new(name: String, age: u8, hobby: String) -> PyClassInitializer<Self> { let child = Child::new(name, age); let grandchild = Self { hobby }; PyClassInitializer::from(child).add_subclass(grandchild) } } }
Limitations
pyo3
supports two kinds of superclasses:
- A Python class defined in Rust, via
#[pyclass]
- A Python built-in class, like
PyDict
orPyList
It currently doesn't support using a custom Python class as the parent class for a class defined in Rust.
Exercise
The exercise for this section is located in 02_classes/05_inheritance
Parent class
Let's go back to our example from the previous section:
#![allow(unused)] fn main() { use pyo3::prelude::*; #[pyclass(subclass)] struct Parent { name: String, } #[pymethods] impl Parent { #[new] fn new(name: String) -> Self { // [...] } fn greet(&self) { println!("Hello, {}!", self.name); } } #[pyclass(extends=Parent)] struct Child { age: u8, } #[pymethods] impl Child { #[new] fn new(name: String, age: u8) -> PyClassInitializer<Self> { // [...] } } }
Child.greet
is not defined, therefore it falls back to the Parent.greet
method.
What if we wanted to override it in Child
?
Overriding methods
On the surface, it's simple: just define a method with the same name in the subclass.
#![allow(unused)] fn main() { #[pymethods] impl Child { #[new] fn new(name: String, age: u8) -> PyClassInitializer<Self> { // [...] } fn greet(&self) { println!("Hi, I'm {} and I'm {} years old!", self.name, self.age); } } }
There's an issue though: self.name
won't work because the Rust struct for Child
doesn't have a name
field.
At the same time, the Python Child
class does, because it inherits it from Parent
.
How do we fix this?
as_super
to the rescue
We need a way, in Rust, to access the fields and methods of the parent class from the child class.
This can be done using another one of pyo3
's smart pointers: PyRef
.
#![allow(unused)] fn main() { #[pymethods] impl Child { // [...] fn greet(self_: PyRef<'_, Self>) { todo!() } } }
PyRef
represents an immutable reference to the Python object.
It allows us, in particular, to call the as_super
method,
which returns a reference to the parent class.
#![allow(unused)] fn main() { #[pymethods] impl Child { // [...] fn greet(self_: PyRef<'_, Self>) { // This is now a reference to a `Parent` instance! let parent = self_.as_super(); println!("Hi, I'm {} and I'm {} years old!", parent.name, self_.age); } } }
Now we can access the name
field from the parent class, and the age
field from the child class.
PyRef
and PyRefMut
PyRef
is for immutable references, but what if we need to modify the parent class?
In that case, we can use PyRefMut
, which is a mutable reference.
Exercise
The exercise for this section is located in 02_classes/06_parent
Wrapping up
There's a ton of little details and options when it comes to writing Python classes in Rust.
We've covered the key concepts and most common use cases, but make sure to check out
the official pyo3
documentation whenever you need more information about
a specific feature (e.g. how do I declare a class to be frozen? How do I make my class iterable?).
Let's take a moment to reflect on what we've learned so far with one last exercise.
Exercise
The exercise for this section is located in 02_classes/07_outro
Concurrency
All our code so far has been designed for sequential execution, on both the Python and Rust side. It's time to spice things up a bit and explore concurrency1!
We won't dive straight into Rust this time.
We'll start by solving a few parallel processing problems in Python, to get a feel for Python's capabilities and limitations.
Once we have a good grasp of what's possible there, we'll port our solutions over to Rust.
Multiprocessing
If you've ever tried to write parallel code in Python, you've probably come across the multiprocessing
module.
Before we dive into the details, let's take a step back and review the terminology we'll be using.
Processes
A process is an instance of a running program.
The precise anatomy of a process depends on the underlying operating system (e.g. Windows or Linux).
Some characteristics are common across most operating systems, though. In particular, a process typically consists of:
- The program's code
- Its memory space, allocated by the operating system
- A set of resources (file handles, sockets, etc.)
+------------------------+
| Memory |
| |
| +--------------------+ |
| | Process A Space | | <-- Each process has a separate memory space.
| +--------------------+ |
| |
| +--------------------+ |
| | Process B Space | |
| | | |
| +--------------------+ |
| |
| +--------------------+ |
| | Process C Space | |
| +--------------------+ |
+------------------------+
There can be multiple processes running the same program, each with its own memory space and resources, fully
isolated from one another.
The operating system's scheduler is in charge of deciding which process to run at any given time, partitioning CPU time
among them to maximize throughput and/or responsiveness.
The multiprocessing
module
Python's multiprocessing
module allows us to spawn new processes, each running its own Python interpreter.
A process is created by invoking the Process
constructor with a target function to execute as well as
any arguments that function might need.
The process is launched by calling its start
method, and we can wait for it to finish by calling join
.
If we want to communicate between processes, we can use Queue
objects, which are shared between processes.
These queues try to abstract away the complexities of inter-process communication, allowing us to pass messages
between our processes in a relatively straightforward manner.
References:
We'll limit our exploration to threads and processes, without venturing into the realm of async
/await
.
Exercise
The exercise for this section is located in 03_concurrency/00_introduction
Threads
The overhead of multiprocessing
Let's have a look at the solution for the previous exercise:
from multiprocessing import Process, Queue
def word_count(text: str, n_processes: int) -> int:
result_queue = Queue()
processes = []
for chunk in split_into_chunks(text, n_processes):
p = Process(target=word_count_task, args=(chunk, result_queue))
p.start()
processes.append(p)
for p in processes:
p.join()
results = [result_queue.get() for _ in range(len(processes))]
return sum(results)
Let's focus, in particular, on process creation:
p = Process(target=word_count_task, args=(chunk, result_queue))
The parent process (the one executing word_count
) doesn't share memory with the child process (the one
spawned via p.start()
). As a result, the child process can't access chunk
or result_queue
directly.
Instead, it needs to be provided a deep copy of these objects1.
That's not a major issue if the data is small, but it can become a problem on larger datasets.
For example, if we're working with 8 GB of text, we'll end up with at least 16 GB of memory usage: 8 GB for the
parent process and 8 GB split among the child processes. Not ideal!
We could try to circumvent this issue2, but that's not always possible nor easy to do.
A more straightforward solution is to use threads instead of processes.
Threads
A thread is an execution context within a process.
Threads share the same memory space and resources as the process that spawned them, thus allowing them to communicate
and share data with one another more easily than processes can.
+------------------------+
| Memory |
| |
| +--------------------+ |
| | Process A Space | | <-- Each process has its own memory space.
| | +-------------+ | | Threads share the same memory space
| | | Thread 1 | | | of the process that spawned them.
| | | Thread 2 | | |
| | | Thread 3 | | |
| | +-------------+ | |
| +--------------------+ |
| |
| +--------------------+ |
| | Process B Space | |
| | +-------------+ | |
| | | Thread 1 | | |
| | | Thread 2 | | |
| | +-------------+ | |
| +--------------------+ |
+------------------------+
Threads, just like processes, are operating system constructs.
The operating system's scheduler is in charge of deciding which thread to run at any given time, partitioning CPU time
among them.
The threading
module
Python's threading
module provides a high-level interface for working with threads.
The API of the Thread
class, in particular, mirrors what you already know from the Process
class:
- A thread is created by calling the
Thread
constructor and passing it a target function to execute as well as any arguments that function might need. - The thread is launched by calling its
start
method, and we can wait for it to finish by callingjoin
. - If we want to communicate between threads, we can use
Queue
objects, from thequeue
module, which are shared between threads.
References:
To be more precise, the multiprocessing
module uses the pickle
module to serialize the objects
that must be passed as arguments to the child process.
The serialized data is then sent to the child process, as a byte stream, over an operating system pipe.
On the other side of the pipe, the child process deserializes the byte stream back into Python objects using pickle
and passes them to the target function.
This all system has higher overhead than a "simple" deep copy.
Common workarounds include memory-mapped files and shared-memory objects, but these can be quite difficult to work with. They also suffer from portability issues, as they rely on OS-specific features.
Exercise
The exercise for this section is located in 03_concurrency/01_python_threads
The GIL problem
Concurrent, yes, but not parallel
On the surface, our thread-based solution addresses all the issues we identified in the multiprocessing
module:
from threading import Process
from queue import Queue
def word_count(text: str, n_threads: int) -> int:
result_queue = Queue()
threads = []
for chunk in split_into_chunks(text, n_threads):
t = Thread(target=word_count_task, args=(chunk, result_queue))
t.start()
threads.append(t)
for t in threads:
t.join()
results = [result_queue.get() for _ in range(len(threads))]
return sum(results)
When a thread is created, we are no longer cloning the text chunk nor incurring the overhead of inter-process communication:
t = Thread(target=word_count_task, args=(chunk, result_queue))
Since the spawned threads share the same memory space as the parent thread, they can access the chunk
and result_queue
directly.
Nonetheless, there's a major issue with this code: it won't actually use multiple CPU cores.
It will run sequentially, even if we pass n_threads > 1
and multiple CPU cores are available.
Python concurrency
You guessed it: the infamous Global Interpreter Lock (GIL) is to blame. As we discussed in the GIL chapter, Python's GIL prevents multiple threads from executing Python code simultaneously1.
As a result, thread-based parallelism has historically seen limited use in Python, as it doesn't provide the performance benefits one might expect from a multithreaded application.
That's why the multiprocessing
module is so popular: it allows Python developers to bypass the GIL.
Each process has its own Python interpreter, and thus its own GIL. The operating system schedules these processes
independently, allowing them to run in parallel on multicore CPUs.
But, as we've seen, multiprocessing comes with its own set of challenges.
Native extensions
There's a third way to achieve parallelism in Python: native extensions.
We must be holding the GIL when we invoke a Rust function from Python, but
pure Rust threads are not affected by the GIL, as long as they don't need to interact with Python objects.
Let's rewrite again our word_count
function, this time in Rust!
This is the current state of Python's concurrency model. There are some exciting changes on the horizon, though!
CPython
's free-threading mode is an experimental feature
that aims to remove the GIL entirely.
It would allow multiple threads to execute Python code simultaneously, without forcing developers to rely on multiprocessing.
We won't cover the new free-threading mode in this course, but it's worth keeping an eye on it as it matures out of the experimental phase.
Exercise
The exercise for this section is located in 03_concurrency/02_gil
Releasing the GIL
What happens to our Python code when it calls a Rust function?
It waits for the Rust function to return:
Time -->
+------------+--------------------+------------+--------------------+
Python: | Execute | Call Rust Function | Idle | Resume Execution |
+------------+--------------------+------------+--------------------+
│ ▲
▼ │
+------------+--------------------+------------+--------------------+
Rust: | Idle | Idle | Execute | Return to Python |
+------------+--------------------+------------+--------------------+
The schema doesn't change even if the Rust function is multithreaded:
Time -->
+------------+--------------------+-------------------+--------------------+
Python: | Execute | Call Rust Function | Idle | Resume Execution |
+------------+--------------------+-------------------+--------------------+
│ ▲
▼ │
+------------+--------------------+-------------------+--------------------+
Rust: | Idle | Idle | Execute Thread 1 | Return to Python |
| | | Execute Thread 2 | |
+------------+--------------------+-------------------+--------------------+
It begs the question: can we have Python and Rust code running concurrently?
Yes! The focus point, once again, is the GIL.
Python access must be serialized
The GIL's job is to serialize all interactions with Python objects.
On the pyo3
side, this is modeled by the Python<'py>
token: you can only get an instance of Python<'py>
if you're holding
the GIL. Going further, you can only interact with Python objects via smart pointers like Borrowed<'py, T>
or Owned<'py, T>
,
which internally hold a Python<'py>
instance.
There's no way around it: any interaction with Python objects must be serialized. But, here's the kicker: not all Rust code needs to
interact with Python objects!
Python::allow_threads
For example, consider a Rust function that calculates the nth Fibonacci number:
#![allow(unused)] fn main() { #[pyfunction] fn fibonacci(n: u64) -> u64 { let mut a = 0; let mut b = 1; for _ in 0..n { let tmp = a; a = b; b = tmp + b; } a } }
There's no Python object in sight! We're just offloading a computation to Rust.
In principle, we could spawn a thread to run this function while the main thread continues executing Python code:
from threading import Thread
def other_work():
print("I'm doing other work!")
t = Thread(target=fibonacci, args=(10,))
t.start()
other_work()
t.join()
As it stands, other_work
and fibonacci
will not be run in parallel: our fibonacci
routine is still holding the GIL, even though
it doesn't need it.
We can fix it by explicitly releasing the GIL:
#![allow(unused)] fn main() { #[pyfunction] fn fibonacci(py: Python<'_>, n: u64) -> u64 { py.allow_threads(|| { let mut a = 0; let mut b = 1; for _ in 0..n { let tmp = a; a = b; b = tmp + b; } a }) } }
Python::allow_threads
releases the GIL while executing the closure passed to it.
This frees up the Python interpreter to run other Python code, such as the other_work
function in our example, while the Rust
thread is busy calculating the nth Fibonacci number.
Using the same line diagram as before, we have the following:
Time -->
+------------+--------------------+-------------------+--------------------+
Python: | Execute | Call Rust Function | other_work() | t.join() |
+------------+--------------------+-------------------+--------------------+
│ ▲
▼ │
+------------+--------------------+-------------------+--------------------+
Rust: | Idle | Idle | fibonacci(n) | Return to Python |
+------------+--------------------+-------------------+--------------------+
▲
│
Python and Rust code
running concurrently here
Ungil
Python::allow_threads
is only sound if the closure doesn't interact with Python objects.
If that's not the case, we end up with undefined behavior: Rust code touching Python objects while the Python interpreter is running
other Python code, assuming nothing else is happening to those objects thanks to the GIL. A recipe for disaster!
It'd be ideal to rely on the type system to enforce this constraint for us at compile-time, in true Rust fashion—"if it compiles, it's
safe."
pyo3
tries to follow this principle with the Ungil
marker trait:
only types that are safe to access without the GIL can implement Ungil
. The trait is then used to constrain the arguments of
Python::allow_threads
:
#![allow(unused)] fn main() { pub fn allow_threads<T, F>(self, f: F) -> T where F: Ungil + FnOnce() -> T, T: Ungil, { // ... } }
Unfortunately, Ungil
is not perfect.
On stable Rust, it leans on the Send
trait, but that allows for some
unsafe interactions with Python objects. The tracking is more precise on nightly
Rust1,
but it doesn't catch every possible misuse of Python::allow_threads
.
My recommendation: if you're using Python::allow_threads
, trigger an additional run of your CI pipeline using the nightly
Rust compiler
to catch more issues. On top of that, review your code carefully.
See the nightly
feature flag exposed by pyo3
.
Exercise
The exercise for this section is located in 03_concurrency/03_releasing_the_gil
Minimize GIL locking
All our examples so far fall into two categories:
- The Rust function holds the GIL for the entire duration of its execution.
- The Rust function doesn't hold the GIL at all, going straight into
Python::allow_threads
mode.
Real-world applications are often more nuanced, though.
You'll need to hold the GIL for some operations (e.g. passing data back to Python), but you're able to release it
for others (e.g. long-running computations).
The goal is to minimize the time spent holding the GIL to the bare minimum, thus maximizing the potential parallelism of your application.
Strategy 1: isolate the GIL-free section
Let's look at an example: we're given a list of numbers and we need to modify it in place, replacing each number with the result of an expensive computation that uses no Python objects.
To minimize GIL locking, we create Rust vector from the Python list, release the GIL, and perform the computation and then re-acquire the GIL to update the Python list in place:
#![allow(unused)] fn main() { #[pyfunction] fn update_in_place<'py>( python: Python<'py>, numbers: Bound<'py, PyList> ) -> PyResult<()> { // Holding the GIL let v: Vec<i32> = numbers.extract()?; let updated_v: Vec<_> = python.allow_threads(|| { v.iter().map(|&n| expensive_computation(n)).collect() }); // Back to holding the GIL for (i, &n) in updated_v.iter().enumerate() { numbers.set_item(i, n)?; } Ok(() } fn expensive_computation(n: i32) -> i32 { // Some heavy number crunching // [...] } }
Strategy 2: manually re-acquire the GIL inside the closure
In the example above, we've created a whole new vector to decouple the GIL-free section from the GIL-holding one. If the input data is large, this can be a significant overhead.
Let's explore a different approach: we won't create a new pure-Rust vector. Instead, we will re-acquire the GIL inside the closure—we'll hold it to access each list element and, after the computation is done, update it in place. Nothing more.
Assuming you know nothing about Ungil
, the naive solution might look like this:
#![allow(unused)] fn main() { #[pyfunction] fn update_in_place<'py>( python: Python<'py>, numbers: Bound<'py, PyList> ) -> PyResult<()> { python.allow_threads(|| -> PyResult<()> { let n_numbers = numbers.len(); for i in 0..n_numbers { let n = numbers.get_item(i)?.extract::<i64>()?; let result = expensive_computation(n); numbers.set_item(i, result))?; } Ok(()) }) } }
It won't compile, though. We're using a GIL-bound object (numbers
) in a GIL-free section (inside python.allow_threads
).
We need to unbind it first.
Py<T>
and Bound<'py, T>
Using Bound<'py, T>::unbind
we get a Py<T>
object back. It has no 'py
lifetime, it's no longer bound to the GIL.
We can try to use it in the GIL-free section:
#![allow(unused)] fn main() { #[pyfunction] fn update_in_place<'py>( python: Python<'py>, numbers: Bound<'py, PyList> ) -> PyResult<()> { let numbers = numbers.unbind(); python.allow_threads(|| -> PyResult<()> { let n_numbers = numbers.len(); for i in 0..n_numbers { let n = numbers.get_item(i)?.extract::<i64>()?; let result = expensive_computation(n); numbers.set_item(i, result)?; } Ok(()) }) } }
But it won't compile either. numbers.len()
, numbers.get_item(i)
, and numbers.set_item(i, result)
all require the GIL.
Py<T>
is just a pointer to a Python object, it won't allow us to access it if we're not holding the GIL.
We need to re-bind it using a Python<'py>
token, thus getting a Bound<'py, PyList>
back.
How do we get a Python<'py>
token inside the closure, though? Using Python::with_gil
: it's the opposite
of Python::allow_threads
, it makes sure to acquire the GIL before executing the closure and release it afterwards.
The closure is given a Python
token as argument, which we can use to re-bind the PyList
object:
#![allow(unused)] fn main() { #[pyfunction] fn update_in_place<'py>( python: Python<'py>, numbers: Bound<'py, PyList> ) -> PyResult<()> { let n_numbers = numbers.len(); let numbers_ref = numbers.unbind(); // Release the GIL python.allow_threads(|| -> PyResult<()> { for i in 0..n_numbers { // Acquire the GIL again, to access the // i-th element of the list let n = Python::with_gil(|inner_py| { numbers_ref .bind(inner_py) .get_item(i)? .extract::<i64>() })?; // Run the computation without holding the GIL let result = expensive_computation(n); // Re-acquire the GIL to update the list in place Python::with_gil(|inner_py| { numbers_ref.bind(inner_py).set_item(i, result) })?; } Ok(()) }) } }
Be mindful of concurrency
The GIL is there for a reason: to protect Python objects from concurrent access.
Whenever you release the GIL, you're allowing other threads to run and potentially modify the
Python objects you're working with.
In the examples above, another Python thread could modify the numbers
list while we're computing the result.
E.g. it could remove an element, causing the index i
to be out of bounds.
This is a common issue in multi-threaded programming, and it's up to you to handle it.
Consider using synchronization primitives like Lock
to serialize access to the Python objects you're working with.
In other words, move towards fine-grained locking rather than the lock-the-world approach
you get with the GIL.
References
Exercise
The exercise for this section is located in 03_concurrency/04_minimize_gil_locking
Immutable types
Concurrency introduces many new classes of bugs that are not present in single-threaded programs.
Data races are one of the most common: two threads try to access the same memory location at the same time, and at least one of them
is writing to it. What should happen?
In most programming languages, the behavior is undefined: the program could crash, or it could produce incorrect results.
Data races can't happen in a single-threaded program, because only one thread can access the memory at a time. That's where the GIL comes in: since it serializes the execution of code that accesses Python objects, it prevents all kinds of data races (albeit with a significant performance cost).
There's another way to prevent data races though: by making sure that the data is immutable. There's no need for synchronization if the data can't change!
Built-in immutable types
Python has many immutable types—e.g. int
, float
, str
.
Whenever you modify them, you're actually creating a new object, not changing the existing one.
a = 1
b = a
a += 1
assert a == 2
# a is a new object,
# b is still 1
assert b == 1
Since they're immutable, they're considered thread-safe: you can access them from multiple threads without worrying about data races and synchronization.
Frozen dataclasses
You can define your own immutable types in Python using dataclasses
and the frozen
attribute.
from dataclasses import dataclass
@dataclass(frozen=True)
class Point:
x: int
y: int
p = Point(1, 2)
# This will raise a `FrozenInstanceError` exception
p.x = 3
The frozen
attribute makes the class immutable: you can't modify its attributes after creation.
This goes beyond modifying the values of the existing attributes. You are also forbidden from
adding new attributes, e.g.:
# This will raise a `FrozenInstanceError` exception
# But would work if `frozen=False` or for a "normal"
# class without the `@dataclass` decorator
p.z = 3
In Rust
Let's see how we can define a similar immutable type in Rust.
#![allow(unused)] fn main() { use pyo3::prelude::*; #[pyclass(frozen)] struct Point { x: i32, y: i32, } }
The above is not enough to get all the niceties of Python's dataclasses
, but
it's sufficient to make the class immutable.
If a pyclass
is marked as frozen
, pyo3
will allow us to access its fields without
holding the GIL—i.e. via Py<T>
instead of Bound<'py, T>
#![allow(unused)] fn main() { #[pyfunction] fn print_point<'py>(python: Python<'py>, point: Bound<'py, Point>) { let point: Py<Point> = point.unbind(); python.allow_threads(|| { // We can now access the fields of the Point struct // even though we are not holding the GIL let point: &Point = point.get(); println!("({}, {})", point.x, point.y); }); } }
This wouldn't compile if Point
wasn't marked as frozen
, thanks to Py<T>::get
's signature:
#![allow(unused)] fn main() { impl<T> Py<T> where T: PyClass, { pub fn get(&self) -> &T where // `Frozen = True` is where the magic happens! T: PyClass<Frozen = True> + Sync, { /* ... */ } } }
Summary
Immutable types significantly simplify GIL jugglery in pyo3
. If it fits the constraints of the problem you're solving,
consider using them to make your code easier to reason about (and potentially faster!).
Exercise
The exercise for this section is located in 03_concurrency/05_immutable_types
Wrapping up
You should now have a strong theoretical foundation to reason about your options whenever you have a Python problem that could benefit from a concurrent solution.
That's only the beginning, though. To truly master conccurent programming, you need to practice it!
This wasn't the right venue to cover all the possible concurrency patterns you can express with Rust.
Luckily enough, there's plenty of resources available to help you on that side! We recommend, in particular,
the "Threads" chapter of our own
Rust course. It follows the same hands-on approach we've been using in this book, and it's a great way to
get more practice with Rust's concurrency primitives.
Take the final exercise as a capstone project: you'll have to design a non-trivial algorithm and piece together various concurrenty primitives to implement a solution that's both correct and efficient. Don't shy away from the challenge: embrace it, it's the best way to learn!
Beyond performance
A note, in closing: writing correct concurrent code is tricky.
We highlighted Rust, in this chapter, as a way to circumvent the limitations of Python's GIL and ultimately
improve the performance of your code. But that's only half of the story.
Rust's type system and ownership model make it much easier to write concurrent code that's correct, too.
As you delve deeper into the world of concurrent programming, you'll come to appreciate the real value of Rust's
Send
and Sync
traits, which we've only briefly touched upon when discussing data races.
As the saying goes: people come to Rust for the performance, but they stay for its safety and correctness guarantees.
Exercise
The exercise for this section is located in 03_concurrency/06_outro