Welcome

Welcome to "Rust-Python interoperability"!
This course will teach you how to call Rust code from Python, packaged as a native extension module.

We assume you are familiar with both Rust and Python, but we don't assume any prior interoperability knowledge. We will provide a brief explanation and references whenever we rely on advanced language features in either language.

Methodology

This course is based on the "learn by doing" principle.
You'll build up your knowledge in small, manageable steps. It has been designed to be interactive and hands-on.

Mainmatter developed this course to be delivered in a classroom setting, over a whole day: each attendee advances through the lessons at their own pace, with an experienced instructor providing guidance, answering questions and diving deeper into the topics as needed.
If you're interested in attending one of our training sessions, or if you'd like to bring this course to your company, please get in touch.

You can also follow the course on your own, but we recommend you find a friend or a mentor to help you along the way should you get stuck. You can also find solutions to all exercises in the solutions branch of the GitHub repository.

Prerequisites

To follow this course, you must install:

  • Rust
  • uv, a Python package manager

If Rust is already installed on your machine, make sure to update it to the latest version:

# If you installed Rust using `rustup`, the recommended way,
# you can update to the latest stable toolchain with:
rustup update stable

These commands should successfully run on your machine:

cargo --version
uv --version

Don't start the course until you have these tools installed and working.

Structure

On the left side of the screen, you can see that the course is divided into sections.
To verify your understanding, each section is paired with an exercise that you need to solve.

You can find the exercises in the companion GitHub repository.
Before starting the course, make sure to clone the repository to your local machine:

# If you have an SSH key set up with GitHub
git clone git@github.com:mainmatter/rust-python-interoperability.git
# Otherwise, use the HTTPS URL:
#
#   git clone https://github.com/mainmatter/rust-python-interoperability.git

We recommend you work on a branch, so you can easily track your progress and pull updates from the main repository if needed:

cd rust-python-interoperability
git checkout -b my-solutions

All exercises are located in the exercises folder. Each exercise is structured as a Rust package. The package contains the exercise itself, instructions on what to do (in src/lib.rs), and a test suite to automatically verify your solution.

wr, the workshop runner

To verify your solutions, we've provided a tool that will guide you through the course. It is the wr CLI (short for "workshop runner"). Install it with:

cargo install --locked workshop-runner

In a new terminal, navigate back to the top-level folder of the repository. Run the wr command to start the course:

wr

wr will verify the solution to the current exercise.
Don't move on to the next section until you've solved the exercise for the current one.

We recommend committing your solutions to Git as you progress through the course, so you can easily track your progress and "restart" from a known point if needed.

Enjoy the course!

Author

This course was written by Luca Palmieri, Principal Engineering Consultant at Mainmatter.
Luca has been working with Rust since 2018, initially at TrueLayer and then at AWS.
Luca is the author of "Zero to Production in Rust", the go-to resource for learning how to build backend applications in Rust, and "100 Exercises to Learn Rust", a learn-by-doing introduction to Rust itself.
He is also the author and maintainer of a variety of open-source Rust projects, including cargo-chef, Pavex and wiremock.

Exercise

The exercise for this section is located in 01_intro/00_welcome

Anatomy of a Python extension

Don't jump ahead!
Complete the exercise for the previous section before you start this one.
It's located in exercises/01_intro/00_welcome, in the course GitHub's repository.
Use wr to start the course and verify your solutions.

To invoke Rust code from Python we need to create a Python extension module.

Rust, just like C and C++, compiles to native code. For this reason, extension modules written in Rust are often called native extensions. Throughout this course we'll use the terms Python extension, Python extension module and native extension interchangeably.

maturin

We'll use maturin to build, package and publish Python extensions written in Rust. Let's install it:

uv tool install "maturin>=1.8"

Tools installed via uv should be available in your path. Run:

uv tool update-shell

to make sure that's the case.

Exercise structure

All exercises in this course will follow the same structure:

  • an extension module written in Rust, in the root of the exercise directory
  • a Python package that invokes the functionality provided by the extension, in the sample subdirectory

The extension module will usually be tested from Python, in the sample/tests subdirectory. You will have to modify the Rust code in the extension module to make the tests pass.

Extension structure

Let's explore the structure of the extension module for this section.

01_setup
├── sample
├── src
│   └── lib.rs
├── Cargo.toml
└── pyproject.toml

Cargo.toml

The manifest file, Cargo.toml, looks like this:

[package]
name = "setup"
version = "0.1.0"
edition = "2021"

[lib]
name = "setup"
crate-type = ["cdylib"]

[dependencies]
pyo3 = "0.23.0"

Two things stand out in this file compared to a regular Rust project:

  • The crate-type attribute is set to cdylib.
  • The pyo3 crate is included as a dependency.

Let's cover these two points in more detail.

Linking

Static linking

By default, Rust libraries are compiled as static libraries.
All dependencies are linked into the final executable at compile-time, making the executable self-contained1.

That's great for distributing applications, but it's not ideal for Python extensions.
To perform static linking, the extension module would have to be compiled alongside the Python interpreter. Furthermore, you'd have to distribute the modified interpreter to all your users.
At the ecosystem level, this process would scale poorly, as each user needs to leverage several unrelated extensions at once. Every single project would have to compile its own bespoke Python interpreter.

Dynamic linking

To avoid this scenario, Python extensions are packaged as dynamic libraries.
The Python interpreter can load these libraries at runtime, without having to be recompiled. Instead of distributing a modified Python interpreter to all users, you must now distribute the extension module as a standalone file.

Rust supports dynamic linking, and it provides two different flavors of dynamic libraries: dylib and cdylib. dylib are Rust-flavored dynamic libraries, geared towards Rust-to-Rust dynamic linking. cdylib, on the other hand, are dynamic libraries that export a C-compatible interface (C dynamic library).

You need a common dialect to get two different languages to communicate with each other. They both need to speak it and understand it.
That bridge, today, is C's ABI (Application Binary Interface).

That's why, for Python extensions, you must use the cdylib crate type:

[lib]
crate-type = ["cdylib"]

pyo3

It's not enough to expose a C-compatible interface. You must also comply with the Python C API, the interface Python uses to interact with C extensions.

Doing this manually is error-prone and tedious. That's where the pyo3 crate comes in: it provides a safe and idiomatic way to write Python extensions in Rust, abstracting away the low-level details.

In lib.rs, you can see it in action:

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

#[pyfunction]
fn it_works() -> bool {
    todo!()
}

/// A Python module implemented in Rust.
#[pymodule]
fn setup(m: &Bound<'_, PyModule>) -> PyResult<()> {
    m.add_function(wrap_pyfunction!(it_works, m)?)?;
    Ok(())
}
}

We're using pyo3 to define a Python function, named it_works, that returns a boolean. The function is then exposed to Python at the top-level of our extension module, named setup.

That same function is then invoked from Python, inside sample/tests/test_sample.py:

from setup import it_works

def test_works(): 
    assert it_works()

We'll cover the details of #[pyfunction] and #[pymodule] in the next section, no worries.

pyproject.toml

Before we move on, let's take a look at pyproject.toml, the Python "manifest" of the extension module:

[build-system]
requires = ["maturin>=1.8,<2.0"]
build-backend = "maturin"

[project]
name = "setup"
# [...]
requires-python = ">=3.13"

[tool.maturin]
features = ["pyo3/extension-module"]

It specifies the build system, the extension name and version, the required Python version, and the features to enable when building the extension module. This is what uv looks at when building the extension module, before delegating the build process to maturin, which in turn invokes cargo to compile the Rust code.

What do I need to do?

A lot has to go right behind the scenes to make a Python extension work.
That's why the exercise for this section is fairly boring—we want to verify that you can build and test a Python extension module without issues.

Things will get a lot more interesting over the coming sections, I promise!

References

Footnotes

1

This is true up to an extent. In most cases, some dependencies are still dynamically linked, e.g. libc on most Unix systems. Nonetheless, the final executable is self-contained in the sense that it doesn't rely on the presence of the Rust standard library or any other Rust crate on the user's system.

Exercise

The exercise for this section is located in 01_intro/01_setup

Modules

In Python, just like in Rust, your code is organized into modules.
Your entire extension is a module!

That module is defined using pyo3's #[pymodule] procedural macro, as you've seen in the previous section:

#![allow(unused)]
fn main() {
#[pymodule]
fn setup(m: &Bound<'_, PyModule>) -> PyResult<()> {
   // [...]
}
}

setup becomes the entry point for the Python interpreter to load your extension.

Naming matters

The name of the annotated function is important: there must be at least one module with a name that matches the name of the dynamic library artifact that Python will try to load. This is the name of the library target specified in your Cargo.toml file:

[lib]
name = "name_of_your_rust_library"

If you don't have a [lib] section, it defaults to the name of your package, specified in the [package] section.

If the module name and the library name don't match, Python will raise an error when trying to import the module:

ImportError: dynamic module does not define 
    module export function (PyInit_name_of_your_module)

The name argument

You can also specify the name of the module explicitly using the name argument, rather than relying on the name of the annotated function:

#![allow(unused)]
fn main() {
#[pymodule]
#[pyo3(name = "setup")]
fn random_name(m: &Bound<'_, PyModule>) -> PyResult<()> {
   // [...]
}
}

Mysterious types?

You might be wondering: what's up with &Bound<'_, PyModule>? What about PyResult?
Don't worry, we'll cover these types in due time later in the course. Go with the flow for now!

Exercise

The exercise for this section is located in 01_intro/02_modules

Functions

Empty modules are not that useful: let's add some functions to our extension!
As you've seen in the "Setup" section, pyo3 provides another procedural macro to define functions that can be called from Python: #[pyfunction].

Back then we used it to define the it_works function:

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

// 👇 A Python function defined in Rust
#[pyfunction]
fn it_works() -> bool {
    true
}
}

Unlike modules, functions aren't exposed to Python automatically; you must attach them to a module using the wrap_pyfunction! macro:

#![allow(unused)]
fn main() {
#[pymodule]
fn setup(m: &Bound<'_, PyModule>) -> PyResult<()> {
    // 👇 Expose the function to Python
    m.add_function(wrap_pyfunction!(it_works, m)?)?;
    Ok(())
}
}

Exercise

The exercise for this section is located in 01_intro/03_functions

Arguments

no_op, the function you added to solve the previous exercise, is very simple:

#![allow(unused)]
fn main() {
use pyo3::prelude::*;
    
#[pyfunction]
fn no_op() {
    // Do nothing
}
}

Let's take it up a notch: what if you want to pass a value from Python to Rust?

The FromPyObject trait

#[pyfunction] functions can take arguments, just like regular Rust functions.
But there's a catch: it must be possible to build those arguments from Python objects.

The contract is encoded in the FromPyObject trait, defined in pyo3:

#![allow(unused)]
fn main() {
pub trait FromPyObject<'py>: Sized {
    fn extract_bound(ob: &Bound<'py, PyAny>) -> PyResult<Self>;
}
}

We won't go into the details of FromPyObject's definition just yet: it would require an in-depth discussion of Python's Global Interpreter Lock (GIL) and the way pyo3 models it in Rust. We'll get to it in the next section.
For the time being, let's focus on what the trait unlocks for us: the ability to convert Python objects into Rust types.

Available implementations

pyo3 provides implementations of FromPyObject for a large number of types—e.g. i32, f64, String, Vec, etc. You can find an exhaustive list in pyo3's guide, under the "Rust" table column.

Conversion cost

Going from a Python object to a Rust type is not free—e.g. the in-memory representation of a Python list doesn't match the in-memory representation of a Rust Vec.
The conversion introduces a (usually small) overhead that you'll have to incur every time you invoke your Rust function from Python. It's a good trade-off if you end up performing enough computational work in Rust to amortize the conversion cost.

Python-native types

In pyo3's documentation you can see a column of "Python-native" types.
Don't try to use them to solve the exercise for this section: we'll cover them in the next one.

References

Exercise

The exercise for this section is located in 01_intro/04_arguments

Global Interpreter Lock (GIL)

If you go back to pyo3's documentation on arguments, you'll find a table column listing so called "Python-native" types. What are they, and why would you use them?

Python-native types

There is overhead in converting a Python object into a Rust-native type.
That overhead might dominate the cost of invoking your Rust function if the function itself isn't doing much computational work. In those cases, it can be desirable to work directly using Python's in-memory representation of the object. That's where the Py* types come in: they give you direct access to Python objects, with minimal overhead1.

Out of the entire family of Py* types, PyAny deserves a special mention. It's the most general Python-native type in pyo3: it stands for an arbitrary Python object. You can use it whenever you don't know the exact type of the object you're working with, or you don't care about it.

Py* types don't implement FromPyObject

Let's try to rewrite the solution of the previous exercise using PyList rather than Vec<u64>:

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

fn print_number_list(list: &PyList) {
    todo!()
}
}

If you try to compile this code, you'll get an error:

error[E0277]: the trait bound `&PyList: PyFunctionArgument<'_, '_>` is not satisfied
   --> src/lib.rs:7:28
    |
7   | fn print_number_list(list: &PyList) {
    |                            ^ 
    |        the trait `PyClass` is not implemented for `&PyList`, 
    |        which is required by `&PyList: PyFunctionArgument<'_, '_>`
    |
    = help: the following other types implement trait `PyFunctionArgument<'a, 'py>`:
              &'a pyo3::Bound<'py, T>
              Option<&'a pyo3::Bound<'py, T>>
    = note: required for `&PyList` to implement `FromPyObject<'_>`
    = note: required for `&PyList` to implement `FromPyObjectBound<'_, '_>`
    = note: required for `&PyList` to implement `PyFunctionArgument<'_, '_>`

The error message is a bit cryptic because it mentions a number of private pyo3 traits (PyFunctionArgument and FromPyObjectBound), but the gist of it is that &PyList doesn't implement FromPyObject. That's true for all Py* types.

Confusing, isn't it? How is possible that Python-native types, that require no conversion, don't implement the trait that allows you to convert Python objects into Rust types?

It's time to have that talk, the one about Python's Global Interpreter Lock (GIL).

Global Interpreter Lock (GIL)

Out of the box, Python's2 data structures are not thread-safe. To prevent data races, there is a global mutual exclusion lock that allows only one thread to execute Python bytecode at a time—i.e. the so-called Global Interpreter Lock (GIL).

It is forbidden to interact with Python objects without holding the GIL.

That's why pyo3 doesn't implement FromPyObject for Py* types: it would allow you to interact with Python objects without you necessarily holding the GIL, a recipe for disaster.

Python<'py>

pyo3 uses a combination of lifetimes and smart pointers to ensure that you're interacting with Python objects in a safe way.

Python<'py> is the cornerstone of the entire system: it's a token type that guarantees that you're holding the GIL. All APIs that require you to hold the GIL will, either directly or indirectly, require you to provide a Python<'py> token as proof.

pyo3 will automatically acquire the GIL behind the scenes whenever you invoke a Rust function from Python. In fact, you can ask for a Python<'py> token as an argument to your Rust function, and pyo3 will provide it for you—it has no (additional) cost.

#![allow(unused)]
fn main() {
use pyo3::prelude::*;
// There is no runtime difference between invoking the two functions
// below from Python.
// The first one is just more explicit about the fact that it requires
// the caller to acquire the GIL ahead of time.

#[pyfunction]
fn print_number_list(_py: Python<'_>, list: Vec<u64>) {
    todo!()
}

#[pyfunction]
fn print_number_list2(list: Vec<u64>) {
    todo!()
}
}

'py, the lifetime parameter of Python<'py>, is used to represent how long the GIL is going to be held.

Bound<'py>

You won't be interacting with Python<'py> directly most of the time.
Instead, you'll use the Bound<'py, T> type, a smart pointer that encapsulates a reference to a Python object, ensuring that you're holding the GIL when you're interacting with it.

Using Bound<'py, T> we can finally start using the Py* types as function arguments:

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

#[pyfunction]
fn print_number_list(list: Bound<'_, PyList>) {
    todo!()
}
}

Bound ensures that we're holding the GIL when interacting with the list instance that has been passed to us as function argument.

FromPyObject

We can now go back to the definition of the FromPyObject trait:

#![allow(unused)]
fn main() {
pub trait FromPyObject<'py>: Sized {
    fn extract_bound(ob: &Bound<'py, PyAny>) -> PyResult<Self>;
}
}

extract_bound takes a &Bound<'py, PyAny> as argument, rather than a bare &PyAny, to ensure that we're holding the GIL when we're interacting with the Python object during the conversion.

References

Footnotes

1

pyo3 still needs to ensure that the Python object you're working with is of the expected type. It'll therefore perform an isinstance check before handing you the object—e.g. checking that an object is indeed a list before giving you a PyList argument. The only exception to this rule is PyAny, which can represent an arbitrary Python object.

2

CPython is the reference implementation of Python, written in C. It's the most widely used Python interpreter and what most people refer to when they say "Python".

Exercise

The exercise for this section is located in 01_intro/05_gil

Output values

We've gone deep into the weeds of how pyo3 handles arguments to your #[pyfunction]s. Let's now move our focus to output values: how do you return something from your Rust functions to Python?

IntoPyObject

Guess what? There's a trait for that too!
IntoPyObject is the counterpart of FromPyObject. It converts Rust values into Python objects:

#![allow(unused)]
fn main() {
pub trait IntoPyObject<'py>: Sized {
    type Target;
    type Output: BoundObject<'py, Self::Target>;
    type Error: Into<PyErr>;

    fn into_pyobject(self, py: Python<'py>) -> Result<Self::Output, Self::Error>;
}
}

The output type of your #[pyfunction] must implement IntoPyObject.

IntoPyObject::into_pyobject

IntoPyObject::into_pyobject expects two arguments:

  • self: the Rust value you want to convert into a Python object.
  • Python<'py>: a GIL token that you can use to create new Python objects.

The conversion can fail, so the method returns a Result.
The output type itself is more complex, so let's break it down using an example.

Case study: a newtype

Let's look at a simple example: a newtype that wraps a u64. We want it to be represented as a "plain" integer in Python.

#![allow(unused)]
fn main() {
use std::convert::Infallible;
use pyo3::prelude::*;
use pyo3::types::PyInt;

struct MyType {
    value: u64,
}

impl<'py> IntoPyObject<'py> for MyType {
    /// `Target` is the **concrete** Python type we want to use
    /// to represent our Rust value.
    /// The underlying Rust type is a `u64`, so we'll convert it to a `PyInt`,
    /// a Python integer.
    type Target = PyInt;
    /// `Output`, instead, is a **wrapper** around the concrete type.
    /// It captures the ownership relationship between the Python object
    /// and the Python runtime.
    /// In this case, we're using a `Bound` smart pointer to a `PyInt`.
    /// The `'py` lifetime ensures that the Python object is owned 
    /// by the Python runtime.
    type Output = Bound<'py, PyInt>;
    /// Since the conversion can fail, we need to specify an error type.
    /// We can't fail to convert a `u64` into a Python integer,
    /// so we'll use `Infallible` as the error type.
    type Error = Infallible;

    fn into_pyobject(self, py: Python<'py>) -> Result<Self::Output, Self::Error> {
        // `u64` already implements `IntoPyObject`, so we delegate 
        // to its implementation to do the actual conversion.
        self.value.into_pyobject(py)
    }
}
}

The Output associated type

Let's focus on the Output associated type for a moment.
In almost all cases, you'll be setting Output to Bound<'py, Self::Target>1. You're creating a new Python object and its lifetime is tied to the Python runtime.

In a few cases, you might be able to rely on Borrowed<'a, 'py, Self::Target> instead. It's slightly faster2, but it's limited to scenarios where you are borrowing from an existing Python object—fairly rare for an IntoPyObject implementation.

There are no other options for Output, since Output must implement the BoundObject trait, the trait is sealed and those two types are the only implementors within pyo3.
If it helps, think of Output as an enum with two variants: Bound and Borrowed.

Provided implementations

pyo3 provides out-of-the-box implementations of IntoPyObject for many Rust types, as well as for all Py* types. Check out its documentation for an exhaustive list.

1

The actual syntax is a bit more complex: type Output = Bound<'py, <Self as IntoPyObject<'py>>::Target>>;. We've simplified it for clarity.

2

In addition to its documentation, you may find this issue useful to understand the trade-offs between &Bound and Borrowed.

Exercise

The exercise for this section is located in 01_intro/06_output

Exceptions

Python and Rust have different error handling mechanisms.
In Python, you raise exceptions to signal that something went wrong.
In Rust, errors are normal values that you return from functions, usually via the Result type.

pyo3 provides PyResult<T> to help you bridge the gap between these two worlds.

PyResult<T>

PyResult<T> is the type you'll return whenever your #[pyfunction] can fail.
It is a type alias for Result<T, PyErr>, where PyErr is pyo3's representation of a Python exception. pyo3 will automatically raise a Python exception whenever a #[pyfunction] returns Err(PyErr) value:

#![allow(unused)]
fn main() {
use pyo3::prelude::*;
use pyo3::types::PyAny;

#[pyfunction]
fn print_if_number(item: Bound<'_, PyAny>) -> PyResult<()> {
    let number = item.extract::<u64>()?;
    println!("{}", number);
    Ok(())
}
}

In the example above, extract::<u64>()? returns a PyResult<u64>.
If the object is not an unsigned integer, extract will return an error, which will be propagated up to the caller via the ? operator. On the Python side, this error will be raised as a Python exception by pyo3.

Built-in exception types

You should be intentional about the types of exceptions you raise. What kind of error are you signaling? What is the caller expected to catch?

All built-in Python exceptions are available in pyo3::exceptions—e.g. pyo3::exceptions::PyValueError for a ValueError. You can use their new_err method to create an instance.

Panics

Rust provides another mechanism for handling "unrecoverable" errors: panics. What happens if you panic in a #[pyfunction]?
pyo3 will catch the panic and raise a pyo3_runtime.PanicException to the Python caller. You've probably seen this behaviour at play when solving the exercises associated to the previous sections.

Exercise

The exercise for this section is located in 01_intro/07_exceptions

Wrapping up

We've covered most of pyo3's key concepts in this chapter.
Before moving on, let's go through one last exercise to consolidate what we've learned. You'll have minimal guidance this time—just the exercise description and the tests to guide you.

Exercise

The exercise for this section is located in 01_intro/08_outro

Classes

We've covered Python functions written in Rust, but what about classes?

Defining a class

You can use the #[pyclass] attribute to define a new Python class in Rust. Here's an example:

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

#[pyclass]
struct Wallet {
    balance: i32,
}
}

It defines a new Python class called Wallet with a single field, balance.

Registering a class

Just like with #[pyfunction]s, you must explicitly register your class with a module to make it visible to users of your extension.
Continuing with the example above, you'd register the Wallet class like this:

#![allow(unused)]
fn main() {
#[pymodule]
fn my_module(m: &Bound<'_, PyModule>) -> PyResult<()> {
    m.add_class::<Wallet>()?;
    Ok(())
}
}

IntoPyObject

Rust types that have been annotated with #[pyclass] automatically implement the IntoPyObject trait, thus allowing you to return them from your #[pyfunction]s.

For example, you can define a function that creates a new Wallet instance:

#![allow(unused)]
fn main() {
#[pyfunction]
fn new_wallet(balance: i32) -> Wallet {
    Wallet { balance }
}
}

It'll compile just fine, handing over a new Wallet instance to the Python caller.

Attributes

By default, the fields of your #[pyclass]-annotated structs aren't accessible to Python callers.
Going back to our Wallet example—if you try to access the balance field from Python, you'll get an error:

        wallet = new_wallet(0)
>       assert wallet.balance == 0
E       AttributeError: 'builtins.Wallet' object has no attribute 'balance'

tests/test_sample.py:8: AttributeError

The same error would occur even if you made balance a public field.

To make the field accessible to Python, you must add a getter.
This can be done using the #[pyo3(get)] attribute:

#![allow(unused)]
fn main() {
#[pyclass]
struct Wallet {
    #[pyo3(get)]
    balance: i32,
}
}

Now, the balance field is accessible from Python:

def test_wallet():
    wallet = new_wallet(0)
    assert wallet.balance == 0

If you want to allow Python callers to modify the field, you can add a setter using the #[pyo3(set)] attribute:

#![allow(unused)]
fn main() {
#[pyclass]
struct Wallet {
    // Both getter and setter
    #[pyo3(get, set)]
    balance: i32,
}
}

Exercise

The exercise for this section is located in 02_classes/00_pyclass

Constructors

In the previous section (and its exercise) we relied on a #[pyfunction] as the constructor for the #[pyclass] we defined. Without new_wallet, we wouldn't have been able to create new Wallet instances from Python.
Let's now explore how to define a constructor directly within the #[pyclass] itself.

Defining a constructor

You can add a constructor to your #[pyclass] using the #[new] attribute on a method. Here's an example:

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

#[pyclass]
struct Wallet {
    #[pyo3(get, set)]
    balance: i32,
}

#[pymethods]
impl Wallet {
    #[new]
    fn new(balance: i32) -> Self {
        Wallet { balance }
    }
}
}

A Rust method annotated with #[new] is equivalent to the __new__ method in Python. At the moment there is no way to define the __init__ method in Rust.
The impl block containing the constructor must also be annotated with the #[pymethods] attribute for #[new] to work as expected.

Signature

Everything we learned about arguments in the context of #[pyfunction]s applies to constructors as well.
In terms of output type, you can return Self if the constructor is infallible, or PyResult<Self> if it can fail.

Exercise

The exercise for this section is located in 02_classes/01_constructors

Methods

The #[pymethods] attribute is not limited to constructors. You can use it to attach any number of methods to your #[pyclass]:

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

#[pyclass]
struct Wallet {
    #[pyo3(get, set)]
    balance: i32,
}

#[pymethods]
impl Wallet {
    #[new]
    fn new(balance: i32) -> Self {
        Wallet { balance }
    }

    fn deposit(&mut self, amount: i32) {
        self.balance += amount;
    }

    fn withdraw(&mut self, amount: i32) {
        self.balance -= amount;
    }
}
}

All methods within an impl block annotated with #[pymethods] are automatically exposed to Python as methods on your #[pyclass]1. The deposit and withdraw methods in the example above can be called from Python like this:

wallet = Wallet(0)
wallet.deposit(100)
wallet.withdraw(50)
assert wallet.balance == 50

multiple-pymethods

You can't annotate multiple impl blocks with #[pymethods] for the same class, due to a limitation in Rust's metaprogramming capabilities.
There is a way to work around this issue using some linker dark magic, via the multiple-pymethods feature flag, but it comes with a penalty in terms of compile times as well as limited cross-platform support. Check out pyo3's documentation for more details.

Footnotes

1

All methods in a #[pymethods] block are exposed, even if they are private!

Exercise

The exercise for this section is located in 02_classes/02_methods

Custom setters and getters

In a previous section, we learned how to attach the default getter and setter to a field in a #[pyclass]:

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

#[pyclass]
struct Wallet {
    #[pyo3(get, set)]
    balance: i32,
}
}

This is convenient, but it's not always desirable!
Let's introduce an additional constraint to our Wallet struct: the balance should never go below a pre-determined overdraft threshold. We'd start by enforcing this constraint in the constructor method:

#![allow(unused)]
fn main() {
use pyo3::prelude::*;
use pyo3::exceptions::PyValueError;

#[pyclass]
struct Wallet {
    #[pyo3(get, set)]
    balance: i32,
}

const OVERDRAFT_LIMIT: i32 = -100;

#[pymethods]
impl Wallet {
    #[new]
    fn new(balance: i32) -> PyResult<Self> {
        if balance < OVERDRAFT_LIMIT {
           return Err(PyValueError::new_err("Balance cannot be below overdraft limit"));     
        }
        Ok(Wallet { balance })
    }
}
}

Wallet::new ensures that a newly-created Wallet upholds the overdraft constraint. But the default setter can be easily used to circumvent the limit:

wallet = Wallet(0)
wallet.balance = -200 # This should not be allowed, but it is!

#[setter] and #[getter]

We can override the default getter and setter by defining custom methods for them.
Here's how we can implement a custom setter for the balance field via the #[setter] attribute:

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

#[pyclass]
struct Wallet {
    // We keep using the default getter, no issues there
    #[pyo3(get)]
    balance: i32,
}

const OVERDRAFT_LIMIT: i32 = -100;

#[pymethods]
impl Wallet {
    #[new]
    fn new(balance: i32) -> PyResult<Self> {
        Wallet::check_balance(balance)?;
        Ok(Wallet { balance })
    }

    #[setter]
    fn set_balance(&mut self, value: i32) {
        Wallet::check_balance(value)?;
        self.balance = value;
    }
}

impl Wallet {
    // We put this method in a separate `impl` block to avoid exposing it to Python
    fn check_balance(balance: i32) -> PyResult<()> {
        if balance < OVERDRAFT_LIMIT {
            return Err(PyValueError::new_err("Balance cannot be below overdraft limit"));
        }
        Ok(())
    }
}
}

Every time the balance field is set in Python, Wallet::set_balance will be called:

wallet = Wallet(0)
wallet.balance = -200  # Now raises a `ValueError`

The field is associated with its setter using a conventional naming strategy for the setter method: set_<field_name>. You can also explicitly specify the field name in the #[setter] attribute, like this: #[setter(balance)].

Custom getters are defined in a similar way using the #[getter] attribute. The naming convention for getter methods is <field_name>, but you can also specify the field name explicitly in the attribute—e.g. #[getter(balance)].

Exercise

The exercise for this section is located in 02_classes/03_setters

Static methods

All the class methods we've seen so far have been instance methods—i.e. they take an instance of the class as one of their arguments.
Python supports static methods as well. These methods don't take an instance of the class as an argument, but they are "attached" to the class itself.

The same concept exists in Rust:

#![allow(unused)]
fn main() {
pub struct Wallet {
    balance: i32,
}

impl Wallet {
    pub fn default() -> Self {
        Wallet { balance: 0 }
    }
}
}

Wallet::get_default is a static method since it doesn't take self or references to self as arguments.
You might then expect the following to define a Python static method on the Wallet class:

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

#[pyclass]
struct Wallet {
    #[pyo3(get, set)]
    balance: i32,
}

#[pymethods]
impl Wallet {
    #[new]
    fn new(balance: i32) -> Self {
        Wallet { balance }
    }

    fn default() -> Self {
        Wallet { balance: 0 }
    }
}
}

However, this code will not compile.
To define a static method in Python, you need to explicitly mark it with the #[staticmethod] attribute:

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

#[pyclass]
struct Wallet {
    #[pyo3(get, set)]
    balance: i32,
}

#[pymethods]
impl Wallet {
    #[new]
    fn new(balance: i32) -> Self {
        Wallet { balance }
    }
 
    // Notice the `#[staticmethod]` attribute here!
    #[staticmethod]
    fn default() -> Self {
        Wallet { balance: 0 }
    }
}
}

Class methods

Python also supports class methods. These methods take the class itself as an argument, rather than an instance of the class.
In Rust, you can define class methods by taking cls: &PyType as the first argument:

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

#[pyclass]
struct Wallet {
    #[pyo3(get, set)]
    balance: i32,
}

#[pymethods]
impl Wallet {
    #[new]
    fn new(balance: i32) -> Self {
        Wallet { balance }
    }

    // Notice the `cls` argument here!
    #[classmethod]
    fn from_str(_cls: &Bound<'_, PyType>, balance: &str) -> PyResult<Self> {
        let balance = balance.parse::<i32>()?;
        Ok(Wallet { balance })
    }
}
}

Since you can directly refer to the class in a Rust static method (i.e. the Self type), you won't find yourself using class methods as often as you would in Python.

Exercise

The exercise for this section is located in 02_classes/04_static_methods

Inheritance

Python, unlike Rust, supports inheritance.
Each class in Python can inherit attributes and methods from a parent class.

class Parent:
    def __init__(self, name):
        self.name = name

    def greet(self):
        print(f"Hello, {self.name}!")
        
# Declare `Child` as a subclass of `Parent`
class Child(Parent):
    def __init__(self, name, age):
        # Call the parent class's constructor
        super().__init__(name)
        self.age = age
        
child = Child("Alice", 7)
# `Child` inherits the `greet` method from `Parent`, so we can call it
child.greet() # Prints "Hello, Alice!"

pyo3 and inheritance

pyo3 supports inheritance as well, via additional attributes on the #[pyclass] macro.
To understand how it works, let's try to translate the Python example above to Rust. We'll start with defining the base class, Parent:

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

#[pyclass(subclass)]
struct Parent {
    name: String,
}

#[pymethods]
impl Parent {
    #[new]
    fn new(name: String) -> Self {
        Parent { name }
    }

    fn greet(&self) {
        println!("Hello, {}!", self.name);
    }
}
}

You can spot one new attribute in the #[pyclass] macro: subclass. This attribute tells pyo3 that this class can be subclassed, and it should generate the necessary machinery to support inheritance.

Now let's define the Child class, which inherits from Parent:

#![allow(unused)]
fn main() {
#[pyclass(extends=Parent)]
struct Child {
    age: u8,
}
}

We're using the extends attribute to specify that Child is a subclass of Parent.
Things get a bit more complicated when it comes to the constructor:

#![allow(unused)]
fn main() {
#[pymethods]
impl Child {
    #[new]
    fn new(name: String, age: u8) -> PyClassInitializer<Self> {
        let parent = Parent::new(name);
        let child = Self { age };
        PyClassInitializer::from(parent).add_subclass(child)
    }
}
}

Whenever you initialize a subclass, you need to make sure that the parent class is initialized first.
We start by calling Parent::new to create an instance of the parent class. We then initialize Child, via Self { age }. We then use PyClassInitializer to return both the parent and child instances together.

Even though Child doesn't have a greet method on the Rust side, you'll be able to call it from Python since the generated Child class inherits it from Parent.

Nested inheritance

PyClassInitializer can be used to build arbitrarily deep inheritance hierarchies. For example, if Child had its own subclass, you could call add_subclass again to add yet another subclass to the chain.

#![allow(unused)]
fn main() {
#[pyclass(extends=Child)]
struct Grandchild {
    hobby: String,
}

#[pymethods]
impl Grandchild {
    #[new]
    fn new(name: String, age: u8, hobby: String) -> PyClassInitializer<Self> {
        let child = Child::new(name, age);
        let grandchild = Self { hobby };
        PyClassInitializer::from(child).add_subclass(grandchild)
    }
}
}

Limitations

pyo3 supports two kinds of superclasses:

  • A Python class defined in Rust, via #[pyclass]
  • A Python built-in class, like PyDict or PyList

It currently doesn't support using a custom Python class as the parent class for a class defined in Rust.

Exercise

The exercise for this section is located in 02_classes/05_inheritance

Parent class

Let's go back to our example from the previous section:

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

#[pyclass(subclass)]
struct Parent {
    name: String,
}

#[pymethods]
impl Parent {
    #[new]
    fn new(name: String) -> Self {
        // [...]
    }

    fn greet(&self) {
        println!("Hello, {}!", self.name);
    }
}

#[pyclass(extends=Parent)]
struct Child {
    age: u8,
}

#[pymethods]
impl Child {
    #[new]
    fn new(name: String, age: u8) -> PyClassInitializer<Self> {
        // [...]
    }
}
}

Child.greet is not defined, therefore it falls back to the Parent.greet method.
What if we wanted to override it in Child?

Overriding methods

On the surface, it's simple: just define a method with the same name in the subclass.

#![allow(unused)]
fn main() {
#[pymethods]
impl Child {
    #[new]
    fn new(name: String, age: u8) -> PyClassInitializer<Self> {
        // [...]
    }
    
    fn greet(&self) {
        println!("Hi, I'm {} and I'm {} years old!", self.name, self.age);
    }
}
}

There's an issue though: self.name won't work because the Rust struct for Child doesn't have a name field. At the same time, the Python Child class does, because it inherits it from Parent.

How do we fix this?

as_super to the rescue

We need a way, in Rust, to access the fields and methods of the parent class from the child class.
This can be done using another one of pyo3's smart pointers: PyRef.

#![allow(unused)]
fn main() {
#[pymethods]
impl Child {
    // [...]
    
    fn greet(self_: PyRef<'_, Self>) {
        todo!()
    }
}
}

PyRef represents an immutable reference to the Python object.
It allows us, in particular, to call the as_super method, which returns a reference to the parent class.

#![allow(unused)]
fn main() {
#[pymethods]
impl Child {
    // [...]
    
    fn greet(self_: PyRef<'_, Self>) {
        // This is now a reference to a `Parent` instance!
        let parent = self_.as_super();
        println!("Hi, I'm {} and I'm {} years old!", parent.name, self_.age);
    }
}
}

Now we can access the name field from the parent class, and the age field from the child class.

PyRef and PyRefMut

PyRef is for immutable references, but what if we need to modify the parent class?
In that case, we can use PyRefMut, which is a mutable reference.

Exercise

The exercise for this section is located in 02_classes/06_parent

Wrapping up

There's a ton of little details and options when it comes to writing Python classes in Rust. We've covered the key concepts and most common use cases, but make sure to check out the official pyo3 documentation whenever you need more information about a specific feature (e.g. how do I declare a class to be frozen? How do I make my class iterable?).

Let's take a moment to reflect on what we've learned so far with one last exercise.

Exercise

The exercise for this section is located in 02_classes/07_outro

Concurrency

All our code so far has been designed for sequential execution, on both the Python and Rust side. It's time to spice things up a bit and explore concurrency1!

We won't dive straight into Rust this time.
We'll start by solving a few parallel processing problems in Python, to get a feel for Python's capabilities and limitations. Once we have a good grasp of what's possible there, we'll port our solutions over to Rust.

Multiprocessing

If you've ever tried to write parallel code in Python, you've probably come across the multiprocessing module. Before we dive into the details, let's take a step back and review the terminology we'll be using.

Processes

A process is an instance of a running program.
The precise anatomy of a process depends on the underlying operating system (e.g. Windows or Linux). Some characteristics are common across most operating systems, though. In particular, a process typically consists of:

  • The program's code
  • Its memory space, allocated by the operating system
  • A set of resources (file handles, sockets, etc.)
+------------------------+
|        Memory          |
|                        |
| +--------------------+ |
| |  Process A Space   | |  <-- Each process has a separate memory space.
| +--------------------+ |
|                        |
| +--------------------+ |
| |  Process B Space   | |
| |                    | |
| +--------------------+ |
|                        |
| +--------------------+ |
| |  Process C Space   | |
| +--------------------+ |
+------------------------+

There can be multiple processes running the same program, each with its own memory space and resources, fully isolated from one another.
The operating system's scheduler is in charge of deciding which process to run at any given time, partitioning CPU time among them to maximize throughput and/or responsiveness.

The multiprocessing module

Python's multiprocessing module allows us to spawn new processes, each running its own Python interpreter.

A process is created by invoking the Process constructor with a target function to execute as well as any arguments that function might need. The process is launched by calling its start method, and we can wait for it to finish by calling join.

If we want to communicate between processes, we can use Queue objects, which are shared between processes. These queues try to abstract away the complexities of inter-process communication, allowing us to pass messages between our processes in a relatively straightforward manner.

References:

1

We'll limit our exploration to threads and processes, without venturing into the realm of async/await.

Exercise

The exercise for this section is located in 03_concurrency/00_introduction

Threads

The overhead of multiprocessing

Let's have a look at the solution for the previous exercise:

from multiprocessing import Process, Queue

def word_count(text: str, n_processes: int) -> int:
    result_queue = Queue()
    processes = []
    for chunk in split_into_chunks(text, n_processes):
        p = Process(target=word_count_task, args=(chunk, result_queue))
        p.start()
        processes.append(p)
    for p in processes:
        p.join()
    results = [result_queue.get() for _ in range(len(processes))]
    return sum(results)

Let's focus, in particular, on process creation:

p = Process(target=word_count_task, args=(chunk, result_queue))

The parent process (the one executing word_count) doesn't share memory with the child process (the one spawned via p.start()). As a result, the child process can't access chunk or result_queue directly. Instead, it needs to be provided a deep copy of these objects1.
That's not a major issue if the data is small, but it can become a problem on larger datasets.
For example, if we're working with 8 GB of text, we'll end up with at least 16 GB of memory usage: 8 GB for the parent process and 8 GB split among the child processes. Not ideal!

We could try to circumvent this issue2, but that's not always possible nor easy to do.
A more straightforward solution is to use threads instead of processes.

Threads

A thread is an execution context within a process.
Threads share the same memory space and resources as the process that spawned them, thus allowing them to communicate and share data with one another more easily than processes can.

+------------------------+
|        Memory          |
|                        |
| +--------------------+ |
| |  Process A Space   | |  <-- Each process has its own memory space.
| |  +-------------+   | |      Threads share the same memory space
| |  | Thread 1    |   | |      of the process that spawned them.
| |  | Thread 2    |   | |
| |  | Thread 3    |   | |
| |  +-------------+   | |
| +--------------------+ |
|                        |
| +--------------------+ |
| |  Process B Space   | |
| |  +-------------+   | |
| |  | Thread 1    |   | |
| |  | Thread 2    |   | |
| |  +-------------+   | |
| +--------------------+ |
+------------------------+

Threads, just like processes, are operating system constructs.
The operating system's scheduler is in charge of deciding which thread to run at any given time, partitioning CPU time among them.

The threading module

Python's threading module provides a high-level interface for working with threads.
The API of the Thread class, in particular, mirrors what you already know from the Process class:

  • A thread is created by calling the Thread constructor and passing it a target function to execute as well as any arguments that function might need.
  • The thread is launched by calling its start method, and we can wait for it to finish by calling join.
  • If we want to communicate between threads, we can use Queue objects, from the queue module, which are shared between threads.

References:

1

To be more precise, the multiprocessing module uses the pickle module to serialize the objects that must be passed as arguments to the child process. The serialized data is then sent to the child process, as a byte stream, over an operating system pipe. On the other side of the pipe, the child process deserializes the byte stream back into Python objects using pickle and passes them to the target function.
This all system has higher overhead than a "simple" deep copy.

2

Common workarounds include memory-mapped files and shared-memory objects, but these can be quite difficult to work with. They also suffer from portability issues, as they rely on OS-specific features.

Exercise

The exercise for this section is located in 03_concurrency/01_python_threads

The GIL problem

Concurrent, yes, but not parallel

On the surface, our thread-based solution addresses all the issues we identified in the multiprocessing module:

from threading import Process
from queue import Queue

def word_count(text: str, n_threads: int) -> int:
    result_queue = Queue()
    threads = []

    for chunk in split_into_chunks(text, n_threads):
        t = Thread(target=word_count_task, args=(chunk, result_queue))
        t.start()
        threads.append(t)

    for t in threads:
        t.join()

    results = [result_queue.get() for _ in range(len(threads))]
    return sum(results)

When a thread is created, we are no longer cloning the text chunk nor incurring the overhead of inter-process communication:

t = Thread(target=word_count_task, args=(chunk, result_queue))

Since the spawned threads share the same memory space as the parent thread, they can access the chunk and result_queue directly.

Nonetheless, there's a major issue with this code: it won't actually use multiple CPU cores.
It will run sequentially, even if we pass n_threads > 1 and multiple CPU cores are available.

Python concurrency

You guessed it: the infamous Global Interpreter Lock (GIL) is to blame. As we discussed in the GIL chapter, Python's GIL prevents multiple threads from executing Python code simultaneously1.

As a result, thread-based parallelism has historically seen limited use in Python, as it doesn't provide the performance benefits one might expect from a multithreaded application.

That's why the multiprocessing module is so popular: it allows Python developers to bypass the GIL. Each process has its own Python interpreter, and thus its own GIL. The operating system schedules these processes independently, allowing them to run in parallel on multicore CPUs.

But, as we've seen, multiprocessing comes with its own set of challenges.

Native extensions

There's a third way to achieve parallelism in Python: native extensions.
We must be holding the GIL when we invoke a Rust function from Python, but pure Rust threads are not affected by the GIL, as long as they don't need to interact with Python objects.

Let's rewrite again our word_count function, this time in Rust!

1

This is the current state of Python's concurrency model. There are some exciting changes on the horizon, though! CPython's free-threading mode is an experimental feature that aims to remove the GIL entirely. It would allow multiple threads to execute Python code simultaneously, without forcing developers to rely on multiprocessing. We won't cover the new free-threading mode in this course, but it's worth keeping an eye on it as it matures out of the experimental phase.

Exercise

The exercise for this section is located in 03_concurrency/02_gil

Releasing the GIL

What happens to our Python code when it calls a Rust function?
It waits for the Rust function to return:

 Time -->

          +------------+--------------------+------------+--------------------+
 Python:  |  Execute   | Call Rust Function |    Idle    |  Resume Execution  |
          +------------+--------------------+------------+--------------------+
                                 │                                ▲
                                 ▼                                │
          +------------+--------------------+------------+--------------------+
 Rust:    |    Idle    |       Idle         |  Execute   |  Return to Python  |
          +------------+--------------------+------------+--------------------+

The schema doesn't change even if the Rust function is multithreaded:

 Time -->

          +------------+--------------------+-------------------+--------------------+
 Python:  |  Execute   | Call Rust Function |       Idle        |  Resume Execution  |
          +------------+--------------------+-------------------+--------------------+
                                 │                                        ▲
                                 ▼                                        │
          +------------+--------------------+-------------------+--------------------+
 Rust:    |    Idle    |       Idle         | Execute Thread 1  |  Return to Python  |
          |            |                    | Execute Thread 2  |                    |
          +------------+--------------------+-------------------+--------------------+

It begs the question: can we have Python and Rust code running concurrently?
Yes! The focus point, once again, is the GIL.

Python access must be serialized

The GIL's job is to serialize all interactions with Python objects.
On the pyo3 side, this is modeled by the Python<'py> token: you can only get an instance of Python<'py> if you're holding the GIL. Going further, you can only interact with Python objects via smart pointers like Borrowed<'py, T> or Owned<'py, T>, which internally hold a Python<'py> instance.
There's no way around it: any interaction with Python objects must be serialized. But, here's the kicker: not all Rust code needs to interact with Python objects!

Python::allow_threads

For example, consider a Rust function that calculates the nth Fibonacci number:

#![allow(unused)]
fn main() {
#[pyfunction]
fn fibonacci(n: u64) -> u64 {
    let mut a = 0;
    let mut b = 1;
    for _ in 0..n {
        let tmp = a;
        a = b;
        b = tmp + b;
    }
    a
}
}

There's no Python object in sight! We're just offloading a computation to Rust.
In principle, we could spawn a thread to run this function while the main thread continues executing Python code:

from threading import Thread

def other_work():
    print("I'm doing other work!")

t = Thread(target=fibonacci, args=(10,))
t.start()
other_work()
t.join()

As it stands, other_work and fibonacci will not be run in parallel: our fibonacci routine is still holding the GIL, even though it doesn't need it.
We can fix it by explicitly releasing the GIL:

#![allow(unused)]
fn main() {
#[pyfunction]
fn fibonacci(py: Python<'_>, n: u64) -> u64 {
    py.allow_threads(|| {
        let mut a = 0;
        let mut b = 1;
        for _ in 0..n {
            let tmp = a;
            a = b;
            b = tmp + b;
        }
        a
    })
}
}

Python::allow_threads releases the GIL while executing the closure passed to it.
This frees up the Python interpreter to run other Python code, such as the other_work function in our example, while the Rust thread is busy calculating the nth Fibonacci number.

Using the same line diagram as before, we have the following:

 Time -->

          +------------+--------------------+-------------------+--------------------+
 Python:  |  Execute   | Call Rust Function |    other_work()   |      t.join()      |
          +------------+--------------------+-------------------+--------------------+
                                 │                                        ▲
                                 ▼                                        │
          +------------+--------------------+-------------------+--------------------+
 Rust:    |    Idle    |       Idle         |    fibonacci(n)   |  Return to Python  |
          +------------+--------------------+-------------------+--------------------+
                                                     ▲
                                                     │
                                            Python and Rust code
                                          running concurrently here

Ungil

Python::allow_threads is only sound if the closure doesn't interact with Python objects.
If that's not the case, we end up with undefined behavior: Rust code touching Python objects while the Python interpreter is running other Python code, assuming nothing else is happening to those objects thanks to the GIL. A recipe for disaster!

It'd be ideal to rely on the type system to enforce this constraint for us at compile-time, in true Rust fashion—"if it compiles, it's safe."
pyo3 tries to follow this principle with the Ungil marker trait: only types that are safe to access without the GIL can implement Ungil. The trait is then used to constrain the arguments of Python::allow_threads:

#![allow(unused)]
fn main() {
pub fn allow_threads<T, F>(self, f: F) -> T
where
    F: Ungil + FnOnce() -> T,
    T: Ungil,
{
    // ...
}
}

Unfortunately, Ungil is not perfect. On stable Rust, it leans on the Send trait, but that allows for some unsafe interactions with Python objects. The tracking is more precise on nightly Rust1, but it doesn't catch every possible misuse of Python::allow_threads.

My recommendation: if you're using Python::allow_threads, trigger an additional run of your CI pipeline using the nightly Rust compiler to catch more issues. On top of that, review your code carefully.

Exercise

The exercise for this section is located in 03_concurrency/03_releasing_the_gil

Minimize GIL locking

All our examples so far fall into two categories:

  • The Rust function holds the GIL for the entire duration of its execution.
  • The Rust function doesn't hold the GIL at all, going straight into Python::allow_threads mode.

Real-world applications are often more nuanced, though.
You'll need to hold the GIL for some operations (e.g. passing data back to Python), but you're able to release it for others (e.g. long-running computations).

The goal is to minimize the time spent holding the GIL to the bare minimum, thus maximizing the potential parallelism of your application.

Strategy 1: isolate the GIL-free section

Let's look at an example: we're given a list of numbers and we need to modify it in place, replacing each number with the result of an expensive computation that uses no Python objects.

To minimize GIL locking, we create Rust vector from the Python list, release the GIL, and perform the computation and then re-acquire the GIL to update the Python list in place:

#![allow(unused)]
fn main() {
#[pyfunction]
fn update_in_place<'py>(
    python: Python<'py>,
    numbers: Bound<'py, PyList>
) -> PyResult<()> {
    // Holding the GIL
    let v: Vec<i32> = numbers.extract()?;
    let updated_v: Vec<_> = python.allow_threads(|| {
        v.iter().map(|&n| expensive_computation(n)).collect()
    });
    // Back to holding the GIL
    for (i, &n) in updated_v.iter().enumerate() {
        numbers.set_item(i, n)?;
    }
    Ok(()
}

fn expensive_computation(n: i32) -> i32 {
    // Some heavy number crunching
    // [...]
}
}

Strategy 2: manually re-acquire the GIL inside the closure

In the example above, we've created a whole new vector to decouple the GIL-free section from the GIL-holding one. If the input data is large, this can be a significant overhead.

Let's explore a different approach: we won't create a new pure-Rust vector. Instead, we will re-acquire the GIL inside the closure—we'll hold it to access each list element and, after the computation is done, update it in place. Nothing more.

Assuming you know nothing about Ungil, the naive solution might look like this:

#![allow(unused)]
fn main() {
#[pyfunction]
fn update_in_place<'py>(
    python: Python<'py>,
    numbers: Bound<'py, PyList>
) -> PyResult<()> {
    python.allow_threads(|| -> PyResult<()> {
        let n_numbers = numbers.len();
        for i in 0..n_numbers {
            let n = numbers.get_item(i)?.extract::<i64>()?;
            let result = expensive_computation(n);
            numbers.set_item(i, result))?;
        }
        Ok(())
    })
}
}

It won't compile, though. We're using a GIL-bound object (numbers) in a GIL-free section (inside python.allow_threads). We need to unbind it first.

Py<T> and Bound<'py, T>

Using Bound<'py, T>::unbind we get a Py<T> object back. It has no 'py lifetime, it's no longer bound to the GIL. We can try to use it in the GIL-free section:

#![allow(unused)]
fn main() {
#[pyfunction]
fn update_in_place<'py>(
    python: Python<'py>,
    numbers: Bound<'py, PyList>
) -> PyResult<()> {
    let numbers = numbers.unbind();
    python.allow_threads(|| -> PyResult<()> {
        let n_numbers = numbers.len();
        for i in 0..n_numbers {
            let n = numbers.get_item(i)?.extract::<i64>()?;
            let result = expensive_computation(n);
            numbers.set_item(i, result)?;
        }
        Ok(())
    })
}
}

But it won't compile either. numbers.len(), numbers.get_item(i), and numbers.set_item(i, result) all require the GIL. Py<T> is just a pointer to a Python object, it won't allow us to access it if we're not holding the GIL.

We need to re-bind it using a Python<'py> token, thus getting a Bound<'py, PyList> back. How do we get a Python<'py> token inside the closure, though? Using Python::with_gil: it's the opposite of Python::allow_threads, it makes sure to acquire the GIL before executing the closure and release it afterwards. The closure is given a Python token as argument, which we can use to re-bind the PyList object:

#![allow(unused)]
fn main() {
#[pyfunction]
fn update_in_place<'py>(
    python: Python<'py>,
    numbers: Bound<'py, PyList>
) -> PyResult<()> {
    let n_numbers = numbers.len();
    let numbers_ref = numbers.unbind();
    // Release the GIL
    python.allow_threads(|| -> PyResult<()> {
        for i in 0..n_numbers {
            // Acquire the GIL again, to access the
            // i-th element of the list
            let n = Python::with_gil(|inner_py| {
                numbers_ref
                    .bind(inner_py)
                    .get_item(i)?
                    .extract::<i64>()
            })?;
            // Run the computation without holding the GIL
            let result = expensive_computation(n);
            // Re-acquire the GIL to update the list in place
            Python::with_gil(|inner_py| {
                numbers_ref.bind(inner_py).set_item(i, result)
            })?;
        }
        Ok(())
    })
}
}

Be mindful of concurrency

The GIL is there for a reason: to protect Python objects from concurrent access.
Whenever you release the GIL, you're allowing other threads to run and potentially modify the Python objects you're working with.

In the examples above, another Python thread could modify the numbers list while we're computing the result. E.g. it could remove an element, causing the index i to be out of bounds.

This is a common issue in multi-threaded programming, and it's up to you to handle it.
Consider using synchronization primitives like Lock to serialize access to the Python objects you're working with. In other words, move towards fine-grained locking rather than the lock-the-world approach you get with the GIL.

References

Exercise

The exercise for this section is located in 03_concurrency/04_minimize_gil_locking

Immutable types

Concurrency introduces many new classes of bugs that are not present in single-threaded programs. Data races are one of the most common: two threads try to access the same memory location at the same time, and at least one of them is writing to it. What should happen?
In most programming languages, the behavior is undefined: the program could crash, or it could produce incorrect results.

Data races can't happen in a single-threaded program, because only one thread can access the memory at a time. That's where the GIL comes in: since it serializes the execution of code that accesses Python objects, it prevents all kinds of data races (albeit with a significant performance cost).

There's another way to prevent data races though: by making sure that the data is immutable. There's no need for synchronization if the data can't change!

Built-in immutable types

Python has many immutable types—e.g. int, float, str.
Whenever you modify them, you're actually creating a new object, not changing the existing one.

a = 1
b = a
a += 1

assert a == 2
# a is a new object,
# b is still 1
assert b == 1

Since they're immutable, they're considered thread-safe: you can access them from multiple threads without worrying about data races and synchronization.

Frozen dataclasses

You can define your own immutable types in Python using dataclasses and the frozen attribute.

from dataclasses import dataclass

@dataclass(frozen=True)
class Point:
    x: int
    y: int

p = Point(1, 2)
# This will raise a `FrozenInstanceError` exception
p.x = 3

The frozen attribute makes the class immutable: you can't modify its attributes after creation. This goes beyond modifying the values of the existing attributes. You are also forbidden from adding new attributes, e.g.:

# This will raise a `FrozenInstanceError` exception
# But would work if `frozen=False` or for a "normal"
# class without the `@dataclass` decorator
p.z = 3

In Rust

Let's see how we can define a similar immutable type in Rust.

#![allow(unused)]
fn main() {
use pyo3::prelude::*;

#[pyclass(frozen)]
struct Point {
    x: i32,
    y: i32,
}
}

The above is not enough to get all the niceties of Python's dataclasses, but it's sufficient to make the class immutable.
If a pyclass is marked as frozen, pyo3 will allow us to access its fields without holding the GIL—i.e. via Py<T> instead of Bound<'py, T>

#![allow(unused)]
fn main() {
#[pyfunction]
fn print_point<'py>(python: Python<'py>, point: Bound<'py, Point>) {
    let point: Py<Point> = point.unbind();
    python.allow_threads(|| {
        // We can now access the fields of the Point struct
        // even though we are not holding the GIL
        let point: &Point = point.get();
        println!("({}, {})", point.x, point.y);
    });
}
}

This wouldn't compile if Point wasn't marked as frozen, thanks to Py<T>::get's signature:

#![allow(unused)]
fn main() {
impl<T> Py<T>
where
    T: PyClass,
{
    pub fn get(&self) -> &T
    where
        // `Frozen = True` is where the magic happens!
        T: PyClass<Frozen = True> + Sync,
    { /* ... */ }
}
}

Summary

Immutable types significantly simplify GIL jugglery in pyo3. If it fits the constraints of the problem you're solving, consider using them to make your code easier to reason about (and potentially faster!).

Exercise

The exercise for this section is located in 03_concurrency/05_immutable_types

Wrapping up

You should now have a strong theoretical foundation to reason about your options whenever you have a Python problem that could benefit from a concurrent solution.

That's only the beginning, though. To truly master conccurent programming, you need to practice it!
This wasn't the right venue to cover all the possible concurrency patterns you can express with Rust. Luckily enough, there's plenty of resources available to help you on that side! We recommend, in particular, the "Threads" chapter of our own Rust course. It follows the same hands-on approach we've been using in this book, and it's a great way to get more practice with Rust's concurrency primitives.

Take the final exercise as a capstone project: you'll have to design a non-trivial algorithm and piece together various concurrenty primitives to implement a solution that's both correct and efficient. Don't shy away from the challenge: embrace it, it's the best way to learn!

Beyond performance

A note, in closing: writing correct concurrent code is tricky.
We highlighted Rust, in this chapter, as a way to circumvent the limitations of Python's GIL and ultimately improve the performance of your code. But that's only half of the story.
Rust's type system and ownership model make it much easier to write concurrent code that's correct, too. As you delve deeper into the world of concurrent programming, you'll come to appreciate the real value of Rust's Send and Sync traits, which we've only briefly touched upon when discussing data races.

As the saying goes: people come to Rust for the performance, but they stay for its safety and correctness guarantees.

Exercise

The exercise for this section is located in 03_concurrency/06_outro