Weekly Report 2021, July 26 - August 1

This week I tried to drop the open pull request count below 1,400. This was a grueling task and I barely succeeded.

I managed to almost double the rate of closed issues and PRs compared to the first two week average. I closed 27 issues and 94 PRs, authored 2 PRs of my own, and reviewed 8 PRs. This isn’t a pace I can keep up every week but this week I stubbornly wanted to bring the open PR count down into the 1,300s. At the time of writing on Saturday morning, it is barely there:

In all likelihood by the time you read this, it will jump back up into the 1,400s. It’s still 30+ fewer than we had on Monday so I’m happy with the progress. Obviously, I’m not the only person reviewing and closing PRs so I don’t intend to either take the whole credit nor the whole responsibility for this. But it goes to show that we could easily use more people doing what I’m doing now full time.

Highlights

This week was the last week before Python 3.10 goes into the “release candidate” phase. Pablo will be releasing RC1 on Monday. This is a big deal because ideally we’d like release candidates to be identical to the final release. That means we will now become very conservative in terms of what code changes are allowed. In particular, the Python Developer’s Guide has this to say about the RC stage:

A branch preparing for an RC release can only have bugfixes applied that have been reviewed by other core developers. Generally, these issues must be severe enough (e.g. crashes) that they deserve fixing before the final release. All other issues should be deferred to the next development cycle, since stability is the strongest concern at this point.

While the goal is to have no code changes between a RC and a final release, there may be a need for final documentation or test fixes. Any such proposed changes should be discussed first with the release manager.

You cannot skip the peer review during an RC, no matter how small! Even if it is a simple copy-and-paste change, everything requires peer review from a core developer.

Naturally then, there’s been some last-minute frantic bugfixing activity, and this sometimes lead to subtle issues like…

Reference Leaks

CPython’s main memory management is through reference counting. As a Python user you rarely have to think about it but when writing C code with the C API, we have to manually put Py_INCREF and Py_DECREF where appropriate 1. A failure to do so correctly ends up either in:

  • crashes when references were insufficiently increased (those are often annoying to debug as the crash often happens on the next use of the same PyMalloc-allocated memory which might be quite some time later); or
  • reference leaks when references were insufficiently decreased.

The second in particular causes Python to balloon in memory usage which at best is using your resources inefficiently, and at worst leads to MemoryErrors or even your operating system killing the process for asking for more memory than is available.

We have a way to deal with this! The regression test suite can be run with the -R argument which catches reference leaks. How? By running a given test multiple times and comparing the reference counts 2. How many times? Well, by default it’s 9 repetitions. Yes, 9. You see, first off the test needs to be “warmed up”, i.e. ran a few times to stabilize a refcount. This is because of global state, caches, and so on. By default we run the test five times to warm it up. Then, we run the test an additional four times, checking each time for reference leak deltas.

This is very effective in catching issues but is unfortunately very slow, which is why we can’t do it on every pull request automatically. In fact, we can’t even do it after every commit since the rate of commits is higher than the processing power available on our buildbots. Instead, we run refleak workers periodically.

An example failure looks like this:

I was recently reminded that we can opt into running tests on buildbots by putting a 🔨 test-with-buildbots label on a given pull request. I started using it this week with good success.

I have one more interesting thing I discovered related to reference leaks but I’ll leave that for next week when I’m done investigating. 3

Undefined Behavior

I always found it kinda weird that C compilers, and in fact even C standards, explicitly list code that while allowed, results in “undefined behavior” which is a euphemism for “this might sometimes crash”.

This week we dealt with a few interesting cases, like complex_pow or list_concat. Turns out Clang is now increasingly verbose in its ability to find and complain about undefined behavior. You can opt into this when building CPython by using:

$ ./configure --with-undefined-behavior-sanitizer

What I find extremely interesting is how those issues are usually found not at compile-time but rather when running the program. Clang compiles a small diagnostic runtime into your program that allows them to check for undefined behavior being exercised. Example output can be found in the issues I linked to above.

In fact, we were so impressed by this that Pablo added a stable UBSAN buildbot that is Clang-based to our fleet to ensure we don’t regress in this manner in the future.

When optional modules aren’t

When you build CPython, some modules might require additional libraries on the system that you don’t have. This is normal, in fact it’s impossible to build Python with every module because some are OS-specific. For example, on my Mac I will never be able to build ossaudiodev or spwd.

However, sometimes an optional module silently fails to build on a builbot where you expect it should be able to build just fine. That means we’re blind to bugs in this module because tests for it will be skipped and that build will appear happily green.

So, this week Pablo added a way for us to loudly crash the build if an expected module failed to build. Hilariously, when landing it to the 3.10 branch it already found a problem where we were skipping _tkinter builds on macOS on this branch due to an invalid configure file.

Plans for next week

I won’t be break-neck racing against the PR rate anymore but will continue working on the visibility of what is going on. We need a more methodical way of tracking pull request activity. Providing this is in fact written into my contract and I already started on it. I intend to present something here in the first week of August.

Pablo also suggested a new kind of buildbot worker that would provide “quick builds” that only run relevant tests to the code changes. This would be quick enough then to run reference leak tests with on open PRs. For this to work though we will have to map C files to test files in Lib/test/ that exercise those C files. I’ll be looking into that next week.

Detailed Report

Monday

Issues:

PRs:

Tuesday

I spoke about my favorite new features of Python 3.10 at Forkwell’s Global Dev Study #5 targeted at Japanese Python users.

PRs:

Wednesday

Issues:

PRs:

Thursday

Issues:

PRs:

Friday

Issues:

PRs:


  1. This is why I highly recommend using Cython instead in your user code. It even allows you to automatically maintain C-level memory with the help of the very useful cymem library. 

  2. It does it with sys.gettotalrefcount() after the fact so it requires a pydebug build of the interpreter

  3. Spoiler alert: if you’re on a Mac, try to build CPython with pydebug, then run ./python.exe -m test -R: test_logging and see what happens. 

#Python/Developer-in-Residence