PEP 563 and PEP 649

As the author of PEP 563, I can summarize my position as follows: had PEP 563 never been implemented, it would be easy to accept PEP 649. However, in the current situation it’s not that clear because I find it very important to allow writing a single codebase that works on Python 3.7 - 3.11. If we can secure this behavior, I’m +1 to accepting PEP 649. If not, I have an alternative idea.

First things first: string annotations are valid annotations

PEP 563 is often criticized for making the contents of __annotations__ strings. The idea of a string annotation doesn’t come from PEP 563 though, it comes right from PEP 484 where the forward reference problem was acknowledged. So, in this sense, string annotations might appear in user code and libraries processing annotations have to deal with them anyway. This is why typing.get_type_hints() was introduced in the first place, long before PEP 563. AFAICT this functionality is not going away.

What was PEP 563 trying to do?

The primary goal of PEP 563 was to allow forward references to seamlessly work. The secondary goals, not really spelled explicitly in the PEP, were to decrease annotations’ runtime cost, and to allow use of more flexible annotation syntax.

The first of those secondary goals was largely achieved, string annotations are much cheaper memory-wise compared to full-blown Python objects. For many simple annotations we also get to exploit Python’s string interning to further decrease memory usage. After all, many annotations in a single Python process are alike. Note that string interning currently ignores strings with square brackets in them, we could change that to further improve annotation performance. In any case, it’s important to acknowledge that at the time there was some community pushback against adopting type annotations that was based on their perceived negative performance impact. This might be less visible 4 years later but it was a big consideration at the time, especially among large commercial Python users.

The second of the secondary goals, to enable new forms of annotation syntax that’s free from runtime constraints, was both naive, incomplete, and in time became unnecessary anyway.

It was naive because we quickly realized that important pieces of typing will always remain in the runtime part of the language: type aliases are essentially runtime assignments, type casting is a function call, using generic types as bases of new classes puts annotation syntax inside a class definition, and so on.

It was incomplete because from __future__ import annotations unparses code back to a string from an AST. So you never really could, for example, put a question mark to denote an optional type. It needed to be syntactically valid Python. It just didn’t have to be executable.

And it became unnecessary because in time PEP 585 and PEP 604 implemented the most important quality-of-life improvements to typing syntax: using generics with builtin types (i.e. list[str] instead of typing.List[str]), and allowing compact notation for type unions (i.e. str | int | None instead of Optional[Union[str, int]]). Those not only solved the most raised usability issues, but also worked in the runtime contexts mentioned above, making them true adult solutions to the problem.

Who cares about PEP 563 today?

Users of forward references, people caring about memory weight of their Python processes, and people who want Python 3.10 syntax in annotations but are still stuck with Python 3.7 in production.

Forward references

The simplest case of a forward reference is this:

class A:
    ...

    @classmethod
    def from_b(cls, b: B) -> A:
        ...

    def convert_to_b(self) -> B:
        ...

class B:
    ...

Depending on the case it might not be possible to re-order the code in such a way that fixes the wrong ordering. Forward references aren’t that prevalent in user code but they do happen regularly, and can be at times problematic, especially in the often missed case of self-reference:

class ImmutableGadget:
    def mutate(self, thingamabob: Thingamabob) -> ImmutableGadget:
        return ImmutableGadget(
            whatchamacallit=self.watchamacallit,
            wieheister=self.wieheister,
            ...
            thingamabob=thingamabob,
        )

Some of such problems can be avoided by using higher order types:

T = TypeVar("T")
class ImmutableGadget:
    def mutate(self: T, thingamabob: Thingamabob) -> T:
        return self.__class__(
            ...
        )

As you can see, this is a little tricky and doesn’t exactly mean the same thing. The second form might be preferable (now we support subclasses correctly!) but it’s more brittle (we need to remember about self.__class__ instead of hardcoding the class name; the subclass constructor might break LSP; we need to use a type variable).

Accessing names unavailable at runtime

There is another case of a “forward reference” that is more insidious than bad ordering of two classes. That’s import cycles. To solve that problem, a common pattern in typing is to guard with if TYPE_CHECKING like this:

if TYPE_CHECKING:
    from bad_import_cycle import NameWeNeed

def some_function(arg: list[NameWeNeed]) -> None:
    ...

Without the __future__ import, you’d need to use a string annotation manually since the import isn’t done at runtime.

The if TYPE_CHECKING block is also popular to guard off imports that are optional or time consuming, as well as to ensure that certain names aren’t exported from the module (which is especially useful for locally-defined Protocols, TypedDicts, NewTypes, type aliases, and so on).

Memory impact and import performance

I know this to be true from tests with Instagram and later with EdgeDB, the __future__ import saves memory and import time. You can test it yourself, I made a synthetic benchmark that allows you to see it for yourself. It tests 1,000 realistic dataclasses and 200,000 annotated functions (otherwise empty) split across 1,000 modules in 10 packages. On my 2018 15” MacBook Pro the results are as follows:

  • import time: 1.79s ±0.02s without the future, 1.16s ±0.02s with the future
  • RSS memory usage: 143.62 MB without the future, 93.29 MB with the future

Of course, the example is drastic in the sense that there is little code besides the annotations. That’s what we’re benchmarking though, right? Just be sure to understand that in real world applications annotations take up a much smaller percentage of the process memory. It’s often still above 1% though for fully annotated applications.

Use PEP 585 and PEP 604 features in Python 3.7+

I said before that usage of type hints goes beyond type annotations and as such the __future__ import doesn’t really allow using new syntax for type annotations. Yes, that’s true in general.

In practice, with a little care, the __future__ import does allow using piped unions and lowercase generic builtin types just fine. And in fact users in the wild are doing just that. Realistically, it will be a good while before everybody can upgrade to Python 3.10.

Especially combined with if TYPE_CHECKING:, usage of new syntactic features of typing is enabled for Python 3.7 users just as well. As well as others, like the Final qualifier added in Python 3.8.

The future that was supposed to be

Had PEP 563 been accepted in 3.10, those three features described above would become the default and users of 3.7+ would be able to ramp up to them by using the __future__ import.

That future has a disadvantage: it forces usage of typing.get_type_hints() to evaluate annotations at runtime, since now the __annotations__ attribute on functions and classes always stores strings. Without PEP 563 that’s still required though if you want to be thorough because, as I mentioned before, string annotations can be used by the user regardless of any __future__ import, as defined by PEP 484.

There is one more disadvantage of typing.get_type_hints(): it might be unable to resolve some names in some particular situations. Those situations can be summed up into three groups:

  • if an annotation string contains PEP 585 or PEP 604 syntax in Python versions lower than 3.10, evaluating it at runtime will fail;
  • if an annotation string depends on names defined in an if TYPE_CHECKING: block of code, evaluating it at runtime will fail;
  • if an annotation string depends on locally defined names that are not globally reachable when get_type_hints() is called, evaluating it at runtime will fail.

PEP 649 won’t help with the first two evaluation issues, and can only help with a subset of cases of the last issue.

Where PEP 649 helps

When a class was defined in local scope, typing.get_type_hints() won’t be able to find it without manually manipulating its globalns and localns which nobody does since it’s tricky to the user, especially if typing.get_type_hints() is called on behalf of a library that the end user is using. Example:

from __future__ import annotations

def make_new_type() -> B:
    @dataclass
    class Point:
        x: float
        y: float
        z: float

    @dataclass
    class B:
        shape: set[Point]

    return B

ShapeHolder = make_new_type()
typing.get_type_hints(ShapeHolder)  # crashes because it doesn't find `Point`

Similarly, when your entire code lives in a function, it won’t work either:

from __future__ import annotations

def main():
    from pydantic import BaseModel, PositiveInt

    class TestModel(BaseModel):
        foo: PositiveInt

    # crashes because `PositiveInt` is imported locally
    typing.get_type_hints(TestModel)

main()

While those two examples are valid Python, to be honest with you I was surprised they were given such weight. And yet, they are basically the reason you’re reading my long post now.

Where PEP 649 doesn’t help

On top the two cases mentioned above (if TYPE_CHECKING, using features from future Python versions in annotation strings), there is one other somewhat common case that won’t be solved by PEP 649: it’s when you try to evaluate a self-referential type with a decorator:

T = TypeVar("T")

def class_decorator(cls: T) -> T:
    annotations = get_type_hints(cls)
    print(annotations)


@class_decorator
class Node:
    parent: Node | None

Running this code will crash even though the Node class is defined at module level. Why? Because the class decorator receives the class object before it received a name. In case of PEP 649, this will crash just the same. A good example of such class_decorator is @dataclass which is why currently it works hard to avoid resolving annotations.

And you might think, “why surely we can change Python to assign the name first?”, that would be complicated and backwards incompatible. But even if we could, or if we taught get_type_hints() to inject cls.__name__ into whatever globalns is used, it wouldn’t solve the more general problem here: the class decorator is executed right there at the point where this class is defined. So if we have two of them, no trick will help:

@class_decorator
class A:
    parent: A | B | None


@class_decorator
class B:
    parent: A | B | None

Such a case simply cannot be supported at runtime.

Why are you talking about typing.get_type_hints() anyway?

In 3.10 Larry Hastings added inspect.get_annotations as a way to solve some common pitfalls with accessing the __annotations__ attribute directly.

get_annotations is able to evaluate string annotations, just like get_type_hints, but the latter goes beyond that:

  • it also resolves ForwardRef objects to actual types;
  • it excludes custom PEP 593 Annotated[] annotations unless instructed to keep them;
  • it standardizes types (like changing None into NoneType);
  • and possibly a few other details I’m forgetting now.

In particular, resolving ForwardRef means that get_type_hints also supports this less popular form of forward reference where the string is inside a generic: List["Point"].

The case for purity

Looking at PEP 649 in isolation, it provides a more flexible and elegant solution to the problem. It avoids unparsing annotations from the AST form back into strings, which is AFAICT unheard of in the world of programming language implementations. It avoids proliferation of strings in annotations which means that in many happy cases the user might be blissfully unaware of them. With PEP 563 they’re user-visible which is suboptimal. Clearly, as long as the PEP can be similarly performant, it represents better engineering.

Given the above, let’s suppose that PEP 563 has to go and PEP 649 is the new standard. I don’t feel that the case is truly strong, but that’s subjective. What isn’t subjective though is how we go about this transition. I’m afraid it’s very easy here to do something that will put our users in a miserable state for years to come.

A terrible future

The current spelling of PEP 649 proposes a new __future__ import: from __future__ import co_annotations. Using it in a file would automatically make it incompatible with Python 3.10 and older. Worse yet, using the old __future__ import (from __future__ import annotations) would now be deprecated and throwing DeprecationWarnings at people when running Python with -Wd (or at pytest time, and so on).

So what is a library author supposed to do? There is no way to write code that will bridge those two gaps, meaning the author now has to return to using string literals for forward references and types from if TYPE_CHECKING blocks. Some libraries already started to advocate for that which I think is a sad state of affairs.

I even see suggestions of using ForwardRef as annotations and it’s at least explicit. But at the same time it’s wordy, still uses a string internally, and still won’t give you the type you want in all the cases I mentioned above.

But the saddest thing about it is that with the old __future__ import being dead, libraries still won’t be able to use any Python 3.10 typing features until they can drop support for Python 3.9 and older (so October 2025 at the soonest).

A better possible future

Another suggestion I’m seeing is to make PEP 649’s behavior the default in Python 3.11 without a need for a new __future__ import, and to make the current one deprecated. This is better because we can at least now write code that is compatible between 3.7 and 3.11. That code still shouldn’t use the deprecated from __future__ import annotations though as this will stop working in 3.13 so what’s the point?

Now, if the library author is fine with DeprecationWarnings, they could keep using from __future__ import annotations until Python 3.13 is released (October 2023). But that is still two years short of Python 3.9’s EOL. So I would advise marking the old future-import as “pending deprecation” until 3.13 and only fully deprecate it then for two releases, so it would be removed in Python 3.14 (𝛑) in October 2025 when Python 3.9 goes EOL.

But displaying deprecation warnings to library users is bound to cause the authors grief. Users will be unhappy to see them during pytest runs and they will inevitably report that. So maybe we should reconsider displaying the warning altogether and instead focus on linting?

A total alternative

I have to admit that I don’t quite understand the rallying cry that froze PEP 563’s finalization in Python 3.10. It was based primarily on cases with which PEP 649 doesn’t help anyway, and the ones that are indeed handled by PEP 649 are, as you’ve seen above, quite convoluted. In fact, one of the quoted reasons for stopping PEP 563 was that it doesn’t play nice with Pydantic while its own documentation claims otherwise.

Since the problem that caused user backlash was that string annotations using local-scoped names can’t be resolved, we can address that one problem specifically while keeping PEP 563 alive. For this rare case, Python could borrow PEP 649’s idea to bind cells to names referenced in annotations that aren’t on module-level functions/classes, so that inspect.get_annotations() and/or typing.get_type_hints() can bind it correctly to resolve names.

To be more specific, consider this example:

from __future__ import annotations

def f1():
    local1 = "str"
    def f2():
        local2 = "str2"
        def f3(arg: local1) -> local2:
            local1, local2  # <-- this needs to happen automatically
        return f3
    return f2()

print(f1().__closure__)  # <-- today this will print None without
                         #     putting `local1, local2` in the body

What I’m suggesting is that the annotation unparser would, for non-module-level functions and classes, would also store names encountered in those annotations in a set glued to the given object. Then, the compiler would consult that set as well when constructing cells. Finally, a dict from name to cell would be constructed that can be used as localsns inside get_type_hints(). It wouldn’t even need to be explicitly passed by the user, get_type_hints() would find itself. This cost would only need to be paid for non-module-level annotations which I believe are rare.

This approach, while less pure than PEP 649, would entirely avoid the necessity for complex deprecations which I believe would provide a better end user experience. It would solve what made people unhappy while keeping what makes PEP 563 effective (its availability from Python 3.7 and runtime efficiency).

#Python