Some of you read my previous post on typing.Protocol
s and probably wondered: “what about zope.interface
?” I’ve advocated strongly for it in the past — but now that we have Mypy and Protocol
s, is it simply a relic of an earlier time? Can we entirely replace it with Protocol
?
Let’s have a look.
Typing in 2 dimensions
In the previous post I discussed structural versus nominal typing. In Mypy’s type system, most classes are checked nominally whereas Protocol
is checked structurally. However, there’s another way that Protocol
is distinct from a normal class: normal classes are concrete types, and Protocol
s are abstract.
Abstract types:
- cannot be instantiated: every instance of an abstract type is an instance of some concrete sub-type, and
- do not include (complete) implementation logic.
Concrete types:
- can be instantiated: they are complete descriptions of a type, and
- must include all their own implementation logic.
Protocol
s and Interface
s are both abstract, but Interface
s are nominal. The highest level distinction between the two is that when you have a problem that requires an abstract type, but nominal checking is preferable to structural, Interface
s are a better solution.
Python’s built-in Abstract Base Classes are technically abstract-and-nominal as well, but they’re in a strange halfway space; they’re formally “abstract” because they can’t be instantiated, but they’re partially concrete in that they can contain any amount of implementation logic themselves, and thereby making an object which is a subtype of multiple ABCs drags in all the usual problems of the conflicting namespaces within multiple inheritance.
Theoretically, there’s a way to treat ABCs as purely abstract — which is to use ABCMeta.register
— but as of this writing (March 2021) it doesn’t work with Mypy, so within the context of “static typing in Python” we presently have to ignore it.
Practicalities
The first major advantage that Protocol
has is that since it is now built in to Python itself, there’s no reason not to use it. When Protocol
didn’t even exist, regardless of all the advantages of adding explicit abstract types to your project with zope.interface
, it did still have the small down-side of requiring a new dependency, with all the minor headaches that might imply.
beyond the theoretical distinctions, there’s a question of how well tooling supports zope.interface
. There are some clear gaps; there is not a ton of great built-in IDE support for zope.interface
; less-sophisticated linters will sometimes still complain that Interface
s don’t take self
as their first argument. Indeed, Mypy itself does this by default — although more on that in a moment. Less mainstream performance-focused type-checkers like Pyre and Pyright don’t support zope.interface
, either, although their lack of support for zope.interface
is just a part of a broader problem of their lack of extensibility; they also can’t support SQLAlchemy or the Django ORM without special-casing in the tools themselves.
But what about Mypy itself — if we have to discount ABCMeta.register
due to practical tooling deficiencies even if they provide a built-in way to declare a nominal-but-abstract type in principle, we need to be able to use zope.interface
within Mypy as well for a fair comparison with Protocol
. Can we?
Luckily, yes! Thanks to Shoobx, there’s a fairly actively maintained Mypy plugin that supports zope.interface
which you can use to statically check your Interface
s.
However, this plugin does have a few key limitations as of this writing (Again, March 2021), which makes its safety guarantees a bit lower-quality than Protocol
.
The net result of this is that Protocol
s have the “home-field advantage” in most cases; out of the box, they’ll work more smoothly with your existing editor / linter setup, and as long as your project supports Python 3.6+, at worst (if you can’t use Python 3.7, where Protocol
is built in to typing
) you have to take a type-check-time dependency on the typing_extensions
package, whereas with zope.interface
you’ll need both the run-time dependency of zope.interface
itself and the Mypy plugin at type-checking time.
So in a situation where both are roughly equivalent, Protocol
tends to win by default. There are undeniably big areas where Interface
s and Protocol
s overlap, and in plenty of them, using Protocol
is a fine idea. But there are still some clear places that zope.interface
shines.
First, let’s look at a case which Interface
s handle more gracefully than Protocol
s: opting out of matching a simple shape, where the shape doesn’t fully describe its own meaning.
Where Interface
s work best: hidden and complex meanings
The string is a stark data structure and everywhere it is passed there is much duplication of process. It is a perfect vehicle for hiding information.
Alan Perlis, “Epigrams in Programming”, Epigram 34.
The place where structural typing has the biggest advantage is when the type system is expressive enough to fully encode the meaning of the desired behavior within the structure of the type itself. Consider a Protocol
which describes an object that can add some integers together:
1 2 3 |
|
It’s fairly unambiguous what adherents to this Protocol
should do, and anyone implementing such a thing should be able to clearly tell that the method is supposed to add a couple of integers together; there’s nothing hidden about the structure of the integers, no constraints the type system won’t let us specify. It would be quite surprising if anything that didn’t have the intended behavior would match this Protocol
.
A the other end of the spectrum, we might have a plugin Interface
that has a lot of hidden structure. For this example, we have an Interface
called IPlugin
containing a method with an easy-to-conflict-with name (“name
”) overloaded with very specific constriants on its return type: the string must contain the dotted-path name of a Python object in an importable module (like, for example, "os.path.join"
).
1 2 3 |
|
With Protocol
s, you can work around these limitations, by manually making it harder to match; adding elements to the structure that embed names relevant to its semantics and thereby making the type behave more as if it were nominally typed.
You could make the method’s name long and ugly instead (plugin_name_to_load
, let’s say) or add unused additional attributes (yep_i_am_a_plugin = Literal[True]
) in order to reduce the risk of accidental matches, but these workarounds look hacky, and they have to be manually namespaced; if you want to mark it as having semantics associated with your specific plugin system, you have to embed the name of that system in your attributes themselves; here we’re just saying “plugin” but if we want to be truly careful, we have to embed the whole name of our project in there.
With Interface
s, the maintainer of each implementation must explicitly opt in, by choosing whether to specify that they are an @implementer(IPlugin)
. Since they had to import IPlugin
from somewhere, this annotation carries with it a specific, namespaced declaration of semantic intent: “I know what the Interface
IPlugin
means, and I promise that I can provide it”.
This is the most salient distinction between Protocol
s and Interface
s: if you have strong reasons to want adherents to the abstract type to opt in, you want an Interface
; if you want them to match automatically, you want a Protocol
.
Runtime support
Interfaces also provide a more nuanced set of runtime checks.
You can say that an object directlyProvides
an interface, allowing for some level of (at least runtime) type safety, and ask if IPlugin
is .providedBy
some object.
You can do most of this with Protocol
, but it’s awkward. The @runtime_checkable
decorator allows your Protocol
to make isinstance(x, MyProtocol)
work like IMyInterface.providedBy(x)
, but:
- you’re still missing
directlyProvides
; the runtime checking is all by type, not by the individual properties of the instance; - it’s not the default, so if you’re not the one defining the
Protocol
, there’s no guarantee you’ll be able to use it.
With Interface
s, there’s also no mandatory relationship between the implementer (i.e. the type whose instances fit the specified shape) and the provider (the specific object which can fit the specified shape). This means you get features like classProvides
and moduleProvides
“for free”.
Interface
s work particularly well for communication between frameworks and application code. For example, let’s say you’re evolving the meaning of an Interface
implemented by applications over time — EventHandler
, EventHandler2
, EventHandler3
— which have similarly named and typed methods, but subtly different expectations on their lifecycle or when precisely the methods will be called. A framework facing this problem can use a series of Interface
s, and check at runtime to see which of these the application implements
, and be secure in the knowledge that the application has properly intentionally adopted the new interface, and doesn’t just happen to have a matching method name against an older version.
Finally, zope.interface
gives you adaptation and adapter registries, which can be a useful mechanism for doing things like templating, like a much more powerful version of singledispatch
from the standard library.
Adapter registries are nuanced, complex tools and unfortunately an example that captures the full utility of their power would itself be commensurately complex. However, the core of adaptation is the idea that if you have an arbitrary object x
, and you want a provider of the interface IY
, you can do the following:
1 |
|
This performs a multi-stage check:
- If
x
already providesIY
(either viaimplementer
,provider
,directlyProvides
,classProvides
, ormoduleProvides
), it’s simply returned; so you don’t need to special-case the case where you’ve already got what you want. - If
x
has a__conform__(interface)
method, it’ll be called withIY
as theinterface
, and if__conform__
returns anything non-None
that result will be returned from the call toIY
. - Each globally-registered function in
zope.interface
’sadapter_hooks
will be invoked to find a function that can transformx
into anIY
provider. Twisted has its own global registry in this list, which is whatregisterAdapter
manipulates.
But from the perspective of the caller, you can just say “I want an IY
”.
With Protocol
s, you can emulate this with functools.singledispatch
by making a function which returns your Protocol
type and registers various types to do conversion. The place that adapter registries have an advantage is their central nature and consistent idiom for converting to the target type; you can use adaptation for any Interface
in the same way, and any type can participate in adaptation in the ways listed above via flexible mechanisms depending on where it makes sense to put your implementation, whereas any singledispatch
function to convert to a protocol needs to be bespoke per-protocol.
Describing and restricting existing shapes
There are still several scenarios where Protocol
’s semantics apply more cleanly.
Unlike Interface
s, Protocol
s can describe the types of things that already exist. To see when that’s an advantage, consider a sprawling application that uses tons of libraries and manipulates 3D spatial data points.
There’s a convention among these disparate libraries where they all represent a “point” as an object with .x
, .y
, and .z
attributes which are all float
s. This is a natural enough shape, given the domain, that lots of your libraries just fit it by accident. You want to write functions that can work with data output by any of these libraries as long as it plausibly looks like your own concept of a Point
:
1 2 3 4 |
|
In this case, the thing defining the Protocol
is your application; the thing implementing the Protocol
is your collection of libraries. Since the libraries don’t and can’t know about the application — the dependency arrow points the other way — they can’t reference the Protocol
to note that they implement it.
Using Protocol
, you can also restrict an existing type to preserve future flexibility.
For example, let’s say we’re implementing a “mailbox” type pattern, where some systems deliver messages and other systems retrieve them later. To avoid mix-ups, the system that sends the messages shouldn’t retrieve them and vice versa - receivers only receive, and senders only send. With Protocols, we can describe this without having any new custom concrete types, like so:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
All of that code is just telling Mypy our intentions; there’s no behavior here yet.
The actual implementation is even shorter:
1 2 3 |
|
Literally no code of our own - set
already does the job we described. And how do we use this?
1 2 3 4 5 6 7 8 9 10 11 |
|
For its initial implementation, this system requires nothing beyond types available in the standard library; just a set
. However, by treating their parameter as a Sender
and a Receiver
respectively rather than a Set
, send
and receive
prevent themselves from using any functionality from the set
passed in aside from the one method that their respective roles are supposed to “see”. As a result, Mypy will now tell us if any code which receives the sender
object tries to remove objects.
This allows us to use existing data structures in libraries without the usual attendant problem of advertising to all clients that every tiny implementation detail of those existing structures is an intended part of the public interface. Python has always tried to make these sort of distinctions by leaving certain things undocumented or saying narratively which things you should rely on, but it’s always hit-or-miss (usually miss) whether library consumers will see those admonitions or not; by making it a feature of the programming environment, Mypy makes it harder to ignore.
Conclusions
In modern Python code, when you have an abstract collection of behavior, you should probably consider using a Protocol
to describe it by default. However, Interface
is also staying up to date with modern Python tooling by with Mypy support, and it can be worthwhile for more sophisticated consumers that want support for nominal typing, or that want to draw on its reach adaptation and component registration feature-set.
from Planet Python
via read more
No comments:
Post a Comment