Sunday, June 28, 2020

Pythonicity: Mutable defaults

Contrarian view on mutable default arguments.

The use of mutable defaults is probably the most infamous Python gotcha. Default values are evaluated at definition time, which means mutating them will be persistent across multiple calls. Many articles on this topic even use the same append example.

In [1]:
def append_to(element, to=[]):
    to.append(element)
    return to

append_to(0)
Out[1]:
[0]
In [2]:
append_to(1)
Out[2]:
[0, 1]

And the solution is invariably to use None instead, and convert as needed.

In [3]:
def append_to(element, to=None):
    if to is None:
        to = []
    to.append(element)
    return to

There is another solution to the problem of mutating a default value: don't do that. More specifically, the problem isn't using mutables as defaults; the problem is actually mutating them.

If the input from the caller is being mutated, then the caller doesn't need it returned because the caller already has a reference. This distinction is explicitly encouraged in Python, e.g., list.sort vs. sorted. But it follows that if the input doesn't need to be returned, then there's no point in the input being optional. How would the caller know the difference?

The reason why examples like the fluent append seem so contrived is because they are. No one actually wants a function named append to take one argument. The realistic fix would be:

In [4]:
def append_to(element, to):
    to.append(element)
    return to

Sure, there are rare occassions where a parameter is mutable but optional, such as a recursive algorithm that's implicitly passing around its own cache. But this author would wager that given any real-world code that's been bitten by this gotcha there is:

  • a ~90% chance the function would have a related bug if defaults were evaluated at runtime
  • a ~95% chance the function has a poor interface

What harm does this advice do? Well, it's caused an over-reaction resulting in using None as the only default, even for immutables. It's so prevalent that it appears many beginners believe using None is the one and only way of making an argument optional.

Besides immutable types, there are also cases where mutation is irrelevant. Consider the following example adapted from a popular project.

In [5]:
from typing import List

def __init__(self, alist: List = None):
    self.alist = [] if alist is None else list(alist)

Notice that the correctness of this code relies on the member list being newly created in either case. What could possible go wrong with:

In [6]:
def __init__(self, alist: List = []):
    self.alist = list(alist)

Or better yet, why not support the more liberal interface.

In [7]:
from typing import Iterable

def __init__(self, alist: Iterable = ()):
    self.alist = list(alist)

The point is that there are at least 4 solutions to this problem:

  1. use mutable defaults, but don't mutate them
  2. use immutable substitute defaults with a compatible interface
  3. use None for mutables, and matching types for immutables
  4. use None for all defaults

Only #1 is even remotely controversial, yet somehow the status quo has landed on #4. Besides being needlessly verbose, it has another pitfall. Python doesn't natively support detecting where the argument was actually passed; a sentinel default is required for that. The implementation detail is leaking through the interface, indicating to the caller that None is an acceptable argument to pass explicitly. As if the type hint was Optional[List], even though that's not the intention. Factor in using **kwargs - which clearly doesn't want data padded with nulls - and actual code breakage can result.

Presumably the disdain for option #1 is because it might encourage the gotcha. But it's disingenous to just let that go unsaid. The implementer is responsible for writing correct code, and the caller sees the right interface. The speculation is that beginners will read code which uses mutable defaults but doesn't mutate them, and follow the former pattern but not the latter.

As a community, let's at least push towards option #3. Using empty strings and zeros as defaults is all upside.



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...