Monday, August 16, 2021

Python Morsels: Using the walrus operator

Transcript

Let's talk about Python's walrus operator.

An assignment followed by a conditional check

We have a function called get_quantitiy that accepts a string argument which represents a number and a unit (either kilograms or grams):

import re

UNITS_RE = re.compile(r'^(?P<quantity>\d+)\s*(?P<units>kg|g)$')


def get_quantity(string):
    match = UNITS_RE.search(string)
    if match:
        return (int(match.group('quantity')), match.group('units'))
    return int(string)

When we call this function it returns a tuple with 2 items: the number (converted to an integer) and the unit.

>>> get_quantity('4 kg')
(4, 'kg')
>>> get_quantity('4 g')
(4, 'g')

If we give this function a string represnting just a number (no units), it will give us that number back converted to an integer:

>>> get_quantity('4')
4

This get_quantitiy function assumes that whatever we give to it is either the pattern (number and unit) or just an integer.

We're doing this using regular expressions, which are a form of pattern matching:

import re

UNITS_RE = re.compile(r'^(?P<quantity>\d+)\s*(?P<units>kg|g)$')

We're not going to get into regular expressions right now.

Instead we're going to focus on these two lines of code (from get_quantity):

    match = UNITS_RE.search(string)
    if match:

On the first line, the match variable stores either a match object or None.

Then if match asks the question "did we get something that it truthy?"

A match object is truthy and None is falsey, so we're basically checking whether we got a match object to work with.

Those two lines above (the assignment to match and the conditional check based on match) can actually be combined into one line of code.

Embedding an assignment into another line with assignment expressions

We can these take these two lines of code:

    match = UNITS_RE.search(string)
    if match:

And combine them into one line of code using an assignment expression (new in Python 3.8):

    if match := UNITS_RE.search(string):

Before we had an assignment statement and a condition (that were checking in our if statement). Now we have both in one line of code.

import re

UNITS_RE = re.compile(r'^(?P<quantity>\d+)\s*(?P<units>kg|g)$')


def get_quantity(string):
    if match := UNITS_RE.search(string):
        return (int(match.group('quantity')), match.group('units'))
    return int(string)

We're using the walrus operator, which is the thing that powers assignment expressions.

Assignment expressions allow us to embed an assignment statement inside of another line of code. They use walrus operator (:=):

    if match := UNITS_RE.search(string):

Which is different from a plain assignment statement (=) because an assignment statement has to be on a line all on its own:

    match = UNITS_RE.search(string)

The := is called the walrus operator because it looks kind of like a walrus on its side: the colon looks sort of like eyes and the equal sign looks kind of like tusks.

Checking to see if we got a match object when using regular expressions in Python is a very common use of the walrus operator.

A use case for Walrus operator

Another common use case for the walrus operator is in a while loop.

Specifically it's common to see a walrus operator used in a while loop that repeatedly:

  1. Stores a value based on an expression
  2. Checks a condition based on that value

With the walrus operator we can perform both of those actions at the same time.

We have a function called compute_md5:

import hashlib


def compute_md5(filename):
    md5 = hashlib.md5()
    with open(filename, mode="rb") as f:
        while chunk := f.read(8192):
            md5.update(chunk)
    return md5.hexdigest()

This function takes a file name and gives us back the MD5 checksum of that file:

>>> compute_md5('units.py')
'b6a5563be535cb94a44d8aea5f9b0f8c'

We might use a function like this if we were trying to check for duplicate files or verify that a large file downloaded accurately.

We're not going to focus on the details of this function though. We care about what is the walrus operator doing in this compute_md5 function and what's the alternative of the walrus operator here?

We're repeatedly reading eight kilobytes (8192 bytes) into the chunk variable:

        while chunk := f.read(8192):
            md5.update(chunk)

The alternative to this is to assign to the chunk variable before our loop, check the value of chunk in our loop condition, and also assign to chunk at the end of each loop iteration:

        chunk = f.read(8192)
        while chunk:
            md5.update(chunk)
            chunk = f.read(8192)

I would argue that the using an assignment expression makes this code more readable than the alternative because we've taken what was three lines of code and turned it into just one line.

In each iteration of our loop we're grabbing a chunk, checking its truthiness (to see whether we've reached the end of the file) and assigning that chunk to the chunk variable. And we're doing all of this in just one line of code:

        while chunk := f.read(8192):

Summary

Assignment expressions use the walrus operator (:=).

Assignment expressions are a way of taking an assignment statement and embedding it in another line of code. I don't recommend using them unless they make your code more readable.



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...