Transcript
Let's talk about Python's walrus operator.
An assignment followed by a conditional check
We have a function called get_quantitiy
that accepts a string argument which represents a number and a unit (either kilograms or grams):
import re
UNITS_RE = re.compile(r'^(?P<quantity>\d+)\s*(?P<units>kg|g)$')
def get_quantity(string):
match = UNITS_RE.search(string)
if match:
return (int(match.group('quantity')), match.group('units'))
return int(string)
When we call this function it returns a tuple with 2 items: the number (converted to an integer) and the unit.
>>> get_quantity('4 kg')
(4, 'kg')
>>> get_quantity('4 g')
(4, 'g')
If we give this function a string represnting just a number (no units), it will give us that number back converted to an integer:
>>> get_quantity('4')
4
This get_quantitiy
function assumes that whatever we give to it is either the pattern (number and unit) or just an integer.
We're doing this using regular expressions, which are a form of pattern matching:
import re
UNITS_RE = re.compile(r'^(?P<quantity>\d+)\s*(?P<units>kg|g)$')
We're not going to get into regular expressions right now.
Instead we're going to focus on these two lines of code (from get_quantity
):
match = UNITS_RE.search(string)
if match:
On the first line, the match
variable stores either a match object or None
.
Then if match
asks the question "did we get something that it truthy?"
A match object is truthy and None
is falsey, so we're basically checking whether we got a match
object to work with.
Those two lines above (the assignment to match
and the conditional check based on match
) can actually be combined into one line of code.
Embedding an assignment into another line with assignment expressions
We can these take these two lines of code:
match = UNITS_RE.search(string)
if match:
And combine them into one line of code using an assignment expression (new in Python 3.8):
if match := UNITS_RE.search(string):
Before we had an assignment statement and a condition (that were checking in our if
statement). Now we have both in one line of code.
import re
UNITS_RE = re.compile(r'^(?P<quantity>\d+)\s*(?P<units>kg|g)$')
def get_quantity(string):
if match := UNITS_RE.search(string):
return (int(match.group('quantity')), match.group('units'))
return int(string)
We're using the walrus operator, which is the thing that powers assignment expressions.
Assignment expressions allow us to embed an assignment statement inside of another line of code. They use walrus operator (:=
):
if match := UNITS_RE.search(string):
Which is different from a plain assignment statement (=
) because an assignment statement has to be on a line all on its own:
match = UNITS_RE.search(string)
The :=
is called the walrus operator because it looks kind of like a walrus on its side: the colon looks sort of like eyes and the equal sign looks kind of like tusks.
Checking to see if we got a match object when using regular expressions in Python is a very common use of the walrus operator.
A use case for Walrus operator
Another common use case for the walrus operator is in a while
loop.
Specifically it's common to see a walrus operator used in a while
loop that repeatedly:
- Stores a value based on an expression
- Checks a condition based on that value
With the walrus operator we can perform both of those actions at the same time.
We have a function called compute_md5
:
import hashlib
def compute_md5(filename):
md5 = hashlib.md5()
with open(filename, mode="rb") as f:
while chunk := f.read(8192):
md5.update(chunk)
return md5.hexdigest()
This function takes a file name and gives us back the MD5 checksum of that file:
>>> compute_md5('units.py')
'b6a5563be535cb94a44d8aea5f9b0f8c'
We might use a function like this if we were trying to check for duplicate files or verify that a large file downloaded accurately.
We're not going to focus on the details of this function though. We care about what is the walrus operator doing in this compute_md5
function and what's the alternative of the walrus operator here?
We're repeatedly reading eight kilobytes (8192 bytes) into the chunk
variable:
while chunk := f.read(8192):
md5.update(chunk)
The alternative to this is to assign to the chunk
variable before our loop, check the value of chunk
in our loop condition, and also assign to chunk
at the end of each loop iteration:
chunk = f.read(8192)
while chunk:
md5.update(chunk)
chunk = f.read(8192)
I would argue that the using an assignment expression makes this code more readable than the alternative because we've taken what was three lines of code and turned it into just one line.
In each iteration of our loop we're grabbing a chunk, checking its truthiness (to see whether we've reached the end of the file) and assigning that chunk to the chunk
variable. And we're doing all of this in just one line of code:
while chunk := f.read(8192):
Summary
Assignment expressions use the walrus operator (:=
).
Assignment expressions are a way of taking an assignment statement and embedding it in another line of code. I don't recommend using them unless they make your code more readable.
from Planet Python
via read more
No comments:
Post a Comment