Transcript
Let's talk about the map
and filter
functions in Python, and why I don't usually recommend using them (related: I also don't recommend lambda expressions).
The map
function transforms each item
The map
function accepts a function and an iterable . Here we're passing a square
function and a numbers
list to map
:
>>> def square(n):
... return n**2
...
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> squared_numbers = map(square, numbers)
The map
function returns a lazy iterable:
>>> squared_numbers
<map object at 0x7f241e1f47f0>
As we loop over this map
object (squared_numbers
), the map
object will loop over the given iterable (numbers
) and call the given function (square
) on each item in the iterable, giving us back the return value of that function call:
>>> list(squared_numbers)
[4, 1, 9, 16, 49, 121, 324]
In this case, our map
object is squaring each of numbers in the given numbers
iterable.
You can think of map
as doing a transformation operation. The map
function:
- Takes an iterable
- Takes an operation to perform on each item in the iterable
- Performs the given operation on each item as we loop over it
The filter
function filters items down
The filter
function also accepts a function and an iterable. We're an is_odd
function and a numbers
list to filter
:
>>> def is_odd(n):
... return n%2 == 1
...
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> odd_numbers = filter(is_odd, numbers)
Like map
, the filter
function gives us back a lazy iterable:
>>> odd_numbers
<filter object at 0x7fbf13c1d7c0>
As we loop over this filter
object (odd_numbers
), the filter
object will loop over the given iterable (numbers
), and call the given function (is_odd
) on each item in it. However, it doesn't give us back the return value of that function call; instead it uses that function call to determine whether that item should be included in the resulting lazy iterable:
>>> list(odd_numbers)
[1, 3, 7, 11]
In this case, we're only getting odd numbers, because the filter
function will only include items where True
(or a truthy value) is returned when that item is passed to the given function (is_odd
in our case).
map
and filter
are equivalent to writing a generator expression
- The
map
function takes each item in a given iterable and and includes all of them in a new lazy iterable, transforming each item along the way - The
filter
function doesn't transform the items, but it's selectively picks out which items it should include in the new lazy iterable
The reason I don't usually recommend using map
and filter
is that they can each be summed up in just one line of Python code.
The map
function is nearly equivalent to this generator expression:
def map(function, iterable):
return (function(x) for x in iterable)
There's a little bit more to the map
function that this, but for most use cases map
is essentially the same as a generator expression that loops over an iterable and calls a function on every item in that iterable (to transform each item).
The filter
function is essentially the same as this generator expression:
def filter(function, iterable):
return (x for x in iterable if function(x))
This generator expression loops over an iterable and calls a function on each item in the conditional part of the generator expression to determine whether the items should be included in the new lazy iterable.
Nested map
and filter
calls vs generator expressions
We have square
and is_odd
functions here:
>>> def square(n):
... return n**2
...
>>> def is_odd(n):
... return n%2 == 1
And we have a list, numbers
:
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
We could use the map
and filter
functions to take numbers and square all of the odd numbers (that is, only including odd numbers and squaring each included number).
We could pass is_odd
and numbers
to the filter
function and then take the filter
object we get back (which is a lazy iterable) and pass it to map
along with the square
function:
>>> numbers = [2, 1, 3, 4, 7, 11, 18]
>>> map(square, filter(is_odd, numbers))
<map object at 0x7ff0b70ef1c0>
This makes a lazy iterable which will include squares of all of the odd numbers in our list.
As we loop over the lazy map
object we get back, we'll see that it includes the square of all the odd numbers from our original list:
>>> list(map(square, filter(is_odd, numbers)))
[1, 9, 49, 121]
We could accomplish this same task using a generator expression, like this:
>>> (square(n) for n in numbers if is_odd(n))
<generator object <genexpr> at 0x7ff0b710aba0>
Though if we wanted to get a list instead of a lazy iterable, we could write it as a list comprehension instead:
>>> [square(n) for n in numbers if is_odd(n)]
[1, 9, 49, 121]
I find this list comprehension (or generator expression) version a lot more readable than the equivalent map
and filter
version of the same code:
>>> list(map(square, filter(is_odd, numbers)))
[1, 9, 49, 121]
The map
and filter
version is a little bit inside-out looking: we pass a function (square
) to map
along with a filter
object which has a function (is_odd
) and an iterable (numbers
) passed to it.
Whereas the list comprehension version looks more like the English sentence I might say in order to describe the operation we're performing:
>>> [square(n) for n in numbers if is_odd(n)]
[1, 9, 49, 121]
In fact, with the generator expression or list comprehension, you don't even need extra functions to call (unlike with map
and filter
). You can write out the operations (n**2
and if n % 2 == 1
) right inside the first part and last part of a list comprehension (or generator expression):
>>> [n**2 for n in numbers if n % 2 == 1]
[1, 9, 49, 121]
In fact, I think of the first part of a generator expression is the mapping part, and the last part of a generator expression as the filtering part because they serve the same purpose as the built-in map
and filter
functions.
Summary
The map
function performs a transformation on each item in an iterable, returning a lazy iterable back. The filter
function filters down items in an iterable, returning a lazy iterable back.
Instead of map
and filter
, I tend to prefer generator expressions. The first part of a generator expression is the mapping part, and the last optional part of a generator expression (the condition) is the filtering part.
from Planet Python
via read more
No comments:
Post a Comment