In this post, let’s discuss some basics for the generator in Python. What’s the benefit for generator, and how we use yield
to implement generator.
Iterables
First, we must understand what’s iterables before understanding generator
. Because generator
is also an iterator in essential.
Most collection data structures in Python are iterables. For example, we create a list and iterate it one by one:
lst = [1, 2, 3]
for i in lst:
print(i)
# 1
# 2
# 3
lst = [x+x for x in range(3)]
for x in lst:
print(x)
# 0
# 2
# 4
We can also iterate the characters in a string:
string = "cat"
for c in string:
print(c)
# c
# a
# t
The limitation of iterable
The limitation of iterable is that we need to store all the values in memory. So, this will cost too much memory in some scenarios. A typical scenario is reading lines from a file:
def file_reader(file_path):
fp = open(file_path)
return fp.read().split("\n")
Think about what will happen if we read a large file, such as a file with a size of 6 GB ?
We need to save all the lines in memory when loading content from the file.
Actually, in most cases, we only want to iterate line by line to do data processing, so this is why generator
is introduced.
Generator
A generator is also iterator, but its key feature is lazy evaluation
. Lazy evaluation is a classic concept in computer science and adopted by many programming languages such as Haskell. The core idea of lazy evaluation is call-by-need. Lazy evaluation can lead to reduction in memory footprint, since values are created when needed.
Generator is used in the style of iterating by need. We will not calculate and store the values at once, but generate them on the fly when we are iterating.
We have two ways to crate a generator
, generator expression and generator function:
generator expression
is similar with list comprehension, except we use (). Similar as iterator, we use next
function to get the next item:
g1 = (x*x for x in range(10))
print(type(g1))
print(next(g1))
print(next(g1))
# <type 'generator'>
# 0
# 1
The difference here is we don’t compute all the values when creating the generator. x*x
is calculated when we are iterating the generator.
Yield
Another way to create generator
is using generator function, we use keyword yield
to return a generator in a function.
def fib(cnt):
n, a, b = 0, 0, 1
while n < cnt:
yield a
a, b = b, a + b
n = n + 1
g = fib(10)
for i in range(10):
print g.next(),
# 0 1 1 2 3 5 8 13 21 34
Let’s use yield
to rewrite above file reading program:
def file_reader(file_path):
for row in open(file_path, "r"):
yield row
for row in file_reader('./demo.txt'):
print(row),
In this way, we won’t load all the content into memory, but loading it by reading the lines.
The post Python: Generator and Yield appeared first on Coder's Cat.
from Planet Python
via read more
No comments:
Post a Comment