Thursday, March 26, 2020

"Coder's Cat": Python: Generator and Yield

2020_03_26_python-generator-and-yield.org_20200326_203548.png

In this post, let’s discuss some basics for the generator in Python. What’s the benefit for generator, and how we use yield to implement generator.

Iterables

First, we must understand what’s iterables before understanding generator. Because generator is also an iterator in essential.

Most collection data structures in Python are iterables. For example, we create a list and iterate it one by one:

lst = [1, 2, 3]
for i in lst:
    print(i)

# 1
# 2
# 3

lst = [x+x for x in range(3)]
for x in lst:
    print(x)
# 0
# 2
# 4

We can also iterate the characters in a string:

string = "cat"
for c in string:
    print(c)

# c
# a
# t

The limitation of iterable

The limitation of iterable is that we need to store all the values in memory. So, this will cost too much memory in some scenarios. A typical scenario is reading lines from a file:

def file_reader(file_path):
    fp = open(file_path)
    return fp.read().split("\n")

Think about what will happen if we read a large file, such as a file with a size of 6 GB ?

We need to save all the lines in memory when loading content from the file.

Actually, in most cases, we only want to iterate line by line to do data processing, so this is why generator is introduced.

Generator

A generator is also iterator, but its key feature is lazy evaluation. Lazy evaluation is a classic concept in computer science and adopted by many programming languages such as Haskell. The core idea of lazy evaluation is call-by-need. Lazy evaluation can lead to reduction in memory footprint, since values are created when needed.

Generator is used in the style of iterating by need. We will not calculate and store the values at once, but generate them on the fly when we are iterating.

We have two ways to crate a generator, generator expression and generator function:

generator expression is similar with list comprehension, except we use (). Similar as iterator, we use next function to get the next item:

g1 = (x*x for x in range(10))
print(type(g1))
print(next(g1))
print(next(g1))

# <type 'generator'>
# 0
# 1

The difference here is we don’t compute all the values when creating the generator. x*x is calculated when we are iterating the generator.

Yield

Another way to create generator is using generator function, we use keyword yield to return a generator in a function.

def fib(cnt):
    n, a, b = 0, 0, 1
    while n < cnt:
        yield a
        a, b = b, a + b
        n = n + 1

g = fib(10)
for i in range(10):
    print g.next(),

# 0 1 1 2 3 5 8 13 21 34

Let’s use yield to rewrite above file reading program:

def file_reader(file_path):
    for row in open(file_path, "r"):
        yield row

for row in file_reader('./demo.txt'):
    print(row),

In this way, we won’t load all the content into memory, but loading it by reading the lines.

    The post Python: Generator and Yield appeared first on Coder's Cat.



    from Planet Python
    via read more

    No comments:

    Post a Comment

    TestDriven.io: Working with Static and Media Files in Django

    This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...