Перейти к содержанию

Generator

Definition

PEP 255 -- Simple Generators

A generator is a function that returns a generator iterator1. It looks like a normal function except that it contains yield expressions for producing a series of values usable in a for-loop or that can be retrieved one at a time with the next() function.

A generator is simply a function which returns an object on which you can call next, such that for every call it returns some value, until it raises a StopIteration exception, signaling that all values have been generated. Such an object is called an iterator.

Normal functions return a single value using return. In Python, however, there is an alternative, called yield. Using yield anywhere in a function makes it a generator.

Motivation

Gives you big performance boosts not only with execution time but with memory.

How to create a generator in Python

def simple_gen(n):
    print(f'Call 1: n={n}')
    yield n
    print(f'Call 2: n={n}')
    yield n + 1
    print(f'Call 3: n={n}')
    yield n + 2

g = simple_gen(5)

print(next(g))
print(next(g))
print(next(g))
print(next(g))
output
Call 1: n=5
5
Call 2: n=5
6
Call 3: n=5
7
Traceback (most recent call last):
File "<input>", line 13, in <module>
StopIteration

def infinite_gen():
    n = 0
    while True:
        yield n
        n += 1

g = infinite_gen()

print(next(g))
print(next(g))
output
0
1

Observe that a generator object is generated once, but its code is not run all at once. Only calls to next() actually execute (part of) the code. Execution of the code in a generator stops once a yield() statement has been reached, upon which it returns a value. The next call to next() then causes execution to continue in the state in which the generator was left after the last yield(). This is a fundamental difference with regular functions: those always start execution at the 'top' and discard their state upon returning a value.

Piece of code which explains 3 key concepts about generators:

def numbers():
    for i in range(10):
        yield i

gen = numbers() # This line only returns a generator object, it does not run the code defined inside numbers.

for i in gen: # We iterate over the generator and the values are printed.
    print(i)

# The generator is now empty.

for i in gen: # So this for block does not print anything.
    print(i)

Typing

Generator

Options:

A generator can be annotated by the generic type Generator[YieldType, SendType, ReturnType]. For example:

def echo_round() -> Generator[int, float, str]:
    sent = yield 0
    while sent >= 0:
        sent = yield round(sent)
    return 'Done'

If your generator will only yield values, set the SendType and ReturnType to None:

def infinite_stream(start: int) -> Generator[int, None, None]:
    while True:
        yield start
        start += 1

AsyncGenerator

Options:

An async generator can be annotated by the generic type AsyncGenerator[YieldType, SendType]. For example:

async def echo_round() -> AsyncGenerator[int, float]:
    sent = yield 0
    while sent >= 0.0:
        rounded = await round(sent)
        sent = yield rounded

If your generator will only yield values, set the SendType to None:

async def infinite_stream(start: int) -> AsyncGenerator[int, None]:
    while True:
        yield start
        start = await increment(start)

Generator Expressions

PEP 289 -- Generator Expressions

Definition. Generator expression — an expression that returns an iterator. It looks like a normal expression followed by a for clause defining a loop variable, range, and an optional if clause.

Motivation. Many of the use cases do not need to have a full list created in memory. Instead, they only need to iterate over the elements one at a time.

Examples. For instance, the following summation code will build a full list of squares in memory, iterate over those values, and, when the reference is no longer needed, delete the list:

sum([x*x for x in range(10)])

Memory is conserved by using a generator expression instead:

sum(x*x for x in range(10))

Similar benefits are conferred on constructors for container objects:

s = set(word  for line in page  for word in line.split())
d = dict( (k, func(k)) for k in keylist)

Generator expressions are especially useful with functions like sum(), min(), and max() that reduce an iterable input to a single value:

max(len(line) for line in file if line.strip())