Learning about iterators and generators in Python

Python is a cool language! I might be biased, since it’s what I used to learn programming some years ago. It wasn’t until recently, though, that I started learning about some more advanced features of the language. One such feature is iterators and generators.

An iterator is, in the words of the official documentation, an object representing a stream of data. An iterator is something you can call next on to get the next item. This is what happens in a for loop from one iteration to the next (I think!).

Some typical things to do for loops on include lists and strings. These are not iterators, but iterables. Again in the words of the documentation, an iterable is an object capable of returning its members one at a time. An iterable is not necessarily an iterator. For example, (note that # here means the output we get)

items = [1, 2, 3]
item = next(items)
# TypeError: 'list' object is not an iterator

However, if you call iter on an iterable, you get an iterator:

items = [1, 2, 3, 4, 5]       # iterable
iterator_items = iter(items)  # iterator
item = next(iterator_items)
print(item)                   # 1

In addition, we have something called generators. A generator is a function which returns an iterator. It looks like a regular function, except it uses yield instead of return. Unfortunately, the term generator is used to mean both the generator function and the generator object.

When a generator function is called, a generator object is created, without executing anything in the function. The generator works in such a way that it starts executing until it reaches the first yield expression in the code. Then it returns the value in this expression. The next time next is called, the generator continues its execution, continuing where it left off, until the next yield expression. And so on. This is an example of lazy evaluation.

Let’s look at an example of a generator.

def a_generator():
    print("before 0")
    yield 0
    print("before hi")
    yield "hi"
    print("last")
    
iterator = a_generator()
item = next(iterator)
# before 0
print(item)                   # 0
item = next(iterator)
# before hi
print(item)                   # hi
next(iterator)
# last
# StopIteration error

We see that we called next one time too many. This is avoided, however, if we use a for to iterate, since it will stop when hitting StopIteration.

iterator = a_generator()
for item in iterator:
    print(item)
# before first
# 0
# after first
# hi
# last

Finally, an important thing that the lazy evaluation provides us with is the ability to create infinite iterators!

The iteration framework, and perhaps generators in particular, are powerful tools, and can really help with abstraction, among other things. Ned Batchelder gave a really good talk on this at PyCon 2013. I recommend you check it out!