Demystifying Generator Functions

Generator Functions are a feature of  modern JavaScript that may seem like some mysterious force of nature. You may have seen them used for a variety of things, one of the first times I saw them in use was with redux-saga, and after working to figure out just what the heck they were actually doing I wanted to share that so that they no longer remain some mysterious feature.

Generator Functions can be quite the confusing thing to jump into. They add a few new tokens into the JavaScript language and can look quite strange to the untrained eye. I know the first time I saw them they were pretty odd, and for the most part I dismissed them as truly usable tools in a real application. After all, if some part of your codebase uses non-standard tools or techniques then introducing someone to help you modify the code is that much more difficult as they now have a hurdle to overcome before they can comprehend the behavior of the application as a whole. Well, this post will try to purge the cloudy bits surrounding generator functions and clarify how you can work with them and what their behavior lets you leverage them for. Feel free to follow along in your (modern) browsers JavaScript console – I tested all the code samples I provide in Chrome's DevTools and they work as is without any modification/poly-fills.

In order for our story into Generator Functions to start, we must first detour back in time and talk about iterating. Iteration is a basic principle of software design, the act of performing a task in repetition. For the most part, iteration is achieve via some kind of looping mechanism – which could be explicit loops, recursion or functional methods designed to wrap the looping behaviors. In "classical" JavaScript one could accomplish loops with two variations of the standard "for" construct.

var list = [1, 2, 3];

// standard form with initializer, test, increment sections
var i, len, cur;
for (i = 0, len = list.length; i < len; ++i) {
  cur = list[i];
  console.log(cur);
}

// property iteration
var key, hasOwn = Object.prototype.hasOwnProperty;
for (key in list) {
  if (hasOwn.call(list, key)) {
    cur = list[key];
    console.log(cur);
  }
}

The former is the standard list iteration one might write to traverse an ordered list (Array). The latter is designed more for object iteration and works by nature of arrays being nothing more than objects with "numeric" keys under the hood. The problem with the latter is that property ordering is not guaranteed so while the former might print the values in the Array in the order you expect, the latter may not. Later versions of core JavaScript provide iteration functions like forEach and map that operate on lists in some of the most common ways.

[1, 2, 3].forEach(item => console.log(item));

const squares = [1, 2, 3].map(x => x * x);

Next up, JavaScript introduced an Iterator protocol. This protocol is pretty simple, and defines a way to traverse some "list" like structure. If you are not familiar with the concept of iterators then please review the pattern definition but the long and the short of it is that iterators allow you to traverse some "thing" without worrying about how the "thing" is implemented. If you were to get an iterator from an Array or an Object or a binary tree implementation, or whatever – you could easily traverse it with your code that takes an iterator and do work with the values without worry about how or what is giving you the value.

JavaScript iterators happen to have one of the simplest patterns, and that is they have a next function that returns an object with a value and done key – that is it. So that means you can build iterators your self. Right now. Without anything incredibly fancy.

const range = (from, to) => {
  let current = from;
  return {
    next() {
      if (current > to) {
        return { done: true };
      }

      const result = { value: current, done: false };
      current += 1;
      
      return result;
    },
  };
};

const i = range(1, 10);
const list = [];
let result = i.next();
while (!result.done) {
  list.push(result.value);
  result = i.next();
}
console.log(list); // => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

So that is pretty simple. Implementing an iterator construct just seems to work. And to complement the addition of iterators and to aid in simplifying writing a for construct – JavaScript now has a for ... of looping mechanism. This new loop will work only on objects that are iterable. Out of the box things like Array and Map, etc... are such objects, which means our above attempt with for ... in can be done:

const list = [1, 2, 3];
for (const val of list) {
  console.log(val);
}

Much cleaner. How might you make your own objects iterable? Well we have to turn to yet another new feature in JavaScript: Symbols. Symbols are a topic of their own so I will not be going in to detail about them here but be aware there is one symbol we really care about right now and that is Symbol.iterator. This symbol is what for ... of is expecting to find – and it wants its value to be a function that returns the iterator. You can read more here. So we can make our range iterator above work within a for ... of with some slight modifications:

const range = (from, to) => {
  let current = from;
  return {
    next() {
      if (current > to) {
        return { done: true };
      }

      const result = { value: current, done: false };
      current += 1;
      
      return result;
    },
    [Symbol.iterator]: function() {
      return this;
    },
  };
};

for (const i of range(1, 10)) {
  console.log(i);
}

Okay. I promised we would talk about Generator Functions and so far you have learned about iterators. You may realize that iterators play a role in generator functions or you may just be confused and wondering what the heck I am going on about. If you fall in the first camp, congratulations you are correct. If you fall in the second camp – do not worry. This is the reason I started writing this post to begin with.

Starting simple, we will take a look at generator functions. The first thing to note is that you do not use function, but instead you will use function* (notice the addition of the asterisk). This is the key aspect of generator functions and excluding it means you are not making a generator function! So we start extremely simple.

function* one() {
  return 1;
}

That is it. You have defined your very first generator function. Now call it.

> one()
<- one {<suspended>}

Okay, what the heck. We defined the function to return the number 1, but it returned something weird (note that if you are using Firefox/Safari/Other you may see a different representation, this is from Chrome). What we got from the call to one is a Generator, which, believe it or not, is an Iterator! So, calling a generator function does not execute any code, it instead creates a new generator. This generator can be iterated to completion. Give it a try!

> one().next()
<- {value: 1, done: true}

In this example all we did was return a value, and you may notice that done is set to true. If you are adventurous like I am you may have tried to toss this into a for ... of only to find out you do not get anything – well that is because we used return which marks the iterator as done after one call to next and regardless of whether or not value is set, if done is true then for ... of is done. So how can we "return" without using return? Well, that is where yield comes into play. It works very similarly to return but only for generator functions. When you call next, all the code up to the next yield will be called. Let us take our range iterator from above and write it as a generator function – we will use yield to "return" our next value in the range.

function* rangeFn(from, to) {
  let current = from;
  while (current <= to) {
    yield current;
    current += 1;
  }
}

const newList = [];
for (const i of rangeFn(5, 9)) {
  newList.push(i);
}
console.log(newList) // => [5, 6, 7, 8, 9]

So I do not know how you feel but in my opinion this makes a lot more sense than manually building iterators. Once you get over the hurdle of the new syntax, like function* and yield. Try using it yourself too, do not just leave it up to the language constructs to prove to you it is working like an iterator.

> it = rangeFn(2, 8);
<- rangeFn {<suspended>}
> it.next()
<- {value: 2, done: false}
> it.next()
<- {value: 3, done: false}
> it.next()
<- {value: 4, done: false}

Nothing magical is happening, necessarily. We just have a specialized iterator built to work over the body of a function. We do not need to worry with the specifics of what the JavaScript VM is doing to enable these features but if you really are that interested, by all means. Also looking into how the regenerator library implements the behavior of generator functions in classic JavaScript might be of some interest.

Shifting focus back to generator functions, we have only just now started to scratch the surface. If you notice that our range generator function manually escapes the list when we hit our limit, but if we stop iterating before we get to that limit then the iterator reference (assuming we were in correct code and not the console) could be cleaned up by a garbage collector without every completing then you may be thinking, "Well what about infinite sequences?"

function* allIntegers() {
  let a = 1;
  while (true) {
    yield a;
    a += 1;
  }
}

Give that a whirl. It just keeps going! So now we can see that generator functions can be powerful tools. This could be used as an auto-incrementing ID generation tool. Not something where you would want IDs to be extremely specific/static but in the case of identifying ephemeral data during a user session it should be plenty sufficient.

// inside simpleId.js
function* allIntegers() {
  let a = 1;
  while (true) {
    yield a;
    a += 1;
  }
}

const idIterator = allIntegers();

const nextId = () => idIterator.next().value;

export default nextId;

Boom. What other kinds of extremely common sequences can be rewritten as infinite with generator functions?

// infinite fibonacci sequence
function* fib() {
  let a = 1;
  let b = 1;
  while (true) {
    yield a;
    const c = b;
    b = a + b;
    a = c;
  }
}

// infamous fizz buzz
function* fizzBuzz() {
  let current = 1;
  while (true) {
    if (current % 5 === 0 && current % 3 === 0) {
      yield "FizzBuzz";
    } else if (current % 3 === 0) {
      yield "Fizz";
    } else if (current % 5 === 0) {
      yield "Buzz";
    } else {
      yield current;
    }
    current += 1;
  }
}

Both of these implementations provide you with the ability to pull as many values from their respective sequences as you desire. Need to print FizzBuzz from 1 to 100? Well, just iterate 100 times calling next()! Need the first 50 fibonacci numbers? Just iterate 50 times! Need to start either sequence over? Just call the generator function again! I hope that you are already starting to see that generator functions can be extremely powerful and useful given the situation is right. But we still have more pieces of the puzzle to discuss, the first being that next behaves in a very odd way with generator functions.

So, say you want to build a generator function that adds two numbers. Now, before I write this example let me be clear in saying that this would most likely never be an example of a useful generator function but it is probably the simplest example to demonstrate this behavior. And we just dive right in.

function* add() {
  const a = yield 'give me a';
  const b = yield 'give me b';
  yield a + b;
}

So what the heck? I did not take a and b as parameters. Instead I assign them in the function. And what are they defined as? The result of yielding a string. Well what could that possibly be? Let us examine what this is doing!

> i = add()
<- add {<suspended>}
> i.next()
<- {value: 'give me a', done: false}
> i.next()
<- {value: 'give me b', done: false}
> i.next()
<- {value: NaN, done: false}

So, it works fine. But adding obviously does not work. If you are immediately curious why the final value is NaN just add undefined + undefined in the console and see what the result is. We know it is undefined + undefined and not null + null because apparently (at least in Chrome) null + null === 0. Go figure, right? Okay so how are we supposed to make this function add two numbers together, you might find yourself asking – simple.

> i = add()
<- add {<suspended>}
> i.next()
<- {value: 'give me a', done: false}
> i.next(5)
<- {value: 'give me b', done: false}
> i.next(6)
<- {value: 11, done: false}

Ah hah! So passing a value to next seems to carry the value back and is the result of the yield expression. So that can be pretty useful. I am going to leave this topic as is for the time being – in the future I will be writing more in depth about redux-saga and building a dumbed down version of it is runtime to aid in understanding it. That post will detail much more detailed usage of this specific aspect of generator functions.

Only one more thing to talk about! Generator functions are not that scary!

The final puzzle piece is yield*, it is very similar to yield except that yield* works on generator functions/iterables and executes them as part of the calling generator function, so to speak. Here is an example. Say we have a very stupid generator function:

function* dumbGenerator() {
  yield 1;
  yield 2;
  yield 3;
}

That is as simple as it gets. We know by now that our expected results are 1, 2 and then 3 before the iterator is done. Now, we build the next part of that sequence as a new generator function:

function* nextDumbGenerator() {
  // reuse first generator
  yield dumbGenerator();
  yield 4;
  yield 5;
  yield 6;
}

Okay, so what I want is for calling nextDumbGenerator to give us the numbers 1 through 6. Let us see if that is the case.

> i = nextDumbGenerator()
<- nextDumbGenerator {<suspended>}
> i.next()
<- {value: dumbGenerator, done: false}
> i.next()
<- {value: 4, done: false}

Now wait a minute. That is not what we wanted. If we stick with this implementation we have to test to see if we get an iterator/generator back and we have to specifically handle those cases and we have to also handle other cases where the value is a number. This is definitely doable, but it is far from ideal. And so now, if you have been following this far then you probably already know that this is where (somehow) yield* comes in to save the day. Let us just try it. Alter our nextDumbGenerator just a tad:

function* nextDumbGenerator() {
  // reuse first generator
  yield* dumbGenerator();
  yield 4;
  yield 5;
  yield 6;
}

And now, we iterate.

> i = nextDumbGenerator()
<- nextDumbGenerator {<suspended>}
> i.next()
<- {value: 1, done: false}
> i.next()
<- {value: 2, done: false}
> i.next()
<- {value: 3, done: false}
> i.next()
<- {value: 4, done: false}
> i.next()
<- {value: 5, done: false}
> i.next()
<- {value: 6, done: false}

Hey now! That works! That does what I wanted. I have successfully composed function generators to build a longer sequence from a set of two smaller sequences. This is super useful – you can now factor out common aspects of your use cases, like network requests, and use yield* to play the yield results as part of the single generation. You are not leaking the details of how they are implemented, all the public generator functions return all their properly yielded values as expected.

Well, I am not really sure how to segue to an ending here but by now I hope that my words have helped to unmask this piece of modern JavaScript and helped to make it yet another tool you can add to your tool belt as you build your modern JavaScript applications. Ad I mentioned earlier when talking about handling the result of yield calls, I plan to write up "Learn redux-saga by building a worse version" blog explaining how you can take generator functions and make them handle side effects for you in a simple and safe manner. First things first though, I have to actually write the worse version before I can teach about how it is working so it may be some time in the making, but I hope you all come back and learn some more about function generators then too!