Introduction

On Monday, March 24, 2003 in ruby-talk, Gavin Sinclair writes: "I’d like someone, hopefully including Julian and Jim (Weirich), to demonstrate how to iterate over two lists in parallel. Extra marks for setting a realistic example problem and solving it!"

How can I refuse an offer like that!

Here’s what’s ahead … We will start with the Same Fringe problem. Then we will implement the solution three times (once by converting everything to arrays, once with a hand-written external iterator, and once with a generator). We will finish up with a discussion of how the generator implementation works (using continuations) in Ruby.

Sound good?

The Same Fringe Problem

Lets imagine that we have a binary tree filled with sorted data. This kind of tree is a common data structure used to implement hash-like tables where the keys are stored in sorted order.

These binary trees have the property that depth-first visit of the leaf nodes will visit the leaves in sorted order (we are ignoring interior nodes for this example). However, two trees containing the same leaf nodes may be structured differently (due to the order in which nodes are inserted into the tree).

We want to write some code that will determine if two binary trees have the same fringe (i.e. leaf) nodes.

Example

Consider the following trees …

   tree1          tree2       tree3
     o              o           o
    / \            / \         / \
   A   o          o   C       o   C
      / \        / \         / \
     B   C      A   B       A   X

Trees 1 and 2 have the same fringe leaves, although they are internally structured differently. Tree 3 does not have the same fringe as either tree 1 or 2, due to the leaf node X.

The Code

So lets take a look at the code to handle this problem.

Class Node

We will use a Node class to represent the interior nodes of our binary tree (leaf nodes will simply be strings). A node has an accessor for its left and right children.

    class Node
      attr_accessor :left, :right

      def initialize(left, right)
        @left, @right = left, right
      end
    end

Iterating through a binary tree is pretty easy. You just recursively visit the left children and then the right children. Since we are ignoring the interior nodes, we don’t have to worry about pre-order vs post-order.

We use implement the iteration in term of each and include Enumerable, making our trees respond to the standard Ruby internal iterator protocol.

    class Node
      include Enumerable

      def each(&block)
        handle_child(self.left, &block)
        handle_child(self.right, &block)
      end

      def handle_child(value, &block)
        case value
        when Node
          value.left.each(&block)
          value.right.each(&block)
        else
          yield(value)
        end
      end
    end

External Iterators

Internal iterators are difficult to use in solving the same-fringe problem. The reason is that we want to walk both trees, one element at a time, so we can compare the elements. Because internal iterators encapsulate the iteration algorithm, it is difficult to change the algorithm to handle two trees.

Fortunately it is easy to solve the same-fringe problem with external iterators. In the following code, assume an iterator provides a method named get that returns the next available item from the iterator. When the iterator is done, get will return nil.

Same Fringe Function Using External Iterators

Here is the same_fringe function written with external iterators. The function takes two iterators as arguments, one iterator for each of the two trees being compared. Same_fringe will only return true if each element from both trees are identical all the way to the end of the list. Since the iterators just return a sequence of leaf nodes, we are ignoring the tree structure during the comparison.

    def same_fringe(it1, it2)
      while v1 = it1.get
        v2 = it2.get
        return false if v1 != v2
      end
      return v1 == it2.get
    end

Notice that same_fringe doesn’t care what kind of iterator it is given, as long as the iterator conforms to the get specification we gave. We will write several iterators and pass them to same_fringe.

Support Code

Before we write some iterators, here is some support code that we will use in the demonstration.

show will show the contents of a tree using the given iterator. show is a good example of using our external iterator.

    def show(msg, it)
      print msg
      while v = it.get
        print " ", v
      end
      puts
    end

show_sf will run the same_fringe function with the given iterators and show the results.

    def show_sf(msg, expect, it1, it2)
      result = same_fringe(it1, it2)
      puts "Same Fringe, #{msg}, should be #{expect}, was #{result}"
      fail "Unexected Result!" if expect != result
    end

Finally, demonstrate will create some trees and display them. Demonstrate expects a block that will return an iterator of the appropriate type.

    def demonstrate(msg)
      tree1 = Node.new("A", Node.new("B", "C"))
      tree2 = Node.new(Node.new("A", "B"), "C")
      tree3 = Node.new(Node.new("A", "X"), "C")

      puts "Using #{msg} ..."
      show("Tree1 = ", yield(tree1))
      show("Tree2 = ", yield(tree2))
      show("Tree3 = ", yield(tree3))
      show_sf("tree1 vs tree2", true,
        yield(tree1),
        yield(tree2))
      show_sf("tree1 vs tree3", false,
        yield(tree1),
        yield(tree3))
      puts
    end

Array Iterator

Our first external iterator will be one that walks through an array. An array iterator is trivial to implement (as you can see). Also, it is easy to convert a tree to an array since the Node objects support the Enumerable protocol. We just need to call to_a and we have an array of leaf nodes.

Here is the array iterator…

    class ArrayIterator
      def initialize(collection)
        @array = collection.to_a.dup
      end
      def get
        @array.shift
      end
    end

We use our demonstrate function to show that the array iterater does do its job.

    demonstrate("Array Iterator") { |tree| ArrayIterator.new(tree) }

Output for Array Iterator

    Using Array Iterator ...
    Tree1 =  A B C
    Tree2 =  A B C
    Tree3 =  A X C
    Same Fringe, tree1 vs tree2, should be true, was true
    Same Fringe, tree1 vs tree3, should be false, was false

Tree Iterator

Although trivial to write, the array iterator suffers from a potential problem. The tree must be converted to an array before the same_fringe function can even begin to look at leaf nodes. If the tree is large, then this can be problematic.

Can we create an external iterator that examines the tree "in place"? Certainly! Here is the code.

    class TreeIterator
      def initialize(node)
        @tree = node
        @stack = []
      end

      # Get the next leaf node.
      def get
        if @tree
          t = @tree
          @tree = nil
          left_most(t)
        elsif ! @stack.empty?
          node = @stack.pop
          left_most(node.right)
        else
          nil
        end
      end

      # Find the left-most leaf from +node+.
      def left_most(node)
        while Node === node
          @stack.push(node)
          node = node.left
        end
        node
      end
    end

And again we use the demonstrate method to show that the TreeIterator works.

    demonstrate("Tree Iterator") { |tree| TreeIterator.new(tree) }

Output for Tree Iterator

    Using Tree Iterator ...
    Tree1 =  A B C
    Tree2 =  A B C
    Tree3 =  A X C
    Same Fringe, tree1 vs tree2, should be true, was true
    Same Fringe, tree1 vs tree3, should be false, was false

Generators

The tree iterator works great, but it was a big pain to code up. Did you notice the explicit stack that was used to remember the nodes that had not yet been processed? It works, but the logic is much more obscure than the internal iterator logic.

Wouldn‘t it be nice if it were possible to write an external iterator reusing the logic from the internal iterator.

Fortunately, it is possible to do exactly that using generators. A generator is simply a resumable function. After a generator returns a value, the next time it is called it starts executing immediately after the last return, remembering the values of all the local variables in the function.

Python generators use the keyword yield to designate returning a value and the resumption point in a generator. Since Ruby already uses yield for blocks, we will use a generate method to return generated values.

Here is a simple example that generates the squares of the numbers 0 through 3. Notice that the iteration logic goes into a block given to the generator constructor. The generator object returned from the constructor can be used as an iterator, just like our Tree and Array iterators.

    $ irb --simple-prompt
    >> require 'generator'
    => true
    >> it = Generator.new { |g| 4.times { |i| g.generate(i**2) } }
    => #<Generator:0x401f24e8 @resume=#<Continuation:0x401f24ac>>
    >> it.get
    => 0
    >> it.get
    => 1
    >> it.get
    => 4
    >> it.get
    => 9
    >> it.get
    => nil
    >> exit
    $

We will look at the guts of the generator in a moment. In the meantime, lets use a generator to solve the same-fringe problem.

First we load the generator code.

    require 'generator'

Then we define a function that creates our generator for us. Since we already have an internal iterator that works, we will simply use that in the iteration logic of our generator.

    def tree_generator(tree)
      Generator.new do |generator|
        tree.each { |value| generator.generate(value) }
      end
    end

And we show that the generator works.

    demonstrate("Generator") { |tree| tree_generator(tree) }

Output for Tree Iterator

    Using Generator ...
    Tree1 =  A B C
    Tree2 =  A B C
    Tree3 =  A X C
    Same Fringe, tree1 vs tree2, should be true, was true
    Same Fringe, tree1 vs tree3, should be false, was false

Writing the Generator Code

So how are generators written in Ruby? … Very Carefully!

No, seriously! We need to use continuations.

Continuations

The key to writing generators is using continuations. A continuation is a callable object (i.e. you can send a :call message to a continuation) that encodes the return point of a function. Continuations are created by given the callcc (or "Call with Current Continuation") method a block. callcc passes its own continuation (the "current" continuation) to the block as a parameter. When the continuation is called, that particular instance of callcc will return immediately with the value passed in the call argument list.

Here’s a short example …

   x = callcc do |continuation|
     puts "In CALLCC"
     continuation.call("HI")
     puts "This is never printed"
   end
   puts "X = #{x}"

Will print …

   In CALLCC
   X = HI

In this example, callcc looks like a version of setjump/longjump in C where the continuation plays the role of the jump buffer. There is one big difference. In C, you must never call longjump outside the scope of the setjump call. A continuation is allowed to be called anywhere! Even outside the scope of the callcc method.

We can use this fact to create resumable functions. The following function will return once (returning a 1). We then resume the function and it returns a second time, assigning 2 to x.

    def resumable
      callcc do |continuation|
        $resume_point = continuation
        return 1
      end
      return 2
    end

    x = resumable
    puts "Again, X = #{x}"
    $resume_point.call if x != 2

Will print …

    Again, X = 1
    Again, X = 2

Wow! This gives use the tools we need to create resumable functions. The key is to capture a continuation that will allow us to resume a function at the point we need.

The Generator Class

The key to our generator design is tracking the two different continuations we need. The @resume_generator continuation will be called from the main code (via the get function) and will resume into the generator. The @resume_mainline continuation will be used by the generator to return to the mainline code.

Here’s the start of our generator class.

    class Generator

The initialize function is carefully designed to initialize the @resume_generator instance variable. We use callcc to capture the continuation we want, and then return early from initialize. The rest of initialize will be run the first time we resume the generator.

      def initialize
        callcc do |continuation|
          @resume_generator = continuation
          return
        end
        yield(self)
        while true
          generate(nil)
        end
      end

The get method is called from the mainline code to resume the generator (to eventually return the next value). Our tasks are to …

Capture the mainline continuation.

Resume the generator.

    def get
      callcc do |continuation|
        @resume_mainline = continuation
        @resume_generator.call
      end
    end

Finally we have the generate method. It is called from the generator to return values to the mainline code. Our tasks are to …

Capture the continuation for the generator.

Resume the mainline code, while returning the next generated value.

    def generate(value)
      callcc do |continuation|
        @resume_generator = continuation
        @resume_mainline.call(value)
      end
    end
  end

Introduction

The Same Fringe Problem

Example

The Code

Class Node

External Iterators

Same Fringe Function Using External Iterators

Support Code

Array Iterator

Output for Array Iterator

Tree Iterator

Output for Tree Iterator

Generators

Output for Tree Iterator

Writing the Generator Code

Continuations

The Generator Class

Further Reading