|
How can I refuse an offer like that!
Here’s what’s ahead … We will start with the Same Fringe problem. Then we will implement the solution three times (once by converting everything to arrays, once with a hand-written external iterator, and once with a generator). We will finish up with a discussion of how the generator implementation works (using continuations) in Ruby.
Sound good?
Lets imagine that we have a binary tree filled with sorted data. This kind of tree is a common data structure used to implement hash-like tables where the keys are stored in sorted order.
These binary trees have the property that depth-first visit of the leaf nodes will visit the leaves in sorted order (we are ignoring interior nodes for this example). However, two trees containing the same leaf nodes may be structured differently (due to the order in which nodes are inserted into the tree).
We want to write some code that will determine if two binary trees have the same fringe (i.e. leaf) nodes.
Consider the following trees …
tree1 tree2 tree3 o o o / \ / \ / \ A o o C o C / \ / \ / \ B C A B A X
Trees 1 and 2 have the same fringe leaves, although they are internally structured differently. Tree 3 does not have the same fringe as either tree 1 or 2, due to the leaf node X.
So lets take a look at the code to handle this problem.
We will use a Node class to represent the interior nodes of our binary tree (leaf nodes will simply be strings). A node has an accessor for its left and right children.
class Node attr_accessor :left, :right def initialize(left, right) @left, @right = left, right end end
Iterating through a binary tree is pretty easy. You just recursively visit the left children and then the right children. Since we are ignoring the interior nodes, we don’t have to worry about pre-order vs post-order.
We use implement the iteration in term of each and include Enumerable, making our trees respond to the standard Ruby internal iterator protocol.
class Node include Enumerable def each(&block) handle_child(self.left, &block) handle_child(self.right, &block) end
def handle_child(value, &block) case value when Node value.left.each(&block) value.right.each(&block) else yield(value) end end end
Internal iterators are difficult to use in solving the same-fringe problem. The reason is that we want to walk both trees, one element at a time, so we can compare the elements. Because internal iterators encapsulate the iteration algorithm, it is difficult to change the algorithm to handle two trees.
Fortunately it is easy to solve the same-fringe problem with external iterators. In the following code, assume an iterator provides a method named get that returns the next available item from the iterator. When the iterator is done, get will return nil.
Here is the same_fringe function written with external iterators. The function takes two iterators as arguments, one iterator for each of the two trees being compared. Same_fringe will only return true if each element from both trees are identical all the way to the end of the list. Since the iterators just return a sequence of leaf nodes, we are ignoring the tree structure during the comparison.
def same_fringe(it1, it2) while v1 = it1.get v2 = it2.get return false if v1 != v2 end return v1 == it2.get end
Notice that same_fringe doesn’t care what kind of iterator it is given, as long as the iterator conforms to the get specification we gave. We will write several iterators and pass them to same_fringe.
Before we write some iterators, here is some support code that we will use in the demonstration.
show will show the contents of a tree using the given iterator. show is a good example of using our external iterator.
def show(msg, it) print msg while v = it.get print " ", v end puts end
show_sf will run the same_fringe function with the given iterators and show the results.
def show_sf(msg, expect, it1, it2) result = same_fringe(it1, it2) puts "Same Fringe, #{msg}, should be #{expect}, was #{result}" fail "Unexected Result!" if expect != result end
Finally, demonstrate will create some trees and display them. Demonstrate expects a block that will return an iterator of the appropriate type.
def demonstrate(msg) tree1 = Node.new("A", Node.new("B", "C")) tree2 = Node.new(Node.new("A", "B"), "C") tree3 = Node.new(Node.new("A", "X"), "C") puts "Using #{msg} ..." show("Tree1 = ", yield(tree1)) show("Tree2 = ", yield(tree2)) show("Tree3 = ", yield(tree3)) show_sf("tree1 vs tree2", true, yield(tree1), yield(tree2)) show_sf("tree1 vs tree3", false, yield(tree1), yield(tree3)) puts end
Our first external iterator will be one that walks through an array. An array iterator is trivial to implement (as you can see). Also, it is easy to convert a tree to an array since the Node objects support the Enumerable protocol. We just need to call to_a and we have an array of leaf nodes.
Here is the array iterator…
class ArrayIterator def initialize(collection) @array = collection.to_a.dup end def get @array.shift end end
We use our demonstrate function to show that the array iterater does do its job.
demonstrate("Array Iterator") { |tree| ArrayIterator.new(tree) }
Using Array Iterator ... Tree1 = A B C Tree2 = A B C Tree3 = A X C Same Fringe, tree1 vs tree2, should be true, was true Same Fringe, tree1 vs tree3, should be false, was false
Although trivial to write, the array iterator suffers from a potential problem. The tree must be converted to an array before the same_fringe function can even begin to look at leaf nodes. If the tree is large, then this can be problematic.
Can we create an external iterator that examines the tree "in place"? Certainly! Here is the code.
class TreeIterator def initialize(node) @tree = node @stack = [] end # Get the next leaf node. def get if @tree t = @tree @tree = nil left_most(t) elsif ! @stack.empty? node = @stack.pop left_most(node.right) else nil end end
# Find the left-most leaf from +node+. def left_most(node) while Node === node @stack.push(node) node = node.left end node end end
And again we use the demonstrate method to show that the TreeIterator works.
demonstrate("Tree Iterator") { |tree| TreeIterator.new(tree) }
Using Tree Iterator ... Tree1 = A B C Tree2 = A B C Tree3 = A X C Same Fringe, tree1 vs tree2, should be true, was true Same Fringe, tree1 vs tree3, should be false, was false
The tree iterator works great, but it was a big pain to code up. Did you notice the explicit stack that was used to remember the nodes that had not yet been processed? It works, but the logic is much more obscure than the internal iterator logic.
Wouldn‘t it be nice if it were possible to write an external iterator reusing the logic from the internal iterator.
Fortunately, it is possible to do exactly that using generators. A generator is simply a resumable function. After a generator returns a value, the next time it is called it starts executing immediately after the last return, remembering the values of all the local variables in the function.
Python generators use the keyword yield to designate returning a value and the resumption point in a generator. Since Ruby already uses yield for blocks, we will use a generate method to return generated values.
Here is a simple example that generates the squares of the numbers 0 through 3. Notice that the iteration logic goes into a block given to the generator constructor. The generator object returned from the constructor can be used as an iterator, just like our Tree and Array iterators.
$ irb --simple-prompt >> require 'generator' => true >> it = Generator.new { |g| 4.times { |i| g.generate(i**2) } } => #<Generator:0x401f24e8 @resume=#<Continuation:0x401f24ac>> >> it.get => 0 >> it.get => 1 >> it.get => 4 >> it.get => 9 >> it.get => nil >> exit $
We will look at the guts of the generator in a moment. In the meantime, lets use a generator to solve the same-fringe problem.
First we load the generator code.
require 'generator'
Then we define a function that creates our generator for us. Since we already have an internal iterator that works, we will simply use that in the iteration logic of our generator.
def tree_generator(tree) Generator.new do |generator| tree.each { |value| generator.generate(value) } end end
And we show that the generator works.
demonstrate("Generator") { |tree| tree_generator(tree) }
Using Generator ... Tree1 = A B C Tree2 = A B C Tree3 = A X C Same Fringe, tree1 vs tree2, should be true, was true Same Fringe, tree1 vs tree3, should be false, was false
So how are generators written in Ruby? … Very Carefully!
No, seriously! We need to use continuations.
The key to writing generators is using continuations. A continuation is a callable object (i.e. you can send a :call message to a continuation) that encodes the return point of a function. Continuations are created by given the callcc (or "Call with Current Continuation") method a block. callcc passes its own continuation (the "current" continuation) to the block as a parameter. When the continuation is called, that particular instance of callcc will return immediately with the value passed in the call argument list.
Here’s a short example …
x = callcc do |continuation| puts "In CALLCC" continuation.call("HI") puts "This is never printed" end puts "X = #{x}"
Will print …
In CALLCC X = HI
In this example, callcc looks like a version of setjump/longjump in C where the continuation plays the role of the jump buffer. There is one big difference. In C, you must never call longjump outside the scope of the setjump call. A continuation is allowed to be called anywhere! Even outside the scope of the callcc method.
We can use this fact to create resumable functions. The following function will return once (returning a 1). We then resume the function and it returns a second time, assigning 2 to x.
def resumable callcc do |continuation| $resume_point = continuation return 1 end return 2 end x = resumable puts "Again, X = #{x}" $resume_point.call if x != 2
Will print …
Again, X = 1 Again, X = 2
Wow! This gives use the tools we need to create resumable functions. The key is to capture a continuation that will allow us to resume a function at the point we need.
The key to our generator design is tracking the two different continuations we need. The @resume_generator continuation will be called from the main code (via the get function) and will resume into the generator. The @resume_mainline continuation will be used by the generator to return to the mainline code.
Here’s the start of our generator class.
class Generator
The initialize function is carefully designed to initialize the @resume_generator instance variable. We use callcc to capture the continuation we want, and then return early from initialize. The rest of initialize will be run the first time we resume the generator.
def initialize callcc do |continuation| @resume_generator = continuation return end yield(self) while true generate(nil) end end
The get method is called from the mainline code to resume the generator (to eventually return the next value). Our tasks are to …
def get callcc do |continuation| @resume_mainline = continuation @resume_generator.call end end
Finally we have the generate method. It is called from the generator to return values to the mainline code. Our tasks are to …
def generate(value) callcc do |continuation| @resume_generator = continuation @resume_mainline.call(value) end end end
Search for "continuations" on Google for even more references.
Copyright 2003 by Jim Weirich. Some rights reserved (see details)