Normally you want your calculations to go as fast as possible. But not
this time. Here we will look at how to slow them down. Slow them waaay
down.
Time for Some Fun
If you are following the Ruby mailing list, you will see we just made a new
release of RubyGems. For the past week, most of my spare time has gone into
making that happen. Now its time to get back into some other projects that
I’m woefully behind on …
But before we do, it’s time to have a little fun with Ruby.
A Language Based SpreadSheet
Patrick Logan wonders
aloud why spread sheets haven’t made it into programming
languages as first class objects. He then goes on to speculate what it
might look like in Smalltalk. And along the way, he bumps into some
interesting concepts about evaluating expressions.
Translating Patrick’s Smalltalk code into Ruby turns out to be quite
straight forward …
ss = SpreadSheet.new
ss[1,1].value = 5
ss[1,2].value = ss[1,1] * 6
ss[1,3].value = ss[1,2] * 7
puts ss[1,3].value # => 210
ss is our spread sheet object. Indexing it with row and column
indicies yields cells, which can hold formulas. The first cell gets a five.
The second cell (at [1,,2]) gets 6 times that (30), and the final cell gets
7 times that (210).
Straight forward arithmetic, right?
Wrong!
Remember that this is a spread sheet. Changing the values in cells should
effect cells that derive from it. In other words, if we change the value of
ss[1,1] and then ask for the value of ss[1,3], we should get a new value.
ss[1,1].value = 10
puts ss[1,3].value # Should now print 420!
The naive implementation of just storing values in arrays as we calculate
them just isn’t going to hack it.
Ummm … Why Don’t We Use Lambdas?
My first thought was that we would want to just wrap the expressions in a
lambda. Putting them in a lambda has the effect of deferring the evaluation
of the expression until we really want it. And we can reevaluate the
expression as needed as its dependents change.
The result would look like this …
ss = SpreadSheet.new
ss[1,1].value = 5
ss[1,2].value = lambda { ss[1,1] * 6 }
ss[1,3].value = lambda { ss[1,2] * 7 }
puts ss[1,3].value # => 210
In fact, a commenter to Patrick’s blog suggested the same thing.
Patrick responds.
-
- The problem then is the programmer has to know where to put the blocks
to delay computation and where to force the revaluations. Instead of
blocks, though, my intention is to use some new object, a Formula, say, and
to make those objects implicit as much as possible.
Deferring Expressions in Ruby
How would this work out in Ruby? In Ruby, all computation is accomplished
by sending messages. To defer a computation, just capture its message and
replay it later at your convenience. We will start with a formula object
that turns any message sent to it into a deferred object.
class Formula < Builder::BlankSlate
def method_missing(sym, *args, &block)
Deferred.new(self, sym, args, block)
end
end
Since you generally want every message sent to a formula object to
be recorded, I based Formula on the BlankSlate class I’ve
mentioned earlier. Deferred is also fairly straight forward to write. The
initialize method just records the receiver of a message, the message name,
arguments and any blocks.
class Deferred < Formula
def initialize(target, operation, args, block)
@target = target
@operation = operation
@args = args
@block = block
end
# ...
We derive Deferred from Formula because we want operations against a
deferred object to also be deferred, and Formula handles that perfectly.
Now, when we want the value of a deferred operation, we need to ask it
nicely. We will use a method named formula_value for that purpose.
def formula_value
@target.formula_value.send(@operation, *eval_args, &@block)
end
# ...
To replay the deferred operation, we need to get the target (reciever) of
the message. Since the target was a formula, we need to ask for its formula
value. We then send it the original operation (message name) and a list of
arguments. Just one subtle note about the arguments. They too might (or
might not) be formula objects, in which case we need to evaluate them
before passing them to send. eval_args just creates a new
list of formula values from the original argument list.
def eval_args
@args.collect { |a| a.formula_value }
end
end # of class Deferred
Formulas in Action
Suppose I had a Formula object, and tried to add 1 to it. What would I get
…
result = some_formula + 1
result is a new deferred object, capturing the sending of +1 to
the formula. To get the real value, we ask the result for its
formula_value:
result.formula_value
which recursively gets the formula value of some_formula and sends
it a +1 message.
Something is Missing
Great. We see how to get deferred evaluations if we have a formula object,
but where does that original formula object come from?
There are several options. One is to create a Formula wrapper around a
normal Ruby value.
class Const < Formula
attr_reader :formula_value
def initialize(value)
@formula_value = value
end
end
Now we can say:
result = Const.new(5) + 3 # Returns a deferred formula object
and later ask:
result.formula_value # Returns 8
Mirror, Mirror
Another question: I can see how formula+1 works. The + message is sent to a
formula object and gets handled specially. But what if we write:
1 + formula
Won’t 1 get confused because it doesn’t know how to
handle a formula object?
Well, yes and no. The + operation in Integer will get confused, but it has
a backup procedure to follow when it doesn’t know how to add itself
to an arbitrary object: It asks the object how to do it.
Integer + will call coerce on the Formula object and expect
Formula to figure out how to do the addition. Coerce should return a two
element list whose elements can be added together.
Here is coerce for Formula:
class Formula
def coerce(other)
[Const.new(other), self]
end
end
Formula just wraps the original number in a Formula wrapper and then lets
the wrapper handle deferring the addition.
The SpreadSheet Cells
The Cells of the spreadsheet are Formula objects too. The only catch here
is that they can hold different Formula objects over time. value
should calculate the value of a cell (and will in turn call
formula_value to do so). value=(val) should set the
internal formula to whatever the user specifies.
Here is Cell in all its glory.
class Cell < Formula
def initialize
@formula = 0
end
def value
@formula.formula_value
end
def value=(new_value)
@formula = new_value
end
def formula_value
@formula.formula_value
end
end
Making a Cell object a formula is pretty key to the whole deferred
calculation design. If a cell contains only normal Ruby objects, then its
value can be calculated immediately. However, we want to defer calculations
that depend upon other cells, and since adding a cell object to any
expression will automatically defer it, we get the effect we want
automatically.
Hey! Wait a Minute!
The astute observer will note that I initialize @formula in Cell
to 0, but later in the code send a formula_value message to
@formula. Won’t this cause problem, sending formula message
to a non-formula object?
Well, yes it would. But it simplifies the code greatly if we just pretend
all objects are formulas and can respond to formula_value.
Non-formula objects just respond by returning themselves.
This is easy to accomplish.
module Kernel
def formula_value
self
end
end
(And this is why I chose a rather awkward name for fetching a
formula’s value. Since all objects will implement this
method, I wanted it to avoid potential name conflicts).
The SpreadSheet
The final piece of the puzzle is the actual spread sheet object. Its only
real function is to act as a lookup container for the Cell objects. My
implementation uses a cheesy "convert two integers to a string
key" technique. It works ok, but it bothers me.
class SpreadSheet
def initialize
@hash = Hash.new
end
def [](r,c)
@hash["#{r}@#{c}"] ||= Cell.new
end
end
That’s it. The guts of a spread sheet in under 100 lines of code. You
can see the complete
listing on my wiki.
Other possibilities
This technique of deferring calculations has some really interesting
possibilities. Imagine that instead of replaying deferred calculations, we
examined the deferred expression and performed some operation on them. Like
what? Well, maybe optimizing the expression, maybe analyzing the variable
references, maybe generating SQL from them. Remember the Criteria
library … it uses this technique to translate ruby code into SQL
conditions.
Limitations
Be careful with this technique. Although quite powerful, there are a couple
of places that can cause problems. In particular, the && and ||
operators in Ruby are difficult to capture in this way.
comments
|