RubySource: Getting to Grips with Blocks

When making the switch to Ruby from a PHP background, one of the grayest areas for me was blocks. Blocks are everywhere in Ruby and not having a good understanding of them will stall your learning past the “novice” stage. When you first start out with Ruby you will inevitably use blocks exclusively for iteration.

Hello Blocks

3.times {|n| puts "Printed #{n} times" }

The code above must be the “hello world” of blocks, and from that single line we get alot of what Ruby is all about. We have a number (Fixnum) with a method (times) that is accepting a block to do something useful. In case you missed the important bit, “a method that is accepting a block”. For the moment, just blindly accept a block is just a piece of code we can pass around and can only be defined after a method call, it can accept parameters and has 2 syntax styles. We have blocks defined in curly braces as above or defined within do and end. It’s standard practice to use the curly braces if the contents of the block is a single line (although Avdi Grimm does make a good argument for mixing these up based on what the block is doing). Regardless, the code above would be just as valid written like so:

3.times do |n|
  puts "Printed #{n} times"
end

In PHP we are not used to blocks at all. In fact it’s feasible to say for someting so trivial it would be far better to just use our trusty for(each) loop. But far too many times have I come across the following.

for($i = 0; $i  count($values); $i++)

Now there is a multitude of problems in this single line of code (it is being used in production near you now). For starters count() in a for loop is practically a mortal sin, it evaluates count every time the loop runs. But even that is not my biggest gripe. The use of the variable $i, we know it means index, but we havent changes scope $i is a valid local to whatever function we are in. Why resort to using these obscure names? I blame C myself, $i is used in almost every textbook example of looping constructs.

Looking past scope, semantics, or even implenting this as a foreach loop we still have what seems to be a foreign looping construct meddling with our data. I used to write that kind of thing day in, day out. But these days after spending some real time in Ruby it just doesn’t feel right to use loop constructs such as for. By placing the business in a block it’s cleaner and we get a closure for free (the blocks local variables only being in scope of the block).

PHP Closures

In PHP 5.3 closures were introduced. It’s made the kind of operation we expect from Ruby’s times possible.

class MyInt {
  protected $value = 0;
  public function __construct($value) {
    $this-value = $value;
  }
  public function times($pBlock) {
    for($i = 0; $i  $this-value; $i++) {
      $pBlock($i);
    }
  }
}
$echo_statement = function($value) {
  echo $value;
};
$val = new MyInt(3);
$val-times($echo_statement);

So we basically created a new class that will act as a Ruby Fixnum, created a times function (with mandatory horrible $i variables) that doesnt really care what it does, it only knows to do it a set number of times. It’s pretty powerful stuff, but look at the code hoops we have to jump through to get something as concise as a Ruby block.

Spilling the Beans on Methods

If we were to create our own implementation of times in Ruby it may well look something like:

class Fixnum
  def times
    for i in (1..self)
      yield i
    end
  end
end
5.times { |n| puts "Im Hi #{n}" }

Above we re-opened the Fixnum class and decided we don’t like how times method is implemented. So we wrote a for loop that iterates over a range of 1 to whatever. The most interesting line in the above is yield i. The yield method is where normal program execution pauses and the block passed to the method temporarily takes control.

Hold on a minute. Just where did we pass said block? Well, every method accepts a block (whether you have defined it yourself or not). So we dont even have to declare the block as a parameter for our times method. It knows a block may be there. But we choose to let it run using yield.

So what happens if we forget to pass a block? Well the answer would be a LocalJumpError : no block given. However we can give this times method a bit more flexibility using the block_given?.

class Fixnum
  def times
    if block_given?
      for i in (1..self)
        yield i
      end
    else
      Enumerable::Enumerator.new self
    end
  end
end
5.times { |n| puts "Im Hi #{n}" }
Im Hi 0
Im Hi 1
Im Hi 2
Im Hi 3
Im Hi 4
= 1..5
5.times
= Enumerable::Enumerator:0xb737acb4

We added in a bit of a curve ball here by returning a new Enumerable::Enumerator object if no block was given. This was mainly to emulate the real implementation of times. It will return an Enumerator object, letting us do the likes of:

enumerator = 5.times
= Enumerable::Enumerator:0xb737acb4
enumerator.max
= 4
enumerator.min
= 0
enumerator.minmax
= [0, 4]

Block Scope

Blocks are closures, they take a chunk of code and it’s local environment (variables etc.) and store it for execution later. So it’s best we talk about what effect blocks have on the surrounding environment. In agreement with what we naturally expect, variables defined within a block are thrown away after the block is executed. But there are some unexpexted behaviors depending on which version of Ruby we are using when defining blocks:

n = 10
y = 0
[1,2,3].each do |n|
  x = n
  y = y + n
end
y.inspect
# = "6"
n.inspect
# = "3" -- In 1.9.X this will be 10
defined? x
# = nil

As expected the variable x has lost scope outwith the block. However we can see n, which was defined before the block invocation has been modified by the blocks actions. I don’t know about you but I feel this behavior is un-natural. I have heard cases when this is actually a “feature”, but I have yet to come across those situations. Finally y which was defined before the block and modified within it maintains modification outside the block.

I mention this behaviour with block parameters (n being modified) for those of you playing with Ruby 1.8.x. In Ruby 1.9.x the parameter n is left unchanged by the block. I prefer this behaviour and suggest you also work with the latest and greatest version of Ruby.

Wrapping Up

Blocks are a piece of code that is put away for execution later (go on, just say “closure”)
All methods in Ruby accept a block by default and we can choose to call the block using yield
Blocks run in thier own enclosed environment
Variables local to a block are thrown away after execution
Variables defined before the block defination are wrapped within the blocks scope

It’s safe to say we have made a good dent in the block learning path, next time we will look at the magic of lamda, Proc and of course explain the difference, before looking at some practical uses for blocks.