danlucraft /blog

Using Meta-Programming for Performance in Ruby

November 2007 • Daniel Lucraft

Normally we use meta-programming in Ruby for our own convenience as developers, and we swallow the speed hit it gives us as a reasonable trade-off. The way Rak is implemented turns this on its head.

Rak compiles its line matching code on the fly, to speed up searching. Its not as horrendous as it sounds. Here’s a very simplified version of the line matching code without the optimization:

def line_match(filename, regex, line)
  if options[:invert_match]
    unless line =~ regex
      if options[:print_filename]
        puts filename
      end
      puts line
    end
  else
    if line =~ regex
      if options[:print_filename]
        puts filename
      end
      puts line
    end
  end
end

We notice that the options hash never changes once the searching has begun. But we are doing an awful lot of work checking the values in the hash for every single line we are matching against. So we replace it with this:

def compile_line_match(filename, regex, line)
  code = []
  code << %{def line_match(filename, regex, line) }
              if options[:invert_match]
  code << %{    unless line =~ regex              }
                  if options[:print_filename]
  code << %{        puts filename                 }
                  end     
  code << %{      puts line                       }
  code << %{    end                               }
              else 
  code << %{    if line =~ regex                  }
                  if options[:print_filename]      
  code << %{        puts filename                 }
                  end        
  code << %{      puts line                       }
  code << %{    end                               }
              end     
  code << %{end                                   }
  module_eval code.join("\\n")
end  

I’ve preserved the indenting so you can see the logic of the method. Here all the option tests have been factored out and will only be done once, at startup. Assuming we are not inverting the match and are printing filenames, the final compiled method looks like this:

def line_match(filename, regex, line)
  if line =~ regex
    puts filename
    puts line
  end
end

Much smaller. This method is 140 lines long in the full version, so its a much more impressive win there. All told this sped up Rak by about two times when I tested it searching on my Phd repository (3500 files).

There’s more work to do optimizing Rak. It’s still not as fast as Ack, so you might want to use that unless you are a Ruby fanboy like me :). But I have some more ideas….

Update 20/04/08: thankyou to redditer tomel for pointing out a WTF in the above code :).

blog comments powered by Disqus