Using Meta-Programming for Performance in Ruby

16 November 2007 ruby and rak x comments

Normally we use meta-programming in Ruby for our own convenience as developers, and we swallow the speed hit it gives us as a reasonable trade-off. The way Rak is implemented turns this on its head.

Rak compiles its line matching code on the fly, to speed up searching. Its not as horrendous as it sounds. Here’s a very simplified version of the line matching code without the optimization:

  def line_match(filename, regex, line)
    if options[:invert_match]
      unless line =~ regex
        if options[:print_filename]
          puts filename
        end
        puts line
      end
    else
      if line =~ regex
        if options[:print_filename]
          puts filename
        end
        puts line
      end
    end
  end

We notice that the options hash never changes once the searching has begun. But we are doing an awful lot of work checking the values in the hash for every single line we are matching against. So we replace it with this:

  def compile_line_match(filename, regex, line)
    code = []
  
    code << %{def line_match(filename, regex, line) }
                if options[:invert_match]
    code << %{    unless line =~ regex              }
                    if options[:print_filename]
    code << %{        puts filename                 }
                    end     
    code << %{      puts line                        }
    code << %{    end                               }
                else 
    code << %{    if line =~ regex                   }
                    if options[:print_filename]      
    code << %{        puts filename                 }
                    end        
    code << %{      puts line                       }
    code << %{    end                          }
                end     
    code << %{end                              }
    module_eval code.join("\\n")
  end  

I’ve preserved the indenting so you can see the logic of the method. Here all the option tests have been factored out and will only be done once, at startup. Assuming we are not inverting the match and are printing filenames, the final compiled method looks like this:

  def line_match(filename, regex, line)
    if line =~ regex
      puts filename
      puts line
    end
  end

Much smaller. This method is 140 lines long in the full version, so its a much more impressive win there. All told this sped up Rak by about two times when I tested it searching on my Phd repository (3500 files).

There’s more work to do optimizing Rak. It’s still not as fast as Ack, so you might want to use that unless you are a Ruby fanboy like me :). But I have some more ideas….

Update 20/04/08: thankyou to redditer tomel for pointing out a WTF in the above code :).

blog comments powered by Disqus
.