I have a slow Ruby script that calculates the Page Rank of a large set of blogs. Let’s try rewriting it in D!
D is a language designed to be the sucessor to C++. It shares C-like syntax and compiles to native binaries but add niceties such as dynamic arrays, a ‘foreach’ operator and garbage collection.
The script must load 4,121 blogs from the (test) MySQL database, along with 41,405 links. Then there is some preprocessing to ensure we know each blog’s incoming and outgoing links, and then we spend roughly 40 iterations recalculating the Page Ranks until they settle down. Afterwards the results must all be written back out to the database.
This was my first ever use of D, even for learning, but it still only took about an hour to get working. I haven’t used a language with static typing for about 10 years, so that was a bit of a shock. But it’s all coming back to me now…
Here are the relevant statistics:
|lines of code||times (seconds)||average time|
|Ruby script||95||23.8, 23.9, 23.7||23.8|
|D program||128||1.7, 2.8, 1.4||2.0|
That speed increase is excellent, and should make a real difference when run on the master database (with >10,000 blogs already). Still, although welcome it’s only what I was hoping for. What’s really impressive is the line and word count comparison. I had expected the D program to be significantly more verbose (remember: D is quite like C), but there are only 30% more lines and (even better) only 14% more words. That’s a great combination of speed and expressiveness.
Please note that neither the Ruby or D programs are particularly well optimized. I’m sure I could improve the Ruby execution time with some work, and I could definitely improve the D time. Since it was my first D program I’m absolutely positive that I’ve done some things completely arse backwards.
Next, I must not let myself get tempted into rewriting all of Says Who? in D. That would be stupid. Although think how wonderful and fast it would be …… No! Snap out of it! Anyway, one thing that D lacks right now is a wide range of libraries, so rewriting all of SW? wouldn’t even be possible. For instance, I can’t find any HTML or RSS parsers, which are kind of essential to absolutely everything SW? does.
But I am going to try to use D in some targeted locations. Page Rank and similar types of analysis seem like a good bet. SW? will be doing a lot more of this in the near future, so to have a performant analyser is going to be verrry useful.