Wednesday, April 29, 2009

Rails, Scalability and Sexism

It's been a long time since I posted here, for a number of reasons:

  1. Change of focus from Ruby programming to Linux capacity planning, especially in the realm of I/O subsystems. You'll find more about that at http://linuxcapacityplanning.com.
  2. Research in social media, starting with the Portland, Oregon Twitter / tech/ eMarketing community and branching out into the R programming language, data visualization and some cutting edge text mining algorithms. You'll find more about that at http://borasky-research.net and http://groups.google.com/group/pdx-visualization.
  3. Becoming a member of the recession. You can find out about that at http://www.linkedin.com/in/edborasky.
  4. Volunteering at Open Source Bridge. You can find out about that at http://opensourcebridge.org/

But Ruby and Rails didn't fall totally off my radar screen after FOSCON 2008. I still write Ruby scripts, most recently to collect Twitter data. I go to the Portland Ruby Brigade meetings and probably know more Ruby and Rails developers in Portland than I do Linux developers. Two events occurred recently in the Ruby / Rails blogosphere that captured enough of my attention despite all of that to warrant a blog post.

So ... scalability. A few weeks ago, Alex Payne, one of the engineers at Twitter, posted a blog entry on why Twitter was moving some of its functionality from Ruby / Rails to Scala. It can be found here, and some of his previous posts on related subjects can be found here and here. Essentially, the reasons for the move fall into two broad categories: scalability and maintainability of the large and growing code base that powers Twitter.

I can't contribute to the maintainability debate in any meaningful way. The last time I was involved in maintenance of a large code base was when I was a FORTRAN programmer working on large scientific application codes. As an aside, though, you'd be surprised at how much of many of those codes implement domain-specific languages. But scalability is something I know a lot about.

So let's talk first about language interpreter performance. Thanks to some really spectacular programming by Isaac Gouy of the Alioth Shootout, here's a boxplot of benchmark times for GCC, Java, Scala, Java, Ruby 1.9, jRuby and Ruby 1.8.6 relative to the fastest times. The plot is log scaled and lower is better / faster.


Yes, Scala is a lot faster than Ruby, even jRuby, which runs on the same Java Virtual Machine as Scala does. We are talking an order of magnitude -- two hemibels.

Now let's talk about overall Rails performance and scalability. I've only done some basic profiling, which you can see described here. The real heavy lifting and expertise on scaling Rails applications can be found here. The net of it is that Rails is heavily constrained by the underlying Ruby interpreter speed.

Yes, you can tune a Rails application for responsiveness or throughput, and you can throw processors and memory at a tuned Rails application to meet service level objectives. But until the bottlenecks are alleviated in the Ruby interpreters, applications written in languages like Scala, Java and C are going to have a significant performance advantage.

Finally, let's talk about the Ruby / Rails community response to Alex Payne. Alex and the other Twitter engineers did their homework. They measured performance. They tested. And they implemented what worked best for them.

But some Ruby / Rails bloggers chose to descend to name-calling and attacks at what was a business and engineering decision based on that testing. From my point of view, such behavior is unacceptable. Tony Arcieri and Obie Fernandez have made major contributions to Ruby and Rails, and I expected better of them.

Moving on to sexism, there are plenty of resources on the web and in social media about this issue. I'll spare you the links. If you're here, you've probably seen them. A lot has been written since the presentation at the Golden Gate Ruby Conference that triggered the current round of discussion.

I emphasize "current" because this isn't the first time that the issue of sexism has come up in discussions about Rails. Again, a web search will find plenty of discussion from years gone by. The issue first entered my consciousness at the 2006 Ruby Conference in Denver, where I noticed that out of a couple hundred Ruby / Rails programmers in attendance, less than half a dozen were women.

So let me add my voice to the chorus of condemnation that is being directed at Matt Aimonetti. The behavior he displayed:

  • Developing a sexist presentation for the Golden Gate Ruby Conference,
  • Actually presenting it, and
  • Defending his behavior and attacking critics of his behavior in public and in private
is unacceptable.

Sunday, July 20, 2008

FOSCON 2008!!

What: FOSCON is a free, fun gathering of Ruby fans held during an evening of O'Reilly's OSCON conference with cool presentations, food, discussions, and a live coding competition.
Who:Anyone interested in Ruby, whether you're just curious or a seasoned pro.
Where:CubeSpace, Portland, Oregon near the Oregon Convention Center (directions).
When:Wednesday, July 23, 2008 from 6pm-9pm.
Why:The Portland Ruby Brigade user group wants to share the joy of Ruby with you.


I'll be there, talking about my second-favorite programming language *and* my favorite operating system! But don't just come to hear me talk (or sing, if they let me). We have:

  • IronRuby: John Lam
  • Selectricity and RubyVote: Benjamin Mako Hill
  • Ruby performance: Brian Ford
  • Musical glasses: Gregory Borenstein
  • Ruby++!…?: Markus Roberts on defining custom operators
  • Five minutes with Selenium: Ian Dees, author of "Scripted GUI Testing with Ruby"
  • Ruby culture: Audrey Eschright, Reid Beels and Igal Koshevoy of Calagator
  • Ruby server automation: Igal Koshevoy on AutomateIt
  • Ruby on Rails profiling: M. Edward Borasky on applying Linux OProfile
  • Ruby web development: John Labovitz on Gossamer, a microframework to spin websites out of distributed, lightweight, ephemeral resources
But wait! There's more! (I know, I said you would never hear me say that, but hey, this is typing.) :-)

  • Live coding competition: Ruby on Rails, PHP Symfony, PHP Drupal, and GemStone/S Smalltalk Seaside
That's right -- you'll actually get to watch web applications built while you wait!

Did I mention that it was free?
--
"A mathematician is a machine for turning coffee into theorems." --
Alfréd Rényi via Paul Erdős

Friday, July 11, 2008

FOSCON 2**2 Is Afoot!

Yes, there will be a FOSCON 4! Stay tuned at http://pdxfoscon.org/start. I'll introduce you to my new best friend. :)

Monday, July 7, 2008

oprofile is my new best friend

For those of you who haven't been following the Ruby mailing lists, I've spent a fair amount of time over the long weekend getting up to speed on "oprofile" as a tool to dig into Ruby performance. There haven't been any great surprises beyond what I learned last year with "gcov" and "gprof" (https://rubyforge.org/docman/view.php/977/2705/Slides.pdf), but what is new is how much you can learn with oprofile.

All of you on Windows, MacOS and Solaris can stop reading now. oprofile is only available on Linux. The clever thing about oprofile is that you can profile your entire workload. Ruby, Rails, Apache, the database and the kernel -- oprofile tracks them all. And you don't have to recompile anything with special flags unless you want to get down to the exact line of code. You can even profile optimized code!

There's one more thing oprofile gives you -- down-to-the-hardware analysis. If you want it, you can get an annotated assembly language listing and learn where your code is generating cache misses, page faults, pipeline stalls, branch prediction failures -- anything that can bottleneck your code.

I'm planning to post a more detailed how-to for using oprofile to find Ruby/Rails application bottlenecks, but for now, there are some sample profiles I made over the weekend for MRI, KRI and Rubinius on RubyForge. They're all in http://cougar.rubyforge.org/svn/trunk/PTR2/. From there, look in "*_benchmark_oprof_reports". The benchmark in question is the Ruby Benchmark Suite, started by Antonio Cangiano and maintained on GitHub at http://github.com/acangiano/ruby-benchmark-suite/tree/master.

Stay tuned! To paraphrase Mark Twain, "Everybody talks about Ruby performance, but we're doing something about it!"

Thursday, July 3, 2008

Chaos, Part 2

  1. Igal Koshevoy is probably the only Rubyist who doesn't have a blog. :-)
  2. It looks like there is going to be a "continuous integration" site set up to watch the Ruby source, compile it, run test suites and post results, etc. Can you say, "We eat our own dog food?"

YES!! Watch this space!!

Sunday, June 22, 2008

Ruby, Rails and Life on the Edge of Chaos

There's a concept in systems theory called, in the popular vernacular, "life on the edge of chaos". I'm sure a Google search for that phrase will turn up plenty of speculation and perhaps even some valid math. :-)

Ruby seems to me to be very close to life on the edge of chaos at the moment. It started a few weeks ago when some of the implementations other than the mainstream Ruby (aka MRI) started running Rails. If my count is correct, in addition to MRI, the John Lam/Microsoft IronRuby and Rubinius now run Rails, and Sun's jRuby has been running Rails for quite a while. I don't remember if Ruby 1.9 runs Rails or not -- perhaps someone can fill me in on that in a comment.

Another indication of life on the edge of chaos is the fact that the main Ruby implementation teams have started having regular IRC meetings to make sure there is some kind of communication about the syntax and semantics of the language, test suites, benchmark suites, and other trappings. There is a Ruby benchmark suite being built (http://antoniocangiano.com/2008/06/01/help-me-create-the-ruby-benchmark-suite/). I've contributed my MatrixBenchmark to it, and will probably also be doing some of the statistical work as well.

Then there's Ruby 1.8.7. I haven't had a chance to experiment with it yet, but my understanding is that some of the syntax and semantics of 1.9 have been backported to the 1.8 MRI implementation. I think there are also some performance patches.

Then there's Maglev. (http://headius.blogspot.com/2008/06/maglev.html) Then there's the recent discovery of a security issue that affects all know MRI and KRI versions (http://www.rubyinside.com/june-2008-ruby-security-vulnerabilities-927.html) and the subsequent discovery that the patched versions segfault under some circumstances.

Finally, there is the continued brouhaha about whether Twitter scales or not. And if it doesn't, is it Rails' fault? I don't know enough about either Twitter's architecture or Rails to be able to join the debate, and given the fact that I'm a performance engineer for a living, it's unlikely anybody will tell me enough to even make an informed guess without an NDA. :-)

So ... if this is indeed, as I think it is, "life on the edge of chaos", what does that mean? Well, for openers, let's look at the "chaos" part. A lot of people are doing a lot of work that in the end will be discarded. That's how this chaos thing works. I was very glad to hear that both IronRuby and Rubinius now run Rails, because if they couldn't do that, there was no hope that they'd survive. None. Zero. Zip. Nada. Now at least there's a chance that the work of those two teams won't be discarded.

And if the claims of the Maglev team are true, I think the Ruby 1.9 and Ruby 2.0 efforts are in great peril. Dave Thomas is still hacking away at Pickaxe 3, and I think there are probably a couple more Ruby 1.9 books in the works, but I just don't see the "industrial strength" support building for that branch of the Ruby evolutionary tree.

So what about the "life" part? Well, I think you will end up seeing Rails becoming a language itself, with its own "virtual machine", independent of Ruby. Sure, it has a Ruby core, but it doesn't use all of Ruby. There's no reason on Earth why someone couldn't take the internal domain-specific language(s) in Rails and turn them into external domain-specific languages -- build compilers and run-times to run those languages an order of magnitude faster by eliminating the MRI/jRuby/Rubinius/IronRuby layer. The only question in my mind is whether such a project will be open-source or not.

Saturday, December 1, 2007

Give 1 Get 1


One Laptop Per Child: Give 1 Get 1