We take performance very seriously, but we can't locate and fix every perf regression on our own. If you can give us metrics and repro code, we can work to fix the issues.
Thanks Aaron. I imagine I'll probably spend Thanksgiving weekend trying to dig further into what is newly slow on our 3.2.15.
The trick with pinpointing a cause is that when I look at the data NR provides, every partial rendered and every action called is slower than it had been previously. Combined with the immensity of our codebase, it's going to be time-consuming to put a finger on a single repro-able cause. But we'll do our best to isolate and report back.
FWIW, have you guys ever considered creating a performance suite test app? With such sparse Google results around Rails performance changes over time, it could be useful to compare the performance of Rails versions on an apples-to-apples basis if we could see how the performance of the test app changed over Rails versions. If we're lucky, it might even catch some of the performance regressions that have happened in the dot releases previously. Even if not, it would at least demonstrate that performance is an important consideration in Rails' evolution. I'd sleep better knowing that.
Maybe take a popular mid-sized framework like Spree and see how it performs on 3.0, 3.1, 3.2, etc.?
If you send me a flamegraph using ?pp=flamegraph_embed I can help you figure out why it got slow.
I am actively working on getting a long term perf benchmark using the Discourse bench. Server is already provisioned by ninefold. see my talk (towards the end) http://www.youtube.com/watch?v=LWyEWUD-ztQ
We have a setup that is fairly similar to yours (large Rails 3.2.15 app, Passenger, New Relic) and we recently experienced the same as you did - just in the opposite direction!
What happened is that average response time fell from 415ms to 255ms and across the board everything was faster. This Friday I upgraded ruby on our servers from 1.9.3-p327 to 1.9.3-p448 and that triggered the speedup.
Ruby was installed using rbenv/ruby-build and my guess is that the new version was compiled using compiler optimization flags whereas the old one for some reason was not.
I'm not suggesting that you upgrade to 2.0, or anything drastic, but does this mean you don't install the security updates? Build 448 included important OpenSSL fixes, and no functional changes or deprecations occur during these minor upgrades.
2x is a pretty serious regression. I'm really not sure how that could be overlooked. doesn't rails do performance tests as part of their regression suite?
Ruby has some great tools to isolate and fix performance issues. In particular, ruby-prof and kcachegrind (or qcachegrind) are tremendously valuable resources for tracking down and eliminating performance bottlenecks.
Thanks cheald. Will take a look at those. My problem with using the performance tracking tools in the past has always been that I get output like "100k string allocations," "3500 ActiveRecord objects instantiated," "6000 ActiveSupport methods executed," etc. With so much complexity in Rails, it can be maddeningly difficult to pick a culprit from amongst the 1000 papercuts that tend to slow an app down.
Even if we just arbitrarily said "let's focus on whichever method is cumulatively taking most time," we have no reference point for whether Rails could make that method faster or not. The time taken by a method or instantiation is just some number, and whether that number is "good enough" is ultimately a judgement call in which I have little basis for comparison.
I hope I'm wrong and that there are some big obvious things to optimize, but when I've gone down these rabbit holes in the past, my experience has been that there's usually so much data and so many pieces things contributing to slowdown that it's hard to find high-impact things to fix. Especially when it comes to a framework as multilayered & complex as Rails.
If you want to email me a gzipped callgrind or two, I'd be happy to take a look and see if I can make any suggestions. You can reach me at cheald @ gmail.
I've upgraded several Rails apps from ~>2.3.0 through to 4.0. I saw performance regressions around Rails 3, and they've subsequently disappeared when upgrading to 3.2 and 4.0 - the latter in general being much faster.
I have to wonder what you're doing that's giving you an average of 480ms? Have you stripped out middleware you're not using? Have you looked into exactly where the time is going? There are so many places to start, it's hard to offer much concrete advice.
I just wish someone would find it in their economic interest to do for Ruby what V8 did for Javascript. Doesn't seem like there is any fundamental reason for Ruby to run slower - it's just a matter of putting the (substantial) resources in to make it happen. JRuby might be the solution to this though, if they can catch up with MRI's functionality and stay on par.
> Doesn't seem like there is any fundamental reason for Ruby to run slower.
I think that all the metaprogramming magic that Ruby allows is at odds with speed. That is not to say that it's impossible to make it fast, but it's probably tricky.
> I think that all the metaprogramming magic that Ruby allows is at odds with speed.
I don't think so, Java and C# also have metaprogramming (reflection, class loading, bytecode manipulation) and they are still plenty fast.
Ruby's liability is that it's dynamically typed, and this puts a hard limit on how fast it can be (and also on how toolable it can be, but that's a separate discussion).
It's not just the actual metaprogramming, i.e. define_method is slow. It's also the runtime implications of metaprogramming being possible, i.e. needing to dispatch each and every method call dynamically just in case someone called define_method. Java can turn every method call into a function pointer at compile time.
Metaprogramming is code using define_method (optionally closing over local variables), instance_eval, and stuff of that nature. Even with bytecode manipulation they would be very difficult to emulate in Java/C++ and would be quite slow.
That said, JavaScript is not far off of Ruby's monkeying capabilities and yet pretty darned fast.
I mean metaprogramming support is kind of a spectrum and neither Java nor C# support it to the same degree as Ruby. For one thing and for another, C# and Java have large corporations supporting those languages which try very hard to make the runtimes fast so it's not a fair comparison.
I said tricky, not impossible but I will, even though I'm not sure I'm really qualified to talk about these languages for one thing and for another, I think that these sort of comparisons are pretty pointless to begin with. Also I'm not sure where you are exactly going with your comment.
Lisp + Scheme: I mean if nothing else, people have been trying to make them fast for the last more than half of century. Plus AFAIK most Lisp meta-programming code is generated during compile time through macros not during run-time.
Dylan: I don't really know anything about Dylan but it seems to have come out of Apple and CMU, so there's that. Plus it seems to be just a Lisp dialect.
That doesn't seem right. Can you give an example? I use lots of js metaprogramming, but it mostly involves wrapping and passing functions, mixing in methods and dynamic getters etc. I don't think I've ever actually had to use eval, or anything eval-like.
OTOH, grep for instance_eval and class_eval in rails source.
Right, you should never ever use eval but I don't think that a lot of the things that are commonly done in Ruby could be achieved in JS without eval. And what you are talking about is not exactly meta-programming.
> OTOH, grep for instance_eval and class_eval in rails source.
I mean what's your point. Yes, I realize that both are used profusely, which is one of the things that this discussion is about.
> And what you are talking about is not exactly meta-programming.
I didn't describe metaprogramming, just how I usually see common metaprogramming "magic" implemented in javascript. The only reason you'd really need eval in js metaprogramming is if for some semantic reason it was nicer to accept a String of javascript somewhere. Afaik eval doesn't "get" you anything in javascript (it doesn't "unlock" private closures scope or anything, and context-injection is available using #call or #apply) and I almost never see it used, unless I am forgetting something (totally possible, I've been up all night)
> I mean what's your point. Yes, I realize that both are used profusely
My point is I don't see eval used profusely in javascript, which contradicts your assertion in a).
Edit: nvm, I stand corrected: (function(){var x = 1; eval('console.log(x)'); })() prints 1. Neat. But still, I do not see this often used, the callback + context passing approach is much more common.
For a while Phusion offered Ruby Enterprise Edition[1] which made improvements on the Ruby 1.8.7 interpreter including improvements such as:
> * A “copy-on-write friendly” garbage collector, capable of reducing Ruby on Rails applications’ memory usage by 33% on average.
> * The tcmalloc memory allocator, which lowers overall memory usage and boosts memory allocation speed.
> * The ability to performance tune the garbage collector.
> * The MBARI patch set, for improved garbage collection efficiency.
> * Various analysis and debugging features.
They've since dropped support of it because:
> * A copy-on-write patch has recently been checked into Ruby 2.0.
> * Many of the patches in Ruby Enterprise Edition are simply not necessary in 1.9.
So it sounds like many of the improvements offered in Ruby Enterprise Edition 1.8.7 have been rolled into Ruby 2.0. Do you have any suggestions of what might be included in a Ruby Enterprise Edition 2.0?
There will be no Ruby Enterprise Edition 2.0. REE has been discontinued.
The original goal of REE was to incorporate our copy-on-write friendliness patches into Ruby, and to distribute it in a user-friendly form. Since then, its goal has been extended to include other useful patches as well. We've discontinued Ruby Enterprise Edition for the following reasons:
1. Ruby 2.0 incorporates (or obsoletes) all these improvements.
2. Ruby 1.8 is no longer supported even by its upstream authors.
3. We have limited resources and wanted to focus on Phusion Passenger. Since the discontinuation of REE we've made tremendous improvements in Passenger.
This presumes the request time here is due to time in the CPU. Odds are what probably happened here is some underlying algorithmic or I/O change in Rails that would have slowed things down proportionally regardless of how slow the interpreter is.
> Twitter was in a nice position to better Ruby in terms of performance, though they chickened out and escaped to Scala.
Actually, they escaped to the JVM, and if their job reqs are any indication, they hire massively for Java engineer positions and hardly at all for Scala ones.
I wouldn't call that chickening out, more common sense.
Rails is great for prototypes and toy apps, but once you start needing scale, it's simply not up to the task.
I'm curious too how much of each of these is still built on Rails, and how much has been tweaked, tuned and rewritten to be 'inspired' by Rails, but no longer Rails.
Good question. I would say it varies, but after all they are mostly likely not changing things dramatically, at least thats what you get from watching talks from both companies[1].
I've browsed this discussion and I must miss something?
You seem to imply Ruby's/Rails' speed matter?
Usually, the DB is the limiting factor in most web applications, few should need to execute more than a few milliseconds of (even) Ruby per page load. That is why people use scripting languages, after all.
Shouldn't this slowdown be a problem in the ORM for this version? I'd look at access patterns for normal requests for the new Rails version and the old. How has the SQL changed?
Sure, Twitter is different since that probably is processor bound.
(Since RoR is a popular framework with lots of eyes, there shouldn't be something simple with e.g. configuring the web server, locking etc.)
I guess if you wanted Scala programmers (and for the long-term, not just a 3 month contract) you would hire for Java programmers as a) there's a lot more of them so the talent pool is larger, b) teaching a Java guy/gal Scala isn't going to be difficult, and c) they may be cheaper to get (a pure guess)?
That's a bit of an oversimplification. Here is a more in depth explanation in the form of a video of a talk called 'Twitter: From Ruby on Rails to the JVM'[1] from O'Reilly Con by Twitter engineer Raffi Krikorian who leads the Applications Services group.
Didn't watch this video. But as far as I recalled back then, the guys at Twitter was complaining about the slowness of Rails and MRI, as Rails was not asynchronous and MRI was CPU-bound, so instead of fixing this root cause to apply the tricks from V8 to Javascript to improve Ruby,they chose the more "pragmatic" way to use Netty that is asynchronous and Scala that is way faster.
I don't accuse them of not saving the world, but Ruby and Rails lost a great opportunity to dominate the web community. If Rails were as performant as Node.js, I doubt lots of companies would change gears.
> I don't accuse them of not saving the world, but Ruby and Rails lost a great opportunity to dominate the web community.
They didn't lose this opportunity because of Twitter, they lost it because the fact that Ruby is dynamically typed puts a hard limit on how fast it can be. Twitter tried very hard to make it scale before making the decision to switch to Java, and they just couldn't do it.
The task was just impossible, switching to Java was the only reasonable decision given their constraints.
Javascript is also a dynamically typed script language, which Google managed to make at least one order of magnitude faster with V8. It's likely same tricks can be applied to Ruby.
Google has a full-time team of some of the best compiler writers in the world wholly dedicated to tuning Javascript, and they have spent literally years to get it to perform how it does today, and it's still far from competing with the JVM, and will likely never be able to do basic things such as multithreading and proper type safety. Twitter was absolutely right in doing what they did.
Otoh they don't use some top secret innovations but decades old techniques accessible to everyone.
And language benchmarks don't tell the whole story. In Java, idiomatic and typical code will be optimized very well whereas in Javascript such code runs easily at least an order of magnitude slower than its potential. Try to run some idiomatic JS through https://github.com/petkaantonov/nodeperf/ and see it explode :)
So even if Javascript can be shown in some benchmark to close on the JVM, the benchmarks are ignoring the fact that you cannot write Javascript carelessly to get anywhere near those speeds unlike with Java.
LuaJIT shows dynamically typed languages can be fast. And even plain Lua is enormously faster than Ruby. Complexity costs, not necessarily being dynamic.
I think organization type plays a role here too. ROR is an excellent fit for startups where speed of iteration is paramount; but once an organization settles on a business model and functionality matures a bit, the code stops changing so rapidly and the ability to do rapid development is not important any longer. At the same time performance becomes increasingly critical due to costs of scale. Hence a switch to something like scala makes perfect sense.
They switched to the right tool for the job, with great success. They've contributed heaps of open source code to the community. (https://github.com/twitter)
But they didn't magically make Rails fast, so screw them?
Twitter is probably the best technology company that has ever helped this community. They built bootstrap to help start of all your companies, and open sourced Mesos for when you get big. If rvm was saveable they would have not rewrote that mono app into services. A tuned JVM is a great compromise of performance and development flexibility.
One of the great problems that plagues MRI is the lack of real concurrency with its global interpreter lock and the single-threaded, single-generation, stop-the-world garbage collector. That is many years behind the state-of-the-art as far as runtimes go.
Dynamic webpages consisting of several widgets are quite amenable to parallelization by rendering subviews separately and just assembling them in the layout in a final step. With parallelism slow single-threaded performance can still yield reasonable response-times if you can throw multiple cores at the problem.
But even if you have parallelism, a slow single-threaded GC would halt all of those threads, again affecting your response times.
I don't know about V8, but mozilla's JS is suffering from similar issues. They're currently working on generational GC and it's promising to provide quite some speed-up. But the javascript runtime, just like ruby, was not designed with parallelism in mind, that's why we're seeing those webworkers which essentially spawn an isolated javascript environment. Simply because it's hard to tack on parallelism as an afterthought.
Similarly one of Ruby 2.1's major performance boosts stems from the generational GC.
There was a good talk by Charles Nutter at Baruco this year about JRuby. It looks to be pretty much there, and from the benchmarks performance looks a lot better than MRI.
Last time I looked into it jruby itself was faster than MRI, especially if you take multithreading into account. But rails-on-jruby was a different story because some metaprogramming things that rails does are terrible for the JIT compilers.
Charles ominously called "dark matter" inside rails.
I've upgraded several apps, I've been on the Rails train since 1.2, and each time there's been slight, and more often major, performance gains. Yes, some of these apps where big (Not Twitter big, but big enough. And no, they didn't need to do anything too crazy hardware wise.)
Almost all Rails apps that are slow on performance suffer from architectural flaws or oddities in the code. A lot of time Rails get a beating when it boils down to slow, illformed SQL queries. One time I had a customer who wanted me to rewrite their platform because "Rails was so slow", when in fact they did some heavy image processing and uploading to a cloud storage within the browser request. I'm not saying OP does any of this, but since I don't know any details about their codebase or infrastructure I'm making some general remarks that may or may not apply.
Ruby code can be written in so many ways, I guess that is its blessing and its curse. There are bad ways to do things, and there are good ways. With great freedom comes great responsibility.
I would encourage OP to push on and upgrade to Rails 4.0.1. The gap between Rails 3 and Rails 4 isn't that huge at all. I'm sure you'll be pleased.
Rails is not for building big apps. It works better for small, even quick and dirty apps. When you want to build something that is big or that will evolve into big, choose a better language and don't rely on a framework to save you. In fact even big apps are better broken up into many separate functional apps integrated by things like an RDBMS or a message queue broker or a NoSQL cluster.
And you really need to think hard about caching. There are many layers at which you can cache stuff and when you get into caching parts of pages, there is a whole architectural design issue around how to divide things up.
In solving these kinds of problems Rails and its overly simplified ActiveRecord pattern, just don`t give you much wiggle room.
I've been thinking a lot of the same thoughts about Django. I'm tempted -- but far too lazy -- to write a decision tree about using Django (which would probably also apply for rails). I really love your comment, Don't rely on a framework to save you.
Django is great for consultants/contractors who need to crank out work fairly quickly that may be quite similar to other jobs they've got on the docket. It's great for new programmers or programmers who aren't familiar with the OS-level interactions between a web server and a programming language. It's great for apps with very low complexity (current and projected).
Once you start getting out of these use cases though, IMO, you are asking for trouble getting too invested in the Django (or Rails) ecosystem. It's worth it to learn about how the web server interfaces with the language runtime, so that you can capitalize on microframeworks. When you buy into a framework, you're buying into a huge set of opinions about really important things that have been made based on being flexible enough for about anyone.
I love Django for what it has enabled me to do in my life, and have learned so, so much from getting into its guts to solve problems. That said it would take a lot for me as a salaried, in-house software developer to start a new project using (Django | Rails).
It'ns not that Rails is that fast (it isn't), they simply cache every damned piece upfront. When your requests barely ever touch the rails but are satified from the cache, you can use nearly everything as your backend stack.
This is exactly why I started to use rails recently. If it becomes pretty easy to cache everything and you get the amazing productivity boosts that rails provides you then there's really no down side.
My first rails app is approaching 4k lines of code and responses that are cached using fragment caching are rendered in 5-8ms usually on a micro EC2 instance (aka. really bad hardware). I'm using MRI with rails 4.0.1 and I have not done any tweaking. Just basic caching that was trivial to implement and running rails in production mode which is just setting an ENV variable.
You'll encounter the downsides when you have to do some serious number-crunching or analytical queries. Or just highly dynamic views that aren't cache-able.
But that's not rails' fault, we all know that ruby itself isn't the fastest language to develop in, the "bloat" of rails just increases this a little.
As long as you have enough cheap hardware to throw at all these problems, everything should be fine, though.
I'm not too afraid of number crunching because if I do need some type of analytical report crunching done I'll just chuck them in a sidekiq worker and use the whenever gem to setup a cron job.
I already setup sidekiq to work with e-mails and have whenever being used to generate a new sitemap once a day.
And tbh I don't think I'll ever go too crazy with highly custom analytical queries either because google analytics is quite strong with custom event trackers and tools like new relic are excellent for system health and performance metrics.
As for highly dynamic views that can't be cached then I would worry about them when the time comes. Maybe there's a way to setup varnish or nginx with SSIs to deal with that, it's something I never researched in depth but have to assume is a solved problem at this point with a little elbow grease and technical knowledge.
The original article seemed to be saying "Rails shouldn't get slower with every release" but almost everybody seems to be replying as if it said "Please help me make my Rails application go faster because I'm an idiot" ... did I just misread?
Thanks for this, it's a pretty useful comparison chart.
I don't know why the Rails community gets their panties in a bunch every time someone brings up the poor performance of rails. It is a a slow bloated framework that requires you throw hardware at your scaling problem instead of writing decent code. In the end, there is only so much hardware you can buy before your app stops scaling and you need a complete rewrite, the way Twitter did.
Ruby is slow, and Rails is slow, although they are both much better than it used to be.
But this isn't bloat. Rails is a large framework that provides moderately sensible defaults. The tradeoff of that is there's a lot of stuff in there which might not be used by every app. This, and the overall flexibility of the framework, can cause performance to suffer, and combined with Ruby's relatively lacklustre performance, that can cause big scaling problems.
Twitter is a poor example though, given the somewhat unique problem space it's tackling. But - building a standard e-commerce or service site? No problem. Getting bigger? Add some hardware. Still not enough? Start customising the framework. Because it's a toolset, not a one-size-fits-all solution. And if you've got outsize scaling requirements, you are going to have to get your hands dirty.
You can easily get your APEX back up without blaming Ruby. There is nothing inherent to Ruby that should stop you from doing that.
More likely than not you probably have a couple nasty requests that should be handled by a background task and a few more or less caching calls.
Additionally, from your user's perspective, 300~ish ms is already pretty good, I wouldn't worry about that too much. You use Ruby on Rails because it is speed on speed for rolling out features. Focus on a better user experience with pretty good performance and you guys will be alright.
> Maybe a partnership with New Relic could help the Rails team to see the real world impact of their decisions on the actual apps
Last time I used NewRelic they gave you an option of sending anonymised statistics to the rails core team.
The stats given in support of this claim are very weak as well. An over-all average is pretty useless for determining the cause. It could be over-all faster with more high outliers skewing the average for some reason. Need more detailed information.
On a side-note, it took me 3 tries before I got the site and not an HTTP 503 error which is rather ironic given the situation.
While I'm new to the rails scene, and can't speak for the 2.x -> 3.0 performance change, what surprises me is that this post is upgrading because of newer gem support.
It this purely a rails upgrade? Or is the OP upgrading other gems at the same time?
You're using newrelic. Use it, and figure out exactly where it's getting slower. Do an entire stack trace. You can figure out exactly where those pages have gotten longer responses. Then you can suggest to the rails community what's slow for your application.
The platform is only as good as the community makes it to be. Maybe it's missing your case's optimizations because you haven't given specific feedback?
newrelic isn't going to tell you that route url generation has gotten 2x slower, or that calling pluralize 100 times is causing the slowness. all you'll see is that a template takes a while to render.
I'm inheriting a Rails app and I'm thinking of switching it over to Node.js for performance reasons. Most of the code currently implemented is pretty basic and it's all Rails but we're already having performance problems are we're still in beta. I've used Rails before, but I'm not guru and I wanted to know if someone could point me to some resources for tweaking Rails performance?
Definitely stick with Rails, the framework will give you the training wheels you probably need. Switch to Node.js and you'll probably end up with bowl of noodle code soup.
Yeah, I've really been liking what I've seen from Rails so far. It seems like there is a lot of support for a lot of functions/features that you just generally need which is awesome.
I realize your comment is snark, but the lesson I take is to build your app in Rails, then if you become multi-billion dollar IPO-level success, rewrite...