How about this: understand what the hell you're doing.
For 99% of the web, 531 requests a second is beyond what you need. Not everything on the web needs to respond to 7000+ requests a second. In fact most of it doesn't. Heck, most of it doesn't need to respond faster than 1 second. In my experience, most of the poor performance in Ruby web applications is caused by trying to shove too many operations into the same process and making all data "first class". This is probably in part because Rails for years didn't include a built-in jobs system.
The only thing "web-scale" means is that something serves HTTP requests through the web in a way that people find useful. If so, then any language that can create sockets is potentially web-scale, including COBOL. Beyond that requirement, there's no such thing as "web scale". All web companies like Facebook, Amazon, Twitter, Apple, etc., operate at different scales and adjust to their own needs accordingly.
"web-scale" is a fairly established term (overused sure, but established). Arguing that most sites are not web-scale is beside the point.
As someone who's spent a lot of time scaling up ruby servers (I like to think I know what I'm doing), I can say it's not ideal for any high-scale applications for a few reasons:
1) It's hard to scale: Just look the thousands of stackflow posts on MRI garage collection tuning, unicorn forking, puma clusters and RoR performance to see that people struggle with it long before any huge scale. Other languages/frameworks are often 30x faster out of the box and cause fewer headaches when scaling up (we've been using GoLang lately). Ruby is easier to write and I love it for that, but much harder to scale. It takes more effort in the long run for high-scale apps.
2) The community tends to design for smaller scales. The ruby community is awesome, but as a result of focusing on lower traffic apps and single threaded/process web servers, a lot of the gems aren't designed for high throughput or concurrency. For example, ActiveRecord's connection pooling is often painful. Also, I found a pretty critical concurrency issue in the very popular DelayedJobs gem a few years back which had gone unnoticed for years.
3) MRI has performance issues: the GIL, slow GC, slow threading and other issues make it a slow language. It's fine for a lot of use cases, but try serving images, video or doing a lot of processing and you'll quickly hit its limits.
You absolutely can make Ruby work at huge scale, but it's much harder. I wouldn't recommend it if you hope/plan to require high-scale.
1) All it really changes is the ratio of application servers to DB servers. If you're running $40k of servers a month it might be worth it. Otherwise, probably not. At the time Twitter was struggling with scale they were also still running on a single MySQL server.
2) Rails has been thread safe since 2008. Ruby/Unicorn was serving some of the busiest sites on the internet in 2010. Go only went stable in 2012.
Connection pooling in other frameworks is usually even worse. Rails is the worst web framework, except for everything else.
delayed_job wasn't designed to be high throughput or threadsafe, but Resqueue and then Sidekiq certainly were.
3) The GIL also massively simplifies the Ruby internals for the core dev. Some platforms don't have performant hardware mutexes. Even dealing with OSX, Linux, Windows and *BSD is a nightmare.
The GC is actually really fast in Ruby 2.1+
Images shouldn't be served directly through Rails, even if dynamically generated. You're asking for a world of pain doing that.
I previously worked on a high-traffic mapping application where we generated the map tiles on the fly through a Rails app and served the images directly and we no scaling issues caused by Ruby/Rails but we did have plenty of CPU load from Imagemagick!
1) That ratio can be 40:1+ for a lot of use cases. That adds up in costs and if you have any other limitations (DB connections) it can be a total deal breaker.
2) I said many gems, not Rails. Connection pooling in other frameworks is worlds better (GoLang's db package comes to mind). We're talking about now, not 2012.
3) Sure GIL is convenient when thinking about code, but the point is it makes high-scale hard. 2.1 GC is only fast by ruby standards. "Images shouldn't be served through rails" < exactly my point, other languages handle this with ease. If you are building a high-scale app you can choose 1 language+framework, or RoR AND another when things are big/hard/tough.
IMO, RoR (and Ruby in general) makes easy things easier, but hard things harder. A good tradeoff for some projects, but not big or high-scale ones.
1) I honestly have not seen a ratio that high for a web service. Most of the hot points of serving a web request through Ruby are actually written in C, so on the same machine I can only serve about 2x as many simple JSON requests through Gin as Sinatra. Once you add in waiting on IO for a DB call this gap gets narrower until it basically disappears.
What blows it out to 40x? Some sort of live image/sound manipulation?
If it was an inherent problem with Ruby, you wouldn't see such fantastic performance from nginx_mruby.
2) Sure, the Go connection pool is nice, but the Go ORMs are horrible as they try and work around the warts of the language. Everything's a trade-off, really. You give up terseness and flexibility for performance.
3) The GIL is definitely not convenient when thinking about code. Threads in Ruby still require you to write thread safe code. The GIL exists to make the _Ruby VM_ simpler. Otherwise you end up in the situation of Lua, where you have to re-initialize the VM in every thread and essentially have sub-interpreters, or you build something incredibly complex like the JVM.
The 2.1+ GC usually spends less than 1% of time in collection. That's fast by any language standard. The next step for Ruby will be a method JIT. There are two promising method JITs in development using LLVM as a backend.
Images shouldn't be served through a dynamic web app regardless of your language. They should be served directly via a web server to a cache/CDN, even if you're generating them on the fly like dynamically composed map tiles. People pan away. They pan back. You don't want to regenerate the tile.
>The 2.1+ GC usually spends less than 1% of time in collection. That's fast by any language standard. The next step for Ruby will be a method JIT. There are two promising method JITs in development using LLVM as a backend.
I have read this number of times, now one Method JIT is shown, what is the other one?
Did you read the article past the benchmark numbers? That's more-or-less exactly the point the author made by the end. To wit:
> The majority of web applications handle far less than 1000 requests per second. I’d go as far as to say that most web application developers are employed by a company whose entire webapp does far less than 1000 requests/second. Most of them do less than 1000 requests/minute.
> When your application is big enough to be doing 20,000 RPM, you will have anywhere from a half-dozen to even fifty engineers, depending on your application’s domain. A single software engineer costs a company at least $10,000/month in employee benefits and salary. So we’re choosing our frameworks based on saving one-third of an engineer per month? And if that framework caused your development cycles to slow down by even one third of a mythical man-month, you’ve increased your costs, not decreased them. Choosing a web framework based on server costs is clearly a sucker’s game.
> What does web-scale even mean?
It's an old-ish meme, from this video making fun of MongoDB. [1]
Yup. If Ruby is good enough for Github, Soundcloud, and many many other large-scale companies, Ruby is good enough for anyone. Knowing what the hell you're doing is half the battle. I've heard places convert from a Ruby stack to Java, and a lot of it was just to hide the fact that the programmers didn't know what the hell they were doing.
> I've heard places convert from a Ruby stack to Java
Twitter.
And for good reason.
The problems that Github and Soundcloud have aren't related to their web front ends. Bandwidth, storage and a host of other issues are probably more front and center.
Twitter has these same problems but for a different set of reasons. Their web front ends are doing a hell of a lot more work relative to the amount of data coming and going. How big is a tweet vs a repo vs a song.
Yes they are all on the web but these three companies could not be more different.
> I've heard places convert from a Ruby stack to Java, and a lot of it was just to hide the fact that the programmers didn't know what the hell they were doing.
Why go from ruby to java? Because you want to invest in an ecosystem that is far more robust and far richer. You don't want a pile of ruby programers off in a corner doing their own thing with bunch of containers while you HAVE to hire java guys to deal with the large scale problems that organizations of a certain size face.
At the time twitter made this choice it was the ONLY choice to make. They didn't have a list of good options that we have today (Go, Scala, Haskle). The reality is that they would STILL probably choose java because it is still easier to find talent and it still has a richer background.
Lets make a comparison here... PHP vs RUBY, and lets talk about the devs. I am hard pressed to find a decent PHP dev who won't throw PHP under the bus (or out the window) when they get outside what it is good for. I am hard pressed to find a ruby dev who doesn't fully believe that the solution is always more ruby. There are plenty of good ruby devs (great ones in fact) who swear that ruby is the only way!
Article is obnoxiously long and borderline idiotic. Dude spends way too long talking about way too much stuff, and in scanning the words all I saw was fluff. TL;DR, nonsense, typical Medium blog
How about this: understand what the hell you're doing.
For 99% of the web, 531 requests a second is beyond what you need. Not everything on the web needs to respond to 7000+ requests a second. In fact most of it doesn't. Heck, most of it doesn't need to respond faster than 1 second. In my experience, most of the poor performance in Ruby web applications is caused by trying to shove too many operations into the same process and making all data "first class". This is probably in part because Rails for years didn't include a built-in jobs system.
The only thing "web-scale" means is that something serves HTTP requests through the web in a way that people find useful. If so, then any language that can create sockets is potentially web-scale, including COBOL. Beyond that requirement, there's no such thing as "web scale". All web companies like Facebook, Amazon, Twitter, Apple, etc., operate at different scales and adjust to their own needs accordingly.