Yup. We have gotten into the habit of leaving a lot of potential performance on the floor in the interest of productivity/accessibility. What always amazes me is when I have to work with a person who only speaks Python or only speaks JS and is completely unaware of the actual performance potential of a system. I think a lot of people just accept the performance they get as normal even if they are doing things that take 1000x (or worse) the time and/or space than it could (even without heroic work).
I think it's even stronger than a habit. When you're exposed to the typical "performance" of the web and apps for a decade or so, you may have forgotten about raw performance entirely. Young people may have never experienced it at all.
I once owned a small business server with a Xeon processor, Linux installed. Just for kicks I wrote a C program that would loop over many thousands of files, read their content, sort in memory, dump into a single output file.
I ran the program and as I ran it, it was done. I kept upping the scope and load but it seems I could throw anything at it and the response time was zero, or something perceived as zero.
Meanwhile, it's 2022 and we can't even have a text editor place a character on screen without noticeable lag.
Shit performance is even ingrained in our culture. When you have a web shop with a "submit order" button, if you'd click it and would instantly say "thanks for your order", people are going to call you. They wonder if the order got through.
Shit performance is what happens when every response to optimizations or overhead is immediately answered with "premature optimization is the root of all evil."
Or the always fun "profile it!" or "the runtime will optimize it" when discussing new language features and systems.
So often performance isn't just ignored, it's actively preached against. Don't question how that new runtime feature performs today or even dare to ask. No no no, go all in on whatever and hope the JIT fairy is real and fixes it. Even though it never is and never does.
There's a place for all the current tech, of course. Developer productivity can be more important at times. But it should be far more known what the tradeoffs are and rough optimization guides than there are.
I think the issue isn't even individual developers, it's indeed the runtime itself. Anything you build on top of it is laggy.
Take my simple example of reading a file, processing it in memory, writing output. A process that should be instant in almost any case.
An implementation of such process that is commonly used in the front-end world would be CSS compilation, where a SCSS file (which is 90% CSS) is compiled into normal CSS output. The computation being pretty simple, it's all in-memory and some reshuffling of values.
In terms of what is actually happening (if we take the shortest path to solve the problem), this process should be instant. Not only that, it can probably handle a 1,000 of such files per second.
Instead, just a handful of files takes multiple seconds. Possibly a thousand times slower than the shortest path. Because that process is a node package with dependencies 17 layers deep running an interpreted language. Worse, the main package requires a Ruby runtime (no longer true for this example, but it was), which then loads a gem and then finally is ready to discover alien life, or...do simple string manipulation.
To appreciate the absurdity of this level of waste, I'd compare it to bringing the full force of the US army in order to kill a mosquito.
It's in end user apps too, and spreading. Desktop apps like Spotify, parts of Photoshop, parts of Office365...all rewritten in Electron, React, etc.
I can understand the perspective of the lonesome developer needing productivity. What I cannot understand is that the core layers are so poor. It means that millions of developers are building millions of apps on this poor foundation. It's a planetary level of waste.
Hmm, I recently built a site on Zola, and rebuilding the whole blog (including a theme with around 10 files of Sass) compiles in a few dozen milliseconds, and around 1 second on a 15 year old Core 2 Duo. But then again this is compiled Rust camping into libsass, which (despite Rust's dependency auditing nightmare) compiles to low-overhead executables. And apparently libsass is now deprecated for Dart Sass which relies on JS or the Dart VM.
In my experience, one of the most common causes of slowness is IO when there should be none. I’ve managed to speed up some computations at my company by over 1000x by batching IO and keeping the main computational pathways IO-free.
The Java and Python runtimes, which have much better test coverage and higher correctness standards than most enteprise applications, shipped a broken sort method for decades because it was a few percent faster. Never mind that for some inputs the returned value wouldn't actually be sorted.
As an industry we're not qualified to even start caring about performance when our record on correctness is so abysmal. If you have a bug then your worst-case runtime is infinity, and so far almost all nontrivial programs have bugs.
Wouldn't "profile it!" be the exact opposite if ignoring performance wins? It tells you which optimizations will noticeably improve your performance and which are theoretical gains that made no difference to realistic workloads.
It's a dismissive answer. It'd be like if someone asked "why does 0.2f + 0.1f print 0.30000000001?" and getting back an answer of "use a debugger!" It's not strictly wrong, the debugger would provide you with the data on what's happening. But it doesn't actually answer the question or provide commentary on why.
Similarly, the "profile it!" answer is often used when the person answering doesn't actually know themselves, and is just shutting down the discussion without meaningfully contributing. And it doesn't provide any commentary on why something performs like it does or if the cost is reasonable.
Well, performance is rarely the most important thing nowadays. What's preached against is not performance, but a performance-first attitude.
I agree it would be nice to value performance a bit more, but not at all costs, and depending on the use case and context of the application not necessarily as the priority over security, maintainability, velocity, reliability, etc.
> What's preached against is not performance, but a performance-first attitude.
That's what's preached against in theory. But in practice any performance discussion is immediately met with that answer. The standing recommendation is build it fully ignorant of all things performance, and then hope you can somehow profile and fix it later. But you probably can't, because your architecture and APIs are fundamentally wrong now. Or you've been so pervasively infested with slow patterns you can't reasonably dig out of it after the fact. Like, say, if you went all in on Java Streams and stopped using for loops entirely, which is something I've seen more than a few times. Or another example would be if you actually listen to all the lint warnings yelling at you to use List<T> everywhere instead of the concrete type. That pattern doesn't meaningfully improve flexibility, but it does cost you performance everywhere.
> What's preached against is not performance, but a performance-first attitude.
No, I can tell you this same record has been stuck on repeat since at least the mid 1990's. People want to shut down conversations or assign homework because it gets them out of having to think. Not because they're stupid (though occasionally...) but because you're harshing their buzz, taking away from something that's fun to think about.
This is a tangent but there are other, arguably better ways to give the user confidence the order took place in your example. You could show the line items to them again with some indicators of completion, or show the order added to an excerpt of their order history, where perhaps they can tap/click to view line items. Something like that is a bit more convincing than just thank-you text even with the delay, IMO, though it may be tougher to pull off design-wise.
In my SAAS app, we have a few artificial delays to ensure all "background tasks" that pop up a progress dialog take at least long enough to show that to the user.
I once did a progress dialog for a Windows app. It had a delay so it wouldn't even show until 0.5 seconds had elapsed - if the operation completed in that time, you never saw the pop-up. Once the dialog appeared it would stay on screen for at least a second so you wouldn't get freaked out by a sudden flash, even if the operation completed immediately after the 0.5 second delay.
In my experience, this wouldn't be needed if the rest of the app ran at native speed. There would already be a natural delay that would be noticed by the user.
> I think a lot of people just accept the performance they get as normal even if they are doing things that take 1000x (or worse) the time and/or space than it could (even without heroic work).
Habit is a very powerful force.
Performance is somewhat abstract, as in "just throw more CPUs at it" / it works for me (on my top of the line PC). But people will happily keep on using unergonomic tools just because they've always done so.
I work for a shop that's mainly Windows (but I'm a Linux guy). I won't even get into how annoying the OS is and how unnecessary, since we're mostly using web apps through Chrome. But pretty much all my colleagues have no issue with using VNC for remote administration of computers.
It's so painful, it hurts to see them do it. And for some reason, they absolutely refuse to use RDP (I'm talking about local connections, over a controlled network). And they don't particularly need to see what the user in front of the computer is seeing, they just need to see that some random app starts or something.
I won't even get into Windows Remote Management and controlling those systems from the comfort of their local terminal with 0 lag.
But for some reason, "we've always done it this way" is stronger than the inconvenience through which they have to suffer every day.
Part of the problem is we use unintentionally vague terms like "performance." What does that mean? Bandwidth? Reliability? Scalability? Something we can fix later right? That's what all executives and—frankly—most engineers hear.
I only ever talk about "latency." Latency is time—you can't get latency back once you've spent it.
It's the downside to choosing boring tech. It costs believable dollars to migrate and unbelievable dollars to keep course. There is a happy medium, I believe, that is better than "pissing away the competitive edge."
> only speaks Python or only speaks JS and is completely unaware of the actual performance potential of a system
If you stick to only doing arithmetic and avoid making lots of small objects, javascript engines are pretty fast (really!). The tricky part with doing performance-sensitive work in JS is that it’s hard to reason about the intricacies of JITs and differences between implementations and sometimes subtle mistakes will dramatically bonk performance, but it’s not impossible to be fast.
People building giant towers of indirection and never bothering to profile them is what slows the code down, not running in JS per se.
JS, like other high-level languages, offers convenient features that encourage authors to focus on code clarity and concision by building abstractions out of abstractions out of abstractions, whereas performance is best with simple for loops working over pre-allocated arrays.
Agreed that switching to lower level languages give the potential of many orders of magnitude. But the thing that was most enlightening was that removing pandas made a 9900% increase in speed without even a change to language. 20 minutes down to 12 seconds is a very big deal, and I still don't have to remember how to manage pointers.
I think that should be emphasized. The rest of the optimizations are entirely unneeded and added complexity to the code base. The next guy to work on this needs to be a cpp dev, but the requirements were only asking for 500ms which was more than met by the first fix. What the payoff of this added performance with and at what cost?
I don’t believe orders of magnitude is achievable in general. Even python, which is perhaps the slowest mainstream language clocks in at around 10x that of C.
Sure, there will be some specialized program where keeping the cache manually small you can achieve big improvements, but most mainstream managed languages have very great performance. The slowdown is caused by the frameworks and whatnot, not the language itself.
That’s a tiny as hell microbenchmark though, where rust likely was able to vectorize even. The difference won’t be as drastic for larger (more meaningful) applications.
Python programs often need to use a lot of optimized C and C++ libraries to get anywhere near reasonable performance. I would be shocked if a webserver written in Python was only 10x slower than one written in C (or Go or Rust for that matter).
It's interesting to me that two of the top three comments right now are talking about gaining performance benefits by switching from Python to C when the actual article in the link claims he gained a speedup by pulling things out of pandas, which is written in C, and using normal Python list operations.
I would like to see all of the actual code he omitted, because I am skeptical how that would happen. It's been a while since I've used pandas for anything, but it should be pretty fast. The only thing I can think is he was maybe trying to run an apply on a column where the function was something doing Python string processing, or possibly the groupby is on something that isn't a categorical variable and needs to be converted on the fly.
> the actual article in the link claims he gained a speedup by pulling things out of pandas, which is written in C, and using normal Python list operations.
Well, he claims he did three things:
(1) avoid repeating a shared step every time the aggregate function was called,
(2) unspecified algorithmic optimizations.
(3) use Python lists instead of pandas dataframes.
(1) is a win that doesn't have anything to do with pandas vs python list ops, (2) is just skipped over any detail but appears to be the meat of the change. Visually, it looks like most of the things the pandas code tries to do just aren't done in the revised code (it's hard to tell because some is hidden behind a function whose purpose and implementation are not provided). It's not at all clear that the move out of pandas was necessary or particularly relevant.
While I would certainly welcome awareness when it comes to performance it's not always useful to make something 1000x faster if it takes even as little as 25% longer to develop. Taking an extra day to make something take 1s instead of an hour is just not always worth it.
Though I will never understand webpages that use more code than you'd reasonably need to implement a performant lisp compiler and build the webpage in that (not that I'm saying that's what they should have done, I just don't understand how they use more code)
Developers are genuinely bad at watching themselves work. I've had any number of conversations with people who are being slowed down by things and just don't see it. If you take the roadblock away, a lot of them will start to notice, but most won't notice when it comes back, so recruiting people to help you keep things working is a challenge, and guard dogging things can be a significant time suck.
The thing I usually end up saying in situations like this is that, if the application doesn't 'work' for the developers, then soon it won't work for anybody.
For the average user, speeding up some things by 10 seconds will affect their lives more than you think, but it's not going to be the secret to happiness. But for some of these same workflows, the developers are running them over, and over, and over in a day, and cutting a few seconds off each iteration can add up quick. I've fixed build issues that saved team members 45 minutes a day. That sounds nice, but not earth shattering, until you look at the work flow and see 45 minutes is the difference between 4 attempts at fixing a hard problem in one day versus 5. That's not just time that's stress. "I have one more try at this and then I'm done for the day, having accomplished nothing."
The XKCD math on whether you should implement a time saving tool is off by at least an order of magnitude for most real world problems, because it doesn't account for team dynamics. It's written for and about people who don't stop and ask for help. The sooner a person finds their own solution to a problem I'm knowledgeable in, the lower the likelihood that I will get preempted. The three minutes it takes to help them costs me half an hour. Even with tricks to salvage a silver lining from such interactions, that's still expensive.
You also have some mental thresholds that multiply this effect even more. The difference between 5 min build and 30 min build isn’t just 25 mins. It’s the difference between I will only run this over lunch break, vs I will run this while fetching coffee. Add many other thresholds like short enough to still stare at progress bar vs alt-tab into Facebook and loose attention and waste another 10mins there, slow enough to only run over night, etc.
Then there is the death by a thousand paper cuts effect. For smaller tasks like updating status in Jira, if this takes 30 seconds of clicking and waiting (far from a hypothetical scenario btw!), I’m simply going to say fuck it and not do it at all.
Agreed. But I’ll add another phenomenon here. A five minute build takes ten minutes, because once you start something else you’ve estimated will take five minutes, you quickly discover that it takes ten, or you forget that you were doing that other thing. So taking four minutes off of a build actually takes 8 minutes off of the expected round trip time.
And that’s not even counting the “what if it fails the first time” tax which can double it again. Especially if it fails 30 seconds in and you don’t check until the end of the expected time. That four minutes can go to fifteen minutes on a really bad day, and that bad day might be a production issue or just trying to get out the door for your anniversary dinner. These are the situations when the light bulb goes on for people.
It depends on how often you need to do the thing and how long it takes to do it. There’s an XKCD that’s just a chart of that.
Sadly any concept of performance seems to completely go out the window for most programmers once it leaves their hands; taking 2-3x longer to write a performant app in a compiled language would save a ton of time and cycles on users’ machines but Programmer Time Is Expensive, let’s just shit out an Electron app, who cares that it’s five orders of magnitude larger and slower.
That knowledge is often not required to earn a living, so it's not surprising to me at all. My only realistic advice for people lamenting the common lack of this knowledge is to teach it (so you feel like you're making a difference) or put yourself among people with similar interests. Making performance knowledge a requirement to earn a paycheck these days is going to take a hell of a lot of change.