One of my pet peeves is "The Useless Use of cat Award". Someone awarded it to me as a teenager in the late 90s and I've been sore ever since.
Yup, it's often a waste of resources to run an extra 'cat'. It really demonstrates that you don't have the usage of the command receiving the output completely memorized. You know, the thousand or so commands you might be piping it into.
But, if you're doing a 'useless' use of cat, you're probably just doing it in an interactive session. You're not writing a script. (Or maybe you are, but even still, I bet that script isn't running thousands of times per second. And if it is, ok, time to question it).
So you're wasting a few clock cycles. The computer is doing a few billion of these per second? By the time you explain the 'useless' use of cat to someone, the time you wasted explaining to them why they are wrong, is greater than the total time that their lifetime usage of cat was going to waste.
There's a set of people who correct the same three pairs of homophones that get used incorrectly, but don't know what the word 'homophone' is. (Har har, they're/their/there). I always liken the people who are so quick to chew someone out for using cat, in the same batch of people who do this: what if I just want to use cat because it makes my command easier to edit? I can click up, warp to the front of the line, and change it real quick.
I do “useless” use of cat quite often because, in my brain, the pipeline naturally starts with “given this file”, so it makes the pipeline more consistent e.g. `cat f | a | b | c` rather than `a < f | b | c` where one must start with the first “function” rather than with the data. I see starting with `cat` analogous to the `->` thread macro in Clojure, `|>` pipe in Elixir, and `&` reverse application operator in Haskell. If bash permitted putting the filename first, I’d stop using `cat`; alas, it does not.
> One of my pet peeves is "The Useless Use of cat Award". Someone awarded it to me as a teenager in the late 90s and I've been sore ever since.
Wear it as a badge of honor! It marks you as a person who puts clarity, convenience and simplicity before raw performance. I can't think of a single case when that bit of performance matters.
Needless to say, I'm happily using cat (uselessly) myself and have no plans to convert.
> It marks you as a person who puts clarity, convenience and simplicity before raw performance.
This. As noted, even in scripts it usually makes more sense since the result is a pipeline that's easier to read, annotate and modify.
Case in point:
cat file.txt \
| sed '1s/^\xEF\xBB\xBF//' `# Strip UTF-8 BOM at the beginning of the file` \
| ...
Specifying a file name would only make the "black-magic-line" of `sed` more complicated while also making it more complicated to modify the pipeline itself. Now, if I want to skip that step to test something, I don't have to figure out if/how the next command takes an input file (or, ironically, replace `sed` with `cat`).
> It really demonstrates that you don't have the usage of the command receiving the output completely memorized.
No, it demonstrates that you don't have redirection memorized, and don't know that you can place it anywhere in the command line, including on the left.
> So you're wasting a few clock cycles
Keystrokes too:
cat x | cmd
< x cmd
It's also possible that cmd may detect that its standard input is connected to a real file, and take advantage of being able to lseek() the file descriptor. For instance say that x is an archive and cmd is an extractor. If cmd needs to skip some indicated offset to get to the desired payload material, it may be able to do it efficiently with an lseek, whereas under cat, it has to read all the bytes and discard them to get to the desired offset.
I still prefer to have cat there because it is interchangeable with other output-producing commands and it can handle globs. In an interactive session I iterate on the last command many times, and if I decide to filter stuff can just replace cat with grep or if I decide to pull from a directory of files can add a glob, if compressed it turns to zgrep or zcat etc. With redirects I'd have to change the structure of the pipeline which wastes mental effort. IMO.
> the time [used to] explain the 'useless[ness]' .. of cat to someone .. is greater than the total time that their lifetime usage of cat was going to waste
If you look for situations like this they are surprisingly common.
I don't often (err, ever…) reply without reading further but this time I must, because: I've never heard this turn of phrase "useless use of cat" and it turned my brain upside-down for a moment, because: "interactive" is precisely how I learn and do and I suppose it was a nice reminder that sometimes really big (read: useless) things are actually kinda small (useful) and vice versa.
Well, you waste an entire fork and exec, so I believe you are underestimating the time by a few orders of magnitude. Also, it's almost always grep following the cat, so it's not much to memorize.
But it's well worth wasting a process to have a nice pipeline where each command does a single thing so you can easily reason about them.
It's a lot more than the few extra cycles to spin up the process - it's also an extra copy of all the data. Usually that's also not much, but occasionally it's everything, as the consuming program can seek in a file, but not in a pipe, so might otherwise only need a tiny bit of the data.
It totally makes sense, after all the electron apps/docker containers/statically linked utilities/microservices running on my machine, needlessly running cat might be the straw that breaks the camels back.
It is not called "useless UUoC comment" (UUUoCC) without justification. ;)
From a personal taste perspective, I'm not a fan of either. Having a floating "<" at the start of a line just isn't my cup of tee. Not dealing with explicit stdin/stdout just makes my code easier to read. And especially considering the post's advice is about reading logs, a lot of the post is very likely built around outage resolution. Not the time I want to be thinking "oh yeah `tr` is special and I need to be explicit" -- nah, just use cat as a practice. And no, I'm not going to write `grep blah < filename` as a practice just because of commands like `tr` being weird.
But honestly, if it's such a big deal to have a cat process floating around, there are probably other things you should be concerned about. "Adds extra load to the server" points to other problems. If perf matters, CPU shielding should be used. Or if that's not an option, then sure, there's some room for trifling around, but if you're at a point where you're already running a series of pipes, a single cat command is beans compared to the greps and seds that come after it.
Yup, it's often a waste of resources to run an extra 'cat'. It really demonstrates that you don't have the usage of the command receiving the output completely memorized. You know, the thousand or so commands you might be piping it into.
But, if you're doing a 'useless' use of cat, you're probably just doing it in an interactive session. You're not writing a script. (Or maybe you are, but even still, I bet that script isn't running thousands of times per second. And if it is, ok, time to question it).
So you're wasting a few clock cycles. The computer is doing a few billion of these per second? By the time you explain the 'useless' use of cat to someone, the time you wasted explaining to them why they are wrong, is greater than the total time that their lifetime usage of cat was going to waste.
There's a set of people who correct the same three pairs of homophones that get used incorrectly, but don't know what the word 'homophone' is. (Har har, they're/their/there). I always liken the people who are so quick to chew someone out for using cat, in the same batch of people who do this: what if I just want to use cat because it makes my command easier to edit? I can click up, warp to the front of the line, and change it real quick.
Sorry. I did say, it is a pet peeve.