It's always seemed strange that an application called TextEdit is actually more than a text editor. I strongly believe that content-type autodetection, much less HTML rendering(!), most certainly does not belong in a text editor.
Here's an interesting quirk in Windows: There are two APIs to execute external programs, CreateProcess and ShellExecute. CreateProcess is the older of the two and only runs executables. ShellExecute opens the target with whatever app is associated with the extension.
When they shoehorned the ShellExecute behavior into cmd.exe, they basically just said "if (!CreateProcess(foo)) {ShellExecute (foo)}"
As a result, if you take "foo.exe" and rename it "foo.txt" then try to run it like "C:\>foo.txt" from the command line, it will run as an executable instead of opening in Notepad like you would expect. Do the same with a real text file (that doesn't start with "MZ") and it opens in Notepad.
This is a frustrating behavior on Windows, not because it's possible, but because it's default. I vastly prefer the way KDE performs. Whatever the default program to open that file type is attempts to open it. You can easily change what the default is.
It's frustrating when I instinctively change a file extension on Windows so I can do some other operation with it (say changing a configuration file to .txt to edit it) and Windows still doesn't know what to do with it.
I'm not averse to the behavior, I just wish I could control when it happens.
It's a rich text editor by default. Rich text is still text.
Opening HTML files and converting them to rich text certainly does belong as a valid feature for a rich text editor. It'll open and convert Word files too, which is super useful.
The content-type autodetection, however, I agree was a bad idea. Still, this vulnerability presumably existed with an .html file opened in TextEdit.
I assume the content-type autodetection exists because of how downloading files occasionally appends a .txt extension (I think this is when the content type is text/plain). Postel’s law gets applied with the result of macOS attempting to make up for misconfigured servers.
If the file extension is .txt, I always expect it to be opened as plain text. The file extension is, rightly or wrongly[0], the metadata declaring the file type — nobody would consider it reasonable for an .exe to remain executable if the extension is changed to .txt, after all.
One might, possibly, still argue about the text encoding of a .txt file (I’m old enough to remember Unicode being a new fancy alternative to ASCII), but that’s about it.
> If the file extension is .txt, I always expect it to be opened as plain text. The file extension is, rightly or wrongly[0], the metadata declaring the file type — nobody would consider it reasonable for an .exe to remain executable if the extension is changed to .txt, after all.
That statement is quite wrong and shows a good dose of ignorance. To start off in UNIX systems the extension means nothing regarding whether a file is an executable or not. All it takes is a +x flag and a file format (header, magic number) that can be executed.
Also, file extensions mean nothing. In fact, a popular and very basic trick to fool clueless users to run malware (and one which any anti-malware tool checks) is to sneak executables with a different extension, because it only means something to clueless users.
And a file with a txt file extensions means nothing at all. The only thing that matters is the file content and it's file permissions.
TextEdit dates back to NeXTStep, so it was originally written in the late 1980s probably. Guessing it didn't render HTML originally, but it always had RTF capability. Not that it's an excuse in 2021, but very few applications from that era woudl be considered "safe" today.
Edit.app is the original NeXTSTEP text editor from the 1989. It supported plain text and rich text files. Famously, the first web browser was based on the rich text capabilities built into NeXSTEP.
TextEdit.app is the OpenStep rewrite of Edit.app and dates to the mid 1990s. It was likely one of the first OpenStep apps. It supported the same rich text files as the original Edit.app.
Apple bought NeXT, OpenStep became Cocoa, TextEdit was ported to Java, and then back to garbage collected Objective-C, then ARC Objective-C, (then Swift, probably).
Along the way it picked up features for reading/writing/editing HTML and Microsoft Word documents.
Apple used to publish the source code for TextEdit as part of their Xcode sample code, but they stopped a few years ago.
Java was supposed to be the primary programming language for OS X. That's why they renamed OpenStep to Cocoa (Java and Cocoa go great together).
But AppKit was still pure Objective-C, and bridging between AppKit's Obj-C APIs and the Java language presented problems. 3rd-party developers (eventually) preferred the write directly in Objective-C and Apple dropped the Java bridge some years later.
NeXT used Display PostScript for the display manager. If you opened an email that had PostScript commands, the mail agent would happily, automatically, execute them.
A favorite payload sent around the computer lab would smear all pixels downward to "melt" whatever was rendered on your display.
Note that there weren't that many interesting things to exfiltrate back then, so this wasn't a terrible default: there wasn't (any!) online commerce, online banking was rare, and even passwords were never echoed to the terminal.
You don't need a password to be echoed to exfiltrate it. You just need the key codes. Not sure about NeXTStep, but regular old X let you sniff keys really easily.
Some systems (specifically, earlier versions of SGI IRIX) shipped with X authorization disabled by default. This is the equivalent of "xhost +". You could sniff a box as soon as it was plugged into the network, including capturing login session credentials, all terminal commands, and anything else. When they su'd to root, yes, you'd capture the root password.
In those days (mid 90's) almost nobody was running firewalls. At least, nobody in these parts. Putting your "office on the Internet" meant raw, unfiltered IP.
According to you. I appreciate that TextEdit is a rich editor. I can use vim or countless other apps for plain text. Few do what TextEdit does with its simplicity.
Neither is the opinion that "This problem exists because someone wrote a tool that should only do one (really well) and but instead made it do five different things."
You can make security bugs in simple tools - this security bug is not purely a function of the number of target use-cases.
Nor do you have any rational basis for asserting that the given app "should only do one [thing]".
>>Anyone with networked filesystems, I should imagine?
You're either missing the point made by GP or being disingenuous. Please keep in mind that you need to explicitly mount a NFS before you're able to open it, and mounting a NFS not only requires explicit authorization but also only provides access to a specific file system mounted in a specific point following specific permissions.
Accessing the whole internet through file:// without being prompted for permissions or consent or even awareness is an entirely different thing. For starters, the access is not explicit nor subjected to conditions.
Rigidly interpreting documents depending on their file extension is worse than trying to figure out the type of a document before interpreting it. File extensions are a brittle and primitive system that does not fix any security issue.
Optimizing for "simple" for the sake of robustness is exactly backward.
> visible and understandable
False. Something is neither visible nor understandable if it's misleading - which file extensions are. There are absolutely no guarantees that a file extension will match file contents, and that assumption can cause security risks - like in this article.
An actually good alternative is to encode file type as metadata, instead of inside the file contents or file-name, and then configure viewers to display it. That, while not "simple", is also visible and understandable to the user, while simultaneously being safe.
> There are absolutely no guarantees that a file extension will match file contents, and that assumption can cause security risks
Only in software that ignores the extension.
> An actually good alternative is to encode file type as metadata, instead of inside the file contents or file-name, and then configure viewers to display it. That, while not "simple", is also visible and understandable to the user, while simultaneously being safe.
Metadata can be just as wrong as a file extension, and is generally far less visible.
The problem is that the text editor ignored the extension of the txt file. That's what lead to unsafe behaviour - the user thought the file was fine to open because the extension was txt, and improving users is not practical.
The exact same thing would happen with metadata - indeed file extensions are just a form of metadata - if the metadata says this is a text file but the application ignores it, we would have the exact same issue.
> They are also trivial to get wrong, can be mangled when the files are moved around, and are easy to use as an attack vector.
On the contrary, they're the only kind of metadata that doesn't get mangled when files are moved around, and they're far less of an attack vector than other approaches. Of course you can set the wrong file type, but no approach avoids that problem.
If you don't want text editors to do non text-editing stuff, then people need to stop saying we should build development environments around text editors. "An IDE is just a text editor with bells and whistles", people say. Well if that's the case, it's not surprising if people "only ship the one text editor".
But not with file extensions of .txt. They should only do bells and whistles if the extension warrants some bells. .md, sure syntax highlight me. But opening .txt and treating it as html, that seems strange.
Well, on Unix file extensions are a convention and don't have any strict semantic meaning. Maybe this doesn't make sense in a world where most people do think in terms of file extensions (thanks to the popularity of Windows) but it shouldn't be surprising that non-Windows programs might not special-case file extensions.
(Though in fairness, text editors do usually have special casing for file extensions and these days tools like ls will colour filenames based on the extension.)
This is only a Unix vs Windows thing in terms of the application launcher and how it is implemented. File extensions are semantically meaningful for many unix tools, most notably gcc.
I don't know anyone who says that an IDE is just a text editor with bells and whistles. Visual Studio Code is a text editor with more bells and/or whistles than a choo-choo train, but that doesn't make it any more of an IDE than nano and termux.
It compiles and debugs, that seems like an IDE to me.
VS Code is cool and all, but it definitely is a lot more manual and laborious than VS. the tooling and automation in VS is missed if you’re used to it.
Yesterday grep didn't work because it 'autodetected' that the target file was a binary.. So I 1) cursed whoever made this non backward compatible change 2) used man to find the '-a' option..
Hah, looks like I’ve been needlessly typing quite a few extra keystrokes, as I’ve always done —binary-files=text. I should have looked at the man page more closely..
It can do some sketchy things and rewrite your terminal in weird confusing ways, but afaik most of the out-and-out malicious escape sequences have been patched out at least a decade ago.
Any vulnerability in the escape sequence handling of the terminal emulator, and conceivably, depending on what sequences your terminal supports, access to facilities like local file generation or clipboard contents. There have been a number of issues with escape sequences injected into things people might copy and paste from a web page, or in git commit histories, that have done nefarious things.
I think this captures my sentiment on the matter as well. Applications today want to be a swiss army knife and do just about every job.. and do it poorly. I do expect that level of complexity from RStudio, but probably not from Notepad. I would probably kinda accept it in Notepad++.
So real question. Is TextEdit a default text editor in a mac?
It's always seemed strange that an application called TextEdit is actually more than a text editor. I strongly believe that content-type autodetection, much less HTML rendering(!), most certainly does not belong in a text editor.