Hacker Newsnew | past | comments | ask | show | jobs | submit | Simpletoon's commentslogin

Yes, it sounds like he was surprised.

That's why he wrote "as expected...".


I only need three programs to deal with anti-ASCII, pro-complexity JSON, XML, etc. crowd: tr, sed and lex.

All the effort these Javascripters expend putting data JSON just gets undone by my custom UNIX-style filters; then I can actually work with the text.

Are they making life easier? For who? Seems like it's just more work for everybody, translating text back and forth into myriad formats.

But what can you do?


Your “plain text” probably has some implicit structure. XML, JSON, ProtocolBuffers, just make that structure explicit.

Dropping to plain text only to run sed or grep is a classic case of “if you have a hammer….” XML has a myriad of tools that do make your life easier — you just need to learn to embrace them.


It all starts as a stream. That is the "universal format".

sed was designed to edit streams. A stream can be transformed via stream editing into any text format, for any downstream consumer. It's line based. That's the only limitation.

lex can handle multiline "records". There is nothing you cannot do with lex. But only if you know how to use it. It's usually faster than any scripting language. Worth learning to use? Your choice. But it is what it is. It works. It, or some clone of it, was used to build the compiler that someone used to compile the shared library you're using as part of your special solution for the format of the month.

If you produce a stream as JSON, that's great. But now we're limited to consumers that understand JSON.

If you know your consumer wants JSON, then sure use some specialised library. But that's not what this guy is suggesting. He wants everything in JSON.

Well, not every consumer wants JSON.

This is a case of "I learned [X]. Please everyone use [X]."

None of us want to have to learn every language and every application.

Now consider if [X] is UNIX. For better or worse, it's the foundation on which most stuff talked about here runs. Perhaps it seems crude, it lacks sophistication in the eyes of a younger generation. It's a "hammer". But what can you build without a "hammer"?

In his case, [X] is Javascript. What's the foundation for Javascript? A "web browser".

Perhaps some people think nothing is possible without a web browser that can run Javascript.

It's a very narrow view.


Your plain text is less portable than any structured format. You’re creating ad-hoc parsers to process your ad-hoc format. There has to be some implicit structure to this text, otherwise you wouldn’t be able to use lex.

All it does is it ties your format to the specific implementation of your parser, including all the bugs in your custom stack. Your logs are now .docs, just in plain text.

> It all starts as a stream. That is the "universal format".

False. It starts as a data structure in the memory of the producing entity. The most direct or lightweight format would be a direct memory dump of the process. This would be unpractical, so the choice is between a generic portable structured data format and an ad-hoc serialization format.

Here’s your pipeline:

Structured data (Producer) → Plain text → Structured data (Consumer)

It’s like creating JPEGs of your logs and then running OCR to get the structured data back. That would be insane, right? But that’s the exact analogy, just your pipeline is a little lighter.

Now consider the alternative:

Structured data (Producer) → Portable structured data (.xml) → Structured data (Consumer)

The data in a structured, portable and uniform format like XML can be leveraged to offer rich and powerful tools, like XPath/XQuery/XSLT, all the while remaining agnostic to the specific data domain.

It’s just the logical thing to do.


It sounds like your [X] is sed and lex.


Paul Graham: "$50m companies innovate. Mine did. We basically invented the web app. We were doing complex stuff in LISP when everyone else was doing CGI scripts. And, quite frankly, $50m is no small thing. "

Of course, no one uses CGI now. It's all complex stuff done in LISP. Innovation is amazing, isn't it?


http://lib.store.yahoo.net/lib/paulgraham/bbnexcerpts.txt

That's more advanced stuff than most web apps or frameworks do today. If viaweb had been done now instead of 15 years ago, they probably would have spun off lots of open source projects.


Doubtful. Yahoo rewrote Viaweb in C or C++, didn't they?

There's a reason why so many programmers still use CGI scripts. And why they depend on a host of scripting languages that rely on external libraries written by others. Perhaps it's because they just don't grasp what Paul Graham is describing. They can't comprehend languages like Lisp or Forth or developing incrementally from an interpreter prompt. And they don't need to. "Web 2.0" is an easy sell, no matter how crappy the "web apps" are, no matter what library-dependent, inflexible language they are written in. "Market forces" inhibit the few folks who do understand Lisp and Forth from spending more serious time with those languages.

The web browser as a UI. Brilliant. Programmers and end-users (a group to which programmers themselves belong) are getting smarter every day.

Anyway, you're right. It is definitely more advanced. Let's keep that in mind as we're looking at what's coming down the pipe hence forward and marvelling at what some will portray as "innovation".


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: