Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Explain to me why humans have to be the ones to write proper syntax and escaping, instead of their editors. Do you do this for other formats, like RTF? What is this obsession with hand-coding complicated formats :)


> Do you do this for other formats, like RTF?

No, but I do it for formats like HTML, CSS, JSON. My editor can assist me, but I don't need it. To a lesser degree, the same is true of Java, though I do admit to leaning much more heavily on my IDE for that.

> What is this obsession with hand-coding complicated formats :)

Well, part of it is that they're not all that complicated unless you're doing something fancy. Another part is almost certainly our (we as in developers) pride in being able to use nothing but vi installed on a decade-old operating system to get things done.


Why? What benefit do you get from coding it by hand? The only one I can think of is being able to type some esoteric incantations in a textbox.

Downsides include:

XSS and malicious injection - users should be using markdown or provided an actual contentEditable HTML editor instead of a textarea. Like in GMail.

Syntax errors. How many times have you typed some JSON by hand and realized you forgot to balance some braces or add a comma, or remove a comma at the end of an object?

Complexity not that complicated, really? Basic C, HTML, CSS is not complicated. Advanced stuff is complicated. Forgetting a brace or a semicolon torpedoes the whole document. SQL is not complicated. Does that mean you want to write SQL by hand for production code?

Why don't you install a simple extension to vi like an html format adapter? I'm a developer too. But let me tell you what it sounds like when you say you want to use nothing but vi: that's like saying you want to be able to build robots using nothing but a hammer and some nails, and nothing else can be built that requires further abstraction or advancement where you can't do the same with a hammer.

And also keep in mind that not everyone is a developer. The fact that you like to keep esoteric syntax rules in your head for HTML, XML, CSS, Javascript, C++ and so on doesn't mean EVERYONE should have to. When it comes to CSV, it's not even a real standard. You have to keep in your head all the cross platform quirks, like \r\n garbage similar to how web developers need to keep in their head all te Browser quirks and workarounds.

All this... for some pride thing of being able to write text based stuff by hand.

I brought up RTF for a reason. No sane person should want to type .rtf or .docx by hand. So why HTML?


If you're calling CSV complicated, I really want to see your example of a simple format.


CSV is complicated in the same way the DOM and Javascript is complicated:

1) there are so many differences between browsers that you have to keep them all in mind when asking people to send you csv files, or generating them, etc. Such as for example \n vs \r vs \r\n and escaping them.

2) You have to keep in your head escape rules and exceptions, and balancing quotes and other delimiters.

3) The whole thing doesn't look human readable or easily navigable for a document of any serious complexity.

And what's the upside? If more people just used Excel or another spreadsheet program to edit these files, you won't face ANY of these issues. They would eventually converge on a standard format, like they did with HTML.

Disclaimer: I wrote a CSV parser


> there are so many differences between browsers that you have to keep them all in mind

Compatibility horrors from people violating the standard will appear no matter what format you use. That's not fair to blame on the format.

> You have to keep in your head escape rules and exceptions, and balancing quotes and other delimiters.

CSV itself has newlines, commas, and quotations marks for special characters. That's extremely minimal. The only extra thing to keep in your head is "is this field quoted or not".

What set of escapes and delimiters could be simpler? Would you rather reserve certain characters, and abandon the idea of holding "just text" values?

> The whole thing doesn't look human readable or easily navigable for a document of any serious complexity.

> And what's the upside? If more people just used Excel or another spreadsheet program to edit these files, you won't face ANY of these issues. They would eventually converge on a standard format, like they did with HTML.

This sounds like you're arguing for a more complex format! I'm confused.

So again, what is a format that you call simple?


> What set of escapes and delimiters could be simpler? Would you rather reserve certain characters, and abandon the idea of holding "just text" values?

There's a lot of design room here.

https://www.lua.org/pil/2.4.html http://prog21.dadgum.com/172.html

I can no longer remember where I saw it (thought it was lua), but I heard the idea to use =[[ ... ]]= and =[[[ ... ]]]= as wrappers (a bit like the qq operator from perl). They can be nested and don't interfere, so =[[[ abc =[[ ]]= ]]]= is a legitimate string.


Except when the compatibility horror comes from Excel violating the standard, but only if you happen to be located at the wrong hemisphere (hello semicolons). So, even if everyone just used Excel, you will still face this issue. Now what?


I am not arguing for another format. I am arguing for using a program to edit these files, to avoid syntax errors and other crap that arises when people do manual stuff that doesn't need to be done manually. And sure the format can stand to be slightly more complex, who cares if you're not editing it by hand.

Ascii text format is simple.

Anything where you have arbitrarily complex structure, why not use a program to edit it? What is the downside of using the right tool for the job? Your text editor is a program. Why tunnel through text and manually edit stuff?


Does any system (modern or not) actually use `\r` for anything? Because I'm not aware of it being used alone. I'm not even sure why software today still differentiates between the two instead of treating any of `\r`, `\n`, `\r\n` as a single line break. I can maybe see it being useful eg in word processor to differentiate manual line break from paragraph break, but that's not a plain text format; in the vast majority of cases, treating them as the same character in parsers shouldn't* cause any issues.

* shouldn't ≠ doesn't


IIRC, Mac systems used to use `\r` for line endings. It doesn't any more and instead uses `\n`.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: