Hacker News new | past | comments | ask | show | jobs | submit login

If you make argument 2) could you explain how writing a parser is more security critical than any other code that has a (direct or indirect) interaction with the network? At least recursive descent parsers are close to trivial. I usually start by writing a "next_byte" function and then "next_token". You'll have to look very hard to find any pointer code there. It's close to impossible to get this wrong and I don't see how the fact that it's a parser would make it any more dangerous.



Well if you're dealing with a struct then the compiler will provide type safety if say you try to access a field that doesn't exist. You don't get the same safeguards when dealing with raw bytes. Admittedly in C you can also run into these hazards with arrays and strings, which I why I suggest using non-standard array and string types which actually store the length if you insist on using C.


When a C program is factored well there needn't be all that much access by pointer + index. I'm not saying it can't be frequent in certain kinds of code, but for many things it's easy to just put a simple abstraction (API consisting of a few functions) that you have to get right once, then can reuse dozens of times.

Plain pointer access in high-level code (say when parsing a particular syntactic element by hand in a recursive descent parser) is a violation of the principle of separation of concerns IMO.

In any case I still don't see what's special about parsers. Most vulnerabilities I suspect to be in the higher levels, like validating parsed numbers and references, for a trivial example. In general, those are checks that are likely to be implemented much closer at the core of the application.


> Most vulnerabilities I suspect to be in the higher levels, like validating parsed numbers and references, for a trivial example. In general, those are checks that are likely to be implemented much closer at the core of the application.

What I see (especially in libraries like OpenSSL) is the core logic often receives a lot of scrutiny and testing, and thus it is silly mistakes with offsets and bounds checks that make up the majority of bugs.

It’s also worth considering the severity of different kinds of bug. A bug in high level logic might allow an attacker to do something they shouldn’t be able to do, but it doesn’t give them code execution.

The worst bit is, an attacker can often gain code execution through a part of the code that otherwise wouldn’t be security critical (where a logic mistake would be low impact). So writing code in a language that allows for these vulnerabilities greatly increases your attack surface.


> It's close to impossible to get this wrong and I don't see how the fact that it's a parser would make it any more dangerous.

I can answer that one. The parser is more dangerous because a parser, essentially by definition, takes untrusted input.

Nothing the parser does is any more dangerous than the rest of the code; it's all about the parser's position in the data flow.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: