Hacker News new | past | comments | ask | show | jobs | submit login
JSON extra uses orjson instead of ujson (2019) (github.com/pydantic)
69 points by arvindh-manian 9 months ago | hide | past | favorite | 16 comments



Ran into this PR today. Thought Samuel Colvin's response to the migration request was prescient, especially considering what we later saw with the XZ Utils backdoor.

I've no clue if there actually were/are any problems with orjson, but I admire this kind of dedication to security, especially years ago.


When you consider how large many of these open-source ecosystems are and the sheer number of contributors who go completely unnoticed, I think it’s pretty much guaranteed that there’s some software we’re all using or relying on that’s been compromised and the vulnerability has gone completely unnoticed.

As an example, think about how many packages get downloaded from npm every single hour without even a single thought by the person downloading it, or in most cases, without even their direct knowledge—what are the odds that they’ve downloaded malware onto their systems?

I remember watching a TypeScript-related livestream on YouTube and the guy installed the wrong npm package (due to a typo), which had a postinstall script. He pretty quickly realized it was the wrong package, but only after it downloaded and ran a script on his system. It turned out the package was harmless, but it’s just so easy to harm a large number of users nowadays if you can just find the right target, which is why package maintainers are in such a dangerous position, in my opinion.

Sadly, I think the XZ Utils backdoor is just the one that got noticed.


It wasn't "prescient" at all because in the five years since this discussion nothing happened with orjson.

Casting aspersions on someone based on a five year old discussion with no evidence whatsoever by referring to a completely unrelated incident is ... not brilliant.

This is the HN variant of "Twitter outrage" over some innocent five year old Tweet.


I’d be willing to bet that Samuel was perceived as a jerk by some for even implying that this contributor was a bit suspicious, yet it was the most honorable position to take as a maintainer so many folks are relying on and trusting, both directly and indirectly. Job well done.


Hard to argue with his logic, especially re the fact that pydantic is used very widely by large organizations. Some degree of dependency visibility (i.e. non binary releases, >1 contributor, publicly attributable maintainer) is a good thing.


I'm really surprised ijl got angry that his mail was quoted, it looks innocent enough to me.

For reference it's been edited out here: https://github.com/pydantic/pydantic/issues/589

But github shows edits, so the edit is meaningless for privacy. Here's the original mail (yes, I'm blatantly ignoring his request to not publish this, I'm just this evil.)

    I've looked into replacing ujson in pydantic with orjson
    (https://github.com/ijl/orjson). In this implementation, the same JSON
    library is used for everything, and JSON outputs bytes without
    whitespace (as it's faster and JSON is a serialization format). If
    orjson is installed, it won't affect pydantic's benchmark for
    validation, but can be expected to improve whole-program performance.

    It's a large change with breaking changes to JSON methods, however, so
    rather than opening a pull request now, could you take a look and see if
    that's consistent and acceptable to the project?

    https://github.com/ijl/pydantic/commit/7c08f41edd340614d7c58888f025665dbc71d0e3

    That passes tests, but that's all. I'll clean it up or modify if the
    idea's acceptable.

    Thanks.


I've had people get angry at me for "quoting emails", even when it's small quotes from completely innocent non-private stuff like this. I guess it's a matter of principle for some shrug


I’ve also been wary of orjson considering ijl is anonymous and the only one authoring commits. Any ideas on if security folks are checking repository artifacts and verifying builds for projects like this?


Has anyone done an analysis of it? I've used orjson in all my Python projects for years.


There is a orjson CVE recently, CVE-2024-27454 based on one person’s analysis: https://monicz.dev/CVE-2024-27454

That is not necessarily the result of a systematic review of course.


I've never heard of either of these and just always used the built-in json library. Is there a great reason to add more technical debt to my projects?


It's a drop-in replacement that's orders of magnitude faster. Barring security/bug concerns, in my opinion there's no reason not to use it.


I definitely appreciate this degree of rigour.


(2019)


Just edited the title to reflect that


Can someone ELI5 why this is news? I'm just not following...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: