Hacker News new | past | comments | ask | show | jobs | submit login
Google Protocol Buffers - Open Sourced (google-opensource.blogspot.com)
57 points by enomar on July 7, 2008 | hide | past | favorite | 24 comments



It's great to see Google has a brain about this. XML is sooo hyped up, and there are so many fan boys that it's generally not worth getting into a discussion about its merits lest you start a religious war. It's cool that Google looks beyond the Dilbert-like mission statement of XML and recognizes its failures as well as its strengths, not to mention the suitability to purpose for their needs.


Agreed.

When I first saw Gmail without folders, I was like wtf!idiots. Then I used it a bit and I was like wtf!sweet.

Same thing with Google Maps. I heard about them using Javascript and I was like wtf!idiots. And then I finally got to see it and I was like wtf!whoa.


Similar to Facebook Thrift (http://developers.facebook.com/thrift/)?


Thrift was inspired by ProtocolBuffers. Ex-googlers who went to Facebook reimplemented PB. In fact Facebook was hiring people explicitly with experience with PBs.


And by inspired, you mean "a complete rip-off of".

I talked with some of the guys who did Thrift, and while they didn't steal code, the concept and config files are close to identical.


Identical config files aside, the concept itself is rather simple and it has been around for ages in a form of ASN.1 and its encodings. It also routinely used in custom network protocols, e.g. IPC or RPC ones. Anyone outside of the XML fanboy club is aware of these things :-)

File format wise, it's hardly a rocket science too. When developers re-implement something, it's only natural to recycle an established implementation element. Changing it just for the sake of being different from another vendor is frankly quite dumb.

So I'd be careful with bold "rip-off" statements. In the end "Everything new is a well-forgotten old".


Well, I don't believe calling it a rip-off was necessarily criticism. As they say: mediocrity borrows, genius steals.

Also, I must plead ignorance. I've never worked with ASN.1 or these other formats. From my cursory examination, ASN.1 seems to be far more complex.


this is a complete fabrication. facebook's thrift was inspired by pillar, an rpc library written in ml by former cto adam d'angelo.


I understand why Facebook might want to claim this in public, but don't call people 'fabricator', okay? You never know when someone might just have conclusive evidence that proves you wrong.

Don't get me wrong, I admire Facebook for making Thrift public, and it was a great thing for the internet. It was probably foolish of me to focus on the lineage of the thing.


Is this something like modern day ASN.1?


Yep, in encoding part it's quite similar to PER - http://en.wikipedia.org/wiki/Packed_Encoding_Rules


Neat!

There's an article in the 1997 Game Developer's Conference Proceedings on a similar technique that Naughty Dog used in their Crash Bandicoot tool-chain.

Given a binary file format, it's nice to be able to specify a reader or writer code-generator with a declarative syntax that looks just like the file format spec.


I am sure the scale of their applications demands optimizing as much as possible. I use yaml for this purpose. And find it very flexible and easy to use. It is also more readable than xml.


Never trust an IDL whose performance is benchmarked against XML.


The fact that Google said this is much faster than XML does not mean that the IDL might be slower than other IDLs. In fact, the way it is implemented, I see no reason why it should be slow. If you want to find a drawback, it's the fact that the IDL is actually less expressive. Whether that matters or not in real life is another matter completely.


I'm not saying it's slow. I'm saying, "an order of magnitude faster than XML" isn't saying much. They've chosen the wrong benchmark. How does it compare to Pickle, Marshal, XDR, DCE, and Thrift?

For that matter, how does it compare to JSON?

I'm not really making a point about how fast Protocol Buffers are. I'm making a point about how the article was written.


Pickle, Marshal, and JSON are all ways of describing the kind of data structures used by Perl, Python, Javascript, and so on. They are way more flexible than Protocol Buffers -- you can have arbitrarily nested structures of arbitrary length. The format of the messages are basically self-describing. By which I mean that if they want to put an array there, they just put a '[' in and away we go.

I think XDR describes fixed-format structures. I don't know anything about DCE.

As described above, Thrift is a clone of Protocol Buffers.

Thrift and PB are a lot like XDR or god forbid, CORBA, but with a few twists. They define message templates for how to parse and emit relatively simple binary structures, which are then compiled into RPC clients and servers. PBs are cross-platform, at least by Google standards: they work with both C++ and Java very well. So you get the speed and convenience of making RPC calls with binary data structures in a mixed C++/Java environment.

But the best thing about PBs is that the message structure can evolve. If a message suddenly has a new field the receiver can't use, the receiver doesn't panic or read in garbage, unlike many other binary formats. So you can upgrade clients and servers in a gradual manner, without any downtime. This is what makes PBs especially suited to cloud computing.


You know, forwards compatibility is a feature of a great many binary protocols, including DNS, RADIUS, SNMP, and DHCP (itself a forward-compatible hack on BOOTP). So, PB is just another TLV encoding?


It seems like it. Before today I only had a vague idea how ASN.1 worked, and I never really delved into the wire format of PBs either. But yes, it seems to write a tag then a serialization of whatever the value is. Lengths are only required for strings because the format doesn't define any other type with variable length.


ASN.1/BER (you really mean BER) is the example most protocol dorks bring up when they argue against IDLs and structured protocols. It's incredibly complicated (though not as hard to implement as CORBA/IIOP).


indeed. i have used ASN.1 for a bunch of SNMP and GGSN accounting stuff, it is a pita. xdr on the other hand, is a breeze. also, xdr does support variable length fields.


hey signa11, are you a software developer for cellphone networking equipment?


yes.


kool, Im a SGSN/GGSN engineer by trade, but I don't do the programming aspect.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: