ZeroRPC

kodablah · on June 22, 2012

It would be nice if the protocol was published. I understand it's just msgpack and zeromq, but exception handling and message ID's escape me until I dig into the source (or finding the wire format on the pycon notes). I could definitely see myself (or someone else) creating a compatible, low-level C implementation of both a server and a client.

m0th87 · on June 22, 2012

We'll be working on that today.

jcromartie · on June 22, 2012

I had a bunch of questions about why it had "Zero" in the name, and why should I use this instead of just building RPC on ØMQ (zeromq). Then I went to the Github page.

You should mention ØMQ on the front page!

capkutay · on June 23, 2012

Currently using 0MQ to set up our cluster...handles concurrency, the guide has plenty of thorough examples (not to mention it's humorous, and it has examples in every mainstream language (C, C++, python, java, ruby, haskell, etc).

m0th87 · on June 22, 2012

Good point, we added it to the copy.

sausagefeet · on June 22, 2012

The documentation is pretty sparse, a few questions:

- How does it handle concurrency? Is each procedure run in a Python thread? Process? Gevent thread?

- How does it handle faults and recovery?

- Do you differentiate errors on the RPC layer with that on the procedure layer? For example, what if the RPC call I do does an RPC call that times out?

On a philosophical note, I'm not sure more RPC layers are what the world needs. The transparency a library like this gives you is really a big lie unless you are willing to accept that any call can now be an RPC one. Otherwise it greatly affects the correctness of your program (method M can now fail with some network failure).

That being said, good work!

gabrielgrant · on June 22, 2012

ZeroRPC uses Gevent for concurrency

Faults are handled differently depending on whether they are ZeroRPC-layer errors or application errors. To maintain the integrity of the connection itself, there is a heartbeat system independent of any given request. There is also an optional timeout that can be set for a given call's response. Application-level errors are propagated as "RemoteError" exceptions in the python interface. In order to collect more info about remote errors, there is also support for ZeroRPC[0] in Raven[1], the Sentry[2] client (Disqus' error logging system). In any failure case, both sides of the connection are notified and given an opportunity to clean up after themselves.

As to the philosophical concerns, I do agree on some level: RPC in general (and ZeroRPC in particular) are powerful weapons that need to be treated with care. That being said, there are a number of cases that are greatly simplified by the higher-level abstractions and more-robust error handling ZeroRPC provides.

------

[0]: http://raven.readthedocs.org/en/latest/config/zerorpc.html [1]: https://github.com/dcramer/raven [2]: https://github.com/dcramer/sentry

sausagefeet · on June 22, 2012

Thanks for the response!

> That being said, there are a number of cases that are greatly simplified by the higher-level abstractions and more-robust error handling ZeroRPC provides.

IMO, c.call(hello, "foo", "bar') would be better than c.hello("foo", "bar"). I think the latter gives too much of an illusion of local call.

bombela · on June 22, 2012

You can use the alternative syntax: c('hello', 'foo', 'bar')

Usually, "c" will be called "logger_service" or "metrics_service", thus reminding you that you are talking to a remote service, but its obviously just a convention...

Yes I know, people dont like convention, ee had cases when this illusion of a local call leaded us to do bad shit, this is true.

I think its the trade-off of any abstraction. More power under your fingers-tip mean easier use for good... or bad.

shykes · on June 22, 2012

Thanks for the detailed criticism and the kind words.

All procedures are run in their own gevent thread.

Yes, we differentiate errors on the RPC layer - there is a builtin heartbeat system so timeouts can be detected even when the procedure returns an infinite stream (think logs or system metrics). There is no built-in retry mechanism.

On the philosophy question: we discourage "pretending" that a call is local. That lead to the demise of original rpc and as you point out leads to weak systems. It remains the job of the developer to be aware of the network boundary and design accordingly.

sausagefeet · on June 22, 2012

> we discourage "pretending" that a call is local

Interesting, because it seems the opposite to me. The example client code makes an RPC call look exactly like a regular Python call. If you believe (which I do) that the syntax of something implies semantics of that something, this would mean you are trying to tell people an RPC call is the same as a local call.

> we differentiate errors on the RPC layer

What does an error look like if my RPC call times out? What does it look like if my RPC does an RPC call that times out?

The website says: "Built-in heartbeats and timeouts detect and recover from failed requests.", what does the 'and recover' part mean?

m0th87 · on June 22, 2012

Syntactically ZeroRPC calls look mostly the same as local calls, but semantically we don't hide erroneous states like you might expect a typically opaque RPC layer to do.

If an RPC call times out, you get a special exception that you can check against. Same with heartbeat failures.

"Recover" might be bad copy; what we mean is that ZeroRPC cleans up pending results. It is up to you to figure out what to do to fully recover. You have to develop with this in mind, but the trade-off is that failures are not hidden from you, and you have full flexibility in deciding what to do in failed states.

sausagefeet · on June 22, 2012

My point is that syntax implies semantics. c.hello(..) implies a local call and the semantics that go with it. But the semantics are actually different, which is bad.

shykes · on June 22, 2012

One cool feature the doc doesn't mention: you can expose a python module or class from the command-line:

  $ zerorpc --server --bind tcp://:4242 os

  $ zerorpc tcp://:4242 chdir /tmp
  $ zerorpc tcp://:4242 getcwd
  /tmp

m0th87 · on June 22, 2012

We've submitted the individual implementations on HN in the past few months. Here are the discussions.

Python: http://news.ycombinator.com/item?id=3761954

Node.js: http://news.ycombinator.com/item?id=4137576

densh · on June 22, 2012

It's such a grief that there is no ruby support yet. It would help a lot with our current project where we have site written in ruby/rails and backend services in python. We currently use Redis a broker between the two. At the same we've been looking at zeromq as a possible and simpler alternative.

Can developers share some info about their current roadmap? What languages will be supported next?

lonestar · on June 22, 2012

Check out JSON-RPC. Jimson is a good JSON-RPC client for Ruby. http://github.com/chriskite/jimson

m0th87 · on June 22, 2012

We did python and node.js first because we use them a lot internally. There's no firm roadmap for future implementations, but we're hoping once the protocol is well-documented, implementations can start growing organically.

shykes · on June 22, 2012

Note that a Redis transport is something we're considering. Also check out stack.io: http://github.com/dotcloud/stack.io

brunoqc · on June 22, 2012

Would a redis transport replace zeromq and what would be the advantages?

shykes · on June 22, 2012

No, it would be an additional option. ZeroMQ will always be supported - we use it enormously at dotCloud. Another transport we plan on supporting is websocket.

Each transport has its advantages. Zmq is incredibly efficient, very flexible and leaves you in full control of your network topology, but it requires a firewalled trusted network, and doesn't provide ready-to-use broker or naming facilities.

Redis can be used as a very efficient broker and discovery component, and can be more easily included in your web stack, but it limits your flexibility and becomes a potential spof.

The good news is, once zerorpc supports multiple transports you can stary with one and swap it out for another without changing your code.

cshenk · on June 22, 2012

I see the project you rely on for serialization (msgpack) has its own take on RPC:

http://msgpack.wordpress.com/category/messagepack-rpc/ http://wiki.msgpack.org/display/MSGPACK/Design+of+RPC

How zeroRPC differs from that in performance, perspective and status? Did you know of msgpack-rpc when started zeroRPC?

shykes · on June 23, 2012

We started writing zerorpc in 2010. We did test msgpack-rpc at the time, but it was barely a proof-of-concept.

A few key differences: zerorpc supports zeromq (including pub-sub and push-pull topologies), streaming responses, and has a bunch a of great instrumentation built on top of it.

m0th87 · on June 23, 2012

Somewhat of a side-note, but interestingly enough, MessagePack's author cited ZeroRPC as a good example of MessagePack's use case: https://gist.github.com/2908191

druiid · on June 22, 2012

One of my clients has been looking at implementing back-end communications between processes, including transaction handoffs... using RPC calls. We had been looking at using dnode, but I've noticed zeroRPC before as well. Is there anyone out there with experience using both or programming for both at least that has any input?

bombela · on June 22, 2012

I did not play with dnode long enough to judge it. On the other hand, I can give some details about zerorpc that might help you to determine if it will be a better fit or not.

zerorpc support:

heartbeat: on remote loss, any pending action is canceled (either on server or client side), and your application is notified properly. You can disable the heartbeat. You can also require that a client wait for a server to come to life before checking for a heartbeat.

timeout: you can specify a timeout for how long a request should take (independently of any heartbeat).

streaming: one call, and a stream as a result. Its serves two purposes: - transferring data sets that that would not fit in memory as well as reducing the transfer latency for big data. - push/pull stream to get events whenever they come. The client effectively "subscribe".

Note that it's up to you to decide what to do when a client can't consume the stream fast enough, so you can do push/pull style (server blocks for client), pub/sub style (server discards messages), punish style (server shits on the client ;)).

-- fx

cmelbye · on June 22, 2012

Can't wait to see a spec on the protocol, I'm excited to work on a Ruby implementation.

DanielRibeiro · on June 22, 2012

Interesting that the python/nodejs comparisons are not equal: the python ones are all synchronous, and the node.js are all asynchronous.

It would be nice to show how to do asynchronous calls in python, and synchronous in node.js

shykes · on June 22, 2012

Both implementations use asynchronous io. The difference is simply the API they expose - each implementation simply uses the best api available. With Node the dominant programming model is callback-based so that's what is used. Python has the advantage of a strong and well-supported coroutine library (gevent) so python developers aren't used to putting up with callback spaggheti.

tl;dr zerorpc is agnostic and each implementation uses the best tool for the job.

sausagefeet · on June 22, 2012

They are using gevent so I imagine each call is actually concurrent.

al_james · on June 22, 2012

Looks very convenient. Are their any plans for other language support (in particular on the client side). Ruby and PHP client libraries would be useful.

shykes · on June 22, 2012

We definitely have plans for it, and multiple transports as well beyond zmq. We're looking for contributors to help :)

al_james · on June 22, 2012

Cool. When the spec is published, I may have time to contribute a PHP client.

eclark · on June 22, 2012

Any plans for a jvm version ?