Hacker News new | past | comments | ask | show | jobs | submit login
Hitch – A Scalable TLS Proxy by Varnish (github.com/varnish)
89 points by kolev on June 9, 2015 | hide | past | favorite | 41 comments



From reading over the blog post, it seems hitch is a forked and patched version of stud. Nice and all but difficult to see what possible advantages it could have over using haproxy for termination?

From skimming the github page the only thing that stands out is shared memory-based SSL contexts and UDP peer communication across processes/machines. Not sure, though, if this is something haproxy can also do? Never had a need for that level of performance so have never looked.


Interesting to see people recommending haproxy for ssl termination now. HAProxy has only supported that since June 2014. Before that, you saw people using projects like stud with it, and others recommending just using nginx in front.


True but if they were like me, they were using those as stop gaps until SSL landed in haproxy 1.5 stable. Yes, stable support arrived 'only' a year ago but that was after years of development work :) The first dev release with basic SSL support was back in September 2012.


We use stud for tls termination at work. It's because we can put it in front of all of our various http servers and get consistent tls support. Almost all of the services have the same stud config, just different certs. Stud is very amenable to running in a jail too (so we've limited the damage of the next openssl vulnerability), so the config references one cert path, and we just put the right cert for the machine when we assemble the jail. The only other usual difference is the number of localhost IPs to use, so the script detects that as well.

Haproxy would probably work fine too, but seems a bit overkill for running a local termination proxy.


haproxy is basically identical for your use case. I guess whether one views it as overkill or not is a matter of taste.


Yes, haproxy can do the same job, but it can also do a whole bunch more that we wouldn't be using. It's overkill in my opinion to have load balancing, status checking, request inspection, etc available, when all i need is listen on port 443, strip tls, add a proxy header, send to localhost. (There's nothing wrong with haproxy, and I would consider it if I needed the other features)


Advantages are that it is faster, and that it is a small and simple program that does a single thing well.


From reading the launch posts it seems the real advantage is it allows Varnish Software to bring SSL under the same roof and offer commercial support for it i.e. more a business advantage rather than a technical one?


"By default, hitch has an overhead of ~200KB per connection"

Ouch. This default means 16GB only lets you handle 80k connections. I would hardly call this "scalable". In 2015 you find blog posts left and right showing you how to reach 1 million connections on a single machine with language X or tool Y or framework Z. Maybe the developers should change this default.

https://news.ycombinator.com/item?id=3028741


Preallocating memory is usually an opimization for throughput.

Servers can have up to 1TB of ram without becoming overpriced.

But starving 80k users due to low buffering will be expensive in the long run. Way more expensive than RAM ;-)


"Preallocating memory is usually an opimization for throughput."

Still, 200kB is excessive. A program with buffers no larger than 10kB can _easily_ saturate a 1 Gbit/s NIC. Hitch is designed to handle many concurrent connections, so even if it handled a paltry 10 connections it could easily saturate a 10 Gbit/s NIC with 10kB buffers. If not, then there is a design flaw somewhere.

"Servers can have up to 1TB of ram without becoming overpriced"

This is irrelevant. If a Hitch version had a default overhead of 10kB per connection, it could in theory scale to 20x the number of connections than this version of Hitch, for a given amount of RAM (no matter the amount). Maximizing the use you get out of a given amount of hardware resources should be your priority when writing scalable software.


How do you think the CPU usage would be with 10kB buffer sizes? And since we're throwing numbers out in the air, why stop at 10kB? If we reduce to 1k, that should give us MUCH MOR connections!!11.

Let me ask a leading question: how much of this do you think is openssl overhead?

Please consider optimising for a real usage scenario, not some fantasy benchmarking setup.


I am not picking my numbers randomly. On a x86/x86-64 Linux kernel, one socket (one connection) will use at least one 4kB physical memory page. So if userland also allocates one or two 4kB pages for its own needs, you need at minimum 8 to 12kB per connection. That's why I quoted ~10kB.

The minimum theoretical memory usage is 4kB per connection: 1 page in kernel space, and nothing on the userland (eg. you use zero-copy to txfer data between sockets or to/from file descriptors).

At Google, our SSL/TLS overhead per connection is 10kB: https://www.imperialviolet.org/2010/06/25/overclocking-ssl.h...


Thanks for the data point.


Are those blog posts referring to setups with TLS? Comparing plaintext HTTP to TLS is comparing apples and oranges.


If you want to handle 1M connections, you can tune this. It will probably be the easiest thing to tune of many. Note that 1M connections terminated stud/hitch is actually 3M sockets: 1M inbound to stud, 1M initiated by stud and 1M terminated by your underlying server. That's a lot of connections on localhost (on the plus side, 127.0.0.1 is a /8)


If you're splitting 10gbit across 80.000 users, that leaves 125 kbps per user. Split it across 1 million users, and it leaves 10kbps per user. Sure, you could have more than 10gbps bandwidth from a single server to the Internet in theory -- but at that point I don't think sticking to 16GB ram makes much difference.


Millions of connections usually are websocket connections or HTTP keep-alive connections. In those cases there's no much traffic over those connections. Imagine game server for example. Latency is more important than bandwidth. 10 kbps is enough for many tasks.


I wonder how much of a hit typical web-socket use-cases would take from swapping to SSD? For games I'd think one might prefer just using connectionless udp, though?


Websockets are for browser clients, there's no much choice there unfortunately.


This is TLS, NOT TCP/HTTP.

Secure sockets have a lot more overhead than plain TCP sockets, on top of that it has all of the overhead that a proxy has per connection.


One big overheard in SSL can be zlib compression buffers. Setting ssl_op_no_compression can help quite a bit.


That's true, and you shouldn't support TLS compression anyway (to resolve attacks like CRIME).

You should also set SSL_MODE_RELEASE_BUFFERS to reclaim memory from idle SSL connections.


Yes, doing hard crypto for all users has costs. Welcome to the real world. :-)



After reading the source, I realized that I have no idea what modern, idiomatic C looks like. There are tons of uses of #define, some even in the middle of typedefs. I'm pretty sure a whole queue data structure is defined in macros in vqueue.h. Is this normal?


From vqueue.h header:

* Copyright (c) 1991, 1993 * The Regents of the University of California. All rights reserved.

It's very much based on this:

http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/sys/queue.h?anno...

Not very modern, in other words.


C does not have generics or templates so macroses are the only way to define data structures and algorithms over arbitrary types without repetition. Linux kernel uses macroses to define some data structures. Many other projects do the same.

Whether that is normal or not, it's up to developer. It's possible to use as small subset of C++ as developer wants. E.g. use only templates. But staying in C realm might be better for portability.


> C does not have generics or templates so macroses

When I read this, I couldn't help but think of "macroses" as pronounced like "neuroses", which seems appropriate.


Pretty normal for the BSD kernels. A lot of the very basic datastructures are defined as macros. For instance, here's queue.h: https://svnweb.freebsd.org/base/head/sys/sys/queue.h?revisio...


While not 100% necessary, header macros to implement lists, queues (and even trees, see: http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/sys/tree.h?...) are common in C.

They first appear for kernel usage where the fact that they expanded to inline code in functions avoided creating too many stack-frames and provided optimization.

They do have the advantage of not relying on casting everything to "void *" or resort to callbacks for walking (see: TAILQ_FOREACH for instance).


I'm already using pound for SSL termination http://www.apsis.ch/pound/. Does Hitch provide any advantages over pound?


The one I'm seeing is that this might work nicer with wildcard certificates and SNI.

Unless it's changed since I last attempted Pound only supported a single wildcard cert when it came to SNI whereas looking at the hitch code it suggests it might play nicely with multiple wildcard certificates

Edit: To clarify I don't think Pound technically supported any wildcard certificates, the wildcard cert had to be the default to work


What is the advantage over just using nginx to terminate SSL?


Presumably the "do one thing and one thing well" principle (assuming hitch actually does it well). Your attack surface is reduced by orders of magnitude if you're worried about future OpenSSL vulnerabilities.


Wonder why LibreSSL wasn't used... :/


The reason is pretty simple: LibreSSL isn't available/packaged on the distributions we care about, and we don't have the will, money or knowledge to do it ourselves. (with my VS hat on)

We are positive to merging any code changes necessary to get it running with libressl though.


I hadn't realized Stud had gone unmaintained. Great to see Varnish taking over the project!


Yeah, it's worrisome that people still use the bump version un-patched. The last commit to the official stud github repo is from 2012: https://github.com/bumptech/stud/commits/master (although there are many forks).


For those that don't know, varnish was always a bit reluctant to take on the challenges of TLS termination. They traditionally had more of a 'do one thing well' policy. Here's a good description of their reasoning around that:

https://www.varnish-cache.org/docs/trunk/phk/ssl.html


The problem with putting forward a 'do-one-thing-well' rationale is in considering TLS to be a separate problem from serving HTTP. It simply is not, and even in 2006 the writing was on the wall: HTTPS will be the standard web transport protocol within the next few years and HTTP will cease to be a viable option for production.

This has pros and cons, but besides the current CA situation I think it's pretty clearly better than what we have today. That's not really the point though; it's going to happen, regardless of flaws.

Using software like Varnish that is intentionally HTTP-only will always be possible, but it introduces architectural and operational handicaps. It may not matter for a lot of use cases, but at large scale you are going to pay for the architectural choice to separate these functional units into multiple processes (or even boxes).

As much as I appreciate some of what Varnish can do, the no-SSL stance and associated mindset really puts me off of it.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: