Hacker News new | past | comments | ask | show | jobs | submit login
What caused that outage
74 points by pg on March 12, 2009 | hide | past | favorite | 62 comments
HN was down today for around 2 hours. Sorry about that.

The News server currently crashes a couple times a day when it runs out of memory. All the comments and stories no longer fit in the 2 GB we can get on a 32 bit machine. We'd been planning to upgrade to a new 64 bit server. In the meantime it was arguably a rather slow form of GC.

Unfortunately the process somehow got wedged in the middle of segfaulting. We're not sure why and will probably never know. But that meant the process that usually notices when News is wedged and restarts it was unable to kill it.

Fortunately rebooting the machine solved the problem. But now we'll presumably be switching to that new, bigger server sooner rather than later.

As far as I can tell it was a coincidence that this happened today. It doesn't seem to have been caused either by the increased traffic, or the excessive number of posts about Erlang.




Bummer. Maybe you should consider rewriting in Erlang... :)


That's what happens when you make something in a new and unproven language.

And I think that it deserves tremendous respect that you have done so. You could have hacked HN in python (or lisp?) in a weekend with the knowledge that it would scale, yet you decided to go with Arc. Very bold. And a great way to battletest a new language.

It's great to see people eat their own dogfood.


If I'm not mistaken, "to battletest a new language" was the purpose of building this.


Keep in mind that mzscheme is battle tested. Arc is "just" a bunch of macros on top of PLT Scheme; what gives HN its [under]performing character is that foundational JIT compiler and 3m GC.


Arc compiles into MzScheme, but it's not implemented as macros. You can see that from the source.


In general, 3m is performing very well, much better than the conservative collector plt used to use by default (the boehm gc). As for the jit: many of the benefit of the jit are irrelevant in Arc since it doesn't use mzscheme modules; this is in addition to using an old mzscheme version, when the jit was rather new -- many many improvements were done since then.


I doubt Arc is the cause of any problems. I think this data model would be problematic in any language. You can't just let your processes die when you are low on memory, you need to evict unused pages from the cache.


Well, in a language with existing web frameworks it would be convenient to use one of them and store your data with SQL. In Arc you would have to write your own SQL wrapper. I suspect if pg was not using Arc he would not be keeping everything in memory.

That said, the lisp world could use a good web framework, so perhaps this will inspire some lispers to create one.


I suspect if pg was not using Arc he would not be keeping everything in memory.

I don't want to speak for pg, but I highly doubt this. He chose the architecture he did for good reason. Looking things up in memory is much faster than looking them up in the database. The only issue is that the site is too big to fit entirely into memory now, so he needs to write something that cleanly manages the memory space. (BerkeleyDB calls this a pager; data is stored in pages that can be in memory or disk.)

With that in place, the site should perform fine.


I would be keeping everything in memory (using lazy loading) regardless. It makes life simpler.


You wouldn't use SQL? Something like django or rails is pretty handy if you are already using those languages, as annoying as it can be.


SQL databases are a very poor fit for sites like news.yc. Things like threaded comments, for example, are hard to model efficiently. If you use an object database (or roll your own in-memory thing), though, it is very easy to model.

Relational databases are certainly useful in some situations, but there are usually better options for websites. You should consider learning about other options before telling everyone to use Rails. Rails is not the only way to do things, and it's rarely the best way.


Actually, most of the websites I write don't use SQL.

I don't think your "poor fit" comment is true. You don't need to model the threaded nature of comments at all in your database. Since there's a relatively small number of comments per post, and most access to comments is probably show-me-all-comments-for-this-submission, you usually just need a single foreign key, each comment to its original grandparent submission.

In this specific context the relevant advantage of SQL over rolling your own is that there is less distinction between keeping data in memory and on disk. Otherwise, you can certainly do anything with memory + SQL that you can with memory + disk.


Sure, you can store it in a relational database, but you have to map your data to some structure that it doesn't have. That's what we call a hack.

If you use an object database, you can store your data exactly as it is represented in memory. That is much cleaner, IMO.

(Relational databases are like programming languages. Just like you can use any programming language for any task, and you can hack any data you want into the relational model. But that's not always the best way. Sometimes a key/value store, or an object store, or a document store is a better model. Using a better model means you need to write less code, which means your app will have fewer bugs.)


Wow, had no idea this was written in Arc. +1 to what mixmax said. It's the only way to keep pushing the envelope.


Boldness drives innovation


Speaking of bugs, pages don't load for me when I'm logged into my real account (without the 2). The front page cuts off after "1.</td><td><center>" and comments pages cut off right after the first sub-table opens. The leaders and submit pages load fine.

I thought it was just a sign that I should get back to work, but maybe my account got put into some sort of inconsistent state?


Thank God. I was worried you had pushed the big red noprocrast.


The server goes down, and when it comes back up the front page is filled with stories about Erlang. Coincidence?


Sign of the Second Coming?


Something that could be done for now, is to write a piece of mzscheme code that "marshalls" the data in (utf-8-encoded) byte-strings. Assuming that most of the 2gb is made of strings, and that these strings are mostly ascii, this should reduce the consumption by close to a factor of 4.

(I can imagine an interface that is transparent at the arc level, where are strings are just passed to the backend and retrieved from it, and the backend converts them to and from byte strings. Later on it could change to use a FS or a DB or whatever.)


Yeah, it's not optimal to let your processes die from running out of memory because maybe some other non-server process actually requested the last bit of memory, and then who knows what sort of state your machine is in. How about making your restarter job notice when the machine is very close to being out of memory and preemptively kill the server then?


Well there's only one good thing for the ultimate in scalability -- arc in the cloud.

Which I believe should be called "lightning", right?


I thought 32-bit addressing gave 4GB of addresses…is there some sort of flag that's taking one bit? Not trying to be a smartass, just curious about the discrepancy.


There ends up being only 2GB of heap, which is where all the stories and comments live.


Does it really make sense for them to live in the heap rather than in the disk buffers?

I presume that you're not hitting a melt-down from CPU time, so I'd assume you'd get better performance by letting the system handle paging in data from the disk and figuring out which stuff needs to be in memory (buffers) and lives out on disk.

Also one thing to watch out for when you make the 64-bit jump is that (internally pointer-heavy) applications in dynamic languages tend to use significantly more memory on 64-bit platforms. davidw talked about this some here:

http://journal.dedasys.com/2008/11/24/slicehost-vs-linode


It does give 4 GB of addressable space, but many things are memory mapped by the OS, and so on a 32-bit system you end up with anywhere from 2.8 to 3.5 GB of addressable memory.


So by 2GB he meant usable, not addressable? I understand that addressable RAM != 4GB due to graphics card memory, BIOS, etc. but 2GB is way less than you could get on a 32-bit server. I just wanted to understand if there's something I'm missing.


2 gig is all you're going to get for a single process anyway, that's some kind of limit if I recall correctly. This site is a single process.


The OS sometimes reserves quite a chunk. For example, Windows XP really only has 2.25GB available. Putting more RAM in is pretty useless unless you switch to 64-bit.


Can't this be changed, you know, by hacking the registry or something ?!


No. The Windows and Linux kernels are hardwired to reserve a huge chunk of virtual addresses for kernel memory space. Windows can be toggled between reserving 2Gb and 1Gb (the latter of which Linux does by default). I assume this is so that a system memory address can be identified by testing a couple bits, and the minimum size is presumably limited by the chunk of addresses eaten up by memory-mapped devices like video cards.

Here's a page with some more details: http://news.ycombinator.com/item?id=452005


There's a boot-time argument that will move the barrier to 3G instead of 2G.


IIRC, Linux uses the high-bit to distinguish between userspace and kernel space.


dynamic languages end up using more memory, because extra information has to be stored about each item (I.e. Type, tc info, etc).


Down-voters, he is also right.

It's common for dynamic languages to embed typing information in pointers as an optimization. For example, CLISP uses at least 2 bits to distinguish between common types. That way fixnum numbers can be recognized and added without slow memory accesses.

The result is that you get less bits for the address. Hence less addressable memory.


Actually, no. These tag bits are usually stored in the lowest bits which are zero for all pointers (you would be mad not to align your data structurs to the four or eight byte boundaries your hardware uses for memory access). So you get the full width for pointers, but reduced width for your fixnums, because you have to set one of the least significant bits of the machine word to one to distinguish it from a pointer. That you still can't use the full 4GB of a 32 Bit address space is due to the fact that the OS needs some address space for itself, the details of this vary from OS to OS and what the runtime of your language does with the addresses the OS allows it to use. So beeing able to use more than 2GB on a 32 bit architecture should not be taken for granted.


While that's true for some things, that doesn't account for the memory overhead of a copying garbage collector.


Not to mention that many languages written in C use unions to describe their primitive -type- [ed: object]. The result is that the minimum number of bytes for storing an integer for instance, is the minimum number of bytes that can store a value of the largest type.

For example, if you have an string type, which keeps track of it's length, then you might need 8 bytes. 4 for the pointer to the string of chars, 4 for the integer to keep count.

Here's a better example from tinyscheme:

    struct cell {
      unsigned int _flag;
      union {
        struct {
          char   *_svalue;
          int   _length;
        } _string;
        num _number;
        port *_port;
        foreign_func _ff;
        struct {
          struct cell *_car;
          struct cell *_cdr;
        } _cons;
      } _object;
    };
As a minimum, each object takes up max(sizeof(_string), sizeof(num), sizeof(port), sizeof(_cons), sizeof(foreign_func)); And num is defined as follows:

    typedef struct num {
       char is_fixnum;
       union {
          long ivalue;
          double rvalue;
       } value;
    } num;


eeek. that should have said GC info (stupid iPod touch keyboard)


Judging by the number of Erlang stories on the front page (19 out of 25 currently) I would say that's your problem right there...(wholly in jest, half in earnest)


Why was the site using a 32-bit environment to begin with? Opterons have been cheap for years now, and all new Xeons are capable of 64-bit operation.


What chip in put out in the last 2 years or so couldn't run a 64 bit system? I'm pretty sure most of the AMD chips could... Edit: By chip I mean one for a desktop CPU, e.g. Atom don't count.


Atom 330 is also 64-bit.


IDK if it's relevant, but 64bit can seriously suck also. If you use a lot of pointers, you just doubled the memory you need for each pointer.

In my experience, 64bit isn't a good way to save memory.


hey, so I'd be happy to donate 16GB worth of Xen instances to the project,just to say I did, if you need mirrors. my boxes are 32GB ram/8 core, and the CPU is proportional, so if you want 4x4GB instances over separate servers, I can do that. (well, you will have to wait a bit for me to put up the 4th server.)


pg, I'd love to read a post about how you've dealt with writing HN to work in one process, in 2GB of RAM, it sounds quite novel! Did you have to make manual indexes? How do you arrange the files on the file system? How do you handle voting and concurrency/file locking?


You aren't running this in the standard language/caching/db setup, then?

You're storing EVERYTHING in memory? Isn't this kind of.. well, stupid, frankly.


While the comment above may not have the best manners, it does bring up an interesting point. I certainly wouldn't mind hearing more about HN's setup.


If you're really curious and have the time, you can download the source: http://arclanguage.org/install


It'd be interesting to read a write up on the concepts used (such as the in memory database) and the server restart mechanism. It'd be hard to tell from the source for non Lisp programmers.


An in-memory database is pretty simple - it's just a bunch of hash tables.

A server restart mechanism is also pretty simple. Write a cron job that activates every few seconds to ps aux and grep for the server name, and if it's not there start the server. That won't catch server hangs rather than deaths, but that's exactly the problem that happened here.


Thanks, I understood it uses hash tables - I've done the same myself, but never on such a large scale. This post sums it up: http://news.ycombinator.com/item?id=513259


Dude, frack no. Having everything in a database means you are tied to the speed of disk, instead of the speed of ram. The biggest issue with being purely ram resident is handling multi-threaded updates across your ram resident dataset.

You could always use Erlang...


"Dude, frack no. Having everything in a database means you are tied to the speed of disk, instead of the speed of ram. "

database are not tied to a storage medium. there is no reason why you can't run a DB (key-val or even a full RDBMS) in RAM.


If everything is purely RAM resident, wouldn't you lose it all when the server gets wedged?


No. You journal the changes to disk. You therefore only need disks fast enough to keep up with the journal. If you get into a situation where your disks can't keep up with the journal, your site is probably big and popular enough that you can afford to hire DBAs to go from there.


That's why you use intelligent caching.


For a site this small, there's absolutely no reason the entire database shouldn't be cached in RAM. 32GB of RAM costs about $800. That buys you plenty of time to not have to worry about caching, and instead gives you more time to work on interesting features. For a single-person operation (or even a few people), you have to spend your time wisely.


But of course, any sane database will cache as much in RAM as possible anyway. But instead of dying when you run out of memory, it just evicts the least recently used page.


add +10 mhz to the heap


Check out god -- it can restart mongrels when they reach a certain memory size:

http://god.rubyforge.org/

(if that's the issue)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: