Hacker News new | past | comments | ask | show | jobs | submit login

I run three separate indices at about 10-20mn documents each. But I'm fairly far off any sort of limit (ram and disk-wise I'm at maybe 40%).

I'm confident 100mn is doable with the current code, maybe .5bn if I did some additional space optimization. There are some low hanging fruit that seem very promising. Sorted integers are highly compressable, and right now I'm not doing that at all.




Yes doclist compression is a must. Higher intersection throughput and less bandwidth stress. Are you loading your doclists from persistent storage? What is your current max rps?


I'm loading the data off a memory mapped SSD, trivial questions will probably be answered entirely from memory, although the disk-read performance doesn't seem terrible either.

> What is your current max rps?

It depends on the complexity of the request, and repeated retrievals are cached, so I'm not even sure there is a good answer to this.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: