Hacker News new | past | comments | ask | show | jobs | submit login

While we're all here talking about databases I've got a question for the DBA/well versed software engineers.

What database or database design strategies to be used for an analytics dashboard that can be queries real-time similar to Google Analytics?

The first real scaling issue I've come across (as a junior dev working on a side project) is a table with 38 million rows and rapidly growing.

Short of just adding indexes and more RAM to the problem, is there another way to design such a table to be effectively queried for Counts and Group Bys over a date range?

Iimagine things can't really be pre-comuted too easily due to the different filters?




I currently run analytic queries on a page view dataset around ~600 million rows on postgres. You can start by looking into partitioning by date. Also take a look at your IO on your queries. IO was the biggest bottleneck for me. Make sure you have hardware that is not limiting it.


I'm not sure how much flexibility you have, but have you taken a look at Prometheus (for time series data) + Grafana (for dashboards)

I'm certainly not a DB expert, but this is where a similar need led me a few months ago.


You want an OLAP database, something like Clickhouse.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: