While we're all here talking about databases I've got a question for the DBA/well versed software engineers.
What database or database design strategies to be used for an analytics dashboard that can be queries real-time similar to Google Analytics?
The first real scaling issue I've come across (as a junior dev working on a side project) is a table with 38 million rows and rapidly growing.
Short of just adding indexes and more RAM to the problem, is there another way to design such a table to be effectively queried for Counts and Group Bys over a date range?
Iimagine things can't really be pre-comuted too easily due to the different filters?
I currently run analytic queries on a page view dataset around ~600 million rows on postgres. You can start by looking into partitioning by date. Also take a look at your IO on your queries. IO was the biggest bottleneck for me. Make sure you have hardware that is not limiting it.
What database or database design strategies to be used for an analytics dashboard that can be queries real-time similar to Google Analytics?
The first real scaling issue I've come across (as a junior dev working on a side project) is a table with 38 million rows and rapidly growing.
Short of just adding indexes and more RAM to the problem, is there another way to design such a table to be effectively queried for Counts and Group Bys over a date range?
Iimagine things can't really be pre-comuted too easily due to the different filters?