Hacker News new | past | comments | ask | show | jobs | submit login

I am probably one of the junior DS you are referring to. But, I genuinely want to know the reason of using anything other than gradient boost tree to do classification on structured data.



It depends on your goal, and the nature of the problem. If you need to explain something in simplest terms, maybe use regularized logistic regression. If you need to make sensitive decisions, maybe a tree would be best because you have a clear sense of the variance in your answers at each node.

There’s nothing wrong with random forest. It’s a perfectly good model. But when it is someone’s only tool, it implies they both don’t know much much about the toolkit, and also how that one particular tool works.

I rarely use anything but linear models, trees and forests fwiw.


Is there a place that tells you: If you have this type of data and want this kind of answer, here's the best algorithm (and why)??





Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: