Thursday, June 20, 2013

Simple Model + Small Data is the only way to go?

A question that bothers me:

1. simple model + small data: OK, if you have good understanding on the generative process.
2. complex model + small data: most likely fail due to overfitting, unless the noise is very very small so that the system is almost deterministic.
3. simple model + big data: calculation of parameter estimate is challenging although sometimes tractable using techniques like SGD, but frequentist hypothesis testing almost always fails because you might be oversimplifying the problem.
4. complex model + big data: calculation of parameter estimate is impossible.

Conclusion: a good statistician should only work with 1. simple model + small data!?

No comments:

Post a Comment