A question that bothers me:
1. simple model + small data: OK, if you have good understanding on the generative process.
2. complex model + small data: most likely fail due to overfitting, unless the noise is very very small so that the system is almost deterministic.
3. simple model + big data: calculation of parameter estimate is challenging although sometimes tractable using techniques like SGD, but frequentist hypothesis testing almost always fails because you might be oversimplifying the problem.
4. complex model + big data: calculation of parameter estimate is impossible.
Conclusion: a good statistician should only work with 1. simple model + small data!?
1. simple model + small data: OK, if you have good understanding on the generative process.
2. complex model + small data: most likely fail due to overfitting, unless the noise is very very small so that the system is almost deterministic.
3. simple model + big data: calculation of parameter estimate is challenging although sometimes tractable using techniques like SGD, but frequentist hypothesis testing almost always fails because you might be oversimplifying the problem.
4. complex model + big data: calculation of parameter estimate is impossible.
Conclusion: a good statistician should only work with 1. simple model + small data!?
No comments:
Post a Comment