A question that bothers me:

1. simple model + small data: OK, if you have good understanding on the generative process.

2. complex model + small data: most likely fail due to overfitting, unless the noise is very very small so that the system is almost deterministic.

3. simple model + big data: calculation of parameter estimate is challenging although sometimes tractable using techniques like SGD, but frequentist hypothesis testing almost always fails because you might be oversimplifying the problem.

4. complex model + big data: calculation of parameter estimate is impossible.

Conclusion: a good statistician should only work with 1. simple model + small data!?

1. simple model + small data: OK, if you have good understanding on the generative process.

2. complex model + small data: most likely fail due to overfitting, unless the noise is very very small so that the system is almost deterministic.

3. simple model + big data: calculation of parameter estimate is challenging although sometimes tractable using techniques like SGD, but frequentist hypothesis testing almost always fails because you might be oversimplifying the problem.

4. complex model + big data: calculation of parameter estimate is impossible.

Conclusion: a good statistician should only work with 1. simple model + small data!?