Data Mining Bias
Data mining bias, or p-hacking, occurs when researchers test an excessive number of hypotheses on the same dataset until they find a result that appears statistically significant by chance. In the context of automated trading, this is a major risk, as computers can iterate through millions of combinations of indicators to find a curve-fitted strategy.
This bias creates the illusion of a robust trading system that inevitably fails when applied to new, unseen data. To avoid this, researchers must use separate datasets for training and testing and apply rigorous statistical corrections.
It is a common trap for those relying heavily on backtesting without a strong theoretical foundation. Data mining bias is the enemy of genuine discovery.
Awareness and strict methodology are the only ways to prevent it from polluting the research process.