In-Sample Data
In-sample data is the specific historical dataset used to develop, train, and optimize a trading strategy. During this phase, the algorithm is exposed to the data to find the best parameters that maximize performance metrics like Sharpe ratio or net profit.
Because the model is directly adjusted based on this information, the performance results are inherently biased toward these specific observations. It is the primary environment where curve-fitting can occur if the model becomes too complex.
To ensure the model is actually learning a valid relationship, this data must be separated from the out-of-sample set. Understanding the limitations of in-sample performance is fundamental to avoiding the trap of believing past results guarantee future success.
This data serves as the foundation for the initial strategy hypothesis.