Which fitness function should we use when building a strategy. I will answer that questions by building thousands of strategies and testing them on the out-of-sample results. Let's see what we find.
The question about what fitness function to use was inspired by the YouTube "Best Fitness Function In Genetic Algorithm" by StatOasis. In summary, StatOasis determined that Annual Return, CAGR, and average profit per trade were the top three performing fitness functions.
I thought those were exciting results and wondered how to perform a similar study. That's what I'm sharing with you today.
I decided to use Build Alpha to help me with this process. Why? Build Alpha has many fitness functions to test, and it's an application designed to build strategies. So, it's a perfect fit for this type of market study.
The Fitness Functions
Build Alpha has 20 fitness functions available, listed below.
- Net Profit & Loss
- Drawdown
- Net Profit vs Drawdown
- Ratio Win vs Loss
- Profit Factor
- Sharpe
- Sortino
- Avg Trade
- Win Percentage
- CPC
- CorrCoef
- CAGR
- T-Test
- SQM
- SQN
- E-Ratio
- K-Ratio
- ExpectancyScore
- PerfectProfitPercentage
- PerfectProfitCorrelation
I won't get into what each of these mean at this point. It's not that important until we finish the test!
To test each of these fitness functions, my thought was to have Build Alpha construct tens of thousands of strategies using a single fitness function. Then, forward test strategies on the out-of-sample data and keep the top 500 strategies, and we would be left with the best 500 strategies for a given market.
Before we get started, there were some assumptions to be made before I dig into my testing.
Short Term Strategies Only!
First, all these tests I ran are daily charts of the futures markets. I did this because the futures markets are all I actively trade and using daily bars also keeps this process moving. You could imagine how intraday bars mean more data and a much slower time performing this study.
Next, I'm most interested in shorter-term strategies, so I instructed Build Alpha to only build strategies with a maximum of a three-day hold.
Keep these assumptions in mind as we move forward.
What To Track?
Once I started testing, it became apparent I could track two more variables. Sure, why not make this test more complicated!
Build Alpha has a setting to change where the out-of-sample data is located. You can have the out-of-sample data be at the end of your historical data, which is more traditional. Or, it can be placed at the beginning of your historical data.
So this added another dimension to the test. Should the out-of-sample be at the beginning or end of our historical data? An interesting question, don't you think? Lets call this the Out-of-Sample Placement. It's tradition to put the out-of-sample data segment after the in-sample (After). But we can test putting the out-of-sample before the in-sample (Before).
I also decided to go one step further by separating long and short trades. I like this idea as I often view trading the long side vs the short side of any market as to distinct opportunities. We'll call this Direction. So, this will be another dimension to the test.
Here is what we'll be testing.
- Fitness Function
- Out-of-sample placement
- Direction
Each market will have two variables to track outside of our fitness function. Those variables are Direction (Long/Short) and OOS placement (Before or After the in-sample). Say we're testing the ES (E-mini S&P). Below are the two variable combinations we'll be testing.
- Combination ES - Long - OOS/IS
- Combination ES - Long - IS/OOS
- Combination ES - Short - OOS/IS
- Combination ES - Short - IS/OOS
That's four combinations for each market. Remember each of these four combinations will have to be repeated for each fitness function. There are 20 fitness functions to test! With 20 fitness functions to test times across our 4 combinations of variables, that means we'll generate 80 sets of data per market. That's a lot of data!
Building Strategies With Build Alpha
I'll configure Build Alpha to generate strategies based on our tracking variables. Build Alpha will use a genetic algorithm to search the space for possible strategies. Its search will employ over 4,675 technical indicators, breadth indicators, price patterns, and seasonality factors to develop strategies over the in-sample data.
The historical data will be from 2007-2021. Once the search has been completed, Build Alpha will save the best strategies based upon the given fitness function. The number of systems will be capped at 500, and some searches on some markets will produce fewer final strategies.
Next, I will move the top strategies to the out-of-sample data and saved their performance. I then summed up these 500 strategies into a single row within a spreadsheet. This row contained the fitness function, the out-of-sample location, the trade direction, and some performance metrics. Here is an example of this spreadsheet. Click the image below for a larger view.
You can see each row is a summary of a Build Alpha run for a given market. In this case, we're looking at the ES market. You can also see that we're tracking the fitness function, market Direction, and the OOS placement for each market. The final spreadsheet in this study as 1040 rows of data.
Which Markets To Test
Below are the markets I've tested so far. I'm not finished with my study, and I'm still collecting data, so far... ES, NQ, YM, RT, NG, US, CL, LH, LC, S, C, GC, EC
What is the best fitness function?
So, which fitness function is best? We can organize our summary spreadsheet by the highest average net profit or average net profit per drawdown. First, look at the top fitness function based upon the highest average net profit.
Top 5 By Average Net Profit
Fitness Function | Average Net Profit |
---|---|
CorrCoef | $5,711 |
CAGR | $4,956 |
Perfect Profit Percentage | $4,794 |
Perfect Profit Correlation | $3,453 |
Expectancy Score | $3,284 |
Top 5 By Average Net Profit vs Drawdown
Fitness Function | Average Net Profit vs. Drawdown |
---|---|
CAGR | .97 |
CPC | .85 |
CorrCoef | .82 |
Perfect Profit Percentage | .80 |
Sharpe | .80 |
Is this the end-all to fitness functions? Of course not. This is far from a perfect test, but it should provide a good starting point. We can see from the tests so far that CAGR, CorrCoef, and Perfect Profit Percentage seems to do well, as they rank high on both sorting methods.
Below are the definitions of these fitness functions found in Build Alpha.
CAGR: Compound Annual Growth Rate. Simply defined as (((End/Begin)^(1/years)) – 1) * 100.
CorrCoef: Correlation Coefficient measures how linear an equity curve is. A choppy equity curve will have a low score. If the equity curve progresses with a slope close to 1, the value shall be close to 1.
Perfect Profit Percentage: This assumes perfect forward information. Find the absolute BEST equity curve, then score strategies based upon this ideal (unattainable) equity curve.
What is the Best Direction?
Here is a summary across all the markets tested. The Long side wins.
Direction | Average Net Profit | Average Net Profit vs. Drawdown |
---|---|---|
Long Only | $3,611 | .80 |
Short Only | $707 | .66 |
Overall, it was easier to build long strategies. I have a feeling that the stock index markets biased these results. Those markets have a strong Thus, it would be necessary to look at each market individually.
What is the best out-of-sample placement?
OOS Placement | Average Net Profit | Average Net Profit vs. Drawdown |
---|---|---|
After | $2,521 | .79 |
Before | $1,744 | .66 |
These values are similar. Once again, I think it would be necessary to look at each market individually to get anything really meaningful.
Looking At Each Market Individually
Because I have all this information within a spreadsheet, I can use pivot tables to quickly organize this just about any way I wish. Above, we looked at all markets together, but I can view each market individually.
For example, for the Nasdaq market, I can see the best settings are:
- Go long
- Use fitness function Perfect Profit Percentage
- Put the OOS segment after the in-sample.
Next, looking at the Natual Gas market, I can see the best settings are:
- Go short
- Use fitness function Perfect Profit Percentage
- Put the OOS segment before the in-sample.
As you can see, each market can be different.
I've now collected this information and can use it as a starting point for building strategies. Again, is the final word on which fitness function, market direction, or out-of-sample placement? Nope. This is just something I've been working on as a guide to help provide a starting point.
As you can imagine, this research has taken a lot of time to put together, and I used two great tools to help automate collecting the data and organizing the data. Both helped to keep my sanity. First was Microsoft's Power Automate, and this was used to automate Build Alpha to generate the results. Then I used Power Query (part of Excel) to transform and merge all the data into a single usable spreadsheet. Highly recommend you review these tools to see if you can incorporate them in your research.
Interesting. I seen a similar study conducted on StrategyQuantX blog. I have ran similar studies myself. I found no real correlation between IS and OOS results based on fitness function.
However, I did find another factor that showed some value in predicting IS and OOS results. This factor is fairly obvious but difficult to make use of, in practice.
Hi Curtis. Thanks for the comment. IS/OSS placement does seem to be important in some markets. Not sure if that means anything or if it will be helpful in practice. Will be watching for that. Another thing I would like to test is different fitness functions during various regimes. Such as bull/bear. Or high/low volatility. Care to share your X factor?
I think the way I did it was a bit different then how you tested. I tested the top 5 ranked IS and top 5 ranked OS using Build Alpha. Basically, if the top 5 ranked IS are the top 5 ranked OS then that was what I was looking for. The only factor that seemed to produce better correlates of IS to OS was shorter OS times. Time was the factor. The beginning OS periods performed better while longer periods tended to degrade quickly. The correlation was not strong but noticeable. It is not easy to make use of the knowledge in itself– as such. However, it can be observed and with some additional work– might prove useful.
Thanks for sharing that. I, too, am interested in seeing how well in-sample vs. out-of-sample holds up. Another item to add to the ever-growing to-do list.
Hi Jeff, thanks for this post. Really helpful.
Hi Andrea.You’re welcome.
Hi Jeff, Thanks for sharing your results. I’m interested in your comment about automating Build Alpha with Power Automate. Could you share any links or resources to learn how to do that, please? So far, I understand that with applications/software outside the Microsoft ecosystem (i.e. Build Alpha) a REST API and coding skills are necessary. Is that the way, or is it simpler than that?
Hi Raul. Unfortunately, I did not have any substantial resources. It was a lot of trial and error on my part. I did heavily use these two YouTube videos. No API needed. It’s graphical interface to build an automated workflow.
https://youtu.be/dDO4Y4aDYXw
https://youtu.be/L4BuUzccLpo