jomega
STS(v8.0) AKPC.002
The Horizon Effect Can Cause Incorrect EvaluationsA continuation of the discussion started here:
jomega's Blog • Strategic Test Suite (STS): The EPD file's best and alternate best moves. • lichess.org
Another interesting position for which Stockfish 14 failed to find the best move in the 3 second (depth 24) test. Note that Stockfish does play the intended best move ...f5 on move 2, and that is a main idea, but by then White will have transferred his Queen Knight via e4 to f2. See the study comments.
The STS category 8 is "Advancement of f/g/h Pawns". I assume it is abbreviated AKPC for "Advancement of Kingside Pawns"; though I don't know what the "C" stands for.
The Human Perspective
In the initial position, White is a Knight up for 3 pawns. White is threatening 2.Nf4. Black has connected passed pawns (the d-pawn and c-pawn), pressure on the e-file, future pressure on the long diagonal leading to a1, the possibility of the lever ...a5 with pressure by the Rook on the b-file, and control of central squares. This is more than enough compensation for Black. The move 1...f5 would increase the scope of the Bishop and pin the White Queen Knight. It would attack the d4-square and so prepare for ...d5. The advancement of the connected passed pawns is a key idea.
Stockfish's Perspective
Looking at the top 5 variations Stockfish gives, and the best moves and alternate best moves in the EPD, we find that at a depth of 23, they all have about the same score from Stockfish. At a depth of 38, ...d4 and ...f5 are about one point better. The move ...d4 also involves ...f5. With 1...f5, Stockfish is seeing an exchange of Queens in that line. However, the move 1...d4 has quirks starting at Black's 4th move.
When I was analyzing the position after 4.Qg3 in the mainline, Stockfish was set to display the top 3 variations. None of those had 4...g5, and I was wondering why not. Usually, Stockfish will correct me immediately with a variation showing why my alternatives are not better. However, this time what I saw was that Stockfish said 4.h4 at -0.7; and it had worse scores with other moves. Yet on making 4.h4, Stockfish immediately said -2.1 ! This is the horizon effect. Whatever bad thing was about to happen to White, was just one half-move over the horizon at the previous move. Following the Stockfish moves from there, we end up with -3.0/23 in the main line.
Now Stockfish did not pick 1...d4 because of that main line; that would be 49 half-moves from the initial position. Stockfish picked 1...d4 over 1...f5 because at depth 23 from the start, that move was 0.2 better.
What this shows is that you have to look at Stockfish's variations with a critical eye. Stockfish can overlook good moves, and suggest moves that are not good.
EPD Changes Needed
The scores in the EPD should be f5=10, d4=9, a5=6. I find the other moves entirely against the spirit of the intent of the position, and they are not superior moves either. The move 1...a5 ends in its main variation with clear compensation for Black for being the exchange down despite being a different idea than the intent; so I think it should be an alternate best move.
Links
- The Strategic Test Suite (STS) home page.
https://sites.google.com/site/strategictestsuite/
- The STS-rating code.
https://github.com/fsmosca/STS-Rating