jomega
Strategic Test Suite (STS): STS (v4.0) Square Vacancy.067
One of the STS positions that Stockfish missed.A continuation of the discussion started here:
jomega's Blog • Strategic Test Suite (STS): The EPD file's best and alternate best moves. • lichess.org
Another interesting position for which Stockfish 14 failed to find the best move in the 3 second (depth 16) test is 'Square Vacancy.067'.
The Human Perspective
White is threatening to take the Black d-pawn and draw. Black would rather have the Queens off the board since his own King is exposed. Hence, of the two ways to defend the d-pawn, 1...Qg3 and 1...Qe7, a human would prefer the first. Besides guarding the d-pawn, 1...Qg3 threatens 2...Qg1+ 3.Kb2 Qd4+ and the Queens come off.
Stockfish's Perspective
Allowing Stockfish to search deeper, these two moves get closer in evaluation; with Stockfish still preferring 1...Qe7. So long as Stockfish sees no forced repetition of moves, or worse, the Black King exposure does not effect its evaluation. Stockfish is not programmed to give the best human move, or recommendation. Stockfish is programmed and tuned to give the best ELO for itself. So as I say in the study, what this comes down to is whether you are wanting the engine to play the best move for the engine, or the best move to suggest to a human! Both 1...Qe7 and 1...Qg3 win.
However the alternate "best moves" in the EPD, 1...Re8 and 1...Qc3 are both suspect. I think the former should be removed as an alternate, and the latter should have its "score" greatly reduced.
1...Re8 drops the d3-pawn and looks very much a draw.
1...Qc3 gives White some counter-play and in the most likely variation we end up in a difficult Queen and pawn endgame.
Links
- The Strategic Test Suite (STS) home page.
https://sites.google.com/site/strategictestsuite/
- The STS-rating code.
https://github.com/fsmosca/STS-Rating