In Search of Brilliancies in Chess

I think you guys are missing an important point. This method uses two different and very large engines. This means resource utilisation in terms of space, processing and battery life (and money, if planning to do it on the server). And in the end you still don't know WHY it is brilliant.

Indeed, but it could all be done on the browser side, so up to he user. The why is an interesting question, but based on my definition, it can be better than the chess.cok piece sack for brilliancies.

@TotalNoob69 said in #10: > I think you guys are missing an important point. This method uses two different and very large engines. This means resource utilisation in terms of space, processing and battery life (and money, if planning to do it on the server). And in the end you still don't know WHY it is brilliant. Indeed, but it could all be done on the browser side, so up to he user. The why is an interesting question, but based on my definition, it can be better than the chess.cok piece sack for brilliancies.

mattchessic

I think this approach can be used to identify potential traps. If you look at the possibilities as generally falling into four quadrants (rare/good, rare/bad, likely/good, and likely/bad) I think it is an interesting way to complement identifying gaps in your repertoire. You could have a routine to suggest potential traps if the user defines a risk tolerance (e.g., acceptable cp loss).

HollowLeaf

@mattchessic said in #12:

I think this approach can be used to identify potential traps. If you look at the possibilities as generally falling into four quadrants (rare/good, rare/bad, likely/good, and likely/bad) I think it is an interesting way to complement identifying gaps in your repertoire. You could have a routine to suggest potential traps if the user defines a risk tolerance (e.g., acceptable cp loss).

That is an interesting way to view it, I do like the quadrant idea.

@mattchessic said in #12: > I think this approach can be used to identify potential traps. If you look at the possibilities as generally falling into four quadrants (rare/good, rare/bad, likely/good, and likely/bad) I think it is an interesting way to complement identifying gaps in your repertoire. You could have a routine to suggest potential traps if the user defines a risk tolerance (e.g., acceptable cp loss). That is an interesting way to view it, I do like the quadrant idea.

TotalNoob69

I think that the depth of candidate moves should lead the discussion of using engines to calculate brilliancy.

Example: you run SF with pv=3 and you get 3 candidate moves. Let's use alphabet letters rather than actual chess moves.

So you get at depth 1:

A +2
B +1.5
C -2

but at depth 2:

A +1
D +0.8
C 0

you get to depth 30:

C +2
A +1
Z -1

clearly that means that move A was considered good for a while (and still is). B was considered good and discarded. D was briefly considered. C switched from a losing move to the best move.

I believe that, in a position that was created by a previous move that you are analysing, having top candidate moves becoming effective only at a deeper level makes for the previous move to be considered interesting.

Based on your definition of "only move", to which I do not particularly ascribe, this move would have to be considered losing at lower depth in order to be interesting.

However, if multiple top engine moves have this characteristic, the position might be objectively losing no matter what you do because a fundamental position flaw, or you might have a "super brilliant" move on your hands, where anything the opponent might try has a deep refutation - like those hard chess puzzles.

Yet I've briefly tested this approach and things are not that clear cut. For example a move might appear winning at low depth, then refuted by a deeper response, then winning again by an even deeper counter-response. In fact, once you think about it, this kind of move might feel even more "brilliant" as it plays with one's emotions.

I was thinking at one point to try to catalogue this as "shapes", where an underscore means low eval, a dash means medium eval and a superscore (just invented this for a top horizontal line) a high eval, and a fixed length of 3 to 5 characters. which would give you a maximum of 243 possible options, but is it representative? Especially for the last case...

So I don't have a clear solution, but I do feel that the evolution of the evaluation of candidate moves with the depth of the engine should play a role. I don't know which is it, because I don't want to use active engine eval in LT, but I feel it might be important.

Therefore, if you ever plan a data driven statistical model approach, take this metric into consideration, perhaps.

I think that the depth of candidate moves should lead the discussion of using engines to calculate brilliancy. Example: you run SF with pv=3 and you get 3 candidate moves. Let's use alphabet letters rather than actual chess moves. So you get at depth 1: - A +2 - B +1.5 - C -2 but at depth 2: - A +1 - D +0.8 - C 0 you get to depth 30: - C +2 - A +1 - Z -1 clearly that means that move A was considered good for a while (and still is). B was considered good and discarded. D was briefly considered. C switched from a losing move to the best move. I believe that, in a position that was created by a previous move that you are analysing, having top candidate moves becoming effective only at a deeper level makes for the previous move to be considered interesting. Based on your definition of "only move", to which I do not particularly ascribe, this move would have to be considered losing at lower depth in order to be interesting. However, if multiple top engine moves have this characteristic, the position might be objectively losing no matter what you do because a fundamental position flaw, or you might have a "super brilliant" move on your hands, where anything the opponent might try has a deep refutation - like those hard chess puzzles. Yet I've briefly tested this approach and things are not that clear cut. For example a move might appear winning at low depth, then refuted by a deeper response, then winning again by an even deeper counter-response. In fact, once you think about it, this kind of move might feel even more "brilliant" as it plays with one's emotions. I was thinking at one point to try to catalogue this as "shapes", where an underscore means low eval, a dash means medium eval and a superscore (just invented this for a top horizontal line) a high eval, and a fixed length of 3 to 5 characters. which would give you a maximum of 243 possible options, but is it representative? Especially for the last case... So I don't have a clear solution, but I do feel that the evolution of the evaluation of candidate moves with the depth of the engine should play a role. I don't know which is it, because I don't want to use active engine eval in LT, but I feel it might be important. Therefore, if you ever plan a data driven statistical model approach, take this metric into consideration, perhaps.

HollowLeaf

@TotalNoob69 said in #14:

I think that the depth of candidate moves should lead the discussion of using engines to calculate brilliancy.

Example: you run SF with pv=3 and you get 3 candidate moves. Let's use alphabet letters rather than actual chess moves.

So you get at depth 1:

A +2

B +1.5

C -2

but at depth 2:

A +1

D +0.8

C 0

you get to depth 30:

C +2

A +1

Z -1

clearly that means that move A was considered good for a while (and still is). B was considered good and discarded. D was briefly considered. C switched from a losing move to the best move.

I believe that, in a position that was created by a previous move that you are analysing, having top candidate moves becoming effective only at a deeper level makes for the previous move to be considered interesting.

Based on your definition of "only move", to which I do not particularly ascribe, this move would have to be considered losing at lower depth in order to be interesting.

However, if multiple top engine moves have this characteristic, the position might be objectively losing no matter what you do because a fundamental position flaw, or you might have a "super brilliant" move on your hands, where anything the opponent might try has a deep refutation - like those hard chess puzzles.

Yet I've briefly tested this approach and things are not that clear cut. For example a move might appear winning at low depth, then refuted by a deeper response, then winning again by an even deeper counter-response. In fact, once you think about it, this kind of move might feel even more "brilliant" as it plays with one's emotions.

I was thinking at one point to try to catalogue this as "shapes", where an underscore means low eval, a dash means medium eval and a superscore (just invented this for a top horizontal line) a high eval, and a fixed length of 3 to 5 characters. which would give you a maximum of 243 possible options, but is it representative? Especially for the last case...

So I don't have a clear solution, but I do feel that the evolution of the evaluation of candidate moves with the depth of the engine should play a role. I don't know which is it, because I don't want to use active engine eval in LT, but I feel it might be important.

Therefore, if you ever plan a data driven statistical model approach, take this metric into consideration, perhaps.

Oh this is interesting, I have not had a chance to implement but the depth is interesting to think about, should this be mapped to the users level i.e. brilliant for a 400 to find this moves that is something that I have no considered.

I am thinking of building something over the next few weeks, I don't have an application in mind, but I thought it would be interesting to code something to try it out.

@TotalNoob69 said in #14: > I think that the depth of candidate moves should lead the discussion of using engines to calculate brilliancy. > > Example: you run SF with pv=3 and you get 3 candidate moves. Let's use alphabet letters rather than actual chess moves. > > So you get at depth 1: > - A +2 > - B +1.5 > - C -2 > > but at depth 2: > - A +1 > - D +0.8 > - C 0 > > you get to depth 30: > - C +2 > - A +1 > - Z -1 > > clearly that means that move A was considered good for a while (and still is). B was considered good and discarded. D was briefly considered. C switched from a losing move to the best move. > > I believe that, in a position that was created by a previous move that you are analysing, having top candidate moves becoming effective only at a deeper level makes for the previous move to be considered interesting. > > Based on your definition of "only move", to which I do not particularly ascribe, this move would have to be considered losing at lower depth in order to be interesting. > > However, if multiple top engine moves have this characteristic, the position might be objectively losing no matter what you do because a fundamental position flaw, or you might have a "super brilliant" move on your hands, where anything the opponent might try has a deep refutation - like those hard chess puzzles. > > Yet I've briefly tested this approach and things are not that clear cut. For example a move might appear winning at low depth, then refuted by a deeper response, then winning again by an even deeper counter-response. In fact, once you think about it, this kind of move might feel even more "brilliant" as it plays with one's emotions. > > I was thinking at one point to try to catalogue this as "shapes", where an underscore means low eval, a dash means medium eval and a superscore (just invented this for a top horizontal line) a high eval, and a fixed length of 3 to 5 characters. which would give you a maximum of 243 possible options, but is it representative? Especially for the last case... > > So I don't have a clear solution, but I do feel that the evolution of the evaluation of candidate moves with the depth of the engine should play a role. I don't know which is it, because I don't want to use active engine eval in LT, but I feel it might be important. > > Therefore, if you ever plan a data driven statistical model approach, take this metric into consideration, perhaps. Oh this is interesting, I have not had a chance to implement but the depth is interesting to think about, should this be mapped to the users level i.e. brilliant for a 400 to find this moves that is something that I have no considered. I am thinking of building something over the next few weeks, I don't have an application in mind, but I thought it would be interesting to code something to try it out.