Check out Alice Chess, our featured variant for June, 2024.

Enter Your Reply

The Comment You're Replying To
Kevin Pacey wrote on Fri, Sep 16, 2016 10:50 PM EDT:

In looking up the latest poll results online (on wikipedia) for the US election, I noticed reference to margin of error, and also noticed that it was naturally bigger for smaller sample sizes. In the following link:

https://en.wikipedia.org/wiki/Margin_of_error

It can be seen that a sample size of just 96 has a margin of error of 10%, but a sample size of 384 has a margin of error of only 5%. It struck me that for a study of piece values using computers, it might be vital to have a considerably large sample size of games where identical engines play against each other in order to be rather confident of conclusions drawn in contesting the values of different pieces. Perhaps the minimum ought to be a sample size of 384 games. In concluding such a study, it might be noted how the margin of error might affect the estimate of a piece's value, if it is at all significant (e.g. "plus or minus 0.125 pawns" [however that might be calculated] possibly stated, after some calculations that are made for a piece's value based on win/loss percentages for that piece).

In his study finding that in chess a knight is exactly worth a bishop, I recall GM Larry Kaufman used a huge number of games (1,000,000+?) between skilled humans to draw his conclusion with a high degree of statistical confidence. This might have been a flawed study all the same; it seems from chess books that most human chess authorities agree that a knight is a little worse than a bishop on average. My own guess is that looking at human vs. human games wouldn't necessarily produce the same statistical result as an engine vs. identical engine study, with such a huge number of games also being played, and all starting with an opening-stage position setup where a single bishop is pitted against a single knight. That's, at the least, since different people value bishops & knights (and under what circumstances they can be exchanged equitably) slightly differently, which affects people's decisions, and in turn the possible results of all the individual games counted in a study, in a more chaotic way than with engines. That's not to mention all too human blunders or lesser mistakes, although these might tend to even out more than discrepencies caused by different players valuing minor pieces differently. I should note though that I own one 1998 middlegame book that is quite content to quote human vs. human database statistics that have results in favour of 2 bishops over 2 knights (or knight + bishop) a big majority of the time, under varying conditions of even material otherwise, much as Kaufman found.

P.S.: In digging back through old Comments, I see that H.G. (if no one else) has in a way basically taken into account much (if not all) of what I posted above, and made computer studies with a minimum of 1,000 games, in at least some cases, e.g. Amazon vs. Q + N (don't know about sample size in the case of B vs. N), when calculating piece values via piece vs. piece(s) battles. Assuming the engine + methodology used is a strong one, I still can't square some of the results of computer studies with my intuition, to my bewilderment. A personal anecdote that's possibly amusing: at one stage when musing about margin of error in regard to piece value estimates, I thought for a second that if a knight (as a piece of lower or equal value to a bishop) were set to 3.0 and the margin were 10% then the margin of error for a study (of 96 games) comparing it to a bishop might be 3.0 x .1 = 0.3 pawns. In similar fashion, I thought if an archbishop were set to 8.0 with a margin of 10% then the margin of error for a study (of 96 games) comparing it to a queen might be 8.0 x .1 = 0.8 pawns. I soon saw no justification for tying margin of error to the assigned numerical value of a piece, and realized it must be incorrect math. :)

Another way to try to convert margin of error from a raw percentage into a percentage of a pawn could first involve considering what constitutes the numerical value of a minimum decisive advantage (i.e. an engine should win 100% of all games in a study with this much advantage). In chess, that's about 1.333 of a pawn according to the old book Point Count Chess; if we accept that value (for the sake of argument) then a margin of error of 10% (i.e. for a study with 96 games) could be converted to 1.333 x .1 = plus or minus 0.133 pawns worth of margin of error. This may be just more incorrect math, but oddly enough I don't see how to easily refute it at the moment, at least with my feeble/rusty math skills.


Edit Form
Conduct Guidelines
This is a Chess variants website, not a general forum.
Please limit your comments to Chess variants or the operation of this site.
Keep this website a safe space for Chess variant hobbyists of all stripes.
Because we want people to feel comfortable here no matter what their political or religious beliefs might be, we ask you to avoid discussing politics, religion, or other controversial subjects here. No matter how passionately you feel about any of these subjects, just take it someplace else.
Avoid Inflammatory Comments
If you are feeling anger, keep it to yourself until you calm down. Avoid insulting, blaming, or attacking someone you are angry with. Focus criticisms on ideas rather than people, and understand that criticisms of your ideas are not personal attacks and do not justify an inflammatory response.
Quick Markdown Guide

By default, new comments may be entered as Markdown, simple markup syntax designed to be readable and not look like markup. Comments stored as Markdown will be converted to HTML by Parsedown before displaying them. This follows the Github Flavored Markdown Spec with support for Markdown Extra. For a good overview of Markdown in general, check out the Markdown Guide. Here is a quick comparison of some commonly used Markdown with the rendered result:

Top level header: <H1>

Block quote

Second paragraph in block quote

First Paragraph of response. Italics, bold, and bold italics.

Second Paragraph after blank line. Here is some HTML code mixed in with the Markdown, and here is the same <U>HTML code</U> enclosed by backticks.

Secondary Header: <H2>

  • Unordered list item
  • Second unordered list item
  • New unordered list
    • Nested list item

Third Level header <H3>

  1. An ordered list item.
  2. A second ordered list item with the same number.
  3. A third ordered list item.
Here is some preformatted text.
  This line begins with some indentation.
    This begins with even more indentation.
And this line has no indentation.

Alt text for a graphic image

A definition list
A list of terms, each with one or more definitions following it.
An HTML construct using the tags <DL>, <DT> and <DD>.
A term
Its definition after a colon.
A second definition.
A third definition.
Another term following a blank line
The definition of that term.