The evidence that with enough money + cajoling + threatening (I assure you that I don't have the power, wealth or desire to make this happen) I could convince twenty people (10 of them rated 800 and 10 rated 1000) to "play" a round in a tournament I've set up and get an identical score. It would be utterly futile and tell us nothing at all about the rating system but it is possible.
Even without this, someone could design a course on which it is probable for competitive players to score the exact same score. Say 18 x 10' holes as an example. Again, stupid, but possible, i.e. not impossible.
Further, in a real world tournament, on a proper course with competitive players, I do not expect to ever see this occur in my lifetime nor that it would occur before the heat death of the universe, but it's not impossible.
Also your understanding of 99.999% is wrong. Regardless of all the real world possibilities for error that percentage is leaving open the possibility of an alternative result. In a practical sense you should feel supremely confident that you are the father but it is not a certainty. By DEFINITION.
I suggest you don't say 'by definition' and ignore the definition.
So, let's start with your "suggestion."
From Webster's:
definition (n.): a statement expressing the essential nature of something;
root word,
define (v., tr.): to determine or identify the essential qualities or meaning of.
And for complete clarification --
essential (adj.) : of, relating to, or constituting essence; INHERENT
and
essence (n.): : the permanent as contrasted with the accidental element of being; OR the individual, real, or ultimate nature of a thing especially as opposed to its existence.
So it seems to me that the definition of something has to do with what is real -- not what is theoretical or hypothetical. No need to lecture me on the definition of "definition." Neither my education nor my knowledge of English is in question here.
Recall that I've never stated that you couldn't plug those numbers into the formula and they wouldn't "spit out" a result. I've stated that those results, the ones NOT IN THE REAL WORLD, would not be valid or reliable statistics. Hence, those in hypothesy (sic) have no meaning whatsoever. When I said certain things were impossible I was speaking in the context of this thread -- drawing conclusions about the PDGA ratings system. To me that only includes valid, reliable conclusions.
Now, one at a time.
The part in RED -- renders your experiment unreliable, you've manipulated the subjects of the experiment.
The part in PURPLE - is manipulating the conditions or environment of the experiment. That makes it not valid.
the part in LIME GREEN seems to echo what I've been saying; It cannot happen in reality. Like the definitions of "definition" above! (sans your qualifier).
The part in BLUE (again sans your qualifier) is about the mathematics of the calculation NOT the reality of the outcome. There exists no other possibility, in reality, other than YOU ARE THE FATHER. The fact that it says 99.999% instead of 100.00% is about how those results are calculated, and that only.
Just for fun I figured we could consider a set of Bernoulli trials.
Assume:
All holes played by each player are independent (a big assumption)
Each player's score is independent of all others (another big assumption)
There is an 18 hole course where the following are true (probably an ok assumption, but I bet there's good data somewhere to be used instead):
- 800 rated players score par 20% of the time (and mostly worse the rest)
- 1000 rated players score par 20% of the time (and mostly better the rest)
The tournament consists of one round.
We only count the 'same score' as all players scoring par every hole (another big assumption)
Then using a Binomial distribution with 360 (20 x 18) trials and probability 20% the probability of all trials being successful is 2.35 x 10^(-252) or 1 in 4.26 x 10^251. For a reference size the number of particles in the observable universe has been estimated as 3.28 x 10^80. So if every particle in the observable universe had it's own observable universe and all the particles in those universes had their own observable universe then each of those particles run 12 billion of these tournaments, we should expect that one of them would have this result. So, definitely possible.
Since that is absolutely rife with assumptions, none of which are consistent with how the formula is developed, I won't even. I'll suffice it to say that all the plugging into a computer in the world, cannot make that scenario occur. Why should this ratings system, developed for real life application, have to have any consistency with a purely hypothetical, not-real-life occurrence. I say it shouldn't. There is no such thing as a REAL course where better players and worse players (BOTH GROUPS) will score par 20% of the time. That's not possible. In fact, I think this is where you missed what I meant when I said, "by definition." I was talking about players -- the better players (i.e., those rated 1000) and worse players (i.e., those rated 800). Perhaps you disagree with me that the 1000-rated players are better players than the 800-rated players, or that 800-rated players are worse players than 1000-rated players; if so you are incorrect. And I know the definition of "better" and "worse."
Of course not, my point about it being mathematically possible was to point out the flaw/bug/anomaly whatever you want to call it that exists in the ratings system, that as long as divisions are rated separately, identical scores can yield different round ratings, and will always favor the division with higher rated players. As far as inflation, bubbling up, whatever you want to call it, it's pretty easy to grasp the concept that as long as the higher rated players are always rated among themselves (DGPT events for example) combined with a ratings system that knows no limits, well...the rich will get richer.
I don't see that as a flaw in the system. They are better players. And that's true for anyone who does the same. It works for everybody.
It just doesn't follow that the ratings would necessarily inflate, they might but I don't see any guaranteed mechanism by which it would. Identical scores yielding different round ratings doesn't necessarily favor the division with higher rated players. I'm sure that stat could be pulled. If we look at round ratings for fields with higher rated players and round ratings for fields with lower rated players do the identical scores on identical courses average higher when scored by the higher field (averaging over enough separate events should minimize the effect of course conditions)?
I can do this analysis quite easily if there's a good source for the data and it's easy to identify that identical courses are being played. Anyone know how/where it's available in an easy to consume form? (the format on PDGA website isn't ideal, I don't think it is easy to identify identical course layouts and writing a script to scrape it just doesn't seem appealing right now).
Since I do not know the formula I cannot say, but I'd guess there is an asymptote out there somewhere. But that's purely a guess.