#1  
Old 03-01-2013, 08:40 PM
jeverett's Avatar
jeverett jeverett is offline
Eagle Member
 
Join Date: Sep 2009
Location: Eugene, OR
Years Playing: 5.5
Courses Played: 22
Throwing Style: LHBH
Posts: 990
When is a 1020 round not a 1020 round?

Preface: this post will contain statistics, and may be best-directed at cgkdisc.

No, my rhetorical question is not actually rhetorical. I believe that the PDGA "rating interval compression formulas" are incorrect, and as such PDGA player round ratings (and subsequently player ratings) are not as accurate as they could be. The result of these formulas, from what data I've been able to collect so far, is that almost no recorded rounds for any event, for rounds above or below 1000, are being correctly rated. The discrepancy is typically very small, however may be much more pronounced on very short or very difficult courses.

Note: I am referencing the 2013 Gentlemen's Club Challenge for much of the following. Unfortunately, a ratings update has happened since the following calculations were run, making comparison with the original data more difficult. The results of that event can be found here:

http://www.pdga.com/tournament_results/99721

For any who are interested, here is a quick breakdown of how PDGA round ratings are calculated:

Step 1. Determine who really is a propagator. Unfortunately this part is actually annoying to do accurately with available data. For example, players without active PDGA memberships may still be propagators for an event, but their initial rating cannot be determined externally. Also, in order to be considered a propagator, a player must have played a sufficient number of previous rated rounds (i.e. they need more than to just have a rating), and this is very time-consuming (and sometimes impossible) to determine externally.

Step 2. Using all propagators, determine the best-fit linear equation that matches the round data. For example, here is a plot of round 1 of the 2013 Gentleman's Club Challenge (GCC):



The linear equation that best fits the above data is:

Round_Score = -0.12196922 * Initial_Rating + 182.4463478

Step 3. Determine the SSA for the round, using the best-fit linear equation above. As an example, for round 1 of the GCC, the SSA is approximately 60.47712771.

Step 4. Determine round ratings, using the following 'compression formulas':

For SSA's above 50.3289725:
Rating Increment = -0.225067 * SSA + 21.3858

For SSA's below 50.3289725:
Rating Increment = -0.487095 * SSA + 34.5734

Note: these formulas were derived from PDGA event results, and are not the precise formulas used by the PDGA. Unfortunately, due to rounding (of round ratings to the nearest whole number, and possibly also round of SSA values) it is impossible to exactly determine the actual linear formulas used. They are accurate, however, to within 0.01% of the actual PDGA linear formulas, and in the case of the GCC, accurately matched the 'unofficial' round ratings for each player in the round (subject to a small amount of rounding error). The rating increment for round 1, for example, was 7.774394297, or each throw off of the SSA increased or decreased the players' round rating by 7.774394297.

Step 5. Compare the round rating of each propagator with their initial rating, and throw out any propagator who's round rating was not within 50 point of their initial rating, and recalculate steps 2-4. Here is a plot of GCC round 1, with these propagators removed:



The linear equation that best fits the above data is:

Round_Score = -0.114947347 * Initial_Rating + 175.3579791

Producing a modified SSA of 60.41063194, and a new rating interval (using the same compression formula as above) of 7.789360301.

So.. what's wrong with any of this? Nothing.. yet. But what happens when you compare the linear equation that was used to compute the SSA with the rating interval compression formula value for that SSA? *If* it is valid to use a linear equation to model player rating vs. round score (and from the two plots above, plus countless other plots I've made from PDGA round data it does appear to be valid), should it not also be valid to use this same linear equation to determine the (averaged) number of rating points per throw? But this is not what the PDGA system does. For example, when you compare the rating interval used for each round of the GCC, against the observed rating-interval-per-throw-vs-initial-player-rating of all propagators, you get the following plot:



The PDGA Interval line above is of course this linear formula:

Rating_Increment = -0.225067 * SSA + 21.3858

But as you can observe, this in no way matches the observed rating interval based on player initial rating vs. round score, and the round data I've collected to date further supports this trend: that the relationship between initial player rating, round score, and round rating increment cannot be accurately predicted using the PDGA linear 'compression formula'. The effect of this is that round ratings are not (as) accurately reflecting the effect of player rating on round scores as they could be. In other words, if a 1020-rated player shoots an "average" round, by the PDGA compression formulas that round is not being rated a 1020. In fact, the only round rating not subject to this kind of induced error is of course a round rated exactly 1000.

What am I suggesting? I am suggesting that the PDGA instead switch to using the same linear equation used to determine SSA for a round to determine the per-throw rating increment. For example, for round 1 of the GCC, this would mean a round that was rated as a 1020 (technically a round score of 57.8430269985) would instead be rated as a 1022.3372266 (or rounded to a 1022). Yes, this difference is very small..

So why does it really matter? The goal here is correlation.. or more specifically the correlation between (initial) player rating and round/event score. i.e. How well did the initial ratings of all players predict how they scored at an event. I don't know if anyone has seen the PDGA report on correlation coefficients for their events (it was published last year), but increasing the correlation coefficient for Majors appears to be a goal of theirs. This of course can also be addressed with course design, but a very simple way to improve the correlation coefficient could be to switch how round ratings themselves are calculated. I don't, however, have any method of proving that this change will work.. and that's where I'd *love* some help (maybe even from Chuck). In order to test this, the 'real' PDGA method of determining SSA and rating intervals per throw would need to be used, using the precise number and ratings of propagators for real events, and then we'd need to test the impact both round rating methods would have on first player ratings and then the correlation coefficient of how well future event rounds are predicted by said player ratings.

Ok.. wall of text over.. thoughts?
Sponsored Links
Reply With Quote
  #2  
Old 03-01-2013, 08:52 PM
BionicRib's Avatar
BionicRib BionicRib is offline
Double Eagle Member
 
Join Date: Jun 2011
Years Playing: 14.6
Courses Played: 189
Throwing Style: RHBH
Posts: 1,706
Do you perhaps have a link to the correlation coefficients reported by the PDGA? Or are they only sent to the TD's?
Reply With Quote
  #3  
Old 03-01-2013, 08:53 PM
Dave242's Avatar
Dave242 Dave242 is offline
* Ace Member *
 
Join Date: Aug 2007
Location: Lake Forest, IL
Years Playing: 20.5
Courses Played: 368
Throwing Style: LHBH
Posts: 4,377
Quote:
Originally Posted by jeverett View Post
the SSA is approximately 60.47712771
Your use of "approximate" is less accurate than PDGA ratings.

I guess it is important to get this down to the nearest few billionths of a throw!

Last edited by Dave242; 03-01-2013 at 08:55 PM.
Reply With Quote
  #4  
Old 03-01-2013, 08:58 PM
jeverett's Avatar
jeverett jeverett is offline
Eagle Member
 
Join Date: Sep 2009
Location: Eugene, OR
Years Playing: 5.5
Courses Played: 22
Throwing Style: LHBH
Posts: 990
Quote:
Originally Posted by BionicRib View Post
Do you perhaps have a link to the correlation coefficients reported by the PDGA? Or are they only sent to the TD's?
What I have is the PDGA document (more of a white paper) on the topic. It is available online here:

http://www.pdga.com/files/documents/...er_Courses.pdf
Reply With Quote
  #5  
Old 03-01-2013, 10:09 PM
BionicRib's Avatar
BionicRib BionicRib is offline
Double Eagle Member
 
Join Date: Jun 2011
Years Playing: 14.6
Courses Played: 189
Throwing Style: RHBH
Posts: 1,706
I asked Chuck a similar question a few days ago. I do see what you are getting at and after reading that file you linked me too.......I would say that it there are two issues that will take time to help with these correlation coefficients. 1. Is as you mentioned.....Course design...."luck golf/tweener holes/bad design". Having clearer definitions that are practiced by "all" designers across the country is a work in progress. 2. IMO....and I'm sure Chuck can clear this up better than I, but is there really enough data to work with? I personally don't think so. If disc golf were as popular as baseball or golf for that matter, the numbers would be more "fine tuned" because you have more players. More players equals more data. You have a fifty fifty chance of getting heads or tails when flipping a coin. If you flip it 100 times I bet you never get 50/50 on the dot.......(more like 70/30 or 60/40). If you flip it a million times you will get closer and closer to that "50/50" on paper. Just my thoughts
Reply With Quote
  #6  
Old 03-01-2013, 10:27 PM
jeverett's Avatar
jeverett jeverett is offline
Eagle Member
 
Join Date: Sep 2009
Location: Eugene, OR
Years Playing: 5.5
Courses Played: 22
Throwing Style: LHBH
Posts: 990
Quote:
Originally Posted by BionicRib View Post
I asked Chuck a similar question a few days ago. I do see what you are getting at and after reading that file you linked me too.......I would say that it there are two issues that will take time to help with these correlation coefficients. 1. Is as you mentioned.....Course design...."luck golf/tweener holes/bad design". Having clearer definitions that are practiced by "all" designers across the country is a work in progress. 2. IMO....and I'm sure Chuck can clear this up better than I, but is there really enough data to work with? I personally don't think so. If disc golf were as popular as baseball or golf for that matter, the numbers would be more "fine tuned" because you have more players. More players equals more data. You have a fifty fifty chance of getting heads or tails when flipping a coin. If you flip it 100 times I bet you never get 50/50 on the dot.......(more like 70/30 or 60/40). If you flip it a million times you will get closer and closer to that "50/50" on paper. Just my thoughts
Hi BionicRib,

Oh definitely, due to sample size problems, the inherent random nature of disc golf (somewhat mitigated by course and equipment design), and physical/muscular limitations on just how 'controllable' disc golf is, period, getting a 100% correlation coefficient between player rating and event score is going to be impossible.. not to mention not ideal anyway.

However my hope is, that with one slight adjustment to how player round ratings (and by extension player ratings) are calculated, we can very, very slightly increase the correlation coefficient between (initial) player rating and event score. I don't really know what could be expected in terms of improvement with this one change.. probably less than 1%.. but as I said, I don't have the ability to determine this, due to not having access to the *exact* same rating methodology that the PDGA uses.
Reply With Quote
  #7  
Old 03-01-2013, 11:38 PM
jenb's Avatar
jenb jenb is offline
* Ace Member *
 
Join Date: Feb 2011
Location: DFW TX USA
Years Playing: 9.4
Courses Played: 82
Throwing Style: RHBH
Posts: 3,364
Of course, one answer is that it doesn't really matter as long as it's applied consistently to everyone.

When someone gives that answer, I want you to look them square in the eye and say, "Momma said knock you out."
Reply With Quote
  #8  
Old 03-02-2013, 03:33 PM
Steve West Steve West is offline
Double Eagle Member
 
Join Date: Dec 2009
Years Playing: 40.5
Courses Played: 190
Posts: 1,385
Why assume "linear"?
Reply With Quote
  #9  
Old 03-02-2013, 04:02 PM
Hampstead's Avatar
Hampstead Hampstead is offline
Double Eagle Member
 
Join Date: Jan 2012
Location: Spaceship Earth
Years Playing: 24.6
Courses Played: 60
Throwing Style: RHFH
Posts: 1,344
I just want to say that I love playing, but the OP made my head swim.
Too smart for me.
Definitely not hating, I'm impressed with all the info.
Reply With Quote
 

  #10  
Old 03-02-2013, 06:42 PM
DGstatistician DGstatistician is offline
Newbie
 
Join Date: Jan 2013
Location: Here?
Posts: 7
Quote:
Originally Posted by jeverett View Post
Hi BionicRib,

Oh definitely, due to sample size problems, the inherent random nature of disc golf (somewhat mitigated by course and equipment design), and physical/muscular limitations on just how 'controllable' disc golf is, period, getting a 100% correlation coefficient between player rating and event score is going to be impossible.. not to mention not ideal anyway.

However my hope is, that with one slight adjustment to how player round ratings (and by extension player ratings) are calculated, we can very, very slightly increase the correlation coefficient between (initial) player rating and event score. I don't really know what could be expected in terms of improvement with this one change.. probably less than 1%.. but as I said, I don't have the ability to determine this, due to not having access to the *exact* same rating methodology that the PDGA uses.
I will preface this by saying that currently my teaching schedule has me too busy to look through data and figure out exact models. In May, it will be in between Spring and Summer semesters and I will have plenty of time to look into the data and actually test some of my theories.

You make a very valid point that the binary model may not be as accurate as a linear model. (This is a classic cost of simplicity debate). However, I believe that this is also not accurate. It is likely that we should be looking at a quadratic of some sort. Although I am not sure of what model I would specifically fit to the design, the difference between harder or easier courses is likely not linear.

I also plan on looking into the lag that is used for ratings. Specifically the use and value of rounds played over 6 months previously.
Reply With Quote
Reply

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -4. The time now is 11:35 PM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.