 Originally Posted by DGstatistician I will preface this by saying that currently my teaching schedule has me too busy to look through data and figure out exact models. In May, it will be in between Spring and Summer semesters and I will have plenty of time to look into the data and actually test some of my theories. You make a very valid point that the binary model may not be as accurate as a linear model. (This is a classic cost of simplicity debate). However, I believe that this is also not accurate. It is likely that we should be looking at a quadratic of some sort. Although I am not sure of what model I would specifically fit to the design, the difference between harder or easier courses is likely not linear. I also plan on looking into the lag that is used for ratings. Specifically the use and value of rounds played over 6 months previously.
Hi Dgstatician,

Yes, I agree with you that the assumption of a linear distribution of initial player rating vs. round score is not necessarily 100% valid, and some of the data I've collected does somewhat contradict that assumption, particularly with very long/difficult courses or lots of forced distance/carry holes.. i.e. course layouts where lower-skill players are having to actually throw different shots than players who can out-drive them. If you look at the plot of the Memorial, for example, and look at the distribution above 1020, the assumption of the same linear distribution as the rest of the skill range is questionable.

I agree, though, that finding an appropriate (possibly quadratic) model for all events/rounds that ultimately increases the correlation coefficient between initial player rating and round score is going to be some work. :P This is actually what BionicRib was discussing with Chuck.. so he's definitely been thinking along these lines, too.
Thinking a bit more on the topic of finding a more-accurate fit than linear to represent player rating vs. round score, I feel like it's coming down to two (maybe three) variables:

Putting - as player rating increases, their ability to make putts absolutely increases, but what is the relationship between these two? Is it linear? Does it have some kind of diminishing returns? Has anyone ever attempted to test this relationship? e.g. have a wide range of rated players each attempt a large number of (static distance) putts, and record their percentages? Anyone who's ever run putting contests happen to have any data on this topic? I'd love to see some.

Drive distance - as player rating increases, their drive distance definitely increases, but again I'm uncertain as to what the relationship between player rating and drive distance is (not to mention we may also need to deal with drive accuracy in here too). This relationship is most definitely not just linear. First, depending on the hole length, there's going to be a point at which a higher-rating/skill/distance player has the possibility of shaving a whole throw off their score, that a lower-rating/skill/distance player simply cannot reach. Imagine a course of all 480ft. holes.. a 1020+ rated player could theoretically be in birdie range on every hole, yet a player rated in the lower 900's is best-case getting Par on every one of those. Plus, as overall drive distance increases, overall putting distance decreases: i.e. drive distance is affecting round score twice. If we were to combine drive distance with 'accurate' distance, we could theoretically measure the relationship between player rating and round score with this regard.. e.g. pick a circle radius, and a necessary accuracy percentage (e.g. 90%), and see at what maximum distance players can hit that circle 90% of the time vs. their rating.

Any thoughts so far? Anyone attempted to estimate/determine any of these relationships?

 Originally Posted by jeverett x
if it's easy, will you distinguish the FPO scatter from MPO?

the PDGA Ratings Team only assessed MPO in the article, "Correlation for Better Courses," so i'm missing out on all the fun !!!!!!!