Disc Golf Course Regression (What variables effect course rating??)

ElementZ

Double Eagle Member
Diamond level trusted reviewer
Joined
Jun 23, 2009
Messages
1,297
Hi everyone,

For my time series and regression class this semester, I did a regression based on 125 or so courses. I pulled rating, year designed, number of holes, distance to next course, multiple tees, tee type, number of players, and whether the baskets were DISCatchers or not for each course. Feel free to use my data!

Report/Findings

Data

Let me know what you think. :)
 
I think I need to go back to school.

One thought on the idea that the average course rating is not improved by year designed: As disc golf grows in popularity we're building great courses, but also courses on elementary school grounds (almost unheard of 10 years ago) and more beginner courses in places that never had any at all. This might dilute the averages.
 
I would agree with David. Eagle Ridge in Clayton, GA, is the first course in this area and we designed it to be brand new player friendly....knowing that was the best way to grow the sport in that community. It would never get more than one or one and a half stars.
 
Great idea! I wonder what else one might be able to get out of the DGCR data...

Some slight nitpicking, is it common practice to use classical ANOVA for that sort of data? Mathematically, I think it might be better to use non-parametric methods like rank correlation (and bootstrapping for inference) instead of linear correlation (since most of your data is ordinal and/or likely non-normal). This is coming from an amateur statistician though, so maybe I'm being pedantic (since your fit looks quite nice anyway).
 
Let me know what you think. :)


Pretty cool project, Element Z. Thanks for posting it. It's a little window into what drives the ratings here.

I agree with your observation that the type of basket on the course seems to have no effect on its rating. I've always thought the design matters more than the basket type.

If you do another study, I think the one variable that carries a lot of ratings weight here is tee signs and/or navigation. Don't know exactly how it might be quantified, but it's my theory that good tee signs & navigation add a significant fraction of a point to a course's rating.
 
Great idea! I wonder what else one might be able to get out of the DGCR data...

Some slight nitpicking, is it common practice to use classical ANOVA for that sort of data? Mathematically, I think it might be better to use non-parametric methods like rank correlation (and bootstrapping for inference) instead of linear correlation (since most of your data is ordinal and/or likely non-normal). This is coming from an amateur statistician though, so maybe I'm being pedantic (since your fit looks quite nice anyway).

That is an awesome first post to have on these forums. :hfive:
 
Scenery probably matters, but how could you possibly account for it?

I think DGCR upgrade would be a red warning light on threads where statisticians are posting. I respect and admire you guys. But when I open such a link while dead-tired after running a tournament, there are only two possiblities: that I'll get a massive headache trying to understand it, or that I'll get a massive headache failing to understand it.
 
I'm not sure where it's been indicated that DISCatcher is the #1 brand? I'm thinking DGA as a brand but not specific models might still be #1? It appears that Multiple Tees, Concrete Tees and to a lesser extent Number of Holes are such predominant factors in the correlation, accounting for 0.69 of the variance, that you could ignore the other two correlation factors accounting for only 0.04.

I would think Course Length per Hole could have been an important variable to check. Length is the predominant variable in determining the course SSA rating and may influence DGCR course ratings.

Interesting study and I'm glad to see statistics being used for analysis.
 
Scenery probably matters, but how could you possibly account for it?

It wouldn't be perfect but looking at how the terrain and foliage match up with trends in ratings might give you some idea. Generally a hilly wooded course will be more scenic than a flat open one.
 
I'm not sure where it's been indicated that DISCatcher is the #1 brand? I'm thinking DGA as a brand but not specific models might still be #1? It appears that Multiple Tees, Concrete Tees and to a lesser extent Number of Holes are such predominant factors in the correlation, accounting for 0.69 of the variance, that you could ignore the other two correlation factors accounting for only 0.04.

I would think Course Length per Hole could have been an important variable to check. Length is the predominant variable in determining the course SSA rating and may influence DGCR course ratings.

Interesting study and I'm glad to see statistics being used for analysis.

I think it would be interesting to see the data from a study done on how each brand of basket catches putts. We got new DGA baskets and new Chainstars on courses here this past year. It seems to me that the gauge of chains that they use are lighter than DisCatchers and because of that, I see more cut throughs and spit outs on them. It would be interesting to me to see the results of a putting machine sending multiple "same" putts at each of these baskets with right side, left side, straight on, nose up, nose down, soft, firm, fast, etcetera putts to provide data on how well each basket catches a particular kind of putt.
 
Is there an analysis on why a person who aims at a basket 20 feet away misses but when aims at a skinny tree 250 feet down the fairway they hit it dead on?
 
That would be really cool data to see, especially if you could include some of the smaller basket manufacturers like Gateway, Arroyo, Spiderweb etc.
 
It wouldn't be perfect but looking at how the terrain and foliage match up with trends in ratings might give you some idea. Generally a hilly wooded course will be more scenic than a flat open one.

Good point. I'll bet if you had accurate numbers for total elevation change, there'd be a significant effect on rating. I've played great courses that were level....but big hills can boost even a dinky 9-holer.
 
I wonder if the new 14 chain DISCatchers catch better than the 12 chain. Just from watching a little bit at Worlds, it seemed like the extra metal in the 14 chains was rejecting a few putts that might have been caught on the 12 chains. I agree an Iron Byron putting machine would be interesting to test putts. But unless the PDGA and manufacturers thought it was a good idea to establish a "putt catching" spec, I'm not sure it will happen.
 
I wonder if the new 14 chain DISCatchers catch better than the 12 chain. Just from watching a little bit at Worlds, it seemed like the extra metal in the 14 chains was rejecting a few putts that might have been caught on the 12 chains. I agree an Iron Byron putting machine would be interesting to test putts. But unless the PDGA and manufacturers thought it was a good idea to establish a "putt catching" spec, I'm not sure it will happen.

We just installed the new Discatchers on Eagle Creek in Clayton, GA. Its a short 9 hole course. I've played it about a dozen times and I think soft high putts don't stick as well. Otherwise, they catch well. The sound of a big pity or ace run is a bit deeper sounding.
 
Nice study. I did a similar study a while back. Less rigorous, probably.

Between brands of baskets, I could find no real difference. However, there was a noticeable difference between courses that have any brand of manufactured basket vs. object courses or homemade baskets.

Because I was only doing it for fun, not a grade, I translated my results into the rating a course would have if it differed from the average course only by having each of these features (ignoring how weak the signal may have been).

Manufactured Baskets = 2.94
Active = 2.88
27 Holes = 2.79
Rubber Tees = 2.71
Pay to Play = 2.65
Concrete Tees = 2.65
Seasonal = 2.63
Multiple Tees = 2.62
Temporary = 2.52
Free = 2.52
18 Holes = 2.51
Private = 2.49
Permanent = 2.47
Public = 2.44
Other Targets = 2.43
Asphalt Tees = 2.37
Single Tees = 2.34
Home Made Baskets = 2.26
Practice Area = 2.25
Other Tees = 2.24
Not Active = 2.14
9 Holes = 1.81
 
Pedant Alert!!!

Got here late, but I'm grinding out a biometrics degree these days, so I had to chime in! A thread on disc golf and stats made my frickin' day!

With so many legitimate (and a few not so legitimate) predictor variables having been suggested here, might I suggest using automatic variable selection techniques? R statistical software (one example) helps with processes like forward and backward selection and step-wise regression, freeing you to input data for however many x-variables you can get your hands on. Punch them all into your dataframe, type in a few easy lines of code, and BOOM, you have yourself 5-6 suggested "best" models with varying fit criteria.:thmbup: Plus you avoid issues of over/under specification, rendering the model biased and inconsistent in other contexts.

Also, as Wolfobert mentioned earlier, a model form with linear parameter estimates might not be your best bet. I would think a non-linear mixed effects model would not only better fit the non-normal data, but explain the extra variation that stems from regional differences in course ratings. My 2¢.
 
Hey! The data I used for my project is open to the public, so if you'd like to play with it, go for it! That type of statistical technique was more advanced than what we learned and played with, but I'd love to see if your model is semi-close to mine. Thanks for the input, even if it was 2 months late. :thmbup:

Got here late, but I'm grinding out a biometrics degree these days, so I had to chime in! A thread on disc golf and stats made my frickin' day!

With so many legitimate (and a few not so legitimate) predictor variables having been suggested here, might I suggest using automatic variable selection techniques? R statistical software (one example) helps with processes like forward and backward selection and step-wise regression, freeing you to input data for however many x-variables you can get your hands on. Punch them all into your dataframe, type in a few easy lines of code, and BOOM, you have yourself 5-6 suggested "best" models with varying fit criteria.:thmbup: Plus you avoid issues of over/under specification, rendering the model biased and inconsistent in other contexts.

Also, as Wolfobert mentioned earlier, a model form with linear parameter estimates might not be your best bet. I would think a non-linear mixed effects model would not only better fit the non-normal data, but explain the extra variation that stems from regional differences in course ratings. My 2¢.
 

Latest posts

Top