Are ratings worthless?

Cgkdisc · Apr 20, 2022

How do you both objectively and subjectively evaluate whether a -18 on a true par 54 SSA course is better than a -18 on a true par 72 SSA course, and for that matter, a -18 on any true par SSA course in between? The reality in ball golf and disc golf is regardless of whether an individual hole is a legit par 3, 4 or 5, you can only realistically save 1 stroke on the hole assuming it's designed well to provide a realistic opportunity for birdie for the player skill level it's designed for.

Here's where the conflict between math and perception occurs. Saving a stroke on a par 3 saves 1/3 and is 33% better than par. Saving a stroke on a par 4 saves 1/4 and is just 25% better than par. Saving a stroke on a par 5 is 1/5 and is just 20% better than par. So, saving 18 strokes on a par 54 with all par 3 holes is mathematically/objectively "more impressive" than saving 18 strokes on a par 72 with all par 4 holes or mix of par 3s, 4s, and 5s.

However, from a subjective perspective, I think most feel that saving a stroke on a par 4 or 5 is more impressive than saving one on a par 3, presumably because the player is making more total shots cumulatively on a course with par 4s and 5s. But taking a closer look, is saving a stroke on a par 4 objectively easier or more difficult than on a par 3? Consider that on a disc golf par 3, you pretty much need to park your drive to get your birdie. On a par 4, you "only" need your second throw to be as accurate as the equivalent tee shot on your par 3 birdie. In other words, on par 4s where your distance capability doesn't require two full power shots for that hole distance, you can get away with a less than stellar drive and still have a chance at birdie with your next shot and/or putt that will likely look more impressive to yourself and viewers than parking your drive on a routine par 3. Regardless, it appears a birdie on a generic par 4 may technically be easier or more probable than the generic par 3 birdie but it will both feel and appear more impressive, especially cumulatively on a higher total par course.

Ratings handle the math with a consistent objective process and evaluation over the range of course pars in our sport but the numbers don't mesh as well with the subjective player and spectator "feels" evaluation when comparing extreme round ratings from courses with a wide gap between true pars. Thus, another method for comparison such as probability of each round occurring seems desirable as a future addition to our stats to better bridge this objective/subjective gap. In the short term, bucketing Best Ever round ratings into the same 6-shot SSA ranges as has been done for many years at least brings apples to apples together as long as players know about it and the PDGA continues to update those tables.

DiscJunkie · Apr 20, 2022

Disc Golf has a range of course difficulties that is "staggering" to someone who grew up playing ball golf.

In ball golf, if you tell someone that you shot in the seventies or that you broke par, everyone understands that you've done very well. Obviously still not exact, but a general understanding that held true for the VAST majority of ball golf courses around, including every BG course within a hundred miles of my home.

In DG, those terms are functionally meaningless.
I've played a ho-hum 18 hole round at Fewell Park in Rock Hill, SC at -17, and then was ecstatic to break 90 at the USDGC course on Spectator Day the next day. Both DG courses within 5 miles of each other.

The rating system that we have may not be everything to everyone, but even with its imperfections, we need it in DG to compare skill and scoring.

Steve West · Apr 20, 2022

biscoe said:
Depends on the question "simplified for who"? A methodology that allows for comparison across course SSA ranges may be more complex to calculate but would in all likelihood be "simpler" for people to use in making judgements.

It's a service, so it should be simpler for the members to use. Of necessity, that means the inner workings will be more complicated - to the point where only a few people on the planet will understand it. That's just how products and services work.

However, if the resulting ratings always makes intuitive sense, people will stop worrying about what's inside. Everyone isn't demanding to see the millions of lines of code inside the UDisc app.

smoothplayer · Apr 20, 2022

DavidSauls said:
You can keep saying that, but you can't make it true.

The ratings are accurate, within the margin of error of a small sample size. They accurately describe how the player performed against the rest of the field, taking into account the quality of the field, and the number of throws required by the course for that field.

But again not accurate course to course which really matters. If McBeth played all his rated golf at 1132 town he might be 1070 rated. So the math simply breaks down if players do not travel and play a lot of different courses.

It's OK to admit the system is flawed.

biscoe · Apr 20, 2022

Steve West said:
It's a service, so it should be simpler for the members to use. Of necessity, that means the inner workings will be more complicated - to the point where only a few people on the planet will understand it. That's just how products and services work.

However, if the resulting ratings always makes intuitive sense, people will stop worrying about what's inside. Everyone isn't demanding to see the millions of lines of code inside the UDisc app.

Agree 100% with the bolded.

Steve West · Apr 20, 2022

Cgkdisc said:
...But taking a closer look, is saving a stroke on a par 4 objectively easier or more difficult than on a par 3?...

The way standard par is set, birdies on pars 3, 4, and 5 are equally impressive. It takes better throws to get a birdie on a par 3, but you need to make more good throws in a row to get birdie on par 5. It turns out that these offset well, and the percent of birdies averages about 20% for pars 3, 4, and 5.

However, the differences between individual holes within each par can be huge. There are many holes where no birdies happen, and on the easiest (relative to par) holes there can be as many as 58% birdies on par 3s, 45% birdies on par 4s, and 35% birdies on par 5s.

So, the impressiveness is far more dependent on whether the holes are easy par x or hard par x than on what x is.

Cgkdisc · Apr 20, 2022

smoothplayer said:
But again not accurate course to course which really matters. If McBeth played all his rated golf at 1132 town he might be 1070 rated. So the math simply breaks down if players do not travel and play a lot of different courses.

It's OK to admit the system is flawed.

Sorry but it is consistent across courses from SSA 42 to 72+. You're confusing a single round rating snapshot with a player rating which is a moving average of many snapshots. Players with a stable 900 rating have been shown to average 900 on courses in the SSA under 48 SSA range and the 66+ SSA range. However, their highest and lowest rated rounds are more likely to occur on lower SSA courses, but the resulting average will still come close to 900.

See the attached table where 4500 rated players with ratings in the five ranges shown threw rated rounds on courses in the five SSA ranges we've tracked since 2005. You can see that in most cells, the players have been able to average their rating across a wide range of course SSAs. The point being that combining round ratings from a wide range of course SSAs appears to be as valid to calculate a pretty accurate player rating as combining round ratings only from courses in a narrow range. BTW, this particular test in the mid-2000s was a milestone for Roger, me and PDGA Admin gaining confidence that the system appeared reliable across a wide range of player skills and course SSAs.

It shouldn't be surprising that the lower rated players struggle a bit more on the 66+ SSA courses, but at that time, there were only a few courses like this with any data. Today, there are few players in those lower rating ranges who even play courses in the 66+ range. But players with 900s ratings are the ones taking a few more OB penalties relative to their ratings which has boosted round ratings of players over 1000 who shoot relatively fewer OBs.

Paul had and would continue to have the same odds of shooting 80 points below his rating as above his rating like the 1132 in any given round on a course like Fountain. In fact, Paul shot 17 strokes worse in R3 than R4 on W.R. Jackson in the Champions Cup. His R3 round rating was 70 points below his player rating with strokes equal to 6.4 rating points. If he shot that many strokes below his rating on Fountain, that round may have been rated around 940.

Flick Maniac · Apr 20, 2022

biscoe said:
That would be my pick- most improbable round from the actual scoring distributions.

FWIW my "gut" is that the round at Jackson and the round at Maple Hill are both better than the round at Toboggan which is better than the one at Waco which is better than the one at Memorial.

I found this. Perhaps they could do the same math on the W.R. Jackson and Maple Hill rounds.

https://discgolf.ultiworld.com/2019/03/20/flight-numbers-waco-glo-question/

Flick Maniac · Apr 20, 2022

edit. theres also a link in the above to an Udisc article about WACO vs DGLO. Good read also and worth linking separetely too
https://udisc.com/blog/post/which-was-perfecter-mcbeths-18-at-waco-or-glo

smoothplayer · Apr 20, 2022

Cgkdisc said:
Sorry but it is consistent across courses from SSA 42 to 72+. You're confusing a single round rating snapshot with a player rating which is a moving average of many snapshots. Players with a stable 900 rating have been shown to average 900 on courses in the SSA under 48 SSA range and the 66+ SSA range. However, their highest and lowest rated rounds are more likely to occur on lower SSA courses, but the resulting average will still come close to 900.

See the attached table where 4500 rated players with ratings in the five ranges shown threw rated rounds on courses in the five SSA ranges we've tracked since 2005. You can see that in most cells, the players have been able to average their rating across a wide range of course SSAs. The point being that combining round ratings from a wide range of course SSAs appears to be as valid to calculate a pretty accurate player rating as combining round ratings only from courses in a narrow range. BTW, this particular test in the mid-2000s was a milestone for Roger, me and PDGA Admin gaining confidence that the system appeared reliable across a wide range of player skills and course SSAs.

It shouldn't be surprising that the lower rated players struggle a bit more on the 66+ SSA courses, but at that time, there were only a few courses like this with any data. Today, there are few players in those lower rating ranges who even play courses in the 66+ range. But players with 900s ratings are the ones taking a few more OB penalties relative to their ratings which has boosted round ratings of players over 1000 who shoot relatively fewer OBs.

Paul had and would continue to have the same odds of shooting 80 points below his rating as above his rating like the 1132 in any given round on a course like Fountain. In fact, Paul shot 17 strokes worse in R3 than R4 on W.R. Jackson in the Champions Cup. His R3 round rating was 70 points below his player rating with strokes equal to 6.4 rating points. If he shot that many strokes below his rating on Fountain, that round may have been rated around 940.

Ratings are inflated by OB though. Top players don't go OB nearly as much as the 990-1020 rated players and have train wrecks. So I have no doubt that course matters. For example to shoot a 1132 would be impossible at WR Jackson. I doubt Paul ever shot 990 at Fountain Hills or Vista either.

OB is skewing the stats.

ru4por · Apr 20, 2022

smoothplayer said:
Ratings are inflated by OB though. Top players don't go OB nearly as much as the 990-1020 rated players and have train wrecks. So I have no doubt that course matters. For example to shoot a 1132 would be impossible at WR Jackson. I doubt Paul ever shot 990 at Fountain Hills or Vista either.

OB is skewing the stats.

Maybe trees are skewing the stats? Or maybe large baskets?

Steve West · Apr 20, 2022

smoothplayer said:
...

OB is skewing the stats.

Or...

Since using too much OB as a design crutch is the norm, perhaps it's the OB-free courses that are skewing the stats the other way.

That seems to be what the "this round should have been rated higher" crowd is saying.

smoothplayer · Apr 20, 2022

Steve West said:
Or...

Since using too much OB as a design crutch is the norm, perhaps it's the OB-free courses that are skewing the stats the other way.

That seems to be what the "this round should have been rated higher" crowd is saying.

I would say that the 1132 was way too high myself and the 1108 from earlier this year.

Cgkdisc · Apr 20, 2022

smoothplayer said:
Ratings are inflated by OB though. Top players don't go OB nearly as much as the 990-1020 rated players and have train wrecks. So I have no doubt that course matters. For example to shoot a 1132 would be impossible at WR Jackson. I doubt Paul ever shot 990 at Fountain Hills or Vista either.

OB is skewing the stats.

The potential to shoot 1100+ at Jackson versus Fountain has to do with the significant difference in course SSA range. OB effect is there but minimal in comparison.

I've discussed the problem of excessive OB with the player ratings calculations when round ratings from excessive OB courses are combined with rounds from traditional courses with minimal OB. Stats from these two types of courses should be calculated and combined separately in the same way a player's shooting stats in basketball games where the 3-pointer is allowed are tracked separately from those shooting stats when all field goals were worth just 2 points. Excessive OB courses introduced the "2-stroke throw" when normally each throw counts one. It's essentially two different games with different scoring rules.

DavidSauls · Apr 20, 2022

Yeah, we could all have 2 different ratings -- our Excessive-OB rating, and our non-Excessive OB rating. Maybe short course and long course ratings, too. Or Open and Woods ratings. Then we could pigeonhole the courses so the same player qualifies for Recreational division on one type of course, but must play Advanced on another. At least the ratings system would have clarity.

Cgkdisc · Apr 20, 2022

DavidSauls said:
Yeah, we could all have 2 different ratings -- our Excessive-OB rating, and our non-Excessive OB rating. Maybe short course and long course ratings, too. Or Open and Woods ratings. Then we could pigeonhole the courses so the same player qualifies for Recreational division on one type of course, but must play Advanced on another. At least the ratings system would have clarity.

League and one-rounder ratings separated from 2-rd+ tournament round ratings.

biscoe · Apr 20, 2022

Cgkdisc said:
League and one-rounder ratings separated from 2-rd+ tournament round ratings.

Why should one round event ratings be separated? League ratings I understand since they shouldn't exist to begin with...

DavidSauls · Apr 20, 2022

Kidding aside, wouldn't shorter/lower-SSA courses have more volatile ratings, just from something similar to small sample size? Fewer total throws, more value for each throw, easier to ride a hot/lucky streak to a higher round rating? I always feel like I can have a hot round on a short course, but on a longer course, there's more opportunities to derail. And vice versa; the longer course gives me more chance to overcome a few blowups.

Cgkdisc · Apr 20, 2022

biscoe said:
Why should one round event ratings be separated? League ratings I understand since they shouldn't exist to begin with...

One-round events and leagues have one or more of these flaws in relation to events with 2 or more rounds:

Contenders in each division are not playing in the same groups with other contenders in the final round since it's the only round
TD more likely to group friends versus randomize cards.
Playing groups many times include players from different divisions with even less incentive to call rules that don't affect them
Handicap scoring is allowed to determine round ranks in the league where players may not be striving for their best score, but then their actual scores are submitted for ratings.
Flex start playing groups are not always randomly created and many times those friends are competing in different divisions.
In flex start rounds, players in the same division are playing at different times and in course conditions throughout the day to where those with flexible schedules can choose to play when it isn't raining, hotter, windy, or snowing.

biscoe · Apr 20, 2022

Cgkdisc said:
One-round events and leagues have one or more of these flaws in relation to events with 2 or more rounds:

Contenders in each division are not playing in the same groups with other contenders in the final round since it's the only round

By this logic the first round of an event would never be rated.

Cgkdisc said:

[*]TD more likely to group friends versus randomize cards.
[*]Playing groups many times include players from different divisions with even less incentive to call rules that don't affect them

Click to expand...

mixed groups occur to some degree or another in many if not most events- the player's responsibility to the rules is unchanged.

Cgkdisc said:

[*]
[*]In flex start rounds, players in the same division are playing at different times and in course conditions throughout the day to where those with flexible schedules can choose to play when it isn't raining, hotter, windy, or snowing.

Click to expand...

Cgkdisc said:

TD's have the option to note changes in playing conditions in the TD report.

Theme font size

Are ratings worthless?

.:Hall of Fame Member:.

* Ace Member *

* Ace Member *

Banned

* Ace Member *

* Ace Member *

.:Hall of Fame Member:.

Attachments

* Ace Member *

* Ace Member *

Banned

* Ace Member *

* Ace Member *

Banned

.:Hall of Fame Member:.

* Ace Member *

.:Hall of Fame Member:.

* Ace Member *

* Ace Member *

.:Hall of Fame Member:.

* Ace Member *

Similar threads