Ok, we’ve been over this before, but we will keep going over it until people listen up.
Assigning a numerical “score” to any product review is misleading at best, and downright bullshit most of the time.
The Verge in their review of the Amazon Fire Phone ((No link because: The Verge)) gave the device a score of 5.9 — they then go on to show the breakdown of that score over eight categories.
If you average the score of those eight categories you get 6.875 — or decidedly not 5.9 (which is the overall score they advertise for the device). Now, The Verge does note that they reserve the right to adjust that score if it doesn’t fit their overall assessments of the device, which is odd because those numbers (in the aforementioned eight categories) are part of the assessment and seemingly should accurately reflect the written assessment, but maybe that is too logical for this level of ‘journalism’.
So instead of accurately assessing the individual categories, in a manner consistent with thier written review, The Verge just adjusts the overall score down, or up to match (one assumes) the written words.
This is kind of like passing your exam with a C, and the professor giving you a D because he didn’t like your attitude — that’s actually how stupid this numerical scoring system is when you adjust the overall score at random so you, presumbly, don’t look so inconsistent.
Let’s take another approach to this numerical scoring system, and since I have not used the Fire Phone I will take The Verge category ratings as gospel and go from there. Let’s take a weighted approach and break up the categories like so:
- Battery Life
- Software & Ecosystem
- Call Quality (Reception)
So I have taken eight categories and made them into three categories with sub-categories. Now, using my experience with phones let’s weight the value each of those categories has as a percentage of 100.
- Hardware: 30%
- Design: 20%
- Display: 5%
- Performance: 30%
- Camera: 40%
- Battery Life: 5%
- Software & Ecosystem: 50%
- Call Quality: 20%
Essentially, I am saying that Call Quality alone is pretty important (this is a phone), but the software is the most important aspect of the entire device — hardware is also important, but mostly that is because people rely on the camera so heavily.
Now, you have every right to disagree with my weightings, but let’s just look how my weights change the overall rating, so I can get to my main point here.
Taking my weightings into account, but using the scores The Verge assigned to those categories, we get an overall rating of 6.4 — so higher than what The Verge adjusted to, but lower than their straight average, but that’s based only on my weights not theirs.
Here’s what’s stupid: The Verge doesn’t tell you their weights, so you, the reader, have no clue if you agree with their values on the weights or not. Because, say you think my weights on software and hardware should be flipped, well then you get an overall rating of 6.9, so knowing how something is weighted is very important. At least with me showing you the weights you know whether or not to agree, but that’s not the case on The Verge their weights appear discretionary, which is just not good.
Let’s go back to my original weights as listed above, and apply them to the ratings for the Apple iPhone 5s from The Verge. Using my weights you get a 9.1, whereas The Verge originally weighted it 8.8 — so, what does this really mean?
There’s three things this really means:
- You cannot assign an overall weighted score to a product review unless you spell out your weighting, and have the same person assign category values to every product you review. Because what I assign to design will be different than what anyone else would assign to that category. The alternative to using just one person, is to use only quantifiable categories, but that eliminates some important ones like ‘build’ and ‘design’.
- You cannot compare the overall scores from device to device unless you know the weighting and you know that those weights are the same on every reviewed device in that category. Because what if the iPhone was weighted more towards hardware, but the Fire more towards software? I’ll tell you what: it makes the overall scores pointless.
- The ratings of the Fire Phone are clearly bullshit.
Let’s explore the third point.
Here’s the most telling statement of the review: “But it’s not a very good smartphone.”
Ok, so it’s not a very good phone, so the overall weighted score should reflect that (I would argue the average score should reflect that as well, but what do you expect from The Verge) and the overall score should reflect the fact that the review is mostly a long reason for not buying the phone. The Verge ranked the phone on a weighted 5.9 and by the site’s own metrics that means:
- A rating of 5 is: “Just okay.”
- A rating of 6 is: “Good. There are issues, but also redeeming qualities.”
Which is where this gets comical. A phone that the review just rated as “not a very good smartphone” gets a score that says: “better than ok, almost, just one tenth away, from being good.” Now I guess that could mean “not very good”, but on a site like The Verge that statement is akin to me saying: this is a piece of shit.
And using their own numbers, but my weighting, that same phone gets a 6.4 which means it is better than good. Even the un-weighted average is better than good rating.
The Verge essentially doesn’t like the phone very much, but if you view the ratings alone one would be right in assuming that this phone is actually good.
Now we could dive through all the reviews and make this same analysis, but why bother? It’s clear the numerical scores have little to no meaning and cannot be reasonably compared with one another. Therefore making these scores completely and utterly pointless.
I don’t like The Verge, but I hate these numerical type of scores even more. And you may be thinking: “But Ben, as long as you read the review you will be fine.” Which would be true, but misleading as a generalized statement because it assumes most readers read the review, and I highly doubt that. More likely most readers skim the review, watch the video, and then view the ratings — which means they have no clue how good the phone actually is. And every one of those readers will be completely mislead as they compare those scores amongst like devices.
This is shoddy journalism, and shoddy reviewing.