• After 15+ years, we've made a big change: Android Forums is now Early Bird Club. Learn more here.

Are benchmarks a fair method of device comparison?

Status
Not open for further replies.

EarlyMon

The PearlyMon
The point remains that benchmarks are not trustworthy.

Allowing that HTC still games them - you don't know by how much. Only HTC does.

And even if you knew, they're just about as useless as they were last year and the year before.

Unless you're running games, they bear little resemblance to the user experience.

If you need bragging rights, there's no harm in that fun. I'm sure we all do it on some level with some hobby.

But it's no service to suggest to comparison shoppers that benchmarks provide any rational information for comparison purposes.
 
But it's no service to suggest to comparison shoppers that benchmarks provide any rational information for comparison purposes.

But benchmarks do provide useful information, the developers behind popular benchmarks are making an effect to make sure the results are legitimate by either releasing updates to stop the benchmark detection or testing and removing offending products.

Based on results we have the only offender left in the top four manufacturers is HTC.

Also given that in Antutu X the HTC One M8 score drops by around 11000 points that does give you some idea how much HTC are "playing" the benchmarks.
 
Other than bragging rights, what exactly is gained?

Any top end phone is more than capable in the performance sector

They are tools to help users determine if an upgrade is worthwhile or not, they also let users know if their devices are working correctly by comparing to users with the same device.

Technical information and benchmark results from benchmarks can also be used to determine if you have fake device are not.

I have to say cheating is usually not difficult to spot when you have many devices with the same SoC available, there is usually a margin or error and anything beyond that should be considered suspect.
 
But benchmarks do provide useful information, the developers behind popular benchmarks are making an effect to make sure the results are legitimate by either releasing updates to stop the benchmark detection or testing and removing offending products.

Define legitimate.

For one of our rom releases, we spent nearly two months tuning and testing more control parameters than you've heard of.

We did not release the one with the highest benchmark - we released the one that blew people away because it was so fast.

And then listened to our users complain about low benchmarks.

And I quote that for different roms and different phones than the one I worked on, having spent inside time with other teams.

You may want benchmarks to matter, you may think that they do, and you may believe that you have proof that they do.

But they just don't.

How fast will the M8 run TouchWiz? How fast will the S5 run Sense? How fast is the S5 when matching the M8 playing music really loud? How quickly can you complete AnTuTu after a battery swap on the M8?

Benchmarks should - should - be able to help you choose your best investment, all things being equal.

But they're not equal.

Most people carrying an S4 last year had Toshiba storage, most carrying the M7 had Samsung storage.

What's the difference this year?

And isn't there a screen difference appealing to different markets?

The list goes on and on before you can get to benchmarks - and when you do get there, it won't matter.

Because benchmarks now are proudly stopping all of the benchmark gamers - except themselves.

They both use the same SoC. The US Samsung is clocked higher in Europe and the US.

Speaking of bare metal, it's going to run faster and have the *potential* to run faster.

No benchmarks required, there's your answer.

And with both at top speed, the S5 processor will eat power faster. And not have the big speakers. And have a removable battery. And not run Sense.

Don't even start me on the annual camera war. Just wake me when it's over.

So to sum up, you say benchmarks are legitimate.

I say, define legitimate.

And if you say because something about speed and performance, then I say get back to me after you've done development. ;) :)
 
They are tools to help users determine if an upgrade is worthwhile or not, they also let users know if their devices are working correctly by comparing to users with the same device.

Technical information and benchmark results from benchmarks can also be used to determine if you have fake device are not.

I have to say cheating is usually not difficult to spot when you have many devices with the same SoC available, there is usually a margin or error and anything beyond that should be considered suspect.

While running completely different operating system stacks, as modified not only by the OEMs vs each other and Google, but also by user preferences for configuration. :)
 
They are tools to help users determine if an upgrade is worthwhile or not

I'd argue you shouldn't base a $700 purchase on a benchmark. There's way more important things to consider, such as real world use.

they also let users know if their devices are working correctly by comparing to users with the same device.
There's a whole lot of variability in scores between users of the same device. If anything, I'd say people have unneeded worry about their phone if it doesn't score as high as some other guy on the internet. Not every CPU is equivalent, even between the same phone types.

Technical information and benchmark results from benchmarks can also be used to determine if you have fake device are not.
It may be one part of a line of evidence, but I'm not sure that's worth putting much weight into benchmark results from a brand new device.

I suppose I just think people put way too much stock/meaning into benchmarks.

Any top end phone of the day is going to be more than capable of multitasking/playing games/ whatever you'd need to do. What's in a number?
 
We did not release the one with the highest benchmark - we released the one that blew people away because it was so fast.

What benchmarks were users complaining about? I will say that not all benchmarks are equal, some should have been retired years ago like Quadrant.
 
What benchmarks were users complaining about? I will say that not all benchmarks are equal, some should have been retired years ago like Quadrant.

I agree completely.

Users complain about all of their favorite benchmarks.

I ran all I could find. And that's a lot. The only one that makes sense is CF Bench. And you have to know what you're looking at and how the context will affect your predictions. Not many do.

I found AnTuTu especially unreliable since late 2011, early 2012.

Take the challenge. Don't compare the top new phones today. Compare what you have with the scores of an older model and ask if it's true. Assuming that you can get past the mark where they changed their scoring system - which - a lot of people upgrading can't.

You and I both have a lot of user experience with a lot of phones.

We have different opinions and may never convince the other.

Your opinion is that benchmarks are good and keep getting better.

Mine is that they're a waste of time and there's no improvement in sight.

Ever talk to someone who's phone is running slower a year later, things lagging that never lagged before?

But their benchmark scores are still right up there?

Common sense will tell you what a benchmark score can not.

There's no more to my opinion than what I've already said. :)
 
I'd argue you shouldn't base a $700 purchase on a benchmark. There's way more important things to consider, such as real world use.

I'd feel allot more comfortable basing a purchase on a benchmark I trust than than basing it on the opinions of others which far less reliable. ;)

There's a whole lot of variability in scores between users of the same device. If anything, I'd say people have unneeded worry about their phone if it doesn't score as high as some other guy on the internet. Not every CPU is equivalent, even between the same phone types.

No there isn't, variability on same devices comes from users running older or customized roms and of course user error.

Most users have no idea what thermal throttling is, they run a game and some benchmarks even while charging and then wonder why the scores are low.

I suppose I just think people put way too much stock/meaning into benchmarks.

Any top end phone of the day is going to be more than capable of multitasking/playing games/ whatever you'd need to do. What's in a number?

Again the only alternative is go on raw specs, the opinions of users or the BS performance estimates the manufactures provide.

Opinions about device performance or interface/game smoothness are unreliable.

I ran all I could find. And that's a lot. The only one that makes sense is CF Bench. And you have to know what you're looking at and how the context will affect your predictions. Not many do.

I found AnTuTu especially unreliable since late 2011, early 2012.


We have different opinions and may never convince the other.

Your opinion is that benchmarks are good and keep getting better.

CF-Bench is fairly reliable, haven't installed it for over 6 months though.

The most consistent benchmarks I find are 3DMark which also comes a physics test for CPU performance and GFXBench which I always find consistent regardless of the roms I'm using.

Antutu I agree isn't that reliable, just installing Qualcomm's optimized Dalvik & Bionic and a new kernel boosts the score on my Nexus 7 by 10000 points which shouldn't really happen. That said with the release of Antutu X it's useful for spotting devices that cheat, so it's still has it's uses.

I don't believe all benchmarks are good or that they necessarily get better over time. Some get better over time, some just become outdated and some were never useful to start with. ;)
 
...
The most consistent benchmarks I find are 3DMark which also comes a physics test for CPU performance...

I've worked in semiconductor research and development for a *lot* of years.

Physics testing evaluates hot carriers, negative temperature bias instability, electromigration, mobile ions, plasma discharge, oxide breakdown and a host of others.

External test hardware is required.

3DMark -

"The Physics test is a pure CPU performance benchmark using lightweight rendering techniques to minimise any GPU impact on scoring. The test contains a multi-theaded simulation of a large number of rigid bodies, some connected with joints, colliding using the Bullet Open Source Physics Library."

Gamer's definition of physics seems to equal something about performing some calculations with the CPU rather than the GPU. Coincidentally, the calculations are for the end use of predicting motion of objects in a game using Newton's Laws.

Excellent marketing, calling that a physics test. Sounds really scientific. And innovative.

It's not any of those things but I can see from that one paragraph how you were taken in.

Go back to my first response. I already allowed that some benchmarks *may* be useful to gamers. No need to beat that horse, I conceded it going in.

But benchmarks end right there for most everyone else, the rest is mindshare. Adding the word physics doesn't change anything. ;) :)

If you want to respond where it matters, answer the central questions -

Have you ever talked to anyone whose phone has acquired lag over time?

Many of us have.

And noticed that the benchmarks, regardless of which ones, don't change when that happens?

If you're not aware of that in real world use of phones, no worries, many of us are.

Same user, same apps, same phone, same benchmarks, and nothing is explained or predicted.

Therefore doubtful that they can tell you how two different phones will perform.

Except for playing some games. In some cases.

By the way, this is hilarious -

http://phandroid.com/2014/03/31/htc...k-cheating-into-performance-boosting-feature/
 
Excellent marketing, calling that a physics test. Sounds really scientific. And innovative. It's not any of those things but I can see from that one paragraph how you were taken in.

Go back to my first response. I already allowed that some benchmarks *may* be useful to gamers. No need to beat that horse, I conceded it going in.

Regardless of what they call the CPU test it's very consistent and never varies much with minor software tweaks, changes in rom or kernel like Quadrant and Antutu, so for me that makes it a good benchmark regardless of if you approve or not.

Have you ever talked to anyone whose phone has acquired lag over time? Many of us have.

And noticed that the benchmarks, regardless of which ones, don't change when that happens?

That's because the performance of the phone hasn't changed, it's usually only the perceived performance compared to other devices which makes the older device appear slow and unresponsive.

It's not that complicated, older devices that don't have 4.3 or better don't have the benefit's of trim which can also cause noticeable slow down.
 
I'd feel allot more comfortable basing a purchase on a benchmark I trust than than basing it on the opinions of others which far less reliable. ;)
That's why I mentioned real world everyday use.

Go into a store and use the phone. Get a feel with it, test out everything that's important for you. Take any opinions from reviewers or the people trying to sell the phone with a grain of salt.

After all, its not their phone, its going to be yours.

No there isn't, variability on same devices comes from users running older or customized roms and of course user error.
Benchmarks can be wildly variable on even the same device. Jimmy and Johnny's galaxy s5's are also going to vary from each other.


Most users have no idea what thermal throttling is, they run a game and some benchmarks even while charging and then wonder why the scores are low.
Exactly the point I brought up earlier. ;)

Id wager benchmarks worry/frustrate more users when they compare their scores to others they see on the internet more than they settle queries on whether a device is legit.
 
Regardless of what they call the CPU test it's very consistent and never varies much with minor software tweaks, changes in rom or kernel like Quadrant and Antutu, so for me that makes it a good benchmark regardless of if you approve or not.

In that case, a "benchmark" that did nothing other than read your build prop and assign your phone a predetermined score based on model would be a fantastic benchmark.

The score doesn't mean anything, but its super consistent.;)
 
In that case, a "benchmark" that did nothing other than read your build prop and assign your phone a predetermined score based on model would be a fantastic benchmark.

The score doesn't mean anything, but its super consistent.;)

Don't be ridiculous, then it wouldn't be able to measure performance differences between devices running at different clockspeeds .

That's why I mentioned real world everyday use.

Go into a store and use the phone. Get a feel with it, test out everything that's important for you. Take any opinions from reviewers or the people trying to sell the phone with a grain of salt.

After all, its not their phone, its going to be yours.

The launcher itself could cause lag and a user based on feel alone could end up buying another phone with the exact same hardware instead of just installing another launcher. Great advice if your looking to help people waste money.

Benchmarks can be wildly variable on even the same device. Jimmy and Johnny's galaxy s5's are also going to vary from each other.

No they won't, not unless they using different versions of the Galaxy S5. ;)

Guys, we will never agree here so the sooner we end this is better for the thread, users can decide for themselves.
 
It's not that complicated, older devices that don't have 4.3 or better don't have the benefit's of trim which can also cause noticeable slow down.
Trim is not required by all Android devices, even if the Nexus did have that bug. The idea that they all do is a myth.

It's a kernel option during building - and every OEM replaces the kernel.

My phone has never had the problem solved by trim, going back to ICS at release.

Forcing trim with the root app on a phone that doesn't need it can and often has permanently bricked the phone.

Trim, just like benchmarks varying on the same model phone being true, is all about what can be shown and proven, not what you think you know, not what you read in the blogosphere, and not what you want to believe. ;)
That's because the performance of the phone hasn't changed, it's usually only the perceived performance compared to other devices which makes the older device appear slow and unresponsive.
Wrong. :D This is a known problem for a lot of users. Not perceived, not unchanged.

Changed. Worse. Performance degradation.

That you're unfamiliar with it really sums up why you don't get how the rest of us won't accept all the tomfoolery of benchmarking claims to help people pick out phones.

That you're making up an answer because you don't understand the issue is something that's just - wrong. :D

That your preferred benchmarks are insensitive to rom and kernel changes - when people love the obvious performance improvements when they change those things - just said all there is to say about benchmarking.

Thank you for proving my point - benchmarks tell you nothing about everyday phone use and are of no use comparing roms within a model, much less comparing models. Could not have said it better myself. :)

Regardless, you changed the conditions of my question before answering. (ninja'd, subject over for me :o)

Have a great day! :)
 
EarlyMon said:
Wrong. This is a known problem for a lot of users. Not perceived, not unchanged.

Changed. Worse. Performance degradation.

That you're unfamiliar with it really sums up why you don't get how the rest of us won't accept all the tomfoolery of benchmarking claims to help people pick out phones.

That you're making up an answer because you don't understand the issue is something that's just - wrong.

I've heard of them slowing down due to background apps building up over time but that's easily fixed with a wipe/reset and restoring user files.

Never experienced what your talking about and I've gone through twelve Android devices over the last five years, that's not including devices I've repaired over the years either.

Why haven't I read about this? links? :bebored:

That your preferred benchmarks are insensitive to rom and kernel changes - when people love the obvious performance improvements when they change those things - just said all there is to say about benchmarking.

No, that's not what I said, I'll try again because you don't understand what I meant or your just refusing to, I'm not sure.

Benchmarks that show huge score increases from minor kernel or rom tweaks are not reliable benchmarks, these tweaks usually make no noticeable difference to real world usage so this should be reflected in the end result.

Benchmarks like Antutu and Quadrant where the score can increase 10,000 points from minor tweaks provide very little useful information.

A benchmark that gives reliable scores based on the hardware is more reliable, obviously the benchmark result will be affected by clockspeeds. When I overclock or underclock my Nexus 7 (CPU & GPU) I can see a clear increase or decrease in performance across all the tests in 3dmark.

:)
 
From what I've seen of benchmarking personally, specifically Antutu, it's mainly used for promotion. "Hey our tablet has higher number than our competitors!". Very often Antutu is used for deception with KIRFs and cheapos. It's pre-installed as a root system app and is doctored.
 
Tbh I DO sometimes use benchmarks as a quick way to compare kernels, tweaks, etc against each other on the one device. IE: not comparing my device to another one.
I think it can be useful if you somehow have a way to standardise it, like, I reboot then let the device "rest" for 5mins then do the test.
Benchmarks don't really come close to determining how a setup "feels" in real use though and they certainly shouldn't be used to compare one device to another imo :beer:
 
I've regarded them as worthless for a long time. For example, Gingerbread ROMs always gave lower benchmark scores than Froyo on the Desire, but were faster in any real usage. That's same device, different OS versions, so I can't imagine comparisons between different devices are more meaningful. And this time last year we were simultaneously reading record-breaking benchmarks and reports of UI lag in S4 reviews, which is a simple demo that the benchmark doesn't tell you the important stuff.

I've recently been doing some kernel testing (one of the devs for my device wanted to blind test a few variants), and we've found almost an inverse relation between benchmark scores and real world smoothness and responsiveness.

So I just don't read "benchmark" sections of reviews any more.
 
Tbh I DO sometimes use benchmarks as a quick way to compare kernels, tweaks, etc against each other on the one device. IE: not comparing my device to another one.
I think it can be useful if you somehow have a way to standardise it, like, I reboot then let the device "rest" for 5mins then do the test.
Benchmarks don't really come close to determining how a setup "feels" in real use though and they certainly shouldn't be used to compare one device to another imo :beer:

How a setup feels has little do with the performance of the hardware once you reach a certain point, that's mostly down to software these days.

But how it feels and how it performs are not the same, you can make a Snapdragon 400 based device feel smooth with stock Android but we know the underlying performance is not comparable to high end devices, you will notice that difference when launching apps or games.

The Galaxy S4 for example uses touchwiz which I think everyone will agree isn't the smoothest experience, changing over to a Google Edition rom may make the interface feel much smoother but the actual performance of the device and the performance you will receive in apps and games you download is unchanged, it's this performance you should measure with benchmarks, not how a device feels.
 
How a setup feels has little do with the performance of the hardware once you reach a certain point, that's mostly down to software these days.

But how it feels and how it performs are not the same, you can make a Snapdragon 400 based device feel smooth with stock Android but we know the underlying performance is not comparable to high end devices, you will notice that difference when launching apps or games.

The Galaxy S4 for example uses touchwiz which I think everyone will agree isn't the smoothest experience, changing over to a Google Edition rom may make the interface feel much smoother but the actual performance of the device and the performance you will receive in apps and games you download is unchanged, it's this performance you should measure with benchmarks, not how a device feels.

But if you switch from a touchwiz to an AOSP ROM you'll see a marked decrease in benchmark scores, while actual fluidity and lag is much better
 
Name the benchmarks, remember them, don't use them again. :)

What a great idea.

In 2011, when challenged, you had this to say about Quadrant -

Benchmarks are useful and worth looking at, it's just some are not for whatever reason, quadrant needs updating badly and had done for over a year, I think the developer is just cashing in on the popularity, I wouldn't pay for it as it's pretty crap but allot of folk have.

In 2012, when someone asked what his Quadrant score meant when his Samsung scored highly, the story changed.

Only had my S2 for around a week, so still getting used to it.
But noticed this benchmark app and tried it out.
Here are my results:
Screenshot_Galaxy S2 | Flickr - Photo Sharing!
I presume this is a decent result, but what does it actually mean???

What does it mean? it means Samsung make great hardware. :p

As far as I know the Galaxy S II is still the fastest Android phone available and Samsungs Exynos SoC does very well in benchmarks even almost a year since it was released.

I suppose it means you have nothing to worry about with regards the apps and games performing well on your device in the near future.

What did AnTuTu tell us was the best device in 2013? Was it trustworthy then?

According to you in the SGS4 thread called "antutu benchmark test scores" it was very trustworthy according to you.

That's a normal score for the i9505, why would you be disappointed?

I score the same more or less, as does every other i9505 user. ;)

Of course, that was just before Samsung got caught lying - with AnTuTu scores.

Now, in 2014, you are once again the benchmark expert, 3DMark is the thing, and according to you, you've been consistent all along.

And you have.

You have consistently defended to the bitter end that Samsung is the best, benchmarks prove it, even if you have to contradict yourself and even if you have to defy logic in your arguments.

This thread, like so many others, exists only because someone said that they were considering an HTC and you derailed the discussion with more of your so-called proof.

There's really only one thing that that we do know.

Whenever a benchmark says Samsung is the best, you support it.

Whenever the benchmark is discredited in the course of someone choosing their new phone, you come up with another benchmark and a lot of mumbo jumbo until there is only one possible outcome -
Samsung wins.

You do this every year.

And we know one more thing.

Samsung pays young people to astroturf in forums.

I think that you've given Samsung all of the free benchmark-based advertising we need to see for this year.

See you in 2015 - this thread is now closed.
 
Status
Not open for further replies.
Back
Top Bottom