Today We will compare Radeon rx 570 both are based on different versions of GCN but they share a number of things in common to start both have 2048 stream processors 32 ROPS and 128 TM used.
But this is where most of the similarities end of course there are some obvious differences such as core clock frequencies memory buffer size bandwidth there’s a lot more as we go deeper.
We’ll get more into that shortly purpose of today’s Post is to pit these GCN base legends against each other to see how while they stack up compared clock for clock as well as at stock frequencies this isn’t a new idea similar tests have been conducted over the years.
There were the middle of 2019 and today’s Triple A games are pretty demanding so some settings will have to be altered a bit to ensure there’s a fair fight. We’ll discuss more on this shortly but first let’s take a look at some architectural differences between these two versions of GCN.
With a little bit of history let’s first go back to 2012 when perhaps the best agent GPU of all time was released to the masses this GPU known as the HD 7970 was codenamed Tahiti and it was PC gamers first taste of the gcn microarchitecture.
GCN was designed with a number of provements over its predecessor AMD won a GPU that offered increased flexibility and efficiency all while offering reduced complexity GCN was also designed with a huge focus on compute along with optimization for heterogeneous computing.
While this design was a great building block for GCN it did have some issues and one of these issues was around geometry throughput and tessellation performance and since tessellation and geometry operations happen early in the pipeline they have a chance to significantly bog down everything else .
If it spends too many clock cycles on pushing a limited amount of geometry as time moved on each generation of GCN implemented a number of improvements this slide doesn’t go over all of them.
But it does a good job at hitting on a lot of the important ones going from GCN 1.0 21.1 we now gained up to eight asynchronous compute engines and four geometry engines for increased geometry and tessellation performance.
Other features such as mantle support true audio free sync XD ma all came along for the ride as well GC n 1.1 to 1.2 introduced delta color compression and again improved tessellation performance among other things.
Unfortunately I don’t have a 2048 stream processor tonga based card so let’s turn our attention over to GCN 1.3 known to all as polaris besides the obvious benefits of moving to a 14 nanometer FinFET process Polaris 10 saw a number of improvements over previous generations of GCN.
Let’s dive in and take a look at some of the improvements 1.3 brought to the table since inception GC n has had some issues keeping all the shaders in a Cu busy which has led to some performance in efficiencies AMD made a handful of changes in GCM 1.3 to improve overall shader efficiency these changes helped contribute to up to 15% more performance per compute unit GC n 1.3.
Also made additional changes to AMD’s Delta color compression which brought around a 16% improvement over GC and 1.2 as shown here capacity permits were also made to l2 cache as Polaris received a total of 2 megabytes along with these improvements.
We also saw the introduction of the primitive discard accelerator this is a filtering algorithm that was used to efficiently discard primitives they work by detecting small or thin triangles that don’t contribute much to the scene so it discards them prior to rasterization which frees up geometry engines that rasterize triangles.
Contribute to the scene all these differences and features i mentioned here are really just skimming the surface of all the overall changes for each generation.
In order to make today’s testing fair had to make sure that core clock speeds are exactly the same along with the memory bandwidth so I decided to drop the core clock of the RX 572 1050 megahertz in order to match the gigahertz edition as for the memory I down clock the memory on the gigahertz edition from 1500 megahertz to 1001 in 67 this produces nearly the same exact memory bandwidth as the RX 570.
Now as I mentioned earlier memory compression techniques were introduced with GCN 1.2 and I did carry over into GCN 1.3 so the RX v 7e does have a bit of an upper hand here now along with per clock comparisons i’m also going to run each card at their stock clock speeds so we can get an understanding on how both cards perform out of the box a 1080p resolution.
The choice of the nine games I selected and some of these titles are three to four years old but the majority of them are current titles it’s also important for me to show a complete API presence that means DirectX 11 DirectX 12 and Vulcan games are represented.
I mentioned earlier I did not dial up all the graphics settings to the max for every game tested as I wanted to ensure that I was not hitting a VRAM limit of the seven 970 gigahertz Edition before recording my results I did test each game individually to ensure the lack of VRAM was not a constraint.
Here’s a quick look at the usual test system along with the OS and drivers my hope is to replace this system with a faster one very soon but with the GPUs tested today the system is not going to hold back either card but it’s a bit of a pig when it comes to power consumption.
First game up is Rainbow six siege and here we’re using the built-in benchmark along with the high preset and 100% render scale looking at per clock results we can see the RX 570 is 15% faster and at stock clock speeds the arcs 570 is 28% faster this performance gap is what I would expect when comparing these two cards in DirectX 11 game .
From this time period a similar story continues with our next game Far Cry new dawn and I’m using the built-in benchmark along with the high preset + AAA and motion blur and comparing per clock the RX 570 is only 10% faster than the GE and out of the box clocks shows the RX 570 out ahead by about 22%.
These are pretty similar results to Rainbow six siege let’s now look at Assassin’s Creed Odyssey and again I’m using the built in benchmark along with the medium preset with both cards tested to the same clock speed shows the RX 570 is 25% faster in comparing stock clock speeds er x57 is an impressive 42 percent faster.
Polaris is looking very impressive in this title but you haven’t seen anything yet let’s move on from DirectX 11 testing and now look at DirectX 12 performance and here we have 2016 and mankind divided and I’m testing using the high preset comparing the stock clock speeds shows the rx 570 is a whopping 41 percent faster than the gigahertz edition out of the box.
Clock speeds show the 570 53% faster and as you can see the RX 570 with the DirectX 12 delivers a large performance advantage over Tahiti I also tested the 1770 using DirectX 11 and it ended up performing worse than DirectX 12 the gigahertz Edition did show some gains using DirectX 12.
But they were not very sizable now on to a new title Tom Clancy’s division 2 again we’re using DirectX 12 with a medium preset in per clock the RX 570 is a massive 48 percent faster comparing stock clock speeds we see the 570 pull ahead by 67% this is another title where I tested both DirectX an 11 and 12.
The 7 970 performed nearly the same in both I did find this a bit strange since this game does support async compute under DirectX 12 and the 790 70 does support async compute but I didn’t see any evidence of it working here let’s move on to shadow of the Tomb Raider again I selected DirectX 12 along with the medium preset and looking at per clock performance the RX 570 is 26 percent faster than comparing stock clock speeds a 570 ends up 39% faster.
Again I did test both the DirectX 11 and 12 and the 7 1970 performance was the same now on to strange Brigade and in this game you can choose from either DirectX 12 or Vulcan and I spent a fair amount of time testing both it does appear that DirectX 12 performance in the seven 970 is slightly better versus Vulcan with the rx 570 the performance difference between.
The two are negligible that said we stuck with DirectX 12 for the capture and I used the high preset along with acing compute testing these cards per clock showed the 570 was a colossal 50% faster comparing both cards at stock clock speeds the 570 is now 65% faster we’re gonna dig more into.
Why towards the end of the video now it’s on to battlefield 5 and I stuck with a single-player campaign to get our numbers as they’re easily repeatable I use DirectX 12 and select the high preset for our capture comparing the two at the same clock speeds we can see the 570 is 30% faster.
Looking at stock clock speeds we can see the 570 is now 42 % faster again I did test the 770 in both DirectX 11 and 12 and performance between the two was a wash and the last game I tested today was World War Z again we used the built-in benchmark along with the ultra preset.
The clock o’clock four o’clock the rx 570 is 42 percent faster than comparing out-of-the-box clock speeds of 570 ends up being 55% faster and that’s quite the gap I had the choice of either DirectX 11 or Vulcan in this title so I thought hey you know Vulcan should perform best on both cards right well.
No it doesn’t oddly enough using DirectX 11 the 7970 performs 6 FPS better on average versus Vulcan that’s definitely a head scratcher and as you would expect the RX 5 setting performs quite a bit better using Vulcan over DirectX 11 all right let’s change things up and start looking at some synthetic testing and starting things off is 3d mark.
I’m only comparing graphics cords so please keep that in mind when you’re looking at these results comparing per clock we can see the RX 570 is 40% faster in the gigahertz Edition and a stock clock speeds of 570 ends up being 60% faster at times pi deploys a pure DirectX 12 engine.
It uses features like acing compute and multi-threading so it’s no surprise of 570 is laying the smackdown in this one let’s jump over to another synthetic benchmark Unigine super position.
Many more all of these contribute to additional performance and efficiency but there’s one thing we have not talked about yet and that is driver neglect and let’s face it GCN 1.0 came to the market in early 2012
You can’t expect it to be the priority for AMD’s driver teams and developers the prime target for AMD GPUs are going to be Polaris Vega and now Navi in addition it could be that we’re seeing a use of certain instructions that may have a higher instruction cost on older GCN 1.0.
This isn’t to say the overall performance of a semi 970 is terrible I’m just surprised how much faster the rx 570 is in today’s games using modern api’s as much as I hate to say it it’s very likely the HTC 7 and 70 is starting to become Woody from Toy Story 4.