The release of the alpha/beta Ashes of Singularity sparked discussions and debates among many technology and GPUs enthusiasts. The reason was the surprising results from benchmarks all over the web. For those who did not already know what’s going on inside the Radeon and Geforce architectures and drivers, at least.
In short, the results showed huge performance improvements for AMD GPUs all across the performance rainbow, including low and mid performance GPUs like the Radeon 250X and 260X, but also high performance GPU like the Radeon 290X/390X and the flagship Fury X – we are talking even as high as 60-80% improvement for common scenarios(!). Somewhat on the contrary, results for Nvidia Geforce GPUs showed little improvement at best. The end result is that the very old Radeon R9 290X rivaling the much newer and praised GTX 980, practically makes it cost ineffective.
Let’s stop for a minute and just mention DX12 / Vulkan / Mantle APIs. DirectX12 is Microsoft’s new graphics API and it was preceded by AMD’s Mantle. The open source free Vulkan API has the same goals and is part of the this new wave of APIs. AMD started with their Mantle API and released its first version back at the beginning of 2014, escorted by Battlefield 4 Mantle version, which showed significant performance improvement (after fixing lots of bugs).
The idea behind DX12/Vulkan/Mantle APIs is allowing a much more direct access to the GPU functionality, removing a lot of the mediating code of the previous generations APIs like DX11 and OpenGL. Consoles APIs have done it for quite some time, allowing for developers to squeeze a lot more performance from the same hardware. I’ll try to summarize the DX12 advantages, though these are not direct advantages, only consequences:
- Considerably lowering the API overhead. APIs like DX11 resulted in a very high API overhead which was caused by the considerable processing required to utilize it, creating and manipulating objects. Reducing overhead can result in freeing CPU power that can be now utilized for other needs, like AI, for example.
- Less driver overhead thanks to significantly lower drive complexity, like in the AMD case. Again, less mediation between the 3D application and the GPU.
- Allowing considerably better parallel programming and design of 3D applications, since you can access the GPU from multiple threads vs only one core in DX11/OpenGL or previous versions. Accessing the GPU from a single core has overloaded the core and bottlenecked the performance. Removing this limitation makes multicore CPUs much more significant for gaming.
- The better parallelism allows a much much better asynchronous computing by allowing access to functionality the before, with DX11, wasn’t really available. More specifically, we are talking about AMD’s GCN architecture which has very good parallel abilities (we now know). I’ll describe it a little more in another post just to keep this one more concentrated, but in general, the latest AMD GCN architecture has ACEs (Asynchronous Compute Engine) which are some kind of queue that can prepare ahead commands. The result, for AMD GCN GPUs, are better utilization of the GPU which is usually underutilized under DX11 (something like Hyperthreading) and also, exposing parallel asynchronous processing of commands. I know that is not clear just from my description and indeed I don’t know it all + this description is just an overview. I’ll write in more detail in another post. In the meanwhile, you can read that and that (Mahigan did great job there) and you’ll know more than I know, really. And you can also check AMD’s Southern Islands architecture page (published by AMD).
The last part, about asynchronous computing, seems to be the ace in hole of AMD, with their GCN architecture preparing for this moment since the first GCN years ago, while NV GPUs seem to be considerably lacking in comparison. No wonder AMD has pushed Mantle forward. The annoying part is, again, the lack of information and loads of crappy marketing from all companies.
Let’s go to the benchmarks. Some more in the sources links:
PCPerspective benchmarks first, comparing Intel and AMD CPUs and AMD and NV GPUs. You’ll see that with AMD Radeon R9 390, the improvement is much higher than with Nvidia GTX 980. The GTX 980 even sees lower performance in DX12 mode. One reason is that NV drivers are probably way better in DX11 (which was kind of known) with much lower API overhead.
We also see how much really multiple cores, over 4 cores, help really in this benchmark. Moreover, AMD’s CPUs still lack behind considerably – yes, a lot of CPU power is freed up, but think about it, the freed CPU power of AMD worth less compared to the fast Intel CPU cores.
Next, a benchmark from benchmark.pl (link). This one is interesting because they’ve included many lower performance GPUs, which are probably much more interesting for many.
The same trend we’ve seen before continues here. For example, the Radeon R7 260X and R7 370 perform around 50% faster in DX12 mode. That’s a lot. The R7 270X becomes ~65% faster, and only 2FPS short of the new GTX 960 (~6-8% advantage), while in DX11 mode the same GTX 960 is 65% faster than the R7 270X.
The bottom line is that we don’t understand what’s going on, except that Nvidia has some problem (but still has powerful GPUs) and AMD performance till now was way(!) suboptimal (API overhead and underutilization). We’ll wait for more info as we have no other choice, but it seems that AMD has a serious advantage over Nvidia in the asynchronous computing department, which seems to be the future (how wasn’t it like that before, that the real question).
Just final words – people still try to figure out what’s going on, because the companies themselves don’t help really. There will be no relief until some good real open source GPU company will arise.