ATI Radeon HD 2900 XT: Calling a Spade a Spade
by Derek Wilson on May 14, 2007 12:04 PM EST- Posted in
- GPUs
R600 Overview
From a very high level, we have the same capabilities we saw in the G80, where each step in the pipeline runs on the same hardware. There are a lot of similarities when stepping way back, as the same goals need to be accomplished: data comes into the GPU, gets setup for processing, shader code runs on the data, and the result either heads back up for another pass through the shaders or moves on to be rendered out to the framebuffer.
The obvious points are that R600 is a unified architecture that supports DX10. The set of requirements for DX10 are very firm this time around, so we won't see any variations in feature support on a basic level. AMD and NVIDIA are free to go beyond the DX10 spec, but these features might not be exposed through the Microsoft API without a little tweaking. AMD includes one such feature, a tessellator unit, which we'll talk about more later. For now, let's take a look at the overall layout of R600.
Our first look shows a huge amount of stream processing power: 320 SPs all told. These are a little different than NVIDIA's SPs, and over the next few pages we'll talk about why. Rather than a small number of SPs spread across eight groups, our block diagram shows R600 has a high number of SPs in each of four groups. Each of these four groups is connected to its own texture unit, while they share a connection to shader export hardware and a local read/write cache.
All of this is built on an 80nm TSMC process and uses in the neighborhood of 720 Million transistors. All other R6xx parts will be built on a 65nm processes with many fewer transistors, making them much smaller and more power efficient. Core clock speed is on the order of 740MHz for R600 with memory running at 825MHz.
Memory is slower this time around with higher bandwidth, as R600 implements a 512-bit memory bus. While we're speaking about memory, AMD has revised their Ring Bus architecture for this round, which we'll delve into later. Unfortunately we won't be able to really compare it to NVIDIA's implementation, as they won't go into any detail with us on internal memory buses.
And speaking of things NVIDIA won't go into detail on, AMD was good enough to share very low level details, including information on cache sizes and shader hardware implementation. We will be very happy to spend time talking about this, and hopefully AMD will inspire NVIDIA to start opening up a little more and going deeper into their underlying architecture.
To hit the other hot points, R600 does have some rather interesting unique features to back it up. Aside from including a tessellation unit, they have also included an audio processor on their hardware. This will accept audio streams and send them out over their DVI port through a special converter to integrate audio with a video stream over HDMI. This is unique, as current HDMI converters only work with video. AMD also included a programmable AA resolve feature that allows their driver team to create new ways of filtering subsample data.
R600 also features an independent DMA engine that can handle moving and managing all memory to and from the GPU, whether it's over the PCIe bus or local memory channels. This combined with huge amounts of memory bandwidth should really assist applications that require large amounts of data. With DX10 supporting up to 8k x 8k textures, we are very interested in seeing these limits pushed in future games.
That's enough of a general description to whet your appetite: let's dig down under the surface and find out what makes this thing tick.
86 Comments
View All Comments
mostlyprudent - Monday, May 14, 2007 - link
Frankly, neither the NVIDIA nor the AMD part at this price point is all that impressive an upgrade from the prior generations. We keep hearing that we will have to wait for DX10 titles to know the real performance of these cards, but I suspect that by the time DX10 titles are on the shelves we will have at least product line refreshes by both companies. Does anyone else feel like the graphics card industry is jerking our chains?johnsonx - Monday, May 14, 2007 - link
It seems pretty obvious that AMD needs a Radeon HD2900Pro to fill in the gap between the 2900XT and 2600XT. Use R600 silicon, give it 256Mb RAM with a 256-bit memory bus. Lower the clocks 15% so that power consumption will be lower, and so that chips that don't bin at full XT speeds can be used. Price at $250-$300. It would own the upper-midrange segment over the 8600GTS, and eat into the 8800GTS 320's lunch as well.GlassHouse69 - Monday, May 14, 2007 - link
If I know this, and YOU know this.... wouldnt anandtech? I see money under the table or utter stupidity at work at anand. I mean, I know that the .01+ version does a lot better in benches as well as the higher res with aa/af on sometimes get BETTER framerates than lower res, no aa/af settings. This is a driver thing. If I know this, you know this, anand must. I would rather admit to being corrupt rather than that stupid.GlassHouse69 - Monday, May 14, 2007 - link
wrong section. dt is doing that today it seems to a few peoplexfiver - Monday, May 14, 2007 - link
Hi, thank you for a really in depth review. While reading other 'earlier' reviews I remember a site using Catalyst 8.38 and reported performance improvements upto 14% from 8.37. Look forward to Anandtech's view on this.xfiver - Monday, May 14, 2007 - link
My apologies it was VR zone and 8.36 to 8.37 (not 8.38)GlassHouse69 - Monday, May 14, 2007 - link
If I know this, and YOU know this.... wouldnt anandtech? I see money under the table or utter stupidity at work at anand. I mean, I know that the .01+ version does a lot better in benches as well as the higher res with aa/af on sometimes get BETTER framerates than lower res, no aa/af settings. This is a driver thing. If I know this, you know this, anand must. I would rather admit to being corrupt rather than that stupid.Gary Key - Tuesday, May 15, 2007 - link
I have worked extensively with four 8.37 releases and now the 8.38 release for the upcoming P35 release article. The 8.37.4.2 alpha driver had the top performance in SM3.0 heavy apps but was not very stable with numerous games, especially under Vista. The released 8.37.4.3 driver on AMD's website is the most stable driver to date and has decent performance but nothing near the alpha 8.37 or beta 8.38. The 8.38s offer great benchmark performance in the 3DMarks, several games, and a couple of DX10 benchmarks from AMD.
However, the 8.38s more or less broke CrossFire, OpenGL, and video acceleration in Vista depending upon the app and IQ is not always perfect. While there is a great deal of promise in their performance and we see the potential, they are still Beta drivers that have a long ways to go in certain areas before their final release date of 5/23 (internal target).
That said, would you rather see impressive results in 3DMarks or have someone tell you the truth about the development progress or lack of it with the drivers. As much as I would like to see this card's performance improve immediately, it is what it is at this time with the released drivers. AMD/ATI will improve the performance of the card with better drivers but until they are released our only choice is to go with what they sent. We said the same thing about NVIDIA's early driver issues with the G80 so there are not any fanboys or people taking money under the table around here. You can put all the lipstick on a pig you want, but in the end, you still have a pig. ;-)
Anand Lal Shimpi - Monday, May 14, 2007 - link
There's nothing sinister going on, ATI gave us 8.37 to test with and told us to use it. We got 8.38 today and are currently testing it for a follow-up.Take care,
Anand
GlassHouse69 - Monday, May 14, 2007 - link
wow dood. you replied!Yes, I have been wondering about the ethics of your group here for about a year now. I felt this sorta slick leaning towards and masking thing goign on. Nice to see there is not.
Thanks for the 1000's of articles and tests!
-Mr. Glass