Applications of GF100’s Compute Hardware

Last but certainly not least are the changes to gaming afforded by the improved compute/shader hardware. NVIDIA believes that by announcing the compute abilities so far ahead of the gaming abilities of the GF100, that potential customers have gotten the wrong idea about NVIDIA’s direction. Certainly they’re increasing their focus on the GPGPU market, but as they’re trying their hardest to point out, most of that compute hardware has a use in gaming too.

Much of this is straightforward: all of the compute hardware is what processes the pixel and vertex shader commands, so the additional CUDA cores in the GF100 give it much more shader power than the GT200. We also have DirectCompute, which can use the compute hardware to quickly do some things that couldn’t be done quickly via shader code, such as Self Shadowing Ambient Occlusion in games like Battleforge, or to take an NVIDIA example, the depth-of-field effect in Metro 2033.

Perhaps the single biggest improvement for gaming that comes from NVIDIA’s changes to the compute hardware are the benefits afforded to compute-like tasks for gaming. PhysX plays a big part here, as along with DirectCompute it’s going to be one of the biggest uses of compute abilities when it comes to gaming.

NVIDIA is heavily promoting the idea that GF100’s concurrent kernels and fast context switching abilities are going to be of significant benefit here. With concurrent kernels, different PhysX simulations can start without waiting for other SMs to complete the previous simulation. With fast context switching, the GPU can switch from rendering to PhysX and back again while wasting less time on the context switch itself. The result is that there’s going to be less overhead in using the compute abilities of GF100 during gaming, be it for PhysX, Bullet Physics, or DirectCompute.

NVIDIA is big on pushing specific examples here in order to entice developers in to using these abilities, and a number of demo programs will be released along with GF100 cards to showcase these abilities. Most interesting among these is a ray tracing demo that NVIDIA is showing off. Ray tracing is something even G80 could do (albeit slowly) but we find this an interesting way for NVIDIA to go since promoting ray tracing puts them in direct competition with Intel, who has been showing off ray tracing demos running on CPUs for years. Ray tracing nullifies NVIDIA’s experience in rasterization, so to promote its use is one of the riskier things they can do in the long-term.


NVIDIA's car ray tracing demo

At any rate, the demo program they are showing off is a hybrid program that showcases the use of both rasterization and ray tracing for rendering a car. As we already know from the original Fermi introduction, GF100 is supposed to be much faster than GT200 at ray tracing, thanks in large part due to the L1 cache architecture of GF100. The demo we saw of a GF100 card next to a GT200 card had the GF100 card performing roughly 3x as well as the GT200 card. This specific demo still runs at less than a frame per second (0.63 on the GF100 card) so it’s by no means true real-time ray tracing, but it’s getting faster all the time. For lower quality ray tracing, certainly this would be doable in real-time.


Dark Void's turbulence in action

NVIDIA is also showing off several other demos of compute for gaming, including a PhysX fluid simulation, the new PhysX APEX turbulence effect on Dark Void, and an AI path finding simulation that we did not have a chance to see. Ultimately PhysX is still NVIDIA’s bigger carrot for consumers, while the rest of this is to entice developers to make use of the compute hardware through whatever means they’d like (PhysX, OpenCL, DirectCompute). Outside of PhysX, heavy use of the GPU compute abilities is still going to be some time off.

Better Image Quality: CSAA & TMAA 3D Vision Surround: NVIDIA’s Eyefinity
Comments Locked

115 Comments

View All Comments

  • dentatus - Monday, January 18, 2010 - link

    Absolutely. Really, the GT200/RV700 generation of DX10 cards was inarguably 'won' (i.e most profitable) for AMD/ATI by cards like the HD4850. But the overall performance crown (i.e highest in-generation performance) was won off the back of the GTX295 for nvidia.

    But I agree with chizow that nvidia has ultimately been "winning" (the performance crown) each generation since the G80.
  • chizow - Monday, January 18, 2010 - link

    Not sure how you can claim AMD "inarguably" won DX10 with 4850 using profits as a metric. How many times did AMD turn a profit since RV770 launched? Zero. They've posted 12 straight quarters of losses last time I checked. Nvidia otoh has turned a profit in many of those quarters and most recently Q3 09 despite not having the fastest GPU on the market.

    Also, the fundamental problem people don't seem to understand with regard to AMD and Nvidia die size and product distribution is that they overlap completely different market segments. Again, this simply serves as a referendum in the differences in their business models. You may also notice these differences are pretty similar to what AMD sees from Intel on the CPU side of things....

    Nvidia GT200 die go into all high-end and mainstream parts like GTX 295, 285, 275, 260 that sell for much higher prices. AMD RV770 die went into 4870, 4850, and 4830. The latter two parts were competing with Nvidia's much cheaper and smaller G92 and G96 parts. You can clearly see that the comparison between die/wafer sizes isn't a valid one.

    AMD has learned from this btw, and this time around it looks like they're using different die for their top tier parts (Cypress) and their lower tier parts (Redwood, Cedar) so that they don't have to sell their high-end die at mainstream prices.
  • Stas - Tuesday, January 19, 2010 - link

    [quote]Not sure how you can claim AMD "inarguably" won DX10 with 4850 using profits as a metric. How many times did AMD turn a profit since RV770 launched? Zero. They've posted 12 straight quarters of losses last time I checked. Nvidia otoh has turned a profit in many of those quarters and most recently Q3 09 despite not having the fastest GPU on the market. [/quote]
    AMD also makes CPUs... they also lost market due to Intel's high end domination... they lost money on ATI... If it wasn't for success of the HD4000 series, AMD would've been in deep shit. Just think before you post.
  • Calin - Tuesday, January 19, 2010 - link

    Hard to make a profit paying the rates of a 5 billion credit - but if you want to take it this way (total profits), why wouldn't we take total income?
    AMD/ATI:
    PERIOD ENDING 26-Sep-09 27-Jun-09 28-Mar-09 27-Dec-08
    Total Revenue 1,396,000 1,184,000 1,177,000 1,227,000
    Cost of Revenue 811,000 743,000 666,000 1,112,000
    Gross Profit 585,000 441,000 511,000 115,000

    NVidia
    PERIOD ENDING 25-Oct-09 26-Jul-09 26-Apr-09 25-Jan-09
    Total Revenue 903,206 776,520 664,231 481,140
    Cost of Revenue 511,423 619,797 474,535 339,474
    Gross Profit 391,783 156,723 189,696 141,666

    Not looking so good for the "winner of the generation", though. As for the die size and product distribution, all I'm looking at is the retail video card offer, and every price bracket I choose have both NVidia and AMD in it.
  • knutjb - Wednesday, January 20, 2010 - link

    You missed my point. I wasn't talking about AMD as a whole I was talking about ATI as a division within AMD. If a company bleeds that much and still survives some part of the company must be making some money and that is the ATI division. ATI is making money. Your macro numbers mean zip.

    The model ATI is using is putting out competitive cards from a company, AMD, that is bleeding badly. What generation card is easier to sell the new and improved one with more features, useful or not, or the last generation chip?
  • beck2448 - Tuesday, January 19, 2010 - link

    Those numbers are ludicrous. AMD hasn't made a profit in years. ATI's revenue is about 30% of Nvidia's.
  • knutjb - Monday, January 18, 2010 - link

    ATI is what has been floating AMD with its profits. ATI has decided to make smaller incremental developmental steps that lower end production costs.

    Nvidia takes a long time to create a monolithic monster that required massive amounts of capital to develop. They will not recoup this investment off gamers alone because most don't have that much cash to put one of those cards in their machines. It is needed for marketing so they can push lower level cards implying superiority, real or not, they are a heavy marketing company. This chip is directed at their GPU server market and that is where they hope to make their money hoping it can do both really well.

    ATI on the other hand by making smaller steps, but at a higher cycle of product development, have focused on the performance/mainstream market. With lower development costs they can turn out new cards that payback development costs back quicker allowing them to put that capital back into new products. Look at the 4890 and 4870. They both share similar architecture but the 4890 is a more refined chip. It was a product that allowed ATI to keep Nvidia reacting to ATI's products.

    Nvidia's marketing requires them to have the fastest card on the market. ATI isn't trying to keep the absolute performance crown but hold onto the price/performance crown. Every time they put out a slightly faster card it forces Nvidia to respond. Nvidia recieves lower profits from having to drop card prices. I don't think this chip will be able to function on the 8800 model because AMD/ATI is now on stronger financial footing than they have been in the past couple years and Nvidia being late to market is helping ATI line their pockets cash. The 5000 series is just marginally better, but is better than Nvidia's current offerings.

    Will Nvidia release just a single high end card or several tiers of cards to compete across the board? I don't think one card will really help the bottom line over the longer term.
  • StormyParis - Monday, January 18, 2010 - link

    I'm not sure what "winning" means, nor, really what a generation is.

    you can win on highest performance, highest marketshare, highest profit, best engineering...

    a generation may also be adirectX iteration, a chip release cycle (in which case, each manufacturer has its own), a fiscal year...

    Anyhoo, I don't really care, as long as i'm regularly getting better, cheaper cards. I'll happily switch back to nVidia
  • chizow - Monday, January 18, 2010 - link

    I clearly defined what I considered a generation, historically the rest of the metrics measured over time (market share, mind share, profits, value-add features, game support) tend to follow suit.

    For someone like you that doesn't care about who's winning a generation it should be simple enough, buy whatever is best that suits your price:performance requirements when you're ready to buy.

    For those who want to make an informed decision once every 12-16 months per generation to avoid those niggling uncertanties and any potential buyer's remorse, they would certainly want to consider both IHV's offerings before making that decision.
  • Ahmed0 - Monday, January 18, 2010 - link

    How can you "win" if your product isnt intended for a meaningful number of customers. Im sure ATi could pull out the biggest, most expensive, hottest and fastest card in the world as well but theres a reason why they dont.

    Really, the performance crown isnt anything special. The title goes from hand to hand all the time.

Log in

Don't have an account? Sign up now