The GPU Advances: ATI's Stream Processing & Folding@Home
by Ryan Smith on September 30, 2006 8:00 PM EST- Posted in
- GPUs
Enter the GPU
Modern GPUs such as the R580 core powering ATI's X19xx series have upwards of 48 pixel shading units, designed to do exactly what the Folding@Home team requires. With help from ATI, the Folding@Home team has created a version of their client that can utilize ATI's X19xx GPUs with very impressive results. While we do not have the client in our hands quite yet, as it will not be released until Monday, the Folding@Home team is saying that the GPU-accelerated client is 20 to 40 times faster than their clients just using the CPU. Once we have the client in our hands, we'll put this to the test, but even a fraction of this number would represent a massive speedup.
With this kind of speedup, the Folding@Home research group is looking to finally be able to run simulations involving longer folding periods and more complex proteins that they couldn't run before, allowing them to research new proteins that were previously inaccessible. This implementation also allows them to finally do some research on their own, without requiring the entire world's help, by building a cluster of (relatively) cheap video cards to do research, something they've never been able to do before.
Unfortunately for home users, for the time being, the number of those who can help out by donating their GPU resources is rather limited. The first beta client to be released on Monday only works on ATI GPUs, and even then only works on single X19xx cards. The research group has indicated that they are hoping to expand this to CrossFire-enabled platforms soon, along with less-powerful ATI cards.
The situation for NVIDIA users however isn't as rosy, as while the research group would like to expand this to use the latest GeForce cards, their current attempts at implementing GPU-accelerated processing on those cards has shown that NVIDIA's cards are too slow compared to ATI's to be used. Whether this is due to a subtle architectural difference between the two, or if it's a result of ATI's greater emphasis on pixel shading with this generation of cards as compared to NVIDIA we're not sure, but Folding@Home won't be coming to NVIDIA cards as long as the research group can't solve the performance problem.
Conclusion
The Folding@Home project is the first of what ATI is hoping will be many projects and applications, both academic and commercial, that will be able to tap the power of GPUs. Given the results showcased by the Folding@Home project, the impact on the applications that would work well on a GPU could be huge. In the future we hope to be testing technologies such as GPU-accelerated physics processing for which both ATI and NVIDIA have promised support, and other yet to be announced applications that utilize stream processing techniques.
It's been a longer wait than we were hoping for, but we're finally seeing the power of the GPU unleashed as was promised so long ago, starting with Folding@Home. As GPUs continue to grow in abilities and power, it should come as no surprise that ATI, NVIDIA, and their CPU-producing counterparts are looking at how to better connect GPUs and other such coprocessors to the CPU in order to further enable this kind of processing and boost its performance. As we see AMD's Torrenza technology and Intel's competing Geneseo technology implemented in computer designs, we'll no doubt see more applications make use of the GPU, in what could be one of the biggest-single performance improvements in years. The GPU is not just for graphics any more.
As for our readers interested in trying out the Folding@Home research group's efforts in GPU acceleration and contributing towards understanding and finding a cure for Alzheimer's, the first GPU beta client is scheduled to be released on Monday. For more information on Folding@Home or how to use the client once it does come out, our Team AnandTech members over in our Distributed Computing forum will be more than happy to give a helping hand.
Modern GPUs such as the R580 core powering ATI's X19xx series have upwards of 48 pixel shading units, designed to do exactly what the Folding@Home team requires. With help from ATI, the Folding@Home team has created a version of their client that can utilize ATI's X19xx GPUs with very impressive results. While we do not have the client in our hands quite yet, as it will not be released until Monday, the Folding@Home team is saying that the GPU-accelerated client is 20 to 40 times faster than their clients just using the CPU. Once we have the client in our hands, we'll put this to the test, but even a fraction of this number would represent a massive speedup.
Click to enlarge |
With this kind of speedup, the Folding@Home research group is looking to finally be able to run simulations involving longer folding periods and more complex proteins that they couldn't run before, allowing them to research new proteins that were previously inaccessible. This implementation also allows them to finally do some research on their own, without requiring the entire world's help, by building a cluster of (relatively) cheap video cards to do research, something they've never been able to do before.
Unfortunately for home users, for the time being, the number of those who can help out by donating their GPU resources is rather limited. The first beta client to be released on Monday only works on ATI GPUs, and even then only works on single X19xx cards. The research group has indicated that they are hoping to expand this to CrossFire-enabled platforms soon, along with less-powerful ATI cards.
The situation for NVIDIA users however isn't as rosy, as while the research group would like to expand this to use the latest GeForce cards, their current attempts at implementing GPU-accelerated processing on those cards has shown that NVIDIA's cards are too slow compared to ATI's to be used. Whether this is due to a subtle architectural difference between the two, or if it's a result of ATI's greater emphasis on pixel shading with this generation of cards as compared to NVIDIA we're not sure, but Folding@Home won't be coming to NVIDIA cards as long as the research group can't solve the performance problem.
Conclusion
The Folding@Home project is the first of what ATI is hoping will be many projects and applications, both academic and commercial, that will be able to tap the power of GPUs. Given the results showcased by the Folding@Home project, the impact on the applications that would work well on a GPU could be huge. In the future we hope to be testing technologies such as GPU-accelerated physics processing for which both ATI and NVIDIA have promised support, and other yet to be announced applications that utilize stream processing techniques.
It's been a longer wait than we were hoping for, but we're finally seeing the power of the GPU unleashed as was promised so long ago, starting with Folding@Home. As GPUs continue to grow in abilities and power, it should come as no surprise that ATI, NVIDIA, and their CPU-producing counterparts are looking at how to better connect GPUs and other such coprocessors to the CPU in order to further enable this kind of processing and boost its performance. As we see AMD's Torrenza technology and Intel's competing Geneseo technology implemented in computer designs, we'll no doubt see more applications make use of the GPU, in what could be one of the biggest-single performance improvements in years. The GPU is not just for graphics any more.
As for our readers interested in trying out the Folding@Home research group's efforts in GPU acceleration and contributing towards understanding and finding a cure for Alzheimer's, the first GPU beta client is scheduled to be released on Monday. For more information on Folding@Home or how to use the client once it does come out, our Team AnandTech members over in our Distributed Computing forum will be more than happy to give a helping hand.
43 Comments
View All Comments
photoguy99 - Sunday, October 1, 2006 - link
Basic research is like that - it may takes a lot of years to benefit from it.Look at Einstein, his work was fundamental research but the benefits are still being realized 100 years later.
So even if they have major breakthroughs they may be at such a foundational level that the actual cure for Alz. comes 25 years later.
Nature of the beast.
JarredWalton - Sunday, October 1, 2006 - link
http://folding.stanford.edu/results.html">Published ResultsCurrent research includes:
http://folding.stanford.edu/FAQ-diseases.html#AD">Alzheimer's, Cancer, Huntington's Disease, Osteogenesis Imperfecta, Parkinson's Disease, Ribosome and antibiotics.
And of course, there's always http://folding.stanford.edu/faq.html">the Folding@Home FAQ.
Do they know in advance that all of the issues are related to protein folding? No, but I'd assume they have good cause to suspect it. The problem is that it takes time; breakthrough results might not materialize soon, next year, or even for 5-10 year. Should research halt just because the task is difficult? Personally, I think FAH has a far greater chance of impacting the world during my lifetime than SETI@Home.
Cheers!
Baked - Sunday, October 1, 2006 - link
I wonder if a X1600 card will work. I've tried both the graphics and command line version of F@H on my new system but both had problem connecting to F@H server. Hopefully this new F@H version will work.JarredWalton - Sunday, October 1, 2006 - link
Next step is to extend to X1800 and probably from there to X1600. Beyond that, the G70 chips are probably the next up would be my guess.smitty3268 - Sunday, October 1, 2006 - link
I assume the new client uses the cpu + gpu, and not just the gpu? Also, it would be nice to have some sort of explanation for the poor nvidia performance in the next article. Is it just their architecture, or has Folding@Home been getting assistance from ATI and not NVidia?This doesn't make much sense:
The Folding@Home design is quite obviously a massivly parallel design, as shown by the fact that hundreds of thousands of computers are all working on the same problem. Therefore, doubling the amount of cores would double the amount of work being done and this seems to be happening faster than the old incremental speed bumps.
Otherwise, it was a good article.
z3R0C00L - Monday, October 2, 2006 - link
It's Simple.. F@H uses Dynamic Branching Calculations. nVIDIA GPU's are technologically inferior to ATi VPU's when it comes to shading performance and branching performance.As such.. nVIDIA's highly mighty GeForce 7950GX2 would perform much like an ATi Radeon x1600XT. In other words.. too slow.
tygrus - Monday, October 9, 2006 - link
The Nvidia FP hardware is fast enough but the overall design doesn't fit well with the software(task) design of F@H. For other tasks the Nvidia GPU's may be very fast. The next Nvidia GPU & API will hopefully be better.The CPU handles the data transformation and setup before sending to GPU for the accelerated portion. Then the CPU overseas the return of data from the GPU. The CPU also looks after the log, text console, disk read/write, internet upload&download, and other system ovreheads.
More information is available from ???
i just found a really great article which covers the public release: http://techreport.com/onearticle.x/10907">http://techreport.com/onearticle.x/10907
quote:
--------------------------------------------------------------------------------
Talk w/Vijay Pande
ATI is currently 8X faster than Nvidia. Nvidia has our code, running it internally, hope we can close the gap. But even 4X difference is large, and ATI is getting faster all of the time.
Lot of work goes into qualifying GPUs internally so they can run.
Making apps like this run on a GPU requires a lot of development work. Currently, science is best served by using ATI chips. Nv may come in future.
---------
The CPU has to poll the GPU to find out if it finished a block and needs help (data from GPU->CPU etc). This takes a context switch and CPU time to wait for the reply (ns wait not fixed number of cycles). Any acutual work is done in the remaining time slice or more. The faster the GPU, the more it demands uses the CPU. Slow the CPU in half and you may be slowing GPU by upto half.
Ryan Smith - Sunday, October 1, 2006 - link
The new client "uses" the CPU like all applications do, but the core is GPU-based, so it won't be pushing the CPU like it does on the CPU-only client, I don't know to what level that means however.As for the Nvidia stuff, we only know what the Folding team tells us. They made it clear that the Nvidia cards do not show the massive gains that ATI's cards do when they try to implement their GPU code on Nvidia's cards. Folding@Home has been getting assistance from Nvidia, but they also made it clear that this is something they can do without help, so the problem is in the design of the G7x in executing their code.
As for the core stuff, this is something the Folding team explicitly brought up with us. The analogy they used is trying to bring together 2000 grad students to write a PhD thesis in 1 day, it doesn't scale like that. They can add cores to a certain point, but the returns are diminishing versus faster methods of processing. This is directly a problem for Folding@Home, which is why they are putting efforts in to stream processing, which can offer the gains they need.
smitty3268 - Sunday, October 1, 2006 - link
Does the G7x have as much support for 32bit floats as ATI does? It seems like I read somewhere that one of the two had moved to 32 bit exclusively while the other was still much faster at 16/24 bit fp. Could that be why they aren't seeing the same performance from NVidia?Clauzii - Monday, October 2, 2006 - link
Probably that, and the fact that the big ATI models contain 48 shaders - pretty beefes the calculations up!