OpenCL 1.0: The Road to Pervasive GPU Computing
by Derek Wilson on December 31, 2008 6:40 PM EST- Posted in
- GPUs
Open, Closed, Proprietary ... Sorting out the Confusion
Over the past few months, we've seen plenty of confusion over the direction NVIDIA and AMD are taking with respect to GPU computing. This isn't helped by either AMD or NVIDIA who both tend to tout the advantages of their approach and the disadvantages of the other guy's take on it.
AMD and supporters tend to claim that NVIDIA's CUDA is not optimal because it is not an open standard and that AMD supports openness because their solution (Brook+) is open source. But Brook+ isn't an open standard either: it was developed at Stanford University and hasn't been standardized. While the source for the Brook+ compiler is available, it would take a large investment to retool it for NVIDIA hardware. Even then, you'd need to build different versions of a program for AMD and NVIDIA platforms. The original GPGPU based Brook is a different story as it generated OpenGL code to do the GPGPU work, but modifying it to generate CAL code makes it very not interoperable and not very open or standard. At least as those terms are used when talking about languages, APIs and interoperability.
NVIDIA isn't much better though. They tend to act like anything AMD does is to copy them and amounts to nothing because CUDA for C is the gold standard for GPU computing and they don't have it, which just isn't the case. In fact, AMD started demonstrating concerted efforts to advance GPU computing before we saw anything from NVIDIA, and in much more interesting ways.
With R580 AMD (then ATI) actually published part of their ISA and called the initiative CTM (for Close to Metal). Before we had a beta version of CUDA, we had folding@home GPU accelerated on R520 and R580. Beyond that, CUDA for C has done really well in the HPC (high performance computing) space, but it hasn't caught on in the consumer space. Neither AMD nor NVIDIA have a viable consumer oriented solution for GPU computing.
So NVIDIA has the HPC market with CUDA and have gotten some universities to start teaching data parallel programming using CUDA for C. AMD could make an investment in the CUDA for C language and create either their own compiler (nothing is stopping them). But then you still have the same problem of interoperability as if NVIDIA implemented Brook+. If NVIDIA or AMD want to make their solution work with the other guy, they would need to write a wrapper to translate CAL to PTX or PTX to CAL. Or we could go a different direction and work on building an industry standard virtual ISA for data parallel architectures. But I doubt that effort would ever take off.
So the bottom line is that both AMD and NVIDIA support both proprietary (Brook+ and CUDA for C) and open standard (OpenCL) solutions. There are further differences between Brook+ and CUDA, but the important part is that these proprietary solutions are not ever going to be able to produce one binary that runs on both AMD and NVIDIA hardware both because of the approach used and the fact that AMD and NVIDIA aren't going to work closely enough to make something like that work. At least in the foreseeable future.
OpenCL, on the other hand, offers developers the ability to write an application once, compile it once, and expect it to run on all major GPU hardware. Something that could never happen with ether CUDA or Brook+.
37 Comments
View All Comments
yyrkoon - Saturday, January 3, 2009 - link
Apparently I *am* more knowledgeable than some here. How you can twist the context of comments to your misguided reasoning ( that I favor Microsoft ) is beyond me. Do I prefer Windows to OSX ? Yes. Why? Because maybe Microsoft is not perfect, but at least they do not force unwanted hardware on me to use their software.Windows is the only real gaming OS. Period. And I suppose my comment about Cross platform applications, and other good strong possible uses in a *NIX environment fell on deaf ears too( uses for OpenCL ).
There is nothing wrong with OSX, it is after all based on BSD. However I will not over pay for hardware *just* to use it either. There are too many free operating systems that are just as good. If I need Windows application compatibility, I will just run Windows. Apple offers me *nothing* I have to have.
Now, who here is truly blind ?
melgross - Saturday, January 3, 2009 - link
You just want to think you are.You have gaming on the brain. I guess you must BE a gamer as that's all they think about anyway.
Penti - Saturday, January 3, 2009 - link
Really who cares about the gaming? This isn't a physics framework or engine.It can be used in games, but this isn't really about a discussion on Apple gaming. That's not really why it can "speak" to each other.
Apple got a lot of professional applications that today uses the open standard OpenGL like photo editing, video editing, VFX and others (scientific apps etc) on their platform, for not only graphics but for gpgpu, from not only them selfs but from vendors such as Adobe and Avid. Most of the apps also use OpenGL for acceleration in Windows too. Besides that, OpenCL will be available for handheld devices such as mobile phones. Even though Microsoft does software for phones you won't see DX11 or GPGPU there. Not that I'm an Apple fanboy, but I can see why Apple builds on what's already around and extends OpenGL and free standards. They can't rely on close standards, most of their apps (other vendors for OS X) are to some degree cross platform as they should be. CUDA is already available on the Mac too. But you can't expect them to run DX. This isn't about Apple as an OEM either. It's about software (Microsoft does hardware too). It's engineered to fit a wider picture and a wider array of devices including Windows, there isn't anything bad about that. There isn't anything bad about getting consumer and professional apps a boost in using GPGPU. It's certainly what some ISVs want. Theres more then gaming in the world. Microsoft are free to do whatever and nobody has said that they aren't best on games, but people are also free to criticizes and complain about Microsoft, just as they are about Apple and there certainly is a lot to be criticizing both about. Apple for certain can't just be catering to it selfs, not when they and their software vendors want something else. Microsoft essentially can. As most are already deeply invested in Microsoft tech and soft. That doesn't mean Windows users can't benefit from the Apple developed OpenCL. Their certainly is Windows only apps that will use it. Even non OpenGL ones. It's not only a cross platform library.
Atechie - Friday, January 2, 2009 - link
Drop the Apple-preaching, it's uninteresting as Apple is neither HPC nor the mainstay platform for CUDA/Brook+/OpenCL..oO(I swear, Apple-jocks are like religious zealots, they can stop pushing their religion down everbody elses throat...interested or not.)
melgross - Saturday, January 3, 2009 - link
Yeah, just like people like you who do the opposite?Why mention the company who did all the work, as long as it's Apple? Right? That' makes people fanboys if we think a proper mention should be made?
Shadowself - Friday, January 2, 2009 - link
So anyone says anything positive about Apple and immediately that equates to being an Apple zealot? It appears more likely that your personal bias is showing.It is absolutely true that Apple's Mac has NEVER been a gamer's platform -- and it probably never will be. Additionally, Apple has never fully supported (or even properly supported, IMHO) any development other than their core groups (K-12, Undergraduate to some extent, graphics and motion picture artist communities, and publishing). Thus Apple supports low to mid range graphics card and very high end 3D cards -- but absolutely nothing for the moderate to high end gamer.
However, Apple did do the vast majority of OpenCL before submitting it to become an open standard. Apple wants to expand its role in the graphics and motion picture communities. The only way to do this was to do something like OpenCL. Additionally, Apple knew that a completely closed set of APIs was not going to gain any traction. Thus they submitted it as an open standard and gave up control of it.
Not mentioning that Apple did the majority of OpenCL is wrong. For anyone to claim Apple did this altruistically is wrong. To bash Apple for coming up with something that has become a cross platform standard that can utilize both AMD and nVidia cards as well as a host of other hardware is wrong.
yyrkoon - Thursday, January 1, 2009 - link
I never said it wasn't true. Let us just say that I am less than inspired to even bother looking. OpenGL is very low on my personal list of priorities, and I could care less what Apple does( unless perhaps if someday they compete head to head with Microsoft ).Still, no matter how much I like or dislike OpenCL, chances are pretty good that on Windows platforms, it is going to be rendered( pun? ) moot. Maybe it will make the next greatest XGL even more powerful, so all those people who like to play with their application windows in linux can spend all day every day bragging/ making youtube videos about how their desktop UI can do *this*, and *that* while remaining even less productive than before ; )
Yes, the above is sarcasm to some extent, but it also true to an extent as well. OpenCL will help those who prefer and alternative to Windows do similar things without having to own Windows. Scientists who want to use GPGPU(s) to crunch some serious numbers, etc. What it will not do however is make the majority of gamers out there happy. *Unless* the majority of game developers start using OpenGL/CL on the Windows platform( Which is very unlikely ). Certain cross platform applications however could benefit, sure.
Penti - Friday, January 2, 2009 - link
So OpenCL and OpenGL is bad because it's cross platform and open standard? If you look at who's involved you see companies like ARM and embedded computing companies, they can't really use anything like DX11. This isn't just for games but GPGPU in general.It's not like there isn't apps using OpenGL on Windows either. But it's rather about a broader spectrum then owning or not owning Windows. It's for a wider category of devices then DX11 is. You won't have DX11 cellphones. But you will have OpenCL on the next gen Sony and Nintendo consoles, handhelds, settopboxes etc. In HPC too, there will be libraries/frameworks to help you out.
Of course theres professional apps such as Photo-editing, video-editing and encoding, VFX, CAD / GIS, math and other engineering software that could benefit widely from Open CL. And a lot of them are cross-platform. Or at least would need the OpenCL on for example the Mac. Where they might have many customers.
kevinkreiser - Wednesday, December 31, 2008 - link
a while back i published a paper that involved performing an iterative deconvolution on the GPU. the point of the paper was that we could do it in real-time and use it on videos with arbitrary spatially varying blur kernels.anyway the largest overhead was copying the render target (single iteration of the algorithm) to initialize the next iteration. if dx11 and opencl allow the gpu and cpu to work with the same memory, without the need to copy between the two, this will speed up gpgpu apps tremendously.
has407 - Monday, January 12, 2009 - link
OpenCL itself is neutral; it provides both explicit copy and map functions, in both synchronous and asynchronous forms. Obviously what works best will depend on platform capabilities and run-time intelligence (e.g., copy/map optimizations based on platform capabilities and program behavior).However, that still doesn't necessarily allow for a large mapped/shared memory between the CPU and CPU. That and its efficacy is going to be implementation dependent and OpenCL has simply defined a model that should be portable and useful, even if suboptimal on a given implementation--but if you know enough about the implementation, gives you sufficient optimization choices.
That requires some constraints on the memory model, in particular the consistency/correctness of various memory regions with respect to computational elements at different points and times, and especially with respect to mapped memory (NB: sec 5.2.8.1 of the spec).