More Details on Elemental's GPU Accelerated H.264 Encoder
by Anand Lal Shimpi on June 23, 2008 10:52 PM EST- Posted in
- GPUs
Tucked away in our NVIDIA GT200 review was a bit of gold. Elemental Technologies has been developing, in CUDA, a GPU-accelerated H.264 video transcoder.
If you've ever tried ripping a Blu-ray movie you'll know that just a raw rip of just one audio and one video stream can easily be over 20 - 30GB. I've been doing a lot of this lately for my HTPC and even without 8-channel audio tracks, my ripped movies are still huge (Casino Royal was around 27GB for the 1080p video track and 5.1-channel english audio track). On a massive screen, you'll want to preserve every last bit of information, but on most displays you could actually stand to compress the video quite a bit.
Using the H.264 codec (or the open-source x264 version), it's very easy to preserve video quality but reduce file size down to the 8 - 15GB range - the problem is that it requires a great deal of processing power to do so. Transcoding from a H.264 encoded Blu-ray to a lower bitrate H.264/x264 can often take several hours, if not over a day for a very high quality re-encode on a fast dual or quad-core system.
Right now transcoding Blu-ray movies isn't exactly at the top of everyone's list, but using H.264/x264 you can significantly reduce file sizes on any video. x264 is the new DivX and its usefulness extends far beyond just ripping HD movies. Needless to say, its use isn't going to increase unless encoding using the codec gets faster.
Elemental Technologies has been working on a technology they called RapiHD, which is a GPU-accelerated H.264 video encoder and the consumer implementation of RapiHD is a software application called BadaBOOM (yes, that's what it's actually called, there's even a video).
RapiHD and thus BadaBOOM are both CUDA applications, meaning they are written in C and compiled to run on NVIDIA's GPUs. They won't work without a CUDA-enabled GPU (GeForce 8xxx, 9xxx or GTX 280/260) and they won't work on AMD/ATI hardware.
Elemental allowed NVIDIA to use a very early beta of BadaBOOM in its GT200 launch, which meant we got access to the beta. We could only transcode up to 2 minutes of video and we weren't given access to any options, we could only choose a vague output format and run the encode.
BadaBOOM uses its own H.264 codec that Elemental developed, we were forced to compare it to the open-source x264 in our tests since Elemental's software won't run without GPU acceleration. We used AutoMKV and played with its presets to vary quality. Even with the awkward comparison, the advantage of GPU-accelerated H.264 encoding was obvious:
Those numbers are compared to an Intel Core 2 Extreme QX9770, the fastest quad-core CPU available today. In the worst case scenario, the GTX 280 is around 40% faster than encoding on Intel's fastest CPU alone. In the best case scenario however, the GTX 280 can complete the encoding task in 1/10th the time. We're not sure where a true apples-to-apples comparison would end up, but somewhere between those two extremes is probably a good guesstimate.
Given the level of performance we saw with the GeForce GTX 280, we scheduled a meeting with Elemental's CEO, Sam Blackman to learn more about BadaBOOM as his application has the ability to truly revolutionize video encoding performance for the masses.
50 Comments
View All Comments
lucapicca - Tuesday, June 24, 2008 - link
I wonder if all this buzz about GPU programming is really a sane idea...First of all, from a video coding point of view, one would not transrate a video (I mean... same resolution, same GOP structure) and perform motion estimation again from scratch.
And this is what GPUs might really be good at.
Lastly, I'm not that impressed at the speed numbers.
Is the performance/power ratio favourable to GPUs (in this application)?
Is the transcoding done entirely on the GPU?
Because... if 75% of the time is spent in communication/synchronization between CPU and GPU, I think that the future of computation is not in GPUs... and perhaps some sort of less powerful DSPs integrated in the CPU might really do dthe job better (see Cell).
After all, it's just a matter of communication speed:
sometimes sending a job to a remote CPU is not really worth it.
Any opinion?
JonnyDough - Wednesday, June 25, 2008 - link
I was thinking that myself as I read this article. The CPU simply isn't designed with this in mind, and if it was more specialized it could probably outperform the GPU. I think what you're suggesting is the merging of the GPU and the CPU...and it's my understanding that that merger is now finally underway. Once we hit 32nm and smaller, and begin to utilize more power saving features we'll see laptops REALLY begin to take off. Gaming on a 3 day battery powered laptop here we come. Hopefully.Pjotr - Tuesday, June 24, 2008 - link
[quote]In the worst case scenario, the GTX 280 is around 40% faster than encoding on Intel's fastest CPU alone.[/quote]Why can you never learn simple maths. If something completes in 8 seconds over something that completes in 14 seconds, it's 75% faster. (If it had run in 7 seconds over 14 seconds, it's obvious it's 100% faster not 50%, isn't it?)
strikeback03 - Tuesday, June 24, 2008 - link
It is not so much math as semantics. The 6 seconds between the NVIDIA number and the fastest 9770 is about a 40% time savings (6/14=0.4285...), which could be thought of as 40% faster.Pjotr - Wednesday, June 25, 2008 - link
To be done in 40% less time (43.9% in this case though) you are done in 60% of the time. To be done in 60% of the time, you must process 1/60% = 1.67x faster = 67% faster. To be done in 57.1% of the time (8 seconds of 14) you must process 75% faster.If something is running 40% faster as stated in the article, it should be done in 71.4% of the time (use 28.6% less time). The article is incorrect in claiming processing is done 40% faster, it should say 75% faster OR done in 40% less time.
JonnyDough - Wednesday, June 25, 2008 - link
The two of you, are correct. 40% faster is wrong.DigitalFreak - Tuesday, June 24, 2008 - link
Why can YOU never learn simple EnglishsINNAM - Tuesday, June 24, 2008 - link
and to make it clear the VOB file can hold H.264 but i think i has to be under 4gbs and no DTS... Also the can play H.264 as well.INNAM - Tuesday, June 24, 2008 - link
as a dvd/blu-ray ripper myself i suggest other formates besides MKV. The fact that MKV can only be played on the computer. This makes it painful to than convert it once more to VOB(PS3 playback) or WMV/AVI(X360 playback). Don't get me wrong, if BadaBOOM wants to make it they NEED to add MKV support because if you're going to watch it via PC/HDMI it has Chapter support all the way to VC-1 and DTS.oh and i know having bult in AC3 or DTS encode plugin cost money but you could leave a option for the hardcore user to allow their own plugins. That would make it oh so good!
shiggz - Tuesday, June 24, 2008 - link
Also shame on ATI years ago i spent 380$ to buy an x1900 to help speed up video encodes (as they promised) and they never came through on gpu accelerate! Even all these years later its still software accel. They just dropped the program. That was the last expensive ati card i bought.However with 4850's total 800 stream processor count If these could work same as nvidias' as mentioned potentially ATI could blow them away. Or am i missing something about 4850 SP count that would make it not directly proportional to Nvidia?