Digging Deeper: Galloping Horses Example
Rather than pull out a bunch of math and traditional timing diagrams, we've decided to put together a more straight forward presentation. The diagrams we will use show the frames of an actual animation that would be generated over time as well as what would be seen on the monitor for each method. Hopefully this will help illustrate the quantitative and qualitative differences between the approaches.
Our example consists of a fabricated example (based on an animation example courtesy of Wikipedia) of a "game" rendering a horse galloping across the screen. The basics of this timeline are that our game is capable of rendering at 5 times our refresh rate (it can render 5 different frames before a new one gets swapped to the front buffer). The consistency of the frame rate is not realistic either, as some frames will take longer than others. We cut down on these and other variables for simplicity sake. We'll talk about timing and lag in more detail based on a 60Hz refresh rate and 300 FPS performance, but we didn't want to clutter the diagram too much with times and labels. Obviously this is a theoretical example, but it does a good job of showing the idea of what is happening.
First up, we'll look at double buffering without vsync. In this case, the buffers are swapped as soon as the game is done drawing a frame. This immediately preempts what is being sent to the display at the time. Here's what it looks like in this case:
Good performance but with quality issues.
The timeline is labeled 0 to 15, and for those keeping count, each step is 3 and 1/3 milliseconds. The timeline for each buffer has a picture on it in the 3.3 ms interval during which the a frame is completed corresponding to the position of the horse and rider at that time in realtime. The large pictures at the bottom of the image represent the image displayed at each vertical refresh on the monitor. The only images we actually see are the frames that get sent to the display. The benefit of all the other frames are to minimize input lag in this case.
We can certainly see, in this extreme case, what bad tearing could look like. For this quick and dirty example, I chose only to composite three frames of animation, but it could be more or fewer tears in reality. The number of different frames drawn to the screen correspond to the length of time it takes for the graphics hardware to send the frame to the monitor. This will happen in less time than the entire interval between refreshes, but I'm not well versed enough in monitor technology to know how long that is. I sort of threw my dart at about half the interval being spent sending the frame for the purposes of this illustration (and thus parts of three completed frames are displayed). If I had to guess, I think I overestimated the time it takes to send a frame to the display.
For the above, FRAPS reported framerate would be 300 FPS, but the actual number of full images that get flashed up on the screen is always only a maximum of the refresh rate (in this example, 60 frames every second). The latency between when a frame is finished rendering and when it starts to appear on screen (this is input latency) is less than 3.3ms.
When we turn on vsync, the tearing goes away, but our real performance goes down and input latency goes up. Here's what we see.
Good quality, but bad performance and input lag.
If we consider each of these diagrams to be systems rendering the exact same thing starting at the exact same time, we can can see how far "behind" this rendering is. There is none of the tearing that was evident in our first example, but we pay for that with outdated information. In addition, the actual framerate in addition to the reported framerate is 60 FPS. The computer ends up doing a lot less work, of course, but it is at the expense of realized performance despite the fact that we cannot actually see more than the 60 images the monitor displays every second.
Here, the price we pay for eliminating tearing is an increase in latency from a maximum of 3.3ms to a maximum of 13.3ms. With vsync on a 60Hz monitor, the maximum latency that happens between when a rendering if finished and when it is displayed is a full 1/60 of a second (16.67ms), but the effective latency that can be incurred will be higher. Since no more drawing can happen after the next frame to be displayed is finished until it is swapped to the front buffer, the real effect of latency when using vsync will be more than a full vertical refresh when rendering takes longer than one refresh to complete.
Moving on to triple buffering, we can see how it combines the best advantages of the two double buffering approaches.
The best of both worlds.
And here we are. We are back down to a maximum of 3.3ms of input latency, but with no tearing. Our actual performance is back up to 300 FPS, but this may not be reported correctly by a frame counter that only monitors front buffer flips. Again, only 60 frames actually get pasted up to the monitor every second, but in this case, those 60 frames are the most recent frames fully rendered before the next refresh.
While there may be parts of the frames in double buffering without vsync that are "newer" than corresponding parts of the triple buffered frame, the price that is paid for that is potential visual corruption. The real kicker is that, if you don't actually see tearing in the double buffered case, then those partial updates are not different enough than the previous frame(s) to have really mattered visually anyway. In other words, only when you see the tear are you really getting any useful new information. But how useful is that new information if it only comes with tearing?
184 Comments
View All Comments
DerekWilson - Friday, June 26, 2009 - link
In game options for DX games are what we need to rely on right now, as there is no control panel option in a driver for this.It is possible to force triple buffering in some DX games through other means, but what is needed is game and driver developer pressure to get this feature into every game.
ukbrainstew - Sunday, June 28, 2009 - link
You'd be surprised the amount of games D3DOverrider works with, I find compatibility is easily over 90%.That developers don't include the option is really rather frustrating, though I just thank the PC community for coming up with a very good workaround as they invariably tend to do.
Another setting that I'd like to become standard is the ability to choose a framerate cap. Plenty of engines allow it (though its often locked away) and it can work wonders for increasing the smoothness and playability of games on older hardware.
Even sub $100 parts could maintain a damn near constant 30fps in most games at 720p resolution but they very well may struggle trying to hit 60fps often resulting in wild variances. Would it not be to the benefit of Nvidia and AMD's marketing if they could produce a driver level setting that caps games at half your refresh rate? A setting that would suddenly making their budget parts capable of maintaining a steady framerate in so many more games thus making them much more attractive products.
Anyway, thanks a lot for the article, triple buffering has been something of particular interest to me as I really can't bear a game with any appreciable amount of tearing and I'd really rather not suffer increased input lag and as much as a 50% reduction in my framerate when a simple setting can do away with it all in one fell swoop.
Could I suggest a mention of D3DOVerrider in the article? Surely giving readers advice on how to benefit from triple buffering in more games would be a worthy addition and something many may be craving now that they're armed with knowledge of its inherent benefits.
erple2 - Sunday, June 28, 2009 - link
I think the reduction in frame rate is to an even multiple of the maximum frame rate - You'll get 1 (refresh rate of monitor), 1/2, 1/3, 1/4, 1/5, 1/6 etc. and nothing in between with vsync on.I've noticed that in games that allow me to show the frame rate, I get exactly 60 FPS (I have an LCD monitor), 30 FPS, 20 FPS, 15 FPS, 12 FPS, or 10FPS (and so on) and nothing in between. But that's the way the vsync operates with double buffering.
With triple buffering, I can get more or less any FPS rate lower than 60.
fiveday - Friday, June 26, 2009 - link
Yes. D3DOverrider is a utility that (as the name implies) overrides certain D3D calls and forces a few of its own settings. Specifically, it can force Triple Buffering and VSync (on or off) in any Direct 3D application. It comes with RivaTuner, but is a seperate app - you won't find it in RT's settings, but in it's installed folder as a standalone program.So yeah - download the latest RivaTuner (which you don't even have to run, tho it's useful!) and use D3DOverrider to force triple buffering in Direct 3D.
This saved my experience with Dead Space... and I've been singing it's praises ever since.
toyota - Friday, June 26, 2009 - link
I have to go ahead and laugh at the people that turn triple buffering on from the standard control panel and claim they see a difference. that setting has NO effect on DX games and is for OpenGL only. of course you have to use a third party app like rivatuner nhancer to actually force triple buffering on. its nice that some games like L4D have it built right into the game options though so that it is very convenient to enable from within the game without any third party crap to fool with.leexgx - Saturday, June 27, 2009 - link
Why is there not an Pole option for triple buffer with Vsync on and off or is the pole option ment for Vsync on with triple if so its Not what most of us would of pickedi allways run the games if i can with triple but No Vsync as the lag is to much
Vsync on has always made input lag be it 3 buffers or 2
Hrel - Monday, June 29, 2009 - link
pole is supposed to be poll, the way you mean. Confusing the way it is.The0ne - Friday, June 26, 2009 - link
For now, I just check the games to make sure there's an option for it. If not then I don't bother trying to find a way around it. Derek has it right, developers has to see the benefits and implement it if video card mfger's or MS doesn't implement it.DerekWilson - Friday, June 26, 2009 - link
This is the reason the last line of our article is focused on the developers.They definitely, like Valve, need to start including triple buffering in in-game options.
And it would be nice if NVIDIA and AMD could build something into the driver to make it work for everything. They put a lot of time into making AA work in most games, why not do the same for triple buffering?
GourdFreeMan - Friday, June 26, 2009 - link
I was under the impression that if you set VSync to "Force On" and Tripple Buffering to "On" in the nVIDIA control panel under the "Global Settings" tab you effectively force triple buffering on for all aplications, except those specifically excluded by their individual profiles. Is this not the case? This option has been available for years... admittedly I have never attempted to capture frames to verify that triple buffering is actually occuring.I don't see why this shouldn't work universally for applications -- as far as the application knows the only thing that has changed is the size of the pool of available graphics memory.