Virtual reality: John Carmack’s battle for 20 millisecond latency

The Oculus Rift is looking to have finally cracked the problems that beset early attempts at virtual reality headsets. It’s fitted with a hi-res viewing panel split to accommodate for each eye, internal sensors that track head motion accurately and quickly, and it’s in a sub $1000 price range. But the biggest issue the developers have had to overcome is latency.

As John Carmack puts it, “The latency between the physical movement of a user’s head and updated photons from a head mounted display reaching their eyes is one of the most critical factors in providing a high quality experience.”

You can find out why and how they’ve tried to overcome this problem below.

Carmack has detailed in a blog post how a high latency has a player “using their head as a controller” whereas a low latency has a player accept “that their head is naturally moving around in a stable virtual world.” The sweet spot, where this delay between action and update becomes imperceptible, is “below approximately 20 milliseconds”.

Easier said than done. Trimming the fat on a computer’s response time has presented a host of trouble for Carmack and Palmer Luckey’s team developing the Oculus Rift.

Even collecting data to measure latency has been difficult. If they have the headset and the computer it’s running on measure latency as they run it actively slows both machines and so increases the delay theyre trying to mesure. The method the team had to adopt was recording “high speed video that simultaneously captures the initiating physical motion and the eventual display update” and then “single stepping the video and counting the number of video frames between the two events”. Almost like in film where they use a clapperboard to mark a point to line up video and sound.

The typical latencies the team were discovering were “over 100 milliseconds”. The weird, and frankly cool thing, about the human brain is that latency delays don’t stop players from understanding a game – as Carmack points out, “many thousands of people enjoyed playing early network games, even with 400+ milliseconds of latency between pressing a key and seeing a response on screen” – it’s just that the brain won’t be fooled into seeing the image in front of them as reality, it never forgets it’s a representation until it updates almost as fast as our eyes convey their view of the world to the brain.

Sensors aren’t a problem for latency if you’re connected to the computer via USB as “it is possible to receive 1000 updates a second” through the connection. There are two problems when it comes to displays, however; lots of manufacturers have their displays run algorithms over the data they receive to optimise it for the screen which creates a delay. The displays do this automatically, even with optimised data, so the delay can’t be skipped. Another problem is that displays tend to start feeding the data in at the top of the screen, meaning the bottom of the screen is 16 milliseconds behind the top. “[O]n a head mounted display it can cause the world to appear to shear left and right, or “waggle” as the head is rotated, because the source image was generated for an instant in time, but different parts are presented at different times.” This is also why Carmack and Luckey are trying to run with V-sync if possible, it’ll mean the computer has the whole image ready to display before it shows it to the player.

The ways they’ve gone about reducing latency are impressive. For a start they’re trying to avoid the graphics card’s attempts to boost performance. According to Carmack, “[t]he drive to win frame rate benchmark wars has led driver writers to aggressively buffer drawing commands, and there have even been cases where drivers ignored explicit calls to glFinish() in the name of improved “performance”.” So the card would be holding back, increasing latency, so that it can show a higher quality image. To counteract this a VR headset aiming for low latency needs provide a card with information already set to pump onto the screen.

That’s the theory at least, to “have sufficient work ready to send to the GPU immediately after the synchronization [between the sensors on the VR headset and the computer] is performed” so there is minimal time where the GPU is waiting for information to display on the screen.

There are two techniques that Carmac has discovered to improve lthis, ‘View Bypass’ and ‘Time Warping’.

View Bypass is a system by which the renderer, the program turning the data it receives from the headset’s sensors and the game code into the image we see on the screens, is able to update on the fly with new data from the headset. What this would mean is that, say, the renderer had received all the data it needs to start making an image and the player moves their head before the image is displayed, making the original image out of date, the renderer can take the new information from the sensors and make a more relevant image by warping the shape of the old image, shifting it slightly to the side, or adding motion blur; keeping latency low and the image more in-line with what the player should currently be seeing.

Time Warping is another technique that shaves vital milliseconds from the rendering process. “If you had perfect knowledge of how long the rendering of a frame would take, some additional amount of latency could be saved by late frame scheduling the entire rendering task”, Carmack says. Of course, there is no way to know how long a frame is going to take to render from moment to moment. However, Carmack spotted that “a post processing task on the rendered image can be counted on to complete in a fairly predictable amount of time, and can be late scheduled more easily.” So, Time Warping in VR is where you render as much of the image as possible without running the post-processing task. Then, you take the latest information from the sensors and run a View Bypass, updating the image to fit with what the player is supposed to be seeing, before running the post-processing pass on the image and displaying it to the player.

Carmack points out that, while developed for use in VR, the View Bypass and Time Warp methods could be used to reduce the latency in game streaming technology, making sure that the image we are sent by the cloud processor is as up-to-date as possible.

Definitely check out the much more in-depth detailing of these processes over on Carmack’s blog.

Cheers, Soulskill over on Slashdot.