|
|
|
Note: This page covers 3D motion blur optimizations, which I
implemented in several stages, mostly back in 2000-2001. Subsequently
(2006-2007), I've spent a lot of time working on a 2D motion blur
solution that provides a much faster, more controllable, and often
aesthetically superior alternative to our 3D motion blur. You can find
more on that under the topic Pixmotor on my publications page.
Motion blur is a basic requirement for any high quality renderer. Because a physically correct implementation dramatically increases the number of shading samples by spreading them over the additional dimension of time, a motion blurred render can take many times longer to complete than the equivalent static render. Not surprisingly, it is often desirable to compromise, in a controllable way, the physical correctness of a motion blur algorithm to favor speed over accuracy. The ChallengeBecause the renders in our studio have been getting more and more detailed over the years, particularly with the increased use of hair rendering in projects like Cats and Dogs and Scooby Doo, the number of triangles in an average frame has steadily increased as the triangles have gotten smaller and smaller. A typical fur render with with several million hair primitives contains about 10-20 million triangles, most of them not much bigger than a pixel. It was clear that we needed to improve the speed of precisely such renders: ones with a large number of small triangles. Although our renderer's implementation of motion blur is physically very accurate---for instance, our motion blur accurately portrays curved trajectories and captures highlights that appear in the middle of a motion blur streak (something that many other architectures, including prman's, fail to do)---we were willing to sacrifice some of that accuracy for speed. The Solution: Triangle UndersamplingOur approach is twofold: By applying a shading sample at only one moment, caching the resulting color, and applying it over the entire motion blur trajectory, we can dramatically reduce the number shading calculations needed. We then further reduce the number of shading samples by shading only certain parts of a triangle and interpolating (a la Gouraud) among the shaded points, based on the requested sample's barycentric coordinates. The reason this solution is effective is that most of the rendering time is spent on shading computations. In fact, the greater the number of lights, shadows, maps, and other such elements, the more expensive the shading calculations become, and the bigger a speed boost undersampling provides.
Our undersampling scheme applies 1, 3, 6, 15, 45, 153, etc. shading samples to each triangle, based on how many 4x subdivisions we wish to apply. We cache the color value at each sample point, and provide an efficient method for interpolating among the cached values given a subsequent shading point on the triangle. For each visible triangle in the current bucket, we precompute a set of shaded samples, as shown above, and store them in a cached data structure that is referenced by all the overlapping triangles that comprise the motion blur streak. Any shading requests for these triangles are directed to the shading cache. Now the trick is to rapidly interpolate among these samples given the requested sampling coordinates. We handle this by precomputing a formula for each level of subdivision, which takes the barycenteric coordinates relative to the original triangle, and identifies which subtriangle the sample is in, and what its coordinates are relative to that subtriangle. The cost of this interpolation is minimal; it is dwarfed by the cost of constructing the non-shaded geometry of the triangles along the motion blur trajectory. Reducing the latter would therefore be the next logical step in future motion blur optimization...
ResultsThe undersampling technique we've described was used with great success on the Cats & Dogs job. It was very useful in bringing down some of the render times from over 30 hours into the teens. Although the color sample cache did increase memory usage somewhat, this could be minimized by reducing the screen bucket size, without any significant speed penalty.
Future WorkFuture optimizations in motion
blur will likely focus on the generation of triangles along the
spline-based motion trajectory of each vertex. Further speed
increases could be effected by limiting the instantiation of
overlapping static triangles and precomputing their trajectories
into a small number of linear segments that could be rapidly
evaluated.
|