Increasingly we are seeing applications where traditional 1-in-N decimation is not sufficient. Four of them in particular stand out.
First, High Definition (HD) digital television often delivers a doubled frame rate, e.g., 59.94 fps for ATSC. A progressive film with a native frame rate of 24 fps can be converted to HD rates by duplicating frames in a pattern. For example to get from 24 fps to 60 fps, we can use this pattern:
A A A B B C C C D D E E E F F ...
We can call this a 3232... frame duplication pattern. Another example is 25 fps to 60 fps conversion, which uses a 3232232322... pattern. Converting from 60 fps to the broadcast rate of 59.94 fps can be achieved by either slowing down the video and adjusting the audio to match it, or by excluding one of the duplicates every 1000 frames. In either case, our final stream has many duplicates and the traditional 1-in-N decimation approach can no longer be directly applied. Sometimes, successive 1-in-N decimations can be applied, but this is cumbersome. And while Decimate(mode=2) is helpful with multiple duplicates, it is not always successful in that regard.
Second, anime is often rendered at 12 fps and converted to 29.97 fps by duplicating frames in a manner similar to that described above. The result is the same: a stream with multiple duplicates that is not reliably addressed by 1-in-N decimation.
Third, many silent films are transferred to DVD by adding duplicates in unusual patterns, because the original frame rates are not 24 fps. It is not unusual to see clips requiring strange decimations such as 20 in 43. Sometimes these strange decimations can be attained, or nearly attained, through repeated application of Decimate() using different cycles, but that is a cumbersome approach that cannot always attain the exact decimation ratios required. FDecimate() allows specification of arbitrary frame rates and is therefore useful in recovering the original cranking rates of these silent movies.
Fourth, sometimes clips are rendered at 120 fps to properly present hybrids of film and video. if we wish to return to normal frame rates, we need to remove multiple duplicates.
The FDecimate() filter helps in these situations. An earlier filter, MultiDecimate(), also attempted to help but it is a two-pass implementation that requires an additional external program, and is therefore cumbersome. It also could produce audio desynchronization if not used very carefully. I consider FDecimate() to be a superior approach.
Please note that, while FDecimate() can be used for traditional 1-in-N decimation, it is probably preferable to stay with Decimate() for those applications, because it does not require the setting of a threshold for duplicate detection when used in mode=0.
This version supports YUY2 and YV12 for Avisynth 2.5.
If you are wondering how FDecimate() differs from the Avisynth internal filter ChangeFPS(), the answer is that FDecimate() can preferentially deliver unique frames, avoiding duplicates where possible.
Determining the Target Frame Rate If you just want to achieve a known target frame rate, you can just go ahead and set the 'rate' parameter to that rate. For example, you may know that the source originated as 24 fps film. But often we have a clip with an irregular pattern of duplicates and we do not know what the target frame rate should be. This is often the case when processing old silent movies.
The proper frame rate can be determined using the following procedure. Find a section of the clip that is about ten seconds long and contains constant motion. Step though the video without decimation and record the pattern of duplicates. For example, I might see this:
new picture followed by a duplicate of it new picture new picture followed by a duplicate of it new picture new picture followed by a duplicate of it new picture ...
I could record this in shorthand as: 212121... Now we need to determine the ratio of the number of unique pictures to the number of total frames. For a repeating 212121 pattern we would have a ratio of 2/3. You may not see a repeating pattern, but it doesn't matter. Just count the number of unique pictures in the ten second period and divide it by the total number of frames in that period.
Next, we multiply the ratio determined above by the frame rate of the undecimated clip. This gives the target frame rate to be used if all the duplicates are to be removed. You may obtain a strange frame rate that is not a standard one. If your target display device is the computer monitor, you may choose to use that strange rate, because the computer can display at that rate. But if your target display is a TV, you'll want to round the rate to the nearest standard rate. See the next section, "Why FDecimate() Cannot Be Perfect, and What We Can Do About It", for some further suggestions about selecting the target frame rate.
Determining the Threshold for Duplicate Detection If your clip has no duplicates and you simply want to decimate it to a slower frame rate, then this step does not apply and you can set the 'threshold' parameter to zero (threshold=0). If your clip has duplicates, you want to preferentially remove them, and so FDecimate() needs a way to determine which are the duplicates.
The process is as follows. Apply the FDecimate() filter with metrics enabled:
FDecimate(metrics=true)
This will display the difference metric for each frame (decimation is not performed when metrics=true). Inspect the metric for the duplicate frames. You need to find a number such that the metric for all the duplicates is below it, while the metric for the new pictures is above it. That number is the threshold that should be set for the 'threshold' parameter. It is usually about 1.0-2.5, but it may differ depending upon how noisy your clip is. It's important to get a good value for this threshold, so do it carefully, inspecting several sections of your clip.
Producing the Decimated Output Once the target frame rate and duplicate threshold have been determined, it is easy to produce the final decimated output. Assuming our target frame rate is 20.8 fps, and our duplicate threshold is 2.1, we would use this FDecimate() call:
FDecimate(rate=20.8,threshold=2.1)
The idea that the "perfectionists" propose is to simply discard all duplicate frames. This process is successful, i.e., audio/video sync is retained, if the duplicate pattern is consistent. But if the pattern is inconsistent, and often it is not, then sync is compromised. To see why, consider this duplicate pattern (actually observed in an HD clip):
...3232323232321222212232323232...
Most of the time, the clip is 323232..., which implies a 24 fps rate for the clip with the duplicates removed (the source frame rate is 59.94 fps). But in the middle of the pattern, the number of duplicates drops for a while, which implies a faster base frame rate. If we just drop all duplicates and retain the unique frames, we will be playing this middle section too slow at 24 fps. Be aware that the audio is synchronized to the pattern prior to decimation, so playing slow for a while will lead to audio/video desync. In fact, the pattern above will throw off the sync by about 200 milliseconds, which is massive. A few hits like this in a row can throw off the audio by handfuls of seconds!
(Another problem with the idea of just throwing out all duplicates is that clips often have static sections where there is no motion. We don't want to throw them away! It's conceivable to detect them and try to spare them but it is an additional complication. The serious problem is the synchronization issue described above.)
Thus we can see that in the presence of variations of the underlying frame rate due to irregular patterns of duplicates, we will destroy audio/video sync if we try to use the "perfect" approach. All we can really do is to enforce the specified rate, while trying to prefer delivery of unique frames to duplicates, where possible. FDecimate() uses this strategy.
If we have such a clip, in which there are random sections where the number of duplicates is reduced, we may skip over good frames when we decimate. This causes perceivable jerks in the output. What can we do about this? We can take advantage of the fact that extra duplicates are less noticable than omitted frames. If we set the target frame rate higher, we will retain more frames from the source, thereby reducing the chances of skipping frames. Of course, this will retain more duplicates, too, but that is less objectionable. For the example HD clip above, I was able to set a frame rate of 30 fps instead of 24 fps and thereby achieve reasonable results, while still outputting a standard frame rate.
For completeness I add this extra note. It is possible that the rate variations noted above could be bipolar, i.e, some reduce the underlying rate and some increase it. Here is an example:
32323232222222323232323333333332323232323...
Theoretically we could try to decimate perfectly (as described above) and keep track of the resulting audio/video desync. As long as the desync remained within acceptable limits, we could output "perfect" results. If the desync moved outside the limits, we'd have to skip or duplicate a frame. The problem with this approach is that in practice we never see such clips. What we see is clips with an underlying base rate randomly punctured by an increase of that rate (i.e., short periods where there are not as many duplicates as expected by the pattern).
I thus believe that there is no perfect solution for clips with random base rate variations. However, I remain receptive to any new thinking that might improve things. Please feel free to contact me if you have any ideas for improving FDecimate().
FDecimate(parameter_list)
Here is an example:
FDecimate(rate=23.976,threshold=0.8,show=true)
rate (float, default 23.976) This parameter sets the desired output frame rate. Frames will be removed from the video to achieve this frame rate while keeping audio and video in sync.
threshold (float, default 1.0) This parameter sets the threshold difference metric for duplicate detection. If the difference metric between two frames exceeds this threshold, the two frames are considered to be different frames. i.e., not duplicates. Refer to the "How to Use FDecimate()" section above for an explanation of how to set this threshold properly.
metrics (true/false, default false) This parameter is used to determine the proper threshold to use for duplicate detection. When it is set to true, no decimation occurs and the difference metric for each frame is shown overlaid on the video and in the DebugView output. Refer to the "How to Use FDecimate()" section above for an explanation of how to use this parameter to set the threshold properly.
show (true/false, default false) This parameter enables information to be displayed on the frame. It also displays the software version.
debug (true/false, default false) This parameter enables information to be printed via OutputDebugString(). A utility called DebugView is available for catching these strings. The information displayed is the same as shown by the show option above.
If you do not like the defaults as documented above, you can set your own standard defaults. To override the defaults, create a default file in the Avisynth plugins directory. For example, to set the default threshold=2.0 for FDecimate(), make a file called FDecimate.def and put this line in it:
threshold=2.0
You can list as many parameter assignments as you like, one per line. Those not specified assume the default values documented above. Of course, you can always override the defaults in your scripts when you invoke the functions. NOTE: The lines in the defaults file must not contain any spaces or tabs.
Copyright (C) 2004, Donald A. Graft, All Rights Reserved.
For updates and other filters/tools, visit my web site:
http://neuron2.net/
$Date: 2004/08/13 21:57:25 $