[XviD-devel] SAD of last visited neighbour positions

Michael Militzer xvid-devel@xvid.org
Mon, 21 Oct 2002 15:43:33 +0200


Hi,

> I just looked into ME again to add fields for saving the SAD
> values of previously visited positions (for faster halfpel/qpel).
>
> The more I started patching, the more I feel like it's a pain in the
> a** to add support for this.

> Whole ME is not structured to keep track of all this values (SAD and
> position). It can of course be done, but do you have any idea if it's
> really worth doing it?

I've measured a similar speed increase to what Michael N. described. However
I recalculated all neighbouring SADs (simply because they are not stored by
our ME) before fast refinement, so I only saved 4 interpolation steps and
even this (limited) implementation was already faster than using image-based
interpolation (test also performed without inter4v or any other fancy
options). But since we already get good 4 mvs for inter4v mode without any
further 8x8 halfpel refinement, block-based interpolation will surely be
faster for "normal" ME (quality level <=5).

The whole thing might become even more interesting for qpel: Currently we're
using image based interpolation for the halfpel positions using a 6tap
low-pass filter (as proposed by the JVT specs). In my tests this seemed to
be a pretty good approximation (at least for ME), however since I made
several changes to the interpolation (MC) code  since then, I feel I should
redo these tests. Maybe the correct way (block-based interpolation using the
8-tap low-pass filter for calcualting the halfpel positions during the ME
step) will give better results now. If so, a fast block-based refinement
will give a substantial speed improvement here, since the needed
interpolated pixels are much harder to calculate in qpel than in halfpel
mode...

Btw: Christoph, what's the main problem your currently facing? From my
understanding it should be sufficient to modify the DiamondSearch functions
to store the SADs of already visited positions, right? Or is the
modification much more complex? (please excuse my ignorance in this area...)

bye,
Michael