[XviD-devel] Subpelrefine_Fast

Michael Militzer michael at xvid.org
Fri Sep 12 17:06:32 CEST 2003


Hi,

Quoting Radek Czyz <syskin at ihug.com.au>:

> Hi everyone.
> 
> I've been looking at new Subpelrefine_Fast() and I have some
> questions. Isibaar, I hope you'll find time to answer them :)
> 
> First - do you think the idea will be useful for halfpel-refinement?
> It looks like halfpel-ready, but you only use it for qpel.
> The way I see it, the total number of checks will actually increase
> for halfpel... Am I wrong?

I don't expect it to be useful for halfpel. Since the halfpel planes are
precalculated we won't save any interpolations (and that's what is costly
for quarterpel refinement). So normal halfpel should be faster but I added
the if(quarterpel) statements to make it work in halfpel mode also, so it
can be used at least for testing purposes...
 
> Second - a bug? You seem to check 8 halfpel positions first, to find
> the "second best" vector among them. Makes sense but...
>     second_best = *data->currentMV;
> As long as data->qpel_precision is set (and it is), CheckCandidates
> will not change data->currentMV, but data->currentQMV instead.

Hm, if that's really the case then it's wrong. As I already mentioned after
GomGom found a bug, my actual version of fast_refine (which I used for my
tests) was qpel specific. I just added these if(quarterpel) to make it work
with halfpel also (but I didn't test what I committed -> shame on me but I
simply wanted to finish this stuff because I'm busy with other things as
well...)

So as a fix, when qpel_precision is 1, we have to set it to zero, then look
for the second best (perform these 8 checks), and after this switch qpel
precision to 1 again.

BTW: The check (at least part of the check) of the 8 neighbours in order to
find second best could already be done during regular halfpel refinement.
However, there isn't much to gain for qpel (because halfpel checks are
pretty cheap compared to qpel checks), but if you want to use the fast re-
finement for halfpel, it might be worth it to try to find the second best
neighbour already during ordinary full-pel search...
 
> As a result, it just can't work correctly - or I missed something ;)
> 
> 
> The _fast idea is very good, and will be even more useful for VHQ
> refinement and b-frame refinement. To implement them, I plan to change
> it a bit. There will only be one CheckCandate, but the information
> about "second best" etc will only be stored in memory if
> qpel_precision is active. Should work as expected, and not be too
> slow.

I may suggest you try SATD as compare metric for refinement. I'd say it
should also give good results (close to R-D) while being much faster.

Regarding b-frames: b-frames + qpel is currently extremely slow - we should
try to do something about this. Several things come to mind: we could try
to do mode decision already after halfpel refinement and only perform qpel
refinement for one mode after mode decision. It should be checked if direct
mode delta search is currently performed in qpel precision (I don't know
out of memory) - if so, this is super-slow. We should then round-down to
integer or halfpel and refine afterwards. Another idea about mode decision
would be to introduce some kind of early-stopping: Most of the time, direct
mode is used and we search for it first. If the found match is good, we
could simply stop checking any other modes. Also: interpolated mode is only
rarely used, so maybe we should try to find some heuristic when to perform 
search_interpolate and when not.

But surely, it would be great if we'd have a R-D optimized b-frame mode
decision - just to have an idea of what the optimal solution looks like. I
really think that there is something to gain from R-D mode decision in b-
frames (maybe even more than with p-frames).

> I'll combine it with halfpixel refinement to save some extra halfpixel
> checkcandidates (from 3 to 8).

yep, that's what I meant above.
 
> BTW I kinda fear about the complexity of CheckCandidates... On
> average, CheckCandidate16 takes about 94 cpu cycles. That doesn't
> include functions it calls (like sad16v). Do you think it's slow?

that's why I created a new CheckCandiate function - I didn't want to change
or slow down the normal behaviour. After all, checking for a second best
match slows down CheckCandidate (and adds some new ifs).

bye,
Michael


More information about the XviD-devel mailing list