[XviD-devel] Re: mrSAD

Christoph Lampert xvid-devel@xvid.org
Mon, 8 Jul 2002 09:58:16 +0200 (CEST)


On Mon, 8 Jul 2002, peter ross wrote:
> ive reordered the c loops to extract more performance (+~25%). the mmx 
> version is ~3.5 times faster :-)

Hi Pete or somebody else knowledged,

for B-frame ME (which is my next task), a MMXed version of 

sad16bi() from sad.c would be nice. The C-code sad16bi_c is already there. 
Also, for syskin's ME and my rewriting of SEARCH8() which is much too
slow: 
Would it be possible to have a vector valued SAD16, also return the
four SAD8-values. 

functionality should be 

int sad16v_c(ptr,ptr, int* sad8)
{
  sad8[0] = SAD8(topleftblock);
  sad8[1] = SAD8(toprightblock):
  sad8[2] = SAD8(bottomleftblock);
  sad8[3] = SAD8(bottomrightblock);

  return (sad8[0]+sad8[1]+sad8[2]+sad8[3]);
}

However, I guess it would be better to extract the SAD8's during  
intermediate steps of SAD16 than calling SAD8 four times. 

Christoph 

P.S. Pete: what do I have to do to activate usage of DIRECT MODE?
Simply set the flag in MACROBLOCK structure during ME? Or is there
more work needed?

-- 
Christoph Lampert
Beringstr. 6, Raum 14 Tel. (0228) 73-2948 | nell erstellt und bedarf
Sprechstunden: keine, aber meistens da    | keiner Unterschrift. AZ 27B-6