[XviD-devel] More MMX improvements => funky cbp!
Skal
skal at planet-d.net
Wed Aug 3 12:48:37 CEST 2005
Hello Carlo!
On Fri, 2005-07-29 at 17:59, carlo.bramix wrote:
> Hello,
> thanks a lot for your replies.
> Unfortunately, I thought those routines were speed critical for the fact they were rewritten in ASM.
> I will try to do improvements on other parts of the codec.
Actually, i've tried applying your patch,
and experience some binary differences in
the output (using xvid_encraw.c with forced
used of MMX cpu).
Could you cross check your function is Ok?
If not, then xvid_bench.c should you be
enhanced to remove this false-positive.
But it might just be me that messed the test up.
Anyway, do you feel like exercising a little,
just for the sport of it? Yes?
'coz i've had a look at your ASM code, and
the final bit-by-bit computation of the CBP
could be sped up a little, IMHO.
Attention, we're just talking about few %
speed-up of few % cpu use, here, but that's
just for the challenge (summer is sooo boring;)
Here it goes:
cbp computation (for the luma part) is in fact
a scalar product:
cbp_y = 1.a + 2.b + 4.c + 8.d,
where a,b,c, and d are boolean values
deduced from or'ing all the 8x8 (luma) DCT coeffs
(with exception to the DC), and 'pcmpgtw'ing
them to zero.
Now, you can easily compute this scalar product
with good ol' 32bits-mult as:
cbp_y = ( 0xdcba * 0x1248 ) >> 24
where 0xdcba is the 32bit integer resulting
from packing (packssdw/wb) the four bools as
0xdcba = (d<<24) | (c<<16) | (b<<8) | (a)
This works because no overflow occur for each
individual terms. Just write the actual mult
(like in school) to see it:
0x d c b a
* 0x 1 2 4 8
--------------------
+ 8d 8c 8b 8a
+ 4d 4c 4b 4a
+ 2d 2c 2b 2a
+ 1d 1c 1b 1a
---------------------
= .........^^
and look at the sum in the fourth column.
(yes, multiplication really is a convolution).
Shifting this column to LSBits (with >>24), you get
the cbp_y result with very few instructions.
haf phun,
-Skal
(this mult trick is used in the GMC code, btw)
More information about the XviD-devel
mailing list