[XviD-devel] Hadamard Transform
skal
xvid-devel@xvid.org
06 Sep 2002 13:59:06 +0200
On Fri, 2002-09-06 at 13:35, Christoph Lampert wrote:
> On 6 Sep 2002, skal wrote:
> > > 1 -= 2 2 += 2 2 += 1
> > > (a, b) -----> (a-b ,b) -----> (a-b, 2*b) -----> (a-b, b+a)
> > >
> > > I don't know if this is faster (extra operation?), but it
> > > might help not needing a 9th register.
> >
> > Indeed, it could apply here, but that's not a good
> > trick when precision is at stake. The *2 operation
> > will spoil (precious) 1bit of precision. Ok, for
> > Hadamard, it's useless, but for DCT, e.g, where room
> > for fractional bits is very tight, it can ruin
> > the scheme. It should not be tried first, actually.
> > Only on the late late stage of optimization, if it
> > leads to better re-ordering...
>
> Hm, would it? For DCT, maybe, if you multiply with something before
> adding...
> But here, since you calculate b+a anyway, and a-b, too, you need several
> extra bits anyway if you want to keep precision. Not?
>
Take an example: a=1, b=126, on 8bit signed precision.
This happens often in DCT for when 'b' is the DC coeff
and 'a' a high-frequency one...
You have:
method 1) a+b = 127, a-b=-125. Ok, no overflow.
method 2) a-b = -125, 2*b=252 -> ouch! overflow. =-4 in
signed char, or 127 if you misused saturation -> (a-b)+2b=-125-4=-129
will be ok (=127) if no saturation occur.
Worse would be the example a=1, b=254, with unsigned, where
the sign bit is definitely trashed without remission taking 2*b.
Ok, all this can be sorted out playing with signed/unsigned
additions, two's complement, saturation, etc.. but when it
comes to be a real mess (I mean: look at *that* mess:
http://skal.planet-d.net/coding/skl_dct_AAN.cpp
I think I even used the sign-bit along with pseudo-unsigned mults to
gain 1 additional bit of precision... don't remember exactly...)
this is the kind of overflow that might stay unnoticed
until you see pink blocks flying around :)
> P.S. Another link: Hadamard is intended to become an approximation of DCT.
> The reasons you can see here:
More seriously, whenever you need the real ASM 1D/2D code
(not the C-like one I sent before), just tell me...
I'll take time to finish it for good...
bye,
Skal