[XviD-devel] Changes to get_pmv2

Wed Sep 22 16:48:29 CEST 2004

On Wed, 22 Sep 2004, Andrew Voznytsa wrote:
> I don't know how you did it exactly, but a few suggestions how to get at 
> least ~30-40% speedup for 2 CPUs:
> 1) create threads on encoder init stage
> 2) in I/P VOP case enable VideoPackets, create N (N <= number of cpus) 
> VideoPackets and encode each VideoPacket in his own thread. Under encode 
> I mean everything: ME, DCT, bit packing. On post encode stage you'll 
> have to join N bitstream buffers (not a problem, they'll be byte aligned).
> 3) in case if B VOPs are present too, try to encode each B VOP (without 
> VideoPackets) in his own thread (as Radek Chyz proposed).

Yes, I know, this would be more successful that my approach. 

What I was referring to was an early test (in 2001), to just do ME in 
parallel, splitting the image into N (like N=2) vertical stripes and work 
on each independently (but with the correct predictors, so there was 
some sync needed where the strips meet). 
Still, it turned out that there was simply too much overhead involed to be 
of any use, and that's one of the reason why I have been proposing coarse 
grain parallelism (as you suggest) instead of splitting individual 
routines. 

gruel