[XviD-devel] mulithread rework

Con Kolivas kernel at kolivas.org
Wed Apr 23 11:23:22 CEST 2008


On Wed, 23 Apr 2008 17:37:09 iibot wrote:
> Con Kolivas schrieb:
> >> Con Kolivas wrote:
> >
> > Converting it to strict locking has the effect of performing at
> > virtually the same rate of frame encoding, but with much less CPU
> > usage. What this means is that no CPU is wasted whatsoever by burning
> > in the yield() loopback. The reason we don't get any more encoding
> > speed is that as I said previously, it's not CPU limiting that
> > prevents further speed gains; it's actually the fact that each
> > successive thread depends entirely on the progress of the previous
> > thread before it can make any forward progress. This is what I was
> > worried about back earlier when I said it didn't sound particularly
> > scalable.
> >
> > Anyway, for the moment, I don't have any magic bullets on how to speed
> > up the encoding. There simply isn't enough work for each thread to do
> > before it blocks waiting on the other thread. Unless there was another
> > way of parallelising the workloads such that each thread has more work
> > to do before blocking, I think you're pretty close to the ceiling of
> > the scalability of this approach. I'll look to see if there's anything
> > else that can be done, but I'm not optimistic about the chances.
> >
> > It was fun playing though :-)
>
> That's good.
>
> Actually there are a few knobs to tune now that it is not burning cycles
> to extract more useful work for the CPUs.
>
> If possible you could try to change when a thread starts working again
> by setting it to use larger sets of macroblocks, i.e. only start when
> the above row has proceeded by k blocks (and not 1 as in current code).
> This alone will make things worse but together with setting more threads
> (at least 2x number of cores) I guess there will be enough additional
> work for the CPUs to do.
>
> Anyway it's a small (and I hope worthwhile) change to try and it just
> *could* work :)

Unfortunately that's worse.

We have in the case of 8 threads:

thread 1 XXXXXXXX
thread 2 XXXXXXX
thread 3 XXXXXX
thread 4 XXXXX
thread 5 XXXX
thread 6 XXX
thread 7 XX
thread 8 X

Basically each next thread can't make progress till the thread above makes 
progress. Enlarging the X size doesn't really help, you need to keep the CPUs 
idle for as little duration as possible, so smaller appears better.

-- 
-ck


More information about the XviD-devel mailing list