[XviD-devel] mulithread rework
Radek Czyz
radoslaw at syskin.cjb.net
Tue Apr 22 15:44:41 CEST 2008
Hello, I'm the author of this code (don't laugh ;P) so I should be able
to help.
First, this code doesn't actually divide picture into slices. Output is
identical regardless of number of threads - this was my design goal and
that's the reason why it works the way it does. I'm not particularly
attached to this design goal though, it was more of "can it be done"
experiment.
In order to encode a macroblock, a thread needs a macroblock on
top/right already encoded.
So, two things happen:
* each thread updates a count of "how many macroblocks are encoded in
this row" under its "complete_count_self" memory location.
* each thread encodes no more blocks than "complete_count_above", which
is again a memory location and it's updated by whatever thread works on
the macroblock row above it.
Thread initialization code ensures that thread for row N and for N+1
share the same pointer in order to communicate. The single memory
location is updated by thread for row n and read by thread for row n+1.
All synchronization happens with this set of memory locations, nothing
more is needed.
yield() simply means that current thread can't encode a block because
the thread above it didn't manage to encode on time (or perhaps we're
getting older value because of caches, in which case there's still no
harm done).
Hopefully this helps
Radek
Con Kolivas wrote:
> Hi all
>
> I see not a lot of activity has been going on for a while, but I noticed there
> was multithread code in the cvs repo. I downloaded it and gave it a shot and
> found it did speed up maybe 40% but that's not a great improvement for a 4
> core machine. So I had a look at the code to see if I could help.
>
> Now within the code I see each P frame is sliced up and encoded by different
> threads. What I also found was the sched_yield was used as a locking
> primitive which unfortunately is less than ideal. Turning the sched_yield
> into a noop makes it spin on wait burning cpu cycles doing nothing and
> actually slows the encoding down a lot. I'd like to help improve this
> threading code with some proper locking in the hope I could speed it up.
> However, I was hoping someone could tell me, exactly what is it that the
> sched_yield is giving time to; ie by yielding, what is it waiting on
> happening.
>
> I could poke around indefinitely and I'll probably eventually find it but at
> the moment it's too obscure for me to decipher from this:
>
> if (current_mb >= max_mbs) {
> /* current workload is zero */
> x--;
> sched_yield();
> continue;
> }
>
> Hopefully someone can enlighten me because I'd love to see if I could help
> make this multithreaded code scale better.
>
> Regards.
More information about the XviD-devel
mailing list