[XviD-devel] Dev-API-4 : Possibility to mark a defined area of
the encoded picture as 'BLACK=0' , to encode BlackBars with absolute minimum
of bitrate ?
Christian HJ Wiesner
chris at matroska.org
Tue Jul 29 12:10:19 CEST 2003
Hi GomGom,
thanks for your reply. Dont forget to visit us by time on
irc.corecodec.com , #matroska , we could even found an anime free #xvid
channel there ;) .... LOL
Edouard Gomez wrote:
>Syskin has better skills in ME than I, but afaik, as soon as a SAD
>comparison is zero, the ME returns and that's all done. This implies not
>so much CPU time for the borders (unless they are not 16px sized in
>height, one MB line will take a bit more of CPU, but it would be the
>case with "bb" hints anyway)
>So i don't think it's worth overbloating user capabilities for gaining a
>few cycles. Let's wait what our expert in ME says, c'mon syskin.
>
>
sysKin replied to me in great detail on IRC, and unfortunately what he
was telling me was not at all motivating, as it seems MPEG4 is not
really suitable to use black bars for the picture. I cant repeat all of
his explanations, as i naturally only understood a small part of it ;-),
but basically it seems that MPEG4 Motion Estimation has a hard time
detecting motion if there is an edge, or even two of them, in the
picture. sysKin mentioned that MPEG4 in theory offers the possibility to
'partition' a picture, but this is neither implemented in XviD nor
planned nor does anybody know anything about that.
>Concerning the resolutions, i've heard that DVD could use a newer video
>codec in near future. If it's MPEG4, you should stick to their
>recomandation no ?
>
Well, from sysKin's explanation i learned that my considerations are
maybe completely wrong. If its true that MPEG4 cant handle black bars
very well, it makes more than sense that any hardware decoder HAS to
offer much better resizing capabilities to be able to handle various
different resolutions, and also interlaced, than the old MPEG1/2 decoder
chips would have ( that was the main reason why DVD/SVCD/VCD would only
support a couple of resolutions and aspect ratios at the beginning ). I
still feel that its a good idea to leave the normal 1:1 Pixel Aspect
Ratio scenario behind, my latest test encodings have shown that a 560 x
352 anamorphic encoding ( Starwars EP2, 136 mins, VHQ 4, 2 b frames,
dx50 vop, h.263 quant ) was looking much better to my eyes than a
'normal' 696 x 288 encoding with square pixels, while number of pixels
is almost the same for both. No idea if the codec will have a
'preference' for a picture that is more 'square', or if its just the
visual perception of a higher vertical resolution, sysKin or you guys
may answer this better i guess.
Different to MP4, the matroska container allows basically any float to
describe the output AR, but its recommended to describe it as 2
integers, one being the preferred width and the other the height of the
output picture. For the example given above, i would set
w/h = 832/352
in mkvmerge, while 352 x 2,35 = 827,2 , with a remaining aspect error of
about 0.5 %. A software player supporting matroska would resize the
picture to 832 x 352 on playback, thus the full vertical resolution
would be preserved while the horizontal res is stretched accordingly.
DirectShow players with full matroska support, like TCMP with the
matroska CDL ( it can read the w/h from the track header *BEFORE*
calling the graph ), will of course use DirectShow and thus the hardware
overlay for that, so its mainly neutral in terms of CPU power. All other
players can do it using latest ffdshow-alpha ( activate 'use overlay' in
the config window ), and in this case it will be ffdshow's internal
resizing filter that will first stretch the picture to the bigger width
accordingly, before DirectShow will expand it to fullscreen using
hardware, so this version is not completely free of using some
additional CPU power but i guess its almost neglectible. In principal
this means that the encoding resolution can be chosen completely free,
as the output resolution will be *ALWAYS* determined by the w/h as set
in the track header, and its the job of the player/decoder filter to
make sure this output AR is achieved, no matter what the input
resolution is.
Now, for a hardware player this is of course more difficult to support,
and for this very reason we would like to try to standardize certain
picture resolutions and output resolutions for those users that believe
that matroska may be suported on standalone units one day in future (
actually, i have email contact with 2 Asian companies about this right
now, but to be honest, i guess they are more polite than really
interested ;-) ). On the example above, with a 560 x 352 encoded res, on
a PC this could be outputted as
560 x 420 = 4 : 3 ( vertical stretch, +20% )
656 x 352 = 16 : 9 ( horizontal stretch, +17% )
832 x 352 = 21 : 9 ( horizontal stretch, +48% )
or for fullscreen from the hardware output on normal TVs this would be
PAL , 768 x 576
768 x 576 = 4 : 3 ( horizontal stretch, +37%, vertical stretch + 64% )
768 x 412 = 16 : 9 ( horizontal stretch, +37%, vertical stretch +17%,
resizer to add black borders on top/bottom )
768 x 324 = 21 : 9 ( horizontal stretch +37%, vertical -8%, resizer to
add ..... )
NTSC, 640 x 480
640 x 480 = 4 : 3 ( horizontal stretch, +14%, vertical stretch + 36% )
640 x 342 = 16 : 9 ( horizontal stretch, +14%, vertical stretch -3%,
resizer to add black borders on top/bottom )
640 x 272 = 21 : 9 ( horizontal stretch +14%, vertical -23%, resizer to
add ..... )
For another example, if the user decides to encode his movie in
640 x 576 ( another res from the old list ), here are the equivalent
numbers, first for the PC
768 x 576 = 4 : 3 ( vertical stretch, +20% )
1024 x 576 = 16 : 9 ( vertical stretch, +60% )
1344 x 576 = 21 : 9 ( vertical stretch, + 110% )
PAL , 768 x 576
768 x 576 = 4 : 3 ( horizontal stretch, +20%, vertical stretch + 0% )
768 x 412 = 16 : 9 ( horizontal stretch, +20%, vertical stretch -28%,
resizer to add black borders on top/bottom )
768 x 324 = 21 : 9 ( horizontal stretch +20%, vertical -44%, resizer to
add ..... )
NTSC, 640 x 480
640 x 480 = 4 : 3 ( horizontal stretch, +0%, vertical stretch -17% )
640 x 342 = 16 : 9 ( horizontal stretch, +0%, vertical stretch -40%,
resizer to add black borders on top/bottom )
640 x 272 = 21 : 9 ( horizontal stretch +0%, vertical -53%, resizer to
add ..... )
As you can see from the 2 examples above, there is no regularity in the
numbers above, very often the resizer would even have to stretch the
picture by uneven numbers ( in % ), or to shrink it even ( bad ). Now,
preferably our list of recommended encoding resolutions should be made
such that there is a certain mathematical correlation between them, in
order to allow using a limited number of resizing algorithms, if this is
possible, and still meet at least with a MOD 8 criterium, preferably MOD
16. As this is a new situation now ( adding black borders would have
made the whole thing much much easier to handle ;-) ), i will have to
sit down and try to find a number of different possible resolutions that
will make the job for the hardware resizers as simply as possible. Of
course, it would help a lot if somebody with a background in resizing
algo's ( trbarry ? ) would give me some hints first about what makes a
resizing job easy or not for a hardware device ( hint, hint :-) ). I
will inform you guys if i manage to come up with something sensible .....
Sorry for the long email and the inevitable, imminent matroska
advertising in it.
Regards
Christian
More information about the XviD-devel
mailing list