[XviD-devel] Mode decision

skal skal at planet-d.net
Tue Mar 25 10:37:39 CET 2003


	Gruel,

On Mon, 2003-03-24 at 19:55, Christoph Lampert wrote:
> On 24 Mar 2003, skal wrote:
> > > What you can't see, is how much a misclassification of INTER->INTER4V
> > > or vice versa would cost in extra bits. 
> > > And you also cannot see that there are hardly any real INTER4V blocks
> > > with sad16 below 600. So maybe this would be a way to speed things up, 
> > > simply not doing inter4v for those.
> > 
> > 	Even with a rather crude sad16-vs-sad8 final criterion for
> > 	INTER/INTER4v decision, I've tried to guess whether a 16x16
> > 	INTER block was worth a candidate for a sub 4V search (which
> > 	takes times). I've tried some criterion based of local gradient
> > 	and divergence (is the MV-field torn appart?), but it's not
> > 	convincing. Actually, it seems that 4V works best at image's
> > 	segmentation limits. So far, so good, I ended up with a sad
> > 	-based criterion that is a good hint of whether going for a
> > 	sub search might be rewarding: after a 16x16 regular search,
> > 	I re-use the best MV found so far to evaluate sad8 of each of
> > 	the four 8x8 sub blocks. If the maximum of these sad8 values 	is
> > greater than a fraction the sad16 for the full block
> > 	(in practice 60% is a good compromise), then I go for a
> > 	refined search...
> > 
> > 	Any opinion?
> 
> Yes, it might be possible to check earlier if INTER4V is promising
> (INTER is already very good => no, SAD8 very different => yes).
> I had plans using that current search also find best positions and
> SADs so far for the 8x8 blocks during search for 16x16.  But my results
> were not convincing.
> 
> If yours is better, great! Do you have results? 

	Here are some results for two sequences I think are
	typical: first one (squares) is made of moving squares
	overlapping each other (might resemble an "anime"),
	and the other is made of 400 frames taken from the
	"Hollow man" trailer. The squares sequence was generated
	using the attached small program (use: warp_gen -t 2 -n 400).

#%probing|file size |FPS

# moving squares (~anime?), 400 frames
0	  5863498    240.964
10	  5809766    235.294
20	  5691258    232.558
30	  5556203    224.719
40	  5376767    218.579
50	  5118033    209.424
60	  4884111    199.005
70	  4797765    188.679
80	  4796947    186.916
90	  4796455    184.332
100	  4794623    186.047


# hollow-man 640x352, 400 frames w/ scene changes
0	  3708908    86.022
10	  3708911    85.837
20	  3709495    85.470
30	  3709517    85.653
40	  3711307    85.288
50	  3715019    84.926
60	  3738086    81.466
70	  3840476    70.796
80	  3826657    65.789
90	  3826657    66.007
100	  3826657    66.007

	Here are also the figures for the "parkrun" sequence:
# parkrun 1280x720, 33 frames
0 	10332336 10.476
10	10332244 10.377
20	10332354 10.061
30	10331736 10.217
40 	10324032 10.185
50	10319714 10.000
60      10308481 10.123
70	10293848 8.824 
80	10307127 7.820 
90	10307127 7.838 
100     10307127 7.857 


	x="% probing" is the threshold used to decide
	going into the 4v search: if max{sad8}>(1-x)*sad16
	=> try 4v mode.

	The curves exhibit two behaviors: "square" sequence
	is a very good candidate for 4v, but one have to
	probe enough blocks (50-60%). However, too much
	probing is time consuming for few improvements.
	On the contrary, the Hollow-man sequence is a very
	bad candidate for 4v, no matter what. In fact,
	file size with full 4v-testing is *bigger*, because
	of the crudeness of the sad16-vs-sad8 criterion you
	were talking about.

	=> I think the good compromise is around 50-60% probing.

	admittedly, more test sequences is needed...

	bye,
			Skal

(PS: the same kind of results seems to apply to interlaced field prediction...)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: file_size.gif
Type: image/gif
Size: 3344 bytes
Desc: not available
Url : http://edu.bnhof.de/pipermail/xvid-devel/attachments/20030325/1624e50e/file_size.gif
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fps.gif
Type: image/gif
Size: 3316 bytes
Desc: not available
Url : http://edu.bnhof.de/pipermail/xvid-devel/attachments/20030325/1624e50e/fps.gif
-------------- next part --------------
A non-text attachment was scrubbed...
Name: warp_gen.c
Type: text/x-c
Size: 9385 bytes
Desc: not available
Url : http://edu.bnhof.de/pipermail/xvid-devel/attachments/20030325/1624e50e/warp_gen.bin


More information about the XviD-devel mailing list