Page 1 of 1

Motion vector coordinate system?

Posted: Mon Feb 04, 2019 3:31 pm
by mirrorman
Hi. I am using the example code extract_mvs.c to obtain the motion vectors for a particular frame (frame 500) of a H.264-encoded video file. The video sequence has an overlaid frame counter to help verify what is going on.

To keep things simple at this stage I have previously used the ffmpeg tool with options -g 0 and -bf 0 to re-encode the original file without B-frames or I-frames (except for the very first frame). I have also used ffmpeg to extract two still frames (frames 499 and 500) as .bmp files. They both look great.

Then, using OpenCV, I can plot the extracted motion vectors as source and destination rectangles and arrows on the still images. They look believable and generally confined to the moving areas of the image.

I am then performing cvSetImageROI on frame 499 for each motion vector source XY/w/h (assuming the XY coordinates refer to top left, with increasing Y going down the screen...) and the same on frame 500 using the destination XY and the same w/h, then copying the contents from frame 499 to 500 (and then resetting the ROI for both images).

I had hoped that I would see no difference between the original frame 500 image, and the one with the overlaid blocks - having, I had hoped, effectively applied the same kind of motion compensation as the MPEG decoder! (give or take the odd sub-pixel cleverness and de-blocking etc. that the decoder maybe does). But no, it's a mess.

It is generally correct-ish, but there are some egregiously bad macroblocks that seem to be moved to/from the wrong locations. Things seem to improve slightly if I offset each block's source and destination location by half the block width and height i.e. assume the coordinates refer to the centre of the block not the top left, but it's still wrong. I have also tried 'fishing' for correct source and destination offsets without luck.

Yesterday when playing with a normal P/B encoded video I occasionally got it to look almost indistinguishable from the FFMPEG-decoded frame, but then on another frame it would fail to work again (and I think I was referring to the correct frame i.e. the previous/next P-frame not an adjacent B-frame). Hence my attempt to simplify it with a P-frame only re-encoding.

From my description, is there an obvious gotcha? If not, I can provide more material for you to go on. (many thanks if you could look at it for me, of course)

FFMPEG is an outstanding tool, so many thanks for providing it.