Encoded Test Video
Set
for Bird-of-Flock Real-Time Video Object Detection
Object
based bit allocation can result in significant improvement in the
perceptual quality of extremely compressed video. However, real-time
video object detection in large format high fidelity video is
computationally daunting. Most such algorithm begins with extensive use
of classical bit analysis, and thus remains computationally heavy.
Based on some recent results in human visual perception, in this paper
we present an experimental visual region tracking algorithm
particularly designed for perceptual stream coding.
This
exploits the
cue order observed in human visual perception to achieve very high
computation speed as well as tracking efficiency. Rather than begin
processing from pixel level or using any pixel level processing at all,
it employs high level motion cue and block shape cue analysis to
identify signatures of various relative movements between object of
interest, scene background and the camera on the motion vector set, and
from there it identifies objects. It then uses predictive cue designed
on Kalman filters to track the regions. The result is a fast yet highly
effective perceptual region tracking algorithm that can operate in
stream rate and track regions of perceptually significant object
despite camera movements such as zoom, panning and translation. We have
implemented this algorithm in a live H.263/MPEG-2 perceptual
transcoder. In this paper we share the performance of this
implementation. This fast object aware video rate transcoder is
particularly suitable for live streaming and can convert a regular
stream into a perceptually coded video stream.
This
report
contains experiment clips used in testing the performance of this
system. The videos are MPEG-2 ISO 13818-2 streams. The detail of
the experiments are in main publications.
*The technical detail of the algorithms are not included here.
Tracking Efficiency Test
Set:
Video
Clip |
Content Description | Video Clip |
Toycar | Fixed camera, still background, one object tracked, well-textured, fast-moving, heavy shadow | Toycar.mv2 |
Mycar-in_parking_lot | Fixed camera, some movements in background, one object tracked, poor-textured, slow-moving | Mycar_in_parkinglot.mv2 |
Two_tractors | Fixed camera, some movements in background, two objects tracked, well-textured, slow-moving, partial exclusion | Two_tractors.mv2 |
Walking_people | Fixed camera, still background, three objects tracked, well-textured, deformable shape, slow-moving, illumination change | Walking_people.mv2 |
Tractor_with_moving_camera | Fast smooth camera movement, well-textured background, one object tracked, well-textured, slow-moving, partial exclusion | Tractor_with_moving_camera.mv2 |
Plane | Smallest object spanning only few macroblocks, very
shaky camera movement, fast moving background,
one
object tracked, poor-textured, irregular-moving. |
Plane.mv2 |
Mower | Smooth camera movement, well-textured background, one object tracked, well-textured, slow-moving, exclusion | Mower.mv2 |
Shaking_camera | Irregular camera movement, poor-textured background, one object tracked, poor-textured, fast-moving | Shaking_camera.mv2 |
Perceptully Encoded Video
Test Set:
Test
Description |
Sample
Clip Name |
Detection of One Object |
A_one_object_detection.mv2 |
Detection of Two Objects |
B_two_object_detection.mv2 |
Video with Perceptual Encoding
Applied (showing detection) |
C_two_object_quality.mv2 |
Video after Perceptual Encoding
Applied (invisible) |
D_two_object_quality_noline.mv2 |
Stream Trancoding Ratio (1/1)
with detection |
F1_worker2_1_1.mv2 |
Stream Trancoding Ratio
(1/1) (invisible) |
F2_worker2_1_1_noline.mv2 |
Stream Trancoding Ratio (4/1) with detection | G1_workers2_5_1.mv2 |
Stream Trancoding Ratio
(4/1) (invisible) |
G2_workers2_5_1_noline.mv2 |