Focus: Querying Large Video Datasets with Low Latency and Low Cost


Kevin Hsieh, Carnegie Mellon University; Ganesh Ananthanarayanan and Peter Bodik, Microsoft; Shivaram Venkataraman, Microsoft / UW-Madison; Paramvir Bahl and Matthai Philipose, Microsoft; Phillip B. Gibbons, Carnegie Mellon University; Onur Mutlu, ETH Zurich


Large volumes of videos are continuously recorded from cameras deployed for traffic control and surveillance with the goal of answering “after the fact” queries: identify video frames with objects of certain classes (cars, bags) from many days of recorded video. Current systems for processing such queries on large video datasets incur either high cost at video ingest time or high latency at query time. We present Focus, a system providing both low-cost and low-latency querying on large video datasets. Focus’s architecture flexibly and effectively divides the query processing work between ingest time and query time. At ingest time (on live videos), Focus uses cheap convolutional network classifiers (CNNs) to construct an approximate index of all possible object classes in each frame (to handle queries for any class in the future). At query time, Focus leverages this approximate index to provide low latency, but compensates for the lower accuracy of the cheap CNNs through the judicious use of an expensive CNN. Experiments on commercial video streams show that Focus is 48× (up to 92×) cheaper than using expensive CNNs for ingestion, and provides 125× (up to 607×) lower query latency than a state-of-the-art video querying system (NoScope).

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@inproceedings {222587,
author = {Kevin Hsieh and Ganesh Ananthanarayanan and Peter Bodik and Shivaram Venkataraman and Paramvir Bahl and Matthai Philipose and Phillip B. Gibbons and Onur Mutlu},
title = {Focus: Querying Large Video Datasets with Low Latency and Low Cost},
booktitle = {13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)},
year = {2018},
isbn = {978-1-939133-08-3},
address = {Carlsbad, CA},
pages = {269--286},
url = {},
publisher = {USENIX Association},
month = oct

Presentation Audio