Distributed Image Search in Camera Sensor Networks


Overview

Recent years have seen massive proliferation of networked camera sensor applications, such as cellphone cameras, surveillance cameras, low-power imagers for environmental sensing, lifelog devices (e.g. sensecam), traffic cameras, digicams, etc. These camera sensor network applications generate tremendous volume of data whiletransmitting images from battery-powered devices is very expensive. Meanwhile, privacy concern makes a centralized infrastructure less attractive.

Project Goal: Design a general purpose distributed image storage and search engine across heterogeneous camera sensor sources.



Challenges:
  1. Efficient image representations to enable high-accuracy, fast, image search.
  2. Efficient local storage and search across diverse storage media (disk, flash), and platforms with diverse resource constraints.
  3. Distributed search across diverse image datasets while minimizing communication and processing.
  4. Privacy-aware search to provide users full control over images that they seek to share.
Our study on distributed image search covers the following three aspects: 1) Compact image representation, 2) Energy efficient search on flash storage centric sensor nodes, and 3) Distributed image search in sensor networks.
Compact image representation

We use visual word/visterm as compact feature for image representation. Two steps are involved in retrieving visterms. 1) SIFT features are extracted from raw QVGA images; 2) Hierarchical clustering is used to convert SIFT features to visterms.

In vocabulary tree, the leaf clusters are used to represent the SIFT features of database images. Therefore, a 128 dimensional SIFT feature is now turned into a cluster id (usually 4 Bytes). We can this cluster id visual words, or visterm. In distributed image search, we use visterm as our distinguishing feature to represent images.

 Energy efficient search on flash storage centric sensor nodes

We have following two key technique to conduct energy efficient search on sensor nodes.

Buffered Vocabulary Tree
The vocabulary tree is the core data structure to map SIFT features to visterms. Since vocabulary tree is a largedata structure (usually several MB), it is necessary to be maintained on flash. We can only load a subset of entire vocabulary tree into memory when converting SIFT to visterm. Thus, we developped the Buffered Vocabulary tree to optimize this converting process. The key idea of Buffered Vocabulary tree is to partition large Vocabulary tree into sub-trees, and buffer SIFT features for batch processing to reduce times of loading the same subtrees.

Inverted File based Scoring and matching

Inverted file is used to map a visterm to the set of images in the local database that contain the visterm. As the number of captured images increases, the size of inverted file grows to exceed the memory limit. Therefore, we introduce an additive log-like flash storage to store inverted file. The longest document list in the inverted file are stored in this log-like storage to keep the flash writing time as less as possible.

Distributed image search in sensor networks

Global matching is to further refine the local search results for global optimal. The proxy gathers the top-k locally ranked results and merge them into the global top-k results. The sensors only need to send back the correspondent images that are in the global ranking list.


Testbed

Our testbed consists of 6 iMote2 nodes, each of which is equipped with an OmniVision camera and an 1 GB SD extension card. The OmniVision provides color VGA (640x480) image aquisition. Our iMote2 nodes forms a multihop metwork in our evaluation.

Publications

Tingxin Yan, Deepak Ganesan and R. Manmatha, Distributed Image Search in Camera Sensor Networks. In Proceedings of the 6th ACM Conference on Embedded Networked Sensor Systems (SenSys 2008), Raleigh, NC, Nov 2008. [pdf]

People 

Faculty members:
Deepak Ganesan
R. Manmatha

PhD Student:
Tingxin Yan

Funding

This work was supported by NSF grants: CNS-0626873, CNS-0546177 and CNS-052072.
©2008 Sensors Lab, Computer Science Department, University of Massachusetts Amherst