Robust image analysis with sparse representation on quantized visual features

Bing Kun Bao, Guangyu Zhu, Jialie Shen, Shuicheng Yan

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Recent techniques based on sparse representation (SR) have demonstrated promising performance in high-level visual recognition, exemplified by the highly accurate face recognition under occlusion and other sparse corruptions. Most research in this area has focused on classification algorithms using raw image pixels, and very few have been proposed to utilize the quantized visual features, such as the popular bag-of-words feature abstraction. In such cases, besides the inherent quantization errors, ambiguity associated with visual word assignment and misdetection of feature points, due to factors such as visual occlusions and noises, constitutes the major cause of dense corruptions of the quantized representation. The dense corruptions can jeopardize the decision process by distorting the patterns of the sparse reconstruction coefficients. In this paper, we aim to eliminate the corruptions and achieve robust image analysis with SR. Toward this goal, we introduce two transfer processes (ambiguity transfer and mis-detection transfer) to account for the two major sources of corruption as discussed. By reasonably assuming the rarity of the two kinds of distortion processes, we augment the original SR-based reconstruction objective with ℓbf0-norm regularization on the transfer terms to encourage sparsity and, hence, discourage dense distortion/transfer. Computationally, we relax the nonconvex ℓ\bf0-norm optimization into a convex ℓ\bf1-norm optimization problem, and employ the accelerated proximal gradient method to optimize the convergence provable updating procedure. Extensive experiments on four benchmark datasets, Caltech-101, Caltech-256, Corel-5k, and CMU pose, illumination, and expression, manifest the necessity of removing the quantization corruptions and the various advantages of the proposed framework.

Original languageEnglish
Article number6310054
Pages (from-to)860-871
Number of pages12
JournalIEEE Transactions on Image Processing
Volume22
Issue number3
DOIs
Publication statusPublished - Mar 2013
Externally publishedYes

Fingerprint

Image analysis
Benchmarking
Gradient methods
Face recognition
Lighting
Noise
Pixels
Research
Experiments
Facial Recognition
Datasets

Cite this

Bao, Bing Kun ; Zhu, Guangyu ; Shen, Jialie ; Yan, Shuicheng. / Robust image analysis with sparse representation on quantized visual features. In: IEEE Transactions on Image Processing. 2013 ; Vol. 22, No. 3. pp. 860-871.
@article{569341f2604f4653beb330ef591ae2cf,
title = "Robust image analysis with sparse representation on quantized visual features",
abstract = "Recent techniques based on sparse representation (SR) have demonstrated promising performance in high-level visual recognition, exemplified by the highly accurate face recognition under occlusion and other sparse corruptions. Most research in this area has focused on classification algorithms using raw image pixels, and very few have been proposed to utilize the quantized visual features, such as the popular bag-of-words feature abstraction. In such cases, besides the inherent quantization errors, ambiguity associated with visual word assignment and misdetection of feature points, due to factors such as visual occlusions and noises, constitutes the major cause of dense corruptions of the quantized representation. The dense corruptions can jeopardize the decision process by distorting the patterns of the sparse reconstruction coefficients. In this paper, we aim to eliminate the corruptions and achieve robust image analysis with SR. Toward this goal, we introduce two transfer processes (ambiguity transfer and mis-detection transfer) to account for the two major sources of corruption as discussed. By reasonably assuming the rarity of the two kinds of distortion processes, we augment the original SR-based reconstruction objective with ℓbf0-norm regularization on the transfer terms to encourage sparsity and, hence, discourage dense distortion/transfer. Computationally, we relax the nonconvex ℓ\bf0-norm optimization into a convex ℓ\bf1-norm optimization problem, and employ the accelerated proximal gradient method to optimize the convergence provable updating procedure. Extensive experiments on four benchmark datasets, Caltech-101, Caltech-256, Corel-5k, and CMU pose, illumination, and expression, manifest the necessity of removing the quantization corruptions and the various advantages of the proposed framework.",
keywords = "Image classification, quantized visual feature, sparse representation",
author = "Bao, {Bing Kun} and Guangyu Zhu and Jialie Shen and Shuicheng Yan",
year = "2013",
month = "3",
doi = "10.1109/TIP.2012.2219543",
language = "English",
volume = "22",
pages = "860--871",
journal = "IEEE Transactions on Image Processing",
issn = "1057-7149",
publisher = "IEEE, Institute of Electrical and Electronics Engineers",
number = "3",

}

Robust image analysis with sparse representation on quantized visual features. / Bao, Bing Kun; Zhu, Guangyu; Shen, Jialie; Yan, Shuicheng.

In: IEEE Transactions on Image Processing, Vol. 22, No. 3, 6310054, 03.2013, p. 860-871.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Robust image analysis with sparse representation on quantized visual features

AU - Bao, Bing Kun

AU - Zhu, Guangyu

AU - Shen, Jialie

AU - Yan, Shuicheng

PY - 2013/3

Y1 - 2013/3

N2 - Recent techniques based on sparse representation (SR) have demonstrated promising performance in high-level visual recognition, exemplified by the highly accurate face recognition under occlusion and other sparse corruptions. Most research in this area has focused on classification algorithms using raw image pixels, and very few have been proposed to utilize the quantized visual features, such as the popular bag-of-words feature abstraction. In such cases, besides the inherent quantization errors, ambiguity associated with visual word assignment and misdetection of feature points, due to factors such as visual occlusions and noises, constitutes the major cause of dense corruptions of the quantized representation. The dense corruptions can jeopardize the decision process by distorting the patterns of the sparse reconstruction coefficients. In this paper, we aim to eliminate the corruptions and achieve robust image analysis with SR. Toward this goal, we introduce two transfer processes (ambiguity transfer and mis-detection transfer) to account for the two major sources of corruption as discussed. By reasonably assuming the rarity of the two kinds of distortion processes, we augment the original SR-based reconstruction objective with ℓbf0-norm regularization on the transfer terms to encourage sparsity and, hence, discourage dense distortion/transfer. Computationally, we relax the nonconvex ℓ\bf0-norm optimization into a convex ℓ\bf1-norm optimization problem, and employ the accelerated proximal gradient method to optimize the convergence provable updating procedure. Extensive experiments on four benchmark datasets, Caltech-101, Caltech-256, Corel-5k, and CMU pose, illumination, and expression, manifest the necessity of removing the quantization corruptions and the various advantages of the proposed framework.

AB - Recent techniques based on sparse representation (SR) have demonstrated promising performance in high-level visual recognition, exemplified by the highly accurate face recognition under occlusion and other sparse corruptions. Most research in this area has focused on classification algorithms using raw image pixels, and very few have been proposed to utilize the quantized visual features, such as the popular bag-of-words feature abstraction. In such cases, besides the inherent quantization errors, ambiguity associated with visual word assignment and misdetection of feature points, due to factors such as visual occlusions and noises, constitutes the major cause of dense corruptions of the quantized representation. The dense corruptions can jeopardize the decision process by distorting the patterns of the sparse reconstruction coefficients. In this paper, we aim to eliminate the corruptions and achieve robust image analysis with SR. Toward this goal, we introduce two transfer processes (ambiguity transfer and mis-detection transfer) to account for the two major sources of corruption as discussed. By reasonably assuming the rarity of the two kinds of distortion processes, we augment the original SR-based reconstruction objective with ℓbf0-norm regularization on the transfer terms to encourage sparsity and, hence, discourage dense distortion/transfer. Computationally, we relax the nonconvex ℓ\bf0-norm optimization into a convex ℓ\bf1-norm optimization problem, and employ the accelerated proximal gradient method to optimize the convergence provable updating procedure. Extensive experiments on four benchmark datasets, Caltech-101, Caltech-256, Corel-5k, and CMU pose, illumination, and expression, manifest the necessity of removing the quantization corruptions and the various advantages of the proposed framework.

KW - Image classification

KW - quantized visual feature

KW - sparse representation

UR - http://www.scopus.com/inward/record.url?scp=84873312902&partnerID=8YFLogxK

U2 - 10.1109/TIP.2012.2219543

DO - 10.1109/TIP.2012.2219543

M3 - Article

VL - 22

SP - 860

EP - 871

JO - IEEE Transactions on Image Processing

JF - IEEE Transactions on Image Processing

SN - 1057-7149

IS - 3

M1 - 6310054

ER -