CONVOLUTIONAL NEURAL NETWORKS FOR FACE …eprints.utem.edu.my/14813/1/Convolutional Neural Networks...

CONVOLUTIONAL NEURAL NETWORKS FOR FACE RECOGNITION AND

FINGER-VEIN BIOMETRIC IDENTIFICATION

SYAFEEZA AHMAD RADZI

UNIVERSITI TEKNOLOGI MALAYSIA

© Universiti Teknikal Malaysia Melaka

PSZ 19:16 (Pind. 1/07)

DECLARATION OF THESIS / UNDERGRADUATE PROJECT PAPER AND COPYRIGHT

Author’s full name :

Date of birth :

Title : CONVOLUTIONAL NEURAL NETWORKS FOR FACE RECOGNITION

AND FINGER-VEIN BIOMETRIC IDENTIFICATION

Academic Session:

I declare that this thesis is classified as :

I acknowledged that Universiti Teknologi Malaysia reserves the right as follows :

1. The thesis is the property of Universiti Teknologi Malaysia.

2. The Library of Universiti Teknologi Malaysia has the right to make copies for the purpose

of research only.

3. The Library has the right to make copies of the thesis for academic exchange.

Certified by :

SIGNATURE SIGNATURE OF SUPERVISOR

(NEW IC NO. /PASSPORT NO.) NAME OF SUPERVISOR

Date : 2 JULY 2014 Date : 2 JULY 2014

NOTES : * If the thesis is CONFIDENTIAL or RESTRICTED, please attach with the letter from

the organisation with period and reasons for confidentiality or restriction.

UNIVERSITI TEKNOLOGI MALAYSIA

√

CONFIDENTIAL (Contains confidential information under the Official Secret

Act 1972)*

RESTRICTED (Contains restricted information as specified by the

organisation where research was done)*

OPEN ACCESS I agree that my thesis to be published as online open access

(full text)

2013/2014/2

SYAFEEZA BINTI AHMAD RADZI

810324-71-5024 MOHAMED KHALIL BIN MOHD HANI

24TH MARCH 1981

“I hereby declare that I have read this thesis and in my

opinion this thesis is sufficient in terms of scope and quality for the

award of the degree of Doctor of Philosophy (Electrical Engineering)”

Signature :

Name : Prof. Dr. Mohamed Khalil Mohd. Hani

Date : July 2, 2014

BAHAGIAN A - Pengesahan Kerjasama*

Adalah disahkan bahawa projek penyelidikan tesis ini telah dilaksanakan melalui kerjasama

antara __________ dengan _________ _

Disahkan oleh:

Tandatangan

Nama

Jawatan (Cop rasmi)

Tarikh:

* Jika penyediaan tesislprojek melibatkan kerjasama.

BAHAGIAN B- Untuk Kegunaan Pejabat Sekolah Pengajian Siswazah

Tesis ini telah diperiksa dan diakui oleh:

Nama dan Alamat Pemeriksa Luar

Nama dan Alamat Pemeriksa Dalam

Prof. Dr. Mohd Alauddin Mohd Ali

Department of Electrical Electronic & Systems

Engineering,

Faculty of Engineering and Built Environment,

Universiti Kebangsaan Malaysia,

43600 Bangi, Se!ang-0>r.

Prof. Dr. Jasmy bin Yunus

Dekan

Fakulti Biosains dan Kejuruteraan Perubatan,

UTM Johor Bahru.

Disahkan oleh Timbalan Pendaftar di Sekolah Pengajian Siswazah:

Tandatangan : Tarikh :

Nama ASRAM BIN SULAIMAN @ SAIM


CONVOLUTIONAL NEURAL NETWORKS FOR FACE RECOGNITION AND

FINGER-VEIN BIOMETRIC IDENTIFICATION


A thesis submitted in fulfilment of the

requirements for the award of the degree of

Doctor of Philosophy (Electrical Engineering)

Faculty of Electrical Engineering

Universiti Teknologi Malaysia

JULY 2014


ii

I declare that this thesis entitled "Convolutional Neural Networks for Face Recognition

and Finger-vein Biometric Identification" is the result of my own research except as

cited in the references. The thesis has not been accepted for any degree and is not

concurrently submitted in candidature of any other degree.

Signature

Name

Date


July 2, 2014


iii

Dedicated to my beloved parents, husband and daughter.

iv

ACKNOWLEDGEMENT

I am very blessed with the completion of this thesis. With this opportunity,I would like to express my gratitude to several people that have involved directly orindirectly to the completion of this thesis.

Firstly, I would like to express my sincere appreciation to my supervisorProfessor Dr. Mohamed Khalil bin Mohd Hani for the useful critics, guidance andadvices during the journey. I have gained valuable knowledge during his supervision.It was also a privilege to work closely with the members of VeCAD Lab namely Rabia,Vishnu, Lee Yee Hui, Sia Chen Wei, Moganesh and Liew Shan Sung. I am veryfortunate to have these people during my ups and downs of conducting this research. Iam glad for the sharing of constructive ideas and their concern.

Most importantly, I would like to thank my family especially my parents,husband and daughter. Thank you for the continuous encouragement, patience andboundless support in all my endeavours.

Last but not least, I would like to acknowledge Universiti Teknikal MalaysiaMelaka (UTeM) for giving me the opportunity to pursue my study up to this level.

v

ABSTRACT

The Convolutional Neural Network (CNN), a variant of the MultilayerPerceptron (MLP), has shown promise in solving complex recognition problems,particularly in visual pattern recognition. However, the classical LeNet-5 CNN model,which most solutions are based on, is highly compute-intensive. This CNN also suffersfrom long training time, due to the large number of layers that ranges from six toeight. In this research, a CNN model with a reduced complexity is proposed forapplication in face recognition and finger-vein biometric identification. A simplerarchitecture is obtained by fusing convolutional and subsampling layers into onelayer, in conjunction with a partial connection scheme applied between the first twolayers in the network. As a result, the total number of layers is reduced to four.The number of feature maps at each layer is optimized according to the type ofimage database being processed. Consequently, the numbers of network parameters(including neurons, trainable parameters and connections) are significantly reduced,essentially increasing the generalization ability of the network. The StochasticDiagonal Levenberg-Marquadt (SDLM) backpropagation algorithm is modified andapplied in the training of the proposed network. With this learning algorithm, theconvergence rate is accelerated such that the proposed CNN converges within 15epochs. For face recognition, the proposed CNN achieves recognition rates of 100.00%and 99.50% for AT&T and AR Purdue face databases respectively. Recognition timeon the AT&T database is less than 0.003 seconds. These results outperform previousexisting works. In addition, when compared with the other CNN-based face recognizer,the proposed CNN model has the least number of network parameters, hence bettergeneralization ability. A training scheme is also proposed to recognize new categorieswithout full CNN training. In this research, a novel CNN solution for the finger-veinbiometric identification problem is also proposed. To the best of knowledge, there isno previous work reported in literature that applied CNN for finger-vein recognition.The proposed method is efficient in that simple preprocessing algorithms are deployed.The CNN design is adapted on a finger-vein database, which is developed in-house andcontains 81 subjects. A recognition accuracy of 99.38% is achieved, which is similarto the results of state-of-the-art work. In conclusion, the success of the research insolving face recognition and finger-vein biometric identification problems proves thefeasibility of the proposed CNN model in any pattern recognition system.

vi

ABSTRAK

Convolutional Neural Network (CNN) yang merupakan variasi kepadaMultilayer Perceptron (MLP) telah menunjukkan kebolehan dalam kerja pengecamanyang rumit terutamanya dalam pengecaman corak visual. Walau bagaimanapun,senibina klasik CNN iaitu LeNet-5, yang merupakan asas kepada kebanyakanpenyelesaian, mempunyai pengiraan intensif yang tinggi. CNN ini juga berhadapandengan masa latihan yang terlalu lama disebabkan oleh bilangan lapisannya dalamlingkungan enam hingga lapan lapisan. Dalam kajian ini, model CNN dengankurang kekompleksan telah dicadangkan untuk diaplikasi pada pengecaman muka danpengesahan identiti biometrik urat jari. Senibina yang lebih ringkas telah diperolehidengan cara mencantumkan lapisan convolution dan subsampling ke satu lapisan,dengan gabungan skim sambungan separa antara dua lapisan pertama dalam rangkaian.Keputusannya, jumlah lapisan telah dikurangkan kepada empat. Bilangan petak sifatpada setiap lapisan telah dioptimumkan berdasarkan kepada jenis pangkalan data yangdigunakan. Kesannya, bilangan parameter rangkaian (termasuk neuron, parameterterlatih dan sambungan) nyata sekali dapat dikurangkan, terutamanya kebolehangeneralisasi yang lebih baik. Algoritma Stochastic Diagonal Levenberg-Marquadt(SDLM) telah diubah suai dan diaplikasi dalam latihan rangkaian yang dicadangkan.Dengan algoritma ini, kadar pembelajaran titik tumpu telah dipercepatkan untuk tumpudalam tempoh 15 epoch. Untuk pengecaman muka, CNN yang dicadangkan mencapaikadar pengecaman sebanyak 100.00% dan 99.50% masing-masing untuk pangkalandata AT&T dan AR Purdue. Masa pengecaman untuk AT&T adalah kurang daripada0.003 saat. Keputusan yang diperolehi telah mengatasi kerja terdahulu. Tambahanpula, apabila dibandingkan dengan reka bentuk CNN yang lain, senibina CNN yangdiusulkan mempunyai parameter rangkaian yang paling sedikit malahan ia mempunyaikebolehan generalisasi yang lebih baik. Satu skim latihan juga telah dicadangkan untukmengecam kategori baru tanpa memerlukan keseluruhan latihan CNN. Dalam kajianini, penyelesaian CNN untuk masalah pengesahan identiti biometrik urat jari jugatelah dicadangkan. Sepanjang pengetahuan yang ada, tiada kerja sebelumnya yangdilaporkan mengaplikasi CNN untuk pengesahan identiti biometrik urat jari. Kaedahyang dicadangkan berkesan kerana algoritma pemprosesan mudah digunakan. Rekabentuk CNN diadaptasi pada pangkalan data urat jari, yang telah dihasilkan sendiri danmengandungi 81 orang. Kejituan pengecaman sebanyak 99.38% telah dicapai, yanghampir sama dengan keputusan yang diperoleh daripada kerja terkini. Kesimpulannya,kajian ini telah berjaya menyelesaikan masalah pengecaman muka dan pengesahanidentiti biometrik urat jari membuktikan bahawa CNN yang dicadangkan bolehdilaksanakan dalam sebarang sistem pengecaman corak.

vii

TABLE OF CONTENTS

CHAPTER TITLE PAGE

DECLARATION iiDEDICATION iiiACKNOWLEDGEMENT ivABSTRACT vABSTRAK viTABLE OF CONTENTS viiLIST OF TABLES xiiLIST OF FIGURES xvLIST OF ABBREVIATIONS xviii

1 INTRODUCTION 11.1 Overview of Pattern Recognition 11.2 Neural Networks in Pattern Recognition Problems 21.3 Convolutional Neural Network 31.4 Biometric Pattern Recognition 4

1.4.1 Summary of Existing Face Recognizers 71.4.2 Summary of Work on Finger-vein Pattern

Recognition 81.5 Problem Statement 81.6 Research Objectives 131.7 Scope of Work 141.8 Research Contributions 141.9 Thesis Organization 16

2 LITERATURE REVIEW 172.1 Basic CNN Architectures 172.2 Previous Work applying CNN in Face Recognition 192.3 Related Previous Work on Finger-vein Biometric

Identification 23

viii

2.4 Other State-of-the-Art CNN Architectures 272.4.1 Siamese CNN 272.4.2 Space Displacement Neural Networks 312.4.3 Shunting Inhibitory Convolutional Neural

Networks 322.4.4 Sparse Convolutional Neural Networks 342.4.5 Multiscale Convolutional Neural Net-

works 342.4.6 Multiple Circular Path Convolutional

Neural Networks 352.4.7 Multi-column Deep Neural Networks 362.4.8 3D Convolutional Neural Networks 372.4.9 Recurrent Convolutional Neural Net-

works 392.5 Summary 39

3 CONVOLUTIONAL NEURAL NETWORK - ALGO-RITHMS AND ARCHITECTURE 403.1 Multilayer Perceptron Neural Network (MLPNN) 40

3.1.1 Activation Functions 423.1.2 Weight Initialization Methods 423.1.3 Normalization Method 44

3.2 Neural Network Training 453.2.1 Gradient Descent Backpropagation Algo-

rithm 463.2.2 Resilient Backpropagation 49

3.3 LeNet-5 Convolutional Neural Networks 493.3.1 Algorithms in LeNet-5 503.3.2 Convolutional Neural Network Training 54

3.4 Summary 55

4 PROPOSED CONVOLUTIONAL NEURALNETWORK: MODELLING, ARCHITECTURE ANDALGORITHM 574.1 Methodology 57

4.1.1 Cross-validation Technique to optimizeGeneralization Ability 57

4.1.2 Techniques to Improve Convergence Rate 59

ix

4.1.3 Winner-Takes-All Rule 624.2 Proposed CNN Model 63

4.2.1 Introduction 634.2.2 CNN with Fused Convolution/Subsam-

pling 634.2.3 Connection Schemes 67

4.3 Proposed Training Algorithm 694.3.1 Stochastic Diagonal Levenberg-

Marquardt (SDLM) 694.3.2 Proposed Modifications to SDLM Algo-

rithm 714.4 Proposed Training Scheme to recognize New

Categories without Full CNN Training 734.5 Generalizing the CNN Architecture for New

Problem Domain 764.6 Summary 77

5 EXPERIMENTAL WORK AND RESULTS FOR CNN INFACE RECOGNITION 785.1 Performance Evaluation Procedures 785.2 Data Preparation 82

5.2.1 Face databases 825.3 Experimental Work and Results of the Proposed

Baseline CNN Model 855.3.1 Results of Cross-validation 865.3.2 Best Connection Scheme 875.3.3 The Most Suitable Normalization and

Weight Initialization Methods 885.3.4 The Optimum Input Image Size 895.3.5 Benchmarking Results 90

5.3.5.1 Benchmarking of Accuracy 905.3.5.2 Benchmarking of Execution

and Convergence Time 925.3.5.3 Analysis and Benchmarking of

Model Complexity 935.3.5.4 Comparison between the pro-

posed CNN Model and LeNet-5 955.4 Experimental Work and Results of Proposed CNN

Model 96

x

5.4.1 Results of Cross-validation 965.4.2 Best Connection Scheme 965.4.3 The Most Suitable Normalization and


5.5 Experimental Results of CNN Model Targeted forFace Images of the ”Extreme Challenge” Type 1015.5.1 Results of Cross-validation 1015.5.2 Best Connection Scheme 1035.5.3 The Most Suitable Normalization and


5.6 Analysis of the Proposed Modification on SDLMAlgorithm 1075.6.1 Comparison between the Proposed Mod-

ification on SDLM Algorithm and Stan-dard Backpropagation 109

5.7 Analysis of Training Scheme to recognize NewCategories without Full CNN Training 111

5.8 Conclusion 112

6 EXPERIMENTAL WORK AND RESULTS FOR CNN INFINGER-VEIN BIOMETRIC IDENTIFICATION 1136.1 Generation of Finger-vein Database for Biometric

Identification 1136.2 Performance Evaluation of Finger-vein Biometric

Identification 1186.2.1 VeCAD-UTM database 1186.2.2 Results of Cross-validation 1186.2.3 Best Connection Scheme 1206.2.4 The Most Suitable Normalization and


6.3 Conclusion 126

xi

7 CONCLUSION 1277.1 Concluding Remarks 1277.2 Suggestions for Future Work 130

REFERENCES 132Appendix A 146

xii

LIST OF TABLES

TABLE NO. TITLE PAGE

1.1 Characteristic comparison of biometric application [1] 61.2 History of some biometric development [2] 61.3 Previous work on face recognition based on CNN 71.4 Previous work on face recognition applying non-CNN

methods 71.5 Accuracy achieved by Computational Intelligence (CI)

approach for finger-vein identification system 81.6 Detail information and accuracy achieved by conventional

approach for fingervein identification system 82.1 Face Recognition System using CNN 222.2 Conventional methods of finger-vein identification 252.3 Algorithms applied by CI approach for finger-vein identifica-

tion system 262.4 Accuracy achieved by CI-based for finger-vein identification 272.5 Related previous work on finger-vein biometric system and

its recognition speed 273.1 Input/output relation and symbol for each activation function 433.2 Each row indicates which feature maps in S2 are combined

by the particular feature maps of C3 [3] 534.1 Variant type of connections between C1 and C2 layer 684.2 Connection scheme between C1 and C2 layer for AT&T

database 684.3 Variant type of connections between C1 and C2 layer 694.4 Connection scheme between C1 and C2 layer for AR Purdue

database 694.5 The values of ηg as given in [3] 735.1 Pose variations of FERET database 855.2 Performance of the model tested for AT&T database 875.3 Variant type of connections between C1 and C2 layer 87

xiii

5.4 Connection scheme between C1 and C2 layer for AT&Tdatabase 88

5.5 Parameter Setup for AT&T database 885.6 Accuracy for different combinations of weight initialization

algorithm and normalization methods for AT&T database 885.7 Accuracy of each input size for AT&T database 895.8 Benchmarking of accuracy of other CNN face recognizers on

AT&T database 915.9 Benchmarking with non-CNN approach 915.10 Benchmarking of execution time and convergence learning

time against other face recognizers 925.11 Benchmarking of model complexity against previous work

that apply CNN (not restricted to face recognition problems) 955.12 Comparison between fused architecture and LeNet-5 965.13 Performance of the model tested for AR database 975.14 Variant type of connections between C1 and C2 layer 985.15 Connection scheme between C1 and C2 layer for AR Purdue

database 985.16 Parameter Setup for AR Purdue database 995.17 Accuracy for different combinations of weight initialization

algorithm and normalization methods for AR Purdue database 995.18 Parameter setup for each input size for AR Purdue database 1005.19 Accuracy of each input size for AR Purdue database 1005.20 Benchmarking of face identification system using AR Purdue

database with existing works 1025.21 Performance of the model tested for FERET database 1035.22 Variant type of connections between C1 and C2 layer for

FERET database 1035.23 Connection scheme between C1 and C2 layer for FERET

database 1045.24 Parameter Setup for FERET database 1045.25 Accuracy for different combinations of weight initialization

algorithm and normalization methods for FERET database 1055.26 Parameter setup for each input size for FERET database 1055.27 Accuracy of each input size for FERET database 1055.28 Benchmarking of face identification system using FERET

database with existing works 1075.29 Experimental setup for standard Backpropagation (BP) 110

xiv

5.30 Result of BP training using combination of different initiallearning rate and learning rate fading 110

5.31 Result of BP training using combination of different initiallearning rate and learning rate fading 111

5.32 Result of accepting new subject(s) from JAFFE database intothe system 112

6.1 Performance of the model tested for VeCAD-UTM finger-vein database 120

6.2 Variant type of connections between C1 and C2 layer 1206.3 Connection scheme between C1 and C2 layer for VeCAD-

UTM database 1206.4 Parameter Setup for Case A and Case B 1216.5 Accuracy for different combinations of weight initialization

algorithm and normalization methods for Case A VeCAD-UTM finger-vein database 121

6.6 Accuracy for different combinations of weight initializationalgorithm and normalization methods for Case B VeCAD-UTM finger-vein database 122

6.7 Accuracy of each input size for VeCAD-UTM database 1236.8 Accuracy achieved by different number of subjects 1246.9 Benchmarking of accuracy achieved by the proposed method

and other CI methods for finger-vein identification system 1256.10 Benchmarking of accuracy achieved by the proposed method

and conventional methods for finger-vein identificationsystem 125

6.11 Benchmarking on recognition speed for finger-vein biometricsystem 126

xv

LIST OF FIGURES

FIGURE NO. TITLE PAGE

1.1 Typical pattern recognition flow 11.2 Example of CNN Architecture 41.3 Taxonomy of biometric methods 51.4 Challenges faced by biometric face recognition system 121.5 Inappropriate lighting in finger-vein samples of VeCAD-

UTM finger-vein database 132.1 Handwriting zip code application using CNN [4] 172.2 LeNet-5 CNN architecture for handwritting recognition 182.3 CNN architecture by Y. Simard et al. [5] with fused

convolution/subsampling 182.4 CNN architecture for fingerprint verification process [6] 282.5 Example of genuine and impostor pairs 292.6 Illustration of SCNN framework 302.7 A sample of ZIP code image and its possible segmentation 322.8 INSEG and OUTSEG segmentation for word recognition 322.9 Three schemes of SICoNNets. (a) Full-Connection. (b)

Toeplitz-Connection. (c) Binary-connection [7] 332.10 Shunting neuron model [7] 332.11 Sketch of the SPCNN connectivity [8]. 342.12 Multiscale Convolutional Neural Networks (MSCNN) archi-

tecture for traffic sign recognition [9]. 352.13 Three types of network paths connecting the input and the

hidden layers in the Multiple Circular Path ConvolutionalNeural Network (MCPCNN) (a) Full-connection path(b) Self-correlation path (c) Neighbourhood-correlationpath [10]. 36

xvi

2.14 Multi-column Deep Neural Networks (MCDNN) and its basicbuiding block (a) Deep Neural Network (DNN) architecture(b) The dataset is preprocessed (P) before the trainingstarts; during training all original or pre-processed imagesare randomly distorted before each epoch (D). (c) MCDNNarchitecture: the input image is pre-processed by n differentpreprocessors P0−Pn−1 and final predictions are obtained byaveraging individual predictions of each DNN [11]. 37

2.15 Comparison between Two-Dimensional (2D) and Three-Dimensional (3D) convolutions 38

2.16 3D CNN architecture for human action recognition 382.17 Recurrent Convolutional Neural Networks (RCNN) architec-

ture for dialog act classification 393.1 (a) Biological neuron (b) Modeling in NN 413.2 (a) Training phase (b) Test phase 483.3 An example of a LeNet-5 CNN architecture for handwritting

recognition [3] 513.4 Process in convolutional layer of CNN 523.5 Process in subsampling layer of CNN 523.6 LeNet-5 CNN architecture 544.1 Prediction of early stopping rule through cross-validation

technique [12]. 594.2 CNN architecture by Y. Simard et al. [5] with fused

convolution/subsampling 644.3 The proposed four-layers CNN architecture 644.4 (a) convolution and subsampling of LeNet-5 CNN archi-

tecture and (b) fused convolution/subsampling in proposedCNN 66

4.5 Conceptual view of generalizing the feature extraction layerto other database in the system 76

5.1 Four different types of weight initialization methods 815.2 Sample of images from AT&T database 835.3 Sample of images from AR Purdue database 835.4 Preprocessing stages for AR Purdue database 845.5 Sample of images from FERET database 855.6 Preprocessing stages for FERET database 855.7 Validation error rates in cross-validation of the six tested

CNN models for AT&T 86

xvii

5.8 The proposed architecture for AT&T database (referred hereas the 5-14-60 model) 89

5.9 Feature maps at each layer after training for AT&T database 905.10 Validation error rates in cross-validation of the seven tested

CNN models for AR database 975.11 Misclassified samples for AR database 995.12 The proposed architecture for AR Purdue database (referred

here as the 15-45-130 model) 1005.13 Feature maps at each layer after training for AR database 1015.14 Validation error rates in cross-validation of the six tested

CNN models for FERET 1025.15 The proposed architecture for FERET database (referred here

as the 15-47-130 model) 1065.16 Feature maps at each layer after training for FERET database 1065.17 Error rate on the training dataset 1085.18 Error rate on the test dataset 1085.19 Classification accuracy performance. (Note: Case A is for

the proposed learning with repeating scheme and Case B isthe original SDLM) 109

6.1 Non-ideal finger-vein samples of SDUMLA-HMT finger-vein database 113

6.2 Sample of images from VeCAD-UTM database 1146.3 Example of vertical cut 1156.4 Example of horizontal cut 1166.5 Image samples for 5 subjects (a) Case A and (b) Case B 1176.6 Preprocessing stages and segmentation process for Case A 1176.7 Preprocessing stages for Case B 1176.8 Validation error rates in cross-validation of the six tested

CNN models for VeCAD-UTM finger-vein database 1196.9 Training error for Case A and Case B 1226.10 Test error for Case A and Case B 1226.11 Rate of accuracy during training for Case A and Case B 1226.12 The proposed model for VeCAD-UTM database (referred

here as the 5-13-50 model) 1236.13 Output images at each layer after training for Case B 124

xviii

LIST OF ABBREVIATIONS

AI – Artificial Intelligence

ANN – Artificial Neural Network

BP – Backpropagation

CI – Computational Intelligence

CNNs – Convolutional Neural Networks

CMC – Cumulative Match Characteristic

DNN – Deep Neural Network

EBGM – Elastic Bunch Graph Matching

EER – Equal Error Rate

FERET – Face Recognition Technology

FAR – False Acceptance Rate

FRR – False Rejection Rate

FPGA – Field Programmable Gate Array

GA – Genetic Algorithm

GPU – Graphic Processing Unit

HD – Hausdorff Distance

HMM – Hidden Markov Model

LED – Light-Emitting Diode

LDA – Linear Discriminant Analysis

MATLAB – Matrix Laboratory

MSE – Mean Square Error

MHD – Modified Hausdorff Distance

MCDNN – Multi-column Deep Neural Networks

MLP – Multilayer Perceptron

MCPCNN – Multiple Circular Path Convolutional Neural Network

MSCNN – Multiscale Convolutional Neural Networks

NIR – Near Infrared

NNs – Neural Networks

xix

ORL – Olivetti Research Laboratory

1D – One-Dimensional

OS – Operating System

PC – Personal Computer

PIN – Personal Identification Number

PCA – Principal Component Analysis

PNN – Probabilistic Neural Network

RAM – Random Access Memory

ROC – Receiver Operating Characteristic

RCNN – Recurrent Convolutional Neural Networks

RNN – Recurrent Neural Networks

ROI – Region of Interest

RBF – Radial Basis Function

RF – Receptive Field

RPROP – Resilient Backpropagation

SOM – Self-Organizing Map

SICoNNets – Shunting Inhibitory Convolutional Neural Networks

SCNN – Siamese Convolutional Neural Network

SDNN – Space Displacement Neural Network

SPCNN – Sparse Convolutional Neural Network

SDLM – Stochastic Diagonal Levenberg-Marquardt

SVM – Support Vector Machine

3D – Three-Dimensional

TDNN – Time Delay Neural Networks

2D – Two-Dimensional

US – United State

UTM – Universiti Teknologi Malaysia

CHAPTER 1

INTRODUCTION

1.1 Overview of Pattern Recognition

Pattern recognition continues to be an active area of research since half acentury ago. The basic approach in pattern recognition is to transform raw imagesthrough a series of image processing algorithms before applying the final stage ofclassification. Examples of applications for pattern recognition includes: speechrecognition, handwriting recognition, object recognition, etc. Figure 1.1 shows acommon pattern recognition flow. The choice of sensors, preprocessing techniquesand decision making techniques depend on the characteristics of the problem domain.

Data acquisition Preprocessing Dimensionality reduction

& Feature extraction ClassificationInput sample ResultSegmentation

Figure 1.1: Typical pattern recognition flow

The first stage of a pattern recognition system is data acquisition. In thisstage, raw data or images are collected from sensors or capture devices. Imagepreprocessing is then performed, where transformations such as image enhancement,image restoration, compression and morphological processing are applied [13]. Imageenhancement is applied to highlight certain features of interest in an image byapplying contrast transformation, Region of Interest (ROI) processing and noisefiltering. Then the image is restored by improving its appearance. This involvesrecovering the original image that has been degraded by using a priori knowledgeof the degradation phenomenon and applying the inverse process. Image restorationapplies deblurring and noise reduction algorithm. Compression is then performedto remove redundant data in the image. Typically, the image is converted to imagefile formats such as JPEG (Joint Photographic Experts Group) image compressionstandard. In morphological processing stage, it involves with the techniques to extract

CONVOLUTIONAL NEURAL NETWORKS FOR FACE …eprints.utem.edu.my/14813/1/Convolutional Neural Networks...

Documents

Transcript of CONVOLUTIONAL NEURAL NETWORKS FOR FACE …eprints.utem.edu.my/14813/1/Convolutional Neural Networks...