Development of an Optimal Extracted Feature Classification Scheme in Voice Recognition System Using Dynamic Cuckoo Search Algorithm.

ABSTRACT

This research work is aimed at the development of an optimal extracted feature classification scheme in a voice recognition system using a dynamic cuckoo search algorithm.

This minimized error mismatch in the recognition process and increased the accuracy of recognition. Standard voice dataset was obtained from English Language Speech Database for Speaker Recognition (ELSDSR) of the Technical University of Denmark (DTU), processed and key features of these voice data were extracted.

A dynamic Cuckoo Search Algorithm (dCSA) was developed, which optimally classify the extracted feature vectors of the speech signals from the voice data for the voice recognition system (using the dataset obtained from the ELSDSR database of the DTU.

The performance of the developed Voice Recognition System (VRS) with the dCSA-based scheme was compared with that of the standard CSA-based scheme using accuracy as performance metrics.

The results of the dCSA-based classification scheme showed a recognition accuracy of 93.18% in the VRS when compared with that of the standard CSA-based classification scheme which records 90% accuracy. The simulation was carried out using MATLAB 2013b.

TABLE OF CONTENTS

DECLARATION i
CERTIFICATION ii
DEDICATION iii
ACKNOWLEDGEMENT iv
ABSTRACT vi
TABLE OF CONTENTS vii
LIST OF FIGURES xv
LIST OF TABLES xiii
LIST OF APPENDICES xiv
LIST OF ABBREVIATIONS xv
CHAPTER ONE: INTRODUCTION
1.1 Background of the Research 1
1.2 Motivation 4
1.3 Significance of Research 4
1.4 Statement of Problem 5
1.5 Aim and Objectives 5
1.6 Methodology 6
1.7 Dissertation Organization 7
CHAPTER TWO: LITERATURE REVIEW
2.1 Introduction 8
2.2 Review of Fundamental Concepts 8
2.2.1 A voice 8
2.2.2 Speech production 9
2.2.3 Voice recognition system 11
2.2.3.1 Categories of voice recognition system 11
2.2.3.2 Speaker recognition 12
2.2.3.3 Processes in speaker recognition system 13
2.2.3.4 Speech signal acquisition process in ELSDSR voice database 14
2.2.3.5 Speech processing 15
2.2.4 Speech Feature extraction 16
2.2.5 Classification and feature matching 17
2.2.6 Cuckoo bird and it’s breeding behavior 20
2.2.7 Lѐvy flight behavior 21
2.2.8 Cuckoo search algorithm (CSA) 22
2.2.9 Inertia weight factor 26
2.2.10 CSA based classification 27
2.2.11 Matching technique 28
2.2.12 Decision theory 29
2.2.13 Optimization test functions 29
2.3 Review of Similar Works 33
2.3.1 Review of works based on voice recognition system 33
2.3.2 Research works on the cuckoo search algorithm modification 38
CHAPTER THREE: MATERIALS AND METHODS
3.1 Introduction 42
3.2 Development of Speakers‘ Database 42
3.2.1 Obtaining standard voice dataset from ELSDSR of DTU 43
3.2.2 Recording environment 44
3.2.3 Recording equipment 45
3.2.4 Extraction of voice features 45
3.2.5 Training of speakers extracted features 46
3.3 Development of dynamic Cuckoo Search Algorithm (dCSA) 46
3.3.1 Initialization of dCSA parameters 47
3.3.2 Introduction of inertia weight factor 47
3.3.3 Generation of new solution by lévy flight and updating cuckoo position 48
3.3.4 Evaluation and comparison of solutions 49
3.3.5 Replacement of worst solutions 50
3.4 Performance Evaluation of the Algorithms (CSA and dCSA) 51
3.4.1 Visualization of the optimization test function 51
3.4.1.1 Ackley function 49
3.4.1.2 De Jong function 50
3.4.1.3 Easom function 50
3.4.1.4 Griewangk function 51
3.4.1.5 Michalewicz function 52
3.4.1.6 Rastrigin function 52
3.4.1.7 Rosenbrock function 53
3.4.1.8 Schwefelfunction 53
3.4.1.9 Shubert function 54
3.4.1.10 Sphere function 54
3.4.2 Percentage improvement 56
3.5 Application of dCSA into Voice Recognition System (VRS) 57
3.5.1 Testing of speakers for recognition 58
3.6 Validation of Performance of CSA and dCSA Scheme in VRS 58
3.6.1 Accuracy 59
CHAPTER FOUR: RESULTS AND DISCUSSION
4.1 Introduction 60
4.2 Speech Signal Representation and Analysis 60
CHAPTER FIVE: SUMMARY, CONCLUSION, AND RECOMMENDATIONS
5.1 Summary 71
5.2 Conclusion 71
5.3 Significant Contribution 72
5.4 Recommendation for Further Work 72
REFERENCES 71

INTRODUCTION

1.1 Background of the Research

denotes sound produced in a person‘s and articulated through the , as speech or song, while refers to the to express thoughts and feelings by articulate sounds (Das &Nahar, 2016).

Voice is used to express certain opinions or interests using specific words. These words are used for communication among individuals, which is the bridge that lays the foundation for improved human relationships (Amarasinghe & Wimalaratne, 2017).

In addition to human-human interaction, the spoken word is now extended through technological mediation such as telephony, movies, radio, television, computers, and the Internet to finds a reflection in human-machine interaction as well.

This gives rise to other interesting research topics like speech recognition, speaker identification, and voice recognition (Huang et al., 2001). Research into voice recognition beginning in the early 1960s (Juang & Rabiner, 2005).

Voice recognition is a binary classification problem in which a person‘s identity is verified based on his/her voice (Zhang et al., 2017). It has a wide range of application areas and plays a crucial role in the arena of forensics, security, and biometric authentication for verifying or detecting the voice of a speaker from a group of speakers (Das & Nahar, 2016).

The human voice in general carries much information such as gender, emotion, and identity of the speaker. The objective of voice recognition is to decide which speaker is presently based on the individual‘s utterance.

REFERENCES

Aggarwal, C. C., & Reddy, C. K. (2014). Data clustering. Algorithms and Applications, Chapman &
Halls.
Amarasinghe, A., & Wimalaratne, P. (2017). An Assistive Technology Framework for Communication
with Hearing Impaired Persons. GSTF Journal on Computing (JoC), 5(2).
Atal, B. S. (1976). Automatic recognition of speakers from their voices. Proceedings of the IEEE, 64(4),
460-475.
Bansal, D., Turk, N., & Mendiratta, S. (2015). Automatic speech recognition by cuckoo search
optimization based artificial neural network classifier. Paper presented at the Soft Computing
Techniques and Implementations (ICSCTI), 2015 International Conference on.
Bansal, J. C., Singh, P., Saraswat, M., Verma, A., Jadon, S. S., & Abraham, A. (2011). Inertia weight
strategies in particle swarm optimization. Paper presented at the Nature and Biologically
Inspired Computing (NaBIC), 2011 Third World Congress on.
Barthelemy, P., Bertolotti, J., & Wiersma, D. S. (2008). A Lévy flight for light. Nature, 453(7194), 495-
498.
Bhalla, A. V., Khaparkar, S., & Bhalla, M. R. (2012). Performance improvement of speaker recognition
system. International Journal of Advanced Research in Computer Science and Software Engineering, 2(3).
Brown, C. T., Liebovitch, L.

By admin

Leave a Reply

Your email address will not be published. Required fields are marked *