Title:

Speech Signal Processing

Code:ZRE
Ac.Year:2017/2018
Term:Summer
Curriculums:
ProgrammeFieldYearDuty
IT-MGR-1HMGH-Recommended
IT-MSC-2MBI-Compulsory-Elective - group S
IT-MSC-2MBS-Elective
IT-MSC-2MGM1stCompulsory
IT-MSC-2MIN-Compulsory-Elective - group C
IT-MSC-2MIS-Elective
IT-MSC-2MMI-Compulsory-Elective - group S
IT-MSC-2MMM-Elective
IT-MSC-2MPV-Compulsory-Elective - group G
IT-MSC-2MSK2ndCompulsory-Elective - group B
Language of Instruction:Czech
Public info:http://www.fit.vutbr.cz/study/courses/ZRE/public/
Credits:5
Completion:examination (written)
Type of
instruction:
Hour/semLecturesSem. ExercisesLab. exercisesComp. exercisesOther
Hours:26201212
 ExaminationTestsExercisesLaboratoriesOther
Points:51140629
Guarantor:Černocký Jan, doc. Dr. Ing., DCGM
Lecturer:Černocký Jan, doc. Dr. Ing., DCGM
Grézl František, Ing., Ph.D., DCGM
Malenovský Vladimír, Ing., Ph.D., DCGM
Szőke Igor, Ing., Ph.D., DCGM
Instructor:Mošner Ladislav, Ing., DCGM
Skácel Miroslav, Ing., DCGM
Žmolíková Kateřina, Ing., DCGM
Faculty:Faculty of Information Technology BUT
Department:Department of Computer Graphics and Multimedia FIT BUT
Follow-ups:
Speech Processing Systems (SRE), DCGM
Schedule:
DayLessonWeekRoomStartEndLect.Gr.St.G.EndG.
Wedexercise2018-04-25A11212:0013:501MIT12 MGM12 MGM
Wedexercise2018-04-25A11212:0013:502MIT12 MGM12 MGM
Wedexam - 2. oprava2018-06-06A11213:0014:501MIT
Wedexam - 2. oprava2018-06-06A11213:0014:502MIT
Thuexam - 1. oprava2018-05-31A11313:0014:501MIT
Thuexam - 1. oprava2018-05-31A11313:0014:502MIT
Thuexam - řádná2018-05-17D020616:0018:501MIT
Thuexam - řádná2018-05-17D020616:0018:502MIT
 
Learning objectives:
  To provide students with the knowledge of basic characteristics of speech signal in relation to production and hearing of speech by humans. To describe basic algorithms of speech analysis common to many applications. To give an overview of applications (recognition, synthesis, coding) and to inform about practical aspects of speech algorithms implementation.
Description:
  Applications of speech processing, digital processing of speech signals, production and perception of speech, introduction to phonetics, pre-processing and basic parameters of speech, linear-predictive model, cepstrum, fundamental frequency estimation, coding - time domain and vocoders, recognition - DTW and HMM, synthesis. Software and libraries for speech processing.
Learning outcomes and competences:
  The students will get familiar with basic characteristics of speech signal in relation to production and hearing of speech by humans. They will understand basic algorithms of speech analysis common to many applications. They will be given an overview of applications (recognition, synthesis, coding) and be informed about practical aspects of speech algorithms implementation. The students will be able to design a simple system for speech processing (speech activity detector, recognizer of limited number of isolated words), including its implementation into application programs.
Syllabus of lectures:
 
  • Introduction, applications of speech processing, sciences relevant for SP, informational content of speech.
  • Digital processing of speech signals.
  • Speech production and perception, basic notions from psycho-acoustics, applications in speech processing. 
  • Introduction to phonetics, international norms for phoneme mark-up.
  • Pre-processing and basic parameters of speech.
  • Linear-predictive model, spectrum using LP, applications of LP. 
  • Cepstral analysis, Mel-frequency cepstrum.
  • Determination of fundamental frequency.
  • Speech coding
  • Speech recognition - dynamic programming DTW, hidden Markov models HMM
  • Speech synthesis
  • Software and libraries for speech processing.
Syllabus of numerical exercises:
 
  • Parameterization, DTW, HMM.
  • Presentation of projects.
Syllabus of computer exercises:
 
    Except the last one, Matlab is used in labs.
  • Frames, windows, spectrum, pre-processing.
  • Linear prediction (LPC).
  • Fundamental frequency estimation.
  • Coding.
  • Recognition - Dynamic time Warping (DTW).
  • Recognition - hidden Markov models (Hidden Markov Model Toolkit - HTK).
Fundamental literature:
 
  • Psutka, J.: Komunikace s počítačem mluvenou řečí. Academia, Praha, 1995, ISBN  80-200-0203-0
  • Gold, B., Morgan, N.: Speech and Audio Signal Processing, John Wiley & Sons, 2000, ISBN 0-471-35154-7
  • Krčmová, N.: Fonetika a fonologie: zvuková stavba současné češtiny. Masarykova univerzita, Brno, 1990, ISBN 80-210-0137-2.
  • Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition, Signal Processing, Prentice Hall, Engelwood Cliffs, NJ, 1993, ISBN 0-13-015157-2
Study literature:
 
  • Gold, B., Morgan, N.: Speech and Audio Signal Processing, John Wiley & Sons, 2000, ISBN 0-471-35154-7
Controlled instruction:
  Presence in any form of instruction is not compulsory. An absence (and hence loss of points) can be compensated in the following ways: 
  1. presence in another laboratory group dealing with the same task. 
  2. showing a summary of results to the tutor at the next lab. 
  3. sending a short report (summarizing the results of the missed lab and answering the questions from the assignment) to the tutor, in 14 days after the missed lab. 
Progress assessment:
  
  • mid-term test 14 pts
  • projects 29 pts
  • presentation of results in computer labs 6 pts
 

Your IPv4 address: 54.80.81.223
Switch to IPv6 connection

DNSSEC [dnssec]