Title:

Processor Architecture

Code:ACH
Ac.Year:2017/2018
Term:Winter
Curriculums:
ProgrammeBranchYearDuty
IT-MSC-2MBI-Elective
IT-MSC-2MBS-Compulsory-Elective - group C
IT-MSC-2MGM2ndElective
IT-MSC-2MIN-Elective
IT-MSC-2MIS-Elective
IT-MSC-2MMI-Compulsory-Elective - group C
IT-MSC-2MMM-Elective
IT-MSC-2MPV2ndCompulsory
IT-MSC-2MSK-Elective
IT-MSC-2MSK2ndCompulsory-Elective - group C
Language:Czech
Credits:5
Completion:accreditation+exam (written)
Type of
instruction:
Hour/semLecturesSem. ExercisesLab. exercisesComp. exercisesOther
Hours:26001016
 ExaminationTestsExercisesLaboratoriesOther
Points:60100030
Guarantee:Jaroš Jiří, Ing., Ph.D., DCSY
Lecturer:Jaroš Jiří, Ing., Ph.D., DCSY
Instructor:Bordovský Gabriel, Ing., DCSY
Jaroš Jiří, Ing., Ph.D., DCSY
Kadlubiak Kristián, Ing., DCSY
Faculty:Faculty of Information Technology BUT
Department:Department of Computer Systems FIT BUT
Follow-ups:
Design of External Adapters and Embedded Systems (NAV), DCSY
Graphic and Multimedia Processors (GMU), DCSY
Substitute for:
Advanced Computer Architecture (ARP), DCSY
Schedule:
DayLessonWeekRoomStartEndLect.Gr.St.G.EndG.
Tuecomp.lablecturesO20408:0009:50
Tuecomp.lablecturesO20410:0011:50
FrilecturelecturesD020710:0011:501MITxxxx
FrilecturelecturesD020710:0011:502MIT17 MPV17 MPV
 
Learning objectives:
  To familiarize students with architecture of the newest processors exploiting the instruction-level, thread-level and data-level parallelism. To clarify the role of a compiler and its cooperation with CPU. To be able to orientate oneself on the processor market, to evaluate and compare various CPUs. Next to familiarize with architecture of graphical processors and its use for acceleration of numerical calculations (GPGPU), and with low-power techniques in processors for mobile applications.  
Description:
  The course covers architecture of universal as well as special-purpose processors. Instruction-level parallelism (ILP) is studied on scalar, superscalar and VLIW processors. Then the processors with thread-level parallelism (TLP) are discussed. Data parallelism is illustrated on SIMD streaming instructions and on graphical processors (SIMT). Parallelization of numerical calculations for GPU is also covered (CUDA). Techniques of  low-power processors are also explained.
Knowledge and skills required for the course:
  Von Neumann computer architecture, memory hierarchy, programming in assembly language, compiler's tasks and functions
Learning outcomes and competences:
  Overview of processor microarchitecture and its future trends, ability to compare processors and using suitable tools, simulate the influence of changes in their architecture. Get acquainted with processor performance measurement. The knowledge of architecture and hardware support of parallel computation on graphic processors can be directly applied for acceleration of intensive calculations. 
Syllabus of lectures:
 
  1. Scalar processors. Pipelined instruction processing and compiler asistance
  2. Superscalar CPU. Dynamic instruction scheduling, branch prediction.
  3. Advanced superscalar processing techniques: register renaming, data flow through memory hierarchy.
  4. Optimization of instruction and data fetching. Examples of superscalar CPUs.
  5. Multi-threaded processors.
  6. Data parallelism. SIMD extensions and vectorization.
  7. Architecture of graphics processing units, SIMT programming model.
  8. CUDA programming language, thread and memory model.
  9. Synchronisation and reduction on GPU, design and tuning of GPU codes.
  10. Stream processing, multi-GPU systems, GPU libraries.
  11. Architecture of many core systems (MIC, Xeon Phi) and their programming.
  12. VLIW processors. SW pipelining, predication, binary translation.
  13. Low power processors.
Syllabus of computer exercises:
 
  1. Performance measurement for sequential codes.
  2. Vectorisation using OpenMP 4.0.
  3. CUDA: Memory transfers, simple kernels.
  4. CUDA: Shared memory.
  5. CUDA: Texture and constant memory, reduction operation.
Syllabus - others, projects and individual work of students:
 
  • Performance evaluation and code optimization using OpenMP 4.0
  • Acceleration of computational job using CUDA 8.0 

 

Fundamental literature:
 
  • Baer, J.L.: Microprocessor Architecture. Cambridge University Press, 2010, 367 s., ISBN 978-0-521-76992-1
  • Hennessy, J.L., Patterson, D.A.: Computer Architecture - A Quantitative Approach. 5. vydání, Morgan Kaufman Publishers, Inc., 2012, 493 s., ISBN: 978-0-12-383872-8
  • Kirk, D., and Hwu, W.: Programming Massively Parallel Processors: A Hands-on Approach, Elsevier, 2010, s. 256, ISBN: 978-0-12-381472-2
  • Jeffers, J., and Reinders, J.: Intel Xeon Phi Coprocessor High Performance Programming, 2013, Morgan Kaufmann, p. 432), ISBN: 978-0-124-10414-3
Study literature:
 
Progress assessment:
  Assessment of two projects, 13 hours in total and a midterm examination.
Exam prerequisites:
  To get 20 out of 40 points for projects and midterm examination.