Foundations of multicore programming: algorithm and verification

Introduction

The advent of multicore era of modern processors calls for a paradigmatic shift from sequential programming to concurrent programming. Traditionally the vocabulary of concurrent programming has been largely that of shared memory with locks (e.g. semaphores, critical regions, monitors, etc.). Whether it comprises a good vocabulary for multicore programming, i.e. in the respects of efficiency, scalability and abstraction, is still controversial. The quest for a better vocabulary (that scores well in all three respects) is well underway in both academia and industries.

In this course we will concentrate on the efficiency and scalability respects of multicore programming and introduce an alternative programming style to traditional shared memory with locks. It is called `low-lock or low-synchronisation programming style'. The programming style takes into account more lower level features of multicore processors, e.g. hierarchies of caches, and will be immediately useful for multicore programming in the depth of system software (operating systems, compilers and middleware), where factors like scalability, deadlock freedom and performance are more highly regarded than the ease of design and implementation.

Low-synchronisation programming improves in efficiency and scalability by relaxing ordering constraints on memory operations and by reducing or removing the synchroniszation overhead of locking. They are also known under various names like weak memory model, fine-grained locking, lock-free synchroniszation, etc.

Low-synchronisation programming can most efficiently exploit the raw power of modern multiprocessors, even though it has been criticised as being too tricky to master by mainstream programmers. Low-synchronisation programming is rich in ingenious algorithms, e.g. algorithms for implementing atomic registers or concurrent data structures, but under-developed in formal theory. In this school, we will also show how the theory developed for CSP can be usefully applied to low-synchronisation algorithms, helping programmers to understand, analyze and verify their algorithms.

Synopsis (preliminary)

1) Lecture: Introduction to concurrent programming (shared memory, mutual exclusion, race-freedom)
2) Lecture: Racy programs and atomicity (safe, regular and atomic registers, precedes order, as-if order, linearisability, sequential consistency, quiescent consistency, weak memory consistency)
3) Lecture: Java memory model (weak program order, local time, atomicty and regularity, memory barriers, fences and volatile variables)
4) Lecture: Algorithm for the construction of atomic registers (1) (SRSW algorithms, MRSW algorithms, MRMW algorithms)
5) Lecture: Algorithm for the construction of atomic registers (2) (MRMW algorithms, wait-free timestamping systems)
6) Lecture: The power of synchronisation primitives (consensus numbers)
7) Lecture: Spin locks and contentions
8) Lecture: Linearisable data structure (fine-grain locking, lock-free synchronisation)
9) Lecture: Algorithms for concurrent data structures (optimistic synchronisation, lazy synchronisation, elimination and the ABA problem, linked list, queue and stack)
10) Exercise session
11) Lab session: CSP/FDR2 tutorial
12) Lab session: Verification of low-synchronisation algorithms using CSP/FDR2

Textbooks

`The Art of Multiprocessor Programming' by Maurice Herlihy and Nir Shavit, 2008
Other material: Slides and course notes

Lecturer:Xu Wang

Dr. Xu Wang is currently a research fellow at the International Institute for Software Technology, United Nations University. His research interests lie in the broad field of semantic and verification of concurrent systems, where he specialises in developing automatic verification methods and state space reduction techniques for CSP/FDR. Before joining UNU-IIST as an assistant research fellow in 2006, he had being working as research staff at HKUST, Oxford University and Birmingham University since 2000. He holds PhD, M.Eng. and B.Sc. degrees in Computer Science from Chinese Academy of Sciences and Xi'an Jiaotong University. He would like to work with motivated students with interests in applying formal methods to solve practical problems in concurrent systems.


All enquires should be addressed to Chang Su chongqingschool@iist.unu.edu