Fix the following issue:
CPU1 CPU2 CPU3
T1 sets cfs_quota starts hrtimer cfs_bandwidth 'period_timer' T1 is migrated to CPU3 T2(worker thread) initiates offlining of CPU1 Hotplug operation starts ... 'period_timer' expires and is re-enqueued on CPU1 ... take_cpu_down() CPU1 shuts down and does not handle timers anymore. They have to be migrated in the post dead hotplug steps by the control task.
T2(worker thread) runs the post dead offline operation T1 holds lockA T1 is scheduled out //throttled by CFS bandwidth control T1 waits for 'period_timer' to expire T2(worker thread) waits for lockA
T1 waits there forever if it is scheduled out before it can execute the hrtimer offline callback hrtimers_dead_cpu(). Thus T2 waits for lockA forever.
Thomas Gleixner (1): hrtimers: Push pending hrtimers away from outgoing CPU earlier
Yu Liao (1): cpu/hotplug: fix kabi breakage in enum cpuhp_state
include/linux/hrtimer.h | 4 ++-- include/linux/smp.h | 1 + kernel/cpu.c | 17 +++++++++++++++-- kernel/smp.c | 8 ++++++++ kernel/time/hrtimer.c | 33 ++++++++++++--------------------- 5 files changed, 38 insertions(+), 25 deletions(-)