From: Thomas Gleixner tglx@linutronix.de
mainline inclusion from mainline-5.11-rc1 commit 372acbbaa80940189593f9d69c7c069955f24f7a category: feature feature: Deep isolation bugzilla: https://gitee.com/openeuler/kernel/issues/I4N00D CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
No point in doing calculations.
tick_next_period = last_jiffies_update + tick_period
Just check whether now is before tick_next_period to figure out whether jiffies need an update.
Add a comment why the intentional data race in the quick check is safe or not so safe in a 32bit corner case and why we don't worry about it.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/r/20201117132006.337366695@linutronix.de Signed-off-by: Yunfeng Ye yeyunfeng@huawei.com Reviewed-by: Chao Liu liuchao173@huawei.com Reviewed-by: Chen Hui judy.chenhui@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- kernel/time/tick-sched.c | 46 ++++++++++++++++++++++++++++------------ 1 file changed, 33 insertions(+), 13 deletions(-)
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index e8d351b7f9b0..82a56b8bcd60 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -57,11 +57,29 @@ static void tick_do_update_jiffies64(ktime_t now) ktime_t delta;
/* - * Do a quick check without holding jiffies_lock: - * The READ_ONCE() pairs with two updates done later in this function. + * Do a quick check without holding jiffies_lock. The READ_ONCE() + * pairs with the update done later in this function. + * + * This is also an intentional data race which is even safe on + * 32bit in theory. If there is a concurrent update then the check + * might give a random answer. It does not matter because if it + * returns then the concurrent update is already taking care, if it + * falls through then it will pointlessly contend on jiffies_lock. + * + * Though there is one nasty case on 32bit due to store tearing of + * the 64bit value. If the first 32bit store makes the quick check + * return on all other CPUs and the writing CPU context gets + * delayed to complete the second store (scheduled out on virt) + * then jiffies can become stale for up to ~2^32 nanoseconds + * without noticing. After that point all CPUs will wait for + * jiffies lock. + * + * OTOH, this is not any different than the situation with NOHZ=off + * where one CPU is responsible for updating jiffies and + * timekeeping. If that CPU goes out for lunch then all other CPUs + * will operate on stale jiffies until it decides to come back. */ - delta = ktime_sub(now, READ_ONCE(last_jiffies_update)); - if (delta < tick_period) + if (ktime_before(now, READ_ONCE(tick_next_period))) return;
/* Reevaluate with jiffies_lock held */ @@ -72,9 +90,8 @@ static void tick_do_update_jiffies64(ktime_t now) if (delta >= tick_period) {
delta = ktime_sub(delta, tick_period); - /* Pairs with the lockless read in this function. */ - WRITE_ONCE(last_jiffies_update, - ktime_add(last_jiffies_update, tick_period)); + last_jiffies_update = ktime_add(last_jiffies_update, + tick_period);
/* Slow path for long timeouts */ if (unlikely(delta >= tick_period)) { @@ -82,15 +99,18 @@ static void tick_do_update_jiffies64(ktime_t now)
ticks = ktime_divns(delta, incr);
- /* Pairs with the lockless read in this function. */ - WRITE_ONCE(last_jiffies_update, - ktime_add_ns(last_jiffies_update, - incr * ticks)); + last_jiffies_update = ktime_add_ns(last_jiffies_update, + incr * ticks); } do_timer(++ticks);
- /* Keep the tick_next_period variable up to date */ - tick_next_period = ktime_add(last_jiffies_update, tick_period); + /* + * Keep the tick_next_period variable up to date. + * WRITE_ONCE() pairs with the READ_ONCE() in the lockless + * quick check above. + */ + WRITE_ONCE(tick_next_period, + ktime_add(last_jiffies_update, tick_period)); } else { write_seqcount_end(&jiffies_seq); raw_spin_unlock(&jiffies_lock);