From: Yunsheng Lin linyunsheng@huawei.com
stable inclusion from stable-5.10.50 commit e7c3ae47978f97f528d95b0c86de51896e78d9f0 bugzilla: 174522 https://gitee.com/openeuler/kernel/issues/I4DNFY
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
[ Upstream commit 89837eb4b2463c556a123437f242d6c2bc62ce81 ]
The spin_trylock() was assumed to contain the implicit barrier needed to ensure the correct ordering between STATE_MISSED setting/clearing and STATE_MISSED checking in commit a90c57f2cedd ("net: sched: fix packet stuck problem for lockless qdisc").
But it turns out that spin_trylock() only has load-acquire semantic, for strongly-ordered system(like x86), the compiler barrier implicitly contained in spin_trylock() seems enough to ensure the correct ordering. But for weakly-orderly system (like arm64), the store-release semantic is needed to ensure the correct ordering as clear_bit() and test_bit() is store operation, see queued_spin_lock().
So add the explicit barrier to ensure the correct ordering for the above case.
Fixes: a90c57f2cedd ("net: sched: fix packet stuck problem for lockless qdisc") Signed-off-by: Yunsheng Lin linyunsheng@huawei.com Acked-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Chen Jun chenjun102@huawei.com Acked-by: Weilong Chen chenweilong@huawei.com Signed-off-by: Chen Jun chenjun102@huawei.com --- include/net/sch_generic.h | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 4dd2c9e34976..f8631ad3c868 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -163,6 +163,12 @@ static inline bool qdisc_run_begin(struct Qdisc *qdisc) if (spin_trylock(&qdisc->seqlock)) goto nolock_empty;
+ /* Paired with smp_mb__after_atomic() to make sure + * STATE_MISSED checking is synchronized with clearing + * in pfifo_fast_dequeue(). + */ + smp_mb__before_atomic(); + /* If the MISSED flag is set, it means other thread has * set the MISSED flag before second spin_trylock(), so * we can return false here to avoid multi cpus doing @@ -180,6 +186,12 @@ static inline bool qdisc_run_begin(struct Qdisc *qdisc) */ set_bit(__QDISC_STATE_MISSED, &qdisc->state);
+ /* spin_trylock() only has load-acquire semantic, so use + * smp_mb__after_atomic() to ensure STATE_MISSED is set + * before doing the second spin_trylock(). + */ + smp_mb__after_atomic(); + /* Retry again in case other CPU may not see the new flag * after it releases the lock at the end of qdisc_run_end(). */