From: Bob Pearson rpearsonhpe@gmail.com
stable inclusion from stable-v6.6.33 commit 21b4c6d4d89030fd4657a8e7c8110fd941049794 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IA72Y8 CVE: CVE-2024-38544
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
---------------------------
[ Upstream commit 2b23b6097303ed0ba5f4bc036a1c07b6027af5c6 ]
In rxe_comp_queue_pkt() an incoming response packet skb is enqueued to the resp_pkts queue and then a decision is made whether to run the completer task inline or schedule it. Finally the skb is dereferenced to bump a 'hw' performance counter. This is wrong because if the completer task is already running in a separate thread it may have already processed the skb and freed it which can cause a seg fault. This has been observed infrequently in testing at high scale.
This patch fixes this by changing the order of enqueuing the packet until after the counter is accessed.
Link: https://lore.kernel.org/r/20240329145513.35381-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson rpearsonhpe@gmail.com Fixes: 0b1e5b99a48b ("IB/rxe: Add port protocol stats") Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Liu Jian liujian56@huawei.com --- drivers/infiniband/sw/rxe/rxe_comp.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c index d0bdc2d8adc8..acd2172bf092 100644 --- a/drivers/infiniband/sw/rxe/rxe_comp.c +++ b/drivers/infiniband/sw/rxe/rxe_comp.c @@ -131,12 +131,12 @@ void rxe_comp_queue_pkt(struct rxe_qp *qp, struct sk_buff *skb) { int must_sched;
- skb_queue_tail(&qp->resp_pkts, skb); - - must_sched = skb_queue_len(&qp->resp_pkts) > 1; + must_sched = skb_queue_len(&qp->resp_pkts) > 0; if (must_sched != 0) rxe_counter_inc(SKB_TO_PKT(skb)->rxe, RXE_CNT_COMPLETER_SCHED);
+ skb_queue_tail(&qp->resp_pkts, skb); + if (must_sched) rxe_sched_task(&qp->comp.task); else