From: Yuanzheng Song songyuanzheng@huawei.com
mainline inclusion from mainline-v5.16-rc1 commit 7e6ec49c18988f1b8dab0677271dafde5f8d9a43 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6AW65 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
When reading memcg->socket_pressure in mem_cgroup_under_socket_pressure() and writing memcg->socket_pressure in vmpressure() at the same time, the following data-race occurs:
BUG: KCSAN: data-race in __sk_mem_reduce_allocated / vmpressure
write to 0xffff8881286f4938 of 8 bytes by task 24550 on cpu 3: vmpressure+0x218/0x230 mm/vmpressure.c:307 shrink_node_memcgs+0x2b9/0x410 mm/vmscan.c:2658 shrink_node+0x9d2/0x11d0 mm/vmscan.c:2769 shrink_zones+0x29f/0x470 mm/vmscan.c:2972 do_try_to_free_pages+0x193/0x6e0 mm/vmscan.c:3027 try_to_free_mem_cgroup_pages+0x1c0/0x3f0 mm/vmscan.c:3345 reclaim_high mm/memcontrol.c:2440 [inline] mem_cgroup_handle_over_high+0x18b/0x4d0 mm/memcontrol.c:2624 tracehook_notify_resume include/linux/tracehook.h:197 [inline] exit_to_user_mode_loop kernel/entry/common.c:164 [inline] exit_to_user_mode_prepare+0x110/0x170 kernel/entry/common.c:191 syscall_exit_to_user_mode+0x16/0x30 kernel/entry/common.c:266 ret_from_fork+0x15/0x30 arch/x86/entry/entry_64.S:289
read to 0xffff8881286f4938 of 8 bytes by interrupt on cpu 1: mem_cgroup_under_socket_pressure include/linux/memcontrol.h:1483 [inline] sk_under_memory_pressure include/net/sock.h:1314 [inline] __sk_mem_reduce_allocated+0x1d2/0x270 net/core/sock.c:2696 __sk_mem_reclaim+0x44/0x50 net/core/sock.c:2711 sk_mem_reclaim include/net/sock.h:1490 [inline] ...... net_rx_action+0x17a/0x480 net/core/dev.c:6864 __do_softirq+0x12c/0x2af kernel/softirq.c:298 run_ksoftirqd+0x13/0x20 kernel/softirq.c:653 smpboot_thread_fn+0x33f/0x510 kernel/smpboot.c:165 kthread+0x1fc/0x220 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
Fix it by using READ_ONCE() and WRITE_ONCE() to read and write memcg->socket_pressure.
Link: https://lkml.kernel.org/r/20211025082843.671690-1-songyuanzheng@huawei.com Signed-off-by: Yuanzheng Song songyuanzheng@huawei.com Reviewed-by: Muchun Song songmuchun@bytedance.com Cc: Shakeel Butt shakeelb@google.com Cc: Roman Gushchin guro@fb.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Michal Hocko mhocko@suse.com Cc: Matthew Wilcox (Oracle) willy@infradead.org Cc: Alex Shi alexs@kernel.org Cc: Wei Yang richard.weiyang@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Cai Xinchen caixinchen1@huawei.com Reviewed-by: Wang Weiyang wangweiyang2@huawei.com Signed-off-by: Jialin Zhang zhangjialin11@huawei.com --- include/linux/memcontrol.h | 2 +- mm/vmpressure.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index abb2476bd282..89e4f01c5710 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1921,7 +1921,7 @@ static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg) if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && memcg->tcpmem_pressure) return true; do { - if (time_before(jiffies, memcg->socket_pressure)) + if (time_before(jiffies, READ_ONCE(memcg->socket_pressure))) return true; } while ((memcg = parent_mem_cgroup(memcg))); return false; diff --git a/mm/vmpressure.c b/mm/vmpressure.c index d69019fc3789..c6a7eca666f6 100644 --- a/mm/vmpressure.c +++ b/mm/vmpressure.c @@ -304,7 +304,7 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree, * asserted for a second in which subsequent * pressure events can occur. */ - memcg->socket_pressure = jiffies + HZ; + WRITE_ONCE(memcg->socket_pressure, jiffies + HZ); } } }