[PATCH openEuler-1.0-LTS] memcg: fix a UAF problem in drain_all_stock()

6 Sep 2023

hulk inclusion
category: bugfix
bugzilla: 189183, https://gitee.com/openeuler/kernel/issues/I7Z1ZU
CVE: NA

----------------------------------------

The following panic with RedHat 7.5 kernel was reported by UVP:

CPU: 28 PID: 56610 Comm: kworker/u160:6 Kdump: loaded Tainted: G           OE K----V-------   3.10.0-862.14.1.6_152.x86_64 #1
Hardware name: ZTE R5300 G4/R5300G4, BIOS 03.20.0200_8717837 06/07/2021
Workqueue: events_aync_free recharge_parent
task: ffff97fc84cc0fe0 ti: ffff97d3840c8000 task.ti: ffff97d3840c8000
RIP: 0010:[<ffffffffa0d1ea7d>]  [<ffffffffa0d1ea7d>] cgroup_is_descendant+0x1d/0x40
RSP: 0018:ffff97d3840cbd10  EFLAGS: 00010296
RAX: 0000000000000000 RBX: ffffffffa1943ba0 RCX: 0000000000000007
RDX: ffff97fd12043800 RSI: ffff982d3faa5c00 RDI: 3930343331356364
RBP: ffff97d3840cbd10 R08: ffffffffa1943ba0 R09: 0000000000000007
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000016100
R13: ffff97fe809d6100 R14: 0000000000000007 R15: 0000000000000007
FS:  0000000000000000(0000) GS:ffff982e7be00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005633394b1c10 CR3: 00000059d900e000 CR4: 00000000003627e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 [<ffffffffa0e1840c>] __mem_cgroup_same_or_subtree+0x2c/0x40
 [<ffffffffc0937291>] drain_all_stock+0xc1/0xe30 [klp_HP0038]
 [<ffffffffa0e185f1>] mem_cgroup_reparent_charges+0x51/0x3b0
 [<ffffffffa0cce718>] ? finish_task_switch+0xf8/0x170
 [<ffffffffa0e18b14>] recharge_parent+0x54/0x80
 [<ffffffffa0cb7a32>] process_one_work+0x182/0x450
 [<ffffffffa0cb8996>] worker_thread+0x126/0x3c0
 [<ffffffffa0cb8870>] ? manage_workers.isra.24+0x2a0/0x2a0
 [<ffffffffa0cbfab1>] kthread+0xd1/0xe0
 [<ffffffffa0cbf9e0>] ? insert_kthread_work+0x40/0x40
 [<ffffffffa133b5dd>] ret_from_fork_nospec_begin+0x7/0x21
 [<ffffffffa0cbf9e0>] ? insert_kthread_work+0x40/0x40

It is found that in case stock->nr_pages is decreased to 0, a memcg that
is going to be freed would skip the drain_local_stock() process and
therefore be left on stock->cached after being freed, which could cause
UAF problems in drain_all_stock().

Now it is believed that the same problem exists on 4.19 as well,
confirmed by successful reproduction. The problem causes panic with
similar call trace, and its triggering process is demonstrated as
follows:

stock->cached = mB
CPU2                            CPU3                         CPU4
consume_stock
 local_irq_save
 stock->nr_pages -= xxx -> 0
                                drain_all_stock
                                 rcu_read_lock()
                                 memcg = cpu2's stock->cached
                                 cpu2's stock->nr_page==0
                                 rcu_read_unlock()
                                 (skip)

====================================== (mB freed) =======================================

                                                             drain_all_stock(mD)
                                                              rcu_read_lock()
                                                              memcg = cpu2's stock->cached
                                                              (interrupted)
refill_stock(mC)
 local_irq_save
 drain_stock(mB)
 stock->cached = mC
 stock->nr_pages += xxx (> 0)
                                                              stock->nr_pages > 0
                                                              mem_cgroup_is_descendant(memcg, root_memcg) [UAF]
                                                              rcu_read_unlock()

Fix this problem by removing `stock->nr_pages` from the preconditions of
`flush = true` in drain_all_stock(), so as to drain the stock even if
its nr_pages is 0.

Signed-off-by: GONG, Ruiqi <gongruiqi1@huawei.com>
---
 mm/memcontrol.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 1b11bc13e1aa..032bb52cd2ed 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2226,8 +2226,7 @@ static void drain_all_stock(struct mem_cgroup *root_memcg)
 
 		rcu_read_lock();
 		memcg = stock->cached;
-		if (memcg && stock->nr_pages &&
-		    mem_cgroup_is_descendant(memcg, root_memcg))
+		if (memcg && mem_cgroup_is_descendant(memcg, root_memcg))
 			flush = true;
 		rcu_read_unlock();
 
-- 
2.25.1