hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IA6J1H CVE: NA
----------------------------------------
There is a low probability that kernel panic will occur when we test with smart_grid.
The log show below:
[65160.746953] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 [65160.746990] Unable to handle kernel NULL pointer dereference at virtual address 000000000000000 [65160.756974] Mem abort info: [65160.766849] Mem abort info: [65160.770660] ESR = 0x96000004 [65160.770663] EC = 0x25: DABT (current EL), IL = 32 bits [65160.774374] ESR = 0x96000004 [65160.774377] EC = 0x25: DABT (current EL), IL = 32 bits [65160.778428] SET = 0, FnV = 0 [65160.778430] EA = 0, S1PTW = 0 [65160.784728] SET = 0, FnV = 0 [65160.784731] EA = 0, S1PTW = 0 [65160.786018] Detected VIPT I-cache on CPU104 [65160.786070] GICv3: CPU104: found redistributor 3a0000 region 104: 0x00002000aa300000 [65160.786240] CPU104: Booted secondary processor 0x00003a0000 [0x481fd010] [65160.788696] Data abort info: [65160.788699] ISV = 0, ISS = 0x00000004 [65160.794911] Data abort info: [65160.794913] ISV = 0, ISS = 0x00000004 [65160.798965] CM = 0, WnR = 0 [65160.798967] user pgtable: 4k pages, 48-bit VAs, pgdp=00000030059b2000 [65160.803102] CM = 0, WnR = 0 [65160.803104] user pgtable: 4k pages, 48-bit VAs, pgdp=00000020ab07c000 [65160.807066] [0000000000000008] pgd=0000000000000000, p4d=0000000000000000 [65160.811113] [0000000000000008] pgd=0000000000000000, p4d=0000000000000000 [65160.816199] Internal error: Oops: 0000000096000004 [#1] SMP [65160.832723] Modules linked in: [65161.006841] CPU: 39 PID: 195931 Comm: (hrottler) Kdump: loaded Nottainted 5.10.0 #51 [65161.016660] Hardware name: Huawei TaiShan 200 (Model 5280)/BC82AMDD, BIOS 1.08 12/14/2019 [65161.026146] pstate: 80400089 (Nzcv daIf +PAN -UAO -TCO BTYPE=--) [65161.033199] pc : set_task_select_cpus+0x8c/0x3d0 [65161.038865] lr : select_task_rq_fair+0x1c8/0x5cc [65161.044528] sp : ffff800172673ba0 [65161.048888] x29: ffff800172673ba0 x28: ffff00303f22cd00 [65161.055237] x27: 0000000000000027 x26: 0000000000000027 [65161.061583] x25: 0000aaaafd32b180 x24: 0000000000000002 [65161.067925] x23: ffff00303f22cd00 x22: 0000000000000000 [65161.074264] x21: ffff800172673cc4 x20: ffff00303f22cd00 [65161.080603] x19: ffff00303f22d7f4 x18: 0000000000000000 [65161.086938] x17: 0000000000000000 x16: 0000000000000000 [65161.093266] x15: 0000aaaafd365130 x14: 0000000000000000 [65161.099584] x13: 0000000000000000 x12: 0000000000000000 [65161.105892] x11: 0000000000000000 x10: 0000000000000000 [65161.112194] x9 : ffff800010129e80 x8 : 0000000000000000 [65161.118489] x7 : ffff00303f22cd00 x6 : 0000000000000001 [65161.124776] x5 : 0000000000000000 x4 : ffff8000118f5008 [65161.131058] x3 : 0000000000000000 x2 : 0000000000000002 [65161.137331] x1 : ffff800172673cc4 x0 : 0000000000000000 [65161.143598] Call trace: [65161.147005] set_task_select_cpus+0x8c/0x3d0 [65161.152225] select_task_rq_fair+0x1c8/0x5cc [65161.157439] sched_exec+0x94/0x1bc [65161.161782] bprm_execve.part.0+0x60/0x164 [65161.166813] bprm_execve+0x78/0xc0 [65161.171143] do_execveat_common+0x1c4/0x250 [65161.176244] __arm64_sys_execve+0x48/0x70 [65161.181167] invoke_syscall+0x50/0x130 [65161.185824] el0_svc_common.constprop.0+0x158/0x180 [65161.191601] do_el0_svc+0x34/0xe0 [65161.195816] el0_svc+0x20/0x30 [65161.199773] el0_sync_handler+0xb8/0xc0 [65161.204502] el0_sync+0x1e8/0x200 [65161.208712] Code: d50323bf d65f03c0 f941ac00 f941c400 (f9400400)
It's panic on task_group(current)->auto_affinity dereference.
In the scenario like:
CPU0 CPU1 rmdir cgroup free auto_affinity try to wake up select_task_rq_fair auto_affinity(NULL) dereference panic
Because there is no protect when remove task from one cgroup with task wakeup, so we need to check auto_affinity is NULL in task_prefer_cpus.
Fixes: 90ef693102cc ("sched: Fix possible deadlock in tg_set_dynamic_affinity_mode") Signed-off-by: Yipeng Zou zouyipeng@huawei.com --- kernel/sched/fair.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 273f6844bc2a..a6145cc1426d 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5988,7 +5988,8 @@ static void smart_grid_usage_dec(void)
static inline struct cpumask *task_prefer_cpus(struct task_struct *p) { - if (!smart_grid_used()) + if (!smart_grid_used() || + !task_group(p)->auto_affinity) return p->prefer_cpus;
if (task_group(p)->auto_affinity->mode == 0)