hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/IA4VMT
--------------------------------
We found an AA deadlock problem as shown belowed:
cgroup_destroy_wq TaskB WatchDog system_wq
... css_killed_work_fn: P(cgroup_mutex) ... ... __lockup_detector_reconfigure: P(cpu_hotplug_lock.read) ... ... percpu_down_write: P(cpu_hotplug_lock.write) ... cgroup_bpf_release: P(cgroup_mutex) smp_call_on_cpu: Wait system_wq
cpuset_css_offline: P(cpu_hotplug_lock.read)
WatchDog is waiting for system_wq, who is waiting for cgroup_mutex, to finish the jobs, but the owner of the cgroup_mutex is waiting for cpu_hotplug_lock. This problem caused by commit 4bfc0bb2c60e ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself") puts cgroup_bpf release work into system_wq. As cgroup_bpf is a member of cgroup, it is reasonable to put cgroup bpf release work into cgroup_destroy_wq, which is only used for cgroup's release work, and the problem is solved.
Fixes: 4bfc0bb2c60e ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself") Signed-off-by: Chen Ridong chenridong@huawei.com --- kernel/bpf/cgroup.c | 2 +- kernel/cgroup/cgroup-internal.h | 1 + kernel/cgroup/cgroup.c | 2 +- 3 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c index 41ddba63082d..049beb37f260 100644 --- a/kernel/bpf/cgroup.c +++ b/kernel/bpf/cgroup.c @@ -160,7 +160,7 @@ static void cgroup_bpf_release_fn(struct percpu_ref *ref) struct cgroup *cgrp = container_of(ref, struct cgroup, bpf.refcnt);
INIT_WORK(&cgrp->bpf.release_work, cgroup_bpf_release); - queue_work(system_wq, &cgrp->bpf.release_work); + queue_work(cgroup_destroy_wq, &cgrp->bpf.release_work); }
/* Get underlying bpf_prog of bpf_prog_list entry, regardless if it's through diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h index 470f196b7f74..096cee0e111a 100644 --- a/kernel/cgroup/cgroup-internal.h +++ b/kernel/cgroup/cgroup-internal.h @@ -14,6 +14,7 @@ extern spinlock_t trace_cgroup_path_lock; extern char trace_cgroup_path[TRACE_CGROUP_PATH_LEN]; extern bool cgroup_debug; extern void __init enable_debug_cgroup(void); +extern struct workqueue_struct *cgroup_destroy_wq;
/* * cgroup_path() takes a spin lock. It is good practice not to take diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index c35068730aab..2182fdb413bd 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -117,7 +117,7 @@ DEFINE_PERCPU_RWSEM(cgroup_threadgroup_rwsem); * destruction work items don't end up filling up max_active of system_wq * which may lead to deadlock. */ -static struct workqueue_struct *cgroup_destroy_wq; +struct workqueue_struct *cgroup_destroy_wq;
/* generate an array of cgroup subsystem pointers */ #define SUBSYS(_x) [_x ## _cgrp_id] = &_x ## _cgrp_subsys,