Backport two bugfix for memcg_swap_qos feature.
Liu Shixin (2): memcg: fix incorrect value of sysctl_memcg_swap_qos_stat mm/swapfile: fix infinite loop in get_swap_pages after set memory.swapfile
mm/memcontrol.c | 28 +++++++++++++++++++--------- mm/swapfile.c | 8 +++----- 2 files changed, 22 insertions(+), 14 deletions(-)
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9QRYN CVE: NA
--------------------------------
The function sysctl_memcg_swap_qos_handler() doesn't handle error case correctly and care nothing about concurrency. To fix it, we should do two thing:
1. reset sysctl_memcg_swap_qos_stat to old value for error case. 2. add a mutex_lock to protect the process.
Fixes: e147c1c34af1 ("memcg/swap: add ability to disable memcg swap") Signed-off-by: Liu Shixin liushixin2@huawei.com --- mm/memcontrol.c | 28 +++++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index ff64d2c36749..fff8b9322521 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4415,21 +4415,21 @@ static int sysctl_memcg_swap_qos_handler(struct ctl_table *table, int write, void __user *buffer, size_t *length, loff_t *ppos) { int ret; - int qos_stat_old = sysctl_memcg_swap_qos_stat; + int qos_stat_old; int swap_type; + static DEFINE_MUTEX(sysctl_mutex);
+ mutex_lock(&sysctl_mutex); + qos_stat_old = sysctl_memcg_swap_qos_stat; ret = proc_dointvec_minmax(table, write, buffer, length, ppos); - if (ret) - return ret; - - if (write) { + if (write && !ret) { if (qos_stat_old == sysctl_memcg_swap_qos_stat) - return 0; + goto unlock;
switch (sysctl_memcg_swap_qos_stat) { case MEMCG_SWAP_STAT_DISABLE: static_branch_disable(&memcg_swap_qos_key); - return 0; + goto unlock; case MEMCG_SWAP_STAT_ALL: swap_type = SWAP_TYPE_ALL; break; @@ -4438,16 +4438,26 @@ static int sysctl_memcg_swap_qos_handler(struct ctl_table *table, int write, break; }
+ /* + * Enable the feature when it is in disabled state. + * If it is already in enabled state, don't allowed + * to switch it to other state directly since it is + * dangerous that will impact all memory cgroups. + */ if (qos_stat_old == MEMCG_SWAP_STAT_DISABLE) { memcg_swap_qos_reset(swap_type); static_branch_enable(&memcg_swap_qos_key); enable_swap_slots_cache_max(); } else { - return -EINVAL; + sysctl_memcg_swap_qos_stat = qos_stat_old; + ret = -EINVAL; } }
- return 0; +unlock: + mutex_unlock(&sysctl_mutex); + + return ret; } #endif
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9QRYN CVE: NA
--------------------------------
In get_swap_pages(), we select the swap device based on the priority by default. If two or more devices have the same priority, their positions in the avail_lists will move in a circle in plist_requeue(). After set memory.swapfile in a memory cgroup and the priority of the matched swap is less than the priority of these swap, the loop will be confined to these swaps with same priority and can't select the specified swap forever.
Fix the infinite loop by skip the unmatched swap before plist_requeue().
Fixes: c08dff4db9ac ("mm/swapfile: introduce per-memcg swapfile control") Signed-off-by: Liu Shixin liushixin2@huawei.com --- mm/swapfile.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c index ddb50283f2f1..744e5c8bd66b 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1171,15 +1171,13 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[], int entry_order, start_over: node = numa_node_id(); plist_for_each_entry_safe(si, next, &swap_avail_heads[node], avail_lists[node]) { + if (should_skip_swap_type(si->type, type)) + goto nextsi; + /* requeue si to after same-priority siblings */ plist_requeue(&si->avail_lists[node], &swap_avail_heads[node]); spin_unlock(&swap_avail_lock); spin_lock(&si->lock); - if (should_skip_swap_type(si->type, type)) { - spin_unlock(&si->lock); - spin_lock(&swap_avail_lock); - goto nextsi; - } if (!si->highest_bit || !(si->flags & SWP_WRITEOK)) { spin_lock(&swap_avail_lock); if (plist_node_empty(&si->avail_lists[node])) {
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/7593 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/J...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/7593 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/J...