From: Michal Koutný mkoutny@suse.com
commit a6f23d14ec7d7d02220ad8bb2774be3322b9aeec upstream.
When workload runs in cgroups that aren't directly below root cgroup and their parent specifies reclaim protection, it may end up ineffective.
The reason is that propagate_protected_usage() is not called in all hierarchy up. All the protected usage is incorrectly accumulated in the workload's parent. This means that siblings_low_usage is overestimated and effective protection underestimated. Even though it is transitional phenomenon (uncharge path does correct propagation and fixes the wrong children_low_usage), it can undermine the intended protection unexpectedly.
We have noticed this problem while seeing a swap out in a descendant of a protected memcg (intermediate node) while the parent was conveniently under its protection limit and the memory pressure was external to that hierarchy. Michal has pinpointed this down to the wrong siblings_low_usage which led to the unwanted reclaim.
The fix is simply updating children_low_usage in respective ancestors also in the charging path.
Fixes: 230671533d64 ("mm: memory.low hierarchical behavior") Signed-off-by: Michal Koutný mkoutny@suse.com Signed-off-by: Michal Hocko mhocko@suse.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Acked-by: Michal Hocko mhocko@suse.com Acked-by: Roman Gushchin guro@fb.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Tejun Heo tj@kernel.org Cc: stable@vger.kernel.org [4.18+] Link: http://lkml.kernel.org/r/20200803153231.15477-1-mhocko@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/page_counter.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/mm/page_counter.c b/mm/page_counter.c index de31470655f66..147ff99187b81 100644 --- a/mm/page_counter.c +++ b/mm/page_counter.c @@ -77,7 +77,7 @@ void page_counter_charge(struct page_counter *counter, unsigned long nr_pages) long new;
new = atomic_long_add_return(nr_pages, &c->usage); - propagate_protected_usage(counter, new); + propagate_protected_usage(c, new); /* * This is indeed racy, but we can live with some * inaccuracy in the watermark. @@ -121,7 +121,7 @@ bool page_counter_try_charge(struct page_counter *counter, new = atomic_long_add_return(nr_pages, &c->usage); if (new > c->max) { atomic_long_sub(nr_pages, &c->usage); - propagate_protected_usage(counter, new); + propagate_protected_usage(c, new); /* * This is racy, but we can live with some * inaccuracy in the failcnt. @@ -130,7 +130,7 @@ bool page_counter_try_charge(struct page_counter *counter, *fail = c; goto failed; } - propagate_protected_usage(counter, new); + propagate_protected_usage(c, new); /* * Just like with failcnt, we can live with some * inaccuracy in the watermark.