This patchset improves the performance of accounted kernel memory allocations by ~30% as measured by a micro-benchmark [1]. The benchmark is very straightforward: 1M of 64 bytes-large kmalloc() allocations.
Below are results with the disabled kernel memory accounting, the original state and with this patchset applied.
| | Kmem disabled | Original | Patched | Delta | |-------------+---------------+----------+---------+--------| | User cgroup | 29764 | 84548 | 59078 | -30.0% | | Root cgroup | 29742 | 48342 | 31501 | -34.8% |
As we can see, the patchset removes the majority of the overhead when there is no actual accounting (a task belongs to the root memory cgroup) and almost halves the accounting overhead otherwise.
The main idea is to get rid of unnecessary memcg to objcg conversions and switch to a scope-based protection of objcgs, which eliminates extra operations with objcg reference counters under a rcu read lock. More details are provided in individual commit descriptions.
Roman Gushchin (7): mm: kmem: optimize get_obj_cgroup_from_current() mm: kmem: add direct objcg pointer to task_struct mm: kmem: make memcg keep a reference to the original objcg mm: kmem: scoped objcg protection percpu: scoped objcg protection mm: kmem: reimplement get_obj_cgroup_from_current() mm: kmem: properly initialize local objcg variable in current_obj_cgroup()
include/linux/memcontrol.h | 28 +++++- include/linux/sched.h | 4 + include/linux/sched/mm.h | 4 + mm/memcontrol.c | 187 +++++++++++++++++++++++++++++++------ mm/percpu.c | 8 +- mm/slab.h | 15 +-- 6 files changed, 204 insertions(+), 42 deletions(-)