--- tools/probeCgroup:
This patch introduces a new tool called "probeCgroup" that enables dynamic monitoring of memory usage at the process level within cgroups. By using kprobes at relevant cgroup functions, this tool can track memory allocations and deallocations for individual processes within a cgroup, providing detailed statistics on memory usage.
The key features of the tool include: 1. Dynamic insertion of kprobes at critical points in the cgroup subsystem. 2. Tracking memory allocation and deallocation events for each process by recording page addresses in a hash table. 3. Providing real-time statistics on memory usage at the process level. 4. Providing statistics on memory usage for processes that are OOM.
Signed-off-by: Taoxy2004 221870066@smail.nju.edu.cn --- tools/probeCgroup/Makefile | 7 + tools/probeCgroup/README.md | 29 + tools/probeCgroup/probeCgroup.c | 612 ++++++++++++++++++ tools/probeCgroup/probeCgroup.h | 415 ++++++++++++ tools/probeCgroup/run.sh | 8 + tools/probeCgroup/scripts/script1.sh | 10 + tools/probeCgroup/scripts/script2.sh | 14 + tools/probeCgroup/scripts/script3.sh | 11 + .../testcases/1_load_unload_test.py | 24 + .../testcases/2_multiple_process_test.py | 48 ++ .../testcases/3_multiple_cgroup_test.py | 55 ++ tools/probeCgroup/testcases/4_oom_test.py | 52 ++ .../testcases/5_multiple_threads_test.py | 45 ++ tools/probeCgroup/testcases/cgroup_utils.py | 115 ++++ tools/probeCgroup/testcases/mem-allocate.c | 35 + .../testcases/multiple-thread-mem-allocate.c | 60 ++ tools/probeCgroup/testcases/run.py | 32 + .../testcases/simple-mem-allocate.c | 27 + 18 files changed, 1599 insertions(+) create mode 100644 tools/probeCgroup/Makefile create mode 100644 tools/probeCgroup/README.md create mode 100644 tools/probeCgroup/probeCgroup.c create mode 100644 tools/probeCgroup/probeCgroup.h create mode 100755 tools/probeCgroup/run.sh create mode 100755 tools/probeCgroup/scripts/script1.sh create mode 100755 tools/probeCgroup/scripts/script2.sh create mode 100755 tools/probeCgroup/scripts/script3.sh create mode 100755 tools/probeCgroup/testcases/1_load_unload_test.py create mode 100755 tools/probeCgroup/testcases/2_multiple_process_test.py create mode 100755 tools/probeCgroup/testcases/3_multiple_cgroup_test.py create mode 100755 tools/probeCgroup/testcases/4_oom_test.py create mode 100755 tools/probeCgroup/testcases/5_multiple_threads_test.py create mode 100644 tools/probeCgroup/testcases/cgroup_utils.py create mode 100644 tools/probeCgroup/testcases/mem-allocate.c create mode 100644 tools/probeCgroup/testcases/multiple-thread-mem-allocate.c create mode 100755 tools/probeCgroup/testcases/run.py create mode 100644 tools/probeCgroup/testcases/simple-mem-allocate.c
diff --git a/tools/probeCgroup/Makefile b/tools/probeCgroup/Makefile new file mode 100644 index 000000000000..606c951e5487 --- /dev/null +++ b/tools/probeCgroup/Makefile @@ -0,0 +1,7 @@ +obj-m := probeCgroup.o +CROSS_COMPILE = '' +KDIR := /lib/modules/$(shell uname -r)/build +all: + make -C $(KDIR) M=$(PWD) modules +clean: + rm -f *.ko *.o *.mod *.mod.o *.mod.c .*.cmd *.symvers module* diff --git a/tools/probeCgroup/README.md b/tools/probeCgroup/README.md new file mode 100644 index 000000000000..ff0b6fc21228 --- /dev/null +++ b/tools/probeCgroup/README.md @@ -0,0 +1,29 @@ +# probeCgroup + +#### Description +probeCgroup is a process-level cgroup memory monitoring tool based on dynamic tracing (kprobe/kretprobe) technology. By inserting kprobes and kretprobes at the entry and exit points of relevant cgroup functions, this tool can track the memory usage of individual processes within each cgroup in real time. + +#### Software Architecture +1. Dynamic Tracing : Insert kprobes and kretprobes at critical points in cgroup functions to capture memory allocation and release events. +2. Hash Table Recording : Record the addresses of pages currently used by each process in a hash table, so that when a page is released, the process it belongs to can be identified. +3. Real-Time Statistics : Provide real-time statistics showing the memory usage of individual processes within each cgroup. + +#### Instruction +1. Compile and Load the Module + a. In the 'probeCgroup' directory, run the 'make' command to compile the module. + b. Load the module: 'insmod probeCgroup.ko'. + c. View memory statistics: 'cat /proc/cgroup_memory_usage_per_process'. + If an OOM (Out of Memory) event occurs in a cgroup, you can see "oom:" followed by the process that experienced the OOM and its memory usage at the time. + +2. Automate OOM Scenario + In the 'probeCgroup' directory, run './run.sh'. This script will automatically set up an OOM scenario and output the content of '/proc/cgroup_memory_usage_per_process' after execution. + +3. Perform More Tests + a. After compiling the module, in the 'testcases' directory, run './run.py'. + b. This script will perform various tests, including: + - Loading and unloading the module + - Each cgroup containing multiple processes + - Creating multiple cgroups + - OOM scenarios + - Multithreading + c. The tests will take approximately one minute to complete. diff --git a/tools/probeCgroup/probeCgroup.c b/tools/probeCgroup/probeCgroup.c new file mode 100644 index 000000000000..9883cb1e082d --- /dev/null +++ b/tools/probeCgroup/probeCgroup.c @@ -0,0 +1,612 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * probeCgroup.c - A tool used to get memory usage for each process in a cgroup + * + * Copyright (C) Taoxy2004 221870066@smail.nju.edu.cn + */ + +#include "probeCgroup.h" + +// kretprobe at mem_cgroup_charge +struct charge_data { + struct cgroup *cgrp; + struct mem_cgroup *memcg; + struct task_struct *task; + unsigned long addr; +}; + +static int mem_cgroup_charge_entry_handler(struct kretprobe_instance *ri, + struct pt_regs *regs) +{ + struct charge_data *data; + struct folio *page; + struct mm_struct *mm; + struct mem_cgroup *memcg; + struct cgroup_subsys_state css; + struct cgroup *cgrp; + + if (!current->mm) + return 1; + page = (struct folio *)regs->di; + mm = (struct mm_struct *)regs->si; + if (mm == NULL || page == NULL) + return -1; + memcg = get_mem_cgroup_from_mm(mm); + if (memcg != NULL) { + css = memcg->css; + cgrp = css.cgroup; + + data = (struct charge_data *)ri->data; + data->memcg = memcg; + data->addr = (unsigned long)page; + data->task = current; + data->cgrp = cgrp; + } + return 0; +} + +NOKPROBE_SYMBOL(mem_cgroup_charge_entry_handler); + +static int mem_cgroup_charge_ret_handler(struct kretprobe_instance *ri, + struct pt_regs *regs) +{ + unsigned long retval = regs_return_value(regs); + struct charge_data *data = (struct charge_data *)ri->data; + int id; + struct cgroup_info *cgrp_info; + struct task_info *tsk_info; + + if (data->memcg != NULL && retval == 0) { + id = ((data->memcg)->css).id; + + spin_lock(&lock); + cgrp_info = find_cgroup_info(id); + if (cgrp_info == NULL) { + cgrp_info = create_cgroup_info(data->cgrp, data->memcg); + if (cgrp_info == NULL) { + spin_unlock(&lock); + return -1; + } + add_cgroup_info(cgrp_info); + } + spin_unlock(&lock); + + read_lock(&cgrp_info->cgrp_lock); + tsk_info = find_task_info(cgrp_info, data->task->tgid); + read_unlock(&cgrp_info->cgrp_lock); + + // for some cases, task->comm changes over time + if (tsk_info != NULL + && strcmp(data->task->comm, tsk_info->comm) != 0) { + strscpy(tsk_info->comm, data->task->comm, + sizeof(tsk_info->comm)); + } + + if (tsk_info == NULL) { + tsk_info = create_task_info(data->task); + if (tsk_info == NULL) + return -1; + add_task_to_cgroup_info(cgrp_info, tsk_info); + } + + if (HashMap_insert(tsk_info->pages, data->addr)) { + //update counter + spin_lock(&(tsk_info->cnt_lock)); + tsk_info->count += + folio_nr_pages((struct folio *)data->addr); + spin_unlock(&(tsk_info->cnt_lock)); + } + } + + return 0; + +} + +NOKPROBE_SYMBOL(mem_cgroup_charge_ret_handler); + +static struct kretprobe mem_cgroup_charge_kretprobe = { + .handler = mem_cgroup_charge_ret_handler, + .entry_handler = mem_cgroup_charge_entry_handler, + .data_size = sizeof(struct charge_data), + .maxactive = 20, +}; + +static int mem_cgroup_charge_kretprobe_init(void) +{ + int ret; + + mem_cgroup_charge_kretprobe.kp.symbol_name = "__mem_cgroup_charge"; + ret = register_kretprobe(&mem_cgroup_charge_kretprobe); + if (ret < 0) { + pr_err("register_kretprobe failed, returned %d\n", ret); + return ret; + } + pr_info("Planted return probe at %s: %p\n", + mem_cgroup_charge_kretprobe.kp.symbol_name, + mem_cgroup_charge_kretprobe.kp.addr); + return 0; +} + +static void mem_cgroup_charge_kretprobe_exit(void) +{ + unregister_kretprobe(&mem_cgroup_charge_kretprobe); + pr_info("kretprobe at %p unregistered\n", + mem_cgroup_charge_kretprobe.kp.addr); + + /* nmissed > 0 suggests that maxactive was set too low. */ + pr_info("Missed probing %d instances of %s\n", + mem_cgroup_charge_kretprobe.nmissed, + mem_cgroup_charge_kretprobe.kp.symbol_name); +} + +// kretprobe at uncharge_folio + +struct uncharge_data { + struct cgroup *cgrp; + struct mem_cgroup *memcg; + unsigned long addr; + bool isKmem; + int nr_pages; +}; + +static int uncharge_folio_entry_handler(struct kretprobe_instance *ri, + struct pt_regs *regs) +{ + struct uncharge_data *data; + struct folio *page; + struct mem_cgroup *memcg = NULL; + struct cgroup_subsys_state css; + struct cgroup *cgrp; + struct obj_cgroup *objcg; + int nr_pages = 0; + + data = (struct uncharge_data *)ri->data; + page = (struct folio *)regs->di; + if (page == NULL) { + data->memcg = NULL; + return -1; + } + if (page->memcg_data & MEMCG_DATA_KMEM) { // if the page belongs to kmem + if (!folio_test_large(page)) + nr_pages = 1; + else + nr_pages = page->_folio_nr_pages; + // nr_pages = thp_nr_pages(page); + objcg = __folio_objcg(page); + if (objcg != NULL) + memcg = objcg->memcg; + data->isKmem = true; + data->nr_pages = nr_pages; + } else { + memcg = __folio_memcg(page); + data->isKmem = false; + } + + if (memcg != NULL) { + css = memcg->css; + cgrp = css.cgroup; + + data->memcg = memcg; + data->addr = (unsigned long)page; + data->cgrp = cgrp; + } + return 0; +} + +NOKPROBE_SYMBOL(uncharge_folio_entry_handler); + +static int uncharge_folio_ret_handler(struct kretprobe_instance *ri, + struct pt_regs *regs) +{ + struct uncharge_data *data = (struct uncharge_data *)ri->data; + int id; + struct cgroup_info *cgrp_info; + int ret = -1; + + if (data->memcg != NULL) { + id = ((data->memcg)->css).id; + cgrp_info = find_cgroup_info(id); + if (cgrp_info == NULL) + return -1; + if (data->isKmem) + ret = -1; + else + ret = remove_page_from_cgroup_info(data->addr, cgrp_info); + } + + return ret; +} + +NOKPROBE_SYMBOL(uncharge_folio_ret_handler); + +static struct kretprobe uncharge_folio_kretprobe = { + .handler = uncharge_folio_ret_handler, + .entry_handler = uncharge_folio_entry_handler, + .data_size = sizeof(struct uncharge_data), + .maxactive = 20, +}; + +static int uncharge_folio_kretprobe_init(void) +{ + int ret; + + uncharge_folio_kretprobe.kp.symbol_name = "uncharge_folio"; + ret = register_kretprobe(&uncharge_folio_kretprobe); + if (ret < 0) { + pr_err("register_kretprobe failed, returned %d\n", ret); + return ret; + } + pr_info("Planted return probe at %s: %p\n", + uncharge_folio_kretprobe.kp.symbol_name, + uncharge_folio_kretprobe.kp.addr); + return 0; +} + +static void uncharge_folio_kretprobe_exit(void) +{ + unregister_kretprobe(&uncharge_folio_kretprobe); + pr_info("kretprobe at %p unregistered\n", + uncharge_folio_kretprobe.kp.addr); + + /* nmissed > 0 suggests that maxactive was set too low. */ + pr_info("Missed probing %d instances of %s\n", + uncharge_folio_kretprobe.nmissed, + uncharge_folio_kretprobe.kp.symbol_name); +} + +//kprobe at do_exit +static struct kprobe do_exit_kprobe; +static int do_exit_kprobe_pre_handler(struct kprobe *p, struct pt_regs *regs) +{ + struct task_struct *cur = current; + int tgid = cur->tgid; + struct mm_struct *mm = cur->mm; + struct mem_cgroup *memcg = get_mem_cgroup_from_mm(mm); + struct cgroup_subsys_state css; + struct cgroup *cgrp; + struct cgroup_info *cgrp_info; + struct task_info *tsk_info; + int id; + + if (memcg != NULL) { + css = memcg->css; + cgrp = css.cgroup; + id = (memcg->css).id; + cgrp_info = find_cgroup_info(id); + if (cgrp_info != NULL) { + write_lock(&cgrp_info->cgrp_lock); + tsk_info = find_task_info(cgrp_info, tgid); + if (tsk_info != NULL) { + list_del(&tsk_info->list); + write_unlock(&cgrp_info->cgrp_lock); + remove_task_from_cgroup_info(cgrp_info, + tsk_info); + } else { + write_unlock(&cgrp_info->cgrp_lock); + } + return 0; + } + } + return 0; +} + +static void do_exit_kprobe_post_handler(struct kprobe *p, + struct pt_regs *regs, + unsigned long flags) +{ + +} + +static int do_exit_kprobe_init(void) +{ + do_exit_kprobe.pre_handler = do_exit_kprobe_pre_handler; + do_exit_kprobe.post_handler = do_exit_kprobe_post_handler; + do_exit_kprobe.symbol_name = "do_exit"; + if (register_kprobe(&do_exit_kprobe)) { + pr_alert("register_kprobe on do_exit failed!\n"); + return -EINVAL; + } + return 0; +} + +static void do_exit_kprobe_exit(void) +{ + unregister_kprobe(&do_exit_kprobe); +} + +//kprobe at mark_oom_victim +static struct kprobe mark_oom_victim_kprobe; + +static int mark_oom_victim_kprobe_pre_handler(struct kprobe *p, + struct pt_regs *regs) +{ + struct task_struct *victim; + int tgid; + struct mm_struct *mm; + struct mem_cgroup *memcg; + struct cgroup_subsys_state css; + struct cgroup *cgrp; + struct cgroup_info *cgrp_info; + struct task_info *tsk_info; + int id; + struct task_info *oom_info; + + victim = (struct task_struct *)regs->di; + tgid = victim->tgid; + mm = victim->mm; + memcg = get_mem_cgroup_from_mm(mm); + if (memcg != NULL) { + css = memcg->css; + cgrp = css.cgroup; + id = (memcg->css).id; + cgrp_info = find_cgroup_info(id); + if (cgrp_info != NULL) { + read_lock(&cgrp_info->cgrp_lock); + tsk_info = find_task_info(cgrp_info, tgid); + read_unlock(&cgrp_info->cgrp_lock); + if (tsk_info != NULL) { + oom_info = create_oom_task_info(tsk_info); + if (oom_info != NULL) { + add_oom_task_to_cgroup_info(cgrp_info, + oom_info); + } + return 0; + } + } + } + return 0; +} + +static void mark_oom_victim_kprobe_post_handler(struct kprobe *p, + struct pt_regs *regs, + unsigned long flags) +{ + +} + +static int mark_oom_victim_kprobe_init(void) +{ + mark_oom_victim_kprobe.pre_handler = mark_oom_victim_kprobe_pre_handler; + mark_oom_victim_kprobe.post_handler = + mark_oom_victim_kprobe_post_handler; + mark_oom_victim_kprobe.symbol_name = "mark_oom_victim"; + if (register_kprobe(&mark_oom_victim_kprobe)) { + pr_alert("register_kprobe on mark_oom_victim failed!\n"); + return -EINVAL; + } + return 0; +} + +static void mark_oom_victim_kprobe_exit(void) +{ + unregister_kprobe(&mark_oom_victim_kprobe); +} + +//kretporbe at cgroup_destroy_locked +struct destroy_data { + struct cgroup *cgrp; +}; + +static int cgroup_destroy_locked_entry_handler(struct kretprobe_instance + *ri, struct pt_regs *regs) +{ + struct destroy_data *data; + + data = (struct destroy_data *)ri->data; + data->cgrp = (struct cgroup *)regs->di; + return 0; +} + +NOKPROBE_SYMBOL(cgroup_destroy_locked_entry_handler); + +static int cgroup_destroy_locked_ret_handler(struct kretprobe_instance *ri, + struct pt_regs *regs) +{ + struct destroy_data *data = (struct destroy_data *)ri->data; + struct cgroup *cgrp = data->cgrp; + struct cgroup_info *cgrp_info = NULL; + unsigned long retval = regs_return_value(regs); + + if (!cgrp) + return -1; + if (retval != 0) + return -1; + list_for_each_entry(cgrp_info, &all_cgroup_info, list) { + if (cgrp_info->cgrp == cgrp) { + spin_lock(&lock); + list_del(&cgrp_info->list); + spin_unlock(&lock); + destroy_cgroup_info(cgrp_info); + return 0; + } + } + return -1; +} + +NOKPROBE_SYMBOL(cgroup_destroy_locked_ret_handler); + +static struct kretprobe cgroup_destroy_locked_kretprobe = { + .handler = cgroup_destroy_locked_ret_handler, + .entry_handler = cgroup_destroy_locked_entry_handler, + .data_size = sizeof(struct destroy_data), + .maxactive = 20, +}; + +static int cgroup_destroy_locked_kretprobe_init(void) +{ + int ret; + + cgroup_destroy_locked_kretprobe.kp.symbol_name = + "cgroup_destroy_locked"; + ret = register_kretprobe(&cgroup_destroy_locked_kretprobe); + if (ret < 0) { + pr_err("register_kretprobe failed, returned %d\n", ret); + return ret; + } + pr_info("Planted return probe at %s: %p\n", + cgroup_destroy_locked_kretprobe.kp.symbol_name, + cgroup_destroy_locked_kretprobe.kp.addr); + return 0; +} + +static void cgroup_destroy_locked_kretprobe_exit(void) +{ + unregister_kretprobe(&cgroup_destroy_locked_kretprobe); + pr_info("kretprobe at %p unregistered\n", + cgroup_destroy_locked_kretprobe.kp.addr); + + /* nmissed > 0 suggests that maxactive was set too low. */ + pr_info("Missed probing %d instances of %s\n", + cgroup_destroy_locked_kretprobe.nmissed, + cgroup_destroy_locked_kretprobe.kp.symbol_name); +} + +// print the tasks in order of their memory usage +static void print_sorted_tasks_list(struct cgroup_info *cgrp_info, + int type, struct seq_file *m) +{ + struct list_head *cur, *insert_pos; + struct task_info *task, *insert_task; + struct list_head new_list = LIST_HEAD_INIT(new_list); + struct list_head *old_list; + struct task_info *new_task, *next_task; + + if (type == 0) { + if (cgrp_info == NULL) + return; + read_lock(&cgrp_info->cgrp_lock); + old_list = &cgrp_info->tasks_list; + } else { + if (cgrp_info == NULL) + return; + old_list = &cgrp_info->oom_list; + } + + list_for_each_entry_safe(task, insert_task, old_list, list) { + new_task = kmalloc(sizeof(struct task_info), GFP_ATOMIC); + if (!new_task) + return; + new_task->tgid = task->tgid; + strscpy(new_task->comm, task->comm, sizeof(new_task->comm)); + new_task->count = task->count; + new_task->pages = NULL; + INIT_LIST_HEAD(&new_task->list); + + //insertion sort + cur = &new_list; + insert_pos = cur->next; + while (insert_pos != &new_list) { + next_task = + list_entry(insert_pos, struct task_info, list); + if (new_task->count >= next_task->count) + break; + cur = insert_pos; + insert_pos = insert_pos->next; + } + + (&new_task->list)->prev = insert_pos->prev; + (insert_pos->prev)->next = (&new_task->list); + (&new_task->list)->next = insert_pos; + insert_pos->prev = (&new_task->list); + } + if (type == 0) + read_unlock(&cgrp_info->cgrp_lock); + + //print + if (type == 1 && (&new_list) != new_list.next) { + seq_puts(m, "oom:\n"); + seq_printf(m, "%10s %20s %20s\n", "pid", "command", + "memory usage (KB)"); + } + if (type == 0) + seq_printf(m, "%10s %20s %20s\n", "pid", "command", + "memory usage (KB)"); + list_for_each_entry_safe(task, insert_task, &new_list, list) { + seq_printf(m, "%10d %20s %20d\n", task->tgid, task->comm, + (task->count) * 4); + } + + list_for_each_entry_safe(task, insert_task, &new_list, list) { + list_del(&task->list); + kfree(task); + } +} + +static struct proc_dir_entry *cgroup_info_read; +#define procfs_file_read "cgroup_memory_usage_per_process" + +void seq_print_tasks(struct cgroup_info *cgroup_info, struct seq_file *m) +{ + if (!cgroup_info) + return; + + print_sorted_tasks_list(cgroup_info, 0, m); +} + +void seq_print_oom_tasks(struct cgroup_info *cgroup_info, struct seq_file *m) +{ + if (!cgroup_info) + return; + + print_sorted_tasks_list(cgroup_info, 1, m); +} + +void seq_print_cgroups(struct seq_file *m) +{ + struct cgroup_info *cgrp, *pos; + + spin_lock(&lock); + list_for_each_entry_safe(cgrp, pos, &all_cgroup_info, list) { + seq_printf(m, "cgroup name : %s\n", cgrp->name); + seq_print_tasks(cgrp, m); + seq_print_oom_tasks(cgrp, m); + seq_puts(m, "\n"); + } + spin_unlock(&lock); +} + +static int memory_usage_show(struct seq_file *m, void *v) +{ + seq_print_cgroups(m); + return 0; +} + +static int __init global_init(void) +{ + int ret = 0; + + cgroup_info_read = + proc_create_single(procfs_file_read, 0, NULL, memory_usage_show); + if (!cgroup_info_read) + return -ENOMEM; + ret = mem_cgroup_charge_kretprobe_init(); + uncharge_folio_kretprobe_init(); + do_exit_kprobe_init(); + mark_oom_victim_kprobe_init(); + cgroup_destroy_locked_kretprobe_init(); + + return ret; +} + +static void __exit global_exit(void) +{ + struct cgroup_info *cgrp_info, *pos; + + mem_cgroup_charge_kretprobe_exit(); + uncharge_folio_kretprobe_exit(); + do_exit_kprobe_exit(); + mark_oom_victim_kprobe_exit(); + cgroup_destroy_locked_kretprobe_exit(); + + remove_proc_entry(procfs_file_read, NULL); + + //release all memory use + list_for_each_entry_safe(cgrp_info, pos, &all_cgroup_info, list) { + list_del(&cgrp_info->list); + destroy_cgroup_info(cgrp_info); + } +} + +module_init(global_init) +module_exit(global_exit) +MODULE_LICENSE("GPL"); diff --git a/tools/probeCgroup/probeCgroup.h b/tools/probeCgroup/probeCgroup.h new file mode 100644 index 000000000000..953a6e0aca31 --- /dev/null +++ b/tools/probeCgroup/probeCgroup.h @@ -0,0 +1,415 @@ +/* SPDX-License-Identifier: GPL-2.0*/ +/* + * probeCgroup.h + * + * Copyright (C) Taoxy2004 221870066@smail.nju.edu.cn + */ + +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/kprobes.h> +#include <linux/ktime.h> +#include <linux/limits.h> +#include <linux/sched.h> +#include <linux/mm_types.h> +#include <linux/memcontrol.h> +#include <linux/cgroup-defs.h> +#include <linux/kernfs.h> +#include <linux/string.h> +#include <linux/list.h> +#include <linux/oom.h> +#include <linux/fs.h> +#include <linux/proc_fs.h> +#include <linux/huge_mm.h> +#include <linux/page-flags.h> +#include <linux/spinlock.h> +#include <linux/rwlock.h> + +static spinlock_t lock; // global lock for the list of cgroup_info + +struct HashNode { + unsigned long addr; + struct HashNode *next; +}; + +struct HashNode *HashNode_create(unsigned long addr) +{ + struct HashNode *node = NULL; + + node = kzalloc(sizeof(struct HashNode), GFP_ATOMIC); + if (node == NULL) + return NULL; + node->addr = addr; + node->next = NULL; + return node; +} + +struct HashBucket { + struct HashNode *head; + spinlock_t bkt_lock; +}; + +void HashBucket_init(struct HashBucket *bkt) +{ + bkt->head = NULL; + spin_lock_init(&bkt->bkt_lock); +} + +bool HashBucket_insert(struct HashBucket *bkt, unsigned long addr) +{ + struct HashNode *new_node; + struct HashNode *node; + struct HashNode *prev; + bool ret = true; + + if (bkt == NULL) + return false; + + prev = NULL; + new_node = NULL; + new_node = HashNode_create(addr); + + spin_lock(&bkt->bkt_lock); + node = bkt->head; + while (node != NULL && node->addr != addr) { + prev = node; + node = node->next; + } + if (node == NULL) { + if (new_node == NULL) { + pr_info("not enough memory for HashNode\n"); + spin_unlock(&bkt->bkt_lock); + return false; + } + if (bkt->head == NULL) + bkt->head = new_node; + else + prev->next = new_node; + spin_unlock(&bkt->bkt_lock); + ret = true; + } else { + spin_unlock(&bkt->bkt_lock); + kfree(new_node); + ret = false; + } + + return ret; +} + +bool HashBucket_erase(struct HashBucket *bkt, unsigned long addr) +{ + struct HashNode *node; + struct HashNode *prev; + bool ret = true; + + if (bkt == NULL) + return false; + + spin_lock(&bkt->bkt_lock); + node = bkt->head; + prev = NULL; + while (node != NULL && node->addr != addr) { + prev = node; + node = node->next; + } + if (node == NULL) { + spin_unlock(&bkt->bkt_lock); + ret = false; + } else { + if (bkt->head == node) + bkt->head = node->next; + else + prev->next = node->next; + kfree(node); + spin_unlock(&bkt->bkt_lock); + ret = true; + } + + return ret; +} + +void HashBucket_clear(struct HashBucket *bkt) +{ + struct HashNode *node; + struct HashNode *prev; + + if (bkt == NULL) + return; + + spin_lock(&bkt->bkt_lock); + node = bkt->head; + prev = NULL; + bkt->head = NULL; + while (node != NULL) { + prev = node; + node = node->next; + kfree(prev); + } + spin_unlock(&bkt->bkt_lock); +} + +struct HashMap { + unsigned long size; + struct HashBucket *HashTable; +}; + +unsigned long hash_func(unsigned long addr, unsigned long size) +{ + return addr % size; +} + +struct HashMap *HashMap_create(unsigned long size) +{ + struct HashMap *hm = NULL; + struct HashBucket *ht = NULL; + int i = 0; + + hm = kmalloc(sizeof(struct HashMap), GFP_ATOMIC); + if (hm == NULL) + return NULL; + ht = kmalloc((size * sizeof(struct HashBucket)), GFP_ATOMIC); + if (ht == NULL) { + kfree(hm); + return NULL; + } + for (i = 0; i < size; i++) + HashBucket_init(&(ht[i])); + + hm->size = size; + hm->HashTable = ht; + return hm; +} + +bool HashMap_insert(struct HashMap *hm, unsigned long addr) +{ + unsigned long index; + + if (hm == NULL) + return false; + index = hash_func(addr, hm->size); + if (hm->HashTable == NULL) + return false; + return HashBucket_insert(&(hm->HashTable[index]), addr); +} + +bool HashMap_erase(struct HashMap *hm, unsigned long addr) +{ + unsigned long index; + + if (hm == NULL) + return false; + index = hash_func(addr, hm->size); + if (hm->HashTable == NULL) + return false; + return HashBucket_erase(&(hm->HashTable[index]), addr); +} + +void HashMap_clear(struct HashMap *hm) +{ + unsigned long size; + struct HashBucket *ht; + int i; + + if (hm == NULL) + return; + size = hm->size; + ht = hm->HashTable; + if (ht == NULL) + return; + hm->HashTable = NULL; + for (i = 0; i < size; i++) + HashBucket_clear(&(ht[i])); + + kfree(ht); + kfree(hm); +} + +//struct that save the information for each task +struct task_info { + int tgid; + char comm[TASK_COMM_LEN]; + int count; // number of pages + struct HashMap *pages; + struct list_head list; + spinlock_t cnt_lock; +}; + +// struct that save the information for each cgroup +struct cgroup_info { + struct cgroup *cgrp; + struct mem_cgroup *memcg; + int id; + char name[64]; + struct list_head list; + struct list_head tasks_list; + struct list_head oom_list; + rwlock_t cgrp_lock; + unsigned int cached_bytes; +}; + +static LIST_HEAD(all_cgroup_info); // a list that linked all the cgroup_info struct + +static struct task_info *create_task_info(struct task_struct *cur_task) +{ + struct task_info *tsk_info = + kmalloc(sizeof(struct task_info), GFP_ATOMIC); + if (!tsk_info) + return NULL; + + // initialization + tsk_info->tgid = cur_task->tgid; + strscpy(tsk_info->comm, cur_task->comm, sizeof(tsk_info->comm)); + tsk_info->count = 0; + tsk_info->pages = NULL; + tsk_info->pages = HashMap_create(1023); + INIT_LIST_HEAD(&tsk_info->list); + spin_lock_init(&(tsk_info->cnt_lock)); + + return tsk_info; +} + +static int +add_task_to_cgroup_info(struct cgroup_info *cgrp, struct task_info *task) +{ + if (!cgrp || !task) + return -EINVAL; + + write_lock(&cgrp->cgrp_lock); + list_add_tail(&task->list, &cgrp->tasks_list); + write_unlock(&cgrp->cgrp_lock); + return 0; +} + +static int +remove_task_from_cgroup_info(struct cgroup_info *cgrp, struct task_info *task) +{ + if (cgrp == NULL || task == NULL) + return -EINVAL; + + HashMap_clear(task->pages); + // kfree(task->pages); + kfree(task); + return 0; +} + +static struct task_info *find_task_info(struct cgroup_info *cgrp, int tgid) +{ + struct task_info *tsk_info, *pos; + + list_for_each_entry_safe(tsk_info, pos, &cgrp->tasks_list, list) { + if (tsk_info->tgid == tgid) + return tsk_info; + } + return NULL; +} + +static int +remove_page_from_cgroup_info(unsigned long addr, struct cgroup_info *cgrp) +{ + struct task_info *tsk_info, *pos; + + read_lock(&cgrp->cgrp_lock); + list_for_each_entry_safe(tsk_info, pos, &cgrp->tasks_list, list) { + if (HashMap_erase(tsk_info->pages, addr)) { + spin_lock(&(tsk_info->cnt_lock)); + tsk_info->count -= folio_nr_pages((struct folio *)addr); + spin_unlock(&(tsk_info->cnt_lock)); + read_unlock(&cgrp->cgrp_lock); + return 0; + } + } + read_unlock(&cgrp->cgrp_lock); + return -1; +} + +static struct cgroup_info *create_cgroup_info(struct cgroup *cgrp, + struct mem_cgroup *memcg) +{ + struct cgroup_info *cgrp_info = + kmalloc(sizeof(struct cgroup_info), GFP_ATOMIC); + struct kernfs_node *kn; + + if (!cgrp_info) + return NULL; + + cgrp_info->cgrp = cgrp; + cgrp_info->memcg = memcg; + cgrp_info->id = (memcg->css).id; + kn = cgrp->kn; + strscpy(cgrp_info->name, kn->name, sizeof(cgrp_info->name)); + INIT_LIST_HEAD(&cgrp_info->list); + INIT_LIST_HEAD(&cgrp_info->tasks_list); + INIT_LIST_HEAD(&cgrp_info->oom_list); + rwlock_init(&(cgrp_info->cgrp_lock)); + cgrp_info->cached_bytes = 0; + + return cgrp_info; +} + +static void destroy_cgroup_info(struct cgroup_info *cgrp_info) +{ + struct task_info *task, *tmp; + + if (!cgrp_info) + return; + + write_lock(&cgrp_info->cgrp_lock); + list_for_each_entry_safe(task, tmp, &cgrp_info->tasks_list, list) { + list_del(&task->list); + remove_task_from_cgroup_info(cgrp_info, task); + } + write_unlock(&cgrp_info->cgrp_lock); + list_for_each_entry_safe(task, tmp, &cgrp_info->oom_list, list) { + list_del(&task->list); + remove_task_from_cgroup_info(cgrp_info, task); + } + + kfree(cgrp_info); +} + +static int add_cgroup_info(struct cgroup_info *cgrp_info) +{ + if (!cgrp_info) + return -EINVAL; + + list_add_tail(&cgrp_info->list, &all_cgroup_info); + return 0; +} + +static struct cgroup_info *find_cgroup_info(int id) +{ + struct cgroup_info *cgrp_info = NULL; + + list_for_each_entry(cgrp_info, &all_cgroup_info, list) { + if (cgrp_info->id == id) + return cgrp_info; + } + + return NULL; +} + +static struct task_info *create_oom_task_info(struct task_info *tsk_info) +{ + struct task_info *oom_tsk_info = + kmalloc(sizeof(struct task_info), GFP_ATOMIC); + if (!oom_tsk_info) + return NULL; + + oom_tsk_info->tgid = tsk_info->tgid; + strscpy(oom_tsk_info->comm, tsk_info->comm, sizeof(oom_tsk_info->comm)); + oom_tsk_info->count = tsk_info->count; + oom_tsk_info->pages = NULL; + INIT_LIST_HEAD(&oom_tsk_info->list); + + return oom_tsk_info; +} + +static int +add_oom_task_to_cgroup_info(struct cgroup_info *cgrp, + struct task_info *oom_task) +{ + if (!cgrp || !oom_task) + return -EINVAL; + list_add_tail(&oom_task->list, &cgrp->oom_list); + return 0; +} diff --git a/tools/probeCgroup/run.sh b/tools/probeCgroup/run.sh new file mode 100755 index 000000000000..7e1ffefe66d1 --- /dev/null +++ b/tools/probeCgroup/run.sh @@ -0,0 +1,8 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) Taoxy2004 221870066@smail.nju.edu.cn + +cd scripts +./script1.sh +./script2.sh +./script3.sh diff --git a/tools/probeCgroup/scripts/script1.sh b/tools/probeCgroup/scripts/script1.sh new file mode 100755 index 000000000000..539e8258afb9 --- /dev/null +++ b/tools/probeCgroup/scripts/script1.sh @@ -0,0 +1,10 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) Taoxy2004 221870066@smail.nju.edu.cn + +cd .. +make +insmod probeCgroup.ko + +cd testcases +gcc simple-mem-allocate.c -o simple-mem-allocate diff --git a/tools/probeCgroup/scripts/script2.sh b/tools/probeCgroup/scripts/script2.sh new file mode 100755 index 000000000000..2ad515cfb912 --- /dev/null +++ b/tools/probeCgroup/scripts/script2.sh @@ -0,0 +1,14 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) Taoxy2004 221870066@smail.nju.edu.cn + +current_dir=$(pwd) +cd /sys/fs/cgroup/memory +mkdir test +cd test +sh -c "echo $$ >> cgroup.procs" +sh -c "echo 5M > memory.limit_in_bytes" +sh -c "echo 0 > memory.swappiness" +cd "$current_dir" +cd ../testcases +./simple-mem-allocate diff --git a/tools/probeCgroup/scripts/script3.sh b/tools/probeCgroup/scripts/script3.sh new file mode 100755 index 000000000000..127eb45de5c9 --- /dev/null +++ b/tools/probeCgroup/scripts/script3.sh @@ -0,0 +1,11 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) Taoxy2004 221870066@smail.nju.edu.cn + +cd /proc +cat cgroup_memory_usage_per_process + +cat /sys/fs/cgroup/memory/test/cgroup.procs > /sys/fs/cgroup/memory/cgroup.procs +rmdir /sys/fs/cgroup/memory/test +# cat cgroup_memory_usage_per_process +rmmod probeCgroup diff --git a/tools/probeCgroup/testcases/1_load_unload_test.py b/tools/probeCgroup/testcases/1_load_unload_test.py new file mode 100755 index 000000000000..5389a14a1dac --- /dev/null +++ b/tools/probeCgroup/testcases/1_load_unload_test.py @@ -0,0 +1,24 @@ +#!/usr/bin/env python +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) Taoxy2004 221870066@smail.nju.edu.cn + +import os +import subprocess +import time + +def test_module_load_unload(): + try: + subprocess.check_call(['insmod', '../probeCgroup.ko']) + time.sleep(1) + print('loading module successfully!') + subprocess.check_call(['rmmod', 'probeCgroup']) + output = subprocess.check_output(['lsmod']) + assert b'probeCgroup' not in output + print('unloading module successfully!') + except subprocess.CalledProcessError as e: + print('Load unload test failed. Insmod failed.') + except AssertionError as e: + print('Load unload test failed. Cannot remove module.') + +if __name__ == '__main__': + test_module_load_unload() \ No newline at end of file diff --git a/tools/probeCgroup/testcases/2_multiple_process_test.py b/tools/probeCgroup/testcases/2_multiple_process_test.py new file mode 100755 index 000000000000..d88c7f7f2952 --- /dev/null +++ b/tools/probeCgroup/testcases/2_multiple_process_test.py @@ -0,0 +1,48 @@ +#!/usr/bin/env python +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) Taoxy2004 221870066@smail.nju.edu.cn + +from cgroup_utils import create_cgroup, add_process_to_cgroup, get_process_memory_usage, remove_cgroup, check_memory_usage, cleanup, check_kmem_usage +import os +import subprocess +import time + +def test_multiple_process(num_procs): + subprocess.check_call(['insmod', '../probeCgroup.ko']) + time.sleep(1) + + cgroup_name = 'test' + cgroup_path = create_cgroup(cgroup_name) + + processes = [] + pids = [] + + for i in range(num_procs): + process = subprocess.Popen(['./mem-allocate']) + pid = process.pid + add_process_to_cgroup(cgroup_path, pid) + processes.append(process) + pids.append(pid) + + time.sleep(0.1) + try: + count = 0 + for i in range (2000): + count += check_memory_usage(cgroup_name, pids, False) + time.sleep(0.01) + assert count <= 50, f"Memory read by probeCgroup is not accurate" + + remove_cgroup(cgroup_path, pids) + check_memory_usage(cgroup_name, pids, True) + cleanup(processes) + subprocess.check_call(['rmmod', 'probeCgroup']) + + print('pass multiple process test!') + except AssertionError as e: + print(f"Assertion failed: {e}") + remove_cgroup(cgroup_path, pids) + cleanup(processes) + subprocess.check_call(['rmmod', 'probeCgroup']) + +if __name__ == '__main__': + test_multiple_process(3) \ No newline at end of file diff --git a/tools/probeCgroup/testcases/3_multiple_cgroup_test.py b/tools/probeCgroup/testcases/3_multiple_cgroup_test.py new file mode 100755 index 000000000000..592a716df877 --- /dev/null +++ b/tools/probeCgroup/testcases/3_multiple_cgroup_test.py @@ -0,0 +1,55 @@ +#!/usr/bin/env python +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) Taoxy2004 221870066@smail.nju.edu.cn + +from cgroup_utils import create_cgroup, add_process_to_cgroup, get_process_memory_usage, remove_cgroup, check_memory_usage, cleanup +import os +import subprocess +import time + +def test_multiple_cgroup(num_procs, num_cgroups): + subprocess.check_call(['insmod', '../probeCgroup.ko']) + time.sleep(1) + + cgroups = [] + processes = {} + pids = {} + for i in range(num_cgroups): + cgroup_name = f'test_{i}' + cgroup_path = create_cgroup(cgroup_name) + cgroups.append((cgroup_name, cgroup_path)) + + for j in range(num_procs): + process = subprocess.Popen(['./mem-allocate']) + pid = process.pid + add_process_to_cgroup(cgroup_path, pid) + if cgroup_path not in processes: + processes[cgroup_path] = [] + processes[cgroup_path].append(process) + if cgroup_path not in pids: + pids[cgroup_path] = [] + pids[cgroup_path].append(pid) + + time.sleep(0.1) + try: + for i in range (100): + for cgroup_name, cgroup_path in cgroups: + check_memory_usage(cgroup_name, pids[cgroup_path], False) + time.sleep(0.01) + + for cgroup_name, cgroup_path in cgroups: + remove_cgroup(cgroup_path, pids[cgroup_path]) + check_memory_usage(cgroup_name, pids[cgroup_path], True) + cleanup(processes[cgroup_path]) + subprocess.check_call(['rmmod', 'probeCgroup']) + + print('pass multiple cgroup test!') + except AssertionError as e: + print(f"Assertion failed: {e}") + for cgroup_name, cgroup_path in cgroups: + remove_cgroup(cgroup_path, pids[cgroup_path]) + cleanup(processes[cgroup_path]) + subprocess.check_call(['rmmod', 'probeCgroup']) + +if __name__ == '__main__': + test_multiple_cgroup(2,2) \ No newline at end of file diff --git a/tools/probeCgroup/testcases/4_oom_test.py b/tools/probeCgroup/testcases/4_oom_test.py new file mode 100755 index 000000000000..128a258c56f5 --- /dev/null +++ b/tools/probeCgroup/testcases/4_oom_test.py @@ -0,0 +1,52 @@ +#!/usr/bin/env python +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) Taoxy2004 221870066@smail.nju.edu.cn + +from cgroup_utils import create_cgroup, add_process_to_cgroup, get_process_memory_usage, remove_cgroup, check_memory_usage, cleanup, get_oom_process_memory_usage +import os +import subprocess +import time + +def test_oom(num_procs): + subprocess.check_call(['insmod', '../probeCgroup.ko']) + time.sleep(1) + + cgroup_name = 'test' + cgroup_path = create_cgroup(cgroup_name) + + with open(f"/sys/fs/cgroup/memory/{cgroup_name}/memory.limit_in_bytes", 'w') as limit_file: + limit_file.write("5M") + with open(f"/sys/fs/cgroup/memory/{cgroup_name}/memory.swappiness", 'w') as swap_file: + swap_file.write("0") + + processes = [] + pids = [] + for i in range(num_procs): + process = subprocess.Popen(['./simple-mem-allocate']) + pid = process.pid + add_process_to_cgroup(cgroup_path, pid) + processes.append(process) + pids.append(pid) + + time.sleep(6) + + try: + for pid in pids: + memory_usage = get_oom_process_memory_usage(pid, cgroup_name) + assert memory_usage is not None, f"Memory usage(oom) not found for PID {pid}" + assert memory_usage > 0, f"Memory usage should be greater than zero for PID {pid}" + + remove_cgroup(cgroup_path, pids) + check_memory_usage(cgroup_name, pids, True) + cleanup(processes) + subprocess.check_call(['rmmod', 'probeCgroup']) + + print('pass oom test!') + except AssertionError as e: + print(f"Assertion failed: {e}") + remove_cgroup(cgroup_path, pids) + cleanup(processes) + subprocess.check_call(['rmmod', 'probeCgroup']) + +if __name__ == '__main__': + test_oom(1) \ No newline at end of file diff --git a/tools/probeCgroup/testcases/5_multiple_threads_test.py b/tools/probeCgroup/testcases/5_multiple_threads_test.py new file mode 100755 index 000000000000..7e1b86dabe48 --- /dev/null +++ b/tools/probeCgroup/testcases/5_multiple_threads_test.py @@ -0,0 +1,45 @@ +#!/usr/bin/env python +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) Taoxy2004 221870066@smail.nju.edu.cn + +from cgroup_utils import create_cgroup, add_process_to_cgroup, get_process_memory_usage, remove_cgroup, check_memory_usage, cleanup, check_kmem_usage +import os +import subprocess +import time + +def test_multiple_thread(num_procs): + subprocess.check_call(['insmod', '../probeCgroup.ko']) + time.sleep(1) + + cgroup_name = 'test' + cgroup_path = create_cgroup(cgroup_name) + + processes = [] + pids = [] + for i in range(num_procs): + process = subprocess.Popen(['./multiple-thread-mem-allocate']) + pid = process.pid + add_process_to_cgroup(cgroup_path, pid) + processes.append(process) + pids.append(pid) + + time.sleep(1) + try: + for i in range (200): + check_memory_usage(cgroup_name, pids, False) + time.sleep(0.01) + + remove_cgroup(cgroup_path, pids) + check_memory_usage(cgroup_name, pids, True) + cleanup(processes) + subprocess.check_call(['rmmod', 'probeCgroup']) + + print('pass multiple threads test!') + except AssertionError as e: + print(f"Assertion failed: {e}") + remove_cgroup(cgroup_path, pids) + cleanup(processes) + subprocess.check_call(['rmmod', 'probeCgroup']) + +if __name__ == '__main__': + test_multiple_thread(5) \ No newline at end of file diff --git a/tools/probeCgroup/testcases/cgroup_utils.py b/tools/probeCgroup/testcases/cgroup_utils.py new file mode 100644 index 000000000000..f70c68c1f188 --- /dev/null +++ b/tools/probeCgroup/testcases/cgroup_utils.py @@ -0,0 +1,115 @@ +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) Taoxy2004 221870066@smail.nju.edu.cn + +import os +import subprocess +import time + +def create_cgroup(cgroup_name): + cgroup_path = f'/sys/fs/cgroup/memory/{cgroup_name}' + + try: + os.makedirs(cgroup_path) + except FileExistsError: + pass + + return cgroup_path + +def add_process_to_cgroup(cgroup_path, pid): + with open(os.path.join(cgroup_path, 'cgroup.procs'), 'w') as procs_file: + procs_file.write(str(pid)) + +def get_process_memory_usage(pid, cgroup_name): + cur_name = '' + with open('/proc/cgroup_memory_usage_per_process', 'r') as file: + for line in file: + parts = line.strip().split() + if len(parts) >= 4 and parts[0] == 'cgroup': + cur_name = parts[3] + if len(parts) >= 3 and parts[0] != 'cgroup' and cur_name == cgroup_name and parts[0] != 'pid' and int(parts[0]) == pid: + return int(parts[2]) + return None + +def get_process_kmem_usage(pid, cgroup_name): + cur_name = '' + with open('/proc/cgroup_memory_usage_per_process', 'r') as file: + for line in file: + parts = line.strip().split() + if len(parts) >= 4 and parts[0] == 'cgroup': + cur_name = parts[3] + if len(parts) >= 4 and parts[0] != 'cgroup' and cur_name == cgroup_name and parts[0] != 'pid' and int(parts[0]) == pid: + return int(parts[3]) + return None + +def remove_cgroup(cgroup_path, pids): + for pid in pids: + with open('/sys/fs/cgroup/memory/cgroup.procs', 'w') as backup_file: + backup_file.write(str(pid)) + os.rmdir(cgroup_path) + return + +def check_memory_usage(cgroup_name, pids, delete): + memory_sum = 0 + for pid in pids: + memory_usage = get_process_memory_usage(pid, cgroup_name) + if delete == False: + assert memory_usage is not None, f"Memory usage not found for PID {pid}" + assert memory_usage >= 0, f"Memory usage should be greater than zero for PID {pid}" + memory_sum += memory_usage + else: + assert memory_usage is None, f"Error: Memory usage should not be available for PID {pid} after deleting the cgroup." + if delete == False: + with open(f"/sys/fs/cgroup/memory/{cgroup_name}/memory.usage_in_bytes", 'r') as file: + content = file.readline().strip() + memory_read = int(content) + memory_sum *= 1024 + delta = abs(memory_read - memory_sum) + # print(f"read: {memory_read}") + # print(f"sum : {memory_sum}") + if (delta > max(memory_read, memory_sum) * 0.1): + return 1 + else: + return 0 + else: + return 0 + +def check_kmem_usage(cgroup_name, pids, delete): + kmem_sum = 0 + for pid in pids: + kmem_usage = get_process_kmem_usage(pid, cgroup_name) + if delete == False: + assert kmem_usage is not None, f"Kmem usage not found for PID {pid}" + assert kmem_usage >= 0, f"Kmem usage should be greater than zero for PID {pid}" + kmem_sum += kmem_usage + else: + assert kmem_usage is None, f"Error: Kmem usage should not be available for PID {pid} after deleting the cgroup." + if delete == False: + with open(f"/sys/fs/cgroup/memory/{cgroup_name}/memory.kmem.usage_in_bytes", 'r') as file: + content = file.readline().strip() + kmem_read = int(content) + kmem_sum *= 1024 + delta = abs(kmem_read - kmem_sum) + # print(f"kmem read: {kmem_read}") + # print(f"kmem sum : {kmem_sum}") + # assert delta <= max(kmem_read, kmem_sum) * 0.2, f"Kmem read by probeCgroup is not accurate, {kmem_read}, {kmem_sum}" + +def cleanup(processes): + for process in processes: + process.terminate() + process.wait() + +def get_oom_process_memory_usage(pid, cgroup_name): + # ������������������������������ + cur_name = '' + oom = False + with open('/proc/cgroup_memory_usage_per_process', 'r') as file: + for line in file: + parts = line.strip().split() + if len(parts) >= 4 and parts[0] == 'cgroup': + cur_name = parts[3] + oom = False + if len(parts) >= 1 and parts[0] == 'oom:': + oom = True + if len(parts) >= 3 and parts[0] != 'cgroup' and cur_name == cgroup_name and parts[0] != 'pid' and int(parts[0]) == pid and oom == True: + return int(parts[2]) + return None \ No newline at end of file diff --git a/tools/probeCgroup/testcases/mem-allocate.c b/tools/probeCgroup/testcases/mem-allocate.c new file mode 100644 index 000000000000..e78e37cae61b --- /dev/null +++ b/tools/probeCgroup/testcases/mem-allocate.c @@ -0,0 +1,35 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * mem-allocate.c - The program to test probeCgroup + * + * Copyright (C) Taoxy2004 221870066@smail.nju.edu.cn + */ + +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> + +#define MB (1024 * 1024) + +char *arr[40]; + +int main(int argc, char *argv[]) +{ + char *p; + int i = 0; + + while (1) { + for (i = 0; i < 40; i++) { + p = (char *)malloc(MB); + memset(p, 0, MB); + arr[i] = p; + usleep(100000); + } + for (int i = 0; i < 40; i++) { + free(arr[i]); + usleep(100000); + } + } + return 0; +} diff --git a/tools/probeCgroup/testcases/multiple-thread-mem-allocate.c b/tools/probeCgroup/testcases/multiple-thread-mem-allocate.c new file mode 100644 index 000000000000..55f4c068f55e --- /dev/null +++ b/tools/probeCgroup/testcases/multiple-thread-mem-allocate.c @@ -0,0 +1,60 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * multiple-thread-mem-allocate.c - The program to test probeCgroup + * + * Copyright (C) Taoxy2004 221870066@smail.nju.edu.cn + */ + +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +#include <pthread.h> + +#define MB (1024 * 1024) + +void *memory_test(void *) +{ + char *arr[25]; + char *p; + int i = 0; + int cnt = 0; + + while (1) { + for (i = 0; i < 20; i++) { + p = (char *)malloc(MB); + memset(p, 0, MB); + arr[i] = p; + usleep(10000); + } + for (int i = 0; i < 20; i++) { + free(arr[i]); + usleep(10000); + } + } +} + +int main(int argc, char *argv[]) +{ + pthread_t threads[4]; + int rc; + + // create threads + for (int i = 0; i < 4; i++) { + rc = pthread_create(&threads[i], NULL, memory_test, NULL); + if (rc != 0) { + fprintf(stderr, "Error creating thread: %d\n", rc); + return 1; + } + } + + for (int i = 0; i < 4; i++) { + rc = pthread_join(threads[i], NULL); + if (rc != 0) { + fprintf(stderr, "Error joining thread: %d\n", rc); + return 1; + } + } + + return 0; +} diff --git a/tools/probeCgroup/testcases/run.py b/tools/probeCgroup/testcases/run.py new file mode 100755 index 000000000000..8ffd0ca720d8 --- /dev/null +++ b/tools/probeCgroup/testcases/run.py @@ -0,0 +1,32 @@ +#!/usr/bin/env python +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) Taoxy2004 221870066@smail.nju.edu.cn + +import os +import subprocess +import sys + +def run_tests(directory): + """Run all Python scripts in the given directory.""" + python_files = [f for f in os.listdir(directory) if f.endswith('test.py')] + python_files.sort() + + for filename in python_files: + try: + filepath = os.path.join(directory, filename) + + subprocess.check_call([sys.executable, filepath]) + except subprocess.CalledProcessError as e: + print(f"Error executing {filename}:") + return + except Exception as e: + print(f"Error executing {filename}:") + print(e) + return + +if __name__ == '__main__': + tests_directory = '.' + subprocess.check_call(['gcc', 'mem-allocate.c', '-o', 'mem-allocate']) + subprocess.check_call(['gcc', 'simple-mem-allocate.c', '-o', 'simple-mem-allocate']) + subprocess.check_call(['gcc', 'multiple-thread-mem-allocate.c', '-o', 'multiple-thread-mem-allocate']) + run_tests(tests_directory) \ No newline at end of file diff --git a/tools/probeCgroup/testcases/simple-mem-allocate.c b/tools/probeCgroup/testcases/simple-mem-allocate.c new file mode 100644 index 000000000000..16328b10ba48 --- /dev/null +++ b/tools/probeCgroup/testcases/simple-mem-allocate.c @@ -0,0 +1,27 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * mem-allocate.c - The program to test probeCgroup + * + * Copyright (C) Taoxy2004 221870066@smail.nju.edu.cn + */ + +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> + +#define MB (1024 * 1024) + +int main(int argc, char *argv[]) +{ + char *p; + int i = 0; + + while (1) { + p = (char *)malloc(MB); + memset(p, 0, MB); + sleep(1); + } + + return 0; +}