Backport two page_owner patchsets: * Fix page_owner's use of free timestamps * mm/page_owner: record and dump free_pid and free_tgid
Audra Mitchell (5): mm/page_owner: remove free_ts from page_owner output tools/mm: remove references to free_ts from page_owner_sort tools/mm: filter out timestamps for correct collation tools/mm: fix the default case for page_owner_sort tools/mm: update the usage output to be more organized
Barry Song (1): mm/page_owner: record and dump free_pid and free_tgid
mm/page_owner.c | 13 ++- tools/mm/page_owner_sort.c | 217 ++++++++++++++++++------------------- 2 files changed, 113 insertions(+), 117 deletions(-)
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/4087 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/M...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/4087 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/M...
From: Audra Mitchell audra@redhat.com
mainline inclusion from mainline-v6.7-rc1 commit b459f0905eec31196b6e841773aabe3194e3c3fe category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8XGY0 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
-------------------------------------------
Patch series "Fix page_owner's use of free timestamps".
While page ower output is used to investigate memory utilization, typically the allocation pathway, the introduction of timestamps to the page owner records caused each record to become unique due to the granularity of the nanosecond timestamp (for example):
Page allocated via order 0 ... ts 5206196026 ns, free_ts 5187156703 ns Page allocated via order 0 ... ts 5206198540 ns, free_ts 5187162702 ns
Furthermore, the page_owner output only dumps the currently allocated records, so having the free timestamps is nonsensical for the typical use case.
In addition, the introduction of timestamps was not properly handled in the page_owner_sort tool causing most use cases to be broken. This series is meant to remove the free timestamps from the page_owner output and fix the page_owner_sort tool so proper collation can occur.
This patch (of 5):
When printing page_owner data via the sysfs interface, no free pages will ever be dumped due to the series of checks in read_page_owner():
/* * Although we do have the info about past allocation of free * pages, it's not relevant for current memory usage. */ if (!test_bit(PAGE_EXT_OWNER_ALLOCATED, &page_ext->flags))
The free_ts values are still used when dump_page_owner() is called, so keeping the field for other use cases but removing them for the typical page_owner case.
Link: https://lkml.kernel.org/r/20231013190350.579407-1-audra@redhat.com Link: https://lkml.kernel.org/r/20231013190350.579407-2-audra@redhat.com Fixes: 866b48526217 ("mm/page_owner: record the timestamp of all pages during free") Signed-off-by: Audra Mitchell audra@redhat.com Acked-by: Rafael Aquini aquini@redhat.com Reviewed-by: Vlastimil Babka vbabka@suse.cz Cc: Georgi Djakov djakov@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Jinjiang Tu tujinjiang@huawei.com --- mm/page_owner.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/page_owner.c b/mm/page_owner.c index 4e2723e1b300..4f13ce7d2452 100644 --- a/mm/page_owner.c +++ b/mm/page_owner.c @@ -408,11 +408,11 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn, return -ENOMEM;
ret = scnprintf(kbuf, count, - "Page allocated via order %u, mask %#x(%pGg), pid %d, tgid %d (%s), ts %llu ns, free_ts %llu ns\n", + "Page allocated via order %u, mask %#x(%pGg), pid %d, tgid %d (%s), ts %llu ns\n", page_owner->order, page_owner->gfp_mask, &page_owner->gfp_mask, page_owner->pid, page_owner->tgid, page_owner->comm, - page_owner->ts_nsec, page_owner->free_ts_nsec); + page_owner->ts_nsec);
/* Print information relevant to grouping pages by mobility */ pageblock_mt = get_pageblock_migratetype(page);
From: Audra Mitchell audra@redhat.com
mainline inclusion from mainline-v6.7-rc1 commit 0179c62839bdc49769a986e6eb1a6ca6fc7d274a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8XGY0 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
-------------------------------------------
With the removal of free timestamps from page_owner output, we no longer need to handle this case or the "unreleased" case. Remove all references to both cases.
Link: https://lkml.kernel.org/r/20231013190350.579407-3-audra@redhat.com Signed-off-by: Audra Mitchell audra@redhat.com Acked-by: Rafael Aquini aquini@redhat.com Reviewed-by: Vlastimil Babka vbabka@suse.cz Cc: Georgi Djakov djakov@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Jinjiang Tu tujinjiang@huawei.com --- tools/mm/page_owner_sort.c | 98 +++++--------------------------------- 1 file changed, 12 insertions(+), 86 deletions(-)
diff --git a/tools/mm/page_owner_sort.c b/tools/mm/page_owner_sort.c index 99798894b879..9c93f3f4514f 100644 --- a/tools/mm/page_owner_sort.c +++ b/tools/mm/page_owner_sort.c @@ -33,7 +33,6 @@ struct block_list { char *comm; // task command name char *stacktrace; __u64 ts_nsec; - __u64 free_ts_nsec; int len; int num; int page_num; @@ -42,18 +41,16 @@ struct block_list { int allocator; }; enum FILTER_BIT { - FILTER_UNRELEASE = 1<<1, - FILTER_PID = 1<<2, - FILTER_TGID = 1<<3, - FILTER_COMM = 1<<4 + FILTER_PID = 1<<1, + FILTER_TGID = 1<<2, + FILTER_COMM = 1<<3 }; enum CULL_BIT { - CULL_UNRELEASE = 1<<1, - CULL_PID = 1<<2, - CULL_TGID = 1<<3, - CULL_COMM = 1<<4, - CULL_STACKTRACE = 1<<5, - CULL_ALLOCATOR = 1<<6 + CULL_PID = 1<<1, + CULL_TGID = 1<<2, + CULL_COMM = 1<<3, + CULL_STACKTRACE = 1<<4, + CULL_ALLOCATOR = 1<<5 }; enum ALLOCATOR_BIT { ALLOCATOR_CMA = 1<<1, @@ -62,9 +59,8 @@ enum ALLOCATOR_BIT { ALLOCATOR_OTHERS = 1<<4 }; enum ARG_TYPE { - ARG_TXT, ARG_COMM, ARG_STACKTRACE, ARG_ALLOC_TS, ARG_FREE_TS, - ARG_CULL_TIME, ARG_PAGE_NUM, ARG_PID, ARG_TGID, ARG_UNKNOWN, ARG_FREE, - ARG_ALLOCATOR + ARG_TXT, ARG_COMM, ARG_STACKTRACE, ARG_ALLOC_TS, ARG_CULL_TIME, + ARG_PAGE_NUM, ARG_PID, ARG_TGID, ARG_UNKNOWN, ARG_ALLOCATOR }; enum SORT_ORDER { SORT_ASC = 1, @@ -90,7 +86,6 @@ static regex_t pid_pattern; static regex_t tgid_pattern; static regex_t comm_pattern; static regex_t ts_nsec_pattern; -static regex_t free_ts_nsec_pattern; static struct block_list *list; static int list_size; static int max_size; @@ -181,24 +176,6 @@ static int compare_ts(const void *p1, const void *p2) return l1->ts_nsec < l2->ts_nsec ? -1 : 1; }
-static int compare_free_ts(const void *p1, const void *p2) -{ - const struct block_list *l1 = p1, *l2 = p2; - - return l1->free_ts_nsec < l2->free_ts_nsec ? -1 : 1; -} - -static int compare_release(const void *p1, const void *p2) -{ - const struct block_list *l1 = p1, *l2 = p2; - - if (!l1->free_ts_nsec && !l2->free_ts_nsec) - return 0; - if (l1->free_ts_nsec && l2->free_ts_nsec) - return 0; - return l1->free_ts_nsec ? 1 : -1; -} - static int compare_cull_condition(const void *p1, const void *p2) { if (cull == 0) @@ -211,8 +188,6 @@ static int compare_cull_condition(const void *p1, const void *p2) return compare_tgid(p1, p2); if ((cull & CULL_COMM) && compare_comm(p1, p2)) return compare_comm(p1, p2); - if ((cull & CULL_UNRELEASE) && compare_release(p1, p2)) - return compare_release(p1, p2); if ((cull & CULL_ALLOCATOR) && compare_allocator(p1, p2)) return compare_allocator(p1, p2); return 0; @@ -366,24 +341,6 @@ static __u64 get_ts_nsec(char *buf) return ts_nsec; }
-static __u64 get_free_ts_nsec(char *buf) -{ - __u64 free_ts_nsec; - char free_ts_nsec_str[FIELD_BUFF] = {0}; - char *endptr; - - search_pattern(&free_ts_nsec_pattern, free_ts_nsec_str, buf); - errno = 0; - free_ts_nsec = strtoull(free_ts_nsec_str, &endptr, 10); - if (errno != 0 || endptr == free_ts_nsec_str || *endptr != '\0') { - if (debug_on) - fprintf(stderr, "wrong free_ts_nsec in follow buf:\n%s\n", buf); - return -1; - } - - return free_ts_nsec; -} - static char *get_comm(char *buf) { char *comm_str = malloc(TASK_COMM_LEN); @@ -411,12 +368,8 @@ static int get_arg_type(const char *arg) return ARG_COMM; else if (!strcmp(arg, "stacktrace") || !strcmp(arg, "st")) return ARG_STACKTRACE; - else if (!strcmp(arg, "free") || !strcmp(arg, "f")) - return ARG_FREE; else if (!strcmp(arg, "txt") || !strcmp(arg, "T")) return ARG_TXT; - else if (!strcmp(arg, "free_ts") || !strcmp(arg, "ft")) - return ARG_FREE_TS; else if (!strcmp(arg, "alloc_ts") || !strcmp(arg, "at")) return ARG_ALLOC_TS; else if (!strcmp(arg, "allocator") || !strcmp(arg, "ator")) @@ -471,13 +424,6 @@ static bool match_str_list(const char *str, char **list, int list_size)
static bool is_need(char *buf) { - __u64 ts_nsec, free_ts_nsec; - - ts_nsec = get_ts_nsec(buf); - free_ts_nsec = get_free_ts_nsec(buf); - - if ((filter & FILTER_UNRELEASE) && free_ts_nsec != 0 && ts_nsec < free_ts_nsec) - return false; if ((filter & FILTER_PID) && !match_num_list(get_pid(buf), fc.pids, fc.pids_size)) return false; if ((filter & FILTER_TGID) && @@ -528,7 +474,6 @@ static bool add_list(char *buf, int len, char *ext_buf) if (*list[list_size].stacktrace == '\n') list[list_size].stacktrace++; list[list_size].ts_nsec = get_ts_nsec(buf); - list[list_size].free_ts_nsec = get_free_ts_nsec(buf); list[list_size].allocator = get_allocator(buf, ext_buf); list_size++; if (list_size % 1000 == 0) { @@ -554,8 +499,6 @@ static bool parse_cull_args(const char *arg_str) cull |= CULL_COMM; else if (arg_type == ARG_STACKTRACE) cull |= CULL_STACKTRACE; - else if (arg_type == ARG_FREE) - cull |= CULL_UNRELEASE; else if (arg_type == ARG_ALLOCATOR) cull |= CULL_ALLOCATOR; else { @@ -616,8 +559,6 @@ static bool parse_sort_args(const char *arg_str) sc.cmps[i] = compare_stacktrace; else if (arg_type == ARG_ALLOC_TS) sc.cmps[i] = compare_ts; - else if (arg_type == ARG_FREE_TS) - sc.cmps[i] = compare_free_ts; else if (arg_type == ARG_TXT) sc.cmps[i] = compare_txt; else if (arg_type == ARG_ALLOCATOR) @@ -679,8 +620,6 @@ static void usage(void) "-P\t\tSort by tgid.\n" "-n\t\tSort by task command name.\n" "-a\t\tSort by memory allocate time.\n" - "-r\t\tSort by memory release time.\n" - "-f\t\tFilter out the information of blocks whose memory has been released.\n" "-d\t\tPrint debug information.\n" "--pid <pidlist>\tSelect by pid. This selects the information of blocks whose process ID numbers appear in <pidlist>.\n" "--tgid <tgidlist>\tSelect by tgid. This selects the information of blocks whose Thread Group ID numbers appear in <tgidlist>.\n" @@ -706,7 +645,7 @@ int main(int argc, char **argv) { 0, 0, 0, 0}, };
- while ((opt = getopt_long(argc, argv, "adfmnprstP", longopts, NULL)) != -1) + while ((opt = getopt_long(argc, argv, "admnpstP", longopts, NULL)) != -1) switch (opt) { case 'a': set_single_cmp(compare_ts, SORT_ASC); @@ -714,18 +653,12 @@ int main(int argc, char **argv) case 'd': debug_on = true; break; - case 'f': - filter = filter | FILTER_UNRELEASE; - break; case 'm': set_single_cmp(compare_page_num, SORT_DESC); break; case 'p': set_single_cmp(compare_pid, SORT_ASC); break; - case 'r': - set_single_cmp(compare_free_ts, SORT_ASC); - break; case 's': set_single_cmp(compare_stacktrace, SORT_ASC); break; @@ -800,10 +733,8 @@ int main(int argc, char **argv) goto out_tgid; if (!check_regcomp(&comm_pattern, "tgid\s*[0-9]*\s*\((.*)\),\s*ts")) goto out_comm; - if (!check_regcomp(&ts_nsec_pattern, "ts\s*([0-9]*)\s*ns,")) + if (!check_regcomp(&ts_nsec_pattern, "ts\s*([0-9]*)\s*ns")) goto out_ts; - if (!check_regcomp(&free_ts_nsec_pattern, "free_ts\s*([0-9]*)\s*ns")) - goto out_free_ts;
fstat(fileno(fin), &st); max_size = st.st_size / 100; /* hack ... */ @@ -864,9 +795,6 @@ int main(int argc, char **argv) fprintf(fout, ", "); print_allocator(fout, list[i].allocator); } - if (cull & CULL_UNRELEASE) - fprintf(fout, " (%s)", - list[i].free_ts_nsec ? "UNRELEASED" : "RELEASED"); if (cull & CULL_STACKTRACE) fprintf(fout, ":\n%s", list[i].stacktrace); fprintf(fout, "\n"); @@ -880,8 +808,6 @@ int main(int argc, char **argv) free(buf); if (list) free(list); -out_free_ts: - regfree(&free_ts_nsec_pattern); out_ts: regfree(&ts_nsec_pattern); out_comm:
From: Audra Mitchell audra@redhat.com
mainline inclusion from mainline-v6.7-rc1 commit 63a150623a2bf94c9ed503719a3423675a3aa0d3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8XGY0 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
-------------------------------------------
With the introduction of allocation timestamps being included in page_owner output, each record becomes unique due to the timestamp nanosecond granularity. Remove the check in add_list that tries to collate each record during processing as the memcmp() is just additional overhead at this point.
Also keep the allocation timestamps, but allow collation to occur without consideration of the allocation timestamp except in the case were allocation timestamps are requested by the user (the -a option).
Link: https://lkml.kernel.org/r/20231013190350.579407-4-audra@redhat.com Signed-off-by: Audra Mitchell audra@redhat.com Acked-by: Rafael Aquini aquini@redhat.com Acked-by: Vlastimil Babka vbabka@suse.cz Cc: Georgi Djakov djakov@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Jinjiang Tu tujinjiang@huawei.com --- tools/mm/page_owner_sort.c | 25 ++++++++++++++++++------- 1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/tools/mm/page_owner_sort.c b/tools/mm/page_owner_sort.c index 9c93f3f4514f..7ddabcb3a073 100644 --- a/tools/mm/page_owner_sort.c +++ b/tools/mm/page_owner_sort.c @@ -203,6 +203,21 @@ static int compare_sort_condition(const void *p1, const void *p2) return cmp; }
+static int remove_pattern(regex_t *pattern, char *buf, int len) +{ + regmatch_t pmatch[2]; + int err; + + err = regexec(pattern, buf, 2, pmatch, REG_NOTBOL); + if (err != 0 || pmatch[1].rm_so == -1) + return len; + + memcpy(buf + pmatch[1].rm_so, + buf + pmatch[1].rm_eo, len - pmatch[1].rm_eo); + + return len - (pmatch[1].rm_eo - pmatch[1].rm_so); +} + static int search_pattern(regex_t *pattern, char *pattern_str, char *buf) { int err, val_len; @@ -443,13 +458,6 @@ static bool is_need(char *buf)
static bool add_list(char *buf, int len, char *ext_buf) { - if (list_size != 0 && - len == list[list_size-1].len && - memcmp(buf, list[list_size-1].txt, len) == 0) { - list[list_size-1].num++; - list[list_size-1].page_num += get_page_num(buf); - return true; - } if (list_size == max_size) { fprintf(stderr, "max_size too small??\n"); return false; @@ -465,6 +473,9 @@ static bool add_list(char *buf, int len, char *ext_buf) return false; } memcpy(list[list_size].txt, buf, len); + if (sc.cmps[0] != compare_ts) { + len = remove_pattern(&ts_nsec_pattern, list[list_size].txt, len); + } list[list_size].txt[len] = 0; list[list_size].len = len; list[list_size].num = 1;
From: Audra Mitchell audra@redhat.com
mainline inclusion from mainline-v6.7-rc1 commit c6d5e4901e00031650132aa30aa082a47c3796e5 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8XGY0 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
-------------------------------------------
With the additional commands and timestamps added to the tool, the default case (-t) has been broken. Now that the allocation timestamps are saved outside of the txt field, allow us to properly sort the data by number of times the record has been seen. Furthermore prevent the misuse of the commandline arguments so only one compare option can be used.
Link: https://lkml.kernel.org/r/20231013190350.579407-5-audra@redhat.com Signed-off-by: Audra Mitchell audra@redhat.com Acked-by: Rafael Aquini aquini@redhat.com Acked-by: Vlastimil Babka vbabka@suse.cz Cc: Georgi Djakov djakov@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Jinjiang Tu tujinjiang@huawei.com --- tools/mm/page_owner_sort.c | 61 +++++++++++++++++++++++++++++++++----- 1 file changed, 53 insertions(+), 8 deletions(-)
diff --git a/tools/mm/page_owner_sort.c b/tools/mm/page_owner_sort.c index 7ddabcb3a073..5a260096ebaa 100644 --- a/tools/mm/page_owner_sort.c +++ b/tools/mm/page_owner_sort.c @@ -66,6 +66,16 @@ enum SORT_ORDER { SORT_ASC = 1, SORT_DESC = -1, }; +enum COMP_FLAG { + COMP_NO_FLAG = 0, + COMP_ALLOC = 1<<0, + COMP_PAGE_NUM = 1<<1, + COMP_PID = 1<<2, + COMP_STACK = 1<<3, + COMP_NUM = 1<<4, + COMP_TGID = 1<<5, + COMP_COMM = 1<<6 +}; struct filter_condition { pid_t *pids; pid_t *tgids; @@ -644,7 +654,7 @@ int main(int argc, char **argv) { FILE *fin, *fout; char *buf, *ext_buf; - int i, count; + int i, count, compare_flag; struct stat st; int opt; struct option longopts[] = { @@ -656,31 +666,33 @@ int main(int argc, char **argv) { 0, 0, 0, 0}, };
+ compare_flag = COMP_NO_FLAG; + while ((opt = getopt_long(argc, argv, "admnpstP", longopts, NULL)) != -1) switch (opt) { case 'a': - set_single_cmp(compare_ts, SORT_ASC); + compare_flag |= COMP_ALLOC; break; case 'd': debug_on = true; break; case 'm': - set_single_cmp(compare_page_num, SORT_DESC); + compare_flag |= COMP_PAGE_NUM; break; case 'p': - set_single_cmp(compare_pid, SORT_ASC); + compare_flag |= COMP_PID; break; case 's': - set_single_cmp(compare_stacktrace, SORT_ASC); + compare_flag |= COMP_STACK; break; case 't': - set_single_cmp(compare_num, SORT_DESC); + compare_flag |= COMP_NUM; break; case 'P': - set_single_cmp(compare_tgid, SORT_ASC); + compare_flag |= COMP_TGID; break; case 'n': - set_single_cmp(compare_comm, SORT_ASC); + compare_flag |= COMP_COMM; break; case 1: filter = filter | FILTER_PID; @@ -728,6 +740,39 @@ int main(int argc, char **argv) exit(1); }
+ /* Only one compare option is allowed, yet we also want handle the + * default case were no option is provided, but we still want to + * match the behavior of the -t option (compare by number of times + * a record is seen + */ + switch (compare_flag) { + case COMP_ALLOC: + set_single_cmp(compare_ts, SORT_ASC); + break; + case COMP_PAGE_NUM: + set_single_cmp(compare_page_num, SORT_DESC); + break; + case COMP_PID: + set_single_cmp(compare_pid, SORT_ASC); + break; + case COMP_STACK: + set_single_cmp(compare_stacktrace, SORT_ASC); + break; + case COMP_NO_FLAG: + case COMP_NUM: + set_single_cmp(compare_num, SORT_DESC); + break; + case COMP_TGID: + set_single_cmp(compare_tgid, SORT_ASC); + break; + case COMP_COMM: + set_single_cmp(compare_comm, SORT_ASC); + break; + default: + usage(); + exit(1); + } + fin = fopen(argv[optind], "r"); fout = fopen(argv[optind + 1], "w"); if (!fin || !fout) {
From: Audra Mitchell audra@redhat.com
mainline inclusion from mainline-v6.7-rc1 commit d8ea435f071592d82479414e97e3c70bed204666 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8XGY0 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
-------------------------------------------
Organize the usage options alphabetically and improve the description of some options. Also separate the more complicated cull options from the single use compare options.
Link: https://lkml.kernel.org/r/20231013190350.579407-6-audra@redhat.com Signed-off-by: Audra Mitchell audra@redhat.com Acked-by: Rafael Aquini aquini@redhat.com Acked-by: Vlastimil Babka vbabka@suse.cz Cc: Georgi Djakov djakov@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Jinjiang Tu tujinjiang@huawei.com --- tools/mm/page_owner_sort.c | 33 ++++++++++++++++++++------------- 1 file changed, 20 insertions(+), 13 deletions(-)
diff --git a/tools/mm/page_owner_sort.c b/tools/mm/page_owner_sort.c index 5a260096ebaa..e1f264444342 100644 --- a/tools/mm/page_owner_sort.c +++ b/tools/mm/page_owner_sort.c @@ -634,19 +634,26 @@ static void print_allocator(FILE *out, int allocator) static void usage(void) { printf("Usage: ./page_owner_sort [OPTIONS] <input> <output>\n" - "-m\t\tSort by total memory.\n" - "-s\t\tSort by the stack trace.\n" - "-t\t\tSort by times (default).\n" - "-p\t\tSort by pid.\n" - "-P\t\tSort by tgid.\n" - "-n\t\tSort by task command name.\n" - "-a\t\tSort by memory allocate time.\n" - "-d\t\tPrint debug information.\n" - "--pid <pidlist>\tSelect by pid. This selects the information of blocks whose process ID numbers appear in <pidlist>.\n" - "--tgid <tgidlist>\tSelect by tgid. This selects the information of blocks whose Thread Group ID numbers appear in <tgidlist>.\n" - "--name <cmdlist>\n\t\tSelect by command name. This selects the information of blocks whose command name appears in <cmdlist>.\n" - "--cull <rules>\tCull by user-defined rules.<rules> is a single argument in the form of a comma-separated list with some common fields predefined\n" - "--sort <order>\tSpecify sort order as: [+|-]key[,[+|-]key[,...]]\n" + "-a\t\t\tSort by memory allocation time.\n" + "-m\t\t\tSort by total memory.\n" + "-n\t\t\tSort by task command name.\n" + "-p\t\t\tSort by pid.\n" + "-P\t\t\tSort by tgid.\n" + "-s\t\t\tSort by the stacktrace.\n" + "-t\t\t\tSort by number of times record is seen (default).\n\n" + "--pid <pidlist>\t\tSelect by pid. This selects the information" + " of\n\t\t\tblocks whose process ID numbers appear in <pidlist>.\n" + "--tgid <tgidlist>\tSelect by tgid. This selects the information" + " of\n\t\t\tblocks whose Thread Group ID numbers appear in " + "<tgidlist>.\n" + "--name <cmdlist>\tSelect by command name. This selects the" + " information\n\t\t\tof blocks whose command name appears in" + " <cmdlist>.\n" + "--cull <rules>\t\tCull by user-defined rules. <rules> is a " + "single\n\t\t\targument in the form of a comma-separated list " + "with some\n\t\t\tcommon fields predefined (pid, tgid, comm, " + "stacktrace, allocator)\n" + "--sort <order>\t\tSpecify sort order as: [+|-]key[,[+|-]key[,...]]\n" ); }
From: Barry Song 21cnbao@gmail.com
mainline inclusion from mainline-v6.8-rc1 commit 1b5c65b64cd417c801945b26a2a50c4d4eefaec8 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I8XGY0 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
-------------------------------------------
While investigating some complex memory allocation and free bugs especially in multi-processes and multi-threads cases, from time to time, I feel the free stack isn't sufficient as a page can be freed by processes or threads other than the one allocating it. And other processes and threads which free the page often have the exactly same free stack with the one allocating the page. We can't know who free the page only through the free stack though the current page_owner does tell us the pid and tgid of the one allocating the page. This makes the bug investigation often hard.
So this patch adds free pid and tgid in page_owner, so that we can easily figure out if the freeing is crossing processes or threads.
Link: https://lkml.kernel.org/r/20231114034202.73098-1-v-songbaohua@oppo.com Signed-off-by: Barry Song v-songbaohua@oppo.com Cc: Audra Mitchell audra@redhat.com Cc: Hyeonggon Yoo 42.hyeyoo@gmail.com Cc: Joonsoo Kim iamjoonsoo.kim@lge.com Cc: Kassey Li quic_yingangl@quicinc.com Cc: Kemeng Shi shikemeng@huaweicloud.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Jinjiang Tu tujinjiang@huawei.com --- mm/page_owner.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/mm/page_owner.c b/mm/page_owner.c index 4f13ce7d2452..e7eba7688881 100644 --- a/mm/page_owner.c +++ b/mm/page_owner.c @@ -32,6 +32,8 @@ struct page_owner { char comm[TASK_COMM_LEN]; pid_t pid; pid_t tgid; + pid_t free_pid; + pid_t free_tgid; };
static bool page_owner_enabled __initdata; @@ -152,6 +154,8 @@ void __reset_page_owner(struct page *page, unsigned short order) page_owner = get_page_owner(page_ext); page_owner->free_handle = handle; page_owner->free_ts_nsec = free_ts_nsec; + page_owner->free_pid = current->pid; + page_owner->free_tgid = current->tgid; page_ext = page_ext_next(page_ext); } page_ext_put(page_ext); @@ -253,6 +257,8 @@ void __folio_copy_owner(struct folio *newfolio, struct folio *old) new_page_owner->handle = old_page_owner->handle; new_page_owner->pid = old_page_owner->pid; new_page_owner->tgid = old_page_owner->tgid; + new_page_owner->free_pid = old_page_owner->free_pid; + new_page_owner->free_tgid = old_page_owner->free_tgid; new_page_owner->ts_nsec = old_page_owner->ts_nsec; new_page_owner->free_ts_nsec = old_page_owner->ts_nsec; strcpy(new_page_owner->comm, old_page_owner->comm); @@ -495,7 +501,8 @@ void __dump_page_owner(const struct page *page) if (!handle) { pr_alert("page_owner free stack trace missing\n"); } else { - pr_alert("page last free stack trace:\n"); + pr_alert("page last free pid %d tgid %d stack trace:\n", + page_owner->free_pid, page_owner->free_tgid); stack_depot_print(handle); }