mailweb.openeuler.org
Manage this list

Keyboard Shortcuts

Thread View

  • j: Next unread message
  • k: Previous unread message
  • j a: Jump to all threads
  • j l: Jump to MailingList overview

Kernel

Threads by month
  • ----- 2025 -----
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2024 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2023 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2022 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2021 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2020 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2019 -----
  • December
kernel@openeuler.org

  • 51 participants
  • 18713 discussions
[PATCH openEuler-1.0-LTS] scsi: lpfc: Fix link down processing to address NULL pointer dereference
by dinglongwei 22 Apr '24

22 Apr '24
From: James Smart <jsmart2021(a)gmail.com> mainline inclusion from mainline-v5.16-rc1 commit 1854f53ccd88ad4e7568ddfafafffe71f1ceb0a6 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9FNFF CVE: CVE-2021-47183 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- If an FC link down transition while PLOGIs are outstanding to fabric well known addresses, outstanding ABTS requests may result in a NULL pointer dereference. Driver unload requests may hang with repeated "2878" log messages. The Link down processing results in ABTS requests for outstanding ELS requests. The Abort WQEs are sent for the ELSs before the driver had set the link state to down. Thus the driver is sending the Abort with the expectation that an ABTS will be sent on the wire. The Abort request is stalled waiting for the link to come up. In some conditions the driver may auto-complete the ELSs thus if the link does come up, the Abort completions may reference an invalid structure. Fix by ensuring that Abort set the flag to avoid link traffic if issued due to conditions where the link failed. Link: https://lore.kernel.org/r/20211020211417.88754-7-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee(a)broadcom.com> Signed-off-by: Justin Tee <justin.tee(a)broadcom.com> Signed-off-by: James Smart <jsmart2021(a)gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com> --- drivers/scsi/lpfc/lpfc_sli.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c index 6c775a8bf..86d31fe32 100644 --- a/drivers/scsi/lpfc/lpfc_sli.c +++ b/drivers/scsi/lpfc/lpfc_sli.c @@ -10839,10 +10839,12 @@ lpfc_sli_abort_iotag_issue(struct lpfc_hba *phba, struct lpfc_sli_ring *pring, if (cmdiocb->iocb_flag & LPFC_IO_FOF) abtsiocbp->iocb_flag |= LPFC_IO_FOF; - if (phba->link_state >= LPFC_LINK_UP) - iabt->ulpCommand = CMD_ABORT_XRI_CN; - else + if (phba->link_state < LPFC_LINK_UP || + (phba->sli_rev == LPFC_SLI_REV4 && + phba->sli4_hba.link_state.status == LPFC_FC_LA_TYPE_LINK_DOWN)) iabt->ulpCommand = CMD_CLOSE_XRI_CN; + else + iabt->ulpCommand = CMD_ABORT_XRI_CN; abtsiocbp->iocb_cmpl = lpfc_sli_abort_els_cmpl; abtsiocbp->vport = vport; -- 2.17.1
2 1
0 0
[PATCH OLK-6.6 v3 0/2] m: convert mm's rss stats to use atomic mode
by Peng Zhang 22 Apr '24

22 Apr '24
From: ZhangPeng <zhangpeng362(a)huawei.com> Since commit f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter"), the rss_stats have converted into percpu_counter, which convert the error margin from (nr_threads * 64) to approximately (nr_cpus ^ 2). However, the new percpu allocation in mm_init() causes a performance regression on fork/exec/shell. Even after commit 14ef95be6f55 ("kernel/fork: group allocation/free of per-cpu counters for mm struct"), the performance of fork/exec/shell is still poor compared to previous kernel versions. To mitigate performance regression, we delay the allocation of percpu memory for rss_stats. Therefore, we convert mm's rss stats to use percpu_counter atomic mode. For single-thread processes, rss_stat is in atomic mode, which reduces the memory consumption and performance regression caused by using percpu. For multiple-thread processes, rss_stat is switched to the percpu mode to reduce the error margin. We convert rss_stats from atomic mode to percpu mode only when the second thread is created. After lmbench test, we can get 2% ~ 4% performance improvement for lmbench fork_proc/exec_proc/shell_proc and 6.7% performance improvement for lmbench page_fault (before batch mode[1]). The test results are as follows: base base+revert base+this patch fork_proc 416.3ms 400.0ms (3.9%) 398.6ms (4.2%) exec_proc 2095.9ms 2061.1ms (1.7%) 2047.7ms (2.3%) shell_proc 3028.2ms 2954.7ms (2.4%) 2961.2ms (2.2%) page_fault 0.3603ms 0.3358ms (6.8%) 0.3361ms (6.7%) [1] https://lore.kernel.org/all/20240412064751.119015-1-wangkefeng.wang@huawei.… ChangeLog: v2->v3: - remove patch 3. v1->v2: - Split patch 2 into two patches. ZhangPeng (2): percpu_counter: introduce atomic mode for percpu_counter mm: convert mm's rss stats to use atomic mode include/linux/mm.h | 50 +++++++++++++++++++++++++++++----- include/linux/percpu_counter.h | 48 ++++++++++++++++++++++++++++++-- include/trace/events/kmem.h | 4 +-- kernel/fork.c | 20 ++++++++------ lib/percpu_counter.c | 35 ++++++++++++++++++++++-- 5 files changed, 135 insertions(+), 22 deletions(-) -- 2.25.1
2 3
0 0
[PATCH OLK-6.6] dm: call the resume method on internal suspend
by Li Lingfeng 22 Apr '24

22 Apr '24
From: Mikulas Patocka <mpatocka(a)redhat.com> mainline inclusion from mainline-v6.9-rc1 commit 65e8fbde64520001abf1c8d0e573561b4746ef38 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9HJXV CVE: CVE-2024-26880 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- There is this reported crash when experimenting with the lvm2 testsuite. The list corruption is caused by the fact that the postsuspend and resume methods were not paired correctly; there were two consecutive calls to the origin_postsuspend function. The second call attempts to remove the "hash_list" entry from a list, while it was already removed by the first call. Fix __dm_internal_resume so that it calls the preresume and resume methods of the table's targets. If a preresume method of some target fails, we are in a tricky situation. We can't return an error because dm_internal_resume isn't supposed to return errors. We can't return success, because then the "resume" and "postsuspend" methods would not be paired correctly. So, we set the DMF_SUSPENDED flag and we fake normal suspend - it may confuse userspace tools, but it won't cause a kernel crash. ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:56! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 1 PID: 8343 Comm: dmsetup Not tainted 6.8.0-rc6 #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 RIP: 0010:__list_del_entry_valid_or_report+0x77/0xc0 <snip> RSP: 0018:ffff8881b831bcc0 EFLAGS: 00010282 RAX: 000000000000004e RBX: ffff888143b6eb80 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffffffff819053d0 RDI: 00000000ffffffff RBP: ffff8881b83a3400 R08: 00000000fffeffff R09: 0000000000000058 R10: 0000000000000000 R11: ffffffff81a24080 R12: 0000000000000001 R13: ffff88814538e000 R14: ffff888143bc6dc0 R15: ffffffffa02e4bb0 FS: 00000000f7c0f780(0000) GS:ffff8893f0a40000(0000) knlGS:0000000000000000 CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 CR2: 0000000057fb5000 CR3: 0000000143474000 CR4: 00000000000006b0 Call Trace: <TASK> ? die+0x2d/0x80 ? do_trap+0xeb/0xf0 ? __list_del_entry_valid_or_report+0x77/0xc0 ? do_error_trap+0x60/0x80 ? __list_del_entry_valid_or_report+0x77/0xc0 ? exc_invalid_op+0x49/0x60 ? __list_del_entry_valid_or_report+0x77/0xc0 ? asm_exc_invalid_op+0x16/0x20 ? table_deps+0x1b0/0x1b0 [dm_mod] ? __list_del_entry_valid_or_report+0x77/0xc0 origin_postsuspend+0x1a/0x50 [dm_snapshot] dm_table_postsuspend_targets+0x34/0x50 [dm_mod] dm_suspend+0xd8/0xf0 [dm_mod] dev_suspend+0x1f2/0x2f0 [dm_mod] ? table_deps+0x1b0/0x1b0 [dm_mod] ctl_ioctl+0x300/0x5f0 [dm_mod] dm_compat_ctl_ioctl+0x7/0x10 [dm_mod] __x64_compat_sys_ioctl+0x104/0x170 do_syscall_64+0x184/0x1b0 entry_SYSCALL_64_after_hwframe+0x46/0x4e RIP: 0033:0xf7e6aead <snip> ---[ end trace 0000000000000000 ]--- Fixes: ffcc39364160 ("dm: enhance internal suspend and resume interface") Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com> Signed-off-by: Mike Snitzer <snitzer(a)kernel.org> Signed-off-by: Li Lingfeng <lilingfeng3(a)huawei.com> --- drivers/md/dm.c | 26 ++++++++++++++++++++------ 1 file changed, 20 insertions(+), 6 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index f7212e8fc27f..6fa1dfe72ddb 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2920,6 +2920,9 @@ static void __dm_internal_suspend(struct mapped_device *md, unsigned int suspend static void __dm_internal_resume(struct mapped_device *md) { + int r; + struct dm_table *map; + BUG_ON(!md->internal_suspend_count); if (--md->internal_suspend_count) @@ -2928,12 +2931,23 @@ static void __dm_internal_resume(struct mapped_device *md) if (dm_suspended_md(md)) goto done; /* resume from nested suspend */ - /* - * NOTE: existing callers don't need to call dm_table_resume_targets - * (which may fail -- so best to avoid it for now by passing NULL map) - */ - (void) __dm_resume(md, NULL); - + map = rcu_dereference_protected(md->map, lockdep_is_held(&md->suspend_lock)); + r = __dm_resume(md, map); + if (r) { + /* + * If a preresume method of some target failed, we are in a + * tricky situation. We can't return an error to the caller. We + * can't fake success because then the "resume" and + * "postsuspend" methods would not be paired correctly, and it + * would break various targets, for example it would cause list + * corruption in the "origin" target. + * + * So, we fake normal suspend here, to make sure that the + * "resume" and "postsuspend" methods will be paired correctly. + */ + DMERR("Preresume method failed: %d", r); + set_bit(DMF_SUSPENDED, &md->flags); + } done: clear_bit(DMF_SUSPENDED_INTERNALLY, &md->flags); smp_mb__after_atomic(); -- 2.31.1
2 1
0 0
[PATCH OLK-5.10] dm: call the resume method on internal suspend
by Li Lingfeng 22 Apr '24

22 Apr '24
From: Mikulas Patocka <mpatocka(a)redhat.com> mainline inclusion from mainline-v6.9-rc1 commit 65e8fbde64520001abf1c8d0e573561b4746ef38 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9HJXV CVE: CVE-2024-26880 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- There is this reported crash when experimenting with the lvm2 testsuite. The list corruption is caused by the fact that the postsuspend and resume methods were not paired correctly; there were two consecutive calls to the origin_postsuspend function. The second call attempts to remove the "hash_list" entry from a list, while it was already removed by the first call. Fix __dm_internal_resume so that it calls the preresume and resume methods of the table's targets. If a preresume method of some target fails, we are in a tricky situation. We can't return an error because dm_internal_resume isn't supposed to return errors. We can't return success, because then the "resume" and "postsuspend" methods would not be paired correctly. So, we set the DMF_SUSPENDED flag and we fake normal suspend - it may confuse userspace tools, but it won't cause a kernel crash. ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:56! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 1 PID: 8343 Comm: dmsetup Not tainted 6.8.0-rc6 #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 RIP: 0010:__list_del_entry_valid_or_report+0x77/0xc0 <snip> RSP: 0018:ffff8881b831bcc0 EFLAGS: 00010282 RAX: 000000000000004e RBX: ffff888143b6eb80 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffffffff819053d0 RDI: 00000000ffffffff RBP: ffff8881b83a3400 R08: 00000000fffeffff R09: 0000000000000058 R10: 0000000000000000 R11: ffffffff81a24080 R12: 0000000000000001 R13: ffff88814538e000 R14: ffff888143bc6dc0 R15: ffffffffa02e4bb0 FS: 00000000f7c0f780(0000) GS:ffff8893f0a40000(0000) knlGS:0000000000000000 CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 CR2: 0000000057fb5000 CR3: 0000000143474000 CR4: 00000000000006b0 Call Trace: <TASK> ? die+0x2d/0x80 ? do_trap+0xeb/0xf0 ? __list_del_entry_valid_or_report+0x77/0xc0 ? do_error_trap+0x60/0x80 ? __list_del_entry_valid_or_report+0x77/0xc0 ? exc_invalid_op+0x49/0x60 ? __list_del_entry_valid_or_report+0x77/0xc0 ? asm_exc_invalid_op+0x16/0x20 ? table_deps+0x1b0/0x1b0 [dm_mod] ? __list_del_entry_valid_or_report+0x77/0xc0 origin_postsuspend+0x1a/0x50 [dm_snapshot] dm_table_postsuspend_targets+0x34/0x50 [dm_mod] dm_suspend+0xd8/0xf0 [dm_mod] dev_suspend+0x1f2/0x2f0 [dm_mod] ? table_deps+0x1b0/0x1b0 [dm_mod] ctl_ioctl+0x300/0x5f0 [dm_mod] dm_compat_ctl_ioctl+0x7/0x10 [dm_mod] __x64_compat_sys_ioctl+0x104/0x170 do_syscall_64+0x184/0x1b0 entry_SYSCALL_64_after_hwframe+0x46/0x4e RIP: 0033:0xf7e6aead <snip> ---[ end trace 0000000000000000 ]--- Fixes: ffcc39364160 ("dm: enhance internal suspend and resume interface") Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com> Signed-off-by: Mike Snitzer <snitzer(a)kernel.org> Signed-off-by: Li Lingfeng <lilingfeng3(a)huawei.com> --- drivers/md/dm.c | 26 ++++++++++++++++++++------ 1 file changed, 20 insertions(+), 6 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 1c79eaede4df..e90b3e96fafc 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2707,6 +2707,9 @@ static void __dm_internal_suspend(struct mapped_device *md, unsigned suspend_fla static void __dm_internal_resume(struct mapped_device *md) { + int r; + struct dm_table *map; + BUG_ON(!md->internal_suspend_count); if (--md->internal_suspend_count) @@ -2715,12 +2718,23 @@ static void __dm_internal_resume(struct mapped_device *md) if (dm_suspended_md(md)) goto done; /* resume from nested suspend */ - /* - * NOTE: existing callers don't need to call dm_table_resume_targets - * (which may fail -- so best to avoid it for now by passing NULL map) - */ - (void) __dm_resume(md, NULL); - + map = rcu_dereference_protected(md->map, lockdep_is_held(&md->suspend_lock)); + r = __dm_resume(md, map); + if (r) { + /* + * If a preresume method of some target failed, we are in a + * tricky situation. We can't return an error to the caller. We + * can't fake success because then the "resume" and + * "postsuspend" methods would not be paired correctly, and it + * would break various targets, for example it would cause list + * corruption in the "origin" target. + * + * So, we fake normal suspend here, to make sure that the + * "resume" and "postsuspend" methods will be paired correctly. + */ + DMERR("Preresume method failed: %d", r); + set_bit(DMF_SUSPENDED, &md->flags); + } done: clear_bit(DMF_SUSPENDED_INTERNALLY, &md->flags); smp_mb__after_atomic(); -- 2.31.1
2 1
0 0
[PATCH openEuler-1.0-LTS] dm: call the resume method on internal suspend
by Li Lingfeng 22 Apr '24

22 Apr '24
From: Mikulas Patocka <mpatocka(a)redhat.com> mainline inclusion from mainline-v6.9-rc1 commit 65e8fbde64520001abf1c8d0e573561b4746ef38 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9HJXV CVE: CVE-2024-26880 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- There is this reported crash when experimenting with the lvm2 testsuite. The list corruption is caused by the fact that the postsuspend and resume methods were not paired correctly; there were two consecutive calls to the origin_postsuspend function. The second call attempts to remove the "hash_list" entry from a list, while it was already removed by the first call. Fix __dm_internal_resume so that it calls the preresume and resume methods of the table's targets. If a preresume method of some target fails, we are in a tricky situation. We can't return an error because dm_internal_resume isn't supposed to return errors. We can't return success, because then the "resume" and "postsuspend" methods would not be paired correctly. So, we set the DMF_SUSPENDED flag and we fake normal suspend - it may confuse userspace tools, but it won't cause a kernel crash. ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:56! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 1 PID: 8343 Comm: dmsetup Not tainted 6.8.0-rc6 #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 RIP: 0010:__list_del_entry_valid_or_report+0x77/0xc0 <snip> RSP: 0018:ffff8881b831bcc0 EFLAGS: 00010282 RAX: 000000000000004e RBX: ffff888143b6eb80 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffffffff819053d0 RDI: 00000000ffffffff RBP: ffff8881b83a3400 R08: 00000000fffeffff R09: 0000000000000058 R10: 0000000000000000 R11: ffffffff81a24080 R12: 0000000000000001 R13: ffff88814538e000 R14: ffff888143bc6dc0 R15: ffffffffa02e4bb0 FS: 00000000f7c0f780(0000) GS:ffff8893f0a40000(0000) knlGS:0000000000000000 CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 CR2: 0000000057fb5000 CR3: 0000000143474000 CR4: 00000000000006b0 Call Trace: <TASK> ? die+0x2d/0x80 ? do_trap+0xeb/0xf0 ? __list_del_entry_valid_or_report+0x77/0xc0 ? do_error_trap+0x60/0x80 ? __list_del_entry_valid_or_report+0x77/0xc0 ? exc_invalid_op+0x49/0x60 ? __list_del_entry_valid_or_report+0x77/0xc0 ? asm_exc_invalid_op+0x16/0x20 ? table_deps+0x1b0/0x1b0 [dm_mod] ? __list_del_entry_valid_or_report+0x77/0xc0 origin_postsuspend+0x1a/0x50 [dm_snapshot] dm_table_postsuspend_targets+0x34/0x50 [dm_mod] dm_suspend+0xd8/0xf0 [dm_mod] dev_suspend+0x1f2/0x2f0 [dm_mod] ? table_deps+0x1b0/0x1b0 [dm_mod] ctl_ioctl+0x300/0x5f0 [dm_mod] dm_compat_ctl_ioctl+0x7/0x10 [dm_mod] __x64_compat_sys_ioctl+0x104/0x170 do_syscall_64+0x184/0x1b0 entry_SYSCALL_64_after_hwframe+0x46/0x4e RIP: 0033:0xf7e6aead <snip> ---[ end trace 0000000000000000 ]--- Fixes: ffcc39364160 ("dm: enhance internal suspend and resume interface") Signed-off-by: Mikulas Patocka <mpatocka(a)redhat.com> Signed-off-by: Mike Snitzer <snitzer(a)kernel.org> Signed-off-by: Li Lingfeng <lilingfeng3(a)huawei.com> --- drivers/md/dm.c | 26 ++++++++++++++++++++------ 1 file changed, 20 insertions(+), 6 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index f964a0818ddf..469aefecfe0b 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2848,6 +2848,9 @@ static void __dm_internal_suspend(struct mapped_device *md, unsigned suspend_fla static void __dm_internal_resume(struct mapped_device *md) { + int r; + struct dm_table *map; + BUG_ON(!md->internal_suspend_count); if (--md->internal_suspend_count) @@ -2856,12 +2859,23 @@ static void __dm_internal_resume(struct mapped_device *md) if (dm_suspended_md(md)) goto done; /* resume from nested suspend */ - /* - * NOTE: existing callers don't need to call dm_table_resume_targets - * (which may fail -- so best to avoid it for now by passing NULL map) - */ - (void) __dm_resume(md, NULL); - + map = rcu_dereference_protected(md->map, lockdep_is_held(&md->suspend_lock)); + r = __dm_resume(md, map); + if (r) { + /* + * If a preresume method of some target failed, we are in a + * tricky situation. We can't return an error to the caller. We + * can't fake success because then the "resume" and + * "postsuspend" methods would not be paired correctly, and it + * would break various targets, for example it would cause list + * corruption in the "origin" target. + * + * So, we fake normal suspend here, to make sure that the + * "resume" and "postsuspend" methods will be paired correctly. + */ + DMERR("Preresume method failed: %d", r); + set_bit(DMF_SUSPENDED, &md->flags); + } done: clear_bit(DMF_SUSPENDED_INTERNALLY, &md->flags); smp_mb__after_atomic(); -- 2.31.1
2 1
0 0
[PATCH OLK-6.6 v2 0/3] mm: convert mm's rss stats to use atomic mode
by Peng Zhang 22 Apr '24

22 Apr '24
From: ZhangPeng <zhangpeng362(a)huawei.com> Since commit f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter"), the rss_stats have converted into percpu_counter, which convert the error margin from (nr_threads * 64) to approximately (nr_cpus ^ 2). However, the new percpu allocation in mm_init() causes a performance regression on fork/exec/shell. Even after commit 14ef95be6f55 ("kernel/fork: group allocation/free of per-cpu counters for mm struct"), the performance of fork/exec/shell is still poor compared to previous kernel versions. To mitigate performance regression, we delay the allocation of percpu memory for rss_stats. Therefore, we convert mm's rss stats to use percpu_counter atomic mode. For single-thread processes, rss_stat is in atomic mode, which reduces the memory consumption and performance regression caused by using percpu. For multiple-thread processes, rss_stat is switched to the percpu mode to reduce the error margin. We convert rss_stats from atomic mode to percpu mode only when the second thread is created. After lmbench test, we can get 2% ~ 4% performance improvement for lmbench fork_proc/exec_proc/shell_proc and 6.7% performance improvement for lmbench page_fault (before batch mode[1]). The test results are as follows: base base+revert base+this patch fork_proc 416.3ms 400.0ms (3.9%) 398.6ms (4.2%) exec_proc 2095.9ms 2061.1ms (1.7%) 2047.7ms (2.3%) shell_proc 3028.2ms 2954.7ms (2.4%) 2961.2ms (2.2%) page_fault 0.3603ms 0.3358ms (6.8%) 0.3361ms (6.7%) [1] https://lore.kernel.org/all/20240412064751.119015-1-wangkefeng.wang@huawei.… ChangeLog: v1->v2: - Split patch 2 into two patches. ZhangPeng (3): percpu_counter: introduce atomic mode for percpu_counter mm: convert mm's rss stats to use atomic mode mm: introduce cmdline to disable mm counter atomic mode include/linux/mm.h | 50 +++++++++++++++++++++++++++++----- include/linux/percpu_counter.h | 48 ++++++++++++++++++++++++++++++-- include/trace/events/kmem.h | 4 +-- kernel/fork.c | 46 ++++++++++++++++++++++++++++--- lib/percpu_counter.c | 35 ++++++++++++++++++++++-- 5 files changed, 165 insertions(+), 18 deletions(-) -- 2.25.1
2 4
0 0
[PATCH OLK-5.10] net/sched: flower: Fix unable to handle page fault bug in fl_init
by Zhengchao Shao 22 Apr '24

22 Apr '24
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9IQLI CVE: NA -------------------------------- The tmplt_reoffload function pointer is of the const type, and the value is assigned to the constant in fl_init. As a result, the following issue occurs. BUG: unable to handle page fault for address: ffffffff98715da0 PF: supervisor write access in kernel mode PF: error_code(0x0003) - permissions violation PGD ec0d067 P4D ec0d067 PUD ec0e063 PMD 800000000e0001e1 Oops: 0003 [#1] SMP PTI CPU: 20 PID: 7533 Comm: tc Kdump: loaded Not tainted 5.10.0+ #40 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:fl_init+0xcf/0x100 RSP: 0018:ffffb6e7c0fe7978 EFLAGS: 00010202 RAX: 0000000000000049 RBX: ffff99c6b3580480 RCX: 0000000000000027 RDX: 0000000000000000 RSI: ffffffff98718740 RDI: ffff99c6a359f800 RBP: ffff99c6a359f800 R08: ffff99cfdce1fe50 R09: ffffb6e7c0fe77a0 R10: ffffb6e7c0fe7798 R11: ffffffff9967d5a8 R12: ffff99c6b3580480 R13: ffffb6e7c0fe7b80 R14: 0000000000000001 R15: ffffb6e7c0fe7ab0 FS: 00007fbaef7b1800(0000) GS:ffff99cfdce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffff98715da0 CR3: 000000011299c000 CR4: 00000000000006e0 Call Trace: tcf_proto_create.cold+0x66/0x9e tc_new_tfilter+0x611/0xa70 rtnetlink_rcv_msg+0x406/0x560 netlink_rcv_skb+0x64/0x180 rtnetlink_rcv+0x19/0x30 netlink_unicast_kernel+0x7b/0x180 netlink_unicast+0x13d/0x230 netlink_sendmsg+0x432/0x610 __sock_sendmsg+0xc6/0xd0 ____sys_sendmsg+0x1f5/0x380 ___sys_sendmsg+0x82/0xe Fixes: fbc634d37f5a ("net/sched: flower: Fix kabi change") Signed-off-by: Zhengchao Shao <shaozhengchao(a)huawei.com> --- net/sched/cls_api.c | 4 ++-- net/sched/cls_flower.c | 5 +---- 2 files changed, 3 insertions(+), 6 deletions(-) diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index 7801d8c552c9..b6dd697a3d5f 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -1397,8 +1397,8 @@ void tcf_block_put(struct tcf_block *block) EXPORT_SYMBOL(tcf_block_put); -void (* const tmplt_reoffload)(struct tcf_chain *chain, bool add, - flow_setup_cb_t *cb, void *cb_priv); +void (*tmplt_reoffload)(struct tcf_chain *chain, bool add, + flow_setup_cb_t *cb, void *cb_priv); EXPORT_SYMBOL(tmplt_reoffload); static void cls_tmplt_reoffload(struct tcf_chain *chain, bool add, diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index 3a1c139c426e..d15729328aef 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -356,8 +356,6 @@ static int fl_init(struct tcf_proto *tp) rcu_assign_pointer(tp->root, head); idr_init(&head->handle_idr); - tmplt_reoffload = &fl_tmplt_reoffload; - return rhashtable_init(&head->ht, &mask_ht_params); } @@ -596,8 +594,6 @@ static void fl_destroy(struct tcf_proto *tp, bool rtnl_held, __module_get(THIS_MODULE); tcf_queue_work(&head->rwork, fl_destroy_sleepable); - - tmplt_reoffload = NULL; } static void fl_put(struct tcf_proto *tp, void *arg) @@ -3250,6 +3246,7 @@ static struct tcf_proto_ops cls_fl_ops __read_mostly = { static int __init cls_fl_init(void) { + tmplt_reoffload = &fl_tmplt_reoffload; return register_tcf_proto_ops(&cls_fl_ops); } -- 2.34.1
2 1
0 0
[PATCH OLK-6.6 0/2] mm/migrate: correct nr_failed in migrate_pages_sync()
by Liu Shixin 22 Apr '24

22 Apr '24
Correct nr_failed in migrate_pages_sync(). Zi Yan (2): mm/migrate: correct nr_failed in migrate_pages_sync() mm/migrate: add nr_split to trace_mm_migrate_pages stats. include/trace/events/migrate.h | 24 ++++++++++++++---------- mm/migrate.c | 19 +++++++++++++++---- 2 files changed, 29 insertions(+), 14 deletions(-) -- 2.25.1
2 3
0 0
[PATCH OLK-5.10] tracing/trigger: Fix to return error if failed to alloc snapshot
by Zheng Yejian 22 Apr '24

22 Apr '24
From: "Masami Hiramatsu (Google)" <mhiramat(a)kernel.org> stable inclusion from stable-v5.10.210 commit 56cfbe60710772916a5ba092c99542332b48e870 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9HL6U CVE: CVE-2024-26920 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- commit 0958b33ef5a04ed91f61cef4760ac412080c4e08 upstream. Fix register_snapshot_trigger() to return error code if it failed to allocate a snapshot instead of 0 (success). Unless that, it will register snapshot trigger without an error. Link: https://lore.kernel.org/linux-trace-kernel/170622977792.270660.278929864275… Fixes: 0bbe7f719985 ("tracing: Fix the race between registering 'snapshot' event trigger and triggering 'snapshot' operation") Cc: stable(a)vger.kernel.org Cc: Vincent Donnefort <vdonnefort(a)google.com> Signed-off-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Zheng Yejian <zhengyejian1(a)huawei.com> --- kernel/trace/trace_events_trigger.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c index 3c6229f16e81..64abc66b82e5 100644 --- a/kernel/trace/trace_events_trigger.c +++ b/kernel/trace/trace_events_trigger.c @@ -1140,8 +1140,10 @@ register_snapshot_trigger(char *glob, struct event_trigger_ops *ops, struct event_trigger_data *data, struct trace_event_file *file) { - if (tracing_alloc_snapshot_instance(file->tr) != 0) - return 0; + int ret = tracing_alloc_snapshot_instance(file->tr); + + if (ret < 0) + return ret; return register_trigger(glob, ops, data, file); } -- 2.25.1
2 1
0 0
[PATCH OLK-5.10] RDMA/mlx5: Fix fortify source warning while accessing Eth segment
by Ziyang Xuan 22 Apr '24

22 Apr '24
From: Leon Romanovsky <leonro(a)nvidia.com> stable inclusion from stable-v5.10.214 commit d27c48dc309da72c3b46351a1205d89687272baa category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9HK6L CVE: CVE-2024-26907 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- [ Upstream commit 4d5e86a56615cc387d21c629f9af8fb0e958d350 ] ------------[ cut here ]------------ memcpy: detected field-spanning write (size 56) of single field "eseg->inline_hdr.start" at /var/lib/dkms/mlnx-ofed-kernel/5.8/build/drivers/infiniband/hw/mlx5/wr.c:131 (size 2) WARNING: CPU: 0 PID: 293779 at /var/lib/dkms/mlnx-ofed-kernel/5.8/build/drivers/infiniband/hw/mlx5/wr.c:131 mlx5_ib_post_send+0x191b/0x1a60 [mlx5_ib] Modules linked in: 8021q garp mrp stp llc rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) mlx5_core(OE) pci_hyperv_intf mlxdevm(OE) mlx_compat(OE) tls mlxfw(OE) psample nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables libcrc32c nfnetlink mst_pciconf(OE) knem(OE) vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd irqbypass cuse nfsv3 nfs fscache netfs xfrm_user xfrm_algo ipmi_devintf ipmi_msghandler binfmt_misc crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 snd_pcsp aesni_intel crypto_simd cryptd snd_pcm snd_timer joydev snd soundcore input_leds serio_raw evbug nfsd auth_rpcgss nfs_acl lockd grace sch_fq_codel sunrpc drm efi_pstore ip_tables x_tables autofs4 psmouse virtio_net net_failover failover floppy [last unloaded: mlx_compat(OE)] CPU: 0 PID: 293779 Comm: ssh Tainted: G OE 6.2.0-32-generic #32~22.04.1-Ubuntu Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:mlx5_ib_post_send+0x191b/0x1a60 [mlx5_ib] Code: 0c 01 00 a8 01 75 25 48 8b 75 a0 b9 02 00 00 00 48 c7 c2 10 5b fd c0 48 c7 c7 80 5b fd c0 c6 05 57 0c 03 00 01 e8 95 4d 93 da <0f> 0b 44 8b 4d b0 4c 8b 45 c8 48 8b 4d c0 e9 49 fb ff ff 41 0f b7 RSP: 0018:ffffb5b48478b570 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffb5b48478b628 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffb5b48478b5e8 R13: ffff963a3c609b5e R14: ffff9639c3fbd800 R15: ffffb5b480475a80 FS: 00007fc03b444c80(0000) GS:ffff963a3dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000556f46bdf000 CR3: 0000000006ac6003 CR4: 00000000003706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? show_regs+0x72/0x90 ? mlx5_ib_post_send+0x191b/0x1a60 [mlx5_ib] ? __warn+0x8d/0x160 ? mlx5_ib_post_send+0x191b/0x1a60 [mlx5_ib] ? report_bug+0x1bb/0x1d0 ? handle_bug+0x46/0x90 ? exc_invalid_op+0x19/0x80 ? asm_exc_invalid_op+0x1b/0x20 ? mlx5_ib_post_send+0x191b/0x1a60 [mlx5_ib] mlx5_ib_post_send_nodrain+0xb/0x20 [mlx5_ib] ipoib_send+0x2ec/0x770 [ib_ipoib] ipoib_start_xmit+0x5a0/0x770 [ib_ipoib] dev_hard_start_xmit+0x8e/0x1e0 ? validate_xmit_skb_list+0x4d/0x80 sch_direct_xmit+0x116/0x3a0 __dev_xmit_skb+0x1fd/0x580 __dev_queue_xmit+0x284/0x6b0 ? _raw_spin_unlock_irq+0xe/0x50 ? __flush_work.isra.0+0x20d/0x370 ? push_pseudo_header+0x17/0x40 [ib_ipoib] neigh_connected_output+0xcd/0x110 ip_finish_output2+0x179/0x480 ? __smp_call_single_queue+0x61/0xa0 __ip_finish_output+0xc3/0x190 ip_finish_output+0x2e/0xf0 ip_output+0x78/0x110 ? __pfx_ip_finish_output+0x10/0x10 ip_local_out+0x64/0x70 __ip_queue_xmit+0x18a/0x460 ip_queue_xmit+0x15/0x30 __tcp_transmit_skb+0x914/0x9c0 tcp_write_xmit+0x334/0x8d0 tcp_push_one+0x3c/0x60 tcp_sendmsg_locked+0x2e1/0xac0 tcp_sendmsg+0x2d/0x50 inet_sendmsg+0x43/0x90 sock_sendmsg+0x68/0x80 sock_write_iter+0x93/0x100 vfs_write+0x326/0x3c0 ksys_write+0xbd/0xf0 ? do_syscall_64+0x69/0x90 __x64_sys_write+0x19/0x30 do_syscall_64+0x59/0x90 ? do_user_addr_fault+0x1d0/0x640 ? exit_to_user_mode_prepare+0x3b/0xd0 ? irqentry_exit_to_user_mode+0x9/0x20 ? irqentry_exit+0x43/0x50 ? exc_page_fault+0x92/0x1b0 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7fc03ad14a37 Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 RSP: 002b:00007ffdf8697fe8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000008024 RCX: 00007fc03ad14a37 RDX: 0000000000008024 RSI: 0000556f46bd8270 RDI: 0000000000000003 RBP: 0000556f46bb1800 R08: 0000000000007fe3 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002 R13: 0000556f46bc66b0 R14: 000000000000000a R15: 0000556f46bb2f50 </TASK> ---[ end trace 0000000000000000 ]--- Link: https://lore.kernel.org/r/8228ad34bd1a25047586270f7b1fb4ddcd046282.17064339… Signed-off-by: Leon Romanovsky <leonro(a)nvidia.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Signed-off-by: Ziyang Xuan <william.xuanziyang(a)huawei.com> --- drivers/infiniband/hw/mlx5/wr.c | 2 +- include/linux/mlx5/qp.h | 5 ++++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/wr.c b/drivers/infiniband/hw/mlx5/wr.c index d6038fb6c50c..19fd440a6ce3 100644 --- a/drivers/infiniband/hw/mlx5/wr.c +++ b/drivers/infiniband/hw/mlx5/wr.c @@ -128,7 +128,7 @@ static void set_eth_seg(const struct ib_send_wr *wr, struct mlx5_ib_qp *qp, */ copysz = min_t(u64, *cur_edge - (void *)eseg->inline_hdr.start, left); - memcpy(eseg->inline_hdr.start, pdata, copysz); + memcpy(eseg->inline_hdr.data, pdata, copysz); stride = ALIGN(sizeof(struct mlx5_wqe_eth_seg) - sizeof(eseg->inline_hdr.start) + copysz, 16); *size += stride / 16; diff --git a/include/linux/mlx5/qp.h b/include/linux/mlx5/qp.h index d75ef8aa8fac..28d44061d670 100644 --- a/include/linux/mlx5/qp.h +++ b/include/linux/mlx5/qp.h @@ -261,7 +261,10 @@ struct mlx5_wqe_eth_seg { union { struct { __be16 sz; - u8 start[2]; + union { + u8 start[2]; + DECLARE_FLEX_ARRAY(u8, data); + }; } inline_hdr; struct { __be16 type; -- 2.25.1
2 1
0 0
  • ← Newer
  • 1
  • ...
  • 1114
  • 1115
  • 1116
  • 1117
  • 1118
  • 1119
  • 1120
  • ...
  • 1872
  • Older →

HyperKitty Powered by HyperKitty