October 2024 - Kernel - mailweb.openeuler.org

[PATCH openEuler-22.03-LTS-SP1] [Backport] KVM: x86: Acquire kvm->srcu when handling KVM_SET_VCPU_EVENTS
by Zhao Yipeng 17 Oct '24

17 Oct '24

From: Sean Christopherson <seanjc(a)google.com> mainline inclusion from mainline-v6.11-rc7 commit 4bcdd831d9d01e0fb64faea50732b59b2ee88da1 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IAU9K5 CVE: CVE-2024-46830 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… -------------------------------- KVM: x86: Acquire kvm->srcu when handling KVM_SET_VCPU_EVENTS Grab kvm->srcu when processing KVM_SET_VCPU_EVENTS, as KVM will forcibly leave nested VMX/SVM if SMM mode is being toggled, and leaving nested VMX reads guest memory. Note, kvm_vcpu_ioctl_x86_set_vcpu_events() can also be called from KVM_RUN via sync_regs(), which already holds SRCU. I.e. trying to precisely use kvm_vcpu_srcu_read_lock() around the problematic SMM code would cause problems. Acquiring SRCU isn't all that expensive, so for simplicity, grab it unconditionally for KVM_SET_VCPU_EVENTS. ============================= WARNING: suspicious RCU usage 6.10.0-rc7-332d2c1d713e-next-vm #552 Not tainted ----------------------------- include/linux/kvm_host.h:1027 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by repro/1071: #0: ffff88811e424430 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x7d/0x970 [kvm] stack backtrace: CPU: 15 PID: 1071 Comm: repro Not tainted 6.10.0-rc7-332d2c1d713e-next-vm #552 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: <TASK> dump_stack_lvl+0x7f/0x90 lockdep_rcu_suspicious+0x13f/0x1a0 kvm_vcpu_gfn_to_memslot+0x168/0x190 [kvm] kvm_vcpu_read_guest+0x3e/0x90 [kvm] nested_vmx_load_msr+0x6b/0x1d0 [kvm_intel] load_vmcs12_host_state+0x432/0xb40 [kvm_intel] vmx_leave_nested+0x30/0x40 [kvm_intel] kvm_vcpu_ioctl_x86_set_vcpu_events+0x15d/0x2b0 [kvm] kvm_arch_vcpu_ioctl+0x1107/0x1750 [kvm] ? mark_held_locks+0x49/0x70 ? kvm_vcpu_ioctl+0x7d/0x970 [kvm] ? kvm_vcpu_ioctl+0x497/0x970 [kvm] kvm_vcpu_ioctl+0x497/0x970 [kvm] ? lock_acquire+0xba/0x2d0 ? find_held_lock+0x2b/0x80 ? do_user_addr_fault+0x40c/0x6f0 ? lock_release+0xb7/0x270 __x64_sys_ioctl+0x82/0xb0 do_syscall_64+0x6c/0x170 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7ff11eb1b539 </TASK> Fixes: f7e570780efc ("KVM: x86: Forcibly leave nested virt when SMM state is toggled") Cc: stable(a)vger.kernel.org Link: https://lore.kernel.org/r/20240723232055.3643811-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc(a)google.com> Conflicts: arch/x86/kvm/x86.c [The functions used in the patch depend on the pre-patch, and the introduction involves interface changes. Therefore, the original version of the function is used instead, and the same effect of locking kvm->srcu is achieved.] Signed-off-by: Zhao Yipeng <zhaoyipeng5(a)huawei.com> --- arch/x86/kvm/x86.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 32fa24d0e5d6..fd484302864c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5030,7 +5030,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp, if (copy_from_user(&events, argp, sizeof(struct kvm_vcpu_events))) break; + vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); r = kvm_vcpu_ioctl_x86_set_vcpu_events(vcpu, &events); + srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx); break; } case KVM_GET_DEBUGREGS: { -- 2.34.1

2 1

[PATCH OLK-5.10] xfs: atomic write file dio convert to mark IOCB_ATOMIC
by Long Li 17 Oct '24

17 Oct '24

hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I9VTE3 CVE: NA -------------------------------- Add mount option "atomicalways", If the file support atomic write, while the user initiates a direct I/O write, convert the write to atomic write, it makes database like Mysql could use atomic write without any other more modify. This also causes some direct writes (such as offset misalignment and invalid length) to the files which with atomic write enabled to return fails. Fixes: d10050f6ee57 ("fs: xfs: Support FS_XFLAG_ATOMICWRITES for forcealign") Signed-off-by: Long Li <leo.lilong(a)huawei.com> --- fs/xfs/xfs_file.c | 5 +++++ fs/xfs/xfs_mount.h | 2 ++ fs/xfs/xfs_super.c | 33 +++++++++++++++++++++++++++++++-- 3 files changed, 38 insertions(+), 2 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 49db1611de96..d4185b63dcd6 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -814,6 +814,11 @@ xfs_file_write_iter( return xfs_file_dax_write(iocb, from); if (iocb->ki_flags & IOCB_DIRECT) { + + if (xfs_inode_atomicwrites(ip) && + xfs_has_atomicalways(ip->i_mount) && + !(iocb->ki_flags & IOCB_ATOMIC)) + iocb->ki_flags |= IOCB_ATOMIC; /* * Allow a directio write to fall back to a buffered * write *only* in the case that we're doing a reflink diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index 888d6bf9bea7..6b952ff031b6 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -279,6 +279,7 @@ typedef struct xfs_mount { /* Mount features */ +#define XFS_FEAT_ATOMICALWAYS (1ULL << 47) /* convert dio to atomic writes always */ #define XFS_FEAT_NOATTR2 (1ULL << 48) /* disable attr2 creation */ #define XFS_FEAT_NOALIGN (1ULL << 49) /* ignore alignment */ #define XFS_FEAT_ALLOCSIZE (1ULL << 50) /* user specified allocation size */ @@ -365,6 +366,7 @@ __XFS_HAS_FEAT(dax_always, DAX_ALWAYS) __XFS_HAS_FEAT(dax_never, DAX_NEVER) __XFS_HAS_FEAT(norecovery, NORECOVERY) __XFS_HAS_FEAT(nouuid, NOUUID) +__XFS_HAS_FEAT(atomicalways, ATOMICALWAYS) /* * Operational mount state flags diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index f2ff547e760c..451386e21262 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -116,7 +116,7 @@ enum { Opt_filestreams, Opt_quota, Opt_noquota, Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_uquota, Opt_gquota, Opt_pquota, Opt_uqnoenforce, Opt_gqnoenforce, Opt_pqnoenforce, Opt_qnoenforce, - Opt_discard, Opt_nodiscard, Opt_dax, Opt_dax_enum, + Opt_discard, Opt_nodiscard, Opt_dax, Opt_dax_enum, Opt_atomicalways, }; static const struct fs_parameter_spec xfs_fs_parameters[] = { @@ -161,6 +161,7 @@ static const struct fs_parameter_spec xfs_fs_parameters[] = { fsparam_flag("nodiscard", Opt_nodiscard), fsparam_flag("dax", Opt_dax), fsparam_enum("dax", Opt_dax_enum, dax_param_enums), + fsparam_flag("atomicalways", Opt_atomicalways), {} }; @@ -189,6 +190,7 @@ xfs_fs_show_options( { XFS_FEAT_LARGE_IOSIZE, ",largeio" }, { XFS_FEAT_DAX_ALWAYS, ",dax=always" }, { XFS_FEAT_DAX_NEVER, ",dax=never" }, + { XFS_FEAT_ATOMICALWAYS, ",atomicalways"}, { 0, NULL } }; struct xfs_mount *mp = XFS_M(root->d_sb); @@ -1361,6 +1363,9 @@ xfs_fc_parse_param( xfs_fs_warn_deprecated(fc, param, XFS_FEAT_NOATTR2, true); parsing_mp->m_features |= XFS_FEAT_NOATTR2; return 0; + case Opt_atomicalways: + parsing_mp->m_features |= XFS_FEAT_ATOMICALWAYS; + return 0; default: xfs_warn(parsing_mp, "unknown mount option [%s].", param->key); return -EINVAL; @@ -1671,9 +1676,33 @@ xfs_fc_fill_super( } - if (xfs_has_atomicwrites(mp)) + if (xfs_has_atomicwrites(mp)) { + if (!xfs_has_forcealign(mp)) { + xfs_alert(mp, + "forcealign should support for atomic writes!"); + error = -EINVAL; + goto out_filestream_unmount; + } + xfs_warn(mp, "EXPERIMENTAL atomicwrites feature in use. Use at your own risk!"); + } + + if (xfs_has_atomicalways(mp)) { + if (!xfs_has_atomicwrites(mp)) { + xfs_alert(mp, + "mounting with \"atomicalways\" option, but fs not support atomic writes!"); + error = -EINVAL; + goto out_filestream_unmount; + } + + if (!bdev_can_atomic_write(mp->m_ddev_targp->bt_bdev)) { + xfs_alert(mp, + "mounting with \"atomicalways\" option, but device not support atomic writes!"); + error = -EINVAL; + goto out_filestream_unmount; + } + } if (xfs_has_reflink(mp)) { if (mp->m_sb.sb_rblocks) { -- 2.39.2

2 1

[PATCH OLK-6.6] [Backport] Bluetooth: btnxpuart: Fix Null pointer dereference in btnxpuart_flush()
by Zhao Yipeng 17 Oct '24

17 Oct '24

From: Neeraj Sanjay Kale <neeraj.sanjaykale(a)nxp.com> stable inclusion from stable-v6.6.51 commit 013dae4735d2010544d1f2121bdeb8e6c9ea171e category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IARWYD CVE: CVE-2024-46749 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- Bluetooth: btnxpuart: Fix Null pointer dereference in btnxpuart_flush() [ Upstream commit c68bbf5e334b35b36ac5b9f0419f1f93f796bad1 ] This adds a check before freeing the rx->skb in flush and close functions to handle the kernel crash seen while removing driver after FW download fails or before FW download completes. dmesg log: [ 54.634586] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000080 [ 54.643398] Mem abort info: [ 54.646204] ESR = 0x0000000096000004 [ 54.649964] EC = 0x25: DABT (current EL), IL = 32 bits [ 54.655286] SET = 0, FnV = 0 [ 54.658348] EA = 0, S1PTW = 0 [ 54.661498] FSC = 0x04: level 0 translation fault [ 54.666391] Data abort info: [ 54.669273] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 54.674768] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 54.674771] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 54.674775] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000048860000 [ 54.674780] [0000000000000080] pgd=0000000000000000, p4d=0000000000000000 [ 54.703880] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 54.710152] Modules linked in: btnxpuart(-) overlay fsl_jr_uio caam_jr caamkeyblob_desc caamhash_desc caamalg_desc crypto_engine authenc libdes crct10dif_ce polyval_ce polyval_generic snd_soc_imx_spdif snd_soc_imx_card snd_soc_ak5558 snd_soc_ak4458 caam secvio error snd_soc_fsl_micfil snd_soc_fsl_spdif snd_soc_fsl_sai snd_soc_fsl_utils imx_pcm_dma gpio_ir_recv rc_core sch_fq_codel fuse [ 54.744357] CPU: 3 PID: 72 Comm: kworker/u9:0 Not tainted 6.6.3-otbr-g128004619037 #2 [ 54.744364] Hardware name: FSL i.MX8MM EVK board (DT) [ 54.744368] Workqueue: hci0 hci_power_on [ 54.757244] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 54.757249] pc : kfree_skb_reason+0x18/0xb0 [ 54.772299] lr : btnxpuart_flush+0x40/0x58 [btnxpuart] [ 54.782921] sp : ffff8000805ebca0 [ 54.782923] x29: ffff8000805ebca0 x28: ffffa5c6cf1869c0 x27: ffffa5c6cf186000 [ 54.782931] x26: ffff377b84852400 x25: ffff377b848523c0 x24: ffff377b845e7230 [ 54.782938] x23: ffffa5c6ce8dbe08 x22: ffffa5c6ceb65410 x21: 00000000ffffff92 [ 54.782945] x20: ffffa5c6ce8dbe98 x19: ffffffffffffffac x18: ffffffffffffffff [ 54.807651] x17: 0000000000000000 x16: ffffa5c6ce2824ec x15: ffff8001005eb857 [ 54.821917] x14: 0000000000000000 x13: ffffa5c6cf1a02e0 x12: 0000000000000642 [ 54.821924] x11: 0000000000000040 x10: ffffa5c6cf19d690 x9 : ffffa5c6cf19d688 [ 54.821931] x8 : ffff377b86000028 x7 : 0000000000000000 x6 : 0000000000000000 [ 54.821938] x5 : ffff377b86000000 x4 : 0000000000000000 x3 : 0000000000000000 [ 54.843331] x2 : 0000000000000000 x1 : 0000000000000002 x0 : ffffffffffffffac [ 54.857599] Call trace: [ 54.857601] kfree_skb_reason+0x18/0xb0 [ 54.863878] btnxpuart_flush+0x40/0x58 [btnxpuart] [ 54.863888] hci_dev_open_sync+0x3a8/0xa04 [ 54.872773] hci_power_on+0x54/0x2e4 [ 54.881832] process_one_work+0x138/0x260 [ 54.881842] worker_thread+0x32c/0x438 [ 54.881847] kthread+0x118/0x11c [ 54.881853] ret_from_fork+0x10/0x20 [ 54.896406] Code: a9be7bfd 910003fd f9000bf3 aa0003f3 (b940d400) [ 54.896410] ---[ end trace 0000000000000000 ]--- Signed-off-by: Neeraj Sanjay Kale <neeraj.sanjaykale(a)nxp.com> Tested-by: Guillaume Legoupil <guillaume.legoupil(a)nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Conflicts: drivers/bluetooth/btnxpuart.c [Trace context conflict, no adaptation required.] Signed-off-by: Zhao Yipeng <zhaoyipeng5(a)huawei.com> --- drivers/bluetooth/btnxpuart.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/bluetooth/btnxpuart.c b/drivers/bluetooth/btnxpuart.c index b5d40e0e05f3..36883f9cddf0 100644 --- a/drivers/bluetooth/btnxpuart.c +++ b/drivers/bluetooth/btnxpuart.c @@ -1279,8 +1279,10 @@ static int btnxpuart_close(struct hci_dev *hdev) ps_wakeup(nxpdev); serdev_device_close(nxpdev->serdev); skb_queue_purge(&nxpdev->txq); - kfree_skb(nxpdev->rx_skb); - nxpdev->rx_skb = NULL; + if (!IS_ERR_OR_NULL(nxpdev->rx_skb)) { + kfree_skb(nxpdev->rx_skb); + nxpdev->rx_skb = NULL; + } clear_bit(BTNXPUART_SERDEV_OPEN, &nxpdev->tx_state); return 0; } @@ -1295,8 +1297,10 @@ static int btnxpuart_flush(struct hci_dev *hdev) cancel_work_sync(&nxpdev->tx_work); - kfree_skb(nxpdev->rx_skb); - nxpdev->rx_skb = NULL; + if (!IS_ERR_OR_NULL(nxpdev->rx_skb)) { + kfree_skb(nxpdev->rx_skb); + nxpdev->rx_skb = NULL; + } return 0; } -- 2.34.1

2 1

[PATCH OLK-5.10] nilfs2: fix state management in error path of log writing function
by Chen Ridong 17 Oct '24

17 Oct '24

From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> mainline inclusion from mainline-v6.11-rc7 commit 6576dd6695f2afca3f4954029ac4a64f82ba60ab category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IAVUFV CVE: CVE-2024-47669 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… ---------------------------------------------------------------------- After commit a694291a6211 ("nilfs2: separate wait function from nilfs_segctor_write") was applied, the log writing function nilfs_segctor_do_construct() was able to issue I/O requests continuously even if user data blocks were split into multiple logs across segments, but two potential flaws were introduced in its error handling. First, if nilfs_segctor_begin_construction() fails while creating the second or subsequent logs, the log writing function returns without calling nilfs_segctor_abort_construction(), so the writeback flag set on pages/folios will remain uncleared. This causes page cache operations to hang waiting for the writeback flag. For example, truncate_inode_pages_final(), which is called via nilfs_evict_inode() when an inode is evicted from memory, will hang. Second, the NILFS_I_COLLECTED flag set on normal inodes remain uncleared. As a result, if the next log write involves checkpoint creation, that's fine, but if a partial log write is performed that does not, inodes with NILFS_I_COLLECTED set are erroneously removed from the "sc_dirty_files" list, and their data and b-tree blocks may not be written to the device, corrupting the block mapping. Fix these issues by uniformly calling nilfs_segctor_abort_construction() on failure of each step in the loop in nilfs_segctor_do_construct(), having it clean up logs and segment usages according to progress, and correcting the conditions for calling nilfs_redirty_inodes() to ensure that the NILFS_I_COLLECTED flag is cleared. Link: https://lkml.kernel.org/r/20240814101119.4070-1-konishi.ryusuke@gmail.com Fixes: a694291a6211 ("nilfs2: separate wait function from nilfs_segctor_write") Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Signed-off-by: Chen Ridong <chenridong(a)huawei.com> --- fs/nilfs2/segment.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c index 02407c524382..76898acb579b 100644 --- a/fs/nilfs2/segment.c +++ b/fs/nilfs2/segment.c @@ -1838,6 +1838,9 @@ static void nilfs_segctor_abort_construction(struct nilfs_sc_info *sci, nilfs_abort_logs(&logs, ret ? : err); list_splice_tail_init(&sci->sc_segbufs, &logs); + if (list_empty(&logs)) + return; /* if the first segment buffer preparation failed */ + nilfs_cancel_segusage(&logs, nilfs->ns_sufile); nilfs_free_incomplete_logs(&logs, nilfs); @@ -2082,7 +2085,7 @@ static int nilfs_segctor_do_construct(struct nilfs_sc_info *sci, int mode) err = nilfs_segctor_begin_construction(sci, nilfs); if (unlikely(err)) - goto out; + goto failed; /* Update time stamp */ sci->sc_seg_ctime = ktime_get_real_seconds(); @@ -2145,10 +2148,9 @@ static int nilfs_segctor_do_construct(struct nilfs_sc_info *sci, int mode) return err; failed_to_write: - if (sci->sc_stage.flags & NILFS_CF_IFILE_STARTED) - nilfs_redirty_inodes(&sci->sc_dirty_files); - failed: + if (mode == SC_LSEG_SR && nilfs_sc_cstage_get(sci) >= NILFS_ST_IFILE) + nilfs_redirty_inodes(&sci->sc_dirty_files); if (nilfs_doing_gc()) nilfs_redirty_inodes(&sci->sc_gc_inodes); nilfs_segctor_abort_construction(sci, nilfs, err); -- 2.34.1

2 1

[PATCH openEuler-22.03-LTS-SP1] nilfs2: fix state management in error path of log writing function
by Chen Ridong 17 Oct '24

17 Oct '24

From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> mainline inclusion from mainline-v6.11-rc7 commit 6576dd6695f2afca3f4954029ac4a64f82ba60ab category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IAVUFV CVE: CVE-2024-47669 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… ---------------------------------------------------------------------- After commit a694291a6211 ("nilfs2: separate wait function from nilfs_segctor_write") was applied, the log writing function nilfs_segctor_do_construct() was able to issue I/O requests continuously even if user data blocks were split into multiple logs across segments, but two potential flaws were introduced in its error handling. First, if nilfs_segctor_begin_construction() fails while creating the second or subsequent logs, the log writing function returns without calling nilfs_segctor_abort_construction(), so the writeback flag set on pages/folios will remain uncleared. This causes page cache operations to hang waiting for the writeback flag. For example, truncate_inode_pages_final(), which is called via nilfs_evict_inode() when an inode is evicted from memory, will hang. Second, the NILFS_I_COLLECTED flag set on normal inodes remain uncleared. As a result, if the next log write involves checkpoint creation, that's fine, but if a partial log write is performed that does not, inodes with NILFS_I_COLLECTED set are erroneously removed from the "sc_dirty_files" list, and their data and b-tree blocks may not be written to the device, corrupting the block mapping. Fix these issues by uniformly calling nilfs_segctor_abort_construction() on failure of each step in the loop in nilfs_segctor_do_construct(), having it clean up logs and segment usages according to progress, and correcting the conditions for calling nilfs_redirty_inodes() to ensure that the NILFS_I_COLLECTED flag is cleared. Link: https://lkml.kernel.org/r/20240814101119.4070-1-konishi.ryusuke@gmail.com Fixes: a694291a6211 ("nilfs2: separate wait function from nilfs_segctor_write") Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Signed-off-by: Chen Ridong <chenridong(a)huawei.com> --- fs/nilfs2/segment.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c index 02407c524382..76898acb579b 100644 --- a/fs/nilfs2/segment.c +++ b/fs/nilfs2/segment.c @@ -1838,6 +1838,9 @@ static void nilfs_segctor_abort_construction(struct nilfs_sc_info *sci, nilfs_abort_logs(&logs, ret ? : err); list_splice_tail_init(&sci->sc_segbufs, &logs); + if (list_empty(&logs)) + return; /* if the first segment buffer preparation failed */ + nilfs_cancel_segusage(&logs, nilfs->ns_sufile); nilfs_free_incomplete_logs(&logs, nilfs); @@ -2082,7 +2085,7 @@ static int nilfs_segctor_do_construct(struct nilfs_sc_info *sci, int mode) err = nilfs_segctor_begin_construction(sci, nilfs); if (unlikely(err)) - goto out; + goto failed; /* Update time stamp */ sci->sc_seg_ctime = ktime_get_real_seconds(); @@ -2145,10 +2148,9 @@ static int nilfs_segctor_do_construct(struct nilfs_sc_info *sci, int mode) return err; failed_to_write: - if (sci->sc_stage.flags & NILFS_CF_IFILE_STARTED) - nilfs_redirty_inodes(&sci->sc_dirty_files); - failed: + if (mode == SC_LSEG_SR && nilfs_sc_cstage_get(sci) >= NILFS_ST_IFILE) + nilfs_redirty_inodes(&sci->sc_dirty_files); if (nilfs_doing_gc()) nilfs_redirty_inodes(&sci->sc_gc_inodes); nilfs_segctor_abort_construction(sci, nilfs, err); -- 2.34.1

2 1

[PATCH OLK-6.6] nilfs2: fix state management in error path of log writing function
by Chen Ridong 17 Oct '24

17 Oct '24

From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> mainline inclusion from mainline-v6.11-rc7 commit 6576dd6695f2afca3f4954029ac4a64f82ba60ab category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IAVUFV CVE: CVE-2024-47669 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… ---------------------------------------------------------------------- After commit a694291a6211 ("nilfs2: separate wait function from nilfs_segctor_write") was applied, the log writing function nilfs_segctor_do_construct() was able to issue I/O requests continuously even if user data blocks were split into multiple logs across segments, but two potential flaws were introduced in its error handling. First, if nilfs_segctor_begin_construction() fails while creating the second or subsequent logs, the log writing function returns without calling nilfs_segctor_abort_construction(), so the writeback flag set on pages/folios will remain uncleared. This causes page cache operations to hang waiting for the writeback flag. For example, truncate_inode_pages_final(), which is called via nilfs_evict_inode() when an inode is evicted from memory, will hang. Second, the NILFS_I_COLLECTED flag set on normal inodes remain uncleared. As a result, if the next log write involves checkpoint creation, that's fine, but if a partial log write is performed that does not, inodes with NILFS_I_COLLECTED set are erroneously removed from the "sc_dirty_files" list, and their data and b-tree blocks may not be written to the device, corrupting the block mapping. Fix these issues by uniformly calling nilfs_segctor_abort_construction() on failure of each step in the loop in nilfs_segctor_do_construct(), having it clean up logs and segment usages according to progress, and correcting the conditions for calling nilfs_redirty_inodes() to ensure that the NILFS_I_COLLECTED flag is cleared. Link: https://lkml.kernel.org/r/20240814101119.4070-1-konishi.ryusuke@gmail.com Fixes: a694291a6211 ("nilfs2: separate wait function from nilfs_segctor_write") Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Signed-off-by: Chen Ridong <chenridong(a)huawei.com> --- fs/nilfs2/segment.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c index e10f8a777ab0..0610cb12c11c 100644 --- a/fs/nilfs2/segment.c +++ b/fs/nilfs2/segment.c @@ -1835,6 +1835,9 @@ static void nilfs_segctor_abort_construction(struct nilfs_sc_info *sci, nilfs_abort_logs(&logs, ret ? : err); list_splice_tail_init(&sci->sc_segbufs, &logs); + if (list_empty(&logs)) + return; /* if the first segment buffer preparation failed */ + nilfs_cancel_segusage(&logs, nilfs->ns_sufile); nilfs_free_incomplete_logs(&logs, nilfs); @@ -2079,7 +2082,7 @@ static int nilfs_segctor_do_construct(struct nilfs_sc_info *sci, int mode) err = nilfs_segctor_begin_construction(sci, nilfs); if (unlikely(err)) - goto out; + goto failed; /* Update time stamp */ sci->sc_seg_ctime = ktime_get_real_seconds(); @@ -2142,10 +2145,9 @@ static int nilfs_segctor_do_construct(struct nilfs_sc_info *sci, int mode) return err; failed_to_write: - if (sci->sc_stage.flags & NILFS_CF_IFILE_STARTED) - nilfs_redirty_inodes(&sci->sc_dirty_files); - failed: + if (mode == SC_LSEG_SR && nilfs_sc_cstage_get(sci) >= NILFS_ST_IFILE) + nilfs_redirty_inodes(&sci->sc_dirty_files); if (nilfs_doing_gc()) nilfs_redirty_inodes(&sci->sc_gc_inodes); nilfs_segctor_abort_construction(sci, nilfs, err); -- 2.34.1

2 1

[PATCH OLK-5.10 v2] netfilter: nf_tables: use timestamp to check for set element timeout
by Liu Jian 17 Oct '24

17 Oct '24

From: Pablo Neira Ayuso <pablo(a)netfilter.org> mainline inclusion from mainline-v6.8-rc4 commit 7395dfacfff65e9938ac0889dafa1ab01e987d15 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I9O0MR CVE: CVE-2024-27397 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… --------------------------- Add a timestamp field at the beginning of the transaction, store it in the nftables per-netns area. Update set backend .insert, .deactivate and sync gc path to use the timestamp, this avoids that an element expires while control plane transaction is still unfinished. .lookup and .update, which are used from packet path, still use the current time to check if the element has expired. And .get path and dump also since this runs lockless under rcu read size lock. Then, there is async gc which also needs to check the current time since it runs asynchronously from a workqueue. Fixes: c3e1b005ed1c ("netfilter: nf_tables: add set element timeout support") Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org> Conflicts: include/net/netfilter/nf_tables.h net/netfilter/nf_tables_api.c net/netfilter/nft_set_pipapo.c net/netfilter/nft_set_rbtree.c [Do not backport nft_rbtree_gc() part, it is a async gc in our repository. Others caused by we did not backport commit ab482c6b66a4, b2d306542ff9, 8837ba3e58ea, 5f68718b34a5, d59d2f82f984.] Signed-off-by: Liu Jian <liujian56(a)huawei.com> --- v1->v2: update bugzilla. include/net/netfilter/nf_tables.h | 24 ++++++++++++++++++++++-- net/netfilter/nf_tables_api.c | 1 + net/netfilter/nft_set_hash.c | 8 +++++++- net/netfilter/nft_set_pipapo.c | 18 +++++++++++------- net/netfilter/nft_set_rbtree.c | 6 ++++-- 5 files changed, 45 insertions(+), 12 deletions(-) diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index 17659138e15a..a6e221de31ef 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -13,6 +13,7 @@ #include <net/netfilter/nf_flow_table.h> #include <net/netlink.h> #include <net/flow_offload.h> +#include <net/netns/generic.h> #define NFT_MAX_HOOKS (NF_INET_INGRESS + 1) @@ -700,10 +701,16 @@ static inline struct nft_expr *nft_set_ext_expr(const struct nft_set_ext *ext) return nft_set_ext(ext, NFT_SET_EXT_EXPR); } -static inline bool nft_set_elem_expired(const struct nft_set_ext *ext) +static inline bool __nft_set_elem_expired(const struct nft_set_ext *ext, + u64 tstamp) { return nft_set_ext_exists(ext, NFT_SET_EXT_EXPIRATION) && - time_is_before_eq_jiffies64(*nft_set_ext_expiration(ext)); + time_after_eq64(tstamp, *nft_set_ext_expiration(ext)); +} + +static inline bool nft_set_elem_expired(const struct nft_set_ext *ext) +{ + return __nft_set_elem_expired(ext, get_jiffies_64()); } static inline struct nft_set_ext *nft_set_elem_ext(const struct nft_set *set, @@ -1592,9 +1599,22 @@ struct nftables_pernet { struct list_head module_list; struct list_head notify_list; struct mutex commit_mutex; + u64 tstamp; unsigned int base_seq; u8 validate_state; unsigned int gc_seq; }; +extern unsigned int nf_tables_net_id; + +static inline struct nftables_pernet *nft_pernet(const struct net *net) +{ + return net_generic(net, nf_tables_net_id); +} + +static inline u64 nft_net_tstamp(const struct net *net) +{ + return nft_pernet(net)->tstamp; +} + #endif /* _NET_NF_TABLES_H */ diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 82526a768e15..81caffd53912 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -9074,6 +9074,7 @@ static bool nf_tables_valid_genid(struct net *net, u32 genid) bool genid_ok; mutex_lock(&nft_net->commit_mutex); + nft_net->tstamp = get_jiffies_64(); genid_ok = genid == 0 || nft_net->base_seq == genid; if (!genid_ok) diff --git a/net/netfilter/nft_set_hash.c b/net/netfilter/nft_set_hash.c index f4a76ae8a11c..1c52e0559011 100644 --- a/net/netfilter/nft_set_hash.c +++ b/net/netfilter/nft_set_hash.c @@ -38,6 +38,7 @@ struct nft_rhash_cmp_arg { const struct nft_set *set; const u32 *key; u8 genmask; + u64 tstamp; }; static inline u32 nft_rhash_key(const void *data, u32 len, u32 seed) @@ -64,7 +65,7 @@ static inline int nft_rhash_cmp(struct rhashtable_compare_arg *arg, return 1; if (nft_set_elem_is_dead(&he->ext)) return 1; - if (nft_set_elem_expired(&he->ext)) + if (__nft_set_elem_expired(&he->ext, x->tstamp)) return 1; if (!nft_set_elem_active(&he->ext, x->genmask)) return 1; @@ -88,6 +89,7 @@ static bool nft_rhash_lookup(const struct net *net, const struct nft_set *set, .genmask = nft_genmask_cur(net), .set = set, .key = key, + .tstamp = get_jiffies_64(), }; he = rhashtable_lookup(&priv->ht, &arg, nft_rhash_params); @@ -106,6 +108,7 @@ static void *nft_rhash_get(const struct net *net, const struct nft_set *set, .genmask = nft_genmask_cur(net), .set = set, .key = elem->key.val.data, + .tstamp = get_jiffies_64(), }; he = rhashtable_lookup(&priv->ht, &arg, nft_rhash_params); @@ -129,6 +132,7 @@ static bool nft_rhash_update(struct nft_set *set, const u32 *key, .genmask = NFT_GENMASK_ANY, .set = set, .key = key, + .tstamp = get_jiffies_64(), }; he = rhashtable_lookup(&priv->ht, &arg, nft_rhash_params); @@ -172,6 +176,7 @@ static int nft_rhash_insert(const struct net *net, const struct nft_set *set, .genmask = nft_genmask_next(net), .set = set, .key = elem->key.val.data, + .tstamp = nft_net_tstamp(net), }; struct nft_rhash_elem *prev; @@ -214,6 +219,7 @@ static void *nft_rhash_deactivate(const struct net *net, .genmask = nft_genmask_next(net), .set = set, .key = elem->key.val.data, + .tstamp = nft_net_tstamp(net), }; rcu_read_lock(); diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c index 2a147e969c66..65ecf486f8d1 100644 --- a/net/netfilter/nft_set_pipapo.c +++ b/net/netfilter/nft_set_pipapo.c @@ -504,6 +504,7 @@ bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set, * @set: nftables API set representation * @data: Key data to be matched against existing elements * @genmask: If set, check that element is active in given genmask + * @tstamp: timestamp to check for expired elements * * This is essentially the same as the lookup function, except that it matches * key data against the uncommitted copy and doesn't use preallocated maps for @@ -513,7 +514,8 @@ bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set, */ static struct nft_pipapo_elem *pipapo_get(const struct net *net, const struct nft_set *set, - const u8 *data, u8 genmask) + const u8 *data, u8 genmask, + u64 tstamp) { struct nft_pipapo_elem *ret = ERR_PTR(-ENOENT); struct nft_pipapo *priv = nft_set_priv(set); @@ -566,7 +568,7 @@ static struct nft_pipapo_elem *pipapo_get(const struct net *net, goto out; if (last) { - if (nft_set_elem_expired(&f->mt[b].e->ext)) + if (__nft_set_elem_expired(&f->mt[b].e->ext, tstamp)) goto next_match; if ((genmask && !nft_set_elem_active(&f->mt[b].e->ext, genmask))) @@ -603,7 +605,7 @@ static void *nft_pipapo_get(const struct net *net, const struct nft_set *set, const struct nft_set_elem *elem, unsigned int flags) { return pipapo_get(net, set, (const u8 *)elem->key.val.data, - nft_genmask_cur(net)); + nft_genmask_cur(net), get_jiffies_64()); } /** @@ -1197,6 +1199,7 @@ static int nft_pipapo_insert(const struct net *net, const struct nft_set *set, struct nft_pipapo *priv = nft_set_priv(set); struct nft_pipapo_match *m = priv->clone; u8 genmask = nft_genmask_next(net); + u64 tstamp = nft_net_tstamp(net); struct nft_pipapo_field *f; const u8 *start_p, *end_p; int i, bsize_max, err = 0; @@ -1206,7 +1209,7 @@ static int nft_pipapo_insert(const struct net *net, const struct nft_set *set, else end = start; - dup = pipapo_get(net, set, start, genmask); + dup = pipapo_get(net, set, start, genmask, tstamp); if (!IS_ERR(dup)) { /* Check if we already have the same exact entry */ const struct nft_data *dup_key, *dup_end; @@ -1228,7 +1231,7 @@ static int nft_pipapo_insert(const struct net *net, const struct nft_set *set, if (PTR_ERR(dup) == -ENOENT) { /* Look for partially overlapping entries */ - dup = pipapo_get(net, set, end, nft_genmask_next(net)); + dup = pipapo_get(net, set, end, nft_genmask_next(net), tstamp); } if (PTR_ERR(dup) != -ENOENT) { @@ -1580,6 +1583,7 @@ static void pipapo_gc(const struct nft_set *_set, struct nft_pipapo_match *m) struct nft_set *set = (struct nft_set *) _set; struct nft_pipapo *priv = nft_set_priv(set); struct net *net = read_pnet(&set->net); + u64 tstamp = nft_net_tstamp(net); int rules_f0, first_rule = 0; struct nft_trans_gc *gc; @@ -1613,7 +1617,7 @@ static void pipapo_gc(const struct nft_set *_set, struct nft_pipapo_match *m) /* synchronous gc never fails, there is no need to set on * NFT_SET_ELEM_DEAD_BIT. */ - if (nft_set_elem_expired(&e->ext)) { + if (__nft_set_elem_expired(&e->ext, tstamp)) { priv->dirty = true; gc = nft_trans_gc_queue_sync(gc, GFP_ATOMIC); @@ -1772,7 +1776,7 @@ static void *pipapo_deactivate(const struct net *net, const struct nft_set *set, { struct nft_pipapo_elem *e; - e = pipapo_get(net, set, data, nft_genmask_next(net)); + e = pipapo_get(net, set, data, nft_genmask_next(net), nft_net_tstamp(net)); if (IS_ERR(e)) return NULL; diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c index 323cbb446d19..2d29cf8d51e0 100644 --- a/net/netfilter/nft_set_rbtree.c +++ b/net/netfilter/nft_set_rbtree.c @@ -314,6 +314,7 @@ static int __nft_rbtree_insert(const struct net *net, const struct nft_set *set, struct rb_node *node, *next, *parent, **p, *first = NULL; struct nft_rbtree *priv = nft_set_priv(set); u8 genmask = nft_genmask_next(net); + u64 tstamp = nft_net_tstamp(net); int d, err; /* Descend the tree to search for an existing element greater than the @@ -359,7 +360,7 @@ static int __nft_rbtree_insert(const struct net *net, const struct nft_set *set, continue; /* perform garbage collection to avoid bogus overlap reports. */ - if (nft_set_elem_expired(&rbe->ext)) { + if (__nft_set_elem_expired(&rbe->ext, tstamp)) { err = nft_rbtree_gc_elem(set, priv, rbe); if (err < 0) return err; @@ -533,6 +534,7 @@ static void *nft_rbtree_deactivate(const struct net *net, const struct rb_node *parent = priv->root.rb_node; struct nft_rbtree_elem *rbe, *this = elem->priv; u8 genmask = nft_genmask_next(net); + u64 tstamp = nft_net_tstamp(net); int d; while (parent != NULL) { @@ -553,7 +555,7 @@ static void *nft_rbtree_deactivate(const struct net *net, nft_rbtree_interval_end(this)) { parent = parent->rb_right; continue; - } else if (nft_set_elem_expired(&rbe->ext)) { + } else if (__nft_set_elem_expired(&rbe->ext, tstamp)) { break; } else if (!nft_set_elem_active(&rbe->ext, genmask)) { parent = parent->rb_left; -- 2.34.1

2 1

[PATCH OLK-5.10 v2] net/mlx5e: Fix netif state handling
by Liu Jian 17 Oct '24

17 Oct '24

From: Shay Drory <shayd(a)nvidia.com> mainline inclusion from mainline-v6.10-rc1 commit 3d5918477f94e4c2f064567875c475468e264644 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IA6S7G CVE: CVE-2024-38608 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… --------------------------- mlx5e_suspend cleans resources only if netif_device_present() returns true. However, mlx5e_resume changes the state of netif, via mlx5e_nic_enable, only if reg_state == NETREG_REGISTERED. In the below case, the above leads to NULL-ptr Oops[1] and memory leaks: mlx5e_probe _mlx5e_resume mlx5e_attach_netdev mlx5e_nic_enable <-- netdev not reg, not calling netif_device_attach() register_netdev <-- failed for some reason. ERROR_FLOW: _mlx5e_suspend <-- netif_device_present return false, resources aren't freed :( Hence, clean resources in this case as well. [1] BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 0 P4D 0 Oops: 0010 [#1] SMP CPU: 2 PID: 9345 Comm: test-ovs-ct-gen Not tainted 6.5.0_for_upstream_min_debug_2023_09_05_16_01 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:0x0 Code: Unable to access opcode bytes at0xffffffffffffffd6. RSP: 0018:ffff888178aaf758 EFLAGS: 00010246 Call Trace: <TASK> ? __die+0x20/0x60 ? page_fault_oops+0x14c/0x3c0 ? exc_page_fault+0x75/0x140 ? asm_exc_page_fault+0x22/0x30 notifier_call_chain+0x35/0xb0 blocking_notifier_call_chain+0x3d/0x60 mlx5_blocking_notifier_call_chain+0x22/0x30 [mlx5_core] mlx5_core_uplink_netdev_event_replay+0x3e/0x60 [mlx5_core] mlx5_mdev_netdev_track+0x53/0x60 [mlx5_ib] mlx5_ib_roce_init+0xc3/0x340 [mlx5_ib] __mlx5_ib_add+0x34/0xd0 [mlx5_ib] mlx5r_probe+0xe1/0x210 [mlx5_ib] ? auxiliary_match_id+0x6a/0x90 auxiliary_bus_probe+0x38/0x80 ? driver_sysfs_add+0x51/0x80 really_probe+0xc9/0x3e0 ? driver_probe_device+0x90/0x90 __driver_probe_device+0x80/0x160 driver_probe_device+0x1e/0x90 __device_attach_driver+0x7d/0x100 bus_for_each_drv+0x80/0xd0 __device_attach+0xbc/0x1f0 bus_probe_device+0x86/0xa0 device_add+0x637/0x840 __auxiliary_device_add+0x3b/0xa0 add_adev+0xc9/0x140 [mlx5_core] mlx5_rescan_drivers_locked+0x22a/0x310 [mlx5_core] mlx5_register_device+0x53/0xa0 [mlx5_core] mlx5_init_one_devl_locked+0x5c4/0x9c0 [mlx5_core] mlx5_init_one+0x3b/0x60 [mlx5_core] probe_one+0x44c/0x730 [mlx5_core] local_pci_probe+0x3e/0x90 pci_device_probe+0xbf/0x210 ? kernfs_create_link+0x5d/0xa0 ? sysfs_do_create_link_sd+0x60/0xc0 really_probe+0xc9/0x3e0 ? driver_probe_device+0x90/0x90 __driver_probe_device+0x80/0x160 driver_probe_device+0x1e/0x90 __device_attach_driver+0x7d/0x100 bus_for_each_drv+0x80/0xd0 __device_attach+0xbc/0x1f0 pci_bus_add_device+0x54/0x80 pci_iov_add_virtfn+0x2e6/0x320 sriov_enable+0x208/0x420 mlx5_core_sriov_configure+0x9e/0x200 [mlx5_core] sriov_numvfs_store+0xae/0x1a0 kernfs_fop_write_iter+0x10c/0x1a0 vfs_write+0x291/0x3c0 ksys_write+0x5f/0xe0 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x46/0xb0 CR2: 0000000000000000 ---[ end trace 0000000000000000 ]--- Fixes: 2c3b5beec46a ("net/mlx5e: More generic netdev management API") Signed-off-by: Shay Drory <shayd(a)nvidia.com> Signed-off-by: Tariq Toukan <tariqt(a)nvidia.com> Reviewed-by: Simon Horman <horms(a)kernel.org> Link: https://lore.kernel.org/r/20240509112951.590184-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> Conflicts: drivers/net/ethernet/mellanox/mlx5/core/en_main.c [Did not backport 912cebf420c2.] Signed-off-by: Liu Jian <liujian56(a)huawei.com> --- v1->v2: update bugzilla. drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 5673a4113253..160ab564d641 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -5578,7 +5578,7 @@ static int mlx5e_attach(struct mlx5_core_dev *mdev, void *vpriv) return 0; } -static void mlx5e_detach(struct mlx5_core_dev *mdev, void *vpriv) +static void __mlx5e_detach(struct mlx5_core_dev *mdev, void *vpriv, bool pre_netdev_reg) { struct mlx5e_priv *priv = vpriv; struct net_device *netdev = priv->netdev; @@ -5588,13 +5588,18 @@ static void mlx5e_detach(struct mlx5_core_dev *mdev, void *vpriv) return; #endif - if (!netif_device_present(netdev)) + if (!pre_netdev_reg && !netif_device_present(netdev)) return; mlx5e_detach_netdev(priv); mlx5e_destroy_mdev_resources(mdev); } +static void mlx5e_detach(struct mlx5_core_dev *mdev, void *vpriv) +{ + __mlx5e_detach(mdev, vpriv, false); +} + static void *mlx5e_add(struct mlx5_core_dev *mdev) { struct net_device *netdev; @@ -5642,7 +5647,7 @@ static void *mlx5e_add(struct mlx5_core_dev *mdev) return priv; err_detach: - mlx5e_detach(mdev, priv); + __mlx5e_detach(mdev, priv, true); err_destroy_netdev: mlx5e_destroy_netdev(priv); return NULL; -- 2.34.1

2 1

[PATCH openEuler-22.03-LTS-SP1 v2] net/mlx5e: Fix netif state handling
by Liu Jian 17 Oct '24

17 Oct '24

From: Shay Drory <shayd(a)nvidia.com> mainline inclusion from mainline-v6.10-rc1 commit 3d5918477f94e4c2f064567875c475468e264644 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IA6S7G CVE: CVE-2024-38608 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… --------------------------- mlx5e_suspend cleans resources only if netif_device_present() returns true. However, mlx5e_resume changes the state of netif, via mlx5e_nic_enable, only if reg_state == NETREG_REGISTERED. In the below case, the above leads to NULL-ptr Oops[1] and memory leaks: mlx5e_probe _mlx5e_resume mlx5e_attach_netdev mlx5e_nic_enable <-- netdev not reg, not calling netif_device_attach() register_netdev <-- failed for some reason. ERROR_FLOW: _mlx5e_suspend <-- netif_device_present return false, resources aren't freed :( Hence, clean resources in this case as well. [1] BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 0 P4D 0 Oops: 0010 [#1] SMP CPU: 2 PID: 9345 Comm: test-ovs-ct-gen Not tainted 6.5.0_for_upstream_min_debug_2023_09_05_16_01 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:0x0 Code: Unable to access opcode bytes at0xffffffffffffffd6. RSP: 0018:ffff888178aaf758 EFLAGS: 00010246 Call Trace: <TASK> ? __die+0x20/0x60 ? page_fault_oops+0x14c/0x3c0 ? exc_page_fault+0x75/0x140 ? asm_exc_page_fault+0x22/0x30 notifier_call_chain+0x35/0xb0 blocking_notifier_call_chain+0x3d/0x60 mlx5_blocking_notifier_call_chain+0x22/0x30 [mlx5_core] mlx5_core_uplink_netdev_event_replay+0x3e/0x60 [mlx5_core] mlx5_mdev_netdev_track+0x53/0x60 [mlx5_ib] mlx5_ib_roce_init+0xc3/0x340 [mlx5_ib] __mlx5_ib_add+0x34/0xd0 [mlx5_ib] mlx5r_probe+0xe1/0x210 [mlx5_ib] ? auxiliary_match_id+0x6a/0x90 auxiliary_bus_probe+0x38/0x80 ? driver_sysfs_add+0x51/0x80 really_probe+0xc9/0x3e0 ? driver_probe_device+0x90/0x90 __driver_probe_device+0x80/0x160 driver_probe_device+0x1e/0x90 __device_attach_driver+0x7d/0x100 bus_for_each_drv+0x80/0xd0 __device_attach+0xbc/0x1f0 bus_probe_device+0x86/0xa0 device_add+0x637/0x840 __auxiliary_device_add+0x3b/0xa0 add_adev+0xc9/0x140 [mlx5_core] mlx5_rescan_drivers_locked+0x22a/0x310 [mlx5_core] mlx5_register_device+0x53/0xa0 [mlx5_core] mlx5_init_one_devl_locked+0x5c4/0x9c0 [mlx5_core] mlx5_init_one+0x3b/0x60 [mlx5_core] probe_one+0x44c/0x730 [mlx5_core] local_pci_probe+0x3e/0x90 pci_device_probe+0xbf/0x210 ? kernfs_create_link+0x5d/0xa0 ? sysfs_do_create_link_sd+0x60/0xc0 really_probe+0xc9/0x3e0 ? driver_probe_device+0x90/0x90 __driver_probe_device+0x80/0x160 driver_probe_device+0x1e/0x90 __device_attach_driver+0x7d/0x100 bus_for_each_drv+0x80/0xd0 __device_attach+0xbc/0x1f0 pci_bus_add_device+0x54/0x80 pci_iov_add_virtfn+0x2e6/0x320 sriov_enable+0x208/0x420 mlx5_core_sriov_configure+0x9e/0x200 [mlx5_core] sriov_numvfs_store+0xae/0x1a0 kernfs_fop_write_iter+0x10c/0x1a0 vfs_write+0x291/0x3c0 ksys_write+0x5f/0xe0 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x46/0xb0 CR2: 0000000000000000 ---[ end trace 0000000000000000 ]--- Fixes: 2c3b5beec46a ("net/mlx5e: More generic netdev management API") Signed-off-by: Shay Drory <shayd(a)nvidia.com> Signed-off-by: Tariq Toukan <tariqt(a)nvidia.com> Reviewed-by: Simon Horman <horms(a)kernel.org> Link: https://lore.kernel.org/r/20240509112951.590184-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> Conflicts: drivers/net/ethernet/mellanox/mlx5/core/en_main.c [Did not backport 912cebf420c2.] Signed-off-by: Liu Jian <liujian56(a)huawei.com> --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index cfc3bfcb04a2..b8717cbf23dd 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -5578,7 +5578,7 @@ static int mlx5e_attach(struct mlx5_core_dev *mdev, void *vpriv) return 0; } -static void mlx5e_detach(struct mlx5_core_dev *mdev, void *vpriv) +static void __mlx5e_detach(struct mlx5_core_dev *mdev, void *vpriv, bool pre_netdev_reg) { struct mlx5e_priv *priv = vpriv; struct net_device *netdev = priv->netdev; @@ -5588,13 +5588,18 @@ static void mlx5e_detach(struct mlx5_core_dev *mdev, void *vpriv) return; #endif - if (!netif_device_present(netdev)) + if (!pre_netdev_reg && !netif_device_present(netdev)) return; mlx5e_detach_netdev(priv); mlx5e_destroy_mdev_resources(mdev); } +static void mlx5e_detach(struct mlx5_core_dev *mdev, void *vpriv) +{ + __mlx5e_detach(mdev, vpriv, false); +} + static void *mlx5e_add(struct mlx5_core_dev *mdev) { struct net_device *netdev; @@ -5642,7 +5647,7 @@ static void *mlx5e_add(struct mlx5_core_dev *mdev) return priv; err_detach: - mlx5e_detach(mdev, priv); + __mlx5e_detach(mdev, priv, true); err_destroy_netdev: mlx5e_destroy_netdev(priv); return NULL; -- 2.34.1

2 1

[PATCH OLK-6.6] dmaengine: altera-msgdma: properly free descriptor in msgdma_free_descriptor
by Jinjiang Tu 16 Oct '24

16 Oct '24

From: Olivier Dautricourt <olivierdautricourt(a)gmail.com> stable inclusion from stable-v6.6.50 commit 20bf2920a869f9dbda0ef8c94c87d1901a64a716 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IARVBS CVE: CVE-2024-46716 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id… -------------------------------- [ Upstream commit 54e4ada1a4206f878e345ae01cf37347d803d1b1 ] Remove list_del call in msgdma_chan_desc_cleanup, this should be the role of msgdma_free_descriptor. In consequence replace list_add_tail with list_move_tail in msgdma_free_descriptor. This fixes the path: msgdma_free_chan_resources -> msgdma_free_descriptors -> msgdma_free_desc_list -> msgdma_free_descriptor which does not correctly free the descriptors as first nodes were not removed from the list. Signed-off-by: Olivier Dautricourt <olivierdautricourt(a)gmail.com> Tested-by: Olivier Dautricourt <olivierdautricourt(a)gmail.com> Link: https://lore.kernel.org/r/20240608213216.25087-3-olivierdautricourt@gmail.c… Signed-off-by: Vinod Koul <vkoul(a)kernel.org> Signed-off-by: Sasha Levin <sashal(a)kernel.org> Conflicts: drivers/dma/altera-msgdma.c [Context conflicts.] Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com> --- drivers/dma/altera-msgdma.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/dma/altera-msgdma.c b/drivers/dma/altera-msgdma.c index 4153c2edb049..ff231c984b9f 100644 --- a/drivers/dma/altera-msgdma.c +++ b/drivers/dma/altera-msgdma.c @@ -233,7 +233,7 @@ static void msgdma_free_descriptor(struct msgdma_device *mdev, struct msgdma_sw_desc *child, *next; mdev->desc_free_cnt++; - list_add_tail(&desc->node, &mdev->free_list); + list_move_tail(&desc->node, &mdev->free_list); list_for_each_entry_safe(child, next, &desc->tx_list, node) { mdev->desc_free_cnt++; list_move_tail(&child->node, &mdev->free_list); @@ -587,8 +587,6 @@ static void msgdma_chan_desc_cleanup(struct msgdma_device *mdev) list_for_each_entry_safe(desc, next, &mdev->done_list, node) { struct dmaengine_desc_callback cb; - list_del(&desc->node); - dmaengine_desc_get_callback(&desc->async_tx, &cb); if (dmaengine_desc_callback_valid(&cb)) { spin_unlock(&mdev->lock); -- 2.34.1

2 1