mailweb.openeuler.org
Manage this list

Keyboard Shortcuts

Thread View

  • j: Next unread message
  • k: Previous unread message
  • j a: Jump to all threads
  • j l: Jump to MailingList overview

Kernel

Threads by month
  • ----- 2025 -----
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2024 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2023 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2022 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2021 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2020 -----
  • December
  • November
  • October
  • September
  • August
  • July
  • June
  • May
  • April
  • March
  • February
  • January
  • ----- 2019 -----
  • December
kernel@openeuler.org

  • 18444 discussions
[PATCH openEuler-21.09 1/2] mm: fix oom killing for disabled pid
by Zheng Zengkai 26 Aug '21

26 Aug '21
From: Jing Xiangfeng <jingxiangfeng(a)huawei.com> hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I46IUJ CVE: NA -------------------------------------- In oom_next_task(), if points is equal to LONG_MIN, then we choose it to kill. That is not correct. LONG_MIN means to disable killing it. So fix it. Fixes: 4da32073a8fe ("memcg: support priority for oom") Signed-off-by: Jing Xiangfeng <jingxiangfeng(a)huawei.com> Reviewed-by: Chen Wandun <chenwandun(a)huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai(a)huawei.com> --- mm/oom_kill.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 41668bd37f52..fb39b0902476 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -311,15 +311,15 @@ static enum oom_constraint constrained_alloc(struct oom_control *oc) * choose the task with the highest number of 'points'. */ static bool oom_next_task(struct task_struct *task, struct oom_control *oc, - unsigned long points) + long points) { struct mem_cgroup *cur_memcg; struct mem_cgroup *oc_memcg; if (!static_branch_likely(&memcg_qos_stat_key)) - return !points || points < oc->chosen_points; + return (points == LONG_MIN || points < oc->chosen_points); - if (!points) + if (points == LONG_MIN) return true; if (!oc->chosen) @@ -341,9 +341,9 @@ static bool oom_next_task(struct task_struct *task, struct oom_control *oc, } #else static inline bool oom_next_task(struct task_struct *task, - struct oom_control *oc, unsigned long points) + struct oom_control *oc, long points) { - return !points || points < oc->chosen_points; + return (points == LONG_MIN || points < oc->chosen_points); } #endif -- 2.20.1
1 1
0 0
[PATCH openEuler-1.0-LTS] KVM: X86: MMU: Use the correct inherited permissions to get shadow page
by Yang Yingliang 26 Aug '21

26 Aug '21
From: Lai Jiangshan <laijs(a)linux.alibaba.com> stable inclusion from linux-4.19.204 commit 4c07e70141eebd3db64297515a427deea4822957 CVE: CVE-2021-38198 -------------------------------- commit b1bd5cba3306691c771d558e94baa73e8b0b96b7 upstream. When computing the access permissions of a shadow page, use the effective permissions of the walk up to that point, i.e. the logic AND of its parents' permissions. Two guest PxE entries that point at the same table gfn need to be shadowed with different shadow pages if their parents' permissions are different. KVM currently uses the effective permissions of the last non-leaf entry for all non-leaf entries. Because all non-leaf SPTEs have full ("uwx") permissions, and the effective permissions are recorded only in role.access and merged into the leaves, this can lead to incorrect reuse of a shadow page and eventually to a missing guest protection page fault. For example, here is a shared pagetable: pgd[] pud[] pmd[] virtual address pointers /->pmd1(u--)->pte1(uw-)->page1 <- ptr1 (u--) /->pud1(uw-)--->pmd2(uw-)->pte2(uw-)->page2 <- ptr2 (uw-) pgd-| (shared pmd[] as above) \->pud2(u--)--->pmd1(u--)->pte1(uw-)->page1 <- ptr3 (u--) \->pmd2(uw-)->pte2(uw-)->page2 <- ptr4 (u--) pud1 and pud2 point to the same pmd table, so: - ptr1 and ptr3 points to the same page. - ptr2 and ptr4 points to the same page. (pud1 and pud2 here are pud entries, while pmd1 and pmd2 here are pmd entries) - First, the guest reads from ptr1 first and KVM prepares a shadow page table with role.access=u--, from ptr1's pud1 and ptr1's pmd1. "u--" comes from the effective permissions of pgd, pud1 and pmd1, which are stored in pt->access. "u--" is used also to get the pagetable for pud1, instead of "uw-". - Then the guest writes to ptr2 and KVM reuses pud1 which is present. The hypervisor set up a shadow page for ptr2 with pt->access is "uw-" even though the pud1 pmd (because of the incorrect argument to kvm_mmu_get_page in the previous step) has role.access="u--". - Then the guest reads from ptr3. The hypervisor reuses pud1's shadow pmd for pud2, because both use "u--" for their permissions. Thus, the shadow pmd already includes entries for both pmd1 and pmd2. - At last, the guest writes to ptr4. This causes no vmexit or pagefault, because pud1's shadow page structures included an "uw-" page even though its role.access was "u--". Any kind of shared pagetable might have the similar problem when in virtual machine without TDP enabled if the permissions are different from different ancestors. In order to fix the problem, we change pt->access to be an array, and any access in it will not include permissions ANDed from child ptes. The test code is: https://lore.kernel.org/kvm/20210603050537.19605-1-jiangshanlai@gmail.com/ Remember to test it with TDP disabled. The problem had existed long before the commit 41074d07c78b ("KVM: MMU: Fix inherited permissions for emulated guest pte updates"), and it is hard to find which is the culprit. So there is no fixes tag here. Signed-off-by: Lai Jiangshan <laijs(a)linux.alibaba.com> Message-Id: <20210603052455.21023-1-jiangshanlai(a)gmail.com> Cc: stable(a)vger.kernel.org Fixes: cea0f0e7ea54 ("[PATCH] KVM: MMU: Shadow page table caching") Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com> [OP: - apply arch/x86/kvm/mmu/* changes to arch/x86/kvm - apply documentation changes to Documentation/virtual/kvm/mmu.txt - adjusted context in arch/x86/kvm/paging_tmpl.h] Signed-off-by: Ovidiu Panait <ovidiu.panait(a)windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com> Reviewed-by: Xiu Jianfeng <xiujianfeng(a)huawei.com> Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com> --- Documentation/virtual/kvm/mmu.txt | 4 ++-- arch/x86/kvm/paging_tmpl.h | 14 +++++++++----- 2 files changed, 11 insertions(+), 7 deletions(-) diff --git a/Documentation/virtual/kvm/mmu.txt b/Documentation/virtual/kvm/mmu.txt index e507a9e0421ed..851a8abcadce4 100644 --- a/Documentation/virtual/kvm/mmu.txt +++ b/Documentation/virtual/kvm/mmu.txt @@ -152,8 +152,8 @@ Shadow pages contain the following information: shadow pages) so role.quadrant takes values in the range 0..3. Each quadrant maps 1GB virtual address space. role.access: - Inherited guest access permissions in the form uwx. Note execute - permission is positive, not negative. + Inherited guest access permissions from the parent ptes in the form uwx. + Note execute permission is positive, not negative. role.invalid: The page is invalid and should not be used. It is a root page that is currently pinned (by a cpu hardware register pointing to it); once it is diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index adf42dc8d38b0..31014a746aede 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -93,8 +93,8 @@ struct guest_walker { gpa_t pte_gpa[PT_MAX_FULL_LEVELS]; pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS]; bool pte_writable[PT_MAX_FULL_LEVELS]; - unsigned pt_access; - unsigned pte_access; + unsigned int pt_access[PT_MAX_FULL_LEVELS]; + unsigned int pte_access; gfn_t gfn; struct x86_exception fault; }; @@ -388,13 +388,15 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker, } walker->ptes[walker->level - 1] = pte; + + /* Convert to ACC_*_MASK flags for struct guest_walker. */ + walker->pt_access[walker->level - 1] = FNAME(gpte_access)(pt_access ^ walk_nx_mask); } while (!is_last_gpte(mmu, walker->level, pte)); pte_pkey = FNAME(gpte_pkeys)(vcpu, pte); accessed_dirty = have_ad ? pte_access & PT_GUEST_ACCESSED_MASK : 0; /* Convert to ACC_*_MASK flags for struct guest_walker. */ - walker->pt_access = FNAME(gpte_access)(pt_access ^ walk_nx_mask); walker->pte_access = FNAME(gpte_access)(pte_access ^ walk_nx_mask); errcode = permission_fault(vcpu, mmu, walker->pte_access, pte_pkey, access); if (unlikely(errcode)) @@ -432,7 +434,8 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker, } pgprintk("%s: pte %llx pte_access %x pt_access %x\n", - __func__, (u64)pte, walker->pte_access, walker->pt_access); + __func__, (u64)pte, walker->pte_access, + walker->pt_access[walker->level - 1]); return 1; error: @@ -601,7 +604,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr, { struct kvm_mmu_page *sp = NULL; struct kvm_shadow_walk_iterator it; - unsigned direct_access, access = gw->pt_access; + unsigned int direct_access, access; int top_level, ret; gfn_t gfn, base_gfn; @@ -633,6 +636,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr, sp = NULL; if (!is_shadow_present_pte(*it.sptep)) { table_gfn = gw->table_gfn[it.level - 2]; + access = gw->pt_access[it.level - 2]; sp = kvm_mmu_get_page(vcpu, table_gfn, addr, it.level-1, false, access); } -- 2.25.1
1 0
0 0
[PATCH kernel-4.19] x86/config: Enable CONFIG_USERSWAP for openeuler_defconfig
by Yang Yingliang 26 Aug '21

26 Aug '21
From: Xiongfeng Wang <wangxiongfeng2(a)huawei.com> hulk inclusion category: feature bugzilla: 47439 CVE: NA ------------------------------------------------- Enable CONFIG_USERSWAP for openeuler_defconfig Signed-off-by: Xiongfeng Wang <wangxiongfeng2(a)huawei.com> Reviewed-by: tong tiangen <tongtiangen(a)huawei.com> Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com> --- arch/x86/configs/openeuler_defconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/configs/openeuler_defconfig b/arch/x86/configs/openeuler_defconfig index 80f5769d2cc5a..4e33229af5e26 100644 --- a/arch/x86/configs/openeuler_defconfig +++ b/arch/x86/configs/openeuler_defconfig @@ -1005,6 +1005,7 @@ CONFIG_TRANSPARENT_HUGE_PAGECACHE=y CONFIG_CLEANCACHE=y CONFIG_FRONTSWAP=y CONFIG_SHRINK_PAGECACHE=y +CONFIG_USERSWAP=y # CONFIG_CMA is not set CONFIG_MEM_SOFT_DIRTY=y CONFIG_ZSWAP=y -- 2.25.1
1 0
0 0
[PATCH openEuler-1.0-LTS] ext4: fix panic when mount failed with parallel flush_stashed_error_work
by Yang Yingliang 26 Aug '21

26 Aug '21
From: yangerkun <yangerkun(a)huawei.com> hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I46HJ6 CVE: NA --------------------------- 'commit c92dc856848f ("ext4: defer saving error info from atomic context")' and '2d01ddc86606 ("ext4: save error info to sb through journal if available")' add s_error_work to fix checksum error problem. But the error path in ext4_fill_super can lead the follow BUG_ON. Our testcase got follow BUG: [32031.759805] ------------[ cut here ]------------ [32031.759807] kernel BUG at fs/jbd2/transaction.c:373! [32031.760075] invalid opcode: 0000 [#1] SMP PTI [32031.760336] CPU: 5 PID: 1029268 Comm: kworker/5:1 Kdump: loaded Tainted: G OE --------- - - 4.18.0 ... [32031.766665] jbd2__journal_start+0xf1/0x1f0 [jbd2] [32031.766934] jbd2_journal_start+0x19/0x20 [jbd2] [32031.767218] flush_stashed_error_work+0x30/0x90 [ext4] [32031.767487] process_one_work+0x195/0x390 [32031.767747] worker_thread+0x30/0x390 [32031.768007] ? process_one_work+0x390/0x390 [32031.768265] kthread+0x10d/0x130 [32031.768521] ? kthread_flush_work_fn+0x10/0x10 [32031.768778] ret_from_fork+0x35/0x40 static int start_this_handle(...) BUG_ON(journal->j_flags & JBD2_UNMOUNT); <---- Trigger this For this case, flush_stashed_error_work will try to access journal with parallel ext4_fill_super try to destroy journal. We need make sure there is no work before me destroy journal. Fix it by flush error work before we destroy journal. Fixes: c92dc856848f ("ext4: defer saving error info from atomic context") Fixes: 2d01ddc86606 ("ext4: save error info to sb through journal if available") Signed-off-by: yangerkun <yangerkun(a)huawei.com> Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com> Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com> --- fs/ext4/super.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index d793e597c0623..280e991e61f47 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -4832,6 +4832,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) sbi->s_ea_block_cache = NULL; } if (sbi->s_journal) { + flush_work(&sbi->s_error_work); jbd2_journal_destroy(sbi->s_journal); sbi->s_journal = NULL; } -- 2.25.1
1 0
0 0
[PATCH kernel-4.19] ext4: fix panic when mount failed with parallel flush_stashed_error_work
by Yang Yingliang 26 Aug '21

26 Aug '21
From: yangerkun <yangerkun(a)huawei.com> hulk inclusion category: bugfix bugzilla: 172146 CVE: NA --------------------------- 'commit c92dc856848f ("ext4: defer saving error info from atomic context")' and '2d01ddc86606 ("ext4: save error info to sb through journal if available")' add s_error_work to fix checksum error problem. But the error path in ext4_fill_super can lead the follow BUG_ON. Our testcase got follow BUG: [32031.759805] ------------[ cut here ]------------ [32031.759807] kernel BUG at fs/jbd2/transaction.c:373! [32031.760075] invalid opcode: 0000 [#1] SMP PTI [32031.760336] CPU: 5 PID: 1029268 Comm: kworker/5:1 Kdump: loaded Tainted: G OE --------- - - 4.18.0 ... [32031.766665] jbd2__journal_start+0xf1/0x1f0 [jbd2] [32031.766934] jbd2_journal_start+0x19/0x20 [jbd2] [32031.767218] flush_stashed_error_work+0x30/0x90 [ext4] [32031.767487] process_one_work+0x195/0x390 [32031.767747] worker_thread+0x30/0x390 [32031.768007] ? process_one_work+0x390/0x390 [32031.768265] kthread+0x10d/0x130 [32031.768521] ? kthread_flush_work_fn+0x10/0x10 [32031.768778] ret_from_fork+0x35/0x40 static int start_this_handle(...) BUG_ON(journal->j_flags & JBD2_UNMOUNT); <---- Trigger this For this case, flush_stashed_error_work will try to access journal with parallel ext4_fill_super try to destroy journal. We need make sure there is no work before me destroy journal. Fix it by flush error work before we destroy journal. Fixes: c92dc856848f ("ext4: defer saving error info from atomic context") Fixes: 2d01ddc86606 ("ext4: save error info to sb through journal if available") Signed-off-by: yangerkun <yangerkun(a)huawei.com> Reviewed-by: Zhang Yi <yi.zhang(a)huawei.com> Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com> --- fs/ext4/super.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index d793e597c0623..280e991e61f47 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -4832,6 +4832,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) sbi->s_ea_block_cache = NULL; } if (sbi->s_journal) { + flush_work(&sbi->s_error_work); jbd2_journal_destroy(sbi->s_journal); sbi->s_journal = NULL; } -- 2.25.1
1 0
0 0
[PATCH openEuler-1.0-LTS] device core: Consolidate locking and unlocking of parent and device
by Yang Yingliang 26 Aug '21

26 Aug '21
From: Alexander Duyck <alexander.h.duyck(a)linux.intel.com> mainline inclusion from mainline-v5.1-rc1 commit ed88747 category: bugfix bugzilla: 176200 CVE: NA -------------------------------- Try to consolidate all of the locking and unlocking of both the parent and device when attaching or removing a driver from a given device. To do that I first consolidated the lock pattern into two functions __device_driver_lock and __device_driver_unlock. After doing that I then created functions specific to attaching and detaching the driver while acquiring these locks. By doing this I was able to reduce the number of spots where we touch need_parent_lock from 12 down to 4. This patch should produce no functional changes, it is meant to be a code clean-up/consolidation only. Reviewed-by: Luis Chamberlain <mcgrof(a)kernel.org> Reviewed-by: Bart Van Assche <bvanassche(a)acm.org> Reviewed-by: Dan Williams <dan.j.williams(a)intel.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck(a)linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Xiongfeng Wang <wangxiongfeng2(a)huawei.com> Reviewed-by: Hanjun Guo <guohanjun(a)huawei.com> Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com> --- drivers/base/base.h | 2 + drivers/base/bus.c | 23 ++--------- drivers/base/dd.c | 95 +++++++++++++++++++++++++++++++++++---------- 3 files changed, 81 insertions(+), 39 deletions(-) diff --git a/drivers/base/base.h b/drivers/base/base.h index 559b047de9f75..2d270b8c731a0 100644 --- a/drivers/base/base.h +++ b/drivers/base/base.h @@ -128,6 +128,8 @@ extern int driver_add_groups(struct device_driver *drv, const struct attribute_group **groups); extern void driver_remove_groups(struct device_driver *drv, const struct attribute_group **groups); +int device_driver_attach(struct device_driver *drv, struct device *dev); +void device_driver_detach(struct device *dev); extern char *make_class_name(const char *name, struct kobject *kobj); diff --git a/drivers/base/bus.c b/drivers/base/bus.c index e06a57936cc96..38a09ca932a3b 100644 --- a/drivers/base/bus.c +++ b/drivers/base/bus.c @@ -187,11 +187,7 @@ static ssize_t unbind_store(struct device_driver *drv, const char *buf, dev = bus_find_device_by_name(bus, NULL, buf); if (dev && dev->driver == drv) { - if (dev->parent && dev->bus->need_parent_lock) - device_lock(dev->parent); - device_release_driver(dev); - if (dev->parent && dev->bus->need_parent_lock) - device_unlock(dev->parent); + device_driver_detach(dev); err = count; } put_device(dev); @@ -214,13 +210,7 @@ static ssize_t bind_store(struct device_driver *drv, const char *buf, dev = bus_find_device_by_name(bus, NULL, buf); if (dev && dev->driver == NULL && driver_match_device(drv, dev)) { - if (dev->parent && bus->need_parent_lock) - device_lock(dev->parent); - device_lock(dev); - err = driver_probe_device(drv, dev); - device_unlock(dev); - if (dev->parent && bus->need_parent_lock) - device_unlock(dev->parent); + err = device_driver_attach(drv, dev); if (err > 0) { /* success */ @@ -774,13 +764,8 @@ EXPORT_SYMBOL_GPL(bus_rescan_devices); */ int device_reprobe(struct device *dev) { - if (dev->driver) { - if (dev->parent && dev->bus->need_parent_lock) - device_lock(dev->parent); - device_release_driver(dev); - if (dev->parent && dev->bus->need_parent_lock) - device_unlock(dev->parent); - } + if (dev->driver) + device_driver_detach(dev); return bus_rescan_devices_helper(dev, NULL); } EXPORT_SYMBOL_GPL(device_reprobe); diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 35112842b50e8..24d3a6289e9bf 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -865,6 +865,64 @@ void device_initial_probe(struct device *dev) __device_attach(dev, true); } +/* + * __device_driver_lock - acquire locks needed to manipulate dev->drv + * @dev: Device we will update driver info for + * @parent: Parent device. Needed if the bus requires parent lock + * + * This function will take the required locks for manipulating dev->drv. + * Normally this will just be the @dev lock, but when called for a USB + * interface, @parent lock will be held as well. + */ +static void __device_driver_lock(struct device *dev, struct device *parent) +{ + if (parent && dev->bus->need_parent_lock) + device_lock(parent); + device_lock(dev); +} + +/* + * __device_driver_unlock - release locks needed to manipulate dev->drv + * @dev: Device we will update driver info for + * @parent: Parent device. Needed if the bus requires parent lock + * + * This function will release the required locks for manipulating dev->drv. + * Normally this will just be the the @dev lock, but when called for a + * USB interface, @parent lock will be released as well. + */ +static void __device_driver_unlock(struct device *dev, struct device *parent) +{ + device_unlock(dev); + if (parent && dev->bus->need_parent_lock) + device_unlock(parent); +} + +/** + * device_driver_attach - attach a specific driver to a specific device + * @drv: Driver to attach + * @dev: Device to attach it to + * + * Manually attach driver to a device. Will acquire both @dev lock and + * @dev->parent lock if needed. + */ +int device_driver_attach(struct device_driver *drv, struct device *dev) +{ + int ret = 0; + + __device_driver_lock(dev, dev->parent); + + /* + * If device has been removed or someone has already successfully + * bound a driver before us just skip the driver probe call. + */ + if (!dev->p->dead && !dev->driver) + ret = driver_probe_device(drv, dev); + + __device_driver_unlock(dev, dev->parent); + + return ret; +} + static int __driver_attach(struct device *dev, void *data) { struct device_driver *drv = data; @@ -892,14 +950,7 @@ static int __driver_attach(struct device *dev, void *data) return ret; } /* ret > 0 means positive match */ - if (dev->parent && dev->bus->need_parent_lock) - device_lock(dev->parent); - device_lock(dev); - if (!dev->p->dead && !dev->driver) - driver_probe_device(drv, dev); - device_unlock(dev); - if (dev->parent && dev->bus->need_parent_lock) - device_unlock(dev->parent); + device_driver_attach(drv, dev); return 0; } @@ -930,15 +981,11 @@ static void __device_release_driver(struct device *dev, struct device *parent) drv = dev->driver; if (drv) { while (device_links_busy(dev)) { - device_unlock(dev); - if (parent && dev->bus->need_parent_lock) - device_unlock(parent); + __device_driver_unlock(dev, parent); device_links_unbind_consumers(dev); - if (parent && dev->bus->need_parent_lock) - device_lock(parent); - device_lock(dev); + __device_driver_lock(dev, parent); /* * A concurrent invocation of the same function might * have released the driver successfully while this one @@ -991,16 +1038,12 @@ void device_release_driver_internal(struct device *dev, struct device_driver *drv, struct device *parent) { - if (parent && dev->bus->need_parent_lock) - device_lock(parent); + __device_driver_lock(dev, parent); - device_lock(dev); if (!drv || drv == dev->driver) __device_release_driver(dev, parent); - device_unlock(dev); - if (parent && dev->bus->need_parent_lock) - device_unlock(parent); + __device_driver_unlock(dev, parent); } /** @@ -1025,6 +1068,18 @@ void device_release_driver(struct device *dev) } EXPORT_SYMBOL_GPL(device_release_driver); +/** + * device_driver_detach - detach driver from a specific device + * @dev: device to detach driver from + * + * Detach driver from device. Will acquire both @dev lock and @dev->parent + * lock if needed. + */ +void device_driver_detach(struct device *dev) +{ + device_release_driver_internal(dev, NULL, dev->parent); +} + /** * driver_detach - detach driver from all devices it controls. * @drv: driver. -- 2.25.1
1 0
0 0
[PATCH openEuler-21.03] tracing: Correct the length check which causes memory corruption
by Hongyu Li 26 Aug '21

26 Aug '21
From: Liangyan <liangyan.peng(a)linux.alibaba.com> stable inclusion from stable-v5.10.44 commit 43c32c22254b9328d7abb1c2b0f689dc67838e60 bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=344 CVE: NA -------------------------------- commit 3e08a9f9760f4a70d633c328a76408e62d6f80a3 upstream. We've suffered from severe kernel crashes due to memory corruption on our production environment, like, Call Trace: [1640542.554277] general protection fault: 0000 [#1] SMP PTI [1640542.554856] CPU: 17 PID: 26996 Comm: python Kdump: loaded Tainted:G [1640542.556629] RIP: 0010:kmem_cache_alloc+0x90/0x190 [1640542.559074] RSP: 0018:ffffb16faa597df8 EFLAGS: 00010286 [1640542.559587] RAX: 0000000000000000 RBX: 0000000000400200 RCX: 0000000006e931bf [1640542.560323] RDX: 0000000006e931be RSI: 0000000000400200 RDI: ffff9a45ff004300 [1640542.560996] RBP: 0000000000400200 R08: 0000000000023420 R09: 0000000000000000 [1640542.561670] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff9a20608d [1640542.562366] R13: ffff9a45ff004300 R14: ffff9a45ff004300 R15: 696c662f65636976 [1640542.563128] FS: 00007f45d7c6f740(0000) GS:ffff9a45ff840000(0000) knlGS:0000000000000000 [1640542.563937] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1640542.564557] CR2: 00007f45d71311a0 CR3: 000000189d63e004 CR4: 00000000003606e0 [1640542.565279] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [1640542.566069] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [1640542.566742] Call Trace: [1640542.567009] anon_vma_clone+0x5d/0x170 [1640542.567417] __split_vma+0x91/0x1a0 [1640542.567777] do_munmap+0x2c6/0x320 [1640542.568128] vm_munmap+0x54/0x70 [1640542.569990] __x64_sys_munmap+0x22/0x30 [1640542.572005] do_syscall_64+0x5b/0x1b0 [1640542.573724] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [1640542.575642] RIP: 0033:0x7f45d6e61e27 James Wang has reproduced it stably on the latest 4.19 LTS. After some debugging, we finally proved that it's due to ftrace buffer out-of-bound access using a debug tool as follows: [ 86.775200] BUG: Out-of-bounds write at addr 0xffff88aefe8b7000 [ 86.780806] no_context+0xdf/0x3c0 [ 86.784327] __do_page_fault+0x252/0x470 [ 86.788367] do_page_fault+0x32/0x140 [ 86.792145] page_fault+0x1e/0x30 [ 86.795576] strncpy_from_unsafe+0x66/0xb0 [ 86.799789] fetch_memory_string+0x25/0x40 [ 86.804002] fetch_deref_string+0x51/0x60 [ 86.808134] kprobe_trace_func+0x32d/0x3a0 [ 86.812347] kprobe_dispatcher+0x45/0x50 [ 86.816385] kprobe_ftrace_handler+0x90/0xf0 [ 86.820779] ftrace_ops_assist_func+0xa1/0x140 [ 86.825340] 0xffffffffc00750bf [ 86.828603] do_sys_open+0x5/0x1f0 [ 86.832124] do_syscall_64+0x5b/0x1b0 [ 86.835900] entry_SYSCALL_64_after_hwframe+0x44/0xa9 commit b220c049d519 ("tracing: Check length before giving out the filter buffer") adds length check to protect trace data overflow introduced in 0fc1b09ff1ff, seems that this fix can't prevent overflow entirely, the length check should also take the sizeof entry->array[0] into account, since this array[0] is filled the length of trace data and occupy addtional space and risk overflow. Link: https://lkml.kernel.org/r/20210607125734.1770447-1-liangyan.peng@linux.alib… Cc: stable(a)vger.kernel.org Cc: Ingo Molnar <mingo(a)redhat.com> Cc: Xunlei Pang <xlpang(a)linux.alibaba.com> Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Fixes: b220c049d519 ("tracing: Check length before giving out the filter buffer") Reviewed-by: Xunlei Pang <xlpang(a)linux.alibaba.com> Reviewed-by: yinbinbin <yinbinbin(a)alibabacloud.com> Reviewed-by: Wetp Zhang <wetp.zy(a)linux.alibaba.com> Tested-by: James Wang <jnwang(a)linux.alibaba.com> Signed-off-by: Liangyan <liangyan.peng(a)linux.alibaba.com> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: 李弘宇 <l543306408(a)bupt.edu.cn> --- kernel/trace/trace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 321f7f7a29b4..b2c141eaca02 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -2734,7 +2734,7 @@ trace_event_buffer_lock_reserve(struct trace_buffer **current_rb, (entry = this_cpu_read(trace_buffered_event))) { /* Try to use the per cpu buffer first */ val = this_cpu_inc_return(trace_buffered_event_cnt); - if ((len < (PAGE_SIZE - sizeof(*entry))) && val == 1) { + if ((len < (PAGE_SIZE - sizeof(*entry) - sizeof(entry->array[0]))) && val == 1) { trace_event_setup(entry, type, flags, pc); entry->array[0] = len; return entry; -- 2.17.1
1 0
0 0
[PATCH kernel-4.19] device core: Consolidate locking and unlocking of parent and device
by Yang Yingliang 25 Aug '21

25 Aug '21
From: Alexander Duyck <alexander.h.duyck(a)linux.intel.com> mainline inclusion from mainline-v5.1-rc1 commit ed88747 category: bugfix bugzilla: 176200 CVE: NA ------------------------------------------------- Try to consolidate all of the locking and unlocking of both the parent and device when attaching or removing a driver from a given device. To do that I first consolidated the lock pattern into two functions __device_driver_lock and __device_driver_unlock. After doing that I then created functions specific to attaching and detaching the driver while acquiring these locks. By doing this I was able to reduce the number of spots where we touch need_parent_lock from 12 down to 4. This patch should produce no functional changes, it is meant to be a code clean-up/consolidation only. Reviewed-by: Luis Chamberlain <mcgrof(a)kernel.org> Reviewed-by: Bart Van Assche <bvanassche(a)acm.org> Reviewed-by: Dan Williams <dan.j.williams(a)intel.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck(a)linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Xiongfeng Wang <wangxiongfeng2(a)huawei.com> Reviewed-by: Hanjun Guo <guohanjun(a)huawei.com> Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com> --- drivers/base/base.h | 2 + drivers/base/bus.c | 23 ++--------- drivers/base/dd.c | 95 +++++++++++++++++++++++++++++++++++---------- 3 files changed, 81 insertions(+), 39 deletions(-) diff --git a/drivers/base/base.h b/drivers/base/base.h index 559b047de9f75..2d270b8c731a0 100644 --- a/drivers/base/base.h +++ b/drivers/base/base.h @@ -128,6 +128,8 @@ extern int driver_add_groups(struct device_driver *drv, const struct attribute_group **groups); extern void driver_remove_groups(struct device_driver *drv, const struct attribute_group **groups); +int device_driver_attach(struct device_driver *drv, struct device *dev); +void device_driver_detach(struct device *dev); extern char *make_class_name(const char *name, struct kobject *kobj); diff --git a/drivers/base/bus.c b/drivers/base/bus.c index e06a57936cc96..38a09ca932a3b 100644 --- a/drivers/base/bus.c +++ b/drivers/base/bus.c @@ -187,11 +187,7 @@ static ssize_t unbind_store(struct device_driver *drv, const char *buf, dev = bus_find_device_by_name(bus, NULL, buf); if (dev && dev->driver == drv) { - if (dev->parent && dev->bus->need_parent_lock) - device_lock(dev->parent); - device_release_driver(dev); - if (dev->parent && dev->bus->need_parent_lock) - device_unlock(dev->parent); + device_driver_detach(dev); err = count; } put_device(dev); @@ -214,13 +210,7 @@ static ssize_t bind_store(struct device_driver *drv, const char *buf, dev = bus_find_device_by_name(bus, NULL, buf); if (dev && dev->driver == NULL && driver_match_device(drv, dev)) { - if (dev->parent && bus->need_parent_lock) - device_lock(dev->parent); - device_lock(dev); - err = driver_probe_device(drv, dev); - device_unlock(dev); - if (dev->parent && bus->need_parent_lock) - device_unlock(dev->parent); + err = device_driver_attach(drv, dev); if (err > 0) { /* success */ @@ -774,13 +764,8 @@ EXPORT_SYMBOL_GPL(bus_rescan_devices); */ int device_reprobe(struct device *dev) { - if (dev->driver) { - if (dev->parent && dev->bus->need_parent_lock) - device_lock(dev->parent); - device_release_driver(dev); - if (dev->parent && dev->bus->need_parent_lock) - device_unlock(dev->parent); - } + if (dev->driver) + device_driver_detach(dev); return bus_rescan_devices_helper(dev, NULL); } EXPORT_SYMBOL_GPL(device_reprobe); diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 26ba7a99b7d5b..aca447eacdb2b 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -869,6 +869,64 @@ void device_initial_probe(struct device *dev) __device_attach(dev, true); } +/* + * __device_driver_lock - acquire locks needed to manipulate dev->drv + * @dev: Device we will update driver info for + * @parent: Parent device. Needed if the bus requires parent lock + * + * This function will take the required locks for manipulating dev->drv. + * Normally this will just be the @dev lock, but when called for a USB + * interface, @parent lock will be held as well. + */ +static void __device_driver_lock(struct device *dev, struct device *parent) +{ + if (parent && dev->bus->need_parent_lock) + device_lock(parent); + device_lock(dev); +} + +/* + * __device_driver_unlock - release locks needed to manipulate dev->drv + * @dev: Device we will update driver info for + * @parent: Parent device. Needed if the bus requires parent lock + * + * This function will release the required locks for manipulating dev->drv. + * Normally this will just be the the @dev lock, but when called for a + * USB interface, @parent lock will be released as well. + */ +static void __device_driver_unlock(struct device *dev, struct device *parent) +{ + device_unlock(dev); + if (parent && dev->bus->need_parent_lock) + device_unlock(parent); +} + +/** + * device_driver_attach - attach a specific driver to a specific device + * @drv: Driver to attach + * @dev: Device to attach it to + * + * Manually attach driver to a device. Will acquire both @dev lock and + * @dev->parent lock if needed. + */ +int device_driver_attach(struct device_driver *drv, struct device *dev) +{ + int ret = 0; + + __device_driver_lock(dev, dev->parent); + + /* + * If device has been removed or someone has already successfully + * bound a driver before us just skip the driver probe call. + */ + if (!dev->p->dead && !dev->driver) + ret = driver_probe_device(drv, dev); + + __device_driver_unlock(dev, dev->parent); + + return ret; +} + static int __driver_attach(struct device *dev, void *data) { struct device_driver *drv = data; @@ -896,14 +954,7 @@ static int __driver_attach(struct device *dev, void *data) return ret; } /* ret > 0 means positive match */ - if (dev->parent && dev->bus->need_parent_lock) - device_lock(dev->parent); - device_lock(dev); - if (!dev->p->dead && !dev->driver) - driver_probe_device(drv, dev); - device_unlock(dev); - if (dev->parent && dev->bus->need_parent_lock) - device_unlock(dev->parent); + device_driver_attach(drv, dev); return 0; } @@ -936,15 +987,11 @@ static void __device_release_driver(struct device *dev, struct device *parent) pm_runtime_get_sync(dev); while (device_links_busy(dev)) { - device_unlock(dev); - if (parent && dev->bus->need_parent_lock) - device_unlock(parent); + __device_driver_unlock(dev, parent); device_links_unbind_consumers(dev); - if (parent && dev->bus->need_parent_lock) - device_lock(parent); - device_lock(dev); + __device_driver_lock(dev, parent); /* * A concurrent invocation of the same function might * have released the driver successfully while this one @@ -998,16 +1045,12 @@ void device_release_driver_internal(struct device *dev, struct device_driver *drv, struct device *parent) { - if (parent && dev->bus->need_parent_lock) - device_lock(parent); + __device_driver_lock(dev, parent); - device_lock(dev); if (!drv || drv == dev->driver) __device_release_driver(dev, parent); - device_unlock(dev); - if (parent && dev->bus->need_parent_lock) - device_unlock(parent); + __device_driver_unlock(dev, parent); } /** @@ -1032,6 +1075,18 @@ void device_release_driver(struct device *dev) } EXPORT_SYMBOL_GPL(device_release_driver); +/** + * device_driver_detach - detach driver from a specific device + * @dev: device to detach driver from + * + * Detach driver from device. Will acquire both @dev lock and @dev->parent + * lock if needed. + */ +void device_driver_detach(struct device *dev) +{ + device_release_driver_internal(dev, NULL, dev->parent); +} + /** * driver_detach - detach driver from all devices it controls. * @drv: driver. -- 2.25.1
1 0
0 0
[PATCH openEuler-1.0-LTS 1/2] ext4: make the updating inode data procedure atomic
by Yang Yingliang 25 Aug '21

25 Aug '21
From: Zhang Yi <yi.zhang(a)huawei.com> hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I476C7 CVE: NA --------------------------- Now that ext4_do_update_inode() return error before filling the whole inode data if we fail to set inode blocks in ext4_inode_blocks_set(). This error should never happen in theory since sb->s_maxbytes should not have allowed this, we have already init sb->s_maxbytes according to this feature in ext4_fill_super(). So even through that could only happen due to the filesystem corruption, we'd better to return after we finish updating the inode because it may left an uninitialized buffer and we could read this buffer later in "errors=continue" mode. This patch make the updating inode data procedure atomic, call EXT4_ERROR_INODE() after we dropping i_raw_lock after something bad happened, make sure that the inode is integrated, and also drop a BUG_ON and do some small cleanups. Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com> Reviewed-by: Jan Kara <jack(a)suse.cz> Reviewed-by: Yang Erkun <yangerkun(a)huawei.com> Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com> --- fs/ext4/inode.c | 44 ++++++++++++++++++++++++++++---------------- 1 file changed, 28 insertions(+), 16 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index b809d383cc5ae..a032f211b80cf 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5169,8 +5169,14 @@ static int ext4_inode_blocks_set(handle_t *handle, ext4_clear_inode_flag(inode, EXT4_INODE_HUGE_FILE); return 0; } + + /* + * This should never happen since sb->s_maxbytes should not have + * allowed this, sb->s_maxbytes was set according to the huge_file + * feature in ext4_fill_super(). + */ if (!ext4_has_feature_huge_file(sb)) - return -EFBIG; + return -EFSCORRUPTED; if (i_blocks <= 0xffffffffffffULL) { /* @@ -5277,16 +5283,14 @@ static int ext4_do_update_inode(handle_t *handle, spin_lock(&ei->i_raw_lock); - /* For fields not tracked in the in-memory inode, - * initialise them to zero for new inodes. */ + /* + * For fields not tracked in the in-memory inode, initialise them + * to zero for new inodes. + */ if (ext4_test_inode_state(inode, EXT4_STATE_NEW)) memset(raw_inode, 0, EXT4_SB(inode->i_sb)->s_inode_size); err = ext4_inode_blocks_set(handle, raw_inode, ei); - if (err) { - spin_unlock(&ei->i_raw_lock); - goto out_brelse; - } raw_inode->i_mode = cpu_to_le16(inode->i_mode); i_uid = i_uid_read(inode); @@ -5295,10 +5299,11 @@ static int ext4_do_update_inode(handle_t *handle, if (!(test_opt(inode->i_sb, NO_UID32))) { raw_inode->i_uid_low = cpu_to_le16(low_16_bits(i_uid)); raw_inode->i_gid_low = cpu_to_le16(low_16_bits(i_gid)); -/* - * Fix up interoperability with old kernels. Otherwise, old inodes get - * re-used with the upper 16 bits of the uid/gid intact - */ + /* + * Fix up interoperability with old kernels. Otherwise, + * old inodes get re-used with the upper 16 bits of the + * uid/gid intact. + */ if (ei->i_dtime && list_empty(&ei->i_orphan)) { raw_inode->i_uid_high = 0; raw_inode->i_gid_high = 0; @@ -5367,8 +5372,9 @@ static int ext4_do_update_inode(handle_t *handle, } } - BUG_ON(!ext4_has_feature_project(inode->i_sb) && - i_projid != EXT4_DEF_PROJID); + if (i_projid != EXT4_DEF_PROJID && + !ext4_has_feature_project(inode->i_sb)) + err = err ?: -EFSCORRUPTED; if (EXT4_INODE_SIZE(inode->i_sb) > EXT4_GOOD_OLD_INODE_SIZE && EXT4_FITS_IN_INODE(raw_inode, ei, i_projid)) @@ -5376,6 +5382,11 @@ static int ext4_do_update_inode(handle_t *handle, ext4_inode_csum_set(inode, raw_inode, ei); spin_unlock(&ei->i_raw_lock); + if (err) { + EXT4_ERROR_INODE(inode, "corrupted inode contents"); + goto out_brelse; + } + if (inode->i_sb->s_flags & SB_LAZYTIME) ext4_update_other_inodes_time(inode->i_sb, inode->i_ino, bh->b_data); @@ -5383,13 +5394,13 @@ static int ext4_do_update_inode(handle_t *handle, BUFFER_TRACE(bh, "call ext4_handle_dirty_metadata"); err = ext4_handle_dirty_metadata(handle, NULL, bh); if (err) - goto out_brelse; + goto out_error; ext4_clear_inode_state(inode, EXT4_STATE_NEW); if (set_large_file) { BUFFER_TRACE(EXT4_SB(sb)->s_sbh, "get write access"); err = ext4_journal_get_write_access(handle, EXT4_SB(sb)->s_sbh); if (err) - goto out_brelse; + goto out_error; lock_buffer(EXT4_SB(sb)->s_sbh); ext4_set_feature_large_file(sb); ext4_superblock_csum_set(sb); @@ -5399,9 +5410,10 @@ static int ext4_do_update_inode(handle_t *handle, EXT4_SB(sb)->s_sbh); } ext4_update_inode_fsync_trans(handle, inode, need_datasync); +out_error: + ext4_std_error(inode->i_sb, err); out_brelse: brelse(bh); - ext4_std_error(inode->i_sb, err); return err; } -- 2.25.1
1 1
0 0
[PATCH kernel-4.19 1/3] ext4: move inode eio simulation behind io completeion
by Yang Yingliang 25 Aug '21

25 Aug '21
From: Zhang Yi <yi.zhang(a)huawei.com> hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I476C7 CVE: NA --------------------------- No EIO simulation is required if the buffer is uptodate, so move the simulation behind read bio completeion just like inode/block bitmap simulation does. Link: https://lore.kernel.org/linux-ext4/20210821065450.1397451-2-yi.zhang@huawei… Signed-off-by: Zhang Yi <yi.zhang(a)huawei.com> Reviewed-by: Jan Kara <jack(a)suse.cz> Reviewed-by: Yang Erkun <yangerkun(a)huawei.com> Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com> --- fs/ext4/inode.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 800e6e3de40aa..1b2ccf61354f3 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4635,8 +4635,6 @@ static int __ext4_get_inode_loc(struct inode *inode, bh = sb_getblk(sb, block); if (unlikely(!bh)) return -ENOMEM; - if (ext4_simulate_fail(sb, EXT4_SIM_INODE_EIO)) - goto simulate_eio; if (!buffer_uptodate(bh)) { lock_buffer(bh); @@ -4721,8 +4719,8 @@ static int __ext4_get_inode_loc(struct inode *inode, trace_ext4_load_inode(inode); ext4_read_bh_nowait(bh, REQ_META | REQ_PRIO, NULL); wait_on_buffer(bh); + ext4_simulate_fail_bh(sb, bh, EXT4_SIM_INODE_EIO); if (!buffer_uptodate(bh)) { - simulate_eio: ext4_error_inode_block(inode, block, EIO, "unable to read itable block"); brelse(bh); -- 2.25.1
1 2
0 0
  • ← Newer
  • 1
  • ...
  • 1722
  • 1723
  • 1724
  • 1725
  • 1726
  • 1727
  • 1728
  • ...
  • 1845
  • Older →

HyperKitty Powered by HyperKitty