From: Barry Song <v-songbaohua(a)oppo.com>
mainline inclusion
from mainline-v6.12-rc6
commit 01626a18230246efdcea322aa8f067e60ffe5ccd
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/IB7294
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
Commit 13ddaf26be32 ("mm/swap: fix race when skipping swapcache")
introduced an unconditional one-tick sleep when `swapcache_prepare()`
fails, which has led to reports of UI stuttering on latency-sensitive
Android devices. To address this, we can use a waitqueue to wake up tasks
that fail `swapcache_prepare()` sooner, instead of always sleeping for a
full tick. While tasks may occasionally be woken by an unrelated
`do_swap_page()`, this method is preferable to two scenarios: rapid
re-entry into page faults, which can cause livelocks, and multiple
millisecond sleeps, which visibly degrade user experience.
Oven's testing shows that a single waitqueue resolves the UI stuttering
issue. If a 'thundering herd' problem becomes apparent later, a waitqueue
hash similar to `folio_wait_table[PAGE_WAIT_TABLE_SIZE]` for page bit
locks can be introduced.
[v-songbaohua(a)oppo.com: wake_up only when swapcache_wq waitqueue is active]
Link: https://lkml.kernel.org/r/20241008130807.40833-1-21cnbao@gmail.com
Link: https://lkml.kernel.org/r/20240926211936.75373-1-21cnbao@gmail.com
Fixes: 13ddaf26be32 ("mm/swap: fix race when skipping swapcache")
Signed-off-by: Barry Song <v-songbaohua(a)oppo.com>
Reported-by: Oven Liyang <liyangouwen1(a)oppo.com>
Tested-by: Oven Liyang <liyangouwen1(a)oppo.com>
Cc: Kairui Song <kasong(a)tencent.com>
Cc: "Huang, Ying" <ying.huang(a)intel.com>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Chris Li <chrisl(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Yosry Ahmed <yosryahmed(a)google.com>
Cc: SeongJae Park <sj(a)kernel.org>
Cc: Kalesh Singh <kaleshsingh(a)google.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Conflicts:
mm/memory.c
[Context conflicts.]
Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com>
---
mm/memory.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index 20869d0cd5a1..f580b9c54247 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3397,6 +3397,8 @@ void unmap_mapping_range(struct address_space *mapping,
}
EXPORT_SYMBOL(unmap_mapping_range);
+static DECLARE_WAIT_QUEUE_HEAD(swapcache_wq);
+
/*
* We enter with non-exclusive mmap_lock (to exclude vma changes,
* but allow concurrent faults), and pte mapped but not yet locked.
@@ -3408,6 +3410,7 @@ EXPORT_SYMBOL(unmap_mapping_range);
vm_fault_t do_swap_page(struct vm_fault *vmf)
{
struct vm_area_struct *vma = vmf->vma;
+ DECLARE_WAITQUEUE(wait, current);
struct page *page = NULL, *swapcache;
struct swap_info_struct *si = NULL;
bool need_clear_cache = false;
@@ -3475,7 +3478,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
*/
if (swapcache_prepare(entry)) {
/* Relax a bit to prevent rapid repeated page faults */
+ add_wait_queue(&swapcache_wq, &wait);
schedule_timeout_uninterruptible(1);
+ remove_wait_queue(&swapcache_wq, &wait);
goto out;
}
need_clear_cache = true;
@@ -3650,8 +3655,11 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
pte_unmap_unlock(vmf->pte, vmf->ptl);
out:
/* Clear the swap cache pin for direct swapin after PTL unlock */
- if (need_clear_cache)
+ if (need_clear_cache) {
swapcache_clear(si, entry);
+ if (waitqueue_active(&swapcache_wq))
+ wake_up(&swapcache_wq);
+ }
if (si)
put_swap_device(si);
return ret;
@@ -3665,8 +3673,11 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
unlock_page(swapcache);
put_page(swapcache);
}
- if (need_clear_cache)
+ if (need_clear_cache) {
swapcache_clear(si, entry);
+ if (waitqueue_active(&swapcache_wq))
+ wake_up(&swapcache_wq);
+ }
if (si)
put_swap_device(si);
return ret;
--
2.34.1
From: Jann Horn <jannh(a)google.com>
mainline inclusion
from mainline-v6.2-rc7
commit 023f47a8250c6bdb4aebe744db4bf7f73414028b
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/IBE8BO
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
--------------------------------
If an ->anon_vma is attached to the VMA, collapse_and_free_pmd() requires
it to be locked.
Page table traversal is allowed under any one of the mmap lock, the
anon_vma lock (if the VMA is associated with an anon_vma), and the
mapping lock (if the VMA is associated with a mapping); and so to be
able to remove page tables, we must hold all three of them.
retract_page_tables() bails out if an ->anon_vma is attached, but does
this check before holding the mmap lock (as the comment above the check
explains).
If we racily merged an existing ->anon_vma (shared with a child
process) from a neighboring VMA, subsequent rmap traversals on pages
belonging to the child will be able to see the page tables that we are
concurrently removing while assuming that nothing else can access them.
Repeat the ->anon_vma check once we hold the mmap lock to ensure that
there really is no concurrent page table access.
Hitting this bug causes a lockdep warning in collapse_and_free_pmd(),
in the line "lockdep_assert_held_write(&vma->anon_vma->root->rwsem)".
It can also lead to use-after-free access.
Link: https://lore.kernel.org/linux-mm/CAG48ez3434wZBKFFbdx4M9j6eUwSUVPd4dxhzW_k_…
Link: https://lkml.kernel.org/r/20230111133351.807024-1-jannh@google.com
Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Jann Horn <jannh(a)google.com>
Reported-by: Zach O'Keefe <zokeefe(a)google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)intel.linux.com>
Reviewed-by: Yang Shi <shy828301(a)gmail.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Conflicts:
mm/khugepaged.c
[Conflicts due to 34488399fa08 ("mm/madvise: add file and shmem support to
MADV_COLLAPSE") is not merged.]
Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com>
---
mm/khugepaged.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 1d5a985ae4cd..a3f45bca187b 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1623,7 +1623,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
* has higher cost too. It would also probably require locking
* the anon_vma.
*/
- if (vma->anon_vma)
+ if (READ_ONCE(vma->anon_vma))
continue;
addr = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT);
if (addr & ~HPAGE_PMD_MASK)
@@ -1642,6 +1642,18 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
* reverse order. Trylock is a way to avoid deadlock.
*/
if (mmap_write_trylock(mm)) {
+ /*
+ * Re-check whether we have an ->anon_vma, because
+ * collapse_and_free_pmd() requires that either no
+ * ->anon_vma exists or the anon_vma is locked.
+ * We already checked ->anon_vma above, but that check
+ * is racy because ->anon_vma can be populated under the
+ * mmap lock in read mode.
+ */
+ if (vma->anon_vma) {
+ mmap_write_unlock(mm);
+ continue;
+ }
if (!khugepaged_test_exit(mm)) {
struct mmu_notifier_range range;
--
2.34.1
From: Takashi Iwai <tiwai(a)suse.de>
stable inclusion
from stable-v6.6.64
commit 74cb86e1006c5437b1d90084d22018da30fddc77
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/IBDHGG
CVE: CVE-2024-53150
Reference: https://git.kernel.org/stable/c/74cb86e1006c5437b1d90084d22018da30fddc77
commit a3dd4d63eeb452cfb064a13862fb376ab108f6a6 upstream.
The current USB-audio driver code doesn't check bLength of each
descriptor at traversing for clock descriptors. That is, when a
device provides a bogus descriptor with a shorter bLength, the driver
might hit out-of-bounds reads.
For addressing it, this patch adds sanity checks to the validator
functions for the clock descriptor traversal. When the descriptor
length is shorter than expected, it's skipped in the loop.
For the clock source and clock multiplier descriptors, we can just
check bLength against the sizeof() of each descriptor type.
OTOH, the clock selector descriptor of UAC2 and UAC3 has an array
of bNrInPins elements and two more fields at its tail, hence those
have to be checked in addition to the sizeof() check.
Reported-by: Benoît Sevens <bsevens(a)google.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/20241121140613.3651-1-bsevens@google.com
Link: https://patch.msgid.link/20241125144629.20757-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Luo Gengkun <luogengkun2(a)huawei.com>
---
sound/usb/clock.c | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)
diff --git a/sound/usb/clock.c b/sound/usb/clock.c
index a676ad093d18..f0f1e445cc56 100644
--- a/sound/usb/clock.c
+++ b/sound/usb/clock.c
@@ -36,6 +36,12 @@ union uac23_clock_multiplier_desc {
struct uac_clock_multiplier_descriptor v3;
};
+/* check whether the descriptor bLength has the minimal length */
+#define DESC_LENGTH_CHECK(p, proto) \
+ ((proto) == UAC_VERSION_3 ? \
+ ((p)->v3.bLength >= sizeof((p)->v3)) : \
+ ((p)->v2.bLength >= sizeof((p)->v2)))
+
#define GET_VAL(p, proto, field) \
((proto) == UAC_VERSION_3 ? (p)->v3.field : (p)->v2.field)
@@ -58,6 +64,8 @@ static bool validate_clock_source(void *p, int id, int proto)
{
union uac23_clock_source_desc *cs = p;
+ if (!DESC_LENGTH_CHECK(cs, proto))
+ return false;
return GET_VAL(cs, proto, bClockID) == id;
}
@@ -65,13 +73,27 @@ static bool validate_clock_selector(void *p, int id, int proto)
{
union uac23_clock_selector_desc *cs = p;
- return GET_VAL(cs, proto, bClockID) == id;
+ if (!DESC_LENGTH_CHECK(cs, proto))
+ return false;
+ if (GET_VAL(cs, proto, bClockID) != id)
+ return false;
+ /* additional length check for baCSourceID array (in bNrInPins size)
+ * and two more fields (which sizes depend on the protocol)
+ */
+ if (proto == UAC_VERSION_3)
+ return cs->v3.bLength >= sizeof(cs->v3) + cs->v3.bNrInPins +
+ 4 /* bmControls */ + 2 /* wCSelectorDescrStr */;
+ else
+ return cs->v2.bLength >= sizeof(cs->v2) + cs->v2.bNrInPins +
+ 1 /* bmControls */ + 1 /* iClockSelector */;
}
static bool validate_clock_multiplier(void *p, int id, int proto)
{
union uac23_clock_multiplier_desc *cs = p;
+ if (!DESC_LENGTH_CHECK(cs, proto))
+ return false;
return GET_VAL(cs, proto, bClockID) == id;
}
--
2.34.1
Hi Shile,
FYI, the error/warning still remains.
tree: https://gitee.com/openeuler/kernel.git openEuler-1.0-LTS
head: a8a9daf97ebc8e303e039402fe784e7bd4e5a9bb
commit: badd79c400ed404df871e1d035bed971d20ead4c [1327/1327] x86/unwind/orc: Remove boot-time ORC unwind tables sorting
config: x86_64-buildonly-randconfig-004-20241216 (https://download.01.org/0day-ci/archive/20241228/202412280732.HThqDDS9-lkp@…)
compiler: clang version 19.1.3 (https://github.com/llvm/llvm-project ab51eccf88f5321e7c60591c5546b254b6afab99)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241228/202412280732.HThqDDS9-lkp@…)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp(a)intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202412280732.HThqDDS9-lkp@intel.com/
All warnings (new ones prefixed by >>):
arch/x86/kernel/unwind_orc.c:179:13: warning: unused function 'orc_sort_swap' [-Wunused-function]
179 | static void orc_sort_swap(void *_a, void *_b, int size)
| ^~~~~~~~~~~~~
arch/x86/kernel/unwind_orc.c:199:12: warning: unused function 'orc_sort_cmp' [-Wunused-function]
199 | static int orc_sort_cmp(const void *_a, const void *_b)
| ^~~~~~~~~~~~
2 warnings generated.
>> arch/x86/kernel/unwind_orc.o: warning: objtool: missing symbol for section .text
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
tree: https://gitee.com/openeuler/kernel.git OLK-5.10
head: 053a6b6f8e4c86200cdb20bc80c063c3bb119859
commit: 66ae8ddda388386daea0623a65ea2ac85c24ca00 [2600/2600] ascend/arm64: Add ascend_enable_all kernel parameter
config: arm64-randconfig-r133-20241227 (https://download.01.org/0day-ci/archive/20241228/202412280553.3KEHilDu-lkp@…)
compiler: aarch64-linux-gcc (GCC) 14.2.0
reproduce: (https://download.01.org/0day-ci/archive/20241228/202412280553.3KEHilDu-lkp@…)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp(a)intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202412280553.3KEHilDu-lkp@intel.com/
sparse warnings: (new ones prefixed by >>)
arch/arm64/mm/init.c:746:9: sparse: sparse: mixing declarations and code
>> arch/arm64/mm/init.c:730:6: sparse: sparse: symbol 'ascend_enable_all_features' was not declared. Should it be static?
vim +/ascend_enable_all_features +730 arch/arm64/mm/init.c
729
> 730 void ascend_enable_all_features(void)
731 {
732 if (IS_ENABLED(CONFIG_ASCEND_DVPP_MMAP))
733 enable_mmap_dvpp = 1;
734
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki