*** BLURB HERE ***
David Gow (1): x86/uaccess: Zero the 8-byte get_range case on failure on 32-bit
David Laight (1): x86: fix off-by-one in access_ok()
Heiko Carstens (1): s390: Add runtime constant support
Linus Torvalds (12): vfs: dcache: move hashlen_hash() from callers into d_hash() runtime constants: add default dummy infrastructure runtime constants: add x86 architecture support arm64: add 'runtime constant' support runtime constants: deal with old decrepit linkers x86: support user address masking instead of non-speculative conditional x86: fix whitespace in runtime-const assembler output x86/uaccess: Improve the 8-byte getuser() case um: Use generic runtime constant implementation x86: do the user address masking outside the user access area x86: make the masked_user_access_begin() macro use its argument only once x86: fix user address masking non-canonical speculation issue
Sabyrzhan Tasbolatov (1): kasan: move checks to do_strncpy_from_user
Stephen Brennan (1): dcache: keep dentry_hashtable or d_hash_shift even when not used
Thomas Gleixner (1): x86/uaccess: Add missing __force to casts in __access_ok() and valid_user_address()
arch/arm64/include/asm/runtime-const.h | 88 ++++++++++++++++++++++++++ arch/arm64/kernel/vmlinux.lds.S | 3 + arch/s390/include/asm/runtime-const.h | 77 ++++++++++++++++++++++ arch/s390/kernel/vmlinux.lds.S | 3 + arch/um/include/asm/Kbuild | 1 + arch/x86/include/asm/runtime-const.h | 61 ++++++++++++++++++ arch/x86/include/asm/uaccess_64.h | 53 ++++++++++------ arch/x86/kernel/cpu/common.c | 10 +++ arch/x86/kernel/vmlinux.lds.S | 4 ++ arch/x86/lib/getuser.S | 80 +++++++++-------------- fs/dcache.c | 26 ++++++-- fs/select.c | 4 +- include/asm-generic/Kbuild | 1 + include/asm-generic/runtime-const.h | 15 +++++ include/asm-generic/vmlinux.lds.h | 7 ++ include/linux/uaccess.h | 7 ++ lib/strncpy_from_user.c | 14 +++- lib/strnlen_user.c | 9 +++ 18 files changed, 385 insertions(+), 78 deletions(-) create mode 100644 arch/arm64/include/asm/runtime-const.h create mode 100644 arch/s390/include/asm/runtime-const.h create mode 100644 arch/x86/include/asm/runtime-const.h create mode 100644 include/asm-generic/runtime-const.h
From: Thomas Gleixner tglx@linutronix.de
mainline inclusion from mainline-v6.9-rc1 commit ae6b0195f5c503f22673a4acf21111822305c1e0 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Sparse complains about losing the __user address space due to the cast to long:
uaccess_64.h:88:24: sparse: warning: cast removes address space '__user' of expression
Annotate it with __force to tell sparse that this is intentional.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20240304005104.677606054@linutronix.de Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/x86/include/asm/uaccess_64.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h index e42507ec4b1d..c8d758496f8c 100644 --- a/arch/x86/include/asm/uaccess_64.h +++ b/arch/x86/include/asm/uaccess_64.h @@ -60,7 +60,7 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm, * half and a user half. When cast to a signed type, user pointers * are positive and kernel pointers are negative. */ -#define valid_user_address(x) ((long)(x) >= 0) +#define valid_user_address(x) ((__force long)(x) >= 0)
/* * User pointers can have tag bits on x86-64. This scheme tolerates @@ -93,8 +93,9 @@ static inline bool __access_ok(const void __user *ptr, unsigned long size) if (__builtin_constant_p(size <= PAGE_SIZE) && size <= PAGE_SIZE) { return valid_user_address(ptr); } else { - unsigned long sum = size + (unsigned long)ptr; - return valid_user_address(sum) && sum >= (unsigned long)ptr; + unsigned long sum = size + (__force unsigned long)ptr; + + return valid_user_address(sum) && sum >= (__force unsigned long)ptr; } } #define __access_ok __access_ok
From: Linus Torvalds torvalds@linux-foundation.org
mainline inclusion from mainline-v6.11-rc1 commit e60cc61153e61e4e38bd983492df9959e82ae4dc category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Both __d_lookup_rcu() and __d_lookup_rcu_op_compare() have the full 'name_hash' value of the qstr that they want to look up, and mask it off to just the low 32-bit hash before calling down to d_hash().
Other callers just load the 32-bit hash and pass it as the argument.
If we move the masking into d_hash() itself, it simplifies the two callers that currently do the masking, and is a no-op for the other cases. It doesn't actually change the generated code since the compiler will inline d_hash() and see that the end result is the same.
[ Technically, since the parse tree changes, the code generation may not be 100% the same, and for me on x86-64, this does result in gcc switching the operands around for one 'cmpl' instruction. So not necessarily the exact same code generation, but equivalent ]
However, this does encapsulate the 'd_hash()' operation more, and makes the shift operation in particular be a "shift 32 bits right, return full word". Which matches the instruction semantics on both x86-64 and arm64 better, since a 32-bit shift will clear the upper bits.
That makes the next step of introducing a "shift by runtime constant" more obvious and generates the shift with no extraneous type masking.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org
Conflicts: fs/dcache.c [Context conflicts] Signed-off-by: Qi Xi xiqi2@huawei.com --- fs/dcache.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/dcache.c b/fs/dcache.c index 3e0b9abbf4d8..ff2bfc3af5b6 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -100,9 +100,9 @@ static unsigned int d_hash_shift __read_mostly;
static struct hlist_bl_head *dentry_hashtable __read_mostly;
-static inline struct hlist_bl_head *d_hash(unsigned int hash) +static inline struct hlist_bl_head *d_hash(unsigned long hashlen) { - return dentry_hashtable + (hash >> d_hash_shift); + return dentry_hashtable + ((u32)hashlen >> d_hash_shift); }
#define IN_LOOKUP_SHIFT 10 @@ -2379,7 +2379,7 @@ static noinline struct dentry *__d_lookup_rcu_op_compare( unsigned *seqp) { u64 hashlen = name->hash_len; - struct hlist_bl_head *b = d_hash(hashlen_hash(hashlen)); + struct hlist_bl_head *b = d_hash(hashlen); struct hlist_bl_node *node; struct dentry *dentry;
@@ -2446,7 +2446,7 @@ struct dentry *__d_lookup_rcu(const struct dentry *parent, { u64 hashlen = name->hash_len; const unsigned char *str = name->name; - struct hlist_bl_head *b = d_hash(hashlen_hash(hashlen)); + struct hlist_bl_head *b = d_hash(hashlen); struct hlist_bl_node *node; struct dentry *dentry;
From: Linus Torvalds torvalds@linux-foundation.org
mainline inclusion from mainline-v6.11-rc1 commit e78298556ee5d881f6679effb2a6743969ea6e2d category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
This adds the initial dummy support for 'runtime constants' for when an architecture doesn't actually support an implementation of fixing up said runtime constants.
This ends up being the fallback to just using the variables as regular __ro_after_init variables, and changes the dcache d_hash() function to use this model.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Qi Xi xiqi2@huawei.com --- fs/dcache.c | 11 ++++++++++- include/asm-generic/Kbuild | 1 + include/asm-generic/runtime-const.h | 15 +++++++++++++++ include/asm-generic/vmlinux.lds.h | 8 ++++++++ 4 files changed, 34 insertions(+), 1 deletion(-) create mode 100644 include/asm-generic/runtime-const.h
diff --git a/fs/dcache.c b/fs/dcache.c index ff2bfc3af5b6..d58804e4939b 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -35,6 +35,8 @@ #include "internal.h" #include "mount.h"
+#include <asm/runtime-const.h> + /* * Usage: * dcache->d_inode->i_lock protects: @@ -102,7 +104,8 @@ static struct hlist_bl_head *dentry_hashtable __read_mostly;
static inline struct hlist_bl_head *d_hash(unsigned long hashlen) { - return dentry_hashtable + ((u32)hashlen >> d_hash_shift); + return runtime_const_ptr(dentry_hashtable) + + runtime_const_shift_right_32(hashlen, d_hash_shift); }
#define IN_LOOKUP_SHIFT 10 @@ -3390,6 +3393,9 @@ static void __init dcache_init_early(void) 0, 0); d_hash_shift = 32 - d_hash_shift; + + runtime_const_init(shift, d_hash_shift); + runtime_const_init(ptr, dentry_hashtable); }
static void __init dcache_init(void) @@ -3418,6 +3424,9 @@ static void __init dcache_init(void) 0, 0); d_hash_shift = 32 - d_hash_shift; + + runtime_const_init(shift, d_hash_shift); + runtime_const_init(ptr, dentry_hashtable); }
/* SLAB cache for __getname() consumers */ diff --git a/include/asm-generic/Kbuild b/include/asm-generic/Kbuild index 941be574bbe0..22673ec5defb 100644 --- a/include/asm-generic/Kbuild +++ b/include/asm-generic/Kbuild @@ -46,6 +46,7 @@ mandatory-y += pci.h mandatory-y += percpu.h mandatory-y += pgalloc.h mandatory-y += preempt.h +mandatory-y += runtime-const.h mandatory-y += rwonce.h mandatory-y += sections.h mandatory-y += serial.h diff --git a/include/asm-generic/runtime-const.h b/include/asm-generic/runtime-const.h new file mode 100644 index 000000000000..670499459514 --- /dev/null +++ b/include/asm-generic/runtime-const.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_RUNTIME_CONST_H +#define _ASM_RUNTIME_CONST_H + +/* + * This is the fallback for when the architecture doesn't + * support the runtime const operations. + * + * We just use the actual symbols as-is. + */ +#define runtime_const_ptr(sym) (sym) +#define runtime_const_shift_right_32(val, sym) ((u32)(val)>>(sym)) +#define runtime_const_init(type,sym) do { } while (0) + +#endif diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 174d865ce46e..88ecb31df393 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -954,6 +954,14 @@ #define CON_INITCALL \ BOUNDED_SECTION_POST_LABEL(.con_initcall.init, __con_initcall, _start, _end)
+#define RUNTIME_NAME(t,x) runtime_##t##_##x + +#define RUNTIME_CONST(t,x) \ + . = ALIGN(8); \ + RUNTIME_NAME(t,x) : AT(ADDR(RUNTIME_NAME(t,x)) - LOAD_OFFSET) { \ + *(RUNTIME_NAME(t,x)); \ + } + /* Alignment must be consistent with (kunit_suite *) in include/kunit/test.h */ #define KUNIT_TABLE() \ . = ALIGN(8); \
From: Linus Torvalds torvalds@linux-foundation.org
mainline inclusion from mainline-v6.11-rc1 commit e3c92e81711d14b46c3121d36bc8e152cb843923 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
This implements the runtime constant infrastructure for x86, allowing the dcache d_hash() function to be generated using as a constant for hash table address followed by shift by a constant of the hash index.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/x86/include/asm/runtime-const.h | 61 ++++++++++++++++++++++++++++ arch/x86/kernel/vmlinux.lds.S | 3 ++ 2 files changed, 64 insertions(+) create mode 100644 arch/x86/include/asm/runtime-const.h
diff --git a/arch/x86/include/asm/runtime-const.h b/arch/x86/include/asm/runtime-const.h new file mode 100644 index 000000000000..24e3a53ca255 --- /dev/null +++ b/arch/x86/include/asm/runtime-const.h @@ -0,0 +1,61 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_RUNTIME_CONST_H +#define _ASM_RUNTIME_CONST_H + +#define runtime_const_ptr(sym) ({ \ + typeof(sym) __ret; \ + asm_inline("mov %1,%0\n1:\n" \ + ".pushsection runtime_ptr_" #sym ","a"\n\t" \ + ".long 1b - %c2 - .\n\t" \ + ".popsection" \ + :"=r" (__ret) \ + :"i" ((unsigned long)0x0123456789abcdefull), \ + "i" (sizeof(long))); \ + __ret; }) + +// The 'typeof' will create at _least_ a 32-bit type, but +// will happily also take a bigger type and the 'shrl' will +// clear the upper bits +#define runtime_const_shift_right_32(val, sym) ({ \ + typeof(0u+(val)) __ret = (val); \ + asm_inline("shrl $12,%k0\n1:\n" \ + ".pushsection runtime_shift_" #sym ","a"\n\t" \ + ".long 1b - 1 - .\n\t" \ + ".popsection" \ + :"+r" (__ret)); \ + __ret; }) + +#define runtime_const_init(type, sym) do { \ + extern s32 __start_runtime_##type##_##sym[]; \ + extern s32 __stop_runtime_##type##_##sym[]; \ + runtime_const_fixup(__runtime_fixup_##type, \ + (unsigned long)(sym), \ + __start_runtime_##type##_##sym, \ + __stop_runtime_##type##_##sym); \ +} while (0) + +/* + * The text patching is trivial - you can only do this at init time, + * when the text section hasn't been marked RO, and before the text + * has ever been executed. + */ +static inline void __runtime_fixup_ptr(void *where, unsigned long val) +{ + *(unsigned long *)where = val; +} + +static inline void __runtime_fixup_shift(void *where, unsigned long val) +{ + *(unsigned char *)where = val; +} + +static inline void runtime_const_fixup(void (*fn)(void *, unsigned long), + unsigned long val, s32 *start, s32 *end) +{ + while (start < end) { + fn(*start + (void *)start, val); + start++; + } +} + +#endif diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index fbd34ee394a7..ad43f427bab0 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -373,6 +373,9 @@ SECTIONS PERCPU_SECTION(INTERNODE_CACHE_BYTES) #endif
+ RUNTIME_CONST(shift, d_hash_shift) + RUNTIME_CONST(ptr, dentry_hashtable) + . = ALIGN(PAGE_SIZE);
/* freed after init ends here */
From: Linus Torvalds torvalds@linux-foundation.org
mainline inclusion from mainline-v6.11-rc1 commit 94a2bc0f611cd9fa4d26e4679bf7ea4b01b12d56 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
This implements the runtime constant infrastructure for arm64, allowing the dcache d_hash() function to be generated using as a constant for hash table address followed by shift by a constant of the hash index.
[ Fixed up to deal with the big-endian case as per Mark Rutland ]
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/arm64/include/asm/runtime-const.h | 88 ++++++++++++++++++++++++++ arch/arm64/kernel/vmlinux.lds.S | 3 + 2 files changed, 91 insertions(+) create mode 100644 arch/arm64/include/asm/runtime-const.h
diff --git a/arch/arm64/include/asm/runtime-const.h b/arch/arm64/include/asm/runtime-const.h new file mode 100644 index 000000000000..be5915669d23 --- /dev/null +++ b/arch/arm64/include/asm/runtime-const.h @@ -0,0 +1,88 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_RUNTIME_CONST_H +#define _ASM_RUNTIME_CONST_H + +#include <asm/cacheflush.h> + +/* Sigh. You can still run arm64 in BE mode */ +#include <asm/byteorder.h> + +#define runtime_const_ptr(sym) ({ \ + typeof(sym) __ret; \ + asm_inline("1:\t" \ + "movz %0, #0xcdef\n\t" \ + "movk %0, #0x89ab, lsl #16\n\t" \ + "movk %0, #0x4567, lsl #32\n\t" \ + "movk %0, #0x0123, lsl #48\n\t" \ + ".pushsection runtime_ptr_" #sym ","a"\n\t" \ + ".long 1b - .\n\t" \ + ".popsection" \ + :"=r" (__ret)); \ + __ret; }) + +#define runtime_const_shift_right_32(val, sym) ({ \ + unsigned long __ret; \ + asm_inline("1:\t" \ + "lsr %w0,%w1,#12\n\t" \ + ".pushsection runtime_shift_" #sym ","a"\n\t" \ + ".long 1b - .\n\t" \ + ".popsection" \ + :"=r" (__ret) \ + :"r" (0u+(val))); \ + __ret; }) + +#define runtime_const_init(type, sym) do { \ + extern s32 __start_runtime_##type##_##sym[]; \ + extern s32 __stop_runtime_##type##_##sym[]; \ + runtime_const_fixup(__runtime_fixup_##type, \ + (unsigned long)(sym), \ + __start_runtime_##type##_##sym, \ + __stop_runtime_##type##_##sym); \ +} while (0) + +/* 16-bit immediate for wide move (movz and movk) in bits 5..20 */ +static inline void __runtime_fixup_16(__le32 *p, unsigned int val) +{ + u32 insn = le32_to_cpu(*p); + insn &= 0xffe0001f; + insn |= (val & 0xffff) << 5; + *p = cpu_to_le32(insn); +} + +static inline void __runtime_fixup_caches(void *where, unsigned int insns) +{ + unsigned long va = (unsigned long)where; + caches_clean_inval_pou(va, va + 4*insns); +} + +static inline void __runtime_fixup_ptr(void *where, unsigned long val) +{ + __le32 *p = lm_alias(where); + __runtime_fixup_16(p, val); + __runtime_fixup_16(p+1, val >> 16); + __runtime_fixup_16(p+2, val >> 32); + __runtime_fixup_16(p+3, val >> 48); + __runtime_fixup_caches(where, 4); +} + +/* Immediate value is 6 bits starting at bit #16 */ +static inline void __runtime_fixup_shift(void *where, unsigned long val) +{ + __le32 *p = lm_alias(where); + u32 insn = le32_to_cpu(*p); + insn &= 0xffc0ffff; + insn |= (val & 63) << 16; + *p = cpu_to_le32(insn); + __runtime_fixup_caches(where, 1); +} + +static inline void runtime_const_fixup(void (*fn)(void *, unsigned long), + unsigned long val, s32 *start, s32 *end) +{ + while (start < end) { + fn(*start + (void *)start, val); + start++; + } +} + +#endif diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S index 3cd7e76cc562..e2f26c56b682 100644 --- a/arch/arm64/kernel/vmlinux.lds.S +++ b/arch/arm64/kernel/vmlinux.lds.S @@ -264,6 +264,9 @@ SECTIONS EXIT_DATA }
+ RUNTIME_CONST(shift, d_hash_shift) + RUNTIME_CONST(ptr, dentry_hashtable) + PERCPU_SECTION(L1_CACHE_BYTES) HYPERVISOR_PERCPU_SECTION
From: Stephen Brennan stephen.s.brennan@oracle.com
mainline inclusion from mainline-v6.11-rc6 commit 04c8abae1b7b2abeb638a3d5d5950fa2a031c244 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
The runtime constant feature removes all the users of these variables, allowing the compiler to optimize them away. It's quite difficult to extract their values from the kernel text, and the memory saved by removing them is tiny, and it was never the point of this optimization.
Since the dentry_hashtable is a core data structure, it's valuable for debugging tools to be able to read it easily. For instance, scripts built on drgn, like the dentrycache script[1], rely on it to be able to perform diagnostics on the contents of the dcache. Annotate it as used, so the compiler doesn't discard it.
Link: https://github.com/oracle-samples/drgn-tools/blob/3afc56146f54d09dfd1f6d3c1b... [1] Fixes: e3c92e81711d ("runtime constants: add x86 architecture support") Signed-off-by: Stephen Brennan stephen.s.brennan@oracle.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org
Conflicts: fs/dcache.c [Context conflicts] Signed-off-by: Qi Xi xiqi2@huawei.com --- fs/dcache.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/fs/dcache.c b/fs/dcache.c index d58804e4939b..0862feab776f 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -96,11 +96,16 @@ EXPORT_SYMBOL(dotdot_name); * * This hash-function tries to avoid losing too many bits of hash * information, yet avoid using a prime hash-size or similar. + * + * Marking the variables "used" ensures that the compiler doesn't + * optimize them away completely on architectures with runtime + * constant infrastructure, this allows debuggers to see their + * values. But updating these values has no effect on those arches. */
-static unsigned int d_hash_shift __read_mostly; +static unsigned int d_hash_shift __ro_after_init __used;
-static struct hlist_bl_head *dentry_hashtable __read_mostly; +static struct hlist_bl_head *dentry_hashtable __ro_after_init __used;
static inline struct hlist_bl_head *d_hash(unsigned long hashlen) {
From: Linus Torvalds torvalds@linux-foundation.org
mainline inclusion from mainline-v6.11-rc2 commit b6547e54864b998af21fdcaa0a88cf8e7efe641a category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
The runtime constants linker script depended on documented linker behavior [1]:
"If an output section’s name is the same as the input section’s name and is representable as a C identifier, then the linker will automatically PROVIDE two symbols: __start_SECNAME and __stop_SECNAME, where SECNAME is the name of the section. These indicate the start address and end address of the output section respectively"
to just automatically define the symbol names for the bounds of the runtime constant arrays.
It turns out that this isn't actually something we can rely on, with old linkers not generating these automatic symbols. It looks to have been introduced in binutils-2.29 back in 2017, and we still support building with versions all the way back to binutils-2.25 (from 2015).
And yes, Oleg actually seems to be using such ancient versions of binutils.
So instead of depending on the implicit symbols from "section names match and are representable C identifiers", just do this all manually. It's not like it causes us any extra pain, we already have to do that for all the other sections that we use that often have special characters in them.
Reported-and-tested-by: Oleg Nesterov oleg@redhat.com Link: https://sourceware.org/binutils/docs/ld/Input-Section-Example.html [1] Link: https://lore.kernel.org/all/20240802114518.GA20924@redhat.com/ Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Qi Xi xiqi2@huawei.com --- include/asm-generic/vmlinux.lds.h | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 88ecb31df393..afcc180910c6 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -954,13 +954,12 @@ #define CON_INITCALL \ BOUNDED_SECTION_POST_LABEL(.con_initcall.init, __con_initcall, _start, _end)
-#define RUNTIME_NAME(t,x) runtime_##t##_##x +#define NAMED_SECTION(name) \ + . = ALIGN(8); \ + name : AT(ADDR(name) - LOAD_OFFSET) \ + { BOUNDED_SECTION_PRE_LABEL(name, name, __start_, __stop_) }
-#define RUNTIME_CONST(t,x) \ - . = ALIGN(8); \ - RUNTIME_NAME(t,x) : AT(ADDR(RUNTIME_NAME(t,x)) - LOAD_OFFSET) { \ - *(RUNTIME_NAME(t,x)); \ - } +#define RUNTIME_CONST(t,x) NAMED_SECTION(runtime_##t##_##x)
/* Alignment must be consistent with (kunit_suite *) in include/kunit/test.h */ #define KUNIT_TABLE() \
From: Linus Torvalds torvalds@linux-foundation.org
mainline inclusion from mainline-v6.12-rc1 commit 2865baf54077aa98fcdb478cefe6a42c417b9374 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
The Spectre-v1 mitigations made "access_ok()" much more expensive, since it has to serialize execution with the test for a valid user address.
All the normal user copy routines avoid this by just masking the user address with a data-dependent mask instead, but the fast "unsafe_user_read()" kind of patterms that were supposed to be a fast case got slowed down.
This introduces a notion of using
src = masked_user_access_begin(src);
to do the user address sanity using a data-dependent mask instead of the more traditional conditional
if (user_read_access_begin(src, len)) {
model.
This model only works for dense accesses that start at 'src' and on architectures that have a guard region that is guaranteed to fault in between the user space and the kernel space area.
With this, the user access doesn't need to be manually checked, because a bad address is guaranteed to fault (by some architecture masking trick: on x86-64 this involves just turning an invalid user address into all ones, since we don't map the top of address space).
This only converts a couple of examples for now. Example x86-64 code generation for loading two words from user space:
stac mov %rax,%rcx sar $0x3f,%rcx or %rax,%rcx mov (%rcx),%r13 mov 0x8(%rcx),%r14 clac
where all the error handling and -EFAULT is now purely handled out of line by the exception path.
Of course, if the micro-architecture does badly at 'clac' and 'stac', the above is still pitifully slow. But at least we did as well as we could.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/x86/include/asm/uaccess_64.h | 8 ++++++++ fs/select.c | 4 +++- include/linux/uaccess.h | 7 +++++++ lib/strncpy_from_user.c | 9 +++++++++ lib/strnlen_user.c | 9 +++++++++ 5 files changed, 36 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h index c8d758496f8c..dbf12d4903c5 100644 --- a/arch/x86/include/asm/uaccess_64.h +++ b/arch/x86/include/asm/uaccess_64.h @@ -62,6 +62,14 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm, */ #define valid_user_address(x) ((__force long)(x) >= 0)
+/* + * Masking the user address is an alternative to a conditional + * user_access_begin that can avoid the fencing. This only works + * for dense accesses starting at the address. + */ +#define mask_user_address(x) ((typeof(x))((long)(x)|((long)(x)>>63))) +#define masked_user_access_begin(x) ({ __uaccess_begin(); mask_user_address(x); }) + /* * User pointers can have tag bits on x86-64. This scheme tolerates * arbitrary values in those bits rather then masking them off. diff --git a/fs/select.c b/fs/select.c index d4d881d439dc..0057561cec68 100644 --- a/fs/select.c +++ b/fs/select.c @@ -780,7 +780,9 @@ static inline int get_sigset_argpack(struct sigset_argpack *to, { // the path is hot enough for overhead of copy_from_user() to matter if (from) { - if (!user_read_access_begin(from, sizeof(*from))) + if (can_do_masked_user_access()) + from = masked_user_access_begin(from); + else if (!user_read_access_begin(from, sizeof(*from))) return -EFAULT; unsafe_get_user(to->p, &from->p, Efault); unsafe_get_user(to->size, &from->size, Efault); diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h index 550287c92990..63666f44f876 100644 --- a/include/linux/uaccess.h +++ b/include/linux/uaccess.h @@ -32,6 +32,13 @@ }) #endif
+#ifdef masked_user_access_begin + #define can_do_masked_user_access() 1 +#else + #define can_do_masked_user_access() 0 + #define masked_user_access_begin(src) NULL +#endif + /* * Architectures should provide two primitives (raw_copy_{to,from}_user()) * and get rid of their private instances of copy_{to,from}_user() and diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c index 6432b8c3e431..989a12a67872 100644 --- a/lib/strncpy_from_user.c +++ b/lib/strncpy_from_user.c @@ -120,6 +120,15 @@ long strncpy_from_user(char *dst, const char __user *src, long count) if (unlikely(count <= 0)) return 0;
+ if (can_do_masked_user_access()) { + long retval; + + src = masked_user_access_begin(src); + retval = do_strncpy_from_user(dst, src, count, count); + user_read_access_end(); + return retval; + } + max_addr = TASK_SIZE_MAX; src_addr = (unsigned long)untagged_addr(src); if (likely(src_addr < max_addr)) { diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c index feeb935a2299..6e489f9e90f1 100644 --- a/lib/strnlen_user.c +++ b/lib/strnlen_user.c @@ -96,6 +96,15 @@ long strnlen_user(const char __user *str, long count) if (unlikely(count <= 0)) return 0;
+ if (can_do_masked_user_access()) { + long retval; + + str = masked_user_access_begin(str); + retval = do_strnlen_user(str, count, count); + user_read_access_end(); + return retval; + } + max_addr = TASK_SIZE_MAX; src_addr = (unsigned long)untagged_addr(str); if (likely(src_addr < max_addr)) {
From: Sabyrzhan Tasbolatov snovitoll@gmail.com
mainline inclusion from mainline-v6.13-rc1 commit ae193dd79398970ee760e0c8129ac42ef8f5c6ff category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Patch series "kasan: migrate the last module test to kunit", v4.
copy_user_test() is the last KUnit-incompatible test with CONFIG_KASAN_MODULE_TEST requirement, which we are going to migrate to KUnit framework and delete the former test and Kconfig as well.
In this patch series:
- [1/3] move kasan_check_write() and check_object_size() to do_strncpy_from_user() to cover with KASAN checks with multiple conditions in strncpy_from_user().
- [2/3] migrated copy_user_test() to KUnit, where we can also test strncpy_from_user() due to [1/4].
KUnits have been tested on: - x86_64 with CONFIG_KASAN_GENERIC. Passed - arm64 with CONFIG_KASAN_SW_TAGS. 1 fail. See [1] - arm64 with CONFIG_KASAN_HW_TAGS. 1 fail. See [1] [1] https://lore.kernel.org/linux-mm/CACzwLxj21h7nCcS2-KA_q7ybe+5pxH0uCDwu64q_9p...
- [3/3] delete CONFIG_KASAN_MODULE_TEST and documentation occurrences.
This patch (of 3):
Since in the commit 2865baf54077("x86: support user address masking instead of non-speculative conditional") do_strncpy_from_user() is called from multiple places, we should sanitize the kernel *dst memory and size which were done in strncpy_from_user() previously.
Link: https://lkml.kernel.org/r/20241016131802.3115788-1-snovitoll@gmail.com Link: https://lkml.kernel.org/r/20241016131802.3115788-2-snovitoll@gmail.com Fixes: 2865baf54077 ("x86: support user address masking instead of non-speculative conditional") Signed-off-by: Sabyrzhan Tasbolatov snovitoll@gmail.com Reviewed-by: Andrey Konovalov andreyknvl@gmail.com Cc: Alexander Potapenko glider@google.com Cc: Alex Shi alexs@kernel.org Cc: Andrey Ryabinin ryabinin.a.a@gmail.com Cc: Dmitry Vyukov dvyukov@google.com Cc: Hu Haowen 2023002089@link.tyut.edu.cn Cc: Jonathan Corbet corbet@lwn.net Cc: Marco Elver elver@google.com Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: Yanteng Si siyanteng@loongson.cn Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Qi Xi xiqi2@huawei.com --- lib/strncpy_from_user.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c index 989a12a67872..6dc234913dd5 100644 --- a/lib/strncpy_from_user.c +++ b/lib/strncpy_from_user.c @@ -120,6 +120,9 @@ long strncpy_from_user(char *dst, const char __user *src, long count) if (unlikely(count <= 0)) return 0;
+ kasan_check_write(dst, count); + check_object_size(dst, count, false); + if (can_do_masked_user_access()) { long retval;
@@ -142,8 +145,6 @@ long strncpy_from_user(char *dst, const char __user *src, long count) if (max > count) max = count;
- kasan_check_write(dst, count); - check_object_size(dst, count, false); if (user_read_access_begin(src, max)) { retval = do_strncpy_from_user(dst, src, count, max); user_read_access_end();
From: Linus Torvalds torvalds@linux-foundation.org
mainline inclusion from mainline-v6.12-rc5 commit 4dc1f31ec3f13a065c7ae2ccdec562b0123e21bb category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
The x86 user pointer validation changes made me look at compiler output a lot, and the wrong indentation for the ".popsection" in the generated assembler triggered me.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/x86/include/asm/runtime-const.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/runtime-const.h b/arch/x86/include/asm/runtime-const.h index 24e3a53ca255..6652ebddfd02 100644 --- a/arch/x86/include/asm/runtime-const.h +++ b/arch/x86/include/asm/runtime-const.h @@ -6,7 +6,7 @@ typeof(sym) __ret; \ asm_inline("mov %1,%0\n1:\n" \ ".pushsection runtime_ptr_" #sym ","a"\n\t" \ - ".long 1b - %c2 - .\n\t" \ + ".long 1b - %c2 - .\n" \ ".popsection" \ :"=r" (__ret) \ :"i" ((unsigned long)0x0123456789abcdefull), \ @@ -20,7 +20,7 @@ typeof(0u+(val)) __ret = (val); \ asm_inline("shrl $12,%k0\n1:\n" \ ".pushsection runtime_shift_" #sym ","a"\n\t" \ - ".long 1b - 1 - .\n\t" \ + ".long 1b - 1 - .\n" \ ".popsection" \ :"+r" (__ret)); \ __ret; })
From: Linus Torvalds torvalds@linux-foundation.org
mainline inclusion from mainline-v6.11-rc1 commit 8a2462df154799129d8259079ec1fecf78703189 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Streamline the 8-byte case and drop the special handling. Use a macro which hides the exception handling.
No functional changes.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Link: https://lore.kernel.org/r/CAHk-=whYb2L_atsRk9pBiFiVLGe5wNZLHhRinA69yu6FiKvDs... Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/x86/lib/getuser.S | 69 ++++++++++++------------------------------ 1 file changed, 20 insertions(+), 49 deletions(-)
diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S index 6913fbce6544..ef22e4579a55 100644 --- a/arch/x86/lib/getuser.S +++ b/arch/x86/lib/getuser.S @@ -44,21 +44,23 @@ or %rdx, %rax .else cmp $TASK_SIZE_MAX-\size+1, %eax -.if \size != 8 jae .Lbad_get_user -.else - jae .Lbad_get_user_8 -.endif sbb %edx, %edx /* array_index_mask_nospec() */ and %edx, %eax .endif .endm
+.macro UACCESS op src dst +1: \op \src,\dst + _ASM_EXTABLE_UA(1b, __get_user_handle_exception) +.endm + + .text SYM_FUNC_START(__get_user_1) check_range size=1 ASM_STAC -1: movzbl (%_ASM_AX),%edx + UACCESS movzbl (%_ASM_AX),%edx xor %eax,%eax ASM_CLAC RET @@ -68,7 +70,7 @@ EXPORT_SYMBOL(__get_user_1) SYM_FUNC_START(__get_user_2) check_range size=2 ASM_STAC -2: movzwl (%_ASM_AX),%edx + UACCESS movzwl (%_ASM_AX),%edx xor %eax,%eax ASM_CLAC RET @@ -78,7 +80,7 @@ EXPORT_SYMBOL(__get_user_2) SYM_FUNC_START(__get_user_4) check_range size=4 ASM_STAC -3: movl (%_ASM_AX),%edx + UACCESS movl (%_ASM_AX),%edx xor %eax,%eax ASM_CLAC RET @@ -89,10 +91,11 @@ SYM_FUNC_START(__get_user_8) check_range size=8 ASM_STAC #ifdef CONFIG_X86_64 -4: movq (%_ASM_AX),%rdx + UACCESS movq (%_ASM_AX),%rdx #else -4: movl (%_ASM_AX),%edx -5: movl 4(%_ASM_AX),%ecx + xor %ecx,%ecx + UACCESS movl (%_ASM_AX),%edx + UACCESS movl 4(%_ASM_AX),%ecx #endif xor %eax,%eax ASM_CLAC @@ -104,7 +107,7 @@ EXPORT_SYMBOL(__get_user_8) SYM_FUNC_START(__get_user_nocheck_1) ASM_STAC ASM_BARRIER_NOSPEC -6: movzbl (%_ASM_AX),%edx + UACCESS movzbl (%_ASM_AX),%edx xor %eax,%eax ASM_CLAC RET @@ -114,7 +117,7 @@ EXPORT_SYMBOL(__get_user_nocheck_1) SYM_FUNC_START(__get_user_nocheck_2) ASM_STAC ASM_BARRIER_NOSPEC -7: movzwl (%_ASM_AX),%edx + UACCESS movzwl (%_ASM_AX),%edx xor %eax,%eax ASM_CLAC RET @@ -124,7 +127,7 @@ EXPORT_SYMBOL(__get_user_nocheck_2) SYM_FUNC_START(__get_user_nocheck_4) ASM_STAC ASM_BARRIER_NOSPEC -8: movl (%_ASM_AX),%edx + UACCESS movl (%_ASM_AX),%edx xor %eax,%eax ASM_CLAC RET @@ -135,10 +138,11 @@ SYM_FUNC_START(__get_user_nocheck_8) ASM_STAC ASM_BARRIER_NOSPEC #ifdef CONFIG_X86_64 -9: movq (%_ASM_AX),%rdx + UACCESS movq (%_ASM_AX),%rdx #else -9: movl (%_ASM_AX),%edx -10: movl 4(%_ASM_AX),%ecx + xor %ecx,%ecx + UACCESS movl (%_ASM_AX),%edx + UACCESS movl 4(%_ASM_AX),%ecx #endif xor %eax,%eax ASM_CLAC @@ -154,36 +158,3 @@ SYM_CODE_START_LOCAL(__get_user_handle_exception) mov $(-EFAULT),%_ASM_AX RET SYM_CODE_END(__get_user_handle_exception) - -#ifdef CONFIG_X86_32 -SYM_CODE_START_LOCAL(__get_user_8_handle_exception) - ASM_CLAC -.Lbad_get_user_8: - xor %edx,%edx - xor %ecx,%ecx - mov $(-EFAULT),%_ASM_AX - RET -SYM_CODE_END(__get_user_8_handle_exception) -#endif - -/* get_user */ - _ASM_EXTABLE_UA(1b, __get_user_handle_exception) - _ASM_EXTABLE_UA(2b, __get_user_handle_exception) - _ASM_EXTABLE_UA(3b, __get_user_handle_exception) -#ifdef CONFIG_X86_64 - _ASM_EXTABLE_UA(4b, __get_user_handle_exception) -#else - _ASM_EXTABLE_UA(4b, __get_user_8_handle_exception) - _ASM_EXTABLE_UA(5b, __get_user_8_handle_exception) -#endif - -/* __get_user */ - _ASM_EXTABLE_UA(6b, __get_user_handle_exception) - _ASM_EXTABLE_UA(7b, __get_user_handle_exception) - _ASM_EXTABLE_UA(8b, __get_user_handle_exception) -#ifdef CONFIG_X86_64 - _ASM_EXTABLE_UA(9b, __get_user_handle_exception) -#else - _ASM_EXTABLE_UA(9b, __get_user_8_handle_exception) - _ASM_EXTABLE_UA(10b, __get_user_8_handle_exception) -#endif
From: Heiko Carstens hca@linux.ibm.com
mainline inclusion from mainline-v6.11-rc1 commit 57e29f4c8919af7276c35c2da0ae5efb4c36c33e category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Implement the runtime constant infrastructure for s390, allowing the dcache d_hash() function to be generated using as a constant for hash table address followed by shift by a constant of the hash index.
This is the s390 variant of commit 94a2bc0f611c ("arm64: add 'runtime constant' support") and commit e3c92e81711d ("runtime constants: add x86 architecture support").
Signed-off-by: Heiko Carstens hca@linux.ibm.com Acked-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com
Conflicts: arch/s390/kernel/vmlinux.lds.S [Context conflicts] Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/s390/include/asm/runtime-const.h | 77 +++++++++++++++++++++++++++ arch/s390/kernel/vmlinux.lds.S | 3 ++ 2 files changed, 80 insertions(+) create mode 100644 arch/s390/include/asm/runtime-const.h
diff --git a/arch/s390/include/asm/runtime-const.h b/arch/s390/include/asm/runtime-const.h new file mode 100644 index 000000000000..17878b1d048c --- /dev/null +++ b/arch/s390/include/asm/runtime-const.h @@ -0,0 +1,77 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_S390_RUNTIME_CONST_H +#define _ASM_S390_RUNTIME_CONST_H + +#include <linux/uaccess.h> + +#define runtime_const_ptr(sym) \ +({ \ + typeof(sym) __ret; \ + \ + asm_inline( \ + "0: iihf %[__ret],%[c1]\n" \ + " iilf %[__ret],%[c2]\n" \ + ".pushsection runtime_ptr_" #sym ","a"\n" \ + ".long 0b - .\n" \ + ".popsection" \ + : [__ret] "=d" (__ret) \ + : [c1] "i" (0x01234567UL), \ + [c2] "i" (0x89abcdefUL)); \ + __ret; \ +}) + +#define runtime_const_shift_right_32(val, sym) \ +({ \ + unsigned int __ret = (val); \ + \ + asm_inline( \ + "0: srl %[__ret],12\n" \ + ".pushsection runtime_shift_" #sym ","a"\n" \ + ".long 0b - .\n" \ + ".popsection" \ + : [__ret] "+d" (__ret)); \ + __ret; \ +}) + +#define runtime_const_init(type, sym) do { \ + extern s32 __start_runtime_##type##_##sym[]; \ + extern s32 __stop_runtime_##type##_##sym[]; \ + \ + runtime_const_fixup(__runtime_fixup_##type, \ + (unsigned long)(sym), \ + __start_runtime_##type##_##sym, \ + __stop_runtime_##type##_##sym); \ +} while (0) + +/* 32-bit immediate for iihf and iilf in bits in I2 field */ +static inline void __runtime_fixup_32(u32 *p, unsigned int val) +{ + s390_kernel_write(p, &val, sizeof(val)); +} + +static inline void __runtime_fixup_ptr(void *where, unsigned long val) +{ + __runtime_fixup_32(where + 2, val >> 32); + __runtime_fixup_32(where + 8, val); +} + +/* Immediate value is lower 12 bits of D2 field of srl */ +static inline void __runtime_fixup_shift(void *where, unsigned long val) +{ + u32 insn = *(u32 *)where; + + insn &= 0xfffff000; + insn |= (val & 63); + s390_kernel_write(where, &insn, sizeof(insn)); +} + +static inline void runtime_const_fixup(void (*fn)(void *, unsigned long), + unsigned long val, s32 *start, s32 *end) +{ + while (start < end) { + fn(*start + (void *)start, val); + start++; + } +} + +#endif /* _ASM_S390_RUNTIME_CONST_H */ diff --git a/arch/s390/kernel/vmlinux.lds.S b/arch/s390/kernel/vmlinux.lds.S index de5f9f623f5b..bf99320c8456 100644 --- a/arch/s390/kernel/vmlinux.lds.S +++ b/arch/s390/kernel/vmlinux.lds.S @@ -187,6 +187,9 @@ SECTIONS . = ALIGN(PAGE_SIZE); INIT_DATA_SECTION(0x100)
+ RUNTIME_CONST(shift, d_hash_shift) + RUNTIME_CONST(ptr, dentry_hashtable) + PERCPU_SECTION(0x100)
.dynsym ALIGN(8) : {
From: Linus Torvalds torvalds@linux-foundation.org
mainline inclusion from mainline-v6.11-rc1 commit 17e6a1213058d8759bcf043151c04612df55ebc4 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
UML should not be using the architecture native runtime constants, since that requires also having the appropriate instruction fixups (and all the linker script details).
Not that using that code would be impossible, but it's not worth it. Just point UML at the generic version.
Reported-by: Nathan Chancellor nathan@kernel.org Fixes: e3c92e81711d ("runtime constants: add x86 architecture support") Link: https://lore.kernel.org/all/20240716143644.GA1827132@thelio-3990X/ Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/um/include/asm/Kbuild | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/um/include/asm/Kbuild b/arch/um/include/asm/Kbuild index b2d834a29f3a..44c741682686 100644 --- a/arch/um/include/asm/Kbuild +++ b/arch/um/include/asm/Kbuild @@ -21,6 +21,7 @@ generic-y += param.h generic-y += parport.h generic-y += percpu.h generic-y += preempt.h +generic-y += runtime-const.h generic-y += softirq_stack.h generic-y += switch_to.h generic-y += topology.h
From: David Gow davidgow@google.com
mainline inclusion from mainline-v6.11-rc2 commit dd35a0933269c636635b6af89dc6fa1782791e56 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
While zeroing the upper 32 bits of an 8-byte getuser on 32-bit x86 was fixed by commit 8c860ed825cb ("x86/uaccess: Fix missed zeroing of ia32 u64 get_user() range checking") it was broken again in commit 8a2462df1547 ("x86/uaccess: Improve the 8-byte getuser() case").
This is because the register which holds the upper 32 bits (%ecx) is being cleared _after_ the check_range, so if the range check fails, %ecx is never cleared.
This can be reproduced with: ./tools/testing/kunit/kunit.py run --arch i386 usercopy
Instead, clear %ecx _before_ check_range in the 8-byte case. This reintroduces a bit of the ugliness we were trying to avoid by adding another #ifndef CONFIG_X86_64, but at least keeps check_range from needing a separate bad_get_user_8 jump.
Fixes: 8a2462df1547 ("x86/uaccess: Improve the 8-byte getuser() case") Signed-off-by: David Gow davidgow@google.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Linus Torvalds torvalds@linux-foundation.org Link: https://lore.kernel.org/all/20240731073031.4045579-1-davidgow@google.com Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/x86/lib/getuser.S | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S index ef22e4579a55..35e88101edb9 100644 --- a/arch/x86/lib/getuser.S +++ b/arch/x86/lib/getuser.S @@ -88,12 +88,14 @@ SYM_FUNC_END(__get_user_4) EXPORT_SYMBOL(__get_user_4)
SYM_FUNC_START(__get_user_8) +#ifndef CONFIG_X86_64 + xor %ecx,%ecx +#endif check_range size=8 ASM_STAC #ifdef CONFIG_X86_64 UACCESS movq (%_ASM_AX),%rdx #else - xor %ecx,%ecx UACCESS movl (%_ASM_AX),%edx UACCESS movl 4(%_ASM_AX),%ecx #endif
From: Linus Torvalds torvalds@linux-foundation.org
mainline inclusion from mainline-v6.12-rc1 commit 05f4216272c4b588c87551d3ba9bfb88b1bffaba category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
In any normal situation this really shouldn't matter, but in case the address passed in to masked_user_access_begin() were to be some complex expression, we should evaluate it fully before doing the 'stac' instruction.
And even without that issue (which objdump would pick up on for any really bad case), just in general we should strive to minimize the amount of code we run with user accesses enabled.
For example, even for the trivial pselect6() case, the code generation (obviously with a non-debug build) just diff with this ends up being
- stac mov %rax,%rcx sar $0x3f,%rcx or %rax,%rcx + stac mov (%rcx),%r13 mov 0x8(%rcx),%r14 clac
so the area delimeted by the 'stac / clac' pair is now literally just the two user access instructions, and the address generation has been moved out to before that code.
This will be much more noticeable if we end up deciding that we can go back to just inlining "get_user()" using the new masked user access model. The get_user() pointers can often be more complex expressions involving kernel memory accesses or even function calls.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/x86/include/asm/uaccess_64.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h index dbf12d4903c5..ee7d78f085a0 100644 --- a/arch/x86/include/asm/uaccess_64.h +++ b/arch/x86/include/asm/uaccess_64.h @@ -68,7 +68,9 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm, * for dense accesses starting at the address. */ #define mask_user_address(x) ((typeof(x))((long)(x)|((long)(x)>>63))) -#define masked_user_access_begin(x) ({ __uaccess_begin(); mask_user_address(x); }) +#define masked_user_access_begin(x) ({ \ + __auto_type __masked_ptr = mask_user_address(x); \ + __uaccess_begin(); __masked_ptr; })
/* * User pointers can have tag bits on x86-64. This scheme tolerates
From: Linus Torvalds torvalds@linux-foundation.org
mainline inclusion from mainline-v6.12-rc1 commit 533ab223aa1a036cfe5d6747fa3be92069f80988 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
This doesn't actually matter for any of the current users, but before merging it mainline, make sure we don't have any surprising semantics.
We don't actually want to use an inline function here, because we want to allow - but not require - const pointer arguments, and return them as such. But we already had a local auto-type variable, so let's just use it to avoid any possible double evaluation.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/x86/include/asm/uaccess_64.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h index ee7d78f085a0..461727d2b014 100644 --- a/arch/x86/include/asm/uaccess_64.h +++ b/arch/x86/include/asm/uaccess_64.h @@ -68,8 +68,9 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm, * for dense accesses starting at the address. */ #define mask_user_address(x) ((typeof(x))((long)(x)|((long)(x)>>63))) -#define masked_user_access_begin(x) ({ \ - __auto_type __masked_ptr = mask_user_address(x); \ +#define masked_user_access_begin(x) ({ \ + __auto_type __masked_ptr = (x); \ + __masked_ptr = mask_user_address(__masked_ptr); \ __uaccess_begin(); __masked_ptr; })
/*
From: Linus Torvalds torvalds@linux-foundation.org
mainline inclusion from mainline-v6.12-rc5 commit 86e6b1547b3d013bc392adf775b89318441403c2 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
It turns out that AMD has a "Meltdown Lite(tm)" issue with non-canonical accesses in kernel space. And so using just the high bit to decide whether an access is in user space or kernel space ends up with the good old "leak speculative data" if you have the right gadget using the result:
CVE-2020-12965 “Transient Execution of Non-Canonical Accesses“
Now, the kernel surrounds the access with a STAC/CLAC pair, and those instructions end up serializing execution on older Zen architectures, which closes the speculation window.
But that was true only up until Zen 5, which renames the AC bit [1]. That improves performance of STAC/CLAC a lot, but also means that the speculation window is now open.
Note that this affects not just the new address masking, but also the regular valid_user_address() check used by access_ok(), and the asm version of the sign bit check in the get_user() helpers.
It does not affect put_user() or clear_user() variants, since there's no speculative result to be used in a gadget for those operations.
Reported-by: Andrew Cooper andrew.cooper3@citrix.com Link: https://lore.kernel.org/all/80d94591-1297-4afb-b510-c665efd37f10@citrix.com/ Link: https://lore.kernel.org/all/20241023094448.GAZxjFkEOOF_DM83TQ@fat_crate.loca... [1] Link: https://www.amd.com/en/resources/product-security/bulletin/amd-sb-1010.html Link: https://arxiv.org/pdf/2108.10771 Cc: Josh Poimboeuf jpoimboe@kernel.org Cc: Borislav Petkov bp@alien8.de Tested-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com # LAM case Fixes: 2865baf54077 ("x86: support user address masking instead of non-speculative conditional") Fixes: 6014bc27561f ("x86-64: make access_ok() independent of LAM") Fixes: b19b74bc99b1 ("x86/mm: Rework address range check in get_user() and put_user()") Signed-off-by: Linus Torvalds torvalds@linux-foundation.org
Conflicts: arch/x86/kernel/vmlinux.lds.S arch/x86/kernel/cpu/common.c arch/x86/include/asm/uaccess_64.h [Context conflicts] Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/x86/include/asm/uaccess_64.h | 43 +++++++++++++++++-------------- arch/x86/kernel/cpu/common.c | 10 +++++++ arch/x86/kernel/vmlinux.lds.S | 1 + arch/x86/lib/getuser.S | 9 +++++-- 4 files changed, 42 insertions(+), 21 deletions(-)
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h index 461727d2b014..2bf3d4a8a71e 100644 --- a/arch/x86/include/asm/uaccess_64.h +++ b/arch/x86/include/asm/uaccess_64.h @@ -15,9 +15,16 @@ defined(CONFIG_X86_HYGON_LMC_AVX2_ON) #include <asm/fpu/api.h> #endif +#include <asm/runtime-const.h>
extern struct static_key_false hygon_lmc_key;
+/* + * Virtual variable: there's no actual backing store for this, + * it can purely be used as 'runtime_const_ptr(USER_PTR_MAX)' + */ +extern unsigned long USER_PTR_MAX; + #ifdef CONFIG_ADDRESS_MASKING /* * Mask out tag bits from the address. @@ -55,19 +62,24 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm,
#endif
-/* - * The virtual address space space is logically divided into a kernel - * half and a user half. When cast to a signed type, user pointers - * are positive and kernel pointers are negative. - */ -#define valid_user_address(x) ((__force long)(x) >= 0) +#define valid_user_address(x) \ + ((__force unsigned long)(x) <= runtime_const_ptr(USER_PTR_MAX))
/* * Masking the user address is an alternative to a conditional * user_access_begin that can avoid the fencing. This only works * for dense accesses starting at the address. */ -#define mask_user_address(x) ((typeof(x))((long)(x)|((long)(x)>>63))) +static inline void __user *mask_user_address(const void __user *ptr) +{ + unsigned long mask; + asm("cmp %1,%0\n\t" + "sbb %0,%0" + :"=r" (mask) + :"r" (ptr), + "0" (runtime_const_ptr(USER_PTR_MAX))); + return (__force void __user *)(mask | (__force unsigned long)ptr); +} #define masked_user_access_begin(x) ({ \ __auto_type __masked_ptr = (x); \ __masked_ptr = mask_user_address(__masked_ptr); \ @@ -78,23 +90,16 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm, * arbitrary values in those bits rather then masking them off. * * Enforce two rules: - * 1. 'ptr' must be in the user half of the address space + * 1. 'ptr' must be in the user part of the address space * 2. 'ptr+size' must not overflow into kernel addresses * - * Note that addresses around the sign change are not valid addresses, - * and will GP-fault even with LAM enabled if the sign bit is set (see - * "CR3.LAM_SUP" that can narrow the canonicality check if we ever - * enable it, but not remove it entirely). - * - * So the "overflow into kernel addresses" does not imply some sudden - * exact boundary at the sign bit, and we can allow a lot of slop on the - * size check. + * Note that we always have at least one guard page between the + * max user address and the non-canonical gap, allowing us to + * ignore small sizes entirely. * * In fact, we could probably remove the size check entirely, since * any kernel accesses will be in increasing address order starting - * at 'ptr', and even if the end might be in kernel space, we'll - * hit the GP faults for non-canonical accesses before we ever get - * there. + * at 'ptr'. * * That's a separate optimization, for now just handle the small * constant case. diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 26d7a26ef2d2..cbdcdd2956e0 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -65,6 +65,7 @@ #include <asm/set_memory.h> #include <asm/traps.h> #include <asm/sev.h> +#include <asm/runtime-const.h>
#include "cpu.h"
@@ -2446,6 +2447,15 @@ void __init arch_cpu_finalize_init(void) alternative_instructions();
if (IS_ENABLED(CONFIG_X86_64)) { + unsigned long USER_PTR_MAX = TASK_SIZE_MAX-1; + + /* + * Enable this when LAM is gated on LASS support + if (cpu_feature_enabled(X86_FEATURE_LAM)) + USER_PTR_MAX = (1ul << 63) - PAGE_SIZE - 1; + */ + runtime_const_init(ptr, USER_PTR_MAX); + /* * Make sure the first 2MB area is not mapped by huge pages * There are typically fixed size MTRRs in there and overlapping diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index ad43f427bab0..0ff946f21c27 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -375,6 +375,7 @@ SECTIONS
RUNTIME_CONST(shift, d_hash_shift) RUNTIME_CONST(ptr, dentry_hashtable) + RUNTIME_CONST(ptr, USER_PTR_MAX)
. = ALIGN(PAGE_SIZE);
diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S index 35e88101edb9..8f41716c40bc 100644 --- a/arch/x86/lib/getuser.S +++ b/arch/x86/lib/getuser.S @@ -39,8 +39,13 @@
.macro check_range size:req .if IS_ENABLED(CONFIG_X86_64) - mov %rax, %rdx - sar $63, %rdx + movq $0x0123456789abcdef,%rdx + 1: + .pushsection runtime_ptr_USER_PTR_MAX,"a" + .long 1b - 8 - . + .popsection + cmp %rax, %rdx + sbb %rdx, %rdx or %rdx, %rax .else cmp $TASK_SIZE_MAX-\size+1, %eax
From: David Laight David.Laight@ACULAB.COM
mainline inclusion from mainline-v6.13-rc1 commit 573f45a9f9a47fed4c7957609689b772121b33d7 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
When the size isn't a small constant, __access_ok() will call valid_user_address() with the address after the last byte of the user buffer.
It is valid for a buffer to end with the last valid user address so valid_user_address() must allow accesses to the base of the guard page.
[ This introduces an off-by-one in the other direction for the plain non-sized accesses, but since we have that guard region that is a whole page, those checks "allowing" accesses to that guard region don't really matter. The access will fault anyway, whether to the guard page or if the address has been masked to all ones - Linus ]
Fixes: 86e6b1547b3d0 ("x86: fix user address masking non-canonical speculation issue") Signed-off-by: David Laight david.laight@aculab.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/x86/kernel/cpu/common.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index cbdcdd2956e0..570ecdca404b 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -2447,12 +2447,12 @@ void __init arch_cpu_finalize_init(void) alternative_instructions();
if (IS_ENABLED(CONFIG_X86_64)) { - unsigned long USER_PTR_MAX = TASK_SIZE_MAX-1; + unsigned long USER_PTR_MAX = TASK_SIZE_MAX;
/* * Enable this when LAM is gated on LASS support if (cpu_feature_enabled(X86_FEATURE_LAM)) - USER_PTR_MAX = (1ul << 63) - PAGE_SIZE - 1; + USER_PTR_MAX = (1ul << 63) - PAGE_SIZE; */ runtime_const_init(ptr, USER_PTR_MAX);
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/14391 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/V...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/14391 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/V...