*** BLURB HERE ***
David Gow (1): x86/uaccess: Zero the 8-byte get_range case on failure on 32-bit
David Laight (1): x86: fix off-by-one in access_ok()
Linus Torvalds (5): runtime constants: deal with old decrepit linkers runtime constants: add x86 architecture support x86: fix whitespace in runtime-const assembler output x86/uaccess: Improve the 8-byte getuser() case x86: fix user address masking non-canonical speculation issue
Stephen Brennan (1): dcache: keep dentry_hashtable or d_hash_shift even when not used
Thomas Gleixner (1): x86/uaccess: Add missing __force to casts in __access_ok() and valid_user_address()
arch/x86/include/asm/runtime-const.h | 61 +++++++++++++++++++++ arch/x86/include/asm/uaccess_64.h | 49 ++++++++++------- arch/x86/kernel/cpu/common.c | 10 ++++ arch/x86/kernel/vmlinux.lds.S | 4 ++ arch/x86/lib/getuser.S | 80 ++++++++++------------------ fs/dcache.c | 9 +++- include/asm-generic/vmlinux.lds.h | 7 +++ 7 files changed, 149 insertions(+), 71 deletions(-) create mode 100644 arch/x86/include/asm/runtime-const.h
From: Thomas Gleixner tglx@linutronix.de
mainline inclusion from mainline-v6.9-rc1 commit ae6b0195f5c503f22673a4acf21111822305c1e0 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Sparse complains about losing the __user address space due to the cast to long:
uaccess_64.h:88:24: sparse: warning: cast removes address space '__user' of expression
Annotate it with __force to tell sparse that this is intentional.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20240304005104.677606054@linutronix.de Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/x86/include/asm/uaccess_64.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h index e42507ec4b1d..c8d758496f8c 100644 --- a/arch/x86/include/asm/uaccess_64.h +++ b/arch/x86/include/asm/uaccess_64.h @@ -60,7 +60,7 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm, * half and a user half. When cast to a signed type, user pointers * are positive and kernel pointers are negative. */ -#define valid_user_address(x) ((long)(x) >= 0) +#define valid_user_address(x) ((__force long)(x) >= 0)
/* * User pointers can have tag bits on x86-64. This scheme tolerates @@ -93,8 +93,9 @@ static inline bool __access_ok(const void __user *ptr, unsigned long size) if (__builtin_constant_p(size <= PAGE_SIZE) && size <= PAGE_SIZE) { return valid_user_address(ptr); } else { - unsigned long sum = size + (unsigned long)ptr; - return valid_user_address(sum) && sum >= (unsigned long)ptr; + unsigned long sum = size + (__force unsigned long)ptr; + + return valid_user_address(sum) && sum >= (__force unsigned long)ptr; } } #define __access_ok __access_ok
From: Linus Torvalds torvalds@linux-foundation.org
mainline inclusion from mainline-v6.11-rc2 commit b6547e54864b998af21fdcaa0a88cf8e7efe641a category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
The runtime constants linker script depended on documented linker behavior [1]:
"If an output section’s name is the same as the input section’s name and is representable as a C identifier, then the linker will automatically PROVIDE two symbols: __start_SECNAME and __stop_SECNAME, where SECNAME is the name of the section. These indicate the start address and end address of the output section respectively"
to just automatically define the symbol names for the bounds of the runtime constant arrays.
It turns out that this isn't actually something we can rely on, with old linkers not generating these automatic symbols. It looks to have been introduced in binutils-2.29 back in 2017, and we still support building with versions all the way back to binutils-2.25 (from 2015).
And yes, Oleg actually seems to be using such ancient versions of binutils.
So instead of depending on the implicit symbols from "section names match and are representable C identifiers", just do this all manually. It's not like it causes us any extra pain, we already have to do that for all the other sections that we use that often have special characters in them.
Reported-and-tested-by: Oleg Nesterov oleg@redhat.com Link: https://sourceware.org/binutils/docs/ld/Input-Section-Example.html [1] Link: https://lore.kernel.org/all/20240802114518.GA20924@redhat.com/ Signed-off-by: Linus Torvalds torvalds@linux-foundation.org
Conflicts: include/asm-generic/vmlinux.lds.h [Context conflicts] Signed-off-by: Qi Xi xiqi2@huawei.com --- include/asm-generic/vmlinux.lds.h | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 174d865ce46e..afcc180910c6 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -954,6 +954,13 @@ #define CON_INITCALL \ BOUNDED_SECTION_POST_LABEL(.con_initcall.init, __con_initcall, _start, _end)
+#define NAMED_SECTION(name) \ + . = ALIGN(8); \ + name : AT(ADDR(name) - LOAD_OFFSET) \ + { BOUNDED_SECTION_PRE_LABEL(name, name, __start_, __stop_) } + +#define RUNTIME_CONST(t,x) NAMED_SECTION(runtime_##t##_##x) + /* Alignment must be consistent with (kunit_suite *) in include/kunit/test.h */ #define KUNIT_TABLE() \ . = ALIGN(8); \
From: Linus Torvalds torvalds@linux-foundation.org
mainline inclusion from mainline-v6.11-rc1 commit e3c92e81711d14b46c3121d36bc8e152cb843923 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
This implements the runtime constant infrastructure for x86, allowing the dcache d_hash() function to be generated using as a constant for hash table address followed by shift by a constant of the hash index.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/x86/include/asm/runtime-const.h | 61 ++++++++++++++++++++++++++++ arch/x86/kernel/vmlinux.lds.S | 3 ++ 2 files changed, 64 insertions(+) create mode 100644 arch/x86/include/asm/runtime-const.h
diff --git a/arch/x86/include/asm/runtime-const.h b/arch/x86/include/asm/runtime-const.h new file mode 100644 index 000000000000..24e3a53ca255 --- /dev/null +++ b/arch/x86/include/asm/runtime-const.h @@ -0,0 +1,61 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_RUNTIME_CONST_H +#define _ASM_RUNTIME_CONST_H + +#define runtime_const_ptr(sym) ({ \ + typeof(sym) __ret; \ + asm_inline("mov %1,%0\n1:\n" \ + ".pushsection runtime_ptr_" #sym ","a"\n\t" \ + ".long 1b - %c2 - .\n\t" \ + ".popsection" \ + :"=r" (__ret) \ + :"i" ((unsigned long)0x0123456789abcdefull), \ + "i" (sizeof(long))); \ + __ret; }) + +// The 'typeof' will create at _least_ a 32-bit type, but +// will happily also take a bigger type and the 'shrl' will +// clear the upper bits +#define runtime_const_shift_right_32(val, sym) ({ \ + typeof(0u+(val)) __ret = (val); \ + asm_inline("shrl $12,%k0\n1:\n" \ + ".pushsection runtime_shift_" #sym ","a"\n\t" \ + ".long 1b - 1 - .\n\t" \ + ".popsection" \ + :"+r" (__ret)); \ + __ret; }) + +#define runtime_const_init(type, sym) do { \ + extern s32 __start_runtime_##type##_##sym[]; \ + extern s32 __stop_runtime_##type##_##sym[]; \ + runtime_const_fixup(__runtime_fixup_##type, \ + (unsigned long)(sym), \ + __start_runtime_##type##_##sym, \ + __stop_runtime_##type##_##sym); \ +} while (0) + +/* + * The text patching is trivial - you can only do this at init time, + * when the text section hasn't been marked RO, and before the text + * has ever been executed. + */ +static inline void __runtime_fixup_ptr(void *where, unsigned long val) +{ + *(unsigned long *)where = val; +} + +static inline void __runtime_fixup_shift(void *where, unsigned long val) +{ + *(unsigned char *)where = val; +} + +static inline void runtime_const_fixup(void (*fn)(void *, unsigned long), + unsigned long val, s32 *start, s32 *end) +{ + while (start < end) { + fn(*start + (void *)start, val); + start++; + } +} + +#endif diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index fbd34ee394a7..ad43f427bab0 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -373,6 +373,9 @@ SECTIONS PERCPU_SECTION(INTERNODE_CACHE_BYTES) #endif
+ RUNTIME_CONST(shift, d_hash_shift) + RUNTIME_CONST(ptr, dentry_hashtable) + . = ALIGN(PAGE_SIZE);
/* freed after init ends here */
From: Stephen Brennan stephen.s.brennan@oracle.com
mainline inclusion from mainline-v6.11-rc6 commit 04c8abae1b7b2abeb638a3d5d5950fa2a031c244 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
The runtime constant feature removes all the users of these variables, allowing the compiler to optimize them away. It's quite difficult to extract their values from the kernel text, and the memory saved by removing them is tiny, and it was never the point of this optimization.
Since the dentry_hashtable is a core data structure, it's valuable for debugging tools to be able to read it easily. For instance, scripts built on drgn, like the dentrycache script[1], rely on it to be able to perform diagnostics on the contents of the dcache. Annotate it as used, so the compiler doesn't discard it.
Link: https://github.com/oracle-samples/drgn-tools/blob/3afc56146f54d09dfd1f6d3c1b... [1] Fixes: e3c92e81711d ("runtime constants: add x86 architecture support") Signed-off-by: Stephen Brennan stephen.s.brennan@oracle.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org
Conflicts: fs/dcache.c [Context conflicts] Signed-off-by: Qi Xi xiqi2@huawei.com --- fs/dcache.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/fs/dcache.c b/fs/dcache.c index 3e0b9abbf4d8..ea0784aff435 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -94,11 +94,16 @@ EXPORT_SYMBOL(dotdot_name); * * This hash-function tries to avoid losing too many bits of hash * information, yet avoid using a prime hash-size or similar. + * + * Marking the variables "used" ensures that the compiler doesn't + * optimize them away completely on architectures with runtime + * constant infrastructure, this allows debuggers to see their + * values. But updating these values has no effect on those arches. */
-static unsigned int d_hash_shift __read_mostly; +static unsigned int d_hash_shift __ro_after_init __used;
-static struct hlist_bl_head *dentry_hashtable __read_mostly; +static struct hlist_bl_head *dentry_hashtable __ro_after_init __used;
static inline struct hlist_bl_head *d_hash(unsigned int hash) {
From: Linus Torvalds torvalds@linux-foundation.org
mainline inclusion from mainline-v6.12-rc5 commit 4dc1f31ec3f13a065c7ae2ccdec562b0123e21bb category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
The x86 user pointer validation changes made me look at compiler output a lot, and the wrong indentation for the ".popsection" in the generated assembler triggered me.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/x86/include/asm/runtime-const.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/runtime-const.h b/arch/x86/include/asm/runtime-const.h index 24e3a53ca255..6652ebddfd02 100644 --- a/arch/x86/include/asm/runtime-const.h +++ b/arch/x86/include/asm/runtime-const.h @@ -6,7 +6,7 @@ typeof(sym) __ret; \ asm_inline("mov %1,%0\n1:\n" \ ".pushsection runtime_ptr_" #sym ","a"\n\t" \ - ".long 1b - %c2 - .\n\t" \ + ".long 1b - %c2 - .\n" \ ".popsection" \ :"=r" (__ret) \ :"i" ((unsigned long)0x0123456789abcdefull), \ @@ -20,7 +20,7 @@ typeof(0u+(val)) __ret = (val); \ asm_inline("shrl $12,%k0\n1:\n" \ ".pushsection runtime_shift_" #sym ","a"\n\t" \ - ".long 1b - 1 - .\n\t" \ + ".long 1b - 1 - .\n" \ ".popsection" \ :"+r" (__ret)); \ __ret; })
From: Linus Torvalds torvalds@linux-foundation.org
mainline inclusion from mainline-v6.11-rc1 commit 8a2462df154799129d8259079ec1fecf78703189 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Streamline the 8-byte case and drop the special handling. Use a macro which hides the exception handling.
No functional changes.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Link: https://lore.kernel.org/r/CAHk-=whYb2L_atsRk9pBiFiVLGe5wNZLHhRinA69yu6FiKvDs... Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/x86/lib/getuser.S | 69 ++++++++++++------------------------------ 1 file changed, 20 insertions(+), 49 deletions(-)
diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S index 6913fbce6544..ef22e4579a55 100644 --- a/arch/x86/lib/getuser.S +++ b/arch/x86/lib/getuser.S @@ -44,21 +44,23 @@ or %rdx, %rax .else cmp $TASK_SIZE_MAX-\size+1, %eax -.if \size != 8 jae .Lbad_get_user -.else - jae .Lbad_get_user_8 -.endif sbb %edx, %edx /* array_index_mask_nospec() */ and %edx, %eax .endif .endm
+.macro UACCESS op src dst +1: \op \src,\dst + _ASM_EXTABLE_UA(1b, __get_user_handle_exception) +.endm + + .text SYM_FUNC_START(__get_user_1) check_range size=1 ASM_STAC -1: movzbl (%_ASM_AX),%edx + UACCESS movzbl (%_ASM_AX),%edx xor %eax,%eax ASM_CLAC RET @@ -68,7 +70,7 @@ EXPORT_SYMBOL(__get_user_1) SYM_FUNC_START(__get_user_2) check_range size=2 ASM_STAC -2: movzwl (%_ASM_AX),%edx + UACCESS movzwl (%_ASM_AX),%edx xor %eax,%eax ASM_CLAC RET @@ -78,7 +80,7 @@ EXPORT_SYMBOL(__get_user_2) SYM_FUNC_START(__get_user_4) check_range size=4 ASM_STAC -3: movl (%_ASM_AX),%edx + UACCESS movl (%_ASM_AX),%edx xor %eax,%eax ASM_CLAC RET @@ -89,10 +91,11 @@ SYM_FUNC_START(__get_user_8) check_range size=8 ASM_STAC #ifdef CONFIG_X86_64 -4: movq (%_ASM_AX),%rdx + UACCESS movq (%_ASM_AX),%rdx #else -4: movl (%_ASM_AX),%edx -5: movl 4(%_ASM_AX),%ecx + xor %ecx,%ecx + UACCESS movl (%_ASM_AX),%edx + UACCESS movl 4(%_ASM_AX),%ecx #endif xor %eax,%eax ASM_CLAC @@ -104,7 +107,7 @@ EXPORT_SYMBOL(__get_user_8) SYM_FUNC_START(__get_user_nocheck_1) ASM_STAC ASM_BARRIER_NOSPEC -6: movzbl (%_ASM_AX),%edx + UACCESS movzbl (%_ASM_AX),%edx xor %eax,%eax ASM_CLAC RET @@ -114,7 +117,7 @@ EXPORT_SYMBOL(__get_user_nocheck_1) SYM_FUNC_START(__get_user_nocheck_2) ASM_STAC ASM_BARRIER_NOSPEC -7: movzwl (%_ASM_AX),%edx + UACCESS movzwl (%_ASM_AX),%edx xor %eax,%eax ASM_CLAC RET @@ -124,7 +127,7 @@ EXPORT_SYMBOL(__get_user_nocheck_2) SYM_FUNC_START(__get_user_nocheck_4) ASM_STAC ASM_BARRIER_NOSPEC -8: movl (%_ASM_AX),%edx + UACCESS movl (%_ASM_AX),%edx xor %eax,%eax ASM_CLAC RET @@ -135,10 +138,11 @@ SYM_FUNC_START(__get_user_nocheck_8) ASM_STAC ASM_BARRIER_NOSPEC #ifdef CONFIG_X86_64 -9: movq (%_ASM_AX),%rdx + UACCESS movq (%_ASM_AX),%rdx #else -9: movl (%_ASM_AX),%edx -10: movl 4(%_ASM_AX),%ecx + xor %ecx,%ecx + UACCESS movl (%_ASM_AX),%edx + UACCESS movl 4(%_ASM_AX),%ecx #endif xor %eax,%eax ASM_CLAC @@ -154,36 +158,3 @@ SYM_CODE_START_LOCAL(__get_user_handle_exception) mov $(-EFAULT),%_ASM_AX RET SYM_CODE_END(__get_user_handle_exception) - -#ifdef CONFIG_X86_32 -SYM_CODE_START_LOCAL(__get_user_8_handle_exception) - ASM_CLAC -.Lbad_get_user_8: - xor %edx,%edx - xor %ecx,%ecx - mov $(-EFAULT),%_ASM_AX - RET -SYM_CODE_END(__get_user_8_handle_exception) -#endif - -/* get_user */ - _ASM_EXTABLE_UA(1b, __get_user_handle_exception) - _ASM_EXTABLE_UA(2b, __get_user_handle_exception) - _ASM_EXTABLE_UA(3b, __get_user_handle_exception) -#ifdef CONFIG_X86_64 - _ASM_EXTABLE_UA(4b, __get_user_handle_exception) -#else - _ASM_EXTABLE_UA(4b, __get_user_8_handle_exception) - _ASM_EXTABLE_UA(5b, __get_user_8_handle_exception) -#endif - -/* __get_user */ - _ASM_EXTABLE_UA(6b, __get_user_handle_exception) - _ASM_EXTABLE_UA(7b, __get_user_handle_exception) - _ASM_EXTABLE_UA(8b, __get_user_handle_exception) -#ifdef CONFIG_X86_64 - _ASM_EXTABLE_UA(9b, __get_user_handle_exception) -#else - _ASM_EXTABLE_UA(9b, __get_user_8_handle_exception) - _ASM_EXTABLE_UA(10b, __get_user_8_handle_exception) -#endif
From: David Gow davidgow@google.com
mainline inclusion from mainline-v6.11-rc2 commit dd35a0933269c636635b6af89dc6fa1782791e56 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
While zeroing the upper 32 bits of an 8-byte getuser on 32-bit x86 was fixed by commit 8c860ed825cb ("x86/uaccess: Fix missed zeroing of ia32 u64 get_user() range checking") it was broken again in commit 8a2462df1547 ("x86/uaccess: Improve the 8-byte getuser() case").
This is because the register which holds the upper 32 bits (%ecx) is being cleared _after_ the check_range, so if the range check fails, %ecx is never cleared.
This can be reproduced with: ./tools/testing/kunit/kunit.py run --arch i386 usercopy
Instead, clear %ecx _before_ check_range in the 8-byte case. This reintroduces a bit of the ugliness we were trying to avoid by adding another #ifndef CONFIG_X86_64, but at least keeps check_range from needing a separate bad_get_user_8 jump.
Fixes: 8a2462df1547 ("x86/uaccess: Improve the 8-byte getuser() case") Signed-off-by: David Gow davidgow@google.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Linus Torvalds torvalds@linux-foundation.org Link: https://lore.kernel.org/all/20240731073031.4045579-1-davidgow@google.com Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/x86/lib/getuser.S | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S index ef22e4579a55..35e88101edb9 100644 --- a/arch/x86/lib/getuser.S +++ b/arch/x86/lib/getuser.S @@ -88,12 +88,14 @@ SYM_FUNC_END(__get_user_4) EXPORT_SYMBOL(__get_user_4)
SYM_FUNC_START(__get_user_8) +#ifndef CONFIG_X86_64 + xor %ecx,%ecx +#endif check_range size=8 ASM_STAC #ifdef CONFIG_X86_64 UACCESS movq (%_ASM_AX),%rdx #else - xor %ecx,%ecx UACCESS movl (%_ASM_AX),%edx UACCESS movl 4(%_ASM_AX),%ecx #endif
From: Linus Torvalds torvalds@linux-foundation.org
mainline inclusion from mainline-v6.12-rc5 commit 86e6b1547b3d013bc392adf775b89318441403c2 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
It turns out that AMD has a "Meltdown Lite(tm)" issue with non-canonical accesses in kernel space. And so using just the high bit to decide whether an access is in user space or kernel space ends up with the good old "leak speculative data" if you have the right gadget using the result:
CVE-2020-12965 “Transient Execution of Non-Canonical Accesses“
Now, the kernel surrounds the access with a STAC/CLAC pair, and those instructions end up serializing execution on older Zen architectures, which closes the speculation window.
But that was true only up until Zen 5, which renames the AC bit [1]. That improves performance of STAC/CLAC a lot, but also means that the speculation window is now open.
Note that this affects not just the new address masking, but also the regular valid_user_address() check used by access_ok(), and the asm version of the sign bit check in the get_user() helpers.
It does not affect put_user() or clear_user() variants, since there's no speculative result to be used in a gadget for those operations.
Reported-by: Andrew Cooper andrew.cooper3@citrix.com Link: https://lore.kernel.org/all/80d94591-1297-4afb-b510-c665efd37f10@citrix.com/ Link: https://lore.kernel.org/all/20241023094448.GAZxjFkEOOF_DM83TQ@fat_crate.loca... [1] Link: https://www.amd.com/en/resources/product-security/bulletin/amd-sb-1010.html Link: https://arxiv.org/pdf/2108.10771 Cc: Josh Poimboeuf jpoimboe@kernel.org Cc: Borislav Petkov bp@alien8.de Tested-by: Maciej Wieczor-Retman maciej.wieczor-retman@intel.com # LAM case Fixes: 2865baf54077 ("x86: support user address masking instead of non-speculative conditional") Fixes: 6014bc27561f ("x86-64: make access_ok() independent of LAM") Fixes: b19b74bc99b1 ("x86/mm: Rework address range check in get_user() and put_user()") Signed-off-by: Linus Torvalds torvalds@linux-foundation.org
Conflicts: arch/x86/kernel/vmlinux.lds.S arch/x86/kernel/cpu/common.c arch/x86/include/asm/uaccess_64.h [Context conflicts] Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/x86/include/asm/uaccess_64.h | 44 ++++++++++++++++++++----------- arch/x86/kernel/cpu/common.c | 10 +++++++ arch/x86/kernel/vmlinux.lds.S | 1 + arch/x86/lib/getuser.S | 9 +++++-- 4 files changed, 46 insertions(+), 18 deletions(-)
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h index c8d758496f8c..179bc5490b7e 100644 --- a/arch/x86/include/asm/uaccess_64.h +++ b/arch/x86/include/asm/uaccess_64.h @@ -15,6 +15,13 @@ defined(CONFIG_X86_HYGON_LMC_AVX2_ON) #include <asm/fpu/api.h> #endif +#include <asm/runtime-const.h> + +/* + * Virtual variable: there's no actual backing store for this, + * it can purely be used as 'runtime_const_ptr(USER_PTR_MAX)' + */ +extern unsigned long USER_PTR_MAX;
extern struct static_key_false hygon_lmc_key;
@@ -55,35 +62,40 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm,
#endif
+#define valid_user_address(x) \ + ((__force unsigned long)(x) <= runtime_const_ptr(USER_PTR_MAX)) + /* - * The virtual address space space is logically divided into a kernel - * half and a user half. When cast to a signed type, user pointers - * are positive and kernel pointers are negative. + * Masking the user address is an alternative to a conditional + * user_access_begin that can avoid the fencing. This only works + * for dense accesses starting at the address. */ -#define valid_user_address(x) ((__force long)(x) >= 0) +static inline void __user *mask_user_address(const void __user *ptr) +{ + unsigned long mask; + asm("cmp %1,%0\n\t" + "sbb %0,%0" + :"=r" (mask) + :"r" (ptr), + "0" (runtime_const_ptr(USER_PTR_MAX))); + return (__force void __user *)(mask | (__force unsigned long)ptr); +}
/* * User pointers can have tag bits on x86-64. This scheme tolerates * arbitrary values in those bits rather then masking them off. * * Enforce two rules: - * 1. 'ptr' must be in the user half of the address space + * 1. 'ptr' must be in the user part of the address space * 2. 'ptr+size' must not overflow into kernel addresses * - * Note that addresses around the sign change are not valid addresses, - * and will GP-fault even with LAM enabled if the sign bit is set (see - * "CR3.LAM_SUP" that can narrow the canonicality check if we ever - * enable it, but not remove it entirely). - * - * So the "overflow into kernel addresses" does not imply some sudden - * exact boundary at the sign bit, and we can allow a lot of slop on the - * size check. + * Note that we always have at least one guard page between the + * max user address and the non-canonical gap, allowing us to + * ignore small sizes entirely. * * In fact, we could probably remove the size check entirely, since * any kernel accesses will be in increasing address order starting - * at 'ptr', and even if the end might be in kernel space, we'll - * hit the GP faults for non-canonical accesses before we ever get - * there. + * at 'ptr'. * * That's a separate optimization, for now just handle the small * constant case. diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 26d7a26ef2d2..cbdcdd2956e0 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -65,6 +65,7 @@ #include <asm/set_memory.h> #include <asm/traps.h> #include <asm/sev.h> +#include <asm/runtime-const.h>
#include "cpu.h"
@@ -2446,6 +2447,15 @@ void __init arch_cpu_finalize_init(void) alternative_instructions();
if (IS_ENABLED(CONFIG_X86_64)) { + unsigned long USER_PTR_MAX = TASK_SIZE_MAX-1; + + /* + * Enable this when LAM is gated on LASS support + if (cpu_feature_enabled(X86_FEATURE_LAM)) + USER_PTR_MAX = (1ul << 63) - PAGE_SIZE - 1; + */ + runtime_const_init(ptr, USER_PTR_MAX); + /* * Make sure the first 2MB area is not mapped by huge pages * There are typically fixed size MTRRs in there and overlapping diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index ad43f427bab0..0ff946f21c27 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -375,6 +375,7 @@ SECTIONS
RUNTIME_CONST(shift, d_hash_shift) RUNTIME_CONST(ptr, dentry_hashtable) + RUNTIME_CONST(ptr, USER_PTR_MAX)
. = ALIGN(PAGE_SIZE);
diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S index 35e88101edb9..8f41716c40bc 100644 --- a/arch/x86/lib/getuser.S +++ b/arch/x86/lib/getuser.S @@ -39,8 +39,13 @@
.macro check_range size:req .if IS_ENABLED(CONFIG_X86_64) - mov %rax, %rdx - sar $63, %rdx + movq $0x0123456789abcdef,%rdx + 1: + .pushsection runtime_ptr_USER_PTR_MAX,"a" + .long 1b - 8 - . + .popsection + cmp %rax, %rdx + sbb %rdx, %rdx or %rdx, %rax .else cmp $TASK_SIZE_MAX-\size+1, %eax
From: David Laight David.Laight@ACULAB.COM
mainline inclusion from mainline-v6.13-rc1 commit 573f45a9f9a47fed4c7957609689b772121b33d7 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/IB2BWT CVE: CVE-2024-50102
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
When the size isn't a small constant, __access_ok() will call valid_user_address() with the address after the last byte of the user buffer.
It is valid for a buffer to end with the last valid user address so valid_user_address() must allow accesses to the base of the guard page.
[ This introduces an off-by-one in the other direction for the plain non-sized accesses, but since we have that guard region that is a whole page, those checks "allowing" accesses to that guard region don't really matter. The access will fault anyway, whether to the guard page or if the address has been masked to all ones - Linus ]
Fixes: 86e6b1547b3d0 ("x86: fix user address masking non-canonical speculation issue") Signed-off-by: David Laight david.laight@aculab.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Qi Xi xiqi2@huawei.com --- arch/x86/kernel/cpu/common.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index cbdcdd2956e0..570ecdca404b 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -2447,12 +2447,12 @@ void __init arch_cpu_finalize_init(void) alternative_instructions();
if (IS_ENABLED(CONFIG_X86_64)) { - unsigned long USER_PTR_MAX = TASK_SIZE_MAX-1; + unsigned long USER_PTR_MAX = TASK_SIZE_MAX;
/* * Enable this when LAM is gated on LASS support if (cpu_feature_enabled(X86_FEATURE_LAM)) - USER_PTR_MAX = (1ul << 63) - PAGE_SIZE - 1; + USER_PTR_MAX = (1ul << 63) - PAGE_SIZE; */ runtime_const_init(ptr, USER_PTR_MAX);
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/14808 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/O...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/14808 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/O...