From: Peter Zijlstra peterz@infradead.org
stable inclusion from stable-v5.10.133 commit 3d13ee0d411a078ca1538d823c2c759b8b266fb1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5YVKO
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
--------------------------------
commit bbe2df3f6b6da7848398d55b1311d58a16ec21e4 upstream.
Try and replace retpoline thunk calls with:
LFENCE CALL *%\reg
for spectre_v2=retpoline,amd.
Specifically, the sequence above is 5 bytes for the low 8 registers, but 6 bytes for the high 8 registers. This means that unless the compilers prefix stuff the call with higher registers this replacement will fail.
Luckily GCC strongly favours RAX for the indirect calls and most (95%+ for defconfig-x86_64) will be converted. OTOH clang strongly favours R11 and almost nothing gets converted.
Note: it will also generate a correct replacement for the Jcc.d32 case, except unless the compilers start to prefix stuff that, it'll never fit. Specifically:
Jncc.d8 1f LFENCE JMP *%\reg 1:
is 7-8 bytes long, where the original instruction in unpadded form is only 6 bytes.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Borislav Petkov bp@suse.de Acked-by: Josh Poimboeuf jpoimboe@redhat.com Tested-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/r/20211026120310.359986601@infradead.org [cascardo: RETPOLINE_AMD was renamed to RETPOLINE_LFENCE] Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@canonical.com Signed-off-by: Ben Hutchings ben@decadent.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com Acked-by: Xie XiuQi xiexiuqi@huawei.com --- arch/x86/kernel/alternative.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 4964c4b52be8..487ade1768b1 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -545,6 +545,7 @@ static int emit_indirect(int op, int reg, u8 *bytes) * * CALL *%\reg * + * It also tries to inline spectre_v2=retpoline,amd when size permits. */ static int patch_retpoline(void *addr, struct insn *insn, u8 *bytes) { @@ -561,7 +562,8 @@ static int patch_retpoline(void *addr, struct insn *insn, u8 *bytes) /* If anyone ever does: CALL/JMP *%rsp, we're in deep trouble. */ BUG_ON(reg == 4);
- if (cpu_feature_enabled(X86_FEATURE_RETPOLINE)) + if (cpu_feature_enabled(X86_FEATURE_RETPOLINE) && + !cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) return -1;
op = insn->opcode.bytes[0]; @@ -574,8 +576,9 @@ static int patch_retpoline(void *addr, struct insn *insn, u8 *bytes) * into: * * Jncc.d8 1f + * [ LFENCE ] * JMP *%\reg - * NOP + * [ NOP ] * 1: */ /* Jcc.d32 second opcode byte is in the range: 0x80-0x8f */ @@ -590,6 +593,15 @@ static int patch_retpoline(void *addr, struct insn *insn, u8 *bytes) op = JMP32_INSN_OPCODE; }
+ /* + * For RETPOLINE_AMD: prepend the indirect CALL/JMP with an LFENCE. + */ + if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) { + bytes[i++] = 0x0f; + bytes[i++] = 0xae; + bytes[i++] = 0xe8; /* LFENCE */ + } + ret = emit_indirect(op, reg, bytes + i); if (ret < 0) return ret;