From: caijian caijian11@h-partners.com
Backport the code of enabling 'writable' ID registers from mainline to openEuler 6.6. It is the fundation of live migration between different platforms.
Jing Zhang (5): KVM: arm64: Allow userspace to get the writable masks for feature ID registers KVM: arm64: Document KVM_ARM_GET_REG_WRITABLE_MASKS KVM: arm64: Use guest ID register values for the sake of emulation KVM: arm64: Allow userspace to change ID_AA64MMFR{0-2}_EL1 KVM: arm64: Allow userspace to change ID_AA64PFR0_EL1
Oliver Upton (6): KVM: arm64: Advertise selected DebugVer in DBGDIDR.Version KVM: arm64: Reject attempts to set invalid debug arch version KVM: arm64: Bump up the default KVM sanitised debug version to v8p8 KVM: arm64: Allow userspace to change ID_AA64ISAR{0-2}_EL1 KVM: arm64: Allow userspace to change ID_AA64ZFR0_EL1 KVM: arm64: Document vCPU feature selection UAPIs
Documentation/virt/kvm/api.rst | 52 +++++ Documentation/virt/kvm/arm/index.rst | 1 + Documentation/virt/kvm/arm/vcpu-features.rst | 48 +++++ arch/arm64/include/asm/kvm_host.h | 2 + arch/arm64/include/uapi/asm/kvm.h | 32 +++ arch/arm64/kvm/arm.c | 10 + arch/arm64/kvm/sys_regs.c | 194 ++++++++++++++++--- include/uapi/linux/kvm.h | 2 + 8 files changed, 309 insertions(+), 32 deletions(-) create mode 100644 Documentation/virt/kvm/arm/vcpu-features.rst
From: Jing Zhang jingzhangos@google.com
mainline inclusion from mainline-v6.7-rc1 commit 3f9cd0ca848413fd368278310d2cdd6c2bef48b2 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I97RDO CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
------------------------------------------------------------------------
While the Feature ID range is well defined and pretty large, it isn't inconceivable that the architecture will eventually grow some other ranges that will need to similarly be described to userspace.
Add a VM ioctl to allow userspace to get writable masks for feature ID registers in below system register space: op0 = 3, op1 = {0, 1, 3}, CRn = 0, CRm = {0 - 7}, op2 = {0 - 7} This is used to support mix-and-match userspace and kernels for writable ID registers, where userspace may want to know upfront whether it can actually tweak the contents of an idreg or not.
Add a new capability (KVM_CAP_ARM_SUPPORTED_FEATURE_ID_RANGES) that returns a bitmap of the valid ranges, which can subsequently be retrieved, one at a time by setting the index of the set bit as the range identifier.
Suggested-by: Marc Zyngier maz@kernel.org Suggested-by: Cornelia Huck cohuck@redhat.com Signed-off-by: Jing Zhang jingzhangos@google.com Reviewed-by: Cornelia Huck cohuck@redhat.com Reviewed-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20231003230408.3405722-2-oliver.upton@linux.dev Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: caijian caijian11@h-partners.com Signed-off-by: Xiang Chen chenxiang66@hisilicon.com --- arch/arm64/include/asm/kvm_host.h | 2 + arch/arm64/include/uapi/asm/kvm.h | 32 +++++++++++++++ arch/arm64/kvm/arm.c | 10 +++++ arch/arm64/kvm/sys_regs.c | 66 +++++++++++++++++++++++++++++++ include/uapi/linux/kvm.h | 2 + 5 files changed, 112 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 8c64fa87279b..ab005ceca231 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -1146,6 +1146,8 @@ int kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm, struct kvm_arm_copy_mte_tags *copy_tags); int kvm_vm_ioctl_set_counter_offset(struct kvm *kvm, struct kvm_arm_counter_offset *offset); +int kvm_vm_ioctl_get_reg_writable_masks(struct kvm *kvm, + struct reg_mask_range *range);
/* Guest/host FPSIMD coordination helpers */ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu); diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h index f7ddd73a8c0f..89d2fc872d9f 100644 --- a/arch/arm64/include/uapi/asm/kvm.h +++ b/arch/arm64/include/uapi/asm/kvm.h @@ -505,6 +505,38 @@ struct kvm_smccc_filter { #define KVM_HYPERCALL_EXIT_SMC (1U << 0) #define KVM_HYPERCALL_EXIT_16BIT (1U << 1)
+/* + * Get feature ID registers userspace writable mask. + * + * From DDI0487J.a, D19.2.66 ("ID_AA64MMFR2_EL1, AArch64 Memory Model + * Feature Register 2"): + * + * "The Feature ID space is defined as the System register space in + * AArch64 with op0==3, op1=={0, 1, 3}, CRn==0, CRm=={0-7}, + * op2=={0-7}." + * + * This covers all currently known R/O registers that indicate + * anything useful feature wise, including the ID registers. + * + * If we ever need to introduce a new range, it will be described as + * such in the range field. + */ +#define KVM_ARM_FEATURE_ID_RANGE_IDX(op0, op1, crn, crm, op2) \ + ({ \ + __u64 __op1 = (op1) & 3; \ + __op1 -= (__op1 == 3); \ + (__op1 << 6 | ((crm) & 7) << 3 | (op2)); \ + }) + +#define KVM_ARM_FEATURE_ID_RANGE 0 +#define KVM_ARM_FEATURE_ID_RANGE_SIZE (3 * 8 * 8) + +struct reg_mask_range { + __u64 addr; /* Pointer to mask array */ + __u32 range; /* Requested range */ + __u32 reserved[13]; +}; + #endif
#endif /* __ARM_KVM_H__ */ diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 859689e80ec4..026c4de5d762 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -344,6 +344,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) r = sdev_enable; break; #endif + case KVM_CAP_ARM_SUPPORTED_REG_MASK_RANGES: + r = BIT(0); + break; default: r = 0; } @@ -1748,6 +1751,13 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) return 0; } #endif + case KVM_ARM_GET_REG_WRITABLE_MASKS: { + struct reg_mask_range range; + + if (copy_from_user(&range, argp, sizeof(range))) + return -EFAULT; + return kvm_vm_ioctl_get_reg_writable_masks(kvm, &range); + } default: return -EINVAL; } diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index b4608d1b08b5..e6631cce6dfa 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1429,6 +1429,13 @@ static inline bool is_id_reg(u32 id) sys_reg_CRm(id) < 8); }
+static inline bool is_aa32_id_reg(u32 id) +{ + return (sys_reg_Op0(id) == 3 && sys_reg_Op1(id) == 0 && + sys_reg_CRn(id) == 0 && sys_reg_CRm(id) >= 1 && + sys_reg_CRm(id) <= 3); +} + static unsigned int id_visibility(const struct kvm_vcpu *vcpu, const struct sys_reg_desc *r) { @@ -3648,6 +3655,65 @@ int kvm_arm_copy_sys_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices) return write_demux_regids(uindices); }
+#define KVM_ARM_FEATURE_ID_RANGE_INDEX(r) \ + KVM_ARM_FEATURE_ID_RANGE_IDX(sys_reg_Op0(r), \ + sys_reg_Op1(r), \ + sys_reg_CRn(r), \ + sys_reg_CRm(r), \ + sys_reg_Op2(r)) + +static bool is_feature_id_reg(u32 encoding) +{ + return (sys_reg_Op0(encoding) == 3 && + (sys_reg_Op1(encoding) < 2 || sys_reg_Op1(encoding) == 3) && + sys_reg_CRn(encoding) == 0 && + sys_reg_CRm(encoding) <= 7); +} + +int kvm_vm_ioctl_get_reg_writable_masks(struct kvm *kvm, struct reg_mask_range *range) +{ + const void *zero_page = page_to_virt(ZERO_PAGE(0)); + u64 __user *masks = (u64 __user *)range->addr; + + /* Only feature id range is supported, reserved[13] must be zero. */ + if (range->range || + memcmp(range->reserved, zero_page, sizeof(range->reserved))) + return -EINVAL; + + /* Wipe the whole thing first */ + if (clear_user(masks, KVM_ARM_FEATURE_ID_RANGE_SIZE * sizeof(__u64))) + return -EFAULT; + + for (int i = 0; i < ARRAY_SIZE(sys_reg_descs); i++) { + const struct sys_reg_desc *reg = &sys_reg_descs[i]; + u32 encoding = reg_to_encoding(reg); + u64 val; + + if (!is_feature_id_reg(encoding) || !reg->set_user) + continue; + + /* + * For ID registers, we return the writable mask. Other feature + * registers return a full 64bit mask. That's not necessary + * compliant with a given revision of the architecture, but the + * RES0/RES1 definitions allow us to do that. + */ + if (is_id_reg(encoding)) { + if (!reg->val || + (is_aa32_id_reg(encoding) && !kvm_supports_32bit_el0())) + continue; + val = reg->val; + } else { + val = ~0UL; + } + + if (put_user(val, (masks + KVM_ARM_FEATURE_ID_RANGE_INDEX(encoding)))) + return -EFAULT; + } + + return 0; +} + int __init kvm_sys_reg_table_init(void) { struct sys_reg_params params; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index d2ab4a3d6993..48b9cfda093a 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1200,6 +1200,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_COUNTER_OFFSET 227 #define KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE 228 #define KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES 229 +#define KVM_CAP_ARM_SUPPORTED_REG_MASK_RANGES 230
#define KVM_CAP_ARM_VIRT_MSI_BYPASS 799
@@ -1578,6 +1579,7 @@ struct kvm_s390_ucas_mapping { #define KVM_ARM_MTE_COPY_TAGS _IOR(KVMIO, 0xb4, struct kvm_arm_copy_mte_tags) /* Available with KVM_CAP_COUNTER_OFFSET */ #define KVM_ARM_SET_COUNTER_OFFSET _IOW(KVMIO, 0xb5, struct kvm_arm_counter_offset) +#define KVM_ARM_GET_REG_WRITABLE_MASKS _IOR(KVMIO, 0xb6, struct reg_mask_range)
/* ioctl for SW vcpu init*/ #define KVM_SW64_VCPU_INIT _IO(KVMIO, 0xba)
From: Jing Zhang jingzhangos@google.com
mainline inclusion from mainline-v6.7-rc1 commit 6656cda0f3b269cb87cfb5773d3369e6d96e83db category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I97RDO CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
------------------------------------------------------------------------
Add some basic documentation on how to get feature ID register writable masks from userspace.
Signed-off-by: Jing Zhang jingzhangos@google.com Reviewed-by: Cornelia Huck cohuck@redhat.com Reviewed-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20231003230408.3405722-3-oliver.upton@linux.dev Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: caijian caijian11@h-partners.com Signed-off-by: Xiang Chen chenxiang66@hisilicon.com --- Documentation/virt/kvm/api.rst | 48 ++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index edc682a94ca4..b5aed9faf2bb 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -6122,6 +6122,54 @@ writes to the CNTVCT_EL0 and CNTPCT_EL0 registers using the SET_ONE_REG interface. No error will be returned, but the resulting offset will not be applied.
+4.139 KVM_ARM_GET_REG_WRITABLE_MASKS +------------------------------------------- + +:Capability: KVM_CAP_ARM_SUPPORTED_REG_MASK_RANGES +:Architectures: arm64 +:Type: vm ioctl +:Parameters: struct reg_mask_range (in/out) +:Returns: 0 on success, < 0 on error + + +:: + + #define KVM_ARM_FEATURE_ID_RANGE 0 + #define KVM_ARM_FEATURE_ID_RANGE_SIZE (3 * 8 * 8) + + struct reg_mask_range { + __u64 addr; /* Pointer to mask array */ + __u32 range; /* Requested range */ + __u32 reserved[13]; + }; + +This ioctl copies the writable masks for a selected range of registers to +userspace. + +The ``addr`` field is a pointer to the destination array where KVM copies +the writable masks. + +The ``range`` field indicates the requested range of registers. +``KVM_CHECK_EXTENSION`` for the ``KVM_CAP_ARM_SUPPORTED_REG_MASK_RANGES`` +capability returns the supported ranges, expressed as a set of flags. Each +flag's bit index represents a possible value for the ``range`` field. +All other values are reserved for future use and KVM may return an error. + +The ``reserved[13]`` array is reserved for future use and should be 0, or +KVM may return an error. + +KVM_ARM_FEATURE_ID_RANGE (0) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The Feature ID range is defined as the AArch64 System register space with +op0==3, op1=={0, 1, 3}, CRn==0, CRm=={0-7}, op2=={0-7}. + +The mask returned array pointed to by ``addr`` is indexed by the macro +``ARM64_FEATURE_ID_RANGE_IDX(op0, op1, crn, crm, op2)``, allowing userspace +to know what fields can be changed for the system register described by +``op0, op1, crn, crm, op2``. KVM rejects ID register values that describe a +superset of the features supported by the system. + 5. The kvm_run structure ========================
From: Jing Zhang jingzhangos@google.com
mainline inclusion from mainline-v6.7-rc1 commit 8b6958d6ace19d56b44ca0f961f4fa480d4b48fc category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I97RDO CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
------------------------------------------------------------------------
Since KVM now supports per-VM ID registers, use per-VM ID register values for the sake of emulation for DBGDIDR and LORegion.
Signed-off-by: Jing Zhang jingzhangos@google.com Reviewed-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20231003230408.3405722-4-oliver.upton@linux.dev Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: caijian caijian11@h-partners.com Signed-off-by: Xiang Chen chenxiang66@hisilicon.com --- arch/arm64/kvm/sys_regs.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index e6631cce6dfa..74851be2d946 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -379,7 +379,7 @@ static bool trap_loregion(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *r) { - u64 val = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1); + u64 val = IDREG(vcpu->kvm, SYS_ID_AA64MMFR1_EL1); u32 sr = reg_to_encoding(r);
if (!(val & (0xfUL << ID_AA64MMFR1_EL1_LO_SHIFT))) { @@ -2512,8 +2512,8 @@ static bool trap_dbgdidr(struct kvm_vcpu *vcpu, if (p->is_write) { return ignore_write(vcpu, p); } else { - u64 dfr = read_sanitised_ftr_reg(SYS_ID_AA64DFR0_EL1); - u64 pfr = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1); + u64 dfr = IDREG(vcpu->kvm, SYS_ID_AA64DFR0_EL1); + u64 pfr = IDREG(vcpu->kvm, SYS_ID_AA64PFR0_EL1); u32 el3 = !!cpuid_feature_extract_unsigned_field(pfr, ID_AA64PFR0_EL1_EL3_SHIFT);
p->regval = ((((dfr >> ID_AA64DFR0_EL1_WRPs_SHIFT) & 0xf) << 28) |
From: Oliver Upton oliver.upton@linux.dev
mainline inclusion from mainline-v6.7-rc1 commit 5a23e5c7cb0d5b2b38c33417579cd628be7d14c5 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I97RDO CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
------------------------------------------------------------------------
Much like we do for other fields, extract the Debug architecture version from the ID register to populate the corresponding field in DBGDIDR. Rewrite the existing sysreg field extractors to use SYS_FIELD_GET() for consistency.
Suggested-by: Marc Zyngier maz@kernel.org Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: caijian caijian11@h-partners.com Signed-off-by: Xiang Chen chenxiang66@hisilicon.com --- arch/arm64/kvm/sys_regs.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 74851be2d946..fd770f7221d4 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -2514,12 +2514,13 @@ static bool trap_dbgdidr(struct kvm_vcpu *vcpu, } else { u64 dfr = IDREG(vcpu->kvm, SYS_ID_AA64DFR0_EL1); u64 pfr = IDREG(vcpu->kvm, SYS_ID_AA64PFR0_EL1); - u32 el3 = !!cpuid_feature_extract_unsigned_field(pfr, ID_AA64PFR0_EL1_EL3_SHIFT); + u32 el3 = !!SYS_FIELD_GET(ID_AA64PFR0_EL1, EL3, pfr);
- p->regval = ((((dfr >> ID_AA64DFR0_EL1_WRPs_SHIFT) & 0xf) << 28) | - (((dfr >> ID_AA64DFR0_EL1_BRPs_SHIFT) & 0xf) << 24) | - (((dfr >> ID_AA64DFR0_EL1_CTX_CMPs_SHIFT) & 0xf) << 20) - | (6 << 16) | (1 << 15) | (el3 << 14) | (el3 << 12)); + p->regval = ((SYS_FIELD_GET(ID_AA64DFR0_EL1, WRPs, dfr) << 28) | + (SYS_FIELD_GET(ID_AA64DFR0_EL1, BRPs, dfr) << 24) | + (SYS_FIELD_GET(ID_AA64DFR0_EL1, CTX_CMPs, dfr) << 20) | + (SYS_FIELD_GET(ID_AA64DFR0_EL1, DebugVer, dfr) << 16) | + (1 << 15) | (el3 << 14) | (el3 << 12)); return true; } }
From: Oliver Upton oliver.upton@linux.dev
mainline inclusion from mainline-v6.7-rc1 commit a9bc4a1c1e0c206838bc2d0c8621ca6df9704a2f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I97RDO CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
------------------------------------------------------------------------
The debug architecture is mandatory in ARMv8, so KVM should not allow userspace to configure a vCPU with less than that. Of course, this isn't handled elegantly by the generic ID register plumbing, as the respective ID register fields have a nonzero starting value.
Add an explicit check for debug versions less than v8 of the architecture.
Reviewed-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20231003230408.3405722-5-oliver.upton@linux.dev Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: caijian caijian11@h-partners.com Signed-off-by: Xiang Chen chenxiang66@hisilicon.com --- arch/arm64/kvm/sys_regs.c | 32 +++++++++++++++++++++++++++++--- 1 file changed, 29 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index fd770f7221d4..b3b87363d7d5 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1241,8 +1241,14 @@ static s64 kvm_arm64_ftr_safe_value(u32 id, const struct arm64_ftr_bits *ftrp, /* Some features have different safe value type in KVM than host features */ switch (id) { case SYS_ID_AA64DFR0_EL1: - if (kvm_ftr.shift == ID_AA64DFR0_EL1_PMUVer_SHIFT) + switch (kvm_ftr.shift) { + case ID_AA64DFR0_EL1_PMUVer_SHIFT: kvm_ftr.type = FTR_LOWER_SAFE; + break; + case ID_AA64DFR0_EL1_DebugVer_SHIFT: + kvm_ftr.type = FTR_LOWER_SAFE; + break; + } break; case SYS_ID_DFR0_EL1: if (kvm_ftr.shift == ID_DFR0_EL1_PerfMon_SHIFT) @@ -1540,14 +1546,22 @@ static u64 read_sanitised_id_aa64pfr0_el1(struct kvm_vcpu *vcpu, return val; }
+#define ID_REG_LIMIT_FIELD_ENUM(val, reg, field, limit) \ +({ \ + u64 __f_val = FIELD_GET(reg##_##field##_MASK, val); \ + (val) &= ~reg##_##field##_MASK; \ + (val) |= FIELD_PREP(reg##_##field##_MASK, \ + min(__f_val, (u64)reg##_##field##_##limit)); \ + (val); \ +}) + static u64 read_sanitised_id_aa64dfr0_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd) { u64 val = read_sanitised_ftr_reg(SYS_ID_AA64DFR0_EL1);
/* Limit debug to ARMv8.0 */ - val &= ~ID_AA64DFR0_EL1_DebugVer_MASK; - val |= SYS_FIELD_PREP_ENUM(ID_AA64DFR0_EL1, DebugVer, IMP); + val = ID_REG_LIMIT_FIELD_ENUM(val, ID_AA64DFR0_EL1, DebugVer, IMP);
/* * Only initialize the PMU version if the vCPU was configured with one. @@ -1567,6 +1581,7 @@ static int set_id_aa64dfr0_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, u64 val) { + u8 debugver = SYS_FIELD_GET(ID_AA64DFR0_EL1, DebugVer, val); u8 pmuver = SYS_FIELD_GET(ID_AA64DFR0_EL1, PMUVer, val);
/* @@ -1586,6 +1601,13 @@ static int set_id_aa64dfr0_el1(struct kvm_vcpu *vcpu, if (pmuver == ID_AA64DFR0_EL1_PMUVer_IMP_DEF) val &= ~ID_AA64DFR0_EL1_PMUVer_MASK;
+ /* + * ID_AA64DFR0_EL1.DebugVer is one of those awkward fields with a + * nonzero minimum safe value. + */ + if (debugver < ID_AA64DFR0_EL1_DebugVer_IMP) + return -EINVAL; + return set_id_reg(vcpu, rd, val); }
@@ -1607,6 +1629,7 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu, u64 val) { u8 perfmon = SYS_FIELD_GET(ID_DFR0_EL1, PerfMon, val); + u8 copdbg = SYS_FIELD_GET(ID_DFR0_EL1, CopDbg, val);
if (perfmon == ID_DFR0_EL1_PerfMon_IMPDEF) { val &= ~ID_DFR0_EL1_PerfMon_MASK; @@ -1622,6 +1645,9 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu, if (perfmon != 0 && perfmon < ID_DFR0_EL1_PerfMon_PMUv3) return -EINVAL;
+ if (copdbg < ID_DFR0_EL1_CopDbg_Armv8) + return -EINVAL; + return set_id_reg(vcpu, rd, val); }
From: Oliver Upton oliver.upton@linux.dev
mainline inclusion from mainline-v6.7-rc1 commit 9f9917bc71b08335819ea667d1e392424fb76450 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I97RDO CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
------------------------------------------------------------------------
Since ID_AA64DFR0_EL1 and ID_DFR0_EL1 are now writable from userspace, it is safe to bump up the default KVM sanitised debug version to v8p8.
Reviewed-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20231003230408.3405722-6-oliver.upton@linux.dev Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: caijian caijian11@h-partners.com Signed-off-by: Xiang Chen chenxiang66@hisilicon.com --- arch/arm64/kvm/sys_regs.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index b3b87363d7d5..fe7d1caf4c4d 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1560,8 +1560,7 @@ static u64 read_sanitised_id_aa64dfr0_el1(struct kvm_vcpu *vcpu, { u64 val = read_sanitised_ftr_reg(SYS_ID_AA64DFR0_EL1);
- /* Limit debug to ARMv8.0 */ - val = ID_REG_LIMIT_FIELD_ENUM(val, ID_AA64DFR0_EL1, DebugVer, IMP); + val = ID_REG_LIMIT_FIELD_ENUM(val, ID_AA64DFR0_EL1, DebugVer, V8P8);
/* * Only initialize the PMU version if the vCPU was configured with one. @@ -1621,6 +1620,8 @@ static u64 read_sanitised_id_dfr0_el1(struct kvm_vcpu *vcpu, if (kvm_vcpu_has_pmu(vcpu)) val |= SYS_FIELD_PREP(ID_DFR0_EL1, PerfMon, perfmon);
+ val = ID_REG_LIMIT_FIELD_ENUM(val, ID_DFR0_EL1, CopDbg, Debugv8p8); + return val; }
@@ -2077,7 +2078,8 @@ static const struct sys_reg_desc sys_reg_descs[] = { .set_user = set_id_dfr0_el1, .visibility = aa32_id_visibility, .reset = read_sanitised_id_dfr0_el1, - .val = ID_DFR0_EL1_PerfMon_MASK, }, + .val = ID_DFR0_EL1_PerfMon_MASK | + ID_DFR0_EL1_CopDbg_MASK, }, ID_HIDDEN(ID_AFR0_EL1), AA32_ID_SANITISED(ID_MMFR0_EL1), AA32_ID_SANITISED(ID_MMFR1_EL1), @@ -2126,7 +2128,8 @@ static const struct sys_reg_desc sys_reg_descs[] = { .get_user = get_id_reg, .set_user = set_id_aa64dfr0_el1, .reset = read_sanitised_id_aa64dfr0_el1, - .val = ID_AA64DFR0_EL1_PMUVer_MASK, }, + .val = ID_AA64DFR0_EL1_PMUVer_MASK | + ID_AA64DFR0_EL1_DebugVer_MASK, }, ID_SANITISED(ID_AA64DFR1_EL1), ID_UNALLOCATED(5,2), ID_UNALLOCATED(5,3),
From: Oliver Upton oliver.upton@linux.dev
mainline inclusion from mainline-v6.7-rc1 commit 56d77aa8bdf527f3b767fa27a3f135769151c92d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I97RDO CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
------------------------------------------------------------------------
Almost all of the features described by the ISA registers have no KVM involvement. Allow userspace to change the value of these registers with a couple exceptions:
- MOPS is not writable as KVM does not currently virtualize FEAT_MOPS.
- The PAuth fields are not writable as KVM requires both address and generic authentication be enabled.
Co-developed-by: Jing Zhang jingzhangos@google.com Signed-off-by: Jing Zhang jingzhangos@google.com Reviewed-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20231003230408.3405722-7-oliver.upton@linux.dev Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: caijian caijian11@h-partners.com Signed-off-by: Xiang Chen chenxiang66@hisilicon.com --- arch/arm64/kvm/sys_regs.c | 38 ++++++++++++++++++++++++++------------ 1 file changed, 26 insertions(+), 12 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index fe7d1caf4c4d..abd3e1d7d810 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1915,11 +1915,14 @@ static unsigned int elx2_visibility(const struct kvm_vcpu *vcpu, * from userspace. */
-/* sys_reg_desc initialiser for known cpufeature ID registers */ -#define ID_SANITISED(name) { \ +#define ID_DESC(name) \ SYS_DESC(SYS_##name), \ .access = access_id_reg, \ - .get_user = get_id_reg, \ + .get_user = get_id_reg \ + +/* sys_reg_desc initialiser for known cpufeature ID registers */ +#define ID_SANITISED(name) { \ + ID_DESC(name), \ .set_user = set_id_reg, \ .visibility = id_visibility, \ .reset = kvm_read_sanitised_id_reg, \ @@ -1928,15 +1931,22 @@ static unsigned int elx2_visibility(const struct kvm_vcpu *vcpu,
/* sys_reg_desc initialiser for known cpufeature ID registers */ #define AA32_ID_SANITISED(name) { \ - SYS_DESC(SYS_##name), \ - .access = access_id_reg, \ - .get_user = get_id_reg, \ + ID_DESC(name), \ .set_user = set_id_reg, \ .visibility = aa32_id_visibility, \ .reset = kvm_read_sanitised_id_reg, \ .val = 0, \ }
+/* sys_reg_desc initialiser for writable ID registers */ +#define ID_WRITABLE(name, mask) { \ + ID_DESC(name), \ + .set_user = set_id_reg, \ + .visibility = id_visibility, \ + .reset = kvm_read_sanitised_id_reg, \ + .val = mask, \ +} + /* * sys_reg_desc initialiser for architecturally unallocated cpufeature ID * register with encoding Op0=3, Op1=0, CRn=0, CRm=crm, Op2=op2 @@ -1958,9 +1968,7 @@ static unsigned int elx2_visibility(const struct kvm_vcpu *vcpu, * RAZ for the guest. */ #define ID_HIDDEN(name) { \ - SYS_DESC(SYS_##name), \ - .access = access_id_reg, \ - .get_user = get_id_reg, \ + ID_DESC(name), \ .set_user = set_id_reg, \ .visibility = raz_visibility, \ .reset = kvm_read_sanitised_id_reg, \ @@ -2139,9 +2147,15 @@ static const struct sys_reg_desc sys_reg_descs[] = { ID_UNALLOCATED(5,7),
/* CRm=6 */ - ID_SANITISED(ID_AA64ISAR0_EL1), - ID_SANITISED(ID_AA64ISAR1_EL1), - ID_SANITISED(ID_AA64ISAR2_EL1), + ID_WRITABLE(ID_AA64ISAR0_EL1, ~ID_AA64ISAR0_EL1_RES0), + ID_WRITABLE(ID_AA64ISAR1_EL1, ~(ID_AA64ISAR1_EL1_GPI | + ID_AA64ISAR1_EL1_GPA | + ID_AA64ISAR1_EL1_API | + ID_AA64ISAR1_EL1_APA)), + ID_WRITABLE(ID_AA64ISAR2_EL1, ~(ID_AA64ISAR2_EL1_RES0 | + ID_AA64ISAR2_EL1_MOPS | + ID_AA64ISAR2_EL1_APA3 | + ID_AA64ISAR2_EL1_GPA3)), ID_UNALLOCATED(6,3), ID_UNALLOCATED(6,4), ID_UNALLOCATED(6,5),
From: Jing Zhang jingzhangos@google.com
mainline inclusion from mainline-v6.7-rc1 commit d5a32b60dc184cc7309f83648a368b94d91c797f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I97RDO CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
------------------------------------------------------------------------
Allow userspace to modify the guest-visible values of these ID registers. Prevent changes to any of the virtualization features until KVM picks up support for nested and we have a handle on managing NV features.
Signed-off-by: Jing Zhang jingzhangos@google.com [oliver: prevent changes to EL2 features for now] Reviewed-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20231003230408.3405722-8-oliver.upton@linux.dev Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: caijian caijian11@h-partners.com Signed-off-by: Xiang Chen chenxiang66@hisilicon.com --- arch/arm64/kvm/sys_regs.c | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index abd3e1d7d810..c1d8d8dfe1ff 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -2163,9 +2163,23 @@ static const struct sys_reg_desc sys_reg_descs[] = { ID_UNALLOCATED(6,7),
/* CRm=7 */ - ID_SANITISED(ID_AA64MMFR0_EL1), - ID_SANITISED(ID_AA64MMFR1_EL1), - ID_SANITISED(ID_AA64MMFR2_EL1), + ID_WRITABLE(ID_AA64MMFR0_EL1, ~(ID_AA64MMFR0_EL1_RES0 | + ID_AA64MMFR0_EL1_TGRAN4_2 | + ID_AA64MMFR0_EL1_TGRAN64_2 | + ID_AA64MMFR0_EL1_TGRAN16_2)), + ID_WRITABLE(ID_AA64MMFR1_EL1, ~(ID_AA64MMFR1_EL1_RES0 | + ID_AA64MMFR1_EL1_HCX | + ID_AA64MMFR1_EL1_XNX | + ID_AA64MMFR1_EL1_TWED | + ID_AA64MMFR1_EL1_XNX | + ID_AA64MMFR1_EL1_VH | + ID_AA64MMFR1_EL1_VMIDBits)), + ID_WRITABLE(ID_AA64MMFR2_EL1, ~(ID_AA64MMFR2_EL1_RES0 | + ID_AA64MMFR2_EL1_EVT | + ID_AA64MMFR2_EL1_FWB | + ID_AA64MMFR2_EL1_IDS | + ID_AA64MMFR2_EL1_NV | + ID_AA64MMFR2_EL1_CCIDX)), ID_SANITISED(ID_AA64MMFR3_EL1), ID_UNALLOCATED(7,4), ID_UNALLOCATED(7,5),
From: Jing Zhang jingzhangos@google.com
mainline inclusion from mainline-v6.7-rc1 commit 8cfd5be88ebe3d71be1a2c52fc72a355bb924b49 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I97RDO CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
------------------------------------------------------------------------
Allow userspace to change the guest-visible value of the register with some severe limitation:
- No changes to features not virtualized by KVM (AMU, MPAM, RAS)
- Short of full GICv2 emulation in kernel, hiding GICv3 from the guest makes absolutely no sense.
- FP is effectively assumed for KVM VMs.
Signed-off-by: Jing Zhang jingzhangos@google.com [oliver: restrict features that are illogical to change] Reviewed-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20231003230408.3405722-9-oliver.upton@linux.dev Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: caijian caijian11@h-partners.com Signed-off-by: Xiang Chen chenxiang66@hisilicon.com --- arch/arm64/kvm/sys_regs.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index c1d8d8dfe1ff..15f44b84efd3 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -2121,7 +2121,13 @@ static const struct sys_reg_desc sys_reg_descs[] = { .get_user = get_id_reg, .set_user = set_id_reg, .reset = read_sanitised_id_aa64pfr0_el1, - .val = ID_AA64PFR0_EL1_CSV2_MASK | ID_AA64PFR0_EL1_CSV3_MASK, }, + .val = ~(ID_AA64PFR0_EL1_AMU | + ID_AA64PFR0_EL1_MPAM | + ID_AA64PFR0_EL1_SVE | + ID_AA64PFR0_EL1_RAS | + ID_AA64PFR0_EL1_GIC | + ID_AA64PFR0_EL1_AdvSIMD | + ID_AA64PFR0_EL1_FP), }, ID_SANITISED(ID_AA64PFR1_EL1), ID_UNALLOCATED(4,2), ID_UNALLOCATED(4,3),
From: Oliver Upton oliver.upton@linux.dev
mainline inclusion from mainline-v6.7-rc1 commit f89fbb350dd76d6b5f080954309b9dec5ad220ac category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I97RDO CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
------------------------------------------------------------------------
All known fields in ID_AA64ZFR0_EL1 describe the unprivileged instructions supported by the PE's SVE implementation. Allow userspace to pick and choose the advertised feature set, though nothing stops the guest from using undisclosed instructions.
Reviewed-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20231003230408.3405722-10-oliver.upton@linux.dev Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: caijian caijian11@h-partners.com Signed-off-by: Xiang Chen chenxiang66@hisilicon.com --- arch/arm64/kvm/sys_regs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 15f44b84efd3..fbced3eeeeb8 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -2131,7 +2131,7 @@ static const struct sys_reg_desc sys_reg_descs[] = { ID_SANITISED(ID_AA64PFR1_EL1), ID_UNALLOCATED(4,2), ID_UNALLOCATED(4,3), - ID_SANITISED(ID_AA64ZFR0_EL1), + ID_WRITABLE(ID_AA64ZFR0_EL1, ~ID_AA64ZFR0_EL1_RES0), ID_HIDDEN(ID_AA64SMFR0_EL1), ID_UNALLOCATED(4,6), ID_UNALLOCATED(4,7),
From: Oliver Upton oliver.upton@linux.dev
mainline inclusion from mainline-v6.7-rc1 commit dafa493dd01d5992f1cb70b08d1741c3ab99e04a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I97RDO CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
------------------------------------------------------------------------
KVM/arm64 has a couple schemes for handling vCPU feature selection now, which is a lot to put on userspace. Add some documentation about how these interact and provide some recommendations for how to use the writable ID register scheme.
Reviewed-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20231003230408.3405722-11-oliver.upton@linux.dev Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: caijian caijian11@h-partners.com Signed-off-by: Xiang Chen chenxiang66@hisilicon.com --- Documentation/virt/kvm/api.rst | 4 ++ Documentation/virt/kvm/arm/index.rst | 1 + Documentation/virt/kvm/arm/vcpu-features.rst | 48 ++++++++++++++++++++ 3 files changed, 53 insertions(+) create mode 100644 Documentation/virt/kvm/arm/vcpu-features.rst
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index b5aed9faf2bb..d75c9a7a7193 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -3422,6 +3422,8 @@ return indicates the attribute is implemented. It does not necessarily indicate that the attribute can be read or written in the device's current state. "addr" is ignored.
+.. _KVM_ARM_VCPU_INIT: + 4.82 KVM_ARM_VCPU_INIT ----------------------
@@ -6122,6 +6124,8 @@ writes to the CNTVCT_EL0 and CNTPCT_EL0 registers using the SET_ONE_REG interface. No error will be returned, but the resulting offset will not be applied.
+.. _KVM_ARM_GET_REG_WRITABLE_MASKS: + 4.139 KVM_ARM_GET_REG_WRITABLE_MASKS -------------------------------------------
diff --git a/Documentation/virt/kvm/arm/index.rst b/Documentation/virt/kvm/arm/index.rst index e84848432158..7f231c724e16 100644 --- a/Documentation/virt/kvm/arm/index.rst +++ b/Documentation/virt/kvm/arm/index.rst @@ -11,3 +11,4 @@ ARM hypercalls pvtime ptp_kvm + vcpu-features diff --git a/Documentation/virt/kvm/arm/vcpu-features.rst b/Documentation/virt/kvm/arm/vcpu-features.rst new file mode 100644 index 000000000000..f7cc6d8d8b74 --- /dev/null +++ b/Documentation/virt/kvm/arm/vcpu-features.rst @@ -0,0 +1,48 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=============================== +vCPU feature selection on arm64 +=============================== + +KVM/arm64 provides two mechanisms that allow userspace to configure +the CPU features presented to the guest. + +KVM_ARM_VCPU_INIT +================= + +The ``KVM_ARM_VCPU_INIT`` ioctl accepts a bitmap of feature flags +(``struct kvm_vcpu_init::features``). Features enabled by this interface are +*opt-in* and may change/extend UAPI. See :ref:`KVM_ARM_VCPU_INIT` for complete +documentation of the features controlled by the ioctl. + +Otherwise, all CPU features supported by KVM are described by the architected +ID registers. + +The ID Registers +================ + +The Arm architecture specifies a range of *ID Registers* that describe the set +of architectural features supported by the CPU implementation. KVM initializes +the guest's ID registers to the maximum set of CPU features supported by the +system. The ID register values may be VM-scoped in KVM, meaning that the +values could be shared for all vCPUs in a VM. + +KVM allows userspace to *opt-out* of certain CPU features described by the ID +registers by writing values to them via the ``KVM_SET_ONE_REG`` ioctl. The ID +registers are mutable until the VM has started, i.e. userspace has called +``KVM_RUN`` on at least one vCPU in the VM. Userspace can discover what fields +are mutable in the ID registers using the ``KVM_ARM_GET_REG_WRITABLE_MASKS``. +See the :ref:`ioctl documentation <KVM_ARM_GET_REG_WRITABLE_MASKS>` for more +details. + +Userspace is allowed to *limit* or *mask* CPU features according to the rules +outlined by the architecture in DDI0487J.a D19.1.3 'Principles of the ID +scheme for fields in ID register'. KVM does not allow ID register values that +exceed the capabilities of the system. + +.. warning:: + It is **strongly recommended** that userspace modify the ID register values + before accessing the rest of the vCPU's CPU register state. KVM may use the + ID register values to control feature emulation. Interleaving ID register + modification with other system register accesses may lead to unpredictable + behavior.
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/5176 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/V...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/5176 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/V...