Re: [PATCH v4 1/5] hisi_acc_vfio_pci: fix XQE dma address error

-----Original Message----- From: Alex Williamson <alex.williamson@redhat.com> Sent: Wednesday, February 26, 2025 12:10 AM To: liulongfang <liulongfang@huawei.com> Cc: jgg@nvidia.com; Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>; Jonathan Cameron <jonathan.cameron@huawei.com>; kvm@vger.kernel.org; linux- kernel@vger.kernel.org; linuxarm@openeuler.org Subject: Re: [PATCH v4 1/5] hisi_acc_vfio_pci: fix XQE dma address error
On Tue, 25 Feb 2025 14:27:53 +0800 Longfang Liu <liulongfang@huawei.com> wrote:
The dma addresses of EQE and AEQE are wrong after migration and results in guest kernel-mode encryption services failure. Comparing the definition of hardware registers, we found that there was an error when the data read from the register was combined into an address. Therefore, the address combination sequence needs to be corrected.
Even after fixing the above problem, we still have an issue where the Guest from an old kernel can get migrated to new kernel and may result in wrong data.
In order to ensure that the address is correct after migration, if an old magic number is detected, the dma address needs to be updated.
Fixes: b0eed085903e ("hisi_acc_vfio_pci: Add support for VFIO live migration") Signed-off-by: Longfang Liu <liulongfang@huawei.com> --- .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 40 ++++++++++++++++--- .../vfio/pci/hisilicon/hisi_acc_vfio_pci.h | 14 ++++++- 2 files changed, 46 insertions(+), 8 deletions(-)
diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c index 451c639299eb..35316984089b 100644 --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c @@ -350,6 +350,31 @@ static int vf_qm_func_stop(struct hisi_qm *qm) return hisi_qm_mb(qm, QM_MB_CMD_PAUSE_QM, 0, 0, 0); }
+static int vf_qm_version_check(struct acc_vf_data *vf_data, struct device *dev) +{ + switch (vf_data->acc_magic) { + case ACC_DEV_MAGIC_V2: + if (vf_data->major_ver < ACC_DRV_MAJOR_VER || + vf_data->minor_ver < ACC_DRV_MINOR_VER) + dev_info(dev, "migration driver version not match!\n"); + return -EINVAL; + break;
What's your major/minor update strategy?
Note that minor_ver is a u16 and ACC_DRV_MINOR_VER is defined as 0, so testing less than 0 against an unsigned is useless.
Arguably testing major and minor independently is pretty useless too.
You're defining what a future "old" driver version will accept, which is very nearly anything, so any breaking change *again* requires a new magic, so we're accomplishing very little here.
Maybe you want to consider a strategy where you'd increment the major number for a breaking change and minor for a compatible feature. In that case you'd want to verify the major_ver matches ACC_DRV_MAJOR_VER exactly and minor_ver would be more of a feature level.
Agree. I think the above check should be just major_ver != ACC_DRV_MAJOR_VER and we can make use of minor version whenever we need a specific handling for a feature support. Also I think it would be good to print the version info above in case of mismatch.
+ case ACC_DEV_MAGIC_V1: + /* Correct dma address */ + vf_data->eqe_dma = vf_data- qm_eqc_dw[QM_XQC_ADDR_HIGH]; + vf_data->eqe_dma <<= QM_XQC_ADDR_OFFSET; + vf_data->eqe_dma |= vf_data- qm_eqc_dw[QM_XQC_ADDR_LOW]; + vf_data->aeqe_dma = vf_data- qm_aeqc_dw[QM_XQC_ADDR_HIGH]; + vf_data->aeqe_dma <<= QM_XQC_ADDR_OFFSET; + vf_data->aeqe_dma |= vf_data- qm_aeqc_dw[QM_XQC_ADDR_LOW]; + break; + default: + return -EINVAL; + } + + return 0; +} + static int vf_qm_check_match(struct hisi_acc_vf_core_device *hisi_acc_vdev, struct hisi_acc_vf_migration_file *migf) { @@ -363,7 +388,8 @@ static int vf_qm_check_match(struct hisi_acc_vf_core_device *hisi_acc_vdev, if (migf->total_length < QM_MATCH_SIZE || hisi_acc_vdev- match_done) return 0;
- if (vf_data->acc_magic != ACC_DEV_MAGIC) { + ret = vf_qm_version_check(vf_data, dev); + if (ret) { dev_err(dev, "failed to match ACC_DEV_MAGIC\n"); return -EINVAL; } @@ -418,7 +444,9 @@ static int vf_qm_get_match_data(struct hisi_acc_vf_core_device *hisi_acc_vdev, int vf_id = hisi_acc_vdev->vf_id; int ret;
- vf_data->acc_magic = ACC_DEV_MAGIC; + vf_data->acc_magic = ACC_DEV_MAGIC_V2; + vf_data->major_ver = ACC_DRV_MAR; + vf_data->minor_ver = ACC_DRV_MIN;
Where are "MAR" and "MIN" defined? I can't see how this would even compile. Thanks,
Yes. Please make sure to do a compile test and run basic sanity tested even if you think the changes are minor. Chances are there that you will miss out things like this. Thanks, Shameer
participants (1)
-
Shameerali Kolothum Thodi