From: jiangdongxu jiangdongxu1@huawei.com
Patch 1-17: some bugfix and ops intruduced by upstream Patch 18-19: introduce vdpa device logging ops Patch 20-21: introduce vdpa device state ops Patch 22-23: introduce vdpa device migrate state ops Patch 24: introduce new vhost feature BYTEPMAPLOG Patch 25: export iommu_get_resv_regions/iommu_set_resv_regions Patch 26-27: some optimization about vhost-vdpa Patch 28: add vdpa/vhost-vdpa build config
Arnaldo Carvalho de Melo (1): tools include UAPI: Sync linux/vhost.h with the kernel sources
Cindy Lu (2): vhost_vdpa: fix the crash in unmap a large memory vhost_vdpa: fix unmap process in no-batch mode
Eugenio Pérez (1): vdpa: add get_backend_features vdpa operation
Gautam Dawar (1): vhost-vdpa: free iommu domain after last use during cleanup
Greg Kroah-Hartman (1): vhost-vdpa: vhost_vdpa_alloc_domain() should be using a const struct bus_type *
Jason Gunthorpe (1): PCI/IOV: Add pci_iov_vf_id() to get VF index
Sebastien Boeuf (3): vdpa: Add resume operation vhost-vdpa: Introduce RESUME backend feature bit vhost-vdpa: uAPI to resume the device
Shannon Nelson (2): vhost_vdpa: tell vqs about the negotiated vhost_vdpa: support PACKED when setting-getting vring_base
Shunsuke Mie (1): virtio: fix virtio transitional ids
Stefano Garzarella (3): vhost-vdpa: fix an iotlb memory leak vdpa: add bind_mm/unbind_mm callbacks vhost-vdpa: use bind_mm/unbind_mm device callbacks
Zhu Lingshan (1): virtio: update virtio id table, add transitional ids
jiangdongxu (11): vdpa: add log operations vhost-vdpa: add uAPI for logging vdpa: add device state operations vhost-vdpa: add uAPI for device buffer vdpa: add vdpa device migration status ops vhost-vdpa: add uAPI for device migration status vhost: add VHOST feature VHOST_BACKEND_F_BYTEMAPLOG export iommu_get_resv_regions and iommu_set_resv_regions vhost-vdpa: Allow transparent MSI IOV vhost-vdpa: fix msi irq request err arm64: openeuler_defconfig: add VDPA config
arch/arm64/configs/openeuler_defconfig | 6 +- drivers/iommu/iommu.c | 2 + drivers/pci/iov.c | 14 + drivers/vhost/vdpa.c | 385 ++++++++++++++++++++++--- include/linux/pci.h | 8 +- include/linux/vdpa.h | 60 +++- include/uapi/linux/vhost.h | 20 ++ include/uapi/linux/vhost_types.h | 21 ++ include/uapi/linux/virtio_ids.h | 12 + tools/include/uapi/linux/vhost.h | 8 + 10 files changed, 490 insertions(+), 46 deletions(-)
From: Stefano Garzarella sgarzare@redhat.com
mainline inclusion from mainline-v6.2-rc3 commit c070c1912a83432530cbb4271d5b9b11fa36b67a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
--------------------------------
Before commit 3d5698793897 ("vhost-vdpa: introduce asid based IOTLB") we called vhost_vdpa_iotlb_unmap(v, iotlb, 0ULL, 0ULL - 1) during release to free all the resources allocated when processing user IOTLB messages through vhost_vdpa_process_iotlb_update(). That commit changed the handling of IOTLB a bit, and we accidentally removed some code called during the release.
We partially fixed this with commit 037d4305569a ("vhost-vdpa: call vhost_vdpa_cleanup during the release") but a potential memory leak is still there as showed by kmemleak if the application does not send VHOST_IOTLB_INVALIDATE or crashes:
unreferenced object 0xffff888007fbaa30 (size 16): comm "blkio-bench", pid 914, jiffies 4294993521 (age 885.500s) hex dump (first 16 bytes): 40 73 41 07 80 88 ff ff 00 00 00 00 00 00 00 00 @sA............. backtrace: [<0000000087736d2a>] kmem_cache_alloc_trace+0x142/0x1c0 [<0000000060740f50>] vhost_vdpa_process_iotlb_msg+0x68c/0x901 [vhost_vdpa] [<0000000083e8e205>] vhost_chr_write_iter+0xc0/0x4a0 [vhost] [<000000008f2f414a>] vhost_vdpa_chr_write_iter+0x18/0x20 [vhost_vdpa] [<00000000de1cd4a0>] vfs_write+0x216/0x4b0 [<00000000a2850200>] ksys_write+0x71/0xf0 [<00000000de8e720b>] __x64_sys_write+0x19/0x20 [<0000000018b12cbb>] do_syscall_64+0x3f/0x90 [<00000000986ec465>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Let's fix this calling vhost_vdpa_iotlb_unmap() on the whole range in vhost_vdpa_remove_as(). We move that call before vhost_dev_cleanup() since we need a valid v->vdev.mm in vhost_vdpa_pa_unmap(). vhost_iotlb_reset() call can be removed, since vhost_vdpa_iotlb_unmap() on the whole range removes all the entries.
The kmemleak log reported was observed with a vDPA device that has `use_va` set to true (e.g. VDUSE). This patch has been tested with both types of devices.
Fixes: 037d4305569a ("vhost-vdpa: call vhost_vdpa_cleanup during the release") Fixes: 3d5698793897 ("vhost-vdpa: introduce asid based IOTLB") Signed-off-by: Stefano Garzarella sgarzare@redhat.com Message-Id: 20221109154213.146789-1-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/vhost/vdpa.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index ebafc05d2b74..eee3189985b9 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -65,6 +65,10 @@ static DEFINE_IDA(vhost_vdpa_ida);
static dev_t vhost_vdpa_major;
+static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, + struct vhost_iotlb *iotlb, + u64 start, u64 last); + static inline u32 iotlb_to_asid(struct vhost_iotlb *iotlb) { struct vhost_vdpa_as *as = container_of(iotlb, struct @@ -135,7 +139,7 @@ static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid) return -EINVAL;
hlist_del(&as->hash_link); - vhost_iotlb_reset(&as->iotlb); + vhost_vdpa_iotlb_unmap(v, &as->iotlb, 0ULL, 0ULL - 1); kfree(as);
return 0; @@ -1166,14 +1170,14 @@ static void vhost_vdpa_cleanup(struct vhost_vdpa *v) struct vhost_vdpa_as *as; u32 asid;
- vhost_dev_cleanup(&v->vdev); - kfree(v->vdev.vqs); - for (asid = 0; asid < v->vdpa->nas; asid++) { as = asid_to_as(v, asid); if (as) vhost_vdpa_remove_as(v, asid); } + + vhost_dev_cleanup(&v->vdev); + kfree(v->vdev.vqs); }
static int vhost_vdpa_open(struct inode *inode, struct file *filep)
From: Cindy Lu lulu@redhat.com
mainline inclusion from mainline-v6.2-rc3 commit e794070af224ade46db368271896b2685ff4f96b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------------------------------------
While testing in vIOMMU, sometimes Guest will unmap very large memory, which will cause the crash. To fix this, add a new function vhost_vdpa_general_unmap(). This function will only unmap the memory that saved in iotlb.
Call Trace: [ 647.820144] ------------[ cut here ]------------ [ 647.820848] kernel BUG at drivers/iommu/intel/iommu.c:1174! [ 647.821486] invalid opcode: 0000 [#1] PREEMPT SMP PTI [ 647.822082] CPU: 10 PID: 1181 Comm: qemu-system-x86 Not tainted 6.0.0-rc1home_lulu_2452_lulu7_vhost+ #62 [ 647.823139] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-29-g6a62e0cb0dfe-prebuilt.qem4 [ 647.824365] RIP: 0010:domain_unmap+0x48/0x110 [ 647.825424] Code: 48 89 fb 8d 4c f6 1e 39 c1 0f 4f c8 83 e9 0c 83 f9 3f 7f 18 48 89 e8 48 d3 e8 48 85 c0 75 59 [ 647.828064] RSP: 0018:ffffae5340c0bbf0 EFLAGS: 00010202 [ 647.828973] RAX: 0000000000000001 RBX: ffff921793d10540 RCX: 000000000000001b [ 647.830083] RDX: 00000000080000ff RSI: 0000000000000001 RDI: ffff921793d10540 [ 647.831214] RBP: 0000000007fc0100 R08: ffffae5340c0bcd0 R09: 0000000000000003 [ 647.832388] R10: 0000007fc0100000 R11: 0000000000100000 R12: 00000000080000ff [ 647.833668] R13: ffffae5340c0bcd0 R14: ffff921793d10590 R15: 0000008000100000 [ 647.834782] FS: 00007f772ec90640(0000) GS:ffff921ce7a80000(0000) knlGS:0000000000000000 [ 647.836004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 647.836990] CR2: 00007f02c27a3a20 CR3: 0000000101b0c006 CR4: 0000000000372ee0 [ 647.838107] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 647.839283] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 647.840666] Call Trace: [ 647.841437] <TASK> [ 647.842107] intel_iommu_unmap_pages+0x93/0x140 [ 647.843112] __iommu_unmap+0x91/0x1b0 [ 647.844003] iommu_unmap+0x6a/0x95 [ 647.844885] vhost_vdpa_unmap+0x1de/0x1f0 [vhost_vdpa] [ 647.845985] vhost_vdpa_process_iotlb_msg+0xf0/0x90b [vhost_vdpa] [ 647.847235] ? _raw_spin_unlock+0x15/0x30 [ 647.848181] ? _copy_from_iter+0x8c/0x580 [ 647.849137] vhost_chr_write_iter+0xb3/0x430 [vhost] [ 647.850126] vfs_write+0x1e4/0x3a0 [ 647.850897] ksys_write+0x53/0xd0 [ 647.851688] do_syscall_64+0x3a/0x90 [ 647.852508] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 647.853457] RIP: 0033:0x7f7734ef9f4f [ 647.854408] Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 29 76 f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c8 [ 647.857217] RSP: 002b:00007f772ec8f040 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [ 647.858486] RAX: ffffffffffffffda RBX: 00000000fef00000 RCX: 00007f7734ef9f4f [ 647.859713] RDX: 0000000000000048 RSI: 00007f772ec8f090 RDI: 0000000000000010 [ 647.860942] RBP: 00007f772ec8f1a0 R08: 0000000000000000 R09: 0000000000000000 [ 647.862206] R10: 0000000000000001 R11: 0000000000000293 R12: 0000000000000010 [ 647.863446] R13: 0000000000000002 R14: 0000000000000000 R15: ffffffff01100000 [ 647.864692] </TASK> [ 647.865458] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs v] [ 647.874688] ---[ end trace 0000000000000000 ]---
Cc: stable@vger.kernel.org Fixes: 4c8cf31885f6 ("vhost: introduce vDPA-based backend") Signed-off-by: Cindy Lu lulu@redhat.com Message-Id: 20221219073331.556140-1-lulu@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/vhost/vdpa.c | 46 +++++++++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 20 deletions(-)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index eee3189985b9..e90a848cfffc 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -66,8 +66,8 @@ static DEFINE_IDA(vhost_vdpa_ida); static dev_t vhost_vdpa_major;
static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, - struct vhost_iotlb *iotlb, - u64 start, u64 last); + struct vhost_iotlb *iotlb, u64 start, + u64 last, u32 asid);
static inline u32 iotlb_to_asid(struct vhost_iotlb *iotlb) { @@ -139,7 +139,7 @@ static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid) return -EINVAL;
hlist_del(&as->hash_link); - vhost_vdpa_iotlb_unmap(v, &as->iotlb, 0ULL, 0ULL - 1); + vhost_vdpa_iotlb_unmap(v, &as->iotlb, 0ULL, 0ULL - 1, asid); kfree(as);
return 0; @@ -687,10 +687,20 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, mutex_unlock(&d->mutex); return r; } +static void vhost_vdpa_general_unmap(struct vhost_vdpa *v, + struct vhost_iotlb_map *map, u32 asid) +{ + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + if (ops->dma_map) { + ops->dma_unmap(vdpa, asid, map->start, map->size); + } else if (ops->set_map == NULL) { + iommu_unmap(v->domain, map->start, map->size); + } +}
-static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, - struct vhost_iotlb *iotlb, - u64 start, u64 last) +static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, struct vhost_iotlb *iotlb, + u64 start, u64 last, u32 asid) { struct vhost_dev *dev = &v->vdev; struct vhost_iotlb_map *map; @@ -707,13 +717,13 @@ static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, unpin_user_page(page); } atomic64_sub(PFN_DOWN(map->size), &dev->mm->pinned_vm); + vhost_vdpa_general_unmap(v, map, asid); vhost_iotlb_map_free(iotlb, map); } }
-static void vhost_vdpa_va_unmap(struct vhost_vdpa *v, - struct vhost_iotlb *iotlb, - u64 start, u64 last) +static void vhost_vdpa_va_unmap(struct vhost_vdpa *v, struct vhost_iotlb *iotlb, + u64 start, u64 last, u32 asid) { struct vhost_iotlb_map *map; struct vdpa_map_file *map_file; @@ -722,20 +732,21 @@ static void vhost_vdpa_va_unmap(struct vhost_vdpa *v, map_file = (struct vdpa_map_file *)map->opaque; fput(map_file->file); kfree(map_file); + vhost_vdpa_general_unmap(v, map, asid); vhost_iotlb_map_free(iotlb, map); } }
static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, - struct vhost_iotlb *iotlb, - u64 start, u64 last) + struct vhost_iotlb *iotlb, u64 start, + u64 last, u32 asid) { struct vdpa_device *vdpa = v->vdpa;
if (vdpa->use_va) - return vhost_vdpa_va_unmap(v, iotlb, start, last); + return vhost_vdpa_va_unmap(v, iotlb, start, last, asid);
- return vhost_vdpa_pa_unmap(v, iotlb, start, last); + return vhost_vdpa_pa_unmap(v, iotlb, start, last, asid); }
static int perm_to_iommu_flags(u32 perm) @@ -802,17 +813,12 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v, const struct vdpa_config_ops *ops = vdpa->config; u32 asid = iotlb_to_asid(iotlb);
- vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1); + vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1, asid);
- if (ops->dma_map) { - ops->dma_unmap(vdpa, asid, iova, size); - } else if (ops->set_map) { + if (ops->set_map) { if (!v->in_batch) ops->set_map(vdpa, asid, iotlb); - } else { - iommu_unmap(v->domain, iova, size); } - /* If we are in the middle of batch processing, delay the free * of AS until BATCH_END. */
From: Sebastien Boeuf sebastien.boeuf@intel.com
mainline inclusion from mainline-v6.3-rc1 commit 1538a8a49ecbe6d3302cd7f347632338e56857f8 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------------------------------------
Add a new operation to allow a vDPA device to be resumed after it has been suspended. Trying to resume a device that wasn't suspended will result in a no-op.
This operation is optional. If it's not implemented, the associated backend feature bit will not be exposed. And if the feature bit is not exposed, invoking this operation will return an error.
Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: Sebastien Boeuf sebastien.boeuf@intel.com Message-Id: 6e05c4b31b47f3e29cb2bd7ebd56c81f84b8f48a.1672742878.git.sebastien.boeuf@intel.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Reviewed-by: Stefano Garzarella sgarzare@redhat.com Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- include/linux/vdpa.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index 3751b672451d..ffb1fd135182 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -216,7 +216,10 @@ struct vdpa_map_file { * @reset: Reset device * @vdev: vdpa device * Returns integer: success (0) or error (< 0) - * @suspend: Suspend or resume the device (optional) + * @suspend: Suspend the device (optional) + * @vdev: vdpa device + * Returns integer: success (0) or error (< 0) + * @resume: Resume the device (optional) * @vdev: vdpa device * Returns integer: success (0) or error (< 0) * @get_config_size: Get the size of the configuration space includes @@ -321,6 +324,7 @@ struct vdpa_config_ops { void (*set_status)(struct vdpa_device *vdev, u8 status); int (*reset)(struct vdpa_device *vdev); int (*suspend)(struct vdpa_device *vdev); + int (*resume)(struct vdpa_device *vdev); size_t (*get_config_size)(struct vdpa_device *vdev); void (*get_config)(struct vdpa_device *vdev, unsigned int offset, void *buf, unsigned int len);
From: Sebastien Boeuf sebastien.boeuf@intel.com
mainline inclusion from mainline-v6.3-rc1 commit 69106b6fb3d73bd4252daa48ae96e600c9701147 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------------------------------------
Userspace knows if the device can be resumed or not by checking this feature bit.
It's only exposed if the vdpa driver backend implements the resume() operation callback. Userspace trying to negotiate this feature when it hasn't been exposed will result in an error.
Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: Sebastien Boeuf sebastien.boeuf@intel.com Message-Id: b18db236ba3d990cdb41278eb4703be9201d9514.1672742878.git.sebastien.boeuf@intel.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Reviewed-by: Stefano Garzarella sgarzare@redhat.com Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/vhost/vdpa.c | 16 +++++++++++++++- include/uapi/linux/vhost_types.h | 2 ++ 2 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index e90a848cfffc..2a0288c019d0 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -359,6 +359,14 @@ static bool vhost_vdpa_can_suspend(const struct vhost_vdpa *v) return ops->suspend; }
+static bool vhost_vdpa_can_resume(const struct vhost_vdpa *v) +{ + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + + return ops->resume; +} + static long vhost_vdpa_get_features(struct vhost_vdpa *v, u64 __user *featurep) { struct vdpa_device *vdpa = v->vdpa; @@ -606,11 +614,15 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, if (copy_from_user(&features, featurep, sizeof(features))) return -EFAULT; if (features & ~(VHOST_VDPA_BACKEND_FEATURES | - BIT_ULL(VHOST_BACKEND_F_SUSPEND))) + BIT_ULL(VHOST_BACKEND_F_SUSPEND) | + BIT_ULL(VHOST_BACKEND_F_RESUME))) return -EOPNOTSUPP; if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) && !vhost_vdpa_can_suspend(v)) return -EOPNOTSUPP; + if ((features & BIT_ULL(VHOST_BACKEND_F_RESUME)) && + !vhost_vdpa_can_resume(v)) + return -EOPNOTSUPP; vhost_set_backend_features(&v->vdev, features); return 0; } @@ -662,6 +674,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, features = VHOST_VDPA_BACKEND_FEATURES; if (vhost_vdpa_can_suspend(v)) features |= BIT_ULL(VHOST_BACKEND_F_SUSPEND); + if (vhost_vdpa_can_resume(v)) + features |= BIT_ULL(VHOST_BACKEND_F_RESUME); if (copy_to_user(featurep, &features, sizeof(features))) r = -EFAULT; break; diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h index 1bdd6e363f4c..9e072926d633 100644 --- a/include/uapi/linux/vhost_types.h +++ b/include/uapi/linux/vhost_types.h @@ -163,5 +163,7 @@ struct vhost_vdpa_iova_range { #define VHOST_BACKEND_F_IOTLB_ASID 0x3 /* Device can be suspended */ #define VHOST_BACKEND_F_SUSPEND 0x4 +/* Device can be resumed */ +#define VHOST_BACKEND_F_RESUME 0x5
#endif
From: Sebastien Boeuf sebastien.boeuf@intel.com
mainline inclusion from mainline-v6.3-rc1 commit 3b688d7a086d0438649ea37990c6e811954fc780 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------------------------------------
This new ioctl adds support for resuming the device from userspace.
This is required when trying to restore the device in a functioning state after it's been suspended. It is already possible to reset a suspended device, but that means the device must be reconfigured and all the IOMMU/IOTLB mappings must be recreated. This new operation allows the device to be resumed without going through a full reset.
This is particularly useful when trying to perform offline migration of a virtual machine (also known as snapshot/restore) as it allows the VMM to resume the virtual machine back to a running state after the snapshot is performed.
Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: Sebastien Boeuf sebastien.boeuf@intel.com Message-Id: 73b75fb87d25cff59768b4955a81fe7ffe5b4770.1672742878.git.sebastien.boeuf@intel.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Reviewed-by: Stefano Garzarella sgarzare@redhat.com Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/vhost/vdpa.c | 18 ++++++++++++++++++ include/uapi/linux/vhost.h | 8 ++++++++ 2 files changed, 26 insertions(+)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 2a0288c019d0..6144b730ec78 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -506,6 +506,21 @@ static long vhost_vdpa_suspend(struct vhost_vdpa *v) return ops->suspend(vdpa); }
+/* After a successful return of this ioctl the device resumes processing + * virtqueue descriptors. The device becomes fully operational the same way it + * was before it was suspended. + */ +static long vhost_vdpa_resume(struct vhost_vdpa *v) +{ + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + + if (!ops->resume) + return -EOPNOTSUPP; + + return ops->resume(vdpa); +} + static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd, void __user *argp) { @@ -691,6 +706,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, case VHOST_VDPA_SUSPEND: r = vhost_vdpa_suspend(v); break; + case VHOST_VDPA_RESUME: + r = vhost_vdpa_resume(v); + break; default: r = vhost_dev_ioctl(&v->vdev, cmd, argp); if (r == -ENOIOCTLCMD) diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h index f9f115a7c75b..92e1b700b51c 100644 --- a/include/uapi/linux/vhost.h +++ b/include/uapi/linux/vhost.h @@ -180,4 +180,12 @@ */ #define VHOST_VDPA_SUSPEND _IO(VHOST_VIRTIO, 0x7D)
+/* Resume a device so it can resume processing virtqueue requests + * + * After the return of this ioctl the device will have restored all the + * necessary states and it is fully operational to continue processing the + * virtqueue descriptors. + */ +#define VHOST_VDPA_RESUME _IO(VHOST_VIRTIO, 0x7E) + #endif
From: Gautam Dawar gautam.dawar@amd.com
mainline inclusion from mainline-v6.3-rc3 commit 5a522150093a0eabae9470a70a37a6e436bfad08 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------------------------------------
Currently vhost_vdpa_cleanup() unmaps the DMA mappings by calling `iommu_unmap(v->domain, map->start, map->size);` from vhost_vdpa_general_unmap() when the parent vDPA driver doesn't provide DMA config operations.
However, the IOMMU domain referred to by `v->domain` is freed in vhost_vdpa_free_domain() before vhost_vdpa_cleanup() in vhost_vdpa_release() which results in NULL pointer de-reference. Accordingly, moving the call to vhost_vdpa_free_domain() in vhost_vdpa_cleanup() would makes sense. This will also help detaching the dma device in error handling of vhost_vdpa_alloc_domain().
This issue was observed on terminating QEMU with SIGQUIT.
Fixes: 037d4305569a ("vhost-vdpa: call vhost_vdpa_cleanup during the release") Signed-off-by: Gautam Dawar gautam.dawar@amd.com Message-Id: 20230301163203.29883-1-gautam.dawar@amd.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Acked-by: Jason Wang jasowang@redhat.com Reviewed-by: Stefano Garzarella sgarzare@redhat.com Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/vhost/vdpa.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 6144b730ec78..0bcc4f47555b 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -1166,6 +1166,7 @@ static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v)
err_attach: iommu_domain_free(v->domain); + v->domain = NULL; return ret; }
@@ -1214,6 +1215,7 @@ static void vhost_vdpa_cleanup(struct vhost_vdpa *v) vhost_vdpa_remove_as(v, asid); }
+ vhost_vdpa_free_domain(v); vhost_dev_cleanup(&v->vdev); kfree(v->vdev.vqs); } @@ -1286,7 +1288,6 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep) vhost_vdpa_clean_irq(v); vhost_vdpa_reset(v); vhost_dev_stop(&v->vdev); - vhost_vdpa_free_domain(v); vhost_vdpa_config_put(v); vhost_vdpa_cleanup(v); mutex_unlock(&d->mutex);
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
mainline inclusion from mainline-v6.4-rc1 commit 94a1150421940ff2e8113b5fa6837774777d5b3d category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------------------------------------
The function, vhost_vdpa_alloc_domain(), has a pointer to a struct bus_type, but it should be constant as the function it passes it to expects it to be const, and the vhost code does not modify it in any way.
Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Jason Wang jasowang@redhat.com Cc: kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20230313182918.1312597-31-gregkh@linuxfoundation.o... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/vhost/vdpa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 0bcc4f47555b..99b08eefac6f 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -1140,7 +1140,7 @@ static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v) struct vdpa_device *vdpa = v->vdpa; const struct vdpa_config_ops *ops = vdpa->config; struct device *dma_dev = vdpa_get_dma_dev(vdpa); - struct bus_type *bus; + const struct bus_type *bus; int ret;
/* Device want to do DMA by itself */
From: Stefano Garzarella sgarzare@redhat.com
mainline inclusion from mainline-v6.4-rc1 commit 9067de4725a299bc1baf11de9f5040fdd0bd05c3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------------------------------------
These new optional callbacks is used to bind/unbind the device to a specific address space so the vDPA framework can use VA when these callbacks are implemented.
Suggested-by: Jason Wang jasowang@redhat.com Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: Stefano Garzarella sgarzare@redhat.com Message-Id: 20230404131326.44403-2-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- include/linux/vdpa.h | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index ffb1fd135182..0f6d8bef66c2 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -282,6 +282,14 @@ struct vdpa_map_file { * @iova: iova to be unmapped * @size: size of the area * Returns integer: success (0) or error (< 0) + * @bind_mm: Bind the device to a specific address space + * so the vDPA framework can use VA when this + * callback is implemented. (optional) + * @vdev: vdpa device + * @mm: address space to bind + * @unbind_mm: Unbind the device from the address space + * bound using the bind_mm callback. (optional) + * @vdev: vdpa device * @free: Free resources that belongs to vDPA (optional) * @vdev: vdpa device */ @@ -342,6 +350,8 @@ struct vdpa_config_ops { u64 iova, u64 size); int (*set_group_asid)(struct vdpa_device *vdev, unsigned int group, unsigned int asid); + int (*bind_mm)(struct vdpa_device *vdev, struct mm_struct *mm); + void (*unbind_mm)(struct vdpa_device *vdev);
/* Free device resources */ void (*free)(struct vdpa_device *vdev);
From: Stefano Garzarella sgarzare@redhat.com
mainline inclusion from mainline-v6.4-rc1 commit 9067de4725a299bc1baf11de9f5040fdd0bd05c3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------------------------------------
When the user call VHOST_SET_OWNER ioctl and the vDPA device has `use_va` set to true, let's call the bind_mm callback. In this way we can bind the device to the user address space and directly use the user VA.
The unbind_mm callback is called during the release after stopping the device.
Signed-off-by: Stefano Garzarella sgarzare@redhat.com Message-Id: 20230404131326.44403-3-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/vhost/vdpa.c | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 99b08eefac6f..4473b73a0318 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -219,6 +219,28 @@ static int vhost_vdpa_reset(struct vhost_vdpa *v) return vdpa_reset(vdpa); }
+static long vhost_vdpa_bind_mm(struct vhost_vdpa *v) +{ + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + + if (!vdpa->use_va || !ops->bind_mm) + return 0; + + return ops->bind_mm(vdpa, v->vdev.mm); +} + +static void vhost_vdpa_unbind_mm(struct vhost_vdpa *v) +{ + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + + if (!vdpa->use_va || !ops->unbind_mm) + return; + + ops->unbind_mm(vdpa); +} + static long vhost_vdpa_get_device_id(struct vhost_vdpa *v, u8 __user *argp) { struct vdpa_device *vdpa = v->vdpa; @@ -716,6 +738,17 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, break; }
+ if (r) + goto out; + + switch (cmd) { + case VHOST_SET_OWNER: + r = vhost_vdpa_bind_mm(v); + if (r) + vhost_dev_reset_owner(d, NULL); + break; + } +out: mutex_unlock(&d->mutex); return r; } @@ -1288,6 +1321,7 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep) vhost_vdpa_clean_irq(v); vhost_vdpa_reset(v); vhost_dev_stop(&v->vdev); + vhost_vdpa_unbind_mm(v); vhost_vdpa_config_put(v); vhost_vdpa_cleanup(v); mutex_unlock(&d->mutex);
From: Cindy Lu lulu@redhat.com
mainline inclusion from mainline-v6.4-rc1 commit c82729e06644f4e087f5ff0f91b8fb15e03b8890 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------------------------------------
While using the vdpa device with vIOMMU enabled in the guest VM, when the vdpa device bind to vfio-pci and run testpmd then system will fail to unmap. The test process is Load guest VM --> attach to virtio driver--> bind to vfio-pci driver So the mapping process is 1)batched mode map to normal MR 2)batched mode unmapped the normal MR 3)unmapped all the memory 4)mapped to iommu MR
This error happened in step 3). The iotlb was freed in step 2) and the function vhost_vdpa_process_iotlb_msg will return fail Which causes failure.
To fix this, we will not remove the AS while the iotlb->nmaps is 0. This will free in the vhost_vdpa_clean
Cc: stable@vger.kernel.org Fixes: aaca8373c4b1 ("vhost-vdpa: support ASID based IOTLB API") Signed-off-by: Cindy Lu lulu@redhat.com Message-Id: 20230420151734.860168-1-lulu@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/vhost/vdpa.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 4473b73a0318..c7b4543210de 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -884,11 +884,7 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v, if (!v->in_batch) ops->set_map(vdpa, asid, iotlb); } - /* If we are in the middle of batch processing, delay the free - * of AS until BATCH_END. - */ - if (!v->in_batch && !iotlb->nmaps) - vhost_vdpa_remove_as(v, asid); + }
static int vhost_vdpa_va_map(struct vhost_vdpa *v, @@ -1145,8 +1141,6 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, u32 asid, if (v->in_batch && ops->set_map) ops->set_map(vdpa, asid, iotlb); v->in_batch = false; - if (!iotlb->nmaps) - vhost_vdpa_remove_as(v, asid); break; default: r = -EINVAL;
From: Shannon Nelson shannon.nelson@amd.com
mainline inclusion from mainline-v6.4-rc6 commit 376daf317753ccb6b1ecbdece66018f7f6313a7f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------------------------------------
As is done in the net, iscsi, and vsock vhost support, let the vdpa vqs know about the features that have been negotiated. This allows vhost to more safely make decisions based on the features, such as when using PACKED vs split queues.
Signed-off-by: Shannon Nelson shannon.nelson@amd.com Acked-by: Jason Wang jasowang@redhat.com Message-Id: 20230424225031.18947-2-shannon.nelson@amd.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/vhost/vdpa.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index c7b4543210de..61f60fcb010a 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -407,7 +407,10 @@ static long vhost_vdpa_set_features(struct vhost_vdpa *v, u64 __user *featurep) { struct vdpa_device *vdpa = v->vdpa; const struct vdpa_config_ops *ops = vdpa->config; + struct vhost_dev *d = &v->vdev; + u64 actual_features; u64 features; + int i;
/* * It's not allowed to change the features after they have @@ -422,6 +425,16 @@ static long vhost_vdpa_set_features(struct vhost_vdpa *v, u64 __user *featurep) if (vdpa_set_features(vdpa, features)) return -EINVAL;
+ /* let the vqs know what has been configured */ + actual_features = ops->get_driver_features(vdpa); + for (i = 0; i < d->nvqs; ++i) { + struct vhost_virtqueue *vq = d->vqs[i]; + + mutex_lock(&vq->mutex); + vq->acked_features = actual_features; + mutex_unlock(&vq->mutex); + } + return 0; }
From: Shannon Nelson shannon.nelson@amd.com
mainline inclusion from mainline-v6.4-rc6 commit beee7fdb5b56a46415a4992d28dd4c2d06eb52df category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------------------------------------
Use the right structs for PACKED or split vqs when setting and getting the vring base.
Fixes: 4c8cf31885f6 ("vhost: introduce vDPA-based backend") Signed-off-by: Shannon Nelson shannon.nelson@amd.com Message-Id: 20230424225031.18947-4-shannon.nelson@amd.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/vhost/vdpa.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 61f60fcb010a..80d3c49f9854 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -607,7 +607,14 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd, if (r) return r;
- vq->last_avail_idx = vq_state.split.avail_index; + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED)) { + vq->last_avail_idx = vq_state.packed.last_avail_idx | + (vq_state.packed.last_avail_counter << 15); + vq->last_used_idx = vq_state.packed.last_used_idx | + (vq_state.packed.last_used_counter << 15); + } else { + vq->last_avail_idx = vq_state.split.avail_index; + } break; }
@@ -625,9 +632,15 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd, break;
case VHOST_SET_VRING_BASE: - vq_state.split.avail_index = vq->last_avail_idx; - if (ops->set_vq_state(vdpa, idx, &vq_state)) - r = -EINVAL; + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED)) { + vq_state.packed.last_avail_idx = vq->last_avail_idx & 0x7fff; + vq_state.packed.last_avail_counter = !!(vq->last_avail_idx & 0x8000); + vq_state.packed.last_used_idx = vq->last_used_idx & 0x7fff; + vq_state.packed.last_used_counter = !!(vq->last_used_idx & 0x8000); + } else { + vq_state.split.avail_index = vq->last_avail_idx; + } + r = ops->set_vq_state(vdpa, idx, &vq_state); break;
case VHOST_SET_VRING_CALL:
From: Jason Gunthorpe jgg@nvidia.com
mainline inclusion from mainline-v5.18-rc1 commit 21ca9fb62d4688da41825e0f05d8e7e26afc69d6 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------------------------------------
The PCI core uses the VF index internally, often called the vf_id, during the setup of the VF, eg pci_iov_add_virtfn().
This index is needed for device drivers that implement live migration for their internal operations that configure/control their VFs.
Specifically, mlx5_vfio_pci driver that is introduced in coming patches from this series needs it and not the bus/device/function which is exposed today.
Add pci_iov_vf_id() which computes the vf_id by reversing the math that was used to create the bus/device/function.
Link: https://lore.kernel.org/all/20220224142024.147653-2-yishaih@nvidia.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Acked-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Yishai Hadas yishaih@nvidia.com Signed-off-by: Leon Romanovsky leonro@nvidia.com Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/pci/iov.c | 14 ++++++++++++++ include/linux/pci.h | 8 +++++++- 2 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c index 4afd4ee4f7f0..08db0610da76 100644 --- a/drivers/pci/iov.c +++ b/drivers/pci/iov.c @@ -32,6 +32,20 @@ int pci_iov_virtfn_devfn(struct pci_dev *dev, int vf_id) dev->sriov->stride * vf_id) & 0xff; }
+int pci_iov_vf_id(struct pci_dev *dev) +{ + struct pci_dev *pf; + + if (!dev->is_virtfn) + return -EINVAL; + + pf = pci_physfn(dev); + return (((dev->bus->number << 8) + dev->devfn) - + ((pf->bus->number << 8) + pf->devfn + pf->sriov->offset)) / + pf->sriov->stride; +} +EXPORT_SYMBOL_GPL(pci_iov_vf_id); + /* * Per SR-IOV spec sec 3.3.10 and 3.3.11, First VF Offset and VF Stride may * change when NumVFs changes. diff --git a/include/linux/pci.h b/include/linux/pci.h index c05a2cc63c8a..cde61998a055 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -2118,7 +2118,7 @@ void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar); #ifdef CONFIG_PCI_IOV int pci_iov_virtfn_bus(struct pci_dev *dev, int id); int pci_iov_virtfn_devfn(struct pci_dev *dev, int id); - +int pci_iov_vf_id(struct pci_dev *dev); int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn); void pci_disable_sriov(struct pci_dev *dev);
@@ -2146,6 +2146,12 @@ static inline int pci_iov_virtfn_devfn(struct pci_dev *dev, int id) { return -ENOSYS; } + +static inline int pci_iov_vf_id(struct pci_dev *dev) +{ + return -ENOSYS; +} + static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn) { return -ENODEV; }
From: Zhu Lingshan lingshan.zhu@intel.com
mainline inclusion from mainline-v5.14-rc1 commit d61914ea6adabde9126b0bed64a7a3a42249435e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------------------------------------
This commit updates virtio id table by adding transitional device ids
Signed-off-by: Zhu Lingshan lingshan.zhu@intel.com Link: https://lore.kernel.org/r/20210510081015.4212-2-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- include/uapi/linux/virtio_ids.h | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_ids.h index b052355ac7a3..dddb4ad18174 100644 --- a/include/uapi/linux/virtio_ids.h +++ b/include/uapi/linux/virtio_ids.h @@ -49,4 +49,16 @@ #define VIRTIO_ID_PMEM 27 /* virtio pmem */ #define VIRTIO_ID_MAC80211_HWSIM 29 /* virtio mac80211-hwsim */
+/* + * Virtio Transitional IDs + */ + +#define VIRTIO_TRANS_ID_NET 1000 /* transitional virtio net */ +#define VIRTIO_TRANS_ID_BLOCK 1001 /* transitional virtio block */ +#define VIRTIO_TRANS_ID_BALLOON 1002 /* transitional virtio balloon */ +#define VIRTIO_TRANS_ID_CONSOLE 1003 /* transitional virtio console */ +#define VIRTIO_TRANS_ID_SCSI 1004 /* transitional virtio SCSI */ +#define VIRTIO_TRANS_ID_RNG 1005 /* transitional virtio rng */ +#define VIRTIO_TRANS_ID_9P 1009 /* transitional virtio 9p console */ + #endif /* _LINUX_VIRTIO_IDS_H */
From: Shunsuke Mie mie@igel.co.jp
mainline inclusion from mainline-v5.18-rc7 commit 7ff960a6fe399fdcbca6159063684671ae57eee9 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------------------------------------
Fixes: d61914ea6ada ("virtio: update virtio id table, add transitional ids") Signed-off-by: Shunsuke Mie mie@igel.co.jp Link: https://lore.kernel.org/r/20220510102723.87666-1-mie@igel.co.jp Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- include/uapi/linux/virtio_ids.h | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/include/uapi/linux/virtio_ids.h b/include/uapi/linux/virtio_ids.h index dddb4ad18174..60555f7d169c 100644 --- a/include/uapi/linux/virtio_ids.h +++ b/include/uapi/linux/virtio_ids.h @@ -53,12 +53,12 @@ * Virtio Transitional IDs */
-#define VIRTIO_TRANS_ID_NET 1000 /* transitional virtio net */ -#define VIRTIO_TRANS_ID_BLOCK 1001 /* transitional virtio block */ -#define VIRTIO_TRANS_ID_BALLOON 1002 /* transitional virtio balloon */ -#define VIRTIO_TRANS_ID_CONSOLE 1003 /* transitional virtio console */ -#define VIRTIO_TRANS_ID_SCSI 1004 /* transitional virtio SCSI */ -#define VIRTIO_TRANS_ID_RNG 1005 /* transitional virtio rng */ -#define VIRTIO_TRANS_ID_9P 1009 /* transitional virtio 9p console */ +#define VIRTIO_TRANS_ID_NET 0x1000 /* transitional virtio net */ +#define VIRTIO_TRANS_ID_BLOCK 0x1001 /* transitional virtio block */ +#define VIRTIO_TRANS_ID_BALLOON 0x1002 /* transitional virtio balloon */ +#define VIRTIO_TRANS_ID_CONSOLE 0x1003 /* transitional virtio console */ +#define VIRTIO_TRANS_ID_SCSI 0x1004 /* transitional virtio SCSI */ +#define VIRTIO_TRANS_ID_RNG 0x1005 /* transitional virtio rng */ +#define VIRTIO_TRANS_ID_9P 0x1009 /* transitional virtio 9p console */
#endif /* _LINUX_VIRTIO_IDS_H */
From: Eugenio Pérez eperezma@redhat.com
mainline inclusion from mainline-v6.6-rc1 commit b63e5c70c39349ea5b9e7dbb604551902fc753fc category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------------------------------------
This operation allow vdpa parent to expose its own backend feature bits.
Next patches introduce a feature not compatible with all parent drivers: the ability to enable vq after driver_ok. Each parent must declare if it allows it or not.
Signed-off-by: Eugenio Pérez eperezma@redhat.com Acked-by: Shannon Nelson shannon.nelson@amd.com Message-Id: 20230609092127.170673-4-eperezma@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/vhost/vdpa.c | 12 ++++++++++++ include/linux/vdpa.h | 4 ++++ 2 files changed, 16 insertions(+)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 80d3c49f9854..202cc30d6adc 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -403,6 +403,17 @@ static long vhost_vdpa_get_features(struct vhost_vdpa *v, u64 __user *featurep) return 0; }
+static u64 vhost_vdpa_get_backend_features(const struct vhost_vdpa *v) +{ + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + + if (!ops->get_backend_features) + return 0; + else + return ops->get_backend_features(vdpa); +} + static long vhost_vdpa_set_features(struct vhost_vdpa *v, u64 __user *featurep) { struct vdpa_device *vdpa = v->vdpa; @@ -739,6 +750,7 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, features |= BIT_ULL(VHOST_BACKEND_F_SUSPEND); if (vhost_vdpa_can_resume(v)) features |= BIT_ULL(VHOST_BACKEND_F_RESUME); + features |= vhost_vdpa_get_backend_features(v); if (copy_to_user(featurep, &features, sizeof(features))) r = -EFAULT; break; diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index 0f6d8bef66c2..1d8d50b61363 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -185,6 +185,9 @@ struct vdpa_map_file { * @vdev: vdpa device * Returns the virtio features support by the * device + * @get_backend_features: Get parent-specific backend features (optional) + * Returns the vdpa features supported by the + * device. * @set_driver_features: Set virtio features supported by the driver * @vdev: vdpa device * @features: feature support by the driver @@ -320,6 +323,7 @@ struct vdpa_config_ops { u32 (*get_vq_align)(struct vdpa_device *vdev); u32 (*get_vq_group)(struct vdpa_device *vdev, u16 idx); u64 (*get_device_features)(struct vdpa_device *vdev); + u64 (*get_backend_features)(const struct vdpa_device *vdev); int (*set_driver_features)(struct vdpa_device *vdev, u64 features); u64 (*get_driver_features)(struct vdpa_device *vdev); void (*set_config_cb)(struct vdpa_device *vdev,
From: Arnaldo Carvalho de Melo acme@redhat.com
mainline inclusion from mainline-v6.3-rc2 commit 14e998ed42208f60d5f848f5e025fa2c2e9667b0 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
----------------------------------------------------------------------
To get the changes in:
3b688d7a086d0438 ("vhost-vdpa: uAPI to resume the device")
To pick up these changes and support them:
$ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > before $ cp ../linux/include/uapi/linux/vhost.h tools/include/uapi/linux/vhost.h $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > after $ diff -u before after --- before 2023-03-06 09:26:14.889251817 -0300 +++ after 2023-03-06 09:26:20.594406270 -0300 @@ -30,6 +30,7 @@ [0x77] = "VDPA_SET_CONFIG_CALL", [0x7C] = "VDPA_SET_GROUP_ASID", [0x7D] = "VDPA_SUSPEND", + [0x7E] = "VDPA_RESUME", }; static const char *vhost_virtio_ioctl_read_cmds[] = { [0x00] = "GET_FEATURES", $
For instance, see how those 'cmd' ioctl arguments get translated, now VDPA_RESUME will be as well:
# perf trace -a -e ioctl --max-events=10 0.000 ( 0.011 ms): pipewire/2261 ioctl(fd: 60, cmd: SNDRV_PCM_HWSYNC, arg: 0x1) = 0 21.353 ( 0.014 ms): pipewire/2261 ioctl(fd: 60, cmd: SNDRV_PCM_HWSYNC, arg: 0x1) = 0 25.766 ( 0.014 ms): gnome-shell/2196 ioctl(fd: 14, cmd: DRM_I915_IRQ_WAIT, arg: 0x7ffe4a22c740) = 0 25.845 ( 0.034 ms): gnome-shel:cs0/2212 ioctl(fd: 14, cmd: DRM_I915_IRQ_EMIT, arg: 0x7fd43915dc70) = 0 25.916 ( 0.011 ms): gnome-shell/2196 ioctl(fd: 9, cmd: DRM_MODE_ADDFB2, arg: 0x7ffe4a22c8a0) = 0 25.941 ( 0.025 ms): gnome-shell/2196 ioctl(fd: 9, cmd: DRM_MODE_ATOMIC, arg: 0x7ffe4a22c840) = 0 32.915 ( 0.009 ms): gnome-shell/2196 ioctl(fd: 9, cmd: DRM_MODE_RMFB, arg: 0x7ffe4a22cf9c) = 0 42.522 ( 0.013 ms): gnome-shell/2196 ioctl(fd: 14, cmd: DRM_I915_IRQ_WAIT, arg: 0x7ffe4a22c740) = 0 42.579 ( 0.031 ms): gnome-shel:cs0/2212 ioctl(fd: 14, cmd: DRM_I915_IRQ_EMIT, arg: 0x7fd43915dc70) = 0 42.644 ( 0.010 ms): gnome-shell/2196 ioctl(fd: 9, cmd: DRM_MODE_ADDFB2, arg: 0x7ffe4a22c8a0) = 0 #
Cc: Adrian Hunter adrian.hunter@intel.com Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@kernel.org Cc: Michael S. Tsirkin mst@redhat.com Cc: Namhyung Kim namhyung@kernel.org Cc: Sebastien Boeuf sebastien.boeuf@intel.com Link: https://lore.kernel.org/lkml/ZAXdCTecxSNwAoeK@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- tools/include/uapi/linux/vhost.h | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/tools/include/uapi/linux/vhost.h b/tools/include/uapi/linux/vhost.h index f9f115a7c75b..92e1b700b51c 100644 --- a/tools/include/uapi/linux/vhost.h +++ b/tools/include/uapi/linux/vhost.h @@ -180,4 +180,12 @@ */ #define VHOST_VDPA_SUSPEND _IO(VHOST_VIRTIO, 0x7D)
+/* Resume a device so it can resume processing virtqueue requests + * + * After the return of this ioctl the device will have restored all the + * necessary states and it is fully operational to continue processing the + * virtqueue descriptors. + */ +#define VHOST_VDPA_RESUME _IO(VHOST_VIRTIO, 0x7E) + #endif
From: jiangdongxu jiangdongxu1@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO
----------------------------------------------------------------------
Several new interfaces are introduced to allow vdpa device logging guest memory during live migration and return to the VMM.
The set_log_base interface is used to set the base address for buffer storing bitmaps.
The set_log_size interface is used to set the size of buffer used for storing bitmaps.
The log_sync interface is used to copy the bitmaps from kernel space to user space of VMM.
These operations are optional. If they are not implemented, these operations will return EOPNOTSUPP.
Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- include/linux/vdpa.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index 1d8d50b61363..c628003104db 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -293,6 +293,15 @@ struct vdpa_map_file { * @unbind_mm: Unbind the device from the address space * bound using the bind_mm callback. (optional) * @vdev: vdpa device + * @set_log_base Set base address for logging. (optional) + * @vdev: vdpa device + * @base: base address + * @set_log_size Set buffer size for logging. (optional) + * @vdev: vdpa device + * @size: logging buffer size + * @log_sync Synchronize logging buffer from kernel space to + * user space. (optional) + * @vdev: vdpa device * @free: Free resources that belongs to vDPA (optional) * @vdev: vdpa device */ @@ -357,6 +366,11 @@ struct vdpa_config_ops { int (*bind_mm)(struct vdpa_device *vdev, struct mm_struct *mm); void (*unbind_mm)(struct vdpa_device *vdev);
+ /* Log ops */ + int (*set_log_base)(struct vdpa_device *vdev, uint64_t base); + int (*set_log_size)(struct vdpa_device *vdev, uint64_t size); + int (*log_sync)(struct vdpa_device *vdev); + /* Free device resources */ void (*free)(struct vdpa_device *vdev); };
From: jiangdongxu jiangdongxu1@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO
----------------------------------------------------------------------
These new ioctl add support for setting bitmaps config, like base address and buffer size from userspace.
When setup migration, VMM will call VHOST_SET_LOG_BASE and VHOST_SET_LOG_SIZE to set address and size of buffer used for storing bitmaps.
Then VMM start live migration, VMM will enable logging vhost device by set feature VHOST_F_LOG_ALL.
And during live migration iterate, VMM get dirty page info from vhost device by calling VHOST_LOG_SYNC.
Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/vhost/vdpa.c | 49 ++++++++++++++++++++++++++++++++++++++ include/uapi/linux/vhost.h | 4 ++++ 2 files changed, 53 insertions(+)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 202cc30d6adc..a48dd419ba76 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -567,6 +567,47 @@ static long vhost_vdpa_resume(struct vhost_vdpa *v) return ops->resume(vdpa); }
+static long vhost_vdpa_set_log_base(struct vhost_vdpa *v, u64 __user *argp) +{ + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + u64 log; + + if (!ops->set_log_base) + return -EOPNOTSUPP; + + if (copy_from_user(&log, argp, sizeof(uint64_t))) + return -EFAULT; + + return ops->set_log_base(vdpa, log); +} + +static long vhost_vdpa_set_log_size(struct vhost_vdpa *v, u64 __user *sizep) +{ + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + u64 log_size; + + if (!ops->set_log_size) + return -EOPNOTSUPP; + + if (copy_from_user(&log_size, sizep, sizeof(log_size))) + return -EFAULT; + + return ops->set_log_size(vdpa, log_size); +} + +static long vhost_vdpa_log_sync(struct vhost_vdpa *v) +{ + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + + if (!ops->log_sync) + return -EOPNOTSUPP; + + return ops->log_sync(vdpa); +} + static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd, void __user *argp) { @@ -738,6 +779,14 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, r = -EFAULT; break; case VHOST_SET_LOG_BASE: + r = vhost_vdpa_set_log_base(v, argp); + break; + case VHOST_SET_LOG_SIZE: + r = vhost_vdpa_set_log_size(v, argp); + break; + case VHOST_LOG_SYNC: + r = vhost_vdpa_log_sync(v); + break; case VHOST_SET_LOG_FD: r = -ENOIOCTLCMD; break; diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h index 92e1b700b51c..7c22c18f51d2 100644 --- a/include/uapi/linux/vhost.h +++ b/include/uapi/linux/vhost.h @@ -43,6 +43,10 @@ * The bit is set using an atomic 32 bit operation. */ /* Set base address for logging. */ #define VHOST_SET_LOG_BASE _IOW(VHOST_VIRTIO, 0x04, __u64) +/* Set buffer size for logging */ +#define VHOST_SET_LOG_SIZE _IOW(VHOST_VIRTIO, 0x05, __u64) +/* Synchronize logging buffer from kernel space to user space */ +#define VHOST_LOG_SYNC _IO(VHOST_VIRTIO, 0x06) /* Specify an eventfd file descriptor to signal on log write. */ #define VHOST_SET_LOG_FD _IOW(VHOST_VIRTIO, 0x07, int)
From: jiangdongxu jiangdongxu1@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO
----------------------------------------------------------------------
Introduce several interfaces to allow vdpa device save/load device status when guest machine resume and suspend.
The get_dev_buffer_size interface is used to get the buffer size of vdpa device status.
The get_dev_buffer interface is used to get the device buffer from vdpa device, and VMM can save it.
The set_dev_buffer interface is used to set the device status from userspace.
These operations are optional. If they are not implemented, return EOPNOTSUPP.
Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- include/linux/vdpa.h | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index c628003104db..dc86e2829a35 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -302,6 +302,19 @@ struct vdpa_map_file { * @log_sync Synchronize logging buffer from kernel space to * user space. (optional) * @vdev: vdpa device + * @get_dev_buffer_size Get device state buffer size. (optional) + * @vdev: vdpa device + * Return device status buffer size of vdpa device. + * @get_dev_buffer Get device state buffer. (optional) + * @vdev: vdpa device + * @offset: offset of dest for saving device state. + * @dest: userspace address for saving device state. + * @len: device state buffer length. + * @set_dev_buffer Set device state buffer. (opetional) + * @vdev: vdpa device + * @offset: offset of src addr of device state. + * @src: userspace addr of device state + * @len: device state buffer length. * @free: Free resources that belongs to vDPA (optional) * @vdev: vdpa device */ @@ -371,6 +384,13 @@ struct vdpa_config_ops { int (*set_log_size)(struct vdpa_device *vdev, uint64_t size); int (*log_sync)(struct vdpa_device *vdev);
+ /* device state ops */ + uint32_t (*get_dev_buffer_size)(struct vdpa_device *vdpa); + int (*get_dev_buffer)(struct vdpa_device *vdev, unsigned int offset, + void __user *dest, unsigned int len); + int (*set_dev_buffer)(struct vdpa_device *vdev, unsigned int offset, + const void __user *src, unsigned int len); + /* Free device resources */ void (*free)(struct vdpa_device *vdev); };
From: jiangdongxu jiangdongxu1@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO
----------------------------------------------------------------------
These new ioctl add support for saving/loading device status from usersapce.
When vhost-vdpa device start migration, VMM need to call these ioctl to save/load device status of vhost-vdpa devices if these ops is implemented.
Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/vhost/vdpa.c | 71 ++++++++++++++++++++++++++++++++++++++ include/uapi/linux/vhost.h | 5 +++ 2 files changed, 76 insertions(+)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index a48dd419ba76..d6a4221ecd60 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -567,6 +567,68 @@ static long vhost_vdpa_resume(struct vhost_vdpa *v) return ops->resume(vdpa); }
+static int vhost_vdpa_get_dev_buffer_size(struct vhost_vdpa *v, + uint32_t __user *argp) +{ + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + uint32_t size; + + if (!ops->get_dev_buffer_size) + return -EOPNOTSUPP; + + size = ops->get_dev_buffer_size(vdpa); + + if (copy_to_user(argp, &size, sizeof(size))) + return -EFAULT; + + return 0; +} + +static int vhost_vdpa_get_dev_buffer(struct vhost_vdpa *v, + struct vhost_vdpa_config __user *c) +{ + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + struct vhost_vdpa_config config; + int ret; + unsigned long size = offsetof(struct vhost_vdpa_config, buf); + + if (copy_from_user(&config, c, size)) + return -EFAULT; + + if (!ops->get_dev_buffer) + return -EOPNOTSUPP; + + down_read(&vdpa->cf_lock); + ret = ops->get_dev_buffer(vdpa, config.off, c->buf, config.len); + up_read(&vdpa->cf_lock); + + return ret; +} + +static int vhost_vdpa_set_dev_buffer(struct vhost_vdpa *v, + struct vhost_vdpa_config __user *c) +{ + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + struct vhost_vdpa_config config; + int ret; + unsigned long size = offsetof(struct vhost_vdpa_config, buf); + + if (copy_from_user(&config, c, size)) + return -EFAULT; + + if (!ops->set_dev_buffer) + return -EOPNOTSUPP; + + down_write(&vdpa->cf_lock); + ret = ops->set_dev_buffer(vdpa, config.off, c->buf, config.len); + up_write(&vdpa->cf_lock); + + return ret; +} + static long vhost_vdpa_set_log_base(struct vhost_vdpa *v, u64 __user *argp) { struct vdpa_device *vdpa = v->vdpa; @@ -818,6 +880,15 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, case VHOST_VDPA_RESUME: r = vhost_vdpa_resume(v); break; + case VHOST_GET_DEV_BUFFER_SIZE: + r = vhost_vdpa_get_dev_buffer_size(v, argp); + break; + case VHOST_GET_DEV_BUFFER: + r = vhost_vdpa_get_dev_buffer(v, argp); + break; + case VHOST_SET_DEV_BUFFER: + r = vhost_vdpa_set_dev_buffer(v, argp); + break; default: r = vhost_dev_ioctl(&v->vdev, cmd, argp); if (r == -ENOIOCTLCMD) diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h index 7c22c18f51d2..da1c83d19896 100644 --- a/include/uapi/linux/vhost.h +++ b/include/uapi/linux/vhost.h @@ -192,4 +192,9 @@ */ #define VHOST_VDPA_RESUME _IO(VHOST_VIRTIO, 0x7E)
+/* set and get device buffer */ +#define VHOST_GET_DEV_BUFFER _IOR(VHOST_VIRTIO, 0xb0, struct vhost_vdpa_config) +#define VHOST_SET_DEV_BUFFER _IOW(VHOST_VIRTIO, 0xb1, struct vhost_vdpa_config) +#define VHOST_GET_DEV_BUFFER_SIZE _IOR(VHOST_VIRTIO, 0xb3, __u32) + #endif
From: jiangdongxu jiangdongxu1@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO
----------------------------------------------------------------------
Introduce a new interface to set vdpa device migration status, Such as migration start/stop, pre_start/pre_stop, etc.
Some vdpa device need to do some job in different state. As not all vdpa devices need to do this, this interface is optional.
Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- include/linux/vdpa.h | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index dc86e2829a35..83df2380eeb9 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -315,6 +315,9 @@ struct vdpa_map_file { * @offset: offset of src addr of device state. * @src: userspace addr of device state * @len: device state buffer length. + * @set_mig_state Set device migration status. (optional) + * @vdev: vdpa device + * @status: migration status * @free: Free resources that belongs to vDPA (optional) * @vdev: vdpa device */ @@ -391,6 +394,9 @@ struct vdpa_config_ops { int (*set_dev_buffer)(struct vdpa_device *vdev, unsigned int offset, const void __user *src, unsigned int len);
+ /* device mig state ops */ + int (*set_mig_state)(struct vdpa_device *v, u8 state); + /* Free device resources */ void (*free)(struct vdpa_device *vdev); };
From: jiangdongxu jiangdongxu1@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO
----------------------------------------------------------------------
These new ioctl add support for setting vhost-vdpa device migration state.
During migration, there is several migration state such as start/stop, pre_start/pre_stop, post_start/post_stop, cancel etc.Some hardware needs to do something at these stages, introduce a new ioctl to implement it.
Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/vhost/vdpa.c | 18 ++++++++++++++++++ include/uapi/linux/vhost.h | 3 +++ include/uapi/linux/vhost_types.h | 16 ++++++++++++++++ 3 files changed, 37 insertions(+)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index d6a4221ecd60..1413264d29ed 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -629,6 +629,21 @@ static int vhost_vdpa_set_dev_buffer(struct vhost_vdpa *v, return ret; }
+static int vhost_vdpa_set_mig_state(struct vhost_vdpa *v, u8 __user *c) +{ + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + u8 state; + + if (!ops->set_mig_state) + return -EOPNOTSUPP; + + if (get_user(state, c)) + return -EFAULT; + + return ops->set_mig_state(vdpa, state); +} + static long vhost_vdpa_set_log_base(struct vhost_vdpa *v, u64 __user *argp) { struct vdpa_device *vdpa = v->vdpa; @@ -889,6 +904,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, case VHOST_SET_DEV_BUFFER: r = vhost_vdpa_set_dev_buffer(v, argp); break; + case VHOST_VDPA_SET_MIG_STATE: + r = vhost_vdpa_set_mig_state(v, argp); + break; default: r = vhost_dev_ioctl(&v->vdev, cmd, argp); if (r == -ENOIOCTLCMD) diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h index da1c83d19896..e916ac7a10ce 100644 --- a/include/uapi/linux/vhost.h +++ b/include/uapi/linux/vhost.h @@ -197,4 +197,7 @@ #define VHOST_SET_DEV_BUFFER _IOW(VHOST_VIRTIO, 0xb1, struct vhost_vdpa_config) #define VHOST_GET_DEV_BUFFER_SIZE _IOR(VHOST_VIRTIO, 0xb3, __u32)
+/* set device migtration state */ +#define VHOST_VDPA_SET_MIG_STATE _IOW(VHOST_VIRTIO, 0xb2, __u8) + #endif diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h index 9e072926d633..d1114838c99c 100644 --- a/include/uapi/linux/vhost_types.h +++ b/include/uapi/linux/vhost_types.h @@ -147,6 +147,22 @@ struct vhost_vdpa_iova_range { __u64 last; };
+/* vhost vdpa device migration statue */ +enum { + VHOST_VDPA_DEVICE_START, + VHOST_VDPA_DEVICE_STOP, + VHOST_VDPA_DEVICE_PRE_START, + VHOST_VDPA_DEVICE_PRE_STOP, + VHOST_VDPA_DEVICE_CANCEL, + VHOST_VDPA_DEVICE_POST_START, + VHOST_VDPA_DEVICE_START_ASYNC, + VHOST_VDPA_DEVICE_STOP_ASYNC, + VHOST_VDPA_DEVICE_PRE_START_ASYNC, + VHOST_VDPA_DEVICE_QUERY_OP_STATE, + VHOST_VDPA_DEVICE_MSIX_MASK, + VHOST_VDPA_DEVICE_MSIX_UNMASK, +}; + /* Feature bits */ /* Log all write descriptors. Can be changed while device is active. */ #define VHOST_F_LOG_ALL 26
From: jiangdongxu jiangdongxu1@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO
----------------------------------------------------------------------
Introduce new feature bit VHOST_BACKEND_F_BYTEMAPLOG for negotiating the type of dirty pages.
As some hardware only support bytemap for logging, introduce a new feature bit. When vhost device starte, negotiating dirty page type used for logging.
Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/vhost/vdpa.c | 3 ++- include/uapi/linux/vhost_types.h | 3 +++ 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 1413264d29ed..23a27df16cff 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -807,7 +807,8 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, return -EFAULT; if (features & ~(VHOST_VDPA_BACKEND_FEATURES | BIT_ULL(VHOST_BACKEND_F_SUSPEND) | - BIT_ULL(VHOST_BACKEND_F_RESUME))) + BIT_ULL(VHOST_BACKEND_F_RESUME) | + BIT_ULL(VHOST_BACKEND_F_BYTEMAPLOG))) return -EOPNOTSUPP; if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) && !vhost_vdpa_can_suspend(v)) diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h index d1114838c99c..95cfaf6678e6 100644 --- a/include/uapi/linux/vhost_types.h +++ b/include/uapi/linux/vhost_types.h @@ -182,4 +182,7 @@ enum { /* Device can be resumed */ #define VHOST_BACKEND_F_RESUME 0x5
+/* Device can use bytemap to deal log */ +#define VHOST_BACKEND_F_BYTEMAPLOG 0x3f + #endif
From: jiangdongxu jiangdongxu1@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO
----------------------------------------------------------------------
Export iommu_get_resv_regions and iommu_put_resv_regions symbol as vhost-vdpa need to use it.
Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/iommu/iommu.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index b888efd65e92..6d4f515294f9 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -3040,6 +3040,7 @@ void iommu_get_resv_regions(struct device *dev, struct list_head *list) if (ops && ops->get_resv_regions) ops->get_resv_regions(dev, list); } +EXPORT_SYMBOL_GPL(iommu_get_resv_regions);
void iommu_put_resv_regions(struct device *dev, struct list_head *list) { @@ -3048,6 +3049,7 @@ void iommu_put_resv_regions(struct device *dev, struct list_head *list) if (ops && ops->put_resv_regions) ops->put_resv_regions(dev, list); } +EXPORT_SYMBOL_GPL(iommu_put_resv_regions);
/** * generic_iommu_put_resv_regions - Reserved region driver helper
From: jiangdongxu jiangdongxu1@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO
----------------------------------------------------------------------
When attach dma_dev to iommu domain, check the device's reserved regions and test whether the IOMMU translates MSI transactions. If yes, we initialize an IOVA allocator through the iommu_get_msi_cookie API. This will allow the MSI IOVAs to be transparently allocated on MSI controller's compose().
Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/vhost/vdpa.c | 60 +++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 57 insertions(+), 3 deletions(-)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 23a27df16cff..41c345f9426f 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -22,6 +22,7 @@ #include <linux/vdpa.h> #include <linux/nospec.h> #include <linux/vhost.h> +#include <linux/dma-iommu.h>
#include "vhost.h"
@@ -49,6 +50,7 @@ struct vhost_vdpa { struct completion completion; struct vdpa_device *vdpa; struct hlist_head as[VHOST_VDPA_IOTLB_BUCKETS]; + struct vhost_iotlb resv_iotlb; struct device dev; struct cdev cdev; atomic_t opened; @@ -1251,6 +1253,10 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, msg->iova + msg->size - 1 > v->range.last) return -EINVAL;
+ if (vhost_iotlb_itree_first(&v->resv_iotlb, msg->iova, + msg->iova + msg->size - 1)) + return -EINVAL; + if (vhost_iotlb_itree_first(iotlb, msg->iova, msg->iova + msg->size - 1)) return -EEXIST; @@ -1339,6 +1345,46 @@ static ssize_t vhost_vdpa_chr_write_iter(struct kiocb *iocb, return vhost_chr_write_iter(dev, from); }
+static int vhost_vdpa_resv_iommu_region(struct iommu_domain *domain, struct device *dma_dev, + struct vhost_iotlb *resv_iotlb) +{ + struct list_head dev_resv_regions; + phys_addr_t resv_msi_base = 0; + struct iommu_resv_region *region; + int ret = 0; + bool with_sw_msi = false; + bool with_hw_msi = false; + + INIT_LIST_HEAD(&dev_resv_regions); + iommu_get_resv_regions(dma_dev, &dev_resv_regions); + + list_for_each_entry(region, &dev_resv_regions, list) { + ret = vhost_iotlb_add_range_ctx(resv_iotlb, region->start, + region->start + region->length - 1, + 0, 0, NULL); + if (ret) { + vhost_iotlb_reset(resv_iotlb); + break; + } + + if (region->type == IOMMU_RESV_MSI) + with_hw_msi = true; + + if (region->type == IOMMU_RESV_SW_MSI) { + resv_msi_base = region->start; + with_sw_msi = true; + } + + } + + if (!ret && !with_hw_msi && with_sw_msi) + ret = iommu_get_msi_cookie(domain, resv_msi_base); + + iommu_put_resv_regions(dma_dev, &dev_resv_regions); + + return ret; +} + static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v) { struct vdpa_device *vdpa = v->vdpa; @@ -1364,11 +1410,16 @@ static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v)
ret = iommu_attach_device(v->domain, dma_dev); if (ret) - goto err_attach; + goto err_alloc_domain;
- return 0; + ret = vhost_vdpa_resv_iommu_region(v->domain, dma_dev, &v->resv_iotlb); + if (ret) + goto err_attach_device;
-err_attach: + return 0; +err_attach_device: + iommu_detach_device(v->domain, dma_dev); +err_alloc_domain: iommu_domain_free(v->domain); v->domain = NULL; return ret; @@ -1495,6 +1546,7 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep) vhost_vdpa_unbind_mm(v); vhost_vdpa_config_put(v); vhost_vdpa_cleanup(v); + vhost_iotlb_reset(&v->resv_iotlb); mutex_unlock(&d->mutex);
atomic_dec(&v->opened); @@ -1627,6 +1679,8 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa) goto err; }
+ vhost_iotlb_init(&v->resv_iotlb, 0, 0); + r = dev_set_name(&v->dev, "vhost-vdpa-%u", minor); if (r) goto err;
From: jiangdongxu jiangdongxu1@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO
----------------------------------------------------------------------
When we call vhost_vdpa_reset, vdpa device pci driver may request irq, at this time, wo have not init msi iova for device, may cause an error, call vhost_vdpa_alloc_domain first to avoid this scene.
Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- drivers/vhost/vdpa.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 41c345f9426f..874877ee7f8f 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -1488,6 +1488,9 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep) opened = atomic_cmpxchg(&v->opened, 0, 1); if (opened) return -EBUSY; + r = vhost_vdpa_alloc_domain(v); + if (r) + return r;
nvqs = v->nvqs; r = vhost_vdpa_reset(v); @@ -1508,19 +1511,14 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep) vhost_dev_init(dev, vqs, nvqs, 0, 0, 0, false, vhost_vdpa_process_iotlb_msg);
- r = vhost_vdpa_alloc_domain(v); - if (r) - goto err_alloc_domain; - vhost_vdpa_set_iova_range(v);
filep->private_data = v;
return 0;
-err_alloc_domain: - vhost_vdpa_cleanup(v); err: + vhost_vdpa_free_domain(v); atomic_dec(&v->opened); return r; } @@ -1546,7 +1544,6 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep) vhost_vdpa_unbind_mm(v); vhost_vdpa_config_put(v); vhost_vdpa_cleanup(v); - vhost_iotlb_reset(&v->resv_iotlb); mutex_unlock(&d->mutex);
atomic_dec(&v->opened);
From: jiangdongxu jiangdongxu1@huawei.com
virt inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I86ITO
----------------------------------------------------------------------
add VDPA config.
Signed-off-by: jiangdongxu jiangdongxu1@huawei.com --- arch/arm64/configs/openeuler_defconfig | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/configs/openeuler_defconfig b/arch/arm64/configs/openeuler_defconfig index 52ff508b84c6..165de3fd17d5 100644 --- a/arch/arm64/configs/openeuler_defconfig +++ b/arch/arm64/configs/openeuler_defconfig @@ -5832,13 +5832,17 @@ CONFIG_VIRTIO_INPUT=m CONFIG_VIRTIO_MMIO=m # CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES is not set CONFIG_VIRTIO_DMA_SHARED_BUFFER=m -# CONFIG_VDPA is not set +CONFIG_VDPA=m +# CONFIG_IFCVF is not set +# CONFIG_MLX5_VDPA_NET is not set +# CONFIG_VP_VDPA is not set CONFIG_VHOST_IOTLB=m CONFIG_VHOST=m CONFIG_VHOST_MENU=y CONFIG_VHOST_NET=m CONFIG_VHOST_SCSI=m CONFIG_VHOST_VSOCK=m +CONFIG_VHOST_VDPA=m # CONFIG_VHOST_CROSS_ENDIAN_LEGACY is not set
#
反馈: 您发送到kernel@openeuler.org的补丁/补丁集,已成功转换为PR! PR链接地址: https://gitee.com/openeuler/kernel/pulls/2578 邮件列表地址:https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/C...
FeedBack: The patch(es) which you have sent to kernel@openeuler.org mailing list has been converted to a pull request successfully! Pull request link: https://gitee.com/openeuler/kernel/pulls/2578 Mailing list address: https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/C...