From: Kunkun Jiang jiangkunkun@huawei.com
virt inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I61SPO CVE: NA
--------------------------------
This reverts commit aa2addedeae2756de0265c56c4e8d96aac737a23.
Signed-off-by: Kunkun Jiang jiangkunkun@huawei.com Reviewed-by: Keqian Zhu zhukeqian1@huawei.com Signed-off-by: Zheng Zengkai zhengzengkai@huawei.com --- Documentation/driver-api/vfio.rst | 77 ------------------------------- 1 file changed, 77 deletions(-)
diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-api/vfio.rst index b57a96d20d3b..d3a02300913a 100644 --- a/Documentation/driver-api/vfio.rst +++ b/Documentation/driver-api/vfio.rst @@ -239,83 +239,6 @@ group and can access them as follows:: /* Gratuitous device reset and go... */ ioctl(device, VFIO_DEVICE_RESET);
-IOMMU Dual Stage Control ------------------------- - -Some IOMMUs support 2 stages/levels of translation. "Stage" corresponds to -the ARM terminology while "level" corresponds to Intel's VTD terminology. In -the following text we use either without distinction. - -This is useful when the guest is exposed with a virtual IOMMU and some -devices are assigned to the guest through VFIO. Then the guest OS can use -stage 1 (IOVA -> GPA), while the hypervisor uses stage 2 for VM isolation -(GPA -> HPA). - -The guest gets ownership of the stage 1 page tables and also owns stage 1 -configuration structures. The hypervisor owns the root configuration structure -(for security reason), including stage 2 configuration. This works as long -configuration structures and page table format are compatible between the -virtual IOMMU and the physical IOMMU. - -Assuming the HW supports it, this nested mode is selected by choosing the -VFIO_TYPE1_NESTING_IOMMU type through: - -ioctl(container, VFIO_SET_IOMMU, VFIO_TYPE1_NESTING_IOMMU); - -This forces the hypervisor to use the stage 2, leaving stage 1 available for -guest usage. - -Once groups are attached to the container, the guest stage 1 translation -configuration data can be passed to VFIO by using - -ioctl(container, VFIO_IOMMU_SET_PASID_TABLE, &pasid_table_info); - -This allows to combine the guest stage 1 configuration structure along with -the hypervisor stage 2 configuration structure. Stage 1 configuration -structures are dependent on the IOMMU type. - -As the stage 1 translation is fully delegated to the HW, translation faults -encountered during the translation process need to be propagated up to -the virtualizer and re-injected into the guest. - -The userspace must be prepared to receive faults. The VFIO-PCI device -exposes one dedicated DMA FAULT region: it contains a ring buffer and -its header that allows to manage the head/tail indices. The region is -identified by the following index/subindex: -- VFIO_REGION_TYPE_NESTED/VFIO_REGION_SUBTYPE_NESTED_DMA_FAULT - -The DMA FAULT region exposes a VFIO_REGION_INFO_CAP_DMA_FAULT -region capability that allows the userspace to retrieve the ABI version -of the fault records filled by the host. - -On top of that region, the userspace can be notified whenever a fault -occurs at the physical level. It can use the VFIO_IRQ_TYPE_NESTED/ -VFIO_IRQ_SUBTYPE_DMA_FAULT specific IRQ to attach the eventfd to be -signalled. - -The ring buffer containing the fault records can be mmapped. When -the userspace consumes a fault in the queue, it should increment -the consumer index to allow new fault records to replace the used ones. - -The queue size and the entry size can be retrieved in the header. -The tail index should never overshoot the producer index as in any -other circular buffer scheme. Also it must be less than the queue size -otherwise the change fails. - -When the guest invalidates stage 1 related caches, invalidations must be -forwarded to the host through -ioctl(container, VFIO_IOMMU_CACHE_INVALIDATE, &inv_data); -Those invalidations can happen at various granularity levels, page, context, ... - -The ARM SMMU specification introduces another challenge: MSIs are translated by -both the virtual SMMU and the physical SMMU. To build a nested mapping for the -IOVA programmed into the assigned device, the guest needs to pass its IOVA/MSI -doorbell GPA binding to the host. Then the hypervisor can build a nested stage 2 -binding eventually translating into the physical MSI doorbell. - -This is achieved by calling -ioctl(container, VFIO_IOMMU_SET_MSI_BINDING, &guest_binding); - VFIO User API -------------------------------------------------------------------------------