Hi Shameer,
On Wed, Jul 21, 2021 at 08:54:00AM +0000, Shameerali Kolothum Thodi wrote:
More generally I think this pinned VMID set conflicts with that of stage-2-only domains (which is the default state until a guest attaches a PASID table). Say you have one guest using DOMAIN_NESTED without PASID table, just DMA to IPA using VMID 0x8000. Now another guest attaches a PASID table and obtains the same VMID from KVM. The stage-2 translation might use TLB entries from the other guest, no? They'll both create stage-2 TLB entries with {StreamWorld=NS-EL1, VMID=0x8000}
Now that we are trying to align the KVM VMID allocation algorithm similar to that of the ASID allocator [1], I attempted to use that for the SMMU pinned VMID allocation. But the issue you have mentioned above is still valid.
And as a solution what I have tried now is follow what pinned ASID is doing in SVA, -Use xarray for private VMIDs -Get pinned VMID from KVM for DOMAIN_NESTED with PASID table -If the new pinned VMID is in use by private, then update the private VMID(VMID update to a live STE).
This seems to work, but still need to run more tests with this though.
It's tempting to allocate all VMIDs through KVM instead, but that will force a dependency on KVM to use VFIO_TYPE1_NESTING_IOMMU and might break existing users of that extension (though I'm not sure there are any). Instead we might need to restrict the SMMU VMID bitmap to match the private VMID set in KVM.
Another solution I have in mind is, make the new KVM VMID allocator common between SMMUv3 and KVM. This will help to avoid all the private and shared VMID splitting, also no need for live updates to STE VMID. One possible drawback is less number of available KVM VMIDs but with 16 bit VMID space I am not sure how much that is a concern.
Yes I think that works too. In practice there shouldn't be many VMIDs on the SMMU side, the feature's only enabled when a user wants to assign devices with nesting translation (unlike ASIDs where each device in the system gets a private ASID by default).
Note that you still need to pin all VMIDs used by the SMMU, otherwise you'll have to update the STE after rollover.
The problem we have with VFIO_TYPE1_NESTING_IOMMU might be solved by the upcoming deprecation of VFIO_*_IOMMU [2]. We need a specific sequence from userspace: 1. Attach VFIO group to KVM (KVM_DEV_VFIO_GROUP_ADD) 2. Create nesting IOMMU domain and attach the group to it (VFIO_GROUP_SET_CONTAINER, VFIO_SET_IOMMU becomes IOMMU_IOASID_ALLOC, VFIO_DEVICE_ATTACH_IOASID) Currently QEMU does 2 then 1, which would cause the SMMU to allocate a separate VMID. If we wanted to extend VFIO_TYPE1_NESTING_IOMMU with PASID tables we'd need to mandate 1-2 and may break existing users. In the new design we can require from the start that creating a nesting IOMMU container through /dev/iommu *must* come with a KVM context, that way we're sure to reuse the existing VMID.
Thanks, Jean
[2] https://lore.kernel.org/linux-iommu/BN9PR11MB5433B1E4AE5B0480369F97178C189@B...