[Linuxarm] Re: [RFC PATCH 2/3] vfio/hisilicon: register the driver to vfio

14 May 2021

On Thu, 13 May 2021 17:52:56 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote:
...
Hi Alex,
...
-----Original Message-----
From: Alex Williamson [mailto:alex.williamson@redhat.com]
Sent: 13 May 2021 18:04
To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>; liulongfang
<liulongfang@huawei.com>; cohuck@redhat.com;
linux-kernel@vger.kernel.org; linuxarm@openeuler.org
Subject: [Linuxarm] Re: [RFC PATCH 2/3] vfio/hisilicon: register the driver to
vfio
On Thu, 13 May 2021 15:49:25 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
wrote:
...
...
-----Original Message-----
From: Jason Gunthorpe [mailto:jgg@nvidia.com]
Sent: 13 May 2021 14:44
To: liulongfang <liulongfang@huawei.com>
Cc: Alex Williamson <alex.williamson@redhat.com>; cohuck@redhat.com;
linux-kernel@vger.kernel.org; linuxarm@openeuler.org
Subject: [Linuxarm] Re: [RFC PATCH 2/3] vfio/hisilicon: register the driver to
vfio
On Thu, May 13, 2021 at 10:08:28AM +0800, liulongfang wrote:
...
On 2021/5/12 20:10, Jason Gunthorpe wrote:
...
On Wed, May 12, 2021 at 04:39:43PM +0800, liulongfang wrote:
> Therefore, this method of limiting the length of the BAR
> configuration space can prevent unsafe operations of the memory.
The issue is DMA controlled by the guest accessing the secure BAR
area, not the guest CPU.
Jason
.
This secure BAR area is not presented to the Guest,
which makes it impossible for the Guest to obtain the secure BAR area
when establishing the DMA mapping of the configuration space.
If the DMA controller accesses the secure BAR area, the access will
be blocked by the SMMU.
There are scenarios where this is not true.
At a minimum the mdev driver should refuse to work in those cases.
Hi,
I think the idea here is not a generic solution, but a quirk for this specific dev.
Something like,

--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -866,7 +866,12 @@ static long vfio_pci_ioctl(struct vfio_device  
*core_vdev,  
                        break;
                case VFIO_PCI_BAR0_REGION_INDEX ...  
VFIO_PCI_BAR5_REGION_INDEX:  
                        info.offset =  
VFIO_PCI_INDEX_TO_OFFSET(info.index);  
-                       info.size = pci_resource_len(pdev, info.index);
+
+                       if (check_hisi_acc_quirk(pdev, info))
+                               info.size = new_size;// BAR is limited  
without migration region.  
+                       else
+                               info.size = pci_resource_len(pdev,  
info.index);  
+
                        if (!info.size) {
                                info.flags = 0;
                                break;
Is this an acceptable/workable solution here?
As Jason says, this only restricts CPU access to the BAR, the issue is
DMA access.  As the hardware vendor you may be able to guarantee that
a DMA transaction generated by the device targeting the remainder of
the BAR will always go upstream, but can you guarantee the routing
between the device and the SMMU?  For instance if this device can be
implemented as a plugin card, then it can be installed into a
downstream port that may not support ACS.  That downstream port may
implement request redirection allowing the transaction to reflect back
to the device without IOMMU translation.  At that point the userspace
driver can target the kernel driver half of the BAR and potentially
expose a security risk.  Thanks,
The ACC devices on this platform are not pluggable devices. They are exposed
as integrated endpoint devices. So I am not sure the above concern is valid in this
case.
I had a look at the userspace driver approach you suggested. But unfortunately
the migration state change for the vf has to check some of the pf registers for
confirming the state. So even if we move the implementation to Qemu, we
still may have to use the migration uAPI to access the pf device registers.
Since the devices we are concerned here are all integrated endpoints and if the 
above quirk is an acceptable one, then we can use the uAPI as done in this
series without overly complicating things here.
If you expect this device to appear only as an integrated endpoint, then
I think Jason's suggestion above is correct.  Your driver that supports
migration can refuse to load for devices there the topology is other
than expected and you're effectively guaranteeing DMA isolation of the
user and in-kernel drivers by hardware DMA semantics and topology.

Requiring access to the PF to support the migration protocol also
suggests that an in-kernel driver to support migration is our best
option.  Thanks,

Alex