Re: [PATCH] iommu/virtio: Reject IOMMU page granule larger than PAGE_SIZE

From: Robin Murphy <robin.murphy@arm.com>
To: Auger Eric <eric.auger@redhat.com>,
	Jean-Philippe Brucker <jean-philippe@linaro.org>,
	iommu@lists.linux-foundation.org,
	virtualization@lists.linux-foundation.org
Cc: Alex Williamson <alex.williamson@redhat.com>,
	Bharat Bhushan <bbhushan2@marvell.com>
Subject: Re: [PATCH] iommu/virtio: Reject IOMMU page granule larger than PAGE_SIZE
Date: Wed, 18 Mar 2020 17:13:59 +0000	[thread overview]
Message-ID: <5b637bf5-dc49-b6eb-852a-a4317602d43e@arm.com> (raw)
In-Reply-To: <09a32736-ea01-21f9-6bd5-9344b368f90a@redhat.com>

On 2020-03-18 4:14 pm, Auger Eric wrote:
> Hi,
> 
> On 3/18/20 1:00 PM, Robin Murphy wrote:
>> On 2020-03-18 11:40 am, Jean-Philippe Brucker wrote:
>>> We don't currently support IOMMUs with a page granule larger than the
>>> system page size. The IOVA allocator has a BUG_ON() in this case, and
>>> VFIO has a WARN_ON().
> 
> Adding Alex in CC in case he has time to jump in. At the moment I don't
> get why this WARN_ON() is here.
> 
> This was introduced in
> c8dbca165bb090f926996a572ea2b5b577b34b70 vfio/iommu_type1: Avoid overflow
> 
>>>
>>> It might be possible to remove these obstacles if necessary. If the host
>>> uses 64kB pages and the guest uses 4kB, then a device driver calling
>>> alloc_page() followed by dma_map_page() will create a 64kB mapping for a
>>> 4kB physical page, allowing the endpoint to access the neighbouring 60kB
>>> of memory. This problem could be worked around with bounce buffers.
>>
>> FWIW the fundamental issue is that callers of iommu_map() may expect to
>> be able to map two or more page-aligned regions directly adjacent to
>> each other for scatter-gather purposes (or ring buffer tricks), and
>> that's just not possible if the IOMMU granule is too big. Bounce
>> buffering would be a viable workaround for the streaming DMA API and
>> certain similar use-cases, but not in general (e.g. coherent DMA, VFIO,
>> GPUs, etc.)
>>
>> Robin.
>>
>>> For the moment, rather than triggering the IOVA BUG_ON() on mismatched
>>> page sizes, abort the virtio-iommu probe with an error message.
> 
> I understand this is a introduced as a temporary solution but this
> sounds as an important limitation to me. For instance this will prevent
> from running a fedora guest exposed with a virtio-iommu with a RHEL host.

As above, even if you bypassed all the warnings it wouldn't really work 
properly anyway. In all cases that wouldn't be considered broken, the 
underlying hardware IOMMUs should support the same set of granules as 
the CPUs (or at least the smallest one), so is it actually appropriate 
for RHEL to (presumably) expose a 64K granule in the first place, rather 
than "works with anything" 4K? And/or more generally is there perhaps a 
hole in the virtio-iommu spec WRT being able to negotiate page_size_mask 
for a particular granule if multiple options are available?

Robin.

> 
> Thanks
> 
> Eric
>>>
>>> Reported-by: Bharat Bhushan <bbhushan2@marvell.com>
>>> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>> ---
>>>    drivers/iommu/virtio-iommu.c | 9 +++++++++
>>>    1 file changed, 9 insertions(+)
>>>
>>> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
>>> index 6d4e3c2a2ddb..80d5d8f621ab 100644
>>> --- a/drivers/iommu/virtio-iommu.c
>>> +++ b/drivers/iommu/virtio-iommu.c
>>> @@ -998,6 +998,7 @@ static int viommu_probe(struct virtio_device *vdev)
>>>        struct device *parent_dev = vdev->dev.parent;
>>>        struct viommu_dev *viommu = NULL;
>>>        struct device *dev = &vdev->dev;
>>> +    unsigned long viommu_page_size;
>>>        u64 input_start = 0;
>>>        u64 input_end = -1UL;
>>>        int ret;
>>> @@ -1028,6 +1029,14 @@ static int viommu_probe(struct virtio_device
>>> *vdev)
>>>            goto err_free_vqs;
>>>        }
>>>    +    viommu_page_size = 1UL << __ffs(viommu->pgsize_bitmap);
>>> +    if (viommu_page_size > PAGE_SIZE) {
>>> +        dev_err(dev, "granule 0x%lx larger than system page size
>>> 0x%lx\n",
>>> +            viommu_page_size, PAGE_SIZE);
>>> +        ret = -EINVAL;
>>> +        goto err_free_vqs;
>>> +    }
>>> +
>>>        viommu->map_flags = VIRTIO_IOMMU_MAP_F_READ |
>>> VIRTIO_IOMMU_MAP_F_WRITE;
>>>        viommu->last_domain = ~0U;
>>>   
>>
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu