* [RFC] Simple device assignment with VFIO platform
@ 2024-09-27 16:17 Mostafa Saleh
2024-09-30 7:19 ` Yi Liu
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: Mostafa Saleh @ 2024-09-27 16:17 UTC (permalink / raw)
To: kvm, open list
Cc: Eric Auger, Alex Williamson, kwankhede, Marc Zyngier, Will Deacon,
Quentin Perret
Hi All,
Background
==========
I have been looking into assigning simple devices which are not DMA
capable to VMs on Android using VFIO platform.
I have been mainly looking with respect to Protected KVM (pKVM), which
would need some extra modifications mostly to KVM-VFIO, that is quite
early under prototyping at the moment, which have core pending pKVM
dependencies upstream as guest memfd[1] and IOMMUs support[2].
However, this problem is not pKVM(or KVM) specific, and about the
design of VFIO.
[1] https://lore.kernel.org/kvm/20240801090117.3841080-1-tabba@google.com/
[2] https://lore.kernel.org/kvmarm/20230201125328.2186498-1-jean-philippe@linaro.org/
Problem
=======
At the moment, VFIO platform will deny a device from probing (through
vfio_group_find_or_alloc()), if it’s not part of an IOMMU group,
unless (CONFIG_VFIO_NOIOMMU is configured)
As far as I understand the current solutions to pass through platform
devices that are not DMA capable are:
- Use VFIO platform + (CONFIG_VFIO_NOIOMMU): The problem with that, it
taints the kernel and this doesn’t actually fit the device description
as the device doesn’t only have an IOMMU, but it’s not DMA capable at
all, so the kernel should be safe with assigning the device without
DMA isolation.
- Use VFIO mdev with an emulated IOMMU, this seems it could work. But
many of the code would be duplicate with the VFIO platform code as the
device is a platform device.
- Use UIO: Can map MMIO to userspace which seems to be focused for
userspace drivers rather than VM passthrough and I can’t find its
support in Qemu.
One other benefit from supporting this in VFIO platform, that we can
use the existing UAPI for platform devices (and support in VMMs)
Proposal
========
Extend VFIO platform to allow assigning devices without an IOMMU, this
can be possibly done by
- Checking device capability from the platform bus (would be something
ACPI/OF specific similar to how it configures DMA from
platform_dma_configure(), we can add a new function something like
platfrom_dma_capable())
- Using emulated IOMMU for such devices
(vfio_register_emulated_iommu_dev()), instead of having intrusive
changes about IOMMUs existence.
If that makes sense I can work on RFC(I don’t have any code at the moment)
Thanks,
Mostafa
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Simple device assignment with VFIO platform
2024-09-27 16:17 [RFC] Simple device assignment with VFIO platform Mostafa Saleh
@ 2024-09-30 7:19 ` Yi Liu
2024-09-30 12:18 ` Mostafa Saleh
2024-09-30 8:19 ` Tian, Kevin
` (2 subsequent siblings)
3 siblings, 1 reply; 10+ messages in thread
From: Yi Liu @ 2024-09-30 7:19 UTC (permalink / raw)
To: Mostafa Saleh, kvm, open list
Cc: Eric Auger, Alex Williamson, kwankhede, Marc Zyngier, Will Deacon,
Quentin Perret
On 2024/9/28 00:17, Mostafa Saleh wrote:
> Hi All,
>
> Background
> ==========
> I have been looking into assigning simple devices which are not DMA
> capable to VMs on Android using VFIO platform.
>
> I have been mainly looking with respect to Protected KVM (pKVM), which
> would need some extra modifications mostly to KVM-VFIO, that is quite
> early under prototyping at the moment, which have core pending pKVM
> dependencies upstream as guest memfd[1] and IOMMUs support[2].
>
> However, this problem is not pKVM(or KVM) specific, and about the
> design of VFIO.
>
> [1] https://lore.kernel.org/kvm/20240801090117.3841080-1-tabba@google.com/
> [2] https://lore.kernel.org/kvmarm/20230201125328.2186498-1-jean-philippe@linaro.org/
>
> Problem
> =======
> At the moment, VFIO platform will deny a device from probing (through
> vfio_group_find_or_alloc()), if it’s not part of an IOMMU group,
> unless (CONFIG_VFIO_NOIOMMU is configured)
so your device does not have an IOMMU and also it does not do DMA at all?
> As far as I understand the current solutions to pass through platform
> devices that are not DMA capable are:
> - Use VFIO platform + (CONFIG_VFIO_NOIOMMU): The problem with that, it
> taints the kernel and this doesn’t actually fit the device description
> as the device doesn’t only have an IOMMU, but it’s not DMA capable at
> all, so the kernel should be safe with assigning the device without
> DMA isolation.
you need to set the vfio_noiommu parameter as well. yes, this would give
your device a fake iommu group. But the kernel would say this taints it.
>
> - Use VFIO mdev with an emulated IOMMU, this seems it could work. But
> many of the code would be duplicate with the VFIO platform code as the
> device is a platform device.
>
> - Use UIO: Can map MMIO to userspace which seems to be focused for
> userspace drivers rather than VM passthrough and I can’t find its
> support in Qemu.
QEMU is for device passthrough, it makes sense it needs to use the VFIO
without noiommu instead of UIO. The below link has more explanations.
https://wiki.qemu.org/Features/VT-d
As the introduction of vfio cdev, you may compile the vfio group out
by CONFIG_VFIO_GROUP==n. Supposedly, you will not be blocked by the
vfio_group_find_or_alloc(). But you might be blocked due to no present
iommu. You may have a try though.
> One other benefit from supporting this in VFIO platform, that we can
> use the existing UAPI for platform devices (and support in VMMs)
>
> Proposal
> ========
> Extend VFIO platform to allow assigning devices without an IOMMU, this
> can be possibly done by
> - Checking device capability from the platform bus (would be something
> ACPI/OF specific similar to how it configures DMA from
> platform_dma_configure(), we can add a new function something like
> platfrom_dma_capable())
>
> - Using emulated IOMMU for such devices
> (vfio_register_emulated_iommu_dev()), instead of having intrusive
> changes about IOMMUs existence.
is it the mdev approach listed in the above?
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [RFC] Simple device assignment with VFIO platform
2024-09-27 16:17 [RFC] Simple device assignment with VFIO platform Mostafa Saleh
2024-09-30 7:19 ` Yi Liu
@ 2024-09-30 8:19 ` Tian, Kevin
2024-09-30 12:26 ` Mostafa Saleh
2024-09-30 13:05 ` Eric Auger
2024-09-30 17:10 ` Alex Williamson
3 siblings, 1 reply; 10+ messages in thread
From: Tian, Kevin @ 2024-09-30 8:19 UTC (permalink / raw)
To: Mostafa Saleh, kvm@vger.kernel.org, open list
Cc: Eric Auger, Alex Williamson, kwankhede@nvidia.com, Marc Zyngier,
Will Deacon, Quentin Perret
> From: Mostafa Saleh <smostafa@google.com>
> Sent: Saturday, September 28, 2024 12:17 AM
>
> Hi All,
>
> Background
> ==========
> I have been looking into assigning simple devices which are not DMA
> capable to VMs on Android using VFIO platform.
>
> I have been mainly looking with respect to Protected KVM (pKVM), which
> would need some extra modifications mostly to KVM-VFIO, that is quite
> early under prototyping at the moment, which have core pending pKVM
> dependencies upstream as guest memfd[1] and IOMMUs support[2].
>
> However, this problem is not pKVM(or KVM) specific, and about the
> design of VFIO.
>
> [1] https://lore.kernel.org/kvm/20240801090117.3841080-1-
> tabba@google.com/
> [2] https://lore.kernel.org/kvmarm/20230201125328.2186498-1-jean-
> philippe@linaro.org/
>
> Problem
> =======
> At the moment, VFIO platform will deny a device from probing (through
> vfio_group_find_or_alloc()), if it’s not part of an IOMMU group,
> unless (CONFIG_VFIO_NOIOMMU is configured)
>
> As far as I understand the current solutions to pass through platform
> devices that are not DMA capable are:
> - Use VFIO platform + (CONFIG_VFIO_NOIOMMU): The problem with that, it
> taints the kernel and this doesn’t actually fit the device description
> as the device doesn’t only have an IOMMU, but it’s not DMA capable at
> all, so the kernel should be safe with assigning the device without
> DMA isolation.
>
> - Use VFIO mdev with an emulated IOMMU, this seems it could work. But
> many of the code would be duplicate with the VFIO platform code as the
> device is a platform device.
emulated IOMMU is not tied to mdev:
/*
* Virtual device without IOMMU backing. The VFIO core fakes up an
* iommu_group as the iommu_group sysfs interface is part of the
* userspace ABI. The user of these devices must not be able to
* directly trigger unmediated DMA.
*/
VFIO_EMULATED_IOMMU,
Except it's not a virtual device, it does match the last sentence that
such device cannot trigger unmediated DMA.
>
> - Use UIO: Can map MMIO to userspace which seems to be focused for
> userspace drivers rather than VM passthrough and I can’t find its
> support in Qemu.
>
> One other benefit from supporting this in VFIO platform, that we can
> use the existing UAPI for platform devices (and support in VMMs)
>
> Proposal
> ========
> Extend VFIO platform to allow assigning devices without an IOMMU, this
> can be possibly done by
> - Checking device capability from the platform bus (would be something
> ACPI/OF specific similar to how it configures DMA from
> platform_dma_configure(), we can add a new function something like
> platfrom_dma_capable())
>
> - Using emulated IOMMU for such devices
> (vfio_register_emulated_iommu_dev()), instead of having intrusive
> changes about IOMMUs existence.
>
> If that makes sense I can work on RFC(I don’t have any code at the moment)
This sounds the best option out of my head now...
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Simple device assignment with VFIO platform
2024-09-30 7:19 ` Yi Liu
@ 2024-09-30 12:18 ` Mostafa Saleh
0 siblings, 0 replies; 10+ messages in thread
From: Mostafa Saleh @ 2024-09-30 12:18 UTC (permalink / raw)
To: Yi Liu
Cc: kvm, open list, Eric Auger, Alex Williamson, kwankhede,
Marc Zyngier, Will Deacon, Quentin Perret
Hi Yi,
Thanks a lot for the feedback.
On Mon, Sep 30, 2024 at 03:19:05PM +0800, Yi Liu wrote:
> On 2024/9/28 00:17, Mostafa Saleh wrote:
> > Hi All,
> >
> > Background
> > ==========
> > I have been looking into assigning simple devices which are not DMA
> > capable to VMs on Android using VFIO platform.
> >
> > I have been mainly looking with respect to Protected KVM (pKVM), which
> > would need some extra modifications mostly to KVM-VFIO, that is quite
> > early under prototyping at the moment, which have core pending pKVM
> > dependencies upstream as guest memfd[1] and IOMMUs support[2].
> >
> > However, this problem is not pKVM(or KVM) specific, and about the
> > design of VFIO.
> >
> > [1] https://lore.kernel.org/kvm/20240801090117.3841080-1-tabba@google.com/
> > [2] https://lore.kernel.org/kvmarm/20230201125328.2186498-1-jean-philippe@linaro.org/
> >
> > Problem
> > =======
> > At the moment, VFIO platform will deny a device from probing (through
> > vfio_group_find_or_alloc()), if it’s not part of an IOMMU group,
> > unless (CONFIG_VFIO_NOIOMMU is configured)
>
> so your device does not have an IOMMU and also it does not do DMA at all?
Exactly.
>
> > As far as I understand the current solutions to pass through platform
> > devices that are not DMA capable are:
> > - Use VFIO platform + (CONFIG_VFIO_NOIOMMU): The problem with that, it
> > taints the kernel and this doesn’t actually fit the device description
> > as the device doesn’t only have an IOMMU, but it’s not DMA capable at
> > all, so the kernel should be safe with assigning the device without
> > DMA isolation.
>
> you need to set the vfio_noiommu parameter as well. yes, this would give
> your device a fake iommu group. But the kernel would say this taints it.
Yes, that would be the main problem with this approach.
>
> >
> > - Use VFIO mdev with an emulated IOMMU, this seems it could work. But
> > many of the code would be duplicate with the VFIO platform code as the
> > device is a platform device.
> >
> > - Use UIO: Can map MMIO to userspace which seems to be focused for
> > userspace drivers rather than VM passthrough and I can’t find its
> > support in Qemu.
>
> QEMU is for device passthrough, it makes sense it needs to use the VFIO
> without noiommu instead of UIO. The below link has more explanations.
>
I agree, the reason I considered UIO, is that it allows mmap of the device
MMIO, so in theory those can be passed to KVM, but that might be abuse of
the UAPI? Specially I can’t find any VMM support for that.
> https://wiki.qemu.org/Features/VT-d
>
> As the introduction of vfio cdev, you may compile the vfio group out
> by CONFIG_VFIO_GROUP==n. Supposedly, you will not be blocked by the
> vfio_group_find_or_alloc(). But you might be blocked due to no present
> iommu. You may have a try though.
That fails at probe also, because of the device has no IOMMU as
vfio_platform_probe() ends up calling device_iommu_capable().
Also, AFAIU, using cdev must be tied with IOMMUFD, which assumes
existence of IOMMU and would failt at bind.
>
> > One other benefit from supporting this in VFIO platform, that we can
> > use the existing UAPI for platform devices (and support in VMMs)
> >
> > Proposal
> > ========
> > Extend VFIO platform to allow assigning devices without an IOMMU, this
> > can be possibly done by
> > - Checking device capability from the platform bus (would be something
> > ACPI/OF specific similar to how it configures DMA from
> > platform_dma_configure(), we can add a new function something like
> > platfrom_dma_capable())
> >
> > - Using emulated IOMMU for such devices
> > (vfio_register_emulated_iommu_dev()), instead of having intrusive
> > changes about IOMMUs existence.
>
> is it the mdev approach listed in the above?
No, I meant to extended VFIO-platform to understand that, so in case the
device has no IOMMU group, it can check with the platform bus if the
device is DMA capable or not and it can allow those devices (maybe
gated behind a command line) to be probed with emulated group.
Having an new mdev just for platform devices with no DMA capabilities might
be good if we don’t want to change VFIO platform, but my main concern is that
might need more code + duplication with VFIO-platform.
Thanks,
Mostafa
> --
> Regards,
> Yi Liu
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Simple device assignment with VFIO platform
2024-09-30 8:19 ` Tian, Kevin
@ 2024-09-30 12:26 ` Mostafa Saleh
0 siblings, 0 replies; 10+ messages in thread
From: Mostafa Saleh @ 2024-09-30 12:26 UTC (permalink / raw)
To: Tian, Kevin
Cc: kvm@vger.kernel.org, open list, Eric Auger, Alex Williamson,
kwankhede@nvidia.com, Marc Zyngier, Will Deacon, Quentin Perret
Hi Tian,
On Mon, Sep 30, 2024 at 08:19:45AM +0000, Tian, Kevin wrote:
> > From: Mostafa Saleh <smostafa@google.com>
> > Sent: Saturday, September 28, 2024 12:17 AM
> >
> > Hi All,
> >
> > Background
> > ==========
> > I have been looking into assigning simple devices which are not DMA
> > capable to VMs on Android using VFIO platform.
> >
> > I have been mainly looking with respect to Protected KVM (pKVM), which
> > would need some extra modifications mostly to KVM-VFIO, that is quite
> > early under prototyping at the moment, which have core pending pKVM
> > dependencies upstream as guest memfd[1] and IOMMUs support[2].
> >
> > However, this problem is not pKVM(or KVM) specific, and about the
> > design of VFIO.
> >
> > [1] https://lore.kernel.org/kvm/20240801090117.3841080-1-
> > tabba@google.com/
> > [2] https://lore.kernel.org/kvmarm/20230201125328.2186498-1-jean-
> > philippe@linaro.org/
> >
> > Problem
> > =======
> > At the moment, VFIO platform will deny a device from probing (through
> > vfio_group_find_or_alloc()), if it’s not part of an IOMMU group,
> > unless (CONFIG_VFIO_NOIOMMU is configured)
> >
> > As far as I understand the current solutions to pass through platform
> > devices that are not DMA capable are:
> > - Use VFIO platform + (CONFIG_VFIO_NOIOMMU): The problem with that, it
> > taints the kernel and this doesn’t actually fit the device description
> > as the device doesn’t only have an IOMMU, but it’s not DMA capable at
> > all, so the kernel should be safe with assigning the device without
> > DMA isolation.
> >
> > - Use VFIO mdev with an emulated IOMMU, this seems it could work. But
> > many of the code would be duplicate with the VFIO platform code as the
> > device is a platform device.
>
> emulated IOMMU is not tied to mdev:
Makes sense, I see it’s used by a couple of other drivers also, so in
that case it can be just a driver and not an mdev.
>
> /*
> * Virtual device without IOMMU backing. The VFIO core fakes up an
> * iommu_group as the iommu_group sysfs interface is part of the
> * userspace ABI. The user of these devices must not be able to
> * directly trigger unmediated DMA.
> */
> VFIO_EMULATED_IOMMU,
>
> Except it's not a virtual device, it does match the last sentence that
> such device cannot trigger unmediated DMA.
>
> >
> > - Use UIO: Can map MMIO to userspace which seems to be focused for
> > userspace drivers rather than VM passthrough and I can’t find its
> > support in Qemu.
> >
> > One other benefit from supporting this in VFIO platform, that we can
> > use the existing UAPI for platform devices (and support in VMMs)
> >
> > Proposal
> > ========
> > Extend VFIO platform to allow assigning devices without an IOMMU, this
> > can be possibly done by
> > - Checking device capability from the platform bus (would be something
> > ACPI/OF specific similar to how it configures DMA from
> > platform_dma_configure(), we can add a new function something like
> > platfrom_dma_capable())
> >
> > - Using emulated IOMMU for such devices
> > (vfio_register_emulated_iommu_dev()), instead of having intrusive
> > changes about IOMMUs existence.
> >
> > If that makes sense I can work on RFC(I don’t have any code at the moment)
>
> This sounds the best option out of my head now...
Thanks a lot for the feedback, I will work on RFC patches unless someone
strongly disagrees with the approach.
Thanks,
Mostafa
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Simple device assignment with VFIO platform
2024-09-27 16:17 [RFC] Simple device assignment with VFIO platform Mostafa Saleh
2024-09-30 7:19 ` Yi Liu
2024-09-30 8:19 ` Tian, Kevin
@ 2024-09-30 13:05 ` Eric Auger
2024-09-30 13:33 ` Mostafa Saleh
2024-10-01 9:44 ` Mostafa Saleh
2024-09-30 17:10 ` Alex Williamson
3 siblings, 2 replies; 10+ messages in thread
From: Eric Auger @ 2024-09-30 13:05 UTC (permalink / raw)
To: Mostafa Saleh, kvm, open list
Cc: Alex Williamson, kwankhede, Marc Zyngier, Will Deacon,
Quentin Perret
Hi Mostafa,
On 9/27/24 18:17, Mostafa Saleh wrote:
> Hi All,
>
> Background
> ==========
> I have been looking into assigning simple devices which are not DMA
> capable to VMs on Android using VFIO platform.
>
> I have been mainly looking with respect to Protected KVM (pKVM), which
> would need some extra modifications mostly to KVM-VFIO, that is quite
> early under prototyping at the moment, which have core pending pKVM
> dependencies upstream as guest memfd[1] and IOMMUs support[2].
>
> However, this problem is not pKVM(or KVM) specific, and about the
> design of VFIO.
>
> [1] https://lore.kernel.org/kvm/20240801090117.3841080-1-tabba@google.com/
> [2] https://lore.kernel.org/kvmarm/20230201125328.2186498-1-jean-philippe@linaro.org/
>
> Problem
> =======
> At the moment, VFIO platform will deny a device from probing (through
> vfio_group_find_or_alloc()), if it’s not part of an IOMMU group,
> unless (CONFIG_VFIO_NOIOMMU is configured)
>
> As far as I understand the current solutions to pass through platform
> devices that are not DMA capable are:
> - Use VFIO platform + (CONFIG_VFIO_NOIOMMU): The problem with that, it
> taints the kernel and this doesn’t actually fit the device description
> as the device doesn’t only have an IOMMU, but it’s not DMA capable at
> all, so the kernel should be safe with assigning the device without
> DMA isolation.
>
> - Use VFIO mdev with an emulated IOMMU, this seems it could work. But
> many of the code would be duplicate with the VFIO platform code as the
> device is a platform device.
>
> - Use UIO: Can map MMIO to userspace which seems to be focused for
> userspace drivers rather than VM passthrough and I can’t find its
> support in Qemu.
In case you did not have this reference, you may have a look at Alex'
reply in
https://patchew.org/QEMU/1518189456-2873-1-git-send-email-geert+renesas@glider.be/1518189456-2873-5-git-send-email-geert+renesas@glider.be/
>
> One other benefit from supporting this in VFIO platform, that we can
> use the existing UAPI for platform devices (and support in VMMs)
>
> Proposal
> ========
> Extend VFIO platform to allow assigning devices without an IOMMU, this
> can be possibly done by
> - Checking device capability from the platform bus (would be something
> ACPI/OF specific similar to how it configures DMA from
> platform_dma_configure(), we can add a new function something like
> platfrom_dma_capable())
>
> - Using emulated IOMMU for such devices
> (vfio_register_emulated_iommu_dev()), instead of having intrusive
> changes about IOMMUs existence.
>
> If that makes sense I can work on RFC(I don’t have any code at the moment)
So if I understand correctly, assuming you are able to safely detect the
device is not DMA capable you would use the
vfio_register_emulated_iommu_dev() trick. Is that correct?
Thanks
Eric
>
> Thanks,
> Mostafa
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Simple device assignment with VFIO platform
2024-09-30 13:05 ` Eric Auger
@ 2024-09-30 13:33 ` Mostafa Saleh
2024-10-01 9:44 ` Mostafa Saleh
1 sibling, 0 replies; 10+ messages in thread
From: Mostafa Saleh @ 2024-09-30 13:33 UTC (permalink / raw)
To: Eric Auger
Cc: kvm, open list, Alex Williamson, kwankhede, Marc Zyngier,
Will Deacon, Quentin Perret
Hi Eric,
On Mon, Sep 30, 2024 at 03:05:38PM +0200, Eric Auger wrote:
> Hi Mostafa,
>
> On 9/27/24 18:17, Mostafa Saleh wrote:
> > Hi All,
> >
> > Background
> > ==========
> > I have been looking into assigning simple devices which are not DMA
> > capable to VMs on Android using VFIO platform.
> >
> > I have been mainly looking with respect to Protected KVM (pKVM), which
> > would need some extra modifications mostly to KVM-VFIO, that is quite
> > early under prototyping at the moment, which have core pending pKVM
> > dependencies upstream as guest memfd[1] and IOMMUs support[2].
> >
> > However, this problem is not pKVM(or KVM) specific, and about the
> > design of VFIO.
> >
> > [1] https://lore.kernel.org/kvm/20240801090117.3841080-1-tabba@google.com/
> > [2] https://lore.kernel.org/kvmarm/20230201125328.2186498-1-jean-philippe@linaro.org/
> >
> > Problem
> > =======
> > At the moment, VFIO platform will deny a device from probing (through
> > vfio_group_find_or_alloc()), if it’s not part of an IOMMU group,
> > unless (CONFIG_VFIO_NOIOMMU is configured)
> >
> > As far as I understand the current solutions to pass through platform
> > devices that are not DMA capable are:
> > - Use VFIO platform + (CONFIG_VFIO_NOIOMMU): The problem with that, it
> > taints the kernel and this doesn’t actually fit the device description
> > as the device doesn’t only have an IOMMU, but it’s not DMA capable at
> > all, so the kernel should be safe with assigning the device without
> > DMA isolation.
> >
> > - Use VFIO mdev with an emulated IOMMU, this seems it could work. But
> > many of the code would be duplicate with the VFIO platform code as the
> > device is a platform device.
> >
> > - Use UIO: Can map MMIO to userspace which seems to be focused for
> > userspace drivers rather than VM passthrough and I can’t find its
> > support in Qemu.
> In case you did not have this reference, you may have a look at Alex'
> reply in
> https://patchew.org/QEMU/1518189456-2873-1-git-send-email-geert+renesas@glider.be/1518189456-2873-5-git-send-email-geert+renesas@glider.be/
> >
> > One other benefit from supporting this in VFIO platform, that we can
> > use the existing UAPI for platform devices (and support in VMMs)
> >
> > Proposal
> > ========
> > Extend VFIO platform to allow assigning devices without an IOMMU, this
> > can be possibly done by
> > - Checking device capability from the platform bus (would be something
> > ACPI/OF specific similar to how it configures DMA from
> > platform_dma_configure(), we can add a new function something like
> > platfrom_dma_capable())
> >
> > - Using emulated IOMMU for such devices
> > (vfio_register_emulated_iommu_dev()), instead of having intrusive
> > changes about IOMMUs existence.
> >
> > If that makes sense I can work on RFC(I don’t have any code at the moment)
> So if I understand correctly, assuming you are able to safely detect the
> device is not DMA capable you would use the
>
> vfio_register_emulated_iommu_dev() trick. Is that correct?
Yes, my guess the challenge here would be discovering that through the
platform bus.
Thanks,
Mostafa
>
> Thanks
>
> Eric
>
> >
> > Thanks,
> > Mostafa
> >
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Simple device assignment with VFIO platform
2024-09-27 16:17 [RFC] Simple device assignment with VFIO platform Mostafa Saleh
` (2 preceding siblings ...)
2024-09-30 13:05 ` Eric Auger
@ 2024-09-30 17:10 ` Alex Williamson
2024-10-01 10:15 ` Mostafa Saleh
3 siblings, 1 reply; 10+ messages in thread
From: Alex Williamson @ 2024-09-30 17:10 UTC (permalink / raw)
To: Mostafa Saleh
Cc: kvm, open list, Eric Auger, kwankhede, Marc Zyngier, Will Deacon,
Quentin Perret
On Fri, 27 Sep 2024 17:17:02 +0100
Mostafa Saleh <smostafa@google.com> wrote:
> Hi All,
>
> Background
> ==========
> I have been looking into assigning simple devices which are not DMA
> capable to VMs on Android using VFIO platform.
>
> I have been mainly looking with respect to Protected KVM (pKVM), which
> would need some extra modifications mostly to KVM-VFIO, that is quite
> early under prototyping at the moment, which have core pending pKVM
> dependencies upstream as guest memfd[1] and IOMMUs support[2].
>
> However, this problem is not pKVM(or KVM) specific, and about the
> design of VFIO.
>
> [1] https://lore.kernel.org/kvm/20240801090117.3841080-1-tabba@google.com/
> [2] https://lore.kernel.org/kvmarm/20230201125328.2186498-1-jean-philippe@linaro.org/
>
> Problem
> =======
> At the moment, VFIO platform will deny a device from probing (through
> vfio_group_find_or_alloc()), if it’s not part of an IOMMU group,
> unless (CONFIG_VFIO_NOIOMMU is configured)
>
> As far as I understand the current solutions to pass through platform
> devices that are not DMA capable are:
> - Use VFIO platform + (CONFIG_VFIO_NOIOMMU): The problem with that, it
> taints the kernel and this doesn’t actually fit the device description
> as the device doesn’t only have an IOMMU, but it’s not DMA capable at
> all, so the kernel should be safe with assigning the device without
> DMA isolation.
If the device is not capable of DMA, then what do you get from using
vfio? Essentially the device is reduced to some MMIO ranges and
something to configure line level interrupt notification.
Traditionally this is the realm of UIO.
> - Use VFIO mdev with an emulated IOMMU, this seems it could work. But
> many of the code would be duplicate with the VFIO platform code as the
> device is a platform device.
Per Eric's talk recently at KVM Forum[1] we're already at an inflection
point for vfio-platform. We're suffering from lack of contributions
for any current devices, agreement in the community to end it as a
failed experiment, while at the same time vendors quietly indicate they
depend on it. It seems that at a minimum, we can't support
vfio-platform like we do vfio-pci, where a meta driver pretends it can
support exposing any platform device. There's not enough definition to
a platform device. Therefore if vfio-platform is to survive, it's
probably going to need to do so through device specific drivers which
understands how a specific device operates, and potentially whether it
can or cannot perform DMA. That might mean that vfio-platform needs to
take the mdev or vfio-pci variant driver approach, and the code
duplication you're concerned about should instead be refactoring in
order to re-use the existing code from more device specific drivers.
> - Use UIO: Can map MMIO to userspace which seems to be focused for
> userspace drivers rather than VM passthrough and I can’t find its
> support in Qemu.
This would need to be device specific code on the QEMU side, so there's
probably not much to share here.
> One other benefit from supporting this in VFIO platform, that we can
> use the existing UAPI for platform devices (and support in VMMs)
But it's not like there's ubiquitous support for vfio-platform devices
in QEMU either. Each platform device needs hooks to at least setup
device tree entries to describe the device to the VM. AIUI, QEMU needs
to understand the device and how to describe it to the VM whether the
approach is vfio-platform or UIO.
> Proposal
> ========
> Extend VFIO platform to allow assigning devices without an IOMMU, this
> can be possibly done by
> - Checking device capability from the platform bus (would be something
> ACPI/OF specific similar to how it configures DMA from
> platform_dma_configure(), we can add a new function something like
> platfrom_dma_capable())
>
> - Using emulated IOMMU for such devices
> (vfio_register_emulated_iommu_dev()), instead of having intrusive
> changes about IOMMUs existence.
>
> If that makes sense I can work on RFC(I don’t have any code at the moment)
As noted in the thread referenced by Eric, I don't think we want to add
any sort of vfio no-iommu into QEMU. vfio-platform in particular is in
no position drive such a feature. If you want to use vfio for this,
the most viable approach would seem to be one of using an emulated
IOMMU in a device specific context which can understand the device is
not capable of DMA. We likely need to let vfio-platform die as generic
means to expose arbitrary platform devices. Thanks,
Alex
[1]https://www.youtube.com/watch?v=Q5BOSbtwRr8
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Simple device assignment with VFIO platform
2024-09-30 13:05 ` Eric Auger
2024-09-30 13:33 ` Mostafa Saleh
@ 2024-10-01 9:44 ` Mostafa Saleh
1 sibling, 0 replies; 10+ messages in thread
From: Mostafa Saleh @ 2024-10-01 9:44 UTC (permalink / raw)
To: eric.auger
Cc: kvm, open list, Alex Williamson, kwankhede, Marc Zyngier,
Will Deacon, Quentin Perret
On Mon, Sep 30, 2024 at 2:05 PM Eric Auger <eric.auger@redhat.com> wrote:
>
> Hi Mostafa,
>
> On 9/27/24 18:17, Mostafa Saleh wrote:
> > Hi All,
> >
> > Background
> > ==========
> > I have been looking into assigning simple devices which are not DMA
> > capable to VMs on Android using VFIO platform.
> >
> > I have been mainly looking with respect to Protected KVM (pKVM), which
> > would need some extra modifications mostly to KVM-VFIO, that is quite
> > early under prototyping at the moment, which have core pending pKVM
> > dependencies upstream as guest memfd[1] and IOMMUs support[2].
> >
> > However, this problem is not pKVM(or KVM) specific, and about the
> > design of VFIO.
> >
> > [1] https://lore.kernel.org/kvm/20240801090117.3841080-1-tabba@google.com/
> > [2] https://lore.kernel.org/kvmarm/20230201125328.2186498-1-jean-philippe@linaro.org/
> >
> > Problem
> > =======
> > At the moment, VFIO platform will deny a device from probing (through
> > vfio_group_find_or_alloc()), if it’s not part of an IOMMU group,
> > unless (CONFIG_VFIO_NOIOMMU is configured)
> >
> > As far as I understand the current solutions to pass through platform
> > devices that are not DMA capable are:
> > - Use VFIO platform + (CONFIG_VFIO_NOIOMMU): The problem with that, it
> > taints the kernel and this doesn’t actually fit the device description
> > as the device doesn’t only have an IOMMU, but it’s not DMA capable at
> > all, so the kernel should be safe with assigning the device without
> > DMA isolation.
> >
> > - Use VFIO mdev with an emulated IOMMU, this seems it could work. But
> > many of the code would be duplicate with the VFIO platform code as the
> > device is a platform device.
> >
> > - Use UIO: Can map MMIO to userspace which seems to be focused for
> > userspace drivers rather than VM passthrough and I can’t find its
> > support in Qemu.
> In case you did not have this reference, you may have a look at Alex'
> reply in
> https://patchew.org/QEMU/1518189456-2873-1-git-send-email-geert+renesas@glider.be/1518189456-2873-5-git-send-email-geert+renesas@glider.be/
Sorry I missed that in the last reply, I see, it seems VFIO-platform
is not the right place for this.
Thanks,
Mostafa
> >
> > One other benefit from supporting this in VFIO platform, that we can
> > use the existing UAPI for platform devices (and support in VMMs)
> >
> > Proposal
> > ========
> > Extend VFIO platform to allow assigning devices without an IOMMU, this
> > can be possibly done by
> > - Checking device capability from the platform bus (would be something
> > ACPI/OF specific similar to how it configures DMA from
> > platform_dma_configure(), we can add a new function something like
> > platfrom_dma_capable())
> >
> > - Using emulated IOMMU for such devices
> > (vfio_register_emulated_iommu_dev()), instead of having intrusive
> > changes about IOMMUs existence.
> >
> > If that makes sense I can work on RFC(I don’t have any code at the moment)
> So if I understand correctly, assuming you are able to safely detect the
> device is not DMA capable you would use the
>
> vfio_register_emulated_iommu_dev() trick. Is that correct?
>
> Thanks
>
> Eric
>
> >
> > Thanks,
> > Mostafa
> >
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Simple device assignment with VFIO platform
2024-09-30 17:10 ` Alex Williamson
@ 2024-10-01 10:15 ` Mostafa Saleh
0 siblings, 0 replies; 10+ messages in thread
From: Mostafa Saleh @ 2024-10-01 10:15 UTC (permalink / raw)
To: Alex Williamson
Cc: kvm, open list, Eric Auger, kwankhede, Marc Zyngier, Will Deacon,
Quentin Perret
Hi Alex,
On Mon, Sep 30, 2024 at 11:10:13AM -0600, Alex Williamson wrote:
> On Fri, 27 Sep 2024 17:17:02 +0100
> Mostafa Saleh <smostafa@google.com> wrote:
>
> > Hi All,
> >
> > Background
> > ==========
> > I have been looking into assigning simple devices which are not DMA
> > capable to VMs on Android using VFIO platform.
> >
> > I have been mainly looking with respect to Protected KVM (pKVM), which
> > would need some extra modifications mostly to KVM-VFIO, that is quite
> > early under prototyping at the moment, which have core pending pKVM
> > dependencies upstream as guest memfd[1] and IOMMUs support[2].
> >
> > However, this problem is not pKVM(or KVM) specific, and about the
> > design of VFIO.
> >
> > [1] https://lore.kernel.org/kvm/20240801090117.3841080-1-tabba@google.com/
> > [2] https://lore.kernel.org/kvmarm/20230201125328.2186498-1-jean-philippe@linaro.org/
> >
> > Problem
> > =======
> > At the moment, VFIO platform will deny a device from probing (through
> > vfio_group_find_or_alloc()), if it’s not part of an IOMMU group,
> > unless (CONFIG_VFIO_NOIOMMU is configured)
> >
> > As far as I understand the current solutions to pass through platform
> > devices that are not DMA capable are:
> > - Use VFIO platform + (CONFIG_VFIO_NOIOMMU): The problem with that, it
> > taints the kernel and this doesn’t actually fit the device description
> > as the device doesn’t only have an IOMMU, but it’s not DMA capable at
> > all, so the kernel should be safe with assigning the device without
> > DMA isolation.
>
> If the device is not capable of DMA, then what do you get from using
> vfio? Essentially the device is reduced to some MMIO ranges and
> something to configure line level interrupt notification.
> Traditionally this is the realm of UIO.
My simplistic understanding was that VFIO mainly deals with device passthrough
to VMs while UIO deals with userspace drivers.
Also, it seems that UIO lacks support of eventfd for irqs, which makes it
inefficient to use with KVM.
>
> > - Use VFIO mdev with an emulated IOMMU, this seems it could work. But
> > many of the code would be duplicate with the VFIO platform code as the
> > device is a platform device.
>
> Per Eric's talk recently at KVM Forum[1] we're already at an inflection
> point for vfio-platform. We're suffering from lack of contributions
> for any current devices, agreement in the community to end it as a
> failed experiment, while at the same time vendors quietly indicate they
> depend on it. It seems that at a minimum, we can't support
> vfio-platform like we do vfio-pci, where a meta driver pretends it can
> support exposing any platform device. There's not enough definition to
> a platform device. Therefore if vfio-platform is to survive, it's
> probably going to need to do so through device specific drivers which
> understands how a specific device operates, and potentially whether it
> can or cannot perform DMA. That might mean that vfio-platform needs to
> take the mdev or vfio-pci variant driver approach, and the code
> duplication you're concerned about should instead be refactoring in
> order to re-use the existing code from more device specific drivers.
I see, that makes sense, but I guess that won't progress much without
vendor contributing drivers :/
> > - Use UIO: Can map MMIO to userspace which seems to be focused for
> > userspace drivers rather than VM passthrough and I can’t find its
> > support in Qemu.
>
> This would need to be device specific code on the QEMU side, so there's
> probably not much to share here.
>
> > One other benefit from supporting this in VFIO platform, that we can
> > use the existing UAPI for platform devices (and support in VMMs)
>
> But it's not like there's ubiquitous support for vfio-platform devices
> in QEMU either. Each platform device needs hooks to at least setup
> device tree entries to describe the device to the VM. AIUI, QEMU needs
> to understand the device and how to describe it to the VM whether the
> approach is vfio-platform or UIO.
Makes sense, although it's sad there is no upstream support for any
device.
>
> > Proposal
> > ========
> > Extend VFIO platform to allow assigning devices without an IOMMU, this
> > can be possibly done by
> > - Checking device capability from the platform bus (would be something
> > ACPI/OF specific similar to how it configures DMA from
> > platform_dma_configure(), we can add a new function something like
> > platfrom_dma_capable())
> >
> > - Using emulated IOMMU for such devices
> > (vfio_register_emulated_iommu_dev()), instead of having intrusive
> > changes about IOMMUs existence.
> >
> > If that makes sense I can work on RFC(I don’t have any code at the moment)
>
> As noted in the thread referenced by Eric, I don't think we want to add
> any sort of vfio no-iommu into QEMU. vfio-platform in particular is in
> no position drive such a feature. If you want to use vfio for this,
> the most viable approach would seem to be one of using an emulated
> IOMMU in a device specific context which can understand the device is
> not capable of DMA. We likely need to let vfio-platform die as generic
> means to expose arbitrary platform devices. Thanks,
I see, thanks a lot for the feedback, it seems vfio-platform is not the
right place for this, and it'd be better to have separate drivers that
register vfio devices, and maybe we can have more common code for platform
devices as this progresses.
Thanks,
Mostafa
>
> Alex
>
> [1]https://www.youtube.com/watch?v=Q5BOSbtwRr8
>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-10-01 10:15 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-27 16:17 [RFC] Simple device assignment with VFIO platform Mostafa Saleh
2024-09-30 7:19 ` Yi Liu
2024-09-30 12:18 ` Mostafa Saleh
2024-09-30 8:19 ` Tian, Kevin
2024-09-30 12:26 ` Mostafa Saleh
2024-09-30 13:05 ` Eric Auger
2024-09-30 13:33 ` Mostafa Saleh
2024-10-01 9:44 ` Mostafa Saleh
2024-09-30 17:10 ` Alex Williamson
2024-10-01 10:15 ` Mostafa Saleh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox