* nested-smmuv3 topic for QEMU/libvirt, Nov 2024
@ 2024-11-01 4:09 Nicolin Chen
2024-11-01 11:55 ` Jason Gunthorpe
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Nicolin Chen @ 2024-11-01 4:09 UTC (permalink / raw)
To: Shameerali Kolothum Thodi
Cc: Eric Auger, Mostafa Saleh, qemu-arm@nongnu.org,
qemu-devel@nongnu.org, Peter Maydell, Jason Gunthorpe,
Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Andrea Bolognani, Michael S. Tsirkin, Peter Xu, Zhangfei Gao,
nathanc, arighi, ianm, jan, mochs
Hi,
This is a continued discussion following previous month's:
https://lore.kernel.org/qemu-devel/Zvr%2Fbf7KgLN1cjOl@Asurada-Nvidia/
Kernel changes are getting closer to merge, as Jason's planning to
take vIOMMU series and smmuv3_nesting series into the iommufd tree:
https://lore.kernel.org/all/cover.1730313494.git.nicolinc@nvidia.com/
https://lore.kernel.org/all/cover.1730313494.git.nicolinc@nvidia.com/
https://lore.kernel.org/all/0-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com/
So, I think it's probably a good time to align with each others and
talk about kicking off some QEMU upstream work in the month ahead.
@Shameer,
Do you have some update on the pluggable smmuv3 module?
Updates on my side:
1) I have kept uAPI updated to the latest version and verified too.
There should be some polishing changes depending on how the basic
nesting infrastructure would look like from Intel/Duan's work.
2) I got some help from NVIDIA folks for the libvirt task. And they
have done some drafting and are now verifying the PCI topology
with "iommu=none".
Once the pluggable smmuv3 module is ready to test, we will make some
change to libvirt for that and drop the auto-assigning patches from
the VIRT code, so as to converge for a libvirt+QEMU test.
FWIW, Robin requested a different solution for MSI mapping [1], v.s.
the RMR one that we have been using since Eric's work. I drafted a
few VFIO/IOMMUFD patches for that, yet paused for getting the vIOMMU
series merged to this kernel cycle. I plan to continue in Nov/Dec.
So, for the near term we will continue with the RMR solution, until
we have something solid later.
[1] https://lore.kernel.org/linux-iommu/ZrVN05VylFq8lK4q@Asurada-Nvidia/
Thanks
Nicolin
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
2024-11-01 4:09 nested-smmuv3 topic for QEMU/libvirt, Nov 2024 Nicolin Chen
@ 2024-11-01 11:55 ` Jason Gunthorpe
2024-12-02 6:04 ` Zhangfei Gao
2024-11-01 18:35 ` Shameerali Kolothum Thodi via
2024-11-07 11:11 ` Eric Auger
2 siblings, 1 reply; 17+ messages in thread
From: Jason Gunthorpe @ 2024-11-01 11:55 UTC (permalink / raw)
To: Nicolin Chen
Cc: Shameerali Kolothum Thodi, Eric Auger, Mostafa Saleh,
qemu-arm@nongnu.org, qemu-devel@nongnu.org, Peter Maydell,
Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Andrea Bolognani, Michael S. Tsirkin, Peter Xu, Zhangfei Gao,
nathanc, arighi, ianm, jan, mochs
On Thu, Oct 31, 2024 at 09:09:20PM -0700, Nicolin Chen wrote:
> FWIW, Robin requested a different solution for MSI mapping [1], v.s.
> the RMR one that we have been using since Eric's work. I drafted a
> few VFIO/IOMMUFD patches for that,
I also talked to MarcZ about this at LPC and he seems willing to
consider it. It took a bit to explain everything though. So I think we
should try in Nov/Dec
Jason
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
2024-11-01 4:09 nested-smmuv3 topic for QEMU/libvirt, Nov 2024 Nicolin Chen
2024-11-01 11:55 ` Jason Gunthorpe
@ 2024-11-01 18:35 ` Shameerali Kolothum Thodi via
2024-11-01 19:29 ` Nicolin Chen
2024-11-20 21:13 ` Andrea Bolognani
2024-11-07 11:11 ` Eric Auger
2 siblings, 2 replies; 17+ messages in thread
From: Shameerali Kolothum Thodi via @ 2024-11-01 18:35 UTC (permalink / raw)
To: Nicolin Chen
Cc: Eric Auger, Mostafa Saleh, qemu-arm@nongnu.org,
qemu-devel@nongnu.org, Peter Maydell, Jason Gunthorpe,
Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Andrea Bolognani, Michael S. Tsirkin, Peter Xu, Zhangfei Gao,
nathanc@nvidia.com, arighi@nvidia.com, ianm@nvidia.com,
jan@nvidia.com, mochs@nvidia.com
Hi Nicolin,
> -----Original Message-----
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Friday, November 1, 2024 4:09 AM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: Eric Auger <eric.auger@redhat.com>; Mostafa Saleh
> <smostafa@google.com>; qemu-arm@nongnu.org; qemu-
> devel@nongnu.org; Peter Maydell <peter.maydell@linaro.org>; Jason
> Gunthorpe <jgg@nvidia.com>; Jean-Philippe Brucker <jean-
> philippe@linaro.org>; Moritz Fischer <mdf@kernel.org>; Michael Shavit
> <mshavit@google.com>; Andrea Bolognani <abologna@redhat.com>;
> Michael S. Tsirkin <mst@redhat.com>; Peter Xu <peterx@redhat.com>;
> Zhangfei Gao <zhangfei.gao@linaro.org>; nathanc@nvidia.com;
> arighi@nvidia.com; ianm@nvidia.com; jan@nvidia.com; mochs@nvidia.com
> Subject: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
>
> Hi,
>
> This is a continued discussion following previous month's:
> https://lore.kernel.org/qemu-devel/Zvr%2Fbf7KgLN1cjOl@Asurada-Nvidia/
>
> Kernel changes are getting closer to merge, as Jason's planning to
> take vIOMMU series and smmuv3_nesting series into the iommufd tree:
> https://lore.kernel.org/all/cover.1730313494.git.nicolinc@nvidia.com/
> https://lore.kernel.org/all/cover.1730313494.git.nicolinc@nvidia.com/
> https://lore.kernel.org/all/0-v4-9e99b76f3518+3a8-
> smmuv3_nesting_jgg@nvidia.com/
>
> So, I think it's probably a good time to align with each others and
> talk about kicking off some QEMU upstream work in the month ahead.
>
> @Shameer,
> Do you have some update on the pluggable smmuv3 module?
I have a bare minimum prototype code that works with a pluggable smmuv3.
...
-device pxb-pcie,id=pcie.1,bus_nr=2,bus=pcie.0 \
-device pcie-root-port,id=pcie.port1,bus=pcie.1 \
-device arm-smmuv3-nested,id=smmuv1,pci-bus=pcie.1 \
-device vfio-pci-nohotplug,host=0000:75:00.1,bus=pcie.port1,iommufd=iommufd0 \
-device pxb-pcie,id=pcie.2,bus_nr=8,bus=pcie.0 \
-device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=8 \
-device arm-smmuv3-nested,id=smmuv2,pci-bus=pcie.2 \
-device vfio-pci-nohotplug,host=0000:7d:02.1,bus=pcie.port2,iommufd=iommufd0 \
...
Something like above can now boot a Guest with the latest kernel. But I am not
sure it actually works correctly. I need a bit more time to update this and carry
out some tests. Will target that in Nov.
>
> Updates on my side:
> 1) I have kept uAPI updated to the latest version and verified too.
> There should be some polishing changes depending on how the basic
> nesting infrastructure would look like from Intel/Duan's work.
> 2) I got some help from NVIDIA folks for the libvirt task. And they
> have done some drafting and are now verifying the PCI topology
> with "iommu=none".
>
> Once the pluggable smmuv3 module is ready to test, we will make some
> change to libvirt for that and drop the auto-assigning patches from
> the VIRT code, so as to converge for a libvirt+QEMU test.
One query I have is, do Qemu still need to check whether the VFIO devices are
assigned correctly behind the nested vSMMUv3 w.r.t the phys SMMUv3s or not?
Or we can just trust whatever the user/libvirt specifies?
(I think even if we don't explicitly check, at present it will eventually fail in s2
HWPT attach for the viommu if it belongs to a different phys SMMUv3).
Thanks,
Shameer
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
2024-11-01 18:35 ` Shameerali Kolothum Thodi via
@ 2024-11-01 19:29 ` Nicolin Chen
2024-11-08 13:05 ` Shameerali Kolothum Thodi via
2024-11-20 21:13 ` Andrea Bolognani
1 sibling, 1 reply; 17+ messages in thread
From: Nicolin Chen @ 2024-11-01 19:29 UTC (permalink / raw)
To: Shameerali Kolothum Thodi
Cc: Eric Auger, Mostafa Saleh, qemu-arm@nongnu.org,
qemu-devel@nongnu.org, Peter Maydell, Jason Gunthorpe,
Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Andrea Bolognani, Michael S. Tsirkin, Peter Xu, Zhangfei Gao,
nathanc@nvidia.com, arighi@nvidia.com, ianm@nvidia.com,
jan@nvidia.com, mochs@nvidia.com
On Fri, Nov 01, 2024 at 06:35:23PM +0000, Shameerali Kolothum Thodi wrote:
> > @Shameer,
> > Do you have some update on the pluggable smmuv3 module?
>
> I have a bare minimum prototype code that works with a pluggable smmuv3.
>
> ...
> -device pxb-pcie,id=pcie.1,bus_nr=2,bus=pcie.0 \
> -device pcie-root-port,id=pcie.port1,bus=pcie.1 \
> -device arm-smmuv3-nested,id=smmuv1,pci-bus=pcie.1 \
> -device vfio-pci-nohotplug,host=0000:75:00.1,bus=pcie.port1,iommufd=iommufd0 \
> -device pxb-pcie,id=pcie.2,bus_nr=8,bus=pcie.0 \
> -device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=8 \
> -device arm-smmuv3-nested,id=smmuv2,pci-bus=pcie.2 \
> -device vfio-pci-nohotplug,host=0000:7d:02.1,bus=pcie.port2,iommufd=iommufd0 \
> ...
>
> Something like above can now boot a Guest with the latest kernel. But I am not
> sure it actually works correctly. I need a bit more time to update this and carry
> out some tests. Will target that in Nov.
That looks nice to me! Thanks for the update.
> > Updates on my side:
> > 1) I have kept uAPI updated to the latest version and verified too.
> > There should be some polishing changes depending on how the basic
> > nesting infrastructure would look like from Intel/Duan's work.
> > 2) I got some help from NVIDIA folks for the libvirt task. And they
> > have done some drafting and are now verifying the PCI topology
> > with "iommu=none".
> >
> > Once the pluggable smmuv3 module is ready to test, we will make some
> > change to libvirt for that and drop the auto-assigning patches from
> > the VIRT code, so as to converge for a libvirt+QEMU test.
>
> One query I have is, do Qemu still need to check whether the VFIO devices are
> assigned correctly behind the nested vSMMUv3 w.r.t the phys SMMUv3s or not?
> Or we can just trust whatever the user/libvirt specifies?
That's a good point. I assume ideally QEMU should do something,
though that would be somewhat duplicated with libvirt.
> (I think even if we don't explicitly check, at present it will eventually fail in s2
> HWPT attach for the viommu if it belongs to a different phys SMMUv3).
Yes. That's what we have. Perhaps simply print something like:
"is the device assigned to the correct arm-smmuv3-nested/pxb?"
?
Thanks
Nicolin
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
2024-11-01 4:09 nested-smmuv3 topic for QEMU/libvirt, Nov 2024 Nicolin Chen
2024-11-01 11:55 ` Jason Gunthorpe
2024-11-01 18:35 ` Shameerali Kolothum Thodi via
@ 2024-11-07 11:11 ` Eric Auger
2024-11-07 20:31 ` Nicolin Chen
2 siblings, 1 reply; 17+ messages in thread
From: Eric Auger @ 2024-11-07 11:11 UTC (permalink / raw)
To: Nicolin Chen, Shameerali Kolothum Thodi
Cc: Mostafa Saleh, qemu-arm@nongnu.org, qemu-devel@nongnu.org,
Peter Maydell, Jason Gunthorpe, Jean-Philippe Brucker,
Moritz Fischer, Michael Shavit, Andrea Bolognani,
Michael S. Tsirkin, Peter Xu, Zhangfei Gao, nathanc, arighi, ianm,
jan, mochs, Don Dutile
Hi Nicolin,
On 11/1/24 05:09, Nicolin Chen wrote:
> Hi,
>
> This is a continued discussion following previous month's:
> https://lore.kernel.org/qemu-devel/Zvr%2Fbf7KgLN1cjOl@Asurada-Nvidia/
>
> Kernel changes are getting closer to merge, as Jason's planning to
> take vIOMMU series and smmuv3_nesting series into the iommufd tree:
> https://lore.kernel.org/all/cover.1730313494.git.nicolinc@nvidia.com/
> https://lore.kernel.org/all/cover.1730313494.git.nicolinc@nvidia.com/
> https://lore.kernel.org/all/0-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com/
>
> So, I think it's probably a good time to align with each others and
> talk about kicking off some QEMU upstream work in the month ahead.
>
> @Shameer,
> Do you have some update on the pluggable smmuv3 module?
>
> Updates on my side:
> 1) I have kept uAPI updated to the latest version and verified too.
> There should be some polishing changes depending on how the basic
> nesting infrastructure would look like from Intel/Duan's work.
> 2) I got some help from NVIDIA folks for the libvirt task. And they
> have done some drafting and are now verifying the PCI topology
> with "iommu=none".
>
> Once the pluggable smmuv3 module is ready to test, we will make some
> change to libvirt for that and drop the auto-assigning patches from
> the VIRT code, so as to converge for a libvirt+QEMU test.
>
> FWIW, Robin requested a different solution for MSI mapping [1], v.s.
> the RMR one that we have been using since Eric's work. I drafted a
> few VFIO/IOMMUFD patches for that, yet paused for getting the vIOMMU
> series merged to this kernel cycle. I plan to continue in Nov/Dec.
> So, for the near term we will continue with the RMR solution, until
> we have something solid later.
>
> [1] https://lore.kernel.org/linux-iommu/ZrVN05VylFq8lK4q@Asurada-Nvidia/
At Red Hat we may find some cycles to resume working on the QEMU
integration. Please can you sketch some tasks we could carry out in
coordination with you and Shameer? Adding Don in the loop.
Thank you in advance
Eric
>
> Thanks
> Nicolin
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
2024-11-07 11:11 ` Eric Auger
@ 2024-11-07 20:31 ` Nicolin Chen
2024-11-07 22:10 ` Donald Dutile
2024-11-18 17:59 ` Eric Auger
0 siblings, 2 replies; 17+ messages in thread
From: Nicolin Chen @ 2024-11-07 20:31 UTC (permalink / raw)
To: Eric Auger
Cc: Shameerali Kolothum Thodi, Mostafa Saleh, qemu-arm@nongnu.org,
qemu-devel@nongnu.org, Peter Maydell, Jason Gunthorpe,
Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Andrea Bolognani, Michael S. Tsirkin, Peter Xu, Zhangfei Gao,
nathanc, arighi, ianm, jan, mochs, Don Dutile
Hi Eric,
On Thu, Nov 07, 2024 at 12:11:05PM +0100, Eric Auger wrote:
> On 11/1/24 05:09, Nicolin Chen wrote:
> > Hi,
> >
> > This is a continued discussion following previous month's:
> > https://lore.kernel.org/qemu-devel/Zvr%2Fbf7KgLN1cjOl@Asurada-Nvidia/
> >
> > Kernel changes are getting closer to merge, as Jason's planning to
> > take vIOMMU series and smmuv3_nesting series into the iommufd tree:
> > https://lore.kernel.org/all/cover.1730313494.git.nicolinc@nvidia.com/
> > https://lore.kernel.org/all/cover.1730313494.git.nicolinc@nvidia.com/
> > https://lore.kernel.org/all/0-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com/
> >
> > So, I think it's probably a good time to align with each others and
> > talk about kicking off some QEMU upstream work in the month ahead.
> >
> > @Shameer,
> > Do you have some update on the pluggable smmuv3 module?
> >
> > Updates on my side:
> > 1) I have kept uAPI updated to the latest version and verified too.
> > There should be some polishing changes depending on how the basic
> > nesting infrastructure would look like from Intel/Duan's work.
> > 2) I got some help from NVIDIA folks for the libvirt task. And they
> > have done some drafting and are now verifying the PCI topology
> > with "iommu=none".
> >
> > Once the pluggable smmuv3 module is ready to test, we will make some
> > change to libvirt for that and drop the auto-assigning patches from
> > the VIRT code, so as to converge for a libvirt+QEMU test.
> >
> > FWIW, Robin requested a different solution for MSI mapping [1], v.s.
> > the RMR one that we have been using since Eric's work. I drafted a
> > few VFIO/IOMMUFD patches for that, yet paused for getting the vIOMMU
> > series merged to this kernel cycle. I plan to continue in Nov/Dec.
> > So, for the near term we will continue with the RMR solution, until
> > we have something solid later.
> >
> > [1] https://lore.kernel.org/linux-iommu/ZrVN05VylFq8lK4q@Asurada-Nvidia/
>
> At Red Hat we may find some cycles to resume working on the QEMU
> integration. Please can you sketch some tasks we could carry out in
> coordination with you and Shameer? Adding Don in the loop.
That is great!
So, given that Shameer is working on pluggable module part and we
have folks working on libvirt. I think the only big thing here is
the SMMUv3 series itself. Please refer to the changes in the link:
- cover-letter: Add HW accelerated nesting support for arm SMMUv3
- https://github.com/nicolinc/qemu/commits/wip/for_smmuv3_nesting-v4/
I expect the IOMMU_HWPT_ALLOC (backend APIs) will go with Intel's
series once their current "emulated devices" one gets merged. And
I think I can prepare IOMMU_VIOMMU_ALLOC patches for backend APIs
aligning with HWPT's.
That being said, one thing I am not sure is how we should bridge
between these backend APIs and a virtual IOMMUs (vSMMU/intel). I
think it'd be better for you and Red Hat to provide some insight,
if calling the backend APIs directly from a viommu module isn't a
preferable way.
We also need your comments on vSMMU module patches that are still
roughly a draft requiring a rework at some small details I think.
So, if your (and Don's) bandwidth allows, perhaps you could take
over the vSMMU patches? Otherwise, we can also share the task for
reworking.
Thanks!
Nicolin
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
2024-11-07 20:31 ` Nicolin Chen
@ 2024-11-07 22:10 ` Donald Dutile
2024-11-18 17:59 ` Eric Auger
1 sibling, 0 replies; 17+ messages in thread
From: Donald Dutile @ 2024-11-07 22:10 UTC (permalink / raw)
To: Nicolin Chen, Eric Auger
Cc: Shameerali Kolothum Thodi, Mostafa Saleh, qemu-arm@nongnu.org,
qemu-devel@nongnu.org, Peter Maydell, Jason Gunthorpe,
Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Andrea Bolognani, Michael S. Tsirkin, Peter Xu, Zhangfei Gao,
nathanc, arighi, ianm, jan, mochs
On 11/7/24 3:31 PM, Nicolin Chen wrote:
> Hi Eric,
>
> On Thu, Nov 07, 2024 at 12:11:05PM +0100, Eric Auger wrote:
>> On 11/1/24 05:09, Nicolin Chen wrote:
>>> Hi,
>>>
>>> This is a continued discussion following previous month's:
>>> https://lore.kernel.org/qemu-devel/Zvr%2Fbf7KgLN1cjOl@Asurada-Nvidia/
>>>
>>> Kernel changes are getting closer to merge, as Jason's planning to
>>> take vIOMMU series and smmuv3_nesting series into the iommufd tree:
>>> https://lore.kernel.org/all/cover.1730313494.git.nicolinc@nvidia.com/
>>> https://lore.kernel.org/all/cover.1730313494.git.nicolinc@nvidia.com/
>>> https://lore.kernel.org/all/0-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com/
>>>
>>> So, I think it's probably a good time to align with each others and
>>> talk about kicking off some QEMU upstream work in the month ahead.
>>>
>>> @Shameer,
>>> Do you have some update on the pluggable smmuv3 module?
>>>
>>> Updates on my side:
>>> 1) I have kept uAPI updated to the latest version and verified too.
>>> There should be some polishing changes depending on how the basic
>>> nesting infrastructure would look like from Intel/Duan's work.
>>> 2) I got some help from NVIDIA folks for the libvirt task. And they
>>> have done some drafting and are now verifying the PCI topology
>>> with "iommu=none".
>>>
>>> Once the pluggable smmuv3 module is ready to test, we will make some
>>> change to libvirt for that and drop the auto-assigning patches from
>>> the VIRT code, so as to converge for a libvirt+QEMU test.
>>>
>>> FWIW, Robin requested a different solution for MSI mapping [1], v.s.
>>> the RMR one that we have been using since Eric's work. I drafted a
>>> few VFIO/IOMMUFD patches for that, yet paused for getting the vIOMMU
>>> series merged to this kernel cycle. I plan to continue in Nov/Dec.
>>> So, for the near term we will continue with the RMR solution, until
>>> we have something solid later.
>>>
>>> [1] https://lore.kernel.org/linux-iommu/ZrVN05VylFq8lK4q@Asurada-Nvidia/
>>
>> At Red Hat we may find some cycles to resume working on the QEMU
>> integration. Please can you sketch some tasks we could carry out in
>> coordination with you and Shameer? Adding Don in the loop.
>
> That is great!
>
> So, given that Shameer is working on pluggable module part and we
> have folks working on libvirt. I think the only big thing here is
> the SMMUv3 series itself. Please refer to the changes in the link:
> - cover-letter: Add HW accelerated nesting support for arm SMMUv3
> - https://github.com/nicolinc/qemu/commits/wip/for_smmuv3_nesting-v4/
>
> I expect the IOMMU_HWPT_ALLOC (backend APIs) will go with Intel's
> series once their current "emulated devices" one gets merged. And
> I think I can prepare IOMMU_VIOMMU_ALLOC patches for backend APIs
> aligning with HWPT's.
>
> That being said, one thing I am not sure is how we should bridge
> between these backend APIs and a virtual IOMMUs (vSMMU/intel). I
> think it'd be better for you and Red Hat to provide some insight,
> if calling the backend APIs directly from a viommu module isn't a
> preferable way.
>
> We also need your comments on vSMMU module patches that are still
> roughly a draft requiring a rework at some small details I think.
> So, if your (and Don's) bandwidth allows, perhaps you could take
> over the vSMMU patches? Otherwise, we can also share the task for
> reworking.
>
> Thanks!
> Nicolin
>
Nicolin,
Hi!
Sounds like a plan. Glad to join the effort.
- Don
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
2024-11-01 19:29 ` Nicolin Chen
@ 2024-11-08 13:05 ` Shameerali Kolothum Thodi via
0 siblings, 0 replies; 17+ messages in thread
From: Shameerali Kolothum Thodi via @ 2024-11-08 13:05 UTC (permalink / raw)
To: Nicolin Chen
Cc: Eric Auger, Mostafa Saleh, qemu-arm@nongnu.org,
qemu-devel@nongnu.org, Peter Maydell, Jason Gunthorpe,
Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Andrea Bolognani, Michael S. Tsirkin, Peter Xu, Zhangfei Gao,
nathanc@nvidia.com, arighi@nvidia.com, ianm@nvidia.com,
jan@nvidia.com, mochs@nvidia.com
> -----Original Message-----
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Friday, November 1, 2024 7:29 PM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: Eric Auger <eric.auger@redhat.com>; Mostafa Saleh
> <smostafa@google.com>; qemu-arm@nongnu.org; qemu-
> devel@nongnu.org; Peter Maydell <peter.maydell@linaro.org>; Jason
> Gunthorpe <jgg@nvidia.com>; Jean-Philippe Brucker <jean-
> philippe@linaro.org>; Moritz Fischer <mdf@kernel.org>; Michael Shavit
> <mshavit@google.com>; Andrea Bolognani <abologna@redhat.com>;
> Michael S. Tsirkin <mst@redhat.com>; Peter Xu <peterx@redhat.com>;
> Zhangfei Gao <zhangfei.gao@linaro.org>; nathanc@nvidia.com;
> arighi@nvidia.com; ianm@nvidia.com; jan@nvidia.com; mochs@nvidia.com
> Subject: Re: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
>
> On Fri, Nov 01, 2024 at 06:35:23PM +0000, Shameerali Kolothum Thodi
> wrote:
> > > @Shameer,
> > > Do you have some update on the pluggable smmuv3 module?
> >
> > I have a bare minimum prototype code that works with a pluggable
> smmuv3.
> >
> > ...
> > -device pxb-pcie,id=pcie.1,bus_nr=2,bus=pcie.0 \
> > -device pcie-root-port,id=pcie.port1,bus=pcie.1 \
> > -device arm-smmuv3-nested,id=smmuv1,pci-bus=pcie.1 \
> > -device vfio-pci-
> nohotplug,host=0000:75:00.1,bus=pcie.port1,iommufd=iommufd0 \
> > -device pxb-pcie,id=pcie.2,bus_nr=8,bus=pcie.0 \
> > -device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=8 \
> > -device arm-smmuv3-nested,id=smmuv2,pci-bus=pcie.2 \
> > -device vfio-pci-
> nohotplug,host=0000:7d:02.1,bus=pcie.port2,iommufd=iommufd0 \
> > ...
> >
> > Something like above can now boot a Guest with the latest kernel. But I
> am not
> > sure it actually works correctly. I need a bit more time to update this and
> carry
> > out some tests. Will target that in Nov.
>
> That looks nice to me! Thanks for the update.
I have sent out an initial RFC here(Might have missed CC few in this email. Apologies.),
https://lore.kernel.org/qemu-devel/20241108125242.60136-1-shameerali.kolothum.thodi@huawei.com/
Please take a look and let me know.
Thanks,
Shameer
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
2024-11-07 20:31 ` Nicolin Chen
2024-11-07 22:10 ` Donald Dutile
@ 2024-11-18 17:59 ` Eric Auger
2024-11-19 7:07 ` Duan, Zhenzhong
2024-11-20 20:44 ` Nicolin Chen
1 sibling, 2 replies; 17+ messages in thread
From: Eric Auger @ 2024-11-18 17:59 UTC (permalink / raw)
To: Nicolin Chen
Cc: Shameerali Kolothum Thodi, Mostafa Saleh, qemu-arm@nongnu.org,
qemu-devel@nongnu.org, Peter Maydell, Jason Gunthorpe,
Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Andrea Bolognani, Michael S. Tsirkin, Peter Xu, Zhangfei Gao,
nathanc, arighi, ianm, jan, mochs, Don Dutile, Zhenzhong Duan
Hi Nicolin,
On 11/7/24 21:31, Nicolin Chen wrote:
> Hi Eric,
>
> On Thu, Nov 07, 2024 at 12:11:05PM +0100, Eric Auger wrote:
>> On 11/1/24 05:09, Nicolin Chen wrote:
>>> Hi,
>>>
>>> This is a continued discussion following previous month's:
>>> https://lore.kernel.org/qemu-devel/Zvr%2Fbf7KgLN1cjOl@Asurada-Nvidia/
>>>
>>> Kernel changes are getting closer to merge, as Jason's planning to
>>> take vIOMMU series and smmuv3_nesting series into the iommufd tree:
>>> https://lore.kernel.org/all/cover.1730313494.git.nicolinc@nvidia.com/
>>> https://lore.kernel.org/all/cover.1730313494.git.nicolinc@nvidia.com/
>>> https://lore.kernel.org/all/0-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com/
>>>
>>> So, I think it's probably a good time to align with each others and
>>> talk about kicking off some QEMU upstream work in the month ahead.
>>>
>>> @Shameer,
>>> Do you have some update on the pluggable smmuv3 module?
>>>
>>> Updates on my side:
>>> 1) I have kept uAPI updated to the latest version and verified too.
>>> There should be some polishing changes depending on how the basic
>>> nesting infrastructure would look like from Intel/Duan's work.
>>> 2) I got some help from NVIDIA folks for the libvirt task. And they
>>> have done some drafting and are now verifying the PCI topology
>>> with "iommu=none".
>>>
>>> Once the pluggable smmuv3 module is ready to test, we will make some
>>> change to libvirt for that and drop the auto-assigning patches from
>>> the VIRT code, so as to converge for a libvirt+QEMU test.
>>>
>>> FWIW, Robin requested a different solution for MSI mapping [1], v.s.
>>> the RMR one that we have been using since Eric's work. I drafted a
>>> few VFIO/IOMMUFD patches for that, yet paused for getting the vIOMMU
>>> series merged to this kernel cycle. I plan to continue in Nov/Dec.
>>> So, for the near term we will continue with the RMR solution, until
>>> we have something solid later.
>>>
>>> [1] https://lore.kernel.org/linux-iommu/ZrVN05VylFq8lK4q@Asurada-Nvidia/
>> At Red Hat we may find some cycles to resume working on the QEMU
>> integration. Please can you sketch some tasks we could carry out in
>> coordination with you and Shameer? Adding Don in the loop.
> That is great!
>
> So, given that Shameer is working on pluggable module part and we
> have folks working on libvirt. I think the only big thing here is
> the SMMUv3 series itself. Please refer to the changes in the link:
> - cover-letter: Add HW accelerated nesting support for arm SMMUv3
> - https://github.com/nicolinc/qemu/commits/wip/for_smmuv3_nesting-v4/
Looking at your branch I see the following series (marked with cover-letter)
*
*
cover-letter: Add RMR WAR for MSI mappings (based on former RMR flat
mapping and not related to *[PATCH RFCv1 0/7] vfio: Allow userspace
to specify the address for each MSI vector
<https://lore.kernel.org/kvm/cover.1731130093.git.nicolinc@nvidia.com/#r>
I guess)*
*
cover-letter: hw/arm/virt: Add multiple nested SMMUs (Nicolin ->
Shameer)
*
cover-letter: Add HW accelerated nesting support for arm SMMUv3
(Nicolin)
*
cover-letter: Add VIOMMU infrastructure support (Nicolin)
*
cover-letter: intel_iommu: Enable stage-1 translation for
passthrough device (Zhenzhong)
*
cover-letter: intel_iommu: Enable stage-1 translation for emulated
device (Zhenzhong)
The last one is covered by *[PATCH v5 00/20] intel_iommu: Enable stage-1
translation for emulated device
<https://lore.kernel.org/all/20241111083457.2090664-1-zhenzhong.duan@intel.com/#r>
*
*I see there is a reference to *"Enable stage-1 translation for
passthrough device" series but has it been posted for review? Adding
Zhenzhong in copy.
**
*Are there any posts upstream for the rest, besides Shameer's respin?
*
*
>
> I expect the IOMMU_HWPT_ALLOC (backend APIs) will go with Intel's
> series once their current "emulated devices" one gets merged. And
> I think I can prepare IOMMU_VIOMMU_ALLOC patches for backend APIs
> aligning with HWPT's.
Can you point us to the actual series including this IOMMU_HWPT_ALLOC
support? This would clarify which part you are going to work on next.
>
> That being said, one thing I am not sure is how we should bridge
> between these backend APIs and a virtual IOMMUs (vSMMU/intel). I
> think it'd be better for you and Red Hat to provide some insight,
> if calling the backend APIs directly from a viommu module isn't a
> preferable way.
can you clarify what you call backend API in that context?
>
> We also need your comments on vSMMU module patches that are still
> roughly a draft requiring a rework at some small details I think.
> So, if your (and Don's) bandwidth allows, perhaps you could take
> over the vSMMU patches? Otherwise, we can also share the task for
> reworking.
So we have started the multi smmu instantiation review and provided
feedbacks.
Which part can we work on then?
Thanks
Eric
>
> Thanks!
> Nicolin
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
2024-11-18 17:59 ` Eric Auger
@ 2024-11-19 7:07 ` Duan, Zhenzhong
2024-11-19 8:17 ` Eric Auger
2024-11-20 20:44 ` Nicolin Chen
1 sibling, 1 reply; 17+ messages in thread
From: Duan, Zhenzhong @ 2024-11-19 7:07 UTC (permalink / raw)
To: eric.auger@redhat.com, Nicolin Chen
Cc: Shameerali Kolothum Thodi, Mostafa Saleh, qemu-arm@nongnu.org,
qemu-devel@nongnu.org, Peter Maydell, Jason Gunthorpe,
Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Andrea Bolognani, Michael S. Tsirkin, Peter Xu, Zhangfei Gao,
nathanc@nvidia.com, arighi@nvidia.com, ianm@nvidia.com,
jan@nvidia.com, mochs@nvidia.com, Don Dutile
Hi Eric,
>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Sent: Tuesday, November 19, 2024 2:00 AM
>Subject: Re: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
>
>Hi Nicolin,
>
>On 11/7/24 21:31, Nicolin Chen wrote:
>> Hi Eric,
>>
>> On Thu, Nov 07, 2024 at 12:11:05PM +0100, Eric Auger wrote:
>>> On 11/1/24 05:09, Nicolin Chen wrote:
>>>> Hi,
>>>>
>>>> This is a continued discussion following previous month's:
>>>> https://lore.kernel.org/qemu-devel/Zvr%2Fbf7KgLN1cjOl@Asurada-Nvidia/
>>>>
>>>> Kernel changes are getting closer to merge, as Jason's planning to
>>>> take vIOMMU series and smmuv3_nesting series into the iommufd tree:
>>>> https://lore.kernel.org/all/cover.1730313494.git.nicolinc@nvidia.com/
>>>> https://lore.kernel.org/all/cover.1730313494.git.nicolinc@nvidia.com/
>>>> https://lore.kernel.org/all/0-v4-9e99b76f3518+3a8-
>smmuv3_nesting_jgg@nvidia.com/
>>>>
>>>> So, I think it's probably a good time to align with each others and
>>>> talk about kicking off some QEMU upstream work in the month ahead.
>>>>
>>>> @Shameer,
>>>> Do you have some update on the pluggable smmuv3 module?
>>>>
>>>> Updates on my side:
>>>> 1) I have kept uAPI updated to the latest version and verified too.
>>>> There should be some polishing changes depending on how the basic
>>>> nesting infrastructure would look like from Intel/Duan's work.
>>>> 2) I got some help from NVIDIA folks for the libvirt task. And they
>>>> have done some drafting and are now verifying the PCI topology
>>>> with "iommu=none".
>>>>
>>>> Once the pluggable smmuv3 module is ready to test, we will make some
>>>> change to libvirt for that and drop the auto-assigning patches from
>>>> the VIRT code, so as to converge for a libvirt+QEMU test.
>>>>
>>>> FWIW, Robin requested a different solution for MSI mapping [1], v.s.
>>>> the RMR one that we have been using since Eric's work. I drafted a
>>>> few VFIO/IOMMUFD patches for that, yet paused for getting the vIOMMU
>>>> series merged to this kernel cycle. I plan to continue in Nov/Dec.
>>>> So, for the near term we will continue with the RMR solution, until
>>>> we have something solid later.
>>>>
>>>> [1] https://lore.kernel.org/linux-iommu/ZrVN05VylFq8lK4q@Asurada-Nvidia/
>>> At Red Hat we may find some cycles to resume working on the QEMU
>>> integration. Please can you sketch some tasks we could carry out in
>>> coordination with you and Shameer? Adding Don in the loop.
>> That is great!
>>
>> So, given that Shameer is working on pluggable module part and we
>> have folks working on libvirt. I think the only big thing here is
>> the SMMUv3 series itself. Please refer to the changes in the link:
>> - cover-letter: Add HW accelerated nesting support for arm SMMUv3
>> - https://github.com/nicolinc/qemu/commits/wip/for_smmuv3_nesting-v4/
>Looking at your branch I see the following series (marked with cover-letter)
>*
>
> *
>
> cover-letter: Add RMR WAR for MSI mappings (based on former RMR flat
> mapping and not related to *[PATCH RFCv1 0/7] vfio: Allow userspace
> to specify the address for each MSI vector
> <https://lore.kernel.org/kvm/cover.1731130093.git.nicolinc@nvidia.com/#r>
> I guess)*
>
> *
>
> cover-letter: hw/arm/virt: Add multiple nested SMMUs (Nicolin ->
> Shameer)
>
> *
>
> cover-letter: Add HW accelerated nesting support for arm SMMUv3
> (Nicolin)
>
> *
>
> cover-letter: Add VIOMMU infrastructure support (Nicolin)
>
> *
>
> cover-letter: intel_iommu: Enable stage-1 translation for
> passthrough device (Zhenzhong)
>
> *
>
> cover-letter: intel_iommu: Enable stage-1 translation for emulated
> device (Zhenzhong)
>
>The last one is covered by *[PATCH v5 00/20] intel_iommu: Enable stage-1
>translation for emulated device
><https://lore.kernel.org/all/20241111083457.2090664-1-
>zhenzhong.duan@intel.com/#r>
>*
>
>*I see there is a reference to *"Enable stage-1 translation for
>passthrough device" series but has it been posted for review? Adding
>Zhenzhong in copy.
There is an RFCv1 posted upstream which is a combination of
" intel_iommu: Enable stage-1 translation for emulated device" and
" intel_iommu: Enable stage-1 translation for passthrough device".
Link: https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg02740.html
If you means the split version, not sent yet. Plan it after the series about
emulated device is accepted.
Thanks
Zhenzhong
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
2024-11-19 7:07 ` Duan, Zhenzhong
@ 2024-11-19 8:17 ` Eric Auger
0 siblings, 0 replies; 17+ messages in thread
From: Eric Auger @ 2024-11-19 8:17 UTC (permalink / raw)
To: Duan, Zhenzhong, Nicolin Chen
Cc: Shameerali Kolothum Thodi, Mostafa Saleh, qemu-arm@nongnu.org,
qemu-devel@nongnu.org, Peter Maydell, Jason Gunthorpe,
Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Andrea Bolognani, Michael S. Tsirkin, Peter Xu, Zhangfei Gao,
nathanc@nvidia.com, arighi@nvidia.com, ianm@nvidia.com,
jan@nvidia.com, mochs@nvidia.com, Don Dutile
Hi Zhenzhong,
On 11/19/24 08:07, Duan, Zhenzhong wrote:
> Hi Eric,
>
>> -----Original Message-----
>> From: Eric Auger <eric.auger@redhat.com>
>> Sent: Tuesday, November 19, 2024 2:00 AM
>> Subject: Re: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
>>
>> Hi Nicolin,
>>
>> On 11/7/24 21:31, Nicolin Chen wrote:
>>> Hi Eric,
>>>
>>> On Thu, Nov 07, 2024 at 12:11:05PM +0100, Eric Auger wrote:
>>>> On 11/1/24 05:09, Nicolin Chen wrote:
>>>>> Hi,
>>>>>
>>>>> This is a continued discussion following previous month's:
>>>>> https://lore.kernel.org/qemu-devel/Zvr%2Fbf7KgLN1cjOl@Asurada-Nvidia/
>>>>>
>>>>> Kernel changes are getting closer to merge, as Jason's planning to
>>>>> take vIOMMU series and smmuv3_nesting series into the iommufd tree:
>>>>> https://lore.kernel.org/all/cover.1730313494.git.nicolinc@nvidia.com/
>>>>> https://lore.kernel.org/all/cover.1730313494.git.nicolinc@nvidia.com/
>>>>> https://lore.kernel.org/all/0-v4-9e99b76f3518+3a8-
>> smmuv3_nesting_jgg@nvidia.com/
>>>>> So, I think it's probably a good time to align with each others and
>>>>> talk about kicking off some QEMU upstream work in the month ahead.
>>>>>
>>>>> @Shameer,
>>>>> Do you have some update on the pluggable smmuv3 module?
>>>>>
>>>>> Updates on my side:
>>>>> 1) I have kept uAPI updated to the latest version and verified too.
>>>>> There should be some polishing changes depending on how the basic
>>>>> nesting infrastructure would look like from Intel/Duan's work.
>>>>> 2) I got some help from NVIDIA folks for the libvirt task. And they
>>>>> have done some drafting and are now verifying the PCI topology
>>>>> with "iommu=none".
>>>>>
>>>>> Once the pluggable smmuv3 module is ready to test, we will make some
>>>>> change to libvirt for that and drop the auto-assigning patches from
>>>>> the VIRT code, so as to converge for a libvirt+QEMU test.
>>>>>
>>>>> FWIW, Robin requested a different solution for MSI mapping [1], v.s.
>>>>> the RMR one that we have been using since Eric's work. I drafted a
>>>>> few VFIO/IOMMUFD patches for that, yet paused for getting the vIOMMU
>>>>> series merged to this kernel cycle. I plan to continue in Nov/Dec.
>>>>> So, for the near term we will continue with the RMR solution, until
>>>>> we have something solid later.
>>>>>
>>>>> [1] https://lore.kernel.org/linux-iommu/ZrVN05VylFq8lK4q@Asurada-Nvidia/
>>>> At Red Hat we may find some cycles to resume working on the QEMU
>>>> integration. Please can you sketch some tasks we could carry out in
>>>> coordination with you and Shameer? Adding Don in the loop.
>>> That is great!
>>>
>>> So, given that Shameer is working on pluggable module part and we
>>> have folks working on libvirt. I think the only big thing here is
>>> the SMMUv3 series itself. Please refer to the changes in the link:
>>> - cover-letter: Add HW accelerated nesting support for arm SMMUv3
>>> - https://github.com/nicolinc/qemu/commits/wip/for_smmuv3_nesting-v4/
>> Looking at your branch I see the following series (marked with cover-letter)
>> *
>>
>> *
>>
>> cover-letter: Add RMR WAR for MSI mappings (based on former RMR flat
>> mapping and not related to *[PATCH RFCv1 0/7] vfio: Allow userspace
>> to specify the address for each MSI vector
>> <https://lore.kernel.org/kvm/cover.1731130093.git.nicolinc@nvidia.com/#r>
>> I guess)*
>>
>> *
>>
>> cover-letter: hw/arm/virt: Add multiple nested SMMUs (Nicolin ->
>> Shameer)
>>
>> *
>>
>> cover-letter: Add HW accelerated nesting support for arm SMMUv3
>> (Nicolin)
>>
>> *
>>
>> cover-letter: Add VIOMMU infrastructure support (Nicolin)
>>
>> *
>>
>> cover-letter: intel_iommu: Enable stage-1 translation for
>> passthrough device (Zhenzhong)
>>
>> *
>>
>> cover-letter: intel_iommu: Enable stage-1 translation for emulated
>> device (Zhenzhong)
>>
>> The last one is covered by *[PATCH v5 00/20] intel_iommu: Enable stage-1
>> translation for emulated device
>> <https://lore.kernel.org/all/20241111083457.2090664-1-
>> zhenzhong.duan@intel.com/#r>
>> *
>>
>> *I see there is a reference to *"Enable stage-1 translation for
>> passthrough device" series but has it been posted for review? Adding
>> Zhenzhong in copy.
> There is an RFCv1 posted upstream which is a combination of
> " intel_iommu: Enable stage-1 translation for emulated device" and
> " intel_iommu: Enable stage-1 translation for passthrough device".
> Link: https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg02740.html
>
> If you means the split version, not sent yet. Plan it after the series about
> emulated device is accepted.
OK thank you very much for the link and confirmation.
Eric
>
> Thanks
> Zhenzhong
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
2024-11-18 17:59 ` Eric Auger
2024-11-19 7:07 ` Duan, Zhenzhong
@ 2024-11-20 20:44 ` Nicolin Chen
1 sibling, 0 replies; 17+ messages in thread
From: Nicolin Chen @ 2024-11-20 20:44 UTC (permalink / raw)
To: Eric Auger
Cc: Shameerali Kolothum Thodi, Mostafa Saleh, qemu-arm@nongnu.org,
qemu-devel@nongnu.org, Peter Maydell, Jason Gunthorpe,
Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Andrea Bolognani, Michael S. Tsirkin, Peter Xu, Zhangfei Gao,
nathanc, arighi, ianm, jan, mochs, Don Dutile, Zhenzhong Duan
On Mon, Nov 18, 2024 at 06:59:53PM +0100, Eric Auger wrote:
> Looking at your branch I see the following series (marked with cover-letter)
..
> cover-letter: Add RMR WAR for MSI mappings (based on former RMR flat
> mapping and not related to *[PATCH RFCv1 0/7] vfio: Allow userspace
> to specify the address for each MSI vector
> <https://lore.kernel.org/kvm/cover.1731130093.git.nicolinc@nvidia.com/#r>
> I guess)*
Yes, I think we would postpone this series, by simply putting it
on top of our future shared branches while waiting for the kernel
solution gets finalized to a certain degree, to make sure an MSI
2-stage mapping could still work.
> cover-letter: hw/arm/virt: Add multiple nested SMMUs (Nicolin ->
> Shameer)
Yes
> cover-letter: Add HW accelerated nesting support for arm SMMUv3
> (Nicolin)
I think this will be moved to Don (or Don/Nic)?
> cover-letter: Add VIOMMU infrastructure support (Nicolin)
Yes.
> cover-letter: intel_iommu: Enable stage-1 translation for
> passthrough device (Zhenzhong)
..
> cover-letter: intel_iommu: Enable stage-1 translation for emulated
> device (Zhenzhong)
Yes. We'll need the HWPT infrastructure patches in Intel's series
and I think Intel's progress is faster so Zhenzhong should be the
submitter.
> *Are there any posts upstream for the rest, besides Shameer's respin?
Not yet. I think the dependency should be
(1) HWPT infrastructure (Zhenzhong's Intel series)
(2) VIOMMU infrastructure
(3) smmuv3-acc series (2-stage translation)
(4) Multi vSMMU instance support (VIRT and IORT)
(5) RMR
> *
> >
> > I expect the IOMMU_HWPT_ALLOC (backend APIs) will go with Intel's
> > series once their current "emulated devices" one gets merged. And
> > I think I can prepare IOMMU_VIOMMU_ALLOC patches for backend APIs
> > aligning with HWPT's.
> Can you point us to the actual series including this IOMMU_HWPT_ALLOC
> support? This would clarify which part you are going to work on next.
https://github.com/yiliu1765/qemu/commits/zhenzhong/iommufd_nesting_rfcv2/
Though not planning to do so since it's unlikely the case, yet HWPT
changes could go with the vSMMU series if we're running faster than
Zhenzhong :)
> >
> > That being said, one thing I am not sure is how we should bridge
> > between these backend APIs and a virtual IOMMUs (vSMMU/intel). I
> > think it'd be better for you and Red Hat to provide some insight,
> > if calling the backend APIs directly from a viommu module isn't a
> > preferable way.
> can you clarify what you call backend API in that context?
> >
> > We also need your comments on vSMMU module patches that are still
> > roughly a draft requiring a rework at some small details I think.
> > So, if your (and Don's) bandwidth allows, perhaps you could take
> > over the vSMMU patches? Otherwise, we can also share the task for
> > reworking.
> So we have started the multi smmu instantiation review and provided
> feedbacks.
> Which part can we work on then?
This one:
cover-letter: Add HW accelerated nesting support for arm SMMUv3
i.e. (3) that I put the list above.
Thanks
Nicolin
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
2024-11-01 18:35 ` Shameerali Kolothum Thodi via
2024-11-01 19:29 ` Nicolin Chen
@ 2024-11-20 21:13 ` Andrea Bolognani
2024-11-21 9:47 ` Shameerali Kolothum Thodi via
1 sibling, 1 reply; 17+ messages in thread
From: Andrea Bolognani @ 2024-11-20 21:13 UTC (permalink / raw)
To: Shameerali Kolothum Thodi
Cc: Nicolin Chen, Eric Auger, Mostafa Saleh, qemu-arm@nongnu.org,
qemu-devel@nongnu.org, Peter Maydell, Jason Gunthorpe,
Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Michael S. Tsirkin, Peter Xu, Zhangfei Gao, nathanc@nvidia.com,
arighi@nvidia.com, ianm@nvidia.com, jan@nvidia.com,
mochs@nvidia.com
On Fri, Nov 01, 2024 at 06:35:23PM +0000, Shameerali Kolothum Thodi wrote:
> I have a bare minimum prototype code that works with a pluggable smmuv3.
>
> ...
> -device pxb-pcie,id=pcie.1,bus_nr=2,bus=pcie.0 \
> -device pcie-root-port,id=pcie.port1,bus=pcie.1 \
> -device arm-smmuv3-nested,id=smmuv1,pci-bus=pcie.1 \
> -device vfio-pci-nohotplug,host=0000:75:00.1,bus=pcie.port1,iommufd=iommufd0 \
> -device pxb-pcie,id=pcie.2,bus_nr=8,bus=pcie.0 \
> -device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=8 \
> -device arm-smmuv3-nested,id=smmuv2,pci-bus=pcie.2 \
> -device vfio-pci-nohotplug,host=0000:7d:02.1,bus=pcie.port2,iommufd=iommufd0 \
Silly bit of feedback on the interface, but why is the
arm-smmuv3-nested property called "pci-bus" instead of just "bus"?
All other devices that need to refer to an existing PCI bus use the
latter. Is there a reason for this specific one to diverge?
--
Andrea Bolognani / Red Hat / Virtualization
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
2024-11-20 21:13 ` Andrea Bolognani
@ 2024-11-21 9:47 ` Shameerali Kolothum Thodi via
0 siblings, 0 replies; 17+ messages in thread
From: Shameerali Kolothum Thodi via @ 2024-11-21 9:47 UTC (permalink / raw)
To: Andrea Bolognani
Cc: Nicolin Chen, Eric Auger, Mostafa Saleh, qemu-arm@nongnu.org,
qemu-devel@nongnu.org, Peter Maydell, Jason Gunthorpe,
Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Michael S. Tsirkin, Peter Xu, Zhangfei Gao, nathanc@nvidia.com,
arighi@nvidia.com, ianm@nvidia.com, jan@nvidia.com,
mochs@nvidia.com
> -----Original Message-----
> From: Andrea Bolognani <abologna@redhat.com>
> Sent: Wednesday, November 20, 2024 9:14 PM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: Nicolin Chen <nicolinc@nvidia.com>; Eric Auger
> <eric.auger@redhat.com>; Mostafa Saleh <smostafa@google.com>; qemu-
> arm@nongnu.org; qemu-devel@nongnu.org; Peter Maydell
> <peter.maydell@linaro.org>; Jason Gunthorpe <jgg@nvidia.com>; Jean-
> Philippe Brucker <jean-philippe@linaro.org>; Moritz Fischer
> <mdf@kernel.org>; Michael Shavit <mshavit@google.com>; Michael S.
> Tsirkin <mst@redhat.com>; Peter Xu <peterx@redhat.com>; Zhangfei Gao
> <zhangfei.gao@linaro.org>; nathanc@nvidia.com; arighi@nvidia.com;
> ianm@nvidia.com; jan@nvidia.com; mochs@nvidia.com
> Subject: Re: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
>
> On Fri, Nov 01, 2024 at 06:35:23PM +0000, Shameerali Kolothum Thodi
> wrote:
> > I have a bare minimum prototype code that works with a pluggable
> smmuv3.
> >
> > ...
> > -device pxb-pcie,id=pcie.1,bus_nr=2,bus=pcie.0 \
> > -device pcie-root-port,id=pcie.port1,bus=pcie.1 \
> > -device arm-smmuv3-nested,id=smmuv1,pci-bus=pcie.1 \
> > -device vfio-pci-
> nohotplug,host=0000:75:00.1,bus=pcie.port1,iommufd=iommufd0 \
> > -device pxb-pcie,id=pcie.2,bus_nr=8,bus=pcie.0 \
> > -device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=8 \
> > -device arm-smmuv3-nested,id=smmuv2,pci-bus=pcie.2 \
> > -device vfio-pci-
> nohotplug,host=0000:7d:02.1,bus=pcie.port2,iommufd=iommufd0 \
>
> Silly bit of feedback on the interface, but why is the
> arm-smmuv3-nested property called "pci-bus" instead of just "bus"?
>
> All other devices that need to refer to an existing PCI bus use the
> latter. Is there a reason for this specific one to diverge?
Nope. This will be changed to "bus" in next respin.
Thanks,
Shameer
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
2024-11-01 11:55 ` Jason Gunthorpe
@ 2024-12-02 6:04 ` Zhangfei Gao
2024-12-02 8:07 ` Shameerali Kolothum Thodi via
0 siblings, 1 reply; 17+ messages in thread
From: Zhangfei Gao @ 2024-12-02 6:04 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Nicolin Chen, Shameerali Kolothum Thodi, Eric Auger,
Mostafa Saleh, qemu-arm@nongnu.org, qemu-devel@nongnu.org,
Peter Maydell, Jean-Philippe Brucker, Moritz Fischer,
Michael Shavit, Andrea Bolognani, Michael S. Tsirkin, Peter Xu,
nathanc, arighi, ianm, jan, mochs
Hi, Nico
On Fri, 1 Nov 2024 at 19:55, Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Thu, Oct 31, 2024 at 09:09:20PM -0700, Nicolin Chen wrote:
>
> > FWIW, Robin requested a different solution for MSI mapping [1], v.s.
> > the RMR one that we have been using since Eric's work. I drafted a
> > few VFIO/IOMMUFD patches for that,
>
> I also talked to MarcZ about this at LPC and he seems willing to
> consider it. It took a bit to explain everything though. So I think we
> should try in Nov/Dec
When boot qemu, reports this
qemu-system-aarch64: warning: IOMMU_IOAS_MAP failed: Bad address, PCI BAR?
qemu-system-aarch64: vfio_container_dma_map(0xaaaadd30f110,
0x8000200000, 0x10000, 0xffffb8031000) = -14 (Bad address)
qemu-system-aarch64: warning: IOMMU_IOAS_MAP failed: Bad address, PCI BAR?
qemu-system-aarch64: vfio_container_dma_map(0xaaaadd2bc310,
0x8000200000, 0x10000, 0xffffb8031000) = -14 (Bad address)
qemu-system-aarch64: warning: IOMMU_IOAS_MAP failed: Bad address, PCI BAR?
qemu-system-aarch64: vfio_container_dma_map(0xaaaadcf90000,
0x8000200000, 0x10000, 0xffffb8031000) = -14 (Bad address)
Will this also be solved in the new MSI mapping patchset?
Thanks
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
2024-12-02 6:04 ` Zhangfei Gao
@ 2024-12-02 8:07 ` Shameerali Kolothum Thodi via
2024-12-02 15:00 ` Zhangfei Gao
0 siblings, 1 reply; 17+ messages in thread
From: Shameerali Kolothum Thodi via @ 2024-12-02 8:07 UTC (permalink / raw)
To: Zhangfei Gao, Jason Gunthorpe
Cc: Nicolin Chen, Eric Auger, Mostafa Saleh, qemu-arm@nongnu.org,
qemu-devel@nongnu.org, Peter Maydell, Jean-Philippe Brucker,
Moritz Fischer, Michael Shavit, Andrea Bolognani,
Michael S. Tsirkin, Peter Xu, nathanc@nvidia.com,
arighi@nvidia.com, ianm@nvidia.com, jan@nvidia.com,
mochs@nvidia.com
> -----Original Message-----
> From: Zhangfei Gao <zhangfei.gao@linaro.org>
> Sent: Monday, December 2, 2024 6:05 AM
> To: Jason Gunthorpe <jgg@nvidia.com>
> Cc: Nicolin Chen <nicolinc@nvidia.com>; Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>; Eric Auger
> <eric.auger@redhat.com>; Mostafa Saleh <smostafa@google.com>; qemu-
> arm@nongnu.org; qemu-devel@nongnu.org; Peter Maydell
> <peter.maydell@linaro.org>; Jean-Philippe Brucker <jean-
> philippe@linaro.org>; Moritz Fischer <mdf@kernel.org>; Michael Shavit
> <mshavit@google.com>; Andrea Bolognani <abologna@redhat.com>;
> Michael S. Tsirkin <mst@redhat.com>; Peter Xu <peterx@redhat.com>;
> nathanc@nvidia.com; arighi@nvidia.com; ianm@nvidia.com;
> jan@nvidia.com; mochs@nvidia.com
> Subject: Re: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
>
> Hi, Nico
>
> On Fri, 1 Nov 2024 at 19:55, Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > On Thu, Oct 31, 2024 at 09:09:20PM -0700, Nicolin Chen wrote:
> >
> > > FWIW, Robin requested a different solution for MSI mapping [1], v.s.
> > > the RMR one that we have been using since Eric's work. I drafted a
> > > few VFIO/IOMMUFD patches for that,
> >
> > I also talked to MarcZ about this at LPC and he seems willing to
> > consider it. It took a bit to explain everything though. So I think we
> > should try in Nov/Dec
>
> When boot qemu, reports this
>
> qemu-system-aarch64: warning: IOMMU_IOAS_MAP failed: Bad address, PCI
> BAR?
> qemu-system-aarch64: vfio_container_dma_map(0xaaaadd30f110,
> 0x8000200000, 0x10000, 0xffffb8031000) = -14 (Bad address)
> qemu-system-aarch64: warning: IOMMU_IOAS_MAP failed: Bad address, PCI
> BAR?
> qemu-system-aarch64: vfio_container_dma_map(0xaaaadd2bc310,
> 0x8000200000, 0x10000, 0xffffb8031000) = -14 (Bad address)
> qemu-system-aarch64: warning: IOMMU_IOAS_MAP failed: Bad address, PCI
> BAR?
> qemu-system-aarch64: vfio_container_dma_map(0xaaaadcf90000,
> 0x8000200000, 0x10000, 0xffffb8031000) = -14 (Bad address)
>
> Will this also be solved in the new MSI mapping patchset?
Nope. These are not related to MSIs. These mappings are required for
P2P DMA between devices and is not supported by IOMMUFD at the moment.
See the discussion here,
https://lore.kernel.org/all/20220426103507.5693a0ca.alex.williamson@redhat.com/
Thanks,
Shameer
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
2024-12-02 8:07 ` Shameerali Kolothum Thodi via
@ 2024-12-02 15:00 ` Zhangfei Gao
0 siblings, 0 replies; 17+ messages in thread
From: Zhangfei Gao @ 2024-12-02 15:00 UTC (permalink / raw)
To: Shameerali Kolothum Thodi
Cc: Jason Gunthorpe, Nicolin Chen, Eric Auger, Mostafa Saleh,
qemu-arm@nongnu.org, qemu-devel@nongnu.org, Peter Maydell,
Jean-Philippe Brucker, Moritz Fischer, Michael Shavit,
Andrea Bolognani, Michael S. Tsirkin, Peter Xu,
nathanc@nvidia.com, arighi@nvidia.com, ianm@nvidia.com,
jan@nvidia.com, mochs@nvidia.com
On Mon, 2 Dec 2024 at 16:07, Shameerali Kolothum Thodi
<shameerali.kolothum.thodi@huawei.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Zhangfei Gao <zhangfei.gao@linaro.org>
> > Sent: Monday, December 2, 2024 6:05 AM
> > To: Jason Gunthorpe <jgg@nvidia.com>
> > Cc: Nicolin Chen <nicolinc@nvidia.com>; Shameerali Kolothum Thodi
> > <shameerali.kolothum.thodi@huawei.com>; Eric Auger
> > <eric.auger@redhat.com>; Mostafa Saleh <smostafa@google.com>; qemu-
> > arm@nongnu.org; qemu-devel@nongnu.org; Peter Maydell
> > <peter.maydell@linaro.org>; Jean-Philippe Brucker <jean-
> > philippe@linaro.org>; Moritz Fischer <mdf@kernel.org>; Michael Shavit
> > <mshavit@google.com>; Andrea Bolognani <abologna@redhat.com>;
> > Michael S. Tsirkin <mst@redhat.com>; Peter Xu <peterx@redhat.com>;
> > nathanc@nvidia.com; arighi@nvidia.com; ianm@nvidia.com;
> > jan@nvidia.com; mochs@nvidia.com
> > Subject: Re: nested-smmuv3 topic for QEMU/libvirt, Nov 2024
> >
> > Hi, Nico
> >
> > On Fri, 1 Nov 2024 at 19:55, Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >
> > > On Thu, Oct 31, 2024 at 09:09:20PM -0700, Nicolin Chen wrote:
> > >
> > > > FWIW, Robin requested a different solution for MSI mapping [1], v.s.
> > > > the RMR one that we have been using since Eric's work. I drafted a
> > > > few VFIO/IOMMUFD patches for that,
> > >
> > > I also talked to MarcZ about this at LPC and he seems willing to
> > > consider it. It took a bit to explain everything though. So I think we
> > > should try in Nov/Dec
> >
> > When boot qemu, reports this
> >
> > qemu-system-aarch64: warning: IOMMU_IOAS_MAP failed: Bad address, PCI
> > BAR?
> > qemu-system-aarch64: vfio_container_dma_map(0xaaaadd30f110,
> > 0x8000200000, 0x10000, 0xffffb8031000) = -14 (Bad address)
> > qemu-system-aarch64: warning: IOMMU_IOAS_MAP failed: Bad address, PCI
> > BAR?
> > qemu-system-aarch64: vfio_container_dma_map(0xaaaadd2bc310,
> > 0x8000200000, 0x10000, 0xffffb8031000) = -14 (Bad address)
> > qemu-system-aarch64: warning: IOMMU_IOAS_MAP failed: Bad address, PCI
> > BAR?
> > qemu-system-aarch64: vfio_container_dma_map(0xaaaadcf90000,
> > 0x8000200000, 0x10000, 0xffffb8031000) = -14 (Bad address)
> >
> > Will this also be solved in the new MSI mapping patchset?
>
> Nope. These are not related to MSIs. These mappings are required for
> P2P DMA between devices and is not supported by IOMMUFD at the moment.
>
> See the discussion here,
> https://lore.kernel.org/all/20220426103507.5693a0ca.alex.williamson@redhat.com/
Thanks Shameer
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2024-12-02 15:02 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-01 4:09 nested-smmuv3 topic for QEMU/libvirt, Nov 2024 Nicolin Chen
2024-11-01 11:55 ` Jason Gunthorpe
2024-12-02 6:04 ` Zhangfei Gao
2024-12-02 8:07 ` Shameerali Kolothum Thodi via
2024-12-02 15:00 ` Zhangfei Gao
2024-11-01 18:35 ` Shameerali Kolothum Thodi via
2024-11-01 19:29 ` Nicolin Chen
2024-11-08 13:05 ` Shameerali Kolothum Thodi via
2024-11-20 21:13 ` Andrea Bolognani
2024-11-21 9:47 ` Shameerali Kolothum Thodi via
2024-11-07 11:11 ` Eric Auger
2024-11-07 20:31 ` Nicolin Chen
2024-11-07 22:10 ` Donald Dutile
2024-11-18 17:59 ` Eric Auger
2024-11-19 7:07 ` Duan, Zhenzhong
2024-11-19 8:17 ` Eric Auger
2024-11-20 20:44 ` Nicolin Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).