qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Shameerali Kolothum Thodi via <qemu-devel@nongnu.org>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: "qemu-arm@nongnu.org" <qemu-arm@nongnu.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>,
	"peter.maydell@linaro.org" <peter.maydell@linaro.org>,
	"jgg@nvidia.com" <jgg@nvidia.com>,
	"nicolinc@nvidia.com" <nicolinc@nvidia.com>,
	"ddutile@redhat.com" <ddutile@redhat.com>,
	Linuxarm <linuxarm@huawei.com>,
	"Wangzhou (B)" <wangzhou1@hisilicon.com>,
	jiangkunkun <jiangkunkun@huawei.com>,
	Jonathan Cameron <jonathan.cameron@huawei.com>,
	"zhangfei.gao@linaro.org" <zhangfei.gao@linaro.org>,
	"nathanc@nvidia.com" <nathanc@nvidia.com>
Subject: RE: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable nested SMMUv3
Date: Thu, 6 Feb 2025 10:02:25 +0000	[thread overview]
Message-ID: <47d2c2556d794d87abf440263b2f7cd8@huawei.com> (raw)
In-Reply-To: <Z51DmtP83741RAsb@redhat.com>

Hi Daniel,

> -----Original Message-----
> From: Daniel P. Berrangé <berrange@redhat.com>
> Sent: Friday, January 31, 2025 9:42 PM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org;
> eric.auger@redhat.com; peter.maydell@linaro.org; jgg@nvidia.com;
> nicolinc@nvidia.com; ddutile@redhat.com; Linuxarm
> <linuxarm@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>;
> jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron
> <jonathan.cameron@huawei.com>; zhangfei.gao@linaro.org
> Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable
> nested SMMUv3
> 
> On Thu, Jan 30, 2025 at 06:09:24PM +0000, Shameerali Kolothum Thodi
> wrote:
> >
> > Each "arm-smmuv3-nested" instance, when the first device gets attached
> > to it, will create a S2 HWPT and a corresponding SMMUv3 domain in
> kernel
> > SMMUv3 driver. This domain will have a pointer representing the physical
> > SMMUv3 that the device belongs. And any other device which belongs to
> > the same physical SMMUv3 can share this S2 domain.
> 
> Ok, so given two guest SMMUv3s,   A and B, and two host SMMUv3s,
> C and D, we could end up with A&C and B&D paired, or we could
> end up with A&D and B&C paired, depending on whether we plug
> the first VFIO device into guest SMMUv3  A or B.
> 
> This is bad.  Behaviour must not vary depending on the order
> in which we create devices.
> 
> An guest SMMUv3 is paired to a guest PXB. A guest PXB is liable
> to be paired to a guest NUMA node. A guest NUMA node is liable
> to be paired to host NUMA node. The guest/host SMMU pairing
> must be chosen such that it makes conceptual sense wrt to the
> guest PXB NUMA to host NUMA pairing.
> 
> If the kernel picks guest<->host SMMU pairings on a first-device
> first-paired basis, this can end up with incorrect guest NUMA
> configurations.

Ok. I am trying to understand how this can happen as I assume the
Guest PXB numa node is picked up by whatever device we are
attaching to it and based on which numa_id that device belongs to
in physical host.

And the physical smmuv3 numa id will be the same to that of the
device numa_id  it is associated with. Isn't it?

For example I have a system here, that has 8 phys SMMUv3s and numa
assignments on this is something like below,

Phys SMMUv3.0 --> node 0
  \..dev1 --> node0
Phys SMMUv3.1 --> node 0
\..dev2 -->node0
Phys SMMUv3.2 --> node 0
Phys SMMUv3.3 --> node 0

Phys SMMUv3.4 --> node 1
Phys SMMUv3.5 --> node 1
\..dev5 --> node1
Phys SMMUv3.6 --> node 1
Phys SMMUv3.7 --> node 1


If I have to assign say dev 1, 2 and 5 to a Guest, we need to specify 3
 "arm-smmuv3-accel" instances as they belong to different phys SMMUv3s.

-device pxb-pcie,id=pcie.1,bus_nr=1,bus=pcie.0,numa_id=0 \
-device pxb-pcie,id=pcie.2,bus_nr=2,bus=pcie.0,numa_id=0 \
-device pxb-pcie,id=pcie.3,bus_nr=3,bus=pcie.0,numa_id=1 \
-device arm-smmuv3-accel,id=smmuv1,bus=pcie.1 \
-device arm-smmuv3-accel,id=smmuv2,bus=pcie.2 \
-device arm-smmuv3-accel,id=smmuv3,bus=pcie.3 \
-device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \
-device pcie-root-port,id=pcie.port2,bus=pcie.3,chassis=2 \
-device pcie-root-port,id=pcie.port3,bus=pcie.2,chassis=3 \
-device vfio-pci,host=0000:dev1,bus=pcie.port1,iommufd=iommufd0 \
-device vfio-pci,host=0000: dev2,bus=pcie.port2,iommufd=iommufd0 \
-device vfio-pci,host=0000: dev5,bus=pcie.port3,iommufd=iommufd0

So I guess even if we don't specify the physical SMMUv3 association
explicitly, the kernel will check that based on the devices the Guest
SMMUv3 is attached to (and hence the Numa association), right?

In other words how an explicit association helps us here?

Or is it that the Guest PXB numa_id allocation is not always based
on device numa_id?

(May be I am missing something here. Sorry)

Thanks,
Shameer 















 
> The mgmt apps needs to be able to tell QEMU exactly which
> host SMMU to pair with each guest SMMU, and QEMU needs to
> then tell the kernel.
> 
> > And as I mentioned in cover letter, Qemu will report,
> >
> > "
> > Attempt to add the HNS VF to a different SMMUv3 will result in,
> >
> > -device vfio-pci,host=0000:7d:02.2,bus=pcie.port3,iommufd=iommufd0:
> Unable to attach viommu
> > -device vfio-pci,host=0000:7d:02.2,bus=pcie.port3,iommufd=iommufd0:
> vfio 0000:7d:02.2:
> >    Failed to set iommu_device: [iommufd=29] error attach 0000:7d:02.2
> (38) to id=11: Invalid argument
> >
> > At present Qemu is not doing any extra validation other than the above
> > failure to make sure the user configuration is correct or not. The
> > assumption is libvirt will take care of this.
> > "
> > So in summary, if the libvirt gets it wrong, Qemu will fail with error.
> 
> That's good error checking, and required, but also insufficient
> as illustrated above IMHO.
> 
> > If a more explicit association is required, some help from kernel is
> required
> > to identify the physical SMMUv3 associated with the device.
> 
> Yep, I think SMMUv3 info for devices needs to be exposed to userspace,
> as well as a mechanism for QEMU to tell the kernel the SMMU mapping.
> 
> 
> With regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange
> :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-
> https://www.instagram.com/dberrange :|
> 


  reply	other threads:[~2025-02-06 10:03 UTC|newest]

Thread overview: 150+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-08 12:52 [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable nested SMMUv3 Shameer Kolothum via
2024-11-08 12:52 ` [RFC PATCH 1/5] hw/arm/virt: Add an SMMU_IO_LEN macro Shameer Kolothum via
2024-11-13 16:48   ` Eric Auger
2024-11-08 12:52 ` [RFC PATCH 2/5] hw/arm/smmuv3: Add initial support for SMMUv3 Nested device Shameer Kolothum via
2024-11-13 17:12   ` Eric Auger
2024-11-13 18:05     ` Nicolin Chen
2024-11-26 18:28       ` Donald Dutile
2024-11-27 10:21         ` Shameerali Kolothum Thodi via
2024-11-27 16:00           ` Jason Gunthorpe
2024-11-27 16:05             ` Eric Auger
2024-11-28  3:25               ` Zhangfei Gao
2024-11-28  8:06                 ` Eric Auger
2024-11-28  8:28                   ` Shameerali Kolothum Thodi via
2024-11-28  8:41                     ` Eric Auger
2024-11-28 12:52                     ` Jason Gunthorpe
2024-11-27 23:03             ` Donald Dutile
2024-11-28 12:51               ` Jason Gunthorpe
2024-11-28  4:29           ` Donald Dutile
2024-11-28  4:44             ` Nicolin Chen
2024-11-28 12:54               ` Jason Gunthorpe
2024-11-28 18:22                 ` Nicolin Chen
2024-12-02 18:53                 ` Donald Dutile
2024-11-28  8:17             ` Shameerali Kolothum Thodi via
2024-11-14  8:20     ` Shameerali Kolothum Thodi via
2024-11-14  8:41       ` Eric Auger
2024-11-14 13:27         ` Shameerali Kolothum Thodi via
2024-11-15 22:32       ` Nicolin Chen
2024-11-13 18:00   ` Eric Auger
2024-11-08 12:52 ` [RFC PATCH 3/5] hw/arm/smmuv3: Associate a pci bus with a " Shameer Kolothum via
2024-11-13 17:58   ` Eric Auger
2024-11-14  8:30     ` Shameerali Kolothum Thodi via
2025-01-30 16:29   ` Daniel P. Berrangé
2025-01-30 18:19     ` Shameerali Kolothum Thodi via
2024-11-08 12:52 ` [RFC PATCH 4/5] hw/arm/virt-acpi-build: Build IORT with multiple SMMU nodes Shameer Kolothum via
2024-11-18 10:01   ` Eric Auger
2024-11-18 11:44     ` Shameerali Kolothum Thodi via
2024-11-18 13:45       ` Eric Auger
2024-11-18 15:00         ` Shameerali Kolothum Thodi via
2024-11-18 18:09           ` Eric Auger
2024-11-20 14:16             ` Shameerali Kolothum Thodi via
2024-11-20 16:10               ` Eric Auger
2024-11-20 16:26                 ` Shameerali Kolothum Thodi via
2024-11-21  9:46                   ` Shameerali Kolothum Thodi via
2024-12-10 20:48                     ` Nicolin Chen
2024-12-11 15:21                       ` Shameerali Kolothum Thodi via
2024-12-13  0:28                         ` Nicolin Chen
2024-11-08 12:52 ` [RFC PATCH 5/5] hw/arm/virt-acpi-build: Add IORT RMR regions to handle MSI nested binding Shameer Kolothum via
2024-11-13 18:31   ` Nicolin Chen
2024-11-14  8:48     ` Shameerali Kolothum Thodi via
2024-11-14 10:41       ` Eric Auger
2024-11-15 22:12         ` Nicolin Chen
2024-12-10 23:01   ` Nicolin Chen
2024-12-11  0:48     ` Jason Gunthorpe
2024-12-11  1:28       ` Nicolin Chen
2024-12-11 13:11         ` Jason Gunthorpe
2024-12-11 17:20           ` Nicolin Chen
2024-12-11 18:01             ` Jason Gunthorpe
2024-11-12 22:59 ` [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable nested SMMUv3 Nicolin Chen
2024-11-14  7:56   ` Shameerali Kolothum Thodi via
2024-11-20 23:59   ` Nathan Chen
2024-11-21 10:12     ` Shameerali Kolothum Thodi via
2024-11-22  1:41       ` Nathan Chen
2024-11-22 17:38         ` Shameerali Kolothum Thodi via
2024-11-22 18:53           ` Nathan Chen
2025-02-04 14:00             ` Eric Auger
2024-12-13 11:58           ` Daniel P. Berrangé
2024-12-13 12:43             ` Jason Gunthorpe
2024-12-12 23:54     ` Nathan Chen
2024-12-13  1:01       ` Nathan Chen
2024-12-16  9:31         ` Shameerali Kolothum Thodi via
2025-01-25  2:43           ` Nathan Chen
2025-01-27 15:26             ` Shameerali Kolothum Thodi via
2025-01-27 23:35               ` Nathan Chen
2024-11-13 16:16 ` Mostafa Saleh
2024-11-14  8:01   ` Shameerali Kolothum Thodi via
2024-11-14 11:49     ` Mostafa Saleh
2024-11-13 21:42 ` Nicolin Chen
2024-11-14  9:11   ` Shameerali Kolothum Thodi via
2024-11-18 10:50 ` Eric Auger
2025-01-30 16:41   ` Daniel P. Berrangé
2024-12-13 12:00 ` Daniel P. Berrangé
2024-12-13 12:46   ` Jason Gunthorpe
2024-12-13 13:19     ` Daniel P. Berrangé
2024-12-16  9:38       ` Shameerali Kolothum Thodi via
2024-12-17 18:36       ` Donald Dutile
2024-12-13 13:33     ` Peter Maydell
2024-12-16 10:01       ` Shameerali Kolothum Thodi via
2025-01-09  4:45         ` Nicolin Chen
2025-01-11  4:05           ` Donald Dutile
2025-01-23  4:10             ` Nicolin Chen
2025-01-23  8:28               ` Shameerali Kolothum Thodi via
2025-01-23  8:40                 ` Nicolin Chen
2025-01-23 11:07                 ` Duan, Zhenzhong
2025-02-17  9:17                   ` Duan, Zhenzhong
2025-02-18  6:52                     ` Shameerali Kolothum Thodi via
2025-03-06 17:59                       ` Eric Auger
2025-03-06 18:27                         ` Shameerali Kolothum Thodi via
2025-03-06 18:40                           ` Eric Auger
2025-01-31 16:54           ` Eric Auger
2025-02-03 18:50             ` Nicolin Chen
2025-02-04 17:49               ` Eric Auger
2025-02-05  0:08                 ` Nicolin Chen
2025-02-05 10:43                   ` Shameerali Kolothum Thodi via
2025-02-05 12:35                   ` Eric Auger
2025-02-06 10:34                   ` Shameerali Kolothum Thodi via
2025-02-06 18:58                     ` Nicolin Chen
2025-03-03 15:21                       ` Shameerali Kolothum Thodi via
2025-03-03 17:04                         ` Nicolin Chen
2025-03-04  9:30                           ` Shameerali Kolothum Thodi via
2025-01-30 16:00 ` Daniel P. Berrangé
2025-01-30 18:09   ` Shameerali Kolothum Thodi via
2025-01-31  9:33     ` Shameerali Kolothum Thodi via
2025-01-31 10:07       ` Eric Auger
2025-01-31 14:24       ` Jason Gunthorpe
2025-01-31 14:39         ` Shameerali Kolothum Thodi via
2025-01-31 14:54           ` Jason Gunthorpe
2025-01-31 15:23             ` Shameerali Kolothum Thodi via
2025-01-31 16:08               ` Eric Auger
2025-02-05 20:53                 ` Nathan Chen
2025-02-06  8:54                   ` Daniel P. Berrangé
2025-02-06  8:53                 ` Daniel P. Berrangé
2025-02-06 16:44                   ` Eric Auger
2025-01-31 21:41     ` Daniel P. Berrangé
2025-02-06 10:02       ` Shameerali Kolothum Thodi via [this message]
2025-02-06 10:37         ` Daniel P. Berrangé
2025-02-06 13:51           ` Shameerali Kolothum Thodi via
2025-02-06 14:46             ` Daniel P. Berrangé
2025-02-06 15:07               ` Shameerali Kolothum Thodi via
2025-02-06 17:02                 ` Jason Gunthorpe
2025-02-06 17:10                   ` Daniel P. Berrangé
2025-02-06 17:46                     ` Jason Gunthorpe
2025-02-06 17:54                       ` Daniel P. Berrangé
2025-02-06 17:58                         ` Jason Gunthorpe
2025-02-06 18:04                           ` Shameerali Kolothum Thodi via
2025-02-06 18:13                             ` Jason Gunthorpe
2025-02-06 18:18                               ` Shameerali Kolothum Thodi via
2025-02-06 18:22                                 ` Jason Gunthorpe
2025-02-06 20:33                                   ` Nicolin Chen
2025-02-06 20:38                                     ` Jason Gunthorpe
2025-02-06 20:48                                       ` Nicolin Chen
2025-02-06 21:11                                         ` Jason Gunthorpe
2025-02-06 22:46                                           ` Nicolin Chen
2025-02-07  0:08                                             ` Jason Gunthorpe
2025-02-07 10:21                                     ` Shameerali Kolothum Thodi via
2025-02-07 10:31                                       ` Daniel P. Berrangé
2025-02-07 12:21                                         ` Shameerali Kolothum Thodi via
2025-02-07 12:53                                           ` Jason Gunthorpe
2025-02-06 18:18                           ` Daniel P. Berrangé
2025-02-06 17:57                       ` Shameerali Kolothum Thodi via
2025-02-06 17:59                         ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47d2c2556d794d87abf440263b2f7cd8@huawei.com \
    --to=qemu-devel@nongnu.org \
    --cc=berrange@redhat.com \
    --cc=ddutile@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=jiangkunkun@huawei.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=linuxarm@huawei.com \
    --cc=nathanc@nvidia.com \
    --cc=nicolinc@nvidia.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=wangzhou1@hisilicon.com \
    --cc=zhangfei.gao@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).