From: Tushar Dave <tdave@nvidia.com>
To: Jason Gunthorpe <jgg@nvidia.com>, Vasant Hegde <vasant.hegde@amd.com>
Cc: Baolu Lu <baolu.lu@linux.intel.com>,
joro@8bytes.org, will@kernel.org, robin.murphy@arm.com,
kevin.tian@intel.com, yi.l.liu@intel.com, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
stable@vger.kernel.org
Subject: Re: [PATCH rc] iommu: Skip PASID validation for devices without PASID capability
Date: Thu, 24 Apr 2025 17:49:20 -0700 [thread overview]
Message-ID: <77be6671-e4e8-4b17-bf72-74bde325671a@nvidia.com> (raw)
In-Reply-To: <20250424123156.GO1648741@nvidia.com>
On 4/24/25 05:31, Jason Gunthorpe wrote:
> On Thu, Apr 24, 2025 at 12:08:56PM +0530, Vasant Hegde wrote:
>
>>> What the iommu driver should do when set_dev_pasid is called for a non-
>>> PASID device?
>
> That's a good point, maybe the core code should filter that out based
> on max_pasids? I think we do run into trouble here because the drivers
> are allocating PASID table space based on max_pasids so the non-pasid
> device should fail to add the pasid. Tushar, you should have hit this
> in your testing???
When we have multi-device group with PASID device and non-PASID devices,
set_dev_pasid doesn't fail in my testing for non-PASID devices.
Here is the example topology and bit more detail:
0008:00:00.0 root_port
└─0008:01:00.0 upstream_port
├─0008:02:00.0 downstream_port
│ └─0008:03:00.0 endpoint (NIC DMA-PF)
└─0008:02:03.0 downstream_port
└─0008:04:00.0 upstream_port
└─0008:05:00.0 downstream_port
└─0008:06:00.0 endpoint (GPU)
In the above topology, we setup ACS flags on DSP 0008:02:03.0 and 0008:02:00.0
to achieve desired p2p configuration for GPU and DMA-PF.
Apparently, this creates multi-device group with GPU being only device with
PASID support in that group. In this case, set_dev_pasid() ops invoked for each
device within the group with pasid=1 and doesn't fail.
e.g.
...
..
.
pcieport 0008:02:03.0: debug: __iommu_set_group_pasid(): pasid=1
dev->iommu->max_pasids=0 iommu_group 30
pcieport 0008:02:03.0: debug: __iommu_set_group_pasid(): ret 0
pcieport 0008:04:00.0: debug: __iommu_set_group_pasid(): pasid=1
dev->iommu->max_pasids=0 iommu_group 30
pcieport 0008:04:00.0: debug: __iommu_set_group_pasid(): ret 0
pcieport 0008:05:00.0: debug: __iommu_set_group_pasid(): pasid=1
dev->iommu->max_pasids=0 iommu_group 30
pcieport 0008:05:00.0: debug: __iommu_set_group_pasid(): ret 0
nvidia 0008:06:00.0: debug: __iommu_set_group_pasid(): pasid=1
dev->iommu->max_pasids=1048576 iommu_group 30
nvidia 0008:06:00.0: debug: __iommu_set_group_pasid(): ret 0
IMO this outcome is expected. Quoting a text from commit
https://github.com/torvalds/linux/commit/16603704559c7a68718059c4f75287886c01b20f
"If multiple devices share a single group, it's fine as long the fabric
always routes every TLP marked with a PASID to the host bridge and only
the host bridge. For example, ACS achieves this universally and has been
checked when pci_enable_pasid() is called. As we can't reliably tell the
source apart in a group, all the devices in a group have to be considered
as the same source, and mapped to the same PASID table."
-Tushar
>
> We also have a problem setting up the default domain - it won't
> compute IOMMU_HWPT_ALLOC_PASID properly across the group. If the
> no-pasid device probes first then PASID will be broken on the group.
>
> Tushar isn't hitting this because ARM always uses a PASID compatible
> domain today, but it will not work on AMD.
>
> That's a huge pain to deal with :\
>
>> Per device max_pasids check should cover that right?
>
> The driver shouldn't be doing this though, if the driver is told to
> make a pasid then it should make a pasid.. The driver can fail
> attaching a pasid to a device that is over the device's max_pasid.
>
>> FYI. One example of such device is some of the AMD GPUs which has
>> both VGA and audio in same group. while VGA supports PASID, audio is
>> not. This used to work fine when we had AMD IOMMU PASID specific
>> driver. GPUs stopped using PASIDs in upstream kernel. So I didn't
>> look into this part in details.
>
> Uhhh.. That sounds like a worse problem, the only way you should end
> up with same group is if the ACS flags are missing on the GPU so Linux
> assumes the VGA and audio can loopback to each other internally.
>
> That should completely block PASID support on the GPU side due the
> wrong routing. We can't have a hole in the PASID address space where
> the audio BAR is.
>
> I suppose the HW doesn't actually behave this way but since it doesn't
> have the right ACS flags the SW doesn't know? Guessing..
>
> Jason
next prev parent reply other threads:[~2025-04-25 0:49 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-24 2:06 [PATCH rc] iommu: Skip PASID validation for devices without PASID capability Tushar Dave
2025-04-24 3:27 ` Baolu Lu
2025-04-24 4:33 ` Tian, Kevin
2025-04-24 6:38 ` Vasant Hegde
2025-04-24 12:31 ` Jason Gunthorpe
2025-04-24 15:49 ` Vasant Hegde
2025-04-25 0:49 ` Tushar Dave [this message]
2025-04-25 12:00 ` Jason Gunthorpe
2025-04-28 1:47 ` Tushar Dave
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=77be6671-e4e8-4b17-bf72-74bde325671a@nvidia.com \
--to=tdave@nvidia.com \
--cc=baolu.lu@linux.intel.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=robin.murphy@arm.com \
--cc=stable@vger.kernel.org \
--cc=vasant.hegde@amd.com \
--cc=will@kernel.org \
--cc=yi.l.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox