All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tushar Dave <tdave@nvidia.com>
To: Jason Gunthorpe <jgg@nvidia.com>, Vasant Hegde <vasant.hegde@amd.com>
Cc: Baolu Lu <baolu.lu@linux.intel.com>,
	joro@8bytes.org, will@kernel.org, robin.murphy@arm.com,
	kevin.tian@intel.com, yi.l.liu@intel.com, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	stable@vger.kernel.org
Subject: Re: [PATCH rc] iommu: Skip PASID validation for devices without PASID capability
Date: Thu, 24 Apr 2025 17:49:20 -0700	[thread overview]
Message-ID: <77be6671-e4e8-4b17-bf72-74bde325671a@nvidia.com> (raw)
In-Reply-To: <20250424123156.GO1648741@nvidia.com>



On 4/24/25 05:31, Jason Gunthorpe wrote:
> On Thu, Apr 24, 2025 at 12:08:56PM +0530, Vasant Hegde wrote:
> 
>>> What the iommu driver should do when set_dev_pasid is called for a non-
>>> PASID device?
> 
> That's a good point, maybe the core code should filter that out based
> on max_pasids? I think we do run into trouble here because the drivers
> are allocating PASID table space based on max_pasids so the non-pasid
> device should fail to add the pasid. Tushar, you should have hit this
> in your testing???

When we have multi-device group with PASID device and non-PASID devices,
set_dev_pasid doesn't fail in my testing for non-PASID devices.

Here is the example topology and bit more detail:

0008:00:00.0 root_port
  └─0008:01:00.0 upstream_port
     ├─0008:02:00.0 downstream_port
     │  └─0008:03:00.0 endpoint (NIC DMA-PF)
     └─0008:02:03.0 downstream_port
        └─0008:04:00.0 upstream_port
           └─0008:05:00.0 downstream_port
              └─0008:06:00.0 endpoint (GPU)


In the above topology, we setup ACS flags on DSP 0008:02:03.0 and 0008:02:00.0 
to achieve desired p2p configuration for GPU and DMA-PF.
Apparently, this creates multi-device group with GPU being only device with 
PASID support in that group. In this case, set_dev_pasid() ops invoked for each 
device within the group with pasid=1 and doesn't fail.

e.g.

  ...
  ..
  .
  pcieport 0008:02:03.0: debug: __iommu_set_group_pasid():  pasid=1 
dev->iommu->max_pasids=0 iommu_group 30
  pcieport 0008:02:03.0: debug: __iommu_set_group_pasid():  ret 0
  pcieport 0008:04:00.0: debug: __iommu_set_group_pasid():  pasid=1 
dev->iommu->max_pasids=0 iommu_group 30
  pcieport 0008:04:00.0: debug: __iommu_set_group_pasid():  ret 0
  pcieport 0008:05:00.0: debug: __iommu_set_group_pasid():  pasid=1 
dev->iommu->max_pasids=0 iommu_group 30
  pcieport 0008:05:00.0: debug: __iommu_set_group_pasid():  ret 0
  nvidia 0008:06:00.0: debug: __iommu_set_group_pasid():  pasid=1 
dev->iommu->max_pasids=1048576 iommu_group 30
  nvidia 0008:06:00.0: debug: __iommu_set_group_pasid():  ret 0


IMO this outcome is expected. Quoting a text from commit
https://github.com/torvalds/linux/commit/16603704559c7a68718059c4f75287886c01b20f


"If multiple devices share a single group, it's fine as long the fabric
always routes every TLP marked with a PASID to the host bridge and only
the host bridge. For example, ACS achieves this universally and has been
checked when pci_enable_pasid() is called. As we can't reliably tell the
source apart in a group, all the devices in a group have to be considered
as the same source, and mapped to the same PASID table."

-Tushar


> 
> We also have a problem setting up the default domain - it won't
> compute IOMMU_HWPT_ALLOC_PASID properly across the group. If the
> no-pasid device probes first then PASID will be broken on the group.
> 
> Tushar isn't hitting this because ARM always uses a PASID compatible
> domain today, but it will not work on AMD.
> 
> That's a huge pain to deal with :\
> 
>> Per device max_pasids check should cover that right?
> 
> The driver shouldn't be doing this though, if the driver is told to
> make a pasid then it should make a pasid.. The driver can fail
> attaching a pasid to a device that is over the device's max_pasid.
> 
>> FYI. One example of such device is some of the AMD GPUs which has
>> both VGA and audio in same group. while VGA supports PASID, audio is
>> not. This used to work fine when we had AMD IOMMU PASID specific
>> driver. GPUs stopped using PASIDs in upstream kernel. So I didn't
>> look into this part in details.
> 
> Uhhh.. That sounds like a worse problem, the only way you should end
> up with same group is if the ACS flags are missing on the GPU so Linux
> assumes the VGA and audio can loopback to each other internally.
> 
> That should completely block PASID support on the GPU side due the
> wrong routing. We can't have a hole in the PASID address space where
> the audio BAR is.
> 
> I suppose the HW doesn't actually behave this way but since it doesn't
> have the right ACS flags the SW doesn't know? Guessing..
> 
> Jason

  parent reply	other threads:[~2025-04-25  0:49 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-24  2:06 [PATCH rc] iommu: Skip PASID validation for devices without PASID capability Tushar Dave
2025-04-24  3:27 ` Baolu Lu
2025-04-24  4:33   ` Tian, Kevin
2025-04-24  6:38   ` Vasant Hegde
2025-04-24 12:31     ` Jason Gunthorpe
2025-04-24 15:49       ` Vasant Hegde
2025-04-25  0:49       ` Tushar Dave [this message]
2025-04-25 12:00         ` Jason Gunthorpe
2025-04-28  1:47           ` Tushar Dave

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=77be6671-e4e8-4b17-bf72-74bde325671a@nvidia.com \
    --to=tdave@nvidia.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=stable@vger.kernel.org \
    --cc=vasant.hegde@amd.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.