[RFC] iommufd: Use accurate dev_id in the PRI forwarding path

All of lore.kernel.org
 help / color / mirror / Atom feed

* [RFC] iommufd: Use accurate dev_id in the PRI forwarding path
@ 2025-03-05 13:03 Yi Liu
  2025-03-05 17:18 ` Jason Gunthorpe
  0 siblings, 1 reply; 12+ messages in thread
From: Yi Liu @ 2025-03-05 13:03 UTC (permalink / raw)
  To: kevin.tian, baolu.lu, jgg; +Cc: yi.l.liu, iommu

Hi Jason, Kevin, Baolu,

This is more a query, the patch here does not really work as there
is locking issue. So I'd like to hear from you if any good idea.

Detail as below:

Existing code has a problem when the PRI happens on a device that has
alias. The PRI reporting path uses the idev stored in the handle. While
this idev is the first idev that attachs to the domain.  If the PRI
happens on devices other than the first attached device, then the idev
stored in handle does not match.

To solve it, the PRI reporting path needs to loop the attached devices
to get the correct idev and its dev_id. The trouble is that the device_list
is protected by igroup->lock. While the PRI path hold the fault->mutex
lock. This fault->mutex lock might be used in the attach path or the
replace path. Both path would hold igroup->lock first, and then the
fault->mutex if auto_response needed. This will have A-B-B-A locking
issue.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/iommu/iommufd/device.c          | 31 +++++++++++++++++++------
 drivers/iommu/iommufd/fault.c           |  4 +++-
 drivers/iommu/iommufd/iommufd_private.h |  5 +++-
 3 files changed, 31 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index bd50146e2ad0..586d008d6bac 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -390,26 +390,26 @@ int iommufd_sw_msi(struct iommu_domain *domain, struct msi_desc *desc,
 
 	handle = to_iommufd_handle(raw_handle);
 	/* No IOMMU_RESV_SW_MSI means no change to the msi_msg */
-	if (handle->idev->igroup->sw_msi_start == PHYS_ADDR_MAX)
+	if (handle->igroup->sw_msi_start == PHYS_ADDR_MAX)
 		return 0;
 
-	ictx = handle->idev->ictx;
+	ictx = handle->igroup->ictx;
 	guard(mutex)(&ictx->sw_msi_lock);
 	/*
 	 * The input msi_addr is the exact byte offset of the MSI doorbell, we
 	 * assume the caller has checked that it is contained with a MMIO region
 	 * that is secure to map at PAGE_SIZE.
 	 */
-	msi_map = iommufd_sw_msi_get_map(handle->idev->ictx,
+	msi_map = iommufd_sw_msi_get_map(ictx,
 					 msi_addr & PAGE_MASK,
-					 handle->idev->igroup->sw_msi_start);
+					 handle->igroup->sw_msi_start);
 	if (IS_ERR(msi_map))
 		return PTR_ERR(msi_map);
 
 	rc = iommufd_sw_msi_install(ictx, hwpt_paging, msi_map);
 	if (rc)
 		return rc;
-	__set_bit(msi_map->id, handle->idev->igroup->required_sw_msi.bitmap);
+	__set_bit(msi_map->id, handle->igroup->required_sw_msi.bitmap);
 
 	iova = msi_map->sw_msi_start + msi_map->pgoff * PAGE_SIZE;
 	msi_desc_set_iommu_msi_iova(desc, iova, PAGE_SHIFT);
@@ -482,6 +482,26 @@ static bool iommufd_device_is_attached(struct iommufd_device *idev)
 	return false;
 }
 
+struct iommufd_device *
+iommufd_group_find_device(struct iommufd_group *igroup, struct device *dev)
+{
+	struct iommufd_device *cur, *idev = NULL;
+
+	/* !!! Headsup !!!
+	 * taking igroup->lock has dead lock with fault->mutex
+	 */
+	mutex_lock(&igroup->lock);
+	list_for_each_entry(cur, &igroup->device_list, group_item) {
+		if (cur->dev == dev) {
+			idev = cur;
+			break;
+		}
+	}
+	mutex_unlock(&igroup->lock);
+
+	return idev;
+}
+
 static int iommufd_hwpt_attach_device(struct iommufd_hw_pagetable *hwpt,
 				      struct iommufd_device *idev)
 {
@@ -500,7 +517,7 @@ static int iommufd_hwpt_attach_device(struct iommufd_hw_pagetable *hwpt,
 			goto out_free_handle;
 	}
 
-	handle->idev = idev;
+	handle->igroup = idev->igroup;
 	rc = iommu_attach_group_handle(hwpt->domain, idev->igroup->group,
 				       &handle->handle);
 	if (rc)
@@ -562,7 +579,7 @@ static int iommufd_hwpt_replace_device(struct iommufd_device *idev,
 			goto out_free_handle;
 	}
 
-	handle->idev = idev;
+	handle->igroup = idev->igroup;
 	rc = iommu_replace_group_handle(idev->igroup->group, hwpt->domain,
 					&handle->handle);
 	if (rc)
diff --git a/drivers/iommu/iommufd/fault.c b/drivers/iommu/iommufd/fault.c
index c48d72c9668c..b7e0283d024a 100644
--- a/drivers/iommu/iommufd/fault.c
+++ b/drivers/iommu/iommufd/fault.c
@@ -168,7 +168,11 @@ static ssize_t iommufd_fault_fops_read(struct file *filep, char __user *buf,
 			break;
 		}
-		idev = to_iommufd_handle(group->attach_handle)->idev;
+		/* Or just add a WARN_ON if the idev->dev does not match
+		   group->fault_param->dev ?*/
+		idev = iommufd_group_find_device(
+				to_iommufd_handle(group->attach_handle)->igroup,
+				group->fault_param->dev);
 		list_for_each_entry(iopf, &group->faults, list) {
 			iommufd_compose_fault_message(&iopf->fault,
 						      &data, idev,
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 246297452a44..df0db45deac0 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -433,6 +433,9 @@ iommufd_get_device(struct iommufd_ucmd *ucmd, u32 id)
 			    struct iommufd_device, obj);
 }
 
+struct iommufd_device *
+iommufd_group_find_device(struct iommufd_group *igroup, struct device *dev);
+
 void iommufd_device_destroy(struct iommufd_object *obj);
 int iommufd_get_hw_info(struct iommufd_ucmd *ucmd);
 
@@ -499,7 +502,7 @@ static inline void iommufd_fault_deliver_restore(struct iommufd_fault *fault,
 
 struct iommufd_attach_handle {
 	struct iommu_attach_handle handle;
-	struct iommufd_device *idev;
+	struct iommufd_group *igroup;
 };
 
 /* Convert an iommu attach handle to iommufd handle. */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC] iommufd: Use accurate dev_id in the PRI forwarding path
  2025-03-05 13:03 [RFC] iommufd: Use accurate dev_id in the PRI forwarding path Yi Liu
@ 2025-03-05 17:18 ` Jason Gunthorpe
  2025-03-06  3:10   ` Tian, Kevin
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Gunthorpe @ 2025-03-05 17:18 UTC (permalink / raw)
  To: Yi Liu; +Cc: kevin.tian, baolu.lu, iommu

On Wed, Mar 05, 2025 at 05:03:56AM -0800, Yi Liu wrote:
> Hi Jason, Kevin, Baolu,
> 
> This is more a query, the patch here does not really work as there
> is locking issue. So I'd like to hear from you if any good idea.
> 
> Detail as below:
> 
> Existing code has a problem when the PRI happens on a device that has
> alias. The PRI reporting path uses the idev stored in the handle. While
> this idev is the first idev that attachs to the domain.  If the PRI
> happens on devices other than the first attached device, then the idev
> stored in handle does not match.

Do you actually care about this? Can we say that devices that have
aliases do not allow PRI?

I'm confused about how that can even work if the RID is erased in the
fabric, how do we deliver the PRI response to the correct entity?

Or do you mean something else by alias?

Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [RFC] iommufd: Use accurate dev_id in the PRI forwarding path
  2025-03-05 17:18 ` Jason Gunthorpe
@ 2025-03-06  3:10   ` Tian, Kevin
  2025-03-06  3:19     ` Baolu Lu
  2025-03-06  3:30     ` Yi Liu
  0 siblings, 2 replies; 12+ messages in thread
From: Tian, Kevin @ 2025-03-06  3:10 UTC (permalink / raw)
  To: Jason Gunthorpe, Liu, Yi L
  Cc: baolu.lu@linux.intel.com, iommu@lists.linux.dev

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, March 6, 2025 1:19 AM
> 
> On Wed, Mar 05, 2025 at 05:03:56AM -0800, Yi Liu wrote:
> > Hi Jason, Kevin, Baolu,
> >
> > This is more a query, the patch here does not really work as there
> > is locking issue. So I'd like to hear from you if any good idea.
> >
> > Detail as below:
> >
> > Existing code has a problem when the PRI happens on a device that has
> > alias. The PRI reporting path uses the idev stored in the handle. While
> > this idev is the first idev that attachs to the domain.  If the PRI
> > happens on devices other than the first attached device, then the idev
> > stored in handle does not match.
> 
> Do you actually care about this? Can we say that devices that have
> aliases do not allow PRI?
> 
> I'm confused about how that can even work if the RID is erased in the
> fabric, how do we deliver the PRI response to the correct entity?
> 

Another tricky thing is when some devices in the group support PRI
but others don't...

I'm inclined to agree that it's simpler to disable PRI on non-singleton
group, than trying to fix and eanble it.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] iommufd: Use accurate dev_id in the PRI forwarding path
  2025-03-06  3:10   ` Tian, Kevin
@ 2025-03-06  3:19     ` Baolu Lu
  2025-03-06  3:30     ` Yi Liu
  1 sibling, 0 replies; 12+ messages in thread
From: Baolu Lu @ 2025-03-06  3:19 UTC (permalink / raw)
  To: Tian, Kevin, Jason Gunthorpe, Liu, Yi L; +Cc: iommu@lists.linux.dev

On 3/6/25 11:10, Tian, Kevin wrote:
>> From: Jason Gunthorpe<jgg@nvidia.com>
>> Sent: Thursday, March 6, 2025 1:19 AM
>>
>> On Wed, Mar 05, 2025 at 05:03:56AM -0800, Yi Liu wrote:
>>> Hi Jason, Kevin, Baolu,
>>>
>>> This is more a query, the patch here does not really work as there
>>> is locking issue. So I'd like to hear from you if any good idea.
>>>
>>> Detail as below:
>>>
>>> Existing code has a problem when the PRI happens on a device that has
>>> alias. The PRI reporting path uses the idev stored in the handle. While
>>> this idev is the first idev that attachs to the domain.  If the PRI
>>> happens on devices other than the first attached device, then the idev
>>> stored in handle does not match.
>> Do you actually care about this? Can we say that devices that have
>> aliases do not allow PRI?
>>
>> I'm confused about how that can even work if the RID is erased in the
>> fabric, how do we deliver the PRI response to the correct entity?
>>
> Another tricky thing is when some devices in the group support PRI
> but others don't...
> 
> I'm inclined to agree that it's simpler to disable PRI on non-singleton
> group, than trying to fix and eanble it.

+1

Thanks,
baolu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] iommufd: Use accurate dev_id in the PRI forwarding path
  2025-03-06  3:10   ` Tian, Kevin
  2025-03-06  3:19     ` Baolu Lu
@ 2025-03-06  3:30     ` Yi Liu
  2025-03-06  4:10       ` Tian, Kevin
  2025-03-06  7:09       ` Yi Liu
  1 sibling, 2 replies; 12+ messages in thread
From: Yi Liu @ 2025-03-06  3:30 UTC (permalink / raw)
  To: Tian, Kevin, Jason Gunthorpe
  Cc: baolu.lu@linux.intel.com, iommu@lists.linux.dev

On 2025/3/6 11:10, Tian, Kevin wrote:
>> From: Jason Gunthorpe <jgg@nvidia.com>
>> Sent: Thursday, March 6, 2025 1:19 AM
>>
>> On Wed, Mar 05, 2025 at 05:03:56AM -0800, Yi Liu wrote:
>>> Hi Jason, Kevin, Baolu,
>>>
>>> This is more a query, the patch here does not really work as there
>>> is locking issue. So I'd like to hear from you if any good idea.
>>>
>>> Detail as below:
>>>
>>> Existing code has a problem when the PRI happens on a device that has
>>> alias. The PRI reporting path uses the idev stored in the handle. While
>>> this idev is the first idev that attachs to the domain.  If the PRI
>>> happens on devices other than the first attached device, then the idev
>>> stored in handle does not match.
>>
>> Do you actually care about this? Can we say that devices that have
>> aliases do not allow PRI?

It would be simpler if we can enforce it.

>> I'm confused about how that can even work if the RID is erased in the
>> fabric, how do we deliver the PRI response to the correct entity?
>>

If the RID is detached, kernel should response any pending PRIs. If user
sends a response, it won't suit any pending PRI then dropped.

> 
> Another tricky thing is when some devices in the group support PRI
> but others don't...
> 
> I'm inclined to agree that it's simpler to disable PRI on non-singleton
> group, than trying to fix and eanble it.

yes. This also tricky.

-- 
Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [RFC] iommufd: Use accurate dev_id in the PRI forwarding path
  2025-03-06  3:30     ` Yi Liu
@ 2025-03-06  4:10       ` Tian, Kevin
  2025-03-06  4:55         ` Baolu Lu
  2025-03-06  5:01         ` Yi Liu
  2025-03-06  7:09       ` Yi Liu
  1 sibling, 2 replies; 12+ messages in thread
From: Tian, Kevin @ 2025-03-06  4:10 UTC (permalink / raw)
  To: Liu, Yi L, Jason Gunthorpe
  Cc: baolu.lu@linux.intel.com, iommu@lists.linux.dev

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, March 6, 2025 11:30 AM
> 
> On 2025/3/6 11:10, Tian, Kevin wrote:
> >> From: Jason Gunthorpe <jgg@nvidia.com>
> >> Sent: Thursday, March 6, 2025 1:19 AM
> >>
> >> I'm confused about how that can even work if the RID is erased in the
> >> fabric, how do we deliver the PRI response to the correct entity?
> >>
> 
> If the RID is detached, kernel should response any pending PRIs. If user
> sends a response, it won't suit any pending PRI then dropped.
> 

Jason talked about the case where the iommu group is created e.g
for a legacy PCI bridge, with all devices behind sharing a same RID. 
Then there is no way to know which device should receive the response.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] iommufd: Use accurate dev_id in the PRI forwarding path
  2025-03-06  4:10       ` Tian, Kevin
@ 2025-03-06  4:55         ` Baolu Lu
  2025-03-06  5:01         ` Yi Liu
  1 sibling, 0 replies; 12+ messages in thread
From: Baolu Lu @ 2025-03-06  4:55 UTC (permalink / raw)
  To: Tian, Kevin, Liu, Yi L, Jason Gunthorpe; +Cc: iommu@lists.linux.dev

On 3/6/25 12:10, Tian, Kevin wrote:
>> From: Liu, Yi L <yi.l.liu@intel.com>
>> Sent: Thursday, March 6, 2025 11:30 AM
>>
>> On 2025/3/6 11:10, Tian, Kevin wrote:
>>>> From: Jason Gunthorpe <jgg@nvidia.com>
>>>> Sent: Thursday, March 6, 2025 1:19 AM
>>>>
>>>> I'm confused about how that can even work if the RID is erased in the
>>>> fabric, how do we deliver the PRI response to the correct entity?
>>>>
>>
>> If the RID is detached, kernel should response any pending PRIs. If user
>> sends a response, it won't suit any pending PRI then dropped.
>>
> 
> Jason talked about the case where the iommu group is created e.g
> for a legacy PCI bridge, with all devices behind sharing a same RID.
> Then there is no way to know which device should receive the response.

The PCI spec requires,

"It (ATS) is permitted to be implemented by Endpoints or RCiEPs."

... but DMA aliases could also be created by pci_add_dma_alias().

Thanks,
baolu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] iommufd: Use accurate dev_id in the PRI forwarding path
  2025-03-06  4:10       ` Tian, Kevin
  2025-03-06  4:55         ` Baolu Lu
@ 2025-03-06  5:01         ` Yi Liu
  2025-03-06 12:13           ` Jason Gunthorpe
  1 sibling, 1 reply; 12+ messages in thread
From: Yi Liu @ 2025-03-06  5:01 UTC (permalink / raw)
  To: Tian, Kevin, Jason Gunthorpe
  Cc: baolu.lu@linux.intel.com, iommu@lists.linux.dev

On 2025/3/6 12:10, Tian, Kevin wrote:
>> From: Liu, Yi L <yi.l.liu@intel.com>
>> Sent: Thursday, March 6, 2025 11:30 AM
>>
>> On 2025/3/6 11:10, Tian, Kevin wrote:
>>>> From: Jason Gunthorpe <jgg@nvidia.com>
>>>> Sent: Thursday, March 6, 2025 1:19 AM
>>>>
>>>> I'm confused about how that can even work if the RID is erased in the
>>>> fabric, how do we deliver the PRI response to the correct entity?
>>>>
>>
>> If the RID is detached, kernel should response any pending PRIs. If user
>> sends a response, it won't suit any pending PRI then dropped.
>>
> 
> Jason talked about the case where the iommu group is created e.g
> for a legacy PCI bridge, with all devices behind sharing a same RID.
> Then there is no way to know which device should receive the response.

hmmm. such devices shall not support PRI I suppose. But yes, this also
pushes us to not supporting PRI for devices that have alias. We might
need an iommu API to count the device number of a given iommu group.

-- 
Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] iommufd: Use accurate dev_id in the PRI forwarding path
  2025-03-06  3:30     ` Yi Liu
  2025-03-06  4:10       ` Tian, Kevin
@ 2025-03-06  7:09       ` Yi Liu
  2025-03-06 19:06         ` Jason Gunthorpe
  1 sibling, 1 reply; 12+ messages in thread
From: Yi Liu @ 2025-03-06  7:09 UTC (permalink / raw)
  To: Tian, Kevin, Jason Gunthorpe
  Cc: baolu.lu@linux.intel.com, iommu@lists.linux.dev

On 2025/3/6 11:30, Yi Liu wrote:
> On 2025/3/6 11:10, Tian, Kevin wrote:
>>> From: Jason Gunthorpe <jgg@nvidia.com>
>>> Sent: Thursday, March 6, 2025 1:19 AM
>>>
>>> On Wed, Mar 05, 2025 at 05:03:56AM -0800, Yi Liu wrote:
>>>> Hi Jason, Kevin, Baolu,
>>>>
>>>> This is more a query, the patch here does not really work as there
>>>> is locking issue. So I'd like to hear from you if any good idea.
>>>>
>>>> Detail as below:
>>>>
>>>> Existing code has a problem when the PRI happens on a device that has
>>>> alias. The PRI reporting path uses the idev stored in the handle. While
>>>> this idev is the first idev that attachs to the domain.  If the PRI
>>>> happens on devices other than the first attached device, then the idev
>>>> stored in handle does not match.
>>>
>>> Do you actually care about this? Can we say that devices that have
>>> aliases do not allow PRI?
> 
> It would be simpler if we can enforce it.

how about something like the below?

diff --git a/drivers/iommu/iommu-priv.h b/drivers/iommu/iommu-priv.h
index b4508423e13b..faefe06c212c 100644
--- a/drivers/iommu/iommu-priv.h
+++ b/drivers/iommu/iommu-priv.h
@@ -32,6 +32,8 @@ void iommu_device_unregister_bus(struct iommu_device *iommu,
  				 const struct bus_type *bus,
  				 struct notifier_block *nb);

+bool iommu_group_is_singular(struct iommu_group *group);
+
  struct iommu_attach_handle *iommu_attach_handle_get(struct iommu_group 
*group,
  						    ioasid_t pasid,
  						    unsigned int type);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0ee17893810f..7b24b49110c6 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1040,6 +1040,18 @@ void *iommu_group_get_iommudata(struct iommu_group 
*group)
  }
  EXPORT_SYMBOL_GPL(iommu_group_get_iommudata);

+bool iommu_group_is_singular(struct iommu_group *group)
+{
+	bool res;
+
+	mutex_lock(&group->mutex);
+	res = list_count_nodes(&group->devices) == 1;
+	mutex_unlock(&group->mutex);
+
+	return res;
+}
+EXPORT_SYMBOL_NS_GPL(iommu_group_is_singular, "IOMMUFD_INTERNAL");
+
  /**
   * iommu_group_set_iommudata - set iommu_data for a group
   * @group: the group
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 92dc1ca010dc..6a1ca3874806 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -591,6 +591,9 @@ int iommufd_hw_pagetable_attach(struct 
iommufd_hw_pagetable *hwpt,
  	struct iommufd_hwpt_paging *hwpt_paging = find_hwpt_paging(hwpt);
  	int rc;

+	if (hwpt->fault && !iommu_group_is_singular(idev->igroup->group))
+		return -EOPNOTSUPP;
+
  	mutex_lock(&idev->igroup->lock);

  	if (idev->igroup->hwpt != NULL && idev->igroup->hwpt != hwpt) {
@@ -720,6 +723,9 @@ iommufd_device_do_replace(struct iommufd_device *idev,
  	unsigned int num_devices;
  	int rc;

+	if (hwpt->fault && !iommu_group_is_singular(idev->igroup->group))
+		return -EOPNOTSUPP;
+
  	mutex_lock(&idev->igroup->lock);

  	if (igroup->hwpt == NULL) {
diff --git a/drivers/iommu/iommufd/hw_pagetable.c 
b/drivers/iommu/iommufd/hw_pagetable.c
index 268315b1d8bc..bd2bab5a3b86 100644
--- a/drivers/iommu/iommufd/hw_pagetable.c
+++ b/drivers/iommu/iommufd/hw_pagetable.c
@@ -405,6 +405,11 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd)
  	if (cmd->flags & IOMMU_HWPT_FAULT_ID_VALID) {
  		struct iommufd_fault *fault;

+		if (!iommu_group_is_singular(idev->igroup->group)) {
+			rc = -EOPNOTSUPP;
+			goto out_hwpt;
+		}
+
  		fault = iommufd_get_fault(ucmd, cmd->fault_id);
  		if (IS_ERR(fault)) {
  			rc = PTR_ERR(fault);

-- 
Regards,
Yi Liu

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC] iommufd: Use accurate dev_id in the PRI forwarding path
  2025-03-06  5:01         ` Yi Liu
@ 2025-03-06 12:13           ` Jason Gunthorpe
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2025-03-06 12:13 UTC (permalink / raw)
  To: Yi Liu; +Cc: Tian, Kevin, baolu.lu@linux.intel.com, iommu@lists.linux.dev

On Thu, Mar 06, 2025 at 01:01:30PM +0800, Yi Liu wrote:
> On 2025/3/6 12:10, Tian, Kevin wrote:
> > > From: Liu, Yi L <yi.l.liu@intel.com>
> > > Sent: Thursday, March 6, 2025 11:30 AM
> > > 
> > > On 2025/3/6 11:10, Tian, Kevin wrote:
> > > > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > > > Sent: Thursday, March 6, 2025 1:19 AM
> > > > > 
> > > > > I'm confused about how that can even work if the RID is erased in the
> > > > > fabric, how do we deliver the PRI response to the correct entity?
> > > > > 
> > > 
> > > If the RID is detached, kernel should response any pending PRIs. If user
> > > sends a response, it won't suit any pending PRI then dropped.
> > > 
> > 
> > Jason talked about the case where the iommu group is created e.g
> > for a legacy PCI bridge, with all devices behind sharing a same RID.
> > Then there is no way to know which device should receive the response.
> 
> hmmm. such devices shall not support PRI I suppose. But yes, this also
> pushes us to not supporting PRI for devices that have alias. We might
> need an iommu API to count the device number of a given iommu group.

iommu groups are not the same thing as aliases..

Aliased devices go into groups, but groups are not just consistent of
aliased devices..

Jason 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] iommufd: Use accurate dev_id in the PRI forwarding path
  2025-03-06  7:09       ` Yi Liu
@ 2025-03-06 19:06         ` Jason Gunthorpe
  2025-03-07  1:32           ` Baolu Lu
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Gunthorpe @ 2025-03-06 19:06 UTC (permalink / raw)
  To: Yi Liu; +Cc: Tian, Kevin, baolu.lu@linux.intel.com, iommu@lists.linux.dev

On Thu, Mar 06, 2025 at 03:09:34PM +0800, Yi Liu wrote:
> +bool iommu_group_is_singular(struct iommu_group *group)
> +{
> +	bool res;
> +
> +	mutex_lock(&group->mutex);
> +	res = list_count_nodes(&group->devices) == 1;
> +	mutex_unlock(&group->mutex);
> +
> +	return res;
> +}
> +EXPORT_SYMBOL_NS_GPL(iommu_group_is_singular, "IOMMUFD_INTERNAL");

I'm not so excited about something racy like this..

I still think this is a bit wrong, we only care about aliases, not
multi device groups.

Aliases are created here:

	if (pci_for_each_dma_alias(pdev, get_pci_alias_or_group, &data))
		return data.group;

And maybe in a few other places..

Can you tag the group that it has aliases in it directly during group
creation so there are no races?

I've long had a desire to more clearly separate the two cases of
groups:

 1) The HW cannot tell the different struct device sources on a per
    DMA basis
 2) There is a security intertwinement between the devices but they
    can all run separately in the IOMMU HW

Maybe it is not so hard?

Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] iommufd: Use accurate dev_id in the PRI forwarding path
  2025-03-06 19:06         ` Jason Gunthorpe
@ 2025-03-07  1:32           ` Baolu Lu
  0 siblings, 0 replies; 12+ messages in thread
From: Baolu Lu @ 2025-03-07  1:32 UTC (permalink / raw)
  To: Jason Gunthorpe, Yi Liu; +Cc: Tian, Kevin, iommu@lists.linux.dev

On 3/7/25 03:06, Jason Gunthorpe wrote:
> On Thu, Mar 06, 2025 at 03:09:34PM +0800, Yi Liu wrote:
>> +bool iommu_group_is_singular(struct iommu_group *group)
>> +{
>> +	bool res;
>> +
>> +	mutex_lock(&group->mutex);
>> +	res = list_count_nodes(&group->devices) == 1;
>> +	mutex_unlock(&group->mutex);
>> +
>> +	return res;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(iommu_group_is_singular, "IOMMUFD_INTERNAL");
> I'm not so excited about something racy like this..
> 
> I still think this is a bit wrong, we only care about aliases, not
> multi device groups.
> 
> Aliases are created here:
> 
> 	if (pci_for_each_dma_alias(pdev, get_pci_alias_or_group, &data))
> 		return data.group;
> 
> And maybe in a few other places..
> 
> Can you tag the group that it has aliases in it directly during group
> creation so there are no races?
> 
> I've long had a desire to more clearly separate the two cases of
> groups:
> 
>   1) The HW cannot tell the different struct device sources on a per
>      DMA basis
>   2) There is a security intertwinement between the devices but they
>      can all run separately in the IOMMU HW
> 
> Maybe it is not so hard?

This is a long-standing issue. If I recall correctly, Joerg once
proposed the concept of iommu subgroup.

Thanks,
baolu

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-03-07  1:35 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-05 13:03 [RFC] iommufd: Use accurate dev_id in the PRI forwarding path Yi Liu
2025-03-05 17:18 ` Jason Gunthorpe
2025-03-06  3:10   ` Tian, Kevin
2025-03-06  3:19     ` Baolu Lu
2025-03-06  3:30     ` Yi Liu
2025-03-06  4:10       ` Tian, Kevin
2025-03-06  4:55         ` Baolu Lu
2025-03-06  5:01         ` Yi Liu
2025-03-06 12:13           ` Jason Gunthorpe
2025-03-06  7:09       ` Yi Liu
2025-03-06 19:06         ` Jason Gunthorpe
2025-03-07  1:32           ` Baolu Lu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.