From: Baolu Lu <baolu.lu@linux.intel.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: baolu.lu@linux.intel.com, Joerg Roedel <joro@8bytes.org>,
Will Deacon <will@kernel.org>,
Robin Murphy <robin.murphy@arm.com>,
Kevin Tian <kevin.tian@intel.com>,
Huang Jiaqing <jiaqing.huang@intel.com>,
Ethan Zhao <haifeng.zhao@linux.intel.com>,
iommu@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] iommu/vt-d: Use device rbtree in iopf reporting path
Date: Sun, 18 Feb 2024 15:02:00 +0800 [thread overview]
Message-ID: <67391b2d-b441-4d43-aa46-2a30c95420a3@linux.intel.com> (raw)
In-Reply-To: <20240215175534.GD1299735@ziepe.ca>
On 2024/2/16 1:55, Jason Gunthorpe wrote:
> On Thu, Feb 15, 2024 at 03:22:49PM +0800, Lu Baolu wrote:
>> The existing IO page fault handler currently locates the PCI device by
>> calling pci_get_domain_bus_and_slot(). This function searches the list
>> of all PCI devices until the desired device is found. To improve lookup
>> efficiency, a helper function named device_rbtree_find() is introduced
>> to search for the device within the rbtree. Replace
>> pci_get_domain_bus_and_slot() in the IO page fault handling path.
>>
>> Co-developed-by: Huang Jiaqing <jiaqing.huang@intel.com>
>> Signed-off-by: Huang Jiaqing <jiaqing.huang@intel.com>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>> ---
>> drivers/iommu/intel/iommu.h | 1 +
>> drivers/iommu/intel/iommu.c | 29 +++++++++++++++++++++++++++++
>> drivers/iommu/intel/svm.c | 14 ++++++--------
>> 3 files changed, 36 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
>> index 54eeaa8e35a9..f13c228924f8 100644
>> --- a/drivers/iommu/intel/iommu.h
>> +++ b/drivers/iommu/intel/iommu.h
>> @@ -1081,6 +1081,7 @@ void free_pgtable_page(void *vaddr);
>> void iommu_flush_write_buffer(struct intel_iommu *iommu);
>> struct iommu_domain *intel_nested_domain_alloc(struct iommu_domain *parent,
>> const struct iommu_user_data *user_data);
>> +struct device *device_rbtree_find(struct intel_iommu *iommu, u16 rid);
>>
>> #ifdef CONFIG_INTEL_IOMMU_SVM
>> void intel_svm_check(struct intel_iommu *iommu);
>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>> index 09009d96e553..d92c680bcc96 100644
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -120,6 +120,35 @@ static int device_rid_cmp(struct rb_node *lhs, const struct rb_node *rhs)
>> return device_rid_cmp_key(&key, rhs);
>> }
>>
>> +/*
>> + * Looks up an IOMMU-probed device using its source ID.
>> + *
>> + * If the device is found:
>> + * - Increments its reference count.
>> + * - Returns a pointer to the device.
>> + * - The caller must call put_device() after using the pointer.
>> + *
>> + * If the device is not found, returns NULL.
>> + */
>> +struct device *device_rbtree_find(struct intel_iommu *iommu, u16 rid)
>> +{
>> + struct device_domain_info *info;
>> + struct device *dev = NULL;
>> + struct rb_node *node;
>> + unsigned long flags;
>> +
>> + spin_lock_irqsave(&iommu->device_rbtree_lock, flags);
>> + node = rb_find(&rid, &iommu->device_rbtree, device_rid_cmp_key);
>> + if (node) {
>> + info = rb_entry(node, struct device_domain_info, node);
>> + dev = info->dev;
>> + get_device(dev);
>
> This get_device() is a bit troubling. It eventually calls into
> iommu_report_device_fault() which does:
>
> struct dev_iommu *param = dev->iommu;
>
> Which is going to explode if the iomm driver release has already
> happened, which is a precondition to getting to a unref'd struct
> device.
>
> The driver needs to do something to fence these events during it's
> release function.
Yes, theoretically the dev->iommu should be protected in the
iommu_report_device_fault() path.
>
> If we are already doing that then I'd suggest to drop the get_device
> and add a big fat comment explaining the special rules about lifetime
> that are in effect here.
>
> Otherwise you need to do that barrier rethink the way the locking
> works..
A device hot removing goes through at least the following steps:
- Disable PRI.
- Drain all outstanding I/O page faults.
- Stop DMA.
- Unload the device driver.
- Call iommu_release_device() upon the BUS_NOTIFY_REMOVED_DEVICE event.
This sequence ensures that a device cannot generate an I/O page fault
after PRI has been disabled. So in reality it's impossible for a device
to generate an I/O page fault before disabling PRI and then go through
the long journey to reach iommu_release_device() before
iopf_get_dev_fault_param() is called in page fault interrupt handling
thread.
Considering this behavior, adding a comment to the code explaining the
sequence and removing put_device() may be a simpler solution?
>
> Aside from that this looks like a great improvement to me
>
> Thanks,
> Jason
Best regards,
baolu
next prev parent reply other threads:[~2024-02-18 7:02 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-15 7:22 [PATCH 0/2] iommu/vt-d: Introduce rbtree for probed devices Lu Baolu
2024-02-15 7:22 ` [PATCH 1/2] iommu/vt-d: Use rbtree to track iommu " Lu Baolu
2024-02-15 17:47 ` Jason Gunthorpe
2024-02-18 4:22 ` Baolu Lu
2024-02-19 2:45 ` Ethan Zhao
2024-02-19 4:04 ` Baolu Lu
2024-02-19 5:33 ` Ethan Zhao
2024-02-19 6:47 ` Baolu Lu
2024-02-19 7:24 ` Ethan Zhao
2024-02-15 7:22 ` [PATCH 2/2] iommu/vt-d: Use device rbtree in iopf reporting path Lu Baolu
2024-02-15 17:55 ` Jason Gunthorpe
2024-02-18 7:02 ` Baolu Lu [this message]
2024-02-21 15:31 ` Jason Gunthorpe
2024-02-21 7:04 ` Ethan Zhao
2024-02-21 7:37 ` Baolu Lu
2024-02-19 6:54 ` Ethan Zhao
2024-02-19 6:58 ` Baolu Lu
2024-02-19 7:06 ` Ethan Zhao
2024-02-19 7:22 ` Baolu Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=67391b2d-b441-4d43-aa46-2a30c95420a3@linux.intel.com \
--to=baolu.lu@linux.intel.com \
--cc=haifeng.zhao@linux.intel.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@ziepe.ca \
--cc=jiaqing.huang@intel.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox