From: Jason Gunthorpe <jgg@ziepe.ca>
To: Will Deacon <will@kernel.org>
Cc: Kunkun Jiang <jiangkunkun@huawei.com>,
Lu Baolu <baolu.lu@linux.intel.com>,
Robin Murphy <robin.murphy@arm.com>,
Joerg Roedel <joro@8bytes.org>,
Nicolin Chen <nicolinc@nvidia.com>,
Michael Shavit <mshavit@google.com>,
Mostafa Saleh <smostafa@google.com>,
"moderated list:ARM SMMU DRIVERS"
<linux-arm-kernel@lists.infradead.org>,
iommu@lists.linux.dev, linux-kernel@vger.kernel.org,
wanghaibin.wang@huawei.com, yuzenghui@huawei.com,
tangnianyao@huawei.com
Subject: Re: [bug report] iommu/arm-smmu-v3: Event cannot be printed in some scenarios
Date: Wed, 24 Jul 2024 10:03:20 -0300 [thread overview]
Message-ID: <20240724130320.GO14050@ziepe.ca> (raw)
In-Reply-To: <20240724102417.GA27376@willie-the-truck>
On Wed, Jul 24, 2024 at 11:24:17AM +0100, Will Deacon wrote:
> > This event handling process is as follows:
> > arm_smmu_evtq_thread
> > ret = arm_smmu_handle_evt
> > iommu_report_device_fault
> > iopf_param = iopf_get_dev_fault_param(dev);
> > // iopf is not enabled.
> > // No RESUME will be sent!
> > if (WARN_ON(!iopf_param))
> > return;
> > if (!ret || !__ratelimit(&rs))
> > continue;
> >
> > In this scenario, the io page-fault capability is not enabled.
> > There are two problems here:
> > 1. The event information is not printed.
> > 2. The entire device(PF level) is stalled,not just the current
> > VF. This affects other normal VFs.
>
> Oh, so that stall is probably also due to b554e396e51c ("iommu: Make
> iopf_group_response() return void"). I agree that we need a way to
> propagate error handling back to the driver in the case that
> 'iopf_param' is NULL, otherwise we're making the unexpected fault
> considerably more problematic than it needs to be.
The expectation was the driver would not call this function if it is
not handling a fault path, that's why there is a WARN_ON..
It seems those details were missed and drivers are not equipped to do
so..
Broadly if a fault is received and there is no domain fault handler to
process it then we should still generate all the DMA failure logging
as normal.
So maybe something like this to capture those corners as well:
diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c
index 7c9011992d3f04..52ffdb8324de50 100644
--- a/drivers/iommu/io-pgfault.c
+++ b/drivers/iommu/io-pgfault.c
@@ -113,14 +113,55 @@ static struct iopf_group *iopf_group_alloc(struct iommu_fault_param *iopf_param,
return group;
}
+static struct iommu_attach_handle *find_fault_handler(struct device *dev,
+ struct iopf_fault *evt)
+{
+ struct iommu_fault *fault = &evt->fault;
+ struct iommu_attach_handle *attach_handle;
+
+ if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) {
+ attach_handle = iommu_attach_handle_get(dev->iommu_group,
+ fault->prm.pasid, 0);
+ if (IS_ERR(attach_handle)) {
+ const struct iommu_ops *ops = dev_iommu_ops(dev);
+
+ if (!ops->user_pasid_table)
+ return NULL;
+
+ /*
+ * The iommu driver for this device supports user-
+ * managed PASID table. Therefore page faults for
+ * any PASID should go through the NESTING domain
+ * attached to the device RID.
+ */
+ attach_handle = iommu_attach_handle_get(
+ dev->iommu_group, IOMMU_NO_PASID,
+ IOMMU_DOMAIN_NESTED);
+ if (IS_ERR(attach_handle))
+ return NULL;
+ }
+ } else {
+ attach_handle = iommu_attach_handle_get(dev->iommu_group,
+ IOMMU_NO_PASID, 0);
+ if (IS_ERR(attach_handle))
+ return NULL;
+ }
+
+ if (!attach_handle->domain->iopf_handler)
+ return NULL;
+ return attach_handle;
+}
+
/**
* iommu_report_device_fault() - Report fault event to device driver
* @dev: the device
* @evt: fault event data
*
* Called by IOMMU drivers when a fault is detected, typically in a threaded IRQ
- * handler. If this function fails then ops->page_response() was called to
- * complete evt if required.
+ * handler. If this function fails then their is no fault handler for this evt
+ * and the fault remains owned by the caller. The caller should log the DMA
+ * protection failure and resolve the fault. Otherwise on success the fault is
+ * always completed eventually.
*
* This module doesn't handle PCI PASID Stop Marker; IOMMU drivers must discard
* them before reporting faults. A PASID Stop Marker (LRW = 0b100) doesn't
@@ -153,16 +194,25 @@ static struct iopf_group *iopf_group_alloc(struct iommu_fault_param *iopf_param,
* hardware has been set to block the page faults) and the pending page faults
* have been flushed.
*/
-void iommu_report_device_fault(struct device *dev, struct iopf_fault *evt)
+int iommu_report_device_fault(struct device *dev, struct iopf_fault *evt)
{
+ struct iommu_attach_handle *attach_handle;
struct iommu_fault *fault = &evt->fault;
struct iommu_fault_param *iopf_param;
struct iopf_group abort_group = {};
struct iopf_group *group;
+ attach_handle = find_fault_handler(dev, evt);
+ if (!attach_handle)
+ return -EINVAL;
+
+ /*
+ * Something has gone wrong if a fault capable domain is attached but no
+ * iopf_param is setup.
+ */
iopf_param = iopf_get_dev_fault_param(dev);
if (WARN_ON(!iopf_param))
- return;
+ return -EINVAL;
if (!(fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) {
report_partial_fault(iopf_param, fault);
@@ -181,39 +231,7 @@ void iommu_report_device_fault(struct device *dev, struct iopf_fault *evt)
group = iopf_group_alloc(iopf_param, evt, &abort_group);
if (group == &abort_group)
goto err_abort;
-
- if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) {
- group->attach_handle = iommu_attach_handle_get(dev->iommu_group,
- fault->prm.pasid,
- 0);
- if (IS_ERR(group->attach_handle)) {
- const struct iommu_ops *ops = dev_iommu_ops(dev);
-
- if (!ops->user_pasid_table)
- goto err_abort;
-
- /*
- * The iommu driver for this device supports user-
- * managed PASID table. Therefore page faults for
- * any PASID should go through the NESTING domain
- * attached to the device RID.
- */
- group->attach_handle =
- iommu_attach_handle_get(dev->iommu_group,
- IOMMU_NO_PASID,
- IOMMU_DOMAIN_NESTED);
- if (IS_ERR(group->attach_handle))
- goto err_abort;
- }
- } else {
- group->attach_handle =
- iommu_attach_handle_get(dev->iommu_group, IOMMU_NO_PASID, 0);
- if (IS_ERR(group->attach_handle))
- goto err_abort;
- }
-
- if (!group->attach_handle->domain->iopf_handler)
- goto err_abort;
+ group->attach_handle = attach_handle;
/*
* On success iopf_handler must call iopf_group_response() and
next prev parent reply other threads:[~2024-07-24 13:05 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-24 1:42 [bug report] iommu/arm-smmu-v3: Event cannot be printed in some scenarios Kunkun Jiang
2024-07-24 9:15 ` Mostafa Saleh
2024-07-24 9:22 ` Kunkun Jiang
2024-07-24 10:24 ` Will Deacon
2024-07-24 13:03 ` Jason Gunthorpe [this message]
2024-07-25 7:35 ` Tian, Kevin
2024-07-25 12:58 ` Jason Gunthorpe
2024-07-26 0:04 ` Tian, Kevin
2024-07-29 5:29 ` Baolu Lu
2024-08-02 14:38 ` Pranjal Shrivastava
2024-08-05 12:13 ` Kunkun Jiang
2024-08-05 12:30 ` Will Deacon
[not found] ` <ZrDwolC6oXN44coq@google.com>
2024-08-06 0:09 ` Baolu Lu
2024-08-06 12:49 ` Jason Gunthorpe
2024-08-06 15:58 ` Pranjal Shrivastava
2024-08-07 5:35 ` Baolu Lu
2024-08-08 13:50 ` Pranjal Shrivastava
2024-08-13 17:56 ` Jason Gunthorpe
2024-08-14 9:02 ` Pranjal Shrivastava
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240724130320.GO14050@ziepe.ca \
--to=jgg@ziepe.ca \
--cc=baolu.lu@linux.intel.com \
--cc=iommu@lists.linux.dev \
--cc=jiangkunkun@huawei.com \
--cc=joro@8bytes.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mshavit@google.com \
--cc=nicolinc@nvidia.com \
--cc=robin.murphy@arm.com \
--cc=smostafa@google.com \
--cc=tangnianyao@huawei.com \
--cc=wanghaibin.wang@huawei.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).