Re: [PATCH v13 06/10] iommu: Add a page fault handler

public inbox for devicetree@vger.kernel.org
 help / color / mirror / Atom feed

From: Jean-Philippe Brucker <jean-philippe@linaro.org>
To: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: joro@8bytes.org, will@kernel.org, lorenzo.pieralisi@arm.com,
	robh+dt@kernel.org, guohanjun@huawei.com, sudeep.holla@arm.com,
	rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com,
	Jonathan.Cameron@huawei.com, eric.auger@redhat.com,
	iommu@lists.linux-foundation.org, devicetree@vger.kernel.org,
	linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-accelerators@lists.ozlabs.org, baolu.lu@linux.intel.com,
	kevin.tian@intel.com, vdumpa@nvidia.com, zhangfei.gao@linaro.org,
	shameerali.kolothum.thodi@huawei.com, vivek.gautam@arm.com,
	zhukeqian1@huawei.com, wangzhou1@hisilicon.com, "Raj,
	Ashok" <ashok.raj@intel.com>
Subject: Re: [PATCH v13 06/10] iommu: Add a page fault handler
Date: Tue, 23 Mar 2021 11:50:03 +0100	[thread overview]
Message-ID: <YFnH20Fy31CwEj7n@myrica> (raw)
In-Reply-To: <20210302155957.620db372@jacob-builder>

On Tue, Mar 02, 2021 at 03:59:57PM -0800, Jacob Pan wrote:
> Hi Jean-Philippe,
> 
> A few comments from the p.o.v of converting VT-d to this framework. Mostly
> about potential optimization. I think VT-d SVA code will be able to use this
> work.
> +Ashok provided many insight.
> 
> FWIW,
> Reviewed-by:Jacob Pan <jacob.jun.pan@linux.intel.com>

Thanks!

> On Tue,  2 Mar 2021 10:26:42 +0100, Jean-Philippe Brucker
> <jean-philippe@linaro.org> wrote:
> > +int iommu_queue_iopf(struct iommu_fault *fault, void *cookie)
> > +{
> > +	int ret;
> > +	struct iopf_group *group;
> > +	struct iopf_fault *iopf, *next;
> > +	struct iopf_device_param *iopf_param;
> > +
> > +	struct device *dev = cookie;
> > +	struct dev_iommu *param = dev->iommu;
> > +
> > +	lockdep_assert_held(&param->lock);
> > +
> > +	if (fault->type != IOMMU_FAULT_PAGE_REQ)
> > +		/* Not a recoverable page fault */
> > +		return -EOPNOTSUPP;
> > +
> > +	/*
> > +	 * As long as we're holding param->lock, the queue can't be
> > unlinked
> > +	 * from the device and therefore cannot disappear.
> > +	 */
> > +	iopf_param = param->iopf_param;
> > +	if (!iopf_param)
> > +		return -ENODEV;
> > +
> > +	if (!(fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) {
> > +		iopf = kzalloc(sizeof(*iopf), GFP_KERNEL);
> > +		if (!iopf)
> > +			return -ENOMEM;
> > +
> > +		iopf->fault = *fault;
> > +
> > +		/* Non-last request of a group. Postpone until the last
> > one */
> Would be nice to have an option here to allow non-deferred handle_mm_fault.
> Some devices may have a large group.
> 
> FYI, VT-d also needs to send page response before the last one (LPIG).
> "A Page Group Response Descriptor is issued by software in response to a
> page request with data or a page request (with or without data) that
> indicated that it was the last request in a group."
> 
> But I think we deal with that when we convert. Perhaps just treat the
> request with data as LPIG.

Sure that seems fine. Do you mean that the vt-d driver will set the
IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE flag for PR-with-data?  Otherwise we
could introduce a new flag. I prefer making it explicit rather than having
IOPF consumers infer that a response is needed when seeing
IOMMU_FAULT_PAGE_REQUEST_PRIV_DATA set, because private_data wouldn't be
reusable by other architectures.

> Also adding a trace event would be nice, similar to CPU page fault.

Agreed, the tracepoints in my development tree (along with the lower-level
page_request/response tracepoints) have been indispensable for debugging
hardware and SVA applications

> > +		list_add(&iopf->list, &iopf_param->partial);
> > +
> > +		return 0;
> > +	}
> > +
> > +	group = kzalloc(sizeof(*group), GFP_KERNEL);
> > +	if (!group) {
> > +		/*
> > +		 * The caller will send a response to the hardware. But
> > we do
> > +		 * need to clean up before leaving, otherwise partial
> > faults
> > +		 * will be stuck.
> > +		 */
> > +		ret = -ENOMEM;
> > +		goto cleanup_partial;
> > +	}
> > +
> > +	group->dev = dev;
> > +	group->last_fault.fault = *fault;
> > +	INIT_LIST_HEAD(&group->faults);
> > +	list_add(&group->last_fault.list, &group->faults);
> > +	INIT_WORK(&group->work, iopf_handle_group);
> > +
> > +	/* See if we have partial faults for this group */
> > +	list_for_each_entry_safe(iopf, next, &iopf_param->partial, list)
> > {
> > +		if (iopf->fault.prm.grpid == fault->prm.grpid)
> Just curious, the iopf handler is registered per arm_smmu_master dev. Is
> the namespace of group ID also within an arm_smmu_master?

Yes for both PCIe PRI and SMMU stall, the namespace is within one device
(one RequesterID or StreamID)

> Can one arm_smmu_master support multiple devices?

No, it's a single device

> 
> For VT-d, group ID is per PCI device.
> 
> > +			/* Insert *before* the last fault */
> > +			list_move(&iopf->list, &group->faults);
> > +	}
> > +
> This is fine with devices supports small number of outstanding PRQs.
> VT-d is setting the limit to 32. But the incoming DSA device will support
> 512.
> 
> So if we pre-sort IOPF by group ID and put them in a per group list, would
> it be faster? I mean once the LPIG comes in, we just need to search the
> list of groups instead of all partial faults. I am not against what is done
> here, just exploring optimization.

Yes I think that's worth exploring, keeping the groups in a rb_tree or something
like that, for easy access and update. Note that I don't have a system
with non-LPIG faults, so I can't test any of this at the moment

Thanks,
Jean

next prev parent reply	other threads:[~2021-03-23 10:51 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-02  9:26 [PATCH v13 00/10] iommu: I/O page faults for SMMUv3 Jean-Philippe Brucker
2021-03-02  9:26 ` [PATCH v13 01/10] iommu: Fix comment for struct iommu_fwspec Jean-Philippe Brucker
2021-03-25 17:37   ` Will Deacon
2021-03-02  9:26 ` [PATCH v13 02/10] iommu/arm-smmu-v3: Use device properties for pasid-num-bits Jean-Philippe Brucker
2021-03-25 17:36   ` Will Deacon
2021-03-02  9:26 ` [PATCH v13 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA Jean-Philippe Brucker
2021-03-03  5:04   ` Lu Baolu
2021-03-02  9:26 ` [PATCH v13 04/10] iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF Jean-Philippe Brucker
2021-03-02  9:26 ` [PATCH v13 05/10] uacce: Enable IOMMU_DEV_FEAT_IOPF Jean-Philippe Brucker
2021-03-02  9:26 ` [PATCH v13 06/10] iommu: Add a page fault handler Jean-Philippe Brucker
2021-03-02 23:59   ` Jacob Pan
2021-03-23 10:50     ` Jean-Philippe Brucker [this message]
2021-03-03  5:27   ` Lu Baolu
2021-03-23 10:51     ` Jean-Philippe Brucker
2021-03-03  5:57   ` Raj, Ashok
2021-03-23 10:53     ` Jean-Philippe Brucker
2021-03-02  9:26 ` [PATCH v13 07/10] iommu/arm-smmu-v3: Maintain a SID->device structure Jean-Philippe Brucker
2021-03-02 12:24   ` Keqian Zhu
2021-03-25 17:48   ` Will Deacon
2021-03-26  9:49     ` Jean-Philippe Brucker
2021-03-02  9:26 ` [PATCH v13 08/10] dt-bindings: document stall property for IOMMU masters Jean-Philippe Brucker
2021-03-02  9:26 ` [PATCH v13 09/10] ACPI/IORT: Enable stall support for platform devices Jean-Philippe Brucker
2021-03-02  9:26 ` [PATCH v13 10/10] iommu/arm-smmu-v3: Add " Jean-Philippe Brucker
2021-03-19 17:40   ` Auger Eric
2021-03-26  9:52   ` Auger Eric
2021-03-18  0:25 ` [PATCH v13 00/10] iommu: I/O page faults for SMMUv3 Krishna Reddy
2021-03-30 17:17 ` Jean-Philippe Brucker
2021-04-01  8:57   ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YFnH20Fy31CwEj7n@myrica \
    --to=jean-philippe@linaro.org \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=devicetree@vger.kernel.org \
    --cc=eric.auger@redhat.com \
    --cc=guohanjun@huawei.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=lenb@kernel.org \
    --cc=linux-accelerators@lists.ozlabs.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=rjw@rjwysocki.net \
    --cc=robh+dt@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=sudeep.holla@arm.com \
    --cc=vdumpa@nvidia.com \
    --cc=vivek.gautam@arm.com \
    --cc=wangzhou1@hisilicon.com \
    --cc=will@kernel.org \
    --cc=zhangfei.gao@linaro.org \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox