* [PATCH v4 0/5] iommu: Enable user space IOPFs in non-PASID and non-svm cases
@ 2024-10-15 21:08 Joel Granados
2024-10-15 21:08 ` [PATCH v4 1/5] iommu/vt-d: Separate page request queue from SVM Joel Granados
` (5 more replies)
0 siblings, 6 replies; 21+ messages in thread
From: Joel Granados @ 2024-10-15 21:08 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Robin Murphy, Jason Gunthorpe, Kevin Tian, Klaus Jensen
Cc: linux-kernel, iommu, Joel Granados, Joel Granados, Klaus Jensen
This series makes use of iommufd_hwpt_replace_device to execute
non-pasid/non-svm user space IOPFs. Our main motivation is to expand or
facilitate user-space driver driven device verification by enabling IOPF
without SVM/PASID.
What?
* Enable IO page fault handling in user space for a non-pasid, non-svm
and non-virtualised use case.
* Move IOMMU_IOPF configuration from INTEL_IOMMU_SVM into INTEL_IOMMU.
* Move all page request queue related logic to a new (prq.c) file.
* Remove PASID checks from PRQ event handling as well as PRQ
initialization.
* Allow execution of IOMMU_HWPT_ALLOC with a valid fault id
(IOMMU_HWPT_FAULT_ID_VALID)
Why?
The PCI ATS Extended Capability allows peripheral devices to
participate in the caching of translations when operating under an
IOMMU. Further, the ATS Page Request Interface (PRI) Extension allows
devices to handle missing mappings. Currently, PRI is mainly used in
the context of Shared Virtual Addressing, requiring support for the
Process Address Space Identifier (PASID) capability which is not
strictly necessary. Relaxing this requirement adds to the
possibilities available for user-space driver driver device
verification as well as for avoiding pinning.
Testing?
The non-svm IOPF interface is exercised by first initializing an IOPF
enabled IOAS and then reading the fault file descriptor. Pseudocode on
the IOPF initializing and handling is in [3] and [4] (using libvfn).
Supplementary repositories supporting this patchset:
1. A user space library libvfn [1] which is used for testing and
verification (see examples/iopf.c), and
2. Basic emulation of PCIe ATS/PRI and Intel VT-d PRQ in QEMU [2].
Changes in v4:
- Include the "trailers" from Kevin that I missed on V3
- Link to v3: https://lore.kernel.org/r/20241009-jag-iopfv8-v3-0-bd4271df5b2b@kernel.org
Changes in v3:
- Adjust wording in cover letter
- Include "_iommu_" in the prq Intel function names to be more in line
with functions in iommu.h file
- Rebase on top of 6.12-rc2
- Update my ID in e-mail, git author and my Signed-off-by.
- Link to v2: https://lore.kernel.org/r/20240913-jag-iopfv8-v2-0-dea01c2343bc@samsung.com
Changes in v2:
- Remove "nesting" from wording. This wording is left over from initial
versions that are now irrelevant.
- Dropped "iommu: init pasid array while doing domain_replace and iopf
is active" as the initialization of the pasid_array x-array happens
automatically when an iopf capable domain is replaced on a device.
- Corrected commit message in "iommu/vt-d: Separate page request queue
from SVM"
- Link to v1: https://lore.kernel.org/r/20240904-jag-iopfv8-v1-0-e3549920adf3@samsung.com
V1:
- This is the first version of the series after initial feedback from
the RFC [5].
Comments and feedback are greatly appreciated
Best
Joel
[1] https://github.com/SamsungDS/libvfn/tree/iommufd-fault-queue
[2] https://gitlab.com/birkelund/qemu/-/tree/pcie-ats-pri
[3] Initializing
```
int iopf_init(struct iommu_ioas *ioas, const char *bdf)
{
// open vfio device from bdf
int devfd = open('/dev/vfio/devices/VFIO_DEV', O_RDWR);
struct vfio_device_bind_iommufd bind = {
.argsz = sizeof(bind),
.flags = 0,
.iommufd = __iommufd,
};
ioctl(devfd, VFIO_DEVICE_BIND_IOMMUFD, &bind);
struct iommu_ioas *ioas = ioas;
struct vfio_device_attach_iommufd_pt attach_data = {
.argsz = sizeof(attach_data),
.flags = 0,
.pt_id = ioas->id,
};
ioctl(devfd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach_data);
struct iommu_fault_alloc fault = {
.size = sizeof(fault),
.flags = 0,
};
ioctl(__iommufd, IOMMU_FAULT_QUEUE_ALLOC, &fault);
struct iommu_hwpt_alloc fault_cmd = {
.size = sizeof(fault_cmd),
.flags = IOMMU_HWPT_FAULT_ID_VALID,
.dev_id = bind.out_devid,
.pt_id = ioas->id,
.data_len = 0,
.data_uptr = (uint64_t)NULL,
.fault_id = fault.out_fault_id,
.__reserved = 0,
};
ioctl(__iommufd, IOMMU_HWPT_ALLOC, &fault_cmd);
// This is a re-attach
struct vfio_device_attach_iommufd_pt attach = {
.argsz = sizeof(attach),
.flags = 0,
.pt_id = fault_cmd.out_hwpt_id
};
ioctl(dev_fd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach);
}
```
[4] Handling
```
int handle_iopf(void *vaddr, int len, uint64_t iova) {
exec_command(CMD)
int iopf_fd = fault_cmd.fault_id;
struct iommu_hwpt_pgfault pgfault = {0};
if(read(iopf_fd, &pgfault, sizeof(pgfault)) == 0);
return; // no page fault
ret = iommu_map_vaddr(__iommmufd, vaddr, len, &iova)
struct iommu_hwpt_page_response pgfault_response = {
.cookie = pgfault.cookie,
.code = ret ? IOMMUFD_PAGE_RESP_SUCCESS : IOMMUFD_PAGE_RESP_INVALID,
};
write(iopf_fd, &pgfault_response, sizeof(pgfault_response));
return;
}
```
[5] https://lore.kernel.org/20240826-iopf-for-all-v1-0-59174e6a7528@samsung.com
Signed-off-by: Joel Granados <j.granados@samsung.com>
---
Joel Granados (3):
iommu/vt-d: Separate page request queue from SVM
iommu: kconfig: Move IOMMU_IOPF into INTEL_IOMMU
iommufd: Enable PRI when doing the iommufd_hwpt_alloc
Klaus Jensen (2):
iommu/vt-d: Remove the pasid present check in prq_event_thread
iommu/vt-d: drop pasid requirement for prq initialization
drivers/iommu/intel/Kconfig | 2 +-
drivers/iommu/intel/Makefile | 2 +-
drivers/iommu/intel/iommu.c | 31 ++-
drivers/iommu/intel/iommu.h | 14 +-
drivers/iommu/intel/prq.c | 406 +++++++++++++++++++++++++++++++++++
drivers/iommu/intel/svm.c | 397 ----------------------------------
drivers/iommu/iommufd/hw_pagetable.c | 3 +-
7 files changed, 428 insertions(+), 427 deletions(-)
---
base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b
change-id: 20240904-jag-iopfv8-1577fd20422d
Best regards,
--
Joel Granados <joel.granados@kernel.org>
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v4 1/5] iommu/vt-d: Separate page request queue from SVM
2024-10-15 21:08 [PATCH v4 0/5] iommu: Enable user space IOPFs in non-PASID and non-svm cases Joel Granados
@ 2024-10-15 21:08 ` Joel Granados
2024-10-15 21:08 ` [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread Joel Granados
` (4 subsequent siblings)
5 siblings, 0 replies; 21+ messages in thread
From: Joel Granados @ 2024-10-15 21:08 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Robin Murphy, Jason Gunthorpe, Kevin Tian, Klaus Jensen
Cc: linux-kernel, iommu, Joel Granados, Joel Granados
IO page faults are no longer dependent on CONFIG_INTEL_IOMMU_SVM. Move
all Page Request Queue (PRQ) functions that handle prq events to a new
file in drivers/iommu/intel/prq.c. The page_req_des struct is now
declared in drivers/iommu/intel/prq.c.
No functional changes are intended. This is a preparation patch to
enable the use of IO page faults outside the SVM/PASID use cases.
Signed-off-by: Joel Granados <joel.granados@kernel.org>
---
drivers/iommu/intel/Makefile | 2 +-
drivers/iommu/intel/iommu.c | 20 +--
drivers/iommu/intel/iommu.h | 14 +-
drivers/iommu/intel/prq.c | 412 +++++++++++++++++++++++++++++++++++++++++++
drivers/iommu/intel/svm.c | 397 -----------------------------------------
5 files changed, 426 insertions(+), 419 deletions(-)
diff --git a/drivers/iommu/intel/Makefile b/drivers/iommu/intel/Makefile
index c8beb0281559..d3bb0798092d 100644
--- a/drivers/iommu/intel/Makefile
+++ b/drivers/iommu/intel/Makefile
@@ -1,6 +1,6 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_DMAR_TABLE) += dmar.o
-obj-$(CONFIG_INTEL_IOMMU) += iommu.o pasid.o nested.o cache.o
+obj-$(CONFIG_INTEL_IOMMU) += iommu.o pasid.o nested.o cache.o prq.o
obj-$(CONFIG_DMAR_TABLE) += trace.o cap_audit.o
obj-$(CONFIG_DMAR_PERF) += perf.o
obj-$(CONFIG_INTEL_IOMMU_DEBUGFS) += debugfs.o
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 9f6b0780f2ef..5d9c3333c92d 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1428,12 +1428,10 @@ static void free_dmar_iommu(struct intel_iommu *iommu)
/* free context mapping */
free_context_table(iommu);
-#ifdef CONFIG_INTEL_IOMMU_SVM
if (pasid_supported(iommu)) {
if (ecap_prs(iommu->ecap))
- intel_svm_finish_prq(iommu);
+ intel_iommu_finish_prq(iommu);
}
-#endif
}
/*
@@ -2354,19 +2352,18 @@ static int __init init_dmars(void)
iommu_flush_write_buffer(iommu);
-#ifdef CONFIG_INTEL_IOMMU_SVM
if (pasid_supported(iommu) && ecap_prs(iommu->ecap)) {
/*
* Call dmar_alloc_hwirq() with dmar_global_lock held,
* could cause possible lock race condition.
*/
up_write(&dmar_global_lock);
- ret = intel_svm_enable_prq(iommu);
+ ret = intel_iommu_enable_prq(iommu);
down_write(&dmar_global_lock);
if (ret)
goto free_iommu;
}
-#endif
+
ret = dmar_set_interrupt(iommu);
if (ret)
goto free_iommu;
@@ -2786,13 +2783,12 @@ static int intel_iommu_add(struct dmar_drhd_unit *dmaru)
intel_iommu_init_qi(iommu);
iommu_flush_write_buffer(iommu);
-#ifdef CONFIG_INTEL_IOMMU_SVM
if (pasid_supported(iommu) && ecap_prs(iommu->ecap)) {
- ret = intel_svm_enable_prq(iommu);
+ ret = intel_iommu_enable_prq(iommu);
if (ret)
goto disable_iommu;
}
-#endif
+
ret = dmar_set_interrupt(iommu);
if (ret)
goto disable_iommu;
@@ -4281,7 +4277,7 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
intel_iommu_debugfs_remove_dev_pasid(dev_pasid);
kfree(dev_pasid);
intel_pasid_tear_down_entry(iommu, dev, pasid, false);
- intel_drain_pasid_prq(dev, pasid);
+ intel_iommu_drain_pasid_prq(dev, pasid);
}
static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
@@ -4609,9 +4605,7 @@ const struct iommu_ops intel_iommu_ops = {
.def_domain_type = device_def_domain_type,
.remove_dev_pasid = intel_iommu_remove_dev_pasid,
.pgsize_bitmap = SZ_4K,
-#ifdef CONFIG_INTEL_IOMMU_SVM
- .page_response = intel_svm_page_response,
-#endif
+ .page_response = intel_iommu_page_response,
.default_domain_ops = &(const struct iommu_domain_ops) {
.attach_dev = intel_iommu_attach_device,
.set_dev_pasid = intel_iommu_set_dev_pasid,
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 1497f3112b12..7c3ef89ae959 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -730,12 +730,10 @@ struct intel_iommu {
struct iommu_flush flush;
#endif
-#ifdef CONFIG_INTEL_IOMMU_SVM
struct page_req_dsc *prq;
unsigned char prq_name[16]; /* Name for PRQ interrupt */
unsigned long prq_seq_number;
struct completion prq_complete;
-#endif
struct iopf_queue *iopf_queue;
unsigned char iopfq_name[16];
/* Synchronization between fault report and iommu device release. */
@@ -1278,18 +1276,18 @@ void intel_context_flush_present(struct device_domain_info *info,
struct context_entry *context,
u16 did, bool affect_domains);
+int intel_iommu_enable_prq(struct intel_iommu *iommu);
+int intel_iommu_finish_prq(struct intel_iommu *iommu);
+void intel_iommu_page_response(struct device *dev, struct iopf_fault *evt,
+ struct iommu_page_response *msg);
+void intel_iommu_drain_pasid_prq(struct device *dev, u32 pasid);
+
#ifdef CONFIG_INTEL_IOMMU_SVM
void intel_svm_check(struct intel_iommu *iommu);
-int intel_svm_enable_prq(struct intel_iommu *iommu);
-int intel_svm_finish_prq(struct intel_iommu *iommu);
-void intel_svm_page_response(struct device *dev, struct iopf_fault *evt,
- struct iommu_page_response *msg);
struct iommu_domain *intel_svm_domain_alloc(struct device *dev,
struct mm_struct *mm);
-void intel_drain_pasid_prq(struct device *dev, u32 pasid);
#else
static inline void intel_svm_check(struct intel_iommu *iommu) {}
-static inline void intel_drain_pasid_prq(struct device *dev, u32 pasid) {}
static inline struct iommu_domain *intel_svm_domain_alloc(struct device *dev,
struct mm_struct *mm)
{
diff --git a/drivers/iommu/intel/prq.c b/drivers/iommu/intel/prq.c
new file mode 100644
index 000000000000..d4f18eb46475
--- /dev/null
+++ b/drivers/iommu/intel/prq.c
@@ -0,0 +1,412 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright © 2015 Intel Corporation.
+ *
+ * Originally split from drivers/iommu/intel/svm.c
+ */
+
+#include <linux/pci.h>
+#include <linux/pci-ats.h>
+
+#include "iommu.h"
+#include "pasid.h"
+#include "../iommu-pages.h"
+#include "trace.h"
+
+/* Page request queue descriptor */
+struct page_req_dsc {
+ union {
+ struct {
+ u64 type:8;
+ u64 pasid_present:1;
+ u64 rsvd:7;
+ u64 rid:16;
+ u64 pasid:20;
+ u64 exe_req:1;
+ u64 pm_req:1;
+ u64 rsvd2:10;
+ };
+ u64 qw_0;
+ };
+ union {
+ struct {
+ u64 rd_req:1;
+ u64 wr_req:1;
+ u64 lpig:1;
+ u64 prg_index:9;
+ u64 addr:52;
+ };
+ u64 qw_1;
+ };
+ u64 qw_2;
+ u64 qw_3;
+};
+
+/**
+ * intel_iommu_drain_pasid_prq - Drain page requests and responses for a pasid
+ * @dev: target device
+ * @pasid: pasid for draining
+ *
+ * Drain all pending page requests and responses related to @pasid in both
+ * software and hardware. This is supposed to be called after the device
+ * driver has stopped DMA, the pasid entry has been cleared, and both IOTLB
+ * and DevTLB have been invalidated.
+ *
+ * It waits until all pending page requests for @pasid in the page fault
+ * queue are completed by the prq handling thread. Then follow the steps
+ * described in VT-d spec CH7.10 to drain all page requests and page
+ * responses pending in the hardware.
+ */
+void intel_iommu_drain_pasid_prq(struct device *dev, u32 pasid)
+{
+ struct device_domain_info *info;
+ struct dmar_domain *domain;
+ struct intel_iommu *iommu;
+ struct qi_desc desc[3];
+ struct pci_dev *pdev;
+ int head, tail;
+ u16 sid, did;
+ int qdep;
+
+ info = dev_iommu_priv_get(dev);
+ if (WARN_ON(!info || !dev_is_pci(dev)))
+ return;
+
+ if (!info->pri_enabled)
+ return;
+
+ iommu = info->iommu;
+ domain = info->domain;
+ pdev = to_pci_dev(dev);
+ sid = PCI_DEVID(info->bus, info->devfn);
+ did = domain ? domain_id_iommu(domain, iommu) : FLPT_DEFAULT_DID;
+
+ qdep = pci_ats_queue_depth(pdev);
+
+ /*
+ * Check and wait until all pending page requests in the queue are
+ * handled by the prq handling thread.
+ */
+prq_retry:
+ reinit_completion(&iommu->prq_complete);
+ tail = dmar_readq(iommu->reg + DMAR_PQT_REG) & PRQ_RING_MASK;
+ head = dmar_readq(iommu->reg + DMAR_PQH_REG) & PRQ_RING_MASK;
+ while (head != tail) {
+ struct page_req_dsc *req;
+
+ req = &iommu->prq[head / sizeof(*req)];
+ if (!req->pasid_present || req->pasid != pasid) {
+ head = (head + sizeof(*req)) & PRQ_RING_MASK;
+ continue;
+ }
+
+ wait_for_completion(&iommu->prq_complete);
+ goto prq_retry;
+ }
+
+ iopf_queue_flush_dev(dev);
+
+ /*
+ * Perform steps described in VT-d spec CH7.10 to drain page
+ * requests and responses in hardware.
+ */
+ memset(desc, 0, sizeof(desc));
+ desc[0].qw0 = QI_IWD_STATUS_DATA(QI_DONE) |
+ QI_IWD_FENCE |
+ QI_IWD_TYPE;
+ desc[1].qw0 = QI_EIOTLB_PASID(pasid) |
+ QI_EIOTLB_DID(did) |
+ QI_EIOTLB_GRAN(QI_GRAN_NONG_PASID) |
+ QI_EIOTLB_TYPE;
+ desc[2].qw0 = QI_DEV_EIOTLB_PASID(pasid) |
+ QI_DEV_EIOTLB_SID(sid) |
+ QI_DEV_EIOTLB_QDEP(qdep) |
+ QI_DEIOTLB_TYPE |
+ QI_DEV_IOTLB_PFSID(info->pfsid);
+qi_retry:
+ reinit_completion(&iommu->prq_complete);
+ qi_submit_sync(iommu, desc, 3, QI_OPT_WAIT_DRAIN);
+ if (readl(iommu->reg + DMAR_PRS_REG) & DMA_PRS_PRO) {
+ wait_for_completion(&iommu->prq_complete);
+ goto qi_retry;
+ }
+}
+
+
+static bool is_canonical_address(u64 addr)
+{
+ int shift = 64 - (__VIRTUAL_MASK_SHIFT + 1);
+ long saddr = (long) addr;
+
+ return (((saddr << shift) >> shift) == saddr);
+}
+
+static void handle_bad_prq_event(struct intel_iommu *iommu,
+ struct page_req_dsc *req, int result)
+{
+ struct qi_desc desc = { };
+
+ pr_err("%s: Invalid page request: %08llx %08llx\n",
+ iommu->name, ((unsigned long long *)req)[0],
+ ((unsigned long long *)req)[1]);
+
+ if (!req->lpig)
+ return;
+
+ desc.qw0 = QI_PGRP_PASID(req->pasid) |
+ QI_PGRP_DID(req->rid) |
+ QI_PGRP_PASID_P(req->pasid_present) |
+ QI_PGRP_RESP_CODE(result) |
+ QI_PGRP_RESP_TYPE;
+ desc.qw1 = QI_PGRP_IDX(req->prg_index) |
+ QI_PGRP_LPIG(req->lpig);
+
+ qi_submit_sync(iommu, &desc, 1, 0);
+}
+
+static int prq_to_iommu_prot(struct page_req_dsc *req)
+{
+ int prot = 0;
+
+ if (req->rd_req)
+ prot |= IOMMU_FAULT_PERM_READ;
+ if (req->wr_req)
+ prot |= IOMMU_FAULT_PERM_WRITE;
+ if (req->exe_req)
+ prot |= IOMMU_FAULT_PERM_EXEC;
+ if (req->pm_req)
+ prot |= IOMMU_FAULT_PERM_PRIV;
+
+ return prot;
+}
+
+static void intel_prq_report(struct intel_iommu *iommu, struct device *dev,
+ struct page_req_dsc *desc)
+{
+ struct iopf_fault event = { };
+
+ /* Fill in event data for device specific processing */
+ event.fault.type = IOMMU_FAULT_PAGE_REQ;
+ event.fault.prm.addr = (u64)desc->addr << VTD_PAGE_SHIFT;
+ event.fault.prm.pasid = desc->pasid;
+ event.fault.prm.grpid = desc->prg_index;
+ event.fault.prm.perm = prq_to_iommu_prot(desc);
+
+ if (desc->lpig)
+ event.fault.prm.flags |= IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE;
+ if (desc->pasid_present) {
+ event.fault.prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
+ event.fault.prm.flags |= IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID;
+ }
+
+ iommu_report_device_fault(dev, &event);
+}
+
+static irqreturn_t prq_event_thread(int irq, void *d)
+{
+ struct intel_iommu *iommu = d;
+ struct page_req_dsc *req;
+ int head, tail, handled;
+ struct device *dev;
+ u64 address;
+
+ /*
+ * Clear PPR bit before reading head/tail registers, to ensure that
+ * we get a new interrupt if needed.
+ */
+ writel(DMA_PRS_PPR, iommu->reg + DMAR_PRS_REG);
+
+ tail = dmar_readq(iommu->reg + DMAR_PQT_REG) & PRQ_RING_MASK;
+ head = dmar_readq(iommu->reg + DMAR_PQH_REG) & PRQ_RING_MASK;
+ handled = (head != tail);
+ while (head != tail) {
+ req = &iommu->prq[head / sizeof(*req)];
+ address = (u64)req->addr << VTD_PAGE_SHIFT;
+
+ if (unlikely(!req->pasid_present)) {
+ pr_err("IOMMU: %s: Page request without PASID\n",
+ iommu->name);
+bad_req:
+ handle_bad_prq_event(iommu, req, QI_RESP_INVALID);
+ goto prq_advance;
+ }
+
+ if (unlikely(!is_canonical_address(address))) {
+ pr_err("IOMMU: %s: Address is not canonical\n",
+ iommu->name);
+ goto bad_req;
+ }
+
+ if (unlikely(req->pm_req && (req->rd_req | req->wr_req))) {
+ pr_err("IOMMU: %s: Page request in Privilege Mode\n",
+ iommu->name);
+ goto bad_req;
+ }
+
+ if (unlikely(req->exe_req && req->rd_req)) {
+ pr_err("IOMMU: %s: Execution request not supported\n",
+ iommu->name);
+ goto bad_req;
+ }
+
+ /* Drop Stop Marker message. No need for a response. */
+ if (unlikely(req->lpig && !req->rd_req && !req->wr_req))
+ goto prq_advance;
+
+ /*
+ * If prq is to be handled outside iommu driver via receiver of
+ * the fault notifiers, we skip the page response here.
+ */
+ mutex_lock(&iommu->iopf_lock);
+ dev = device_rbtree_find(iommu, req->rid);
+ if (!dev) {
+ mutex_unlock(&iommu->iopf_lock);
+ goto bad_req;
+ }
+
+ intel_prq_report(iommu, dev, req);
+ trace_prq_report(iommu, dev, req->qw_0, req->qw_1,
+ req->qw_2, req->qw_3,
+ iommu->prq_seq_number++);
+ mutex_unlock(&iommu->iopf_lock);
+prq_advance:
+ head = (head + sizeof(*req)) & PRQ_RING_MASK;
+ }
+
+ dmar_writeq(iommu->reg + DMAR_PQH_REG, tail);
+
+ /*
+ * Clear the page request overflow bit and wake up all threads that
+ * are waiting for the completion of this handling.
+ */
+ if (readl(iommu->reg + DMAR_PRS_REG) & DMA_PRS_PRO) {
+ pr_info_ratelimited("IOMMU: %s: PRQ overflow detected\n",
+ iommu->name);
+ head = dmar_readq(iommu->reg + DMAR_PQH_REG) & PRQ_RING_MASK;
+ tail = dmar_readq(iommu->reg + DMAR_PQT_REG) & PRQ_RING_MASK;
+ if (head == tail) {
+ iopf_queue_discard_partial(iommu->iopf_queue);
+ writel(DMA_PRS_PRO, iommu->reg + DMAR_PRS_REG);
+ pr_info_ratelimited("IOMMU: %s: PRQ overflow cleared",
+ iommu->name);
+ }
+ }
+
+ if (!completion_done(&iommu->prq_complete))
+ complete(&iommu->prq_complete);
+
+ return IRQ_RETVAL(handled);
+}
+
+int intel_iommu_enable_prq(struct intel_iommu *iommu)
+{
+ struct iopf_queue *iopfq;
+ int irq, ret;
+
+ iommu->prq = iommu_alloc_pages_node(iommu->node, GFP_KERNEL, PRQ_ORDER);
+ if (!iommu->prq) {
+ pr_warn("IOMMU: %s: Failed to allocate page request queue\n",
+ iommu->name);
+ return -ENOMEM;
+ }
+
+ irq = dmar_alloc_hwirq(IOMMU_IRQ_ID_OFFSET_PRQ + iommu->seq_id, iommu->node, iommu);
+ if (irq <= 0) {
+ pr_err("IOMMU: %s: Failed to create IRQ vector for page request queue\n",
+ iommu->name);
+ ret = -EINVAL;
+ goto free_prq;
+ }
+ iommu->pr_irq = irq;
+
+ snprintf(iommu->iopfq_name, sizeof(iommu->iopfq_name),
+ "dmar%d-iopfq", iommu->seq_id);
+ iopfq = iopf_queue_alloc(iommu->iopfq_name);
+ if (!iopfq) {
+ pr_err("IOMMU: %s: Failed to allocate iopf queue\n", iommu->name);
+ ret = -ENOMEM;
+ goto free_hwirq;
+ }
+ iommu->iopf_queue = iopfq;
+
+ snprintf(iommu->prq_name, sizeof(iommu->prq_name), "dmar%d-prq", iommu->seq_id);
+
+ ret = request_threaded_irq(irq, NULL, prq_event_thread, IRQF_ONESHOT,
+ iommu->prq_name, iommu);
+ if (ret) {
+ pr_err("IOMMU: %s: Failed to request IRQ for page request queue\n",
+ iommu->name);
+ goto free_iopfq;
+ }
+ dmar_writeq(iommu->reg + DMAR_PQH_REG, 0ULL);
+ dmar_writeq(iommu->reg + DMAR_PQT_REG, 0ULL);
+ dmar_writeq(iommu->reg + DMAR_PQA_REG, virt_to_phys(iommu->prq) | PRQ_ORDER);
+
+ init_completion(&iommu->prq_complete);
+
+ return 0;
+
+free_iopfq:
+ iopf_queue_free(iommu->iopf_queue);
+ iommu->iopf_queue = NULL;
+free_hwirq:
+ dmar_free_hwirq(irq);
+ iommu->pr_irq = 0;
+free_prq:
+ iommu_free_pages(iommu->prq, PRQ_ORDER);
+ iommu->prq = NULL;
+
+ return ret;
+}
+
+int intel_iommu_finish_prq(struct intel_iommu *iommu)
+{
+ dmar_writeq(iommu->reg + DMAR_PQH_REG, 0ULL);
+ dmar_writeq(iommu->reg + DMAR_PQT_REG, 0ULL);
+ dmar_writeq(iommu->reg + DMAR_PQA_REG, 0ULL);
+
+ if (iommu->pr_irq) {
+ free_irq(iommu->pr_irq, iommu);
+ dmar_free_hwirq(iommu->pr_irq);
+ iommu->pr_irq = 0;
+ }
+
+ if (iommu->iopf_queue) {
+ iopf_queue_free(iommu->iopf_queue);
+ iommu->iopf_queue = NULL;
+ }
+
+ iommu_free_pages(iommu->prq, PRQ_ORDER);
+ iommu->prq = NULL;
+
+ return 0;
+}
+
+void intel_iommu_page_response(struct device *dev, struct iopf_fault *evt,
+ struct iommu_page_response *msg)
+{
+ struct device_domain_info *info = dev_iommu_priv_get(dev);
+ struct intel_iommu *iommu = info->iommu;
+ u8 bus = info->bus, devfn = info->devfn;
+ struct iommu_fault_page_request *prm;
+ struct qi_desc desc;
+ bool pasid_present;
+ bool last_page;
+ u16 sid;
+
+ prm = &evt->fault.prm;
+ sid = PCI_DEVID(bus, devfn);
+ pasid_present = prm->flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
+ last_page = prm->flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE;
+
+ desc.qw0 = QI_PGRP_PASID(prm->pasid) | QI_PGRP_DID(sid) |
+ QI_PGRP_PASID_P(pasid_present) |
+ QI_PGRP_RESP_CODE(msg->code) |
+ QI_PGRP_RESP_TYPE;
+ desc.qw1 = QI_PGRP_IDX(prm->grpid) | QI_PGRP_LPIG(last_page);
+ desc.qw2 = 0;
+ desc.qw3 = 0;
+
+ qi_submit_sync(iommu, &desc, 1, 0);
+}
+
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 078d1e32a24e..3cc43a958b4d 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -25,92 +25,6 @@
#include "../iommu-pages.h"
#include "trace.h"
-static irqreturn_t prq_event_thread(int irq, void *d);
-
-int intel_svm_enable_prq(struct intel_iommu *iommu)
-{
- struct iopf_queue *iopfq;
- int irq, ret;
-
- iommu->prq = iommu_alloc_pages_node(iommu->node, GFP_KERNEL, PRQ_ORDER);
- if (!iommu->prq) {
- pr_warn("IOMMU: %s: Failed to allocate page request queue\n",
- iommu->name);
- return -ENOMEM;
- }
-
- irq = dmar_alloc_hwirq(IOMMU_IRQ_ID_OFFSET_PRQ + iommu->seq_id, iommu->node, iommu);
- if (irq <= 0) {
- pr_err("IOMMU: %s: Failed to create IRQ vector for page request queue\n",
- iommu->name);
- ret = -EINVAL;
- goto free_prq;
- }
- iommu->pr_irq = irq;
-
- snprintf(iommu->iopfq_name, sizeof(iommu->iopfq_name),
- "dmar%d-iopfq", iommu->seq_id);
- iopfq = iopf_queue_alloc(iommu->iopfq_name);
- if (!iopfq) {
- pr_err("IOMMU: %s: Failed to allocate iopf queue\n", iommu->name);
- ret = -ENOMEM;
- goto free_hwirq;
- }
- iommu->iopf_queue = iopfq;
-
- snprintf(iommu->prq_name, sizeof(iommu->prq_name), "dmar%d-prq", iommu->seq_id);
-
- ret = request_threaded_irq(irq, NULL, prq_event_thread, IRQF_ONESHOT,
- iommu->prq_name, iommu);
- if (ret) {
- pr_err("IOMMU: %s: Failed to request IRQ for page request queue\n",
- iommu->name);
- goto free_iopfq;
- }
- dmar_writeq(iommu->reg + DMAR_PQH_REG, 0ULL);
- dmar_writeq(iommu->reg + DMAR_PQT_REG, 0ULL);
- dmar_writeq(iommu->reg + DMAR_PQA_REG, virt_to_phys(iommu->prq) | PRQ_ORDER);
-
- init_completion(&iommu->prq_complete);
-
- return 0;
-
-free_iopfq:
- iopf_queue_free(iommu->iopf_queue);
- iommu->iopf_queue = NULL;
-free_hwirq:
- dmar_free_hwirq(irq);
- iommu->pr_irq = 0;
-free_prq:
- iommu_free_pages(iommu->prq, PRQ_ORDER);
- iommu->prq = NULL;
-
- return ret;
-}
-
-int intel_svm_finish_prq(struct intel_iommu *iommu)
-{
- dmar_writeq(iommu->reg + DMAR_PQH_REG, 0ULL);
- dmar_writeq(iommu->reg + DMAR_PQT_REG, 0ULL);
- dmar_writeq(iommu->reg + DMAR_PQA_REG, 0ULL);
-
- if (iommu->pr_irq) {
- free_irq(iommu->pr_irq, iommu);
- dmar_free_hwirq(iommu->pr_irq);
- iommu->pr_irq = 0;
- }
-
- if (iommu->iopf_queue) {
- iopf_queue_free(iommu->iopf_queue);
- iommu->iopf_queue = NULL;
- }
-
- iommu_free_pages(iommu->prq, PRQ_ORDER);
- iommu->prq = NULL;
-
- return 0;
-}
-
void intel_svm_check(struct intel_iommu *iommu)
{
if (!pasid_supported(iommu))
@@ -240,317 +154,6 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
return ret;
}
-/* Page request queue descriptor */
-struct page_req_dsc {
- union {
- struct {
- u64 type:8;
- u64 pasid_present:1;
- u64 rsvd:7;
- u64 rid:16;
- u64 pasid:20;
- u64 exe_req:1;
- u64 pm_req:1;
- u64 rsvd2:10;
- };
- u64 qw_0;
- };
- union {
- struct {
- u64 rd_req:1;
- u64 wr_req:1;
- u64 lpig:1;
- u64 prg_index:9;
- u64 addr:52;
- };
- u64 qw_1;
- };
- u64 qw_2;
- u64 qw_3;
-};
-
-static bool is_canonical_address(u64 addr)
-{
- int shift = 64 - (__VIRTUAL_MASK_SHIFT + 1);
- long saddr = (long) addr;
-
- return (((saddr << shift) >> shift) == saddr);
-}
-
-/**
- * intel_drain_pasid_prq - Drain page requests and responses for a pasid
- * @dev: target device
- * @pasid: pasid for draining
- *
- * Drain all pending page requests and responses related to @pasid in both
- * software and hardware. This is supposed to be called after the device
- * driver has stopped DMA, the pasid entry has been cleared, and both IOTLB
- * and DevTLB have been invalidated.
- *
- * It waits until all pending page requests for @pasid in the page fault
- * queue are completed by the prq handling thread. Then follow the steps
- * described in VT-d spec CH7.10 to drain all page requests and page
- * responses pending in the hardware.
- */
-void intel_drain_pasid_prq(struct device *dev, u32 pasid)
-{
- struct device_domain_info *info;
- struct dmar_domain *domain;
- struct intel_iommu *iommu;
- struct qi_desc desc[3];
- struct pci_dev *pdev;
- int head, tail;
- u16 sid, did;
- int qdep;
-
- info = dev_iommu_priv_get(dev);
- if (WARN_ON(!info || !dev_is_pci(dev)))
- return;
-
- if (!info->pri_enabled)
- return;
-
- iommu = info->iommu;
- domain = info->domain;
- pdev = to_pci_dev(dev);
- sid = PCI_DEVID(info->bus, info->devfn);
- did = domain ? domain_id_iommu(domain, iommu) : FLPT_DEFAULT_DID;
- qdep = pci_ats_queue_depth(pdev);
-
- /*
- * Check and wait until all pending page requests in the queue are
- * handled by the prq handling thread.
- */
-prq_retry:
- reinit_completion(&iommu->prq_complete);
- tail = dmar_readq(iommu->reg + DMAR_PQT_REG) & PRQ_RING_MASK;
- head = dmar_readq(iommu->reg + DMAR_PQH_REG) & PRQ_RING_MASK;
- while (head != tail) {
- struct page_req_dsc *req;
-
- req = &iommu->prq[head / sizeof(*req)];
- if (!req->pasid_present || req->pasid != pasid) {
- head = (head + sizeof(*req)) & PRQ_RING_MASK;
- continue;
- }
-
- wait_for_completion(&iommu->prq_complete);
- goto prq_retry;
- }
-
- iopf_queue_flush_dev(dev);
-
- /*
- * Perform steps described in VT-d spec CH7.10 to drain page
- * requests and responses in hardware.
- */
- memset(desc, 0, sizeof(desc));
- desc[0].qw0 = QI_IWD_STATUS_DATA(QI_DONE) |
- QI_IWD_FENCE |
- QI_IWD_TYPE;
- desc[1].qw0 = QI_EIOTLB_PASID(pasid) |
- QI_EIOTLB_DID(did) |
- QI_EIOTLB_GRAN(QI_GRAN_NONG_PASID) |
- QI_EIOTLB_TYPE;
- desc[2].qw0 = QI_DEV_EIOTLB_PASID(pasid) |
- QI_DEV_EIOTLB_SID(sid) |
- QI_DEV_EIOTLB_QDEP(qdep) |
- QI_DEIOTLB_TYPE |
- QI_DEV_IOTLB_PFSID(info->pfsid);
-qi_retry:
- reinit_completion(&iommu->prq_complete);
- qi_submit_sync(iommu, desc, 3, QI_OPT_WAIT_DRAIN);
- if (readl(iommu->reg + DMAR_PRS_REG) & DMA_PRS_PRO) {
- wait_for_completion(&iommu->prq_complete);
- goto qi_retry;
- }
-}
-
-static int prq_to_iommu_prot(struct page_req_dsc *req)
-{
- int prot = 0;
-
- if (req->rd_req)
- prot |= IOMMU_FAULT_PERM_READ;
- if (req->wr_req)
- prot |= IOMMU_FAULT_PERM_WRITE;
- if (req->exe_req)
- prot |= IOMMU_FAULT_PERM_EXEC;
- if (req->pm_req)
- prot |= IOMMU_FAULT_PERM_PRIV;
-
- return prot;
-}
-
-static void intel_svm_prq_report(struct intel_iommu *iommu, struct device *dev,
- struct page_req_dsc *desc)
-{
- struct iopf_fault event = { };
-
- /* Fill in event data for device specific processing */
- event.fault.type = IOMMU_FAULT_PAGE_REQ;
- event.fault.prm.addr = (u64)desc->addr << VTD_PAGE_SHIFT;
- event.fault.prm.pasid = desc->pasid;
- event.fault.prm.grpid = desc->prg_index;
- event.fault.prm.perm = prq_to_iommu_prot(desc);
-
- if (desc->lpig)
- event.fault.prm.flags |= IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE;
- if (desc->pasid_present) {
- event.fault.prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
- event.fault.prm.flags |= IOMMU_FAULT_PAGE_RESPONSE_NEEDS_PASID;
- }
-
- iommu_report_device_fault(dev, &event);
-}
-
-static void handle_bad_prq_event(struct intel_iommu *iommu,
- struct page_req_dsc *req, int result)
-{
- struct qi_desc desc = { };
-
- pr_err("%s: Invalid page request: %08llx %08llx\n",
- iommu->name, ((unsigned long long *)req)[0],
- ((unsigned long long *)req)[1]);
-
- if (!req->lpig)
- return;
-
- desc.qw0 = QI_PGRP_PASID(req->pasid) |
- QI_PGRP_DID(req->rid) |
- QI_PGRP_PASID_P(req->pasid_present) |
- QI_PGRP_RESP_CODE(result) |
- QI_PGRP_RESP_TYPE;
- desc.qw1 = QI_PGRP_IDX(req->prg_index) |
- QI_PGRP_LPIG(req->lpig);
-
- qi_submit_sync(iommu, &desc, 1, 0);
-}
-
-static irqreturn_t prq_event_thread(int irq, void *d)
-{
- struct intel_iommu *iommu = d;
- struct page_req_dsc *req;
- int head, tail, handled;
- struct device *dev;
- u64 address;
-
- /*
- * Clear PPR bit before reading head/tail registers, to ensure that
- * we get a new interrupt if needed.
- */
- writel(DMA_PRS_PPR, iommu->reg + DMAR_PRS_REG);
-
- tail = dmar_readq(iommu->reg + DMAR_PQT_REG) & PRQ_RING_MASK;
- head = dmar_readq(iommu->reg + DMAR_PQH_REG) & PRQ_RING_MASK;
- handled = (head != tail);
- while (head != tail) {
- req = &iommu->prq[head / sizeof(*req)];
- address = (u64)req->addr << VTD_PAGE_SHIFT;
-
- if (unlikely(!req->pasid_present)) {
- pr_err("IOMMU: %s: Page request without PASID\n",
- iommu->name);
-bad_req:
- handle_bad_prq_event(iommu, req, QI_RESP_INVALID);
- goto prq_advance;
- }
-
- if (unlikely(!is_canonical_address(address))) {
- pr_err("IOMMU: %s: Address is not canonical\n",
- iommu->name);
- goto bad_req;
- }
-
- if (unlikely(req->pm_req && (req->rd_req | req->wr_req))) {
- pr_err("IOMMU: %s: Page request in Privilege Mode\n",
- iommu->name);
- goto bad_req;
- }
-
- if (unlikely(req->exe_req && req->rd_req)) {
- pr_err("IOMMU: %s: Execution request not supported\n",
- iommu->name);
- goto bad_req;
- }
-
- /* Drop Stop Marker message. No need for a response. */
- if (unlikely(req->lpig && !req->rd_req && !req->wr_req))
- goto prq_advance;
-
- /*
- * If prq is to be handled outside iommu driver via receiver of
- * the fault notifiers, we skip the page response here.
- */
- mutex_lock(&iommu->iopf_lock);
- dev = device_rbtree_find(iommu, req->rid);
- if (!dev) {
- mutex_unlock(&iommu->iopf_lock);
- goto bad_req;
- }
-
- intel_svm_prq_report(iommu, dev, req);
- trace_prq_report(iommu, dev, req->qw_0, req->qw_1,
- req->qw_2, req->qw_3,
- iommu->prq_seq_number++);
- mutex_unlock(&iommu->iopf_lock);
-prq_advance:
- head = (head + sizeof(*req)) & PRQ_RING_MASK;
- }
-
- dmar_writeq(iommu->reg + DMAR_PQH_REG, tail);
-
- /*
- * Clear the page request overflow bit and wake up all threads that
- * are waiting for the completion of this handling.
- */
- if (readl(iommu->reg + DMAR_PRS_REG) & DMA_PRS_PRO) {
- pr_info_ratelimited("IOMMU: %s: PRQ overflow detected\n",
- iommu->name);
- head = dmar_readq(iommu->reg + DMAR_PQH_REG) & PRQ_RING_MASK;
- tail = dmar_readq(iommu->reg + DMAR_PQT_REG) & PRQ_RING_MASK;
- if (head == tail) {
- iopf_queue_discard_partial(iommu->iopf_queue);
- writel(DMA_PRS_PRO, iommu->reg + DMAR_PRS_REG);
- pr_info_ratelimited("IOMMU: %s: PRQ overflow cleared",
- iommu->name);
- }
- }
-
- if (!completion_done(&iommu->prq_complete))
- complete(&iommu->prq_complete);
-
- return IRQ_RETVAL(handled);
-}
-
-void intel_svm_page_response(struct device *dev, struct iopf_fault *evt,
- struct iommu_page_response *msg)
-{
- struct device_domain_info *info = dev_iommu_priv_get(dev);
- struct intel_iommu *iommu = info->iommu;
- u8 bus = info->bus, devfn = info->devfn;
- struct iommu_fault_page_request *prm;
- struct qi_desc desc;
- bool pasid_present;
- bool last_page;
- u16 sid;
-
- prm = &evt->fault.prm;
- sid = PCI_DEVID(bus, devfn);
- pasid_present = prm->flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
- last_page = prm->flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE;
-
- desc.qw0 = QI_PGRP_PASID(prm->pasid) | QI_PGRP_DID(sid) |
- QI_PGRP_PASID_P(pasid_present) |
- QI_PGRP_RESP_CODE(msg->code) |
- QI_PGRP_RESP_TYPE;
- desc.qw1 = QI_PGRP_IDX(prm->grpid) | QI_PGRP_LPIG(last_page);
- desc.qw2 = 0;
- desc.qw3 = 0;
-
- qi_submit_sync(iommu, &desc, 1, 0);
-}
-
static void intel_svm_domain_free(struct iommu_domain *domain)
{
struct dmar_domain *dmar_domain = to_dmar_domain(domain);
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread
2024-10-15 21:08 [PATCH v4 0/5] iommu: Enable user space IOPFs in non-PASID and non-svm cases Joel Granados
2024-10-15 21:08 ` [PATCH v4 1/5] iommu/vt-d: Separate page request queue from SVM Joel Granados
@ 2024-10-15 21:08 ` Joel Granados
2024-10-28 7:50 ` Yi Liu
2024-10-15 21:08 ` [PATCH v4 3/5] iommu: kconfig: Move IOMMU_IOPF into INTEL_IOMMU Joel Granados
` (3 subsequent siblings)
5 siblings, 1 reply; 21+ messages in thread
From: Joel Granados @ 2024-10-15 21:08 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Robin Murphy, Jason Gunthorpe, Kevin Tian, Klaus Jensen
Cc: linux-kernel, iommu, Joel Granados, Klaus Jensen, Joel Granados
From: Klaus Jensen <k.jensen@samsung.com>
PASID is not strictly needed when handling a PRQ event; remove the check
for the pasid present bit in the request. This change was not included
in the creation of prq.c to emphasize the change in capability checks
when handing PRQ events.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Joel Granados <joel.granados@kernel.org>
---
drivers/iommu/intel/prq.c | 12 +++---------
1 file changed, 3 insertions(+), 9 deletions(-)
diff --git a/drivers/iommu/intel/prq.c b/drivers/iommu/intel/prq.c
index d4f18eb46475..3c50c848893f 100644
--- a/drivers/iommu/intel/prq.c
+++ b/drivers/iommu/intel/prq.c
@@ -223,18 +223,12 @@ static irqreturn_t prq_event_thread(int irq, void *d)
req = &iommu->prq[head / sizeof(*req)];
address = (u64)req->addr << VTD_PAGE_SHIFT;
- if (unlikely(!req->pasid_present)) {
- pr_err("IOMMU: %s: Page request without PASID\n",
- iommu->name);
-bad_req:
- handle_bad_prq_event(iommu, req, QI_RESP_INVALID);
- goto prq_advance;
- }
-
if (unlikely(!is_canonical_address(address))) {
pr_err("IOMMU: %s: Address is not canonical\n",
iommu->name);
- goto bad_req;
+bad_req:
+ handle_bad_prq_event(iommu, req, QI_RESP_INVALID);
+ goto prq_advance;
}
if (unlikely(req->pm_req && (req->rd_req | req->wr_req))) {
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v4 3/5] iommu: kconfig: Move IOMMU_IOPF into INTEL_IOMMU
2024-10-15 21:08 [PATCH v4 0/5] iommu: Enable user space IOPFs in non-PASID and non-svm cases Joel Granados
2024-10-15 21:08 ` [PATCH v4 1/5] iommu/vt-d: Separate page request queue from SVM Joel Granados
2024-10-15 21:08 ` [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread Joel Granados
@ 2024-10-15 21:08 ` Joel Granados
2024-10-15 21:08 ` [PATCH v4 4/5] iommufd: Enable PRI when doing the iommufd_hwpt_alloc Joel Granados
` (2 subsequent siblings)
5 siblings, 0 replies; 21+ messages in thread
From: Joel Granados @ 2024-10-15 21:08 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Robin Murphy, Jason Gunthorpe, Kevin Tian, Klaus Jensen
Cc: linux-kernel, iommu, Joel Granados, Joel Granados
Move IOMMU_IOPF from under INTEL_IOMMU_SVM into INTEL_IOMMU. This
certifies that the core intel iommu utilizes the IOPF library
functions, independent of the INTEL_IOMMU_SVM config.
Signed-off-by: Joel Granados <joel.granados@kernel.org>
---
drivers/iommu/intel/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig
index 88fd32a9323c..f2f538c70650 100644
--- a/drivers/iommu/intel/Kconfig
+++ b/drivers/iommu/intel/Kconfig
@@ -14,6 +14,7 @@ config INTEL_IOMMU
depends on PCI_MSI && ACPI && X86
select IOMMU_API
select IOMMU_IOVA
+ select IOMMU_IOPF
select IOMMUFD_DRIVER if IOMMUFD
select NEED_DMA_MAP_STATE
select DMAR_TABLE
@@ -50,7 +51,6 @@ config INTEL_IOMMU_SVM
depends on X86_64
select MMU_NOTIFIER
select IOMMU_SVA
- select IOMMU_IOPF
help
Shared Virtual Memory (SVM) provides a facility for devices
to access DMA resources through process address space by
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v4 4/5] iommufd: Enable PRI when doing the iommufd_hwpt_alloc
2024-10-15 21:08 [PATCH v4 0/5] iommu: Enable user space IOPFs in non-PASID and non-svm cases Joel Granados
` (2 preceding siblings ...)
2024-10-15 21:08 ` [PATCH v4 3/5] iommu: kconfig: Move IOMMU_IOPF into INTEL_IOMMU Joel Granados
@ 2024-10-15 21:08 ` Joel Granados
2024-10-15 21:08 ` [PATCH v4 5/5] iommu/vt-d: drop pasid requirement for prq initialization Joel Granados
2024-11-02 3:27 ` [PATCH v4 0/5] iommu: Enable user space IOPFs in non-PASID and non-svm cases Baolu Lu
5 siblings, 0 replies; 21+ messages in thread
From: Joel Granados @ 2024-10-15 21:08 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Robin Murphy, Jason Gunthorpe, Kevin Tian, Klaus Jensen
Cc: linux-kernel, iommu, Joel Granados, Joel Granados
Add IOMMU_HWPT_FAULT_ID_VALID as part of the valid flags when doing an
iommufd_hwpt_alloc allowing the use of an iommu fault allocation
(iommu_fault_alloc) with the IOMMU_HWPT_ALLOC ioctl.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Joel Granados <joel.granados@kernel.org>
---
drivers/iommu/intel/iommu.c | 3 ++-
drivers/iommu/iommufd/hw_pagetable.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 5d9c3333c92d..dd399ca09859 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3535,7 +3535,8 @@ intel_iommu_domain_alloc_user(struct device *dev, u32 flags,
}
if (flags &
- (~(IOMMU_HWPT_ALLOC_NEST_PARENT | IOMMU_HWPT_ALLOC_DIRTY_TRACKING)))
+ (~(IOMMU_HWPT_ALLOC_NEST_PARENT | IOMMU_HWPT_ALLOC_DIRTY_TRACKING
+ | IOMMU_HWPT_FAULT_ID_VALID)))
return ERR_PTR(-EOPNOTSUPP);
if (nested_parent && !nested_supported(iommu))
return ERR_PTR(-EOPNOTSUPP);
diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c
index d06bf6e6c19f..8f020bc0815f 100644
--- a/drivers/iommu/iommufd/hw_pagetable.c
+++ b/drivers/iommu/iommufd/hw_pagetable.c
@@ -107,7 +107,8 @@ iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
const struct iommu_user_data *user_data)
{
const u32 valid_flags = IOMMU_HWPT_ALLOC_NEST_PARENT |
- IOMMU_HWPT_ALLOC_DIRTY_TRACKING;
+ IOMMU_HWPT_ALLOC_DIRTY_TRACKING |
+ IOMMU_HWPT_FAULT_ID_VALID;
const struct iommu_ops *ops = dev_iommu_ops(idev->dev);
struct iommufd_hwpt_paging *hwpt_paging;
struct iommufd_hw_pagetable *hwpt;
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v4 5/5] iommu/vt-d: drop pasid requirement for prq initialization
2024-10-15 21:08 [PATCH v4 0/5] iommu: Enable user space IOPFs in non-PASID and non-svm cases Joel Granados
` (3 preceding siblings ...)
2024-10-15 21:08 ` [PATCH v4 4/5] iommufd: Enable PRI when doing the iommufd_hwpt_alloc Joel Granados
@ 2024-10-15 21:08 ` Joel Granados
2024-11-02 3:27 ` [PATCH v4 0/5] iommu: Enable user space IOPFs in non-PASID and non-svm cases Baolu Lu
5 siblings, 0 replies; 21+ messages in thread
From: Joel Granados @ 2024-10-15 21:08 UTC (permalink / raw)
To: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Robin Murphy, Jason Gunthorpe, Kevin Tian, Klaus Jensen
Cc: linux-kernel, iommu, Joel Granados, Klaus Jensen, Joel Granados
From: Klaus Jensen <k.jensen@samsung.com>
PASID support within the IOMMU is not required to enable the Page
Request Queue, only the PRS capability.
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Joel Granados <joel.granados@kernel.org>
---
drivers/iommu/intel/iommu.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index dd399ca09859..53bbaba847d3 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1428,10 +1428,8 @@ static void free_dmar_iommu(struct intel_iommu *iommu)
/* free context mapping */
free_context_table(iommu);
- if (pasid_supported(iommu)) {
- if (ecap_prs(iommu->ecap))
- intel_iommu_finish_prq(iommu);
- }
+ if (ecap_prs(iommu->ecap))
+ intel_iommu_finish_prq(iommu);
}
/*
@@ -2352,7 +2350,7 @@ static int __init init_dmars(void)
iommu_flush_write_buffer(iommu);
- if (pasid_supported(iommu) && ecap_prs(iommu->ecap)) {
+ if (ecap_prs(iommu->ecap)) {
/*
* Call dmar_alloc_hwirq() with dmar_global_lock held,
* could cause possible lock race condition.
@@ -2783,7 +2781,7 @@ static int intel_iommu_add(struct dmar_drhd_unit *dmaru)
intel_iommu_init_qi(iommu);
iommu_flush_write_buffer(iommu);
- if (pasid_supported(iommu) && ecap_prs(iommu->ecap)) {
+ if (ecap_prs(iommu->ecap)) {
ret = intel_iommu_enable_prq(iommu);
if (ret)
goto disable_iommu;
--
2.43.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread
2024-10-15 21:08 ` [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread Joel Granados
@ 2024-10-28 7:50 ` Yi Liu
2024-10-28 8:23 ` Baolu Lu
2024-10-28 10:24 ` Joel Granados
0 siblings, 2 replies; 21+ messages in thread
From: Yi Liu @ 2024-10-28 7:50 UTC (permalink / raw)
To: Joel Granados, David Woodhouse, Lu Baolu, Joerg Roedel,
Will Deacon, Robin Murphy, Jason Gunthorpe, Kevin Tian,
Klaus Jensen
Cc: linux-kernel, iommu, Klaus Jensen
On 2024/10/16 05:08, Joel Granados wrote:
> From: Klaus Jensen <k.jensen@samsung.com>
>
> PASID is not strictly needed when handling a PRQ event; remove the check
> for the pasid present bit in the request. This change was not included
> in the creation of prq.c to emphasize the change in capability checks
> when handing PRQ events.
>
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Joel Granados <joel.granados@kernel.org>
looks like the PRQ draining is missed for the PRI usage. When a pasid
entry is destroyed, it might need to add helper similar to the
intel_drain_pasid_prq() to drain PRQ for the non-pasid usage.
> ---
> drivers/iommu/intel/prq.c | 12 +++---------
> 1 file changed, 3 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/iommu/intel/prq.c b/drivers/iommu/intel/prq.c
> index d4f18eb46475..3c50c848893f 100644
> --- a/drivers/iommu/intel/prq.c
> +++ b/drivers/iommu/intel/prq.c
> @@ -223,18 +223,12 @@ static irqreturn_t prq_event_thread(int irq, void *d)
> req = &iommu->prq[head / sizeof(*req)];
> address = (u64)req->addr << VTD_PAGE_SHIFT;
>
> - if (unlikely(!req->pasid_present)) {
> - pr_err("IOMMU: %s: Page request without PASID\n",
> - iommu->name);
> -bad_req:
> - handle_bad_prq_event(iommu, req, QI_RESP_INVALID);
> - goto prq_advance;
> - }
> -
> if (unlikely(!is_canonical_address(address))) {
> pr_err("IOMMU: %s: Address is not canonical\n",
> iommu->name);
> - goto bad_req;
> +bad_req:
> + handle_bad_prq_event(iommu, req, QI_RESP_INVALID);
> + goto prq_advance;
> }
>
> if (unlikely(req->pm_req && (req->rd_req | req->wr_req))) {
>
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread
2024-10-28 7:50 ` Yi Liu
@ 2024-10-28 8:23 ` Baolu Lu
2024-10-29 5:11 ` Yi Liu
2024-10-28 10:24 ` Joel Granados
1 sibling, 1 reply; 21+ messages in thread
From: Baolu Lu @ 2024-10-28 8:23 UTC (permalink / raw)
To: Yi Liu, Joel Granados, David Woodhouse, Joerg Roedel, Will Deacon,
Robin Murphy, Jason Gunthorpe, Kevin Tian, Klaus Jensen
Cc: baolu.lu, linux-kernel, iommu, Klaus Jensen
On 2024/10/28 15:50, Yi Liu wrote:
> On 2024/10/16 05:08, Joel Granados wrote:
>> From: Klaus Jensen <k.jensen@samsung.com>
>>
>> PASID is not strictly needed when handling a PRQ event; remove the check
>> for the pasid present bit in the request. This change was not included
>> in the creation of prq.c to emphasize the change in capability checks
>> when handing PRQ events.
>>
>> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
>> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
>> Signed-off-by: Joel Granados <joel.granados@kernel.org>
>
> looks like the PRQ draining is missed for the PRI usage. When a pasid
> entry is destroyed, it might need to add helper similar to the
> intel_drain_pasid_prq() to drain PRQ for the non-pasid usage.
Perhaps we can move intel_drain_pasid_prq() into
intel_pasid_tear_down_entry(), indicating that once a translation is
removed from the pasid and PRI is enabled on the device, the page
requests for the pasid should be flushed.
Thanks,
baolu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Re: [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread
2024-10-28 7:50 ` Yi Liu
2024-10-28 8:23 ` Baolu Lu
@ 2024-10-28 10:24 ` Joel Granados
2024-10-29 3:12 ` Baolu Lu
1 sibling, 1 reply; 21+ messages in thread
From: Joel Granados @ 2024-10-28 10:24 UTC (permalink / raw)
To: Yi Liu
Cc: David Woodhouse, Lu Baolu, Joerg Roedel, Will Deacon,
Robin Murphy, Jason Gunthorpe, Kevin Tian, Klaus Jensen,
linux-kernel, iommu, Klaus Jensen
On Mon, Oct 28, 2024 at 03:50:46PM +0800, Yi Liu wrote:
> On 2024/10/16 05:08, Joel Granados wrote:
> > From: Klaus Jensen <k.jensen@samsung.com>
> >
> > PASID is not strictly needed when handling a PRQ event; remove the check
> > for the pasid present bit in the request. This change was not included
> > in the creation of prq.c to emphasize the change in capability checks
> > when handing PRQ events.
> >
> > Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> > Signed-off-by: Joel Granados <joel.granados@kernel.org>
>
> looks like the PRQ draining is missed for the PRI usage. When a pasid
> entry is destroyed, it might need to add helper similar to the
> intel_drain_pasid_prq() to drain PRQ for the non-pasid usage.
These types of user space PRIs (non-pasid, non-svm) are created by
making use of iommufd_hwpt_replace_device. Which adds an entry to the
pasid_array indexed on IOMMU_NO_PASID (0U) via the following path:
iommufd_hwpt_replace_device
-> iommufd_fault_domain_repalce_dev
-> __fault_domain_replace_dev
-> iommu_replace_group_handle
-> xa_reserve(&group->pasid_array, IOMMU_NO_PASID, GFP_KERNEL);
It is my understanding that this will provide the needed relation
between the device and the prq in such a way that when remove_dev_pasid
is called, intel_iommu_drain_pasid_prq will be called with the
appropriate pasid value set to IOMMU_NO_PASID. Please correct me if I'm
mistaken.
Does this answer your question? Do you have a specific path that you are
looking at where a specific non-pasid drain is needed?
Best
>
> > ---
> > drivers/iommu/intel/prq.c | 12 +++---------
> > 1 file changed, 3 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/iommu/intel/prq.c b/drivers/iommu/intel/prq.c
> > index d4f18eb46475..3c50c848893f 100644
> > --- a/drivers/iommu/intel/prq.c
> > +++ b/drivers/iommu/intel/prq.c
> > @@ -223,18 +223,12 @@ static irqreturn_t prq_event_thread(int irq, void *d)
> > req = &iommu->prq[head / sizeof(*req)];
> > address = (u64)req->addr << VTD_PAGE_SHIFT;
> >
> > - if (unlikely(!req->pasid_present)) {
> > - pr_err("IOMMU: %s: Page request without PASID\n",
> > - iommu->name);
> > -bad_req:
> > - handle_bad_prq_event(iommu, req, QI_RESP_INVALID);
> > - goto prq_advance;
> > - }
> > -
> > if (unlikely(!is_canonical_address(address))) {
> > pr_err("IOMMU: %s: Address is not canonical\n",
> > iommu->name);
> > - goto bad_req;
> > +bad_req:
> > + handle_bad_prq_event(iommu, req, QI_RESP_INVALID);
> > + goto prq_advance;
> > }
> >
> > if (unlikely(req->pm_req && (req->rd_req | req->wr_req))) {
> >
>
> --
> Regards,
> Yi Liu
--
Joel Granados
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread
2024-10-28 10:24 ` Joel Granados
@ 2024-10-29 3:12 ` Baolu Lu
2024-10-29 5:13 ` Yi Liu
2024-10-30 14:28 ` Joel Granados
0 siblings, 2 replies; 21+ messages in thread
From: Baolu Lu @ 2024-10-29 3:12 UTC (permalink / raw)
To: Joel Granados, Yi Liu
Cc: baolu.lu, David Woodhouse, Joerg Roedel, Will Deacon,
Robin Murphy, Jason Gunthorpe, Kevin Tian, Klaus Jensen,
linux-kernel, iommu, Klaus Jensen
On 2024/10/28 18:24, Joel Granados wrote:
> On Mon, Oct 28, 2024 at 03:50:46PM +0800, Yi Liu wrote:
>> On 2024/10/16 05:08, Joel Granados wrote:
>>> From: Klaus Jensen<k.jensen@samsung.com>
>>>
>>> PASID is not strictly needed when handling a PRQ event; remove the check
>>> for the pasid present bit in the request. This change was not included
>>> in the creation of prq.c to emphasize the change in capability checks
>>> when handing PRQ events.
>>>
>>> Signed-off-by: Klaus Jensen<k.jensen@samsung.com>
>>> Reviewed-by: Kevin Tian<kevin.tian@intel.com>
>>> Signed-off-by: Joel Granados<joel.granados@kernel.org>
>> looks like the PRQ draining is missed for the PRI usage. When a pasid
>> entry is destroyed, it might need to add helper similar to the
>> intel_drain_pasid_prq() to drain PRQ for the non-pasid usage.
> These types of user space PRIs (non-pasid, non-svm) are created by
> making use of iommufd_hwpt_replace_device. Which adds an entry to the
> pasid_array indexed on IOMMU_NO_PASID (0U) via the following path:
>
> iommufd_hwpt_replace_device
> -> iommufd_fault_domain_repalce_dev
> -> __fault_domain_replace_dev
> -> iommu_replace_group_handle
-> __iommu_group_set_domain
-> intel_iommu_attach_device
-> device_block_translation
-> intel_pasid_tear_down_entry(IOMMU_NO_PASID)
Here a domain is removed from the pasid entry, hence we need to flush
all page requests that are pending in the IOMMU page request queue or
the PCI fabric.
> -> xa_reserve(&group->pasid_array, IOMMU_NO_PASID, GFP_KERNEL);
>
> It is my understanding that this will provide the needed relation
> between the device and the prq in such a way that when remove_dev_pasid
> is called, intel_iommu_drain_pasid_prq will be called with the
> appropriate pasid value set to IOMMU_NO_PASID. Please correct me if I'm
> mistaken.
Removing a domain from a RID and a PASID are different paths.
Previously, this IOMMU driver only supported page requests on PASID
(non-IOMMU_NO_PASID). It is acceptable that it does not flush the PRQ in
the domain-removing RID path.
With the changes made in this series, the driver now supports page
requests for RID. It should also flush the PRQ when removing a domain
from a PASID entry for IOMMU_NO_PASID.
>
> Does this answer your question? Do you have a specific path that you are
> looking at where a specific non-pasid drain is needed?
Perhaps we can simply add below change.
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index e860bc9439a2..a24a42649621 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4283,7 +4283,6 @@ static void intel_iommu_remove_dev_pasid(struct
device *dev, ioasid_t pasid,
intel_iommu_debugfs_remove_dev_pasid(dev_pasid);
kfree(dev_pasid);
intel_pasid_tear_down_entry(iommu, dev, pasid, false);
- intel_drain_pasid_prq(dev, pasid);
}
static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index 2e5fa0a23299..8639f3eb4264 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -265,6 +265,7 @@ void intel_pasid_tear_down_entry(struct intel_iommu
*iommu, struct device *dev,
iommu->flush.flush_iotlb(iommu, did, 0, 0,
DMA_TLB_DSI_FLUSH);
devtlb_invalidation_with_pasid(iommu, dev, pasid);
+ intel_drain_pasid_prq(dev, pasid);
}
/*
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 078d1e32a24e..ff88f31053d1 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -304,9 +304,6 @@ void intel_drain_pasid_prq(struct device *dev, u32
pasid)
int qdep;
info = dev_iommu_priv_get(dev);
- if (WARN_ON(!info || !dev_is_pci(dev)))
- return;
-
if (!info->pri_enabled)
return;
Generally, intel_drain_pasid_prq() should be called if
- a translation is removed from a pasid entry; and
- PRI on this device is enabled.
Thought?
--
baolu
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread
2024-10-28 8:23 ` Baolu Lu
@ 2024-10-29 5:11 ` Yi Liu
0 siblings, 0 replies; 21+ messages in thread
From: Yi Liu @ 2024-10-29 5:11 UTC (permalink / raw)
To: Baolu Lu, Joel Granados, David Woodhouse, Joerg Roedel,
Will Deacon, Robin Murphy, Jason Gunthorpe, Kevin Tian,
Klaus Jensen
Cc: linux-kernel, iommu, Klaus Jensen
On 2024/10/28 16:23, Baolu Lu wrote:
> On 2024/10/28 15:50, Yi Liu wrote:
>> On 2024/10/16 05:08, Joel Granados wrote:
>>> From: Klaus Jensen <k.jensen@samsung.com>
>>>
>>> PASID is not strictly needed when handling a PRQ event; remove the check
>>> for the pasid present bit in the request. This change was not included
>>> in the creation of prq.c to emphasize the change in capability checks
>>> when handing PRQ events.
>>>
>>> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
>>> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
>>> Signed-off-by: Joel Granados <joel.granados@kernel.org>
>>
>> looks like the PRQ draining is missed for the PRI usage. When a pasid
>> entry is destroyed, it might need to add helper similar to the
>> intel_drain_pasid_prq() to drain PRQ for the non-pasid usage.
>
> Perhaps we can move intel_drain_pasid_prq() into
> intel_pasid_tear_down_entry(), indicating that once a translation is
> removed from the pasid and PRI is enabled on the device, the page
> requests for the pasid should be flushed.
>
yes, something like the below patch but no flag to opt-out the PRQ drain.
https://lore.kernel.org/linux-iommu/20241018055402.23277-3-yi.l.liu@intel.com/
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread
2024-10-29 3:12 ` Baolu Lu
@ 2024-10-29 5:13 ` Yi Liu
2024-10-29 5:39 ` Baolu Lu
2024-10-30 14:28 ` Joel Granados
1 sibling, 1 reply; 21+ messages in thread
From: Yi Liu @ 2024-10-29 5:13 UTC (permalink / raw)
To: Baolu Lu, Joel Granados
Cc: David Woodhouse, Joerg Roedel, Will Deacon, Robin Murphy,
Jason Gunthorpe, Kevin Tian, Klaus Jensen, linux-kernel, iommu,
Klaus Jensen
On 2024/10/29 11:12, Baolu Lu wrote:
> On 2024/10/28 18:24, Joel Granados wrote:
>> On Mon, Oct 28, 2024 at 03:50:46PM +0800, Yi Liu wrote:
>>> On 2024/10/16 05:08, Joel Granados wrote:
>>>> From: Klaus Jensen<k.jensen@samsung.com>
>>>>
>>>> PASID is not strictly needed when handling a PRQ event; remove the check
>>>> for the pasid present bit in the request. This change was not included
>>>> in the creation of prq.c to emphasize the change in capability checks
>>>> when handing PRQ events.
>>>>
>>>> Signed-off-by: Klaus Jensen<k.jensen@samsung.com>
>>>> Reviewed-by: Kevin Tian<kevin.tian@intel.com>
>>>> Signed-off-by: Joel Granados<joel.granados@kernel.org>
>>> looks like the PRQ draining is missed for the PRI usage. When a pasid
>>> entry is destroyed, it might need to add helper similar to the
>>> intel_drain_pasid_prq() to drain PRQ for the non-pasid usage.
>> These types of user space PRIs (non-pasid, non-svm) are created by
>> making use of iommufd_hwpt_replace_device. Which adds an entry to the
>> pasid_array indexed on IOMMU_NO_PASID (0U) via the following path:
>>
>> iommufd_hwpt_replace_device
>> -> iommufd_fault_domain_repalce_dev
>> -> __fault_domain_replace_dev
>> -> iommu_replace_group_handle
> -> __iommu_group_set_domain
> -> intel_iommu_attach_device
> -> device_block_translation
> -> intel_pasid_tear_down_entry(IOMMU_NO_PASID)
>
> Here a domain is removed from the pasid entry, hence we need to flush
> all page requests that are pending in the IOMMU page request queue or
> the PCI fabric.
>
>> -> xa_reserve(&group->pasid_array, IOMMU_NO_PASID, GFP_KERNEL);
>>
>> It is my understanding that this will provide the needed relation
>> between the device and the prq in such a way that when remove_dev_pasid
>> is called, intel_iommu_drain_pasid_prq will be called with the
>> appropriate pasid value set to IOMMU_NO_PASID. Please correct me if I'm
>> mistaken.
>
> Removing a domain from a RID and a PASID are different paths.
> Previously, this IOMMU driver only supported page requests on PASID
> (non-IOMMU_NO_PASID). It is acceptable that it does not flush the PRQ in
> the domain-removing RID path.
>
> With the changes made in this series, the driver now supports page
> requests for RID. It should also flush the PRQ when removing a domain
> from a PASID entry for IOMMU_NO_PASID.
>
>>
>> Does this answer your question? Do you have a specific path that you are
>> looking at where a specific non-pasid drain is needed?
>
> Perhaps we can simply add below change.
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index e860bc9439a2..a24a42649621 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4283,7 +4283,6 @@ static void intel_iommu_remove_dev_pasid(struct
> device *dev, ioasid_t pasid,
> intel_iommu_debugfs_remove_dev_pasid(dev_pasid);
> kfree(dev_pasid);
> intel_pasid_tear_down_entry(iommu, dev, pasid, false);
> - intel_drain_pasid_prq(dev, pasid);
> }
>
> static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
> index 2e5fa0a23299..8639f3eb4264 100644
> --- a/drivers/iommu/intel/pasid.c
> +++ b/drivers/iommu/intel/pasid.c
> @@ -265,6 +265,7 @@ void intel_pasid_tear_down_entry(struct intel_iommu
> *iommu, struct device *dev,
> iommu->flush.flush_iotlb(iommu, did, 0, 0,
> DMA_TLB_DSI_FLUSH);
>
> devtlb_invalidation_with_pasid(iommu, dev, pasid);
> + intel_drain_pasid_prq(dev, pasid);
> }
>
> /*
> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
> index 078d1e32a24e..ff88f31053d1 100644
> --- a/drivers/iommu/intel/svm.c
> +++ b/drivers/iommu/intel/svm.c
> @@ -304,9 +304,6 @@ void intel_drain_pasid_prq(struct device *dev, u32 pasid)
> int qdep;
>
> info = dev_iommu_priv_get(dev);
> - if (WARN_ON(!info || !dev_is_pci(dev)))
> - return;
> -
> if (!info->pri_enabled)
> return;
>
> Generally, intel_drain_pasid_prq() should be called if
>
> - a translation is removed from a pasid entry; and
> - PRI on this device is enabled.
If the @pasid==IOMMU_NO_PASID, PRQ drain should use the iotlb invalidation
and dev-tlb invalidation descriptors. So extra code change is needed in
intel_drain_pasid_prq(). Or perhaps it's better to have a separate helper
for draining prq for non-pasid case.
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread
2024-10-29 5:13 ` Yi Liu
@ 2024-10-29 5:39 ` Baolu Lu
2024-10-30 5:51 ` Yi Liu
0 siblings, 1 reply; 21+ messages in thread
From: Baolu Lu @ 2024-10-29 5:39 UTC (permalink / raw)
To: Yi Liu, Joel Granados
Cc: baolu.lu, David Woodhouse, Joerg Roedel, Will Deacon,
Robin Murphy, Jason Gunthorpe, Kevin Tian, Klaus Jensen,
linux-kernel, iommu, Klaus Jensen
On 2024/10/29 13:13, Yi Liu wrote:
> On 2024/10/29 11:12, Baolu Lu wrote:
>> On 2024/10/28 18:24, Joel Granados wrote:
>>> On Mon, Oct 28, 2024 at 03:50:46PM +0800, Yi Liu wrote:
>>>> On 2024/10/16 05:08, Joel Granados wrote:
>>>>> From: Klaus Jensen<k.jensen@samsung.com>
>>>>>
>>>>> PASID is not strictly needed when handling a PRQ event; remove the
>>>>> check
>>>>> for the pasid present bit in the request. This change was not included
>>>>> in the creation of prq.c to emphasize the change in capability checks
>>>>> when handing PRQ events.
>>>>>
>>>>> Signed-off-by: Klaus Jensen<k.jensen@samsung.com>
>>>>> Reviewed-by: Kevin Tian<kevin.tian@intel.com>
>>>>> Signed-off-by: Joel Granados<joel.granados@kernel.org>
>>>> looks like the PRQ draining is missed for the PRI usage. When a pasid
>>>> entry is destroyed, it might need to add helper similar to the
>>>> intel_drain_pasid_prq() to drain PRQ for the non-pasid usage.
>>> These types of user space PRIs (non-pasid, non-svm) are created by
>>> making use of iommufd_hwpt_replace_device. Which adds an entry to the
>>> pasid_array indexed on IOMMU_NO_PASID (0U) via the following path:
>>>
>>> iommufd_hwpt_replace_device
>>> -> iommufd_fault_domain_repalce_dev
>>> -> __fault_domain_replace_dev
>>> -> iommu_replace_group_handle
>> -> __iommu_group_set_domain
>> -> intel_iommu_attach_device
>> -> device_block_translation
>> -> intel_pasid_tear_down_entry(IOMMU_NO_PASID)
>>
>> Here a domain is removed from the pasid entry, hence we need to flush
>> all page requests that are pending in the IOMMU page request queue or
>> the PCI fabric.
>>
>>> -> xa_reserve(&group->pasid_array, IOMMU_NO_PASID, GFP_KERNEL);
>>>
>>> It is my understanding that this will provide the needed relation
>>> between the device and the prq in such a way that when remove_dev_pasid
>>> is called, intel_iommu_drain_pasid_prq will be called with the
>>> appropriate pasid value set to IOMMU_NO_PASID. Please correct me if I'm
>>> mistaken.
>>
>> Removing a domain from a RID and a PASID are different paths.
>> Previously, this IOMMU driver only supported page requests on PASID
>> (non-IOMMU_NO_PASID). It is acceptable that it does not flush the PRQ in
>> the domain-removing RID path.
>>
>> With the changes made in this series, the driver now supports page
>> requests for RID. It should also flush the PRQ when removing a domain
>> from a PASID entry for IOMMU_NO_PASID.
>>
>>>
>>> Does this answer your question? Do you have a specific path that you are
>>> looking at where a specific non-pasid drain is needed?
>>
>> Perhaps we can simply add below change.
>>
>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>> index e860bc9439a2..a24a42649621 100644
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -4283,7 +4283,6 @@ static void intel_iommu_remove_dev_pasid(struct
>> device *dev, ioasid_t pasid,
>> intel_iommu_debugfs_remove_dev_pasid(dev_pasid);
>> kfree(dev_pasid);
>> intel_pasid_tear_down_entry(iommu, dev, pasid, false);
>> - intel_drain_pasid_prq(dev, pasid);
>> }
>>
>> static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
>> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
>> index 2e5fa0a23299..8639f3eb4264 100644
>> --- a/drivers/iommu/intel/pasid.c
>> +++ b/drivers/iommu/intel/pasid.c
>> @@ -265,6 +265,7 @@ void intel_pasid_tear_down_entry(struct
>> intel_iommu *iommu, struct device *dev,
>> iommu->flush.flush_iotlb(iommu, did, 0, 0,
>> DMA_TLB_DSI_FLUSH);
>>
>> devtlb_invalidation_with_pasid(iommu, dev, pasid);
>> + intel_drain_pasid_prq(dev, pasid);
>> }
>>
>> /*
>> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
>> index 078d1e32a24e..ff88f31053d1 100644
>> --- a/drivers/iommu/intel/svm.c
>> +++ b/drivers/iommu/intel/svm.c
>> @@ -304,9 +304,6 @@ void intel_drain_pasid_prq(struct device *dev, u32
>> pasid)
>> int qdep;
>>
>> info = dev_iommu_priv_get(dev);
>> - if (WARN_ON(!info || !dev_is_pci(dev)))
>> - return;
>> -
>> if (!info->pri_enabled)
>> return;
>>
>> Generally, intel_drain_pasid_prq() should be called if
>>
>> - a translation is removed from a pasid entry; and
>> - PRI on this device is enabled.
>
> If the @pasid==IOMMU_NO_PASID, PRQ drain should use the iotlb invalidation
> and dev-tlb invalidation descriptors. So extra code change is needed in
> intel_drain_pasid_prq(). Or perhaps it's better to have a separate helper
> for draining prq for non-pasid case.
According to VT-d spec, section 7.10, "Software Steps to Drain Page
Requests & Responses", we can simply replace p_iotlb_inv_dsc and
p_dev_tlb_inv_dsc with iotlb_inv_dsc and dev_tlb_inv_dsc. Any
significant negative performance impact?
--
baolu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread
2024-10-29 5:39 ` Baolu Lu
@ 2024-10-30 5:51 ` Yi Liu
2024-10-30 8:32 ` Baolu Lu
0 siblings, 1 reply; 21+ messages in thread
From: Yi Liu @ 2024-10-30 5:51 UTC (permalink / raw)
To: Baolu Lu, Joel Granados
Cc: David Woodhouse, Joerg Roedel, Will Deacon, Robin Murphy,
Jason Gunthorpe, Kevin Tian, Klaus Jensen, linux-kernel, iommu,
Klaus Jensen
On 2024/10/29 13:39, Baolu Lu wrote:
> On 2024/10/29 13:13, Yi Liu wrote:
>> On 2024/10/29 11:12, Baolu Lu wrote:
>>> On 2024/10/28 18:24, Joel Granados wrote:
>>>> On Mon, Oct 28, 2024 at 03:50:46PM +0800, Yi Liu wrote:
>>>>> On 2024/10/16 05:08, Joel Granados wrote:
>>>>>> From: Klaus Jensen<k.jensen@samsung.com>
>>>>>>
>>>>>> PASID is not strictly needed when handling a PRQ event; remove the check
>>>>>> for the pasid present bit in the request. This change was not included
>>>>>> in the creation of prq.c to emphasize the change in capability checks
>>>>>> when handing PRQ events.
>>>>>>
>>>>>> Signed-off-by: Klaus Jensen<k.jensen@samsung.com>
>>>>>> Reviewed-by: Kevin Tian<kevin.tian@intel.com>
>>>>>> Signed-off-by: Joel Granados<joel.granados@kernel.org>
>>>>> looks like the PRQ draining is missed for the PRI usage. When a pasid
>>>>> entry is destroyed, it might need to add helper similar to the
>>>>> intel_drain_pasid_prq() to drain PRQ for the non-pasid usage.
>>>> These types of user space PRIs (non-pasid, non-svm) are created by
>>>> making use of iommufd_hwpt_replace_device. Which adds an entry to the
>>>> pasid_array indexed on IOMMU_NO_PASID (0U) via the following path:
>>>>
>>>> iommufd_hwpt_replace_device
>>>> -> iommufd_fault_domain_repalce_dev
>>>> -> __fault_domain_replace_dev
>>>> -> iommu_replace_group_handle
>>> -> __iommu_group_set_domain
>>> -> intel_iommu_attach_device
>>> -> device_block_translation
>>> -> intel_pasid_tear_down_entry(IOMMU_NO_PASID)
>>>
>>> Here a domain is removed from the pasid entry, hence we need to flush
>>> all page requests that are pending in the IOMMU page request queue or
>>> the PCI fabric.
>>>
>>>> -> xa_reserve(&group->pasid_array, IOMMU_NO_PASID, GFP_KERNEL);
>>>>
>>>> It is my understanding that this will provide the needed relation
>>>> between the device and the prq in such a way that when remove_dev_pasid
>>>> is called, intel_iommu_drain_pasid_prq will be called with the
>>>> appropriate pasid value set to IOMMU_NO_PASID. Please correct me if I'm
>>>> mistaken.
>>>
>>> Removing a domain from a RID and a PASID are different paths.
>>> Previously, this IOMMU driver only supported page requests on PASID
>>> (non-IOMMU_NO_PASID). It is acceptable that it does not flush the PRQ in
>>> the domain-removing RID path.
>>>
>>> With the changes made in this series, the driver now supports page
>>> requests for RID. It should also flush the PRQ when removing a domain
>>> from a PASID entry for IOMMU_NO_PASID.
>>>
>>>>
>>>> Does this answer your question? Do you have a specific path that you are
>>>> looking at where a specific non-pasid drain is needed?
>>>
>>> Perhaps we can simply add below change.
>>>
>>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>>> index e860bc9439a2..a24a42649621 100644
>>> --- a/drivers/iommu/intel/iommu.c
>>> +++ b/drivers/iommu/intel/iommu.c
>>> @@ -4283,7 +4283,6 @@ static void intel_iommu_remove_dev_pasid(struct
>>> device *dev, ioasid_t pasid,
>>> intel_iommu_debugfs_remove_dev_pasid(dev_pasid);
>>> kfree(dev_pasid);
>>> intel_pasid_tear_down_entry(iommu, dev, pasid, false);
>>> - intel_drain_pasid_prq(dev, pasid);
>>> }
>>>
>>> static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
>>> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
>>> index 2e5fa0a23299..8639f3eb4264 100644
>>> --- a/drivers/iommu/intel/pasid.c
>>> +++ b/drivers/iommu/intel/pasid.c
>>> @@ -265,6 +265,7 @@ void intel_pasid_tear_down_entry(struct intel_iommu
>>> *iommu, struct device *dev,
>>> iommu->flush.flush_iotlb(iommu, did, 0, 0,
>>> DMA_TLB_DSI_FLUSH);
>>>
>>> devtlb_invalidation_with_pasid(iommu, dev, pasid);
>>> + intel_drain_pasid_prq(dev, pasid);
>>> }
>>>
>>> /*
>>> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
>>> index 078d1e32a24e..ff88f31053d1 100644
>>> --- a/drivers/iommu/intel/svm.c
>>> +++ b/drivers/iommu/intel/svm.c
>>> @@ -304,9 +304,6 @@ void intel_drain_pasid_prq(struct device *dev, u32
>>> pasid)
>>> int qdep;
>>>
>>> info = dev_iommu_priv_get(dev);
>>> - if (WARN_ON(!info || !dev_is_pci(dev)))
>>> - return;
>>> -
>>> if (!info->pri_enabled)
>>> return;
>>>
>>> Generally, intel_drain_pasid_prq() should be called if
>>>
>>> - a translation is removed from a pasid entry; and
>>> - PRI on this device is enabled.
>>
>> If the @pasid==IOMMU_NO_PASID, PRQ drain should use the iotlb invalidation
>> and dev-tlb invalidation descriptors. So extra code change is needed in
>> intel_drain_pasid_prq(). Or perhaps it's better to have a separate helper
>> for draining prq for non-pasid case.
>
> According to VT-d spec, section 7.10, "Software Steps to Drain Page
> Requests & Responses", we can simply replace p_iotlb_inv_dsc and
> p_dev_tlb_inv_dsc with iotlb_inv_dsc and dev_tlb_inv_dsc. Any
> significant negative performance impact?
It's not about performance impact. My point is to use iotlb_inv_dsc and
dev_tlb_inv_dsc for the @pasid==IOMMU_NO_PASID case. The existing
intel_drain_pasid_prq() only uses p_iotlb_inv_dsc and p_dev_tlb_inv_dsc.
The way you described in above reply works. But it needs to add if/else
to use the correct invalidation descriptor. Since the descriptor
composition has several lines, so just an ask if it's better to have a
separate helper. :)
--
Regards,
Yi Liu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread
2024-10-30 5:51 ` Yi Liu
@ 2024-10-30 8:32 ` Baolu Lu
0 siblings, 0 replies; 21+ messages in thread
From: Baolu Lu @ 2024-10-30 8:32 UTC (permalink / raw)
To: Yi Liu, Joel Granados
Cc: baolu.lu, David Woodhouse, Joerg Roedel, Will Deacon,
Robin Murphy, Jason Gunthorpe, Kevin Tian, Klaus Jensen,
linux-kernel, iommu, Klaus Jensen
On 2024/10/30 13:51, Yi Liu wrote:
> On 2024/10/29 13:39, Baolu Lu wrote:
>> On 2024/10/29 13:13, Yi Liu wrote:
>>> On 2024/10/29 11:12, Baolu Lu wrote:
>>>> On 2024/10/28 18:24, Joel Granados wrote:
>>>>> On Mon, Oct 28, 2024 at 03:50:46PM +0800, Yi Liu wrote:
>>>>>> On 2024/10/16 05:08, Joel Granados wrote:
>>>>>>> From: Klaus Jensen<k.jensen@samsung.com>
>>>>>>>
>>>>>>> PASID is not strictly needed when handling a PRQ event; remove
>>>>>>> the check
>>>>>>> for the pasid present bit in the request. This change was not
>>>>>>> included
>>>>>>> in the creation of prq.c to emphasize the change in capability
>>>>>>> checks
>>>>>>> when handing PRQ events.
>>>>>>>
>>>>>>> Signed-off-by: Klaus Jensen<k.jensen@samsung.com>
>>>>>>> Reviewed-by: Kevin Tian<kevin.tian@intel.com>
>>>>>>> Signed-off-by: Joel Granados<joel.granados@kernel.org>
>>>>>> looks like the PRQ draining is missed for the PRI usage. When a pasid
>>>>>> entry is destroyed, it might need to add helper similar to the
>>>>>> intel_drain_pasid_prq() to drain PRQ for the non-pasid usage.
>>>>> These types of user space PRIs (non-pasid, non-svm) are created by
>>>>> making use of iommufd_hwpt_replace_device. Which adds an entry to the
>>>>> pasid_array indexed on IOMMU_NO_PASID (0U) via the following path:
>>>>>
>>>>> iommufd_hwpt_replace_device
>>>>> -> iommufd_fault_domain_repalce_dev
>>>>> -> __fault_domain_replace_dev
>>>>> -> iommu_replace_group_handle
>>>> -> __iommu_group_set_domain
>>>> -> intel_iommu_attach_device
>>>> -> device_block_translation
>>>> -> intel_pasid_tear_down_entry(IOMMU_NO_PASID)
>>>>
>>>> Here a domain is removed from the pasid entry, hence we need to flush
>>>> all page requests that are pending in the IOMMU page request queue or
>>>> the PCI fabric.
>>>>
>>>>> -> xa_reserve(&group->pasid_array, IOMMU_NO_PASID,
>>>>> GFP_KERNEL);
>>>>>
>>>>> It is my understanding that this will provide the needed relation
>>>>> between the device and the prq in such a way that when
>>>>> remove_dev_pasid
>>>>> is called, intel_iommu_drain_pasid_prq will be called with the
>>>>> appropriate pasid value set to IOMMU_NO_PASID. Please correct me if
>>>>> I'm
>>>>> mistaken.
>>>>
>>>> Removing a domain from a RID and a PASID are different paths.
>>>> Previously, this IOMMU driver only supported page requests on PASID
>>>> (non-IOMMU_NO_PASID). It is acceptable that it does not flush the
>>>> PRQ in
>>>> the domain-removing RID path.
>>>>
>>>> With the changes made in this series, the driver now supports page
>>>> requests for RID. It should also flush the PRQ when removing a domain
>>>> from a PASID entry for IOMMU_NO_PASID.
>>>>
>>>>>
>>>>> Does this answer your question? Do you have a specific path that
>>>>> you are
>>>>> looking at where a specific non-pasid drain is needed?
>>>>
>>>> Perhaps we can simply add below change.
>>>>
>>>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>>>> index e860bc9439a2..a24a42649621 100644
>>>> --- a/drivers/iommu/intel/iommu.c
>>>> +++ b/drivers/iommu/intel/iommu.c
>>>> @@ -4283,7 +4283,6 @@ static void
>>>> intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
>>>> intel_iommu_debugfs_remove_dev_pasid(dev_pasid);
>>>> kfree(dev_pasid);
>>>> intel_pasid_tear_down_entry(iommu, dev, pasid, false);
>>>> - intel_drain_pasid_prq(dev, pasid);
>>>> }
>>>>
>>>> static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
>>>> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
>>>> index 2e5fa0a23299..8639f3eb4264 100644
>>>> --- a/drivers/iommu/intel/pasid.c
>>>> +++ b/drivers/iommu/intel/pasid.c
>>>> @@ -265,6 +265,7 @@ void intel_pasid_tear_down_entry(struct
>>>> intel_iommu *iommu, struct device *dev,
>>>> iommu->flush.flush_iotlb(iommu, did, 0, 0,
>>>> DMA_TLB_DSI_FLUSH);
>>>>
>>>> devtlb_invalidation_with_pasid(iommu, dev, pasid);
>>>> + intel_drain_pasid_prq(dev, pasid);
>>>> }
>>>>
>>>> /*
>>>> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
>>>> index 078d1e32a24e..ff88f31053d1 100644
>>>> --- a/drivers/iommu/intel/svm.c
>>>> +++ b/drivers/iommu/intel/svm.c
>>>> @@ -304,9 +304,6 @@ void intel_drain_pasid_prq(struct device *dev,
>>>> u32 pasid)
>>>> int qdep;
>>>>
>>>> info = dev_iommu_priv_get(dev);
>>>> - if (WARN_ON(!info || !dev_is_pci(dev)))
>>>> - return;
>>>> -
>>>> if (!info->pri_enabled)
>>>> return;
>>>>
>>>> Generally, intel_drain_pasid_prq() should be called if
>>>>
>>>> - a translation is removed from a pasid entry; and
>>>> - PRI on this device is enabled.
>>>
>>> If the @pasid==IOMMU_NO_PASID, PRQ drain should use the iotlb
>>> invalidation
>>> and dev-tlb invalidation descriptors. So extra code change is needed in
>>> intel_drain_pasid_prq(). Or perhaps it's better to have a separate
>>> helper
>>> for draining prq for non-pasid case.
>>
>> According to VT-d spec, section 7.10, "Software Steps to Drain Page
>> Requests & Responses", we can simply replace p_iotlb_inv_dsc and
>> p_dev_tlb_inv_dsc with iotlb_inv_dsc and dev_tlb_inv_dsc. Any
>> significant negative performance impact?
>
> It's not about performance impact. My point is to use iotlb_inv_dsc and
> dev_tlb_inv_dsc for the @pasid==IOMMU_NO_PASID case. The existing
> intel_drain_pasid_prq() only uses p_iotlb_inv_dsc and p_dev_tlb_inv_dsc.
> The way you described in above reply works. But it needs to add if/else
> to use the correct invalidation descriptor. Since the descriptor
> composition has several lines, so just an ask if it's better to have a
> separate helper. :)
The spec says (7.10 Software Steps to Drain Page Requests & Responses):
"
Submit an IOTLB invalidate descriptor (iotlb_inv_dsc or p_iotlb_inv_dsc)
followed by DeviceTLB invalidation descriptor (dev_tlb_inv_dsc or
p_dev_tlb_inv_dsc) targeting the endpoint device. These invalidation
requests can be of any granularity. Per the ordering requirements
described in Section 7.8, older page group responses issued by software
to the endpoint device before step (a) are guaranteed to be received by
the endpoint before the endpoint receives this Device-TLB invalidation
request.
"
The purpose of the cache invalidation requests sent to the device is to
leverage PCI ordering requirements to ensure that all page group
responses are received by the device before it processes the TLB
invalidation request. Therefore, the specification doesn't mandate the
type and granularity of invalidation requests, as long as they are
valid.
Given that the Page Request Interface (PRI) is only supported when the
IOMMU operates in scalable mode, I don't believe we need to make any
changes to the invalidation types at this time.
--
baolu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Re: [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread
2024-10-29 3:12 ` Baolu Lu
2024-10-29 5:13 ` Yi Liu
@ 2024-10-30 14:28 ` Joel Granados
2024-10-31 3:42 ` Baolu Lu
2024-10-31 9:57 ` Baolu Lu
1 sibling, 2 replies; 21+ messages in thread
From: Joel Granados @ 2024-10-30 14:28 UTC (permalink / raw)
To: Baolu Lu
Cc: Yi Liu, David Woodhouse, Joerg Roedel, Will Deacon, Robin Murphy,
Jason Gunthorpe, Kevin Tian, Klaus Jensen, linux-kernel, iommu,
Klaus Jensen
On Tue, Oct 29, 2024 at 11:12:49AM +0800, Baolu Lu wrote:
> On 2024/10/28 18:24, Joel Granados wrote:
> > On Mon, Oct 28, 2024 at 03:50:46PM +0800, Yi Liu wrote:
> >> On 2024/10/16 05:08, Joel Granados wrote:
> >>> From: Klaus Jensen<k.jensen@samsung.com>
> >>>
> >>> PASID is not strictly needed when handling a PRQ event; remove the check
> >>> for the pasid present bit in the request. This change was not included
> >>> in the creation of prq.c to emphasize the change in capability checks
> >>> when handing PRQ events.
> >>>
> >>> Signed-off-by: Klaus Jensen<k.jensen@samsung.com>
> >>> Reviewed-by: Kevin Tian<kevin.tian@intel.com>
> >>> Signed-off-by: Joel Granados<joel.granados@kernel.org>
> >> looks like the PRQ draining is missed for the PRI usage. When a pasid
> >> entry is destroyed, it might need to add helper similar to the
> >> intel_drain_pasid_prq() to drain PRQ for the non-pasid usage.
> > These types of user space PRIs (non-pasid, non-svm) are created by
> > making use of iommufd_hwpt_replace_device. Which adds an entry to the
> > pasid_array indexed on IOMMU_NO_PASID (0U) via the following path:
> >
> > iommufd_hwpt_replace_device
> > -> iommufd_fault_domain_repalce_dev
> > -> __fault_domain_replace_dev
> > -> iommu_replace_group_handle
> -> __iommu_group_set_domain
> -> intel_iommu_attach_device
> -> device_block_translation
> -> intel_pasid_tear_down_entry(IOMMU_NO_PASID)
>
> Here a domain is removed from the pasid entry, hence we need to flush
> all page requests that are pending in the IOMMU page request queue or
> the PCI fabric.
This make a lot of sense: To use iommufd_hwpt_replace_device to replace
the existing hwpt with a iopf enabled one, the soon to be irrelevant
page requests from the existing hwpt need to be flushed. And we were not
doing that here.
>
> > -> xa_reserve(&group->pasid_array, IOMMU_NO_PASID, GFP_KERNEL);
> >
> > It is my understanding that this will provide the needed relation
> > between the device and the prq in such a way that when remove_dev_pasid
> > is called, intel_iommu_drain_pasid_prq will be called with the
> > appropriate pasid value set to IOMMU_NO_PASID. Please correct me if I'm
> > mistaken.
>
> Removing a domain from a RID and a PASID are different paths.
> Previously, this IOMMU driver only supported page requests on PASID
> (non-IOMMU_NO_PASID). It is acceptable that it does not flush the PRQ in
> the domain-removing RID path.
>
> With the changes made in this series, the driver now supports page
> requests for RID. It should also flush the PRQ when removing a domain
> from a PASID entry for IOMMU_NO_PASID.
Thank you for your explanation. Clarifies where I lacked understanding.
>
> >
> > Does this answer your question? Do you have a specific path that you are
> > looking at where a specific non-pasid drain is needed?
>
> Perhaps we can simply add below change.
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index e860bc9439a2..a24a42649621 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4283,7 +4283,6 @@ static void intel_iommu_remove_dev_pasid(struct
> device *dev, ioasid_t pasid,
> intel_iommu_debugfs_remove_dev_pasid(dev_pasid);
> kfree(dev_pasid);
> intel_pasid_tear_down_entry(iommu, dev, pasid, false);
> - intel_drain_pasid_prq(dev, pasid);
> }
>
> static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
> index 2e5fa0a23299..8639f3eb4264 100644
> --- a/drivers/iommu/intel/pasid.c
> +++ b/drivers/iommu/intel/pasid.c
> @@ -265,6 +265,7 @@ void intel_pasid_tear_down_entry(struct intel_iommu
> *iommu, struct device *dev,
> iommu->flush.flush_iotlb(iommu, did, 0, 0,
> DMA_TLB_DSI_FLUSH);
>
> devtlb_invalidation_with_pasid(iommu, dev, pasid);
> + intel_drain_pasid_prq(dev, pasid);
> }
This make sense logically as the intel_drain_pasid_prq keeps being
called at the end of intel_iommu_remove_dev_pasid, but it is now also
included in the intel_pasid_tear_down_entry call which adds it to the
case discussed.
>
> /*
> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
> index 078d1e32a24e..ff88f31053d1 100644
> --- a/drivers/iommu/intel/svm.c
> +++ b/drivers/iommu/intel/svm.c
> @@ -304,9 +304,6 @@ void intel_drain_pasid_prq(struct device *dev, u32
> pasid)
> int qdep;
>
> info = dev_iommu_priv_get(dev);
> - if (WARN_ON(!info || !dev_is_pci(dev)))
> - return;
Did you mean to take out both checks?:
1. The info pointer check
2. the dev_is_pci check
I can understand the dev_is_pci check, but we should definitely take
action if info is NULL. Right?
> -
> if (!info->pri_enabled)
> return;
>
> Generally, intel_drain_pasid_prq() should be called if
>
> - a translation is removed from a pasid entry; and
This is the path that is already mentiond
> - PRI on this device is enabled.
And this path is:
-> intel_iommu_enable_iopf
-> context_flip_pri
-> intel_context_flush_present
-> qi_flush_pasid_cache
Right?
I'll put this in my next version if I see that there is a consensus in
the current discussion.
thx again for the feedback.
Best
--
Joel Granados
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread
2024-10-30 14:28 ` Joel Granados
@ 2024-10-31 3:42 ` Baolu Lu
2024-10-31 9:57 ` Baolu Lu
1 sibling, 0 replies; 21+ messages in thread
From: Baolu Lu @ 2024-10-31 3:42 UTC (permalink / raw)
To: Joel Granados
Cc: Yi Liu, David Woodhouse, Joerg Roedel, Will Deacon, Robin Murphy,
Jason Gunthorpe, Kevin Tian, Klaus Jensen, linux-kernel, iommu,
Klaus Jensen
On 10/30/24 22:28, Joel Granados wrote:
>> /*
>> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
>> index 078d1e32a24e..ff88f31053d1 100644
>> --- a/drivers/iommu/intel/svm.c
>> +++ b/drivers/iommu/intel/svm.c
>> @@ -304,9 +304,6 @@ void intel_drain_pasid_prq(struct device *dev, u32
>> pasid)
>> int qdep;
>>
>> info = dev_iommu_priv_get(dev);
>> - if (WARN_ON(!info || !dev_is_pci(dev)))
>> - return;
> Did you mean to take out both checks?:
> 1. The info pointer check
> 2. the dev_is_pci check
>
> I can understand the dev_is_pci check, but we should definitely take
> action if info is NULL. Right?
WARN_ON(!info) is duplicate as far as I can see. Accessing
info->pri_enable when info is NULL will cause a null pointer dereference
warning. This appears irrelevant to this patch though.
>
>> -
>> if (!info->pri_enabled)
>> return;
>>
>> Generally, intel_drain_pasid_prq() should be called if
>>
>> - a translation is removed from a pasid entry; and
> This is the path that is already mentiond
>
>> - PRI on this device is enabled.
> And this path is:
> -> intel_iommu_enable_iopf
> -> context_flip_pri
> -> intel_context_flush_present
> -> qi_flush_pasid_cache
>
> Right?
Sorry that I didn't make it clear. It should be "PRI on this device was
enabled", a.k.a. info->pri_enabled is true. I didn't meant to say in the
PRI enabling path.
--
baolu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread
2024-10-30 14:28 ` Joel Granados
2024-10-31 3:42 ` Baolu Lu
@ 2024-10-31 9:57 ` Baolu Lu
2024-10-31 11:18 ` Joel Granados
1 sibling, 1 reply; 21+ messages in thread
From: Baolu Lu @ 2024-10-31 9:57 UTC (permalink / raw)
To: Joel Granados
Cc: baolu.lu, Yi Liu, David Woodhouse, Joerg Roedel, Will Deacon,
Robin Murphy, Jason Gunthorpe, Kevin Tian, Klaus Jensen,
linux-kernel, iommu, Klaus Jensen
On 2024/10/30 22:28, Joel Granados wrote:
> On Tue, Oct 29, 2024 at 11:12:49AM +0800, Baolu Lu wrote:
>> On 2024/10/28 18:24, Joel Granados wrote:
>>> On Mon, Oct 28, 2024 at 03:50:46PM +0800, Yi Liu wrote:
>>>> On 2024/10/16 05:08, Joel Granados wrote:
>>>>> From: Klaus Jensen<k.jensen@samsung.com>
>>>>>
>>>>> PASID is not strictly needed when handling a PRQ event; remove the check
>>>>> for the pasid present bit in the request. This change was not included
>>>>> in the creation of prq.c to emphasize the change in capability checks
>>>>> when handing PRQ events.
>>>>>
>>>>> Signed-off-by: Klaus Jensen<k.jensen@samsung.com>
>>>>> Reviewed-by: Kevin Tian<kevin.tian@intel.com>
>>>>> Signed-off-by: Joel Granados<joel.granados@kernel.org>
>>>> looks like the PRQ draining is missed for the PRI usage. When a pasid
>>>> entry is destroyed, it might need to add helper similar to the
>>>> intel_drain_pasid_prq() to drain PRQ for the non-pasid usage.
>>> These types of user space PRIs (non-pasid, non-svm) are created by
>>> making use of iommufd_hwpt_replace_device. Which adds an entry to the
>>> pasid_array indexed on IOMMU_NO_PASID (0U) via the following path:
>>>
>>> iommufd_hwpt_replace_device
>>> -> iommufd_fault_domain_repalce_dev
>>> -> __fault_domain_replace_dev
>>> -> iommu_replace_group_handle
>> -> __iommu_group_set_domain
>> -> intel_iommu_attach_device
>> -> device_block_translation
>> -> intel_pasid_tear_down_entry(IOMMU_NO_PASID)
>>
>> Here a domain is removed from the pasid entry, hence we need to flush
>> all page requests that are pending in the IOMMU page request queue or
>> the PCI fabric.
> This make a lot of sense: To use iommufd_hwpt_replace_device to replace
> the existing hwpt with a iopf enabled one, the soon to be irrelevant
> page requests from the existing hwpt need to be flushed. And we were not
> doing that here.
>
>>> -> xa_reserve(&group->pasid_array, IOMMU_NO_PASID, GFP_KERNEL);
>>>
>>> It is my understanding that this will provide the needed relation
>>> between the device and the prq in such a way that when remove_dev_pasid
>>> is called, intel_iommu_drain_pasid_prq will be called with the
>>> appropriate pasid value set to IOMMU_NO_PASID. Please correct me if I'm
>>> mistaken.
>> Removing a domain from a RID and a PASID are different paths.
>> Previously, this IOMMU driver only supported page requests on PASID
>> (non-IOMMU_NO_PASID). It is acceptable that it does not flush the PRQ in
>> the domain-removing RID path.
>>
>> With the changes made in this series, the driver now supports page
>> requests for RID. It should also flush the PRQ when removing a domain
>> from a PASID entry for IOMMU_NO_PASID.
> Thank you for your explanation. Clarifies where I lacked understanding.
>
>>> Does this answer your question? Do you have a specific path that you are
>>> looking at where a specific non-pasid drain is needed?
>> Perhaps we can simply add below change.
>>
>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>> index e860bc9439a2..a24a42649621 100644
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -4283,7 +4283,6 @@ static void intel_iommu_remove_dev_pasid(struct
>> device *dev, ioasid_t pasid,
>> intel_iommu_debugfs_remove_dev_pasid(dev_pasid);
>> kfree(dev_pasid);
>> intel_pasid_tear_down_entry(iommu, dev, pasid, false);
>> - intel_drain_pasid_prq(dev, pasid);
>> }
>>
>> static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
>> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
>> index 2e5fa0a23299..8639f3eb4264 100644
>> --- a/drivers/iommu/intel/pasid.c
>> +++ b/drivers/iommu/intel/pasid.c
>> @@ -265,6 +265,7 @@ void intel_pasid_tear_down_entry(struct intel_iommu
>> *iommu, struct device *dev,
>> iommu->flush.flush_iotlb(iommu, did, 0, 0,
>> DMA_TLB_DSI_FLUSH);
>>
>> devtlb_invalidation_with_pasid(iommu, dev, pasid);
>> + intel_drain_pasid_prq(dev, pasid);
>> }
> This make sense logically as the intel_drain_pasid_prq keeps being
> called at the end of intel_iommu_remove_dev_pasid, but it is now also
> included in the intel_pasid_tear_down_entry call which adds it to the
> case discussed.
>
>> /*
>> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
>> index 078d1e32a24e..ff88f31053d1 100644
>> --- a/drivers/iommu/intel/svm.c
>> +++ b/drivers/iommu/intel/svm.c
>> @@ -304,9 +304,6 @@ void intel_drain_pasid_prq(struct device *dev, u32
>> pasid)
>> int qdep;
>>
>> info = dev_iommu_priv_get(dev);
>> - if (WARN_ON(!info || !dev_is_pci(dev)))
>> - return;
> Did you mean to take out both checks?:
> 1. The info pointer check
> 2. the dev_is_pci check
>
> I can understand the dev_is_pci check, but we should definitely take
> action if info is NULL. Right?
>
>> -
>> if (!info->pri_enabled)
>> return;
>>
>> Generally, intel_drain_pasid_prq() should be called if
>>
>> - a translation is removed from a pasid entry; and
> This is the path that is already mentiond
>
>> - PRI on this device is enabled.
> And this path is:
> -> intel_iommu_enable_iopf
> -> context_flip_pri
> -> intel_context_flush_present
> -> qi_flush_pasid_cache
>
> Right?
>
> I'll put this in my next version if I see that there is a consensus in
> the current discussion.
I post a patch to address what we are discussing here, so that you don't
need to send a new version.
https://lore.kernel.org/linux-iommu/20241031095139.44220-1-baolu.lu@linux.intel.com/
--
baolu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Re: [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread
2024-10-31 9:57 ` Baolu Lu
@ 2024-10-31 11:18 ` Joel Granados
2024-10-31 11:46 ` Baolu Lu
0 siblings, 1 reply; 21+ messages in thread
From: Joel Granados @ 2024-10-31 11:18 UTC (permalink / raw)
To: Baolu Lu
Cc: Yi Liu, David Woodhouse, Joerg Roedel, Will Deacon, Robin Murphy,
Jason Gunthorpe, Kevin Tian, Klaus Jensen, linux-kernel, iommu,
Klaus Jensen
On Thu, Oct 31, 2024 at 05:57:01PM +0800, Baolu Lu wrote:
> On 2024/10/30 22:28, Joel Granados wrote:
> > On Tue, Oct 29, 2024 at 11:12:49AM +0800, Baolu Lu wrote:
> >> On 2024/10/28 18:24, Joel Granados wrote:
> >>> On Mon, Oct 28, 2024 at 03:50:46PM +0800, Yi Liu wrote:
> >>>> On 2024/10/16 05:08, Joel Granados wrote:
> >>>>> From: Klaus Jensen<k.jensen@samsung.com>
> >>>>>
> >>>>> PASID is not strictly needed when handling a PRQ event; remove the check
> >>>>> for the pasid present bit in the request. This change was not included
> >>>>> in the creation of prq.c to emphasize the change in capability checks
> >>>>> when handing PRQ events.
> >>>>>
> >>>>> Signed-off-by: Klaus Jensen<k.jensen@samsung.com>
> >>>>> Reviewed-by: Kevin Tian<kevin.tian@intel.com>
> >>>>> Signed-off-by: Joel Granados<joel.granados@kernel.org>
> >>>> looks like the PRQ draining is missed for the PRI usage. When a pasid
> >>>> entry is destroyed, it might need to add helper similar to the
> >>>> intel_drain_pasid_prq() to drain PRQ for the non-pasid usage.
> >>> These types of user space PRIs (non-pasid, non-svm) are created by
> >>> making use of iommufd_hwpt_replace_device. Which adds an entry to the
> >>> pasid_array indexed on IOMMU_NO_PASID (0U) via the following path:
> >>>
> >>> iommufd_hwpt_replace_device
> >>> -> iommufd_fault_domain_repalce_dev
> >>> -> __fault_domain_replace_dev
> >>> -> iommu_replace_group_handle
> >> -> __iommu_group_set_domain
> >> -> intel_iommu_attach_device
> >> -> device_block_translation
> >> -> intel_pasid_tear_down_entry(IOMMU_NO_PASID)
> >>
> >> Here a domain is removed from the pasid entry, hence we need to flush
> >> all page requests that are pending in the IOMMU page request queue or
> >> the PCI fabric.
> > This make a lot of sense: To use iommufd_hwpt_replace_device to replace
> > the existing hwpt with a iopf enabled one, the soon to be irrelevant
> > page requests from the existing hwpt need to be flushed. And we were not
> > doing that here.
> >
> >>> -> xa_reserve(&group->pasid_array, IOMMU_NO_PASID, GFP_KERNEL);
> >>>
> >>> It is my understanding that this will provide the needed relation
> >>> between the device and the prq in such a way that when remove_dev_pasid
> >>> is called, intel_iommu_drain_pasid_prq will be called with the
> >>> appropriate pasid value set to IOMMU_NO_PASID. Please correct me if I'm
> >>> mistaken.
> >> Removing a domain from a RID and a PASID are different paths.
> >> Previously, this IOMMU driver only supported page requests on PASID
> >> (non-IOMMU_NO_PASID). It is acceptable that it does not flush the PRQ in
> >> the domain-removing RID path.
> >>
> >> With the changes made in this series, the driver now supports page
> >> requests for RID. It should also flush the PRQ when removing a domain
> >> from a PASID entry for IOMMU_NO_PASID.
> > Thank you for your explanation. Clarifies where I lacked understanding.
> >
> >>> Does this answer your question? Do you have a specific path that you are
> >>> looking at where a specific non-pasid drain is needed?
> >> Perhaps we can simply add below change.
> >>
> >> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> >> index e860bc9439a2..a24a42649621 100644
> >> --- a/drivers/iommu/intel/iommu.c
> >> +++ b/drivers/iommu/intel/iommu.c
> >> @@ -4283,7 +4283,6 @@ static void intel_iommu_remove_dev_pasid(struct
> >> device *dev, ioasid_t pasid,
> >> intel_iommu_debugfs_remove_dev_pasid(dev_pasid);
> >> kfree(dev_pasid);
> >> intel_pasid_tear_down_entry(iommu, dev, pasid, false);
> >> - intel_drain_pasid_prq(dev, pasid);
> >> }
> >>
> >> static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
> >> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
> >> index 2e5fa0a23299..8639f3eb4264 100644
> >> --- a/drivers/iommu/intel/pasid.c
> >> +++ b/drivers/iommu/intel/pasid.c
> >> @@ -265,6 +265,7 @@ void intel_pasid_tear_down_entry(struct intel_iommu
> >> *iommu, struct device *dev,
> >> iommu->flush.flush_iotlb(iommu, did, 0, 0,
> >> DMA_TLB_DSI_FLUSH);
> >>
> >> devtlb_invalidation_with_pasid(iommu, dev, pasid);
> >> + intel_drain_pasid_prq(dev, pasid);
> >> }
> > This make sense logically as the intel_drain_pasid_prq keeps being
> > called at the end of intel_iommu_remove_dev_pasid, but it is now also
> > included in the intel_pasid_tear_down_entry call which adds it to the
> > case discussed.
> >
> >> /*
> >> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
> >> index 078d1e32a24e..ff88f31053d1 100644
> >> --- a/drivers/iommu/intel/svm.c
> >> +++ b/drivers/iommu/intel/svm.c
> >> @@ -304,9 +304,6 @@ void intel_drain_pasid_prq(struct device *dev, u32
> >> pasid)
> >> int qdep;
> >>
> >> info = dev_iommu_priv_get(dev);
> >> - if (WARN_ON(!info || !dev_is_pci(dev)))
> >> - return;
> > Did you mean to take out both checks?:
> > 1. The info pointer check
> > 2. the dev_is_pci check
> >
> > I can understand the dev_is_pci check, but we should definitely take
> > action if info is NULL. Right?
> >
> >> -
> >> if (!info->pri_enabled)
> >> return;
> >>
> >> Generally, intel_drain_pasid_prq() should be called if
> >>
> >> - a translation is removed from a pasid entry; and
> > This is the path that is already mentiond
> >
> >> - PRI on this device is enabled.
> > And this path is:
> > -> intel_iommu_enable_iopf
> > -> context_flip_pri
> > -> intel_context_flush_present
> > -> qi_flush_pasid_cache
> >
> > Right?
> >
> > I'll put this in my next version if I see that there is a consensus in
> > the current discussion.
>
> I post a patch to address what we are discussing here, so that you don't
> need to send a new version.
>
> https://lore.kernel.org/linux-iommu/20241031095139.44220-1-baolu.lu@linux.intel.com/
Thx for that :). A few comments:
1. I see that you have correctly changed the intel/prq.c file. This
means that that patch depends on this series. Would it be easier (for
upstreaming) to just put them together? I can take your patch into
the series leaving you as the author. Tell me what you think.
2. I see the mail in the list and I see that I'm cced, but I have not
received it in my mail box yet. I'll wait for it to arrive to see if
my comments still apply to that one
Best
>
> --
> baolu
--
Joel Granados
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread
2024-10-31 11:18 ` Joel Granados
@ 2024-10-31 11:46 ` Baolu Lu
0 siblings, 0 replies; 21+ messages in thread
From: Baolu Lu @ 2024-10-31 11:46 UTC (permalink / raw)
To: Joel Granados
Cc: baolu.lu, Yi Liu, David Woodhouse, Joerg Roedel, Will Deacon,
Robin Murphy, Jason Gunthorpe, Kevin Tian, Klaus Jensen,
linux-kernel, iommu, Klaus Jensen
On 2024/10/31 19:18, Joel Granados wrote:
> On Thu, Oct 31, 2024 at 05:57:01PM +0800, Baolu Lu wrote:
>> On 2024/10/30 22:28, Joel Granados wrote:
>>> On Tue, Oct 29, 2024 at 11:12:49AM +0800, Baolu Lu wrote:
>>>> On 2024/10/28 18:24, Joel Granados wrote:
>>>>> On Mon, Oct 28, 2024 at 03:50:46PM +0800, Yi Liu wrote:
>>>>>> On 2024/10/16 05:08, Joel Granados wrote:
>>>>>>> From: Klaus Jensen<k.jensen@samsung.com>
>>>>>>>
>>>>>>> PASID is not strictly needed when handling a PRQ event; remove the check
>>>>>>> for the pasid present bit in the request. This change was not included
>>>>>>> in the creation of prq.c to emphasize the change in capability checks
>>>>>>> when handing PRQ events.
>>>>>>>
>>>>>>> Signed-off-by: Klaus Jensen<k.jensen@samsung.com>
>>>>>>> Reviewed-by: Kevin Tian<kevin.tian@intel.com>
>>>>>>> Signed-off-by: Joel Granados<joel.granados@kernel.org>
>>>>>> looks like the PRQ draining is missed for the PRI usage. When a pasid
>>>>>> entry is destroyed, it might need to add helper similar to the
>>>>>> intel_drain_pasid_prq() to drain PRQ for the non-pasid usage.
>>>>> These types of user space PRIs (non-pasid, non-svm) are created by
>>>>> making use of iommufd_hwpt_replace_device. Which adds an entry to the
>>>>> pasid_array indexed on IOMMU_NO_PASID (0U) via the following path:
>>>>>
>>>>> iommufd_hwpt_replace_device
>>>>> -> iommufd_fault_domain_repalce_dev
>>>>> -> __fault_domain_replace_dev
>>>>> -> iommu_replace_group_handle
>>>> -> __iommu_group_set_domain
>>>> -> intel_iommu_attach_device
>>>> -> device_block_translation
>>>> -> intel_pasid_tear_down_entry(IOMMU_NO_PASID)
>>>>
>>>> Here a domain is removed from the pasid entry, hence we need to flush
>>>> all page requests that are pending in the IOMMU page request queue or
>>>> the PCI fabric.
>>> This make a lot of sense: To use iommufd_hwpt_replace_device to replace
>>> the existing hwpt with a iopf enabled one, the soon to be irrelevant
>>> page requests from the existing hwpt need to be flushed. And we were not
>>> doing that here.
>>>
>>>>> -> xa_reserve(&group->pasid_array, IOMMU_NO_PASID, GFP_KERNEL);
>>>>>
>>>>> It is my understanding that this will provide the needed relation
>>>>> between the device and the prq in such a way that when remove_dev_pasid
>>>>> is called, intel_iommu_drain_pasid_prq will be called with the
>>>>> appropriate pasid value set to IOMMU_NO_PASID. Please correct me if I'm
>>>>> mistaken.
>>>> Removing a domain from a RID and a PASID are different paths.
>>>> Previously, this IOMMU driver only supported page requests on PASID
>>>> (non-IOMMU_NO_PASID). It is acceptable that it does not flush the PRQ in
>>>> the domain-removing RID path.
>>>>
>>>> With the changes made in this series, the driver now supports page
>>>> requests for RID. It should also flush the PRQ when removing a domain
>>>> from a PASID entry for IOMMU_NO_PASID.
>>> Thank you for your explanation. Clarifies where I lacked understanding.
>>>
>>>>> Does this answer your question? Do you have a specific path that you are
>>>>> looking at where a specific non-pasid drain is needed?
>>>> Perhaps we can simply add below change.
>>>>
>>>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>>>> index e860bc9439a2..a24a42649621 100644
>>>> --- a/drivers/iommu/intel/iommu.c
>>>> +++ b/drivers/iommu/intel/iommu.c
>>>> @@ -4283,7 +4283,6 @@ static void intel_iommu_remove_dev_pasid(struct
>>>> device *dev, ioasid_t pasid,
>>>> intel_iommu_debugfs_remove_dev_pasid(dev_pasid);
>>>> kfree(dev_pasid);
>>>> intel_pasid_tear_down_entry(iommu, dev, pasid, false);
>>>> - intel_drain_pasid_prq(dev, pasid);
>>>> }
>>>>
>>>> static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
>>>> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
>>>> index 2e5fa0a23299..8639f3eb4264 100644
>>>> --- a/drivers/iommu/intel/pasid.c
>>>> +++ b/drivers/iommu/intel/pasid.c
>>>> @@ -265,6 +265,7 @@ void intel_pasid_tear_down_entry(struct intel_iommu
>>>> *iommu, struct device *dev,
>>>> iommu->flush.flush_iotlb(iommu, did, 0, 0,
>>>> DMA_TLB_DSI_FLUSH);
>>>>
>>>> devtlb_invalidation_with_pasid(iommu, dev, pasid);
>>>> + intel_drain_pasid_prq(dev, pasid);
>>>> }
>>> This make sense logically as the intel_drain_pasid_prq keeps being
>>> called at the end of intel_iommu_remove_dev_pasid, but it is now also
>>> included in the intel_pasid_tear_down_entry call which adds it to the
>>> case discussed.
>>>
>>>> /*
>>>> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
>>>> index 078d1e32a24e..ff88f31053d1 100644
>>>> --- a/drivers/iommu/intel/svm.c
>>>> +++ b/drivers/iommu/intel/svm.c
>>>> @@ -304,9 +304,6 @@ void intel_drain_pasid_prq(struct device *dev, u32
>>>> pasid)
>>>> int qdep;
>>>>
>>>> info = dev_iommu_priv_get(dev);
>>>> - if (WARN_ON(!info || !dev_is_pci(dev)))
>>>> - return;
>>> Did you mean to take out both checks?:
>>> 1. The info pointer check
>>> 2. the dev_is_pci check
>>>
>>> I can understand the dev_is_pci check, but we should definitely take
>>> action if info is NULL. Right?
>>>
>>>> -
>>>> if (!info->pri_enabled)
>>>> return;
>>>>
>>>> Generally, intel_drain_pasid_prq() should be called if
>>>>
>>>> - a translation is removed from a pasid entry; and
>>> This is the path that is already mentiond
>>>
>>>> - PRI on this device is enabled.
>>> And this path is:
>>> -> intel_iommu_enable_iopf
>>> -> context_flip_pri
>>> -> intel_context_flush_present
>>> -> qi_flush_pasid_cache
>>>
>>> Right?
>>>
>>> I'll put this in my next version if I see that there is a consensus in
>>> the current discussion.
>> I post a patch to address what we are discussing here, so that you don't
>> need to send a new version.
>>
>> https://lore.kernel.org/linux-iommu/20241031095139.44220-1-
>> baolu.lu@linux.intel.com/
> Thx for that 🙂. A few comments:
>
> 1. I see that you have correctly changed the intel/prq.c file. This
> means that that patch depends on this series. Would it be easier (for
> upstreaming) to just put them together? I can take your patch into
> the series leaving you as the author. Tell me what you think.
>
> 2. I see the mail in the list and I see that I'm cced, but I have not
> received it in my mail box yet. I'll wait for it to arrive to see if
> my comments still apply to that one
No need to put them together and resend a new version if there's no
further comment. Then, I'll queue them together for v6.13 through the
iommu tree.
--
baolu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 0/5] iommu: Enable user space IOPFs in non-PASID and non-svm cases
2024-10-15 21:08 [PATCH v4 0/5] iommu: Enable user space IOPFs in non-PASID and non-svm cases Joel Granados
` (4 preceding siblings ...)
2024-10-15 21:08 ` [PATCH v4 5/5] iommu/vt-d: drop pasid requirement for prq initialization Joel Granados
@ 2024-11-02 3:27 ` Baolu Lu
5 siblings, 0 replies; 21+ messages in thread
From: Baolu Lu @ 2024-11-02 3:27 UTC (permalink / raw)
To: Joel Granados, David Woodhouse, Joerg Roedel, Will Deacon,
Robin Murphy, Jason Gunthorpe, Kevin Tian, Klaus Jensen
Cc: baolu.lu, linux-kernel, iommu, Klaus Jensen
On 2024/10/16 5:08, Joel Granados wrote:
> This series makes use of iommufd_hwpt_replace_device to execute
> non-pasid/non-svm user space IOPFs. Our main motivation is to expand or
> facilitate user-space driver driven device verification by enabling IOPF
> without SVM/PASID.
>
> What?
> * Enable IO page fault handling in user space for a non-pasid, non-svm
> and non-virtualised use case.
> * Move IOMMU_IOPF configuration from INTEL_IOMMU_SVM into INTEL_IOMMU.
> * Move all page request queue related logic to a new (prq.c) file.
> * Remove PASID checks from PRQ event handling as well as PRQ
> initialization.
> * Allow execution of IOMMU_HWPT_ALLOC with a valid fault id
> (IOMMU_HWPT_FAULT_ID_VALID)
>
> Why?
> The PCI ATS Extended Capability allows peripheral devices to
> participate in the caching of translations when operating under an
> IOMMU. Further, the ATS Page Request Interface (PRI) Extension allows
> devices to handle missing mappings. Currently, PRI is mainly used in
> the context of Shared Virtual Addressing, requiring support for the
> Process Address Space Identifier (PASID) capability which is not
> strictly necessary. Relaxing this requirement adds to the
> possibilities available for user-space driver driver device
> verification as well as for avoiding pinning.
>
> Testing?
> The non-svm IOPF interface is exercised by first initializing an IOPF
> enabled IOAS and then reading the fault file descriptor. Pseudocode on
> the IOPF initializing and handling is in [3] and [4] (using libvfn).
>
> Supplementary repositories supporting this patchset:
> 1. A user space library libvfn [1] which is used for testing and
> verification (see examples/iopf.c), and
> 2. Basic emulation of PCIe ATS/PRI and Intel VT-d PRQ in QEMU [2].
>
> Changes in v4:
> - Include the "trailers" from Kevin that I missed on V3
> - Link to v3:https://lore.kernel.org/r/20241009-jag-iopfv8-v3-0-bd4271df5b2b@kernel.org
>
> Changes in v3:
> - Adjust wording in cover letter
> - Include "_iommu_" in the prq Intel function names to be more in line
> with functions in iommu.h file
> - Rebase on top of 6.12-rc2
> - Update my ID in e-mail, git author and my Signed-off-by.
> - Link to v2:https://lore.kernel.org/r/20240913-jag-iopfv8-v2-0-dea01c2343bc@samsung.com
>
> Changes in v2:
> - Remove "nesting" from wording. This wording is left over from initial
> versions that are now irrelevant.
> - Dropped "iommu: init pasid array while doing domain_replace and iopf
> is active" as the initialization of the pasid_array x-array happens
> automatically when an iopf capable domain is replaced on a device.
> - Corrected commit message in "iommu/vt-d: Separate page request queue
> from SVM"
> - Link to v1:https://lore.kernel.org/r/20240904-jag-iopfv8-v1-0-e3549920adf3@samsung.com
>
> V1:
> - This is the first version of the series after initial feedback from
> the RFC [5].
>
> Comments and feedback are greatly appreciated
> Best
>
> Joel
>
> [1]https://github.com/SamsungDS/libvfn/tree/iommufd-fault-queue
> [2]https://gitlab.com/birkelund/qemu/-/tree/pcie-ats-pri
>
> [3] Initializing
> ```
> int iopf_init(struct iommu_ioas *ioas, const char *bdf)
> {
> // open vfio device from bdf
> int devfd = open('/dev/vfio/devices/VFIO_DEV', O_RDWR);
>
> struct vfio_device_bind_iommufd bind = {
> .argsz = sizeof(bind),
> .flags = 0,
> .iommufd = __iommufd,
> };
> ioctl(devfd, VFIO_DEVICE_BIND_IOMMUFD, &bind);
>
> struct iommu_ioas *ioas = ioas;
> struct vfio_device_attach_iommufd_pt attach_data = {
> .argsz = sizeof(attach_data),
> .flags = 0,
> .pt_id = ioas->id,
> };
> ioctl(devfd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach_data);
>
> struct iommu_fault_alloc fault = {
> .size = sizeof(fault),
> .flags = 0,
> };
> ioctl(__iommufd, IOMMU_FAULT_QUEUE_ALLOC, &fault);
>
> struct iommu_hwpt_alloc fault_cmd = {
> .size = sizeof(fault_cmd),
> .flags = IOMMU_HWPT_FAULT_ID_VALID,
> .dev_id = bind.out_devid,
> .pt_id = ioas->id,
> .data_len = 0,
> .data_uptr = (uint64_t)NULL,
> .fault_id = fault.out_fault_id,
> .__reserved = 0,
> };
> ioctl(__iommufd, IOMMU_HWPT_ALLOC, &fault_cmd);
>
> // This is a re-attach
> struct vfio_device_attach_iommufd_pt attach = {
> .argsz = sizeof(attach),
> .flags = 0,
> .pt_id = fault_cmd.out_hwpt_id
> };
> ioctl(dev_fd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach);
> }
> ```
>
> [4] Handling
> ```
> int handle_iopf(void *vaddr, int len, uint64_t iova) {
> exec_command(CMD)
>
> int iopf_fd = fault_cmd.fault_id;
>
> struct iommu_hwpt_pgfault pgfault = {0};
> if(read(iopf_fd, &pgfault, sizeof(pgfault)) == 0);
> return; // no page fault
>
> ret = iommu_map_vaddr(__iommmufd, vaddr, len, &iova)
> struct iommu_hwpt_page_response pgfault_response = {
> .cookie = pgfault.cookie,
> .code = ret ? IOMMUFD_PAGE_RESP_SUCCESS : IOMMUFD_PAGE_RESP_INVALID,
> };
>
> write(iopf_fd, &pgfault_response, sizeof(pgfault_response));
>
> return;
> }
> ```
>
> [5]https://lore.kernel.org/20240826-iopf-for-all-v1-0-59174e6a7528@samsung.com
>
> Signed-off-by: Joel Granados<j.granados@samsung.com>
> ---
> Joel Granados (3):
> iommu/vt-d: Separate page request queue from SVM
> iommu: kconfig: Move IOMMU_IOPF into INTEL_IOMMU
> iommufd: Enable PRI when doing the iommufd_hwpt_alloc
>
> Klaus Jensen (2):
> iommu/vt-d: Remove the pasid present check in prq_event_thread
> iommu/vt-d: drop pasid requirement for prq initialization
>
> drivers/iommu/intel/Kconfig | 2 +-
> drivers/iommu/intel/Makefile | 2 +-
> drivers/iommu/intel/iommu.c | 31 ++-
> drivers/iommu/intel/iommu.h | 14 +-
> drivers/iommu/intel/prq.c | 406 +++++++++++++++++++++++++++++++++++
> drivers/iommu/intel/svm.c | 397 ----------------------------------
> drivers/iommu/iommufd/hw_pagetable.c | 3 +-
> 7 files changed, 428 insertions(+), 427 deletions(-)
With the issue discussed for patch 2/5 addressed by this patch:
Link:
https://lore.kernel.org/r/20241101045543.70086-1-baolu.lu@linux.intel.com
Queued this series for v6.13. Thank you, Joel.
--
baolu
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2024-11-02 3:27 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-15 21:08 [PATCH v4 0/5] iommu: Enable user space IOPFs in non-PASID and non-svm cases Joel Granados
2024-10-15 21:08 ` [PATCH v4 1/5] iommu/vt-d: Separate page request queue from SVM Joel Granados
2024-10-15 21:08 ` [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in prq_event_thread Joel Granados
2024-10-28 7:50 ` Yi Liu
2024-10-28 8:23 ` Baolu Lu
2024-10-29 5:11 ` Yi Liu
2024-10-28 10:24 ` Joel Granados
2024-10-29 3:12 ` Baolu Lu
2024-10-29 5:13 ` Yi Liu
2024-10-29 5:39 ` Baolu Lu
2024-10-30 5:51 ` Yi Liu
2024-10-30 8:32 ` Baolu Lu
2024-10-30 14:28 ` Joel Granados
2024-10-31 3:42 ` Baolu Lu
2024-10-31 9:57 ` Baolu Lu
2024-10-31 11:18 ` Joel Granados
2024-10-31 11:46 ` Baolu Lu
2024-10-15 21:08 ` [PATCH v4 3/5] iommu: kconfig: Move IOMMU_IOPF into INTEL_IOMMU Joel Granados
2024-10-15 21:08 ` [PATCH v4 4/5] iommufd: Enable PRI when doing the iommufd_hwpt_alloc Joel Granados
2024-10-15 21:08 ` [PATCH v4 5/5] iommu/vt-d: drop pasid requirement for prq initialization Joel Granados
2024-11-02 3:27 ` [PATCH v4 0/5] iommu: Enable user space IOPFs in non-PASID and non-svm cases Baolu Lu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox