All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rc v1 0/4] iommufd: Fix veventq_depth boundary
@ 2026-05-18  2:28 Nicolin Chen
  2026-05-18  2:28 ` [PATCH rc v1 1/4] iommufd: Move vevent memory allocation outside spinlock Nicolin Chen
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Nicolin Chen @ 2026-05-18  2:28 UTC (permalink / raw)
  To: jgg, kevin.tian; +Cc: iommu, linux-kernel, linux-kselftest

The upper bound of veventq_depth has been missing for veventq allocation,
leaving a vulnerability where userspace could exhaust atomic memory pool.

Fix it properly:
 - Allocate outside the spinlock to avoid GFP_ATOMIC
 - Cap the veventq_depth upper bound
 - Fix event_data byte-count
 - Add selftest coverage

Note that QEMU's SMMU has been already allocating veventq using a "HW"
EVTQ entry number. So, picking 19 as the known use case, for a minimal
level of ABI consistency.

This is on github:
https://github.com/nicolinc/iommufd/commits/fix_veventq_depth-v1

Nicolin Chen (4):
  iommufd: Move vevent memory allocation outside spinlock
  iommufd: Set veventq_depth upper bound
  iommufd: Fix data_len byte-count vs element-count mismatch
  iommufd/selftest: Add boundary tests for veventq_depth

 drivers/iommu/iommufd/iommufd_private.h       |  2 +-
 tools/testing/selftests/iommu/iommufd_utils.h | 17 +++++++++--------
 drivers/iommu/iommufd/driver.c                | 19 ++++++++++++++++---
 drivers/iommu/iommufd/eventq.c                |  5 ++++-
 tools/testing/selftests/iommu/iommufd.c       | 19 +++++++++++++++++--
 .../selftests/iommu/iommufd_fail_nth.c        |  2 +-
 6 files changed, 48 insertions(+), 16 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH rc v1 1/4] iommufd: Move vevent memory allocation outside spinlock
  2026-05-18  2:28 [PATCH rc v1 0/4] iommufd: Fix veventq_depth boundary Nicolin Chen
@ 2026-05-18  2:28 ` Nicolin Chen
  2026-05-18  2:28 ` [PATCH rc v1 2/4] iommufd: Set veventq_depth upper bound Nicolin Chen
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Nicolin Chen @ 2026-05-18  2:28 UTC (permalink / raw)
  To: jgg, kevin.tian; +Cc: iommu, linux-kernel, linux-kselftest

The veventq memory allocation happens inside the spinlock. Given its depth
is decided by the user space, this leaves a vulnerability, where userspace
can allocate large queues to exhaust atomic memory reserves.

Move the allocation outside the spinlock and use GFP_NOWAIT, which can fail
fast under memory pressure without dipping into the GFP_ATOMIC reserves or
direct-reclaiming from the threaded IRQ handler. Treat the failure the same
as a queue-overflow and return -ENOMEM to notify the caller.

A subsequent change will cap the upper bound of the veventq_depth.

Fixes: e36ba5ab808e ("iommufd: Add IOMMUFD_OBJ_VEVENTQ and IOMMUFD_CMD_VEVENTQ_ALLOC")
Cc: stable@vger.kernel.org
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/iommu/iommufd/driver.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/iommufd/driver.c b/drivers/iommu/iommufd/driver.c
index 61e6b02601d1a..b0dec089aeee1 100644
--- a/drivers/iommu/iommufd/driver.c
+++ b/drivers/iommu/iommufd/driver.c
@@ -149,15 +149,28 @@ int iommufd_viommu_report_event(struct iommufd_viommu *viommu,
 		goto out_unlock_veventqs;
 	}
 
-	spin_lock(&veventq->common.lock);
-	if (veventq->num_events == veventq->depth) {
+	/*
+	 * Optimistic skip when clearly full. A concurrent reader may free a slot
+	 * before the spinlock; the cost is recording one extra event as lost.
+	 */
+	if (READ_ONCE(veventq->num_events) >= veventq->depth) {
+		spin_lock(&veventq->common.lock);
 		vevent = &veventq->lost_events_header;
 		goto out_set_header;
 	}
 
-	vevent = kzalloc_flex(*vevent, event_data, data_len, GFP_ATOMIC);
+	/* Pre-allocate to avoid GFP_ATOMIC; use GFP_NOWAIT to avoid sleeping */
+	vevent = kzalloc_flex(*vevent, event_data, data_len, GFP_NOWAIT);
 	if (!vevent) {
+		spin_lock(&veventq->common.lock);
+		vevent = &veventq->lost_events_header;
 		rc = -ENOMEM;
+		goto out_set_header;
+	}
+
+	spin_lock(&veventq->common.lock);
+	if (veventq->num_events == veventq->depth) {
+		kfree(vevent);
 		vevent = &veventq->lost_events_header;
 		goto out_set_header;
 	}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH rc v1 2/4] iommufd: Set veventq_depth upper bound
  2026-05-18  2:28 [PATCH rc v1 0/4] iommufd: Fix veventq_depth boundary Nicolin Chen
  2026-05-18  2:28 ` [PATCH rc v1 1/4] iommufd: Move vevent memory allocation outside spinlock Nicolin Chen
@ 2026-05-18  2:28 ` Nicolin Chen
  2026-05-18  2:28 ` [PATCH rc v1 3/4] iommufd: Fix data_len byte-count vs element-count mismatch Nicolin Chen
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Nicolin Chen @ 2026-05-18  2:28 UTC (permalink / raw)
  To: jgg, kevin.tian; +Cc: iommu, linux-kernel, linux-kselftest

iommufd_veventq_alloc() accepts any !0 veventq_depth from userspace, with
an upper bound at U32_MAX.

This leaves a vulnerability where userspace can allocate excessively large
queues to exhaust kernel memory reserves.

Cap the veventq_depth (maximum number of entries) to 1 << 19, matching the
maximum number of entries in the SMMUv3 EVTQ (the largest use case today).

Fixes: e36ba5ab808e ("iommufd: Add IOMMUFD_OBJ_VEVENTQ and IOMMUFD_CMD_VEVENTQ_ALLOC")
Cc: stable@vger.kernel.org
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/iommu/iommufd/eventq.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommufd/eventq.c b/drivers/iommu/iommufd/eventq.c
index 78689fb52d24c..1f1e415285b1a 100644
--- a/drivers/iommu/iommufd/eventq.c
+++ b/drivers/iommu/iommufd/eventq.c
@@ -473,6 +473,9 @@ int iommufd_fault_iopf_handler(struct iopf_group *group)
 static const struct file_operations iommufd_veventq_fops =
 	INIT_EVENTQ_FOPS(iommufd_veventq_fops_read, NULL);
 
+/* An arbitrary upper bound for veventq_depth that fits all existing HWs */
+#define VEVENTQ_MAX_DEPTH (1U << 19)
+
 int iommufd_veventq_alloc(struct iommufd_ucmd *ucmd)
 {
 	struct iommu_veventq_alloc *cmd = ucmd->cmd;
@@ -484,7 +487,7 @@ int iommufd_veventq_alloc(struct iommufd_ucmd *ucmd)
 	if (cmd->flags || cmd->__reserved ||
 	    cmd->type == IOMMU_VEVENTQ_TYPE_DEFAULT)
 		return -EOPNOTSUPP;
-	if (!cmd->veventq_depth)
+	if (!cmd->veventq_depth || cmd->veventq_depth > VEVENTQ_MAX_DEPTH)
 		return -EINVAL;
 
 	viommu = iommufd_get_viommu(ucmd, cmd->viommu_id);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH rc v1 3/4] iommufd: Fix data_len byte-count vs element-count mismatch
  2026-05-18  2:28 [PATCH rc v1 0/4] iommufd: Fix veventq_depth boundary Nicolin Chen
  2026-05-18  2:28 ` [PATCH rc v1 1/4] iommufd: Move vevent memory allocation outside spinlock Nicolin Chen
  2026-05-18  2:28 ` [PATCH rc v1 2/4] iommufd: Set veventq_depth upper bound Nicolin Chen
@ 2026-05-18  2:28 ` Nicolin Chen
  2026-05-18  2:28 ` [PATCH rc v1 4/4] iommufd/selftest: Add boundary tests for veventq_depth Nicolin Chen
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Nicolin Chen @ 2026-05-18  2:28 UTC (permalink / raw)
  To: jgg, kevin.tian; +Cc: iommu, linux-kernel, linux-kselftest

kzalloc_flex() computes the allocation size. With event_data typed as u64,
data_len is interpreted as a u64 element count. Yet, every caller and the
read path treat data_len as a byte count. The current code over-allocates
by sizeof(u64) and the __counted_by() annotation overstates the length by
the same factor.

Re-type event_data as u8. No functional change in user-visible behavior.

Fixes: e36ba5ab808e ("iommufd: Add IOMMUFD_OBJ_VEVENTQ and IOMMUFD_CMD_VEVENTQ_ALLOC")
Cc: stable@vger.kernel.org
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/iommu/iommufd/iommufd_private.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 6ac1965199e9a..43fbc5bed8de3 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -602,7 +602,7 @@ struct iommufd_vevent {
 	struct iommufd_vevent_header header;
 	struct list_head node; /* for iommufd_eventq::deliver */
 	ssize_t data_len;
-	u64 event_data[] __counted_by(data_len);
+	u8 event_data[] __counted_by(data_len);
 };
 
 #define vevent_for_lost_events_header(vevent) \
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH rc v1 4/4] iommufd/selftest: Add boundary tests for veventq_depth
  2026-05-18  2:28 [PATCH rc v1 0/4] iommufd: Fix veventq_depth boundary Nicolin Chen
                   ` (2 preceding siblings ...)
  2026-05-18  2:28 ` [PATCH rc v1 3/4] iommufd: Fix data_len byte-count vs element-count mismatch Nicolin Chen
@ 2026-05-18  2:28 ` Nicolin Chen
  2026-05-18 18:11 ` [PATCH rc v1 0/4] iommufd: Fix veventq_depth boundary Jason Gunthorpe
  2026-05-21 14:30 ` Jason Gunthorpe
  5 siblings, 0 replies; 9+ messages in thread
From: Nicolin Chen @ 2026-05-18  2:28 UTC (permalink / raw)
  To: jgg, kevin.tian; +Cc: iommu, linux-kernel, linux-kselftest

Test veventq_depth to cover a memory exhaustion vulnerability.

Keep veventq_depth=2 for the existing callers.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 tools/testing/selftests/iommu/iommufd_utils.h | 17 +++++++++--------
 tools/testing/selftests/iommu/iommufd.c       | 19 +++++++++++++++++--
 .../selftests/iommu/iommufd_fail_nth.c        |  2 +-
 3 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
index 5502751d500c8..b4928cbd4d9c8 100644
--- a/tools/testing/selftests/iommu/iommufd_utils.h
+++ b/tools/testing/selftests/iommu/iommufd_utils.h
@@ -1060,12 +1060,13 @@ static int _test_cmd_hw_queue_alloc(int fd, __u32 viommu_id, __u32 type,
 					      base_addr, len, out_qid))
 
 static int _test_cmd_veventq_alloc(int fd, __u32 viommu_id, __u32 type,
-				   __u32 *veventq_id, __u32 *veventq_fd)
+				   __u32 depth, __u32 *veventq_id,
+				   __u32 *veventq_fd)
 {
 	struct iommu_veventq_alloc cmd = {
 		.size = sizeof(cmd),
 		.type = type,
-		.veventq_depth = 2,
+		.veventq_depth = depth,
 		.viommu_id = viommu_id,
 	};
 	int ret;
@@ -1080,13 +1081,13 @@ static int _test_cmd_veventq_alloc(int fd, __u32 viommu_id, __u32 type,
 	return 0;
 }
 
-#define test_cmd_veventq_alloc(viommu_id, type, veventq_id, veventq_fd) \
-	ASSERT_EQ(0, _test_cmd_veventq_alloc(self->fd, viommu_id, type, \
+#define test_cmd_veventq_alloc(viommu_id, type, depth, veventq_id, veventq_fd) \
+	ASSERT_EQ(0, _test_cmd_veventq_alloc(self->fd, viommu_id, type, depth, \
 					     veventq_id, veventq_fd))
-#define test_err_veventq_alloc(_errno, viommu_id, type, veventq_id,     \
-			       veventq_fd)                              \
-	EXPECT_ERRNO(_errno,                                            \
-		     _test_cmd_veventq_alloc(self->fd, viommu_id, type, \
+#define test_err_veventq_alloc(_errno, viommu_id, type, depth, veventq_id,     \
+			       veventq_fd)                                     \
+	EXPECT_ERRNO(_errno,                                                   \
+		     _test_cmd_veventq_alloc(self->fd, viommu_id, type, depth, \
 					     veventq_id, veventq_fd))
 
 static int _test_cmd_trigger_vevents(int fd, __u32 dev_id, __u32 nvevents)
diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
index d1fe5dbc2813e..2e8a27dab0bb8 100644
--- a/tools/testing/selftests/iommu/iommufd.c
+++ b/tools/testing/selftests/iommu/iommufd.c
@@ -2986,11 +2986,26 @@ TEST_F(iommufd_viommu, vdevice_alloc)
 		test_err_mock_domain_replace(ENOENT, self->stdev_id,
 					     self->nested_hwpt_id);
 
+		/* Test depth lower and upper bounds (mirrors kernel cap) */
+#define VEVENTQ_MAX_DEPTH (1U << 19)
+		test_err_veventq_alloc(EINVAL, viommu_id,
+				       IOMMU_VEVENTQ_TYPE_SELFTEST, 0, NULL,
+				       NULL);
+		test_err_veventq_alloc(EINVAL, viommu_id,
+				       IOMMU_VEVENTQ_TYPE_SELFTEST,
+				       VEVENTQ_MAX_DEPTH + 1, NULL, NULL);
+		test_cmd_veventq_alloc(viommu_id, IOMMU_VEVENTQ_TYPE_SELFTEST,
+				       VEVENTQ_MAX_DEPTH, &veventq_id,
+				       &veventq_fd);
+		close(veventq_fd);
+		test_ioctl_destroy(veventq_id);
+
 		/* Allocate a vEVENTQ with veventq_depth=2 */
 		test_cmd_veventq_alloc(viommu_id, IOMMU_VEVENTQ_TYPE_SELFTEST,
-				       &veventq_id, &veventq_fd);
+				       2, &veventq_id, &veventq_fd);
 		test_err_veventq_alloc(EEXIST, viommu_id,
-				       IOMMU_VEVENTQ_TYPE_SELFTEST, NULL, NULL);
+				       IOMMU_VEVENTQ_TYPE_SELFTEST, 2, NULL,
+				       NULL);
 		/* Set vdev_id to 0x99, unset it, and set to 0x88 */
 		test_cmd_vdevice_alloc(viommu_id, dev_id, 0x99, &vdev_id);
 		test_cmd_mock_domain_replace(self->stdev_id,
diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c
index 45c14323a6183..25495d8dceb3d 100644
--- a/tools/testing/selftests/iommu/iommufd_fail_nth.c
+++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c
@@ -712,7 +712,7 @@ TEST_FAIL_NTH(basic_fail_nth, device)
 		return -1;
 
 	if (_test_cmd_veventq_alloc(self->fd, viommu_id,
-				    IOMMU_VEVENTQ_TYPE_SELFTEST, &veventq_id,
+				    IOMMU_VEVENTQ_TYPE_SELFTEST, 2, &veventq_id,
 				    &veventq_fd))
 		return -1;
 	close(veventq_fd);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH rc v1 0/4] iommufd: Fix veventq_depth boundary
  2026-05-18  2:28 [PATCH rc v1 0/4] iommufd: Fix veventq_depth boundary Nicolin Chen
                   ` (3 preceding siblings ...)
  2026-05-18  2:28 ` [PATCH rc v1 4/4] iommufd/selftest: Add boundary tests for veventq_depth Nicolin Chen
@ 2026-05-18 18:11 ` Jason Gunthorpe
  2026-05-21 14:30 ` Jason Gunthorpe
  5 siblings, 0 replies; 9+ messages in thread
From: Jason Gunthorpe @ 2026-05-18 18:11 UTC (permalink / raw)
  To: Nicolin Chen; +Cc: kevin.tian, iommu, linux-kernel, linux-kselftest

On Sun, May 17, 2026 at 07:28:45PM -0700, Nicolin Chen wrote:
> The upper bound of veventq_depth has been missing for veventq allocation,
> leaving a vulnerability where userspace could exhaust atomic memory pool.
> 
> Fix it properly:
>  - Allocate outside the spinlock to avoid GFP_ATOMIC
>  - Cap the veventq_depth upper bound
>  - Fix event_data byte-count
>  - Add selftest coverage
> 
> Note that QEMU's SMMU has been already allocating veventq using a "HW"
> EVTQ entry number. So, picking 19 as the known use case, for a minimal
> level of ABI consistency.
> 
> This is on github:
> https://github.com/nicolinc/iommufd/commits/fix_veventq_depth-v1
> 
> Nicolin Chen (4):
>   iommufd: Move vevent memory allocation outside spinlock
>   iommufd: Set veventq_depth upper bound
>   iommufd: Fix data_len byte-count vs element-count mismatch
>   iommufd/selftest: Add boundary tests for veventq_depth

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH rc v1 0/4] iommufd: Fix veventq_depth boundary
  2026-05-18  2:28 [PATCH rc v1 0/4] iommufd: Fix veventq_depth boundary Nicolin Chen
                   ` (4 preceding siblings ...)
  2026-05-18 18:11 ` [PATCH rc v1 0/4] iommufd: Fix veventq_depth boundary Jason Gunthorpe
@ 2026-05-21 14:30 ` Jason Gunthorpe
  2026-05-21 18:01   ` Nicolin Chen
  5 siblings, 1 reply; 9+ messages in thread
From: Jason Gunthorpe @ 2026-05-21 14:30 UTC (permalink / raw)
  To: Nicolin Chen; +Cc: kevin.tian, iommu, linux-kernel, linux-kselftest

On Sun, May 17, 2026 at 07:28:45PM -0700, Nicolin Chen wrote:
> The upper bound of veventq_depth has been missing for veventq allocation,
> leaving a vulnerability where userspace could exhaust atomic memory pool.
> 
> Fix it properly:
>  - Allocate outside the spinlock to avoid GFP_ATOMIC
>  - Cap the veventq_depth upper bound
>  - Fix event_data byte-count
>  - Add selftest coverage
> 
> Note that QEMU's SMMU has been already allocating veventq using a "HW"
> EVTQ entry number. So, picking 19 as the known use case, for a minimal
> level of ABI consistency.
> 
> This is on github:
> https://github.com/nicolinc/iommufd/commits/fix_veventq_depth-v1
> 
> Nicolin Chen (4):
>   iommufd: Move vevent memory allocation outside spinlock
>   iommufd: Set veventq_depth upper bound
>   iommufd: Fix data_len byte-count vs element-count mismatch
>   iommufd/selftest: Add boundary tests for veventq_depth

Please adjust for the sashiko remarks:

1) Put "iommufd: Fix data_len byte-count vs element-count mismatch"
   first
2) This "Returning -ENOMEM for allocation failures but 0 for queue overflows treats
   the conditions differently, which seems to contradict the stated
   intent." Seems bogus, I think adjust the commit message. We do want
   0 for queue full conditions.
3) Let's fix the "Will this lockless read concurrent with a plain write cause a
   data race?" by removing the optimization, just pre-allocate and
   fail. We don't expect this to be a normal condition worth
   optimizing
4) I'm OK with ENOMEM here, leave it, EAGAIN should mean it is
   pollable and it won't become pollable..
5) The sizeof(hdr) has been fixed in my rc branch. You can rebase on
   top of that and also ensure to send a base-commit trailer to help
   Sashiko apply the patches properly
6) What do you think about the "but done has
   already been incremented by sizeof(*hdr)" ? unrelated issue? If it
   is simple lets add a patch here to fix it

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH rc v1 0/4] iommufd: Fix veventq_depth boundary
  2026-05-21 14:30 ` Jason Gunthorpe
@ 2026-05-21 18:01   ` Nicolin Chen
  2026-05-21 23:27     ` Jason Gunthorpe
  0 siblings, 1 reply; 9+ messages in thread
From: Nicolin Chen @ 2026-05-21 18:01 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: kevin.tian, iommu, linux-kernel, linux-kselftest

On Thu, May 21, 2026 at 11:30:04AM -0300, Jason Gunthorpe wrote:
> 1) Put "iommufd: Fix data_len byte-count vs element-count mismatch"
>    first

OK.

> 2) This "Returning -ENOMEM for allocation failures but 0 for queue overflows treats
>    the conditions differently, which seems to contradict the stated
>    intent." Seems bogus, I think adjust the commit message. We do want
>    0 for queue full conditions.

Ack.

> 3) Let's fix the "Will this lockless read concurrent with a plain write cause a
>    data race?" by removing the optimization, just pre-allocate and
>    fail. We don't expect this to be a normal condition worth
>    optimizing

I can drop it.

FWIW, it was added to address a Sashiko review also:

  By moving the allocation outside the spinlock, the precondition check that
  skipped the allocation when the queue was full is bypassed.

  When the queue is full, which can be common during a hardware fault storm
  if userspace cannot keep up, the code now unconditionally allocates memory,
  copies data, acquires the lock, and then immediately frees the memory and
  drops the event.

  Can this tight loop of wasteful slab allocations, memory copies, and
  deallocations exacerbate IOMMU fault storms by adding unnecessary CPU
  overhead?

  Would it be possible to add an optimistic lockless check, such as
  READ_ONCE(veventq->num_events) < veventq->depth, to bypass the allocation
  when the queue appears full?

> 4) I'm OK with ENOMEM here, leave it, EAGAIN should mean it is
>    pollable and it won't become pollable..

Yea. Sashiko would complain about an EAGAIN as well :-)

> 5) The sizeof(hdr) has been fixed in my rc branch. You can rebase on
>    top of that and also ensure to send a base-commit trailer to help
>    Sashiko apply the patches properly

Oh, I forgot to add base commit ID. Will use your for-rc branch.

> 6) What do you think about the "but done has
>    already been incremented by sizeof(*hdr)" ? unrelated issue? If it
>    is simple lets add a patch here to fix it

I added a patch but didn't include in the series -- Sashiko would
raise more questions against that patch...

I think it's a separate bug; Sashiko pointed out another in fault
queue as well. Both bugs are at failure (corner cases?) path.

I'd like to address them separately.

Thanks
Nicolin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH rc v1 0/4] iommufd: Fix veventq_depth boundary
  2026-05-21 18:01   ` Nicolin Chen
@ 2026-05-21 23:27     ` Jason Gunthorpe
  0 siblings, 0 replies; 9+ messages in thread
From: Jason Gunthorpe @ 2026-05-21 23:27 UTC (permalink / raw)
  To: Nicolin Chen; +Cc: kevin.tian, iommu, linux-kernel, linux-kselftest

On Thu, May 21, 2026 at 11:01:48AM -0700, Nicolin Chen wrote:

> FWIW, it was added to address a Sashiko review also:
> 
>   By moving the allocation outside the spinlock, the precondition check that
>   skipped the allocation when the queue was full is bypassed.
> 
>   When the queue is full, which can be common during a hardware fault storm
>   if userspace cannot keep up, the code now unconditionally allocates memory,
>   copies data, acquires the lock, and then immediately frees the memory and
>   drops the event.
> 
>   Can this tight loop of wasteful slab allocations, memory copies, and
>   deallocations exacerbate IOMMU fault storms by adding unnecessary CPU
>   overhead?
> 
>   Would it be possible to add an optimistic lockless check, such as
>   READ_ONCE(veventq->num_events) < veventq->depth, to bypass the allocation
>   when the queue appears full?

That seems like nonsense to me.

> > 6) What do you think about the "but done has
> >    already been incremented by sizeof(*hdr)" ? unrelated issue? If it
> >    is simple lets add a patch here to fix it
> 
> I added a patch but didn't include in the series -- Sashiko would
> raise more questions against that patch...
> 
> I think it's a separate bug; Sashiko pointed out another in fault
> queue as well. Both bugs are at failure (corner cases?) path.
> 
> I'd like to address them separately.

Ok

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-05-21 23:27 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-18  2:28 [PATCH rc v1 0/4] iommufd: Fix veventq_depth boundary Nicolin Chen
2026-05-18  2:28 ` [PATCH rc v1 1/4] iommufd: Move vevent memory allocation outside spinlock Nicolin Chen
2026-05-18  2:28 ` [PATCH rc v1 2/4] iommufd: Set veventq_depth upper bound Nicolin Chen
2026-05-18  2:28 ` [PATCH rc v1 3/4] iommufd: Fix data_len byte-count vs element-count mismatch Nicolin Chen
2026-05-18  2:28 ` [PATCH rc v1 4/4] iommufd/selftest: Add boundary tests for veventq_depth Nicolin Chen
2026-05-18 18:11 ` [PATCH rc v1 0/4] iommufd: Fix veventq_depth boundary Jason Gunthorpe
2026-05-21 14:30 ` Jason Gunthorpe
2026-05-21 18:01   ` Nicolin Chen
2026-05-21 23:27     ` Jason Gunthorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.