* [PATCH v4 0/2] SMMU v3 CMDQ fix and improvement
@ 2025-11-14 17:17 Jacob Pan
2025-11-14 17:17 ` [PATCH v4 1/2] iommu/arm-smmu-v3: Fix CMDQ timeout warning Jacob Pan
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Jacob Pan @ 2025-11-14 17:17 UTC (permalink / raw)
To: linux-kernel, iommu@lists.linux.dev, Will Deacon, Joerg Roedel,
Mostafa Saleh, Jason Gunthorpe, Robin Murphy, Nicolin Chen
Cc: Jacob Pan, Zhang Yu, Jean Philippe-Brucker, Alexander Grest
Hi Will et al,
These two patches address logic issues that occur when SMMU CMDQ spaces
are nearly exhausted at runtime. The problems become more pronounced
when multiple CPUs submit to a single queue, a common scenario under SVA
when shared buffers (used by both CPU and device) are being unmapped.
Thanks,
Jacob
Alexander Grest (1):
iommu/arm-smmu-v3: Improve CMDQ lock fairness and efficiency
Jacob Pan (1):
iommu/arm-smmu-v3: Fix CMDQ timeout warning
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 72 +++++++++++----------
1 file changed, 37 insertions(+), 35 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 13+ messages in thread* [PATCH v4 1/2] iommu/arm-smmu-v3: Fix CMDQ timeout warning 2025-11-14 17:17 [PATCH v4 0/2] SMMU v3 CMDQ fix and improvement Jacob Pan @ 2025-11-14 17:17 ` Jacob Pan 2025-11-14 18:29 ` Nicolin Chen 2025-11-25 17:19 ` Will Deacon 2025-11-14 17:17 ` [PATCH v4 2/2] iommu/arm-smmu-v3: Improve CMDQ lock fairness and efficiency Jacob Pan 2025-11-20 17:10 ` [PATCH v4 0/2] SMMU v3 CMDQ fix and improvement Jacob Pan 2 siblings, 2 replies; 13+ messages in thread From: Jacob Pan @ 2025-11-14 17:17 UTC (permalink / raw) To: linux-kernel, iommu@lists.linux.dev, Will Deacon, Joerg Roedel, Mostafa Saleh, Jason Gunthorpe, Robin Murphy, Nicolin Chen Cc: Jacob Pan, Zhang Yu, Jean Philippe-Brucker, Alexander Grest While polling for n spaces in the cmdq, the current code instead checks if the queue is full. If the queue is almost full but not enough space (<n), then the CMDQ timeout warning is never triggered even if the polling has exceeded timeout limit. The existing arm_smmu_cmdq_poll_until_not_full() doesn't fit efficiently nor ideally to the only caller arm_smmu_cmdq_issue_cmdlist(): - It uses a new timer at every single call, which fails to limit to the preset ARM_SMMU_POLL_TIMEOUT_US per issue. - It has a redundant internal queue_full(), which doesn't detect whether there is a enough space for number of n commands. This patch polls for the availability of exact space instead of full and emit timeout warning accordingly. Fixes: 587e6c10a7ce ("iommu/arm-smmu-v3: Reduce contention during command-queue insertion") Co-developed-by: Yu Zhang <zhangyu1@linux.microsoft.com> Signed-off-by: Yu Zhang <zhangyu1@linux.microsoft.com> Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com> --- v4: - Deleted non-ETIMEOUT error handling for queue_poll (Nicolin) v3: - Use a helper for cmdq poll instead of open coding (Nicolin) - Add more explanation in the commit message (Nicolin) v2: - Reduced debug print info (Nicolin) - Use a separate irq flags for exclusive lock - Handle queue_poll err code other than ETIMEOUT --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 41 ++++++++------------- 1 file changed, 16 insertions(+), 25 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 2a8b46b948f0..9824bd808725 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -138,12 +138,6 @@ static bool queue_has_space(struct arm_smmu_ll_queue *q, u32 n) return space >= n; } -static bool queue_full(struct arm_smmu_ll_queue *q) -{ - return Q_IDX(q, q->prod) == Q_IDX(q, q->cons) && - Q_WRP(q, q->prod) != Q_WRP(q, q->cons); -} - static bool queue_empty(struct arm_smmu_ll_queue *q) { return Q_IDX(q, q->prod) == Q_IDX(q, q->cons) && @@ -633,14 +627,13 @@ static void arm_smmu_cmdq_poll_valid_map(struct arm_smmu_cmdq *cmdq, __arm_smmu_cmdq_poll_set_valid_map(cmdq, sprod, eprod, false); } -/* Wait for the command queue to become non-full */ -static int arm_smmu_cmdq_poll_until_not_full(struct arm_smmu_device *smmu, - struct arm_smmu_cmdq *cmdq, - struct arm_smmu_ll_queue *llq) + +static inline void arm_smmu_cmdq_poll(struct arm_smmu_device *smmu, + struct arm_smmu_cmdq *cmdq, + struct arm_smmu_ll_queue *llq, + struct arm_smmu_queue_poll *qp) { unsigned long flags; - struct arm_smmu_queue_poll qp; - int ret = 0; /* * Try to update our copy of cons by grabbing exclusive cmdq access. If @@ -650,19 +643,16 @@ static int arm_smmu_cmdq_poll_until_not_full(struct arm_smmu_device *smmu, WRITE_ONCE(cmdq->q.llq.cons, readl_relaxed(cmdq->q.cons_reg)); arm_smmu_cmdq_exclusive_unlock_irqrestore(cmdq, flags); llq->val = READ_ONCE(cmdq->q.llq.val); - return 0; + return; } - queue_poll_init(smmu, &qp); - do { - llq->val = READ_ONCE(cmdq->q.llq.val); - if (!queue_full(llq)) - break; - - ret = queue_poll(&qp); - } while (!ret); - - return ret; + if (queue_poll(qp) == -ETIMEDOUT) { + dev_err_ratelimited(smmu->dev, "CMDQ timed out, cons: %08x, prod: 0x%08x\n", + llq->cons, llq->prod); + /* Restart the timer */ + queue_poll_init(smmu, qp); + } + llq->val = READ_ONCE(cmdq->q.llq.val); } /* @@ -804,12 +794,13 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu, local_irq_save(flags); llq.val = READ_ONCE(cmdq->q.llq.val); do { + struct arm_smmu_queue_poll qp; u64 old; + queue_poll_init(smmu, &qp); while (!queue_has_space(&llq, n + sync)) { local_irq_restore(flags); - if (arm_smmu_cmdq_poll_until_not_full(smmu, cmdq, &llq)) - dev_err_ratelimited(smmu->dev, "CMDQ timeout\n"); + arm_smmu_cmdq_poll(smmu, cmdq, &llq, &qp); local_irq_save(flags); } -- 2.43.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v4 1/2] iommu/arm-smmu-v3: Fix CMDQ timeout warning 2025-11-14 17:17 ` [PATCH v4 1/2] iommu/arm-smmu-v3: Fix CMDQ timeout warning Jacob Pan @ 2025-11-14 18:29 ` Nicolin Chen 2025-11-25 17:19 ` Will Deacon 1 sibling, 0 replies; 13+ messages in thread From: Nicolin Chen @ 2025-11-14 18:29 UTC (permalink / raw) To: Jacob Pan Cc: linux-kernel, iommu@lists.linux.dev, Will Deacon, Joerg Roedel, Mostafa Saleh, Jason Gunthorpe, Robin Murphy, Zhang Yu, Jean Philippe-Brucker, Alexander Grest On Fri, Nov 14, 2025 at 09:17:17AM -0800, Jacob Pan wrote: > While polling for n spaces in the cmdq, the current code instead checks > if the queue is full. If the queue is almost full but not enough space > (<n), then the CMDQ timeout warning is never triggered even if the > polling has exceeded timeout limit. > > The existing arm_smmu_cmdq_poll_until_not_full() doesn't fit efficiently > nor ideally to the only caller arm_smmu_cmdq_issue_cmdlist(): > - It uses a new timer at every single call, which fails to limit to the > preset ARM_SMMU_POLL_TIMEOUT_US per issue. > - It has a redundant internal queue_full(), which doesn't detect whether > there is a enough space for number of n commands. > > This patch polls for the availability of exact space instead of full and > emit timeout warning accordingly. > > Fixes: 587e6c10a7ce ("iommu/arm-smmu-v3: Reduce contention during command-queue insertion") > Co-developed-by: Yu Zhang <zhangyu1@linux.microsoft.com> > Signed-off-by: Yu Zhang <zhangyu1@linux.microsoft.com> > Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 1/2] iommu/arm-smmu-v3: Fix CMDQ timeout warning 2025-11-14 17:17 ` [PATCH v4 1/2] iommu/arm-smmu-v3: Fix CMDQ timeout warning Jacob Pan 2025-11-14 18:29 ` Nicolin Chen @ 2025-11-25 17:19 ` Will Deacon 2025-11-30 23:06 ` Jacob Pan 2025-12-01 17:42 ` Jacob Pan 1 sibling, 2 replies; 13+ messages in thread From: Will Deacon @ 2025-11-25 17:19 UTC (permalink / raw) To: Jacob Pan Cc: linux-kernel, iommu@lists.linux.dev, Joerg Roedel, Mostafa Saleh, Jason Gunthorpe, Robin Murphy, Nicolin Chen, Zhang Yu, Jean Philippe-Brucker, Alexander Grest On Fri, Nov 14, 2025 at 09:17:17AM -0800, Jacob Pan wrote: > While polling for n spaces in the cmdq, the current code instead checks > if the queue is full. If the queue is almost full but not enough space > (<n), then the CMDQ timeout warning is never triggered even if the > polling has exceeded timeout limit. > > The existing arm_smmu_cmdq_poll_until_not_full() doesn't fit efficiently > nor ideally to the only caller arm_smmu_cmdq_issue_cmdlist(): > - It uses a new timer at every single call, which fails to limit to the > preset ARM_SMMU_POLL_TIMEOUT_US per issue. > - It has a redundant internal queue_full(), which doesn't detect whether > there is a enough space for number of n commands. > > This patch polls for the availability of exact space instead of full and > emit timeout warning accordingly. > > Fixes: 587e6c10a7ce ("iommu/arm-smmu-v3: Reduce contention during command-queue insertion") > Co-developed-by: Yu Zhang <zhangyu1@linux.microsoft.com> > Signed-off-by: Yu Zhang <zhangyu1@linux.microsoft.com> > Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com> I'm assuming you're seeing problems with an emulated command queue? Any chance you could make that bigger? > @@ -804,12 +794,13 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu, > local_irq_save(flags); > llq.val = READ_ONCE(cmdq->q.llq.val); > do { > + struct arm_smmu_queue_poll qp; > u64 old; > > + queue_poll_init(smmu, &qp); > while (!queue_has_space(&llq, n + sync)) { > local_irq_restore(flags); > - if (arm_smmu_cmdq_poll_until_not_full(smmu, cmdq, &llq)) > - dev_err_ratelimited(smmu->dev, "CMDQ timeout\n"); > + arm_smmu_cmdq_poll(smmu, cmdq, &llq, &qp); Isn't this broken for wfe-based polling? The SMMU only generates the wake-up event when the queue becomes non-full. Will ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 1/2] iommu/arm-smmu-v3: Fix CMDQ timeout warning 2025-11-25 17:19 ` Will Deacon @ 2025-11-30 23:06 ` Jacob Pan 2025-12-01 19:57 ` Robin Murphy 2025-12-01 17:42 ` Jacob Pan 1 sibling, 1 reply; 13+ messages in thread From: Jacob Pan @ 2025-11-30 23:06 UTC (permalink / raw) To: Will Deacon Cc: linux-kernel, iommu@lists.linux.dev, Joerg Roedel, Mostafa Saleh, Jason Gunthorpe, Robin Murphy, Nicolin Chen, Zhang Yu, Jean Philippe-Brucker, Alexander Grest Hi Will, On Tue, 25 Nov 2025 17:19:16 +0000 Will Deacon <will@kernel.org> wrote: > On Fri, Nov 14, 2025 at 09:17:17AM -0800, Jacob Pan wrote: > > While polling for n spaces in the cmdq, the current code instead > > checks if the queue is full. If the queue is almost full but not > > enough space (<n), then the CMDQ timeout warning is never triggered > > even if the polling has exceeded timeout limit. > > > > The existing arm_smmu_cmdq_poll_until_not_full() doesn't fit > > efficiently nor ideally to the only caller > > arm_smmu_cmdq_issue_cmdlist(): > > - It uses a new timer at every single call, which fails to limit > > to the preset ARM_SMMU_POLL_TIMEOUT_US per issue. > > - It has a redundant internal queue_full(), which doesn't detect > > whether there is a enough space for number of n commands. > > > > This patch polls for the availability of exact space instead of > > full and emit timeout warning accordingly. > > > > Fixes: 587e6c10a7ce ("iommu/arm-smmu-v3: Reduce contention during > > command-queue insertion") Co-developed-by: Yu Zhang > > <zhangyu1@linux.microsoft.com> Signed-off-by: Yu Zhang > > <zhangyu1@linux.microsoft.com> Signed-off-by: Jacob Pan > > <jacob.pan@linux.microsoft.com> > > I'm assuming you're seeing problems with an emulated command queue? > Any chance you could make that bigger? > This is not related to queue size, but rather a logic issue when anytime queue is nearly full. > > @@ -804,12 +794,13 @@ int arm_smmu_cmdq_issue_cmdlist(struct > > arm_smmu_device *smmu, local_irq_save(flags); > > llq.val = READ_ONCE(cmdq->q.llq.val); > > do { > > + struct arm_smmu_queue_poll qp; > > u64 old; > > > > + queue_poll_init(smmu, &qp); > > while (!queue_has_space(&llq, n + sync)) { > > local_irq_restore(flags); > > - if > > (arm_smmu_cmdq_poll_until_not_full(smmu, cmdq, &llq)) > > - dev_err_ratelimited(smmu->dev, > > "CMDQ timeout\n"); > > + arm_smmu_cmdq_poll(smmu, cmdq, &llq, &qp); > > > > Isn't this broken for wfe-based polling? The SMMU only generates the > wake-up event when the queue becomes non-full. I don't see this is a problem since any interrupts such as scheduler tick can be a break evnt for WFE, no? I have also tested this with WFE on BM with no issues. HyperV VM does not support WFE. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 1/2] iommu/arm-smmu-v3: Fix CMDQ timeout warning 2025-11-30 23:06 ` Jacob Pan @ 2025-12-01 19:57 ` Robin Murphy 2025-12-01 21:49 ` Jacob Pan 0 siblings, 1 reply; 13+ messages in thread From: Robin Murphy @ 2025-12-01 19:57 UTC (permalink / raw) To: Jacob Pan, Will Deacon Cc: linux-kernel, iommu@lists.linux.dev, Joerg Roedel, Mostafa Saleh, Jason Gunthorpe, Nicolin Chen, Zhang Yu, Jean Philippe-Brucker, Alexander Grest On 2025-11-30 11:06 pm, Jacob Pan wrote: > Hi Will, > > On Tue, 25 Nov 2025 17:19:16 +0000 > Will Deacon <will@kernel.org> wrote: > >> On Fri, Nov 14, 2025 at 09:17:17AM -0800, Jacob Pan wrote: >>> While polling for n spaces in the cmdq, the current code instead >>> checks if the queue is full. If the queue is almost full but not >>> enough space (<n), then the CMDQ timeout warning is never triggered >>> even if the polling has exceeded timeout limit. >>> >>> The existing arm_smmu_cmdq_poll_until_not_full() doesn't fit >>> efficiently nor ideally to the only caller >>> arm_smmu_cmdq_issue_cmdlist(): >>> - It uses a new timer at every single call, which fails to limit >>> to the preset ARM_SMMU_POLL_TIMEOUT_US per issue. >>> - It has a redundant internal queue_full(), which doesn't detect >>> whether there is a enough space for number of n commands. >>> >>> This patch polls for the availability of exact space instead of >>> full and emit timeout warning accordingly. >>> >>> Fixes: 587e6c10a7ce ("iommu/arm-smmu-v3: Reduce contention during >>> command-queue insertion") Co-developed-by: Yu Zhang >>> <zhangyu1@linux.microsoft.com> Signed-off-by: Yu Zhang >>> <zhangyu1@linux.microsoft.com> Signed-off-by: Jacob Pan >>> <jacob.pan@linux.microsoft.com> >> >> I'm assuming you're seeing problems with an emulated command queue? >> Any chance you could make that bigger? >> > This is not related to queue size, but rather a logic issue when > anytime queue is nearly full. It is absolutely related to queue size, because there are only two reasons why this warning should ever be seen: 1: The SMMU is in some unexpected error state and has stopped consuming commands altogether. 2: The queue is far too small to do its job of buffering commands for the number of present CPUs. >>> @@ -804,12 +794,13 @@ int arm_smmu_cmdq_issue_cmdlist(struct >>> arm_smmu_device *smmu, local_irq_save(flags); >>> llq.val = READ_ONCE(cmdq->q.llq.val); >>> do { >>> + struct arm_smmu_queue_poll qp; >>> u64 old; >>> >>> + queue_poll_init(smmu, &qp); >>> while (!queue_has_space(&llq, n + sync)) { >>> local_irq_restore(flags); >>> - if >>> (arm_smmu_cmdq_poll_until_not_full(smmu, cmdq, &llq)) >>> - dev_err_ratelimited(smmu->dev, >>> "CMDQ timeout\n"); >>> + arm_smmu_cmdq_poll(smmu, cmdq, &llq, &qp); >>> >> >> Isn't this broken for wfe-based polling? The SMMU only generates the >> wake-up event when the queue becomes non-full. > I don't see this is a problem since any interrupts such as scheduler > tick can be a break evnt for WFE, no? No, we cannot assume that any other WFE wakeup event is *guaranteed*. It's certainly possible to get stuck in WFE on a NOHZ_FULL kernel with the arch timer event stream disabled - I recall proving that back when I hoped I might not have to bother upstreaming a workaround for MMU-600 erratum #1076982. Yes, a large part of the reason we enable the event stream by default is to help mitigate errata and software bugs which lead to missed events, but that still doesn't mean we should consciously abuse it. I guess something like the diff below (completely untested) would be a bit closer to correct (basically, allow WFE when the queue is completely full, but suppress it if we're then still waiting for more space in a non-full queue). Thanks, Robin. ----->8----- diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index a0c6d87f85a1..206c6c6860dd 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -123,7 +123,7 @@ static void parse_driver_options(struct arm_smmu_device *smmu) } /* Low-level queue manipulation functions */ -static bool queue_has_space(struct arm_smmu_ll_queue *q, u32 n) +static int queue_space(struct arm_smmu_ll_queue *q) { u32 space, prod, cons; @@ -135,7 +135,7 @@ static bool queue_has_space(struct arm_smmu_ll_queue *q, u32 n) else space = cons - prod; - return space >= n; + return space; } static bool queue_empty(struct arm_smmu_ll_queue *q) @@ -796,9 +796,11 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu, do { struct arm_smmu_queue_poll qp; u64 old; + int space; queue_poll_init(smmu, &qp); - while (!queue_has_space(&llq, n + sync)) { + while (space = queue_space(&llq), space < n + sync) { + qp.wfe &= !space; local_irq_restore(flags); arm_smmu_cmdq_poll(smmu, cmdq, &llq, &qp); local_irq_save(flags); ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v4 1/2] iommu/arm-smmu-v3: Fix CMDQ timeout warning 2025-12-01 19:57 ` Robin Murphy @ 2025-12-01 21:49 ` Jacob Pan 0 siblings, 0 replies; 13+ messages in thread From: Jacob Pan @ 2025-12-01 21:49 UTC (permalink / raw) To: Robin Murphy Cc: Will Deacon, linux-kernel, iommu@lists.linux.dev, Joerg Roedel, Mostafa Saleh, Jason Gunthorpe, Nicolin Chen, Zhang Yu, Jean Philippe-Brucker, Alexander Grest Hi Robin, On Mon, 1 Dec 2025 19:57:31 +0000 Robin Murphy <robin.murphy@arm.com> wrote: > On 2025-11-30 11:06 pm, Jacob Pan wrote: > > Hi Will, > > > > On Tue, 25 Nov 2025 17:19:16 +0000 > > Will Deacon <will@kernel.org> wrote: > > > >> On Fri, Nov 14, 2025 at 09:17:17AM -0800, Jacob Pan wrote: > >>> While polling for n spaces in the cmdq, the current code instead > >>> checks if the queue is full. If the queue is almost full but not > >>> enough space (<n), then the CMDQ timeout warning is never > >>> triggered even if the polling has exceeded timeout limit. > >>> > >>> The existing arm_smmu_cmdq_poll_until_not_full() doesn't fit > >>> efficiently nor ideally to the only caller > >>> arm_smmu_cmdq_issue_cmdlist(): > >>> - It uses a new timer at every single call, which fails to limit > >>> to the preset ARM_SMMU_POLL_TIMEOUT_US per issue. > >>> - It has a redundant internal queue_full(), which doesn't detect > >>> whether there is a enough space for number of n commands. > >>> > >>> This patch polls for the availability of exact space instead of > >>> full and emit timeout warning accordingly. > >>> > >>> Fixes: 587e6c10a7ce ("iommu/arm-smmu-v3: Reduce contention during > >>> command-queue insertion") Co-developed-by: Yu Zhang > >>> <zhangyu1@linux.microsoft.com> Signed-off-by: Yu Zhang > >>> <zhangyu1@linux.microsoft.com> Signed-off-by: Jacob Pan > >>> <jacob.pan@linux.microsoft.com> > >> > >> I'm assuming you're seeing problems with an emulated command queue? > >> Any chance you could make that bigger? > >> > > This is not related to queue size, but rather a logic issue when > > anytime queue is nearly full. > > It is absolutely related to queue size, because there are only two > reasons why this warning should ever be seen: > > 1: The SMMU is in some unexpected error state and has stopped > consuming commands altogether. > 2: The queue is far too small to do its job of buffering commands for > the number of present CPUs. I agree that smaller queue size or slow emulation makes this problem more visible, in this sense they are related. > >>> @@ -804,12 +794,13 @@ int arm_smmu_cmdq_issue_cmdlist(struct > >>> arm_smmu_device *smmu, local_irq_save(flags); > >>> llq.val = READ_ONCE(cmdq->q.llq.val); > >>> do { > >>> + struct arm_smmu_queue_poll qp; > >>> u64 old; > >>> > >>> + queue_poll_init(smmu, &qp); > >>> while (!queue_has_space(&llq, n + sync)) { > >>> local_irq_restore(flags); > >>> - if > >>> (arm_smmu_cmdq_poll_until_not_full(smmu, cmdq, &llq)) > >>> - dev_err_ratelimited(smmu->dev, > >>> "CMDQ timeout\n"); > >>> + arm_smmu_cmdq_poll(smmu, cmdq, &llq, > >>> &qp); > >> > >> Isn't this broken for wfe-based polling? The SMMU only generates > >> the wake-up event when the queue becomes non-full. > > I don't see this is a problem since any interrupts such as scheduler > > tick can be a break evnt for WFE, no? > > No, we cannot assume that any other WFE wakeup event is *guaranteed*. > It's certainly possible to get stuck in WFE on a NOHZ_FULL kernel with > the arch timer event stream disabled - I recall proving that back > when I hoped I might not have to bother upstreaming a workaround for > MMU-600 erratum #1076982. > Make sense, thanks for sharing this history. > Yes, a large part of the reason we enable the event stream by default > is to help mitigate errata and software bugs which lead to missed > events, but that still doesn't mean we should consciously abuse it. I > guess something like the diff below (completely untested) would be a > bit closer to correct (basically, allow WFE when the queue is > completely full, but suppress it if we're then still waiting for more > space in a non-full queue). > The diff below looks good to me, I can give it a try. But at the same time, since queue full is supposed to be an exceptional scenario, I wonder if we can just disable WFE all together in this case. Power saving may not matter here? > Thanks, > Robin. > > ----->8----- > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index > a0c6d87f85a1..206c6c6860dd 100644 --- > a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ > b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -123,7 +123,7 @@ > static void parse_driver_options(struct arm_smmu_device *smmu) } > > /* Low-level queue manipulation functions */ > -static bool queue_has_space(struct arm_smmu_ll_queue *q, u32 n) > +static int queue_space(struct arm_smmu_ll_queue *q) > { > u32 space, prod, cons; > > @@ -135,7 +135,7 @@ static bool queue_has_space(struct > arm_smmu_ll_queue *q, u32 n) else > space = cons - prod; > > - return space >= n; > + return space; > } > > static bool queue_empty(struct arm_smmu_ll_queue *q) > @@ -796,9 +796,11 @@ int arm_smmu_cmdq_issue_cmdlist(struct > arm_smmu_device *smmu, do { > struct arm_smmu_queue_poll qp; > u64 old; > + int space; > > queue_poll_init(smmu, &qp); > - while (!queue_has_space(&llq, n + sync)) { > + while (space = queue_space(&llq), space < n + sync) { > + qp.wfe &= !space; > local_irq_restore(flags); > arm_smmu_cmdq_poll(smmu, cmdq, &llq, &qp); > local_irq_save(flags); ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 1/2] iommu/arm-smmu-v3: Fix CMDQ timeout warning 2025-11-25 17:19 ` Will Deacon 2025-11-30 23:06 ` Jacob Pan @ 2025-12-01 17:42 ` Jacob Pan 1 sibling, 0 replies; 13+ messages in thread From: Jacob Pan @ 2025-12-01 17:42 UTC (permalink / raw) To: Will Deacon Cc: linux-kernel, iommu@lists.linux.dev, Joerg Roedel, Mostafa Saleh, Jason Gunthorpe, Robin Murphy, Nicolin Chen, Zhang Yu, Jean Philippe-Brucker, Alexander Grest Hi Will, On Tue, 25 Nov 2025 17:19:16 +0000 Will Deacon <will@kernel.org> wrote: > I'm assuming you're seeing problems with an emulated command queue? > Any chance you could make that bigger? Yes, it was initially observed on HyperV emulated CMDQ with small queue size, which was chosen for latency reasons. But as I explained in the other thread, the queue space polling timeout detection problem is not directly related to queue size. It is a code logic bug IMHO. Sorry I forgot to directly answer these. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v4 2/2] iommu/arm-smmu-v3: Improve CMDQ lock fairness and efficiency 2025-11-14 17:17 [PATCH v4 0/2] SMMU v3 CMDQ fix and improvement Jacob Pan 2025-11-14 17:17 ` [PATCH v4 1/2] iommu/arm-smmu-v3: Fix CMDQ timeout warning Jacob Pan @ 2025-11-14 17:17 ` Jacob Pan 2025-11-25 17:18 ` Will Deacon 2025-11-20 17:10 ` [PATCH v4 0/2] SMMU v3 CMDQ fix and improvement Jacob Pan 2 siblings, 1 reply; 13+ messages in thread From: Jacob Pan @ 2025-11-14 17:17 UTC (permalink / raw) To: linux-kernel, iommu@lists.linux.dev, Will Deacon, Joerg Roedel, Mostafa Saleh, Jason Gunthorpe, Robin Murphy, Nicolin Chen Cc: Jacob Pan, Zhang Yu, Jean Philippe-Brucker, Alexander Grest From: Alexander Grest <Alexander.Grest@microsoft.com> The SMMU CMDQ lock is highly contentious when there are multiple CPUs issuing commands and the queue is nearly full. The lock has the following states: - 0: Unlocked - >0: Shared lock held with count - INT_MIN+N: Exclusive lock held, where N is the # of shared waiters - INT_MIN: Exclusive lock held, no shared waiters When multiple CPUs are polling for space in the queue, they attempt to grab the exclusive lock to update the cons pointer from the hardware. If they fail to get the lock, they will spin until either the cons pointer is updated by another CPU. The current code allows the possibility of shared lock starvation if there is a constant stream of CPUs trying to grab the exclusive lock. This leads to severe latency issues and soft lockups. Consider the following scenario where CPU1's attempt to acquire the shared lock is starved by CPU2 and CPU0 contending for the exclusive lock. CPU0 (exclusive) | CPU1 (shared) | CPU2 (exclusive) | `cmdq->lock` -------------------------------------------------------------------------- trylock() //takes | | | 0 | shared_lock() | | INT_MIN | fetch_inc() | | INT_MIN | no return | | INT_MIN + 1 | spins // VAL >= 0 | | INT_MIN + 1 unlock() | spins... | | INT_MIN + 1 set_release(0) | spins... | | 0 see[NOTE] (done) | (sees 0) | trylock() // takes | 0 | *exits loop* | cmpxchg(0, INT_MIN) | 0 | | *cuts in* | INT_MIN | cmpxchg(0, 1) | | INT_MIN | fails // != 0 | | INT_MIN | spins // VAL >= 0 | | INT_MIN | *starved* | | INT_MIN [NOTE] The current code resets the exclusive lock to 0 regardless of the state of the lock. This causes two problems: 1. It opens the possibility of back-to-back exclusive locks and the downstream effect of starving shared lock. 2. The count of shared lock waiters are lost. To mitigate this, we release the exclusive lock by only clearing the sign bit while retaining the shared lock waiter count as a way to avoid starving the shared lock waiters. Also deleted cmpxchg loop while trying to acquire the shared lock as it is not needed. The waiters can see the positive lock count and proceed immediately after the exclusive lock is released. Exclusive lock is not starved in that submitters will try exclusive lock first when new spaces become available. Reviewed-by: Mostafa Saleh <smostafa@google.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Alexander Grest <Alexander.Grest@microsoft.com> Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com> --- v4: - No change v3: - Add flow chart for example starvation case (Nicolin) no code change. v2: - Changed shared lock acquire condition from VAL>=0 to VAL>0 (Mostafa) - Added more comments to explain shared lock change (Nicolin) --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 31 ++++++++++++++------- 1 file changed, 21 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 9824bd808725..0b62b3b0f994 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -481,20 +481,26 @@ static void arm_smmu_cmdq_skip_err(struct arm_smmu_device *smmu) */ static void arm_smmu_cmdq_shared_lock(struct arm_smmu_cmdq *cmdq) { - int val; - /* - * We can try to avoid the cmpxchg() loop by simply incrementing the - * lock counter. When held in exclusive state, the lock counter is set - * to INT_MIN so these increments won't hurt as the value will remain - * negative. + * When held in exclusive state, the lock counter is set to INT_MIN + * so these increments won't hurt as the value will remain negative. + * The increment will also signal the exclusive locker that there are + * shared waiters. */ if (atomic_fetch_inc_relaxed(&cmdq->lock) >= 0) return; - do { - val = atomic_cond_read_relaxed(&cmdq->lock, VAL >= 0); - } while (atomic_cmpxchg_relaxed(&cmdq->lock, val, val + 1) != val); + /* + * Someone else is holding the lock in exclusive state, so wait + * for them to finish. Since we already incremented the lock counter, + * no exclusive lock can be acquired until we finish. We don't need + * the return value since we only care that the exclusive lock is + * released (i.e. the lock counter is non-negative). + * Once the exclusive locker releases the lock, the sign bit will + * be cleared and our increment will make the lock counter positive, + * allowing us to proceed. + */ + atomic_cond_read_relaxed(&cmdq->lock, VAL > 0); } static void arm_smmu_cmdq_shared_unlock(struct arm_smmu_cmdq *cmdq) @@ -521,9 +527,14 @@ static bool arm_smmu_cmdq_shared_tryunlock(struct arm_smmu_cmdq *cmdq) __ret; \ }) +/* + * Only clear the sign bit when releasing the exclusive lock this will + * allow any shared_lock() waiters to proceed without the possibility + * of entering the exclusive lock in a tight loop. + */ #define arm_smmu_cmdq_exclusive_unlock_irqrestore(cmdq, flags) \ ({ \ - atomic_set_release(&cmdq->lock, 0); \ + atomic_fetch_and_release(~INT_MIN, &cmdq->lock); \ local_irq_restore(flags); \ }) -- 2.43.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v4 2/2] iommu/arm-smmu-v3: Improve CMDQ lock fairness and efficiency 2025-11-14 17:17 ` [PATCH v4 2/2] iommu/arm-smmu-v3: Improve CMDQ lock fairness and efficiency Jacob Pan @ 2025-11-25 17:18 ` Will Deacon 2025-11-30 22:52 ` Jacob Pan 0 siblings, 1 reply; 13+ messages in thread From: Will Deacon @ 2025-11-25 17:18 UTC (permalink / raw) To: Jacob Pan Cc: linux-kernel, iommu@lists.linux.dev, Joerg Roedel, Mostafa Saleh, Jason Gunthorpe, Robin Murphy, Nicolin Chen, Zhang Yu, Jean Philippe-Brucker, Alexander Grest On Fri, Nov 14, 2025 at 09:17:18AM -0800, Jacob Pan wrote: > @@ -521,9 +527,14 @@ static bool arm_smmu_cmdq_shared_tryunlock(struct arm_smmu_cmdq *cmdq) > __ret; \ > }) > > +/* > + * Only clear the sign bit when releasing the exclusive lock this will > + * allow any shared_lock() waiters to proceed without the possibility > + * of entering the exclusive lock in a tight loop. > + */ > #define arm_smmu_cmdq_exclusive_unlock_irqrestore(cmdq, flags) \ > ({ \ > - atomic_set_release(&cmdq->lock, 0); \ > + atomic_fetch_and_release(~INT_MIN, &cmdq->lock); \ nit: you can use atomic_fetch_andnot_release(INT_MIN) That aside, doesn't this introduce a new fairness issue in that a steady stream of shared lockers will starve somebody trying to take the lock in exclusive state? Will ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 2/2] iommu/arm-smmu-v3: Improve CMDQ lock fairness and efficiency 2025-11-25 17:18 ` Will Deacon @ 2025-11-30 22:52 ` Jacob Pan 2025-12-10 3:11 ` Will Deacon 0 siblings, 1 reply; 13+ messages in thread From: Jacob Pan @ 2025-11-30 22:52 UTC (permalink / raw) To: Will Deacon Cc: linux-kernel, iommu@lists.linux.dev, Joerg Roedel, Mostafa Saleh, Jason Gunthorpe, Robin Murphy, Nicolin Chen, Zhang Yu, Jean Philippe-Brucker, Alexander Grest Hi Will, On Tue, 25 Nov 2025 17:18:57 +0000 Will Deacon <will@kernel.org> wrote: > On Fri, Nov 14, 2025 at 09:17:18AM -0800, Jacob Pan wrote: > > @@ -521,9 +527,14 @@ static bool > > arm_smmu_cmdq_shared_tryunlock(struct arm_smmu_cmdq *cmdq) > > __ret; > > \ }) > > +/* > > + * Only clear the sign bit when releasing the exclusive lock this > > will > > + * allow any shared_lock() waiters to proceed without the > > possibility > > + * of entering the exclusive lock in a tight loop. > > + */ > > #define arm_smmu_cmdq_exclusive_unlock_irqrestore(cmdq, > > flags) \ ({ > > \ > > - atomic_set_release(&cmdq->lock, 0); > > \ > > + atomic_fetch_and_release(~INT_MIN, &cmdq->lock); > > \ > > nit: you can use atomic_fetch_andnot_release(INT_MIN) > Good point, will do. > That aside, doesn't this introduce a new fairness issue in that a > steady stream of shared lockers will starve somebody trying to take > the lock in exclusive state? > I don't think this change will starve exclusive lockers in the current code flow since new shared locker must acquire exclusive locker first while polling for available queue spaces. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 2/2] iommu/arm-smmu-v3: Improve CMDQ lock fairness and efficiency 2025-11-30 22:52 ` Jacob Pan @ 2025-12-10 3:11 ` Will Deacon 0 siblings, 0 replies; 13+ messages in thread From: Will Deacon @ 2025-12-10 3:11 UTC (permalink / raw) To: Jacob Pan Cc: linux-kernel, iommu@lists.linux.dev, Joerg Roedel, Mostafa Saleh, Jason Gunthorpe, Robin Murphy, Nicolin Chen, Zhang Yu, Jean Philippe-Brucker, Alexander Grest On Sun, Nov 30, 2025 at 02:52:36PM -0800, Jacob Pan wrote: > On Tue, 25 Nov 2025 17:18:57 +0000 > Will Deacon <will@kernel.org> wrote: > > That aside, doesn't this introduce a new fairness issue in that a > > steady stream of shared lockers will starve somebody trying to take > > the lock in exclusive state? > > > I don't think this change will starve exclusive lockers in the > current code flow since new shared locker must acquire exclusive locker > first while polling for available queue spaces. Looking at this again, we already have the same starvation problem in that the lockword has to hit zero for the exclusive locker to succeed. So my initial worry was unfounded. Will ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 0/2] SMMU v3 CMDQ fix and improvement 2025-11-14 17:17 [PATCH v4 0/2] SMMU v3 CMDQ fix and improvement Jacob Pan 2025-11-14 17:17 ` [PATCH v4 1/2] iommu/arm-smmu-v3: Fix CMDQ timeout warning Jacob Pan 2025-11-14 17:17 ` [PATCH v4 2/2] iommu/arm-smmu-v3: Improve CMDQ lock fairness and efficiency Jacob Pan @ 2025-11-20 17:10 ` Jacob Pan 2 siblings, 0 replies; 13+ messages in thread From: Jacob Pan @ 2025-11-20 17:10 UTC (permalink / raw) To: linux-kernel, iommu@lists.linux.dev, Will Deacon, Joerg Roedel, Mostafa Saleh, Jason Gunthorpe, Robin Murphy, Nicolin Chen Cc: Zhang Yu, Jean Philippe-Brucker, Alexander Grest Hi Joerg/Will, Any more comments on these? I could spin another version to add Nicolin's Reviewed-by tag for patch 1/2. Thanks, Jacob On Fri, 14 Nov 2025 09:17:16 -0800 Jacob Pan <jacob.pan@linux.microsoft.com> wrote: > Hi Will et al, > > These two patches address logic issues that occur when SMMU CMDQ > spaces are nearly exhausted at runtime. The problems become more > pronounced when multiple CPUs submit to a single queue, a common > scenario under SVA when shared buffers (used by both CPU and device) > are being unmapped. > > > Thanks, > > Jacob > > > Alexander Grest (1): > iommu/arm-smmu-v3: Improve CMDQ lock fairness and efficiency > > Jacob Pan (1): > iommu/arm-smmu-v3: Fix CMDQ timeout warning > > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 72 > +++++++++++---------- 1 file changed, 37 insertions(+), 35 > deletions(-) > ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-12-10 3:11 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-11-14 17:17 [PATCH v4 0/2] SMMU v3 CMDQ fix and improvement Jacob Pan 2025-11-14 17:17 ` [PATCH v4 1/2] iommu/arm-smmu-v3: Fix CMDQ timeout warning Jacob Pan 2025-11-14 18:29 ` Nicolin Chen 2025-11-25 17:19 ` Will Deacon 2025-11-30 23:06 ` Jacob Pan 2025-12-01 19:57 ` Robin Murphy 2025-12-01 21:49 ` Jacob Pan 2025-12-01 17:42 ` Jacob Pan 2025-11-14 17:17 ` [PATCH v4 2/2] iommu/arm-smmu-v3: Improve CMDQ lock fairness and efficiency Jacob Pan 2025-11-25 17:18 ` Will Deacon 2025-11-30 22:52 ` Jacob Pan 2025-12-10 3:11 ` Will Deacon 2025-11-20 17:10 ` [PATCH v4 0/2] SMMU v3 CMDQ fix and improvement Jacob Pan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox