From: Pranjal Shrivastava <praan@google.com>
To: Daniel Mentz <danielmentz@google.com>
Cc: iommu@lists.linux.dev, Will Deacon <will@kernel.org>,
Joerg Roedel <joro@8bytes.org>,
Robin Murphy <robin.murphy@arm.com>,
Jason Gunthorpe <jgg@ziepe.ca>,
Mostafa Saleh <smostafa@google.com>,
Nicolin Chen <nicolinc@nvidia.com>,
Ashish Mhetre <amhetre@nvidia.com>,
Sairaj Kodilkar <sarunkod@amd.com>
Subject: Re: [PATCH v5 06/10] iommu/arm-smmu-v3: Add a usage counter for cmdq
Date: Wed, 11 Mar 2026 13:53:13 +0000 [thread overview]
Message-ID: <abFzyWG-ASb1Nv4J@google.com> (raw)
In-Reply-To: <CAE2F3rA_uyYGx2PCHMpFPWD6B4_2WxK-ahpoRHioXfuNf5Mpxw@mail.gmail.com>
On Tue, Mar 10, 2026 at 09:43:05PM -0700, Daniel Mentz wrote:
> Hi Pranjal,
>
> I have thought about this further and would like to propose an
> alternative design.
>
> A key difference in this design is that no commands are written to the
> command queue, and the producer index is not updated while the SMMU is
> suspended.
>
> Similar to the existing CMDQ_PROD_OWNED_FLAG, we could define a new
> flag, CMDQ_PROD_STOP_FLAG, stored in cmdq->q.llq.prod.
>
> In arm_smmu_cmdq_issue_cmdlist():
> Perform "old = cmpxchg_relaxed(&cmdq->q.llq.val, llq.val, head.val);".
> If CMDQ_PROD_STOP_FLAG is set in the variable named old, return from
> arm_smmu_cmdq_issue_cmdlist() immediately.
>
> In the Suspend handler:
>
> 1. Clear SMMU_CR0.SMMUEN.
> 2. Set CMDQ_PROD_STOP_FLAG in cmdq->q.llq.prod.
> 3. Wait until CMDQ_PROD_OWNED_FLAG is cleared from cmdq->q.llq.prod.
> 4. Wait until cmdq->owner_prod matches cmdq->q.llq.prod (excluding the
> CMDQ_PROD_STOP_FLAG).
> 5. Wait until cmdq->lock is 0.
> 6. Clear SMMU_CR0.
>
> In the Resume handler:
> Follow the same logic as your patch, but clear CMDQ_PROD_STOP_FLAG in
> cmdq->q.llq.prod instead of setting nr_cmdq_users to 1.
>
> One benefit of this proposal is that it avoids adding a new atomic
> variable that multiple CPU threads might compete for. Instead, we
> reuse existing atomic operations, which should reduce the performance
> impact for server use cases where many CPU threads compete for command
> queue access.
>
I like the idea/direction! Although, there could be a few (solvable)
challenges, for example:
1. Abandoned Command Batch
In the SMMUv3 driver, cmdq->q.llq.val is the shared global state in
RAM. The cmpxchg is the "Point of Commitment." Consider the following
sequence of events:
a. Assume STOP_FLAG is already set. Global prod is 10 | STOP_FLAG.
b. CPU-A calls issue_cmdlist. It reads the global state:
llq.val = 10 | STOP_FLAG.
c. CPU-A calculates its new target:
head.prod = 14 | STOP_FLAG | OWNED_FLAG.
d. CPU-A executes: old = cmpxchg(&cmdq->q.llq.val, llq.val, head.val)
The swap succeeds because the global state was exactly llq.val.
Global prod is now 14. We have just told the entire system that
there are 14 commands.
e. CPU-A now checks the old variable. It sees STOP_FLAG is set and
returns immediately.
CPU-A has "reserved" the indices 10, 11, 12, and 13 in the global
producer pointer, but it never wrote the commands to those memory slots.
Indices 10–13 now contain garbage/stale data. When the SMMU resumes and
processes up to index 14, it will execute that garbage cmd leading to a
fault (CERROR_ILL).
2. The OWNED_FLAG Deadlock
a. The CPU who finds OWNED_FLAG is 0 in the cmpxchg becomes the owner
Its job is to gather all concurrent work and update the hardware.
b. Any subsequent CPUs that find OWNED_FLAG is 1 in the cmpxchg. They
write their commands to RAM and then just wait. They rely on the
Owner to "kick" the HW for them & to clear the flag for the next
batch.
The Deadlock Scenario:
a. Suspend Handler: Sets the STOP_FLAG in the global cmdq->q.llq.val.
b. CPU-A (The Owner): Executes cmpxchg. It moves the global prod from
10 to 14. Because it was the first in this batch, it does not set
the OWNED_FLAG.
c. CPU-B (The Follower): Executes cmpxchg. It moves the global prod
from 14 to 18. Because it is joining CPU-A's batch, its cmpxchg
sets the OWNED_FLAG in the global state.
d. Now, CPU-A checks the old value, sees STOP_FLAG, and returns.
CPU-B checks the old value, sees STOP_FLAG, and returns.
e. The Consequence: The global state cmdq->q.llq.prod now has the
OWNED_FLAG set.
f. Suspend Handler: Calls Wait until CMDQ_PROD_OWNED_FLAG is cleared
g. Since the Owner (CPU-A) should clear that flag (it does so at the
very end of the function). The Suspend Handler will spin on this
flag forever.
This can be solved by checking the flag before setting the OWNED_FLAG.
While the atomic counter (nr_cmdq_users) used in v5 adds an tomic, I
believe that it provides a cleaner "gate" that prevents CPUs from ever
entering the "Commitment" phase once a suspend is imminent (with fixes).
However, I agree that the atomic cntr can cause pref drops on servers.
I'm okay to adopt Daniel's direction (it only impacts 2 patches) & post
a v6. However, given we've been working through this for a year now, I'd
want to make sure we have a solid consensus on the design from everyone.
I felt with v4 that we have consensus on the design but it still seems
to be evolving. I’m ready to post v6 once we settle on the path forward.
Thanks,
Praan
next prev parent reply other threads:[~2026-03-11 13:53 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-26 15:11 [PATCH v5 00/10] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Pranjal Shrivastava
2026-01-26 15:11 ` [PATCH v5 01/10] iommu/arm-smmu-v3: Refactor arm_smmu_setup_irqs Pranjal Shrivastava
2026-01-26 15:11 ` [PATCH v5 02/10] iommu/arm-smmu-v3: Add a helper to drain cmd queues Pranjal Shrivastava
2026-01-26 15:11 ` [PATCH v5 03/10] iommu/tegra241-cmdqv: Add a helper to drain VCMDQs Pranjal Shrivastava
2026-01-26 15:11 ` [PATCH v5 04/10] iommu/tegra241-cmdqv: Restore PROD and CONS after resume Pranjal Shrivastava
2026-01-26 15:11 ` [PATCH v5 05/10] iommu/arm-smmu-v3: Cache and restore MSI config Pranjal Shrivastava
2026-01-26 15:11 ` [PATCH v5 06/10] iommu/arm-smmu-v3: Add a usage counter for cmdq Pranjal Shrivastava
2026-03-08 21:23 ` Daniel Mentz
2026-03-09 19:46 ` Pranjal Shrivastava
2026-03-09 21:13 ` Pranjal Shrivastava
2026-03-09 22:56 ` Daniel Mentz
2026-03-10 16:46 ` Pranjal Shrivastava
2026-03-09 22:41 ` Daniel Mentz
2026-03-10 16:54 ` Pranjal Shrivastava
2026-03-11 4:43 ` Daniel Mentz
2026-03-11 13:53 ` Pranjal Shrivastava [this message]
2026-03-11 22:56 ` Daniel Mentz
2026-03-16 15:36 ` Pranjal Shrivastava
2026-01-26 15:11 ` [PATCH v5 07/10] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops Pranjal Shrivastava
2026-03-10 1:45 ` Daniel Mentz
2026-03-10 16:58 ` Pranjal Shrivastava
2026-03-10 21:16 ` Daniel Mentz
2026-03-10 21:40 ` Pranjal Shrivastava
2026-03-11 0:12 ` Daniel Mentz
2026-03-11 5:31 ` Pranjal Shrivastava
2026-03-11 17:26 ` Daniel Mentz
2026-03-16 15:05 ` Pranjal Shrivastava
2026-03-12 19:20 ` Daniel Mentz
2026-01-26 15:11 ` [PATCH v5 08/10] iommu/arm-smmu-v3: Handle gerror during suspend Pranjal Shrivastava
2026-01-26 15:11 ` [PATCH v5 09/10] iommu/arm-smmu-v3: Enable pm_runtime and setup devlinks Pranjal Shrivastava
2026-01-26 15:11 ` [PATCH v5 10/10] iommu/arm-smmu-v3: Invoke pm_runtime before hw access Pranjal Shrivastava
2026-02-06 15:48 ` [PATCH v5 00/10] iommu/arm-smmu-v3: Implement Runtime/System Sleep ops Ashish Mhetre
2026-02-11 13:33 ` Pranjal Shrivastava
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=abFzyWG-ASb1Nv4J@google.com \
--to=praan@google.com \
--cc=amhetre@nvidia.com \
--cc=danielmentz@google.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@ziepe.ca \
--cc=joro@8bytes.org \
--cc=nicolinc@nvidia.com \
--cc=robin.murphy@arm.com \
--cc=sarunkod@amd.com \
--cc=smostafa@google.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox