Re: [PATCH 0/2] SMMU v3 CMDQ fix and improvement

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jacob Pan <jacob.pan@linux.microsoft.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Mostafa Saleh <smostafa@google.com>,
	linux-kernel@vger.kernel.org,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	Will Deacon <will@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Nicolin Chen <nicolinc@nvidia.com>,
	Zhang Yu <zhangyu1@linux.microsoft.com>,
	Jean Philippe-Brucker <jean-philippe@linaro.org>,
	Alexander Grest <Alexander.Grest@microsoft.com>
Subject: Re: [PATCH 0/2] SMMU v3 CMDQ fix and improvement
Date: Mon, 20 Oct 2025 11:57:10 -0700	[thread overview]
Message-ID: <20251020115710.0000258b@linux.microsoft.com> (raw)
In-Reply-To: <20251020120240.GI316284@nvidia.com>

On Mon, 20 Oct 2025 09:02:40 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Fri, Oct 17, 2025 at 09:50:31AM -0700, Jacob Pan wrote:
> > On Fri, 17 Oct 2025 10:51:45 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >   
> > > On Fri, Oct 17, 2025 at 10:57:52AM +0000, Mostafa Saleh wrote:  
> > > > On Wed, Sep 24, 2025 at 10:54:36AM -0700, Jacob Pan wrote:    
> > > > > Hi Will et al,
> > > > > 
> > > > > These two patches are derived from testing SMMU driver with
> > > > > smaller CMDQ sizes where we see soft lockups.
> > > > > 
> > > > > This happens on HyperV emulated SMMU v3 as well as baremetal
> > > > > ARM servers with artificially reduced queue size and
> > > > > microbenchmark to stress test concurrency.    
> > > > 
> > > > Is it possible to share what are the artificial sizes and does
> > > > the HW/emulation support range invalidation (IRD3.RIL)?
> > > > 
> > > > I'd expect it would be really hard to overwhelm the command
> > > > queue, unless the HW doesn't support range invalidation and/or
> > > > the queue entries are close to the number of CPUs.    
> > > 
> > > At least on Jacob's system there is no RIL and there are 72/144
> > > CPU cores potentially banging on this.
> > > 
> > > I think it is combination of lots of required invalidation
> > > commands, low queue depth and slow retirement of commands that
> > > make it easier to create a queue full condition.
> > > 
> > > Without RIL one SVA invalidation may take out the entire small
> > > queue, for example.  
> > Right, no range invalidation and queue depth is 256 in this case.  
> 
> I think Robin is asking you to justify why the queue depth is 256 when
> ARM is recommending much larger depths specifically to fix issues like
> this?
The smaller queue depth is chosen for CMD_SYNC latency reasons. But I
don't know the implementation details of HyperV and host SMMU driver.

IMHO, queue size is orthogonal to what this patch is trying to
address, which is a specific locking problem and improve efficiency.
e.g. eliminated cmpxchg
-	do {
-		val = atomic_cond_read_relaxed(&cmdq->lock, VAL >= 0);
-	} while (atomic_cmpxchg_relaxed(&cmdq->lock, val, val + 1) != val);
+	atomic_cond_read_relaxed(&cmdq->lock, VAL > 0);

Even on BM with restricted queue size, this patch reduces latency of
concurrent madvise(MADV_DONTNEED) from multiple CPUs (I tested 32 CPUs,
cutting 50% latency unmap 1GB buffer in 2MB chucks per CPU).

next prev parent reply	other threads:[~2025-10-20 18:57 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-24 17:54 [PATCH 0/2] SMMU v3 CMDQ fix and improvement Jacob Pan
2025-09-24 17:54 ` [PATCH 1/2] iommu/arm-smmu-v3: Fix CMDQ timeout warning Jacob Pan
2025-10-07  0:44   ` Nicolin Chen
2025-10-07 16:12     ` Jacob Pan
2025-10-07 16:32       ` Nicolin Chen
2025-09-24 17:54 ` [PATCH 2/2] iommu/arm-smmu-v3: Improve CMDQ lock fairness and efficiency Jacob Pan
2025-10-07  1:08   ` Nicolin Chen
2025-10-07 18:16     ` Jacob Pan
2025-10-17 11:04   ` Mostafa Saleh
2025-10-19  5:32     ` Jacob Pan
2025-10-06 15:14 ` [PATCH 0/2] SMMU v3 CMDQ fix and improvement Jacob Pan
2025-10-16 15:31 ` Jacob Pan
2025-10-17 10:57 ` Mostafa Saleh
2025-10-17 13:51   ` Jason Gunthorpe
2025-10-17 14:44     ` Robin Murphy
2025-10-17 16:50     ` Jacob Pan
2025-10-20 12:02       ` Jason Gunthorpe
2025-10-20 18:57         ` Jacob Pan [this message]
2025-10-21 11:45           ` Robin Murphy
2025-10-21 20:37             ` Jacob Pan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251020115710.0000258b@linux.microsoft.com \
    --to=jacob.pan@linux.microsoft.com \
    --cc=Alexander.Grest@microsoft.com \
    --cc=iommu@lists.linux.dev \
    --cc=jean-philippe@linaro.org \
    --cc=jgg@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nicolinc@nvidia.com \
    --cc=robin.murphy@arm.com \
    --cc=smostafa@google.com \
    --cc=will@kernel.org \
    --cc=zhangyu1@linux.microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.