Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Leonardo Bras <leo.bras@arm.com>
To: Oliver Upton <oupton@kernel.org>
Cc: Leonardo Bras <leo.bras@arm.com>, Will Deacon <will@kernel.org>,
	Marc Zyngier <maz@kernel.org>, Joey Gouly <joey.gouly@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Fuad Tabba <tabba@google.com>,
	Raghavendra Rao Ananta <rananta@google.com>,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 1/2] KVM: arm64: Introduce S2 walker SKIP return options
Date: Wed, 20 May 2026 12:49:56 +0100	[thread overview]
Message-ID: <ag2f5C3i3LBGSPS5@devkitleo> (raw)
In-Reply-To: <agzUZS1l-uHQoodG@kernel.org>

On Tue, May 19, 2026 at 02:21:41PM -0700, Oliver Upton wrote:
> On Tue, May 19, 2026 at 03:35:19PM +0100, Leonardo Bras wrote:
> > On Tue, May 19, 2026 at 02:15:41PM +0100, Will Deacon wrote:
> > > On Tue, May 19, 2026 at 01:56:48PM +0100, Leonardo Bras wrote:
> > > > On Tue, May 19, 2026 at 01:43:37PM +0100, Will Deacon wrote:
> > > > > > > I was wondering along similar lines, but maybe it would be useful just
> > > > > > > to pass a maximum level to the walker logic? That feels like the most
> > > > > > > general case without complicating the existing logic.
> 
> FWIW, I had considered this too but decided that it requires a bit more
> churn since we cannot rely on zero initialization in the existing
> callsites (level 0 is a valid level).
> 
> But that's extremely minor.
> 
> > > > > > This proposal seems simpler for me to understand, and indeed looks like a 
> > > > > > better solution than what I have proposed, taking care of  the 
> > > > > > 'already split' case with better performance, as it don't even walk a 
> > > > > > single level-3 entry. 
> > > > > > 
> > > > > > On the 'splitting' case, it also works flawlessly if the memory is given in 
> > > > > > level-2 blocks. There is only one case that I would like to address here:
> > > > > > 
> > > > > > - Memory given in level-1 blocks (say 1GB)
> > > > > > - Walker flag says 'walk down to level-2 only'
> > > > > > - Split Walker on level-1 will break page down to (up to) level-3 entries.
> > > > > > - Walker will continue to be called on level-2 entries, even though it's 
> > > > > >   not necessary.
> > > > > 
> > > > > If you're only visiting leaves, why would it be called on the level-2
> > > > > table entries?
> > > > > 
> > > > 
> > > > Because once the leaf is turned into a table by the splitting walker, it 
> > > > gets reloaded and walked. This is an excerpt of __kvm_pgtable_visit():
> > > 
> > > Sorry, I was musing about the semantics after adding something to limit
> > > the maximum level. I don't dispute what the current code would do.
> > > 
> > > > Example:
> > > > - Split this level-1 leave:
> > > >   - Walker creates the whole structure up to given level (currently 3)
> > > >   - Walker returns, gets reloaded, table detected, go down on that one
> > > >   - Level 2 entries walked (which is unnecessary)
> > > > 
> > > > Please let me know if I am misunderstanding something.
> > > 
> > > I just don't grok why this would happen if we limited the maximum level
> > > to '2' _and_ said we only wanted to visit the leaf entries. In that
> > > case, I wouldn't expect to descend into any of the L2 table entries
> > > (because that would imply going beyond level 2) and I wouldn't expect to
> > > be called for the table entries either (because we're only interested in
> > > leaves).
> > 
> > Agree, if we specify to skip level-3 entries, it would only walk up to 
> > level-2 entries, but take above example in detail:
> > - Split these level-1 leaves, up to level-3 leaves (regular)
> >   - INFO: kvm_pgtable_walk will call walker:
> >     - only up to level-2 entries (skip level-3)
> >     - only on leaf entries
> >   - Walk first level-1 leaf, calls walker
> >     - walker will split the level-1 leaf in level-3 leaves
> >     - walker return from that first level-1 leaf
> >   - level-1 leaf is reloaded as a table
> >     - level-2 entries of that table are also walked (unnecessary)
> >     - on each of the level-2 table entries, level-3 entries are skipped
> > 
> > To avoid the unecessary walk of the level-2 entries above, we would need to 
> > specify 'skip level-2' that could be an issue if we have a mix of level-1 
> > and level-2 leaves, as the level-2 leaves in that case would not be split.
> > 
> > That's why I suggest something like "skip recently created table" as a flag 
> > as well, so we can guarantee no newly created table gets walked 
> > unecessarily.
> > 
> > Please help me if I am missing something important.
> 
> I'm not sure the added complexity of handling this case perfectly results
> in a measurable performance improvement. Just avoiding the level 3
> tables would be an exponential reduction (~ 512-8192x) in the number of
> walk steps.
> 

Humm, I agree most of the improvement is due to skipping level-3 walks.

The added complexity would be, appart from a new flag, just a single line 
change:

---
        if (!table && (ctx.flags & KVM_PGTABLE_WALK_LEAF)) {
                ret = kvm_pgtable_visitor_cb(data, &ctx, KVM_PGTABLE_WALK_LEAF);
-               reload = true;
+               /* Do not reload if flagged to skip new table */
+               reload = !(ctx.flags & KVM_PGTABLE_WALK_SKIP_NEW_TABLE);
        }
--- 

Even then, do you think it would not be worth having it?

Thanks!
Leo
 


  reply	other threads:[~2026-05-20 11:50 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-15 19:59 [RFC PATCH 0/2] Optimize S2 page splitting Leonardo Bras
2026-05-15 19:59 ` [RFC PATCH 1/2] KVM: arm64: Introduce S2 walker SKIP return options Leonardo Bras
2026-05-18  7:22   ` Oliver Upton
2026-05-18  8:52     ` Will Deacon
2026-05-18 13:45       ` Leonardo Bras
2026-05-19 12:43         ` Will Deacon
2026-05-19 12:56           ` Leonardo Bras
2026-05-19 13:15             ` Will Deacon
2026-05-19 14:35               ` Leonardo Bras
2026-05-19 21:21                 ` Oliver Upton
2026-05-20 11:49                   ` Leonardo Bras [this message]
2026-05-15 19:59 ` [RFC PATCH 2/2] KVM: arm64: Improve splitting performance by using SKIP return values Leonardo Bras
2026-05-16  9:15 ` [RFC PATCH 0/2] Optimize S2 page splitting Marc Zyngier
2026-05-18 14:09   ` Leonardo Bras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ag2f5C3i3LBGSPS5@devkitleo \
    --to=leo.bras@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=joey.gouly@arm.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=oupton@kernel.org \
    --cc=rananta@google.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox