All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oliver Upton <oupton@kernel.org>
To: Leonardo Bras <leo.bras@arm.com>
Cc: Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>,
	Joey Gouly <joey.gouly@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Fuad Tabba <tabba@google.com>,
	Raghavendra Rao Ananta <rananta@google.com>,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 1/2] KVM: arm64: Introduce S2 walker SKIP return options
Date: Tue, 19 May 2026 14:21:41 -0700	[thread overview]
Message-ID: <agzUZS1l-uHQoodG@kernel.org> (raw)
In-Reply-To: <agx1J1tQFn0cdtBf@devkitleo>

On Tue, May 19, 2026 at 03:35:19PM +0100, Leonardo Bras wrote:
> On Tue, May 19, 2026 at 02:15:41PM +0100, Will Deacon wrote:
> > On Tue, May 19, 2026 at 01:56:48PM +0100, Leonardo Bras wrote:
> > > On Tue, May 19, 2026 at 01:43:37PM +0100, Will Deacon wrote:
> > > > > > I was wondering along similar lines, but maybe it would be useful just
> > > > > > to pass a maximum level to the walker logic? That feels like the most
> > > > > > general case without complicating the existing logic.

FWIW, I had considered this too but decided that it requires a bit more
churn since we cannot rely on zero initialization in the existing
callsites (level 0 is a valid level).

But that's extremely minor.

> > > > > This proposal seems simpler for me to understand, and indeed looks like a 
> > > > > better solution than what I have proposed, taking care of  the 
> > > > > 'already split' case with better performance, as it don't even walk a 
> > > > > single level-3 entry. 
> > > > > 
> > > > > On the 'splitting' case, it also works flawlessly if the memory is given in 
> > > > > level-2 blocks. There is only one case that I would like to address here:
> > > > > 
> > > > > - Memory given in level-1 blocks (say 1GB)
> > > > > - Walker flag says 'walk down to level-2 only'
> > > > > - Split Walker on level-1 will break page down to (up to) level-3 entries.
> > > > > - Walker will continue to be called on level-2 entries, even though it's 
> > > > >   not necessary.
> > > > 
> > > > If you're only visiting leaves, why would it be called on the level-2
> > > > table entries?
> > > > 
> > > 
> > > Because once the leaf is turned into a table by the splitting walker, it 
> > > gets reloaded and walked. This is an excerpt of __kvm_pgtable_visit():
> > 
> > Sorry, I was musing about the semantics after adding something to limit
> > the maximum level. I don't dispute what the current code would do.
> > 
> > > Example:
> > > - Split this level-1 leave:
> > >   - Walker creates the whole structure up to given level (currently 3)
> > >   - Walker returns, gets reloaded, table detected, go down on that one
> > >   - Level 2 entries walked (which is unnecessary)
> > > 
> > > Please let me know if I am misunderstanding something.
> > 
> > I just don't grok why this would happen if we limited the maximum level
> > to '2' _and_ said we only wanted to visit the leaf entries. In that
> > case, I wouldn't expect to descend into any of the L2 table entries
> > (because that would imply going beyond level 2) and I wouldn't expect to
> > be called for the table entries either (because we're only interested in
> > leaves).
> 
> Agree, if we specify to skip level-3 entries, it would only walk up to 
> level-2 entries, but take above example in detail:
> - Split these level-1 leaves, up to level-3 leaves (regular)
>   - INFO: kvm_pgtable_walk will call walker:
>     - only up to level-2 entries (skip level-3)
>     - only on leaf entries
>   - Walk first level-1 leaf, calls walker
>     - walker will split the level-1 leaf in level-3 leaves
>     - walker return from that first level-1 leaf
>   - level-1 leaf is reloaded as a table
>     - level-2 entries of that table are also walked (unnecessary)
>     - on each of the level-2 table entries, level-3 entries are skipped
> 
> To avoid the unecessary walk of the level-2 entries above, we would need to 
> specify 'skip level-2' that could be an issue if we have a mix of level-1 
> and level-2 leaves, as the level-2 leaves in that case would not be split.
> 
> That's why I suggest something like "skip recently created table" as a flag 
> as well, so we can guarantee no newly created table gets walked 
> unecessarily.
> 
> Please help me if I am missing something important.

I'm not sure the added complexity of handling this case perfectly results
in a measurable performance improvement. Just avoiding the level 3
tables would be an exponential reduction (~ 512-8192x) in the number of
walk steps.

Thanks,
Oliver


  reply	other threads:[~2026-05-19 21:21 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-15 19:59 [RFC PATCH 0/2] Optimize S2 page splitting Leonardo Bras
2026-05-15 19:59 ` [RFC PATCH 1/2] KVM: arm64: Introduce S2 walker SKIP return options Leonardo Bras
2026-05-18  7:22   ` Oliver Upton
2026-05-18  8:52     ` Will Deacon
2026-05-18 13:45       ` Leonardo Bras
2026-05-19 12:43         ` Will Deacon
2026-05-19 12:56           ` Leonardo Bras
2026-05-19 13:15             ` Will Deacon
2026-05-19 14:35               ` Leonardo Bras
2026-05-19 21:21                 ` Oliver Upton [this message]
2026-05-20 11:49                   ` Leonardo Bras
2026-05-28 16:03                   ` Leonardo Bras
2026-05-15 19:59 ` [RFC PATCH 2/2] KVM: arm64: Improve splitting performance by using SKIP return values Leonardo Bras
2026-05-16  9:15 ` [RFC PATCH 0/2] Optimize S2 page splitting Marc Zyngier
2026-05-18 14:09   ` Leonardo Bras
2026-05-28 17:00   ` Leonardo Bras
2026-05-29 12:25     ` Marc Zyngier
2026-05-29 12:56       ` Leonardo Bras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=agzUZS1l-uHQoodG@kernel.org \
    --to=oupton@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=joey.gouly@arm.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=leo.bras@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=rananta@google.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.