From: Marc Zyngier <maz@kernel.org>
To: Quentin Perret <qperret@google.com>
Cc: Oliver Upton <oliver.upton@linux.dev>,
Joey Gouly <joey.gouly@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Zenghui Yu <yuzenghui@huawei.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
linux-kernel@vger.kernel.org, Leo Yan <leo.yan@arm.com>
Subject: Re: [PATCH] KVM: arm64: Adjust range correctly during host stage-2 faults
Date: Thu, 05 Mar 2026 13:22:33 +0000 [thread overview]
Message-ID: <86o6l276na.wl-maz@kernel.org> (raw)
In-Reply-To: <fdqyxxlu2n4hngowq2ksllhwew33swrsj6mqpeyzb7vaofzuzf@ks7z6dnatyoo>
On Thu, 05 Mar 2026 13:13:40 +0000,
Quentin Perret <qperret@google.com> wrote:
>
> On Thursday 05 Mar 2026 at 10:55:42 (+0000), Marc Zyngier wrote:
> > On Wed, 04 Mar 2026 18:55:04 +0000,
> > Marc Zyngier <maz@kernel.org> wrote:
> > >
> > > On Wed, 25 Jun 2025 11:55:48 +0100,
> > > Quentin Perret <qperret@google.com> wrote:
> > > >
> > > > host_stage2_adjust_range() tries to find the largest block mapping that
> > > > fits within a memory or mmio region (represented by a kvm_mem_range in
> > > > this function) during host stage-2 faults under pKVM. To do so, it walks
> > > > the host stage-2 page-table, finds the faulting PTE and its level, and
> > > > then progressively increments the level until it finds a granule of the
> > > > appropriate size. However, the condition in the loop implementing the
> > > > above is broken as it checks kvm_level_supports_block_mapping() for the
> > > > next level instead of the current, so pKVM may attempt to map a region
> > > > larger than can be covered with a single block.
> > > >
> > > > This is not a security problem and is quite rare in practice (the
> > > > kvm_mem_range check usually forces host_stage2_adjust_range() to choose a
> > > > smaller granule), but this is clearly not the expected behaviour.
> > > >
> > > > Refactor the loop to fix the bug and improve readability.
> > > >
> > > > Fixes: c4f0935e4d95 ("KVM: arm64: Optimize host memory aborts")
> > > > Signed-off-by: Quentin Perret <qperret@google.com>
> > >
> > > This patch prevents my O6 board from booting in protected mode as of
> > > e728e705802fe. Reverting it on top of 7.0-rc2 make the box work again.
> > >
> > > I haven't quite worked out why though. The hack below makes it work,
> > > but implies that we can get ranges that are smaller than a page. That
> > > feels unlikely, but I'm not sure we can rule it out (the kernel page
> > > size could be pretty large anyway).
> >
> > Having spent a bit of time on this, I'm pretty sure this is the cause
> > of the issue. The memblock tables are as such:
> >
> > maz@cosmic-debris:~/vminstall$ sudo cat /sys/kernel/debug/memblock/memory
> > 0: 0x0000000080000000..0x00000000843fffff 0 NOMAP
> > 1: 0x0000000084400000..0x00000000845fffff 0 NONE
> > 2: 0x0000000085000000..0x000000009fffffff 0 NONE
> > 3: 0x00000000a0000000..0x00000000a7ffffff 0 NOMAP
> > 4: 0x00000000a8000000..0x00000000fffbffff 0 NONE
> > 5: 0x00000000fffc0000..0x00000000fffeffff 0 NOMAP
> > 6: 0x00000000ffff0000..0x00000000ffffdfff 0 NONE
> > 7: 0x00000000ffffe000..0x00000000ffffffff 0 NOMAP
> > 8: 0x0000000100000000..0x00000007fe4effff 0 NONE
> > 9: 0x00000007fe4f0000..0x00000007fedeffff 0 NOMAP
> > 10: 0x00000007fedf0000..0x00000007ffffffff 0 NONE
> > 11: 0x0000008000000000..0x000000807a290fff 0 NONE
> > 12: 0x000000807a291000..0x000000807a2927b2 0 NOMAP
> > 13: 0x000000807a2927b3..0x000000807fffffff 0 NONE
>
> Ouch, these last few are 'interesting', oh well :-)
>
> > Any access to page 0x000000807a292000 is going to blow up in your
> > face, because there is no way you can map this and still respect the
> > memblock boundary. Same thing for any region that is smaller than
> > PAGE_SIZE, or not aligned on PAGE_SIZE. Which is even more annoying.
> >
> > I'm starting to think that my hack is not that idiotic in the end...
>
> Yes, I can't think of anything better TBH. We've already asserted that
> we don't have an annotated PTE here, and at the last level we're
> guaranteed not to accidentally map a neighbouring private region, so yes
> we should just proceed with a page-aligned mapping there.
>
> Want me to post a proper patch or do you already have one in stock?
I have that ready, but I wanted your feedback on it before posting it.
I'll send that now.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
prev parent reply other threads:[~2026-03-05 13:22 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-25 10:55 [PATCH] KVM: arm64: Adjust range correctly during host stage-2 faults Quentin Perret
2025-06-26 7:53 ` Marc Zyngier
2026-03-04 18:55 ` Marc Zyngier
2026-03-05 10:55 ` Marc Zyngier
2026-03-05 13:13 ` Quentin Perret
2026-03-05 13:22 ` Marc Zyngier [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86o6l276na.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=joey.gouly@arm.com \
--cc=kvmarm@lists.linux.dev \
--cc=leo.yan@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=oliver.upton@linux.dev \
--cc=qperret@google.com \
--cc=suzuki.poulose@arm.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.