From: Marc Zyngier <maz@kernel.org>
To: Oliver Upton <oliver.upton@linux.dev>
Cc: kvmarm@lists.cs.columbia.edu, kvmarm@lists.linux.dev,
kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
James Morse <james.morse@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Alexandru Elisei <alexandru.elisei@arm.com>,
Ard Biesheuvel <ardb@kernel.org>, Will Deacon <will@kernel.org>,
Quentin Perret <qperret@google.com>,
stable@vger.kernel.org
Subject: Re: [PATCH 1/3] KVM: arm64: Fix S1PTW handling on RO memslots
Date: Wed, 21 Dec 2022 09:35:06 +0000 [thread overview]
Message-ID: <86pmcdaylx.wl-maz@kernel.org> (raw)
In-Reply-To: <Y6IteDoK406o9pM+@google.com>
On Tue, 20 Dec 2022 21:47:36 +0000,
Oliver Upton <oliver.upton@linux.dev> wrote:
>
> Hi Marc,
>
> On Tue, Dec 20, 2022 at 08:09:21PM +0000, Marc Zyngier wrote:
> > A recent development on the EFI front has resulted in guests having
> > their page tables baked in the firmware binary, and mapped into
> > the IPA space as part as a read-only memslot.
>
> as part of a
>
> > Not only this is legitimate, but it also results in added security,
> > so thumbs up. However, this clashes mildly with our handling of a S1PTW
> > as a write to correctly handle AF/DB updates to the S1 PTs, and results
> > in the guest taking an abort it won't recover from (the PTs mapping the
> > vectors will suffer freom the same problem...).
>
> To be clear, the read-only page tables already have the AF set,
> right? They certainly must, or else the guest isn't getting far :)
Yes, the guest definitely has the AF set in the PT, and is not trying
to use the HW-assisted AF (which obviously wouldn't work).
>
> I understand you're trying to describe _why_ we promote S1PTW to a
> write, but doing it inline with the context of the EFI issue makes
> it slightly unclear. Could you break these ideas up into two
> paragraphs and maybe spell out the fault conditions a bit more?
>
> A recent development on the EFI front has resulted in guests having
> their page tables baked in the firmware binary, and mapped into the
> IPA space as part of a read-only memslot. Not only is this legitimate,
> but it also results in added security, so thumbs up.
>
> It is possible to take an S1PTW translation fault if the S1 PTs are
> unmapped at stage-2. However, KVM unconditionally treats S1PTW as a
> write to correctly handle hardware AF/DB updates to the S1 PTs.
> Furthermore, KVM injects a data abort into the guest for S1PTW writes.
> In the aforementioned case this results in the guest taking an abort
> it won't recover from, as the S1 PTs mapping the vectors suffer from
> the same problem.
>
> Dunno, maybe I stink at reading which is why I got confused in the
> first place.
Nothing wrong with you, just that my write-up is indeed sloppy. I'll
copy paste the above, thanks!
>
> > So clearly our handling is... wrong.
> >
> > Instead, switch to a two-pronged approach:
> >
> > - On S1PTW translation fault, handle the fault as a read
> >
> > - On S1PTW permission fault, handle the fault as a write
> >
> > This is of no consequence to SW that *writes* to its PTs (the write
> > will trigger a non-S1PTW fault), and SW that uses RO PTs will not
> > use AF/DB anyway, as that'd be wrong.
> >
> > Only in the case described in c4ad98e4b72c ("KVM: arm64: Assume write
> > fault on S1PTW permission fault on instruction fetch") do we end-up
> > with two back-to-back faults (page being evicted and faulted back).
> > I don't think this is a case worth optimising for.
> >
> > Fixes: c4ad98e4b72c ("KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetch")
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > Cc: stable@vger.kernel.org
> > ---
> > arch/arm64/include/asm/kvm_emulate.h | 22 ++++++++++++++++++++--
> > 1 file changed, 20 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> > index 9bdba47f7e14..fd6ad8b21f85 100644
> > --- a/arch/arm64/include/asm/kvm_emulate.h
> > +++ b/arch/arm64/include/asm/kvm_emulate.h
> > @@ -373,8 +373,26 @@ static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu)
> >
> > static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu)
> > {
> > - if (kvm_vcpu_abt_iss1tw(vcpu))
> > - return true;
> > + if (kvm_vcpu_abt_iss1tw(vcpu)) {
> > + /*
> > + * Only a permission fault on a S1PTW should be
> > + * considered as a write. Otherwise, page tables baked
> > + * in a read-only memslot will result in an exception
> > + * being delivered in the guest.
>
> Somewhat of a tangent, but:
>
> Aren't we somewhat unaligned with the KVM UAPI by injecting an
> exception in this case? I know we've been doing it for a while, but it
> flies in the face of the rules outlined in the
> KVM_SET_USER_MEMORY_REGION documentation.
That's an interesting point, and I certainly haven't considered that
for faults introduced by page table walks.
I'm not sure what userspace can do with that though. The problem is
that this is a write for which we don't have useful data: although we
know it is a page-table walker access, we don't know what it was about
to write. The instruction that caused the write is meaningless (it
could either be a load, a store, or an instruction fetch). How do you
populate the data[] field then?
If anything, this is closer to KVM_EXIT_ARM_NISV, for which we give
userspace the full ESR and ask it to sort it out. I doubt it will be
able to, but hey, maybe it is worth a shot. This would need to be a
different exit reason though, as NISV is explicitly for non-memslot
stuff.
In any case, the documentation for KVM_SET_USER_MEMORY_REGION needs to
reflect the fact that KVM_EXIT_MMIO cannot represent a fault due to a
S1 PTW.
>
> > + * The drawback is that we end-up fauling twice if the
>
> typo: s/fauling/faulting/
>
> > + * guest is using any of HW AF/DB: a translation fault
> > + * to map the page containing the PT (read only at
> > + * first), then a permission fault to allow the flags
> > + * to be set.
> > + */
> > + switch (kvm_vcpu_trap_get_fault_type(vcpu)) {
> > + case ESR_ELx_FSC_PERM:
> > + return true;
> > + default:
> > + return false;
> > + }
> > + }
> >
> > if (kvm_vcpu_trap_is_iabt(vcpu))
> > return false;
> > --
> > 2.34.1
> >
>
> Besides the changelog/comment suggestions, the patch looks good to me.
>
> Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Thanks for the quick review! I'll wait a bit before respinning the
series, as I'd like to get closure on the UAPI point you have raised.
Cheers,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2022-12-21 9:36 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-20 20:09 [PATCH 0/3] KVM: arm64: Fix handling of S1PTW S2 fault on RO memslots Marc Zyngier
2022-12-20 20:09 ` [PATCH 1/3] KVM: arm64: Fix S1PTW handling " Marc Zyngier
2022-12-20 21:47 ` Oliver Upton
2022-12-21 9:35 ` Marc Zyngier [this message]
2022-12-21 16:50 ` Oliver Upton
2022-12-21 17:53 ` Marc Zyngier
2022-12-21 18:26 ` Oliver Upton
2022-12-22 13:01 ` Ard Biesheuvel
2022-12-24 12:18 ` Marc Zyngier
2022-12-24 13:09 ` Ard Biesheuvel
2022-12-20 20:09 ` [PATCH 2/3] KVM: arm64: Handle S1PTW translation with TCR_HA set as a write Marc Zyngier
2022-12-21 16:46 ` Ricardo Koller
2022-12-21 17:43 ` Marc Zyngier
2022-12-23 0:33 ` Ricardo Koller
2022-12-21 17:46 ` Oliver Upton
2022-12-22 9:01 ` Marc Zyngier
2022-12-22 20:58 ` Oliver Upton
2022-12-23 1:00 ` Ricardo Koller
2022-12-24 11:59 ` Marc Zyngier
2022-12-20 20:09 ` [PATCH 3/3] KVM: arm64: Convert FSC_* over to ESR_ELx_FSC_* Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86pmcdaylx.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=alexandru.elisei@arm.com \
--cc=ardb@kernel.org \
--cc=james.morse@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=oliver.upton@linux.dev \
--cc=qperret@google.com \
--cc=stable@vger.kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).