From: Alexandru Elisei <alexandru.elisei@arm.com>
To: Marc Zyngier <maz@kernel.org>
Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
kvm@vger.kernel.org, James Morse <james.morse@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Oliver Upton <oliver.upton@linux.dev>,
Zenghui Yu <yuzenghui@huawei.com>,
Joey Gouly <joey.gouly@arm.com>
Subject: Re: [PATCH 10/12] KVM: arm64: nv: Add SW walker for AT S1 emulation
Date: Mon, 15 Jul 2024 16:30:19 +0100 [thread overview]
Message-ID: <ZpVAi3dqOOysMMnE@raptor> (raw)
In-Reply-To: <874j8wp1hx.wl-maz@kernel.org>
Hi Marc,
On Thu, Jul 11, 2024 at 01:16:42PM +0100, Marc Zyngier wrote:
> On Thu, 11 Jul 2024 11:56:13 +0100,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> >
> > Hi,
> >
> > On Mon, Jul 08, 2024 at 05:57:58PM +0100, Marc Zyngier wrote:
> > > In order to plug the brokenness of our current AT implementation,
> > > we need a SW walker that is going to... err.. walk the S1 tables
> > > and tell us what it finds.
> > >
> > > Of course, it builds on top of our S2 walker, and share similar
> > > concepts. The beauty of it is that since it uses kvm_read_guest(),
> > > it is able to bring back pages that have been otherwise evicted.
> > >
> > > This is then plugged in the two AT S1 emulation functions as
> > > a "slow path" fallback. I'm not sure it is that slow, but hey.
> > > [..]
> > > switch (op) {
> > > case OP_AT_S1E1RP:
> > > case OP_AT_S1E1WP:
> > > + retry_slow = false;
> > > fail = check_at_pan(vcpu, vaddr, &par);
> > > break;
> > > default:
> > > goto nopan;
> > > }
> >
> > For context, this is what check_at_pan() does:
> >
> > static int check_at_pan(struct kvm_vcpu *vcpu, u64 vaddr, u64 *res)
> > {
> > u64 par_e0;
> > int error;
> >
> > /*
> > * For PAN-involved AT operations, perform the same translation,
> > * using EL0 this time. Twice. Much fun.
> > */
> > error = __kvm_at(OP_AT_S1E0R, vaddr);
> > if (error)
> > return error;
> >
> > par_e0 = read_sysreg_par();
> > if (!(par_e0 & SYS_PAR_EL1_F))
> > goto out;
> >
> > error = __kvm_at(OP_AT_S1E0W, vaddr);
> > if (error)
> > return error;
> >
> > par_e0 = read_sysreg_par();
> > out:
> > *res = par_e0;
> > return 0;
> > }
> >
> > I'm having a hard time understanding why KVM is doing both AT S1E0R and AT S1E0W
> > regardless of the type of the access (read/write) in the PAN-aware AT
> > instruction. Would you mind elaborating on that?
>
> Because that's the very definition of an AT S1E1{W,R}P instruction
> when PAN is set. If *any* EL0 permission is set, then the translation
> must equally fail. Just like a load or a store from EL1 would fail if
> any EL0 permission is set when PSTATE.PAN is set.
>
> Since we cannot check for both permissions at once, we do it twice.
> It is worth noting that we don't quite handle the PAN3 case correctly
> (because we can't retrieve the *execution* property using AT). I'll
> add that to the list of stuff to fix.
>
> >
> > > + if (fail) {
> > > + vcpu_write_sys_reg(vcpu, SYS_PAR_EL1_F, PAR_EL1);
> > > + goto nopan;
> > > + }
> > > [..]
> > > + if (par & SYS_PAR_EL1_F) {
> > > + u8 fst = FIELD_GET(SYS_PAR_EL1_FST, par);
> > > +
> > > + /*
> > > + * If we get something other than a permission fault, we
> > > + * need to retry, as we're likely to have missed in the PTs.
> > > + */
> > > + if ((fst & ESR_ELx_FSC_TYPE) != ESR_ELx_FSC_PERM)
> > > + retry_slow = true;
> > > + } else {
> > > + /*
> > > + * The EL0 access succeded, but we don't have the full
> > > + * syndrom information to synthetize the failure. Go slow.
> > > + */
> > > + retry_slow = true;
> > > + }
> >
> > This is what PSTATE.PAN controls:
> >
> > If the Effective value of PSTATE.PAN is 1, then a privileged data access from
> > any of the following Exception levels to a virtual memory address that is
> > accessible to data accesses at EL0 generates a stage 1 Permission fault:
> >
> > - A privileged data access from EL1.
> > - If HCR_EL2.E2H is 1, then a privileged data access from EL2.
> >
> > With that in mind, I am really struggling to understand the logic.
>
> I don't quite see what you don't understand, you'll have to be more
> precise. Are you worried about the page tables we're looking at, the
> value of PSTATE.PAN, the permission fault, or something else?
>
> It also doesn't help that you're looking at the patch that contains
> the integration with the slow-path, which is pretty hard to read (I
> have a reworked version that's a bit better). You probably want to
> look at the "fast" path alone.
I was referring to checking both unprivileged read and write permissions.
And you are right, sorry, I managed to get myself terribly confused. For
completeness sake, this matches AArch64.S1DirectBasePermissions(), where if PAN
&& (UnprivRead || UnprivWrite) then PrivRead = False and PrivWrite = False. So
you need to check that both UnprivRead and UnprivWrite are false for the PAN
variants of AT to succeed.
>
> >
> > If AT S1E0{R,W} (from check_at_pan()) failed, doesn't that mean that the virtual
> > memory address is not accessible to EL0? Add that to the fact that the AT
> > S1E1{R,W} (from the beginning of __kvm_at_s1e01()) succeeded, doesn't that mean
> > that AT S1E1{R,W}P should succeed, and furthermore the PAR_EL1 value should be
> > the one KVM got from AT S1E1{R,W}?
>
> There are plenty of ways for AT S1E0 to fail when AT S1E1 succeeded:
>
> - no EL0 permission: that's the best case, and the PAR_EL1 obtained
> from the AT S1E1 is the correct one. That's what we return.
Yes, that is correct, the place where VCPUs PAR_EL1 register is set is far
enough from this code that I didn't make the connection.
>
> - The EL0 access failed, but for another reason than a permission
> fault. This contradicts the EL1 walk, and is a sure sign that
> someone is playing behind our back. We fail.
>
> - exception from AT S1E0: something went wrong (again the guest
> playing with the PTs behind our back). We fail as well.
>
> Do you at least agree with these as goals? If you do, what in
> the implementation does not satisfy these goals? If you don't, what in
> these goals seem improper to you?
I agree with the goals.
In this patch, if I'm reading the code right (and I'm starting to doubt myself)
if PAR_EL1.F is set and PAR_EL1 doesn't indicate a permissions fault, then KVM
falls back to walking the S1 tables:
if (par & SYS_PAR_EL1_F) {
u8 fst = FIELD_GET(SYS_PAR_EL1_FST, par);
/*
* If we get something other than a permission fault, we
* need to retry, as we're likely to have missed in the PTs.
*/
if ((fst & ESR_ELx_FSC_TYPE) != ESR_ELx_FSC_PERM)
retry_slow = true;
}
I suppose that's because KVM cannot distinguish between two very different
reasons for AT failing: 1, because of something being wrong with the stage 1
tables when the AT S1E0* instruction was executed and 2, because of missing
entries at stage 2, as per the comment. Is that correct?
Thanks,
Alex
next prev parent reply other threads:[~2024-07-15 15:31 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-25 13:34 [PATCH 00/12] KVM: arm64: nv: Add support for address translation instructions Marc Zyngier
2024-06-25 13:35 ` [PATCH 01/12] arm64: Add missing APTable and TCR_ELx.HPD masks Marc Zyngier
2024-07-12 8:32 ` Anshuman Khandual
2024-07-13 8:04 ` Marc Zyngier
2024-06-25 13:35 ` [PATCH 02/12] arm64: Add PAR_EL1 field description Marc Zyngier
2024-07-12 7:06 ` Anshuman Khandual
2024-07-13 7:56 ` Marc Zyngier
2024-06-25 13:35 ` [PATCH 03/12] KVM: arm64: nv: Turn upper_attr for S2 walk into the full descriptor Marc Zyngier
2024-06-25 13:35 ` [PATCH 04/12] KVM: arm64: nv: Honor absence of FEAT_PAN2 Marc Zyngier
2024-07-12 8:40 ` Anshuman Khandual
2024-06-25 13:35 ` [PATCH 05/12] KVM: arm64: make kvm_at() take an OP_AT_* Marc Zyngier
2024-07-12 8:52 ` Anshuman Khandual
2024-06-25 13:35 ` [PATCH 06/12] KVM: arm64: nv: Add basic emulation of AT S1E{0,1}{R,W}[P] Marc Zyngier
2024-06-25 13:35 ` [PATCH 07/12] KVM: arm64: nv: Add basic emulation of AT S1E2{R,W} Marc Zyngier
2024-06-25 13:35 ` [PATCH 08/12] KVM: arm64: nv: Add emulation of AT S12E{0,1}{R,W} Marc Zyngier
2024-07-18 15:10 ` Alexandru Elisei
2024-07-20 9:49 ` Marc Zyngier
2024-07-22 10:33 ` Alexandru Elisei
2024-06-25 13:35 ` [PATCH 09/12] KVM: arm64: nv: Make ps_to_output_size() generally available Marc Zyngier
2024-07-08 16:28 ` [PATCH 00/12] KVM: arm64: nv: Add support for address translation instructions Alexandru Elisei
2024-07-08 17:00 ` Marc Zyngier
2024-07-08 16:57 ` [PATCH 10/12] KVM: arm64: nv: Add SW walker for AT S1 emulation Marc Zyngier
2024-07-08 16:57 ` [PATCH 11/12] KVM: arm64: nv: Plumb handling of AT S1* traps from EL2 Marc Zyngier
2024-07-08 16:58 ` [PATCH 12/12] KVM: arm64: nv: Add support for FEAT_ATS1A Marc Zyngier
2024-07-10 15:12 ` [PATCH 10/12] KVM: arm64: nv: Add SW walker for AT S1 emulation Alexandru Elisei
2024-07-11 8:05 ` Marc Zyngier
2024-07-11 10:56 ` Alexandru Elisei
2024-07-11 12:16 ` Marc Zyngier
2024-07-15 15:30 ` Alexandru Elisei [this message]
2024-07-18 11:37 ` Marc Zyngier
2024-07-18 15:16 ` Alexandru Elisei
2024-07-20 13:49 ` Marc Zyngier
2024-07-22 10:53 ` Alexandru Elisei
2024-07-22 15:25 ` Marc Zyngier
2024-07-23 8:57 ` Alexandru Elisei
2024-07-25 14:16 ` Alexandru Elisei
2024-07-25 14:30 ` Marc Zyngier
2024-07-25 15:13 ` Alexandru Elisei
2024-07-25 15:33 ` Marc Zyngier
2024-07-29 15:26 ` Alexandru Elisei
2024-07-31 8:55 ` Marc Zyngier
2024-07-31 9:53 ` Alexandru Elisei
2024-07-31 10:18 ` Marc Zyngier
2024-07-31 10:28 ` Alexandru Elisei
2024-07-31 14:33 ` Alexandru Elisei
2024-07-31 15:43 ` Marc Zyngier
2024-07-31 16:05 ` Alexandru Elisei
2024-07-31 10:05 ` [PATCH 00/12] KVM: arm64: nv: Add support for address translation instructions Alexandru Elisei
2024-07-31 11:02 ` Marc Zyngier
2024-07-31 14:19 ` Alexandru Elisei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZpVAi3dqOOysMMnE@raptor \
--to=alexandru.elisei@arm.com \
--cc=james.morse@arm.com \
--cc=joey.gouly@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=maz@kernel.org \
--cc=oliver.upton@linux.dev \
--cc=suzuki.poulose@arm.com \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox