From: Alexandru Elisei <alexandru.elisei@arm.com>
To: Marc Zyngier <maz@kernel.org>
Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
kvm@vger.kernel.org, James Morse <james.morse@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Oliver Upton <oliver.upton@linux.dev>,
Zenghui Yu <yuzenghui@huawei.com>,
Joey Gouly <joey.gouly@arm.com>
Subject: Re: [PATCH 10/12] KVM: arm64: nv: Add SW walker for AT S1 emulation
Date: Mon, 15 Jul 2024 16:30:19 +0100 [thread overview]
Message-ID: <ZpVAi3dqOOysMMnE@raptor> (raw)
In-Reply-To: <874j8wp1hx.wl-maz@kernel.org>
Hi Marc,
On Thu, Jul 11, 2024 at 01:16:42PM +0100, Marc Zyngier wrote:
> On Thu, 11 Jul 2024 11:56:13 +0100,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> >
> > Hi,
> >
> > On Mon, Jul 08, 2024 at 05:57:58PM +0100, Marc Zyngier wrote:
> > > In order to plug the brokenness of our current AT implementation,
> > > we need a SW walker that is going to... err.. walk the S1 tables
> > > and tell us what it finds.
> > >
> > > Of course, it builds on top of our S2 walker, and share similar
> > > concepts. The beauty of it is that since it uses kvm_read_guest(),
> > > it is able to bring back pages that have been otherwise evicted.
> > >
> > > This is then plugged in the two AT S1 emulation functions as
> > > a "slow path" fallback. I'm not sure it is that slow, but hey.
> > > [..]
> > > switch (op) {
> > > case OP_AT_S1E1RP:
> > > case OP_AT_S1E1WP:
> > > + retry_slow = false;
> > > fail = check_at_pan(vcpu, vaddr, &par);
> > > break;
> > > default:
> > > goto nopan;
> > > }
> >
> > For context, this is what check_at_pan() does:
> >
> > static int check_at_pan(struct kvm_vcpu *vcpu, u64 vaddr, u64 *res)
> > {
> > u64 par_e0;
> > int error;
> >
> > /*
> > * For PAN-involved AT operations, perform the same translation,
> > * using EL0 this time. Twice. Much fun.
> > */
> > error = __kvm_at(OP_AT_S1E0R, vaddr);
> > if (error)
> > return error;
> >
> > par_e0 = read_sysreg_par();
> > if (!(par_e0 & SYS_PAR_EL1_F))
> > goto out;
> >
> > error = __kvm_at(OP_AT_S1E0W, vaddr);
> > if (error)
> > return error;
> >
> > par_e0 = read_sysreg_par();
> > out:
> > *res = par_e0;
> > return 0;
> > }
> >
> > I'm having a hard time understanding why KVM is doing both AT S1E0R and AT S1E0W
> > regardless of the type of the access (read/write) in the PAN-aware AT
> > instruction. Would you mind elaborating on that?
>
> Because that's the very definition of an AT S1E1{W,R}P instruction
> when PAN is set. If *any* EL0 permission is set, then the translation
> must equally fail. Just like a load or a store from EL1 would fail if
> any EL0 permission is set when PSTATE.PAN is set.
>
> Since we cannot check for both permissions at once, we do it twice.
> It is worth noting that we don't quite handle the PAN3 case correctly
> (because we can't retrieve the *execution* property using AT). I'll
> add that to the list of stuff to fix.
>
> >
> > > + if (fail) {
> > > + vcpu_write_sys_reg(vcpu, SYS_PAR_EL1_F, PAR_EL1);
> > > + goto nopan;
> > > + }
> > > [..]
> > > + if (par & SYS_PAR_EL1_F) {
> > > + u8 fst = FIELD_GET(SYS_PAR_EL1_FST, par);
> > > +
> > > + /*
> > > + * If we get something other than a permission fault, we
> > > + * need to retry, as we're likely to have missed in the PTs.
> > > + */
> > > + if ((fst & ESR_ELx_FSC_TYPE) != ESR_ELx_FSC_PERM)
> > > + retry_slow = true;
> > > + } else {
> > > + /*
> > > + * The EL0 access succeded, but we don't have the full
> > > + * syndrom information to synthetize the failure. Go slow.
> > > + */
> > > + retry_slow = true;
> > > + }
> >
> > This is what PSTATE.PAN controls:
> >
> > If the Effective value of PSTATE.PAN is 1, then a privileged data access from
> > any of the following Exception levels to a virtual memory address that is
> > accessible to data accesses at EL0 generates a stage 1 Permission fault:
> >
> > - A privileged data access from EL1.
> > - If HCR_EL2.E2H is 1, then a privileged data access from EL2.
> >
> > With that in mind, I am really struggling to understand the logic.
>
> I don't quite see what you don't understand, you'll have to be more
> precise. Are you worried about the page tables we're looking at, the
> value of PSTATE.PAN, the permission fault, or something else?
>
> It also doesn't help that you're looking at the patch that contains
> the integration with the slow-path, which is pretty hard to read (I
> have a reworked version that's a bit better). You probably want to
> look at the "fast" path alone.
I was referring to checking both unprivileged read and write permissions.
And you are right, sorry, I managed to get myself terribly confused. For
completeness sake, this matches AArch64.S1DirectBasePermissions(), where if PAN
&& (UnprivRead || UnprivWrite) then PrivRead = False and PrivWrite = False. So
you need to check that both UnprivRead and UnprivWrite are false for the PAN
variants of AT to succeed.
>
> >
> > If AT S1E0{R,W} (from check_at_pan()) failed, doesn't that mean that the virtual
> > memory address is not accessible to EL0? Add that to the fact that the AT
> > S1E1{R,W} (from the beginning of __kvm_at_s1e01()) succeeded, doesn't that mean
> > that AT S1E1{R,W}P should succeed, and furthermore the PAR_EL1 value should be
> > the one KVM got from AT S1E1{R,W}?
>
> There are plenty of ways for AT S1E0 to fail when AT S1E1 succeeded:
>
> - no EL0 permission: that's the best case, and the PAR_EL1 obtained
> from the AT S1E1 is the correct one. That's what we return.
Yes, that is correct, the place where VCPUs PAR_EL1 register is set is far
enough from this code that I didn't make the connection.
>
> - The EL0 access failed, but for another reason than a permission
> fault. This contradicts the EL1 walk, and is a sure sign that
> someone is playing behind our back. We fail.
>
> - exception from AT S1E0: something went wrong (again the guest
> playing with the PTs behind our back). We fail as well.
>
> Do you at least agree with these as goals? If you do, what in
> the implementation does not satisfy these goals? If you don't, what in
> these goals seem improper to you?
I agree with the goals.
In this patch, if I'm reading the code right (and I'm starting to doubt myself)
if PAR_EL1.F is set and PAR_EL1 doesn't indicate a permissions fault, then KVM
falls back to walking the S1 tables:
if (par & SYS_PAR_EL1_F) {
u8 fst = FIELD_GET(SYS_PAR_EL1_FST, par);
/*
* If we get something other than a permission fault, we
* need to retry, as we're likely to have missed in the PTs.
*/
if ((fst & ESR_ELx_FSC_TYPE) != ESR_ELx_FSC_PERM)
retry_slow = true;
}
I suppose that's because KVM cannot distinguish between two very different
reasons for AT failing: 1, because of something being wrong with the stage 1
tables when the AT S1E0* instruction was executed and 2, because of missing
entries at stage 2, as per the comment. Is that correct?
Thanks,
Alex
next prev parent reply other threads:[~2024-07-15 15:30 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-25 13:34 [PATCH 00/12] KVM: arm64: nv: Add support for address translation instructions Marc Zyngier
2024-06-25 13:35 ` [PATCH 01/12] arm64: Add missing APTable and TCR_ELx.HPD masks Marc Zyngier
2024-07-12 8:32 ` Anshuman Khandual
2024-07-13 8:04 ` Marc Zyngier
2024-06-25 13:35 ` [PATCH 02/12] arm64: Add PAR_EL1 field description Marc Zyngier
2024-07-12 7:06 ` Anshuman Khandual
2024-07-13 7:56 ` Marc Zyngier
2024-06-25 13:35 ` [PATCH 03/12] KVM: arm64: nv: Turn upper_attr for S2 walk into the full descriptor Marc Zyngier
2024-06-25 13:35 ` [PATCH 04/12] KVM: arm64: nv: Honor absence of FEAT_PAN2 Marc Zyngier
2024-07-12 8:40 ` Anshuman Khandual
2024-06-25 13:35 ` [PATCH 05/12] KVM: arm64: make kvm_at() take an OP_AT_* Marc Zyngier
2024-07-12 8:52 ` Anshuman Khandual
2024-06-25 13:35 ` [PATCH 06/12] KVM: arm64: nv: Add basic emulation of AT S1E{0,1}{R,W}[P] Marc Zyngier
2024-06-25 13:35 ` [PATCH 07/12] KVM: arm64: nv: Add basic emulation of AT S1E2{R,W} Marc Zyngier
2024-06-25 13:35 ` [PATCH 08/12] KVM: arm64: nv: Add emulation of AT S12E{0,1}{R,W} Marc Zyngier
2024-07-18 15:10 ` Alexandru Elisei
2024-07-20 9:49 ` Marc Zyngier
2024-07-22 10:33 ` Alexandru Elisei
2024-06-25 13:35 ` [PATCH 09/12] KVM: arm64: nv: Make ps_to_output_size() generally available Marc Zyngier
2024-07-08 16:28 ` [PATCH 00/12] KVM: arm64: nv: Add support for address translation instructions Alexandru Elisei
2024-07-08 17:00 ` Marc Zyngier
2024-07-08 16:57 ` [PATCH 10/12] KVM: arm64: nv: Add SW walker for AT S1 emulation Marc Zyngier
2024-07-08 16:57 ` [PATCH 11/12] KVM: arm64: nv: Plumb handling of AT S1* traps from EL2 Marc Zyngier
2024-07-08 16:58 ` [PATCH 12/12] KVM: arm64: nv: Add support for FEAT_ATS1A Marc Zyngier
2024-07-10 15:12 ` [PATCH 10/12] KVM: arm64: nv: Add SW walker for AT S1 emulation Alexandru Elisei
2024-07-11 8:05 ` Marc Zyngier
2024-07-11 10:56 ` Alexandru Elisei
2024-07-11 12:16 ` Marc Zyngier
2024-07-15 15:30 ` Alexandru Elisei [this message]
2024-07-18 11:37 ` Marc Zyngier
2024-07-18 15:16 ` Alexandru Elisei
2024-07-20 13:49 ` Marc Zyngier
2024-07-22 10:53 ` Alexandru Elisei
2024-07-22 15:25 ` Marc Zyngier
2024-07-23 8:57 ` Alexandru Elisei
2024-07-25 14:16 ` Alexandru Elisei
2024-07-25 14:30 ` Marc Zyngier
2024-07-25 15:13 ` Alexandru Elisei
2024-07-25 15:33 ` Marc Zyngier
2024-07-29 15:26 ` Alexandru Elisei
2024-07-31 8:55 ` Marc Zyngier
2024-07-31 9:53 ` Alexandru Elisei
2024-07-31 10:18 ` Marc Zyngier
2024-07-31 10:28 ` Alexandru Elisei
2024-07-31 14:33 ` Alexandru Elisei
2024-07-31 15:43 ` Marc Zyngier
2024-07-31 16:05 ` Alexandru Elisei
2024-07-31 10:05 ` [PATCH 00/12] KVM: arm64: nv: Add support for address translation instructions Alexandru Elisei
2024-07-31 11:02 ` Marc Zyngier
2024-07-31 14:19 ` Alexandru Elisei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZpVAi3dqOOysMMnE@raptor \
--to=alexandru.elisei@arm.com \
--cc=james.morse@arm.com \
--cc=joey.gouly@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=maz@kernel.org \
--cc=oliver.upton@linux.dev \
--cc=suzuki.poulose@arm.com \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.