linux-mips.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Maxim Levitsky <mlevitsk@redhat.com>
Cc: Jim Mattson <jmattson@google.com>,
	alexandru.elisei@arm.com, anup@brainfault.org,
	 aou@eecs.berkeley.edu, atishp@atishpatra.org,
	borntraeger@linux.ibm.com,  chenhuacai@kernel.org,
	david@redhat.com, frankja@linux.ibm.com,  imbrenda@linux.ibm.com,
	james.morse@arm.com, kvm-riscv@lists.infradead.org,
	 kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	 linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org,
	 linux-riscv@lists.infradead.org, linuxppc-dev@lists.ozlabs.org,
	 maz@kernel.org, oliver.upton@linux.dev, palmer@dabbelt.com,
	 paul.walmsley@sifive.com, pbonzini@redhat.com,
	suzuki.poulose@arm.com
Subject: Re: [PATCH v4 10/12] KVM: x86: never write to memory from kvm_vcpu_check_block()
Date: Wed, 13 Dec 2023 14:59:16 -0800	[thread overview]
Message-ID: <ZXo3REB8g-ZecT2U@google.com> (raw)
In-Reply-To: <5ca5592b21131f515e296afae006e5bb28b1fb87.camel@redhat.com>

On Thu, Dec 14, 2023, Maxim Levitsky wrote:
> On Tue, 2023-12-12 at 07:28 -0800, Sean Christopherson wrote:
> > On Sun, Dec 10, 2023, Jim Mattson wrote:
> > > On Thu, Dec 7, 2023 at 8:21 AM Sean Christopherson <seanjc@google.com> wrote:
> > > > Doh.  We got the less obvious cases and missed the obvious one.
> > > > 
> > > > Ugh, and we also missed a related mess in kvm_guest_apic_has_interrupt().  That
> > > > thing should really be folded into vmx_has_nested_events().
> > > > 
> > > > Good gravy.  And vmx_interrupt_blocked() does the wrong thing because that
> > > > specifically checks if L1 interrupts are blocked.
> > > > 
> > > > Compile tested only, and definitely needs to be chunked into multiple patches,
> > > > but I think something like this mess?
> > > 
> > > The proposed patch does not fix the problem. In fact, it messes things
> > > up so much that I don't get any test results back.
> > 
> > Drat.
> > 
> > > Google has an internal K-U-T test that demonstrates the problem. I
> > > will post it soon.
> > 
> > Received, I'll dig in soonish, though "soonish" might unfortunately might mean
> > 2024.
> > 
> 
> Hi,
> 
> So this is what I think:
> 
> KVM does have kvm_guest_apic_has_interrupt() for this exact purpose,
> to check if nested APICv has a pending interrupt before halting.

For all intents and purposes, so was nested_ops->has_events().  I don't see
any reason to have two APIs that do the same thing, and the call to
kvm_guest_apic_has_interrupt() is wrong in that it doesn't verify that IRQs are
enabled for _L2_.  That's why my preference is to fold the two together.

> However the problem is bigger - with APICv we have in essence 2 pending
> interrupt bitmaps - the PIR and the IRR, and to know if the guest has a
> pending interrupt one has in theory to copy PIR to IRR, then see if the max
> is larger then the current PPR.

Yeah, this is what my untested hack-a-patch tried to do.

> Since we don't want to write to guest memory,

The changelog is misleading/wrong.  Writing guest memory is ok, what isn't safe
is blocking or sleeping, i.e. KVM must not trigger a host page fault due to
accessing a page that's been swapped out.  Read vs. write doesn't matter.

So KVM can safely read and write guest memory so long as it already mapped by 
kvm_vcpu_map() (or I suppose if we wrapped an access with pagefault_disable(),
but I can't think of a sane reason to do that).  E.g. nVMX can access a vCPU's
PID mapping, but synthesizing a nested VM-Exit will cause explosions on nSVM.

> and the IRR here resides in the guest memory, I guess we have to do a
> 'dry-run' version of 'vmx_complete_nested_posted_interrupt' and call it from
> kvm_guest_apic_has_interrupt().

nested_ops->has_events() is the much better fit, e.g. the naming won't get weird
and we can gate the whole thing on is_guest_mode().  Though we probably need a
wrapper to handle any commonalities between nVMX and nSVM.

> What do you think? I can prepare a patch for this.

As above, this is what I tried to do, sort of.  Though it's obviously broken.  We
don't need a full dry-run because KVM only needs to detect events that are unique
to L2, e.g. nVMX's preemption timer, MTF, and pending virtual interrupts (hmm,
I suspect nSVM's vNMI is broken too).  Things like INIT and SMI don't require
nested virtualization awareness because the event itself is tracked for the vCPU
as a whole.

  parent reply	other threads:[~2023-12-13 22:59 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-21  0:31 [PATCH v4 00/12] KVM: x86: never write to memory from kvm_vcpu_check_block Sean Christopherson
2022-09-21  0:31 ` [PATCH v4 01/12] KVM: x86: make vendor code check for all nested events Sean Christopherson
2022-09-21  0:31 ` [PATCH v4 02/12] KVM: nVMX: Make an event request when pending an MTF nested VM-Exit Sean Christopherson
2022-09-21  0:31 ` [PATCH v4 03/12] KVM: x86: Rename and expose helper to detect if INIT/SIPI are allowed Sean Christopherson
2022-09-21  0:31 ` [PATCH v4 04/12] KVM: x86: Rename kvm_apic_has_events() to make it INIT/SIPI specific Sean Christopherson
2022-09-21  0:31 ` [PATCH v4 05/12] KVM: x86: lapic does not have to process INIT if it is blocked Sean Christopherson
2022-09-21  0:31 ` [PATCH v4 06/12] KVM: SVM: Make an event request if INIT or SIPI is pending when GIF is set Sean Christopherson
2022-09-21  0:31 ` [PATCH v4 07/12] KVM: nVMX: Make an event request if INIT or SIPI is pending on VM-Enter Sean Christopherson
2022-09-21  0:31 ` [PATCH v4 08/12] KVM: nVMX: Make event request on VMXOFF iff INIT/SIPI is pending Sean Christopherson
2022-09-21  0:31 ` [PATCH v4 09/12] KVM: x86: Don't snapshot pending INIT/SIPI prior to checking nested events Sean Christopherson
2022-09-21  0:31 ` [PATCH v4 10/12] KVM: x86: never write to memory from kvm_vcpu_check_block() Sean Christopherson
2023-12-07  1:03   ` Jim Mattson
2023-12-07 16:21     ` Sean Christopherson
2023-12-10 22:52       ` Jim Mattson
2023-12-12 15:28         ` Sean Christopherson
2023-12-13 22:25           ` Maxim Levitsky
2023-12-13 22:31             ` Jim Mattson
2023-12-13 22:44               ` Maxim Levitsky
2023-12-13 22:59             ` Sean Christopherson [this message]
2022-09-21  0:32 ` [PATCH v4 11/12] KVM: mips, x86: do not rely on KVM_REQ_UNHALT Sean Christopherson
2022-09-22 13:17   ` Philippe Mathieu-Daudé
2022-09-21  0:32 ` [PATCH v4 12/12] KVM: remove KVM_REQ_UNHALT Sean Christopherson
2022-09-22 14:52   ` Marc Zyngier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZXo3REB8g-ZecT2U@google.com \
    --to=seanjc@google.com \
    --cc=alexandru.elisei@arm.com \
    --cc=anup@brainfault.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=atishp@atishpatra.org \
    --cc=borntraeger@linux.ibm.com \
    --cc=chenhuacai@kernel.org \
    --cc=david@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=james.morse@arm.com \
    --cc=jmattson@google.com \
    --cc=kvm-riscv@lists.infradead.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maz@kernel.org \
    --cc=mlevitsk@redhat.com \
    --cc=oliver.upton@linux.dev \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=pbonzini@redhat.com \
    --cc=suzuki.poulose@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).