public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Konstantin Khorenko <khorenko@virtuozzo.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org,  Thomas Gleixner <tglx@kernel.org>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	 Dave Hansen <dave.hansen@linux.intel.com>,
	"H . Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org,  linux-kernel@vger.kernel.org,
	Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Subject: Re: [RFC PATCH 0/1] KVM: VMX: restore host CR2 after VM exit
Date: Wed, 22 Apr 2026 11:56:05 -0700	[thread overview]
Message-ID: <aekZxVXzxJ_sj4Ev@google.com> (raw)
In-Reply-To: <20260422175000.1544258-1-khorenko@virtuozzo.com>

On Wed, Apr 22, 2026, Konstantin Khorenko wrote:
> All four oopses happened inside the L1 host itself: the original fault
> plus three further faults taken inside the oops-reporting code
> (dump_pagetable() -> copy_from_kernel_nofault(), vt_console_print() ->
> lf(), vsnprintf() in the "Modules linked in" path).
> They are not extra levels of guest nesting; the nesting stack in this
> setup is just two deep (outer hypervisor, then this L1 host running its
> own L2 guests).

...

> The mechanical fact (VMX leaves the guest CR2 in the hardware register
> after VM exit, and the rest of the kernel treats CR2 as "address of
> the last host #PF") is easy to verify from the source.  What I cannot
> pin down from that one dump is which exact delivery path brought a #PF
> handler into play with the CPU not having updated CR2 on that run.
> The plausible candidates include:
> 
>   - corner cases of outer-hypervisor event injection into this host;
>   - NMI/MCE entries racing with oops reporting;
>   - crash/__show_regs() invoked from contexts other than a freshly
>     taken #PF, where die()/oops code reads CR2 as if it were fresh.
> 
> All of these stop mattering the moment the host CR2 stops being a
> guest-controlled value after a VM exit.  The patch targets the
> weakest link directly: the "CR2 on the host == address of the last
> host #PF" invariant should hold across VM entry/exit on VMX, and
> today it does not.

And it never will (barring a hardware/ucode change).  This flaw is impossible to
completely fix on Intel.  The best we can do is "restore" host CR2 within a few
instructions of VM-Exit.  Intel doesn't provide a GIF equivalent, and so NMIs
can't be blocked in the entry/exit path.  E.g. the kernel already needs to be
prepared to handle NMIs with guest CR2 loaded since VMX doesn't provide a way
to block NMIs.   

More importantly, I just don't see the point; the host CR2 is _guaranteed_ to be
stale.  KVM obviously doesn't do VM-Enter from #PF context.  It'll probably be
less garbage than guest CR2, but it's still garbage.

I appreciate that seeing a bogus CR2 can make debug difficult, but IMO, the
benefit of making KVM moderately less painful on rare occasions where all hell
breaks loose isn't worth the cost of the extra CR2 writes.  And practically
speaking, the kernel _must_ be hardened against bogus CR2 values when dealing
with OOPses and panicks, because pretty much by definition something has gone
sideways and so CR2 can't be assumed to be benign.

> Patch properties:
> 
>   - Hot path impact: one extra register compare in the common case,
>     one extra MOV to CR2 under unlikely() when the guest modified CR2.

That's not unlikely.  The odds of guest CR2 matching host CR2 are basically zero.
In practice, this likely adds two extra CR2 writes on the majority of entry/exit
transitions.

>   - Stays within the existing noinstr region.  native_read_cr2() and
>     native_write_cr2() are plain inline asm with no instrumentation,
>     so noinstr constraints are preserved.
> 
>   - Not a security fix for a user-triggerable issue per se, but it
>     removes a class of confusing "kernel CR2 points into guest memory"
>     oops reports and hardens the CR2 invariant for the whole kernel.

      parent reply	other threads:[~2026-04-22 18:56 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-22 17:49 [RFC PATCH 0/1] KVM: VMX: restore host CR2 after VM exit Konstantin Khorenko
2026-04-22 17:50 ` [RFC PATCH 1/1] " Konstantin Khorenko
2026-04-22 18:56 ` Sean Christopherson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aekZxVXzxJ_sj4Ev@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=khorenko@virtuozzo.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=ptikhomirov@virtuozzo.com \
    --cc=tglx@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox