Re: [RFC PATCH] x86/kvm: Allow kvm_read_and_reset_apf_flags() to be instrumented

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
	Lai Jiangshan <laijs@linux.alibaba.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	kvm@vger.kernel.org
Subject: Re: [RFC PATCH] x86/kvm: Allow kvm_read_and_reset_apf_flags() to be instrumented
Date: Thu, 30 Dec 2021 18:40:47 +0000	[thread overview]
Message-ID: <Yc39L/2Ltx53bP/0@google.com> (raw)
In-Reply-To: <20211126123145.2772-1-jiangshanlai@gmail.com>

On Fri, Nov 26, 2021, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@linux.alibaba.com>
> 
> Both VMX and SVM made mistakes of calling kvm_read_and_reset_apf_flags()
> in instrumentable code:
> 	vmx_vcpu_enter_exit()
> 		ASYNC PF induced VMEXIT
> 		save cr2
> 	leave noinstr section
> 	-- kernel #PF can happen here due to instrumentation
> 	handle_exception_nmi_irqoff
> 		kvm_read_and_reset_apf_flags()
> 
> If kernel #PF happens before handle_exception_nmi_irqoff() after leaving
> noinstr section due to instrumentation, kernel #PF handler will consume
> the incorrect apf flags and panic.
> 
> The problem might be fixed by moving kvm_read_and_reset_apf_flags()
> into vmx_vcpu_enter_exit() to consume apf flags earlier before leaving
> noinstr section like the way it handles CR2.
> 
> But linux kernel only resigters ASYNC PF for userspace and guest, so

I'd omit the guest part.  While technically true, it's not relevant to the change
as the instrumentation safety comes purely from the user_mode() check.  Mentioning
the "guest" side of things gets confusing as the "host" may be an L1 kernel, in
which case it is also a guest and may have its own guests.

It'd also be helpful for future readers to call out that this is true only since
commit 3a7c8fafd1b4 ("x86/kvm: Restrict ASYNC_PF to user space").  Allowing async
#PF in kernel mode was presumably why this code was non-instrumentable in the
first place.

> ASYNC PF can't hit when it is in kernel, so apf flags can be changed to
> be consumed only when the #PF is from guest or userspace.

...

> @@ -1538,7 +1514,20 @@ DEFINE_IDTENTRY_RAW_ERRORCODE(exc_page_fault)
>  	state = irqentry_enter(regs);
>  
>  	instrumentation_begin();
> -	handle_page_fault(regs, error_code, address);
> +
> +	/*
> +	 * KVM uses #PF vector to deliver 'page not present' events to guests
> +	 * (asynchronous page fault mechanism). The event happens when a
> +	 * userspace task is trying to access some valid (from guest's point of
> +	 * view) memory which is not currently mapped by the host (e.g. the
> +	 * memory is swapped out). Note, the corresponding "page ready" event
> +	 * which is injected when the memory becomes available, is delivered via
> +	 * an interrupt mechanism and not a #PF exception
> +	 * (see arch/x86/kernel/kvm.c: sysvec_kvm_asyncpf_interrupt()).
> +	 */
> +	if (!user_mode(regs) || !kvm_handle_async_user_pf(regs, (u32)address))
> +		handle_page_fault(regs, error_code, address);

To preserve the panic() if an async #PF is injected for !user_mode, any reason
not to do?

	if (!kvm_handle_async_user_pf(regs, (u32)address))
		handle_page_fault(regs, error_code, address);

Then __kvm_handle_async_user_pf() can keep its panic() on !user_mode().  And it
also saves a conditional when kvm_async_pf_enabled is not true.

     prev parent reply	other threads:[~2021-12-30 18:40 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-26 12:31 [RFC PATCH] x86/kvm: Allow kvm_read_and_reset_apf_flags() to be instrumented Lai Jiangshan
2021-12-30 18:40 ` Sean Christopherson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yc39L/2Ltx53bP/0@google.com \
    --to=seanjc@google.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jiangshanlai@gmail.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=laijs@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.