public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Yan Zhao <yan.y.zhao@intel.com>
Cc: pbonzini@redhat.com, rick.p.edgecombe@intel.com,
	kai.huang@intel.com,  isaku.yamahata@intel.com,
	dmatlack@google.com, sagis@google.com,  erdemaktas@google.com,
	graf@amazon.com, linux-kernel@vger.kernel.org,
	 kvm@vger.kernel.org
Subject: Re: [PATCH v2 1/4] KVM: x86/mmu: Introduce a quirk to control memslot zap behavior
Date: Mon, 23 Sep 2024 11:37:14 -0700	[thread overview]
Message-ID: <ZvG1Wki4GvIyVWqB@google.com> (raw)
In-Reply-To: <20240703021043.13881-1-yan.y.zhao@intel.com>

On Wed, Jul 03, 2024, Yan Zhao wrote:
> Introduce the quirk KVM_X86_QUIRK_SLOT_ZAP_ALL to allow users to select
> KVM's behavior when a memslot is moved or deleted for KVM_X86_DEFAULT_VM
> VMs. Make sure KVM behave as if the quirk is always disabled for
> non-KVM_X86_DEFAULT_VM VMs.
 
...

> Suggested-by: Kai Huang <kai.huang@intel.com>
> Suggested-by: Sean Christopherson <seanjc@google.com>

Bad Sean, bad.

> +/*
> + * Zapping leaf SPTEs with memslot range when a memslot is moved/deleted.
> + *
> + * Zapping non-leaf SPTEs, a.k.a. not-last SPTEs, isn't required, worst
> + * case scenario we'll have unused shadow pages lying around until they
> + * are recycled due to age or when the VM is destroyed.
> + */
> +static void kvm_mmu_zap_memslot_leafs(struct kvm *kvm, struct kvm_memory_slot *slot)
> +{
> +	struct kvm_gfn_range range = {
> +		.slot = slot,
> +		.start = slot->base_gfn,
> +		.end = slot->base_gfn + slot->npages,
> +		.may_block = true,
> +	};
> +	bool flush = false;
> +
> +	write_lock(&kvm->mmu_lock);
> +
> +	if (kvm_memslots_have_rmaps(kvm))
> +		flush = kvm_handle_gfn_range(kvm, &range, kvm_zap_rmap);

This, and Paolo's merged variant, break shadow paging.  As was tried in commit
4e103134b862 ("KVM: x86/mmu: Zap only the relevant pages when removing a memslot"),
all shadow pages, i.e. non-leaf SPTEs, need to be zapped.  All of the accounting
for a shadow page is tied to the memslot, i.e. the shadow page holds a reference
to the memslot, for all intents and purposes.  Deleting the memslot without removing
all relevant shadow pages results in NULL pointer derefs when tearing down the VM.

Note, that commit is/was buggy, and I suspect my follow-up attempt[*] was as well.
https://lore.kernel.org/all/20190820200318.GA15808@linux.intel.com

Rather than trying to get this functional for shadow paging (which includes nested
TDP), I think we should scrap the quirk idea and simply make this the behavior for
S-EPT and nothing else.

 BUG: kernel NULL pointer dereference, address: 00000000000000b0
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 6085f43067 P4D 608c080067 PUD 608c081067 PMD 0 
 Oops: Oops: 0000 [#1] SMP NOPTI
 CPU: 79 UID: 0 PID: 187063 Comm: set_memory_regi Tainted: G        W          6.11.0-smp--24867312d167-cpl #395
 Tainted: [W]=WARN
 Hardware name: Google Astoria/astoria, BIOS 0.20240617.0-0 06/17/2024
 RIP: 0010:__kvm_mmu_prepare_zap_page+0x3a9/0x7b0 [kvm]
 Code:  <48> 8b 8e b0 00 00 00 48 8b 96 e0 00 00 00 48 c1 e9 09 48 29 c8 8b
 RSP: 0018:ff314a25b19f7c28 EFLAGS: 00010212
 Call Trace:
  <TASK>
  kvm_arch_flush_shadow_all+0x7a/0xf0 [kvm]
  kvm_mmu_notifier_release+0x6c/0xb0 [kvm]
  mmu_notifier_unregister+0x85/0x140
  kvm_put_kvm+0x263/0x410 [kvm]
  kvm_vm_release+0x21/0x30 [kvm]
  __fput+0x8d/0x2c0
  __se_sys_close+0x71/0xc0
  do_syscall_64+0x83/0x160
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

  reply	other threads:[~2024-09-23 18:37 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-03  2:09 [PATCH v2 0/4] Introduce a quirk to control memslot zap behavior Yan Zhao
2024-07-03  2:10 ` [PATCH v2 1/4] KVM: x86/mmu: " Yan Zhao
2024-09-23 18:37   ` Sean Christopherson [this message]
2024-09-24  0:58     ` Yan Zhao
2024-09-24  7:45       ` Sean Christopherson
2024-09-24  8:37         ` Yan Zhao
2024-11-26  9:15       ` Yan Zhao
2024-07-03  2:11 ` [PATCH v2 2/4] KVM: selftests: Test slot move/delete with slot zap quirk enabled/disabled Yan Zhao
2024-07-03  2:12 ` [PATCH v2 3/4] KVM: selftests: Allow slot modification stress test with quirk disabled Yan Zhao
2024-09-24 12:26   ` Aishwarya TCV
2024-09-25  0:42     ` Yan Zhao
2024-09-30 18:15       ` Mark Brown
2024-07-03  2:12 ` [PATCH v2 4/4] KVM: selftests: Test memslot move in memslot_perf_test " Yan Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZvG1Wki4GvIyVWqB@google.com \
    --to=seanjc@google.com \
    --cc=dmatlack@google.com \
    --cc=erdemaktas@google.com \
    --cc=graf@amazon.com \
    --cc=isaku.yamahata@intel.com \
    --cc=kai.huang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=sagis@google.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox