From: Alex Williamson <alex.williamson@redhat.com>
To: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: "Paolo Bonzini" <pbonzini@redhat.com>,
"Radim Krčmář" <rkrcmar@redhat.com>,
kvm@vger.kernel.org, "Xiao Guangrong" <guangrong.xiao@gmail.com>
Subject: Re: [PATCH v2 11/27] KVM: x86/mmu: Zap only the relevant pages when removing a memslot
Date: Wed, 21 Aug 2019 13:08:59 -0600 [thread overview]
Message-ID: <20190821130859.4330bcf4@x1.home> (raw)
In-Reply-To: <20190820210245.GC15808@linux.intel.com>
On Tue, 20 Aug 2019 14:02:45 -0700
Sean Christopherson <sean.j.christopherson@intel.com> wrote:
> On Tue, Aug 20, 2019 at 02:42:04PM -0600, Alex Williamson wrote:
> > On Tue, 20 Aug 2019 13:03:19 -0700
> > Sean Christopherson <sean.j.christopherson@intel.com> wrote:
> > > All that being said, it doesn't explain why gfns like 0xfec00 and 0xfee00
> > > were sensitive to (lack of) zapping. My theory is that zapping what were
> > > effectively random-but-interesting shadow pages cleaned things up enough
> > > to avoid noticeable badness.
> > >
> > >
> > > Alex,
> > >
> > > Can you please test the attached patch? It implements a very slimmed down
> > > version of kvm_mmu_zap_all() to zap only shadow pages that can hold sptes
> > > pointing at the memslot being removed, which was the original intent of
> > > kvm_mmu_invalidate_zap_pages_in_memslot(). I apologize in advance if it
> > > crashes the host. I'm hopeful it's correct, but given how broken the
> > > previous version was, I'm not exactly confident.
> >
> > It doesn't crash the host, but the guest is not happy, failing to boot
> > the desktop in one case and triggering errors in the guest w/o even
> > running test programs in another case. Seems like it might be worse
> > than previous. Thanks,
>
> Hrm, I'm back to being completely flummoxed.
>
> Would you be able to generate a trace of all events/kvmmmu, using the
> latest patch? I'd like to rule out a stupid code bug if it's not too
> much trouble.
I tried to simplify the patch, making it closer to zap_all, so I
removed the max_level calculation and exclusion based on s->role.level,
as well as the gfn range filtering. For good measure I even removed
the sp->root_count test, so any sp not marked invalid is zapped. This
works, and I can also add back the sp->root_count test and things
remain working.
From there I added back the gfn range test, but I left out the gfn_mask
because I'm not doing the level filtering and I think(?) this is just
another optimization. So essentially I only add:
if (sp->gfn < slot->base_gfn ||
sp->gfn > (slot->base_gfn + slot->npages - 1))
continue;
Not only does this not work, the host will sometimes oops:
[ 808.541168] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 808.555065] #PF: supervisor read access in kernel mode
[ 808.565326] #PF: error_code(0x0000) - not-present page
[ 808.575588] PGD 0 P4D 0
[ 808.580649] Oops: 0000 [#1] SMP PTI
[ 808.587617] CPU: 3 PID: 1965 Comm: CPU 0/KVM Not tainted 5.3.0-rc4+ #4
[ 808.600652] Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
[ 808.618907] RIP: 0010:gfn_to_rmap+0xd9/0x120 [kvm]
[ 808.628472] Code: c7 48 8d 0c 80 48 8d 04 48 4d 8d 14 c0 49 8b 02 48 39 c6 72 15 49 03 42 08 48 39 c6 73 0c 41 89 b9 08 b4 00 00 49 8b 3a eb 0b <48> 8b 3c 25 00 00 00 00 45 31 d2 0f b6 42 24 83 e0 0f 83 e8 01 8d
[ 808.665945] RSP: 0018:ffffa888009a3b20 EFLAGS: 00010202
[ 808.676381] RAX: 00000000000c1040 RBX: ffffa888007d5000 RCX: 0000000000000014
[ 808.690628] RDX: ffff8eadd0708260 RSI: 00000000000c1080 RDI: 0000000000000004
[ 808.704877] RBP: ffff8eadc3d11400 R08: ffff8ead97cf0008 R09: ffff8ead97cf0000
[ 808.719124] R10: ffff8ead97cf0168 R11: 0000000000000004 R12: ffff8eadd0708260
[ 808.733374] R13: ffffa888007d5000 R14: 0000000000000000 R15: 0000000000000004
[ 808.747620] FS: 00007f28dab7c700(0000) GS:ffff8eb19f4c0000(0000) knlGS:0000000000000000
[ 808.763776] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 808.775249] CR2: 0000000000000000 CR3: 000000003f508006 CR4: 00000000001626e0
[ 808.789499] Call Trace:
[ 808.794399] drop_spte+0x77/0xa0 [kvm]
[ 808.801885] mmu_page_zap_pte+0xac/0xe0 [kvm]
[ 808.810587] __kvm_mmu_prepare_zap_page+0x69/0x350 [kvm]
[ 808.821196] kvm_mmu_invalidate_zap_pages_in_memslot+0x87/0xf0 [kvm]
[ 808.833881] kvm_page_track_flush_slot+0x55/0x80 [kvm]
[ 808.844140] __kvm_set_memory_region+0x821/0xaa0 [kvm]
[ 808.854402] kvm_set_memory_region+0x26/0x40 [kvm]
[ 808.863971] kvm_vm_ioctl+0x59a/0x940 [kvm]
[ 808.872318] ? pagevec_lru_move_fn+0xb8/0xd0
[ 808.880846] ? __seccomp_filter+0x7a/0x680
[ 808.889028] do_vfs_ioctl+0xa4/0x630
[ 808.896168] ? security_file_ioctl+0x32/0x50
[ 808.904695] ksys_ioctl+0x60/0x90
[ 808.911316] __x64_sys_ioctl+0x16/0x20
[ 808.918807] do_syscall_64+0x5f/0x1a0
[ 808.926121] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 808.936209] RIP: 0033:0x7f28ebf2b0fb
Does this suggests something is still fundamentally wrong with the
premise of this change or have I done something stupid? Thanks,
Alex
next prev parent reply other threads:[~2019-08-21 19:09 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-05 20:54 [PATCH v2 00/27] KVM: x86/mmu: Remove fast invalidate mechanism Sean Christopherson
2019-02-05 20:54 ` [PATCH v2 01/27] KVM: Call kvm_arch_memslots_updated() before updating memslots Sean Christopherson
2019-02-06 9:12 ` Cornelia Huck
2019-02-12 12:36 ` [PATCH v2 00/27] KVM: x86/mmu: Remove fast invalidate mechanism Paolo Bonzini
[not found] ` <20190205210137.1377-11-sean.j.christopherson@intel.com>
2019-08-13 16:04 ` [PATCH v2 11/27] KVM: x86/mmu: Zap only the relevant pages when removing a memslot Alex Williamson
2019-08-13 17:04 ` Sean Christopherson
2019-08-13 17:57 ` Alex Williamson
2019-08-13 19:33 ` Alex Williamson
2019-08-13 20:19 ` Sean Christopherson
2019-08-13 20:37 ` Paolo Bonzini
2019-08-13 21:14 ` Alex Williamson
2019-08-13 21:15 ` Paolo Bonzini
2019-08-13 22:10 ` Alex Williamson
2019-08-15 14:46 ` Sean Christopherson
2019-08-15 15:23 ` Alex Williamson
2019-08-15 16:00 ` Sean Christopherson
2019-08-15 18:16 ` Alex Williamson
2019-08-15 19:25 ` Sean Christopherson
2019-08-15 20:11 ` Alex Williamson
2019-08-19 16:03 ` Paolo Bonzini
2019-08-20 20:03 ` Sean Christopherson
2019-08-20 20:42 ` Alex Williamson
2019-08-20 21:02 ` Sean Christopherson
2019-08-21 19:08 ` Alex Williamson [this message]
2019-08-21 19:35 ` Alex Williamson
2019-08-21 20:30 ` Sean Christopherson
2019-08-23 2:25 ` Sean Christopherson
2019-08-23 22:05 ` Alex Williamson
2019-08-21 20:10 ` Sean Christopherson
2019-08-26 7:36 ` Tian, Kevin
2019-08-26 14:56 ` Sean Christopherson
2020-06-26 17:32 ` Sean Christopherson
2022-10-20 18:31 ` Alexander Graf
2022-10-20 20:37 ` Sean Christopherson
2022-10-20 21:06 ` Alexander Graf
2022-10-21 19:40 ` Sean Christopherson
2022-10-24 6:12 ` Alexander Graf
2022-10-24 15:55 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190821130859.4330bcf4@x1.home \
--to=alex.williamson@redhat.com \
--cc=guangrong.xiao@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=rkrcmar@redhat.com \
--cc=sean.j.christopherson@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).