From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Huang, Kai" Subject: Re: [4.1.y] vmwrite error: reg 401e value a9 (err 1) Date: Wed, 9 Nov 2016 16:10:03 +1300 Message-ID: References: <20161109001702.GA24512@psuche> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: Sasha Levin , Paolo Bonzini , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , Jim Mattson , pfeiner@google.com To: Greg Edwards , kvm@vger.kernel.org Return-path: Received: from mga09.intel.com ([134.134.136.24]:9974 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751202AbcKIDKK (ORCPT ); Tue, 8 Nov 2016 22:10:10 -0500 In-Reply-To: <20161109001702.GA24512@psuche> Sender: kvm-owner@vger.kernel.org List-ID: Hi Greg, Thanks for reporting this issue. I don't have 4.1.y source code tree at hand but after taking a glance looks the commit a3eaa8649e4c6a6afdafaa04b9114fb230617bb1 ("KVM: VMX: Fix commit which broke PML") fixes this by removing vmwrite to SECONDARY_VM_EXEC_CONTROL in vmx_disable_pml, so yes I think this commit can fix this issue. But I think you probably need another commit to fix potential vmwrite error when creating vcpu: 4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec (kvm: vmx: ensure VMCS is current while enabling PML). Peter found and fixed this issue, so I also added him to cc-list. Paolo/Radim, please comment if I made mistake here. Thanks, -Kai On 11/9/2016 1:17 PM, Greg Edwards wrote: > On current 4.1.y stable kernel (4.1.35) on a Broadwell-EP system, I see the > following when shutting down a multiple vcpu VM: > > [ 758.387722] vmwrite error: reg 401e value a9 (err 1) > [ 758.392860] CPU: 33 PID: 14969 Comm: qemu-system-x86 Not tainted 4.1.35 #1 > [ 758.399897] Hardware name: DDN 14000x/14000, BIOS 0229 09/23/2016 > [ 758.406156] 0000000000000286 0000000028b15def ffff88202f3fbb38 ffffffff8159de63 > [ 758.413942] ffff88402a938000 0000000000000001 ffff88202f3fbb48 ffffffffa060fa1c > [ 758.421736] ffff88202f3fbb58 ffffffffa060fa49 ffff88202f3fbb78 ffffffffa0618fab > [ 758.429534] Call Trace: > [ 758.432147] [] dump_stack+0x4d/0x63 > [ 758.437449] [] vmwrite_error+0x2c/0x30 [kvm_intel] > [ 758.444059] [] vmcs_writel+0x29/0x30 [kvm_intel] > [ 758.450493] [] vmx_free_vcpu+0xdb/0xf0 [kvm_intel] > [ 758.457111] [] kvm_arch_vcpu_free+0x48/0x50 [kvm] > [ 758.463637] [] kvm_arch_destroy_vm+0x10a/0x200 [kvm] > [ 758.470418] [] ? synchronize_srcu+0x28/0x30 > [ 758.476419] [] kvm_put_kvm+0x105/0x220 [kvm] > [ 758.482505] [] kvm_vcpu_release+0x18/0x20 [kvm] > [ 758.488853] [] __fput+0xcb/0x1d0 > [ 758.493899] [] ____fput+0xe/0x10 > [ 758.498939] [] task_work_run+0xd4/0xf0 > [ 758.504497] [] do_exit+0x2a1/0xb40 > [ 758.509708] [] do_group_exit+0x47/0xc0 > [ 758.515269] [] get_signal+0x1f3/0x6c0 > [ 758.520743] [] do_signal+0x37/0x800 > [ 758.526042] [] ? SyS_futex+0x85/0x1a0 > [ 758.531513] [] do_notify_resume+0x70/0x80 > [ 758.537334] [] int_signal+0x12/0x17 > > This started with the inclusion of 6c2ca21665b99ce2f76389c353b985d8195387cc > ("KVM: nVMX: Fix memory corruption when using VMCS shadowing") in 4.1.31. > > The error is coming out of vmx_disable_pml() when freeing the 2nd and > subsequent vcpus, as SECONDARY_EXEC_ENABLE_PML was already cleared from the > SECONDARY_VM_EXEC_CONTROL when the first vcpu was freed. > > Additionally pulling back a3eaa8649e4c6a6afdafaa04b9114fb230617bb1 ("KVM: VMX: > Fix commit which broke PML") from 4.4 resolves it for me, as it fixes > the above condition. > > Is this the correct fix for 4.1.y? > > Greg >