From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751696AbdBXCY0 (ORCPT ); Thu, 23 Feb 2017 21:24:26 -0500 Received: from szxga01-in.huawei.com ([45.249.212.187]:2842 "EHLO dggrg01-dlp.huawei.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1751246AbdBXCYY (ORCPT ); Thu, 23 Feb 2017 21:24:24 -0500 Subject: Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration To: Paolo Bonzini , Chris Friesen , "Han, Huaitong" , "hangaohuai@huawei.com" , References: <589C7E96.9060905@huawei.com> <589D83CE.1090803@huawei.com> <589DDC05.9010807@windriver.com> <58AA51D6.6020508@huawei.com> <1487565495.3740.27.camel@intel.com> <58AD0094.90304@windriver.com> <4dd92012-626a-2d80-9adb-0be398f73eb1@redhat.com> <58AD92AE.6040502@windriver.com> <6c5567f4-192d-aefd-90e4-89f53479c24e@redhat.com> CC: "kvm@vger.kernel.org" , "fangying1@huawei.com" , "xudong.hao@linux.intel.com" , "qemu-devel@nongnu.org" , "wangxinxin.wang@huawei.com" , "kai.huang@linux.intel.com" , "rkrcmar@redhat.com" , "guangrong.xiao@linux.intel.com" , From: "Herongguang (Stephen)" Message-ID: <58AF9921.6060201@huawei.com> Date: Fri, 24 Feb 2017 10:23:29 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <6c5567f4-192d-aefd-90e4-89f53479c24e@redhat.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.177.19.20] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020201.58AF9934.0152,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 30c846bf7eaa816f5cf4ea37a42e348b Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2017/2/22 22:43, Paolo Bonzini wrote: > > > On 22/02/2017 14:31, Chris Friesen wrote: >>>> >>> >>> Can you reproduce it with kernel 4.8+? I'm suspecting commmit >>> 4e59516a12a6 ("kvm: vmx: ensure VMCS is current while enabling PML", >>> 2016-07-14) to be the fix. >> >> I can't easily try with a newer kernel, the software package we're using >> has kernel patches that would have to be ported. >> >> I'm at a conference, don't really have time to set up a pair of test >> machines from scratch with a custom kernel. > > Hopefully Gaohuai and Rongguang can help with this too. > > Paolo > > . > Yes, we are looking into and testing this. I think this can result in any memory corruption, if VM1 writes its PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1 is not registered yet), then VM1 is destroyed (hence its PML buffer is freed back to kernel), after that, VM2 starts migration, so CPU logs VM2’s dirty GFNS into a freed memory, results in any memory corruption. As its severity, this commit (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec) is eligible to back port to kernel stable.