From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Herongguang (Stephen)" Subject: Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration Date: Sat, 25 Feb 2017 09:44:49 +0800 Message-ID: <58B0E191.6040108@huawei.com> References: <589C7E96.9060905@huawei.com> <589D83CE.1090803@huawei.com> <589DDC05.9010807@windriver.com> <58AA51D6.6020508@huawei.com> <1487565495.3740.27.camel@intel.com> <58AD0094.90304@windriver.com> <4dd92012-626a-2d80-9adb-0be398f73eb1@redhat.com> <58AD92AE.6040502@windriver.com> <6c5567f4-192d-aefd-90e4-89f53479c24e@redhat.com> <58AF9921.6060201@huawei.com> <58B04CD3.7010304@windriver.com> <7fdf2551-3d55-1bd9-2848-720a880cc93e@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit Cc: "kvm@vger.kernel.org" , "fangying1@huawei.com" , "xudong.hao@linux.intel.com" , "qemu-devel@nongnu.org" , "wangxinxin.wang@huawei.com" , "kai.huang@linux.intel.com" , "rkrcmar@redhat.com" , "guangrong.xiao@linux.intel.com" , To: Paolo Bonzini , Chris Friesen , "Han, Huaitong" , "hangaohuai@huawei.com" , Return-path: In-Reply-To: <7fdf2551-3d55-1bd9-2848-720a880cc93e@redhat.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On 2017/2/24 23:14, Paolo Bonzini wrote: > > > On 24/02/2017 16:10, Chris Friesen wrote: >> On 02/23/2017 08:23 PM, Herongguang (Stephen) wrote: >> >>> On 2017/2/22 22:43, Paolo Bonzini wrote: >> >>>> Hopefully Gaohuai and Rongguang can help with this too. >>>> >>>> Paolo >>> >>> Yes, we are looking into and testing this. >>> >>> I think this can result in any memory corruption, if VM1 writes its >>> PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1 >>> is not registered yet), then VM1 is destroyed (hence its PML buffer >>> is freed back to kernel), after that, VM2 starts migration, so CPU >>> logs VM2’s dirty GFNS into a freed memory, results in any memory >>> corruption. >>> >>> As its severity, this commit >>> (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec) >>> >>> is eligible to back port to kernel stable. >> >> Are we expecting that fix to resolve the original issue, or is it a >> separate issue that needs fixing in stable? > > It should be the original issue. > > Paolo > > . > Yes, I agree, though we are still testing. From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33661) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1chRQQ-0004Us-74 for qemu-devel@nongnu.org; Fri, 24 Feb 2017 20:45:31 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1chRQO-0002cn-UM for qemu-devel@nongnu.org; Fri, 24 Feb 2017 20:45:30 -0500 Received: from [45.249.212.187] (port=2990 helo=dggrg01-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1chRQO-0002aY-9i for qemu-devel@nongnu.org; Fri, 24 Feb 2017 20:45:28 -0500 References: <589C7E96.9060905@huawei.com> <589D83CE.1090803@huawei.com> <589DDC05.9010807@windriver.com> <58AA51D6.6020508@huawei.com> <1487565495.3740.27.camel@intel.com> <58AD0094.90304@windriver.com> <4dd92012-626a-2d80-9adb-0be398f73eb1@redhat.com> <58AD92AE.6040502@windriver.com> <6c5567f4-192d-aefd-90e4-89f53479c24e@redhat.com> <58AF9921.6060201@huawei.com> <58B04CD3.7010304@windriver.com> <7fdf2551-3d55-1bd9-2848-720a880cc93e@redhat.com> From: "Herongguang (Stephen)" Message-ID: <58B0E191.6040108@huawei.com> Date: Sat, 25 Feb 2017 09:44:49 +0800 MIME-Version: 1.0 In-Reply-To: <7fdf2551-3d55-1bd9-2848-720a880cc93e@redhat.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] kvm bug in __rmap_clear_dirty during live migration List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini , Chris Friesen , "Han, Huaitong" , "hangaohuai@huawei.com" , stable@vger.kernel.org Cc: "kvm@vger.kernel.org" , "fangying1@huawei.com" , "xudong.hao@linux.intel.com" , "qemu-devel@nongnu.org" , "wangxinxin.wang@huawei.com" , "kai.huang@linux.intel.com" , "rkrcmar@redhat.com" , "guangrong.xiao@linux.intel.com" , linux-kernel@vger.kernel.org On 2017/2/24 23:14, Paolo Bonzini wrote: > > > On 24/02/2017 16:10, Chris Friesen wrote: >> On 02/23/2017 08:23 PM, Herongguang (Stephen) wrote: >> >>> On 2017/2/22 22:43, Paolo Bonzini wrote: >> >>>> Hopefully Gaohuai and Rongguang can help with this too. >>>> >>>> Paolo >>> >>> Yes, we are looking into and testing this. >>> >>> I think this can result in any memory corruption, if VM1 writes its >>> PML buffer into VM2’s VMCS (since sched_in/sched_out notifier of VM1 >>> is not registered yet), then VM1 is destroyed (hence its PML buffer >>> is freed back to kernel), after that, VM2 starts migration, so CPU >>> logs VM2’s dirty GFNS into a freed memory, results in any memory >>> corruption. >>> >>> As its severity, this commit >>> (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4e59516a12a6ef6dcb660cb3a3f70c64bd60cfec) >>> >>> is eligible to back port to kernel stable. >> >> Are we expecting that fix to resolve the original issue, or is it a >> separate issue that needs fixing in stable? > > It should be the original issue. > > Paolo > > . > Yes, I agree, though we are still testing.