From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965854AbbD1MbZ (ORCPT ); Tue, 28 Apr 2015 08:31:25 -0400 Received: from szxga03-in.huawei.com ([119.145.14.66]:15239 "EHLO szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965440AbbD1MbW (ORCPT ); Tue, 28 Apr 2015 08:31:22 -0400 Message-ID: <553F7D80.3010409@huawei.com> Date: Tue, 28 Apr 2015 20:30:56 +0800 From: "Ouyang Zhaowei (Charles)" User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Boris Ostrovsky , Konrad Rzeszutek Wilk , David Vrabel CC: , Dingweiping , Yanqiangjun , Subject: Re: [PATCH] xen: vcpu_info reinit error after 'xl save -c' & 'xl restore' on PVOPS VM which has multi-cpu References: <553A0D49.2020300@huawei.com> <553C23EE.9090101@oracle.com> In-Reply-To: <553C23EE.9090101@oracle.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.177.30.65] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020206.553F7D8D.032B,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2013-05-26 15:14:31, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: f1e5afd841be79aafb1b8e56edea5e79 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2015.4.26 7:31, Boris Ostrovsky wrote: > > On 04/24/2015 05:30 AM, Ouyang Zhaowei (Charles) wrote: >> If a PVOPS VM has multi-cpu the vcpu_info of cpu0 is the member of the structure HYPERVISOR_shared_info, >> and the others is not, but after 'xl save -c/restore' the vcpu_info will be reinitialized, >> the vcpu_info of all the vcpus will be considered as the member of HYPERVISOR_shared_info. >> This will cause the cpu1 and other cpu keep receiving interrupts, and the cpu0 is waiting them to >> finish the job. >> So we do not reinit the vcpu_info when PVOPS vm is doing 'xl save -c/restore'. >> >> Signed-off-by: Charles Ouyang >> --- >> arch/x86/xen/suspend.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c >> index d949769..b2bed45 100644 >> --- a/arch/x86/xen/suspend.c >> +++ b/arch/x86/xen/suspend.c >> @@ -32,7 +32,8 @@ static void xen_hvm_post_suspend(int suspend_cancelled) >> { >> #ifdef CONFIG_XEN_PVHVM >> int cpu; >> - xen_hvm_init_shared_info(); >> + if (!suspend_cancelled) >> + xen_hvm_init_shared_info(); >> xen_callback_vector(); >> xen_unplug_emulated_devices(); >> if (xen_feature(XENFEAT_hvm_safe_pvclock)) { > > Do we need to call other routines if suspend is canceled? > > Also, if suspend is canceled then we don't do xen_irq_resume() if that's what you meant by "vcpu_info will be reinitialized". Were you referring some other re-initialization? > Hi Boris, Sorry I didn't make myself clear. About the "vcpu_info reinitialize", I mean in the function "xen_hvm_init_shared_info()" the pointer "xen_vcpu" will be reset and all point to HYPERVISOR_shared_info->vcpu_info[cpu]. void __ref xen_hvm_init_shared_info(void) ---- 1702 * When xen_hvm_init_shared_info is run at boot time only vcpu 0 is 1703 * online but xen_hvm_init_shared_info is run at resume time too and 1704 * in that case multiple vcpus might be online. */ 1705 for_each_online_cpu(cpu) { 1706 /* Leave it to be NULL. */ 1707 if (cpu >= MAX_VIRT_CPUS) 1708 continue; 1709 per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu]; 1710 } 1711 } But on Xen boot the init function "xen_start_kernel" only set the cpu0 to point to HYPERVISOR_shared_info->vcpu_info[0] asmlinkage __visible void __init xen_start_kernel(void) ---- 1563 /* Don't do the full vcpu_info placement stuff until we have a 1564 possible map and a non-dummy shared_info. */ 1565 per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0]; 1566 1567 local_irq_disable(); Other cpus are set to point to "xen_vcpu_info" in function xen_vcpu_setup(). So after xl save -c/restore, the pointer xen_vcpu will be reset in function "xen_hvm_init_shared_info" and point to a wrong place. This may cause all the cpus cannot handle irqs except cpu0, so IMHO it's not necessary to call xen_hvm_init_shared_info again if suspend is canceled. > (The patch itself looks like the right thing to do though). > > -boris > > . >