* [PATCH] xen: vcpu_info reinit error after 'xl save -c' & 'xl restore' on PVOPS VM which has multi-cpu @ 2015-04-24 9:30 Ouyang Zhaowei (Charles) 2015-04-25 23:31 ` Boris Ostrovsky 0 siblings, 1 reply; 7+ messages in thread From: Ouyang Zhaowei (Charles) @ 2015-04-24 9:30 UTC (permalink / raw) To: Konrad Rzeszutek Wilk, Boris Ostrovsky, David Vrabel Cc: linux-kernel, Dingweiping, Yanqiangjun, jinjian, Ouyangzhaowei (Charles) If a PVOPS VM has multi-cpu the vcpu_info of cpu0 is the member of the structure HYPERVISOR_shared_info, and the others is not, but after 'xl save -c/restore' the vcpu_info will be reinitialized, the vcpu_info of all the vcpus will be considered as the member of HYPERVISOR_shared_info. This will cause the cpu1 and other cpu keep receiving interrupts, and the cpu0 is waiting them to finish the job. So we do not reinit the vcpu_info when PVOPS vm is doing 'xl save -c/restore'. Signed-off-by: Charles Ouyang <ouyangzhaowei@huawei.com> --- arch/x86/xen/suspend.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c index d949769..b2bed45 100644 --- a/arch/x86/xen/suspend.c +++ b/arch/x86/xen/suspend.c @@ -32,7 +32,8 @@ static void xen_hvm_post_suspend(int suspend_cancelled) { #ifdef CONFIG_XEN_PVHVM int cpu; - xen_hvm_init_shared_info(); + if (!suspend_cancelled) + xen_hvm_init_shared_info(); xen_callback_vector(); xen_unplug_emulated_devices(); if (xen_feature(XENFEAT_hvm_safe_pvclock)) { ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] xen: vcpu_info reinit error after 'xl save -c' & 'xl restore' on PVOPS VM which has multi-cpu 2015-04-24 9:30 [PATCH] xen: vcpu_info reinit error after 'xl save -c' & 'xl restore' on PVOPS VM which has multi-cpu Ouyang Zhaowei (Charles) @ 2015-04-25 23:31 ` Boris Ostrovsky 2015-04-28 12:30 ` Ouyang Zhaowei (Charles) 0 siblings, 1 reply; 7+ messages in thread From: Boris Ostrovsky @ 2015-04-25 23:31 UTC (permalink / raw) To: Ouyang Zhaowei (Charles), Konrad Rzeszutek Wilk, David Vrabel Cc: linux-kernel, Dingweiping, Yanqiangjun, jinjian On 04/24/2015 05:30 AM, Ouyang Zhaowei (Charles) wrote: > If a PVOPS VM has multi-cpu the vcpu_info of cpu0 is the member of the structure HYPERVISOR_shared_info, > and the others is not, but after 'xl save -c/restore' the vcpu_info will be reinitialized, > the vcpu_info of all the vcpus will be considered as the member of HYPERVISOR_shared_info. > This will cause the cpu1 and other cpu keep receiving interrupts, and the cpu0 is waiting them to > finish the job. > So we do not reinit the vcpu_info when PVOPS vm is doing 'xl save -c/restore'. > > Signed-off-by: Charles Ouyang <ouyangzhaowei@huawei.com> > --- > arch/x86/xen/suspend.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c > index d949769..b2bed45 100644 > --- a/arch/x86/xen/suspend.c > +++ b/arch/x86/xen/suspend.c > @@ -32,7 +32,8 @@ static void xen_hvm_post_suspend(int suspend_cancelled) > { > #ifdef CONFIG_XEN_PVHVM > int cpu; > - xen_hvm_init_shared_info(); > + if (!suspend_cancelled) > + xen_hvm_init_shared_info(); > xen_callback_vector(); > xen_unplug_emulated_devices(); > if (xen_feature(XENFEAT_hvm_safe_pvclock)) { Do we need to call other routines if suspend is canceled? Also, if suspend is canceled then we don't do xen_irq_resume() if that's what you meant by "vcpu_info will be reinitialized". Were you referring some other re-initialization? (The patch itself looks like the right thing to do though). -boris ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] xen: vcpu_info reinit error after 'xl save -c' & 'xl restore' on PVOPS VM which has multi-cpu 2015-04-25 23:31 ` Boris Ostrovsky @ 2015-04-28 12:30 ` Ouyang Zhaowei (Charles) 2015-04-28 21:31 ` Boris Ostrovsky 0 siblings, 1 reply; 7+ messages in thread From: Ouyang Zhaowei (Charles) @ 2015-04-28 12:30 UTC (permalink / raw) To: Boris Ostrovsky, Konrad Rzeszutek Wilk, David Vrabel Cc: linux-kernel, Dingweiping, Yanqiangjun, jinjian On 2015.4.26 7:31, Boris Ostrovsky wrote: > > On 04/24/2015 05:30 AM, Ouyang Zhaowei (Charles) wrote: >> If a PVOPS VM has multi-cpu the vcpu_info of cpu0 is the member of the structure HYPERVISOR_shared_info, >> and the others is not, but after 'xl save -c/restore' the vcpu_info will be reinitialized, >> the vcpu_info of all the vcpus will be considered as the member of HYPERVISOR_shared_info. >> This will cause the cpu1 and other cpu keep receiving interrupts, and the cpu0 is waiting them to >> finish the job. >> So we do not reinit the vcpu_info when PVOPS vm is doing 'xl save -c/restore'. >> >> Signed-off-by: Charles Ouyang <ouyangzhaowei@huawei.com> >> --- >> arch/x86/xen/suspend.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c >> index d949769..b2bed45 100644 >> --- a/arch/x86/xen/suspend.c >> +++ b/arch/x86/xen/suspend.c >> @@ -32,7 +32,8 @@ static void xen_hvm_post_suspend(int suspend_cancelled) >> { >> #ifdef CONFIG_XEN_PVHVM >> int cpu; >> - xen_hvm_init_shared_info(); >> + if (!suspend_cancelled) >> + xen_hvm_init_shared_info(); >> xen_callback_vector(); >> xen_unplug_emulated_devices(); >> if (xen_feature(XENFEAT_hvm_safe_pvclock)) { > > Do we need to call other routines if suspend is canceled? > > Also, if suspend is canceled then we don't do xen_irq_resume() if that's what you meant by "vcpu_info will be reinitialized". Were you referring some other re-initialization? > Hi Boris, Sorry I didn't make myself clear. About the "vcpu_info reinitialize", I mean in the function "xen_hvm_init_shared_info()" the pointer "xen_vcpu" will be reset and all point to HYPERVISOR_shared_info->vcpu_info[cpu]. void __ref xen_hvm_init_shared_info(void) ---- 1702 * When xen_hvm_init_shared_info is run at boot time only vcpu 0 is 1703 * online but xen_hvm_init_shared_info is run at resume time too and 1704 * in that case multiple vcpus might be online. */ 1705 for_each_online_cpu(cpu) { 1706 /* Leave it to be NULL. */ 1707 if (cpu >= MAX_VIRT_CPUS) 1708 continue; 1709 per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu]; 1710 } 1711 } But on Xen boot the init function "xen_start_kernel" only set the cpu0 to point to HYPERVISOR_shared_info->vcpu_info[0] asmlinkage __visible void __init xen_start_kernel(void) ---- 1563 /* Don't do the full vcpu_info placement stuff until we have a 1564 possible map and a non-dummy shared_info. */ 1565 per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0]; 1566 1567 local_irq_disable(); Other cpus are set to point to "xen_vcpu_info" in function xen_vcpu_setup(). So after xl save -c/restore, the pointer xen_vcpu will be reset in function "xen_hvm_init_shared_info" and point to a wrong place. This may cause all the cpus cannot handle irqs except cpu0, so IMHO it's not necessary to call xen_hvm_init_shared_info again if suspend is canceled. > (The patch itself looks like the right thing to do though). > > -boris > > . > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] xen: vcpu_info reinit error after 'xl save -c' & 'xl restore' on PVOPS VM which has multi-cpu 2015-04-28 12:30 ` Ouyang Zhaowei (Charles) @ 2015-04-28 21:31 ` Boris Ostrovsky 2015-04-30 7:27 ` Ouyang Zhaowei (Charles) 0 siblings, 1 reply; 7+ messages in thread From: Boris Ostrovsky @ 2015-04-28 21:31 UTC (permalink / raw) To: Ouyang Zhaowei (Charles), Konrad Rzeszutek Wilk, David Vrabel Cc: linux-kernel, Dingweiping, Yanqiangjun, jinjian On 04/28/2015 08:30 AM, Ouyang Zhaowei (Charles) wrote: > > On 2015.4.26 7:31, Boris Ostrovsky wrote: >> On 04/24/2015 05:30 AM, Ouyang Zhaowei (Charles) wrote: >>> If a PVOPS VM has multi-cpu the vcpu_info of cpu0 is the member of the structure HYPERVISOR_shared_info, >>> and the others is not, but after 'xl save -c/restore' the vcpu_info will be reinitialized, >>> the vcpu_info of all the vcpus will be considered as the member of HYPERVISOR_shared_info. >>> This will cause the cpu1 and other cpu keep receiving interrupts, and the cpu0 is waiting them to >>> finish the job. >>> So we do not reinit the vcpu_info when PVOPS vm is doing 'xl save -c/restore'. >>> >>> Signed-off-by: Charles Ouyang <ouyangzhaowei@huawei.com> >>> --- >>> arch/x86/xen/suspend.c | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c >>> index d949769..b2bed45 100644 >>> --- a/arch/x86/xen/suspend.c >>> +++ b/arch/x86/xen/suspend.c >>> @@ -32,7 +32,8 @@ static void xen_hvm_post_suspend(int suspend_cancelled) >>> { >>> #ifdef CONFIG_XEN_PVHVM >>> int cpu; >>> - xen_hvm_init_shared_info(); >>> + if (!suspend_cancelled) >>> + xen_hvm_init_shared_info(); >>> xen_callback_vector(); >>> xen_unplug_emulated_devices(); >>> if (xen_feature(XENFEAT_hvm_safe_pvclock)) { >> Do we need to call other routines if suspend is canceled? >> >> Also, if suspend is canceled then we don't do xen_irq_resume() if that's what you meant by "vcpu_info will be reinitialized". Were you referring some other re-initialization? >> > Hi Boris, > > Sorry I didn't make myself clear. > > About the "vcpu_info reinitialize", I mean in the function "xen_hvm_init_shared_info()" the pointer "xen_vcpu" will be reset and all > point to HYPERVISOR_shared_info->vcpu_info[cpu]. > > void __ref xen_hvm_init_shared_info(void) > ---- > 1702 * When xen_hvm_init_shared_info is run at boot time only vcpu 0 is > 1703 * online but xen_hvm_init_shared_info is run at resume time too and > 1704 * in that case multiple vcpus might be online. */ > 1705 for_each_online_cpu(cpu) { > 1706 /* Leave it to be NULL. */ > 1707 if (cpu >= MAX_VIRT_CPUS) > 1708 continue; > 1709 per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu]; > 1710 } > 1711 } > > > But on Xen boot the init function "xen_start_kernel" only set the cpu0 to point to HYPERVISOR_shared_info->vcpu_info[0] > > asmlinkage __visible void __init xen_start_kernel(void) We are talking about HVM guests here and xen_start_kernel is only called for PV. But even if it was, xen_vcpu pointers for other VCPUs are set in xen_vcpu_setup(), which is called when non-boot VCPUs are coming up. And I wonder whether the actual problem is that we don't call xen_vcpu_setup() on canceled suspend (as we don't need to, really) and therefore if we call xen_hvm_init_shared_info() then per_cpu(xen_vcpu,cpu) for *non-boot* cpus is will become wrong. -boris > ---- > 1563 /* Don't do the full vcpu_info placement stuff until we have a > 1564 possible map and a non-dummy shared_info. */ > 1565 per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0]; > 1566 > 1567 local_irq_disable(); > > Other cpus are set to point to "xen_vcpu_info" in function xen_vcpu_setup(). > > So after xl save -c/restore, the pointer xen_vcpu will be reset in function "xen_hvm_init_shared_info" and point to a wrong place. > This may cause all the cpus cannot handle irqs except cpu0, so IMHO it's not necessary to call xen_hvm_init_shared_info again if > suspend is canceled. > >> (The patch itself looks like the right thing to do though). >> >> -boris >> >> . >> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] xen: vcpu_info reinit error after 'xl save -c' & 'xl restore' on PVOPS VM which has multi-cpu 2015-04-28 21:31 ` Boris Ostrovsky @ 2015-04-30 7:27 ` Ouyang Zhaowei (Charles) 2015-05-01 18:55 ` Boris Ostrovsky 0 siblings, 1 reply; 7+ messages in thread From: Ouyang Zhaowei (Charles) @ 2015-04-30 7:27 UTC (permalink / raw) To: Boris Ostrovsky, Konrad Rzeszutek Wilk, David Vrabel Cc: linux-kernel, Dingweiping, Yanqiangjun, jinjian, herongguang.he On 2015.4.29 5:31, Boris Ostrovsky wrote: > On 04/28/2015 08:30 AM, Ouyang Zhaowei (Charles) wrote: >> >> On 2015.4.26 7:31, Boris Ostrovsky wrote: >>> On 04/24/2015 05:30 AM, Ouyang Zhaowei (Charles) wrote: >>>> If a PVOPS VM has multi-cpu the vcpu_info of cpu0 is the member of the structure HYPERVISOR_shared_info, >>>> and the others is not, but after 'xl save -c/restore' the vcpu_info will be reinitialized, >>>> the vcpu_info of all the vcpus will be considered as the member of HYPERVISOR_shared_info. >>>> This will cause the cpu1 and other cpu keep receiving interrupts, and the cpu0 is waiting them to >>>> finish the job. >>>> So we do not reinit the vcpu_info when PVOPS vm is doing 'xl save -c/restore'. >>>> >>>> Signed-off-by: Charles Ouyang <ouyangzhaowei@huawei.com> >>>> --- >>>> arch/x86/xen/suspend.c | 3 ++- >>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c >>>> index d949769..b2bed45 100644 >>>> --- a/arch/x86/xen/suspend.c >>>> +++ b/arch/x86/xen/suspend.c >>>> @@ -32,7 +32,8 @@ static void xen_hvm_post_suspend(int suspend_cancelled) >>>> { >>>> #ifdef CONFIG_XEN_PVHVM >>>> int cpu; >>>> - xen_hvm_init_shared_info(); >>>> + if (!suspend_cancelled) >>>> + xen_hvm_init_shared_info(); >>>> xen_callback_vector(); >>>> xen_unplug_emulated_devices(); >>>> if (xen_feature(XENFEAT_hvm_safe_pvclock)) { >>> Do we need to call other routines if suspend is canceled? >>> >>> Also, if suspend is canceled then we don't do xen_irq_resume() if that's what you meant by "vcpu_info will be reinitialized". Were you referring some other re-initialization? >>> >> Hi Boris, >> >> Sorry I didn't make myself clear. >> >> About the "vcpu_info reinitialize", I mean in the function "xen_hvm_init_shared_info()" the pointer "xen_vcpu" will be reset and all >> point to HYPERVISOR_shared_info->vcpu_info[cpu]. >> >> void __ref xen_hvm_init_shared_info(void) >> ---- >> 1702 * When xen_hvm_init_shared_info is run at boot time only vcpu 0 is >> 1703 * online but xen_hvm_init_shared_info is run at resume time too and >> 1704 * in that case multiple vcpus might be online. */ >> 1705 for_each_online_cpu(cpu) { >> 1706 /* Leave it to be NULL. */ >> 1707 if (cpu >= MAX_VIRT_CPUS) >> 1708 continue; >> 1709 per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu]; >> 1710 } >> 1711 } >> >> >> But on Xen boot the init function "xen_start_kernel" only set the cpu0 to point to HYPERVISOR_shared_info->vcpu_info[0] >> >> asmlinkage __visible void __init xen_start_kernel(void) > > > We are talking about HVM guests here and xen_start_kernel is only called for PV. But even if it was, xen_vcpu pointers for other VCPUs are set in xen_vcpu_setup(), which is called when non-boot VCPUs are coming up. > > And I wonder whether the actual problem is that we don't call xen_vcpu_setup() on canceled suspend (as we don't need to, really) and therefore if we call xen_hvm_init_shared_info() then per_cpu(xen_vcpu,cpu) for *non-boot* cpus is will become wrong. > Yes, you are right, in xen_vcpu_setup() non-boot VCPUs is set to point to xen_vcpu_info static void xen_vcpu_setup(int cpu) ---- 208 vcpup = &per_cpu(xen_vcpu_info, cpu); ... 227 /* This cpu is using the registered vcpu info, even if 228 later ones fail to. */ 229 per_cpu(xen_vcpu, cpu) = vcpup; But on canceled suspend if we call xen_hvm_init_shared_info(), the non-boot VCPUS will be reset to point to HYPERVISOR_shared_info->vcpu_info[cpu] which is a wrong address. So I suggest we don't call xen_hvm_init_shared_info() when suspend is canceled. > -boris > >> ---- >> 1563 /* Don't do the full vcpu_info placement stuff until we have a >> 1564 possible map and a non-dummy shared_info. */ >> 1565 per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0]; >> 1566 >> 1567 local_irq_disable(); >> >> Other cpus are set to point to "xen_vcpu_info" in function xen_vcpu_setup(). >> >> So after xl save -c/restore, the pointer xen_vcpu will be reset in function "xen_hvm_init_shared_info" and point to a wrong place. >> This may cause all the cpus cannot handle irqs except cpu0, so IMHO it's not necessary to call xen_hvm_init_shared_info again if >> suspend is canceled. >> >>> (The patch itself looks like the right thing to do though). >>> >>> -boris >>> >>> . >>> > > > . > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] xen: vcpu_info reinit error after 'xl save -c' & 'xl restore' on PVOPS VM which has multi-cpu 2015-04-30 7:27 ` Ouyang Zhaowei (Charles) @ 2015-05-01 18:55 ` Boris Ostrovsky 2015-05-04 2:25 ` Ouyang Zhaowei (Charles) 0 siblings, 1 reply; 7+ messages in thread From: Boris Ostrovsky @ 2015-05-01 18:55 UTC (permalink / raw) To: Ouyang Zhaowei (Charles), Konrad Rzeszutek Wilk, David Vrabel Cc: linux-kernel, Dingweiping, Yanqiangjun, jinjian, herongguang.he On 04/30/2015 03:27 AM, Ouyang Zhaowei (Charles) wrote: > > On 2015.4.29 5:31, Boris Ostrovsky wrote: >> On 04/28/2015 08:30 AM, Ouyang Zhaowei (Charles) wrote: >>> On 2015.4.26 7:31, Boris Ostrovsky wrote: >>>> On 04/24/2015 05:30 AM, Ouyang Zhaowei (Charles) wrote: >>>>> If a PVOPS VM has multi-cpu the vcpu_info of cpu0 is the member of the structure HYPERVISOR_shared_info, >>>>> and the others is not, but after 'xl save -c/restore' the vcpu_info will be reinitialized, >>>>> the vcpu_info of all the vcpus will be considered as the member of HYPERVISOR_shared_info. >>>>> This will cause the cpu1 and other cpu keep receiving interrupts, and the cpu0 is waiting them to >>>>> finish the job. >>>>> So we do not reinit the vcpu_info when PVOPS vm is doing 'xl save -c/restore'. >>>>> >>>>> Signed-off-by: Charles Ouyang <ouyangzhaowei@huawei.com> >>>>> --- >>>>> arch/x86/xen/suspend.c | 3 ++- >>>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c >>>>> index d949769..b2bed45 100644 >>>>> --- a/arch/x86/xen/suspend.c >>>>> +++ b/arch/x86/xen/suspend.c >>>>> @@ -32,7 +32,8 @@ static void xen_hvm_post_suspend(int suspend_cancelled) >>>>> { >>>>> #ifdef CONFIG_XEN_PVHVM >>>>> int cpu; >>>>> - xen_hvm_init_shared_info(); >>>>> + if (!suspend_cancelled) >>>>> + xen_hvm_init_shared_info(); >>>>> xen_callback_vector(); >>>>> xen_unplug_emulated_devices(); >>>>> if (xen_feature(XENFEAT_hvm_safe_pvclock)) { >>>> Do we need to call other routines if suspend is canceled? >>>> >>>> Also, if suspend is canceled then we don't do xen_irq_resume() if that's what you meant by "vcpu_info will be reinitialized". Were you referring some other re-initialization? >>>> >>> Hi Boris, >>> >>> Sorry I didn't make myself clear. >>> >>> About the "vcpu_info reinitialize", I mean in the function "xen_hvm_init_shared_info()" the pointer "xen_vcpu" will be reset and all >>> point to HYPERVISOR_shared_info->vcpu_info[cpu]. >>> >>> void __ref xen_hvm_init_shared_info(void) >>> ---- >>> 1702 * When xen_hvm_init_shared_info is run at boot time only vcpu 0 is >>> 1703 * online but xen_hvm_init_shared_info is run at resume time too and >>> 1704 * in that case multiple vcpus might be online. */ >>> 1705 for_each_online_cpu(cpu) { >>> 1706 /* Leave it to be NULL. */ >>> 1707 if (cpu >= MAX_VIRT_CPUS) >>> 1708 continue; >>> 1709 per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu]; >>> 1710 } >>> 1711 } >>> >>> >>> But on Xen boot the init function "xen_start_kernel" only set the cpu0 to point to HYPERVISOR_shared_info->vcpu_info[0] >>> >>> asmlinkage __visible void __init xen_start_kernel(void) >> >> We are talking about HVM guests here and xen_start_kernel is only called for PV. But even if it was, xen_vcpu pointers for other VCPUs are set in xen_vcpu_setup(), which is called when non-boot VCPUs are coming up. >> >> And I wonder whether the actual problem is that we don't call xen_vcpu_setup() on canceled suspend (as we don't need to, really) and therefore if we call xen_hvm_init_shared_info() then per_cpu(xen_vcpu,cpu) for *non-boot* cpus is will become wrong. >> > Yes, you are right, in xen_vcpu_setup() non-boot VCPUs is set to point to xen_vcpu_info > > static void xen_vcpu_setup(int cpu) > ---- > 208 vcpup = &per_cpu(xen_vcpu_info, cpu); > ... > 227 /* This cpu is using the registered vcpu info, even if > 228 later ones fail to. */ > 229 per_cpu(xen_vcpu, cpu) = vcpup; > > But on canceled suspend if we call xen_hvm_init_shared_info(), the non-boot VCPUS will be reset to point to HYPERVISOR_shared_info->vcpu_info[cpu] which is a wrong address. > So I suggest we don't call xen_hvm_init_shared_info() when suspend is canceled. Right, so can you resubmit the patch with updated commit message? (Just note there that the hypervisor continues assuming that vcpu_info is stored in per-cpu data which was set up by xen_vcpu_setup(), while the call to xen_hvm_init_shared_info() will now make the guest think that vcpu_info is in the shared page). Thanks. -boris > >> -boris >> >>> ---- >>> 1563 /* Don't do the full vcpu_info placement stuff until we have a >>> 1564 possible map and a non-dummy shared_info. */ >>> 1565 per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0]; >>> 1566 >>> 1567 local_irq_disable(); >>> >>> Other cpus are set to point to "xen_vcpu_info" in function xen_vcpu_setup(). >>> >>> So after xl save -c/restore, the pointer xen_vcpu will be reset in function "xen_hvm_init_shared_info" and point to a wrong place. >>> This may cause all the cpus cannot handle irqs except cpu0, so IMHO it's not necessary to call xen_hvm_init_shared_info again if >>> suspend is canceled. >>> >>>> (The patch itself looks like the right thing to do though). >>>> >>>> -boris >>>> >>>> . >>>> >> >> . >> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] xen: vcpu_info reinit error after 'xl save -c' & 'xl restore' on PVOPS VM which has multi-cpu 2015-05-01 18:55 ` Boris Ostrovsky @ 2015-05-04 2:25 ` Ouyang Zhaowei (Charles) 0 siblings, 0 replies; 7+ messages in thread From: Ouyang Zhaowei (Charles) @ 2015-05-04 2:25 UTC (permalink / raw) To: Boris Ostrovsky, Konrad Rzeszutek Wilk, David Vrabel Cc: linux-kernel, Dingweiping, Yanqiangjun, jinjian, herongguang.he On 2015.5.2 2:55, Boris Ostrovsky wrote: > On 04/30/2015 03:27 AM, Ouyang Zhaowei (Charles) wrote: >> >> On 2015.4.29 5:31, Boris Ostrovsky wrote: >>> On 04/28/2015 08:30 AM, Ouyang Zhaowei (Charles) wrote: >>>> On 2015.4.26 7:31, Boris Ostrovsky wrote: >>>>> On 04/24/2015 05:30 AM, Ouyang Zhaowei (Charles) wrote: >>>>>> If a PVOPS VM has multi-cpu the vcpu_info of cpu0 is the member of the structure HYPERVISOR_shared_info, >>>>>> and the others is not, but after 'xl save -c/restore' the vcpu_info will be reinitialized, >>>>>> the vcpu_info of all the vcpus will be considered as the member of HYPERVISOR_shared_info. >>>>>> This will cause the cpu1 and other cpu keep receiving interrupts, and the cpu0 is waiting them to >>>>>> finish the job. >>>>>> So we do not reinit the vcpu_info when PVOPS vm is doing 'xl save -c/restore'. >>>>>> >>>>>> Signed-off-by: Charles Ouyang <ouyangzhaowei@huawei.com> >>>>>> --- >>>>>> arch/x86/xen/suspend.c | 3 ++- >>>>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c >>>>>> index d949769..b2bed45 100644 >>>>>> --- a/arch/x86/xen/suspend.c >>>>>> +++ b/arch/x86/xen/suspend.c >>>>>> @@ -32,7 +32,8 @@ static void xen_hvm_post_suspend(int suspend_cancelled) >>>>>> { >>>>>> #ifdef CONFIG_XEN_PVHVM >>>>>> int cpu; >>>>>> - xen_hvm_init_shared_info(); >>>>>> + if (!suspend_cancelled) >>>>>> + xen_hvm_init_shared_info(); >>>>>> xen_callback_vector(); >>>>>> xen_unplug_emulated_devices(); >>>>>> if (xen_feature(XENFEAT_hvm_safe_pvclock)) { >>>>> Do we need to call other routines if suspend is canceled? >>>>> >>>>> Also, if suspend is canceled then we don't do xen_irq_resume() if that's what you meant by "vcpu_info will be reinitialized". Were you referring some other re-initialization? >>>>> >>>> Hi Boris, >>>> >>>> Sorry I didn't make myself clear. >>>> >>>> About the "vcpu_info reinitialize", I mean in the function "xen_hvm_init_shared_info()" the pointer "xen_vcpu" will be reset and all >>>> point to HYPERVISOR_shared_info->vcpu_info[cpu]. >>>> >>>> void __ref xen_hvm_init_shared_info(void) >>>> ---- >>>> 1702 * When xen_hvm_init_shared_info is run at boot time only vcpu 0 is >>>> 1703 * online but xen_hvm_init_shared_info is run at resume time too and >>>> 1704 * in that case multiple vcpus might be online. */ >>>> 1705 for_each_online_cpu(cpu) { >>>> 1706 /* Leave it to be NULL. */ >>>> 1707 if (cpu >= MAX_VIRT_CPUS) >>>> 1708 continue; >>>> 1709 per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu]; >>>> 1710 } >>>> 1711 } >>>> >>>> >>>> But on Xen boot the init function "xen_start_kernel" only set the cpu0 to point to HYPERVISOR_shared_info->vcpu_info[0] >>>> >>>> asmlinkage __visible void __init xen_start_kernel(void) >>> >>> We are talking about HVM guests here and xen_start_kernel is only called for PV. But even if it was, xen_vcpu pointers for other VCPUs are set in xen_vcpu_setup(), which is called when non-boot VCPUs are coming up. >>> >>> And I wonder whether the actual problem is that we don't call xen_vcpu_setup() on canceled suspend (as we don't need to, really) and therefore if we call xen_hvm_init_shared_info() then per_cpu(xen_vcpu,cpu) for *non-boot* cpus is will become wrong. >>> >> Yes, you are right, in xen_vcpu_setup() non-boot VCPUs is set to point to xen_vcpu_info >> >> static void xen_vcpu_setup(int cpu) >> ---- >> 208 vcpup = &per_cpu(xen_vcpu_info, cpu); >> ... >> 227 /* This cpu is using the registered vcpu info, even if >> 228 later ones fail to. */ >> 229 per_cpu(xen_vcpu, cpu) = vcpup; >> >> But on canceled suspend if we call xen_hvm_init_shared_info(), the non-boot VCPUS will be reset to point to HYPERVISOR_shared_info->vcpu_info[cpu] which is a wrong address. >> So I suggest we don't call xen_hvm_init_shared_info() when suspend is canceled. > > > Right, so can you resubmit the patch with updated commit message? (Just note there that the hypervisor continues assuming that vcpu_info is stored in per-cpu data which was set up by xen_vcpu_setup(), while the call to xen_hvm_init_shared_info() will now make the guest think that vcpu_info is in the shared page). > > Thanks. > -boris OK, Thank for the advise, I'll resend the patch now > >> >>> -boris >>> >>>> ---- >>>> 1563 /* Don't do the full vcpu_info placement stuff until we have a >>>> 1564 possible map and a non-dummy shared_info. */ >>>> 1565 per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0]; >>>> 1566 >>>> 1567 local_irq_disable(); >>>> >>>> Other cpus are set to point to "xen_vcpu_info" in function xen_vcpu_setup(). >>>> >>>> So after xl save -c/restore, the pointer xen_vcpu will be reset in function "xen_hvm_init_shared_info" and point to a wrong place. >>>> This may cause all the cpus cannot handle irqs except cpu0, so IMHO it's not necessary to call xen_hvm_init_shared_info again if >>>> suspend is canceled. >>>> >>>>> (The patch itself looks like the right thing to do though). >>>>> >>>>> -boris >>>>> >>>>> . >>>>> >>> >>> . >>> > > > . > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-05-04 2:26 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-04-24 9:30 [PATCH] xen: vcpu_info reinit error after 'xl save -c' & 'xl restore' on PVOPS VM which has multi-cpu Ouyang Zhaowei (Charles) 2015-04-25 23:31 ` Boris Ostrovsky 2015-04-28 12:30 ` Ouyang Zhaowei (Charles) 2015-04-28 21:31 ` Boris Ostrovsky 2015-04-30 7:27 ` Ouyang Zhaowei (Charles) 2015-05-01 18:55 ` Boris Ostrovsky 2015-05-04 2:25 ` Ouyang Zhaowei (Charles)
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.