From: Keir Fraser <keir@xen.org>
To: Juergen Gross <juergen.gross@ts.fujitsu.com>
Cc: xen-devel@lists.xensource.com, Jan Beulich <JBeulich@suse.com>
Subject: Re: [PATCH 2 of 2] Avoid vcpu migration of paused vcpus
Date: Thu, 22 Mar 2012 11:20:30 +0000 [thread overview]
Message-ID: <CB90BB7F.3C24E%keir@xen.org> (raw)
In-Reply-To: <4F6AFD53.1040505@ts.fujitsu.com>
[-- Attachment #1: Type: text/plain, Size: 2508 bytes --]
On 22/03/2012 10:22, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:
> On 03/22/2012 11:12 AM, Keir Fraser wrote:
>> Your original patch didn't touch this code. Was that an omission in the
>> original version? On reflection I prefer your original patch to this new
>> approach. I'll apply it if you still believe your original patch is complete
>> and correct as it stands.
>
> I like my second patch more :-)
>
> It covers more cases, not just poweroff. In hibernate case no vcpu pinnings
> will be lost. Today all vcpus pinned to a cpu other than 0 will lose their
> pinnings at cpu offlining. At reactivation those pinnings will not be
> restored automatically. My patch will cover that by checking availability
> of the cpus after reactivation.
>
> Poweroff (which was my primary concern) works with both versions. I did not
> test other ACPI state changes with either version, but would expect better
> results in hibernate case with my second approach.
How about the attached patch? Which is similar to your original patch except
I added the global state variable, and I added a check for it to
cpu_disable_scheduler(). It's nice and small. :-)
-- Keir
>
> Juergen
>
>> -- Keir
>>
>> On 22/03/2012 08:22, "Juergen Gross"<juergen.gross@ts.fujitsu.com> wrote:
>>
>>> Currently offlining a cpu will migrate all vcpus which were active on that
>>> cpu to another one, possibly breaking existing cpu affinities.
>>>
>>> In case of an ACPI state change the cpus are taken offline and online later
>>> (if not poweroff) while all domains are paused. There is no reason to
>>> migrate the vcpus during offlining the cpus, as the cpus will be available
>>> again when the domains are being unpaused.
>>>
>>> This patch defers the migration check in case of paused vcpus or domains
>>> by adding vcpu_arouse() to wake up a vcpu and check whether it must be
>>> migrated to another cpu.
>>>
>>> Signed-off-by: Juergen Gross<juergen.gross@ts.fujitsu.com>
>>>
>>>
>>> 3 files changed, 64 insertions(+), 24 deletions(-)
>>> xen/common/domain.c | 4 +-
>>> xen/common/schedule.c | 83
>>> ++++++++++++++++++++++++++++++++++-------------
>>> xen/include/xen/sched.h | 1
>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xen.org
>>> http://lists.xen.org/xen-devel
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
>>
>>
>
[-- Attachment #2: 00-suspend-affinities --]
[-- Type: application/octet-stream, Size: 6209 bytes --]
diff -r 6bf50858c3c5 xen/arch/arm/setup.c
--- a/xen/arch/arm/setup.c Thu Mar 22 10:26:03 2012 +0000
+++ b/xen/arch/arm/setup.c Thu Mar 22 11:11:39 2012 +0000
@@ -253,6 +253,8 @@ void __init start_xen(unsigned long boot
/* Hide UART from DOM0 if we're using it */
serial_endboot();
+ system_state = SYS_STATE_active;
+
domain_unpause_by_systemcontroller(dom0);
/* Switch on to the dynamically allocated stack for the idle vcpu
diff -r 6bf50858c3c5 xen/arch/x86/acpi/power.c
--- a/xen/arch/x86/acpi/power.c Thu Mar 22 10:26:03 2012 +0000
+++ b/xen/arch/x86/acpi/power.c Thu Mar 22 11:11:39 2012 +0000
@@ -135,6 +135,9 @@ static int enter_state(u32 state)
if ( !spin_trylock(&pm_lock) )
return -EBUSY;
+ BUG_ON(system_state != SYS_STATE_active);
+ system_state = SYS_STATE_suspend;
+
printk(XENLOG_INFO "Preparing system for ACPI S%d state.\n", state);
freeze_domains();
@@ -142,7 +145,10 @@ static int enter_state(u32 state)
acpi_dmar_reinstate();
if ( (error = disable_nonboot_cpus()) )
+ {
+ system_state = SYS_STATE_resume;
goto enable_cpu;
+ }
cpufreq_del_cpu(0);
@@ -159,6 +165,7 @@ static int enter_state(u32 state)
if ( (error = device_power_down()) )
{
printk(XENLOG_ERR "Some devices failed to power down.");
+ system_state = SYS_STATE_resume;
goto done;
}
@@ -179,6 +186,8 @@ static int enter_state(u32 state)
break;
}
+ system_state = SYS_STATE_resume;
+
/* Restore CR4 and EFER from cached values. */
cr4 = read_cr4();
write_cr4(cr4 & ~X86_CR4_MCE);
@@ -212,6 +221,7 @@ static int enter_state(u32 state)
mtrr_aps_sync_end();
acpi_dmar_zap();
thaw_domains();
+ system_state = SYS_STATE_active;
spin_unlock(&pm_lock);
return error;
}
diff -r 6bf50858c3c5 xen/arch/x86/mm.c
--- a/xen/arch/x86/mm.c Thu Mar 22 10:26:03 2012 +0000
+++ b/xen/arch/x86/mm.c Thu Mar 22 11:11:39 2012 +0000
@@ -5316,7 +5316,7 @@ int ptwr_do_page_fault(struct vcpu *v, u
void free_xen_pagetable(void *v)
{
- if ( early_boot )
+ if ( system_state == SYS_STATE_early_boot )
return;
if ( is_xen_heap_page(virt_to_page(v)) )
diff -r 6bf50858c3c5 xen/arch/x86/setup.c
--- a/xen/arch/x86/setup.c Thu Mar 22 10:26:03 2012 +0000
+++ b/xen/arch/x86/setup.c Thu Mar 22 11:11:39 2012 +0000
@@ -81,8 +81,6 @@ boolean_param("noapic", skip_ioapic_setu
s8 __read_mostly xen_cpuidle = -1;
boolean_param("cpuidle", xen_cpuidle);
-bool_t __read_mostly early_boot = 1;
-
cpumask_t __read_mostly cpu_present_map;
unsigned long __read_mostly xen_phys_start;
@@ -271,7 +269,7 @@ static void *__init bootstrap_map(const
void *ret;
#ifdef __x86_64__
- if ( !early_boot )
+ if ( system_state != SYS_STATE_early_boot )
return mod ? mfn_to_virt(mod->mod_start) : NULL;
#endif
@@ -1168,7 +1166,7 @@ void __init __start_xen(unsigned long mb
#endif
end_boot_allocator();
- early_boot = 0;
+ system_state = SYS_STATE_boot;
#if defined(CONFIG_X86_64)
vesa_init();
@@ -1391,6 +1389,8 @@ void __init __start_xen(unsigned long mb
dmi_end_boot();
+ system_state = SYS_STATE_active;
+
domain_unpause_by_systemcontroller(dom0);
reset_stack_and_jump(init_done);
diff -r 6bf50858c3c5 xen/arch/x86/x86_32/mm.c
--- a/xen/arch/x86/x86_32/mm.c Thu Mar 22 10:26:03 2012 +0000
+++ b/xen/arch/x86/x86_32/mm.c Thu Mar 22 11:11:39 2012 +0000
@@ -43,7 +43,7 @@ void *alloc_xen_pagetable(void)
{
unsigned long mfn;
- if ( !early_boot )
+ if ( system_state != SYS_STATE_early_boot )
{
void *v = alloc_xenheap_page();
diff -r 6bf50858c3c5 xen/arch/x86/x86_64/mm.c
--- a/xen/arch/x86/x86_64/mm.c Thu Mar 22 10:26:03 2012 +0000
+++ b/xen/arch/x86/x86_64/mm.c Thu Mar 22 11:11:39 2012 +0000
@@ -85,7 +85,7 @@ void *alloc_xen_pagetable(void)
{
unsigned long mfn;
- if ( !early_boot )
+ if ( system_state != SYS_STATE_early_boot )
{
struct page_info *pg = alloc_domheap_page(NULL, 0);
diff -r 6bf50858c3c5 xen/common/cpupool.c
--- a/xen/common/cpupool.c Thu Mar 22 10:26:03 2012 +0000
+++ b/xen/common/cpupool.c Thu Mar 22 11:11:39 2012 +0000
@@ -629,6 +629,10 @@ static int cpu_callback(
unsigned int cpu = (unsigned long)hcpu;
int rc = 0;
+ if ( (system_state == SYS_STATE_suspend) ||
+ (system_state == SYS_STATE_resume) )
+ goto out;
+
switch ( action )
{
case CPU_DOWN_FAILED:
@@ -642,6 +646,7 @@ static int cpu_callback(
break;
}
+out:
return !rc ? NOTIFY_DONE : notifier_from_errno(rc);
}
diff -r 6bf50858c3c5 xen/common/kernel.c
--- a/xen/common/kernel.c Thu Mar 22 10:26:03 2012 +0000
+++ b/xen/common/kernel.c Thu Mar 22 11:11:39 2012 +0000
@@ -20,6 +20,8 @@
#ifndef COMPAT
+enum system_state system_state = SYS_STATE_early_boot;
+
int tainted;
xen_commandline_t saved_cmdline;
diff -r 6bf50858c3c5 xen/common/schedule.c
--- a/xen/common/schedule.c Thu Mar 22 10:26:03 2012 +0000
+++ b/xen/common/schedule.c Thu Mar 22 11:11:39 2012 +0000
@@ -538,7 +538,7 @@ int cpu_disable_scheduler(unsigned int c
int ret = 0;
c = per_cpu(cpupool, cpu);
- if ( c == NULL )
+ if ( (c == NULL) || (system_state == SYS_STATE_suspend) )
return ret;
for_each_domain_in_cpupool ( d, c )
diff -r 6bf50858c3c5 xen/include/asm-x86/setup.h
--- a/xen/include/asm-x86/setup.h Thu Mar 22 10:26:03 2012 +0000
+++ b/xen/include/asm-x86/setup.h Thu Mar 22 11:11:39 2012 +0000
@@ -3,7 +3,6 @@
#include <xen/multiboot.h>
-extern bool_t early_boot;
extern unsigned long xenheap_initial_phys_start;
void early_cpu_init(void);
diff -r 6bf50858c3c5 xen/include/xen/kernel.h
--- a/xen/include/xen/kernel.h Thu Mar 22 10:26:03 2012 +0000
+++ b/xen/include/xen/kernel.h Thu Mar 22 11:11:39 2012 +0000
@@ -87,5 +87,13 @@ extern char _sinittext[], _einittext[];
(__p >= _sinittext) && (__p < _einittext); \
})
+extern enum system_state {
+ SYS_STATE_early_boot,
+ SYS_STATE_boot,
+ SYS_STATE_active,
+ SYS_STATE_suspend,
+ SYS_STATE_resume
+} system_state;
+
#endif /* _LINUX_KERNEL_H */
[-- Attachment #3: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2012-03-22 11:20 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-22 8:22 [PATCH 0 of 2] ACPI state change corrections Juergen Gross
2012-03-22 8:22 ` [PATCH 1 of 2] Allow ACPI state change with active cpupools Juergen Gross
2012-03-22 10:09 ` Jan Beulich
2012-03-22 8:22 ` [PATCH 2 of 2] Avoid vcpu migration of paused vcpus Juergen Gross
2012-03-22 10:12 ` Keir Fraser
2012-03-22 10:22 ` Juergen Gross
2012-03-22 11:20 ` Keir Fraser [this message]
2012-03-22 11:59 ` Juergen Gross
2012-03-23 8:17 ` Juergen Gross
2012-03-22 10:26 ` Jan Beulich
2012-03-22 10:38 ` Juergen Gross
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CB90BB7F.3C24E%keir@xen.org \
--to=keir@xen.org \
--cc=JBeulich@suse.com \
--cc=juergen.gross@ts.fujitsu.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).