* [PATCH] Allocate vmcs pages when system booting
@ 2009-11-12 10:52 Jiang, Yunhong
2009-11-12 11:22 ` Keir Fraser
0 siblings, 1 reply; 16+ messages in thread
From: Jiang, Yunhong @ 2009-11-12 10:52 UTC (permalink / raw)
To: xen-devel@lists.xensource.com, Keir Fraser
[-- Attachment #1: Type: text/plain, Size: 2620 bytes --]
Currently the VMCS page is allocated when the CPU is brought-up and identified, and then spin_debug_enable() is called.
This does not work for cpu hot-add. When hot-added CPU is brought-up and try to allocate the vmcs page, it will hit check_lock, because irq is disabled still.
This patch allocate the vmcs pages for all possible pages when system booting, so that we don't allocate vmcs anymore when secondary CPU is up.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
diff -r 1db7034f4498 xen/arch/x86/hvm/vmx/vmcs.c
--- a/xen/arch/x86/hvm/vmx/vmcs.c Thu Nov 12 00:05:59 2009 +0800
+++ b/xen/arch/x86/hvm/vmx/vmcs.c Thu Nov 12 04:08:25 2009 +0800
@@ -267,6 +267,34 @@ static void vmx_free_vmcs(struct vmcs_st
static void vmx_free_vmcs(struct vmcs_struct *vmcs)
{
free_xenheap_page(vmcs);
+}
+
+int vmx_allocate_vmcs_pages(void)
+{
+ int cpu, ret = 0;
+
+ for_each_possible_cpu(cpu)
+ {
+ if (per_cpu(host_vmcs, cpu) == NULL)
+ {
+ per_cpu(host_vmcs, cpu) = vmx_alloc_vmcs();
+ if (per_cpu(host_vmcs, cpu) == NULL)
+ {
+ ret = -1;
+ break;
+ }
+ }
+ }
+
+ if (ret < 0)
+ {
+ for_each_possible_cpu(cpu)
+ {
+ if (per_cpu(host_vmcs, cpu))
+ vmx_free_vmcs(per_cpu(host_vmcs, cpu));
+ }
+ }
+ return ret;
}
static void __vmx_clear_vmcs(void *info)
diff -r 1db7034f4498 xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c Thu Nov 12 00:05:59 2009 +0800
+++ b/xen/arch/x86/hvm/vmx/vmx.c Thu Nov 12 04:08:25 2009 +0800
@@ -1450,6 +1450,16 @@ void start_vmx(void)
if ( !test_bit(X86_FEATURE_VMXE, &boot_cpu_data.x86_capability) )
return;
+ /*
+ * We can't allocate VMCS pages when start_secondary for hot-added CPU
+ * because allocate page at that time will trigger check_lock error
+ */
+ if (vmx_allocate_vmcs_pages())
+ {
+ printk("VMX: failed to alloca vmcs pages\n");
+ return;
+ }
+
set_in_cr4(X86_CR4_VMXE);
if ( !vmx_cpu_up() )
diff -r 1db7034f4498 xen/include/asm-x86/hvm/vmx/vmcs.h
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h Thu Nov 12 00:05:59 2009 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h Thu Nov 12 04:08:25 2009 +0800
@@ -28,6 +28,7 @@ extern void setup_vmcs_dump(void);
extern void setup_vmcs_dump(void);
extern int vmx_cpu_up(void);
extern void vmx_cpu_down(void);
+extern int vmx_allocate_vmcs_pages(void);
struct vmcs_struct {
u32 vmcs_revision_id;
[-- Attachment #2: vmcs_allocation.patch --]
[-- Type: application/octet-stream, Size: 2058 bytes --]
diff -r 1db7034f4498 xen/arch/x86/hvm/vmx/vmcs.c
--- a/xen/arch/x86/hvm/vmx/vmcs.c Thu Nov 12 00:05:59 2009 +0800
+++ b/xen/arch/x86/hvm/vmx/vmcs.c Thu Nov 12 04:08:25 2009 +0800
@@ -267,6 +267,34 @@ static void vmx_free_vmcs(struct vmcs_st
static void vmx_free_vmcs(struct vmcs_struct *vmcs)
{
free_xenheap_page(vmcs);
+}
+
+int vmx_allocate_vmcs_pages(void)
+{
+ int cpu, ret = 0;
+
+ for_each_possible_cpu(cpu)
+ {
+ if (per_cpu(host_vmcs, cpu) == NULL)
+ {
+ per_cpu(host_vmcs, cpu) = vmx_alloc_vmcs();
+ if (per_cpu(host_vmcs, cpu) == NULL)
+ {
+ ret = -1;
+ break;
+ }
+ }
+ }
+
+ if (ret < 0)
+ {
+ for_each_possible_cpu(cpu)
+ {
+ if (per_cpu(host_vmcs, cpu))
+ vmx_free_vmcs(per_cpu(host_vmcs, cpu));
+ }
+ }
+ return ret;
}
static void __vmx_clear_vmcs(void *info)
diff -r 1db7034f4498 xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c Thu Nov 12 00:05:59 2009 +0800
+++ b/xen/arch/x86/hvm/vmx/vmx.c Thu Nov 12 04:08:25 2009 +0800
@@ -1450,6 +1450,16 @@ void start_vmx(void)
if ( !test_bit(X86_FEATURE_VMXE, &boot_cpu_data.x86_capability) )
return;
+ /*
+ * We can't allocate VMCS pages when start_secondary for hot-added CPU
+ * because allocate page at that time will trigger check_lock error
+ */
+ if (vmx_allocate_vmcs_pages())
+ {
+ printk("VMX: failed to alloca vmcs pages\n");
+ return;
+ }
+
set_in_cr4(X86_CR4_VMXE);
if ( !vmx_cpu_up() )
diff -r 1db7034f4498 xen/include/asm-x86/hvm/vmx/vmcs.h
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h Thu Nov 12 00:05:59 2009 +0800
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h Thu Nov 12 04:08:25 2009 +0800
@@ -28,6 +28,7 @@ extern void setup_vmcs_dump(void);
extern void setup_vmcs_dump(void);
extern int vmx_cpu_up(void);
extern void vmx_cpu_down(void);
+extern int vmx_allocate_vmcs_pages(void);
struct vmcs_struct {
u32 vmcs_revision_id;
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Allocate vmcs pages when system booting
2009-11-12 10:52 [PATCH] Allocate vmcs pages when system booting Jiang, Yunhong
@ 2009-11-12 11:22 ` Keir Fraser
2009-11-12 14:58 ` Jiang, Yunhong
2010-01-15 9:06 ` Jiang, Yunhong
0 siblings, 2 replies; 16+ messages in thread
From: Keir Fraser @ 2009-11-12 11:22 UTC (permalink / raw)
To: Jiang, Yunhong, xen-devel@lists.xensource.com
On 12/11/2009 10:52, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
> Currently the VMCS page is allocated when the CPU is brought-up and
> identified, and then spin_debug_enable() is called.
>
> This does not work for cpu hot-add. When hot-added CPU is brought-up and try
> to allocate the vmcs page, it will hit check_lock, because irq is disabled
> still.
>
> This patch allocate the vmcs pages for all possible pages when system booting,
> so that we don't allocate vmcs anymore when secondary CPU is up.
>
> Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
A general point: using cpu_possible_map is not a good idea any longer, since
it is going to be all-ones as soon as your physical cpu hotplug patches go
in (I don't intend to make that a build-time option). Hence we could
potentially massively over-allocate pages with this approach.
The good news is that, when bringing a CPU online, we don't need to worry
about acquiring IRQ-unsafe locks with IRQS disabled until the CPU is added
to the cpu_online_map. This is because the race we are trying to avoid with
the spin-debug checks involves rendezvous of CPUs via IPIs, and a CPU not in
cpu_online_map will never get waited on by others.
So, your better fix would be to spin_debug_disable() at the top of
start_secondary(), and spin_debug_enable() just before
cpu_set(...online_map).
Can you try this alternative fix please?
-- Keir
^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH] Allocate vmcs pages when system booting
2009-11-12 11:22 ` Keir Fraser
@ 2009-11-12 14:58 ` Jiang, Yunhong
2009-11-12 15:04 ` Keir Fraser
2010-01-15 9:06 ` Jiang, Yunhong
1 sibling, 1 reply; 16+ messages in thread
From: Jiang, Yunhong @ 2009-11-12 14:58 UTC (permalink / raw)
To: Keir Fraser, xen-devel@lists.xensource.com
Keir Fraser wrote:
> On 12/11/2009 10:52, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
>
>> Currently the VMCS page is allocated when the CPU is brought-up and
>> identified, and then spin_debug_enable() is called.
>>
>> This does not work for cpu hot-add. When hot-added CPU is brought-up
>> and try to allocate the vmcs page, it will hit check_lock, because
>> irq is disabled still.
>>
>> This patch allocate the vmcs pages for all possible pages when
>> system booting, so that we don't allocate vmcs anymore when
>> secondary CPU is up.
>>
>> Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
>
> A general point: using cpu_possible_map is not a good idea any
> longer, since it is going to be all-ones as soon as your physical cpu
> hotplug patches go
> in (I don't intend to make that a build-time option). Hence we could
> potentially massively over-allocate pages with this approach.
>
> The good news is that, when bringing a CPU online, we don't
> need to worry
> about acquiring IRQ-unsafe locks with IRQS disabled until the
> CPU is added
> to the cpu_online_map. This is because the race we are trying
> to avoid with
> the spin-debug checks involves rendezvous of CPUs via IPIs,
> and a CPU not in
> cpu_online_map will never get waited on by others.
>
> So, your better fix would be to spin_debug_disable() at the top of
> start_secondary(), and spin_debug_enable() just before
> cpu_set(...online_map).
>
> Can you try this alternative fix please?
Yes, spin_debug_disable() is the first idea come into my mind, I change my mind because the spin_debug is a global variable. disable spin_debug() before start_secondary means maybe we can't catch error happens at that window, so I choose this method. Previously I thought high-end system support cpu hotplug may not care about the 64 pages :$
Or you assume this window is quite small, and we can assume safely if a error happening at this window, it will happen in other timeslot in the end, sepecially this function itself don't need the check?
Thanks
Yunhong Jiang
>
> -- Keir
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Allocate vmcs pages when system booting
2009-11-12 14:58 ` Jiang, Yunhong
@ 2009-11-12 15:04 ` Keir Fraser
2009-11-12 15:15 ` Jiang, Yunhong
0 siblings, 1 reply; 16+ messages in thread
From: Keir Fraser @ 2009-11-12 15:04 UTC (permalink / raw)
To: Jiang, Yunhong, xen-devel@lists.xensource.com
On 12/11/2009 14:58, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
>> Can you try this alternative fix please?
>
> Yes, spin_debug_disable() is the first idea come into my mind, I change my
> mind because the spin_debug is a global variable. disable spin_debug() before
> start_secondary means maybe we can't catch error happens at that window, so I
> choose this method. Previously I thought high-end system support cpu hotplug
> may not care about the 64 pages :$
CONFIG_HOTPLUG_CPU is not a user-accessible build option. In fact manually
disabling it in config.h resulted in a broken build, and now I basically
stripped out the non-HOTPLUG_CPU code (in c/s 20431). So this would be an
overhead for absolutely everyone.
> Or you assume this window is quite small, and we can assume safely if a error
> happening at this window, it will happen in other timeslot in the end,
> sepecially this function itself don't need the check?
Exactly. The spinlock debug checking can be best effort. I did think about
allowing the checks to be disabled per-cpu, but I don't think it's worth it.
I will check in my suggested fix, then.
-- Keir
^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH] Allocate vmcs pages when system booting
2009-11-12 15:04 ` Keir Fraser
@ 2009-11-12 15:15 ` Jiang, Yunhong
0 siblings, 0 replies; 16+ messages in thread
From: Jiang, Yunhong @ 2009-11-12 15:15 UTC (permalink / raw)
To: Keir Fraser, xen-devel@lists.xensource.com
Keir Fraser wrote:
> On 12/11/2009 14:58, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
>
>>> Can you try this alternative fix please?
>>
>> Yes, spin_debug_disable() is the first idea come into my mind, I
>> change my mind because the spin_debug is a global variable. disable
>> spin_debug() before start_secondary means maybe we can't catch error
>> happens at that window, so I choose this method. Previously I
>> thought high-end system support cpu hotplug may not care about the
>> 64 pages :$
>
> CONFIG_HOTPLUG_CPU is not a user-accessible build option. In
> fact manually
> disabling it in config.h resulted in a broken build, and now I
> basically stripped out the non-HOTPLUG_CPU code (in c/s 20431). So
> this
> would be an
> overhead for absolutely everyone.
>
>> Or you assume this window is quite small, and we can assume safely
>> if a error happening at this window, it will happen in other
>> timeslot in the end, sepecially this function itself don't need the
>> check?
>
> Exactly. The spinlock debug checking can be best effort. I did
> think about
> allowing the checks to be disabled per-cpu, but I don't think
> it's worth it.
Agree, it don't need be per-cpu.
>
> I will check in my suggested fix, then.
Ok, I will work this way tomorrow.
Thanks
Yunhong Jiang
>
> -- Keir
^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH] Allocate vmcs pages when system booting
2009-11-12 11:22 ` Keir Fraser
2009-11-12 14:58 ` Jiang, Yunhong
@ 2010-01-15 9:06 ` Jiang, Yunhong
2010-01-15 9:30 ` Jan Beulich
2010-01-15 10:28 ` Keir Fraser
1 sibling, 2 replies; 16+ messages in thread
From: Jiang, Yunhong @ 2010-01-15 9:06 UTC (permalink / raw)
To: Keir Fraser, xen-devel@lists.xensource.com
Hi, Keir, recently a bug is reported for the alloc vmcs page for the new added CPU. The reason is because when we try to allocate vmcs page, if we need flush TLB for allocated page, we will hit ASSERT in flush_area_mask in following code.
So my question are:
a) Why do we need make sure the irq is enabeld in flush_area_mask? I assume it is because this function may takes a long time waiting for all CPU flush TLB, am I right?
b) Can we still turn to the original patch, i.e. pre-allocate all VMCS pages for all possible CPU?
Thanks
Yunhong Jiang
void flush_area_mask(const cpumask_t *mask, const void *va, unsigned int flags)
{
ASSERT(local_irq_is_enabled());
if ( cpu_isset(smp_processor_id(), *mask) )
flush_area_local(va, flags);
The trace for the error:.
----------------------
(XEN) Xen call trace:
(XEN) [<ffff82c48016d40b>] flush_area_mask+0x2c/0x140
(XEN) [<ffff82c480114da7>] alloc_heap_pages+0x44f/0x493
(XEN) [<ffff82c48011642b>] alloc_domheap_pages+0x115/0x186
(XEN) [<ffff82c4801164ec>] alloc_xenheap_pages+0x50/0xd8
(XEN) [<ffff82c4801b3b4f>] vmx_alloc_vmcs+0x18/0x65
(XEN) [<ffff82c4801b45e9>] vmx_cpu_up+0x4d3/0x72a
(XEN) [<ffff82c4801b83fc>] start_vmx+0x86/0x144
(XEN) [<ffff82c480191703>] init_intel+0x2a2/0x2ff
(XEN) [<ffff82c480191094>] identify_cpu+0xc3/0x23d
(XEN) [<ffff82c48016d7b4>] smp_store_cpu_info+0x3b/0xc1
(XEN) [<ffff82c48016d93b>] smp_callin+0x101/0x227
(XEN) [<ffff82c48016ec85>] start_secondary+0xb3/0x43f
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 56:
(XEN) Assertion 'local_irq_is_enabled()' failed at smp.c:223
>-----Original Message-----
>From: Keir Fraser [mailto:keir.fraser@eu.citrix.com]
>Sent: Thursday, November 12, 2009 7:23 PM
>To: Jiang, Yunhong; xen-devel@lists.xensource.com
>Subject: Re: [PATCH] Allocate vmcs pages when system booting
>
>On 12/11/2009 10:52, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
>
>> Currently the VMCS page is allocated when the CPU is brought-up and
>> identified, and then spin_debug_enable() is called.
>>
>> This does not work for cpu hot-add. When hot-added CPU is brought-up and try
>> to allocate the vmcs page, it will hit check_lock, because irq is disabled
>> still.
>>
>> This patch allocate the vmcs pages for all possible pages when system booting,
>> so that we don't allocate vmcs anymore when secondary CPU is up.
>>
>> Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
>
>A general point: using cpu_possible_map is not a good idea any longer, since
>it is going to be all-ones as soon as your physical cpu hotplug patches go
>in (I don't intend to make that a build-time option). Hence we could
>potentially massively over-allocate pages with this approach.
>
>The good news is that, when bringing a CPU online, we don't need to worry
>about acquiring IRQ-unsafe locks with IRQS disabled until the CPU is added
>to the cpu_online_map. This is because the race we are trying to avoid with
>the spin-debug checks involves rendezvous of CPUs via IPIs, and a CPU not in
>cpu_online_map will never get waited on by others.
>
>So, your better fix would be to spin_debug_disable() at the top of
>start_secondary(), and spin_debug_enable() just before
>cpu_set(...online_map).
>
>Can you try this alternative fix please?
>
> -- Keir
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH] Allocate vmcs pages when system booting
2010-01-15 9:06 ` Jiang, Yunhong
@ 2010-01-15 9:30 ` Jan Beulich
2010-01-15 10:31 ` Keir Fraser
2010-01-15 10:28 ` Keir Fraser
1 sibling, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2010-01-15 9:30 UTC (permalink / raw)
To: Keir Fraser, Yunhong Jiang; +Cc: xen-devel@lists.xensource.com
>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 15.01.10 10:06 >>>
>b) Can we still turn to the original patch, i.e. pre-allocate all VMCS pages for all possible CPU?
I'm generally opposed to pre-allocations, at least as long as all CPUs are
considered 'possible'. Building with a relatively high nr_cpus= setting
and then running on a relatively small system results in (close to)
unacceptable overhead.
In fact it's really not clear to me why cpu_possible_map must be set to
CPU_MASK_ALL - if Linux has ways to avoid this, Xen should have too.
Jan
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Allocate vmcs pages when system booting
2010-01-15 9:06 ` Jiang, Yunhong
2010-01-15 9:30 ` Jan Beulich
@ 2010-01-15 10:28 ` Keir Fraser
1 sibling, 0 replies; 16+ messages in thread
From: Keir Fraser @ 2010-01-15 10:28 UTC (permalink / raw)
To: Jiang, Yunhong, xen-devel@lists.xensource.com
Another example where a hotplug notifier would be handy. Can you add a VMX
hook in the path bringing the CPU up (e.g., cpu_up, or do_boot_cpu), kind-of
equivalent to what would happen if we had CPU_UP_PREPARE notifiers?
>From there you can allocate memory in a safe (irqs enabled, on a cpu that is
already fully online) context.
This is the right direction for when notifiers get added -- I think I will
sort that out after 4.0 is done.
Btw, I am away rest of today and not back on email until Monday. I can
discuss further then. But we'll make sure this is sorted one way or another
before 4.0.0-rc2.
-- Keir
On 15/01/2010 09:06, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
> Hi, Keir, recently a bug is reported for the alloc vmcs page for the new added
> CPU. The reason is because when we try to allocate vmcs page, if we need flush
> TLB for allocated page, we will hit ASSERT in flush_area_mask in following
> code.
>
> So my question are:
> a) Why do we need make sure the irq is enabeld in flush_area_mask? I assume it
> is because this function may takes a long time waiting for all CPU flush TLB,
> am I right?
> b) Can we still turn to the original patch, i.e. pre-allocate all VMCS pages
> for all possible CPU?
>
> Thanks
> Yunhong Jiang
>
> void flush_area_mask(const cpumask_t *mask, const void *va, unsigned int
> flags)
> {
> ASSERT(local_irq_is_enabled());
>
> if ( cpu_isset(smp_processor_id(), *mask) )
> flush_area_local(va, flags);
>
> The trace for the error:.
> ----------------------
> (XEN) Xen call trace:
> (XEN) [<ffff82c48016d40b>] flush_area_mask+0x2c/0x140
> (XEN) [<ffff82c480114da7>] alloc_heap_pages+0x44f/0x493
> (XEN) [<ffff82c48011642b>] alloc_domheap_pages+0x115/0x186
> (XEN) [<ffff82c4801164ec>] alloc_xenheap_pages+0x50/0xd8
> (XEN) [<ffff82c4801b3b4f>] vmx_alloc_vmcs+0x18/0x65
> (XEN) [<ffff82c4801b45e9>] vmx_cpu_up+0x4d3/0x72a
> (XEN) [<ffff82c4801b83fc>] start_vmx+0x86/0x144
> (XEN) [<ffff82c480191703>] init_intel+0x2a2/0x2ff
> (XEN) [<ffff82c480191094>] identify_cpu+0xc3/0x23d
> (XEN) [<ffff82c48016d7b4>] smp_store_cpu_info+0x3b/0xc1
> (XEN) [<ffff82c48016d93b>] smp_callin+0x101/0x227
> (XEN) [<ffff82c48016ec85>] start_secondary+0xb3/0x43f
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 56:
> (XEN) Assertion 'local_irq_is_enabled()' failed at smp.c:223
>
>
>
>> -----Original Message-----
>> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com]
>> Sent: Thursday, November 12, 2009 7:23 PM
>> To: Jiang, Yunhong; xen-devel@lists.xensource.com
>> Subject: Re: [PATCH] Allocate vmcs pages when system booting
>>
>> On 12/11/2009 10:52, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
>>
>>> Currently the VMCS page is allocated when the CPU is brought-up and
>>> identified, and then spin_debug_enable() is called.
>>>
>>> This does not work for cpu hot-add. When hot-added CPU is brought-up and try
>>> to allocate the vmcs page, it will hit check_lock, because irq is disabled
>>> still.
>>>
>>> This patch allocate the vmcs pages for all possible pages when system
>>> booting,
>>> so that we don't allocate vmcs anymore when secondary CPU is up.
>>>
>>> Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
>>
>> A general point: using cpu_possible_map is not a good idea any longer, since
>> it is going to be all-ones as soon as your physical cpu hotplug patches go
>> in (I don't intend to make that a build-time option). Hence we could
>> potentially massively over-allocate pages with this approach.
>>
>> The good news is that, when bringing a CPU online, we don't need to worry
>> about acquiring IRQ-unsafe locks with IRQS disabled until the CPU is added
>> to the cpu_online_map. This is because the race we are trying to avoid with
>> the spin-debug checks involves rendezvous of CPUs via IPIs, and a CPU not in
>> cpu_online_map will never get waited on by others.
>>
>> So, your better fix would be to spin_debug_disable() at the top of
>> start_secondary(), and spin_debug_enable() just before
>> cpu_set(...online_map).
>>
>> Can you try this alternative fix please?
>>
>> -- Keir
>>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: RE: [PATCH] Allocate vmcs pages when system booting
2010-01-15 9:30 ` Jan Beulich
@ 2010-01-15 10:31 ` Keir Fraser
2010-01-17 2:02 ` Jiang, Yunhong
2010-03-18 10:45 ` Jan Beulich
0 siblings, 2 replies; 16+ messages in thread
From: Keir Fraser @ 2010-01-15 10:31 UTC (permalink / raw)
To: Jan Beulich, Yunhong Jiang; +Cc: xen-devel@lists.xensource.com
On 15/01/2010 09:30, "Jan Beulich" <JBeulich@novell.com> wrote:
>>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 15.01.10 10:06 >>>
>> b) Can we still turn to the original patch, i.e. pre-allocate all VMCS pages
>> for all possible CPU?
>
> I'm generally opposed to pre-allocations, at least as long as all CPUs are
> considered 'possible'. Building with a relatively high nr_cpus= setting
> and then running on a relatively small system results in (close to)
> unacceptable overhead.
>
> In fact it's really not clear to me why cpu_possible_map must be set to
> CPU_MASK_ALL - if Linux has ways to avoid this, Xen should have too.
Does Linux manage a good reliable job of cutting down cpu_possible_map? That
would save cpu_possible_map in my eyes, if we could do that. Otherwise it is
indeed pretty useless now. Either way, I'd like cpu hotplug notifier chains
in the 4.1 development window.
-- Keir
^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: RE: [PATCH] Allocate vmcs pages when system booting
2010-01-15 10:31 ` Keir Fraser
@ 2010-01-17 2:02 ` Jiang, Yunhong
2010-01-17 21:35 ` Keir Fraser
2010-03-18 10:45 ` Jan Beulich
1 sibling, 1 reply; 16+ messages in thread
From: Jiang, Yunhong @ 2010-01-17 2:02 UTC (permalink / raw)
To: Keir Fraser, Jan Beulich; +Cc: xen-devel@lists.xensource.com
In Linux side, it required some changes on ACPI. See information from Documentation/x86/x86_64/cpu-hotplug-spec:
Linux/x86-64 supports CPU hotplug now. For various reasons Linux wants to
know in advance of boot time the maximum number of CPUs that could be plugged
into the system. ACPI 3.0 currently has no official way to supply
this information from the firmware to the operating system.
But this requirement is not added to ACPI 4.0, so when working on CPU hotplug, I didn't try to use the same method.
After reading the code again, I think maybe I missed one possible place to achieve the same purpose. We can simply allocate the VMCS pag from cpu_hotadd in XENPF_cpu_hotadd, where the CPU changed from possible to present. Later, when we bring-up the CPU, we don't need allocate VMCS page anymore. That is, we will allocate VMCS pages still, but only for present cpus, not possible CPUs. When CPU is changed from logical online to offline, this VMCS page will not be freed.
Do you agree this method?
Thanks
Yunhong Jiang
>-----Original Message-----
>From: Keir Fraser [mailto:keir.fraser@eu.citrix.com]
>Sent: Friday, January 15, 2010 6:32 PM
>To: Jan Beulich; Jiang, Yunhong
>Cc: xen-devel@lists.xensource.com
>Subject: Re: [Xen-devel] RE: [PATCH] Allocate vmcs pages when system booting
>
>On 15/01/2010 09:30, "Jan Beulich" <JBeulich@novell.com> wrote:
>
>>>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 15.01.10 10:06 >>>
>>> b) Can we still turn to the original patch, i.e. pre-allocate all VMCS pages
>>> for all possible CPU?
>>
>> I'm generally opposed to pre-allocations, at least as long as all CPUs are
>> considered 'possible'. Building with a relatively high nr_cpus= setting
>> and then running on a relatively small system results in (close to)
>> unacceptable overhead.
>>
>> In fact it's really not clear to me why cpu_possible_map must be set to
>> CPU_MASK_ALL - if Linux has ways to avoid this, Xen should have too.
>
>Does Linux manage a good reliable job of cutting down cpu_possible_map? That
>would save cpu_possible_map in my eyes, if we could do that. Otherwise it is
>indeed pretty useless now. Either way, I'd like cpu hotplug notifier chains
>in the 4.1 development window.
>
> -- Keir
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: RE: [PATCH] Allocate vmcs pages when system booting
2010-01-17 2:02 ` Jiang, Yunhong
@ 2010-01-17 21:35 ` Keir Fraser
2010-01-18 8:11 ` Keir Fraser
0 siblings, 1 reply; 16+ messages in thread
From: Keir Fraser @ 2010-01-17 21:35 UTC (permalink / raw)
To: Jiang, Yunhong, Jan Beulich; +Cc: xen-devel@lists.xensource.com
On 17/01/2010 02:02, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
> After reading the code again, I think maybe I missed one possible place to
> achieve the same purpose. We can simply allocate the VMCS pag from cpu_hotadd
> in XENPF_cpu_hotadd, where the CPU changed from possible to present. Later,
> when we bring-up the CPU, we don't need allocate VMCS page anymore. That is,
> we will allocate VMCS pages still, but only for present cpus, not possible
> CPUs. When CPU is changed from logical online to offline, this VMCS page will
> not be freed.
>
> Do you agree this method?
I will come up with a patch tomorrow for you to try.
-- Keir
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: RE: [PATCH] Allocate vmcs pages when system booting
2010-01-17 21:35 ` Keir Fraser
@ 2010-01-18 8:11 ` Keir Fraser
2010-01-18 8:22 ` Jiang, Yunhong
0 siblings, 1 reply; 16+ messages in thread
From: Keir Fraser @ 2010-01-18 8:11 UTC (permalink / raw)
To: Jiang, Yunhong, Jan Beulich; +Cc: xen-devel@lists.xensource.com
[-- Attachment #1: Type: text/plain, Size: 289 bytes --]
On 17/01/2010 21:35, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
> On 17/01/2010 02:02, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
>
>> Do you agree this method?
>
> I will come up with a patch tomorrow for you to try.
Does the attached patch work okay for you?
-- Keir
[-- Attachment #2: 00-hvm-cpu-prepare --]
[-- Type: application/octet-stream, Size: 4515 bytes --]
diff -r 7a8cee80597e xen/arch/x86/hvm/svm/svm.c
--- a/xen/arch/x86/hvm/svm/svm.c Sun Jan 17 18:20:04 2010 +0000
+++ b/xen/arch/x86/hvm/svm/svm.c Mon Jan 18 08:10:28 2010 +0000
@@ -809,6 +809,16 @@
return 0;
}
+static int svm_cpu_prepare(unsigned int cpu)
+{
+ if ( ((hsa[cpu] == NULL) &&
+ ((hsa[cpu] = alloc_host_save_area()) == NULL)) ||
+ ((root_vmcb[cpu] == NULL) &&
+ ((root_vmcb[cpu] = alloc_vmcb()) == NULL)) )
+ return -ENOMEM;
+ return 0;
+}
+
static int svm_cpu_up(struct cpuinfo_x86 *c)
{
u32 eax, edx, phys_hsa_lo, phys_hsa_hi;
@@ -823,10 +833,7 @@
return 0;
}
- if ( ((hsa[cpu] == NULL) &&
- ((hsa[cpu] = alloc_host_save_area()) == NULL)) ||
- ((root_vmcb[cpu] == NULL) &&
- ((root_vmcb[cpu] = alloc_vmcb()) == NULL)) )
+ if ( svm_cpu_prepare(cpu) != 0 )
return 0;
write_efer(read_efer() | EFER_SVME);
@@ -1231,6 +1238,7 @@
static struct hvm_function_table __read_mostly svm_function_table = {
.name = "SVM",
+ .cpu_prepare = svm_cpu_prepare,
.cpu_down = svm_cpu_down,
.domain_initialise = svm_domain_initialise,
.domain_destroy = svm_domain_destroy,
diff -r 7a8cee80597e xen/arch/x86/hvm/vmx/vmcs.c
--- a/xen/arch/x86/hvm/vmx/vmcs.c Sun Jan 17 18:20:04 2010 +0000
+++ b/xen/arch/x86/hvm/vmx/vmcs.c Mon Jan 18 08:10:28 2010 +0000
@@ -320,6 +320,19 @@
local_irq_restore(flags);
}
+int vmx_cpu_prepare(unsigned int cpu)
+{
+ if ( this_cpu(host_vmcs) != NULL )
+ return 0;
+
+ this_cpu(host_vmcs) = vmx_alloc_vmcs();
+ if ( this_cpu(host_vmcs) != NULL )
+ return 0;
+
+ printk("CPU%d: Could not allocate host VMCS\n", cpu);
+ return -ENOMEM;
+}
+
int vmx_cpu_up(void)
{
u32 eax, edx;
@@ -367,15 +380,8 @@
INIT_LIST_HEAD(&this_cpu(active_vmcs_list));
- if ( this_cpu(host_vmcs) == NULL )
- {
- this_cpu(host_vmcs) = vmx_alloc_vmcs();
- if ( this_cpu(host_vmcs) == NULL )
- {
- printk("CPU%d: Could not allocate host VMCS\n", cpu);
- return 0;
- }
- }
+ if ( vmx_cpu_prepare(cpu) != 0 )
+ return 0;
switch ( __vmxon(virt_to_maddr(this_cpu(host_vmcs))) )
{
diff -r 7a8cee80597e xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c Sun Jan 17 18:20:04 2010 +0000
+++ b/xen/arch/x86/hvm/vmx/vmx.c Mon Jan 18 08:10:28 2010 +0000
@@ -1377,6 +1377,7 @@
static struct hvm_function_table __read_mostly vmx_function_table = {
.name = "VMX",
+ .cpu_prepare = vmx_cpu_prepare,
.domain_initialise = vmx_domain_initialise,
.domain_destroy = vmx_domain_destroy,
.vcpu_initialise = vmx_vcpu_initialise,
diff -r 7a8cee80597e xen/arch/x86/smpboot.c
--- a/xen/arch/x86/smpboot.c Sun Jan 17 18:20:04 2010 +0000
+++ b/xen/arch/x86/smpboot.c Mon Jan 18 08:10:28 2010 +0000
@@ -1518,7 +1518,11 @@
int __devinit __cpu_up(unsigned int cpu)
{
- int ret = 0;
+ int ret;
+
+ ret = hvm_cpu_prepare(cpu);
+ if (ret)
+ return ret;
/*
* We do warm boot only on cpus that had booted earlier
diff -r 7a8cee80597e xen/include/asm-x86/hvm/hvm.h
--- a/xen/include/asm-x86/hvm/hvm.h Sun Jan 17 18:20:04 2010 +0000
+++ b/xen/include/asm-x86/hvm/hvm.h Mon Jan 18 08:10:28 2010 +0000
@@ -111,6 +111,7 @@
int (*event_pending)(struct vcpu *v);
int (*do_pmu_interrupt)(struct cpu_user_regs *regs);
+ int (*cpu_prepare)(unsigned int cpu);
int (*cpu_up)(void);
void (*cpu_down)(void);
@@ -290,11 +291,15 @@
void hvm_set_rdtsc_exiting(struct domain *d, bool_t enable);
int hvm_gtsc_need_scale(struct domain *d);
+static inline int
+hvm_cpu_prepare(unsigned int cpu)
+{
+ return (hvm_funcs.cpu_prepare ? hvm_funcs.cpu_prepare(cpu) : 0);
+}
+
static inline int hvm_cpu_up(void)
{
- if ( hvm_funcs.cpu_up )
- return hvm_funcs.cpu_up();
- return 1;
+ return (hvm_funcs.cpu_up ? hvm_funcs.cpu_up() : 1);
}
static inline void hvm_cpu_down(void)
diff -r 7a8cee80597e xen/include/asm-x86/hvm/vmx/vmcs.h
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h Sun Jan 17 18:20:04 2010 +0000
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h Mon Jan 18 08:10:28 2010 +0000
@@ -26,6 +26,7 @@
extern void start_vmx(void);
extern void vmcs_dump_vcpu(struct vcpu *v);
extern void setup_vmcs_dump(void);
+extern int vmx_cpu_prepare(unsigned int cpu);
extern int vmx_cpu_up(void);
extern void vmx_cpu_down(void);
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: RE: [PATCH] Allocate vmcs pages when system booting
2010-01-18 8:11 ` Keir Fraser
@ 2010-01-18 8:22 ` Jiang, Yunhong
0 siblings, 0 replies; 16+ messages in thread
From: Jiang, Yunhong @ 2010-01-18 8:22 UTC (permalink / raw)
To: Keir Fraser, Jan Beulich; +Cc: xen-devel@lists.xensource.com
Really thanks and I will have a try.
Thanks
--jyh
>-----Original Message-----
>From: Keir Fraser [mailto:keir.fraser@eu.citrix.com]
>Sent: Monday, January 18, 2010 4:12 PM
>To: Jiang, Yunhong; Jan Beulich
>Cc: xen-devel@lists.xensource.com
>Subject: Re: [Xen-devel] RE: [PATCH] Allocate vmcs pages when system booting
>
>On 17/01/2010 21:35, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:
>
>> On 17/01/2010 02:02, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:
>>
>>> Do you agree this method?
>>
>> I will come up with a patch tomorrow for you to try.
>
>Does the attached patch work okay for you?
>
> -- Keir
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: RE: [PATCH] Allocate vmcs pages when system booting
2010-01-15 10:31 ` Keir Fraser
2010-01-17 2:02 ` Jiang, Yunhong
@ 2010-03-18 10:45 ` Jan Beulich
2010-03-18 10:51 ` Keir Fraser
2010-03-19 2:17 ` Jiang, Yunhong
1 sibling, 2 replies; 16+ messages in thread
From: Jan Beulich @ 2010-03-18 10:45 UTC (permalink / raw)
To: Keir Fraser, Yunhong Jiang; +Cc: xen-devel@lists.xensource.com
>>> Keir Fraser <keir.fraser@eu.citrix.com> 15.01.10 11:31 >>>
>On 15/01/2010 09:30, "Jan Beulich" <JBeulich@novell.com> wrote:
>
>>>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 15.01.10 10:06 >>>
>>> b) Can we still turn to the original patch, i.e. pre-allocate all VMCS pages
>>> for all possible CPU?
>>
>> I'm generally opposed to pre-allocations, at least as long as all CPUs are
>> considered 'possible'. Building with a relatively high nr_cpus= setting
>> and then running on a relatively small system results in (close to)
>> unacceptable overhead.
>>
>> In fact it's really not clear to me why cpu_possible_map must be set to
>> CPU_MASK_ALL - if Linux has ways to avoid this, Xen should have too.
>
>Does Linux manage a good reliable job of cutting down cpu_possible_map? That
>would save cpu_possible_map in my eyes, if we could do that. Otherwise it is
>indeed pretty useless now. Either way, I'd like cpu hotplug notifier chains
>in the 4.1 development window.
Only now got to look into this: Linux simply counts the disabled CPUs
along with the present ones, plus it allows a command line override to
the number of CPU slots reserved for hotplug. I just put together
something similar, partly copying code over from there. Will submit
post-4.0.
Jan
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: RE: [PATCH] Allocate vmcs pages when system booting
2010-03-18 10:45 ` Jan Beulich
@ 2010-03-18 10:51 ` Keir Fraser
2010-03-19 2:17 ` Jiang, Yunhong
1 sibling, 0 replies; 16+ messages in thread
From: Keir Fraser @ 2010-03-18 10:51 UTC (permalink / raw)
To: Jan Beulich, Yunhong Jiang; +Cc: xen-devel@lists.xensource.com
On 18/3/10 10:45, "Jan Beulich" <JBeulich@novell.com> wrote:
>>> In fact it's really not clear to me why cpu_possible_map must be set to
>>> CPU_MASK_ALL - if Linux has ways to avoid this, Xen should have too.
>>
>> Does Linux manage a good reliable job of cutting down cpu_possible_map? That
>> would save cpu_possible_map in my eyes, if we could do that. Otherwise it is
>> indeed pretty useless now. Either way, I'd like cpu hotplug notifier chains
>> in the 4.1 development window.
>
> Only now got to look into this: Linux simply counts the disabled CPUs
> along with the present ones, plus it allows a command line override to
> the number of CPU slots reserved for hotplug. I just put together
> something similar, partly copying code over from there. Will submit
> post-4.0.
Intel are going to submit patches for cpu hotplug notifiers post-4.0, which
is the best solution. I think we can get rid of all possible_map uses in the
end.
-- Keir
^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: RE: [PATCH] Allocate vmcs pages when system booting
2010-03-18 10:45 ` Jan Beulich
2010-03-18 10:51 ` Keir Fraser
@ 2010-03-19 2:17 ` Jiang, Yunhong
1 sibling, 0 replies; 16+ messages in thread
From: Jiang, Yunhong @ 2010-03-19 2:17 UTC (permalink / raw)
To: Jan Beulich, Keir Fraser; +Cc: xen-devel@lists.xensource.com
>-----Original Message-----
>From: Jan Beulich [mailto:JBeulich@novell.com]
>Sent: Thursday, March 18, 2010 6:45 PM
>To: Keir Fraser; Jiang, Yunhong
>Cc: xen-devel@lists.xensource.com
>Subject: Re: [Xen-devel] RE: [PATCH] Allocate vmcs pages when system booting
>
>>>> Keir Fraser <keir.fraser@eu.citrix.com> 15.01.10 11:31 >>>
>>On 15/01/2010 09:30, "Jan Beulich" <JBeulich@novell.com> wrote:
>>
>>>>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 15.01.10 10:06 >>>
>>>> b) Can we still turn to the original patch, i.e. pre-allocate all VMCS pages
>>>> for all possible CPU?
>>>
>>> I'm generally opposed to pre-allocations, at least as long as all CPUs are
>>> considered 'possible'. Building with a relatively high nr_cpus= setting
>>> and then running on a relatively small system results in (close to)
>>> unacceptable overhead.
>>>
>>> In fact it's really not clear to me why cpu_possible_map must be set to
>>> CPU_MASK_ALL - if Linux has ways to avoid this, Xen should have too.
>>
>>Does Linux manage a good reliable job of cutting down cpu_possible_map? That
>>would save cpu_possible_map in my eyes, if we could do that. Otherwise it is
>>indeed pretty useless now. Either way, I'd like cpu hotplug notifier chains
>>in the 4.1 development window.
>
>Only now got to look into this: Linux simply counts the disabled CPUs
>along with the present ones, plus it allows a command line override to
>the number of CPU slots reserved for hotplug. I just put together
>something similar, partly copying code over from there. Will submit
>post-4.0.
>
>Jan
Jan, thanks for your look on this. Yes, that's explained in kernel's "linux-2.6/Documentation/x86/x86_64/cpu-hotplug-spec" . I didn't take the assumption that MADT wil always list hot-plugable CPUs.
The reason I take this is because
a) at that time, I thought linux is not the only possible dom0, other OS like Solaris or FreeBSD can work as dom0 also in theory, so we'd better work as the spec's statement, not according to kernel's implementation, after all, xen is HYPERVISOR :-)
b) This statement talks about ACPI 3.0, and AFAIK, the ACPI 4.0 didn't change for this requirement still.
c) I didn't realize this VMCS page issue when I working on the patch :$
Of course, I'm not strong against the kernel's method, and your patch is ok for me.
Below are kernel's doc:
"Linux/x86-64 supports CPU hotplug now. For various reasons Linux wants to
know in advance of boot time the maximum number of CPUs that could be plugged
into the system. ACPI 3.0 currently has no official way to supply
this information from the firmware to the operating system.
......
For CPU hotplug Linux/x86-64 expects now that any possible future hotpluggable
CPU is already available in the MADT. If the CPU is not available yet
it should have its LAPIC Enabled bit set to 0. Linux will use the number
of disabled LAPICs to compute the maximum number of future CPUs.
..
In the worst case the user can overwrite this choice using a command line
option (additional_cpus=...), but it is recommended to supply the correct
number (or a reasonable approximation of it, with erring towards more not less)
in the MADT to avoid manual configuration."
--jyh
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2010-03-19 2:17 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-12 10:52 [PATCH] Allocate vmcs pages when system booting Jiang, Yunhong
2009-11-12 11:22 ` Keir Fraser
2009-11-12 14:58 ` Jiang, Yunhong
2009-11-12 15:04 ` Keir Fraser
2009-11-12 15:15 ` Jiang, Yunhong
2010-01-15 9:06 ` Jiang, Yunhong
2010-01-15 9:30 ` Jan Beulich
2010-01-15 10:31 ` Keir Fraser
2010-01-17 2:02 ` Jiang, Yunhong
2010-01-17 21:35 ` Keir Fraser
2010-01-18 8:11 ` Keir Fraser
2010-01-18 8:22 ` Jiang, Yunhong
2010-03-18 10:45 ` Jan Beulich
2010-03-18 10:51 ` Keir Fraser
2010-03-19 2:17 ` Jiang, Yunhong
2010-01-15 10:28 ` Keir Fraser
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).