* Re: [Xen-users] 2.6.23 oops
[not found] <20071011144817.n5x1drwcgug44sg0@whitelist.dk>
@ 2007-10-11 15:29 ` Mark Williamson
2007-10-11 16:27 ` Mark Williamson
2007-10-11 16:41 ` Jeremy Fitzhardinge
0 siblings, 2 replies; 16+ messages in thread
From: Mark Williamson @ 2007-10-11 15:29 UTC (permalink / raw)
To: xen-devel; +Cc: Jeremy Fitzhardinge, Morten Bøgeskov
I'm bringing this discussion onto Xen-devel as it smells like it needs some
more specific developer input than I can give.
> I've tried to run 2.6.23 on xen 3.1.0, my hardware is a Dual Athlon MP.
> my DomU is running 2.6.23 SMP (vcpus = 1), without SMP it hang during
> boot, with it panics. I've attached my config, if somebody thinks I've
> left something out
Do you mean that it hangs during boot if SMP is not compiled into the guest
kernel? That seems strange, I'll try that out myself and see what happens.
How far into the boot does it manage to get? Has it started running userspace
apps (the normal startup messages starting essential services), or is it
still during the kernel initialisation?
> Does anybody else see this:
I've booted a kernel successfully on a UP AMD64 box with an SMP 2.6.23 kernel,
vcpus = 1 but I didn't bother giving it a virtual disk, so I don't know if
userspace worked. I'll give it a try shortly...
Cheers,
Mark
> ------------[ cut here ]------------
> kernel BUG at arch/i386/xen/multicalls.c:68!
> invalid opcode: 0000 [#1]
> SMP
> CPU: 0
> EIP: 0061:[<c01019a6>] Not tainted VLI
> EFLAGS: 00010002 (2.6.23 #26)
> EIP is at xen_mc_flush+0xa6/0xb0
> eax: 00000001 ebx: 00000003 ecx: 00000001 edx: c10c1060
> esi: 00000000 edi: 00000001 ebp: c5c8d000 esp: c1101d28
> ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021
> Process swapper (pid: 1, ti=c1100000 task=c10eda90 task.ti=c1100000)
> Stack: 000218ee c10c10a0 c10c1460 c0101ede c5c8ce40 c02b90e0 c10eda90
> 00000000 c0101ff3 c5c8ce40 c015caca c5d3b49c 00000080 c5eadaa0 00000000
> c10eed80 c01584d0 c02b9a80 c5cfd560 c02a53d2 c1142440 c015c077 c1101d84
> c5eadaa0 Call Trace:
> [<c0101ede>] xen_pgd_pin+0x9e/0x100
> [<c0101ff3>] xen_activate_mm+0x13/0x20
> [<c015caca>] flush_old_exec+0x3ca/0x7f0
> [<c01584d0>] do_sync_read+0x0/0x120
> [<c015c077>] kernel_read+0x37/0x50
> [<c018308e>] load_elf_binary+0x2fe/0x1af0
> [<c013ec27>] __alloc_pages+0x57/0x310
> [<c0155387>] kmem_cache_alloc+0x47/0x90
> [<c0147610>] handle_mm_fault+0x540/0x710
> [<c01585a5>] do_sync_read+0xd5/0x120
> [<c0145e10>] vm_normal_page+0x10/0x70
> [<c0145e10>] vm_normal_page+0x10/0x70
> [<c0146471>] follow_page+0xf1/0x170
> [<c01478ca>] get_user_pages+0xea/0x2e0
> [<c015bbe2>] get_arg_page+0x42/0xa0
> [<c015bdc6>] copy_strings+0x186/0x1a0
> [<c015be64>] search_binary_handler+0x54/0x110
> [<c015d83b>] do_execve+0x14b/0x170
> [<c010425f>] sys_execve+0x2f/0x90
> [<c0105c72>] syscall_call+0x7/0xb
> [<c010a364>] kernel_execve+0x14/0x20
> [<c0100173>] init_post+0xa3/0xf0
> [<c02d594f>] kernel_init+0x20f/0x310
> [<c0118464>] schedule_tail+0x34/0x90
> [<c0105b16>] ret_from_fork+0x6/0x20
> [<c0105c7b>] syscall_exit+0x5/0x1b
> [<c02d5740>] kernel_init+0x0/0x310
> [<c02d5740>] kernel_init+0x0/0x310
> [<c0106e77>] kernel_thread_helper+0x7/0x10
> =======================
--
Dave: Just a question. What use is a unicyle with no seat? And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Re: [Xen-users] 2.6.23 oops
2007-10-11 15:29 ` [Xen-users] 2.6.23 oops Mark Williamson
@ 2007-10-11 16:27 ` Mark Williamson
2007-10-11 18:22 ` Morten Bøgeskov
2007-10-11 16:41 ` Jeremy Fitzhardinge
1 sibling, 1 reply; 16+ messages in thread
From: Mark Williamson @ 2007-10-11 16:27 UTC (permalink / raw)
To: xen-devel; +Cc: Jeremy Fitzhardinge, Morten Bøgeskov
> > I've tried to run 2.6.23 on xen 3.1.0, my hardware is a Dual Athlon MP.
> > my DomU is running 2.6.23 SMP (vcpus = 1), without SMP it hang during
> > boot, with it panics. I've attached my config, if somebody thinks I've
> > left something out
>
> Do you mean that it hangs during boot if SMP is not compiled into the guest
> kernel? That seems strange, I'll try that out myself and see what happens.
Where does it hang for you?
I disabled SMP in my kernel config and found that the guest hung during the
kernel messages, just after:
installing Xen timer for CPU 0
Is this similar to what you observed?
Cheers,
Mark
> How far into the boot does it manage to get? Has it started running
> userspace apps (the normal startup messages starting essential services),
> or is it still during the kernel initialisation?
>
> > Does anybody else see this:
>
> I've booted a kernel successfully on a UP AMD64 box with an SMP 2.6.23
> kernel, vcpus = 1 but I didn't bother giving it a virtual disk, so I don't
> know if userspace worked. I'll give it a try shortly...
>
> Cheers,
> Mark
>
> > ------------[ cut here ]------------
> > kernel BUG at arch/i386/xen/multicalls.c:68!
> > invalid opcode: 0000 [#1]
> > SMP
> > CPU: 0
> > EIP: 0061:[<c01019a6>] Not tainted VLI
> > EFLAGS: 00010002 (2.6.23 #26)
> > EIP is at xen_mc_flush+0xa6/0xb0
> > eax: 00000001 ebx: 00000003 ecx: 00000001 edx: c10c1060
> > esi: 00000000 edi: 00000001 ebp: c5c8d000 esp: c1101d28
> > ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021
> > Process swapper (pid: 1, ti=c1100000 task=c10eda90 task.ti=c1100000)
> > Stack: 000218ee c10c10a0 c10c1460 c0101ede c5c8ce40 c02b90e0 c10eda90
> > 00000000 c0101ff3 c5c8ce40 c015caca c5d3b49c 00000080 c5eadaa0 00000000
> > c10eed80 c01584d0 c02b9a80 c5cfd560 c02a53d2 c1142440 c015c077 c1101d84
> > c5eadaa0 Call Trace:
> > [<c0101ede>] xen_pgd_pin+0x9e/0x100
> > [<c0101ff3>] xen_activate_mm+0x13/0x20
> > [<c015caca>] flush_old_exec+0x3ca/0x7f0
> > [<c01584d0>] do_sync_read+0x0/0x120
> > [<c015c077>] kernel_read+0x37/0x50
> > [<c018308e>] load_elf_binary+0x2fe/0x1af0
> > [<c013ec27>] __alloc_pages+0x57/0x310
> > [<c0155387>] kmem_cache_alloc+0x47/0x90
> > [<c0147610>] handle_mm_fault+0x540/0x710
> > [<c01585a5>] do_sync_read+0xd5/0x120
> > [<c0145e10>] vm_normal_page+0x10/0x70
> > [<c0145e10>] vm_normal_page+0x10/0x70
> > [<c0146471>] follow_page+0xf1/0x170
> > [<c01478ca>] get_user_pages+0xea/0x2e0
> > [<c015bbe2>] get_arg_page+0x42/0xa0
> > [<c015bdc6>] copy_strings+0x186/0x1a0
> > [<c015be64>] search_binary_handler+0x54/0x110
> > [<c015d83b>] do_execve+0x14b/0x170
> > [<c010425f>] sys_execve+0x2f/0x90
> > [<c0105c72>] syscall_call+0x7/0xb
> > [<c010a364>] kernel_execve+0x14/0x20
> > [<c0100173>] init_post+0xa3/0xf0
> > [<c02d594f>] kernel_init+0x20f/0x310
> > [<c0118464>] schedule_tail+0x34/0x90
> > [<c0105b16>] ret_from_fork+0x6/0x20
> > [<c0105c7b>] syscall_exit+0x5/0x1b
> > [<c02d5740>] kernel_init+0x0/0x310
> > [<c02d5740>] kernel_init+0x0/0x310
> > [<c0106e77>] kernel_thread_helper+0x7/0x10
> > =======================
--
Dave: Just a question. What use is a unicyle with no seat? And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops
2007-10-11 15:29 ` [Xen-users] 2.6.23 oops Mark Williamson
2007-10-11 16:27 ` Mark Williamson
@ 2007-10-11 16:41 ` Jeremy Fitzhardinge
2007-10-11 19:00 ` Morten Bøgeskov
1 sibling, 1 reply; 16+ messages in thread
From: Jeremy Fitzhardinge @ 2007-10-11 16:41 UTC (permalink / raw)
To: Mark Williamson; +Cc: xen-devel, Morten Bøgeskov
[-- Attachment #1: Type: text/plain, Size: 463 bytes --]
Mark Williamson wrote:
> I'm bringing this discussion onto Xen-devel as it smells like it needs some
> more specific developer input than I can give.
>
>
>> I've tried to run 2.6.23 on xen 3.1.0, my hardware is a Dual Athlon MP.
>> my DomU is running 2.6.23 SMP (vcpus = 1), without SMP it hang during
>> boot, with it panics. I've attached my config, if somebody thinks I've
>> left something out
>>
Oh, I just fixed this. Try these patches.
J
[-- Attachment #2: xen-multicall-callbacks.patch --]
[-- Type: text/x-patch, Size: 2300 bytes --]
Subject: xen: add batch completion callbacks
This adds a mechanism to register a callback function to be called once
a batch of hypercalls has been issued. This is typically used to unlock
things which must remain locked until the hypercall has taken place.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
---
arch/i386/xen/multicalls.c | 29 ++++++++++++++++++++++++++---
arch/i386/xen/multicalls.h | 3 +++
2 files changed, 29 insertions(+), 3 deletions(-)
===================================================================
--- a/arch/i386/xen/multicalls.c
+++ b/arch/i386/xen/multicalls.c
@@ -32,7 +32,11 @@ struct mc_buffer {
struct mc_buffer {
struct multicall_entry entries[MC_BATCH];
u64 args[MC_ARGS];
- unsigned mcidx, argidx;
+ struct callback {
+ void (*fn)(void *);
+ void *data;
+ } callbacks[MC_BATCH];
+ unsigned mcidx, argidx, cbidx;
};
static DEFINE_PER_CPU(struct mc_buffer, mc_buffer);
@@ -43,6 +47,7 @@ void xen_mc_flush(void)
struct mc_buffer *b = &__get_cpu_var(mc_buffer);
int ret = 0;
unsigned long flags;
+ int i;
BUG_ON(preemptible());
@@ -51,8 +56,6 @@ void xen_mc_flush(void)
local_irq_save(flags);
if (b->mcidx) {
- int i;
-
if (HYPERVISOR_multicall(b->entries, b->mcidx) != 0)
BUG();
for (i = 0; i < b->mcidx; i++)
@@ -64,6 +67,13 @@ void xen_mc_flush(void)
BUG_ON(b->argidx != 0);
local_irq_restore(flags);
+
+ for(i = 0; i < b->cbidx; i++) {
+ struct callback *cb = &b->callbacks[i];
+
+ (*cb->fn)(cb->data);
+ }
+ b->cbidx = 0;
BUG_ON(ret);
}
@@ -88,3 +98,16 @@ struct multicall_space __xen_mc_entry(si
return ret;
}
+
+void xen_mc_callback(void (*fn)(void *), void *data)
+{
+ struct mc_buffer *b = &__get_cpu_var(mc_buffer);
+ struct callback *cb;
+
+ if (b->cbidx == MC_BATCH)
+ xen_mc_flush();
+
+ cb = &b->callbacks[b->cbidx++];
+ cb->fn = fn;
+ cb->data = data;
+}
===================================================================
--- a/arch/i386/xen/multicalls.h
+++ b/arch/i386/xen/multicalls.h
@@ -42,4 +42,7 @@ static inline void xen_mc_issue(unsigned
local_irq_restore(x86_read_percpu(xen_mc_irq_flags));
}
+/* Set up a callback to be called when the current batch is flushed */
+void xen_mc_callback(void (*fn)(void *), void *data);
+
#endif /* _XEN_MULTICALLS_H */
[-- Attachment #3: xen-handle-lazy-cr3-on-unpin.patch --]
[-- Type: text/x-patch, Size: 5927 bytes --]
Subject: xen: deal with stale cr3 values when unpinning pagetables
When a pagetable is no longer in use, it must be unpinned so that its
pages can be freed. However, this is only possible if there are no
stray uses of the pagetable. The code currently deals with all the
usual cases, but there's a rare case where a vcpu is changing cr3, but
is doing so lazily, and the change hasn't actually happened by the time
the pagetable is unpinned, even though it appears to have been completed.
This change adds a second per-cpu cr3 variable - xen_current_cr3 -
which tracks the actual state of the vcpu cr3. It is only updated once
the actual hypercall to set cr3 has been completed. Other processors
wishing to unpin a pagetable can check other vcpu's xen_current_cr3
values to see if any cross-cpu IPIs are needed to clean things up.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
---
arch/i386/xen/enlighten.c | 63 ++++++++++++++++++++++++++++++---------------
arch/i386/xen/mmu.c | 33 ++++++++++++++++++++---
arch/i386/xen/xen-ops.h | 1
3 files changed, 71 insertions(+), 26 deletions(-)
===================================================================
--- a/arch/i386/xen/enlighten.c
+++ b/arch/i386/xen/enlighten.c
@@ -55,7 +55,23 @@ DEFINE_PER_CPU(enum paravirt_lazy_mode,
DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu);
DEFINE_PER_CPU(struct vcpu_info, xen_vcpu_info);
-DEFINE_PER_CPU(unsigned long, xen_cr3);
+
+/*
+ * Note about cr3 (pagetable base) values:
+ *
+ * xen_cr3 contains the current logical cr3 value; it contains the
+ * last set cr3. This may not be the current effective cr3, because
+ * its update may be being lazily deferred. However, a vcpu looking
+ * at its own cr3 can use this value knowing that it everything will
+ * be self-consistent.
+ *
+ * xen_current_cr3 contains the actual vcpu cr3; it is set once the
+ * hypercall to set the vcpu cr3 is complete (so it may be a little
+ * out of date, but it will never be set early). If one vcpu is
+ * looking at another vcpu's cr3 value, it should use this variable.
+ */
+DEFINE_PER_CPU(unsigned long, xen_cr3); /* cr3 stored as physaddr */
+DEFINE_PER_CPU(unsigned long, xen_current_cr3); /* actual vcpu cr3 */
struct start_info *xen_start_info;
EXPORT_SYMBOL_GPL(xen_start_info);
@@ -631,32 +647,36 @@ static unsigned long xen_read_cr3(void)
return x86_read_percpu(xen_cr3);
}
+static void set_current_cr3(void *v)
+{
+ x86_write_percpu(xen_current_cr3, (unsigned long)v);
+}
+
static void xen_write_cr3(unsigned long cr3)
{
+ struct mmuext_op *op;
+ struct multicall_space mcs;
+ unsigned long mfn = pfn_to_mfn(PFN_DOWN(cr3));
+
BUG_ON(preemptible());
- if (cr3 == x86_read_percpu(xen_cr3)) {
- /* just a simple tlb flush */
- xen_flush_tlb();
- return;
- }
-
+ mcs = xen_mc_entry(sizeof(*op)); /* disables interrupts */
+
+ /* Update while interrupts are disabled, so its atomic with
+ respect to ipis */
x86_write_percpu(xen_cr3, cr3);
-
- {
- struct mmuext_op *op;
- struct multicall_space mcs = xen_mc_entry(sizeof(*op));
- unsigned long mfn = pfn_to_mfn(PFN_DOWN(cr3));
-
- op = mcs.args;
- op->cmd = MMUEXT_NEW_BASEPTR;
- op->arg1.mfn = mfn;
-
- MULTI_mmuext_op(mcs.mc, op, 1, NULL, DOMID_SELF);
-
- xen_mc_issue(PARAVIRT_LAZY_CPU);
- }
+ op = mcs.args;
+ op->cmd = MMUEXT_NEW_BASEPTR;
+ op->arg1.mfn = mfn;
+
+ MULTI_mmuext_op(mcs.mc, op, 1, NULL, DOMID_SELF);
+
+ /* Update xen_update_cr3 once the batch has actually
+ been submitted. */
+ xen_mc_callback(set_current_cr3, (void *)cr3);
+
+ xen_mc_issue(PARAVIRT_LAZY_CPU); /* interrupts restored */
}
/* Early in boot, while setting up the initial pagetable, assume
@@ -1124,6 +1144,7 @@ asmlinkage void __init xen_start_kernel(
/* keep using Xen gdt for now; no urgent need to change it */
x86_write_percpu(xen_cr3, __pa(pgd));
+ x86_write_percpu(xen_current_cr3, __pa(pgd));
#ifdef CONFIG_SMP
/* Don't do the full vcpu_info placement stuff until we have a
===================================================================
--- a/arch/i386/xen/mmu.c
+++ b/arch/i386/xen/mmu.c
@@ -564,20 +564,43 @@ static void drop_other_mm_ref(void *info
if (__get_cpu_var(cpu_tlbstate).active_mm == mm)
leave_mm(smp_processor_id());
+
+ /* If this cpu still has a stale cr3 reference, then make sure
+ it has been flushed. */
+ if (x86_read_percpu(xen_current_cr3) == __pa(mm->pgd)) {
+ load_cr3(swapper_pg_dir);
+ arch_flush_lazy_cpu_mode();
+ }
}
static void drop_mm_ref(struct mm_struct *mm)
{
+ cpumask_t mask;
+ unsigned cpu;
+
if (current->active_mm == mm) {
if (current->mm == mm)
load_cr3(swapper_pg_dir);
else
leave_mm(smp_processor_id());
- }
-
- if (!cpus_empty(mm->cpu_vm_mask))
- xen_smp_call_function_mask(mm->cpu_vm_mask, drop_other_mm_ref,
- mm, 1);
+ arch_flush_lazy_cpu_mode();
+ }
+
+ /* Get the "official" set of cpus referring to our pagetable. */
+ mask = mm->cpu_vm_mask;
+
+ /* It's possible that a vcpu may have a stale reference to our
+ cr3, because its in lazy mode, and it hasn't yet flushed
+ its set of pending hypercalls yet. In this case, we can
+ look at its actual current cr3 value, and force it to flush
+ if needed. */
+ for_each_online_cpu(cpu) {
+ if (per_cpu(xen_current_cr3, cpu) == __pa(mm->pgd))
+ cpu_set(cpu, mask);
+ }
+
+ if (!cpus_empty(mask))
+ xen_smp_call_function_mask(mask, drop_other_mm_ref, mm, 1);
}
#else
static void drop_mm_ref(struct mm_struct *mm)
===================================================================
--- a/arch/i386/xen/xen-ops.h
+++ b/arch/i386/xen/xen-ops.h
@@ -11,6 +11,7 @@ void xen_copy_trap_info(struct trap_info
DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu);
DECLARE_PER_CPU(unsigned long, xen_cr3);
+DECLARE_PER_CPU(unsigned long, xen_current_cr3);
extern struct start_info *xen_start_info;
extern struct shared_info *HYPERVISOR_shared_info;
[-- Attachment #4: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Re: [Xen-users] 2.6.23 oops
2007-10-11 16:27 ` Mark Williamson
@ 2007-10-11 18:22 ` Morten Bøgeskov
2007-10-11 18:53 ` Mark Williamson
0 siblings, 1 reply; 16+ messages in thread
From: Morten Bøgeskov @ 2007-10-11 18:22 UTC (permalink / raw)
To: Mark Williamson; +Cc: Jeremy Fitzhardinge, xen-devel
Quoting Mark Williamson <mark.williamson@cl.cam.ac.uk>:
>> > I've tried to run 2.6.23 on xen 3.1.0, my hardware is a Dual Athlon MP.
>> > my DomU is running 2.6.23 SMP (vcpus = 1), without SMP it hang during
>> > boot, with it panics. I've attached my config, if somebody thinks I've
>> > left something out
>>
>> Do you mean that it hangs during boot if SMP is not compiled into the guest
>> kernel? That seems strange, I'll try that out myself and see what happens.
>
> Where does it hang for you?
>
> I disabled SMP in my kernel config and found that the guest hung during the
> kernel messages, just after:
>
> installing Xen timer for CPU 0
>
> Is this similar to what you observed?
That is exactly what I experienced. I tracked it down to:
xen_vcpuop_set_next_event(...)
ret = HYPERVISOR_vcpu_op(VCPUOP_set_singleshot_timer, cpu, &single);
always returns -ETIME
resulting in a infinite loop in
tick_setup_periodic(...)
Where this never succeeds
if (!clockevents_program_event(dev, next, ktime_get()))
return;
Now my brain needs a rest. I never thought I had to go head first into
the linux-kernel ;-) can't claim that I got any wiser ;-)
>
> Cheers,
> Mark
>
>> How far into the boot does it manage to get? Has it started running
>> userspace apps (the normal startup messages starting essential services),
>> or is it still during the kernel initialisation?
>>
>> > Does anybody else see this:
>>
>> I've booted a kernel successfully on a UP AMD64 box with an SMP 2.6.23
>> kernel, vcpus = 1 but I didn't bother giving it a virtual disk, so I don't
>> know if userspace worked. I'll give it a try shortly...
>>
>> Cheers,
>> Mark
>>
>> > ------------[ cut here ]------------
>> > kernel BUG at arch/i386/xen/multicalls.c:68!
>> > invalid opcode: 0000 [#1]
>> > SMP
>> > CPU: 0
>> > EIP: 0061:[<c01019a6>] Not tainted VLI
>> > EFLAGS: 00010002 (2.6.23 #26)
>> > EIP is at xen_mc_flush+0xa6/0xb0
>> > eax: 00000001 ebx: 00000003 ecx: 00000001 edx: c10c1060
>> > esi: 00000000 edi: 00000001 ebp: c5c8d000 esp: c1101d28
>> > ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021
>> > Process swapper (pid: 1, ti=c1100000 task=c10eda90 task.ti=c1100000)
>> > Stack: 000218ee c10c10a0 c10c1460 c0101ede c5c8ce40 c02b90e0 c10eda90
>> > 00000000 c0101ff3 c5c8ce40 c015caca c5d3b49c 00000080 c5eadaa0 00000000
>> > c10eed80 c01584d0 c02b9a80 c5cfd560 c02a53d2 c1142440 c015c077 c1101d84
>> > c5eadaa0 Call Trace:
>> > [<c0101ede>] xen_pgd_pin+0x9e/0x100
>> > [<c0101ff3>] xen_activate_mm+0x13/0x20
>> > [<c015caca>] flush_old_exec+0x3ca/0x7f0
>> > [<c01584d0>] do_sync_read+0x0/0x120
>> > [<c015c077>] kernel_read+0x37/0x50
>> > [<c018308e>] load_elf_binary+0x2fe/0x1af0
>> > [<c013ec27>] __alloc_pages+0x57/0x310
>> > [<c0155387>] kmem_cache_alloc+0x47/0x90
>> > [<c0147610>] handle_mm_fault+0x540/0x710
>> > [<c01585a5>] do_sync_read+0xd5/0x120
>> > [<c0145e10>] vm_normal_page+0x10/0x70
>> > [<c0145e10>] vm_normal_page+0x10/0x70
>> > [<c0146471>] follow_page+0xf1/0x170
>> > [<c01478ca>] get_user_pages+0xea/0x2e0
>> > [<c015bbe2>] get_arg_page+0x42/0xa0
>> > [<c015bdc6>] copy_strings+0x186/0x1a0
>> > [<c015be64>] search_binary_handler+0x54/0x110
>> > [<c015d83b>] do_execve+0x14b/0x170
>> > [<c010425f>] sys_execve+0x2f/0x90
>> > [<c0105c72>] syscall_call+0x7/0xb
>> > [<c010a364>] kernel_execve+0x14/0x20
>> > [<c0100173>] init_post+0xa3/0xf0
>> > [<c02d594f>] kernel_init+0x20f/0x310
>> > [<c0118464>] schedule_tail+0x34/0x90
>> > [<c0105b16>] ret_from_fork+0x6/0x20
>> > [<c0105c7b>] syscall_exit+0x5/0x1b
>> > [<c02d5740>] kernel_init+0x0/0x310
>> > [<c02d5740>] kernel_init+0x0/0x310
>> > [<c0106e77>] kernel_thread_helper+0x7/0x10
>> > =======================
>
>
>
> --
> Dave: Just a question. What use is a unicyle with no seat? And no pedals!
> Mark: To answer a question with a question: What use is a skateboard?
> Dave: Skateboards have wheels.
> Mark: My wheel has a wheel!
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Re: [Xen-users] 2.6.23 oops
2007-10-11 18:22 ` Morten Bøgeskov
@ 2007-10-11 18:53 ` Mark Williamson
0 siblings, 0 replies; 16+ messages in thread
From: Mark Williamson @ 2007-10-11 18:53 UTC (permalink / raw)
To: Morten Bøgeskov; +Cc: Jeremy Fitzhardinge, xen-devel
> > Is this similar to what you observed?
>
> That is exactly what I experienced. I tracked it down to:
Awesome! Thanks for helping out here.
> xen_vcpuop_set_next_event(...)
> ret = HYPERVISOR_vcpu_op(VCPUOP_set_singleshot_timer, cpu, &single);
> always returns -ETIME
Which, AFAICS has the expected meaning that the requested time is in the past.
> resulting in a infinite loop in
> tick_setup_periodic(...)
> Where this never succeeds
> if (!clockevents_program_event(dev, next, ktime_get()))
> return;
in kernel/time/tick-common.c, right?
I see what you mean, but it's not immediately obvious to me what's going
wrong. I don't think the kernel mainline Xen uses even has clockevents,
anyway, so I've not seen it before :-)
> Now my brain needs a rest. I never thought I had to go head first into
> the linux-kernel ;-) can't claim that I got any wiser ;-)
Every little helps.
Dip your head into tepid water just to forestall any overheating.
I may be able to take a look at this later tonight if Jeremy doesn't beat me
to it. I'd like to get a bit more familiar with our patches to mainline.
Cheers,
Mark
> > Cheers,
> > Mark
> >
> >> How far into the boot does it manage to get? Has it started running
> >> userspace apps (the normal startup messages starting essential
> >> services), or is it still during the kernel initialisation?
> >>
> >> > Does anybody else see this:
> >>
> >> I've booted a kernel successfully on a UP AMD64 box with an SMP 2.6.23
> >> kernel, vcpus = 1 but I didn't bother giving it a virtual disk, so I
> >> don't know if userspace worked. I'll give it a try shortly...
> >>
> >> Cheers,
> >> Mark
> >>
> >> > ------------[ cut here ]------------
> >> > kernel BUG at arch/i386/xen/multicalls.c:68!
> >> > invalid opcode: 0000 [#1]
> >> > SMP
> >> > CPU: 0
> >> > EIP: 0061:[<c01019a6>] Not tainted VLI
> >> > EFLAGS: 00010002 (2.6.23 #26)
> >> > EIP is at xen_mc_flush+0xa6/0xb0
> >> > eax: 00000001 ebx: 00000003 ecx: 00000001 edx: c10c1060
> >> > esi: 00000000 edi: 00000001 ebp: c5c8d000 esp: c1101d28
> >> > ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021
> >> > Process swapper (pid: 1, ti=c1100000 task=c10eda90 task.ti=c1100000)
> >> > Stack: 000218ee c10c10a0 c10c1460 c0101ede c5c8ce40 c02b90e0 c10eda90
> >> > 00000000 c0101ff3 c5c8ce40 c015caca c5d3b49c 00000080 c5eadaa0
> >> > 00000000 c10eed80 c01584d0 c02b9a80 c5cfd560 c02a53d2 c1142440
> >> > c015c077 c1101d84 c5eadaa0 Call Trace:
> >> > [<c0101ede>] xen_pgd_pin+0x9e/0x100
> >> > [<c0101ff3>] xen_activate_mm+0x13/0x20
> >> > [<c015caca>] flush_old_exec+0x3ca/0x7f0
> >> > [<c01584d0>] do_sync_read+0x0/0x120
> >> > [<c015c077>] kernel_read+0x37/0x50
> >> > [<c018308e>] load_elf_binary+0x2fe/0x1af0
> >> > [<c013ec27>] __alloc_pages+0x57/0x310
> >> > [<c0155387>] kmem_cache_alloc+0x47/0x90
> >> > [<c0147610>] handle_mm_fault+0x540/0x710
> >> > [<c01585a5>] do_sync_read+0xd5/0x120
> >> > [<c0145e10>] vm_normal_page+0x10/0x70
> >> > [<c0145e10>] vm_normal_page+0x10/0x70
> >> > [<c0146471>] follow_page+0xf1/0x170
> >> > [<c01478ca>] get_user_pages+0xea/0x2e0
> >> > [<c015bbe2>] get_arg_page+0x42/0xa0
> >> > [<c015bdc6>] copy_strings+0x186/0x1a0
> >> > [<c015be64>] search_binary_handler+0x54/0x110
> >> > [<c015d83b>] do_execve+0x14b/0x170
> >> > [<c010425f>] sys_execve+0x2f/0x90
> >> > [<c0105c72>] syscall_call+0x7/0xb
> >> > [<c010a364>] kernel_execve+0x14/0x20
> >> > [<c0100173>] init_post+0xa3/0xf0
> >> > [<c02d594f>] kernel_init+0x20f/0x310
> >> > [<c0118464>] schedule_tail+0x34/0x90
> >> > [<c0105b16>] ret_from_fork+0x6/0x20
> >> > [<c0105c7b>] syscall_exit+0x5/0x1b
> >> > [<c02d5740>] kernel_init+0x0/0x310
> >> > [<c02d5740>] kernel_init+0x0/0x310
> >> > [<c0106e77>] kernel_thread_helper+0x7/0x10
> >> > =======================
> >
> > --
> > Dave: Just a question. What use is a unicyle with no seat? And no
> > pedals! Mark: To answer a question with a question: What use is a
> > skateboard? Dave: Skateboards have wheels.
> > Mark: My wheel has a wheel!
--
Dave: Just a question. What use is a unicyle with no seat? And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops
2007-10-11 16:41 ` Jeremy Fitzhardinge
@ 2007-10-11 19:00 ` Morten Bøgeskov
2007-10-11 21:04 ` Jeremy Fitzhardinge
0 siblings, 1 reply; 16+ messages in thread
From: Morten Bøgeskov @ 2007-10-11 19:00 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: xen-devel, Mark Williamson
Quoting Jeremy Fitzhardinge <jeremy@goop.org>:
> Mark Williamson wrote:
>> I'm bringing this discussion onto Xen-devel as it smells like it needs some
>> more specific developer input than I can give.
>>
>>
>>> I've tried to run 2.6.23 on xen 3.1.0, my hardware is a Dual Athlon MP.
>>> my DomU is running 2.6.23 SMP (vcpus = 1), without SMP it hang during
>>> boot, with it panics. I've attached my config, if somebody thinks I've
>>> left something out
>>>
>
> Oh, I just fixed this. Try these patches.
>
> J
>
I've applied both, and tried with and without SMP, with exactly the
same result as before.
UP hangs after "installing Xen timer for CPU 0"
SMP the oops'es the same way except the line no. is now 78.
Morten Bøgeskov
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops
2007-10-11 19:00 ` Morten Bøgeskov
@ 2007-10-11 21:04 ` Jeremy Fitzhardinge
2007-10-11 21:21 ` Morten Bøgeskov
2007-10-12 4:14 ` Mark Williamson
0 siblings, 2 replies; 16+ messages in thread
From: Jeremy Fitzhardinge @ 2007-10-11 21:04 UTC (permalink / raw)
To: Morten Bøgeskov; +Cc: xen-devel, Mark Williamson
Morten Bøgeskov wrote:
> I've applied both, and tried with and without SMP, with exactly the
> same result as before.
>
> UP hangs after "installing Xen timer for CPU 0"
Ah, OK. I'd overlooked this one. Hm, I probably haven't tried UP in a
while.
> SMP the oops'es the same way except the line no. is now 78.
Oh, that's odd. Could you resend your original bug report? What kind
of load does it fail under?
J
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops
2007-10-11 21:04 ` Jeremy Fitzhardinge
@ 2007-10-11 21:21 ` Morten Bøgeskov
2007-10-11 21:47 ` Jeremy Fitzhardinge
2007-10-12 4:14 ` Mark Williamson
1 sibling, 1 reply; 16+ messages in thread
From: Morten Bøgeskov @ 2007-10-11 21:21 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: xen-devel, Mark Williamson
[-- Attachment #1: Type: text/plain, Size: 6671 bytes --]
Quoting Jeremy Fitzhardinge <jeremy@goop.org>:
> Morten Bøgeskov wrote:
>> I've applied both, and tried with and without SMP, with exactly the
>> same result as before.
>>
>> UP hangs after "installing Xen timer for CPU 0"
>
> Ah, OK. I'd overlooked this one. Hm, I probably haven't tried UP in a
> while.
>
>> SMP the oops'es the same way except the line no. is now 78.
>
> Oh, that's odd. Could you resend your original bug report? What kind
> of load does it fail under?
I can't really say what load. It doesn't get that far.
I've included .config
config:
kernel = "/usr/src/linux-2.6.23/vmlinux"
memory = 96
name = "foo"
vif = [ 'mac=00:16:3e:00:00:00, bridge=br1' ]
vcpus = 1
disk = [ 'phy:/dev/VOL/foo,hda1,w' ]
root = "/dev/xvda1 ro"
extra = "ip=192.168.0.2::192.168.0.1:255.255.255.0:foo:eth0:"
Output:
Using config file "/etc/xen/foo".
Started domain foo
Reserving virtual address space above 0xfbffe000
Linux version 2.6.23 (root@hobbes) (gcc version 4.1.2 20061115
(prerelease) (Debian 4.1.1-21)) #32 SMP Thu Oct 11 21:02:46 CEST 2007
BIOS-provided physical RAM map:
Xen: 0000000000000000 - 0000000006000000 (usable)
0MB HIGHMEM available.
96MB LOWMEM available.
Zone PFN ranges:
DMA 0 -> 4096
Normal 4096 -> 24576
HighMem 24576 -> 24576
Movable zone start PFN for each node
early_node_map[1] active PFN ranges
0: 0 -> 24576
DMI not present or invalid.
Allocating PCI resources starting at 10000000 (gap: 06000000:fa000000)
Built 1 zonelists in Zone order. Total pages: 24384
Kernel command line: root=/dev/xvda1 ro
ip=192.168.0.2::192.168.0.1:255.255.255.0:foo:eth0:
Local APIC disabled by BIOS -- you can enable it with "lapic"
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
PID hash table entries: 512 (order: 9, 2048 bytes)
Detected 1666.723 MHz processor.
console [hvc0] enabled
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
Memory: 94884k/98304k available (1543k kernel code, 3360k reserved,
343k data, 176k init, 0k highmem)
virtual kernel memory layout:
fixmap : 0xfbf9d000 - 0xfbffd000 ( 384 kB)
pkmap : 0xfb800000 - 0xfbc00000 (4096 kB)
vmalloc : 0xc6800000 - 0xfb7fe000 ( 847 MB)
lowmem : 0xc0000000 - 0xc6000000 ( 96 MB)
.init : 0xc02dd000 - 0xc0309000 ( 176 kB)
.data : 0xc0281ed6 - 0xc02d7d6c ( 343 kB)
.text : 0xc0100000 - 0xc0281ed6 (1543 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
installing Xen timer for CPU 0
Calibrating delay using timer specific routine.. 3341.81 BogoMIPS
(lpj=5567710)
Mount-cache hash table entries: 512
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
Compat vDSO mapped to fbffc000.
SMP alternatives: switching to UP code
Freeing SMP alternatives: 9k freed
Brought up 1 CPUs
Booting paravirtualized kernel on Xen
Hypervisor signature: xen-3.0-x86_32
Grant table initialized
NET: Registered protocol family 16
Setting up standard PCI resources
NET: Registered protocol family 2
Time: xen clocksource has been installed.
IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
TCP established hash table entries: 4096 (order: 3, 49152 bytes)
TCP bind hash table entries: 4096 (order: 3, 32768 bytes)
TCP: Hash tables configured (established 4096 bind 4096)
TCP reno registered
SGI XFS with no debug enabled
io scheduler noop registered
io scheduler deadline registered (default)
Initialising Xen virtual ethernet driver.
blkfront: xvda1: barriers enabled
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
Using IPI Shortcut mode
XENBUS: Device with no driver: device/console/0
IP-Config: Complete:
device=eth0, addr=192.168.0.2, mask=255.255.255.0, gw=192.168.0.1,
host=foo, domain=, nis-domain=(none),
bootserver=255.255.255.255, rootserver=255.255.255.255, rootpath=
blkfront: xvda1: write barrier op failed
blkfront: xvda1: barriers disabled
Filesystem "xvda1": Disabling barriers, trial barrier write failed
XFS mounting filesystem xvda1
VFS: Mounted root (xfs filesystem) readonly.
Freeing unused kernel memory: 176k freed
------------[ cut here ]------------
kernel BUG at arch/i386/xen/multicalls.c:78!
invalid opcode: 0000 [#1]
SMP
CPU: 0
EIP: 0061:[<c0101a82>] Not tainted VLI
EFLAGS: 00010002 (2.6.23 #32)
EIP is at xen_mc_flush+0xd2/0xe0
eax: 00000000 ebx: c10c1060 ecx: 00000003 edx: 00000003
esi: c10c1060 edi: 00000000 ebp: 00000001 esp: c10efd1c
ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021
Process swapper (pid: 1, ti=c10ee000 task=c10eba90 task.ti=c10ee000)
Stack: 000d9b10 c10c10a0 c10c1460 c130b000 c01020d0 c1329e40 c02c1100 c10eba90
00000000 c01021f3 c1329e40 c015e498 c132849c 00000080 c12e0aa0 00000000
c10ece60 c0159b50 c02c1ac0 c12fa560 c1253c20 c1144440 c015da4d c10efd7c
Call Trace:
[<c01020d0>] xen_pgd_pin+0xb0/0x120
[<c01021f3>] xen_activate_mm+0x13/0x20
[<c015e498>] flush_old_exec+0x3c8/0x7e0
[<c0159b50>] do_sync_read+0x0/0x120
[<c015da4d>] kernel_read+0x3d/0x60
[<c0185ed6>] load_elf_binary+0x316/0x1aa0
[<c013fe27>] __alloc_pages+0x57/0x2f0
[<c0156796>] kmem_cache_alloc+0x56/0xb0
[<c01488eb>] handle_mm_fault+0x52b/0x6f0
[<c014710b>] vm_normal_page+0x1b/0x80
[<c014710b>] vm_normal_page+0x1b/0x80
[<c0147786>] follow_page+0x106/0x180
[<c0148bb7>] get_user_pages+0x107/0x2d0
[<c015d59b>] get_arg_page+0x4b/0xb0
[<c015d774>] copy_strings+0x174/0x190
[<c015d824>] search_binary_handler+0x54/0x110
[<c015f236>] do_execve+0x166/0x190
[<c010460f>] sys_execve+0x2f/0x90
[<c010605e>] syscall_call+0x7/0xb
[<c010a3ac>] kernel_execve+0x1c/0x30
[<c0100173>] init_post+0xa3/0xf0
[<c02dd942>] kernel_init+0x222/0x330
[<c0118173>] schedule_tail+0x33/0xa0
[<c0105f1a>] ret_from_fork+0x6/0x1c
[<c0106067>] syscall_exit+0x5/0x1b
[<c02dd720>] kernel_init+0x0/0x330
[<c02dd720>] kernel_init+0x0/0x330
[<c0106c33>] kernel_thread_helper+0x7/0x10
=======================
Code: 89 d8 72 e9 85 ed c7 86 08 07 00 00 00 00 00 00 75 19 5b 5e 5f
5d c3 0f 0b eb fe 8b 96 04 07 00 00 31 ed 85 d2 74 ac 0f 0b eb fe <0f>
0b eb fe 8d 76 00 8d bc
00 00 00 00 83 ec 0c 89 1c 24 89
EIP: [<c0101a82>] xen_mc_flush+0xd2/0xe0 SS:ESP e021:c10efd1c
Kernel panic - not syncing: Attempted to kill init!
[-- Attachment #2: config --]
[-- Type: text/plain, Size: 3709 bytes --]
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_LOG_BUF_SHIFT=15
CONFIG_SYSFS_DEPRECATED=y
CONFIG_SYSCTL=y
CONFIG_EMBEDDED=y
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLAB=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_BLOCK=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_DEFAULT_DEADLINE=y
CONFIG_DEFAULT_IOSCHED="deadline"
CONFIG_SMP=y
CONFIG_X86_PC=y
CONFIG_PARAVIRT=y
CONFIG_XEN=y
CONFIG_MK7=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_XADD=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_USE_3DNOW=y
CONFIG_X86_TSC=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=4
CONFIG_NR_CPUS=2
CONFIG_PREEMPT_NONE=y
CONFIG_PREEMPT_BKL=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_DMIID=y
CONFIG_HIGHMEM4G=y
CONFIG_VMSPLIT_3G=y
CONFIG_PAGE_OFFSET=0xC0000000
CONFIG_HIGHMEM=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
CONFIG_SPARSEMEM_STATIC=y
CONFIG_SPLIT_PTLOCK_CPUS=4096
CONFIG_ZONE_DMA_FLAG=1
CONFIG_BOUNCE=y
CONFIG_NR_QUICK=1
CONFIG_VIRT_TO_BUS=y
CONFIG_HZ_300=y
CONFIG_HZ=300
CONFIG_PHYSICAL_START=0x100000
CONFIG_PHYSICAL_ALIGN=0x100000
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_SUSPEND_SMP_POSSIBLE=y
CONFIG_HIBERNATION_SMP_POSSIBLE=y
CONFIG_ISA_DMA_API=y
CONFIG_BINFMT_ELF=y
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_FIB_HASH=y
CONFIG_IP_PNP=y
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_BLK_DEV=y
CONFIG_XEN_BLKDEV_FRONTEND=y
CONFIG_NETDEVICES=y
CONFIG_XEN_NETDEV_FRONTEND=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
CONFIG_HVC_DRIVER=y
CONFIG_HVC_XEN=y
CONFIG_XFS_FS=y
CONFIG_INOTIFY=y
CONFIG_INOTIFY_USER=y
CONFIG_DNOTIFY=y
CONFIG_PROC_FS=y
CONFIG_PROC_SYSCTL=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_RAMFS=y
CONFIG_MSDOS_PARTITION=y
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DEBUG_KERNEL=y
CONFIG_DETECT_SOFTLOCKUP=y
CONFIG_SCHED_DEBUG=y
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_INFO=y
CONFIG_FORCED_INLINING=y
CONFIG_EARLY_PRINTK=y
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
CONFIG_PLIST=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT=y
CONFIG_HAS_DMA=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_X86_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_X86_TRAMPOLINE=y
CONFIG_KTIME_SCALAR=y
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops
2007-10-11 21:21 ` Morten Bøgeskov
@ 2007-10-11 21:47 ` Jeremy Fitzhardinge
2007-10-12 9:46 ` Morten Bøgeskov
0 siblings, 1 reply; 16+ messages in thread
From: Jeremy Fitzhardinge @ 2007-10-11 21:47 UTC (permalink / raw)
To: Morten Bøgeskov; +Cc: xen-devel, Mark Williamson
Morten Bøgeskov wrote:
> Quoting Jeremy Fitzhardinge <jeremy@goop.org>:
>
>> Morten Bøgeskov wrote:
>>> I've applied both, and tried with and without SMP, with exactly the
>>> same result as before.
>>>
>>> UP hangs after "installing Xen timer for CPU 0"
>>
>> Ah, OK. I'd overlooked this one. Hm, I probably haven't tried UP in a
>> while.
>>
>>> SMP the oops'es the same way except the line no. is now 78.
>>
>> Oh, that's odd. Could you resend your original bug report? What kind
>> of load does it fail under?
>
> I can't really say what load. It doesn't get that far.
Odd.
> I've included .config
It seems a bit small. Can you send your whole .config so I can rebuild
your kernel?
J
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops
2007-10-11 21:04 ` Jeremy Fitzhardinge
2007-10-11 21:21 ` Morten Bøgeskov
@ 2007-10-12 4:14 ` Mark Williamson
2007-10-12 6:21 ` Jeremy Fitzhardinge
1 sibling, 1 reply; 16+ messages in thread
From: Mark Williamson @ 2007-10-12 4:14 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: xen-devel, Morten Bøgeskov
> Morten Bøgeskov wrote:
> > I've applied both, and tried with and without SMP, with exactly the
> > same result as before.
> >
> > UP hangs after "installing Xen timer for CPU 0"
>
> Ah, OK. I'd overlooked this one. Hm, I probably haven't tried UP in a
> while.
Actually, I think the kernel is hanging in an infinite loop whilst trying to
measure loops per jiffies... For some reason the UP kernel isn't getting
timer interrupts from Xen at this point, whereas the SMP kernel is. I've not
figured out why this could be happening, yet, but that's where I am at the
moment.
Cheers,
Mark
> > SMP the oops'es the same way except the line no. is now 78.
>
> Oh, that's odd. Could you resend your original bug report? What kind
> of load does it fail under?
>
> J
--
Dave: Just a question. What use is a unicyle with no seat? And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops
2007-10-12 4:14 ` Mark Williamson
@ 2007-10-12 6:21 ` Jeremy Fitzhardinge
0 siblings, 0 replies; 16+ messages in thread
From: Jeremy Fitzhardinge @ 2007-10-12 6:21 UTC (permalink / raw)
To: Mark Williamson; +Cc: xen-devel, Morten Bøgeskov
Mark Williamson wrote:
> Actually, I think the kernel is hanging in an infinite loop whilst trying to
> measure loops per jiffies... For some reason the UP kernel isn't getting
> timer interrupts from Xen at this point, whereas the SMP kernel is. I've not
> figured out why this could be happening, yet, but that's where I am at the
> moment.
I can't reproduce that specific symptom, but this may help.
diff -r 25f5c8bdd699 arch/i386/xen/enlighten.c
--- a/arch/i386/xen/enlighten.c Thu Oct 11 18:46:33 2007 -0700
+++ b/arch/i386/xen/enlighten.c Thu Oct 11 23:20:29 2007 -0700
@@ -115,7 +115,7 @@ static void __init xen_vcpu_setup(int cp
info.mfn = virt_to_mfn(vcpup);
info.offset = offset_in_page(vcpup);
- printk(KERN_DEBUG "trying to map vcpu_info %d at %p, mfn %x, offset %d\n",
+ printk(KERN_DEBUG "trying to map vcpu_info %d at %p, mfn %llx, offset %d\n",
cpu, vcpup, info.mfn, info.offset);
/* Check to see if the hypervisor will put the vcpu_info
diff -r 25f5c8bdd699 include/xen/interface/vcpu.h
--- a/include/xen/interface/vcpu.h Thu Oct 11 18:46:33 2007 -0700
+++ b/include/xen/interface/vcpu.h Thu Oct 11 23:20:29 2007 -0700
@@ -160,8 +160,9 @@ struct vcpu_set_singleshot_timer {
*/
#define VCPUOP_register_vcpu_info 10 /* arg == struct vcpu_info */
struct vcpu_register_vcpu_info {
- uint32_t mfn; /* mfn of page to place vcpu_info */
- uint32_t offset; /* offset within page */
+ uint64_t mfn; /* mfn of page to place vcpu_info */
+ uint32_t offset; /* offset within page */
+ uint32_t rsvd; /* unused */
};
#endif /* __XEN_PUBLIC_VCPU_H__ */
J
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops
2007-10-11 21:47 ` Jeremy Fitzhardinge
@ 2007-10-12 9:46 ` Morten Bøgeskov
2007-10-12 15:30 ` Jeremy Fitzhardinge
0 siblings, 1 reply; 16+ messages in thread
From: Morten Bøgeskov @ 2007-10-12 9:46 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: xen-devel, Mark Williamson
[-- Attachment #1: Type: text/plain, Size: 4001 bytes --]
Quoting Jeremy Fitzhardinge <jeremy@goop.org>:
> Morten Bøgeskov wrote:
>> Quoting Jeremy Fitzhardinge <jeremy@goop.org>:
>>
>>> Morten Bøgeskov wrote:
>>>> I've applied both, and tried with and without SMP, with exactly the
>>>> same result as before.
>>>>
>>>> UP hangs after "installing Xen timer for CPU 0"
>>>
>>> Ah, OK. I'd overlooked this one. Hm, I probably haven't tried UP in a
>>> while.
>>>
>>>> SMP the oops'es the same way except the line no. is now 78.
>>>
>>> Oh, that's odd. Could you resend your original bug report? What kind
>>> of load does it fail under?
>>
>> I can't really say what load. It doesn't get that far.
>
> Odd.
>
>> I've included .config
>
> It seems a bit small. Can you send your whole .config so I can rebuild
> your kernel?
>
Ah, you don't have PAE enabled. Could you try it with (HIGHMEM64G)? I
do technically support non-PAE kernels, but they're a bit tricky to test
(stock Xen doesn't really support non-PAE any more).
I'll look at the UP issue too.
J
New compiled xen-3.1.1-rc3 PAE, 2.6.18-xen (domU) PAE &
2.6.23 PAE (config included) Still no difference.
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
Using IPI Shortcut mode
blkfront: xvda1: barriers enabled
XENBUS: Device with no driver: device/console/0
IP-Config: Complete:
device=eth0, addr=192.168.0.2, mask=255.255.255.0, gw=192.168.0.1,
host=foo, domain=, nis-domain=(none),
bootserver=255.255.255.255, rootserver=255.255.255.255, rootpath=
blkfront: xvda1: write barrier op failed
blkfront: xvda1: barriers disabled
Filesystem "xvda1": Disabling barriers, trial barrier write failed
XFS mounting filesystem xvda1
VFS: Mounted root (xfs filesystem) readonly.
Freeing unused kernel memory: 180k freed
------------[ cut here ]------------
kernel BUG at arch/i386/xen/multicalls.c:78!
invalid opcode: 0000 [#1]
SMP
CPU: 0
EIP: 0061:[<c0101a92>] Not tainted VLI
EFLAGS: 00010002 (2.6.23 #33)
EIP is at xen_mc_flush+0xd2/0xe0
eax: 00000000 ebx: c10c1060 ecx: 00000007 edx: 00000007
esi: c10c1060 edi: 00000000 ebp: 00000001 esp: c10efd1c
ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021
Process swapper (pid: 1, ti=c10ee000 task=c10eba90 task.ti=c10ee000)
Stack: 000d991c c10c1120 c10c1460 c12d6000 c0102400 c12d5e40 c02c5100 c10eba90
00000000 c01025d3 c12d5e40 c0161728 c125f49c 00000080 c1240aa0 00000000
c10ece60 c015cdd0 c02c5ac0 c131d560 c134cbc0 c1141440 c0160ccd c10efd7c
Call Trace:
[<c0102400>] xen_pgd_pin+0xb0/0x120
[<c01025d3>] xen_activate_mm+0x13/0x20
[<c0161728>] flush_old_exec+0x3c8/0x7e0
[<c015cdd0>] do_sync_read+0x0/0x120
[<c0160ccd>] kernel_read+0x3d/0x60
[<c01891e6>] load_elf_binary+0x316/0x1aa0
[<c0159a16>] kmem_cache_alloc+0x56/0xb0
[<c01f610c>] xfs_file_aio_read+0x6c/0x80
[<c0148b3a>] vm_normal_page+0x2a/0xb0
[<c01492ef>] follow_page+0x1af/0x230
[<c014ac25>] get_user_pages+0x105/0x360
[<c016081b>] get_arg_page+0x4b/0xb0
[<c01609f4>] copy_strings+0x174/0x190
[<c0160aa4>] search_binary_handler+0x54/0x110
[<c01624c6>] do_execve+0x166/0x190
[<c0104bef>] sys_execve+0x2f/0x90
[<c010663e>] syscall_call+0x7/0xb
[<c010a98c>] kernel_execve+0x1c/0x30
[<c0100173>] init_post+0xa3/0xf0
[<c02e1942>] kernel_init+0x222/0x330
[<c0119453>] schedule_tail+0x33/0xa0
[<c01064fa>] ret_from_fork+0x6/0x1c
[<c0106647>] syscall_exit+0x5/0x1b
[<c02e1720>] kernel_init+0x0/0x330
[<c02e1720>] kernel_init+0x0/0x330
[<c0107213>] kernel_thread_helper+0x7/0x10
=======================
Code: 89 d8 72 e9 85 ed c7 86 08 07 00 00 00 00 00 00 75 19 5b 5e 5f
5d c3 0f 0b
eb fe 8b 96 04 07 00 00 31 ed 85 d2 74 ac 0f 0b eb fe <0f> 0b eb fe
8d 76 00 8d
bc 27 00 00 00 00 83 ec 0c 89 1c 24 89
EIP: [<c0101a92>] xen_mc_flush+0xd2/0xe0 SS:ESP e021:c10efd1c
Kernel panic - not syncing: Attempted to kill init!
[-- Attachment #2: config --]
[-- Type: text/plain, Size: 205 bytes --]
menu "Index" {
"now" = menu_now;
empty;
"Exit" = exec "exit 0";
}
menu_now "Connamds" {
"foo" = call "echo foo";
"bar" = call "echo bar";
empty;
"return" = menu;
"Exit" = exec "exit 0";
}
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops
2007-10-12 9:46 ` Morten Bøgeskov
@ 2007-10-12 15:30 ` Jeremy Fitzhardinge
2007-10-12 15:34 ` Mark Williamson
0 siblings, 1 reply; 16+ messages in thread
From: Jeremy Fitzhardinge @ 2007-10-12 15:30 UTC (permalink / raw)
To: Morten Bøgeskov; +Cc: xen-devel, Mark Williamson
Morten Bøgeskov wrote:
> New compiled xen-3.1.1-rc3 PAE, 2.6.18-xen (domU) PAE &
> 2.6.23 PAE (config included) Still no difference.
I just realized I hadn't been reading your backtrace closely enough,
since it looks similar to the bug I'd been working on. Turns out having
an xfs rootfs is what triggers your bug - I can repro it now, so I'll
see if I can work out what's going on.
BTW, did last night's little patch help with the UP time issue?
J
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops
2007-10-12 15:30 ` Jeremy Fitzhardinge
@ 2007-10-12 15:34 ` Mark Williamson
2007-10-12 15:40 ` Keir Fraser
2007-10-12 15:50 ` Jeremy Fitzhardinge
0 siblings, 2 replies; 16+ messages in thread
From: Mark Williamson @ 2007-10-12 15:34 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: xen-devel, Morten Bøgeskov
> I just realized I hadn't been reading your backtrace closely enough,
> since it looks similar to the bug I'd been working on. Turns out having
> an xfs rootfs is what triggers your bug - I can repro it now, so I'll
> see if I can work out what's going on.
>
> BTW, did last night's little patch help with the UP time issue?
I've not had a chance to try it out yet... I'll try and take a long.
But I didn't entirely understand the semantic significance of the change?
Could you possibly elaborate?
Cheers,
Mark
--
Dave: Just a question. What use is a unicyle with no seat? And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Re: [Xen-users] 2.6.23 oops
2007-10-12 15:34 ` Mark Williamson
@ 2007-10-12 15:40 ` Keir Fraser
2007-10-12 15:50 ` Jeremy Fitzhardinge
1 sibling, 0 replies; 16+ messages in thread
From: Keir Fraser @ 2007-10-12 15:40 UTC (permalink / raw)
To: Mark Williamson, Jeremy Fitzhardinge; +Cc: xen-devel, Morten Bøgeskov
On 12/10/07 16:34, "Mark Williamson" <mark.williamson@cl.cam.ac.uk> wrote:
>> I just realized I hadn't been reading your backtrace closely enough,
>> since it looks similar to the bug I'd been working on. Turns out having
>> an xfs rootfs is what triggers your bug - I can repro it now, so I'll
>> see if I can work out what's going on.
>>
>> BTW, did last night's little patch help with the UP time issue?
>
> I've not had a chance to try it out yet... I'll try and take a long.
>
> But I didn't entirely understand the semantic significance of the change?
> Could you possibly elaborate?
It fixed the layout of the structure passed to VCPUOP_register_vcpu_info. I
would expect that to improve stability!
-- Keir
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops
2007-10-12 15:34 ` Mark Williamson
2007-10-12 15:40 ` Keir Fraser
@ 2007-10-12 15:50 ` Jeremy Fitzhardinge
1 sibling, 0 replies; 16+ messages in thread
From: Jeremy Fitzhardinge @ 2007-10-12 15:50 UTC (permalink / raw)
To: Mark Williamson; +Cc: xen-devel, Morten Bøgeskov
Mark Williamson wrote:
>> I just realized I hadn't been reading your backtrace closely enough,
>> since it looks similar to the bug I'd been working on. Turns out having
>> an xfs rootfs is what triggers your bug - I can repro it now, so I'll
>> see if I can work out what's going on.
>>
>> BTW, did last night's little patch help with the UP time issue?
>>
>
> I've not had a chance to try it out yet... I'll try and take a long.
>
> But I didn't entirely understand the semantic significance of the change?
> Could you possibly elaborate?
>
There was version drift in the register_vcpu_info hypercall arg
structure, and the version of the structure being used by the kernel was
smaller than the one that xen was expecting.
That meant that the mfn argument was OK, but the offset was being
corrupted, and so the vcpu_info structure could have been placed
anywhere, corrupting kernel memory. For me it manifested as an oops,
but it could also have corrupted the timing parameters - or at the very
least, reading the time from the vcpu_info structure wouldn't work.
So I think there's a good chance this change would fix the UP problem.
It doesn't hit in the same way in SMP because the per-cpu data area is
elsewhere, but it could still have caused havok.
J
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2007-10-12 15:50 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20071011144817.n5x1drwcgug44sg0@whitelist.dk>
2007-10-11 15:29 ` [Xen-users] 2.6.23 oops Mark Williamson
2007-10-11 16:27 ` Mark Williamson
2007-10-11 18:22 ` Morten Bøgeskov
2007-10-11 18:53 ` Mark Williamson
2007-10-11 16:41 ` Jeremy Fitzhardinge
2007-10-11 19:00 ` Morten Bøgeskov
2007-10-11 21:04 ` Jeremy Fitzhardinge
2007-10-11 21:21 ` Morten Bøgeskov
2007-10-11 21:47 ` Jeremy Fitzhardinge
2007-10-12 9:46 ` Morten Bøgeskov
2007-10-12 15:30 ` Jeremy Fitzhardinge
2007-10-12 15:34 ` Mark Williamson
2007-10-12 15:40 ` Keir Fraser
2007-10-12 15:50 ` Jeremy Fitzhardinge
2007-10-12 4:14 ` Mark Williamson
2007-10-12 6:21 ` Jeremy Fitzhardinge
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.