* Re: [Xen-users] 2.6.23 oops [not found] <20071011144817.n5x1drwcgug44sg0@whitelist.dk> @ 2007-10-11 15:29 ` Mark Williamson 2007-10-11 16:27 ` Mark Williamson 2007-10-11 16:41 ` Jeremy Fitzhardinge 0 siblings, 2 replies; 16+ messages in thread From: Mark Williamson @ 2007-10-11 15:29 UTC (permalink / raw) To: xen-devel; +Cc: Jeremy Fitzhardinge, Morten Bøgeskov I'm bringing this discussion onto Xen-devel as it smells like it needs some more specific developer input than I can give. > I've tried to run 2.6.23 on xen 3.1.0, my hardware is a Dual Athlon MP. > my DomU is running 2.6.23 SMP (vcpus = 1), without SMP it hang during > boot, with it panics. I've attached my config, if somebody thinks I've > left something out Do you mean that it hangs during boot if SMP is not compiled into the guest kernel? That seems strange, I'll try that out myself and see what happens. How far into the boot does it manage to get? Has it started running userspace apps (the normal startup messages starting essential services), or is it still during the kernel initialisation? > Does anybody else see this: I've booted a kernel successfully on a UP AMD64 box with an SMP 2.6.23 kernel, vcpus = 1 but I didn't bother giving it a virtual disk, so I don't know if userspace worked. I'll give it a try shortly... Cheers, Mark > ------------[ cut here ]------------ > kernel BUG at arch/i386/xen/multicalls.c:68! > invalid opcode: 0000 [#1] > SMP > CPU: 0 > EIP: 0061:[<c01019a6>] Not tainted VLI > EFLAGS: 00010002 (2.6.23 #26) > EIP is at xen_mc_flush+0xa6/0xb0 > eax: 00000001 ebx: 00000003 ecx: 00000001 edx: c10c1060 > esi: 00000000 edi: 00000001 ebp: c5c8d000 esp: c1101d28 > ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021 > Process swapper (pid: 1, ti=c1100000 task=c10eda90 task.ti=c1100000) > Stack: 000218ee c10c10a0 c10c1460 c0101ede c5c8ce40 c02b90e0 c10eda90 > 00000000 c0101ff3 c5c8ce40 c015caca c5d3b49c 00000080 c5eadaa0 00000000 > c10eed80 c01584d0 c02b9a80 c5cfd560 c02a53d2 c1142440 c015c077 c1101d84 > c5eadaa0 Call Trace: > [<c0101ede>] xen_pgd_pin+0x9e/0x100 > [<c0101ff3>] xen_activate_mm+0x13/0x20 > [<c015caca>] flush_old_exec+0x3ca/0x7f0 > [<c01584d0>] do_sync_read+0x0/0x120 > [<c015c077>] kernel_read+0x37/0x50 > [<c018308e>] load_elf_binary+0x2fe/0x1af0 > [<c013ec27>] __alloc_pages+0x57/0x310 > [<c0155387>] kmem_cache_alloc+0x47/0x90 > [<c0147610>] handle_mm_fault+0x540/0x710 > [<c01585a5>] do_sync_read+0xd5/0x120 > [<c0145e10>] vm_normal_page+0x10/0x70 > [<c0145e10>] vm_normal_page+0x10/0x70 > [<c0146471>] follow_page+0xf1/0x170 > [<c01478ca>] get_user_pages+0xea/0x2e0 > [<c015bbe2>] get_arg_page+0x42/0xa0 > [<c015bdc6>] copy_strings+0x186/0x1a0 > [<c015be64>] search_binary_handler+0x54/0x110 > [<c015d83b>] do_execve+0x14b/0x170 > [<c010425f>] sys_execve+0x2f/0x90 > [<c0105c72>] syscall_call+0x7/0xb > [<c010a364>] kernel_execve+0x14/0x20 > [<c0100173>] init_post+0xa3/0xf0 > [<c02d594f>] kernel_init+0x20f/0x310 > [<c0118464>] schedule_tail+0x34/0x90 > [<c0105b16>] ret_from_fork+0x6/0x20 > [<c0105c7b>] syscall_exit+0x5/0x1b > [<c02d5740>] kernel_init+0x0/0x310 > [<c02d5740>] kernel_init+0x0/0x310 > [<c0106e77>] kernel_thread_helper+0x7/0x10 > ======================= -- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Re: [Xen-users] 2.6.23 oops 2007-10-11 15:29 ` [Xen-users] 2.6.23 oops Mark Williamson @ 2007-10-11 16:27 ` Mark Williamson 2007-10-11 18:22 ` Morten Bøgeskov 2007-10-11 16:41 ` Jeremy Fitzhardinge 1 sibling, 1 reply; 16+ messages in thread From: Mark Williamson @ 2007-10-11 16:27 UTC (permalink / raw) To: xen-devel; +Cc: Jeremy Fitzhardinge, Morten Bøgeskov > > I've tried to run 2.6.23 on xen 3.1.0, my hardware is a Dual Athlon MP. > > my DomU is running 2.6.23 SMP (vcpus = 1), without SMP it hang during > > boot, with it panics. I've attached my config, if somebody thinks I've > > left something out > > Do you mean that it hangs during boot if SMP is not compiled into the guest > kernel? That seems strange, I'll try that out myself and see what happens. Where does it hang for you? I disabled SMP in my kernel config and found that the guest hung during the kernel messages, just after: installing Xen timer for CPU 0 Is this similar to what you observed? Cheers, Mark > How far into the boot does it manage to get? Has it started running > userspace apps (the normal startup messages starting essential services), > or is it still during the kernel initialisation? > > > Does anybody else see this: > > I've booted a kernel successfully on a UP AMD64 box with an SMP 2.6.23 > kernel, vcpus = 1 but I didn't bother giving it a virtual disk, so I don't > know if userspace worked. I'll give it a try shortly... > > Cheers, > Mark > > > ------------[ cut here ]------------ > > kernel BUG at arch/i386/xen/multicalls.c:68! > > invalid opcode: 0000 [#1] > > SMP > > CPU: 0 > > EIP: 0061:[<c01019a6>] Not tainted VLI > > EFLAGS: 00010002 (2.6.23 #26) > > EIP is at xen_mc_flush+0xa6/0xb0 > > eax: 00000001 ebx: 00000003 ecx: 00000001 edx: c10c1060 > > esi: 00000000 edi: 00000001 ebp: c5c8d000 esp: c1101d28 > > ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021 > > Process swapper (pid: 1, ti=c1100000 task=c10eda90 task.ti=c1100000) > > Stack: 000218ee c10c10a0 c10c1460 c0101ede c5c8ce40 c02b90e0 c10eda90 > > 00000000 c0101ff3 c5c8ce40 c015caca c5d3b49c 00000080 c5eadaa0 00000000 > > c10eed80 c01584d0 c02b9a80 c5cfd560 c02a53d2 c1142440 c015c077 c1101d84 > > c5eadaa0 Call Trace: > > [<c0101ede>] xen_pgd_pin+0x9e/0x100 > > [<c0101ff3>] xen_activate_mm+0x13/0x20 > > [<c015caca>] flush_old_exec+0x3ca/0x7f0 > > [<c01584d0>] do_sync_read+0x0/0x120 > > [<c015c077>] kernel_read+0x37/0x50 > > [<c018308e>] load_elf_binary+0x2fe/0x1af0 > > [<c013ec27>] __alloc_pages+0x57/0x310 > > [<c0155387>] kmem_cache_alloc+0x47/0x90 > > [<c0147610>] handle_mm_fault+0x540/0x710 > > [<c01585a5>] do_sync_read+0xd5/0x120 > > [<c0145e10>] vm_normal_page+0x10/0x70 > > [<c0145e10>] vm_normal_page+0x10/0x70 > > [<c0146471>] follow_page+0xf1/0x170 > > [<c01478ca>] get_user_pages+0xea/0x2e0 > > [<c015bbe2>] get_arg_page+0x42/0xa0 > > [<c015bdc6>] copy_strings+0x186/0x1a0 > > [<c015be64>] search_binary_handler+0x54/0x110 > > [<c015d83b>] do_execve+0x14b/0x170 > > [<c010425f>] sys_execve+0x2f/0x90 > > [<c0105c72>] syscall_call+0x7/0xb > > [<c010a364>] kernel_execve+0x14/0x20 > > [<c0100173>] init_post+0xa3/0xf0 > > [<c02d594f>] kernel_init+0x20f/0x310 > > [<c0118464>] schedule_tail+0x34/0x90 > > [<c0105b16>] ret_from_fork+0x6/0x20 > > [<c0105c7b>] syscall_exit+0x5/0x1b > > [<c02d5740>] kernel_init+0x0/0x310 > > [<c02d5740>] kernel_init+0x0/0x310 > > [<c0106e77>] kernel_thread_helper+0x7/0x10 > > ======================= -- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Re: [Xen-users] 2.6.23 oops 2007-10-11 16:27 ` Mark Williamson @ 2007-10-11 18:22 ` Morten Bøgeskov 2007-10-11 18:53 ` Mark Williamson 0 siblings, 1 reply; 16+ messages in thread From: Morten Bøgeskov @ 2007-10-11 18:22 UTC (permalink / raw) To: Mark Williamson; +Cc: Jeremy Fitzhardinge, xen-devel Quoting Mark Williamson <mark.williamson@cl.cam.ac.uk>: >> > I've tried to run 2.6.23 on xen 3.1.0, my hardware is a Dual Athlon MP. >> > my DomU is running 2.6.23 SMP (vcpus = 1), without SMP it hang during >> > boot, with it panics. I've attached my config, if somebody thinks I've >> > left something out >> >> Do you mean that it hangs during boot if SMP is not compiled into the guest >> kernel? That seems strange, I'll try that out myself and see what happens. > > Where does it hang for you? > > I disabled SMP in my kernel config and found that the guest hung during the > kernel messages, just after: > > installing Xen timer for CPU 0 > > Is this similar to what you observed? That is exactly what I experienced. I tracked it down to: xen_vcpuop_set_next_event(...) ret = HYPERVISOR_vcpu_op(VCPUOP_set_singleshot_timer, cpu, &single); always returns -ETIME resulting in a infinite loop in tick_setup_periodic(...) Where this never succeeds if (!clockevents_program_event(dev, next, ktime_get())) return; Now my brain needs a rest. I never thought I had to go head first into the linux-kernel ;-) can't claim that I got any wiser ;-) > > Cheers, > Mark > >> How far into the boot does it manage to get? Has it started running >> userspace apps (the normal startup messages starting essential services), >> or is it still during the kernel initialisation? >> >> > Does anybody else see this: >> >> I've booted a kernel successfully on a UP AMD64 box with an SMP 2.6.23 >> kernel, vcpus = 1 but I didn't bother giving it a virtual disk, so I don't >> know if userspace worked. I'll give it a try shortly... >> >> Cheers, >> Mark >> >> > ------------[ cut here ]------------ >> > kernel BUG at arch/i386/xen/multicalls.c:68! >> > invalid opcode: 0000 [#1] >> > SMP >> > CPU: 0 >> > EIP: 0061:[<c01019a6>] Not tainted VLI >> > EFLAGS: 00010002 (2.6.23 #26) >> > EIP is at xen_mc_flush+0xa6/0xb0 >> > eax: 00000001 ebx: 00000003 ecx: 00000001 edx: c10c1060 >> > esi: 00000000 edi: 00000001 ebp: c5c8d000 esp: c1101d28 >> > ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021 >> > Process swapper (pid: 1, ti=c1100000 task=c10eda90 task.ti=c1100000) >> > Stack: 000218ee c10c10a0 c10c1460 c0101ede c5c8ce40 c02b90e0 c10eda90 >> > 00000000 c0101ff3 c5c8ce40 c015caca c5d3b49c 00000080 c5eadaa0 00000000 >> > c10eed80 c01584d0 c02b9a80 c5cfd560 c02a53d2 c1142440 c015c077 c1101d84 >> > c5eadaa0 Call Trace: >> > [<c0101ede>] xen_pgd_pin+0x9e/0x100 >> > [<c0101ff3>] xen_activate_mm+0x13/0x20 >> > [<c015caca>] flush_old_exec+0x3ca/0x7f0 >> > [<c01584d0>] do_sync_read+0x0/0x120 >> > [<c015c077>] kernel_read+0x37/0x50 >> > [<c018308e>] load_elf_binary+0x2fe/0x1af0 >> > [<c013ec27>] __alloc_pages+0x57/0x310 >> > [<c0155387>] kmem_cache_alloc+0x47/0x90 >> > [<c0147610>] handle_mm_fault+0x540/0x710 >> > [<c01585a5>] do_sync_read+0xd5/0x120 >> > [<c0145e10>] vm_normal_page+0x10/0x70 >> > [<c0145e10>] vm_normal_page+0x10/0x70 >> > [<c0146471>] follow_page+0xf1/0x170 >> > [<c01478ca>] get_user_pages+0xea/0x2e0 >> > [<c015bbe2>] get_arg_page+0x42/0xa0 >> > [<c015bdc6>] copy_strings+0x186/0x1a0 >> > [<c015be64>] search_binary_handler+0x54/0x110 >> > [<c015d83b>] do_execve+0x14b/0x170 >> > [<c010425f>] sys_execve+0x2f/0x90 >> > [<c0105c72>] syscall_call+0x7/0xb >> > [<c010a364>] kernel_execve+0x14/0x20 >> > [<c0100173>] init_post+0xa3/0xf0 >> > [<c02d594f>] kernel_init+0x20f/0x310 >> > [<c0118464>] schedule_tail+0x34/0x90 >> > [<c0105b16>] ret_from_fork+0x6/0x20 >> > [<c0105c7b>] syscall_exit+0x5/0x1b >> > [<c02d5740>] kernel_init+0x0/0x310 >> > [<c02d5740>] kernel_init+0x0/0x310 >> > [<c0106e77>] kernel_thread_helper+0x7/0x10 >> > ======================= > > > > -- > Dave: Just a question. What use is a unicyle with no seat? And no pedals! > Mark: To answer a question with a question: What use is a skateboard? > Dave: Skateboards have wheels. > Mark: My wheel has a wheel! > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Re: [Xen-users] 2.6.23 oops 2007-10-11 18:22 ` Morten Bøgeskov @ 2007-10-11 18:53 ` Mark Williamson 0 siblings, 0 replies; 16+ messages in thread From: Mark Williamson @ 2007-10-11 18:53 UTC (permalink / raw) To: Morten Bøgeskov; +Cc: Jeremy Fitzhardinge, xen-devel > > Is this similar to what you observed? > > That is exactly what I experienced. I tracked it down to: Awesome! Thanks for helping out here. > xen_vcpuop_set_next_event(...) > ret = HYPERVISOR_vcpu_op(VCPUOP_set_singleshot_timer, cpu, &single); > always returns -ETIME Which, AFAICS has the expected meaning that the requested time is in the past. > resulting in a infinite loop in > tick_setup_periodic(...) > Where this never succeeds > if (!clockevents_program_event(dev, next, ktime_get())) > return; in kernel/time/tick-common.c, right? I see what you mean, but it's not immediately obvious to me what's going wrong. I don't think the kernel mainline Xen uses even has clockevents, anyway, so I've not seen it before :-) > Now my brain needs a rest. I never thought I had to go head first into > the linux-kernel ;-) can't claim that I got any wiser ;-) Every little helps. Dip your head into tepid water just to forestall any overheating. I may be able to take a look at this later tonight if Jeremy doesn't beat me to it. I'd like to get a bit more familiar with our patches to mainline. Cheers, Mark > > Cheers, > > Mark > > > >> How far into the boot does it manage to get? Has it started running > >> userspace apps (the normal startup messages starting essential > >> services), or is it still during the kernel initialisation? > >> > >> > Does anybody else see this: > >> > >> I've booted a kernel successfully on a UP AMD64 box with an SMP 2.6.23 > >> kernel, vcpus = 1 but I didn't bother giving it a virtual disk, so I > >> don't know if userspace worked. I'll give it a try shortly... > >> > >> Cheers, > >> Mark > >> > >> > ------------[ cut here ]------------ > >> > kernel BUG at arch/i386/xen/multicalls.c:68! > >> > invalid opcode: 0000 [#1] > >> > SMP > >> > CPU: 0 > >> > EIP: 0061:[<c01019a6>] Not tainted VLI > >> > EFLAGS: 00010002 (2.6.23 #26) > >> > EIP is at xen_mc_flush+0xa6/0xb0 > >> > eax: 00000001 ebx: 00000003 ecx: 00000001 edx: c10c1060 > >> > esi: 00000000 edi: 00000001 ebp: c5c8d000 esp: c1101d28 > >> > ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021 > >> > Process swapper (pid: 1, ti=c1100000 task=c10eda90 task.ti=c1100000) > >> > Stack: 000218ee c10c10a0 c10c1460 c0101ede c5c8ce40 c02b90e0 c10eda90 > >> > 00000000 c0101ff3 c5c8ce40 c015caca c5d3b49c 00000080 c5eadaa0 > >> > 00000000 c10eed80 c01584d0 c02b9a80 c5cfd560 c02a53d2 c1142440 > >> > c015c077 c1101d84 c5eadaa0 Call Trace: > >> > [<c0101ede>] xen_pgd_pin+0x9e/0x100 > >> > [<c0101ff3>] xen_activate_mm+0x13/0x20 > >> > [<c015caca>] flush_old_exec+0x3ca/0x7f0 > >> > [<c01584d0>] do_sync_read+0x0/0x120 > >> > [<c015c077>] kernel_read+0x37/0x50 > >> > [<c018308e>] load_elf_binary+0x2fe/0x1af0 > >> > [<c013ec27>] __alloc_pages+0x57/0x310 > >> > [<c0155387>] kmem_cache_alloc+0x47/0x90 > >> > [<c0147610>] handle_mm_fault+0x540/0x710 > >> > [<c01585a5>] do_sync_read+0xd5/0x120 > >> > [<c0145e10>] vm_normal_page+0x10/0x70 > >> > [<c0145e10>] vm_normal_page+0x10/0x70 > >> > [<c0146471>] follow_page+0xf1/0x170 > >> > [<c01478ca>] get_user_pages+0xea/0x2e0 > >> > [<c015bbe2>] get_arg_page+0x42/0xa0 > >> > [<c015bdc6>] copy_strings+0x186/0x1a0 > >> > [<c015be64>] search_binary_handler+0x54/0x110 > >> > [<c015d83b>] do_execve+0x14b/0x170 > >> > [<c010425f>] sys_execve+0x2f/0x90 > >> > [<c0105c72>] syscall_call+0x7/0xb > >> > [<c010a364>] kernel_execve+0x14/0x20 > >> > [<c0100173>] init_post+0xa3/0xf0 > >> > [<c02d594f>] kernel_init+0x20f/0x310 > >> > [<c0118464>] schedule_tail+0x34/0x90 > >> > [<c0105b16>] ret_from_fork+0x6/0x20 > >> > [<c0105c7b>] syscall_exit+0x5/0x1b > >> > [<c02d5740>] kernel_init+0x0/0x310 > >> > [<c02d5740>] kernel_init+0x0/0x310 > >> > [<c0106e77>] kernel_thread_helper+0x7/0x10 > >> > ======================= > > > > -- > > Dave: Just a question. What use is a unicyle with no seat? And no > > pedals! Mark: To answer a question with a question: What use is a > > skateboard? Dave: Skateboards have wheels. > > Mark: My wheel has a wheel! -- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops 2007-10-11 15:29 ` [Xen-users] 2.6.23 oops Mark Williamson 2007-10-11 16:27 ` Mark Williamson @ 2007-10-11 16:41 ` Jeremy Fitzhardinge 2007-10-11 19:00 ` Morten Bøgeskov 1 sibling, 1 reply; 16+ messages in thread From: Jeremy Fitzhardinge @ 2007-10-11 16:41 UTC (permalink / raw) To: Mark Williamson; +Cc: xen-devel, Morten Bøgeskov [-- Attachment #1: Type: text/plain, Size: 463 bytes --] Mark Williamson wrote: > I'm bringing this discussion onto Xen-devel as it smells like it needs some > more specific developer input than I can give. > > >> I've tried to run 2.6.23 on xen 3.1.0, my hardware is a Dual Athlon MP. >> my DomU is running 2.6.23 SMP (vcpus = 1), without SMP it hang during >> boot, with it panics. I've attached my config, if somebody thinks I've >> left something out >> Oh, I just fixed this. Try these patches. J [-- Attachment #2: xen-multicall-callbacks.patch --] [-- Type: text/x-patch, Size: 2300 bytes --] Subject: xen: add batch completion callbacks This adds a mechanism to register a callback function to be called once a batch of hypercalls has been issued. This is typically used to unlock things which must remain locked until the hypercall has taken place. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> --- arch/i386/xen/multicalls.c | 29 ++++++++++++++++++++++++++--- arch/i386/xen/multicalls.h | 3 +++ 2 files changed, 29 insertions(+), 3 deletions(-) =================================================================== --- a/arch/i386/xen/multicalls.c +++ b/arch/i386/xen/multicalls.c @@ -32,7 +32,11 @@ struct mc_buffer { struct mc_buffer { struct multicall_entry entries[MC_BATCH]; u64 args[MC_ARGS]; - unsigned mcidx, argidx; + struct callback { + void (*fn)(void *); + void *data; + } callbacks[MC_BATCH]; + unsigned mcidx, argidx, cbidx; }; static DEFINE_PER_CPU(struct mc_buffer, mc_buffer); @@ -43,6 +47,7 @@ void xen_mc_flush(void) struct mc_buffer *b = &__get_cpu_var(mc_buffer); int ret = 0; unsigned long flags; + int i; BUG_ON(preemptible()); @@ -51,8 +56,6 @@ void xen_mc_flush(void) local_irq_save(flags); if (b->mcidx) { - int i; - if (HYPERVISOR_multicall(b->entries, b->mcidx) != 0) BUG(); for (i = 0; i < b->mcidx; i++) @@ -64,6 +67,13 @@ void xen_mc_flush(void) BUG_ON(b->argidx != 0); local_irq_restore(flags); + + for(i = 0; i < b->cbidx; i++) { + struct callback *cb = &b->callbacks[i]; + + (*cb->fn)(cb->data); + } + b->cbidx = 0; BUG_ON(ret); } @@ -88,3 +98,16 @@ struct multicall_space __xen_mc_entry(si return ret; } + +void xen_mc_callback(void (*fn)(void *), void *data) +{ + struct mc_buffer *b = &__get_cpu_var(mc_buffer); + struct callback *cb; + + if (b->cbidx == MC_BATCH) + xen_mc_flush(); + + cb = &b->callbacks[b->cbidx++]; + cb->fn = fn; + cb->data = data; +} =================================================================== --- a/arch/i386/xen/multicalls.h +++ b/arch/i386/xen/multicalls.h @@ -42,4 +42,7 @@ static inline void xen_mc_issue(unsigned local_irq_restore(x86_read_percpu(xen_mc_irq_flags)); } +/* Set up a callback to be called when the current batch is flushed */ +void xen_mc_callback(void (*fn)(void *), void *data); + #endif /* _XEN_MULTICALLS_H */ [-- Attachment #3: xen-handle-lazy-cr3-on-unpin.patch --] [-- Type: text/x-patch, Size: 5927 bytes --] Subject: xen: deal with stale cr3 values when unpinning pagetables When a pagetable is no longer in use, it must be unpinned so that its pages can be freed. However, this is only possible if there are no stray uses of the pagetable. The code currently deals with all the usual cases, but there's a rare case where a vcpu is changing cr3, but is doing so lazily, and the change hasn't actually happened by the time the pagetable is unpinned, even though it appears to have been completed. This change adds a second per-cpu cr3 variable - xen_current_cr3 - which tracks the actual state of the vcpu cr3. It is only updated once the actual hypercall to set cr3 has been completed. Other processors wishing to unpin a pagetable can check other vcpu's xen_current_cr3 values to see if any cross-cpu IPIs are needed to clean things up. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> --- arch/i386/xen/enlighten.c | 63 ++++++++++++++++++++++++++++++--------------- arch/i386/xen/mmu.c | 33 ++++++++++++++++++++--- arch/i386/xen/xen-ops.h | 1 3 files changed, 71 insertions(+), 26 deletions(-) =================================================================== --- a/arch/i386/xen/enlighten.c +++ b/arch/i386/xen/enlighten.c @@ -55,7 +55,23 @@ DEFINE_PER_CPU(enum paravirt_lazy_mode, DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu); DEFINE_PER_CPU(struct vcpu_info, xen_vcpu_info); -DEFINE_PER_CPU(unsigned long, xen_cr3); + +/* + * Note about cr3 (pagetable base) values: + * + * xen_cr3 contains the current logical cr3 value; it contains the + * last set cr3. This may not be the current effective cr3, because + * its update may be being lazily deferred. However, a vcpu looking + * at its own cr3 can use this value knowing that it everything will + * be self-consistent. + * + * xen_current_cr3 contains the actual vcpu cr3; it is set once the + * hypercall to set the vcpu cr3 is complete (so it may be a little + * out of date, but it will never be set early). If one vcpu is + * looking at another vcpu's cr3 value, it should use this variable. + */ +DEFINE_PER_CPU(unsigned long, xen_cr3); /* cr3 stored as physaddr */ +DEFINE_PER_CPU(unsigned long, xen_current_cr3); /* actual vcpu cr3 */ struct start_info *xen_start_info; EXPORT_SYMBOL_GPL(xen_start_info); @@ -631,32 +647,36 @@ static unsigned long xen_read_cr3(void) return x86_read_percpu(xen_cr3); } +static void set_current_cr3(void *v) +{ + x86_write_percpu(xen_current_cr3, (unsigned long)v); +} + static void xen_write_cr3(unsigned long cr3) { + struct mmuext_op *op; + struct multicall_space mcs; + unsigned long mfn = pfn_to_mfn(PFN_DOWN(cr3)); + BUG_ON(preemptible()); - if (cr3 == x86_read_percpu(xen_cr3)) { - /* just a simple tlb flush */ - xen_flush_tlb(); - return; - } - + mcs = xen_mc_entry(sizeof(*op)); /* disables interrupts */ + + /* Update while interrupts are disabled, so its atomic with + respect to ipis */ x86_write_percpu(xen_cr3, cr3); - - { - struct mmuext_op *op; - struct multicall_space mcs = xen_mc_entry(sizeof(*op)); - unsigned long mfn = pfn_to_mfn(PFN_DOWN(cr3)); - - op = mcs.args; - op->cmd = MMUEXT_NEW_BASEPTR; - op->arg1.mfn = mfn; - - MULTI_mmuext_op(mcs.mc, op, 1, NULL, DOMID_SELF); - - xen_mc_issue(PARAVIRT_LAZY_CPU); - } + op = mcs.args; + op->cmd = MMUEXT_NEW_BASEPTR; + op->arg1.mfn = mfn; + + MULTI_mmuext_op(mcs.mc, op, 1, NULL, DOMID_SELF); + + /* Update xen_update_cr3 once the batch has actually + been submitted. */ + xen_mc_callback(set_current_cr3, (void *)cr3); + + xen_mc_issue(PARAVIRT_LAZY_CPU); /* interrupts restored */ } /* Early in boot, while setting up the initial pagetable, assume @@ -1124,6 +1144,7 @@ asmlinkage void __init xen_start_kernel( /* keep using Xen gdt for now; no urgent need to change it */ x86_write_percpu(xen_cr3, __pa(pgd)); + x86_write_percpu(xen_current_cr3, __pa(pgd)); #ifdef CONFIG_SMP /* Don't do the full vcpu_info placement stuff until we have a =================================================================== --- a/arch/i386/xen/mmu.c +++ b/arch/i386/xen/mmu.c @@ -564,20 +564,43 @@ static void drop_other_mm_ref(void *info if (__get_cpu_var(cpu_tlbstate).active_mm == mm) leave_mm(smp_processor_id()); + + /* If this cpu still has a stale cr3 reference, then make sure + it has been flushed. */ + if (x86_read_percpu(xen_current_cr3) == __pa(mm->pgd)) { + load_cr3(swapper_pg_dir); + arch_flush_lazy_cpu_mode(); + } } static void drop_mm_ref(struct mm_struct *mm) { + cpumask_t mask; + unsigned cpu; + if (current->active_mm == mm) { if (current->mm == mm) load_cr3(swapper_pg_dir); else leave_mm(smp_processor_id()); - } - - if (!cpus_empty(mm->cpu_vm_mask)) - xen_smp_call_function_mask(mm->cpu_vm_mask, drop_other_mm_ref, - mm, 1); + arch_flush_lazy_cpu_mode(); + } + + /* Get the "official" set of cpus referring to our pagetable. */ + mask = mm->cpu_vm_mask; + + /* It's possible that a vcpu may have a stale reference to our + cr3, because its in lazy mode, and it hasn't yet flushed + its set of pending hypercalls yet. In this case, we can + look at its actual current cr3 value, and force it to flush + if needed. */ + for_each_online_cpu(cpu) { + if (per_cpu(xen_current_cr3, cpu) == __pa(mm->pgd)) + cpu_set(cpu, mask); + } + + if (!cpus_empty(mask)) + xen_smp_call_function_mask(mask, drop_other_mm_ref, mm, 1); } #else static void drop_mm_ref(struct mm_struct *mm) =================================================================== --- a/arch/i386/xen/xen-ops.h +++ b/arch/i386/xen/xen-ops.h @@ -11,6 +11,7 @@ void xen_copy_trap_info(struct trap_info DECLARE_PER_CPU(struct vcpu_info *, xen_vcpu); DECLARE_PER_CPU(unsigned long, xen_cr3); +DECLARE_PER_CPU(unsigned long, xen_current_cr3); extern struct start_info *xen_start_info; extern struct shared_info *HYPERVISOR_shared_info; [-- Attachment #4: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops 2007-10-11 16:41 ` Jeremy Fitzhardinge @ 2007-10-11 19:00 ` Morten Bøgeskov 2007-10-11 21:04 ` Jeremy Fitzhardinge 0 siblings, 1 reply; 16+ messages in thread From: Morten Bøgeskov @ 2007-10-11 19:00 UTC (permalink / raw) To: Jeremy Fitzhardinge; +Cc: xen-devel, Mark Williamson Quoting Jeremy Fitzhardinge <jeremy@goop.org>: > Mark Williamson wrote: >> I'm bringing this discussion onto Xen-devel as it smells like it needs some >> more specific developer input than I can give. >> >> >>> I've tried to run 2.6.23 on xen 3.1.0, my hardware is a Dual Athlon MP. >>> my DomU is running 2.6.23 SMP (vcpus = 1), without SMP it hang during >>> boot, with it panics. I've attached my config, if somebody thinks I've >>> left something out >>> > > Oh, I just fixed this. Try these patches. > > J > I've applied both, and tried with and without SMP, with exactly the same result as before. UP hangs after "installing Xen timer for CPU 0" SMP the oops'es the same way except the line no. is now 78. Morten Bøgeskov ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops 2007-10-11 19:00 ` Morten Bøgeskov @ 2007-10-11 21:04 ` Jeremy Fitzhardinge 2007-10-11 21:21 ` Morten Bøgeskov 2007-10-12 4:14 ` Mark Williamson 0 siblings, 2 replies; 16+ messages in thread From: Jeremy Fitzhardinge @ 2007-10-11 21:04 UTC (permalink / raw) To: Morten Bøgeskov; +Cc: xen-devel, Mark Williamson Morten Bøgeskov wrote: > I've applied both, and tried with and without SMP, with exactly the > same result as before. > > UP hangs after "installing Xen timer for CPU 0" Ah, OK. I'd overlooked this one. Hm, I probably haven't tried UP in a while. > SMP the oops'es the same way except the line no. is now 78. Oh, that's odd. Could you resend your original bug report? What kind of load does it fail under? J ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops 2007-10-11 21:04 ` Jeremy Fitzhardinge @ 2007-10-11 21:21 ` Morten Bøgeskov 2007-10-11 21:47 ` Jeremy Fitzhardinge 2007-10-12 4:14 ` Mark Williamson 1 sibling, 1 reply; 16+ messages in thread From: Morten Bøgeskov @ 2007-10-11 21:21 UTC (permalink / raw) To: Jeremy Fitzhardinge; +Cc: xen-devel, Mark Williamson [-- Attachment #1: Type: text/plain, Size: 6671 bytes --] Quoting Jeremy Fitzhardinge <jeremy@goop.org>: > Morten Bøgeskov wrote: >> I've applied both, and tried with and without SMP, with exactly the >> same result as before. >> >> UP hangs after "installing Xen timer for CPU 0" > > Ah, OK. I'd overlooked this one. Hm, I probably haven't tried UP in a > while. > >> SMP the oops'es the same way except the line no. is now 78. > > Oh, that's odd. Could you resend your original bug report? What kind > of load does it fail under? I can't really say what load. It doesn't get that far. I've included .config config: kernel = "/usr/src/linux-2.6.23/vmlinux" memory = 96 name = "foo" vif = [ 'mac=00:16:3e:00:00:00, bridge=br1' ] vcpus = 1 disk = [ 'phy:/dev/VOL/foo,hda1,w' ] root = "/dev/xvda1 ro" extra = "ip=192.168.0.2::192.168.0.1:255.255.255.0:foo:eth0:" Output: Using config file "/etc/xen/foo". Started domain foo Reserving virtual address space above 0xfbffe000 Linux version 2.6.23 (root@hobbes) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #32 SMP Thu Oct 11 21:02:46 CEST 2007 BIOS-provided physical RAM map: Xen: 0000000000000000 - 0000000006000000 (usable) 0MB HIGHMEM available. 96MB LOWMEM available. Zone PFN ranges: DMA 0 -> 4096 Normal 4096 -> 24576 HighMem 24576 -> 24576 Movable zone start PFN for each node early_node_map[1] active PFN ranges 0: 0 -> 24576 DMI not present or invalid. Allocating PCI resources starting at 10000000 (gap: 06000000:fa000000) Built 1 zonelists in Zone order. Total pages: 24384 Kernel command line: root=/dev/xvda1 ro ip=192.168.0.2::192.168.0.1:255.255.255.0:foo:eth0: Local APIC disabled by BIOS -- you can enable it with "lapic" Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 PID hash table entries: 512 (order: 9, 2048 bytes) Detected 1666.723 MHz processor. console [hvc0] enabled Dentry cache hash table entries: 16384 (order: 4, 65536 bytes) Inode-cache hash table entries: 8192 (order: 3, 32768 bytes) Memory: 94884k/98304k available (1543k kernel code, 3360k reserved, 343k data, 176k init, 0k highmem) virtual kernel memory layout: fixmap : 0xfbf9d000 - 0xfbffd000 ( 384 kB) pkmap : 0xfb800000 - 0xfbc00000 (4096 kB) vmalloc : 0xc6800000 - 0xfb7fe000 ( 847 MB) lowmem : 0xc0000000 - 0xc6000000 ( 96 MB) .init : 0xc02dd000 - 0xc0309000 ( 176 kB) .data : 0xc0281ed6 - 0xc02d7d6c ( 343 kB) .text : 0xc0100000 - 0xc0281ed6 (1543 kB) Checking if this processor honours the WP bit even in supervisor mode... Ok. installing Xen timer for CPU 0 Calibrating delay using timer specific routine.. 3341.81 BogoMIPS (lpj=5567710) Mount-cache hash table entries: 512 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 256K (64 bytes/line) Compat vDSO mapped to fbffc000. SMP alternatives: switching to UP code Freeing SMP alternatives: 9k freed Brought up 1 CPUs Booting paravirtualized kernel on Xen Hypervisor signature: xen-3.0-x86_32 Grant table initialized NET: Registered protocol family 16 Setting up standard PCI resources NET: Registered protocol family 2 Time: xen clocksource has been installed. IP route cache hash table entries: 1024 (order: 0, 4096 bytes) TCP established hash table entries: 4096 (order: 3, 49152 bytes) TCP bind hash table entries: 4096 (order: 3, 32768 bytes) TCP: Hash tables configured (established 4096 bind 4096) TCP reno registered SGI XFS with no debug enabled io scheduler noop registered io scheduler deadline registered (default) Initialising Xen virtual ethernet driver. blkfront: xvda1: barriers enabled TCP cubic registered NET: Registered protocol family 1 NET: Registered protocol family 17 Using IPI Shortcut mode XENBUS: Device with no driver: device/console/0 IP-Config: Complete: device=eth0, addr=192.168.0.2, mask=255.255.255.0, gw=192.168.0.1, host=foo, domain=, nis-domain=(none), bootserver=255.255.255.255, rootserver=255.255.255.255, rootpath= blkfront: xvda1: write barrier op failed blkfront: xvda1: barriers disabled Filesystem "xvda1": Disabling barriers, trial barrier write failed XFS mounting filesystem xvda1 VFS: Mounted root (xfs filesystem) readonly. Freeing unused kernel memory: 176k freed ------------[ cut here ]------------ kernel BUG at arch/i386/xen/multicalls.c:78! invalid opcode: 0000 [#1] SMP CPU: 0 EIP: 0061:[<c0101a82>] Not tainted VLI EFLAGS: 00010002 (2.6.23 #32) EIP is at xen_mc_flush+0xd2/0xe0 eax: 00000000 ebx: c10c1060 ecx: 00000003 edx: 00000003 esi: c10c1060 edi: 00000000 ebp: 00000001 esp: c10efd1c ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021 Process swapper (pid: 1, ti=c10ee000 task=c10eba90 task.ti=c10ee000) Stack: 000d9b10 c10c10a0 c10c1460 c130b000 c01020d0 c1329e40 c02c1100 c10eba90 00000000 c01021f3 c1329e40 c015e498 c132849c 00000080 c12e0aa0 00000000 c10ece60 c0159b50 c02c1ac0 c12fa560 c1253c20 c1144440 c015da4d c10efd7c Call Trace: [<c01020d0>] xen_pgd_pin+0xb0/0x120 [<c01021f3>] xen_activate_mm+0x13/0x20 [<c015e498>] flush_old_exec+0x3c8/0x7e0 [<c0159b50>] do_sync_read+0x0/0x120 [<c015da4d>] kernel_read+0x3d/0x60 [<c0185ed6>] load_elf_binary+0x316/0x1aa0 [<c013fe27>] __alloc_pages+0x57/0x2f0 [<c0156796>] kmem_cache_alloc+0x56/0xb0 [<c01488eb>] handle_mm_fault+0x52b/0x6f0 [<c014710b>] vm_normal_page+0x1b/0x80 [<c014710b>] vm_normal_page+0x1b/0x80 [<c0147786>] follow_page+0x106/0x180 [<c0148bb7>] get_user_pages+0x107/0x2d0 [<c015d59b>] get_arg_page+0x4b/0xb0 [<c015d774>] copy_strings+0x174/0x190 [<c015d824>] search_binary_handler+0x54/0x110 [<c015f236>] do_execve+0x166/0x190 [<c010460f>] sys_execve+0x2f/0x90 [<c010605e>] syscall_call+0x7/0xb [<c010a3ac>] kernel_execve+0x1c/0x30 [<c0100173>] init_post+0xa3/0xf0 [<c02dd942>] kernel_init+0x222/0x330 [<c0118173>] schedule_tail+0x33/0xa0 [<c0105f1a>] ret_from_fork+0x6/0x1c [<c0106067>] syscall_exit+0x5/0x1b [<c02dd720>] kernel_init+0x0/0x330 [<c02dd720>] kernel_init+0x0/0x330 [<c0106c33>] kernel_thread_helper+0x7/0x10 ======================= Code: 89 d8 72 e9 85 ed c7 86 08 07 00 00 00 00 00 00 75 19 5b 5e 5f 5d c3 0f 0b eb fe 8b 96 04 07 00 00 31 ed 85 d2 74 ac 0f 0b eb fe <0f> 0b eb fe 8d 76 00 8d bc 00 00 00 00 83 ec 0c 89 1c 24 89 EIP: [<c0101a82>] xen_mc_flush+0xd2/0xe0 SS:ESP e021:c10efd1c Kernel panic - not syncing: Attempted to kill init! [-- Attachment #2: config --] [-- Type: text/plain, Size: 3709 bytes --] CONFIG_X86_32=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_X86=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_QUICKLIST=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" CONFIG_EXPERIMENTAL=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="" CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_LOG_BUF_SHIFT=15 CONFIG_SYSFS_DEPRECATED=y CONFIG_SYSCTL=y CONFIG_EMBEDDED=y CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_ANON_INODES=y CONFIG_EPOLL=y CONFIG_SIGNALFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_SLAB=y CONFIG_RT_MUTEXES=y CONFIG_BASE_SMALL=0 CONFIG_BLOCK=y CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_DEADLINE=y CONFIG_DEFAULT_DEADLINE=y CONFIG_DEFAULT_IOSCHED="deadline" CONFIG_SMP=y CONFIG_X86_PC=y CONFIG_PARAVIRT=y CONFIG_XEN=y CONFIG_MK7=y CONFIG_X86_CMPXCHG=y CONFIG_X86_L1_CACHE_SHIFT=6 CONFIG_X86_XADD=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_USE_3DNOW=y CONFIG_X86_TSC=y CONFIG_X86_CMOV=y CONFIG_X86_MINIMUM_CPU_FAMILY=4 CONFIG_NR_CPUS=2 CONFIG_PREEMPT_NONE=y CONFIG_PREEMPT_BKL=y CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y CONFIG_DMIID=y CONFIG_HIGHMEM4G=y CONFIG_VMSPLIT_3G=y CONFIG_PAGE_OFFSET=0xC0000000 CONFIG_HIGHMEM=y CONFIG_ARCH_FLATMEM_ENABLE=y CONFIG_ARCH_SPARSEMEM_ENABLE=y CONFIG_ARCH_SELECT_MEMORY_MODEL=y CONFIG_ARCH_POPULATES_NODE_MAP=y CONFIG_SELECT_MEMORY_MODEL=y CONFIG_FLATMEM_MANUAL=y CONFIG_FLATMEM=y CONFIG_FLAT_NODE_MEM_MAP=y CONFIG_SPARSEMEM_STATIC=y CONFIG_SPLIT_PTLOCK_CPUS=4096 CONFIG_ZONE_DMA_FLAG=1 CONFIG_BOUNCE=y CONFIG_NR_QUICK=1 CONFIG_VIRT_TO_BUS=y CONFIG_HZ_300=y CONFIG_HZ=300 CONFIG_PHYSICAL_START=0x100000 CONFIG_PHYSICAL_ALIGN=0x100000 CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y CONFIG_SUSPEND_SMP_POSSIBLE=y CONFIG_HIBERNATION_SMP_POSSIBLE=y CONFIG_ISA_DMA_API=y CONFIG_BINFMT_ELF=y CONFIG_NET=y CONFIG_PACKET=y CONFIG_UNIX=y CONFIG_INET=y CONFIG_IP_FIB_HASH=y CONFIG_IP_PNP=y CONFIG_TCP_CONG_CUBIC=y CONFIG_DEFAULT_TCP_CONG="cubic" CONFIG_BLK_DEV=y CONFIG_XEN_BLKDEV_FRONTEND=y CONFIG_NETDEVICES=y CONFIG_XEN_NETDEV_FRONTEND=y CONFIG_FIX_EARLYCON_MEM=y CONFIG_UNIX98_PTYS=y CONFIG_LEGACY_PTYS=y CONFIG_LEGACY_PTY_COUNT=256 CONFIG_HVC_DRIVER=y CONFIG_HVC_XEN=y CONFIG_XFS_FS=y CONFIG_INOTIFY=y CONFIG_INOTIFY_USER=y CONFIG_DNOTIFY=y CONFIG_PROC_FS=y CONFIG_PROC_SYSCTL=y CONFIG_SYSFS=y CONFIG_TMPFS=y CONFIG_RAMFS=y CONFIG_MSDOS_PARTITION=y CONFIG_TRACE_IRQFLAGS_SUPPORT=y CONFIG_MAGIC_SYSRQ=y CONFIG_DEBUG_KERNEL=y CONFIG_DETECT_SOFTLOCKUP=y CONFIG_SCHED_DEBUG=y CONFIG_DEBUG_BUGVERBOSE=y CONFIG_DEBUG_INFO=y CONFIG_FORCED_INLINING=y CONFIG_EARLY_PRINTK=y CONFIG_X86_FIND_SMP_CONFIG=y CONFIG_X86_MPPARSE=y CONFIG_PLIST=y CONFIG_HAS_IOMEM=y CONFIG_HAS_IOPORT=y CONFIG_HAS_DMA=y CONFIG_GENERIC_HARDIRQS=y CONFIG_GENERIC_IRQ_PROBE=y CONFIG_GENERIC_PENDING_IRQ=y CONFIG_X86_SMP=y CONFIG_X86_HT=y CONFIG_X86_BIOS_REBOOT=y CONFIG_X86_TRAMPOLINE=y CONFIG_KTIME_SCALAR=y [-- Attachment #3: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops 2007-10-11 21:21 ` Morten Bøgeskov @ 2007-10-11 21:47 ` Jeremy Fitzhardinge 2007-10-12 9:46 ` Morten Bøgeskov 0 siblings, 1 reply; 16+ messages in thread From: Jeremy Fitzhardinge @ 2007-10-11 21:47 UTC (permalink / raw) To: Morten Bøgeskov; +Cc: xen-devel, Mark Williamson Morten Bøgeskov wrote: > Quoting Jeremy Fitzhardinge <jeremy@goop.org>: > >> Morten Bøgeskov wrote: >>> I've applied both, and tried with and without SMP, with exactly the >>> same result as before. >>> >>> UP hangs after "installing Xen timer for CPU 0" >> >> Ah, OK. I'd overlooked this one. Hm, I probably haven't tried UP in a >> while. >> >>> SMP the oops'es the same way except the line no. is now 78. >> >> Oh, that's odd. Could you resend your original bug report? What kind >> of load does it fail under? > > I can't really say what load. It doesn't get that far. Odd. > I've included .config It seems a bit small. Can you send your whole .config so I can rebuild your kernel? J ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops 2007-10-11 21:47 ` Jeremy Fitzhardinge @ 2007-10-12 9:46 ` Morten Bøgeskov 2007-10-12 15:30 ` Jeremy Fitzhardinge 0 siblings, 1 reply; 16+ messages in thread From: Morten Bøgeskov @ 2007-10-12 9:46 UTC (permalink / raw) To: Jeremy Fitzhardinge; +Cc: xen-devel, Mark Williamson [-- Attachment #1: Type: text/plain, Size: 4001 bytes --] Quoting Jeremy Fitzhardinge <jeremy@goop.org>: > Morten Bøgeskov wrote: >> Quoting Jeremy Fitzhardinge <jeremy@goop.org>: >> >>> Morten Bøgeskov wrote: >>>> I've applied both, and tried with and without SMP, with exactly the >>>> same result as before. >>>> >>>> UP hangs after "installing Xen timer for CPU 0" >>> >>> Ah, OK. I'd overlooked this one. Hm, I probably haven't tried UP in a >>> while. >>> >>>> SMP the oops'es the same way except the line no. is now 78. >>> >>> Oh, that's odd. Could you resend your original bug report? What kind >>> of load does it fail under? >> >> I can't really say what load. It doesn't get that far. > > Odd. > >> I've included .config > > It seems a bit small. Can you send your whole .config so I can rebuild > your kernel? > Ah, you don't have PAE enabled. Could you try it with (HIGHMEM64G)? I do technically support non-PAE kernels, but they're a bit tricky to test (stock Xen doesn't really support non-PAE any more). I'll look at the UP issue too. J New compiled xen-3.1.1-rc3 PAE, 2.6.18-xen (domU) PAE & 2.6.23 PAE (config included) Still no difference. TCP cubic registered NET: Registered protocol family 1 NET: Registered protocol family 17 Using IPI Shortcut mode blkfront: xvda1: barriers enabled XENBUS: Device with no driver: device/console/0 IP-Config: Complete: device=eth0, addr=192.168.0.2, mask=255.255.255.0, gw=192.168.0.1, host=foo, domain=, nis-domain=(none), bootserver=255.255.255.255, rootserver=255.255.255.255, rootpath= blkfront: xvda1: write barrier op failed blkfront: xvda1: barriers disabled Filesystem "xvda1": Disabling barriers, trial barrier write failed XFS mounting filesystem xvda1 VFS: Mounted root (xfs filesystem) readonly. Freeing unused kernel memory: 180k freed ------------[ cut here ]------------ kernel BUG at arch/i386/xen/multicalls.c:78! invalid opcode: 0000 [#1] SMP CPU: 0 EIP: 0061:[<c0101a92>] Not tainted VLI EFLAGS: 00010002 (2.6.23 #33) EIP is at xen_mc_flush+0xd2/0xe0 eax: 00000000 ebx: c10c1060 ecx: 00000007 edx: 00000007 esi: c10c1060 edi: 00000000 ebp: 00000001 esp: c10efd1c ds: 007b es: 007b fs: 00d8 gs: 0000 ss: e021 Process swapper (pid: 1, ti=c10ee000 task=c10eba90 task.ti=c10ee000) Stack: 000d991c c10c1120 c10c1460 c12d6000 c0102400 c12d5e40 c02c5100 c10eba90 00000000 c01025d3 c12d5e40 c0161728 c125f49c 00000080 c1240aa0 00000000 c10ece60 c015cdd0 c02c5ac0 c131d560 c134cbc0 c1141440 c0160ccd c10efd7c Call Trace: [<c0102400>] xen_pgd_pin+0xb0/0x120 [<c01025d3>] xen_activate_mm+0x13/0x20 [<c0161728>] flush_old_exec+0x3c8/0x7e0 [<c015cdd0>] do_sync_read+0x0/0x120 [<c0160ccd>] kernel_read+0x3d/0x60 [<c01891e6>] load_elf_binary+0x316/0x1aa0 [<c0159a16>] kmem_cache_alloc+0x56/0xb0 [<c01f610c>] xfs_file_aio_read+0x6c/0x80 [<c0148b3a>] vm_normal_page+0x2a/0xb0 [<c01492ef>] follow_page+0x1af/0x230 [<c014ac25>] get_user_pages+0x105/0x360 [<c016081b>] get_arg_page+0x4b/0xb0 [<c01609f4>] copy_strings+0x174/0x190 [<c0160aa4>] search_binary_handler+0x54/0x110 [<c01624c6>] do_execve+0x166/0x190 [<c0104bef>] sys_execve+0x2f/0x90 [<c010663e>] syscall_call+0x7/0xb [<c010a98c>] kernel_execve+0x1c/0x30 [<c0100173>] init_post+0xa3/0xf0 [<c02e1942>] kernel_init+0x222/0x330 [<c0119453>] schedule_tail+0x33/0xa0 [<c01064fa>] ret_from_fork+0x6/0x1c [<c0106647>] syscall_exit+0x5/0x1b [<c02e1720>] kernel_init+0x0/0x330 [<c02e1720>] kernel_init+0x0/0x330 [<c0107213>] kernel_thread_helper+0x7/0x10 ======================= Code: 89 d8 72 e9 85 ed c7 86 08 07 00 00 00 00 00 00 75 19 5b 5e 5f 5d c3 0f 0b eb fe 8b 96 04 07 00 00 31 ed 85 d2 74 ac 0f 0b eb fe <0f> 0b eb fe 8d 76 00 8d bc 27 00 00 00 00 83 ec 0c 89 1c 24 89 EIP: [<c0101a92>] xen_mc_flush+0xd2/0xe0 SS:ESP e021:c10efd1c Kernel panic - not syncing: Attempted to kill init! [-- Attachment #2: config --] [-- Type: text/plain, Size: 205 bytes --] menu "Index" { "now" = menu_now; empty; "Exit" = exec "exit 0"; } menu_now "Connamds" { "foo" = call "echo foo"; "bar" = call "echo bar"; empty; "return" = menu; "Exit" = exec "exit 0"; } [-- Attachment #3: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops 2007-10-12 9:46 ` Morten Bøgeskov @ 2007-10-12 15:30 ` Jeremy Fitzhardinge 2007-10-12 15:34 ` Mark Williamson 0 siblings, 1 reply; 16+ messages in thread From: Jeremy Fitzhardinge @ 2007-10-12 15:30 UTC (permalink / raw) To: Morten Bøgeskov; +Cc: xen-devel, Mark Williamson Morten Bøgeskov wrote: > New compiled xen-3.1.1-rc3 PAE, 2.6.18-xen (domU) PAE & > 2.6.23 PAE (config included) Still no difference. I just realized I hadn't been reading your backtrace closely enough, since it looks similar to the bug I'd been working on. Turns out having an xfs rootfs is what triggers your bug - I can repro it now, so I'll see if I can work out what's going on. BTW, did last night's little patch help with the UP time issue? J ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops 2007-10-12 15:30 ` Jeremy Fitzhardinge @ 2007-10-12 15:34 ` Mark Williamson 2007-10-12 15:40 ` Keir Fraser 2007-10-12 15:50 ` Jeremy Fitzhardinge 0 siblings, 2 replies; 16+ messages in thread From: Mark Williamson @ 2007-10-12 15:34 UTC (permalink / raw) To: Jeremy Fitzhardinge; +Cc: xen-devel, Morten Bøgeskov > I just realized I hadn't been reading your backtrace closely enough, > since it looks similar to the bug I'd been working on. Turns out having > an xfs rootfs is what triggers your bug - I can repro it now, so I'll > see if I can work out what's going on. > > BTW, did last night's little patch help with the UP time issue? I've not had a chance to try it out yet... I'll try and take a long. But I didn't entirely understand the semantic significance of the change? Could you possibly elaborate? Cheers, Mark -- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Re: [Xen-users] 2.6.23 oops 2007-10-12 15:34 ` Mark Williamson @ 2007-10-12 15:40 ` Keir Fraser 2007-10-12 15:50 ` Jeremy Fitzhardinge 1 sibling, 0 replies; 16+ messages in thread From: Keir Fraser @ 2007-10-12 15:40 UTC (permalink / raw) To: Mark Williamson, Jeremy Fitzhardinge; +Cc: xen-devel, Morten Bøgeskov On 12/10/07 16:34, "Mark Williamson" <mark.williamson@cl.cam.ac.uk> wrote: >> I just realized I hadn't been reading your backtrace closely enough, >> since it looks similar to the bug I'd been working on. Turns out having >> an xfs rootfs is what triggers your bug - I can repro it now, so I'll >> see if I can work out what's going on. >> >> BTW, did last night's little patch help with the UP time issue? > > I've not had a chance to try it out yet... I'll try and take a long. > > But I didn't entirely understand the semantic significance of the change? > Could you possibly elaborate? It fixed the layout of the structure passed to VCPUOP_register_vcpu_info. I would expect that to improve stability! -- Keir ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops 2007-10-12 15:34 ` Mark Williamson 2007-10-12 15:40 ` Keir Fraser @ 2007-10-12 15:50 ` Jeremy Fitzhardinge 1 sibling, 0 replies; 16+ messages in thread From: Jeremy Fitzhardinge @ 2007-10-12 15:50 UTC (permalink / raw) To: Mark Williamson; +Cc: xen-devel, Morten Bøgeskov Mark Williamson wrote: >> I just realized I hadn't been reading your backtrace closely enough, >> since it looks similar to the bug I'd been working on. Turns out having >> an xfs rootfs is what triggers your bug - I can repro it now, so I'll >> see if I can work out what's going on. >> >> BTW, did last night's little patch help with the UP time issue? >> > > I've not had a chance to try it out yet... I'll try and take a long. > > But I didn't entirely understand the semantic significance of the change? > Could you possibly elaborate? > There was version drift in the register_vcpu_info hypercall arg structure, and the version of the structure being used by the kernel was smaller than the one that xen was expecting. That meant that the mfn argument was OK, but the offset was being corrupted, and so the vcpu_info structure could have been placed anywhere, corrupting kernel memory. For me it manifested as an oops, but it could also have corrupted the timing parameters - or at the very least, reading the time from the vcpu_info structure wouldn't work. So I think there's a good chance this change would fix the UP problem. It doesn't hit in the same way in SMP because the per-cpu data area is elsewhere, but it could still have caused havok. J ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops 2007-10-11 21:04 ` Jeremy Fitzhardinge 2007-10-11 21:21 ` Morten Bøgeskov @ 2007-10-12 4:14 ` Mark Williamson 2007-10-12 6:21 ` Jeremy Fitzhardinge 1 sibling, 1 reply; 16+ messages in thread From: Mark Williamson @ 2007-10-12 4:14 UTC (permalink / raw) To: Jeremy Fitzhardinge; +Cc: xen-devel, Morten Bøgeskov > Morten Bøgeskov wrote: > > I've applied both, and tried with and without SMP, with exactly the > > same result as before. > > > > UP hangs after "installing Xen timer for CPU 0" > > Ah, OK. I'd overlooked this one. Hm, I probably haven't tried UP in a > while. Actually, I think the kernel is hanging in an infinite loop whilst trying to measure loops per jiffies... For some reason the UP kernel isn't getting timer interrupts from Xen at this point, whereas the SMP kernel is. I've not figured out why this could be happening, yet, but that's where I am at the moment. Cheers, Mark > > SMP the oops'es the same way except the line no. is now 78. > > Oh, that's odd. Could you resend your original bug report? What kind > of load does it fail under? > > J -- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Xen-users] 2.6.23 oops 2007-10-12 4:14 ` Mark Williamson @ 2007-10-12 6:21 ` Jeremy Fitzhardinge 0 siblings, 0 replies; 16+ messages in thread From: Jeremy Fitzhardinge @ 2007-10-12 6:21 UTC (permalink / raw) To: Mark Williamson; +Cc: xen-devel, Morten Bøgeskov Mark Williamson wrote: > Actually, I think the kernel is hanging in an infinite loop whilst trying to > measure loops per jiffies... For some reason the UP kernel isn't getting > timer interrupts from Xen at this point, whereas the SMP kernel is. I've not > figured out why this could be happening, yet, but that's where I am at the > moment. I can't reproduce that specific symptom, but this may help. diff -r 25f5c8bdd699 arch/i386/xen/enlighten.c --- a/arch/i386/xen/enlighten.c Thu Oct 11 18:46:33 2007 -0700 +++ b/arch/i386/xen/enlighten.c Thu Oct 11 23:20:29 2007 -0700 @@ -115,7 +115,7 @@ static void __init xen_vcpu_setup(int cp info.mfn = virt_to_mfn(vcpup); info.offset = offset_in_page(vcpup); - printk(KERN_DEBUG "trying to map vcpu_info %d at %p, mfn %x, offset %d\n", + printk(KERN_DEBUG "trying to map vcpu_info %d at %p, mfn %llx, offset %d\n", cpu, vcpup, info.mfn, info.offset); /* Check to see if the hypervisor will put the vcpu_info diff -r 25f5c8bdd699 include/xen/interface/vcpu.h --- a/include/xen/interface/vcpu.h Thu Oct 11 18:46:33 2007 -0700 +++ b/include/xen/interface/vcpu.h Thu Oct 11 23:20:29 2007 -0700 @@ -160,8 +160,9 @@ struct vcpu_set_singleshot_timer { */ #define VCPUOP_register_vcpu_info 10 /* arg == struct vcpu_info */ struct vcpu_register_vcpu_info { - uint32_t mfn; /* mfn of page to place vcpu_info */ - uint32_t offset; /* offset within page */ + uint64_t mfn; /* mfn of page to place vcpu_info */ + uint32_t offset; /* offset within page */ + uint32_t rsvd; /* unused */ }; #endif /* __XEN_PUBLIC_VCPU_H__ */ J ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2007-10-12 15:50 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20071011144817.n5x1drwcgug44sg0@whitelist.dk>
2007-10-11 15:29 ` [Xen-users] 2.6.23 oops Mark Williamson
2007-10-11 16:27 ` Mark Williamson
2007-10-11 18:22 ` Morten Bøgeskov
2007-10-11 18:53 ` Mark Williamson
2007-10-11 16:41 ` Jeremy Fitzhardinge
2007-10-11 19:00 ` Morten Bøgeskov
2007-10-11 21:04 ` Jeremy Fitzhardinge
2007-10-11 21:21 ` Morten Bøgeskov
2007-10-11 21:47 ` Jeremy Fitzhardinge
2007-10-12 9:46 ` Morten Bøgeskov
2007-10-12 15:30 ` Jeremy Fitzhardinge
2007-10-12 15:34 ` Mark Williamson
2007-10-12 15:40 ` Keir Fraser
2007-10-12 15:50 ` Jeremy Fitzhardinge
2007-10-12 4:14 ` Mark Williamson
2007-10-12 6:21 ` Jeremy Fitzhardinge
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.