From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Linux guests hitting VCPUOP_stop_singleshot_timer returning -EINVAL. Date: Wed, 8 May 2013 13:02:27 -0400 Message-ID: <20130508170227.GA32140@phenom.dumpdata.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="9amGYk9869ThD9tj" Return-path: Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: George.Dunlap@eu.citrix.com, xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org --9amGYk9869ThD9tj Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hey George, The issue I am hitting happens quite often but not always. And I hope you can help me understand the bootup part of PVHVM guests which I think is part of the problem. I can reproduce this way back to Xen 4.1 so I think this has been in the hypervisor for some time. First of the guest starts with just one VCPU: builder=3D'hvm' device_model_version =3D 'qemu-xen-traditional' disk =3D [ 'file:/mnt/lab/latest/root_image.iso,hdc:cdrom,r'] memory =3D 2048 boot=3D"d" maxvcpus=3D32 vcpus=3D1 serial=3D'pty' vnclisten=3D"0.0.0.0" name=3D"latest" vif =3D [ 'mac=3D00:0F:4B:00:00:68, bridge=3Dswitch' ] And hit the issue when I try to add more VCPUs: xl vcpu-set latest 16 which triggers this bug in the Linux code (see below for the code): =2E. May 8 16:03:48 (none) udevd-work[2153]: error opening ATTR{/sys/devices/sy= stem/cpu/cpu1/online} for writing: No such file or directory =0D=0D [ 84.585905] CPU 1 got hotplugged=0D [ 84.590192] installing Xen timer for CPU 1=0D [ 84.596371] SMP alternatives: lockdep: fixing up alternatives=0D [ 84.603560] SMP alternatives: switching to SMP code=0D [ 84.619508] smpboot: Booting Node 0 Processor 1 APIC 0x8=0D [ 84.639766] ------------[ cut here ]------------=0D [ 84.639766] WARNING: at /home/konrad/ssd/konrad/linux/arch/x86/xen/time.= c:336 xen_vcpuop_set_mode+0xc2/0xd0()=0D [ 84.639766] Hardware name: HVM domU=0D [ 84.639766] VCPUOP_stop_singleshot_timer on CPU1 ret: -22=0D [ 84.639766] Modules linked in: dm_multipath dm_mod iscsi_boot_sysfs iscs= i_tcp libiscsi_tcp libiscsi scsi_transport_iscsi libcrc32c crc32c sg sr_mod= cdrom ata_generic crc32c_intel ata_piix libata scsi_mod xen_blkfront xen_n= etfront fb_sys_fops sysimgblt sysfillrect syscopyarea xen_kbdfront xenfs xe= n_privcmd=0D [ 84.639766] Pid: 0, comm: swapper/1 Not tainted 3.9.0upstream-00022-g49c= 1bf1-dirty #6=0D [ 84.639766] Call Trace:=0D [ 84.639766] [] warn_slowpath_common+0x7a/0xc0=0D [ 84.639766] [] warn_slowpath_fmt+0x41/0x50=0D [ 84.639766] [] ? tick_notify+0x114/0x420=0D [ 84.639766] [] xen_vcpuop_set_mode+0xc2/0xd0=0D [ 84.639766] [] clockevents_set_mode+0x25/0x70=0D [ 84.639766] [] clockevents_shutdown+0x16/0x30=0D [ 84.639766] [] clockevents_exchange_device+0xb7/0x110= =0D [ 84.639766] [] ? tick_notify+0x114/0x420=0D [ 84.639766] [] tick_notify+0x1c9/0x420=0D [ 84.639766] [] ? clockevents_register_device+0x31/0x1= 70=0D [ 84.639766] [] notifier_call_chain+0x4d/0x70=0D [ 84.639766] [] raw_notifier_call_chain+0x11/0x20=0D [ 84.639766] [] clockevents_register_device+0xe0/0x170= =0D [ 84.639766] [] xen_setup_cpu_clockevents+0x2c/0x50=0D [ 84.639766] [] xen_hvm_setup_cpu_clockevents+0x16/0x2= 0=0D [ 84.639766] [] start_secondary+0x1ea/0x1f9=0D [ 84.639766] ---[ end trace f218984223a7067d ]---=0D [ 84.639766] ------------[ cut here ]------------=0D [ 84.639766] WARNING: at /home/konrad/ssd/konrad/linux/arch/x86/xen/time.= c:338 xen_vcpuop_set_mode+0x9f/0xd0()=0D [ 84.639766] Hardware name: HVM domU=0D [ 84.639766] VCPUOP_stop_periodic_timer on CPU1 ret: -22=0D [ 84.639766] Modules linked in: dm_multipath dm_mod iscsi_boot_sysfs iscs= i_tcp libiscsi_tcp libiscsi scsi_transport_iscsi libcrc32c crc32c sg sr_mod= cdrom ata_generic crc32c_intel ata_piix libata scsi_mod xen_blkfront xen_n= etfront fb_sys_fops sysimgblt sysfillrect syscopyarea xen_kbdfront xenfs xe= n_privcmd=0D [ 84.639766] Pid: 0, comm: swapper/1 Tainted: G W 3.9.0upstream= -00022-g49c1bf1-dirty #6=0D [ 84.639766] Call Trace:=0D [ 84.639766] [] warn_slowpath_common+0x7a/0xc0=0D [ 84.639766] [] warn_slowpath_fmt+0x41/0x50=0D [ 84.639766] [] ? tick_notify+0x114/0x420=0D [ 84.639766] [] xen_vcpuop_set_mode+0x9f/0xd0=0D [ 84.639766] [] clockevents_set_mode+0x25/0x70=0D [ 84.639766] [] clockevents_shutdown+0x16/0x30=0D [ 84.639766] [] clockevents_exchange_device+0xb7/0x110= =0D [ 84.639766] [] ? tick_notify+0x114/0x420=0D [ 84.639766] [] tick_notify+0x1c9/0x420=0D [ 84.639766] [] ? clockevents_register_device+0x31/0x1= 70=0D [ 84.639766] [] notifier_call_chain+0x4d/0x70=0D [ 84.639766] [] raw_notifier_call_chain+0x11/0x20=0D [ 84.639766] [] clockevents_register_device+0xe0/0x170= =0D [ 84.639766] [] xen_setup_cpu_clockevents+0x2c/0x50=0D [ 84.639766] [] xen_hvm_setup_cpu_clockevents+0x16/0x2= 0=0D [ 84.639766] [] start_secondary+0x1ea/0x1f9=0D [ 84.639766] ---[ end trace f218984223a7067e ]---=0D [ 84.639766] ------------[ cut here ]------------=0D [ 84.639766] kernel BUG at /home/konrad/ssd/konrad/linux/arch/x86/xen/tim= e.c:340!=0D [ 84.639766] invalid opcode: 0000 [#1] SMP =0D [ 84.639766] Modules linked in: dm_multipath dm_mod iscsi_boot_sysfs iscs= i_tcp libiscsi_tcp libiscsi scsi_transport_iscsi libcrc32c crc32c sg sr_mod= cdrom ata_generic crc32c_intel ata_piix libata scsi_mod xen_blkfront xen_n= etfront fb_sys_fops sysimgblt sysfillrect syscopyarea xen_kbdfront xenfs xe= n_privcmd=0D [ 84.639766] CPU 1 =0D [ 84.639766] Pid: 0, comm: swapper/1 Tainted: G W 3.9.0upstream= -00022-g49c1bf1-dirty #6 Xen HVM domU=0D [ 84.639766] RIP: 0010:[] [] xen_vcp= uop_set_mode+0x9f/0xd0=0D [ 84.639766] RSP: 0000:ffff880073c6bd98 EFLAGS: 00010092=0D [ 84.639766] RAX: 0000000000000024 RBX: 0000000000000001 RCX: 00000000000= 00000=0D [ 84.639766] RDX: ffff880073c68300 RSI: 0000000000000000 RDI: 00000000000= 00009=0D [ 84.639766] RBP: ffff880073c6bdb8 R08: 0000000000000001 R09: 00000000000= 00000=0D [ 84.639766] R10: 0000000000000258 R11: 0000000000000000 R12: 00000000fff= fffea=0D [ 84.639766] R13: 0000000000000001 R14: 0000000000000000 R15: 00000000000= 00082=0D [ 84.639766] FS: 0000000000000000(0000) GS:ffff880074220000(0000) knlGS:= 0000000000000000=0D [ 84.639766] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033=0D [ 84.639766] CR2: 0000000000000000 CR3: 0000000001c0c000 CR4: 00000000000= 406e0=0D [ 84.639766] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000= 00000=0D [ 84.639766] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 00000000000= 00400=0D [ 84.639766] Process swapper/1 (pid: 0, threadinfo ffff880073c6a000, task= ffff880073c68300)=0D [ 84.639766] Stack:=0D [ 84.639766] 0000000000000000 ffff88007422be40 0000000000000001 00000000= 00000082=0D [ 84.639766] ffff880073c6bdd8 ffffffff810e8385 ffff88007422be40 ffff8800= 7422be40=0D [ 84.639766] ffff880073c6bdf8 ffffffff810e83e6 ffff880073c6bdf8 00000000= 00000000=0D [ 84.639766] Call Trace:=0D [ 84.639766] [] clockevents_set_mode+0x25/0x70=0D [ 84.639766] [] clockevents_shutdown+0x16/0x30=0D [ 84.639766] [] clockevents_exchange_device+0xb7/0x110= =0D [ 84.639766] [] ? tick_notify+0x114/0x420=0D [ 84.639766] [] tick_notify+0x1c9/0x420=0D [ 84.639766] [] ? clockevents_register_device+0x31/0x1= 70=0D [ 84.639766] [] notifier_call_chain+0x4d/0x70=0D [ 84.639766] [] raw_notifier_call_chain+0x11/0x20=0D [ 84.639766] [] clockevents_register_device+0xe0/0x170= =0D [ 84.639766] [] xen_setup_cpu_clockevents+0x2c/0x50=0D [ 84.639766] [] xen_hvm_setup_cpu_clockevents+0x16/0x2= 0=0D [ 84.639766] [] start_secondary+0x1ea/0x1f9=0D [ 84.639766] Code: 00 48 c7 c7 f8 bc 9b 81 e8 bf d6 04 00 eb c9 89 d9 48 = c7 c2 98 bd 9b 81 be 52 01 00 00 48 c7 c7 f8 bc 9b 81 31 c0 e8 01 d7 04 00 = <0f> 0b eb fe 41 89 c0 89 d9 48 c7 c2 68 bd 9b 81 be 50 01 00 00 =0D [ 84.639766] RIP [] xen_vcpuop_set_mode+0x9f/0xd0=0D [ 84.639766] RSP =0D [ 84.639766] ---[ end trace f218984223a7067f ]---=0D [ 84.639766] Kernel panic - not syncing: Attempted to kill the idle task!= =0D Parsing config from /mnt/lab/latest/vm.cfg Daemon running with PID 7341 Here is the patch that I applied on v3.9 (along with some other patches to fix the VCPU hotplug path in there): diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c index 3d88bfd..d2d1248 100644 --- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -319,7 +319,7 @@ static void xen_vcpuop_set_mode(enum clock_event_mode m= ode, struct clock_event_device *evt) { int cpu =3D smp_processor_id(); - + int rc; switch (mode) { case CLOCK_EVT_MODE_PERIODIC: WARN_ON(1); /* unsupported */ @@ -332,9 +332,13 @@ static void xen_vcpuop_set_mode(enum clock_event_mode = mode, =20 case CLOCK_EVT_MODE_UNUSED: case CLOCK_EVT_MODE_SHUTDOWN: - if (HYPERVISOR_vcpu_op(VCPUOP_stop_singleshot_timer, cpu, NULL) || - HYPERVISOR_vcpu_op(VCPUOP_stop_periodic_timer, cpu, NULL)) + rc =3D HYPERVISOR_vcpu_op(VCPUOP_stop_singleshot_timer, cpu, NULL); + WARN(rc, "VCPUOP_stop_singleshot_timer on CPU%d ret: %d\n", cpu, rc); + rc |=3D HYPERVISOR_vcpu_op(VCPUOP_stop_periodic_timer, cpu, NULL); + WARN(rc, "VCPUOP_stop_periodic_timer on CPU%d ret: %d\n", cpu, rc); + if (rc) { BUG(); + } break; case CLOCK_EVT_MODE_RESUME: break; If you look in the kernel it says that it does on VCPU1, but the hypervisor (this is with another debug patch - this time attached), you can see: =2E. snip.. (XEN) irq.c:375: Dom5 callback via changed to Direct Vector 0xf3 (XEN) irq.c:270: Dom5 PCI link 0 changed 5 -> 0 (XEN) irq.c:270: Dom5 PCI link 1 changed 10 -> 0 (XEN) irq.c:270: Dom5 PCI link 2 changed 11 -> 0 (XEN) irq.c:270: Dom5 PCI link 3 changed 5 -> 0 =3D=3D=3D> (XEN) d5 [`=06=04:-137435614851072] on VCPU4 <=3D=3D=3D=3D (XEN) d5v0: timer [singleshot] in 385066us vcpu_singleshot_timer_fn+0/0= xb (XEN) d5v0 on CPU-1 - - 0 (XEN) d5v0: poll:0 not pending - (XEN) d5v1 on CPU-1 - - 0 (XEN) d5v1: poll:0 pending pending sel It thinks that 'current' is for VCPU4, not VCPU1! 1022 = =20 1023 case VCPUOP_stop_singleshot_timer: = =20 1024 if ( v !=3D current ) = =20 [v is VCPU1, current is VCPU4] 1025 return -EINVAL; = =20 1026 = =20 1027 stop_timer(&v->singleshot_timer); = =20 1028 = =20 1029 break; =20 And if I run xentrace and xenalyze it confirms: 0.097606174 --x- d32767v2 runstate_change d7v4 offline->runnable Creating vcpu 4 for dom 7 ] 0.097606174 --x- d32767v2 22006(2:2:6) 1 [ 3 ] ] 0.097607074 --x- d32767v2 28004(2:8:4) 2 [ 7 4 ] 0.097607786 ---x d32767v3 pm_idle_end c1 0.097607995 --x- d32767v2 runstate_change d7v0 offline->runnable ] 0.097607995 --x- d32767v2 22006(2:2:6) 1 [ 1 ] ] 0.097608930 --x- d32767v2 28004(2:8:4) 2 [ 7 0 ] ] 0.097609486 ---x d32767v3 2800e(2:8:e) 2 [ 7fff 9a2abb ] ] 0.097609562 --x- d32767v2 2800e(2:8:e) 2 [ 7fff 49e9 ] ] 0.097609620 ---x d32767v3 2800f(2:8:f) 3 [ 7 8c2 1c9c380 ] 0.097609660 -x-- d32767v1 pm_idle_end c1 ] 0.097609690 --x- d32767v2 2800f(2:8:f) 3 [ 7 3d7 1c9c380 ] ] 0.097609780 ---x d32767v3 2800a(2:8:a) 4 [ 7fff 3 7 4 ] ] 0.097609845 --x- d32767v2 2800a(2:8:a) 4 [ 7fff 2 7 0 ] 0.097609933 ---x d32767v3 runstate_change d32767v3 running->runnable 0.097609998 --x d32767v2 runstate_change d32767v2 running->runnable 0.097610125 -- x d?v? runstate_change d7v4 runnable->running 0.097610168 --x| d?v? runstate_change d7v0 runnable->running 0.097612126 -x|| d32767v1 runstate_continue d32767v1 running->running 0.097612739 -x|| d32767v1 pm_idle_start c1 ! 0.097614366 --x| d7v0 vmentry ! 0.097615354 --|x d7v4 vmentry ] 0.097616526 --|x d7v4 vmexit exit_reason WBINVD eip 1 ] 0.097617403 --|x d7v4 vmentry cycles 2104=20 ] 0.097618322 --|x d7v4 vmexit exit_reason CPUID eip 1073 ] 0.097618322 --|x d7v4 cpuid [ 0 d 756e6547 6c65746e 49656e69 ] ] 0.097619549 --|x d7v4 vmentry cycles 2944=20 ] 0.097620297 --|x d7v4 vmexit exit_reason CPUID eip 10c0 =2E.blah blah.. bootup of the VCPU. Somehow it is using VCPU4 but the Linux guest thinks it is running in VCPU1. Please note that I can only cause this if maxvcpus !=3D vcpus. And sometimes it works just great.=20 Any ideas? --9amGYk9869ThD9tj Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="0001-xen-vcpu-Add-VCPUOP_marker.patch" >>From d7b5aee76865943b92cf4b40fa6a1adec9f7eed3 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Tue, 16 Apr 2013 16:17:48 -0400 Subject: [PATCH] xen/vcpu: Add VCPUOP_marker Signed-off-by: Konrad Rzeszutek Wilk debug stop_singlehost_timer --- xen/arch/x86/hvm/hvm.c | 1 + xen/common/compat/domain.c | 1 + xen/common/domain.c | 69 +++++++++++++++++++++++++++++++++++++++++-- xen/include/public/vcpu.h | 8 +++++ 4 files changed, 75 insertions(+), 4 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 7c3cb15..65a8dff 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -3174,6 +3174,7 @@ static long hvm_vcpu_op( case VCPUOP_set_singleshot_timer: case VCPUOP_stop_singleshot_timer: case VCPUOP_register_vcpu_info: + case VCPUOP_marker: rc = do_vcpu_op(cmd, vcpuid, arg); break; default: diff --git a/xen/common/compat/domain.c b/xen/common/compat/domain.c index 44ba78d..b0336c8 100644 --- a/xen/common/compat/domain.c +++ b/xen/common/compat/domain.c @@ -65,6 +65,7 @@ int compat_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) case VCPUOP_stop_periodic_timer: case VCPUOP_stop_singleshot_timer: case VCPUOP_send_nmi: + case VCPUOP_marker: rc = do_vcpu_op(cmd, vcpuid, arg); break; diff --git a/xen/common/domain.c b/xen/common/domain.c index fac3470..6e00095 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -41,7 +41,8 @@ #include #include #include - +#include +#include /* Linux config option: propageted to domain0 */ /* xen_processor_pmbits: xen control Cx, Px, ... */ unsigned int xen_processor_pmbits = XEN_PROCESSOR_PM_PX; @@ -1082,9 +1083,10 @@ long do_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) } case VCPUOP_stop_singleshot_timer: - if ( v != current ) - return -EINVAL; - + if ( v != current ) { + rc = -EINVAL; + goto print_out; + } stop_timer(&v->singleshot_timer); break; @@ -1143,7 +1145,66 @@ long do_vcpu_op(int cmd, int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) break; #endif + case VCPUOP_marker: + { + struct timer_info { + struct timer *t; + char *name; + } t[2]; + int i,set; + struct vcpu_marker marker = { .s = "\0", .line = 0 }; + if ( copy_from_guest(&marker, arg, 1) ) + return -EFAULT; + rc = 0; +print_out: + t[0].name = "singleshot"; + t[1].name = "periodic"; + printk("d%d [%s:%8"PRId64"] on VCPU%d\n", d->domain_id, marker.s, marker.line, current->vcpu_id); + for_each_vcpu ( d, v ) + { + struct tasklet *task; + + t[0].t = &v->singleshot_timer; + t[1].t = &v->periodic_timer; + for (i = 0; i < 2; i++) { + struct timer *_t = t[i].t; + if (!_t) + continue; + if (_t->status < TIMER_STATUS_in_heap) + continue; + + printk(" d%dv%d: timer [%s] in %8"PRId64"us", + d->domain_id, v->vcpu_id, t[i].name, _t->expires - NOW()); + print_symbol(" %s\n", (unsigned long)_t->function); + } + task = &v->arch.hvm_vcpu.assert_evtchn_irq_tasklet; + if (task) + printk(" d%dv%d on CPU%d %s %s %d\n", + d->domain_id, v->vcpu_id, task->scheduled_on, + task->is_dead ? "dead" : "-", + task->is_running ? "running" : "-", + task->is_running); + printk(" d%dv%d: %s %s poll:%d %s pending %s\n", + d->domain_id, v->vcpu_id, + test_bit(VCPU_KICK_SOFTIRQ, &softirq_pending(v->processor)) ? "IPIed": "", + /* Which means SOFTIRQ handler does whatever it needs to ? */ + /* If runing, should send IPI of 0xFC (XEn) */ + v->paused_for_shutdown ? "paused" : "", + v->poll_evtchn, + vcpu_info(v, evtchn_upcall_pending) ? "" : "not", + vcpu_info(v, evtchn_pending_sel) ? "pending sel": "-"); + for ( i = 0; i < BITS_PER_EVTCHN_WORD(d); i++ ) { + set = test_bit(i, &vcpu_info(v, evtchn_pending_sel)); + if (set) + printk(" d%d event: %d %s\n", d->domain_id, i, + test_bit(i, &shared_info(d, evtchn_mask)) ? "masked" : "-"); + } + } + /* TODO: Will show up as TRC_HYPERCALL_V2 */ + + break; + } default: rc = arch_do_vcpu_op(cmd, v, arg); break; diff --git a/xen/include/public/vcpu.h b/xen/include/public/vcpu.h index e888daf..b0578a3 100644 --- a/xen/include/public/vcpu.h +++ b/xen/include/public/vcpu.h @@ -227,6 +227,14 @@ struct vcpu_register_time_memory_area { typedef struct vcpu_register_time_memory_area vcpu_register_time_memory_area_t; DEFINE_XEN_GUEST_HANDLE(vcpu_register_time_memory_area_t); +#define VCPUOP_marker 14 +struct vcpu_marker { + char s[4]; + uint64_t line; +}; +typedef struct vcpu_marker vcpu_marker_t; +DEFINE_XEN_GUEST_HANDLE(vcpu_marker_t); + #endif /* __XEN_PUBLIC_VCPU_H__ */ /* -- 1.7.7.6 --9amGYk9869ThD9tj Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --9amGYk9869ThD9tj--