* Hypervisor error messages after xl block-detach with linux 3.18-rc5 @ 2014-11-21 8:42 Juergen Gross 2014-11-21 13:57 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 14+ messages in thread From: Juergen Gross @ 2014-11-21 8:42 UTC (permalink / raw) To: xen-devel@lists.xensource.com Hi, while testing my "linear p2m list" patches I saw the following problem (even without my patches in place): In dom0 running linux 3.18-rc5 on top of Xen 4.4.1 I modified the disk image of a guest by attaching it to dom0: xl block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w mount /dev/xvda2 /mnt ... umount /mnt xl block-detach 0 xvda Worked without any problem. After some seconds the following messages were issued on the console: (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) for mfn 61110 (pfn 1f3f21c) (XEN) mm.c:2995:d0 Error while pinning mfn 61110 (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) for mfn 61110 (pfn 1f3f21c) (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) for mfn 61111 (pfn 1f3f21d) (XEN) mm.c:2995:d0 Error while pinning mfn 61111 (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) for mfn 61111 (pfn 1f3f21d) (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) for mfn 61120 (pfn 1f3f22c) (XEN) mm.c:2995:d0 Error while pinning mfn 61120 (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) for mfn 61120 (pfn 1f3f22c) (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) for mfn 61121 (pfn 1f3f22d) (XEN) mm.c:2995:d0 Error while pinning mfn 61121 (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) for mfn 61121 (pfn 1f3f22d) (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) for mfn 61102 (pfn 1f3f20e) (XEN) mm.c:2995:d0 Error while pinning mfn 61102 (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) for mfn 61102 (pfn 1f3f20e) (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) for mfn 61103 (pfn 1f3f20f) (XEN) mm.c:2995:d0 Error while pinning mfn 61103 (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) for mfn 61103 (pfn 1f3f20f) (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms Is this a known issue? Juergen ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5 2014-11-21 8:42 Hypervisor error messages after xl block-detach with linux 3.18-rc5 Juergen Gross @ 2014-11-21 13:57 ` Konrad Rzeszutek Wilk 2014-11-24 9:55 ` Juergen Gross 0 siblings, 1 reply; 14+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-11-21 13:57 UTC (permalink / raw) To: Juergen Gross; +Cc: xen-devel@lists.xensource.com On Fri, Nov 21, 2014 at 09:42:11AM +0100, Juergen Gross wrote: > Hi, > > while testing my "linear p2m list" patches I saw the following > problem (even without my patches in place): > > In dom0 running linux 3.18-rc5 on top of Xen 4.4.1 I modified the > disk image of a guest by attaching it to dom0: > > xl block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w > mount /dev/xvda2 /mnt > ... > umount /mnt > xl block-detach 0 xvda > > Worked without any problem. After some seconds the following messages > were issued on the console: > > (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) > for mfn 61110 (pfn 1f3f21c) > (XEN) mm.c:2995:d0 Error while pinning mfn 61110 > (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) > for mfn 61110 (pfn 1f3f21c) > (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms > (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) > for mfn 61111 (pfn 1f3f21d) > (XEN) mm.c:2995:d0 Error while pinning mfn 61111 > (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) > for mfn 61111 (pfn 1f3f21d) > (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms > (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) > for mfn 61120 (pfn 1f3f22c) > (XEN) mm.c:2995:d0 Error while pinning mfn 61120 > (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) > for mfn 61120 (pfn 1f3f22c) > (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms > (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) > for mfn 61121 (pfn 1f3f22d) > (XEN) mm.c:2995:d0 Error while pinning mfn 61121 > (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) > for mfn 61121 (pfn 1f3f22d) > (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms > (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) > for mfn 61102 (pfn 1f3f20e) > (XEN) mm.c:2995:d0 Error while pinning mfn 61102 > (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) > for mfn 61102 (pfn 1f3f20e) > (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms > (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) > for mfn 61103 (pfn 1f3f20f) > (XEN) mm.c:2995:d0 Error while pinning mfn 61103 > (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) > for mfn 61103 (pfn 1f3f20f) > (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms > > Is this a known issue? No. First time I see it. > > > Juergen > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5 2014-11-21 13:57 ` Konrad Rzeszutek Wilk @ 2014-11-24 9:55 ` Juergen Gross 2014-11-24 10:02 ` Ian Campbell ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Juergen Gross @ 2014-11-24 9:55 UTC (permalink / raw) To: Konrad Rzeszutek Wilk; +Cc: xen-devel@lists.xensource.com On 11/21/2014 02:57 PM, Konrad Rzeszutek Wilk wrote: > On Fri, Nov 21, 2014 at 09:42:11AM +0100, Juergen Gross wrote: >> Hi, >> >> while testing my "linear p2m list" patches I saw the following >> problem (even without my patches in place): >> >> In dom0 running linux 3.18-rc5 on top of Xen 4.4.1 I modified the >> disk image of a guest by attaching it to dom0: >> >> xl block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w >> mount /dev/xvda2 /mnt >> ... >> umount /mnt >> xl block-detach 0 xvda Did some more testing: - Seems to occur only when attaching a block device to dom0, problem shows up only after doing the block-detach. - Sometimes I see only NMI watchdog messages, looking into hanging cpu state via xen debug keys I can see the cpu(s) in question are spinning in _raw_spin_lock(): __handle_mm_fault()->__pte_alloc()->pmd_lock()->_raw_spin_lock() The hanging cpus were executing some random user processes (cron, bash, xargs), cr2 contained user addresses. - Up to now I've seen the problem on a rather huge machine only (128GB, 120 cpus). I just did a quick test on my laptop and nothing bad happened. - Even on the huge machine just doing block-attach and block-detach without mounting the filesystem on the disk was harmless. Any ideas how to find the problem? Juergen >> >> Worked without any problem. After some seconds the following messages >> were issued on the console: >> >> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) >> for mfn 61110 (pfn 1f3f21c) >> (XEN) mm.c:2995:d0 Error while pinning mfn 61110 >> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) >> for mfn 61110 (pfn 1f3f21c) >> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms >> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) >> for mfn 61111 (pfn 1f3f21d) >> (XEN) mm.c:2995:d0 Error while pinning mfn 61111 >> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) >> for mfn 61111 (pfn 1f3f21d) >> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms >> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) >> for mfn 61120 (pfn 1f3f22c) >> (XEN) mm.c:2995:d0 Error while pinning mfn 61120 >> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) >> for mfn 61120 (pfn 1f3f22c) >> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms >> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) >> for mfn 61121 (pfn 1f3f22d) >> (XEN) mm.c:2995:d0 Error while pinning mfn 61121 >> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) >> for mfn 61121 (pfn 1f3f22d) >> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms >> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) >> for mfn 61102 (pfn 1f3f20e) >> (XEN) mm.c:2995:d0 Error while pinning mfn 61102 >> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) >> for mfn 61102 (pfn 1f3f20e) >> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms >> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) >> for mfn 61103 (pfn 1f3f20f) >> (XEN) mm.c:2995:d0 Error while pinning mfn 61103 >> (XEN) mm.c:2352:d0 Bad type (saw 7400000000000002 != exp 1000000000000000) >> for mfn 61103 (pfn 1f3f20f) >> (XEN) mm.c:906:d0 Attempt to create linear p.t. with write perms >> >> Is this a known issue? > > No. First time I see it. >> >> >> Juergen >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5 2014-11-24 9:55 ` Juergen Gross @ 2014-11-24 10:02 ` Ian Campbell 2014-11-25 16:10 ` Juergen Gross 2014-11-25 17:01 ` Juergen Gross 2014-11-24 10:20 ` Jan Beulich [not found] ` <5473147C020000780004A3D5@suse.com> 2 siblings, 2 replies; 14+ messages in thread From: Ian Campbell @ 2014-11-24 10:02 UTC (permalink / raw) To: Juergen Gross; +Cc: xen-devel@lists.xensource.com, Stefano Stabellini On Mon, 2014-11-24 at 10:55 +0100, Juergen Gross wrote: > On 11/21/2014 02:57 PM, Konrad Rzeszutek Wilk wrote: > > On Fri, Nov 21, 2014 at 09:42:11AM +0100, Juergen Gross wrote: > >> Hi, > >> > >> while testing my "linear p2m list" patches I saw the following > >> problem (even without my patches in place): > >> > >> In dom0 running linux 3.18-rc5 on top of Xen 4.4.1 I modified the > >> disk image of a guest by attaching it to dom0: > >> > >> xl block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w > >> mount /dev/xvda2 /mnt > >> ... > >> umount /mnt > >> xl block-detach 0 xvda > > Did some more testing: > - Seems to occur only when attaching a block device to dom0 That means a qdisk backed situation then, I think. Is your qemu up to date? http://xenbits.xen.org/gitweb/?p=qemu-upstream-unstable.git;a=commit;h=abbbc2f09a53f8f9ee565356ab11a78af006e45e resulted in slightly different messages, but could be related? Ian. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5 2014-11-24 10:02 ` Ian Campbell @ 2014-11-25 16:10 ` Juergen Gross 2014-11-25 16:18 ` Ian Campbell 2014-11-25 17:01 ` Juergen Gross 1 sibling, 1 reply; 14+ messages in thread From: Juergen Gross @ 2014-11-25 16:10 UTC (permalink / raw) To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, Stefano Stabellini On 11/24/2014 11:02 AM, Ian Campbell wrote: > On Mon, 2014-11-24 at 10:55 +0100, Juergen Gross wrote: >> On 11/21/2014 02:57 PM, Konrad Rzeszutek Wilk wrote: >>> On Fri, Nov 21, 2014 at 09:42:11AM +0100, Juergen Gross wrote: >>>> Hi, >>>> >>>> while testing my "linear p2m list" patches I saw the following >>>> problem (even without my patches in place): >>>> >>>> In dom0 running linux 3.18-rc5 on top of Xen 4.4.1 I modified the >>>> disk image of a guest by attaching it to dom0: >>>> >>>> xl block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w >>>> mount /dev/xvda2 /mnt >>>> ... >>>> umount /mnt >>>> xl block-detach 0 xvda >> >> Did some more testing: >> - Seems to occur only when attaching a block device to dom0 > > That means a qdisk backed situation then, I think. > > Is your qemu up to date? > > http://xenbits.xen.org/gitweb/?p=qemu-upstream-unstable.git;a=commit;h=abbbc2f09a53f8f9ee565356ab11a78af006e45e resulted in slightly different messages, but could be related? Tried it with xen-unstable and newest qemu. Now I have another problem: xl -vvv block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w libxl: debug: libxl.c:4178:libxl_device_disk_add: ao 0xbac0f0: create: how=(nil) callback=(nil) poller=0xbac1d0 libxl: error: libxl.c:1897:device_addrm_aocomplete: unable to add device libxl: debug: libxl_event.c:1739:libxl__ao_complete: ao 0xbac0f0: complete, rc=-16 libxl: debug: libxl.c:4178:libxl_device_disk_add: ao 0xbac0f0: inprogress: poller=0xbac1d0, flags=ic libxl: debug: libxl_event.c:1711:libxl__ao__destroy: ao 0xbac0f0: destroy libxl_device_disk_add failed. rc=-16 means ERROR_JSON_CONFIG_EMPTY. Could it be that newest tools won't let me attach a block device to dom0? IMHO that's a severe regression! Juergen ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5 2014-11-25 16:10 ` Juergen Gross @ 2014-11-25 16:18 ` Ian Campbell 2014-11-25 16:48 ` Juergen Gross 0 siblings, 1 reply; 14+ messages in thread From: Ian Campbell @ 2014-11-25 16:18 UTC (permalink / raw) To: Juergen Gross; +Cc: xen-devel@lists.xensource.com, Stefano Stabellini On Tue, 2014-11-25 at 17:10 +0100, Juergen Gross wrote: > On 11/24/2014 11:02 AM, Ian Campbell wrote: > > On Mon, 2014-11-24 at 10:55 +0100, Juergen Gross wrote: > >> On 11/21/2014 02:57 PM, Konrad Rzeszutek Wilk wrote: > >>> On Fri, Nov 21, 2014 at 09:42:11AM +0100, Juergen Gross wrote: > >>>> Hi, > >>>> > >>>> while testing my "linear p2m list" patches I saw the following > >>>> problem (even without my patches in place): > >>>> > >>>> In dom0 running linux 3.18-rc5 on top of Xen 4.4.1 I modified the > >>>> disk image of a guest by attaching it to dom0: > >>>> > >>>> xl block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w > >>>> mount /dev/xvda2 /mnt > >>>> ... > >>>> umount /mnt > >>>> xl block-detach 0 xvda > >> > >> Did some more testing: > >> - Seems to occur only when attaching a block device to dom0 > > > > That means a qdisk backed situation then, I think. > > > > Is your qemu up to date? > > > > http://xenbits.xen.org/gitweb/?p=qemu-upstream-unstable.git;a=commit;h=abbbc2f09a53f8f9ee565356ab11a78af006e45e resulted in slightly different messages, but could be related? > > Tried it with xen-unstable and newest qemu. Now I have another problem: > > xl -vvv block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w > libxl: debug: libxl.c:4178:libxl_device_disk_add: ao 0xbac0f0: create: > how=(nil) callback=(nil) poller=0xbac1d0 > libxl: error: libxl.c:1897:device_addrm_aocomplete: unable to add device > libxl: debug: libxl_event.c:1739:libxl__ao_complete: ao 0xbac0f0: > complete, rc=-16 > libxl: debug: libxl.c:4178:libxl_device_disk_add: ao 0xbac0f0: > inprogress: poller=0xbac1d0, flags=ic > libxl: debug: libxl_event.c:1711:libxl__ao__destroy: ao 0xbac0f0: destroy > libxl_device_disk_add failed. > > > rc=-16 means ERROR_JSON_CONFIG_EMPTY. Have you updated your initscripts to run the xen-init-dom0 helper? This replaced the manual xenstore-write stuff a little while back and added some other init, including creating the necessary json description of dom0. Ian. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5 2014-11-25 16:18 ` Ian Campbell @ 2014-11-25 16:48 ` Juergen Gross 0 siblings, 0 replies; 14+ messages in thread From: Juergen Gross @ 2014-11-25 16:48 UTC (permalink / raw) To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, Stefano Stabellini On 11/25/2014 05:18 PM, Ian Campbell wrote: > On Tue, 2014-11-25 at 17:10 +0100, Juergen Gross wrote: >> On 11/24/2014 11:02 AM, Ian Campbell wrote: >>> On Mon, 2014-11-24 at 10:55 +0100, Juergen Gross wrote: >>>> On 11/21/2014 02:57 PM, Konrad Rzeszutek Wilk wrote: >>>>> On Fri, Nov 21, 2014 at 09:42:11AM +0100, Juergen Gross wrote: >>>>>> Hi, >>>>>> >>>>>> while testing my "linear p2m list" patches I saw the following >>>>>> problem (even without my patches in place): >>>>>> >>>>>> In dom0 running linux 3.18-rc5 on top of Xen 4.4.1 I modified the >>>>>> disk image of a guest by attaching it to dom0: >>>>>> >>>>>> xl block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w >>>>>> mount /dev/xvda2 /mnt >>>>>> ... >>>>>> umount /mnt >>>>>> xl block-detach 0 xvda >>>> >>>> Did some more testing: >>>> - Seems to occur only when attaching a block device to dom0 >>> >>> That means a qdisk backed situation then, I think. >>> >>> Is your qemu up to date? >>> >>> http://xenbits.xen.org/gitweb/?p=qemu-upstream-unstable.git;a=commit;h=abbbc2f09a53f8f9ee565356ab11a78af006e45e resulted in slightly different messages, but could be related? >> >> Tried it with xen-unstable and newest qemu. Now I have another problem: >> >> xl -vvv block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w >> libxl: debug: libxl.c:4178:libxl_device_disk_add: ao 0xbac0f0: create: >> how=(nil) callback=(nil) poller=0xbac1d0 >> libxl: error: libxl.c:1897:device_addrm_aocomplete: unable to add device >> libxl: debug: libxl_event.c:1739:libxl__ao_complete: ao 0xbac0f0: >> complete, rc=-16 >> libxl: debug: libxl.c:4178:libxl_device_disk_add: ao 0xbac0f0: >> inprogress: poller=0xbac1d0, flags=ic >> libxl: debug: libxl_event.c:1711:libxl__ao__destroy: ao 0xbac0f0: destroy >> libxl_device_disk_add failed. >> >> >> rc=-16 means ERROR_JSON_CONFIG_EMPTY. > > Have you updated your initscripts to run the xen-init-dom0 helper? This > replaced the manual xenstore-write stuff a little while back and added > some other init, including creating the necessary json description of > dom0. Bingo! Now it works. Thanks, Juergen ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5 2014-11-24 10:02 ` Ian Campbell 2014-11-25 16:10 ` Juergen Gross @ 2014-11-25 17:01 ` Juergen Gross 1 sibling, 0 replies; 14+ messages in thread From: Juergen Gross @ 2014-11-25 17:01 UTC (permalink / raw) To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, Stefano Stabellini On 11/24/2014 11:02 AM, Ian Campbell wrote: > On Mon, 2014-11-24 at 10:55 +0100, Juergen Gross wrote: >> On 11/21/2014 02:57 PM, Konrad Rzeszutek Wilk wrote: >>> On Fri, Nov 21, 2014 at 09:42:11AM +0100, Juergen Gross wrote: >>>> Hi, >>>> >>>> while testing my "linear p2m list" patches I saw the following >>>> problem (even without my patches in place): >>>> >>>> In dom0 running linux 3.18-rc5 on top of Xen 4.4.1 I modified the >>>> disk image of a guest by attaching it to dom0: >>>> >>>> xl block-attach 0 file:/var/lib/libvirt/images/opensuse13-1/xvda,xvda,w >>>> mount /dev/xvda2 /mnt >>>> ... >>>> umount /mnt >>>> xl block-detach 0 xvda >> >> Did some more testing: >> - Seems to occur only when attaching a block device to dom0 > > That means a qdisk backed situation then, I think. > > Is your qemu up to date? > > http://xenbits.xen.org/gitweb/?p=qemu-upstream-unstable.git;a=commit;h=abbbc2f09a53f8f9ee565356ab11a78af006e45e resulted in slightly different messages, but could be related? Newest qemu and xen-unstable together with my dom0 kernel patch made all issues go away at a first glance. Juergen ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5 2014-11-24 9:55 ` Juergen Gross 2014-11-24 10:02 ` Ian Campbell @ 2014-11-24 10:20 ` Jan Beulich [not found] ` <5473147C020000780004A3D5@suse.com> 2 siblings, 0 replies; 14+ messages in thread From: Jan Beulich @ 2014-11-24 10:20 UTC (permalink / raw) To: Juergen Gross; +Cc: xen-devel@lists.xensource.com >>> On 24.11.14 at 10:55, <JGross@suse.com> wrote: > - Sometimes I see only NMI watchdog messages, looking into hanging cpu > state via xen debug keys I can see the cpu(s) in question are spinning > in _raw_spin_lock(): > __handle_mm_fault()->__pte_alloc()->pmd_lock()->_raw_spin_lock() > The hanging cpus were executing some random user processes (cron, > bash, xargs), cr2 contained user addresses. Is this perhaps what http://lists.xenproject.org/archives/html/xen-devel/2014-11/msg02135.html appears to be about? Jan ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <5473147C020000780004A3D5@suse.com>]
* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5 [not found] ` <5473147C020000780004A3D5@suse.com> @ 2014-11-24 10:59 ` Juergen Gross 2014-11-24 15:09 ` Juergen Gross 0 siblings, 1 reply; 14+ messages in thread From: Juergen Gross @ 2014-11-24 10:59 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel@lists.xensource.com On 11/24/2014 11:20 AM, Jan Beulich wrote: >>>> On 24.11.14 at 10:55, <JGross@suse.com> wrote: >> - Sometimes I see only NMI watchdog messages, looking into hanging cpu >> state via xen debug keys I can see the cpu(s) in question are spinning >> in _raw_spin_lock(): >> __handle_mm_fault()->__pte_alloc()->pmd_lock()->_raw_spin_lock() >> The hanging cpus were executing some random user processes (cron, >> bash, xargs), cr2 contained user addresses. > > Is this perhaps what > http://lists.xenproject.org/archives/html/xen-devel/2014-11/msg02135.html > appears to be about? Hmm, I'm not sure. I'll try a 3.17 kernel to verify. Juergen ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5 2014-11-24 10:59 ` Juergen Gross @ 2014-11-24 15:09 ` Juergen Gross 2014-11-24 17:27 ` Juergen Gross 2014-11-25 4:47 ` Juergen Gross 0 siblings, 2 replies; 14+ messages in thread From: Juergen Gross @ 2014-11-24 15:09 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel@lists.xensource.com On 11/24/2014 11:59 AM, Juergen Gross wrote: > On 11/24/2014 11:20 AM, Jan Beulich wrote: >>>>> On 24.11.14 at 10:55, <JGross@suse.com> wrote: >>> - Sometimes I see only NMI watchdog messages, looking into hanging cpu >>> state via xen debug keys I can see the cpu(s) in question are >>> spinning >>> in _raw_spin_lock(): >>> __handle_mm_fault()->__pte_alloc()->pmd_lock()->_raw_spin_lock() >>> The hanging cpus were executing some random user processes (cron, >>> bash, xargs), cr2 contained user addresses. >> >> Is this perhaps what >> http://lists.xenproject.org/archives/html/xen-devel/2014-11/msg02135.html >> appears to be about? > > Hmm, I'm not sure. > > I'll try a 3.17 kernel to verify. Still seeing the issue, but less frequent. OTOH I just found in above thread in lkml that 3.17 is showing that issue, too. :-( I'll try to setup a pv-variant of Linus' patch and test it... Juergen ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5 2014-11-24 15:09 ` Juergen Gross @ 2014-11-24 17:27 ` Juergen Gross 2014-11-25 7:50 ` Jan Beulich 2014-11-25 4:47 ` Juergen Gross 1 sibling, 1 reply; 14+ messages in thread From: Juergen Gross @ 2014-11-24 17:27 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel@lists.xensource.com On 11/24/2014 04:09 PM, Juergen Gross wrote: > On 11/24/2014 11:59 AM, Juergen Gross wrote: >> On 11/24/2014 11:20 AM, Jan Beulich wrote: >>>>>> On 24.11.14 at 10:55, <JGross@suse.com> wrote: >>>> - Sometimes I see only NMI watchdog messages, looking into hanging cpu >>>> state via xen debug keys I can see the cpu(s) in question are >>>> spinning >>>> in _raw_spin_lock(): >>>> __handle_mm_fault()->__pte_alloc()->pmd_lock()->_raw_spin_lock() >>>> The hanging cpus were executing some random user processes (cron, >>>> bash, xargs), cr2 contained user addresses. >>> >>> Is this perhaps what >>> http://lists.xenproject.org/archives/html/xen-devel/2014-11/msg02135.html >>> >>> appears to be about? >> >> Hmm, I'm not sure. >> >> I'll try a 3.17 kernel to verify. > > Still seeing the issue, but less frequent. OTOH I just found in above > thread in lkml that 3.17 is showing that issue, too. :-( > > I'll try to setup a pv-variant of Linus' patch and test it... First test seems to be okay, no immediate NMI message... Any idea why the block-attach/detach would trigger this problem so easily? I can see the dependency on the high cpu count, but fail to do so for the xl actions. Juergen ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5 2014-11-24 17:27 ` Juergen Gross @ 2014-11-25 7:50 ` Jan Beulich 0 siblings, 0 replies; 14+ messages in thread From: Jan Beulich @ 2014-11-25 7:50 UTC (permalink / raw) To: Juergen Gross; +Cc: xen-devel@lists.xensource.com >>> On 24.11.14 at 18:27, <JGross@suse.com> wrote: > On 11/24/2014 04:09 PM, Juergen Gross wrote: >> Still seeing the issue, but less frequent. OTOH I just found in above >> thread in lkml that 3.17 is showing that issue, too. :-( >> >> I'll try to setup a pv-variant of Linus' patch and test it... > > First test seems to be okay, no immediate NMI message... > > Any idea why the block-attach/detach would trigger this problem so > easily? I can see the dependency on the high cpu count, but fail to do > so for the xl actions. No, no idea. It was only the similar symptoms that made me consider this a possibility. Jan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Hypervisor error messages after xl block-detach with linux 3.18-rc5 2014-11-24 15:09 ` Juergen Gross 2014-11-24 17:27 ` Juergen Gross @ 2014-11-25 4:47 ` Juergen Gross 1 sibling, 0 replies; 14+ messages in thread From: Juergen Gross @ 2014-11-25 4:47 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel@lists.xensource.com, Ian.Campbell@citrix.com On 11/24/2014 04:09 PM, Juergen Gross wrote: > On 11/24/2014 11:59 AM, Juergen Gross wrote: >> On 11/24/2014 11:20 AM, Jan Beulich wrote: >>>>>> On 24.11.14 at 10:55, <JGross@suse.com> wrote: >>>> - Sometimes I see only NMI watchdog messages, looking into hanging cpu >>>> state via xen debug keys I can see the cpu(s) in question are >>>> spinning >>>> in _raw_spin_lock(): >>>> __handle_mm_fault()->__pte_alloc()->pmd_lock()->_raw_spin_lock() >>>> The hanging cpus were executing some random user processes (cron, >>>> bash, xargs), cr2 contained user addresses. >>> >>> Is this perhaps what >>> http://lists.xenproject.org/archives/html/xen-devel/2014-11/msg02135.html >>> >>> appears to be about? >> >> Hmm, I'm not sure. >> >> I'll try a 3.17 kernel to verify. > > Still seeing the issue, but less frequent. OTOH I just found in above > thread in lkml that 3.17 is showing that issue, too. :-( > > I'll try to setup a pv-variant of Linus' patch and test it... Okay, test survived the night. Seems really to be the same issue. I think I'm seeing the qemu issue now Ian mentioned: [ 140.182849] xen:grant_table: WARNING: g.e. 0x10 still in use! [ 140.182859] deferring g.e. 0x10 (pfn 0xffffffffffffffff) [ 140.182864] xen:grant_table: WARNING: g.e. 0xf still in use! [ 140.182866] deferring g.e. 0xf (pfn 0xffffffffffffffff) ... [ 140.183128] xen:grant_table: WARNING: g.e. 0x2a still in use! [ 140.183129] deferring g.e. 0x2a (pfn 0xffffffffffffffff) [ 142.182274] xen:grant_table: freeing g.e. 0x9 [ 145.182284] xen:grant_table: freeing g.e. 0x44 [ 147.182272] xen:grant_table: freeing g.e. 0x43 [ 501.182282] xen:grant_table: g.e. 0x10 still pending [ 501.182315] xen:grant_table: g.e. 0xf still pending ... I'll update qemu and try again... Juergen ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2014-11-25 17:01 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-21 8:42 Hypervisor error messages after xl block-detach with linux 3.18-rc5 Juergen Gross
2014-11-21 13:57 ` Konrad Rzeszutek Wilk
2014-11-24 9:55 ` Juergen Gross
2014-11-24 10:02 ` Ian Campbell
2014-11-25 16:10 ` Juergen Gross
2014-11-25 16:18 ` Ian Campbell
2014-11-25 16:48 ` Juergen Gross
2014-11-25 17:01 ` Juergen Gross
2014-11-24 10:20 ` Jan Beulich
[not found] ` <5473147C020000780004A3D5@suse.com>
2014-11-24 10:59 ` Juergen Gross
2014-11-24 15:09 ` Juergen Gross
2014-11-24 17:27 ` Juergen Gross
2014-11-25 7:50 ` Jan Beulich
2014-11-25 4:47 ` Juergen Gross
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.