From: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
To: xen-devel@lists.xen.org, glenn@rimuhosting.com
Cc: Juergen Gross <jgross@suse.com>
Subject: Re: null domains after xl destroy
Date: Tue, 11 Apr 2017 11:49:48 +0200 [thread overview]
Message-ID: <3385656.IoOB642KYU@amur> (raw)
In-Reply-To: <70eae378-2392-bd82-670a-5dafff58c259@rimuhosting.com>
Am Dienstag, 11. April 2017, 20:03:14 schrieb Glenn Enright:
> On 11/04/17 17:59, Juergen Gross wrote:
> > On 11/04/17 07:25, Glenn Enright wrote:
> >> Hi all
> >>
> >> We are seeing an odd issue with domu domains from xl destroy, under
> >> recent 4.9 kernels a (null) domain is left behind.
> >
> > I guess this is the dom0 kernel version?
> >
> >> This has occurred on a variety of hardware, with no obvious commonality.
> >>
> >> 4.4.55 does not show this behavior.
> >>
> >> On my test machine I have the following packages installed under
> >> centos6, from https://xen.crc.id.au/
> >>
> >> ~]# rpm -qa | grep xen
> >> xen47-licenses-4.7.2-4.el6.x86_64
> >> xen47-4.7.2-4.el6.x86_64
> >> kernel-xen-4.9.21-1.el6xen.x86_64
> >> xen47-ocaml-4.7.2-4.el6.x86_64
> >> xen47-libs-4.7.2-4.el6.x86_64
> >> xen47-libcacard-4.7.2-4.el6.x86_64
> >> xen47-hypervisor-4.7.2-4.el6.x86_64
> >> xen47-runtime-4.7.2-4.el6.x86_64
> >> kernel-xen-firmware-4.9.21-1.el6xen.x86_64
> >>
> >> I've also replicated the issue with 4.9.17 and 4.9.20
> >>
> >> To replicate, on a cleanly booted dom0 with one pv VM, I run the
> >> following on the VM
> >>
> >> {
> >> while true; do
> >> dd bs=1M count=512 if=/dev/zero of=test conv=fdatasync
> >> done
> >> }
> >>
> >> Then on the dom0 I do this sequence to reliably get a null domain. This
> >> occurs with oxenstored and xenstored both.
> >>
> >> {
> >> xl sync 1
> >> xl destroy 1
> >> }
> >>
> >> xl list then renders something like ...
> >>
> >> (null) 1 4 4 --p--d
> >> 9.8 0
> >
> > Something is referencing the domain, e.g. some of its memory pages are
> > still mapped by dom0.
You can try
# xl debug-keys q
and further
# xl dmesg
to see the output of the previous command. The 'q' dumps domain
(and guest debug) info.
# xl debug-keys h
prints all possible parameters for more informations.
Dietmar.
> >
> >> From what I can see it appears to be disk related. Affected VMs all use
> >> lvm storage for their boot disk. lvdisplay of the affected lv shows that
> >> the lv has is being help open by something.
> >
> > How are the disks configured? Especially the backend type is important.
> >
> >>
> >> ~]# lvdisplay test/test.img | grep open
> >> # open 1
> >>
> >> I've not been able to determine what that thing is as yet. I tried lsof,
> >> dmsetup, various lv tools. Waiting for the disk to be released does not
> >> work.
> >>
> >> ~]# xl list
> >> Name ID Mem VCPUs State
> >> Time(s)
> >> Domain-0 0 1512 2 r-----
> >> 29.0
> >> (null) 1 4 4 --p--d
> >> 9.8
> >>
> >> xenstore-ls reports nothing for the null domain id that I can see.
> >
> > Any qemu process related to the domain still running?
> >
> > Any dom0 kernel messages related to Xen?
> >
> >
> > Juergen
> >
>
> Yep, 4.9 dom0 kernel
>
> Typically we see an xl process running, but that has already gone away
> in this case. The domU is a PV guest using phy definition, the basic
> startup is like this...
>
> xl -v create -f paramfile extra="console=hvc0 elevator=noop
> xen-blkfront.max=64"
>
> There are no qemu processes or threads anywhere I can see.
>
> I dont see any meaningful messages in the linux kernel log, and nothing
> at all in the hypervisor log. Here is output from the dom0 starting and
> then stopping a domU using the above mechanism
>
> br0: port 2(vif3.0) entered disabled state
> br0: port 2(vif4.0) entered blocking state
> br0: port 2(vif4.0) entered disabled state
> device vif4.0 entered promiscuous mode
> IPv6: ADDRCONF(NETDEV_UP): vif4.0: link is not ready
> xen-blkback: backend/vbd/4/51713: using 2 queues, protocol 1
> (x86_64-abi) persistent grants
> xen-blkback: backend/vbd/4/51721: using 2 queues, protocol 1
> (x86_64-abi) persistent grants
> vif vif-4-0 vif4.0: Guest Rx ready
> IPv6: ADDRCONF(NETDEV_CHANGE): vif4.0: link becomes ready
> br0: port 2(vif4.0) entered blocking state
> br0: port 2(vif4.0) entered forwarding state
> br0: port 2(vif4.0) entered disabled state
> br0: port 2(vif4.0) entered disabled state
> device vif4.0 left promiscuous mode
> br0: port 2(vif4.0) entered disabled state
>
> ... here is xl info ...
>
> host : xxxxxxxxxxxx
> release : 4.9.21-1.el6xen.x86_64
> version : #1 SMP Sat Apr 8 18:03:45 AEST 2017
> machine : x86_64
> nr_cpus : 4
> max_cpu_id : 3
> nr_nodes : 1
> cores_per_socket : 4
> threads_per_core : 1
> cpu_mhz : 2394
> hw_caps :
> b7ebfbff:0000e3bd:20100800:00000001:00000000:00000000:00000000:00000000
> virt_caps :
> total_memory : 8190
> free_memory : 6577
> sharing_freed_memory : 0
> sharing_used_memory : 0
> outstanding_claims : 0
> free_cpus : 0
> xen_major : 4
> xen_minor : 7
> xen_extra : .2
> xen_version : 4.7.2
> xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p
> xen_scheduler : credit
> xen_pagesize : 4096
> platform_params : virt_start=0xffff800000000000
> xen_changeset :
> xen_commandline : dom0_mem=1512M cpufreq=xen dom0_max_vcpus=2
> dom0_vcpus_pin log_lvl=all guest_loglvl=all vcpu_migration_delay=1000
> cc_compiler : gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-17)
> cc_compile_by : mockbuild
> cc_compile_domain : (none)
> cc_compile_date : Mon Apr 3 12:17:20 AEST 2017
> build_id : 0ec32d14d7c34e5d9deaaf6e3b7ea0c8006d68fa
> xend_config_format : 4
>
>
> # cat /proc/cmdline
> ro root=UUID=xxxxxxxxxx rd_MD_UUID=xxxxxxxxxxxx rd_NO_LUKS
> KEYBOARDTYPE=pc KEYTABLE=us LANG=en_US.UTF-8 rd_MD_UUID=xxxxxxxxxxxxx
> SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_NO_LVM rd_NO_DM rhgb quiet
> pcie_aspm=off panic=30 max_loop=64 dm_mod.use_blk_mq=y xen-blkfront.max=64
>
> The domu is using an lvm on top of a md raid1 array, on direct connected
> HDDs. Nothing special hardware wise. The disk line for that domU looks
> functionally like...
>
> disk = [ 'phy:/dev/testlv/test.img,xvda1,w' ]
>
> I would appreciate any suggestions on how to increase the debug level in
> a relevant way or where to look to get more useful information on what
> is happening.
>
> To clarify the actual shutdown sequence that causes problems...
>
> # xl sysrq $id s
> # xl destroy $id
>
>
> Regards, Glenn
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
--
Company details: http://ts.fujitsu.com/imprint.html
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2017-04-11 9:49 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-11 5:25 null domains after xl destroy Glenn Enright
2017-04-11 5:59 ` Juergen Gross
2017-04-11 8:03 ` Glenn Enright
2017-04-11 9:49 ` Dietmar Hahn [this message]
2017-04-11 22:13 ` Glenn Enright
2017-04-11 22:23 ` Andrew Cooper
2017-04-11 22:45 ` Glenn Enright
2017-04-18 8:36 ` Juergen Gross
2017-04-19 1:02 ` Glenn Enright
2017-04-19 4:39 ` Juergen Gross
2017-04-19 7:16 ` Roger Pau Monné
2017-04-19 7:35 ` Juergen Gross
2017-04-19 10:09 ` Juergen Gross
2017-04-19 16:22 ` Steven Haigh
2017-04-21 8:42 ` Steven Haigh
2017-04-21 8:44 ` Juergen Gross
2017-05-01 0:55 ` Glenn Enright
2017-05-03 10:45 ` Steven Haigh
2017-05-03 13:38 ` Juergen Gross
2017-05-03 15:53 ` Juergen Gross
2017-05-03 16:58 ` Steven Haigh
2017-05-03 22:17 ` Glenn Enright
2017-05-08 9:10 ` Juergen Gross
2017-05-09 9:24 ` Roger Pau Monné
2017-05-13 4:02 ` Glenn Enright
2017-05-15 9:57 ` Juergen Gross
2017-05-16 0:49 ` Glenn Enright
2017-05-16 1:18 ` Steven Haigh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3385656.IoOB642KYU@amur \
--to=dietmar.hahn@ts.fujitsu.com \
--cc=glenn@rimuhosting.com \
--cc=jgross@suse.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).