xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	george.dunlap@eu.citrix.com, xen-devel@lists.xenproject.org,
	wei.liu2@citrix.com, Ian Campbell <ian.campbell@citrix.com>
Subject: Re: Second regression due to libxl: Remove linux udev rules (2ba368d13893402b2f1fb3c283ddcc714659dd9b)
Date: Tue, 4 Aug 2015 10:14:32 +0200	[thread overview]
Message-ID: <55C07468.4040909@citrix.com> (raw)
In-Reply-To: <55B9E614.4040504@citrix.com>

El 30/07/15 a les 10.53, Roger Pau Monné ha escrit:
> El 28/07/15 a les 21.47, Konrad Rzeszutek Wilk ha escrit:
>> Hey,
>>
>> I launch a bunch of guests at the same time or in parallel and 
>> the scripts end up timing out with:
>>
>>
>> Parsing config from //g-vm8.cfg
>> WARNING: you seem to be using "kernel" directive to override HVM guest firmware. Ignore that. Use "firmware_override" instead if you really want a non-default firmware
>> Jul 28 19:20:53 tst036 logger: /etc/xen/scripts/block: add XENBUS_PATH=backend/vbd/13/5632
>> libxl: error: libxl_aoutils.c:539:async_exec_timeout: killing execution of /etc/xen/scripts/block add because of timeout
>> libxl: error: libxl_create.c:1157:domcreate_launch_dm: unable to add disk devices
>> libxl: error: libxl_dm.c:1955:kill_device_model: unable to find device model pid in /local/domain/13/image/device-model-pid
>> libxl: error: libxl.c:1606:libxl__destroy_domid: libxl__destroy_device_model failed for 13
>> Jul 28 19:21:03 tst036 logger: /etc/xen/scripts/block: remove XENBUS_PATH=backend/vbd/13/5632
>> Jul 28 19:21:04 tst036 logger: /etc/xen/scripts/block: Writing backend/vbd/13/5632/hotplug-error xenstore-read backend/vbd/13/5632/node failed. backend/vbd/13/5632/hotplug-status error to xenstore.
>> Jul 28 19:21:04 tst036 logger: /etc/xen/scripts/block: xenstore-read backend/vbd/13/5632/node failed.
>> Jul 28 19:21:05 tst036 logger: /etc/xen/scripts/block: Writing backend/vbd/13/5632/hotplug-error /etc/xen/scripts/block failed; error detected. backend/vbd/13/5632/hotplug-status error to xenstore.
>> Jul 28 19:21:05 tst036 logger: /etc/xen/scripts/block: /etc/xen/scripts/block failed; error detected.
>> libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: /etc/xen/scripts/block remove [10344] exited with error status 1
>> libxl: error: libxl_device.c:1085:device_hotplug_child_death_cb: script: /etc/xen/scripts/block failed; error detected.
>> libxl: error: libxl.c:1569:libxl__destroy_domid: non-existant domain 13
>> libxl: error: libxl.c:1527:domain_destroy_callback: unable to destroy guest with domid 13
>> libxl: error: libxl.c:1454:domain_destroy_cb: destruction of domain 13 failed
>>
>> And I cannot start the guest.
>>
>> While if I revert the mentioned commit everything works peachy.
>>
>> What is interesting is that if I have the revert I can see that the
>>
>> Jul 28 19:39:03 tst036 logger: /etc/xen/scripts/block: Writing backend/vbd/14/5632/physical-device 7:d to xenstore.
>> Jul 28 19:39:03 tst036 logger: /etc/xen/scripts/block: Writing backend/vbd/14/5632/hotplug-status connected to xenstore.
>>
>> or often done much much later after xl create has started.
>>
>> Attached is the bad log and the good log.
> 
> Can you do the same test with xl -vvv and the following patch applied 
> (with and without 2ba368 reverted):

Ping?

I've looked into this, and AFAICT you were probably using the udev 
rules (you have run_hotplug_scripts=0 in xl.conf?) before 2ba368, and 
now you are forcefully switched to launching hotplug scripts from libxl.

The issue is that you have multiple guests all using the same image 
file, so the time to execute the block hotplug script is O(n), where n 
is the number of times the same image is used:

shared_list=$(losetup -a |
      sed -n -e "s@^\([^:]\+\)\(:[[:blank:]]\[0*${dev}\]:${inode}[[:blank:]](.*)\)@\1@p" )
for dev in $shared_list
do
  if [ -n "$dev" ]
  then
    check_file_sharing "$file" "$dev" "$mode"
  fi
done

This was not a problem when using udev, because there's no timeout, but 
libxl has a hard timeout (10s) regarding hotplug script execution. The 
only way I see to solve this is to remove the checks done in the block 
hotplug script, or to increase the timeout (but since the execution 
time is not bounded this is doomed to fail if enough guests are using 
the same image).

Roger.

  reply	other threads:[~2015-08-04  8:14 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-28 19:47 Second regression due to libxl: Remove linux udev rules (2ba368d13893402b2f1fb3c283ddcc714659dd9b) Konrad Rzeszutek Wilk
2015-07-29  9:03 ` Ian Campbell
2015-07-29 10:52   ` Roger Pau Monné
2015-07-29 15:45     ` Konrad Rzeszutek Wilk
2015-07-30  8:17       ` Ian Campbell
2015-07-30  8:43         ` Roger Pau Monné
2015-07-30  8:56           ` Ian Campbell
2015-07-30  8:53 ` Roger Pau Monné
2015-08-04  8:14   ` Roger Pau Monné [this message]
2015-08-04  8:32     ` Ian Campbell
2015-08-04  9:44       ` Roger Pau Monné
2015-08-07 14:54     ` Konrad Rzeszutek Wilk
2015-08-07 14:58       ` Roger Pau Monné
2015-08-12 14:09         ` Konrad Rzeszutek Wilk
2015-08-18  7:49           ` Roger Pau Monné
2015-09-22 14:15             ` Konrad Rzeszutek Wilk
2015-08-11  8:52       ` Ian Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55C07468.4040909@citrix.com \
    --to=roger.pau@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=konrad.wilk@oracle.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).