xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: george.dunlap@eu.citrix.com, xen-devel@lists.xenproject.org,
	wei.liu2@citrix.com, Ian Campbell <ian.campbell@citrix.com>
Subject: Re: Second regression due to libxl: Remove linux udev rules (2ba368d13893402b2f1fb3c283ddcc714659dd9b)
Date: Fri, 7 Aug 2015 10:54:40 -0400	[thread overview]
Message-ID: <20150807145440.GK29527@l.oracle.com> (raw)
In-Reply-To: <55C07468.4040909@citrix.com>

On Tue, Aug 04, 2015 at 10:14:32AM +0200, Roger Pau Monné wrote:
> El 30/07/15 a les 10.53, Roger Pau Monné ha escrit:
> > El 28/07/15 a les 21.47, Konrad Rzeszutek Wilk ha escrit:
> >> Hey,
> >>
> >> I launch a bunch of guests at the same time or in parallel and 
> >> the scripts end up timing out with:
> >>
> >>
> >> Parsing config from //g-vm8.cfg
> >> WARNING: you seem to be using "kernel" directive to override HVM guest firmware. Ignore that. Use "firmware_override" instead if you really want a non-default firmware
> >> Jul 28 19:20:53 tst036 logger: /etc/xen/scripts/block: add XENBUS_PATH=backend/vbd/13/5632
> >> libxl: error: libxl_aoutils.c:539:async_exec_timeout: killing execution of /etc/xen/scripts/block add because of timeout
> >> libxl: error: libxl_create.c:1157:domcreate_launch_dm: unable to add disk devices
> >> libxl: error: libxl_dm.c:1955:kill_device_model: unable to find device model pid in /local/domain/13/image/device-model-pid
> >> libxl: error: libxl.c:1606:libxl__destroy_domid: libxl__destroy_device_model failed for 13
> >> Jul 28 19:21:03 tst036 logger: /etc/xen/scripts/block: remove XENBUS_PATH=backend/vbd/13/5632
> >> Jul 28 19:21:04 tst036 logger: /etc/xen/scripts/block: Writing backend/vbd/13/5632/hotplug-error xenstore-read backend/vbd/13/5632/node failed. backend/vbd/13/5632/hotplug-status error to xenstore.
> >> Jul 28 19:21:04 tst036 logger: /etc/xen/scripts/block: xenstore-read backend/vbd/13/5632/node failed.
> >> Jul 28 19:21:05 tst036 logger: /etc/xen/scripts/block: Writing backend/vbd/13/5632/hotplug-error /etc/xen/scripts/block failed; error detected. backend/vbd/13/5632/hotplug-status error to xenstore.
> >> Jul 28 19:21:05 tst036 logger: /etc/xen/scripts/block: /etc/xen/scripts/block failed; error detected.
> >> libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: /etc/xen/scripts/block remove [10344] exited with error status 1
> >> libxl: error: libxl_device.c:1085:device_hotplug_child_death_cb: script: /etc/xen/scripts/block failed; error detected.
> >> libxl: error: libxl.c:1569:libxl__destroy_domid: non-existant domain 13
> >> libxl: error: libxl.c:1527:domain_destroy_callback: unable to destroy guest with domid 13
> >> libxl: error: libxl.c:1454:domain_destroy_cb: destruction of domain 13 failed
> >>
> >> And I cannot start the guest.
> >>
> >> While if I revert the mentioned commit everything works peachy.
> >>
> >> What is interesting is that if I have the revert I can see that the
> >>
> >> Jul 28 19:39:03 tst036 logger: /etc/xen/scripts/block: Writing backend/vbd/14/5632/physical-device 7:d to xenstore.
> >> Jul 28 19:39:03 tst036 logger: /etc/xen/scripts/block: Writing backend/vbd/14/5632/hotplug-status connected to xenstore.
> >>
> >> or often done much much later after xl create has started.
> >>
> >> Attached is the bad log and the good log.
> > 
> > Can you do the same test with xl -vvv and the following patch applied 
> > (with and without 2ba368 reverted):
> 
> Ping?

Hey!
> 
> I've looked into this, and AFAICT you were probably using the udev 
> rules (you have run_hotplug_scripts=0 in xl.conf?) before 2ba368, and 

Correct. I think I needed that for driver domains and had left it in there.
> now you are forcefully switched to launching hotplug scripts from libxl.

OK.
> 
> The issue is that you have multiple guests all using the same image 
> file, so the time to execute the block hotplug script is O(n), where n 
> is the number of times the same image is used:
> 
> shared_list=$(losetup -a |
>       sed -n -e "s@^\([^:]\+\)\(:[[:blank:]]\[0*${dev}\]:${inode}[[:blank:]](.*)\)@\1@p" )
> for dev in $shared_list
> do
>   if [ -n "$dev" ]
>   then
>     check_file_sharing "$file" "$dev" "$mode"
>   fi
> done
> 
> This was not a problem when using udev, because there's no timeout, but 
> libxl has a hard timeout (10s) regarding hotplug script execution. The 
> only way I see to solve this is to remove the checks done in the block 
> hotplug script, or to increase the timeout (but since the execution 
> time is not bounded this is doomed to fail if enough guests are using 
> the same image).

Ok. I hadn't run your patch yet. Do you want me to run the latest staging
instead once more with my test-case?
> 
> Roger.
> 

  parent reply	other threads:[~2015-08-07 14:54 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-28 19:47 Second regression due to libxl: Remove linux udev rules (2ba368d13893402b2f1fb3c283ddcc714659dd9b) Konrad Rzeszutek Wilk
2015-07-29  9:03 ` Ian Campbell
2015-07-29 10:52   ` Roger Pau Monné
2015-07-29 15:45     ` Konrad Rzeszutek Wilk
2015-07-30  8:17       ` Ian Campbell
2015-07-30  8:43         ` Roger Pau Monné
2015-07-30  8:56           ` Ian Campbell
2015-07-30  8:53 ` Roger Pau Monné
2015-08-04  8:14   ` Roger Pau Monné
2015-08-04  8:32     ` Ian Campbell
2015-08-04  9:44       ` Roger Pau Monné
2015-08-07 14:54     ` Konrad Rzeszutek Wilk [this message]
2015-08-07 14:58       ` Roger Pau Monné
2015-08-12 14:09         ` Konrad Rzeszutek Wilk
2015-08-18  7:49           ` Roger Pau Monné
2015-09-22 14:15             ` Konrad Rzeszutek Wilk
2015-08-11  8:52       ` Ian Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150807145440.GK29527@l.oracle.com \
    --to=konrad.wilk@oracle.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=roger.pau@citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).