All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG,PATCH] race in xen-block-hotplug
@ 2014-01-15  7:28 Philipp Hahn
  2014-01-15  8:07 ` Roger Pau Monné
  0 siblings, 1 reply; 3+ messages in thread
From: Philipp Hahn @ 2014-01-15  7:28 UTC (permalink / raw)
  To: Xen-devel, Keir Fraser

[-- Attachment #1: Type: text/plain, Size: 1791 bytes --]

Hello,

we encountered a strange race condition in tools/hotplug/Linux/block,
which only shows very rarely and only once for the first domain started
after the host server is started. The complete details are in our
Bugzilla at <https://forge.univention.org/bugzilla/show_bug.cgi?id=20481>.

In our case its Xen-4.1.3 (yes I know it's ancient, but the code is
still the same in current Xen-git) and it only happens for file-backed
files (in our case a ISO-image as CD-ROM).

>        # Avoid a race with the remove if the path has been deleted, or
> »···# otherwise changed from "InitWait" state e.g. due to a timeout
>         xenbus_state=$(xenstore_read_default "$XENBUS_PATH/state" 'unknown')
>         if [ "$xenbus_state" != '2' ]
>         then
>           release_lock "block"
>           fatal "Path closed or removed during hotplug add: $XENBUS_PATH state: $xenbus_state"
>         fi

The problem is that sometimes the other end (kernel/qemu?) is too slow
and the device is still in in state 1=Initializing. If that happens, the
domU stat is aborted and destroyed.
If the same VM is then started again, it works flawlessly.

That code block was added in
<http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=92e6cb5673b37bd883bdef0d0e83faf000edf61d>
by Keir Fraser.

My work-around is to delay for one more second and retry again if the
state is 1=Initializing. The printout confirmed that that case actually
happened.

Signed-off-by: Philipp Hahn <hahn@univention.de>

BYtE
Philipp Hahn
-- 
Philipp Hahn
Open Source Software Engineer

Univention GmbH
be open.
Mary-Somerville-Str. 1
D-28359 Bremen
Tel.: +49 421 22232-0
Fax : +49 421 22232-99
hahn@univention.de

http://www.univention.de/
Geschäftsführer: Peter H. Ganten
HRB 20755 Amtsgericht Bremen
Steuer-Nr.: 71-597-02876

[-- Attachment #2: fix_hotplug_race.patch --]
[-- Type: text/x-patch, Size: 1242 bytes --]

# Bug #20481: Fix race during device hotplug
# Sometimes qemu/the kernel takes too long for the first domain to start. The
# hotplug script then finds the device still in state 1=Initialising and aborts.
# Add an artifical delay of 1 seconds and try again.
diff --git a/tools/hotplug/Linux/block b/tools/hotplug/Linux/block
index 06de5c9..cbf2af3 100644
--- a/tools/hotplug/Linux/block
+++ b/tools/hotplug/Linux/block
@@ -255,12 +255,16 @@ case "$command" in
 
         # Avoid a race with the remove if the path has been deleted, or
 	# otherwise changed from "InitWait" state e.g. due to a timeout
-        xenbus_state=$(xenstore_read_default "$XENBUS_PATH/state" 'unknown')
-        if [ "$xenbus_state" != '2' ]
-        then
+        while true
+        do
+          xenbus_state=$(xenstore_read_default "$XENBUS_PATH/state" 'unknown')
+          case "$xenbus_state" in
+          1) log notice "Path still initializing: $XENBUS_PATH" ; sleep 1 ; continue ;;
+          2) break ;;
+          esac
           release_lock "block"
           fatal "Path closed or removed during hotplug add: $XENBUS_PATH state: $xenbus_state"
-        fi
+        done
 
         if [ "$mode" = 'w' ] && ! stat "$file" -c %A | grep -q w
         then

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [BUG,PATCH] race in xen-block-hotplug
  2014-01-15  7:28 [BUG,PATCH] race in xen-block-hotplug Philipp Hahn
@ 2014-01-15  8:07 ` Roger Pau Monné
  2014-01-15 14:24   ` Ian Campbell
  0 siblings, 1 reply; 3+ messages in thread
From: Roger Pau Monné @ 2014-01-15  8:07 UTC (permalink / raw)
  To: Philipp Hahn, Xen-devel, Keir Fraser

On 15/01/14 08:28, Philipp Hahn wrote:
> Hello,
> 
> we encountered a strange race condition in tools/hotplug/Linux/block,
> which only shows very rarely and only once for the first domain started
> after the host server is started. The complete details are in our
> Bugzilla at <https://forge.univention.org/bugzilla/show_bug.cgi?id=20481>.
> 
> In our case its Xen-4.1.3 (yes I know it's ancient, but the code is
> still the same in current Xen-git) and it only happens for file-backed
> files (in our case a ISO-image as CD-ROM).
> 
>>        # Avoid a race with the remove if the path has been deleted, or
>> »···# otherwise changed from "InitWait" state e.g. due to a timeout
>>         xenbus_state=$(xenstore_read_default "$XENBUS_PATH/state" 'unknown')
>>         if [ "$xenbus_state" != '2' ]
>>         then
>>           release_lock "block"
>>           fatal "Path closed or removed during hotplug add: $XENBUS_PATH state: $xenbus_state"
>>         fi
> 
> The problem is that sometimes the other end (kernel/qemu?) is too slow
> and the device is still in in state 1=Initializing. If that happens, the
> domU stat is aborted and destroyed.
> If the same VM is then started again, it works flawlessly.

This problem only manifests itself with xend and xl from Xen versions <
4.2, since 4.2 onwards libxl waits for the backend to switch to state 2
before launching hotplug scripts.

I would like to avoid having this kind of infinite loop in the block
hotplug script, is there anyway this can be fixed in xend? (which is the
only toolstack that has this problem in upstream versions).

Roger.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [BUG,PATCH] race in xen-block-hotplug
  2014-01-15  8:07 ` Roger Pau Monné
@ 2014-01-15 14:24   ` Ian Campbell
  0 siblings, 0 replies; 3+ messages in thread
From: Ian Campbell @ 2014-01-15 14:24 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Keir Fraser, Xen-devel, Philipp Hahn

On Wed, 2014-01-15 at 09:07 +0100, Roger Pau Monné wrote:
> On 15/01/14 08:28, Philipp Hahn wrote:
> > Hello,
> > 
> > we encountered a strange race condition in tools/hotplug/Linux/block,
> > which only shows very rarely and only once for the first domain started
> > after the host server is started. The complete details are in our
> > Bugzilla at <https://forge.univention.org/bugzilla/show_bug.cgi?id=20481>.
> > 
> > In our case its Xen-4.1.3 (yes I know it's ancient, but the code is
> > still the same in current Xen-git) and it only happens for file-backed
> > files (in our case a ISO-image as CD-ROM).
> > 
> >>        # Avoid a race with the remove if the path has been deleted, or
> >> »···# otherwise changed from "InitWait" state e.g. due to a timeout
> >>         xenbus_state=$(xenstore_read_default "$XENBUS_PATH/state" 'unknown')
> >>         if [ "$xenbus_state" != '2' ]
> >>         then
> >>           release_lock "block"
> >>           fatal "Path closed or removed during hotplug add: $XENBUS_PATH state: $xenbus_state"
> >>         fi
> > 
> > The problem is that sometimes the other end (kernel/qemu?) is too slow
> > and the device is still in in state 1=Initializing. If that happens, the
> > domU stat is aborted and destroyed.
> > If the same VM is then started again, it works flawlessly.
> 
> This problem only manifests itself with xend and xl from Xen versions <
> 4.2, since 4.2 onwards libxl waits for the backend to switch to state 2
> before launching hotplug scripts.
> 
> I would like to avoid having this kind of infinite loop in the block
> hotplug script, is there anyway this can be fixed in xend? (which is the
> only toolstack that has this problem in upstream versions).

xend relies on udev running the scripts based on the kernel firing the
events, so I don't think xend can fix it, at least not without quite a
big change, I don't know how the xend maintainers would view that.

Does blkback fire the uevent before it has moved to state 2 though --
sounds a bit iffy to me. Perhaps there is an underlying kernel bug here?

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-01-15 14:24 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-15  7:28 [BUG,PATCH] race in xen-block-hotplug Philipp Hahn
2014-01-15  8:07 ` Roger Pau Monné
2014-01-15 14:24   ` Ian Campbell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.