xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* Blktap fixes and kernel patch.
@ 2012-07-31 21:41 Dr. Greg Wettstein
  2012-08-01  9:17 ` Ian Campbell
  0 siblings, 1 reply; 4+ messages in thread
From: Dr. Greg Wettstein @ 2012-07-31 21:41 UTC (permalink / raw)
  To: xen-devel; +Cc: ian.campbell

Hi, hope the day is going well for everyone.  Just came in from a 90
mile cycle ride and wanted to get this out since 4.1.3 and/or 4.2 are
looming in the near future.

After becoming far more familiar with the XEN vbd infrastructure then
I had intended I was finally able to run down the problem with the
blktap2 driver which has plagued 4.1.2.  Most specifically the
orphaning of the tapdisk2 driver process and after Ian's fix for that,
the hang and orphaning of a tapdisk minor which occurs when libxl
attempts to shutdown the driver process.

In the process I updated the blktap2 kernel driver to patch cleanly
into the Linux 3.4 kernel.  These fixes have been validated against
the 3.4 kernel as well as the 3.2 kernel.

Locations for the files are as follows:

Kernel driver:
	ftp://ftp.enjellic.com/pub/xen/blktap2-3.4.patch

Xen fixes:
	ftp://ftp.enjellic.com/pub/xen/xen-4.1.2.blktap1.patch
	ftp://ftp.enjellic.com/pub/xen/xen-4.1.2.blktap2.patch

The two XEN patches overlay sequentially onto a virgin 4.1.2
distribution and both are required for correct behavior.

The first patch is one which was done by Ian for the development tree
with minor corrections for 4.1.2.  I'm including it for completeness
for those who want a trouble free patch set for a 4.1.2 distribution.
This patch fixes the orphaning of the tapdisk2 driver process when xl
shuts down.

The second patch corrects the deadlock which occurs between the
blktap2 kernel driver and the blktap2 userspace control plane.  The
deadlock causes a delay in the shutdown of a XEN guest and results in
the 'orphaning' of tapdisk minor number allocations.  As seems to be
typical with these types of things the fix was trivially straight
forward once I finally figured out what was going on.

Ian for your reference the following change which you introduced to
address this issue:

79e3dbe4b659e78408a9eea76c51a601bd4a383a
tapdisk: respond to destroy request before tearing down the commuication channel

Is not needed and does not provide formally correct behavior in the
presence of the two patches noted above.

I don't know where all this went in 4.2 since I heard mention that
someone was giving xl a good work over with respect to device
management.  I also don't know if a 4.1.3 is planned, if it is the
above patches go a long ways toward improving the user experience for
those sites which find blktap2 useful.

Now on to NPIV support since it was the reason I waded into the xl/vbd
in the first place.... :-)

Best wishes for a pleasant remainder of the week.

Greg

Cut here for patch 1: -----------------------------------------------------
diff -urNp v4.1.2/xen-4.1.2/tools/blktap2/control/tap-ctl-list.c xen-4.1.2/tools/blktap2/control/tap-ctl-list.c
--- v4.1.2/xen-4.1.2/tools/blktap2/control/tap-ctl-list.c	Thu Oct 20 12:05:41 2011
+++ xen-4.1.2/tools/blktap2/control/tap-ctl-list.c	Thu Jul 26 09:43:10 2012
@@ -506,17 +506,15 @@ out:
 }
 
 int
-tap_ctl_find_minor(const char *type, const char *path)
+tap_ctl_find(const char *type, const char *path, tap_list_t *tap)
 {
 	tap_list_t **list, **_entry;
-	int minor, err;
+	int ret = -ENOENT, err;
 
 	err = tap_ctl_list(&list);
 	if (err)
 		return err;
 
-	minor = -1;
-
 	for (_entry = list; *_entry != NULL; ++_entry) {
 		tap_list_t *entry  = *_entry;
 
@@ -526,11 +524,13 @@ tap_ctl_find_minor(const char *type, con
 		if (path && (!entry->path || strcmp(entry->path, path)))
 			continue;
 
-		minor = entry->minor;
+		*tap = *entry;
+		tap->type = tap->path = NULL;
+		ret = 0;
 		break;
 	}
 
 	tap_ctl_free_list(list);
 
-	return minor >= 0 ? minor : -ENOENT;
+	return ret;
 }
diff -urNp v4.1.2/xen-4.1.2/tools/blktap2/control/tap-ctl.h xen-4.1.2/tools/blktap2/control/tap-ctl.h
--- v4.1.2/xen-4.1.2/tools/blktap2/control/tap-ctl.h	Thu Oct 20 12:05:41 2011
+++ xen-4.1.2/tools/blktap2/control/tap-ctl.h	Thu Jul 26 09:43:10 2012
@@ -76,7 +76,7 @@ int tap_ctl_get_driver_id(const char *ha
 
 int tap_ctl_list(tap_list_t ***list);
 void tap_ctl_free_list(tap_list_t **list);
-int tap_ctl_find_minor(const char *type, const char *path);
+int tap_ctl_find(const char *type, const char *path, tap_list_t *tap);
 
 int tap_ctl_allocate(int *minor, char **devname);
 int tap_ctl_free(const int minor);
diff -urNp v4.1.2/xen-4.1.2/tools/libxl/libxl_blktap2.c xen-4.1.2/tools/libxl/libxl_blktap2.c
--- v4.1.2/xen-4.1.2/tools/libxl/libxl_blktap2.c	Thu Oct 20 12:05:42 2011
+++ xen-4.1.2/tools/libxl/libxl_blktap2.c	Thu Jul 26 09:43:10 2012
@@ -18,6 +18,8 @@
 
 #include "tap-ctl.h"
 
+#include <string.h>
+
 int libxl__blktap_enabled(libxl__gc *gc)
 {
     const char *msg;
@@ -30,12 +32,13 @@ const char *libxl__blktap_devpath(libxl_
 {
     const char *type;
     char *params, *devname = NULL;
-    int minor, err;
+    tap_list_t tap;
+    int err;
 
     type = libxl__device_disk_string_of_format(format);
-    minor = tap_ctl_find_minor(type, disk);
-    if (minor >= 0) {
-        devname = libxl__sprintf(gc, "/dev/xen/blktap-2/tapdev%d", minor);
+    err = tap_ctl_find(type, disk, &tap);
+    if (err == 0) {
+        devname = libxl__sprintf(gc, "/dev/xen/blktap-2/tapdev%d", tap.minor);
         if (devname)
             return devname;
     }
@@ -48,4 +51,29 @@ const char *libxl__blktap_devpath(libxl_
     }
 
     return NULL;
+}
+
+
+void libxl__device_destroy_tapdisk(libxl__gc *gc, char *be_path)
+{
+    char *path, *params, *type, *disk;
+    int err;
+    tap_list_t tap;
+
+    path = libxl__sprintf(gc, "%s/tapdisk-params", be_path);
+    if (!path) return;
+
+    params = libxl__xs_read(gc, XBT_NULL, path);
+    if (!params) return;
+
+    type = params;
+    disk = strchr(params, ':');
+    if (!disk) return;
+
+    *disk++ = '\0';
+
+    err = tap_ctl_find(type, disk, &tap);
+    if (err < 0) return;
+
+    tap_ctl_destroy(tap.id, tap.minor);
 }
diff -urNp v4.1.2/xen-4.1.2/tools/libxl/libxl_device.c xen-4.1.2/tools/libxl/libxl_device.c
--- v4.1.2/xen-4.1.2/tools/libxl/libxl_device.c	Thu Oct 20 12:05:42 2011
+++ xen-4.1.2/tools/libxl/libxl_device.c	Thu Jul 26 09:48:03 2012
@@ -250,6 +250,7 @@ int libxl__device_destroy(libxl_ctx *ctx
     if (!state)
         goto out;
     if (atoi(state) != 4) {
+        libxl__device_destroy_tapdisk(&gc, be_path);
         xs_rm(ctx->xsh, XBT_NULL, be_path);
         goto out;
     }
@@ -368,6 +369,7 @@ int libxl__devices_destroy(libxl_ctx *ct
             }
         }
     }
+    libxl__device_destroy_tapdisk(&gc, be_path);
 out:
     libxl__free_all(&gc);
     return 0;
diff -urNp v4.1.2/xen-4.1.2/tools/libxl/libxl_internal.h xen-4.1.2/tools/libxl/libxl_internal.h
--- v4.1.2/xen-4.1.2/tools/libxl/libxl_internal.h	Thu Oct 20 12:05:43 2011
+++ xen-4.1.2/tools/libxl/libxl_internal.h	Thu Jul 26 09:43:10 2012
@@ -314,6 +314,12 @@ _hidden const char *libxl__blktap_devpat
                                  const char *disk,
                                  libxl_disk_format format);
 
+/* libxl__device_destroy_tapdisk:
+ *   Destroys any tapdisk process associated with the backend represented
+ *   by be_path.
+ */
+_hidden void libxl__device_destroy_tapdisk(libxl__gc *gc, char *be_path);
+
 _hidden char *libxl__uuid2string(libxl__gc *gc, const libxl_uuid uuid);
 
 struct libxl__xen_console_reader {
diff -urNp v4.1.2/xen-4.1.2/tools/libxl/libxl_noblktap2.c xen-4.1.2/tools/libxl/libxl_noblktap2.c
--- v4.1.2/xen-4.1.2/tools/libxl/libxl_noblktap2.c	Thu Oct 20 12:05:43 2011
+++ xen-4.1.2/tools/libxl/libxl_noblktap2.c	Thu Jul 26 09:43:10 2012
@@ -27,3 +27,7 @@ const char *libxl__blktap_devpath(libxl_
 {
     return NULL;
 }
+
+void libxl__device_destroy_tapdisk(libxl__gc *gc, char *be_path)
+{
+}
And here for patch1: ------------------------------------------------------


Cut here for patch 2: -----------------------------------------------------
diff -urNp v4.1.2/xen-4.1.2/tools/libxl/libxl_blktap2.c xen-4.1.2/tools/libxl/libxl_blktap2.c
--- v4.1.2/xen-4.1.2/tools/libxl/libxl_blktap2.c	Thu Jul 26 09:43:10 2012
+++ xen-4.1.2/tools/libxl/libxl_blktap2.c	Thu Jul 26 10:04:47 2012
@@ -59,6 +59,7 @@ void libxl__device_destroy_tapdisk(libxl
     char *path, *params, *type, *disk;
     int err;
     tap_list_t tap;
+    libxl_ctx *ctx = libxl__gc_owner(gc);
 
     path = libxl__sprintf(gc, "%s/tapdisk-params", be_path);
     if (!path) return;
@@ -74,6 +75,12 @@ void libxl__device_destroy_tapdisk(libxl
 
     err = tap_ctl_find(type, disk, &tap);
     if (err < 0) return;
+
+    /*
+     * Remove the instance of the backend device to avoid a deadlock with the
+     * removal of the tap device.
+     */
+    xs_rm(ctx->xsh, XBT_NULL, be_path);
 
     tap_ctl_destroy(tap.id, tap.minor);
 }
diff -urNp v4.1.2/xen-4.1.2/tools/libxl/libxl_device.c xen-4.1.2/tools/libxl/libxl_device.c
--- v4.1.2/xen-4.1.2/tools/libxl/libxl_device.c	Thu Jul 26 09:48:03 2012
+++ xen-4.1.2/tools/libxl/libxl_device.c	Thu Jul 26 10:06:20 2012
@@ -250,8 +250,7 @@ int libxl__device_destroy(libxl_ctx *ctx
     if (!state)
         goto out;
     if (atoi(state) != 4) {
-        libxl__device_destroy_tapdisk(&gc, be_path);
-        xs_rm(ctx->xsh, XBT_NULL, be_path);
+	libxl__device_destroy_tapdisk(&gc, be_path);
         goto out;
     }
 
And here for patch 2: -----------------------------------------------------

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686
FAX: 701-281-3949           EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"A distributed system is one on which I cannot get any work done because 
 a machine I have never heard of has crashed."
                                -- Leslie Lamport

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Blktap fixes and kernel patch.
  2012-07-31 21:41 Dr. Greg Wettstein
@ 2012-08-01  9:17 ` Ian Campbell
  0 siblings, 0 replies; 4+ messages in thread
From: Ian Campbell @ 2012-08-01  9:17 UTC (permalink / raw)
  To: greg@enjellic.com; +Cc: xen-devel@lists.xen.org

On Tue, 2012-07-31 at 22:41 +0100, Dr. Greg Wettstein wrote:

> After becoming far more familiar with the XEN vbd infrastructure then
> I had intended I was finally able to run down the problem with the
> blktap2 driver which has plagued 4.1.2.  Most specifically the
> orphaning of the tapdisk2 driver process and after Ian's fix for that,
> the hang and orphaning of a tapdisk minor which occurs when libxl
> attempts to shutdown the driver process.

Thanks for digging into this Greg.

> In the process I updated the blktap2 kernel driver to patch cleanly
> into the Linux 3.4 kernel.  These fixes have been validated against
> the 3.4 kernel as well as the 3.2 kernel.

Just to be clear this is just a straight forward port, there's no part
of the deadlock fix in here?

> The first patch is one which was done by Ian for the development tree
> with minor corrections for 4.1.2.  I'm including it for completeness
> for those who want a trouble free patch set for a 4.1.2 distribution.
> This patch fixes the orphaning of the tapdisk2 driver process when xl
> shuts down.

This is a fairly straight backport of a patch in unstable?

If you send a mail with a subject "Xen 4.1.x backport request
<commit-id>" explaining which commit it is and CC keir@xen.org &
ian.jackson@eu.citrix.com then we can see about getting this into a
future 4.1.x (perhaps even 4.1.3, not sure which stage of rcs we are at
there).

If the backport is reasonably trivial then there is often no need to
include it but since you have done so you might as well include the
patch for reference.

> The second patch corrects the deadlock which occurs between the
> blktap2 kernel driver and the blktap2 userspace control plane.  The
> deadlock causes a delay in the shutdown of a XEN guest and results in
> the 'orphaning' of tapdisk minor number allocations.  As seems to be
> typical with these types of things the fix was trivially straight
> forward once I finally figured out what was going on.

Thanks for this.

Am I right that the important functional change here is that the xs_rm
needs to come after we read the params node but before tap_ctl_destroy?
Obviously removing the node before calling libxl__device_destroy_tapdisk
is wrong since libxl__device_destroy_tapdisk reads from be_path!

Looking at 4.2.0-rc1 I see that libxl__device_destroy removes the
backend before calling libxl__device_destroy_tapdisk, so I think that a
fix is needed there too.

I'm less sure about the usage in libxl__initiate_device_remove. I wonder
if the call to libxl__device_destroy_tapdisk there needs to move to
device_hotplug_done right after the transaction which cleans up the
backend back?

The code which used to be in libxl__devices_destroy is now in
libxl__initiate_device_remove so I expect that fixing that would be
sufficient.

I'd really appreciate it if you could validate whether 4.2.0-rc1 works
for you or not, I suspect not. We would usually want fix the development
version before considering fixes for the stable branches (even if the
actual patch ends up looking totally different) otherwise we run the
risk of regressions in the next version.

Is there a simple command which will list the leaked tap devices? If so
we can consider adding it to the leak-check phase of the automated tests
(although I'm not sure how much use these make of blktap)

For future reference if you intend for a patch to be applied it is best
to submit it in the form described in
http://wiki.xen.org/wiki/Submitting_Xen_Patches, that is one patch per
email, with a changelog specific to that change and a Signed-off-by. In
this sort of scenario (a patch going to 4.1 which isn't a backport) the
changelog should also mention why the patch isn't a backport.

> Ian for your reference the following change which you introduced to
> address this issue:
> 
> 79e3dbe4b659e78408a9eea76c51a601bd4a383a
> tapdisk: respond to destroy request before tearing down the commuication channel
> 
> Is not needed and does not provide formally correct behavior in the
> presence of the two patches noted above.

Is it incorrect (i.e. should be reverted) or is it just incomplete/not
helpful?

Ian.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Blktap fixes and kernel patch.
@ 2012-08-02 16:06 Dr. Greg Wettstein
  2012-08-03  9:21 ` Ian Campbell
  0 siblings, 1 reply; 4+ messages in thread
From: Dr. Greg Wettstein @ 2012-08-02 16:06 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel@lists.xen.org

On Aug 1, 10:17am, Ian Campbell wrote:
} Subject: Re: Blktap fixes and kernel patch.

Hi Ian et. al, hope your week is proceeding well.

> On Tue, 2012-07-31 at 22:41 +0100, Dr. Greg Wettstein wrote:

> > In the process I updated the blktap2 kernel driver to patch cleanly
> > into the Linux 3.4 kernel.  These fixes have been validated against
> > the 3.4 kernel as well as the 3.2 kernel.

> Just to be clear this is just a straight forward port, there's no
> part of the deadlock fix in here?

The kernel patch is just a forward port to 3.4, there were no issues
with the driver itself that needed to be addressed.

It seemed the community lacks ready access to the kernel driver so
hopefully others will find a solid reference site for the patches
useful.

> > The first patch is one which was done by Ian for the development tree
> > with minor corrections for 4.1.2.  I'm including it for completeness
> > for those who want a trouble free patch set for a 4.1.2 distribution.
> > This patch fixes the orphaning of the tapdisk2 driver process when xl
> > shuts down.

> This is a fairly straight backport of a patch in unstable?

Correct, the only change besides some version skew noise is a change
in the calling convention for the libxl__gc context passed to
libxl__device_destroy_tapdisk().

> If you send a mail with a subject "Xen 4.1.x backport request
> <commit-id>" explaining which commit it is and CC keir@xen.org &
> ian.jackson@eu.citrix.com then we can see about getting this into a
> future 4.1.x (perhaps even 4.1.3, not sure which stage of rcs we are at
> there).
> 
> If the backport is reasonably trivial then there is often no need to
> include it but since you have done so you might as well include the
> patch for reference.

I will forward along the reference and the patch.

> > The second patch corrects the deadlock which occurs between the
> > blktap2 kernel driver and the blktap2 userspace control plane.  The
> > deadlock causes a delay in the shutdown of a XEN guest and results in
> > the 'orphaning' of tapdisk minor number allocations.  As seems to be
> > typical with these types of things the fix was trivially straight
> > forward once I finally figured out what was going on.

> Thanks for this.
> 
> Am I right that the important functional change here is that the xs_rm
> needs to come after we read the params node but before tap_ctl_destroy?
> Obviously removing the node before calling libxl__device_destroy_tapdisk
> is wrong since libxl__device_destroy_tapdisk reads from be_path!

Correct.

I debated a bit about how to do this in the cleanest fashion
possible.  Since the be_path is passed to libxl__device_destroy_tapdisk()
the simplest strategy seemed to be to abstract the libxl_ctx context
and pull the entry from xenstore after the tapdisk-params key was read
from the xenstore.

> Looking at 4.2.0-rc1 I see that libxl__device_destroy removes the
> backend before calling libxl__device_destroy_tapdisk, so I think that a
> fix is needed there too.
> 
> I'm less sure about the usage in libxl__initiate_device_remove. I wonder
> if the call to libxl__device_destroy_tapdisk there needs to move to
> device_hotplug_done right after the transaction which cleans up the
> backend back?
> 
> The code which used to be in libxl__devices_destroy is now in
> libxl__initiate_device_remove so I expect that fixing that would be
> sufficient.

The root cause of the problem is a deadlock between the xen-blkback
and blktap2 drivers when the tapdisk2 user space process requests
unmapping of the ring buffer.  The only thing which saves the kernel
is the select timeout on the IPC channel between libxl and the
userspace process.  That timeout allows xl to proceed forward and
teardown the backend which releases the deadlock but the error results
in orphaning of the tapdisk2 minor.

So the overal fix is pretty straight forward, libxl needs to shutdown
the backend before initiating the teardown of the blktap2 userspace
component of the device.

While my fix vaguely felt like a layering violation it seemed to be
the most correct approach.  Since libxl__device_destroy_tapdisk() is
stubbed out in the non-blktap2 case having the teardown of the backend
there would seem to generically fix the problem.  Provided of course
the function can be provided with the correct context, I'm not looking
at the current code.

> I'd really appreciate it if you could validate whether 4.2.0-rc1
> works for you or not, I suspect not. We would usually want fix the
> development version before considering fixes for the stable branches
> (even if the actual patch ends up looking totally different)
> otherwise we run the risk of regressions in the next version.

I'd be happy to give it a test.  Is there a tarball cut or should I
hone up my Mercurial skills?

> Is there a simple command which will list the leaked tap devices? If
> so we can consider adding it to the leak-check phase of the
> automated tests (although I'm not sure how much use these make of
> blktap)

The tap-ctl tools make managingn all this pretty straight forward.

If you issue the following command:

	tap-ctl list

On a faulty implementation after the startup and shutdown of a blktap2
using guest you will see the orphaned minor.  They steadily increase
as guests startup and shutdown.

> For future reference if you intend for a patch to be applied it is best
> to submit it in the form described in
> http://wiki.xen.org/wiki/Submitting_Xen_Patches, that is one patch per
> email, with a changelog specific to that change and a Signed-off-by. In
> this sort of scenario (a patch going to 4.1 which isn't a backport) the
> changelog should also mention why the patch isn't a backport.

Will do, should have been more methodical in my practice.

It appears that 4.1.2 is not properly cleaning up xenstore.  I'm
chasing that down now and if there are trivial correctness fixes I
will pass them onward.

> > Ian for your reference the following change which you introduced to
> > address this issue:
> > 
> > 79e3dbe4b659e78408a9eea76c51a601bd4a383a
> > tapdisk: respond to destroy request before tearing down the commuication channel
> > 
> > Is not needed and does not provide formally correct behavior in the
> > presence of the two patches noted above.

> Is it incorrect (i.e. should be reverted) or is it just incomplete/not
> helpful?

Its incorrect in that it changes a logically correct implementation
only for the purposes of masking the delay, which in this case, lead
to the discovery of the root cause of the problem.  Arguably libxl
needs a bit better error reporting around all this and the patch
arguably works against that.

I'm currently running without the patch in 4.1.2 and the tapdisk2
devices are performing very nicely.

> Ian.

Let me know if you have any additional questions/issues.

Best wishes for a pleasant weekend.

Greg

}-- End of excerpt from Ian Campbell

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686
FAX: 701-281-3949           EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"There is no heavier burden than a great potential."
                                -- Linus' Law

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Blktap fixes and kernel patch.
  2012-08-02 16:06 Blktap fixes and kernel patch Dr. Greg Wettstein
@ 2012-08-03  9:21 ` Ian Campbell
  0 siblings, 0 replies; 4+ messages in thread
From: Ian Campbell @ 2012-08-03  9:21 UTC (permalink / raw)
  To: greg@enjellic.com; +Cc: Mike McClurg, xen-devel@lists.xen.org

On Thu, 2012-08-02 at 17:06 +0100, Dr. Greg Wettstein wrote:
> On Aug 1, 10:17am, Ian Campbell wrote:
> } Subject: Re: Blktap fixes and kernel patch.
> 
> Hi Ian et. al, hope your week is proceeding well.
> 
> > On Tue, 2012-07-31 at 22:41 +0100, Dr. Greg Wettstein wrote:
> 
> > > In the process I updated the blktap2 kernel driver to patch cleanly
> > > into the Linux 3.4 kernel.  These fixes have been validated against
> > > the 3.4 kernel as well as the 3.2 kernel.
> 
> > Just to be clear this is just a straight forward port, there's no
> > part of the deadlock fix in here?
> 
> The kernel patch is just a forward port to 3.4, there were no issues
> with the driver itself that needed to be addressed.
> 
> It seemed the community lacks ready access to the kernel driver so
> hopefully others will find a solid reference site for the patches
> useful.

I think they will, thanks.

The closest thing we have for an "upstream" for these now is the
blktap-dkms packages which are in Debian and Ubuntu, but I don't think
those really have a maintainer as such (I think Mike, CCd, wrangles with
them a bit). Might that be something you would like to maintain?

[...]
> > If you send a mail with a subject "Xen 4.1.x backport request
> > <commit-id>" explaining which commit it is and CC keir@xen.org &
> > ian.jackson@eu.citrix.com then we can see about getting this into a
> > future 4.1.x (perhaps even 4.1.3, not sure which stage of rcs we are at
> > there).
> > 
> > If the backport is reasonably trivial then there is often no need to
> > include it but since you have done so you might as well include the
> > patch for reference.
> 
> I will forward along the reference and the patch.

Great thanks.

> > > The second patch corrects the deadlock which occurs between the
> > > blktap2 kernel driver and the blktap2 userspace control plane.  The
> > > deadlock causes a delay in the shutdown of a XEN guest and results in
> > > the 'orphaning' of tapdisk minor number allocations.  As seems to be
> > > typical with these types of things the fix was trivially straight
> > > forward once I finally figured out what was going on.
> 
> > Thanks for this.
> > 
> > Am I right that the important functional change here is that the xs_rm
> > needs to come after we read the params node but before tap_ctl_destroy?
> > Obviously removing the node before calling libxl__device_destroy_tapdisk
> > is wrong since libxl__device_destroy_tapdisk reads from be_path!
> 
> Correct.
> 
> I debated a bit about how to do this in the cleanest fashion
> possible.  Since the be_path is passed to libxl__device_destroy_tapdisk()
> the simplest strategy seemed to be to abstract the libxl_ctx context
> and pull the entry from xenstore after the tapdisk-params key was read
> from the xenstore.

[...]
> While my fix vaguely felt like a layering violation it seemed to be
> the most correct approach.  Since libxl__device_destroy_tapdisk() is
> stubbed out in the non-blktap2 case having the teardown of the backend
> there would seem to generically fix the problem.  Provided of course
> the function can be provided with the correct context, I'm not looking
> at the current code.

It was a little more complicated in unstable, because the rm is now in a
transaction. What I ended up doing was making
libxl__device_destroy_tapdisk take the actual params string instead of
the be_path, so I could read it as part of the transaction and then call
destroy_tapdisk.

I've just sent out V2 of that patch, I CCd you. 
<2c21d5c75dcbdf52987f.1343985208@cosworth.uk.xensource.com>

> > I'd really appreciate it if you could validate whether 4.2.0-rc1
> > works for you or not, I suspect not. We would usually want fix the
> > development version before considering fixes for the stable branches
> > (even if the actual patch ends up looking totally different)
> > otherwise we run the risk of regressions in the next version.
> 
> I'd be happy to give it a test.  Is there a tarball cut or should I
> hone up my Mercurial skills?

We don't do tarballs for rcs. In any case I suggest you use the
Mercurial tip anyway since it has the prerequisites for my patch in it
now.

Those prerequisistes are still being tested so you will need the staging
tree:
        hg clone http://xenbits.xen.org/hg/staging/xen-unstable.hg

once they pass testing (this afternoon BST, assuming all is well) then
you can drop the "/staging" bit. We'll probably to rc2 early next week.

> > Is there a simple command which will list the leaked tap devices? If
> > so we can consider adding it to the leak-check phase of the
> > automated tests (although I'm not sure how much use these make of
> > blktap)
> 
> The tap-ctl tools make managingn all this pretty straight forward.
> 
> If you issue the following command:
> 
> 	tap-ctl list
> 
> On a faulty implementation after the startup and shutdown of a blktap2
> using guest you will see the orphaned minor.  They steadily increase
> as guests startup and shutdown.

Thanks, this rings a bell from last time I looked at this.

> [..]

> It appears that 4.1.2 is not properly cleaning up xenstore.  I'm
> chasing that down now and if there are trivial correctness fixes I
> will pass them onward.

Thanks.

> 
> > > Ian for your reference the following change which you introduced to
> > > address this issue:
> > > 
> > > 79e3dbe4b659e78408a9eea76c51a601bd4a383a
> > > tapdisk: respond to destroy request before tearing down the commuication channel
> > > 
> > > Is not needed and does not provide formally correct behavior in the
> > > presence of the two patches noted above.
> 
> > Is it incorrect (i.e. should be reverted) or is it just incomplete/not
> > helpful?
> 
> Its incorrect in that it changes a logically correct implementation
> only for the purposes of masking the delay, which in this case, lead
> to the discovery of the root cause of the problem.  Arguably libxl
> needs a bit better error reporting around all this and the patch
> arguably works against that.
> 
> I'm currently running without the patch in 4.1.2 and the tapdisk2
> devices are performing very nicely.

I remember this patch now but I can't seem to find it in any tree, I
suspect it was never applied, which sounds like a good thing.

> Best wishes for a pleasant weekend.

Thanks, and you.

Ian.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-08-03  9:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-02 16:06 Blktap fixes and kernel patch Dr. Greg Wettstein
2012-08-03  9:21 ` Ian Campbell
  -- strict thread matches above, loose matches on Subject: below --
2012-07-31 21:41 Dr. Greg Wettstein
2012-08-01  9:17 ` Ian Campbell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).