* [PATCH] Safely finish closing protocol when guest fails in blkfront
@ 2006-12-04 21:40 Glauber de Oliveira Costa
2006-12-05 7:39 ` Keir Fraser
0 siblings, 1 reply; 4+ messages in thread
From: Glauber de Oliveira Costa @ 2006-12-04 21:40 UTC (permalink / raw)
To: xen-devel
[-- Attachment #1: Type: text/plain, Size: 250 bytes --]
If a guest finds any error and aborts the connection of a block device,
it's online state set at device create phase will stop it from being
properly cleaned up.
Follows a fix for it.
--
Glauber de Oliveira Costa
Red Hat Inc.
"Free as in Freedom"
[-- Attachment #2: xen-frontend-fails-attach.patch --]
[-- Type: text/plain, Size: 1890 bytes --]
# HG changeset patch
# User gcosta@redhat.com
# Date 1165272179 18000
# Node ID 6702c80880a1ed7e7467a59135f3764fb145cd0b
# Parent fd28a1b139dea91b8bfcf06dd233dbdda8f51ff1
[LINUX] Get rid of device if block-attach fails
The current backend code checks to see if a device is online
before unregistering it. However, when block attach fails
due to a guest failure, guest is the one to start closing protocols,
and state is never set to offline again.
Proposal is to check if there are error entries in xenstore,
and unregister if it is the case.
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
diff -r fd28a1b139de -r 6702c80880a1 linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c
--- a/linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c Mon Dec 04 09:29:26 2006 +0000
+++ b/linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c Mon Dec 04 17:42:59 2006 -0500
@@ -306,6 +306,32 @@ static void backend_changed(struct xenbu
}
+static int dev_had_error(struct xenbus_device *dev)
+{
+ int err;
+ unsigned int virtual_device, frontend_id;
+ char *path = NULL;
+
+ err = xenbus_scanf(XBT_NIL, dev->otherend,
+ "virtual-device", "%u", &virtual_device);
+ if (err < 0)
+ return 0;
+
+ err = xenbus_scanf(XBT_NIL, dev->nodename,
+ "frontend-id", "%u", &frontend_id);
+ if (err < 0)
+ return 0;
+
+ path = kasprintf(GFP_KERNEL, "%u/error/device/vbd/%u",
+ frontend_id,virtual_device);
+ if (!path)
+ return 0;
+
+ err = xenbus_exists(XBT_NIL,"/local/domain",path);
+ kfree(path);
+ return err;
+}
+
/**
* Callback received when the frontend's state changes.
*/
@@ -347,7 +373,7 @@ static void frontend_changed(struct xenb
case XenbusStateClosed:
xenbus_switch_state(dev, XenbusStateClosed);
- if (xenbus_dev_is_online(dev))
+ if (xenbus_dev_is_online(dev) && !dev_had_error(dev))
break;
/* fall through if not online */
case XenbusStateUnknown:
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] Safely finish closing protocol when guest fails in blkfront
2006-12-04 21:40 [PATCH] Safely finish closing protocol when guest fails in blkfront Glauber de Oliveira Costa
@ 2006-12-05 7:39 ` Keir Fraser
2006-12-05 10:22 ` Glauber de Oliveira Costa
0 siblings, 1 reply; 4+ messages in thread
From: Keir Fraser @ 2006-12-05 7:39 UTC (permalink / raw)
To: Glauber de Oliveira Costa, xen-devel
On 4/12/06 9:40 pm, "Glauber de Oliveira Costa" <gcosta@redhat.com> wrote:
> If a guest finds any error and aborts the connection of a block device,
> it's online state set at device create phase will stop it from being
> properly cleaned up.
>
> Follows a fix for it.
Assignment and unassignment of physical resources is really a tools issue.
Tools should really be integrated with device-hotplug success/failure anyway
-- for example, it is likely the initiator would like confirmation of
success/failure in most cases.
-- Keir
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] Safely finish closing protocol when guest fails in blkfront
2006-12-05 7:39 ` Keir Fraser
@ 2006-12-05 10:22 ` Glauber de Oliveira Costa
2006-12-05 20:25 ` Glauber de Oliveira Costa
0 siblings, 1 reply; 4+ messages in thread
From: Glauber de Oliveira Costa @ 2006-12-05 10:22 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel
On Tue, Dec 05, 2006 at 07:39:11AM +0000, Keir Fraser wrote:
> On 4/12/06 9:40 pm, "Glauber de Oliveira Costa" <gcosta@redhat.com> wrote:
>
> > If a guest finds any error and aborts the connection of a block device,
> > it's online state set at device create phase will stop it from being
> > properly cleaned up.
> >
> > Follows a fix for it.
>
> Assignment and unassignment of physical resources is really a tools issue.
> Tools should really be integrated with device-hotplug success/failure anyway
> -- for example, it is likely the initiator would like confirmation of
> success/failure in most cases.
>
Agree. But what if after properly initiation, frontend finds an error
and starts Closing protocol? What will happen is that the test
if (xenbus_dev_is_online(dev))
will cause the device to not be unregistered. At this point, it do not
see frontend changes. (putting backend in closing leads to frontend
closing,closed, but backend never see frontend closing, never going to
closed).
Given that, what tools can do ? At the current point, this is what leads
me to believe that arbitrary frontend-failure cases should be handled in the frontend.
--
Glauber de Oliveira Costa
Red Hat Inc.
"Free as in Freedom"
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] Safely finish closing protocol when guest fails in blkfront
2006-12-05 10:22 ` Glauber de Oliveira Costa
@ 2006-12-05 20:25 ` Glauber de Oliveira Costa
0 siblings, 0 replies; 4+ messages in thread
From: Glauber de Oliveira Costa @ 2006-12-05 20:25 UTC (permalink / raw)
To: Glauber de Oliveira Costa; +Cc: xen-devel, Keir Fraser
> > Assignment and unassignment of physical resources is really a tools issue.
> > Tools should really be integrated with device-hotplug success/failure anyway
> > -- for example, it is likely the initiator would like confirmation of
> > success/failure in most cases.
> >
> Agree. But what if after properly initiation, frontend finds an error
> and starts Closing protocol? What will happen is that the test
>
> if (xenbus_dev_is_online(dev))
>
> will cause the device to not be unregistered. At this point, it do not
> see frontend changes. (putting backend in closing leads to frontend
> closing,closed, but backend never see frontend closing, never going to
> closed).
>
> Given that, what tools can do ? At the current point, this is what leads
> me to believe that arbitrary frontend-failure cases should be handled in the frontend.
>
Keir,
Let me just try to clarify this. (after all, I just realised that even
if this is the right path, there's a piece missing).
Right now, I think that handling failures in the frontend code is the
correct choice, because failures can pretty much happen anytime .
According to the diagram at
http://wiki.xensource.com/xenwiki/XenSplitDrivers, a closedown
initiated by the frontend should end in the device being unregistered,
and I don't think tools will _ever_ be able to do it. The best they
can do is wait to see if the device is properly connected, but what if
the error happens after it? If this is indeed the real scenario, the
missing piece would be to delete the error message, to avoid
unregistering devices that should not be unregistered.
If you can assure, that now and ever, errors in the frontend side will
_always_ be constrained to the pre-Connect steps, then, my proposal is
to set the online flag just after the device is connected. It would
assure that device is properly unregistered, and tools would have a
way to know if the process was successfull (online = 1). Any comments
on that ?
I assume that I don't understand exactly the purpose of online. At
first I thought it was save & restore related, but I'm currently able
to save & restore with online being always 0. Can you shed some light
on it ?
As soon as you answer those, I'll proceed with the right approach to fix this.
--
Glauber de Oliveira Costa.
"Free as in Freedom"
Add your comments to GPLv3 at:
http://gplv3.fsf.org/comments/gplv3-draft-2.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2006-12-05 20:25 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-04 21:40 [PATCH] Safely finish closing protocol when guest fails in blkfront Glauber de Oliveira Costa
2006-12-05 7:39 ` Keir Fraser
2006-12-05 10:22 ` Glauber de Oliveira Costa
2006-12-05 20:25 ` Glauber de Oliveira Costa
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.