* [PATCH] Use timeout on xenstore read_reply to avoid task hunging @ 2011-03-02 10:20 Frank Pan 2011-03-02 10:34 ` Ian Campbell 0 siblings, 1 reply; 8+ messages in thread From: Frank Pan @ 2011-03-02 10:20 UTC (permalink / raw) To: xen-devel, Chris Wright, Jeremy Fitzhardinge Recent pvops kernel uses wait_event on waiting reply from xenstored. This may cause xenstore clients hung inside kernel when xenstored does not response correctly. The hung even happens when xenstored is not running, and may confuse the developer. The patch attached uses wait_event_timeout instead, and return -EIO to userspace if xenstored does not response in 5 seconds. Simply change wait_event to wait_event_timeout is not correct. Right after the xenstored starts working, the requests abandoned before will be processed by xenstored, and responses sent to the reply_list queue will confuse the requests later. The patch also makes use of the req_id section in struct xsd_sockmsg, as a sequence id. This avoids the confusing of responses by abandoned requests. Any suggestions? Thanks. -- Frank Pan Computer Science and Technology Tsinghua University ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Use timeout on xenstore read_reply to avoid task hunging 2011-03-02 10:20 [PATCH] Use timeout on xenstore read_reply to avoid task hunging Frank Pan @ 2011-03-02 10:34 ` Ian Campbell 2011-03-02 11:35 ` Frank Pan 0 siblings, 1 reply; 8+ messages in thread From: Ian Campbell @ 2011-03-02 10:34 UTC (permalink / raw) To: Frank Pan; +Cc: xen-devel@lists.xensource.com, Jeremy, Fitzhardinge (Chris isn't involved with Xen maintenance any more, moving him to BCC). On Wed, 2011-03-02 at 10:20 +0000, Frank Pan wrote: > Recent pvops kernel uses wait_event on waiting reply from xenstored. > This may cause xenstore clients hung inside kernel when xenstored does > not response correctly. The hung even happens when xenstored is not > running, and may confuse the developer. > > The patch attached uses wait_event_timeout instead, and return -EIO to > userspace if xenstored does not response in 5 seconds. > > Simply change wait_event to wait_event_timeout is not correct. Right > after the xenstored starts working, the requests abandoned before will > be processed by xenstored, and responses sent to the reply_list queue > will confuse the requests later. The patch also makes use of the > req_id section in struct xsd_sockmsg, as a sequence id. This avoids > the confusing of responses by abandoned requests. > > Any suggestions? Thanks. It all sounds very plausible to me but you've forgotten the patch ;-) Why wait_event_timeout and not wait_event_interruptible to allow users to interrupt? In particular I'm concerned about the arbitrarily chosen 5s timeout not being sufficient on a busy system. There was a recent discussion of the general issue you are solving on xen-devel which it might be worth looking over. It was a thread titled "libxl: error handling before xenstored runs" starting on 9 February. Once specific pitfall which I remember was that userspace clients are at liberty to make use of the req_id themselves (and some do). Fixing this might involve shadowing the user provided req_id with a kernel generated ID on the ring and unshadowing in the responses... Ian. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Use timeout on xenstore read_reply to avoid task hunging 2011-03-02 10:34 ` Ian Campbell @ 2011-03-02 11:35 ` Frank Pan 2011-03-03 3:31 ` Frank Pan 2011-03-04 11:31 ` Ian Campbell 0 siblings, 2 replies; 8+ messages in thread From: Frank Pan @ 2011-03-02 11:35 UTC (permalink / raw) To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, Jeremy Fitzhardinge [-- Attachment #1: Type: text/plain, Size: 1255 bytes --] Oh sorry. Patch attached. On Wed, Mar 2, 2011 at 6:34 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote: > It all sounds very plausible to me but you've forgotten the patch ;-) > > Why wait_event_timeout and not wait_event_interruptible to allow users > to interrupt? In particular I'm concerned about the arbitrarily chosen > 5s timeout not being sufficient on a busy system. FP: I wait_event_interruptible is a choice. But it needs user operation such as kill command. User-level tool(xenstore-ls for example) can also set SIGALRM or something else, but it sounds not so good. The timeout parameter is something discussible. 5s may not be a good one, but I believe xenstored on a healthy system should be response in 5s. What do you think? > Once specific pitfall which I remember was that userspace clients are at > liberty to make use of the req_id themselves (and some do). Fixing this > might involve shadowing the user provided req_id with a kernel generated > ID on the ring and unshadowing in the responses... FP: Yes, that's what I supposed to do. But I cannot find any dereference on the req_id section of the reply msg. If it exist somewhere, shadowing is surely needed. -- Frank Pan Computer Science and Technology Tsinghua University [-- Attachment #2: 0001-Use-timeout-on-xenstore-read_reply-to-avoid-task-hun.patch --] [-- Type: text/x-patch, Size: 3689 bytes --] From 379c83fa345444b696b5ab96f55ef55a18a9f309 Mon Sep 17 00:00:00 2001 From: Frank Pan <frankpzh@gmail.com> Date: Wed, 2 Mar 2011 17:52:52 +0800 Subject: [PATCH] Use timeout on xenstore read_reply to avoid task hunging. --- linux-2.6-xen/drivers/xen/xenbus/xenbus_xs.c | 57 ++++++++++++++++++------- 1 files changed, 41 insertions(+), 16 deletions(-) diff --git a/linux-2.6-xen/drivers/xen/xenbus/xenbus_xs.c b/linux-2.6-xen/drivers/xen/xenbus/xenbus_xs.c index 5534690..4e66b30 100644 --- a/linux-2.6-xen/drivers/xen/xenbus/xenbus_xs.c +++ b/linux-2.6-xen/drivers/xen/xenbus/xenbus_xs.c @@ -73,6 +73,9 @@ struct xs_handle { spinlock_t reply_lock; wait_queue_head_t reply_waitq; + /* Sequence number of last request */ + uint32_t reply_seq; + /* * Mutex ordering: transaction_mutex -> watch_mutex -> request_mutex. * response_mutex is never taken simultaneously with the other three. @@ -136,26 +139,45 @@ static int get_error(const char *errorstring) return xsd_errors[i].errnum; } -static void *read_reply(enum xsd_sockmsg_type *type, unsigned int *len) +#define XENSTORE_TIMEOUT (5*HZ) + +static void *read_reply(enum xsd_sockmsg_type *type, unsigned int *len, uint32_t seq) { - struct xs_stored_msg *msg; + unsigned long remain_time = XENSTORE_TIMEOUT; + struct xs_stored_msg *msg = 0; char *body; - spin_lock(&xs_state.reply_lock); - - while (list_empty(&xs_state.reply_list)) { - spin_unlock(&xs_state.reply_lock); + while (remain_time && !msg) { /* XXX FIXME: Avoid synchronous wait for response here. */ - wait_event(xs_state.reply_waitq, - !list_empty(&xs_state.reply_list)); + remain_time = wait_event_timeout(xs_state.reply_waitq, + !list_empty(&xs_state.reply_list), + remain_time); + spin_lock(&xs_state.reply_lock); - } - msg = list_entry(xs_state.reply_list.next, - struct xs_stored_msg, list); - list_del(&msg->list); + while (!list_empty(&xs_state.reply_list)) { + msg = list_entry(xs_state.reply_list.next, + struct xs_stored_msg, list); + list_del(&msg->list); + + /* Check sequence number */ + if (msg->hdr.req_id == seq) + break; - spin_unlock(&xs_state.reply_lock); + if (!IS_ERR(msg->u.reply.body)) + kfree(msg->u.reply.body); + kfree(msg); + msg = 0; + } + + spin_unlock(&xs_state.reply_lock); + } + + if (!msg) { + *type = XS_ERROR; + *len = 0; + return ERR_PTR(-EIO); + } *type = msg->hdr.type; if (len) @@ -202,13 +224,14 @@ void *xenbus_dev_request_and_reply(struct xsd_sockmsg *msg) transaction_start(); mutex_lock(&xs_state.request_mutex); + msg->req_id = xs_state.reply_seq++; err = xb_write(msg, sizeof(*msg) + msg->len); if (err) { msg->type = XS_ERROR; ret = ERR_PTR(err); } else - ret = read_reply(&msg->type, &msg->len); + ret = read_reply(&msg->type, &msg->len, msg->req_id); mutex_unlock(&xs_state.request_mutex); @@ -234,13 +257,13 @@ static void *xs_talkv(struct xenbus_transaction t, int err; msg.tx_id = t.id; - msg.req_id = 0; msg.type = type; msg.len = 0; for (i = 0; i < num_vecs; i++) msg.len += iovec[i].iov_len; mutex_lock(&xs_state.request_mutex); + msg.req_id = xs_state.reply_seq++; err = xb_write(&msg, sizeof(msg)); if (err) { @@ -256,7 +279,7 @@ static void *xs_talkv(struct xenbus_transaction t, } } - ret = read_reply(&msg.type, len); + ret = read_reply(&msg.type, len, msg.req_id); mutex_unlock(&xs_state.request_mutex); @@ -872,6 +895,8 @@ int xs_init(void) int err; struct task_struct *task; + xs_state.reply_seq = 0; + INIT_LIST_HEAD(&xs_state.reply_list); spin_lock_init(&xs_state.reply_lock); init_waitqueue_head(&xs_state.reply_waitq); -- 1.7.1 [-- Attachment #3: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] Use timeout on xenstore read_reply to avoid task hunging 2011-03-02 11:35 ` Frank Pan @ 2011-03-03 3:31 ` Frank Pan 2011-03-04 11:31 ` Ian Campbell 1 sibling, 0 replies; 8+ messages in thread From: Frank Pan @ 2011-03-03 3:31 UTC (permalink / raw) To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, Jeremy Fitzhardinge Also post patch as content. Sorry for gmail's line wrapping. >From 379c83fa345444b696b5ab96f55ef55a18a9f309 Mon Sep 17 00:00:00 2001 From: Frank Pan <frankpzh@gmail.com> Date: Wed, 2 Mar 2011 17:52:52 +0800 Subject: [PATCH] Use timeout on xenstore read_reply to avoid task hunging. --- linux-2.6-xen/drivers/xen/xenbus/xenbus_xs.c | 57 ++++++++++++++++++------- 1 files changed, 41 insertions(+), 16 deletions(-) diff --git a/linux-2.6-xen/drivers/xen/xenbus/xenbus_xs.c b/linux-2.6-xen/drivers/xen/xenbus/xenbus_xs.c index 5534690..4e66b30 100644 --- a/linux-2.6-xen/drivers/xen/xenbus/xenbus_xs.c +++ b/linux-2.6-xen/drivers/xen/xenbus/xenbus_xs.c @@ -73,6 +73,9 @@ struct xs_handle { spinlock_t reply_lock; wait_queue_head_t reply_waitq; + /* Sequence number of last request */ + uint32_t reply_seq; + /* * Mutex ordering: transaction_mutex -> watch_mutex -> request_mutex. * response_mutex is never taken simultaneously with the other three. @@ -136,26 +139,45 @@ static int get_error(const char *errorstring) return xsd_errors[i].errnum; } -static void *read_reply(enum xsd_sockmsg_type *type, unsigned int *len) +#define XENSTORE_TIMEOUT (5*HZ) + +static void *read_reply(enum xsd_sockmsg_type *type, unsigned int *len, uint32_t seq) { - struct xs_stored_msg *msg; + unsigned long remain_time = XENSTORE_TIMEOUT; + struct xs_stored_msg *msg = 0; char *body; - spin_lock(&xs_state.reply_lock); - - while (list_empty(&xs_state.reply_list)) { - spin_unlock(&xs_state.reply_lock); + while (remain_time && !msg) { /* XXX FIXME: Avoid synchronous wait for response here. */ - wait_event(xs_state.reply_waitq, - !list_empty(&xs_state.reply_list)); + remain_time = wait_event_timeout(xs_state.reply_waitq, + !list_empty(&xs_state.reply_list), + remain_time); + spin_lock(&xs_state.reply_lock); - } - msg = list_entry(xs_state.reply_list.next, - struct xs_stored_msg, list); - list_del(&msg->list); + while (!list_empty(&xs_state.reply_list)) { + msg = list_entry(xs_state.reply_list.next, + struct xs_stored_msg, list); + list_del(&msg->list); + + /* Check sequence number */ + if (msg->hdr.req_id == seq) + break; - spin_unlock(&xs_state.reply_lock); + if (!IS_ERR(msg->u.reply.body)) + kfree(msg->u.reply.body); + kfree(msg); + msg = 0; + } + + spin_unlock(&xs_state.reply_lock); + } + + if (!msg) { + *type = XS_ERROR; + *len = 0; + return ERR_PTR(-EIO); + } *type = msg->hdr.type; if (len) @@ -202,13 +224,14 @@ void *xenbus_dev_request_and_reply(struct xsd_sockmsg *msg) transaction_start(); mutex_lock(&xs_state.request_mutex); + msg->req_id = xs_state.reply_seq++; err = xb_write(msg, sizeof(*msg) + msg->len); if (err) { msg->type = XS_ERROR; ret = ERR_PTR(err); } else - ret = read_reply(&msg->type, &msg->len); + ret = read_reply(&msg->type, &msg->len, msg->req_id); mutex_unlock(&xs_state.request_mutex); @@ -234,13 +257,13 @@ static void *xs_talkv(struct xenbus_transaction t, int err; msg.tx_id = t.id; - msg.req_id = 0; msg.type = type; msg.len = 0; for (i = 0; i < num_vecs; i++) msg.len += iovec[i].iov_len; mutex_lock(&xs_state.request_mutex); + msg.req_id = xs_state.reply_seq++; err = xb_write(&msg, sizeof(msg)); if (err) { @@ -256,7 +279,7 @@ static void *xs_talkv(struct xenbus_transaction t, } } - ret = read_reply(&msg.type, len); + ret = read_reply(&msg.type, len, msg.req_id); mutex_unlock(&xs_state.request_mutex); @@ -872,6 +895,8 @@ int xs_init(void) int err; struct task_struct *task; + xs_state.reply_seq = 0; + INIT_LIST_HEAD(&xs_state.reply_list); spin_lock_init(&xs_state.reply_lock); init_waitqueue_head(&xs_state.reply_waitq); -- 1.7.1 On Wed, Mar 2, 2011 at 7:35 PM, Frank Pan <frankpzh@gmail.com> wrote: > Oh sorry. > Patch attached. > > On Wed, Mar 2, 2011 at 6:34 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote: >> It all sounds very plausible to me but you've forgotten the patch ;-) >> >> Why wait_event_timeout and not wait_event_interruptible to allow users >> to interrupt? In particular I'm concerned about the arbitrarily chosen >> 5s timeout not being sufficient on a busy system. > > FP: I wait_event_interruptible is a choice. But it needs user > operation such as kill command. User-level tool(xenstore-ls for > example) can also set SIGALRM or something else, but it sounds not so > good. > > The timeout parameter is something discussible. 5s may not be a good > one, but I believe xenstored on a healthy system should be response in > 5s. What do you think? > >> Once specific pitfall which I remember was that userspace clients are at >> liberty to make use of the req_id themselves (and some do). Fixing this >> might involve shadowing the user provided req_id with a kernel generated >> ID on the ring and unshadowing in the responses... > > FP: Yes, that's what I supposed to do. But I cannot find any > dereference on the req_id section of the reply msg. If it exist > somewhere, shadowing is surely needed. > > -- > Frank Pan > > Computer Science and Technology > Tsinghua University > -- 潘震皓, Frank Pan Computer Science and Technology Tsinghua University ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] Use timeout on xenstore read_reply to avoid task hunging 2011-03-02 11:35 ` Frank Pan 2011-03-03 3:31 ` Frank Pan @ 2011-03-04 11:31 ` Ian Campbell 2011-03-04 11:41 ` Vincent Hanquez 2011-03-10 10:54 ` Frank Pan 1 sibling, 2 replies; 8+ messages in thread From: Ian Campbell @ 2011-03-04 11:31 UTC (permalink / raw) To: Frank Pan; +Cc: xen-devel@lists.xensource.com, Jeremy Fitzhardinge On Wed, 2011-03-02 at 11:35 +0000, Frank Pan wrote: > Oh sorry. > Patch attached. > > On Wed, Mar 2, 2011 at 6:34 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > It all sounds very plausible to me but you've forgotten the patch ;-) > > > > Why wait_event_timeout and not wait_event_interruptible to allow users > > to interrupt? In particular I'm concerned about the arbitrarily chosen > > 5s timeout not being sufficient on a busy system. > > FP: I wait_event_interruptible is a choice. But it needs user > operation such as kill command. User-level tool(xenstore-ls for > example) can also set SIGALRM or something else, but it sounds not so > good. I think this is preferable to requiring all clients be reworked to handle a timeout event which they aren't currently expecting. Rather than have all xenbus clients expect and handle an arbitrary timeout we should provide a new interface to in-kernel users of xenbus which includes a timeout parameter, e.g. foo_timeout. (assuming there are in kernel users who could do anything sane with a timeout, if not then we shouldn't bother with the interface). For userspace clients I think we would probably be better off making sure that poll/select work properly than trying to find a way to expose timeouts of this sort, that combined with making the sleeps interruptible, would cover the problems we care about, I think. > The timeout parameter is something discussible. 5s may not be a good > one, but I believe xenstored on a healthy system should be response in > 5s. What do you think? I've definitely seen cases in heavily loaded domain 0 where it has taken longer than 5s (think of a system which is starting dozens of new domains concurrently for example) > > Once specific pitfall which I remember was that userspace clients are at > > liberty to make use of the req_id themselves (and some do). Fixing this > > might involve shadowing the user provided req_id with a kernel generated > > ID on the ring and unshadowing in the responses... > > FP: Yes, that's what I supposed to do. But I cannot find any > dereference on the req_id section of the reply msg. If it exist > somewhere, shadowing is surely needed. Userspace clients communicating via the drivers/xen/xenfs/xenbus.c driver are at liberty to use the req_id field for their own purposes. For example, the ocaml xenbus bindings in xen-unstable.hg:tools/ocaml/libs/xb expose this field and I think (although I'm not 100% sure) that the oxenstored (tools/ocaml/xenstored) uses it. Ian. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Use timeout on xenstore read_reply to avoid task hunging 2011-03-04 11:31 ` Ian Campbell @ 2011-03-04 11:41 ` Vincent Hanquez 2011-03-04 11:53 ` Ian Campbell 2011-03-10 10:54 ` Frank Pan 1 sibling, 1 reply; 8+ messages in thread From: Vincent Hanquez @ 2011-03-04 11:41 UTC (permalink / raw) To: xen-devel On 03/04/2011 11:31 AM, Ian Campbell wrote: > For example, the ocaml xenbus bindings in > xen-unstable.hg:tools/ocaml/libs/xb expose this field and I think > (although I'm not 100% sure) that the oxenstored (tools/ocaml/xenstored) > uses it. > oxenstored is echoing it back in the reply, and it doesn't have any more business for it. The client bindings (ocaml/libs/xs not ocaml/libs/xb) doesn't use it. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Use timeout on xenstore read_reply to avoid task hunging 2011-03-04 11:41 ` Vincent Hanquez @ 2011-03-04 11:53 ` Ian Campbell 0 siblings, 0 replies; 8+ messages in thread From: Ian Campbell @ 2011-03-04 11:53 UTC (permalink / raw) To: Vincent Hanquez; +Cc: xen-devel@lists.xensource.com On Fri, 2011-03-04 at 11:41 +0000, Vincent Hanquez wrote: > On 03/04/2011 11:31 AM, Ian Campbell wrote: > > For example, the ocaml xenbus bindings in > > xen-unstable.hg:tools/ocaml/libs/xb expose this field and I think > > (although I'm not 100% sure) that the oxenstored (tools/ocaml/xenstored) > > uses it. > > > oxenstored is echoing it back in the reply, and it doesn't have any more > business for it. Duh, of course, it's the server end isn't it ;-) > The client bindings (ocaml/libs/xs not ocaml/libs/xb) doesn't use it. Ian. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Use timeout on xenstore read_reply to avoid task hunging 2011-03-04 11:31 ` Ian Campbell 2011-03-04 11:41 ` Vincent Hanquez @ 2011-03-10 10:54 ` Frank Pan 1 sibling, 0 replies; 8+ messages in thread From: Frank Pan @ 2011-03-10 10:54 UTC (permalink / raw) To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, Jeremy Fitzhardinge > I think this is preferable to requiring all clients be reworked to > handle a timeout event which they aren't currently expecting. > Rather than have all xenbus clients expect and handle an arbitrary > timeout we should provide a new interface to in-kernel users of xenbus > which includes a timeout parameter, e.g. foo_timeout. (assuming there > are in kernel users who could do anything sane with a timeout, if not > then we shouldn't bother with the interface). > For userspace clients I think we would probably be better off making > sure that poll/select work properly than trying to find a way to expose > timeouts of this sort, that combined with making the sleeps > interruptible, would cover the problems we care about, I think. How many clients are there? Is it possible to change them all? Poll/select sounds great, but it may change the current xenbus protocol. Also, if the protocol is changing, i think design a method to determine whether xenstored is up is needed. Can we alter xenbus protocol? -- 潘震皓, Frank Pan Computer Science and Technology Tsinghua University ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-03-10 10:54 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-03-02 10:20 [PATCH] Use timeout on xenstore read_reply to avoid task hunging Frank Pan 2011-03-02 10:34 ` Ian Campbell 2011-03-02 11:35 ` Frank Pan 2011-03-03 3:31 ` Frank Pan 2011-03-04 11:31 ` Ian Campbell 2011-03-04 11:41 ` Vincent Hanquez 2011-03-04 11:53 ` Ian Campbell 2011-03-10 10:54 ` Frank Pan
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.