* [PATCH 0/3] libceph: fix backoff handling
@ 2012-10-09 21:31 Alex Elder
2012-10-09 21:33 ` [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue Alex Elder
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Alex Elder @ 2012-10-09 21:31 UTC (permalink / raw)
To: ceph-devel
These three patches fix some problems related to how backoff
is handled when a ceph connection faults.
-Alex
[PATCH 1/3] libceph: reset BACKOFF if unable to re-queue
[PATCH 2/3] libceph: let con_work() handle backoff
[PATCH 3/3] libceph: define common queue_con_delay()
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue 2012-10-09 21:31 [PATCH 0/3] libceph: fix backoff handling Alex Elder @ 2012-10-09 21:33 ` Alex Elder 2012-10-09 21:33 ` Sage Weil 2012-10-09 21:33 ` [PATCH 2/3] libceph: let con_work() handle backoff Alex Elder 2012-10-09 21:33 ` [PATCH 3/3] libceph: define common queue_con_delay() Alex Elder 2 siblings, 1 reply; 7+ messages in thread From: Alex Elder @ 2012-10-09 21:33 UTC (permalink / raw) To: ceph-devel If ceph_fault() is unable to queue work after a delay, it sets the BACKOFF connection flag so con_work() will attempt to do so. In con_work(), when BACKOFF is set, if queue_delayed_work() doesn't result in newly-queued work, it simply ignores this condition and proceeds as if no backoff delay were desired. There are two problems with this--one of which is a bug. The first problem is simply that the intended behavior is to back off, and if we aren't able queue the work item to run after a delay we're not doing that. The only reason queue_delayed_work() won't queue work is if the provided work item is already queued. In the messenger, this means that con_work() is already scheduled to be run again. So if we simply set the BACKOFF flag again when this occurs, we know the next con_work() call will again attempt to hold off activity on the connection until after the delay. The second problem--the bug--is a leak of a reference count. If queue_delayed_work() returns 0 in con_work(), con->ops->put() drops the connection reference held on entry to con_work(). However, processing is (was) allowed to continue, and at the end of the function a second con->ops->put() is called. This patch fixes both problems. Signed-off-by: Alex Elder <elder@inktank.com> --- net/ceph/messenger.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index f9f65fe..ece06bc 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -2300,10 +2300,11 @@ restart: mutex_unlock(&con->mutex); return; } else { - con->ops->put(con); dout("con_work %p FAILED to back off %lu\n", con, con->delay); + set_bit(CON_FLAG_BACKOFF, &con->flags); } + goto done; } if (con->state == CON_STATE_STANDBY) { -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue 2012-10-09 21:33 ` [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue Alex Elder @ 2012-10-09 21:33 ` Sage Weil 0 siblings, 0 replies; 7+ messages in thread From: Sage Weil @ 2012-10-09 21:33 UTC (permalink / raw) To: Alex Elder; +Cc: ceph-devel Reviewed-by: Sage Weil <sage@inktank.com> On Tue, 9 Oct 2012, Alex Elder wrote: > If ceph_fault() is unable to queue work after a delay, it sets the > BACKOFF connection flag so con_work() will attempt to do so. > > In con_work(), when BACKOFF is set, if queue_delayed_work() doesn't > result in newly-queued work, it simply ignores this condition and > proceeds as if no backoff delay were desired. There are two > problems with this--one of which is a bug. > > The first problem is simply that the intended behavior is to back > off, and if we aren't able queue the work item to run after a delay > we're not doing that. > > The only reason queue_delayed_work() won't queue work is if the > provided work item is already queued. In the messenger, this > means that con_work() is already scheduled to be run again. So > if we simply set the BACKOFF flag again when this occurs, we know > the next con_work() call will again attempt to hold off activity > on the connection until after the delay. > > The second problem--the bug--is a leak of a reference count. If > queue_delayed_work() returns 0 in con_work(), con->ops->put() drops > the connection reference held on entry to con_work(). However, > processing is (was) allowed to continue, and at the end of the > function a second con->ops->put() is called. > > This patch fixes both problems. > > Signed-off-by: Alex Elder <elder@inktank.com> > --- > net/ceph/messenger.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c > index f9f65fe..ece06bc 100644 > --- a/net/ceph/messenger.c > +++ b/net/ceph/messenger.c > @@ -2300,10 +2300,11 @@ restart: > mutex_unlock(&con->mutex); > return; > } else { > - con->ops->put(con); > dout("con_work %p FAILED to back off %lu\n", con, > con->delay); > + set_bit(CON_FLAG_BACKOFF, &con->flags); > } > + goto done; > } > > if (con->state == CON_STATE_STANDBY) { > -- > 1.7.9.5 > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/3] libceph: let con_work() handle backoff 2012-10-09 21:31 [PATCH 0/3] libceph: fix backoff handling Alex Elder 2012-10-09 21:33 ` [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue Alex Elder @ 2012-10-09 21:33 ` Alex Elder 2012-10-09 21:34 ` Sage Weil 2012-10-09 21:33 ` [PATCH 3/3] libceph: define common queue_con_delay() Alex Elder 2 siblings, 1 reply; 7+ messages in thread From: Alex Elder @ 2012-10-09 21:33 UTC (permalink / raw) To: ceph-devel Both ceph_fault() and con_work() include handling for imposing a delay before doing further processing on a faulted connection. The latter is used only if ceph_fault() is unable to. Instead, just let con_work() always be responsible for implementing the delay. After setting up the delay value, set the BACKOFF flag on the connection unconditionally and call queue_con() to ensure con_work() will get called to handle it. Signed-off-by: Alex Elder <elder@inktank.com> --- net/ceph/messenger.c | 20 ++------------------ 1 file changed, 2 insertions(+), 18 deletions(-) diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index ece06bc..9170c20 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -2398,24 +2398,8 @@ static void ceph_fault(struct ceph_connection *con) con->delay = BASE_DELAY_INTERVAL; else if (con->delay < MAX_DELAY_INTERVAL) con->delay *= 2; - con->ops->get(con); - if (queue_delayed_work(ceph_msgr_wq, &con->work, - round_jiffies_relative(con->delay))) { - dout("fault queued %p delay %lu\n", con, con->delay); - } else { - con->ops->put(con); - dout("fault failed to queue %p delay %lu, backoff\n", - con, con->delay); - /* - * In many cases we see a socket state change - * while con_work is running and end up - * queuing (non-delayed) work, such that we - * can't backoff with a delay. Set a flag so - * that when con_work restarts we schedule the - * delay then. - */ - set_bit(CON_FLAG_BACKOFF, &con->flags); - } + set_bit(CON_FLAG_BACKOFF, &con->flags); + queue_con(con); } out_unlock: -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] libceph: let con_work() handle backoff 2012-10-09 21:33 ` [PATCH 2/3] libceph: let con_work() handle backoff Alex Elder @ 2012-10-09 21:34 ` Sage Weil 0 siblings, 0 replies; 7+ messages in thread From: Sage Weil @ 2012-10-09 21:34 UTC (permalink / raw) To: Alex Elder; +Cc: ceph-devel Reviewed-by: Sage Weil <sage@inktank.com> On Tue, 9 Oct 2012, Alex Elder wrote: > Both ceph_fault() and con_work() include handling for imposing a > delay before doing further processing on a faulted connection. > The latter is used only if ceph_fault() is unable to. > > Instead, just let con_work() always be responsible for implementing > the delay. After setting up the delay value, set the BACKOFF flag > on the connection unconditionally and call queue_con() to ensure > con_work() will get called to handle it. > > Signed-off-by: Alex Elder <elder@inktank.com> > --- > net/ceph/messenger.c | 20 ++------------------ > 1 file changed, 2 insertions(+), 18 deletions(-) > > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c > index ece06bc..9170c20 100644 > --- a/net/ceph/messenger.c > +++ b/net/ceph/messenger.c > @@ -2398,24 +2398,8 @@ static void ceph_fault(struct ceph_connection *con) > con->delay = BASE_DELAY_INTERVAL; > else if (con->delay < MAX_DELAY_INTERVAL) > con->delay *= 2; > - con->ops->get(con); > - if (queue_delayed_work(ceph_msgr_wq, &con->work, > - round_jiffies_relative(con->delay))) { > - dout("fault queued %p delay %lu\n", con, con->delay); > - } else { > - con->ops->put(con); > - dout("fault failed to queue %p delay %lu, backoff\n", > - con, con->delay); > - /* > - * In many cases we see a socket state change > - * while con_work is running and end up > - * queuing (non-delayed) work, such that we > - * can't backoff with a delay. Set a flag so > - * that when con_work restarts we schedule the > - * delay then. > - */ > - set_bit(CON_FLAG_BACKOFF, &con->flags); > - } > + set_bit(CON_FLAG_BACKOFF, &con->flags); > + queue_con(con); > } > > out_unlock: > -- > 1.7.9.5 > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 3/3] libceph: define common queue_con_delay() 2012-10-09 21:31 [PATCH 0/3] libceph: fix backoff handling Alex Elder 2012-10-09 21:33 ` [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue Alex Elder 2012-10-09 21:33 ` [PATCH 2/3] libceph: let con_work() handle backoff Alex Elder @ 2012-10-09 21:33 ` Alex Elder 2012-10-09 21:37 ` Sage Weil 2 siblings, 1 reply; 7+ messages in thread From: Alex Elder @ 2012-10-09 21:33 UTC (permalink / raw) To: ceph-devel This patch defines a single function, queue_con_delay() to call queue_delayed_work() for a connection. It basically generalizes what was previously queue_con() by adding the delay argument. queue_con() is now a simple helper that passes 0 for its delay. queue_con_delay() returns 0 if it queued work or an errno if it did not for some reason. If con_work() finds the BACKOFF flag set for a connection, it now calls queue_con_delay() to handle arranging to start again after a delay. Note about connection reference counts: con_work() only ever gets called as a work item function. At the time that work is scheduled, a reference to the connection is acquired, and the corresponding con_work() call is then responsible for dropping that reference before it returns. Previously, the backoff handling inside con_work() silently handed off its reference to delayed work it scheduled. Now that queue_con_delay() is used, a new reference is acquired for the newly-scheduled work, and the original reference is dropped by the con->ops->put() call at the end of the function. Signed-off-by: Alex Elder <elder@inktank.com> --- net/ceph/messenger.c | 38 +++++++++++++++++++++++--------------- 1 file changed, 23 insertions(+), 15 deletions(-) diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index 9170c20..77cc8b1 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -2244,22 +2244,33 @@ bad_tag: /* - * Atomically queue work on a connection. Bump @con reference to - * avoid races with connection teardown. + * Atomically queue work on a connection after the specified delay. + * Bump @con reference to avoid races with connection teardown. + * Returns 0 if work was queued, or an error code otherwise. */ -static void queue_con(struct ceph_connection *con) +static int queue_con_delay(struct ceph_connection *con, unsigned long delay) { if (!con->ops->get(con)) { - dout("queue_con %p ref count 0\n", con); - return; + dout("%s %p ref count 0\n", __func__, con); + + return -ENOENT; } - if (!queue_delayed_work(ceph_msgr_wq, &con->work, 0)) { - dout("queue_con %p - already queued\n", con); + if (!queue_delayed_work(ceph_msgr_wq, &con->work, delay)) { + dout("%s %p - already queued\n", __func__, con); con->ops->put(con); - } else { - dout("queue_con %p\n", con); + + return -EBUSY; } + + dout("%s %p %lu\n", __func__, con, delay); + + return 0; +} + +static void queue_con(struct ceph_connection *con) +{ + (void) queue_con_delay(con, 0); } /* @@ -2294,14 +2305,11 @@ restart: if (test_and_clear_bit(CON_FLAG_BACKOFF, &con->flags)) { dout("con_work %p backing off\n", con); - if (queue_delayed_work(ceph_msgr_wq, &con->work, - round_jiffies_relative(con->delay))) { - dout("con_work %p backoff %lu\n", con, con->delay); - mutex_unlock(&con->mutex); - return; - } else { + ret = queue_con_delay(con, round_jiffies_relative(con->delay)); + if (ret) { dout("con_work %p FAILED to back off %lu\n", con, con->delay); + BUG_ON(ret == -ENOENT); set_bit(CON_FLAG_BACKOFF, &con->flags); } goto done; -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 3/3] libceph: define common queue_con_delay() 2012-10-09 21:33 ` [PATCH 3/3] libceph: define common queue_con_delay() Alex Elder @ 2012-10-09 21:37 ` Sage Weil 0 siblings, 0 replies; 7+ messages in thread From: Sage Weil @ 2012-10-09 21:37 UTC (permalink / raw) To: Alex Elder; +Cc: ceph-devel Reviewed-by: Sage Weil <sage@inktank.com> On Tue, 9 Oct 2012, Alex Elder wrote: > This patch defines a single function, queue_con_delay() to call > queue_delayed_work() for a connection. It basically generalizes > what was previously queue_con() by adding the delay argument. > queue_con() is now a simple helper that passes 0 for its delay. > queue_con_delay() returns 0 if it queued work or an errno if it > did not for some reason. > > If con_work() finds the BACKOFF flag set for a connection, it now > calls queue_con_delay() to handle arranging to start again after a > delay. > > > Note about connection reference counts: con_work() only ever gets > called as a work item function. At the time that work is scheduled, > a reference to the connection is acquired, and the corresponding > con_work() call is then responsible for dropping that reference > before it returns. > > Previously, the backoff handling inside con_work() silently handed > off its reference to delayed work it scheduled. Now that > queue_con_delay() is used, a new reference is acquired for the > newly-scheduled work, and the original reference is dropped by the > con->ops->put() call at the end of the function. > > Signed-off-by: Alex Elder <elder@inktank.com> > --- > net/ceph/messenger.c | 38 +++++++++++++++++++++++--------------- > 1 file changed, 23 insertions(+), 15 deletions(-) > > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c > index 9170c20..77cc8b1 100644 > --- a/net/ceph/messenger.c > +++ b/net/ceph/messenger.c > @@ -2244,22 +2244,33 @@ bad_tag: > > > /* > - * Atomically queue work on a connection. Bump @con reference to > - * avoid races with connection teardown. > + * Atomically queue work on a connection after the specified delay. > + * Bump @con reference to avoid races with connection teardown. > + * Returns 0 if work was queued, or an error code otherwise. > */ > -static void queue_con(struct ceph_connection *con) > +static int queue_con_delay(struct ceph_connection *con, unsigned long delay) > { > if (!con->ops->get(con)) { > - dout("queue_con %p ref count 0\n", con); > - return; > + dout("%s %p ref count 0\n", __func__, con); > + > + return -ENOENT; > } > > - if (!queue_delayed_work(ceph_msgr_wq, &con->work, 0)) { > - dout("queue_con %p - already queued\n", con); > + if (!queue_delayed_work(ceph_msgr_wq, &con->work, delay)) { > + dout("%s %p - already queued\n", __func__, con); > con->ops->put(con); > - } else { > - dout("queue_con %p\n", con); > + > + return -EBUSY; > } > + > + dout("%s %p %lu\n", __func__, con, delay); > + > + return 0; > +} > + > +static void queue_con(struct ceph_connection *con) > +{ > + (void) queue_con_delay(con, 0); > } > > /* > @@ -2294,14 +2305,11 @@ restart: > > if (test_and_clear_bit(CON_FLAG_BACKOFF, &con->flags)) { > dout("con_work %p backing off\n", con); > - if (queue_delayed_work(ceph_msgr_wq, &con->work, > - round_jiffies_relative(con->delay))) { > - dout("con_work %p backoff %lu\n", con, con->delay); > - mutex_unlock(&con->mutex); > - return; > - } else { > + ret = queue_con_delay(con, > round_jiffies_relative(con->delay)); > + if (ret) { > dout("con_work %p FAILED to back off %lu\n", con, > con->delay); > + BUG_ON(ret == -ENOENT); > set_bit(CON_FLAG_BACKOFF, &con->flags); > } > goto done; > -- > 1.7.9.5 > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-10-09 21:37 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-10-09 21:31 [PATCH 0/3] libceph: fix backoff handling Alex Elder 2012-10-09 21:33 ` [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue Alex Elder 2012-10-09 21:33 ` Sage Weil 2012-10-09 21:33 ` [PATCH 2/3] libceph: let con_work() handle backoff Alex Elder 2012-10-09 21:34 ` Sage Weil 2012-10-09 21:33 ` [PATCH 3/3] libceph: define common queue_con_delay() Alex Elder 2012-10-09 21:37 ` Sage Weil
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox