From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jim Schutt" Subject: Re: [PATCH 0/3] RFC: Enable clients to distinguish busy and unreachable OSDs Date: Thu, 23 Jun 2011 08:22:28 -0600 Message-ID: <4E034C24.6020103@sandia.gov> References: <1308767187-10376-1-git-send-email-jaschut@sandia.gov> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from sentry-two.sandia.gov ([132.175.109.14]:39317 "EHLO sentry-two.sandia.gov" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758901Ab1FWOXC (ORCPT ); Thu, 23 Jun 2011 10:23:02 -0400 In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Sage Weil Cc: ceph-devel@vger.kernel.org Hi Sage, Sage Weil wrote: > Hey Jim- > > I wonder if the below is sufficient, actually. This avoids any change on > the server side, and just changes the client to start the per-message > timeout "clock" when the message is actually received by the server... > > This way we still time out if the request gets stuck in cosd's request > queues somewhere, or if the disk blocks up, or something. Any requests > that didn't get received don't time out, though. > > What do you think? I like it. I'm pretty sure you've addressed the case I'm after. Let me give it a try. Thanks! -- Jim > > sage > > > >>>From e129e4f3f500f4e77cd1a7c64ff64edc54a9a9ea Mon Sep 17 00:00:00 2001 > From: Sage Weil > Date: Wed, 22 Jun 2011 13:43:06 -0700 > Subject: [PATCH] libceph: don't time out osd requests that haven't been received > > Keep track of when an outgoing message is ACKed (i.e., the server fully > received it and, presumably, queued it for processing). Time out OSD > requests only if it's been too long since they've been received. > > This prevents timeouts and connection thrashing when the OSDs are simply > busy and are throttling the requests they read off the network. > > Signed-off-by: Sage Weil > --- > include/linux/ceph/messenger.h | 1 + > net/ceph/messenger.c | 12 +++++------- > net/ceph/osd_client.c | 6 ++++++ > 3 files changed, 12 insertions(+), 7 deletions(-) > > diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h > index 31d91a6..d7adf15 100644 > --- a/include/linux/ceph/messenger.h > +++ b/include/linux/ceph/messenger.h > @@ -94,6 +94,7 @@ struct ceph_msg { > bool more_to_follow; > bool needs_out_seq; > int front_max; > + unsigned long ack_stamp; /* tx: when we were acked */ > > struct ceph_msgpool *pool; > }; > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c > index 78b55f4..c340e2e 100644 > --- a/net/ceph/messenger.c > +++ b/net/ceph/messenger.c > @@ -486,13 +486,10 @@ static void prepare_write_message(struct ceph_connection *con) > m = list_first_entry(&con->out_queue, > struct ceph_msg, list_head); > con->out_msg = m; > - if (test_bit(LOSSYTX, &con->state)) { > - list_del_init(&m->list_head); > - } else { > - /* put message on sent list */ > - ceph_msg_get(m); > - list_move_tail(&m->list_head, &con->out_sent); > - } > + > + /* put message on sent list */ > + ceph_msg_get(m); > + list_move_tail(&m->list_head, &con->out_sent); > > /* > * only assign outgoing seq # if we haven't sent this message > @@ -1399,6 +1396,7 @@ static void process_ack(struct ceph_connection *con) > break; > dout("got ack for seq %llu type %d at %p\n", seq, > le16_to_cpu(m->hdr.type), m); > + m->ack_stamp = jiffies; > ceph_msg_remove(m); > } > prepare_read_tag(con); > diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c > index 7330c27..ce310ee 100644 > --- a/net/ceph/osd_client.c > +++ b/net/ceph/osd_client.c > @@ -1085,9 +1085,15 @@ static void handle_timeout(struct work_struct *work) > req = list_entry(osdc->req_lru.next, struct ceph_osd_request, > r_req_lru_item); > > + /* hasn't been long enough since we sent it? */ > if (time_before(jiffies, req->r_stamp + timeout)) > break; > > + /* hasn't been long enough since it was acked? */ > + if (req->r_request->ack_stamp == 0 || > + time_before(jiffies, req->r_request->ack_stamp + timeout)) > + break; > + > BUG_ON(req == last_req && req->r_stamp == last_stamp); > last_req = req; > last_stamp = req->r_stamp;