From: Nicolas Williams <Nicolas.Williams@sun.com>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] server-side resending & bulk transfer
Date: Fri, 5 Feb 2010 14:20:13 -0600 [thread overview]
Message-ID: <20100205202013.GN1061@Sun.COM> (raw)
In-Reply-To: <002d01caa686$72c40d40$584c27c0$@com>
On Fri, Feb 05, 2010 at 05:12:51PM +0000, Eric Barton wrote:
> On Feb 5, 2010, at 8:35 AM, Johann Lombardi wrote:
> > Unlike lock callback rpcs, losing the start bulk signal is not fatal since
> > the bulk transfer will timeout on the server side, the request be dropped
> > and the client will resend after reconnection. This is indeed harmless,
> > but still causes slowdown which could be avoided according to LLNL if we
> > try to resend the start bulk signal (bug 21714). Brian Behlendorf's
> > proposal is to resend the start bulk signal after the first l_wait_event()
> > timeout in ost_brw_write(). However, we don't know if this is safe to do,
> > e.g. how does the client react if it receives duplicated start bulk signals?
>
> Yes, the server could retry the bulk if it times out and this
> will be safe for the client since its bulk buffer is auto-unlinked,
> so only 1 bulk PUT/GET can match it. But if the problem happens
> on the way back to the server rather than the way out to the client,
> you're hosed since the bulk has completed from the client's POV.
>
> This should be an exceptional circumstance - i.e. a router has
> actually failed - so I think it's better just to stick with the
> client retrying from scratch rather than tying down a server thread
> until it has decided whether there was a router failure or the
> client really crashed.
I agree that tying down a server thread on a long block is not a good
thing. If the LLNL proposal (resend the start bulk signal) is on the
money, then the thing to do would be to create a queue and separate
service thread(s) to handle such resends.
> Roll on the health network! :)
Well, if the deadline here is on the order of 1s or thereabouts then the
health network isn't likely to help much because we're not going to get
sub-second dead node detection. (Well, if we jack up the ping rate and
reduce the time-to-declare-death low enough, and make sure that HN
threads and messaging are suitably prioritized, then we might be able to
get sub-second dead node detection, but my gut feeling is that any
heuristic approach should wait for longer than 1s.)
Nico
--
next prev parent reply other threads:[~2010-02-05 20:20 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20100205163524.GW236@granier.hd.free.fr>
2010-02-05 17:12 ` [Lustre-devel] server-side resending & bulk transfer Eric Barton
2010-02-05 20:20 ` Nicolas Williams [this message]
2010-02-06 12:28 ` Eric Barton
2010-02-09 19:21 ` Nathan Rutman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100205202013.GN1061@Sun.COM \
--to=nicolas.williams@sun.com \
--cc=lustre-devel@lists.lustre.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.