From: Nathaniel Rutman <Nathan.Rutman@Sun.COM>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] hiding non-fatal communications errors
Date: Thu, 19 Jun 2008 13:24:42 -0700 [thread overview]
Message-ID: <485AC08A.7070907@sun.com> (raw)
In-Reply-To: <067b01c8c7c6$530dbb40$0281a8c0@ebpc>
Eric Barton wrote:
> Oleg's comments about congestion and the ORNL discussions I've been
> involved in are effectively presenting arguments for allowing
> expedited communications. This is possible but comes at a cost.
>
> The "proper" implementation effectively holds an uncongested network
> in reserve for expedited communications. That's a high price to pay
> because it pretty well means doubling up all the LNET state - twice
> the number of queues/sockets/queuepairs/connections. That's
> unavoidable since we're using these structures for backpressure and
> once they're "full" you can only bypass with an additional connection.
>
That's assuming network congestion is the cause of the lock timeout.
What if the server disk is busy doing who knows what, the client's cache
flush RPCs are all sitting on the server in the request queue just
waiting for some disk time. Furthermore assume that a bunch of other
clients are all doing the same thing, so that we can't simply prioritize
this clients RPCs over everybody else's.
I think the method suggested by Oleg has the most potential in this
case: "sniff" the incoming RPCs to see if they are cache flushes, and do
not decide to evict those clients until after those RPCs have been
processed. As mentioned, we already do sniff the incoming reqs to check
adaptive timeout deadlines (ptlrpc_server_handle_req_in).
One further thing I would like to do is respond to "easy" RPCs
immediately (in a reserved thread). "Easy" would certainly include
pings, maybe others that have no disk access. This would allow us to
free up LNET buffers and other resources, prevent us from evicting
clients "we haven't heard from in X seconds" (although I just realized
we could fix that right now in ptlrpc_server_handle_req_in), and more
quickly determine network and server loading remotely.
next prev parent reply other threads:[~2008-06-19 20:24 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-04 13:25 [Lustre-devel] hiding non-fatal communications errors Eric Barton
2008-06-04 21:17 ` Peter Braam
2008-06-04 22:20 ` Andreas Dilger
2008-06-05 4:12 ` Oleg Drokin
2008-06-05 16:42 ` Robert Read
2008-06-05 16:59 ` Oleg Drokin
2008-06-06 3:29 ` Peter Braam
2008-06-06 3:38 ` Oleg Drokin
2008-06-06 3:40 ` Peter Braam
2008-06-06 4:41 ` Andreas Dilger
2008-06-06 11:13 ` Eric Barton
2008-06-19 20:24 ` Nathaniel Rutman [this message]
2008-06-06 12:23 ` Peter Braam
2008-06-06 3:37 ` Peter Braam
2008-06-04 23:41 ` Eric Barton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=485AC08A.7070907@sun.com \
--to=nathan.rutman@sun.com \
--cc=lustre-devel@lists.lustre.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.