All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nathaniel Rutman <Nathan.Rutman@Sun.COM>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] hiding non-fatal communications errors
Date: Thu, 19 Jun 2008 13:24:42 -0700	[thread overview]
Message-ID: <485AC08A.7070907@sun.com> (raw)
In-Reply-To: <067b01c8c7c6$530dbb40$0281a8c0@ebpc>

Eric Barton wrote:
> Oleg's comments about congestion and the ORNL discussions I've been
> involved in are effectively presenting arguments for allowing
> expedited communications.  This is possible but comes at a cost.
>   
> The "proper" implementation effectively holds an uncongested network
> in reserve for expedited communications.  That's a high price to pay
> because it pretty well means doubling up all the LNET state - twice
> the number of queues/sockets/queuepairs/connections.  That's
> unavoidable since we're using these structures for backpressure and
> once they're "full" you can only bypass with an additional connection.
>   
That's assuming network congestion is the cause of the lock timeout.  
What if the server disk is busy doing who knows what, the client's cache 
flush RPCs are all sitting on the server in the request queue just 
waiting for some disk time.  Furthermore assume that a bunch of other 
clients are all doing the same thing, so that we can't simply prioritize 
this clients RPCs over everybody else's. 

I think the method suggested by Oleg has the most potential in this 
case: "sniff" the incoming RPCs to see if they are cache flushes, and do 
not decide to evict those clients until after those RPCs have been 
processed.  As mentioned, we already do sniff the incoming reqs to check 
adaptive timeout deadlines (ptlrpc_server_handle_req_in).

One further thing I would like to do is respond to "easy" RPCs 
immediately (in a reserved thread).  "Easy" would certainly include 
pings, maybe others that have no disk access.  This would allow us to 
free up LNET buffers and other resources, prevent us from evicting 
clients "we haven't heard from in X seconds" (although I just realized 
we could fix that right now in ptlrpc_server_handle_req_in), and more 
quickly determine network and server loading remotely.

  reply	other threads:[~2008-06-19 20:24 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-04 13:25 [Lustre-devel] hiding non-fatal communications errors Eric Barton
2008-06-04 21:17 ` Peter Braam
2008-06-04 22:20   ` Andreas Dilger
2008-06-05  4:12     ` Oleg Drokin
2008-06-05 16:42       ` Robert Read
2008-06-05 16:59         ` Oleg Drokin
2008-06-06  3:29           ` Peter Braam
2008-06-06  3:38             ` Oleg Drokin
2008-06-06  3:40               ` Peter Braam
2008-06-06  4:41                 ` Andreas Dilger
2008-06-06 11:13                   ` Eric Barton
2008-06-19 20:24                     ` Nathaniel Rutman [this message]
2008-06-06 12:23                   ` Peter Braam
2008-06-06  3:37         ` Peter Braam
2008-06-04 23:41   ` Eric Barton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=485AC08A.7070907@sun.com \
    --to=nathan.rutman@sun.com \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.