All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christopher J. Morrone <morrone2@llnl.gov>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] replacing Lustre pings with LNet Peer Health
Date: Thu, 12 May 2011 10:37:51 -0700	[thread overview]
Message-ID: <4DCC1AEF.8020705@llnl.gov> (raw)
In-Reply-To: <4DCBF565.3060602@cray.com>

I think Eric's approach is the only sane way I've heard to reduce pings.

Here are some issues that I see with this:

1)  For your solution to work, you require that the lnet layer take on 
pinging duties.  Usually the network, be it IB, TCP, whatever, will not 
provide any active notification of a peer failure.  To notice that a 
peer has died, the lnet LND must, you guessed it, ping.

Usually the LNDs try to be smart.  They only generate their own pings if 
no traffic has been sent to the peer in a certain period of time.  So 
once you eliminate the higher-level pings, they will partly be replaced 
by lower-level pings.

2)  Doesn't work in a routed environment.  Would need a health network 
for clients behind routers to learn that a server has died, and vice versa.

On 05/12/2011 07:57 AM, Nic Henke wrote:
> Just floating an idea... I'd much appreciate any feedback
>
> Given bug 12471 where the ptlrpc pinger traffic on a large system can
> approach the ridiculous (2.6M pings every 75s for 160 OSTs and 16K
> clients), I'd like to consider getting rid of the pings entirely.
>
> The idea would be to extend the idea in the attached patch where we add
> an upper layer callback for lnet_notify() signaling a peer going down or
> up. The ptlrpc pinger code would be then changed to record the 'down'
> event for an import/export which would then start an eviction timer that
> started when the LNet peer was last_alive. If the nodes comes 'up'
> before the timer expires, no eviction. The eviction code would then only
> operate on nodes with 'down' events and trusting that the rest are all
> ok and functional.
>
> Eric - I know this doesn't get us that far down the road toward your new
> health network, but does solve a near term issue with pinger rates on
> large systems.
>
> Issues...
>
> - lacks "proof" that peer nodes ptlrpc queues are moving forward, but
> not really sure that is all that important in terms of pinger evictions.
>
> - LNet peer health is a bit "weird" in that it requires an upper layer
> sending a packet to trigger a node moving back to 'up'. We would need to
> address this for proper LNet peer health as it is.
>
> - Might need some beefing up of the standard LNDs to ensure we have good
> peer health data.
>
> Thoughts ?
>
> Nic

  parent reply	other threads:[~2011-05-12 17:37 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-12 14:57 [Lustre-devel] replacing Lustre pings with LNet Peer Health Nic Henke
2011-05-12 17:27 ` Andreas Dilger
2011-05-17 14:27   ` Nic Henke
2011-05-12 17:37 ` Christopher J. Morrone [this message]
2011-05-15  7:44   ` Alexey Lyashkov
2011-05-17 14:30   ` Nic Henke
2011-05-17 22:53 ` Isaac Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DCC1AEF.8020705@llnl.gov \
    --to=morrone2@llnl.gov \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.