[Lustre-devel] Imperative Recovery - forcing failover server stop blocking

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Eric Barton <eeb@sun.com>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] Imperative Recovery - forcing failover server stop blocking
Date: Tue, 23 Jun 2009 13:49:53 +0100	[thread overview]
Message-ID: <003301c9f401$1adb8af0$5092a0d0$@com> (raw)
In-Reply-To: <4A3FCB96.4010201@cray.com>

Chris,

> Eric Barton wrote:
> > Consider a utility that runs on a client to notify it to reconnect
> > to a failover server, and which completes with a success status
> > only when the client has reconnected successfully.
>
> Would this be equivalent to monitoring the "completed_clients" field
> of the recovery_status proc file?

No, this is for accounting clients that have actually completed
recovery, not clients which have reconnected and are therefore ready
to participate in recovery - you'd want 'connected_clients' for that.

But actually, counting reconnected clients is only half the story.
Currently clients don't even start to participate in recovery until
they detect an error communicating with the failed server - i.e. after
a timeout _and_ a failed reconnection attempt.  This utility
eliminates this latency by notifying the client explicitly to
reconnect NOW.

> > If you run this utility on all clients after starting a failover
> > server, you can notify the server to close the recovery window
> > once all instances have completed since that tells you that all
> > clients are healthy and ready to participate in recovery.
>
> Won't the server already begin replay by this time, since it has
> received connections from all clients?  Thus rendering our
> notification to the server (to close the recovery window) redundant?

Yes, in the optimistic event that all clients reconnected.  

> > Of course, you can decide to stop waiting and proceed with the
> > server notification at any time you like.  You can base this
> > decision on a timeout, knowing how many clients have reconnected
> > successfully, or any other criterion you chose - i.e. you are now
> > the effective arbiter of client health.
>
> Our initial plan was to do just this.  We would have a proxy running
> on the bootnode to aggregate client responses.  It would wait some
> configurable timeout period, say clnt_timeout, and if it received a
> # of responses equal to obd->obd_max_recoverable_clients, it would
> go ahead and notify the server to stop waiting for responses
> immediately (though this is the situation described in the last
> comment).  If the timeout expired it would notify the server to stop
> waiting.  However, it occurred to me that we would get the same
> behavior by simply tuning the server's recovery window down to
> whatever value we were going to assign clnt_timeout.  It seemed we
> were going through an awful lot of trouble to gain a tunable
> recovery_window.  I'm not sure if this is a result of our choosing
> poor criterion upon which to notify the server to stop waiting, or
> if there is something else (a use case perhaps) that I'm missing.

Yes, of course, you can just tune down the recovery window in the
knowledge that explicit notification has speeded the whole process of
client reconnection.  However if you have better knowledge about
client health than Lustre can have - e.g. hardware-specific health
monitoring, or just using the success/failure of the explicit
notification method itself - then why not use it to control exactly
when to stop waiting for dead clients?

-- 

        Cheers,
                   Eric

next prev parent reply	other threads:[~2009-06-23 12:49 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-18 23:10 [Lustre-devel] Imperative Recovery - forcing failover server stop blocking Chris Horn
2009-06-19 21:18 ` Johann Lombardi
2009-06-19 22:10   ` Chris Horn
2009-06-22 17:53     ` Eric Barton
2009-06-22 18:21       ` Chris Horn
2009-06-22 19:27         ` Brian Behlendorf
2009-06-23 12:49         ` Eric Barton [this message]
2009-06-23 14:53           ` Andreas Dilger
2009-06-23 14:59             ` Chris Horn
2009-06-23 17:20             ` Robert Read

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='003301c9f401$1adb8af0$5092a0d0$@com' \
    --to=eeb@sun.com \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.