From: Olivier AUDRY <oaudry@predical.fr>
To: Lokendra Rathour <lokendrarathour@gmail.com>,
ceph-devel@vger.kernel.org, dev@ceph.io, ceph-users@ceph.io
Subject: Re: [ceph-users] [ Ceph MDS MON Config Variables ] Failover Delay issue
Date: Mon, 03 May 2021 15:48:36 +0200 [thread overview]
Message-ID: <d81929d41edfc41c645e49220e8644ed924d6fb0.camel@predical.fr> (raw)
In-Reply-To: <CAJm6b-741TRptPWOqoqEJG6m00auekTkcWUD+z3sxH1-34THgA@mail.gmail.com>
hello
perhaps you should have more than one MDS active.
mds: cephfs:3 {0=cephfs-d=up:active,1=cephfs-e=up:active,2=cephfs-
a=up:active} 1 up:standby-replay
I got 3 active mds and one standby.
I'm using rook in kubernetes for this setup.
oau
Le lundi 03 mai 2021 à 19:06 +0530, Lokendra Rathour a écrit :
> Hi Team,
> I was setting up the ceph cluster with
>
> - Node Details:3 Mon,2 MDS, 2 Mgr, 2 RGW
> - Deployment Type: Active Standby
> - Testing Mode: Failover of MDS Node
> - Setup : Octopus (15.2.7)
> - OS: centos 8.3
> - hardware: HP
> - Ram: 128 GB on each Node
> - OSD: 2 ( 1 tb each)
> - Operation: Normal I/O with mkdir on every 1 second.
>
> T*est Case: Power-off any active MDS Node for failover to happen*
>
> *Observation:*
> We have observed that whenever an active MDS Node is down it takes
> around*
> 40 seconds* to activate the standby MDS Node.
> on further checking the logs for the new-handover MDS Node we have
> seen
> delay on the basis of following inputs:
>
> 1. 10 second delay after which Mon calls for new Monitor election
> 1. [log] 0 log_channel(cluster) log [INF] : mon.cephnode1
> calling
> monitor election
> 2. 5 second delay in which newly elected Monitor is elected
> 1. [log] 0 log_channel(cluster) log [INF] : mon.cephnode1 is
> new
> leader, mons cephnode1,cephnode3 in quorum (ranks 0,2)
> 3. the addition beacon grace time for which the system waits
> before
> which it enables standby MDS node activation. (approx delay of 19
> seconds)
> 1. defaults : sudo ceph config get mon mds_beacon_grace
> 15.000000
> 2. sudo ceph config get mon mds_beacon_interval
> 5.000000
> 3. [log] - 2021-04-30T18:23:10.136+0530 7f4e3925c700 1
> mon.cephnode2@1(leader).mds e776 no beacon from mds.0.771 (gid:
> 639443 addr: [v2:
> 10.0.4.10:6800/2172152716,v1:10.0.4.10:6801/2172152716] state:
> up:active)* since 18.7951*
> 4. *in Total it takes around 40 seconds to handover and activate
> passive
> standby node. *
>
> *Query:*
>
> 1. Can these variables be configured ? which we have tried,but
> are not
> aware of the overall impact on the ceph cluster because of these
> changes
> 1. By tuning these values we could reach the minimum time of 12
> seconds in which the active node comes up.
> 2. Values taken to get the said time :
> 1. *mon_election_timeout* (default 5) - configured as 1
> 2. *mon_lease*(default 5) - configured as 2
> 3. *mds_beacon_grace* (default 15) - configured as 5
> 4. *mds_beacon_interval* (default 5) - configured as 1
>
> We need to tune this setup to get the failover duration as low as 5-7
> seconds.
>
> Please suggest/support and share your inputs, my setup is ready and
> already
> we are testing with multiple scenarios so that we are able to achive
> min
> failover duration.
>
next parent reply other threads:[~2021-05-03 13:54 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAJm6b-741TRptPWOqoqEJG6m00auekTkcWUD+z3sxH1-34THgA@mail.gmail.com>
2021-05-03 13:48 ` Olivier AUDRY [this message]
2021-05-03 15:19 ` [ceph-users] [ Ceph MDS MON Config Variables ] Failover Delay issue Patrick Donnelly
2021-05-03 17:00 ` [ceph-users] " Frank Schilder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d81929d41edfc41c645e49220e8644ed924d6fb0.camel@predical.fr \
--to=oaudry@predical.fr \
--cc=ceph-devel@vger.kernel.org \
--cc=ceph-users@ceph.io \
--cc=dev@ceph.io \
--cc=lokendrarathour@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).