From: Sam Lang <samlang@gmail.com>
To: ceph-devel@vger.kernel.org
Subject: heartbeat logic
Date: Wed, 03 Aug 2011 22:10:31 -0500 [thread overview]
Message-ID: <4E3A0DA7.8090409@gmail.com> (raw)
During startup of an osd cluster with 37 osds, within the first few
seconds I see osds getting marked down, even though the osd processes
remain running and seem to be just fine. The up count fluctuates for a
while but seems to stabilize eventually at around 30 up osds, while 7 or
so remain down, and eventually get marked out.
With debugging enabled, I've tracked it down to this bit of logic in
OSD.cc:1502 (stable branch):
------snip------
// ignore (and mark down connection for) old messages
epoch_t e = m->map_epoch;
if (!e)
e = m->peer_as_of_epoch;
if (e <= osdmap->get_epoch() &&
((heartbeat_to.count(from) == 0 && heartbeat_from.count(from) ==
0) ||
heartbeat_con[from] != m->get_connection())) {
dout(5) << "handle_osd_ping marking down peer " << m->get_source_inst()
<< " after old message from epoch " << e
<< " <= current " << osdmap->get_epoch() << dendl;
heartbeat_messenger->mark_down(m->get_connection());
goto out;
}
--------------------
It looks as though the osd getting marked down is sending a heartbeat
ping to another osd, at which point, that osd marks it as down. Its not
clear to me why that happens. Is it because connections are getting
dropped and ports are changing?
In any case, that if conditional succeeds, resulting in the osd marking
down the osd that just sent it a ping heartbeat.
I modified the debug output to show the values for
heartbeat_to.count(from) and heartbeat_from.count(from), as well as
heartbeat_con[from] and m->get_connection(). The cases where osds are
marked down are when the ping message's epoch and the osdmap epoch are
the same (usually around 16), and the counts are always zero, suggesting
that this is the first heartbeat from osdA to osdB. Even if they
weren't zero, the heartbeat_con[from] is null, and doesn't get set till
later, so the conditional would succeed anyway. Can someone explain the
purpose and reasoning behind this bit of code? If I just whack the
second part of the conditional will bad things happen? Any help is
greatly appreciated.
Thanks,
-sam
next reply other threads:[~2011-08-04 3:16 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-04 3:10 Sam Lang [this message]
2011-08-04 4:28 ` heartbeat logic Sage Weil
2011-08-04 18:18 ` Sam Lang
2011-08-04 20:10 ` Sage Weil
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E3A0DA7.8090409@gmail.com \
--to=samlang@gmail.com \
--cc=ceph-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.