All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wido den Hollander <wido@widodh.nl>
To: Yehuda Sadeh <yehuda@inktank.com>
Cc: Sage Weil <sage@inktank.com>, ceph-devel@vger.kernel.org
Subject: Re: handling fs errors
Date: Tue, 22 Jan 2013 14:12:23 +0100	[thread overview]
Message-ID: <50FE9037.8040501@widodh.nl> (raw)
In-Reply-To: <CAC-hyiHnGXrr5hiZF__BP_uWZeSrhFk_psphAcRPwb5qGa69cw@mail.gmail.com>



On 01/22/2013 07:12 AM, Yehuda Sadeh wrote:
> On Mon, Jan 21, 2013 at 10:05 PM, Sage Weil <sage@inktank.com> wrote:
>> We observed an interesting situation over the weekend.  The XFS volume
>> ceph-osd locked up (hung in xfs_ilock) for somewhere between 2 and 4
>> minutes.  After 3 minutes (180s), ceph-osd gave up waiting and committed
>> suicide.  XFS seemed to unwedge itself a bit after that, as the daemon was
>> able to restart and continue.
>>
>> The problem is that during that 180s the OSD was claiming to be alive but
>> not able to do any IO.  That heartbeat check is meant as a sanity check
>> against a wedged kernel, but waiting so long meant that the ceph-osd
>> wasn't failed by the cluster quickly enough and client IO stalled.
>>
>> We could simply change that timeout to something close to the heartbeat
>> interval (currently default is 20s).  That will make ceph-osd much more
>> sensitive to fs stalls that may be transient (high load, whatever).
>>
>> Another option would be to make the osd heartbeat replies conditional on
>> whether the internal heartbeat is healthy.  Then the heartbeat warnings
>> could start at 10-20s, ping replies would pause, but the suicide could
>> still be 180s out.  If the stall is short-lived, pings will continue, the
>> osd will mark itself back up (if it was marked down) and continue.
>>
>> Having written that out, the last option sounds like the obvious choice.
>> Any other thoughts?
>>
>
> Another option would be to have the osd reply to the ping with some
> health description.
>

Looking to the future with more monitoring that might be a good idea.

If an OSD simply stops sending heartbeats if the internal conditions 
aren't met you don't know what's going on.

If the heartbeat would have metadata which tells: "I'm here, but not in 
such a good shape" that could be reported back to the monitors.

Monitoring tools could read this out and could sent out 
notifications/alerts to where they want.

Now we assume I/O completely stalls, but the metadata could also contain 
high latency? If the latency goes over threshold X you can still mark 
the OSD out temporarily since it will impact clients, but some 
information towards the monitor might be useful.

Wido

> Yehuda
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

  reply	other threads:[~2013-01-22 13:12 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-22  6:05 handling fs errors Sage Weil
2013-01-22  6:12 ` Yehuda Sadeh
2013-01-22 13:12   ` Wido den Hollander [this message]
2013-01-22 17:59     ` Gregory Farnum
2013-01-22  7:25 ` Andrey Korolyov
2013-01-22  8:09 ` Chen, Xiaoxi
2013-01-22 18:29 ` Dimitri Maziuk
2013-01-22 23:09   ` Sage Weil
2013-01-22 22:20 ` Andrey Korolyov
2013-01-22 23:08   ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50FE9037.8040501@widodh.nl \
    --to=wido@widodh.nl \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sage@inktank.com \
    --cc=yehuda@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.