All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin Mailand <martin@tuxadero.com>
To: Sage Weil <sage@newdream.net>
Cc: ceph-devel@vger.kernel.org
Subject: Re: OSD::disk_tp timeout
Date: Sun, 09 Oct 2011 00:15:24 +0200	[thread overview]
Message-ID: <4E90CB7C.3010304@tuxadero.com> (raw)
In-Reply-To: <Pine.LNX.4.64.1110081427160.26432@cobra.newdream.net>

Hi,
I am using v3.1-rc9, so the fix in there. Maybe I can nail it down a bit 
more specific.

Best Regards,
  martin

Sage Weil schrieb:
> Hi Christian,
> 
> On Sat, 8 Oct 2011, Christian Brunner wrote:
>> Hi,
>>
>> I've upgraded ceph from 0.32 to 0.36 yesterday. Now I have a totaly
>> screwed ceph cluster. :(
>>
>> What bugs me most is the fact, that OSDs become unresponsive
>> frequently. The process is eating a lot of cpu and I can see the
> 
> What version of btrfs are you running?  This sound a bit like the bug 
> fixed by this patch:
> 
> http://www.spinics.net/lists/linux-btrfs/msg12627.html
> 
> (That was just merged into mainline this week.)
> 
>> following messages in the log:
>>
>> Oct  8 22:30:05 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map
>> is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60
>> Oct  8 22:30:10 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map
>> is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60
>> Oct  8 22:30:15 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map
>> is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60
>> Oct  8 22:30:20 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map
>> is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60
>> Oct  8 22:30:25 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map
>> is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60
>> Oct  8 22:30:30 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map
>> is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60
>>
>> Do you have any idea, what to do about that?
> 
> Those messages just mean that a thread in the disk threadpool (which is 
> doing all the writes to btrfs) is blocked/stopped.
> 
> sage

  reply	other threads:[~2011-10-08 22:15 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-08 20:37 OSD::disk_tp timeout Christian Brunner
2011-10-08 21:04 ` Martin Mailand
2011-10-08 21:28 ` Sage Weil
2011-10-08 22:15   ` Martin Mailand [this message]
2011-10-08 22:44     ` Sage Weil
2011-10-09  6:02       ` Christian Brunner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E90CB7C.3010304@tuxadero.com \
    --to=martin@tuxadero.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sage@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.