From: Martin Mailand <martin@tuxadero.com>
To: chb@muc.de
Cc: ceph-devel@vger.kernel.org
Subject: Re: OSD::disk_tp timeout
Date: Sat, 08 Oct 2011 23:04:27 +0200 [thread overview]
Message-ID: <4E90BADB.2000908@tuxadero.com> (raw)
In-Reply-To: <CAO47_-_x3QqP4qmTUAFB5SrObaBuzaJUa4dh+p+mdTHzKGeojg@mail.gmail.com>
Hi Christian,
if I remember correctly you are using ceph with a qemu-kvm setup?
After the last update of ceph, the load average on the osd was doubled,
the performance of the kvm machines became bad.
The really weird thing is, the cluster "needs" around 30 mins to get
into this state. After I restart the osd's everything is fine, than
after a while the load of the osd nodes is building up. Most of the load
is produced by btrfs kernel processes in the deferred state.
Not sure if I have the same problem as you, as I do not get any timeouts.
Best Regards,
martin
Christian Brunner schrieb:
> Hi,
>
> I've upgraded ceph from 0.32 to 0.36 yesterday. Now I have a totaly
> screwed ceph cluster. :(
>
> What bugs me most is the fact, that OSDs become unresponsive
> frequently. The process is eating a lot of cpu and I can see the
> following messages in the log:
>
> Oct 8 22:30:05 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map
> is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60
> Oct 8 22:30:10 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map
> is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60
> Oct 8 22:30:15 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map
> is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60
> Oct 8 22:30:20 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map
> is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60
> Oct 8 22:30:25 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map
> is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60
> Oct 8 22:30:30 os00 osd.000[31688]: 7fe0f3b9c700 heartbeat_map
> is_healthy 'OSD::disk_tp thread 0x7fe0e527e700' had timed out after 60
>
> Do you have any idea, what to do about that?
>
> Regards,
> Christian
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-10-08 21:04 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-08 20:37 OSD::disk_tp timeout Christian Brunner
2011-10-08 21:04 ` Martin Mailand [this message]
2011-10-08 21:28 ` Sage Weil
2011-10-08 22:15 ` Martin Mailand
2011-10-08 22:44 ` Sage Weil
2011-10-09 6:02 ` Christian Brunner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E90BADB.2000908@tuxadero.com \
--to=martin@tuxadero.com \
--cc=ceph-devel@vger.kernel.org \
--cc=chb@muc.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.