From: Chris Dunlop <chris@onthe.net.au>
To: Sage Weil <sage@inktank.com>
Cc: ceph-devel@vger.kernel.org
Subject: Re: Mon losing touch with OSDs
Date: Fri, 1 Mar 2013 13:02:39 +1100 [thread overview]
Message-ID: <20130301020239.GA16236@onthe.net.au> (raw)
In-Reply-To: <20130223020253.GA11899@onthe.net.au>
On Sat, Feb 23, 2013 at 01:02:53PM +1100, Chris Dunlop wrote:
> On Fri, Feb 22, 2013 at 05:52:11PM -0800, Sage Weil wrote:
>> On Sat, 23 Feb 2013, Chris Dunlop wrote:
>>> On Fri, Feb 22, 2013 at 05:30:04PM -0800, Sage Weil wrote:
>>>> On Sat, 23 Feb 2013, Chris Dunlop wrote:
>>>>> On Fri, Feb 22, 2013 at 04:13:21PM -0800, Sage Weil wrote:
>>>>>> On Sat, 23 Feb 2013, Chris Dunlop wrote:
>>>>>>> On Fri, Feb 22, 2013 at 03:43:22PM -0800, Sage Weil wrote:
>>>>>>>> On Sat, 23 Feb 2013, Chris Dunlop wrote:
>>>>>>>>> On Fri, Feb 22, 2013 at 01:57:32PM -0800, Sage Weil wrote:
>>>>>>>>>> I just looked at the logs. I can't tell what happend to cause that 10
>>>>>>>>>> second delay.. strangely, messages were passing from 0 -> 1, but nothing
>>>>>>>>>> came back from 1 -> 0 (although 1 was queuing, if not sending, them).
>>>>>>>
>>>>>>> Is there any way of telling where they were delayed, i.e. in the 1's output
>>>>>>> queue or 0's input queue?
>>>>>>
>>>>>> Yeah, if you bump it up to 'debug ms = 20'. Be aware that that will
>>>>>> generate a lot of logging, though.
>>>>>
>>>>> I really don't want to load the system with too much logging, but I'm happy
>>>>> modifying code... Are there specific interesting debug outputs which I can
>>>>> modify so they're output under "ms = 1"?
>>>>
>>>> I'm basically interested in everything in writer() and write_message(),
>>>> and reader() and read_message()...
>>>
>>> Like this?
>>
>> Yeah. You could do 2 instead of 1 so you can turn it down. I suspect
>> that this is the lions share of what debug 20 will spam to the log, but
>> hopefully the load is manageable!
>
> Good idea on the '2'. I'll get that installed and wait for it to happen again.
FYI...
To avoid running out of disk space for the massive logs, I
started using logrotate on the ceph logs every two hours, which
does a 'service ceph reload' to re-open the log files.
In the week since doing that I haven't seen any 'slow requests'
at all (the load has stayed the same as before the change),
which means the issue with the osds dropping out, then the
system not recovering properly, also hasn't happened.
That's a bit suspicious, no?
I've now put the log dirs on each machine on their own 2TB
partition and reverted back to the default daily rotates.
And once more we're waiting... Godot, is that you?
Chris
next prev parent reply other threads:[~2013-03-01 2:02 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-02-15 3:29 Mon losing touch with OSDs Chris Dunlop
2013-02-15 4:57 ` Sage Weil
2013-02-15 22:05 ` Chris Dunlop
2013-02-17 23:41 ` Chris Dunlop
2013-02-18 1:44 ` Sage Weil
2013-02-19 3:02 ` Chris Dunlop
2013-02-20 2:07 ` Chris Dunlop
2013-02-22 3:06 ` Chris Dunlop
2013-02-22 21:57 ` Sage Weil
2013-02-22 23:35 ` Chris Dunlop
2013-02-22 23:43 ` Sage Weil
2013-02-23 0:08 ` Chris Dunlop
2013-02-23 0:13 ` Sage Weil
2013-02-23 0:25 ` Sage Weil
2013-02-23 0:50 ` Chris Dunlop
2013-02-23 1:10 ` Chris Dunlop
2013-02-23 0:57 ` Chris Dunlop
2013-02-23 1:30 ` Sage Weil
2013-02-23 1:49 ` Chris Dunlop
2013-02-23 1:52 ` Sage Weil
2013-02-23 2:02 ` Chris Dunlop
2013-03-01 2:02 ` Chris Dunlop [this message]
2013-03-01 5:00 ` Sage Weil
2013-03-08 3:12 ` Chris Dunlop
2013-03-08 22:47 ` Chris Dunlop
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130301020239.GA16236@onthe.net.au \
--to=chris@onthe.net.au \
--cc=ceph-devel@vger.kernel.org \
--cc=sage@inktank.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.