From: Vladimir Bashkirtsev <vladimir@bashkirtsev.com>
To: Gregory Farnum <greg@inktank.com>
Cc: ceph-devel@vger.kernel.org
Subject: Re: Possible memory leak in mon?
Date: Fri, 18 May 2012 19:37:54 +0930 [thread overview]
Message-ID: <4FB61F7A.2080602@bashkirtsev.com> (raw)
In-Reply-To: <CAPYLRzgpZmASDE=3XYF+BBydtOopN19nOk8Tkmqk_-EdscmrMg@mail.gmail.com>
On 16/05/12 02:43, Gregory Farnum wrote:
> On Sun, May 6, 2012 at 5:53 PM, Vladimir Bashkirtsev
> <vladimir@bashkirtsev.com> wrote:
>> On 03/05/12 16:23, Greg Farnum wrote:
>>> On Wednesday, May 2, 2012 at 11:24 PM, Vladimir Bashkirtsev wrote:
>>>> Greg,
>>>>
>>>> Apologies for multiple emails: my mail server is backed by ceph now and
>>>> it struggled this morning (separate issue). So my mail server reported
>>>> back to my mailer that sending of email failed when obviously it was not
>>>> the case.
>>> Interesting — I presume you're using the file system? That's not something
>>> we've heard of anybody doing with Ceph before. :)
>>>
>>>> [root@gamma ~]# ceph -s
>>>> 2012-05-03 15:46:55.640951 mds e2666: 1/1/1 up {0=1=up:active}, 1
>>>> up:standby
>>>> 2012-05-03 15:46:55.647106 osd e10728: 6 osds: 6 up, 6 in
>>>> 2012-05-03 15:46:55.654052 log 2012-05-03 15:46:26.557084 mon.2
>>>> 172.16.64.202:6789/0 2878 : [INF] mon.2 calling new monitor election
>>>> 2012-05-03 15:46:55.654425 mon e7: 3 mons at
>>>> {0=172.16.64.200:6789/0,1=172.16.64.201:6789/0,2=172.16.64.202:6789/0}
>>>> 2012-05-03 15:46:56.961624 pg v1251669: 600 pgs: 2 creating, 598
>>>> active+clean; 309 GB data, 963 GB used, 1098 GB / 2145 GB avail
>>>>
>>>> Loggin is on but nothing obvious in there: logs quite small. Number of
>>>> ceph health logged (ceph monitored by nagios and so this record appears
>>>> every 5 minutes), monitors periodically call for election (different
>>>> periods between 1 to 15 minutes as it looks). That's it.
>>> Hrm. Generally speaking the monitors shouldn't call for elections unless
>>> something changes (one of them crashes) or the leader monitor is slowing
>>> down.
>>> Can you increase the debug_mon to 20, the debug_ms to 1, and post one of
>>> the logs somewhere? The "Live Debugging" section of
>>> http://ceph.com/wiki/Debugging should give you what you need. :)
>> Here's the logs and core dumps:
>> http://www.bashkirtsev.com/logs-2012-05-07.tar.bz2
>>
>> Mons grown to 1.2GB and 2GB of memory.
> When I look at the logs for mon.0, I see that there are a lot of
> places where mon.0 takes tens of seconds to write something to disk.
> If the disk is just about full, that might make sense (many
> filesystems don't handle a nearly-full disk very well at all); and a
> monitor getting stuck for that long could definitely explain why they
> start using up so much memory (they're buffering messages). I suspect
> that there's not anything particularly wrong here, unless I'm
> misunderstanding the story you're telling me. :) Have you noticed this
> problem when the monitor's disk partition isn't nearly full?
> -Greg
I have recreated conditions when mon started to consume more memory:
everything appears in line with your suspicions. When disk gets almost
full, mon slows down and finally crashes quite badly so I cannot recover
it. I am forced then to destroy mon all together and create a new one
instead.
Long story short: in docs/wiki it should be stated as recommendation NOT
to keep monfs on the same partition as ceph log (which can grow quickly)
and preferably keep it on separate partition all together.
In the same time it begs another question: what it recommended partition
size for monfs?
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2012-05-18 10:09 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-02 22:28 Possible memory leak in mon? Vladimir Bashkirtsev
2012-05-03 0:22 ` Greg Farnum
2012-05-03 6:24 ` Vladimir Bashkirtsev
2012-05-03 6:53 ` Greg Farnum
2012-05-07 0:52 ` Vladimir Bashkirtsev
2012-05-07 0:53 ` Vladimir Bashkirtsev
2012-05-14 21:23 ` Gregory Farnum
2012-05-15 17:13 ` Gregory Farnum
2012-05-18 10:07 ` Vladimir Bashkirtsev [this message]
2012-05-21 18:18 ` Gregory Farnum
-- strict thread matches above, loose matches on Subject: below --
2012-05-02 22:49 Vladimir Bashkirtsev
2012-05-02 23:36 Vladimir Bashkirtsev
2012-05-02 23:52 Vladimir Bashkirtsev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FB61F7A.2080602@bashkirtsev.com \
--to=vladimir@bashkirtsev.com \
--cc=ceph-devel@vger.kernel.org \
--cc=greg@inktank.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.