All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Mick <dan.mick@inktank.com>
To: Sage Weil <sage@inktank.com>
Cc: "Sébastien Han" <han.sebastien@gmail.com>,
	"Loic Dachary" <loic@dachary.org>,
	"Sylvain Munaut" <s.munaut@whatever-company.com>,
	ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: [0.48.3] OSD memory leak when scrubbing
Date: Mon, 04 Feb 2013 13:03:12 -0800	[thread overview]
Message-ID: <51102210.2040300@inktank.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1302040930540.27333@cobra.newdream.net>

...and/or do you have the corepath set interestingly, or one of the 
core-trapping mechanisms turned on?

On 02/04/2013 11:29 AM, Sage Weil wrote:
> On Mon, 4 Feb 2013, S?bastien Han wrote:
>> Hum just tried several times on my test cluster and I can't get any
>> core dump. Does Ceph commit suicide or something? Is it expected
>> behavior?
>
> SIGSEGV should trigger the usual path that dumps a stack trace and then
> dumps core.  Was your ulimit -c set before the daemon was started?
>
> sage
>
>
>
>> --
>> Regards,
>> S?bastien Han.
>>
>>
>> On Sun, Feb 3, 2013 at 10:03 PM, S?bastien Han <han.sebastien@gmail.com> wrote:
>>> Hi Lo?c,
>>>
>>> Thanks for bringing our discussion on the ML. I'll check that tomorrow :-).
>>>
>>> Cheer
>>> --
>>> Regards,
>>> S?bastien Han.
>>>
>>>
>>> On Sun, Feb 3, 2013 at 10:01 PM, S?bastien Han <han.sebastien@gmail.com> wrote:
>>>> Hi Lo?c,
>>>>
>>>> Thanks for bringing our discussion on the ML. I'll check that tomorrow :-).
>>>>
>>>> Cheers
>>>>
>>>> --
>>>> Regards,
>>>> S?bastien Han.
>>>>
>>>>
>>>> On Sun, Feb 3, 2013 at 7:17 PM, Loic Dachary <loic@dachary.org> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> As discussed during FOSDEM, the script you wrote to kill the OSD when it
>>>>> grows too much could be amended to core dump instead of just being killed &
>>>>> restarted. The binary + core could probably be used to figure out where the
>>>>> leak is.
>>>>>
>>>>> You should make sure the OSD current working directory is in a file system
>>>>> with enough free disk space to accomodate for the dump and set
>>>>>
>>>>> ulimit -c unlimited
>>>>>
>>>>> before running it ( your system default is probably ulimit -c 0 which
>>>>> inhibits core dumps ). When you detect that OSD grows too much kill it with
>>>>>
>>>>> kill -SEGV $pid
>>>>>
>>>>> and upload the core found in the working directory, together with the
>>>>> binary in a public place. If the osd binary is compiled with -g but without
>>>>> changing the -O settings, you should have a larger binary file but no
>>>>> negative impact on performances. Forensics analysis will be made a lot
>>>>> easier with the debugging symbols.
>>>>>
>>>>> My 2cts
>>>>>
>>>>> On 01/31/2013 08:57 PM, Sage Weil wrote:
>>>>>> On Thu, 31 Jan 2013, Sylvain Munaut wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I disabled scrubbing using
>>>>>>>
>>>>>>>> ceph osd tell \* injectargs '--osd-scrub-min-interval 1000000'
>>>>>>>> ceph osd tell \* injectargs '--osd-scrub-max-interval 10000000'
>>>>>>>
>>>>>>> and the leak seems to be gone.
>>>>>>>
>>>>>>> See the graph at  http://i.imgur.com/A0KmVot.png  with the OSD memory
>>>>>>> for the 12 osd processes over the last 3.5 days.
>>>>>>> Memory was rising every 24h. I did the change yesterday around 13h00
>>>>>>> and OSDs stopped growing. OSD memory even seems to go down slowly by
>>>>>>> small blocks.
>>>>>>>
>>>>>>> Of course I assume disabling scrubbing is not a long term solution and
>>>>>>> I should re-enable it ... (how do I do that btw ? what were the
>>>>>>> default values for those parameters)
>>>>>>
>>>>>> It depends on the exact commit you're on.  You can see the defaults if
>>>>>> you
>>>>>> do
>>>>>>
>>>>>>   ceph-osd --show-config | grep osd_scrub
>>>>>>
>>>>>> Thanks for testing this... I have a few other ideas to try to reproduce.
>>>>>>
>>>>>> sage
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>> --
>>>>> Lo?c Dachary, Artisan Logiciel Libre
>>>>>
>>>>
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

  reply	other threads:[~2013-02-04 21:04 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-22 20:01 [0.48.3] OSD memory leak when scrubbing Sylvain Munaut
2013-01-22 21:19 ` Sébastien Han
2013-01-22 21:32   ` Sylvain Munaut
2013-01-22 21:38     ` Sébastien Han
2013-01-25 16:29       ` Sébastien Han
2013-01-25 20:16         ` Sylvain Munaut
2013-01-27 16:17           ` Sylvain Munaut
2013-01-27 17:47             ` Sage Weil
2013-01-27 18:17               ` Sylvain Munaut
2013-01-30  9:12               ` Sylvain Munaut
2013-01-30  9:18                 ` Sage Weil
2013-01-30 13:26                   ` Sylvain Munaut
2013-01-30 19:40                     ` Sage Weil
2013-01-31 13:20                       ` Sylvain Munaut
     [not found]                         ` <31226757.422.1359645742478.JavaMail.dspano@it1>
2013-01-31 15:26                           ` Sylvain Munaut
2013-01-31 19:57                         ` Sage Weil
2013-02-03 18:17                           ` Loic Dachary
     [not found]                             ` <CAOLwVUkUFvLihb6KbxG9Et7R_-ZTZpLQJYTjXm9TEe40V_ZRHg@mail.gmail.com>
2013-02-03 21:03                               ` Sébastien Han
2013-02-04 17:29                                 ` Sébastien Han
2013-02-04 19:29                                   ` Sage Weil
2013-02-04 21:03                                     ` Dan Mick [this message]
2013-02-04 21:08                                       ` Sébastien Han
2013-02-04 21:22                                         ` Gregory Farnum
2013-02-04 21:27                                           ` Sébastien Han
2013-02-16  7:09                                             ` Andrey Korolyov
2013-02-16  9:09                                               ` Wido den Hollander
2013-02-17 17:21                                                 ` Sébastien Han
2013-02-18 16:46                                                 ` 0.56 scrub OSD memleaks, WAS " Christopher Kunz
2013-02-19 19:23                                                   ` Samuel Just
2013-02-19 19:50                                                     ` Christopher Kunz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51102210.2040300@inktank.com \
    --to=dan.mick@inktank.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=han.sebastien@gmail.com \
    --cc=loic@dachary.org \
    --cc=s.munaut@whatever-company.com \
    --cc=sage@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.