All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: OSD memory leaks?
  2012-12-17  8:28 ` Fwd: " Sébastien Han
@ 2012-12-17 18:12   ` Samuel Just
       [not found]     ` <CAOLwVUn5VbH1P=0wu-Oxb1bSKpaQfC6uQ5012wvPc7bvz606JA@mail.gmail.com>
  0 siblings, 1 reply; 48+ messages in thread
From: Samuel Just @ 2012-12-17 18:12 UTC (permalink / raw)
  To: Sébastien Han; +Cc: ceph-devel

Are you having network hiccups?  There was a bug noticed recently that
could cause a memory leak if nodes are being marked up and down.
-Sam

On Mon, Dec 17, 2012 at 12:28 AM, Sébastien Han <han.sebastien@gmail.com> wrote:
> Hi guys,
>
> Today looking at my graphs I noticed that one over 4 ceph nodes used a
> lot of memory. It keeps growing and growing.
> See the graph attached to this mail.
> I run 0.48.2 on Ubuntu 12.04.
>
> The other nodes also grow, but slowly than the first one.
>
> I'm not quite sure about the information that I have to provide. So
> let me know. The only thing I can say is that the load haven't
> increase that much this week. It seems to be consuming and not giving
> back the memory.
>
> Thank you in advance.
>
> --
> Regards,
> Sébastien Han.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
       [not found]     ` <CAOLwVUn5VbH1P=0wu-Oxb1bSKpaQfC6uQ5012wvPc7bvz606JA@mail.gmail.com>
@ 2012-12-17 22:41       ` Sébastien Han
  2012-12-17 22:55         ` Samuel Just
  0 siblings, 1 reply; 48+ messages in thread
From: Sébastien Han @ 2012-12-17 22:41 UTC (permalink / raw)
  To: Samuel Just; +Cc: ceph-devel

Hi,

No, I don't see nothing abnormal in the network stats. I don't see
anything in the logs... :(
The weird thing is that one node over 4 seems to take way more memory
than the others...

--
Regards,
Sébastien Han.


On Mon, Dec 17, 2012 at 11:31 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
>
> Hi,
>
> No, I don't see nothing abnormal in the network stats. I don't see anything in the logs... :(
> The weird thing is that one node over 4 seems to take way more memory than the others...
>
> --
> Regards,
> Sébastien Han.
>
>
>
> On Mon, Dec 17, 2012 at 7:12 PM, Samuel Just <sam.just@inktank.com> wrote:
>>
>> Are you having network hiccups?  There was a bug noticed recently that
>> could cause a memory leak if nodes are being marked up and down.
>> -Sam
>>
>> On Mon, Dec 17, 2012 at 12:28 AM, Sébastien Han <han.sebastien@gmail.com> wrote:
>> > Hi guys,
>> >
>> > Today looking at my graphs I noticed that one over 4 ceph nodes used a
>> > lot of memory. It keeps growing and growing.
>> > See the graph attached to this mail.
>> > I run 0.48.2 on Ubuntu 12.04.
>> >
>> > The other nodes also grow, but slowly than the first one.
>> >
>> > I'm not quite sure about the information that I have to provide. So
>> > let me know. The only thing I can say is that the load haven't
>> > increase that much this week. It seems to be consuming and not giving
>> > back the memory.
>> >
>> > Thank you in advance.
>> >
>> > --
>> > Regards,
>> > Sébastien Han.
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2012-12-17 22:41       ` Sébastien Han
@ 2012-12-17 22:55         ` Samuel Just
  2012-12-18 17:21           ` Sébastien Han
  0 siblings, 1 reply; 48+ messages in thread
From: Samuel Just @ 2012-12-17 22:55 UTC (permalink / raw)
  To: Sébastien Han; +Cc: ceph-devel

What is the workload like?
-Sam

On Mon, Dec 17, 2012 at 2:41 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
> Hi,
>
> No, I don't see nothing abnormal in the network stats. I don't see
> anything in the logs... :(
> The weird thing is that one node over 4 seems to take way more memory
> than the others...
>
> --
> Regards,
> Sébastien Han.
>
>
> On Mon, Dec 17, 2012 at 11:31 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
>>
>> Hi,
>>
>> No, I don't see nothing abnormal in the network stats. I don't see anything in the logs... :(
>> The weird thing is that one node over 4 seems to take way more memory than the others...
>>
>> --
>> Regards,
>> Sébastien Han.
>>
>>
>>
>> On Mon, Dec 17, 2012 at 7:12 PM, Samuel Just <sam.just@inktank.com> wrote:
>>>
>>> Are you having network hiccups?  There was a bug noticed recently that
>>> could cause a memory leak if nodes are being marked up and down.
>>> -Sam
>>>
>>> On Mon, Dec 17, 2012 at 12:28 AM, Sébastien Han <han.sebastien@gmail.com> wrote:
>>> > Hi guys,
>>> >
>>> > Today looking at my graphs I noticed that one over 4 ceph nodes used a
>>> > lot of memory. It keeps growing and growing.
>>> > See the graph attached to this mail.
>>> > I run 0.48.2 on Ubuntu 12.04.
>>> >
>>> > The other nodes also grow, but slowly than the first one.
>>> >
>>> > I'm not quite sure about the information that I have to provide. So
>>> > let me know. The only thing I can say is that the load haven't
>>> > increase that much this week. It seems to be consuming and not giving
>>> > back the memory.
>>> >
>>> > Thank you in advance.
>>> >
>>> > --
>>> > Regards,
>>> > Sébastien Han.
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2012-12-17 22:55         ` Samuel Just
@ 2012-12-18 17:21           ` Sébastien Han
  2012-12-19 16:37             ` Sébastien Han
  0 siblings, 1 reply; 48+ messages in thread
From: Sébastien Han @ 2012-12-18 17:21 UTC (permalink / raw)
  To: Samuel Just; +Cc: ceph-devel

Nothing terrific...

Kernel logs from my clients are full of "libceph: osd4
172.20.11.32:6801 socket closed"

I saw this somewhere on the tracker.

Does this harm?

Thanks.

--
Regards,
Sébastien Han.



On Mon, Dec 17, 2012 at 11:55 PM, Samuel Just <sam.just@inktank.com> wrote:
>
> What is the workload like?
> -Sam
>
> On Mon, Dec 17, 2012 at 2:41 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
> > Hi,
> >
> > No, I don't see nothing abnormal in the network stats. I don't see
> > anything in the logs... :(
> > The weird thing is that one node over 4 seems to take way more memory
> > than the others...
> >
> > --
> > Regards,
> > Sébastien Han.
> >
> >
> > On Mon, Dec 17, 2012 at 11:31 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> No, I don't see nothing abnormal in the network stats. I don't see anything in the logs... :(
> >> The weird thing is that one node over 4 seems to take way more memory than the others...
> >>
> >> --
> >> Regards,
> >> Sébastien Han.
> >>
> >>
> >>
> >> On Mon, Dec 17, 2012 at 7:12 PM, Samuel Just <sam.just@inktank.com> wrote:
> >>>
> >>> Are you having network hiccups?  There was a bug noticed recently that
> >>> could cause a memory leak if nodes are being marked up and down.
> >>> -Sam
> >>>
> >>> On Mon, Dec 17, 2012 at 12:28 AM, Sébastien Han <han.sebastien@gmail.com> wrote:
> >>> > Hi guys,
> >>> >
> >>> > Today looking at my graphs I noticed that one over 4 ceph nodes used a
> >>> > lot of memory. It keeps growing and growing.
> >>> > See the graph attached to this mail.
> >>> > I run 0.48.2 on Ubuntu 12.04.
> >>> >
> >>> > The other nodes also grow, but slowly than the first one.
> >>> >
> >>> > I'm not quite sure about the information that I have to provide. So
> >>> > let me know. The only thing I can say is that the load haven't
> >>> > increase that much this week. It seems to be consuming and not giving
> >>> > back the memory.
> >>> >
> >>> > Thank you in advance.
> >>> >
> >>> > --
> >>> > Regards,
> >>> > Sébastien Han.
> >>
> >>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2012-12-18 17:21           ` Sébastien Han
@ 2012-12-19 16:37             ` Sébastien Han
  2012-12-19 21:43               ` Samuel Just
  0 siblings, 1 reply; 48+ messages in thread
From: Sébastien Han @ 2012-12-19 16:37 UTC (permalink / raw)
  To: Samuel Just; +Cc: ceph-devel

No more suggestions? :(
--
Regards,
Sébastien Han.


On Tue, Dec 18, 2012 at 6:21 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
> Nothing terrific...
>
> Kernel logs from my clients are full of "libceph: osd4
> 172.20.11.32:6801 socket closed"
>
> I saw this somewhere on the tracker.
>
> Does this harm?
>
> Thanks.
>
> --
> Regards,
> Sébastien Han.
>
>
>
> On Mon, Dec 17, 2012 at 11:55 PM, Samuel Just <sam.just@inktank.com> wrote:
>>
>> What is the workload like?
>> -Sam
>>
>> On Mon, Dec 17, 2012 at 2:41 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
>> > Hi,
>> >
>> > No, I don't see nothing abnormal in the network stats. I don't see
>> > anything in the logs... :(
>> > The weird thing is that one node over 4 seems to take way more memory
>> > than the others...
>> >
>> > --
>> > Regards,
>> > Sébastien Han.
>> >
>> >
>> > On Mon, Dec 17, 2012 at 11:31 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> No, I don't see nothing abnormal in the network stats. I don't see anything in the logs... :(
>> >> The weird thing is that one node over 4 seems to take way more memory than the others...
>> >>
>> >> --
>> >> Regards,
>> >> Sébastien Han.
>> >>
>> >>
>> >>
>> >> On Mon, Dec 17, 2012 at 7:12 PM, Samuel Just <sam.just@inktank.com> wrote:
>> >>>
>> >>> Are you having network hiccups?  There was a bug noticed recently that
>> >>> could cause a memory leak if nodes are being marked up and down.
>> >>> -Sam
>> >>>
>> >>> On Mon, Dec 17, 2012 at 12:28 AM, Sébastien Han <han.sebastien@gmail.com> wrote:
>> >>> > Hi guys,
>> >>> >
>> >>> > Today looking at my graphs I noticed that one over 4 ceph nodes used a
>> >>> > lot of memory. It keeps growing and growing.
>> >>> > See the graph attached to this mail.
>> >>> > I run 0.48.2 on Ubuntu 12.04.
>> >>> >
>> >>> > The other nodes also grow, but slowly than the first one.
>> >>> >
>> >>> > I'm not quite sure about the information that I have to provide. So
>> >>> > let me know. The only thing I can say is that the load haven't
>> >>> > increase that much this week. It seems to be consuming and not giving
>> >>> > back the memory.
>> >>> >
>> >>> > Thank you in advance.
>> >>> >
>> >>> > --
>> >>> > Regards,
>> >>> > Sébastien Han.
>> >>
>> >>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2012-12-19 16:37             ` Sébastien Han
@ 2012-12-19 21:43               ` Samuel Just
  2013-01-04 15:20                 ` Sébastien Han
  0 siblings, 1 reply; 48+ messages in thread
From: Samuel Just @ 2012-12-19 21:43 UTC (permalink / raw)
  To: Sébastien Han; +Cc: ceph-devel

Sorry, it's been very busy.  The next step would to try to get a heap
dump.  You can start a heap profile on osd N by:

ceph osd tell N heap start_profiler

and you can get it to dump the collected profile using

ceph osd tell N heap dump.

The dumps should show up in the osd log directory.

Assuming the heap profiler is working correctly, you can look at the
dump using pprof in google-perftools.

On Wed, Dec 19, 2012 at 8:37 AM, Sébastien Han <han.sebastien@gmail.com> wrote:
> No more suggestions? :(
> --
> Regards,
> Sébastien Han.
>
>
> On Tue, Dec 18, 2012 at 6:21 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
>> Nothing terrific...
>>
>> Kernel logs from my clients are full of "libceph: osd4
>> 172.20.11.32:6801 socket closed"
>>
>> I saw this somewhere on the tracker.
>>
>> Does this harm?
>>
>> Thanks.
>>
>> --
>> Regards,
>> Sébastien Han.
>>
>>
>>
>> On Mon, Dec 17, 2012 at 11:55 PM, Samuel Just <sam.just@inktank.com> wrote:
>>>
>>> What is the workload like?
>>> -Sam
>>>
>>> On Mon, Dec 17, 2012 at 2:41 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
>>> > Hi,
>>> >
>>> > No, I don't see nothing abnormal in the network stats. I don't see
>>> > anything in the logs... :(
>>> > The weird thing is that one node over 4 seems to take way more memory
>>> > than the others...
>>> >
>>> > --
>>> > Regards,
>>> > Sébastien Han.
>>> >
>>> >
>>> > On Mon, Dec 17, 2012 at 11:31 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
>>> >>
>>> >> Hi,
>>> >>
>>> >> No, I don't see nothing abnormal in the network stats. I don't see anything in the logs... :(
>>> >> The weird thing is that one node over 4 seems to take way more memory than the others...
>>> >>
>>> >> --
>>> >> Regards,
>>> >> Sébastien Han.
>>> >>
>>> >>
>>> >>
>>> >> On Mon, Dec 17, 2012 at 7:12 PM, Samuel Just <sam.just@inktank.com> wrote:
>>> >>>
>>> >>> Are you having network hiccups?  There was a bug noticed recently that
>>> >>> could cause a memory leak if nodes are being marked up and down.
>>> >>> -Sam
>>> >>>
>>> >>> On Mon, Dec 17, 2012 at 12:28 AM, Sébastien Han <han.sebastien@gmail.com> wrote:
>>> >>> > Hi guys,
>>> >>> >
>>> >>> > Today looking at my graphs I noticed that one over 4 ceph nodes used a
>>> >>> > lot of memory. It keeps growing and growing.
>>> >>> > See the graph attached to this mail.
>>> >>> > I run 0.48.2 on Ubuntu 12.04.
>>> >>> >
>>> >>> > The other nodes also grow, but slowly than the first one.
>>> >>> >
>>> >>> > I'm not quite sure about the information that I have to provide. So
>>> >>> > let me know. The only thing I can say is that the load haven't
>>> >>> > increase that much this week. It seems to be consuming and not giving
>>> >>> > back the memory.
>>> >>> >
>>> >>> > Thank you in advance.
>>> >>> >
>>> >>> > --
>>> >>> > Regards,
>>> >>> > Sébastien Han.
>>> >>
>>> >>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2012-12-19 21:43               ` Samuel Just
@ 2013-01-04 15:20                 ` Sébastien Han
  0 siblings, 0 replies; 48+ messages in thread
From: Sébastien Han @ 2013-01-04 15:20 UTC (permalink / raw)
  To: Samuel Just; +Cc: ceph-devel

[-- Attachment #1: Type: text/plain, Size: 6043 bytes --]

Hi Sam,

Thanks for your answer and sorry the late reply.

Unfortunately I can't get something out from the profiler, actually I
do but I guess it doesn't show what is supposed to show... I will keep
on trying this. Anyway yesterday I just thought that the problem might
be due to some over usage of some OSDs. I was thinking that the
distribution of the primary OSD might be uneven, this could have
explained that some memory leaks are more important with some servers.
At the end, the repartition seems even but while looking at the pg
dump I found something interesting in the scrub column, timestamps
from the last scrubbing operation matched with times showed on the
graph.

After this, I made some calculation, I compared the total number of
scrubbing operation with the time range where memory leaks occurred.
First of all check my setup:

root@c2-ceph-01 ~ # ceph osd tree
dumped osdmap tree epoch 859
# id weight type name up/down reweight
-1 12 pool default
-3 12 rack lc2_rack33
-2 3 host c2-ceph-01
0 1 osd.0 up 1
1 1 osd.1 up 1
2 1 osd.2 up 1
-4 3 host c2-ceph-04
10 1 osd.10 up 1
11 1 osd.11 up 1
9 1 osd.9 up 1
-5 3 host c2-ceph-02
3 1 osd.3 up 1
4 1 osd.4 up 1
5 1 osd.5 up 1
-6 3 host c2-ceph-03
6 1 osd.6 up 1
7 1 osd.7 up 1
8 1 osd.8 up 1


And there are the results:

* Ceph node 1 which has the most important memory leak performed 1608
in total and 1059 during the time range where memory leaks occured
* Ceph node 2, 1168 in total and 776 during the time range where
memory leaks occured
* Ceph node 3, 940 in total and 94 during  the time range where memory
leaks occurred
* Ceph node 4, 899 in total and 191 during  the time range where
memory leaks occurred

I'm still not entirely sure that the scrub operation causes the leak
but the only relevant relation that I found...

Could it be that the scrubbing process doesn't release memory? Btw I
was wondering, how ceph decides at what time it should run the
scrubbing operation? I know that it's once a day and control by the
following options

OPTION(osd_scrub_min_interval, OPT_FLOAT, 300)
OPTION(osd_scrub_max_interval, OPT_FLOAT, 60*60*24)

But how ceph determined the time where the operation started, during
cluster creation probably?

I just checked the options that control OSD scrubbing and found that by default:

OPTION(osd_max_scrubs, OPT_INT, 1)

So that might explain why only one OSD uses a lot of memory.

My dirty workaround at the moment is to performed a check of memory
use by every OSD and restart it if it uses more than 25% of the total
memory. Also note that on ceph 1, 3 and 4 it's always one OSD that
uses a lot of memory, for ceph 2 only the mem usage is high but almost
the same for all the OSD process.

Thank you in advance.

--
Regards,
Sébastien Han.


On Wed, Dec 19, 2012 at 10:43 PM, Samuel Just <sam.just@inktank.com> wrote:
>
> Sorry, it's been very busy.  The next step would to try to get a heap
> dump.  You can start a heap profile on osd N by:
>
> ceph osd tell N heap start_profiler
>
> and you can get it to dump the collected profile using
>
> ceph osd tell N heap dump.
>
> The dumps should show up in the osd log directory.
>
> Assuming the heap profiler is working correctly, you can look at the
> dump using pprof in google-perftools.
>
> On Wed, Dec 19, 2012 at 8:37 AM, Sébastien Han <han.sebastien@gmail.com> wrote:
> > No more suggestions? :(
> > --
> > Regards,
> > Sébastien Han.
> >
> >
> > On Tue, Dec 18, 2012 at 6:21 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
> >> Nothing terrific...
> >>
> >> Kernel logs from my clients are full of "libceph: osd4
> >> 172.20.11.32:6801 socket closed"
> >>
> >> I saw this somewhere on the tracker.
> >>
> >> Does this harm?
> >>
> >> Thanks.
> >>
> >> --
> >> Regards,
> >> Sébastien Han.
> >>
> >>
> >>
> >> On Mon, Dec 17, 2012 at 11:55 PM, Samuel Just <sam.just@inktank.com> wrote:
> >>>
> >>> What is the workload like?
> >>> -Sam
> >>>
> >>> On Mon, Dec 17, 2012 at 2:41 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
> >>> > Hi,
> >>> >
> >>> > No, I don't see nothing abnormal in the network stats. I don't see
> >>> > anything in the logs... :(
> >>> > The weird thing is that one node over 4 seems to take way more memory
> >>> > than the others...
> >>> >
> >>> > --
> >>> > Regards,
> >>> > Sébastien Han.
> >>> >
> >>> >
> >>> > On Mon, Dec 17, 2012 at 11:31 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
> >>> >>
> >>> >> Hi,
> >>> >>
> >>> >> No, I don't see nothing abnormal in the network stats. I don't see anything in the logs... :(
> >>> >> The weird thing is that one node over 4 seems to take way more memory than the others...
> >>> >>
> >>> >> --
> >>> >> Regards,
> >>> >> Sébastien Han.
> >>> >>
> >>> >>
> >>> >>
> >>> >> On Mon, Dec 17, 2012 at 7:12 PM, Samuel Just <sam.just@inktank.com> wrote:
> >>> >>>
> >>> >>> Are you having network hiccups?  There was a bug noticed recently that
> >>> >>> could cause a memory leak if nodes are being marked up and down.
> >>> >>> -Sam
> >>> >>>
> >>> >>> On Mon, Dec 17, 2012 at 12:28 AM, Sébastien Han <han.sebastien@gmail.com> wrote:
> >>> >>> > Hi guys,
> >>> >>> >
> >>> >>> > Today looking at my graphs I noticed that one over 4 ceph nodes used a
> >>> >>> > lot of memory. It keeps growing and growing.
> >>> >>> > See the graph attached to this mail.
> >>> >>> > I run 0.48.2 on Ubuntu 12.04.
> >>> >>> >
> >>> >>> > The other nodes also grow, but slowly than the first one.
> >>> >>> >
> >>> >>> > I'm not quite sure about the information that I have to provide. So
> >>> >>> > let me know. The only thing I can say is that the load haven't
> >>> >>> > increase that much this week. It seems to be consuming and not giving
> >>> >>> > back the memory.
> >>> >>> >
> >>> >>> > Thank you in advance.
> >>> >>> >
> >>> >>> > --
> >>> >>> > Regards,
> >>> >>> > Sébastien Han.
> >>> >>
> >>> >>

[-- Attachment #2: ceph-leak-scrub.png --]
[-- Type: image/png, Size: 38237 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
       [not found] ` <10953797.470.1357585419142.JavaMail.dspano@it1>
@ 2013-01-07 19:09   ` Samuel Just
  2013-01-09 15:20     ` Sébastien Han
  0 siblings, 1 reply; 48+ messages in thread
From: Samuel Just @ 2013-01-07 19:09 UTC (permalink / raw)
  To: Dave Spano; +Cc: ceph-devel

Awesome!  What version are you running (ceph-osd -v, include the hash)?
-Sam

On Mon, Jan 7, 2013 at 11:03 AM, Dave Spano <dspano@optogenics.com> wrote:
> This failed the first time I sent it, so I'm resending in plain text.
>
> Dave Spano
> Optogenics
> Systems Administrator
>
>
>
> ----- Original Message -----
>
> From: "Dave Spano" <dspano@optogenics.com>
> To: "Sébastien Han" <han.sebastien@gmail.com>
> Cc: "ceph-devel" <ceph-devel@vger.kernel.org>, "Samuel Just" <sam.just@inktank.com>
> Sent: Monday, January 7, 2013 12:40:06 PM
> Subject: Re: OSD memory leaks?
>
>
> Sam,
>
> Attached are some heaps that I collected today. 001 and 003 are just after I started the profiler; 011 is the most recent. If you need more, or anything different let me know. Already the OSD in question is at 38% memory usage. As mentioned by Sèbastien, restarting ceph-osd keeps things going.
>
> Not sure if this is helpful information, but out of the two OSDs that I have running, the first one (osd.0) is the one that develops this problem the quickest. osd.1 does have the same issue, it just takes much longer. Do the monitors hit the first osd in the list first, when there's activity?
>
>
> Dave Spano
> Optogenics
> Systems Administrator
>
>
> ----- Original Message -----
>
> From: "Sébastien Han" <han.sebastien@gmail.com>
> To: "Samuel Just" <sam.just@inktank.com>
> Cc: "ceph-devel" <ceph-devel@vger.kernel.org>
> Sent: Friday, January 4, 2013 10:20:58 AM
> Subject: Re: OSD memory leaks?
>
> Hi Sam,
>
> Thanks for your answer and sorry the late reply.
>
> Unfortunately I can't get something out from the profiler, actually I
> do but I guess it doesn't show what is supposed to show... I will keep
> on trying this. Anyway yesterday I just thought that the problem might
> be due to some over usage of some OSDs. I was thinking that the
> distribution of the primary OSD might be uneven, this could have
> explained that some memory leaks are more important with some servers.
> At the end, the repartition seems even but while looking at the pg
> dump I found something interesting in the scrub column, timestamps
> from the last scrubbing operation matched with times showed on the
> graph.
>
> After this, I made some calculation, I compared the total number of
> scrubbing operation with the time range where memory leaks occurred.
> First of all check my setup:
>
> root@c2-ceph-01 ~ # ceph osd tree
> dumped osdmap tree epoch 859
> # id weight type name up/down reweight
> -1 12 pool default
> -3 12 rack lc2_rack33
> -2 3 host c2-ceph-01
> 0 1 osd.0 up 1
> 1 1 osd.1 up 1
> 2 1 osd.2 up 1
> -4 3 host c2-ceph-04
> 10 1 osd.10 up 1
> 11 1 osd.11 up 1
> 9 1 osd.9 up 1
> -5 3 host c2-ceph-02
> 3 1 osd.3 up 1
> 4 1 osd.4 up 1
> 5 1 osd.5 up 1
> -6 3 host c2-ceph-03
> 6 1 osd.6 up 1
> 7 1 osd.7 up 1
> 8 1 osd.8 up 1
>
>
> And there are the results:
>
> * Ceph node 1 which has the most important memory leak performed 1608
> in total and 1059 during the time range where memory leaks occured
> * Ceph node 2, 1168 in total and 776 during the time range where
> memory leaks occured
> * Ceph node 3, 940 in total and 94 during the time range where memory
> leaks occurred
> * Ceph node 4, 899 in total and 191 during the time range where
> memory leaks occurred
>
> I'm still not entirely sure that the scrub operation causes the leak
> but the only relevant relation that I found...
>
> Could it be that the scrubbing process doesn't release memory? Btw I
> was wondering, how ceph decides at what time it should run the
> scrubbing operation? I know that it's once a day and control by the
> following options
>
> OPTION(osd_scrub_min_interval, OPT_FLOAT, 300)
> OPTION(osd_scrub_max_interval, OPT_FLOAT, 60*60*24)
>
> But how ceph determined the time where the operation started, during
> cluster creation probably?
>
> I just checked the options that control OSD scrubbing and found that by default:
>
> OPTION(osd_max_scrubs, OPT_INT, 1)
>
> So that might explain why only one OSD uses a lot of memory.
>
> My dirty workaround at the moment is to performed a check of memory
> use by every OSD and restart it if it uses more than 25% of the total
> memory. Also note that on ceph 1, 3 and 4 it's always one OSD that
> uses a lot of memory, for ceph 2 only the mem usage is high but almost
> the same for all the OSD process.
>
> Thank you in advance.
>
> --
> Regards,
> Sébastien Han.
>
>
> On Wed, Dec 19, 2012 at 10:43 PM, Samuel Just <sam.just@inktank.com> wrote:
>>
>> Sorry, it's been very busy. The next step would to try to get a heap
>> dump. You can start a heap profile on osd N by:
>>
>> ceph osd tell N heap start_profiler
>>
>> and you can get it to dump the collected profile using
>>
>> ceph osd tell N heap dump.
>>
>> The dumps should show up in the osd log directory.
>>
>> Assuming the heap profiler is working correctly, you can look at the
>> dump using pprof in google-perftools.
>>
>> On Wed, Dec 19, 2012 at 8:37 AM, Sébastien Han <han.sebastien@gmail.com> wrote:
>> > No more suggestions? :(
>> > --
>> > Regards,
>> > Sébastien Han.
>> >
>> >
>> > On Tue, Dec 18, 2012 at 6:21 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
>> >> Nothing terrific...
>> >>
>> >> Kernel logs from my clients are full of "libceph: osd4
>> >> 172.20.11.32:6801 socket closed"
>> >>
>> >> I saw this somewhere on the tracker.
>> >>
>> >> Does this harm?
>> >>
>> >> Thanks.
>> >>
>> >> --
>> >> Regards,
>> >> Sébastien Han.
>> >>
>> >>
>> >>
>> >> On Mon, Dec 17, 2012 at 11:55 PM, Samuel Just <sam.just@inktank.com> wrote:
>> >>>
>> >>> What is the workload like?
>> >>> -Sam
>> >>>
>> >>> On Mon, Dec 17, 2012 at 2:41 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
>> >>> > Hi,
>> >>> >
>> >>> > No, I don't see nothing abnormal in the network stats. I don't see
>> >>> > anything in the logs... :(
>> >>> > The weird thing is that one node over 4 seems to take way more memory
>> >>> > than the others...
>> >>> >
>> >>> > --
>> >>> > Regards,
>> >>> > Sébastien Han.
>> >>> >
>> >>> >
>> >>> > On Mon, Dec 17, 2012 at 11:31 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
>> >>> >>
>> >>> >> Hi,
>> >>> >>
>> >>> >> No, I don't see nothing abnormal in the network stats. I don't see anything in the logs... :(
>> >>> >> The weird thing is that one node over 4 seems to take way more memory than the others...
>> >>> >>
>> >>> >> --
>> >>> >> Regards,
>> >>> >> Sébastien Han.
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> On Mon, Dec 17, 2012 at 7:12 PM, Samuel Just <sam.just@inktank.com> wrote:
>> >>> >>>
>> >>> >>> Are you having network hiccups? There was a bug noticed recently that
>> >>> >>> could cause a memory leak if nodes are being marked up and down.
>> >>> >>> -Sam
>> >>> >>>
>> >>> >>> On Mon, Dec 17, 2012 at 12:28 AM, Sébastien Han <han.sebastien@gmail.com> wrote:
>> >>> >>> > Hi guys,
>> >>> >>> >
>> >>> >>> > Today looking at my graphs I noticed that one over 4 ceph nodes used a
>> >>> >>> > lot of memory. It keeps growing and growing.
>> >>> >>> > See the graph attached to this mail.
>> >>> >>> > I run 0.48.2 on Ubuntu 12.04.
>> >>> >>> >
>> >>> >>> > The other nodes also grow, but slowly than the first one.
>> >>> >>> >
>> >>> >>> > I'm not quite sure about the information that I have to provide. So
>> >>> >>> > let me know. The only thing I can say is that the load haven't
>> >>> >>> > increase that much this week. It seems to be consuming and not giving
>> >>> >>> > back the memory.
>> >>> >>> >
>> >>> >>> > Thank you in advance.
>> >>> >>> >
>> >>> >>> > --
>> >>> >>> > Regards,
>> >>> >>> > Sébastien Han.
>> >>> >>
>> >>> >>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-01-07 19:09   ` Samuel Just
@ 2013-01-09 15:20     ` Sébastien Han
  2013-01-09 16:10       ` Dave Spano
  0 siblings, 1 reply; 48+ messages in thread
From: Sébastien Han @ 2013-01-09 15:20 UTC (permalink / raw)
  To: Samuel Just; +Cc: Dave Spano, ceph-devel

I guess he runs Argonaut as well.

More suggestions about this problem?

Thanks!

--
Regards,
Sébastien Han.


On Mon, Jan 7, 2013 at 8:09 PM, Samuel Just <sam.just@inktank.com> wrote:
>
> Awesome!  What version are you running (ceph-osd -v, include the hash)?
> -Sam
>
> On Mon, Jan 7, 2013 at 11:03 AM, Dave Spano <dspano@optogenics.com> wrote:
> > This failed the first time I sent it, so I'm resending in plain text.
> >
> > Dave Spano
> > Optogenics
> > Systems Administrator
> >
> >
> >
> > ----- Original Message -----
> >
> > From: "Dave Spano" <dspano@optogenics.com>
> > To: "Sébastien Han" <han.sebastien@gmail.com>
> > Cc: "ceph-devel" <ceph-devel@vger.kernel.org>, "Samuel Just" <sam.just@inktank.com>
> > Sent: Monday, January 7, 2013 12:40:06 PM
> > Subject: Re: OSD memory leaks?
> >
> >
> > Sam,
> >
> > Attached are some heaps that I collected today. 001 and 003 are just after I started the profiler; 011 is the most recent. If you need more, or anything different let me know. Already the OSD in question is at 38% memory usage. As mentioned by Sèbastien, restarting ceph-osd keeps things going.
> >
> > Not sure if this is helpful information, but out of the two OSDs that I have running, the first one (osd.0) is the one that develops this problem the quickest. osd.1 does have the same issue, it just takes much longer. Do the monitors hit the first osd in the list first, when there's activity?
> >
> >
> > Dave Spano
> > Optogenics
> > Systems Administrator
> >
> >
> > ----- Original Message -----
> >
> > From: "Sébastien Han" <han.sebastien@gmail.com>
> > To: "Samuel Just" <sam.just@inktank.com>
> > Cc: "ceph-devel" <ceph-devel@vger.kernel.org>
> > Sent: Friday, January 4, 2013 10:20:58 AM
> > Subject: Re: OSD memory leaks?
> >
> > Hi Sam,
> >
> > Thanks for your answer and sorry the late reply.
> >
> > Unfortunately I can't get something out from the profiler, actually I
> > do but I guess it doesn't show what is supposed to show... I will keep
> > on trying this. Anyway yesterday I just thought that the problem might
> > be due to some over usage of some OSDs. I was thinking that the
> > distribution of the primary OSD might be uneven, this could have
> > explained that some memory leaks are more important with some servers.
> > At the end, the repartition seems even but while looking at the pg
> > dump I found something interesting in the scrub column, timestamps
> > from the last scrubbing operation matched with times showed on the
> > graph.
> >
> > After this, I made some calculation, I compared the total number of
> > scrubbing operation with the time range where memory leaks occurred.
> > First of all check my setup:
> >
> > root@c2-ceph-01 ~ # ceph osd tree
> > dumped osdmap tree epoch 859
> > # id weight type name up/down reweight
> > -1 12 pool default
> > -3 12 rack lc2_rack33
> > -2 3 host c2-ceph-01
> > 0 1 osd.0 up 1
> > 1 1 osd.1 up 1
> > 2 1 osd.2 up 1
> > -4 3 host c2-ceph-04
> > 10 1 osd.10 up 1
> > 11 1 osd.11 up 1
> > 9 1 osd.9 up 1
> > -5 3 host c2-ceph-02
> > 3 1 osd.3 up 1
> > 4 1 osd.4 up 1
> > 5 1 osd.5 up 1
> > -6 3 host c2-ceph-03
> > 6 1 osd.6 up 1
> > 7 1 osd.7 up 1
> > 8 1 osd.8 up 1
> >
> >
> > And there are the results:
> >
> > * Ceph node 1 which has the most important memory leak performed 1608
> > in total and 1059 during the time range where memory leaks occured
> > * Ceph node 2, 1168 in total and 776 during the time range where
> > memory leaks occured
> > * Ceph node 3, 940 in total and 94 during the time range where memory
> > leaks occurred
> > * Ceph node 4, 899 in total and 191 during the time range where
> > memory leaks occurred
> >
> > I'm still not entirely sure that the scrub operation causes the leak
> > but the only relevant relation that I found...
> >
> > Could it be that the scrubbing process doesn't release memory? Btw I
> > was wondering, how ceph decides at what time it should run the
> > scrubbing operation? I know that it's once a day and control by the
> > following options
> >
> > OPTION(osd_scrub_min_interval, OPT_FLOAT, 300)
> > OPTION(osd_scrub_max_interval, OPT_FLOAT, 60*60*24)
> >
> > But how ceph determined the time where the operation started, during
> > cluster creation probably?
> >
> > I just checked the options that control OSD scrubbing and found that by default:
> >
> > OPTION(osd_max_scrubs, OPT_INT, 1)
> >
> > So that might explain why only one OSD uses a lot of memory.
> >
> > My dirty workaround at the moment is to performed a check of memory
> > use by every OSD and restart it if it uses more than 25% of the total
> > memory. Also note that on ceph 1, 3 and 4 it's always one OSD that
> > uses a lot of memory, for ceph 2 only the mem usage is high but almost
> > the same for all the OSD process.
> >
> > Thank you in advance.
> >
> > --
> > Regards,
> > Sébastien Han.
> >
> >
> > On Wed, Dec 19, 2012 at 10:43 PM, Samuel Just <sam.just@inktank.com> wrote:
> >>
> >> Sorry, it's been very busy. The next step would to try to get a heap
> >> dump. You can start a heap profile on osd N by:
> >>
> >> ceph osd tell N heap start_profiler
> >>
> >> and you can get it to dump the collected profile using
> >>
> >> ceph osd tell N heap dump.
> >>
> >> The dumps should show up in the osd log directory.
> >>
> >> Assuming the heap profiler is working correctly, you can look at the
> >> dump using pprof in google-perftools.
> >>
> >> On Wed, Dec 19, 2012 at 8:37 AM, Sébastien Han <han.sebastien@gmail.com> wrote:
> >> > No more suggestions? :(
> >> > --
> >> > Regards,
> >> > Sébastien Han.
> >> >
> >> >
> >> > On Tue, Dec 18, 2012 at 6:21 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
> >> >> Nothing terrific...
> >> >>
> >> >> Kernel logs from my clients are full of "libceph: osd4
> >> >> 172.20.11.32:6801 socket closed"
> >> >>
> >> >> I saw this somewhere on the tracker.
> >> >>
> >> >> Does this harm?
> >> >>
> >> >> Thanks.
> >> >>
> >> >> --
> >> >> Regards,
> >> >> Sébastien Han.
> >> >>
> >> >>
> >> >>
> >> >> On Mon, Dec 17, 2012 at 11:55 PM, Samuel Just <sam.just@inktank.com> wrote:
> >> >>>
> >> >>> What is the workload like?
> >> >>> -Sam
> >> >>>
> >> >>> On Mon, Dec 17, 2012 at 2:41 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
> >> >>> > Hi,
> >> >>> >
> >> >>> > No, I don't see nothing abnormal in the network stats. I don't see
> >> >>> > anything in the logs... :(
> >> >>> > The weird thing is that one node over 4 seems to take way more memory
> >> >>> > than the others...
> >> >>> >
> >> >>> > --
> >> >>> > Regards,
> >> >>> > Sébastien Han.
> >> >>> >
> >> >>> >
> >> >>> > On Mon, Dec 17, 2012 at 11:31 PM, Sébastien Han <han.sebastien@gmail.com> wrote:
> >> >>> >>
> >> >>> >> Hi,
> >> >>> >>
> >> >>> >> No, I don't see nothing abnormal in the network stats. I don't see anything in the logs... :(
> >> >>> >> The weird thing is that one node over 4 seems to take way more memory than the others...
> >> >>> >>
> >> >>> >> --
> >> >>> >> Regards,
> >> >>> >> Sébastien Han.
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >> On Mon, Dec 17, 2012 at 7:12 PM, Samuel Just <sam.just@inktank.com> wrote:
> >> >>> >>>
> >> >>> >>> Are you having network hiccups? There was a bug noticed recently that
> >> >>> >>> could cause a memory leak if nodes are being marked up and down.
> >> >>> >>> -Sam
> >> >>> >>>
> >> >>> >>> On Mon, Dec 17, 2012 at 12:28 AM, Sébastien Han <han.sebastien@gmail.com> wrote:
> >> >>> >>> > Hi guys,
> >> >>> >>> >
> >> >>> >>> > Today looking at my graphs I noticed that one over 4 ceph nodes used a
> >> >>> >>> > lot of memory. It keeps growing and growing.
> >> >>> >>> > See the graph attached to this mail.
> >> >>> >>> > I run 0.48.2 on Ubuntu 12.04.
> >> >>> >>> >
> >> >>> >>> > The other nodes also grow, but slowly than the first one.
> >> >>> >>> >
> >> >>> >>> > I'm not quite sure about the information that I have to provide. So
> >> >>> >>> > let me know. The only thing I can say is that the load haven't
> >> >>> >>> > increase that much this week. It seems to be consuming and not giving
> >> >>> >>> > back the memory.
> >> >>> >>> >
> >> >>> >>> > Thank you in advance.
> >> >>> >>> >
> >> >>> >>> > --
> >> >>> >>> > Regards,
> >> >>> >>> > Sébastien Han.
> >> >>> >>
> >> >>> >>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-01-09 15:20     ` Sébastien Han
@ 2013-01-09 16:10       ` Dave Spano
  2013-01-09 16:35         ` Sébastien Han
  2013-01-10 21:43         ` Gregory Farnum
  0 siblings, 2 replies; 48+ messages in thread
From: Dave Spano @ 2013-01-09 16:10 UTC (permalink / raw)
  To: Sébastien Han; +Cc: ceph-devel, Samuel Just

Yes, I'm using argonaut. 

I've got 38 heap files from yesterday. Currently, the OSD in question is using 91.2% of memory according to top, and staying there. I initially thought it would go until the OOM killer started killing processes, but I don't see anything funny in the system logs that indicate that. 

On the other hand, the ceph-osd process on osd.1 is using far less memory. 

osd.0
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                 
 9151 root      20   0 20.4g  14g 2548 S    1 91.2 517:58.71 ceph-osd 

osd.1

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                 
10785 root      20   0  673m 310m 5164 S    3  1.9 107:04.39 ceph-osd  

Here's what tcmalloc says when I run ceph osd tell 0 heap stats:
2013-01-09 11:09:36.778675 7f62aae23700  0 log [INF] : osd.0tcmalloc heap stats:------------------------------------------------
2013-01-09 11:09:36.779113 7f62aae23700  0 log [INF] : MALLOC:      210884768 (  201.1 MB) Bytes in use by application
2013-01-09 11:09:36.779348 7f62aae23700  0 log [INF] : MALLOC: +     89026560 (   84.9 MB) Bytes in page heap freelist
2013-01-09 11:09:36.779928 7f62aae23700  0 log [INF] : MALLOC: +      7926512 (    7.6 MB) Bytes in central cache freelist
2013-01-09 11:09:36.779951 7f62aae23700  0 log [INF] : MALLOC: +       144896 (    0.1 MB) Bytes in transfer cache freelist
2013-01-09 11:09:36.779972 7f62aae23700  0 log [INF] : MALLOC: +     11046512 (   10.5 MB) Bytes in thread cache freelists
2013-01-09 11:09:36.780013 7f62aae23700  0 log [INF] : MALLOC: +      5177344 (    4.9 MB) Bytes in malloc metadata
2013-01-09 11:09:36.780030 7f62aae23700  0 log [INF] : MALLOC:   ------------
2013-01-09 11:09:36.780056 7f62aae23700  0 log [INF] : MALLOC: =    324206592 (  309.2 MB) Actual memory used (physical + swap)
2013-01-09 11:09:36.780081 7f62aae23700  0 log [INF] : MALLOC: +    126177280 (  120.3 MB) Bytes released to OS (aka unmapped)
2013-01-09 11:09:36.780112 7f62aae23700  0 log [INF] : MALLOC:   ------------
2013-01-09 11:09:36.780127 7f62aae23700  0 log [INF] : MALLOC: =    450383872 (  429.5 MB) Virtual address space used
2013-01-09 11:09:36.780152 7f62aae23700  0 log [INF] : MALLOC:
2013-01-09 11:09:36.780168 7f62aae23700  0 log [INF] : MALLOC:          37492              Spans in use
2013-01-09 11:09:36.780330 7f62aae23700  0 log [INF] : MALLOC:             51              Thread heaps in use
2013-01-09 11:09:36.780359 7f62aae23700  0 log [INF] : MALLOC:           4096              Tcmalloc page size
2013-01-09 11:09:36.780384 7f62aae23700  0 log [INF] : ------------------------------------------------


Dave Spano 
Optogenics 
Systems Administrator 



----- Original Message ----- 

From: "Sébastien Han" <han.sebastien@gmail.com> 
To: "Samuel Just" <sam.just@inktank.com> 
Cc: "Dave Spano" <dspano@optogenics.com>, "ceph-devel" <ceph-devel@vger.kernel.org> 
Sent: Wednesday, January 9, 2013 10:20:43 AM 
Subject: Re: OSD memory leaks? 

I guess he runs Argonaut as well. 

More suggestions about this problem? 

Thanks! 

-- 
Regards, 
Sébastien Han. 


On Mon, Jan 7, 2013 at 8:09 PM, Samuel Just <sam.just@inktank.com> wrote: 
> 
> Awesome! What version are you running (ceph-osd -v, include the hash)? 
> -Sam 
> 
> On Mon, Jan 7, 2013 at 11:03 AM, Dave Spano <dspano@optogenics.com> wrote: 
> > This failed the first time I sent it, so I'm resending in plain text. 
> > 
> > Dave Spano 
> > Optogenics 
> > Systems Administrator 
> > 
> > 
> > 
> > ----- Original Message ----- 
> > 
> > From: "Dave Spano" <dspano@optogenics.com> 
> > To: "Sébastien Han" <han.sebastien@gmail.com> 
> > Cc: "ceph-devel" <ceph-devel@vger.kernel.org>, "Samuel Just" <sam.just@inktank.com> 
> > Sent: Monday, January 7, 2013 12:40:06 PM 
> > Subject: Re: OSD memory leaks? 
> > 
> > 
> > Sam, 
> > 
> > Attached are some heaps that I collected today. 001 and 003 are just after I started the profiler; 011 is the most recent. If you need more, or anything different let me know. Already the OSD in question is at 38% memory usage. As mentioned by Sèbastien, restarting ceph-osd keeps things going. 
> > 
> > Not sure if this is helpful information, but out of the two OSDs that I have running, the first one (osd.0) is the one that develops this problem the quickest. osd.1 does have the same issue, it just takes much longer. Do the monitors hit the first osd in the list first, when there's activity? 
> > 
> > 
> > Dave Spano 
> > Optogenics 
> > Systems Administrator 
> > 
> > 
> > ----- Original Message ----- 
> > 
> > From: "Sébastien Han" <han.sebastien@gmail.com> 
> > To: "Samuel Just" <sam.just@inktank.com> 
> > Cc: "ceph-devel" <ceph-devel@vger.kernel.org> 
> > Sent: Friday, January 4, 2013 10:20:58 AM 
> > Subject: Re: OSD memory leaks? 
> > 
> > Hi Sam, 
> > 
> > Thanks for your answer and sorry the late reply. 
> > 
> > Unfortunately I can't get something out from the profiler, actually I 
> > do but I guess it doesn't show what is supposed to show... I will keep 
> > on trying this. Anyway yesterday I just thought that the problem might 
> > be due to some over usage of some OSDs. I was thinking that the 
> > distribution of the primary OSD might be uneven, this could have 
> > explained that some memory leaks are more important with some servers. 
> > At the end, the repartition seems even but while looking at the pg 
> > dump I found something interesting in the scrub column, timestamps 
> > from the last scrubbing operation matched with times showed on the 
> > graph. 
> > 
> > After this, I made some calculation, I compared the total number of 
> > scrubbing operation with the time range where memory leaks occurred. 
> > First of all check my setup: 
> > 
> > root@c2-ceph-01 ~ # ceph osd tree 
> > dumped osdmap tree epoch 859 
> > # id weight type name up/down reweight 
> > -1 12 pool default 
> > -3 12 rack lc2_rack33 
> > -2 3 host c2-ceph-01 
> > 0 1 osd.0 up 1 
> > 1 1 osd.1 up 1 
> > 2 1 osd.2 up 1 
> > -4 3 host c2-ceph-04 
> > 10 1 osd.10 up 1 
> > 11 1 osd.11 up 1 
> > 9 1 osd.9 up 1 
> > -5 3 host c2-ceph-02 
> > 3 1 osd.3 up 1 
> > 4 1 osd.4 up 1 
> > 5 1 osd.5 up 1 
> > -6 3 host c2-ceph-03 
> > 6 1 osd.6 up 1 
> > 7 1 osd.7 up 1 
> > 8 1 osd.8 up 1 
> > 
> > 
> > And there are the results: 
> > 
> > * Ceph node 1 which has the most important memory leak performed 1608 
> > in total and 1059 during the time range where memory leaks occured 
> > * Ceph node 2, 1168 in total and 776 during the time range where 
> > memory leaks occured 
> > * Ceph node 3, 940 in total and 94 during the time range where memory 
> > leaks occurred 
> > * Ceph node 4, 899 in total and 191 during the time range where 
> > memory leaks occurred 
> > 
> > I'm still not entirely sure that the scrub operation causes the leak 
> > but the only relevant relation that I found... 
> > 
> > Could it be that the scrubbing process doesn't release memory? Btw I 
> > was wondering, how ceph decides at what time it should run the 
> > scrubbing operation? I know that it's once a day and control by the 
> > following options 
> > 
> > OPTION(osd_scrub_min_interval, OPT_FLOAT, 300) 
> > OPTION(osd_scrub_max_interval, OPT_FLOAT, 60*60*24) 
> > 
> > But how ceph determined the time where the operation started, during 
> > cluster creation probably? 
> > 
> > I just checked the options that control OSD scrubbing and found that by default: 
> > 
> > OPTION(osd_max_scrubs, OPT_INT, 1) 
> > 
> > So that might explain why only one OSD uses a lot of memory. 
> > 
> > My dirty workaround at the moment is to performed a check of memory 
> > use by every OSD and restart it if it uses more than 25% of the total 
> > memory. Also note that on ceph 1, 3 and 4 it's always one OSD that 
> > uses a lot of memory, for ceph 2 only the mem usage is high but almost 
> > the same for all the OSD process. 
> > 
> > Thank you in advance. 
> > 
> > -- 
> > Regards, 
> > Sébastien Han. 
> > 
> > 
> > On Wed, Dec 19, 2012 at 10:43 PM, Samuel Just <sam.just@inktank.com> wrote: 
> >> 
> >> Sorry, it's been very busy. The next step would to try to get a heap 
> >> dump. You can start a heap profile on osd N by: 
> >> 
> >> ceph osd tell N heap start_profiler 
> >> 
> >> and you can get it to dump the collected profile using 
> >> 
> >> ceph osd tell N heap dump. 
> >> 
> >> The dumps should show up in the osd log directory. 
> >> 
> >> Assuming the heap profiler is working correctly, you can look at the 
> >> dump using pprof in google-perftools. 
> >> 
> >> On Wed, Dec 19, 2012 at 8:37 AM, Sébastien Han <han.sebastien@gmail.com> wrote: 
> >> > No more suggestions? :( 
> >> > -- 
> >> > Regards, 
> >> > Sébastien Han. 
> >> > 
> >> > 
> >> > On Tue, Dec 18, 2012 at 6:21 PM, Sébastien Han <han.sebastien@gmail.com> wrote: 
> >> >> Nothing terrific... 
> >> >> 
> >> >> Kernel logs from my clients are full of "libceph: osd4 
> >> >> 172.20.11.32:6801 socket closed" 
> >> >> 
> >> >> I saw this somewhere on the tracker. 
> >> >> 
> >> >> Does this harm? 
> >> >> 
> >> >> Thanks. 
> >> >> 
> >> >> -- 
> >> >> Regards, 
> >> >> Sébastien Han. 
> >> >> 
> >> >> 
> >> >> 
> >> >> On Mon, Dec 17, 2012 at 11:55 PM, Samuel Just <sam.just@inktank.com> wrote: 
> >> >>> 
> >> >>> What is the workload like? 
> >> >>> -Sam 
> >> >>> 
> >> >>> On Mon, Dec 17, 2012 at 2:41 PM, Sébastien Han <han.sebastien@gmail.com> wrote: 
> >> >>> > Hi, 
> >> >>> > 
> >> >>> > No, I don't see nothing abnormal in the network stats. I don't see 
> >> >>> > anything in the logs... :( 
> >> >>> > The weird thing is that one node over 4 seems to take way more memory 
> >> >>> > than the others... 
> >> >>> > 
> >> >>> > -- 
> >> >>> > Regards, 
> >> >>> > Sébastien Han. 
> >> >>> > 
> >> >>> > 
> >> >>> > On Mon, Dec 17, 2012 at 11:31 PM, Sébastien Han <han.sebastien@gmail.com> wrote: 
> >> >>> >> 
> >> >>> >> Hi, 
> >> >>> >> 
> >> >>> >> No, I don't see nothing abnormal in the network stats. I don't see anything in the logs... :( 
> >> >>> >> The weird thing is that one node over 4 seems to take way more memory than the others... 
> >> >>> >> 
> >> >>> >> -- 
> >> >>> >> Regards, 
> >> >>> >> Sébastien Han. 
> >> >>> >> 
> >> >>> >> 
> >> >>> >> 
> >> >>> >> On Mon, Dec 17, 2012 at 7:12 PM, Samuel Just <sam.just@inktank.com> wrote: 
> >> >>> >>> 
> >> >>> >>> Are you having network hiccups? There was a bug noticed recently that 
> >> >>> >>> could cause a memory leak if nodes are being marked up and down. 
> >> >>> >>> -Sam 
> >> >>> >>> 
> >> >>> >>> On Mon, Dec 17, 2012 at 12:28 AM, Sébastien Han <han.sebastien@gmail.com> wrote: 
> >> >>> >>> > Hi guys, 
> >> >>> >>> > 
> >> >>> >>> > Today looking at my graphs I noticed that one over 4 ceph nodes used a 
> >> >>> >>> > lot of memory. It keeps growing and growing. 
> >> >>> >>> > See the graph attached to this mail. 
> >> >>> >>> > I run 0.48.2 on Ubuntu 12.04. 
> >> >>> >>> > 
> >> >>> >>> > The other nodes also grow, but slowly than the first one. 
> >> >>> >>> > 
> >> >>> >>> > I'm not quite sure about the information that I have to provide. So 
> >> >>> >>> > let me know. The only thing I can say is that the load haven't 
> >> >>> >>> > increase that much this week. It seems to be consuming and not giving 
> >> >>> >>> > back the memory. 
> >> >>> >>> > 
> >> >>> >>> > Thank you in advance. 
> >> >>> >>> > 
> >> >>> >>> > -- 
> >> >>> >>> > Regards, 
> >> >>> >>> > Sébastien Han. 
> >> >>> >> 
> >> >>> >> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
> the body of a message to majordomo@vger.kernel.org 
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-01-09 16:10       ` Dave Spano
@ 2013-01-09 16:35         ` Sébastien Han
  2013-01-09 18:09           ` Sylvain Munaut
  2013-01-09 21:42           ` Dave Spano
  2013-01-10 21:43         ` Gregory Farnum
  1 sibling, 2 replies; 48+ messages in thread
From: Sébastien Han @ 2013-01-09 16:35 UTC (permalink / raw)
  To: Dave Spano; +Cc: ceph-devel, Samuel Just

If you wait too long, the system will trigger OOM killer :D, I already
experienced that unfortunately...

Sam?

On Wed, Jan 9, 2013 at 5:10 PM, Dave Spano <dspano@optogenics.com> wrote:
> OOM killer



--
Regards,
Sébastien Han.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-01-09 16:35         ` Sébastien Han
@ 2013-01-09 18:09           ` Sylvain Munaut
  2013-01-09 19:11             ` Sébastien Han
  2013-01-10 21:44             ` Gregory Farnum
  2013-01-09 21:42           ` Dave Spano
  1 sibling, 2 replies; 48+ messages in thread
From: Sylvain Munaut @ 2013-01-09 18:09 UTC (permalink / raw)
  To: Sébastien Han; +Cc: Dave Spano, ceph-devel, Samuel Just

Just fyi, I also have growing memory on OSD, and I have the same logs:

"libceph: osd4 172.20.11.32:6801 socket closed" in the RBD clients


I traced that problem and correlated it to some cephx issue in the OSD
some time ago in this thread

http://www.mail-archive.com/ceph-devel@vger.kernel.org/msg10634.html

but the thread kind of died without a solution ...

Cheers,

   Sylvain

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-01-09 18:09           ` Sylvain Munaut
@ 2013-01-09 19:11             ` Sébastien Han
  2013-01-10 21:44             ` Gregory Farnum
  1 sibling, 0 replies; 48+ messages in thread
From: Sébastien Han @ 2013-01-09 19:11 UTC (permalink / raw)
  To: Sylvain Munaut; +Cc: Dave Spano, ceph-devel, Samuel Just

Hi,

Thanks for the input.

I also have tons of "socket closed", I recall that this message is
harmless. Anyway Cephx is disable on my platform from the beginning...
Anyone to approve or disapprove my "scrub theory"?
--
Regards,
Sébastien Han.


On Wed, Jan 9, 2013 at 7:09 PM, Sylvain Munaut
<s.munaut@whatever-company.com> wrote:
> Just fyi, I also have growing memory on OSD, and I have the same logs:
>
> "libceph: osd4 172.20.11.32:6801 socket closed" in the RBD clients
>
>
> I traced that problem and correlated it to some cephx issue in the OSD
> some time ago in this thread
>
> http://www.mail-archive.com/ceph-devel@vger.kernel.org/msg10634.html
>
> but the thread kind of died without a solution ...
>
> Cheers,
>
>    Sylvain
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-01-09 16:35         ` Sébastien Han
  2013-01-09 18:09           ` Sylvain Munaut
@ 2013-01-09 21:42           ` Dave Spano
  2013-01-09 22:12             ` Sébastien Han
  1 sibling, 1 reply; 48+ messages in thread
From: Dave Spano @ 2013-01-09 21:42 UTC (permalink / raw)
  To: Sébastien Han; +Cc: ceph-devel, Samuel Just

That's very good to know. I'll be restarting ceph-osd right now! Thanks for the heads up! 

Dave Spano 
Optogenics 
Systems Administrator 



----- Original Message ----- 

From: "Sébastien Han" <han.sebastien@gmail.com> 
To: "Dave Spano" <dspano@optogenics.com> 
Cc: "ceph-devel" <ceph-devel@vger.kernel.org>, "Samuel Just" <sam.just@inktank.com> 
Sent: Wednesday, January 9, 2013 11:35:13 AM 
Subject: Re: OSD memory leaks? 

If you wait too long, the system will trigger OOM killer :D, I already 
experienced that unfortunately... 

Sam? 

On Wed, Jan 9, 2013 at 5:10 PM, Dave Spano <dspano@optogenics.com> wrote: 
> OOM killer 



-- 
Regards, 
Sébastien Han.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-01-09 21:42           ` Dave Spano
@ 2013-01-09 22:12             ` Sébastien Han
  2013-01-09 23:03               ` Dave Spano
  0 siblings, 1 reply; 48+ messages in thread
From: Sébastien Han @ 2013-01-09 22:12 UTC (permalink / raw)
  To: Dave Spano; +Cc: ceph-devel, Samuel Just

Dave, I share you my little script for now if you want it:

#!/bin/bash

for i in $(ps aux | grep [c]eph-osd | awk '{print $4}')
do
        MEM_INTEGER=$(echo $i | cut -d '.' -f1)
        OSD=$(ps aux | grep [c]eph-osd | grep "$i " | awk '{print $13}')
        if [[ $MEM_INTEGER -ge 25 ]];then
service ceph restart osd.$OSD >> /dev/null
                if [ $? -eq 0 ]; then
logger -t ceph-memory-usage "The OSD number $OSD has been restarted
since it was using $i % of the memory"
                else
                        logger -t ceph-memory-usage "ERROR while
restarting the OSD daemon"
                fi
        else
                logger -t ceph-memory-usage "The OSD number $OSD is
only using $i % of the memory, doing nothing"
        fi
logger -t ceph-memory-usage "Waiting 60 seconds before testing the next OSD..."
sleep 60
done

logger -t ceph-memory-usage "Ceph state after memory check operation
is: $(ceph health)"

Crons run with 10 min interval everyday for each storage node ;-).

Waiting for some Inktank guys now :-).
--
Regards,
Sébastien Han.


On Wed, Jan 9, 2013 at 10:42 PM, Dave Spano <dspano@optogenics.com> wrote:
> That's very good to know. I'll be restarting ceph-osd right now! Thanks for the heads up!
>
> Dave Spano
> Optogenics
> Systems Administrator
>
>
>
> ----- Original Message -----
>
> From: "Sébastien Han" <han.sebastien@gmail.com>
> To: "Dave Spano" <dspano@optogenics.com>
> Cc: "ceph-devel" <ceph-devel@vger.kernel.org>, "Samuel Just" <sam.just@inktank.com>
> Sent: Wednesday, January 9, 2013 11:35:13 AM
> Subject: Re: OSD memory leaks?
>
> If you wait too long, the system will trigger OOM killer :D, I already
> experienced that unfortunately...
>
> Sam?
>
> On Wed, Jan 9, 2013 at 5:10 PM, Dave Spano <dspano@optogenics.com> wrote:
>> OOM killer
>
>
>
> --
> Regards,
> Sébastien Han.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-01-09 22:12             ` Sébastien Han
@ 2013-01-09 23:03               ` Dave Spano
  0 siblings, 0 replies; 48+ messages in thread
From: Dave Spano @ 2013-01-09 23:03 UTC (permalink / raw)
  To: Sébastien Han; +Cc: ceph-devel, Samuel Just

Thank you. I appreciate it! 

Dave Spano 
Optogenics 
Systems Administrator 



----- Original Message ----- 

From: "Sébastien Han" <han.sebastien@gmail.com> 
To: "Dave Spano" <dspano@optogenics.com> 
Cc: "ceph-devel" <ceph-devel@vger.kernel.org>, "Samuel Just" <sam.just@inktank.com> 
Sent: Wednesday, January 9, 2013 5:12:12 PM 
Subject: Re: OSD memory leaks? 

Dave, I share you my little script for now if you want it: 

#!/bin/bash 

for i in $(ps aux | grep [c]eph-osd | awk '{print $4}') 
do 
MEM_INTEGER=$(echo $i | cut -d '.' -f1) 
OSD=$(ps aux | grep [c]eph-osd | grep "$i " | awk '{print $13}') 
if [[ $MEM_INTEGER -ge 25 ]];then 
service ceph restart osd.$OSD >> /dev/null 
if [ $? -eq 0 ]; then 
logger -t ceph-memory-usage "The OSD number $OSD has been restarted 
since it was using $i % of the memory" 
else 
logger -t ceph-memory-usage "ERROR while 
restarting the OSD daemon" 
fi 
else 
logger -t ceph-memory-usage "The OSD number $OSD is 
only using $i % of the memory, doing nothing" 
fi 
logger -t ceph-memory-usage "Waiting 60 seconds before testing the next OSD..." 
sleep 60 
done 

logger -t ceph-memory-usage "Ceph state after memory check operation 
is: $(ceph health)" 

Crons run with 10 min interval everyday for each storage node ;-). 

Waiting for some Inktank guys now :-). 
-- 
Regards, 
Sébastien Han. 


On Wed, Jan 9, 2013 at 10:42 PM, Dave Spano <dspano@optogenics.com> wrote: 
> That's very good to know. I'll be restarting ceph-osd right now! Thanks for the heads up! 
> 
> Dave Spano 
> Optogenics 
> Systems Administrator 
> 
> 
> 
> ----- Original Message ----- 
> 
> From: "Sébastien Han" <han.sebastien@gmail.com> 
> To: "Dave Spano" <dspano@optogenics.com> 
> Cc: "ceph-devel" <ceph-devel@vger.kernel.org>, "Samuel Just" <sam.just@inktank.com> 
> Sent: Wednesday, January 9, 2013 11:35:13 AM 
> Subject: Re: OSD memory leaks? 
> 
> If you wait too long, the system will trigger OOM killer :D, I already 
> experienced that unfortunately... 
> 
> Sam? 
> 
> On Wed, Jan 9, 2013 at 5:10 PM, Dave Spano <dspano@optogenics.com> wrote: 
>> OOM killer 
> 
> 
> 
> -- 
> Regards, 
> Sébastien Han.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-01-09 16:10       ` Dave Spano
  2013-01-09 16:35         ` Sébastien Han
@ 2013-01-10 21:43         ` Gregory Farnum
  1 sibling, 0 replies; 48+ messages in thread
From: Gregory Farnum @ 2013-01-10 21:43 UTC (permalink / raw)
  To: Dave Spano; +Cc: Sébastien Han, ceph-devel, Samuel Just

On Wed, Jan 9, 2013 at 8:10 AM, Dave Spano <dspano@optogenics.com> wrote:
> Yes, I'm using argonaut.
>
> I've got 38 heap files from yesterday. Currently, the OSD in question is using 91.2% of memory according to top, and staying there. I initially thought it would go until the OOM killer started killing processes, but I don't see anything funny in the system logs that indicate that.
>
> On the other hand, the ceph-osd process on osd.1 is using far less memory.

Is osd.1 using the heap profiler as well? Keep in mind that active use
of the memory profiler will itself cause memory usage to increase —
this sounds a bit like that to me since it's staying stable at a large
but finite portion of total memory.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-01-09 18:09           ` Sylvain Munaut
  2013-01-09 19:11             ` Sébastien Han
@ 2013-01-10 21:44             ` Gregory Farnum
  2013-01-11 14:57               ` Sébastien Han
  1 sibling, 1 reply; 48+ messages in thread
From: Gregory Farnum @ 2013-01-10 21:44 UTC (permalink / raw)
  To: Sylvain Munaut; +Cc: Sébastien Han, Dave Spano, ceph-devel, Samuel Just

On Wed, Jan 9, 2013 at 10:09 AM, Sylvain Munaut
<s.munaut@whatever-company.com> wrote:
> Just fyi, I also have growing memory on OSD, and I have the same logs:
>
> "libceph: osd4 172.20.11.32:6801 socket closed" in the RBD clients

That message is not an error; it just happens if the RBD client
doesn't talk to that OSD for a while. I believe its volume has been
turned down quite a lot in the latest kernels/our git tree.
-Greg

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-01-10 21:44             ` Gregory Farnum
@ 2013-01-11 14:57               ` Sébastien Han
  2013-01-11 18:13                 ` Gregory Farnum
  0 siblings, 1 reply; 48+ messages in thread
From: Sébastien Han @ 2013-01-11 14:57 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Sylvain Munaut, Dave Spano, ceph-devel, Samuel Just

> Is osd.1 using the heap profiler as well? Keep in mind that active use
> of the memory profiler will itself cause memory usage to increase —
> this sounds a bit like that to me since it's staying stable at a large
> but finite portion of total memory.

Well, the memory consumption was already high before the profiler was
started. So yes with the memory profiler enable an OSD might consume
more memory but this doesn't cause the memory leaks.

Any ideas? Nothing to say about my scrumbing theory?

Thanks!
--
Regards,
Sébastien Han.


On Thu, Jan 10, 2013 at 10:44 PM, Gregory Farnum <greg@inktank.com> wrote:
> On Wed, Jan 9, 2013 at 10:09 AM, Sylvain Munaut
> <s.munaut@whatever-company.com> wrote:
>> Just fyi, I also have growing memory on OSD, and I have the same logs:
>>
>> "libceph: osd4 172.20.11.32:6801 socket closed" in the RBD clients
>
> That message is not an error; it just happens if the RBD client
> doesn't talk to that OSD for a while. I believe its volume has been
> turned down quite a lot in the latest kernels/our git tree.
> -Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-01-11 14:57               ` Sébastien Han
@ 2013-01-11 18:13                 ` Gregory Farnum
  2013-02-22 15:24                   ` Sébastien Han
  0 siblings, 1 reply; 48+ messages in thread
From: Gregory Farnum @ 2013-01-11 18:13 UTC (permalink / raw)
  To: Sébastien Han; +Cc: Sylvain Munaut, Dave Spano, ceph-devel, Samuel Just

On Fri, Jan 11, 2013 at 6:57 AM, Sébastien Han <han.sebastien@gmail.com> wrote:
>> Is osd.1 using the heap profiler as well? Keep in mind that active use
>> of the memory profiler will itself cause memory usage to increase —
>> this sounds a bit like that to me since it's staying stable at a large
>> but finite portion of total memory.
>
> Well, the memory consumption was already high before the profiler was
> started. So yes with the memory profiler enable an OSD might consume
> more memory but this doesn't cause the memory leaks.

My concern is that maybe you saw a leak but when you restarted with
the memory profiling you lost whatever conditions caused it.

> Any ideas? Nothing to say about my scrumbing theory?
I like it, but Sam indicates that without some heap dumps which
capture the actual leak then scrub is too large to effectively code
review for leaks. :(
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-01-11 18:13                 ` Gregory Farnum
@ 2013-02-22 15:24                   ` Sébastien Han
  2013-02-23  0:44                     ` Sage Weil
  0 siblings, 1 reply; 48+ messages in thread
From: Sébastien Han @ 2013-02-22 15:24 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Sylvain Munaut, Dave Spano, ceph-devel, Samuel Just

Hi all,

I finally got a core dump.

I did it with a kill -SEGV on the OSD process.

https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008

Hope we will get something out of it :-).
--
Regards,
Sébastien Han.


On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com> wrote:
> On Fri, Jan 11, 2013 at 6:57 AM, Sébastien Han <han.sebastien@gmail.com> wrote:
>>> Is osd.1 using the heap profiler as well? Keep in mind that active use
>>> of the memory profiler will itself cause memory usage to increase —
>>> this sounds a bit like that to me since it's staying stable at a large
>>> but finite portion of total memory.
>>
>> Well, the memory consumption was already high before the profiler was
>> started. So yes with the memory profiler enable an OSD might consume
>> more memory but this doesn't cause the memory leaks.
>
> My concern is that maybe you saw a leak but when you restarted with
> the memory profiling you lost whatever conditions caused it.
>
>> Any ideas? Nothing to say about my scrumbing theory?
> I like it, but Sam indicates that without some heap dumps which
> capture the actual leak then scrub is too large to effectively code
> review for leaks. :(
> -Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-02-22 15:24                   ` Sébastien Han
@ 2013-02-23  0:44                     ` Sage Weil
  2013-02-24 23:10                       ` Sébastien Han
  2013-03-01 15:51                       ` Wido den Hollander
  0 siblings, 2 replies; 48+ messages in thread
From: Sage Weil @ 2013-02-23  0:44 UTC (permalink / raw)
  To: Sébastien Han
  Cc: Gregory Farnum, Sylvain Munaut, Dave Spano, ceph-devel,
	Samuel Just

On Fri, 22 Feb 2013, S?bastien Han wrote:
> Hi all,
> 
> I finally got a core dump.
> 
> I did it with a kill -SEGV on the OSD process.
> 
> https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
> 
> Hope we will get something out of it :-).

AHA!  We have a theory.  The pg log isnt trimmed during scrub (because teh 
old scrub code required that), but the new (deep) scrub can take a very 
long time, which means the pg log will eat ram in the meantime.. 
especially under high iops.

Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see 
if that seems to work?  Note that that patch shouldn't be run in a mixed 
argonaut+bobtail cluster, since it isn't properly checking if the scrub is 
class or chunky/deep.

Thanks!
sage


 > --
> Regards,
> S?bastien Han.
> 
> 
> On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com> wrote:
> > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com> wrote:
> >>> Is osd.1 using the heap profiler as well? Keep in mind that active use
> >>> of the memory profiler will itself cause memory usage to increase ?
> >>> this sounds a bit like that to me since it's staying stable at a large
> >>> but finite portion of total memory.
> >>
> >> Well, the memory consumption was already high before the profiler was
> >> started. So yes with the memory profiler enable an OSD might consume
> >> more memory but this doesn't cause the memory leaks.
> >
> > My concern is that maybe you saw a leak but when you restarted with
> > the memory profiling you lost whatever conditions caused it.
> >
> >> Any ideas? Nothing to say about my scrumbing theory?
> > I like it, but Sam indicates that without some heap dumps which
> > capture the actual leak then scrub is too large to effectively code
> > review for leaks. :(
> > -Greg
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-02-23  0:44                     ` Sage Weil
@ 2013-02-24 23:10                       ` Sébastien Han
  2013-02-25  0:21                         ` Sage Weil
  2013-03-01 15:51                       ` Wido den Hollander
  1 sibling, 1 reply; 48+ messages in thread
From: Sébastien Han @ 2013-02-24 23:10 UTC (permalink / raw)
  To: Sage Weil
  Cc: Gregory Farnum, Sylvain Munaut, Dave Spano, ceph-devel,
	Samuel Just

Hi Sage,

Sorry it's a production system, so I can't test it.
So at the end, you can't get anything out of the core dump?

--
Regards,
Sébastien Han.


On Sat, Feb 23, 2013 at 1:44 AM, Sage Weil <sage@inktank.com> wrote:
> On Fri, 22 Feb 2013, S?bastien Han wrote:
>> Hi all,
>>
>> I finally got a core dump.
>>
>> I did it with a kill -SEGV on the OSD process.
>>
>> https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
>>
>> Hope we will get something out of it :-).
>
> AHA!  We have a theory.  The pg log isnt trimmed during scrub (because teh
> old scrub code required that), but the new (deep) scrub can take a very
> long time, which means the pg log will eat ram in the meantime..
> especially under high iops.
>
> Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see
> if that seems to work?  Note that that patch shouldn't be run in a mixed
> argonaut+bobtail cluster, since it isn't properly checking if the scrub is
> class or chunky/deep.
>
> Thanks!
> sage
>
>
>  > --
>> Regards,
>> S?bastien Han.
>>
>>
>> On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com> wrote:
>> > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com> wrote:
>> >>> Is osd.1 using the heap profiler as well? Keep in mind that active use
>> >>> of the memory profiler will itself cause memory usage to increase ?
>> >>> this sounds a bit like that to me since it's staying stable at a large
>> >>> but finite portion of total memory.
>> >>
>> >> Well, the memory consumption was already high before the profiler was
>> >> started. So yes with the memory profiler enable an OSD might consume
>> >> more memory but this doesn't cause the memory leaks.
>> >
>> > My concern is that maybe you saw a leak but when you restarted with
>> > the memory profiling you lost whatever conditions caused it.
>> >
>> >> Any ideas? Nothing to say about my scrumbing theory?
>> > I like it, but Sam indicates that without some heap dumps which
>> > capture the actual leak then scrub is too large to effectively code
>> > review for leaks. :(
>> > -Greg
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-02-24 23:10                       ` Sébastien Han
@ 2013-02-25  0:21                         ` Sage Weil
  2013-02-25  7:51                           ` Wido den Hollander
  0 siblings, 1 reply; 48+ messages in thread
From: Sage Weil @ 2013-02-25  0:21 UTC (permalink / raw)
  To: Sébastien Han
  Cc: Gregory Farnum, Sylvain Munaut, Dave Spano, ceph-devel,
	Samuel Just

On Mon, 25 Feb 2013, S?bastien Han wrote:
> Hi Sage,
> 
> Sorry it's a production system, so I can't test it.
> So at the end, you can't get anything out of the core dump?

I saw a bunch of dup object anmes, which is what led us to the pg log 
theory.  I can look a bit more carefully to confirm, but in the end it 
would be nice to see users scrubbing without leaking.

This may be a bit moot because we want to allow trimming for other 
reasons, so those patches are being tested and working their way into 
master.  We'll backport when things are solid.

In the meantime, if someone has been able to reproduce this in a test 
environment, testing is obviously welcome :)

sage




 > 
> --
> Regards,
> S?bastien Han.
> 
> 
> On Sat, Feb 23, 2013 at 1:44 AM, Sage Weil <sage@inktank.com> wrote:
> > On Fri, 22 Feb 2013, S?bastien Han wrote:
> >> Hi all,
> >>
> >> I finally got a core dump.
> >>
> >> I did it with a kill -SEGV on the OSD process.
> >>
> >> https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
> >>
> >> Hope we will get something out of it :-).
> >
> > AHA!  We have a theory.  The pg log isnt trimmed during scrub (because teh
> > old scrub code required that), but the new (deep) scrub can take a very
> > long time, which means the pg log will eat ram in the meantime..
> > especially under high iops.
> >
> > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see
> > if that seems to work?  Note that that patch shouldn't be run in a mixed
> > argonaut+bobtail cluster, since it isn't properly checking if the scrub is
> > class or chunky/deep.
> >
> > Thanks!
> > sage
> >
> >
> >  > --
> >> Regards,
> >> S?bastien Han.
> >>
> >>
> >> On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com> wrote:
> >> > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com> wrote:
> >> >>> Is osd.1 using the heap profiler as well? Keep in mind that active use
> >> >>> of the memory profiler will itself cause memory usage to increase ?
> >> >>> this sounds a bit like that to me since it's staying stable at a large
> >> >>> but finite portion of total memory.
> >> >>
> >> >> Well, the memory consumption was already high before the profiler was
> >> >> started. So yes with the memory profiler enable an OSD might consume
> >> >> more memory but this doesn't cause the memory leaks.
> >> >
> >> > My concern is that maybe you saw a leak but when you restarted with
> >> > the memory profiling you lost whatever conditions caused it.
> >> >
> >> >> Any ideas? Nothing to say about my scrumbing theory?
> >> > I like it, but Sam indicates that without some heap dumps which
> >> > capture the actual leak then scrub is too large to effectively code
> >> > review for leaks. :(
> >> > -Greg
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> >>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-02-25  0:21                         ` Sage Weil
@ 2013-02-25  7:51                           ` Wido den Hollander
  2013-02-25 17:18                             ` Sébastien Han
  0 siblings, 1 reply; 48+ messages in thread
From: Wido den Hollander @ 2013-02-25  7:51 UTC (permalink / raw)
  To: Sage Weil
  Cc: Sébastien Han, Gregory Farnum, Sylvain Munaut, Dave Spano,
	ceph-devel, Samuel Just

On 02/25/2013 01:21 AM, Sage Weil wrote:
> On Mon, 25 Feb 2013, S?bastien Han wrote:
>> Hi Sage,
>>
>> Sorry it's a production system, so I can't test it.
>> So at the end, you can't get anything out of the core dump?
>
> I saw a bunch of dup object anmes, which is what led us to the pg log
> theory.  I can look a bit more carefully to confirm, but in the end it
> would be nice to see users scrubbing without leaking.
>
> This may be a bit moot because we want to allow trimming for other
> reasons, so those patches are being tested and working their way into
> master.  We'll backport when things are solid.
>
> In the meantime, if someone has been able to reproduce this in a test
> environment, testing is obviously welcome :)
>

I'll see what I can do later this week. I know of a cluster which has 
the same issues which is in semi-production as far as I know.

Wido

> sage
>
>
>
>
>   >
>> --
>> Regards,
>> S?bastien Han.
>>
>>
>> On Sat, Feb 23, 2013 at 1:44 AM, Sage Weil <sage@inktank.com> wrote:
>>> On Fri, 22 Feb 2013, S?bastien Han wrote:
>>>> Hi all,
>>>>
>>>> I finally got a core dump.
>>>>
>>>> I did it with a kill -SEGV on the OSD process.
>>>>
>>>> https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
>>>>
>>>> Hope we will get something out of it :-).
>>>
>>> AHA!  We have a theory.  The pg log isnt trimmed during scrub (because teh
>>> old scrub code required that), but the new (deep) scrub can take a very
>>> long time, which means the pg log will eat ram in the meantime..
>>> especially under high iops.
>>>
>>> Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see
>>> if that seems to work?  Note that that patch shouldn't be run in a mixed
>>> argonaut+bobtail cluster, since it isn't properly checking if the scrub is
>>> class or chunky/deep.
>>>
>>> Thanks!
>>> sage
>>>
>>>
>>>   > --
>>>> Regards,
>>>> S?bastien Han.
>>>>
>>>>
>>>> On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com> wrote:
>>>>> On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com> wrote:
>>>>>>> Is osd.1 using the heap profiler as well? Keep in mind that active use
>>>>>>> of the memory profiler will itself cause memory usage to increase ?
>>>>>>> this sounds a bit like that to me since it's staying stable at a large
>>>>>>> but finite portion of total memory.
>>>>>>
>>>>>> Well, the memory consumption was already high before the profiler was
>>>>>> started. So yes with the memory profiler enable an OSD might consume
>>>>>> more memory but this doesn't cause the memory leaks.
>>>>>
>>>>> My concern is that maybe you saw a leak but when you restarted with
>>>>> the memory profiling you lost whatever conditions caused it.
>>>>>
>>>>>> Any ideas? Nothing to say about my scrumbing theory?
>>>>> I like it, but Sam indicates that without some heap dumps which
>>>>> capture the actual leak then scrub is too large to effectively code
>>>>> review for leaks. :(
>>>>> -Greg
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


-- 
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-02-25  7:51                           ` Wido den Hollander
@ 2013-02-25 17:18                             ` Sébastien Han
  0 siblings, 0 replies; 48+ messages in thread
From: Sébastien Han @ 2013-02-25 17:18 UTC (permalink / raw)
  To: Wido den Hollander
  Cc: Sage Weil, Gregory Farnum, Sylvain Munaut, Dave Spano, ceph-devel,
	Samuel Just

Ok thanks guys. Hope we will find something :-).
--
Regards,
Sébastien Han.


On Mon, Feb 25, 2013 at 8:51 AM, Wido den Hollander <wido@42on.com> wrote:
> On 02/25/2013 01:21 AM, Sage Weil wrote:
>>
>> On Mon, 25 Feb 2013, S?bastien Han wrote:
>>>
>>> Hi Sage,
>>>
>>> Sorry it's a production system, so I can't test it.
>>> So at the end, you can't get anything out of the core dump?
>>
>>
>> I saw a bunch of dup object anmes, which is what led us to the pg log
>> theory.  I can look a bit more carefully to confirm, but in the end it
>> would be nice to see users scrubbing without leaking.
>>
>> This may be a bit moot because we want to allow trimming for other
>> reasons, so those patches are being tested and working their way into
>> master.  We'll backport when things are solid.
>>
>> In the meantime, if someone has been able to reproduce this in a test
>> environment, testing is obviously welcome :)
>>
>
> I'll see what I can do later this week. I know of a cluster which has the
> same issues which is in semi-production as far as I know.
>
> Wido
>
>
>> sage
>>
>>
>>
>>
>>   >
>>>
>>> --
>>> Regards,
>>> S?bastien Han.
>>>
>>>
>>> On Sat, Feb 23, 2013 at 1:44 AM, Sage Weil <sage@inktank.com> wrote:
>>>>
>>>> On Fri, 22 Feb 2013, S?bastien Han wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I finally got a core dump.
>>>>>
>>>>> I did it with a kill -SEGV on the OSD process.
>>>>>
>>>>>
>>>>> https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
>>>>>
>>>>> Hope we will get something out of it :-).
>>>>
>>>>
>>>> AHA!  We have a theory.  The pg log isnt trimmed during scrub (because
>>>> teh
>>>> old scrub code required that), but the new (deep) scrub can take a very
>>>> long time, which means the pg log will eat ram in the meantime..
>>>> especially under high iops.
>>>>
>>>> Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see
>>>> if that seems to work?  Note that that patch shouldn't be run in a mixed
>>>> argonaut+bobtail cluster, since it isn't properly checking if the scrub
>>>> is
>>>> class or chunky/deep.
>>>>
>>>> Thanks!
>>>> sage
>>>>
>>>>
>>>>   > --
>>>>>
>>>>> Regards,
>>>>> S?bastien Han.
>>>>>
>>>>>
>>>>> On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com>
>>>>> wrote:
>>>>>>
>>>>>> On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han
>>>>>> <han.sebastien@gmail.com> wrote:
>>>>>>>>
>>>>>>>> Is osd.1 using the heap profiler as well? Keep in mind that active
>>>>>>>> use
>>>>>>>> of the memory profiler will itself cause memory usage to increase ?
>>>>>>>> this sounds a bit like that to me since it's staying stable at a
>>>>>>>> large
>>>>>>>> but finite portion of total memory.
>>>>>>>
>>>>>>>
>>>>>>> Well, the memory consumption was already high before the profiler was
>>>>>>> started. So yes with the memory profiler enable an OSD might consume
>>>>>>> more memory but this doesn't cause the memory leaks.
>>>>>>
>>>>>>
>>>>>> My concern is that maybe you saw a leak but when you restarted with
>>>>>> the memory profiling you lost whatever conditions caused it.
>>>>>>
>>>>>>> Any ideas? Nothing to say about my scrumbing theory?
>>>>>>
>>>>>> I like it, but Sam indicates that without some heap dumps which
>>>>>> capture the actual leak then scrub is too large to effectively code
>>>>>> review for leaks. :(
>>>>>> -Greg
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>>>>> in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
>
> --
> Wido den Hollander
> 42on B.V.
>
> Phone: +31 (0)20 700 9902
> Skype: contact42on
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-02-23  0:44                     ` Sage Weil
  2013-02-24 23:10                       ` Sébastien Han
@ 2013-03-01 15:51                       ` Wido den Hollander
  2013-03-01 18:07                         ` Samuel Just
  2013-03-01 19:10                         ` Sage Weil
  1 sibling, 2 replies; 48+ messages in thread
From: Wido den Hollander @ 2013-03-01 15:51 UTC (permalink / raw)
  To: Sage Weil
  Cc: Sébastien Han, Gregory Farnum, Sylvain Munaut, Dave Spano,
	ceph-devel, Samuel Just

On 02/23/2013 01:44 AM, Sage Weil wrote:
> On Fri, 22 Feb 2013, S?bastien Han wrote:
>> Hi all,
>>
>> I finally got a core dump.
>>
>> I did it with a kill -SEGV on the OSD process.
>>
>> https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
>>
>> Hope we will get something out of it :-).
>
> AHA!  We have a theory.  The pg log isnt trimmed during scrub (because teh
> old scrub code required that), but the new (deep) scrub can take a very
> long time, which means the pg log will eat ram in the meantime..
> especially under high iops.
>

Does the number of PGs influence the memory leak? So my theory is that 
when you have a high number of PGs with a low number of objects per PG 
you don't see the memory leak.

I saw the memory leak on a RBD system where a pool had just 8 PGs, but 
after going to 1024 PGs in a new pool it seemed to be resolved.

I've asked somebody else to try your patch since he's still seeing it on 
his systems. Hopefully that gives us some results.

Wido

> Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see
> if that seems to work?  Note that that patch shouldn't be run in a mixed
> argonaut+bobtail cluster, since it isn't properly checking if the scrub is
> class or chunky/deep.
>
> Thanks!
> sage
>
>
>   > --
>> Regards,
>> S?bastien Han.
>>
>>
>> On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com> wrote:
>>> On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com> wrote:
>>>>> Is osd.1 using the heap profiler as well? Keep in mind that active use
>>>>> of the memory profiler will itself cause memory usage to increase ?
>>>>> this sounds a bit like that to me since it's staying stable at a large
>>>>> but finite portion of total memory.
>>>>
>>>> Well, the memory consumption was already high before the profiler was
>>>> started. So yes with the memory profiler enable an OSD might consume
>>>> more memory but this doesn't cause the memory leaks.
>>>
>>> My concern is that maybe you saw a leak but when you restarted with
>>> the memory profiling you lost whatever conditions caused it.
>>>
>>>> Any ideas? Nothing to say about my scrumbing theory?
>>> I like it, but Sam indicates that without some heap dumps which
>>> capture the actual leak then scrub is too large to effectively code
>>> review for leaks. :(
>>> -Greg
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


-- 
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-01 15:51                       ` Wido den Hollander
@ 2013-03-01 18:07                         ` Samuel Just
  2013-03-01 19:10                         ` Sage Weil
  1 sibling, 0 replies; 48+ messages in thread
From: Samuel Just @ 2013-03-01 18:07 UTC (permalink / raw)
  To: Wido den Hollander
  Cc: Sage Weil, Sébastien Han, Gregory Farnum, Sylvain Munaut,
	Dave Spano, ceph-devel

That pattern would seem to support the log trimming theory of the leak.
-Sam

On Fri, Mar 1, 2013 at 7:51 AM, Wido den Hollander <wido@42on.com> wrote:
> On 02/23/2013 01:44 AM, Sage Weil wrote:
>>
>> On Fri, 22 Feb 2013, S?bastien Han wrote:
>>>
>>> Hi all,
>>>
>>> I finally got a core dump.
>>>
>>> I did it with a kill -SEGV on the OSD process.
>>>
>>>
>>> https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
>>>
>>> Hope we will get something out of it :-).
>>
>>
>> AHA!  We have a theory.  The pg log isnt trimmed during scrub (because teh
>> old scrub code required that), but the new (deep) scrub can take a very
>> long time, which means the pg log will eat ram in the meantime..
>> especially under high iops.
>>
>
> Does the number of PGs influence the memory leak? So my theory is that when
> you have a high number of PGs with a low number of objects per PG you don't
> see the memory leak.
>
> I saw the memory leak on a RBD system where a pool had just 8 PGs, but after
> going to 1024 PGs in a new pool it seemed to be resolved.
>
> I've asked somebody else to try your patch since he's still seeing it on his
> systems. Hopefully that gives us some results.
>
> Wido
>
>
>> Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see
>> if that seems to work?  Note that that patch shouldn't be run in a mixed
>> argonaut+bobtail cluster, since it isn't properly checking if the scrub is
>> class or chunky/deep.
>>
>> Thanks!
>> sage
>>
>>
>>   > --
>>>
>>> Regards,
>>> S?bastien Han.
>>>
>>>
>>> On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com> wrote:
>>>>
>>>> On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com>
>>>> wrote:
>>>>>>
>>>>>> Is osd.1 using the heap profiler as well? Keep in mind that active use
>>>>>> of the memory profiler will itself cause memory usage to increase ?
>>>>>> this sounds a bit like that to me since it's staying stable at a large
>>>>>> but finite portion of total memory.
>>>>>
>>>>>
>>>>> Well, the memory consumption was already high before the profiler was
>>>>> started. So yes with the memory profiler enable an OSD might consume
>>>>> more memory but this doesn't cause the memory leaks.
>>>>
>>>>
>>>> My concern is that maybe you saw a leak but when you restarted with
>>>> the memory profiling you lost whatever conditions caused it.
>>>>
>>>>> Any ideas? Nothing to say about my scrumbing theory?
>>>>
>>>> I like it, but Sam indicates that without some heap dumps which
>>>> capture the actual leak then scrub is too large to effectively code
>>>> review for leaks. :(
>>>> -Greg
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
>
> --
> Wido den Hollander
> 42on B.V.
>
> Phone: +31 (0)20 700 9902
> Skype: contact42on

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-01 15:51                       ` Wido den Hollander
  2013-03-01 18:07                         ` Samuel Just
@ 2013-03-01 19:10                         ` Sage Weil
  2013-03-04 17:11                           ` Sébastien Han
  1 sibling, 1 reply; 48+ messages in thread
From: Sage Weil @ 2013-03-01 19:10 UTC (permalink / raw)
  To: Wido den Hollander
  Cc: Sébastien Han, Gregory Farnum, Sylvain Munaut, Dave Spano,
	ceph-devel, Samuel Just

On Fri, 1 Mar 2013, Wido den Hollander wrote:
> On 02/23/2013 01:44 AM, Sage Weil wrote:
> > On Fri, 22 Feb 2013, S?bastien Han wrote:
> > > Hi all,
> > > 
> > > I finally got a core dump.
> > > 
> > > I did it with a kill -SEGV on the OSD process.
> > > 
> > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
> > > 
> > > Hope we will get something out of it :-).
> > 
> > AHA!  We have a theory.  The pg log isnt trimmed during scrub (because teh
> > old scrub code required that), but the new (deep) scrub can take a very
> > long time, which means the pg log will eat ram in the meantime..
> > especially under high iops.
> > 
> 
> Does the number of PGs influence the memory leak? So my theory is that when
> you have a high number of PGs with a low number of objects per PG you don't
> see the memory leak.
> 
> I saw the memory leak on a RBD system where a pool had just 8 PGs, but after
> going to 1024 PGs in a new pool it seemed to be resolved.
> 
> I've asked somebody else to try your patch since he's still seeing it on his
> systems. Hopefully that gives us some results.

The PGs were active+clean when you saw the leak?  There is a problem (that 
we just fixed in master) where pg logs aren't trimmed for degraded PGs.

sage

> 
> Wido
> 
> > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see
> > if that seems to work?  Note that that patch shouldn't be run in a mixed
> > argonaut+bobtail cluster, since it isn't properly checking if the scrub is
> > class or chunky/deep.
> > 
> > Thanks!
> > sage
> > 
> > 
> >   > --
> > > Regards,
> > > S?bastien Han.
> > > 
> > > 
> > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com> wrote:
> > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com>
> > > > wrote:
> > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active
> > > > > > use
> > > > > > of the memory profiler will itself cause memory usage to increase ?
> > > > > > this sounds a bit like that to me since it's staying stable at a
> > > > > > large
> > > > > > but finite portion of total memory.
> > > > > 
> > > > > Well, the memory consumption was already high before the profiler was
> > > > > started. So yes with the memory profiler enable an OSD might consume
> > > > > more memory but this doesn't cause the memory leaks.
> > > > 
> > > > My concern is that maybe you saw a leak but when you restarted with
> > > > the memory profiling you lost whatever conditions caused it.
> > > > 
> > > > > Any ideas? Nothing to say about my scrumbing theory?
> > > > I like it, but Sam indicates that without some heap dumps which
> > > > capture the actual leak then scrub is too large to effectively code
> > > > review for leaks. :(
> > > > -Greg
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > 
> > > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> 
> -- 
> Wido den Hollander
> 42on B.V.
> 
> Phone: +31 (0)20 700 9902
> Skype: contact42on
> 
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-01 19:10                         ` Sage Weil
@ 2013-03-04 17:11                           ` Sébastien Han
       [not found]                             ` <13183200.155.1363027427897.JavaMail.dspano@it1>
  2013-03-12 11:45                             ` Vladislav Gorbunov
  0 siblings, 2 replies; 48+ messages in thread
From: Sébastien Han @ 2013-03-04 17:11 UTC (permalink / raw)
  To: Sage Weil
  Cc: Wido den Hollander, Gregory Farnum, Sylvain Munaut, Dave Spano,
	ceph-devel, Samuel Just

FYI I'm using 450 pgs for my pools.

--
Regards,
Sébastien Han.


On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil <sage@inktank.com> wrote:
>
> On Fri, 1 Mar 2013, Wido den Hollander wrote:
> > On 02/23/2013 01:44 AM, Sage Weil wrote:
> > > On Fri, 22 Feb 2013, S?bastien Han wrote:
> > > > Hi all,
> > > >
> > > > I finally got a core dump.
> > > >
> > > > I did it with a kill -SEGV on the OSD process.
> > > >
> > > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
> > > >
> > > > Hope we will get something out of it :-).
> > >
> > > AHA!  We have a theory.  The pg log isnt trimmed during scrub (because teh
> > > old scrub code required that), but the new (deep) scrub can take a very
> > > long time, which means the pg log will eat ram in the meantime..
> > > especially under high iops.
> > >
> >
> > Does the number of PGs influence the memory leak? So my theory is that when
> > you have a high number of PGs with a low number of objects per PG you don't
> > see the memory leak.
> >
> > I saw the memory leak on a RBD system where a pool had just 8 PGs, but after
> > going to 1024 PGs in a new pool it seemed to be resolved.
> >
> > I've asked somebody else to try your patch since he's still seeing it on his
> > systems. Hopefully that gives us some results.
>
> The PGs were active+clean when you saw the leak?  There is a problem (that
> we just fixed in master) where pg logs aren't trimmed for degraded PGs.
>
> sage
>
> >
> > Wido
> >
> > > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see
> > > if that seems to work?  Note that that patch shouldn't be run in a mixed
> > > argonaut+bobtail cluster, since it isn't properly checking if the scrub is
> > > class or chunky/deep.
> > >
> > > Thanks!
> > > sage
> > >
> > >
> > >   > --
> > > > Regards,
> > > > S?bastien Han.
> > > >
> > > >
> > > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com> wrote:
> > > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com>
> > > > > wrote:
> > > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active
> > > > > > > use
> > > > > > > of the memory profiler will itself cause memory usage to increase ?
> > > > > > > this sounds a bit like that to me since it's staying stable at a
> > > > > > > large
> > > > > > > but finite portion of total memory.
> > > > > >
> > > > > > Well, the memory consumption was already high before the profiler was
> > > > > > started. So yes with the memory profiler enable an OSD might consume
> > > > > > more memory but this doesn't cause the memory leaks.
> > > > >
> > > > > My concern is that maybe you saw a leak but when you restarted with
> > > > > the memory profiling you lost whatever conditions caused it.
> > > > >
> > > > > > Any ideas? Nothing to say about my scrumbing theory?
> > > > > I like it, but Sam indicates that without some heap dumps which
> > > > > capture the actual leak then scrub is too large to effectively code
> > > > > review for leaks. :(
> > > > > -Greg
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > >
> > > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > >
> >
> >
> > --
> > Wido den Hollander
> > 42on B.V.
> >
> > Phone: +31 (0)20 700 9902
> > Skype: contact42on
> >
> >
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
       [not found]                             ` <13183200.155.1363027427897.JavaMail.dspano@it1>
@ 2013-03-11 22:23                               ` Sébastien Han
  0 siblings, 0 replies; 48+ messages in thread
From: Sébastien Han @ 2013-03-11 22:23 UTC (permalink / raw)
  To: Dave Spano
  Cc: Wido den Hollander, Gregory Farnum, Sylvain Munaut, ceph-devel,
	Samuel Just, Sage Weil

Dave,

It still a production platform so no I didn't try it. I've also found
that now ceph-mon are constantly leaking... I truly hope your log max
recent = 10000 will help.

Cheers.
--
Regards,
Sébastien Han.


On Mon, Mar 11, 2013 at 7:43 PM, Dave Spano <dspano@optogenics.com> wrote:
> Sebastien,
>
> Did the patch that Sage mentioned work for you? I've found that this behavior is happening more frequently with my first osd during deep scrubbing on version 0.56.3. OOM Killer now goes after the ceph-osd process after a couple of days.
>
> Sage,
> Yesterday after following the OSD memory requirements thread, I added log max recent = 10000 to ceph.conf, and osd.0 seems to have returned to a state of normalcy. If it makes it through a deep scrubbing with no problem, I'll be very happy.
>
>
>>> sage@inktank.com said:
>>>> - pg log trimming (probably a conservative subset) to avoid memory bloat
>>>
>>> Anything that reduces the size of OSD processes would be appreciated.
>>> You can probably do this with just
>>>  log max recent = 1000
>>> By default it's keeping 100k lines of logs in memory, which can eat a lot  of
>>> ram (but is great when debugging issues).
>
> Dave Spano
>
>
>
>
> ----- Original Message -----
> From: "Sébastien Han" <han.sebastien@gmail.com>
> To: "Sage Weil" <sage@inktank.com>
> Cc: "Wido den Hollander" <wido@42on.com>, "Gregory Farnum" <greg@inktank.com>, "Sylvain Munaut" <s.munaut@whatever-company.com>, "Dave Spano" <dspano@optogenics.com>, "ceph-devel" <ceph-devel@vger.kernel.org>, "Samuel Just" <sam.just@inktank.com>
> Sent: Monday, March 4, 2013 12:11:22 PM
> Subject: Re: OSD memory leaks?
>
> FYI I'm using 450 pgs for my pools.
>
> --
> Regards,
> Sébastien Han.
>
>
> On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil <sage@inktank.com> wrote:
>>
>> On Fri, 1 Mar 2013, Wido den Hollander wrote:
>> > On 02/23/2013 01:44 AM, Sage Weil wrote:
>> > > On Fri, 22 Feb 2013, S?bastien Han wrote:
>> > > > Hi all,
>> > > >
>> > > > I finally got a core dump.
>> > > >
>> > > > I did it with a kill -SEGV on the OSD process.
>> > > >
>> > > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
>> > > >
>> > > > Hope we will get something out of it :-).
>> > >
>> > > AHA!  We have a theory.  The pg log isnt trimmed during scrub (because teh
>> > > old scrub code required that), but the new (deep) scrub can take a very
>> > > long time, which means the pg log will eat ram in the meantime..
>> > > especially under high iops.
>> > >
>> >
>> > Does the number of PGs influence the memory leak? So my theory is that when
>> > you have a high number of PGs with a low number of objects per PG you don't
>> > see the memory leak.
>> >
>> > I saw the memory leak on a RBD system where a pool had just 8 PGs, but after
>> > going to 1024 PGs in a new pool it seemed to be resolved.
>> >
>> > I've asked somebody else to try your patch since he's still seeing it on his
>> > systems. Hopefully that gives us some results.
>>
>> The PGs were active+clean when you saw the leak?  There is a problem (that
>> we just fixed in master) where pg logs aren't trimmed for degraded PGs.
>>
>> sage
>>
>> >
>> > Wido
>> >
>> > > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see
>> > > if that seems to work?  Note that that patch shouldn't be run in a mixed
>> > > argonaut+bobtail cluster, since it isn't properly checking if the scrub is
>> > > class or chunky/deep.
>> > >
>> > > Thanks!
>> > > sage
>> > >
>> > >
>> > >   > --
>> > > > Regards,
>> > > > S?bastien Han.
>> > > >
>> > > >
>> > > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com> wrote:
>> > > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com>
>> > > > > wrote:
>> > > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active
>> > > > > > > use
>> > > > > > > of the memory profiler will itself cause memory usage to increase ?
>> > > > > > > this sounds a bit like that to me since it's staying stable at a
>> > > > > > > large
>> > > > > > > but finite portion of total memory.
>> > > > > >
>> > > > > > Well, the memory consumption was already high before the profiler was
>> > > > > > started. So yes with the memory profiler enable an OSD might consume
>> > > > > > more memory but this doesn't cause the memory leaks.
>> > > > >
>> > > > > My concern is that maybe you saw a leak but when you restarted with
>> > > > > the memory profiling you lost whatever conditions caused it.
>> > > > >
>> > > > > > Any ideas? Nothing to say about my scrumbing theory?
>> > > > > I like it, but Sam indicates that without some heap dumps which
>> > > > > capture the actual leak then scrub is too large to effectively code
>> > > > > review for leaks. :(
>> > > > > -Greg
>> > > > --
>> > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> > > > the body of a message to majordomo@vger.kernel.org
>> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> > > >
>> > > >
>> > > --
>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> > > the body of a message to majordomo@vger.kernel.org
>> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> > >
>> >
>> >
>> > --
>> > Wido den Hollander
>> > 42on B.V.
>> >
>> > Phone: +31 (0)20 700 9902
>> > Skype: contact42on
>> >
>> >
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-04 17:11                           ` Sébastien Han
       [not found]                             ` <13183200.155.1363027427897.JavaMail.dspano@it1>
@ 2013-03-12 11:45                             ` Vladislav Gorbunov
  2013-03-12 12:12                               ` Sébastien Han
  1 sibling, 1 reply; 48+ messages in thread
From: Vladislav Gorbunov @ 2013-03-12 11:45 UTC (permalink / raw)
  To: Sébastien Han
  Cc: Sage Weil, Wido den Hollander, Gregory Farnum, Sylvain Munaut,
	Dave Spano, ceph-devel, Samuel Just

> FYI I'm using 450 pgs for my pools.
Please, can you show the number of object replicas?

ceph osd dump | grep 'rep size'

Vlad Gorbunov

2013/3/5 Sébastien Han <han.sebastien@gmail.com>:
> FYI I'm using 450 pgs for my pools.
>
> --
> Regards,
> Sébastien Han.
>
>
> On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil <sage@inktank.com> wrote:
>>
>> On Fri, 1 Mar 2013, Wido den Hollander wrote:
>> > On 02/23/2013 01:44 AM, Sage Weil wrote:
>> > > On Fri, 22 Feb 2013, S?bastien Han wrote:
>> > > > Hi all,
>> > > >
>> > > > I finally got a core dump.
>> > > >
>> > > > I did it with a kill -SEGV on the OSD process.
>> > > >
>> > > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
>> > > >
>> > > > Hope we will get something out of it :-).
>> > >
>> > > AHA!  We have a theory.  The pg log isnt trimmed during scrub (because teh
>> > > old scrub code required that), but the new (deep) scrub can take a very
>> > > long time, which means the pg log will eat ram in the meantime..
>> > > especially under high iops.
>> > >
>> >
>> > Does the number of PGs influence the memory leak? So my theory is that when
>> > you have a high number of PGs with a low number of objects per PG you don't
>> > see the memory leak.
>> >
>> > I saw the memory leak on a RBD system where a pool had just 8 PGs, but after
>> > going to 1024 PGs in a new pool it seemed to be resolved.
>> >
>> > I've asked somebody else to try your patch since he's still seeing it on his
>> > systems. Hopefully that gives us some results.
>>
>> The PGs were active+clean when you saw the leak?  There is a problem (that
>> we just fixed in master) where pg logs aren't trimmed for degraded PGs.
>>
>> sage
>>
>> >
>> > Wido
>> >
>> > > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see
>> > > if that seems to work?  Note that that patch shouldn't be run in a mixed
>> > > argonaut+bobtail cluster, since it isn't properly checking if the scrub is
>> > > class or chunky/deep.
>> > >
>> > > Thanks!
>> > > sage
>> > >
>> > >
>> > >   > --
>> > > > Regards,
>> > > > S?bastien Han.
>> > > >
>> > > >
>> > > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com> wrote:
>> > > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com>
>> > > > > wrote:
>> > > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active
>> > > > > > > use
>> > > > > > > of the memory profiler will itself cause memory usage to increase ?
>> > > > > > > this sounds a bit like that to me since it's staying stable at a
>> > > > > > > large
>> > > > > > > but finite portion of total memory.
>> > > > > >
>> > > > > > Well, the memory consumption was already high before the profiler was
>> > > > > > started. So yes with the memory profiler enable an OSD might consume
>> > > > > > more memory but this doesn't cause the memory leaks.
>> > > > >
>> > > > > My concern is that maybe you saw a leak but when you restarted with
>> > > > > the memory profiling you lost whatever conditions caused it.
>> > > > >
>> > > > > > Any ideas? Nothing to say about my scrumbing theory?
>> > > > > I like it, but Sam indicates that without some heap dumps which
>> > > > > capture the actual leak then scrub is too large to effectively code
>> > > > > review for leaks. :(
>> > > > > -Greg
>> > > > --
>> > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> > > > the body of a message to majordomo@vger.kernel.org
>> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> > > >
>> > > >
>> > > --
>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> > > the body of a message to majordomo@vger.kernel.org
>> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> > >
>> >
>> >
>> > --
>> > Wido den Hollander
>> > 42on B.V.
>> >
>> > Phone: +31 (0)20 700 9902
>> > Skype: contact42on
>> >
>> >
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-12 11:45                             ` Vladislav Gorbunov
@ 2013-03-12 12:12                               ` Sébastien Han
  2013-03-12 13:00                                 ` Vladislav Gorbunov
  0 siblings, 1 reply; 48+ messages in thread
From: Sébastien Han @ 2013-03-12 12:12 UTC (permalink / raw)
  To: Vladislav Gorbunov
  Cc: Sage Weil, Wido den Hollander, Gregory Farnum, Sylvain Munaut,
	Dave Spano, ceph-devel, Samuel Just

Replica count has been set to 2.

Why?
--
Regards,
Sébastien Han.


On Tue, Mar 12, 2013 at 12:45 PM, Vladislav Gorbunov <vadikgo@gmail.com> wrote:
>> FYI I'm using 450 pgs for my pools.
> Please, can you show the number of object replicas?
>
> ceph osd dump | grep 'rep size'
>
> Vlad Gorbunov
>
> 2013/3/5 Sébastien Han <han.sebastien@gmail.com>:
>> FYI I'm using 450 pgs for my pools.
>>
>> --
>> Regards,
>> Sébastien Han.
>>
>>
>> On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil <sage@inktank.com> wrote:
>>>
>>> On Fri, 1 Mar 2013, Wido den Hollander wrote:
>>> > On 02/23/2013 01:44 AM, Sage Weil wrote:
>>> > > On Fri, 22 Feb 2013, S?bastien Han wrote:
>>> > > > Hi all,
>>> > > >
>>> > > > I finally got a core dump.
>>> > > >
>>> > > > I did it with a kill -SEGV on the OSD process.
>>> > > >
>>> > > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
>>> > > >
>>> > > > Hope we will get something out of it :-).
>>> > >
>>> > > AHA!  We have a theory.  The pg log isnt trimmed during scrub (because teh
>>> > > old scrub code required that), but the new (deep) scrub can take a very
>>> > > long time, which means the pg log will eat ram in the meantime..
>>> > > especially under high iops.
>>> > >
>>> >
>>> > Does the number of PGs influence the memory leak? So my theory is that when
>>> > you have a high number of PGs with a low number of objects per PG you don't
>>> > see the memory leak.
>>> >
>>> > I saw the memory leak on a RBD system where a pool had just 8 PGs, but after
>>> > going to 1024 PGs in a new pool it seemed to be resolved.
>>> >
>>> > I've asked somebody else to try your patch since he's still seeing it on his
>>> > systems. Hopefully that gives us some results.
>>>
>>> The PGs were active+clean when you saw the leak?  There is a problem (that
>>> we just fixed in master) where pg logs aren't trimmed for degraded PGs.
>>>
>>> sage
>>>
>>> >
>>> > Wido
>>> >
>>> > > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see
>>> > > if that seems to work?  Note that that patch shouldn't be run in a mixed
>>> > > argonaut+bobtail cluster, since it isn't properly checking if the scrub is
>>> > > class or chunky/deep.
>>> > >
>>> > > Thanks!
>>> > > sage
>>> > >
>>> > >
>>> > >   > --
>>> > > > Regards,
>>> > > > S?bastien Han.
>>> > > >
>>> > > >
>>> > > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com> wrote:
>>> > > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com>
>>> > > > > wrote:
>>> > > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active
>>> > > > > > > use
>>> > > > > > > of the memory profiler will itself cause memory usage to increase ?
>>> > > > > > > this sounds a bit like that to me since it's staying stable at a
>>> > > > > > > large
>>> > > > > > > but finite portion of total memory.
>>> > > > > >
>>> > > > > > Well, the memory consumption was already high before the profiler was
>>> > > > > > started. So yes with the memory profiler enable an OSD might consume
>>> > > > > > more memory but this doesn't cause the memory leaks.
>>> > > > >
>>> > > > > My concern is that maybe you saw a leak but when you restarted with
>>> > > > > the memory profiling you lost whatever conditions caused it.
>>> > > > >
>>> > > > > > Any ideas? Nothing to say about my scrumbing theory?
>>> > > > > I like it, but Sam indicates that without some heap dumps which
>>> > > > > capture the actual leak then scrub is too large to effectively code
>>> > > > > review for leaks. :(
>>> > > > > -Greg
>>> > > > --
>>> > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> > > > the body of a message to majordomo@vger.kernel.org
>>> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> > > >
>>> > > >
>>> > > --
>>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> > > the body of a message to majordomo@vger.kernel.org
>>> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> > >
>>> >
>>> >
>>> > --
>>> > Wido den Hollander
>>> > 42on B.V.
>>> >
>>> > Phone: +31 (0)20 700 9902
>>> > Skype: contact42on
>>> >
>>> >
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-12 12:12                               ` Sébastien Han
@ 2013-03-12 13:00                                 ` Vladislav Gorbunov
  2013-03-12 13:43                                   ` Sébastien Han
  0 siblings, 1 reply; 48+ messages in thread
From: Vladislav Gorbunov @ 2013-03-12 13:00 UTC (permalink / raw)
  To: Sébastien Han
  Cc: Sage Weil, Wido den Hollander, Gregory Farnum, Sylvain Munaut,
	Dave Spano, ceph-devel, Samuel Just

Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd
dump | grep 'rep size'"
The default pg_num value 8 is NOT suitable for big cluster.

2013/3/13 Sébastien Han <han.sebastien@gmail.com>:
> Replica count has been set to 2.
>
> Why?
> --
> Regards,
> Sébastien Han.
>
>
> On Tue, Mar 12, 2013 at 12:45 PM, Vladislav Gorbunov <vadikgo@gmail.com> wrote:
>>> FYI I'm using 450 pgs for my pools.
>> Please, can you show the number of object replicas?
>>
>> ceph osd dump | grep 'rep size'
>>
>> Vlad Gorbunov
>>
>> 2013/3/5 Sébastien Han <han.sebastien@gmail.com>:
>>> FYI I'm using 450 pgs for my pools.
>>>
>>> --
>>> Regards,
>>> Sébastien Han.
>>>
>>>
>>> On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil <sage@inktank.com> wrote:
>>>>
>>>> On Fri, 1 Mar 2013, Wido den Hollander wrote:
>>>> > On 02/23/2013 01:44 AM, Sage Weil wrote:
>>>> > > On Fri, 22 Feb 2013, S?bastien Han wrote:
>>>> > > > Hi all,
>>>> > > >
>>>> > > > I finally got a core dump.
>>>> > > >
>>>> > > > I did it with a kill -SEGV on the OSD process.
>>>> > > >
>>>> > > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
>>>> > > >
>>>> > > > Hope we will get something out of it :-).
>>>> > >
>>>> > > AHA!  We have a theory.  The pg log isnt trimmed during scrub (because teh
>>>> > > old scrub code required that), but the new (deep) scrub can take a very
>>>> > > long time, which means the pg log will eat ram in the meantime..
>>>> > > especially under high iops.
>>>> > >
>>>> >
>>>> > Does the number of PGs influence the memory leak? So my theory is that when
>>>> > you have a high number of PGs with a low number of objects per PG you don't
>>>> > see the memory leak.
>>>> >
>>>> > I saw the memory leak on a RBD system where a pool had just 8 PGs, but after
>>>> > going to 1024 PGs in a new pool it seemed to be resolved.
>>>> >
>>>> > I've asked somebody else to try your patch since he's still seeing it on his
>>>> > systems. Hopefully that gives us some results.
>>>>
>>>> The PGs were active+clean when you saw the leak?  There is a problem (that
>>>> we just fixed in master) where pg logs aren't trimmed for degraded PGs.
>>>>
>>>> sage
>>>>
>>>> >
>>>> > Wido
>>>> >
>>>> > > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see
>>>> > > if that seems to work?  Note that that patch shouldn't be run in a mixed
>>>> > > argonaut+bobtail cluster, since it isn't properly checking if the scrub is
>>>> > > class or chunky/deep.
>>>> > >
>>>> > > Thanks!
>>>> > > sage
>>>> > >
>>>> > >
>>>> > >   > --
>>>> > > > Regards,
>>>> > > > S?bastien Han.
>>>> > > >
>>>> > > >
>>>> > > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com> wrote:
>>>> > > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com>
>>>> > > > > wrote:
>>>> > > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active
>>>> > > > > > > use
>>>> > > > > > > of the memory profiler will itself cause memory usage to increase ?
>>>> > > > > > > this sounds a bit like that to me since it's staying stable at a
>>>> > > > > > > large
>>>> > > > > > > but finite portion of total memory.
>>>> > > > > >
>>>> > > > > > Well, the memory consumption was already high before the profiler was
>>>> > > > > > started. So yes with the memory profiler enable an OSD might consume
>>>> > > > > > more memory but this doesn't cause the memory leaks.
>>>> > > > >
>>>> > > > > My concern is that maybe you saw a leak but when you restarted with
>>>> > > > > the memory profiling you lost whatever conditions caused it.
>>>> > > > >
>>>> > > > > > Any ideas? Nothing to say about my scrumbing theory?
>>>> > > > > I like it, but Sam indicates that without some heap dumps which
>>>> > > > > capture the actual leak then scrub is too large to effectively code
>>>> > > > > review for leaks. :(
>>>> > > > > -Greg
>>>> > > > --
>>>> > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> > > > the body of a message to majordomo@vger.kernel.org
>>>> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> > > >
>>>> > > >
>>>> > > --
>>>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> > > the body of a message to majordomo@vger.kernel.org
>>>> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> > >
>>>> >
>>>> >
>>>> > --
>>>> > Wido den Hollander
>>>> > 42on B.V.
>>>> >
>>>> > Phone: +31 (0)20 700 9902
>>>> > Skype: contact42on
>>>> >
>>>> >
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-12 13:00                                 ` Vladislav Gorbunov
@ 2013-03-12 13:43                                   ` Sébastien Han
  0 siblings, 0 replies; 48+ messages in thread
From: Sébastien Han @ 2013-03-12 13:43 UTC (permalink / raw)
  To: Vladislav Gorbunov
  Cc: Sage Weil, Wido den Hollander, Gregory Farnum, Sylvain Munaut,
	Dave Spano, ceph-devel, Samuel Just

>Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd
>dump | grep 'rep size'"

Well it's still 450 each...

>The default pg_num value 8 is NOT suitable for big cluster.

Thanks I know, I'm not new with Ceph. What's your point here? I
already said that pg_num was 450...
--
Regards,
Sébastien Han.


On Tue, Mar 12, 2013 at 2:00 PM, Vladislav Gorbunov <vadikgo@gmail.com> wrote:
> Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd
> dump | grep 'rep size'"
> The default pg_num value 8 is NOT suitable for big cluster.
>
> 2013/3/13 Sébastien Han <han.sebastien@gmail.com>:
>> Replica count has been set to 2.
>>
>> Why?
>> --
>> Regards,
>> Sébastien Han.
>>
>>
>> On Tue, Mar 12, 2013 at 12:45 PM, Vladislav Gorbunov <vadikgo@gmail.com> wrote:
>>>> FYI I'm using 450 pgs for my pools.
>>> Please, can you show the number of object replicas?
>>>
>>> ceph osd dump | grep 'rep size'
>>>
>>> Vlad Gorbunov
>>>
>>> 2013/3/5 Sébastien Han <han.sebastien@gmail.com>:
>>>> FYI I'm using 450 pgs for my pools.
>>>>
>>>> --
>>>> Regards,
>>>> Sébastien Han.
>>>>
>>>>
>>>> On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil <sage@inktank.com> wrote:
>>>>>
>>>>> On Fri, 1 Mar 2013, Wido den Hollander wrote:
>>>>> > On 02/23/2013 01:44 AM, Sage Weil wrote:
>>>>> > > On Fri, 22 Feb 2013, S?bastien Han wrote:
>>>>> > > > Hi all,
>>>>> > > >
>>>>> > > > I finally got a core dump.
>>>>> > > >
>>>>> > > > I did it with a kill -SEGV on the OSD process.
>>>>> > > >
>>>>> > > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
>>>>> > > >
>>>>> > > > Hope we will get something out of it :-).
>>>>> > >
>>>>> > > AHA!  We have a theory.  The pg log isnt trimmed during scrub (because teh
>>>>> > > old scrub code required that), but the new (deep) scrub can take a very
>>>>> > > long time, which means the pg log will eat ram in the meantime..
>>>>> > > especially under high iops.
>>>>> > >
>>>>> >
>>>>> > Does the number of PGs influence the memory leak? So my theory is that when
>>>>> > you have a high number of PGs with a low number of objects per PG you don't
>>>>> > see the memory leak.
>>>>> >
>>>>> > I saw the memory leak on a RBD system where a pool had just 8 PGs, but after
>>>>> > going to 1024 PGs in a new pool it seemed to be resolved.
>>>>> >
>>>>> > I've asked somebody else to try your patch since he's still seeing it on his
>>>>> > systems. Hopefully that gives us some results.
>>>>>
>>>>> The PGs were active+clean when you saw the leak?  There is a problem (that
>>>>> we just fixed in master) where pg logs aren't trimmed for degraded PGs.
>>>>>
>>>>> sage
>>>>>
>>>>> >
>>>>> > Wido
>>>>> >
>>>>> > > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see
>>>>> > > if that seems to work?  Note that that patch shouldn't be run in a mixed
>>>>> > > argonaut+bobtail cluster, since it isn't properly checking if the scrub is
>>>>> > > class or chunky/deep.
>>>>> > >
>>>>> > > Thanks!
>>>>> > > sage
>>>>> > >
>>>>> > >
>>>>> > >   > --
>>>>> > > > Regards,
>>>>> > > > S?bastien Han.
>>>>> > > >
>>>>> > > >
>>>>> > > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com> wrote:
>>>>> > > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com>
>>>>> > > > > wrote:
>>>>> > > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active
>>>>> > > > > > > use
>>>>> > > > > > > of the memory profiler will itself cause memory usage to increase ?
>>>>> > > > > > > this sounds a bit like that to me since it's staying stable at a
>>>>> > > > > > > large
>>>>> > > > > > > but finite portion of total memory.
>>>>> > > > > >
>>>>> > > > > > Well, the memory consumption was already high before the profiler was
>>>>> > > > > > started. So yes with the memory profiler enable an OSD might consume
>>>>> > > > > > more memory but this doesn't cause the memory leaks.
>>>>> > > > >
>>>>> > > > > My concern is that maybe you saw a leak but when you restarted with
>>>>> > > > > the memory profiling you lost whatever conditions caused it.
>>>>> > > > >
>>>>> > > > > > Any ideas? Nothing to say about my scrumbing theory?
>>>>> > > > > I like it, but Sam indicates that without some heap dumps which
>>>>> > > > > capture the actual leak then scrub is too large to effectively code
>>>>> > > > > review for leaks. :(
>>>>> > > > > -Greg
>>>>> > > > --
>>>>> > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> > > > the body of a message to majordomo@vger.kernel.org
>>>>> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> > > >
>>>>> > > >
>>>>> > > --
>>>>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> > > the body of a message to majordomo@vger.kernel.org
>>>>> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> > >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Wido den Hollander
>>>>> > 42on B.V.
>>>>> >
>>>>> > Phone: +31 (0)20 700 9902
>>>>> > Skype: contact42on
>>>>> >
>>>>> >
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
       [not found] <7332688.5.1363110084349.JavaMail.dspano@it1>
@ 2013-03-12 18:09 ` Dave Spano
  2013-03-12 20:10   ` Sébastien Han
  0 siblings, 1 reply; 48+ messages in thread
From: Dave Spano @ 2013-03-12 18:09 UTC (permalink / raw)
  To: ceph-devel
  Cc: Sage Weil, Wido den Hollander, Gregory Farnum, Sylvain Munaut,
	Samuel Just, Vladislav Gorbunov, Sébastien Han

Disregard my previous question. I found my answer in the post below. Absolutely brilliant! I thought I was screwed! 

http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/8924 

Dave Spano 
Optogenics 
Systems Administrator 



----- Original Message ----- 

From: "Dave Spano" <dspano@optogenics.com> 
To: "Sébastien Han" <han.sebastien@gmail.com> 
Cc: "Sage Weil" <sage@inktank.com>, "Wido den Hollander" <wido@42on.com>, "Gregory Farnum" <greg@inktank.com>, "Sylvain Munaut" <s.munaut@whatever-company.com>, "ceph-devel" <ceph-devel@vger.kernel.org>, "Samuel Just" <sam.just@inktank.com>, "Vladislav Gorbunov" <vadikgo@gmail.com> 
Sent: Tuesday, March 12, 2013 1:41:21 PM 
Subject: Re: OSD memory leaks? 


If one were stupid enough to have their pg_num and pgp_num set to 8 on two of their pools, how could you fix that? 


Dave Spano 



----- Original Message -----

From: "Sébastien Han" <han.sebastien@gmail.com> 
To: "Vladislav Gorbunov" <vadikgo@gmail.com> 
Cc: "Sage Weil" <sage@inktank.com>, "Wido den Hollander" <wido@42on.com>, "Gregory Farnum" <greg@inktank.com>, "Sylvain Munaut" <s.munaut@whatever-company.com>, "Dave Spano" <dspano@optogenics.com>, "ceph-devel" <ceph-devel@vger.kernel.org>, "Samuel Just" <sam.just@inktank.com> 
Sent: Tuesday, March 12, 2013 9:43:44 AM 
Subject: Re: OSD memory leaks? 

>Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd 
>dump | grep 'rep size'" 

Well it's still 450 each... 

>The default pg_num value 8 is NOT suitable for big cluster. 

Thanks I know, I'm not new with Ceph. What's your point here? I 
already said that pg_num was 450... 
-- 
Regards, 
Sébastien Han. 


On Tue, Mar 12, 2013 at 2:00 PM, Vladislav Gorbunov <vadikgo@gmail.com> wrote: 
> Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd 
> dump | grep 'rep size'" 
> The default pg_num value 8 is NOT suitable for big cluster. 
> 
> 2013/3/13 Sébastien Han <han.sebastien@gmail.com>: 
>> Replica count has been set to 2. 
>> 
>> Why? 
>> -- 
>> Regards, 
>> Sébastien Han. 
>> 
>> 
>> On Tue, Mar 12, 2013 at 12:45 PM, Vladislav Gorbunov <vadikgo@gmail.com> wrote: 
>>>> FYI I'm using 450 pgs for my pools. 
>>> Please, can you show the number of object replicas? 
>>> 
>>> ceph osd dump | grep 'rep size' 
>>> 
>>> Vlad Gorbunov 
>>> 
>>> 2013/3/5 Sébastien Han <han.sebastien@gmail.com>: 
>>>> FYI I'm using 450 pgs for my pools. 
>>>> 
>>>> -- 
>>>> Regards, 
>>>> Sébastien Han. 
>>>> 
>>>> 
>>>> On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil <sage@inktank.com> wrote: 
>>>>> 
>>>>> On Fri, 1 Mar 2013, Wido den Hollander wrote: 
>>>>> > On 02/23/2013 01:44 AM, Sage Weil wrote: 
>>>>> > > On Fri, 22 Feb 2013, S?bastien Han wrote: 
>>>>> > > > Hi all, 
>>>>> > > > 
>>>>> > > > I finally got a core dump. 
>>>>> > > > 
>>>>> > > > I did it with a kill -SEGV on the OSD process. 
>>>>> > > > 
>>>>> > > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008 
>>>>> > > > 
>>>>> > > > Hope we will get something out of it :-). 
>>>>> > > 
>>>>> > > AHA! We have a theory. The pg log isnt trimmed during scrub (because teh 
>>>>> > > old scrub code required that), but the new (deep) scrub can take a very 
>>>>> > > long time, which means the pg log will eat ram in the meantime.. 
>>>>> > > especially under high iops. 
>>>>> > > 
>>>>> > 
>>>>> > Does the number of PGs influence the memory leak? So my theory is that when 
>>>>> > you have a high number of PGs with a low number of objects per PG you don't 
>>>>> > see the memory leak. 
>>>>> > 
>>>>> > I saw the memory leak on a RBD system where a pool had just 8 PGs, but after 
>>>>> > going to 1024 PGs in a new pool it seemed to be resolved. 
>>>>> > 
>>>>> > I've asked somebody else to try your patch since he's still seeing it on his 
>>>>> > systems. Hopefully that gives us some results. 
>>>>> 
>>>>> The PGs were active+clean when you saw the leak? There is a problem (that 
>>>>> we just fixed in master) where pg logs aren't trimmed for degraded PGs. 
>>>>> 
>>>>> sage 
>>>>> 
>>>>> > 
>>>>> > Wido 
>>>>> > 
>>>>> > > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see 
>>>>> > > if that seems to work? Note that that patch shouldn't be run in a mixed 
>>>>> > > argonaut+bobtail cluster, since it isn't properly checking if the scrub is 
>>>>> > > class or chunky/deep. 
>>>>> > > 
>>>>> > > Thanks! 
>>>>> > > sage 
>>>>> > > 
>>>>> > > 
>>>>> > > > -- 
>>>>> > > > Regards, 
>>>>> > > > S?bastien Han. 
>>>>> > > > 
>>>>> > > > 
>>>>> > > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com> wrote: 
>>>>> > > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com> 
>>>>> > > > > wrote: 
>>>>> > > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active 
>>>>> > > > > > > use 
>>>>> > > > > > > of the memory profiler will itself cause memory usage to increase ? 
>>>>> > > > > > > this sounds a bit like that to me since it's staying stable at a 
>>>>> > > > > > > large 
>>>>> > > > > > > but finite portion of total memory. 
>>>>> > > > > > 
>>>>> > > > > > Well, the memory consumption was already high before the profiler was 
>>>>> > > > > > started. So yes with the memory profiler enable an OSD might consume 
>>>>> > > > > > more memory but this doesn't cause the memory leaks. 
>>>>> > > > > 
>>>>> > > > > My concern is that maybe you saw a leak but when you restarted with 
>>>>> > > > > the memory profiling you lost whatever conditions caused it. 
>>>>> > > > > 
>>>>> > > > > > Any ideas? Nothing to say about my scrumbing theory? 
>>>>> > > > > I like it, but Sam indicates that without some heap dumps which 
>>>>> > > > > capture the actual leak then scrub is too large to effectively code 
>>>>> > > > > review for leaks. :( 
>>>>> > > > > -Greg 
>>>>> > > > -- 
>>>>> > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
>>>>> > > > the body of a message to majordomo@vger.kernel.org 
>>>>> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html 
>>>>> > > > 
>>>>> > > > 
>>>>> > > -- 
>>>>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
>>>>> > > the body of a message to majordomo@vger.kernel.org 
>>>>> > > More majordomo info at http://vger.kernel.org/majordomo-info.html 
>>>>> > > 
>>>>> > 
>>>>> > 
>>>>> > -- 
>>>>> > Wido den Hollander 
>>>>> > 42on B.V. 
>>>>> > 
>>>>> > Phone: +31 (0)20 700 9902 
>>>>> > Skype: contact42on 
>>>>> > 
>>>>> > 
>>>> -- 
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
>>>> the body of a message to majordomo@vger.kernel.org 
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-12 18:09 ` OSD memory leaks? Dave Spano
@ 2013-03-12 20:10   ` Sébastien Han
  2013-03-12 20:20     ` Greg Farnum
  2013-03-12 21:01     ` Bryan K. Wright
  0 siblings, 2 replies; 48+ messages in thread
From: Sébastien Han @ 2013-03-12 20:10 UTC (permalink / raw)
  To: Dave Spano
  Cc: ceph-devel, Sage Weil, Wido den Hollander, Gregory Farnum,
	Sylvain Munaut, Samuel Just, Vladislav Gorbunov

Well to avoid un necessary data movement, there is also an
_experimental_ feature to change on fly the number of PGs in a pool.

ceph osd pool set <poolname> pg_num <numpgs> --allow-experimental-feature

Cheers!
--
Regards,
Sébastien Han.


On Tue, Mar 12, 2013 at 7:09 PM, Dave Spano <dspano@optogenics.com> wrote:
> Disregard my previous question. I found my answer in the post below. Absolutely brilliant! I thought I was screwed!
>
> http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/8924
>
> Dave Spano
> Optogenics
> Systems Administrator
>
>
>
> ----- Original Message -----
>
> From: "Dave Spano" <dspano@optogenics.com>
> To: "Sébastien Han" <han.sebastien@gmail.com>
> Cc: "Sage Weil" <sage@inktank.com>, "Wido den Hollander" <wido@42on.com>, "Gregory Farnum" <greg@inktank.com>, "Sylvain Munaut" <s.munaut@whatever-company.com>, "ceph-devel" <ceph-devel@vger.kernel.org>, "Samuel Just" <sam.just@inktank.com>, "Vladislav Gorbunov" <vadikgo@gmail.com>
> Sent: Tuesday, March 12, 2013 1:41:21 PM
> Subject: Re: OSD memory leaks?
>
>
> If one were stupid enough to have their pg_num and pgp_num set to 8 on two of their pools, how could you fix that?
>
>
> Dave Spano
>
>
>
> ----- Original Message -----
>
> From: "Sébastien Han" <han.sebastien@gmail.com>
> To: "Vladislav Gorbunov" <vadikgo@gmail.com>
> Cc: "Sage Weil" <sage@inktank.com>, "Wido den Hollander" <wido@42on.com>, "Gregory Farnum" <greg@inktank.com>, "Sylvain Munaut" <s.munaut@whatever-company.com>, "Dave Spano" <dspano@optogenics.com>, "ceph-devel" <ceph-devel@vger.kernel.org>, "Samuel Just" <sam.just@inktank.com>
> Sent: Tuesday, March 12, 2013 9:43:44 AM
> Subject: Re: OSD memory leaks?
>
>>Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd
>>dump | grep 'rep size'"
>
> Well it's still 450 each...
>
>>The default pg_num value 8 is NOT suitable for big cluster.
>
> Thanks I know, I'm not new with Ceph. What's your point here? I
> already said that pg_num was 450...
> --
> Regards,
> Sébastien Han.
>
>
> On Tue, Mar 12, 2013 at 2:00 PM, Vladislav Gorbunov <vadikgo@gmail.com> wrote:
>> Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd
>> dump | grep 'rep size'"
>> The default pg_num value 8 is NOT suitable for big cluster.
>>
>> 2013/3/13 Sébastien Han <han.sebastien@gmail.com>:
>>> Replica count has been set to 2.
>>>
>>> Why?
>>> --
>>> Regards,
>>> Sébastien Han.
>>>
>>>
>>> On Tue, Mar 12, 2013 at 12:45 PM, Vladislav Gorbunov <vadikgo@gmail.com> wrote:
>>>>> FYI I'm using 450 pgs for my pools.
>>>> Please, can you show the number of object replicas?
>>>>
>>>> ceph osd dump | grep 'rep size'
>>>>
>>>> Vlad Gorbunov
>>>>
>>>> 2013/3/5 Sébastien Han <han.sebastien@gmail.com>:
>>>>> FYI I'm using 450 pgs for my pools.
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Sébastien Han.
>>>>>
>>>>>
>>>>> On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil <sage@inktank.com> wrote:
>>>>>>
>>>>>> On Fri, 1 Mar 2013, Wido den Hollander wrote:
>>>>>> > On 02/23/2013 01:44 AM, Sage Weil wrote:
>>>>>> > > On Fri, 22 Feb 2013, S?bastien Han wrote:
>>>>>> > > > Hi all,
>>>>>> > > >
>>>>>> > > > I finally got a core dump.
>>>>>> > > >
>>>>>> > > > I did it with a kill -SEGV on the OSD process.
>>>>>> > > >
>>>>>> > > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
>>>>>> > > >
>>>>>> > > > Hope we will get something out of it :-).
>>>>>> > >
>>>>>> > > AHA! We have a theory. The pg log isnt trimmed during scrub (because teh
>>>>>> > > old scrub code required that), but the new (deep) scrub can take a very
>>>>>> > > long time, which means the pg log will eat ram in the meantime..
>>>>>> > > especially under high iops.
>>>>>> > >
>>>>>> >
>>>>>> > Does the number of PGs influence the memory leak? So my theory is that when
>>>>>> > you have a high number of PGs with a low number of objects per PG you don't
>>>>>> > see the memory leak.
>>>>>> >
>>>>>> > I saw the memory leak on a RBD system where a pool had just 8 PGs, but after
>>>>>> > going to 1024 PGs in a new pool it seemed to be resolved.
>>>>>> >
>>>>>> > I've asked somebody else to try your patch since he's still seeing it on his
>>>>>> > systems. Hopefully that gives us some results.
>>>>>>
>>>>>> The PGs were active+clean when you saw the leak? There is a problem (that
>>>>>> we just fixed in master) where pg logs aren't trimmed for degraded PGs.
>>>>>>
>>>>>> sage
>>>>>>
>>>>>> >
>>>>>> > Wido
>>>>>> >
>>>>>> > > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see
>>>>>> > > if that seems to work? Note that that patch shouldn't be run in a mixed
>>>>>> > > argonaut+bobtail cluster, since it isn't properly checking if the scrub is
>>>>>> > > class or chunky/deep.
>>>>>> > >
>>>>>> > > Thanks!
>>>>>> > > sage
>>>>>> > >
>>>>>> > >
>>>>>> > > > --
>>>>>> > > > Regards,
>>>>>> > > > S?bastien Han.
>>>>>> > > >
>>>>>> > > >
>>>>>> > > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com> wrote:
>>>>>> > > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com>
>>>>>> > > > > wrote:
>>>>>> > > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active
>>>>>> > > > > > > use
>>>>>> > > > > > > of the memory profiler will itself cause memory usage to increase ?
>>>>>> > > > > > > this sounds a bit like that to me since it's staying stable at a
>>>>>> > > > > > > large
>>>>>> > > > > > > but finite portion of total memory.
>>>>>> > > > > >
>>>>>> > > > > > Well, the memory consumption was already high before the profiler was
>>>>>> > > > > > started. So yes with the memory profiler enable an OSD might consume
>>>>>> > > > > > more memory but this doesn't cause the memory leaks.
>>>>>> > > > >
>>>>>> > > > > My concern is that maybe you saw a leak but when you restarted with
>>>>>> > > > > the memory profiling you lost whatever conditions caused it.
>>>>>> > > > >
>>>>>> > > > > > Any ideas? Nothing to say about my scrumbing theory?
>>>>>> > > > > I like it, but Sam indicates that without some heap dumps which
>>>>>> > > > > capture the actual leak then scrub is too large to effectively code
>>>>>> > > > > review for leaks. :(
>>>>>> > > > > -Greg
>>>>>> > > > --
>>>>>> > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>> > > > the body of a message to majordomo@vger.kernel.org
>>>>>> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>> > > >
>>>>>> > > >
>>>>>> > > --
>>>>>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>> > > the body of a message to majordomo@vger.kernel.org
>>>>>> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>> > >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Wido den Hollander
>>>>>> > 42on B.V.
>>>>>> >
>>>>>> > Phone: +31 (0)20 700 9902
>>>>>> > Skype: contact42on
>>>>>> >
>>>>>> >
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-12 20:10   ` Sébastien Han
@ 2013-03-12 20:20     ` Greg Farnum
  2013-03-12 21:15       ` Dave Spano
  2013-03-12 21:01     ` Bryan K. Wright
  1 sibling, 1 reply; 48+ messages in thread
From: Greg Farnum @ 2013-03-12 20:20 UTC (permalink / raw)
  To: Sébastien Han
  Cc: Dave Spano, ceph-devel, Sage Weil, Wido den Hollander,
	Sylvain Munaut, Samuel Just, Vladislav Gorbunov

On Tuesday, March 12, 2013 at 1:10 PM, Sébastien Han wrote:
> Well to avoid un necessary data movement, there is also an
> _experimental_ feature to change on fly the number of PGs in a pool.
>  
> ceph osd pool set <poolname> pg_num <numpgs> --allow-experimental-feature
Don't do that. We've got a set of 3 patches which fix bugs we know about that aren't in bobtail yet, and I'm sure there's more we aren't aware of…
-Greg

Software Engineer #42 @ http://inktank.com | http://ceph.com  

>  
> Cheers!
> --
> Regards,
> Sébastien Han.
>  
>  
> On Tue, Mar 12, 2013 at 7:09 PM, Dave Spano <dspano@optogenics.com (mailto:dspano@optogenics.com)> wrote:
> > Disregard my previous question. I found my answer in the post below. Absolutely brilliant! I thought I was screwed!
> >  
> > http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/8924
> >  
> > Dave Spano
> > Optogenics
> > Systems Administrator
> >  
> >  
> >  
> > ----- Original Message -----
> >  
> > From: "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>
> > To: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>
> > Cc: "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Gregory Farnum" <greg@inktank.com (mailto:greg@inktank.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>, "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)>
> > Sent: Tuesday, March 12, 2013 1:41:21 PM
> > Subject: Re: OSD memory leaks?
> >  
> >  
> > If one were stupid enough to have their pg_num and pgp_num set to 8 on two of their pools, how could you fix that?
> >  
> >  
> > Dave Spano
> >  
> >  
> >  
> > ----- Original Message -----
> >  
> > From: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>
> > To: "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)>
> > Cc: "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Gregory Farnum" <greg@inktank.com (mailto:greg@inktank.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>
> > Sent: Tuesday, March 12, 2013 9:43:44 AM
> > Subject: Re: OSD memory leaks?
> >  
> > > Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd
> > > dump | grep 'rep size'"
> >  
> >  
> >  
> > Well it's still 450 each...
> >  
> > > The default pg_num value 8 is NOT suitable for big cluster.
> >  
> > Thanks I know, I'm not new with Ceph. What's your point here? I
> > already said that pg_num was 450...
> > --
> > Regards,
> > Sébastien Han.
> >  
> >  
> > On Tue, Mar 12, 2013 at 2:00 PM, Vladislav Gorbunov <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> wrote:
> > > Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd
> > > dump | grep 'rep size'"
> > > The default pg_num value 8 is NOT suitable for big cluster.
> > >  
> > > 2013/3/13 Sébastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>:
> > > > Replica count has been set to 2.
> > > >  
> > > > Why?
> > > > --
> > > > Regards,
> > > > Sébastien Han.
> > > >  
> > > >  
> > > > On Tue, Mar 12, 2013 at 12:45 PM, Vladislav Gorbunov <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> wrote:
> > > > > > FYI I'm using 450 pgs for my pools.
> > > > >  
> > > > >  
> > > > > Please, can you show the number of object replicas?
> > > > >  
> > > > > ceph osd dump | grep 'rep size'
> > > > >  
> > > > > Vlad Gorbunov
> > > > >  
> > > > > 2013/3/5 Sébastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>:
> > > > > > FYI I'm using 450 pgs for my pools.
> > > > > >  
> > > > > > --
> > > > > > Regards,
> > > > > > Sébastien Han.
> > > > > >  
> > > > > >  
> > > > > > On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil <sage@inktank.com (mailto:sage@inktank.com)> wrote:
> > > > > > >  
> > > > > > > On Fri, 1 Mar 2013, Wido den Hollander wrote:
> > > > > > > > On 02/23/2013 01:44 AM, Sage Weil wrote:
> > > > > > > > > On Fri, 22 Feb 2013, S?bastien Han wrote:
> > > > > > > > > > Hi all,
> > > > > > > > > >  
> > > > > > > > > > I finally got a core dump.
> > > > > > > > > >  
> > > > > > > > > > I did it with a kill -SEGV on the OSD process.
> > > > > > > > > >  
> > > > > > > > > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
> > > > > > > > > >  
> > > > > > > > > > Hope we will get something out of it :-).
> > > > > > > > >  
> > > > > > > > > AHA! We have a theory. The pg log isnt trimmed during scrub (because teh
> > > > > > > > > old scrub code required that), but the new (deep) scrub can take a very
> > > > > > > > > long time, which means the pg log will eat ram in the meantime..
> > > > > > > > > especially under high iops.
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > Does the number of PGs influence the memory leak? So my theory is that when
> > > > > > > > you have a high number of PGs with a low number of objects per PG you don't
> > > > > > > > see the memory leak.
> > > > > > > >  
> > > > > > > > I saw the memory leak on a RBD system where a pool had just 8 PGs, but after
> > > > > > > > going to 1024 PGs in a new pool it seemed to be resolved.
> > > > > > > >  
> > > > > > > > I've asked somebody else to try your patch since he's still seeing it on his
> > > > > > > > systems. Hopefully that gives us some results.
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > > The PGs were active+clean when you saw the leak? There is a problem (that
> > > > > > > we just fixed in master) where pg logs aren't trimmed for degraded PGs.
> > > > > > >  
> > > > > > > sage
> > > > > > >  
> > > > > > > >  
> > > > > > > > Wido
> > > > > > > >  
> > > > > > > > > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see
> > > > > > > > > if that seems to work? Note that that patch shouldn't be run in a mixed
> > > > > > > > > argonaut+bobtail cluster, since it isn't properly checking if the scrub is
> > > > > > > > > class or chunky/deep.
> > > > > > > > >  
> > > > > > > > > Thanks!
> > > > > > > > > sage
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > > --
> > > > > > > > > > Regards,
> > > > > > > > > > S?bastien Han.
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
> > > > > > > > > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active
> > > > > > > > > > > > > use
> > > > > > > > > > > > > of the memory profiler will itself cause memory usage to increase ?
> > > > > > > > > > > > > this sounds a bit like that to me since it's staying stable at a
> > > > > > > > > > > > > large
> > > > > > > > > > > > > but finite portion of total memory.
> > > > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > > > Well, the memory consumption was already high before the profiler was
> > > > > > > > > > > > started. So yes with the memory profiler enable an OSD might consume
> > > > > > > > > > > > more memory but this doesn't cause the memory leaks.
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > > My concern is that maybe you saw a leak but when you restarted with
> > > > > > > > > > > the memory profiling you lost whatever conditions caused it.
> > > > > > > > > > >  
> > > > > > > > > > > > Any ideas? Nothing to say about my scrumbing theory?
> > > > > > > > > > > I like it, but Sam indicates that without some heap dumps which
> > > > > > > > > > > capture the actual leak then scrub is too large to effectively code
> > > > > > > > > > > review for leaks. :(
> > > > > > > > > > > -Greg
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > --
> > > > > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > > > > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)
> > > > > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > --
> > > > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > > > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)
> > > > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > --
> > > > > > > > Wido den Hollander
> > > > > > > > 42on B.V.
> > > > > > > >  
> > > > > > > > Phone: +31 (0)20 700 9902
> > > > > > > > Skype: contact42on
> > > > > > >  
> > > > > >  
> > > > > >  
> > > > > > --
> > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)
> > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > >  
> > > >  
> > >  
> >  
> >  
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>  



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-12 20:10   ` Sébastien Han
  2013-03-12 20:20     ` Greg Farnum
@ 2013-03-12 21:01     ` Bryan K. Wright
  1 sibling, 0 replies; 48+ messages in thread
From: Bryan K. Wright @ 2013-03-12 21:01 UTC (permalink / raw)
  To: han.sebastien; +Cc: ceph-devel


han.sebastien@gmail.com said:
> Well to avoid un necessary data movement, there is also an _experimental_
> feature to change on fly the number of PGs in a pool.
> ceph osd pool set <poolname> pg_num <numpgs> --allow-experimental-feature 

I've been following the instructions here:

http://ceph.com/docs/master/rados/configuration/osd-config-ref/

under "data placement", trying to set the number of pgs in ceph.conf.
I've added these lines in the "global" section:

        osd pool default pg num = 500
        osd pool default pgp num = 500

but they don't seem to have any effect on how mkcephfs behaves.
Before I added these lines, mkcephfs created a data pool with
3904 pgs.  After wiping everything, adding the lines and 
re-creating the pool, it still ends up with 3904 pgs.  What
am I doing wrong?

					Thanks,
					Bryan
-- 
========================================================================
Bryan Wright              |"If you take cranberries and stew them like 
Physics Department        | applesauce, they taste much more like prunes
University of Virginia    | than rhubarb does."  --  Groucho 
Charlottesville, VA  22901|			
(434) 924-7218            |         bryan@virginia.edu
========================================================================



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-12 20:20     ` Greg Farnum
@ 2013-03-12 21:15       ` Dave Spano
  2013-03-12 21:37         ` Greg Farnum
  0 siblings, 1 reply; 48+ messages in thread
From: Dave Spano @ 2013-03-12 21:15 UTC (permalink / raw)
  To: Greg Farnum
  Cc: ceph-devel, Sage Weil, Wido den Hollander, Sylvain Munaut,
	Samuel Just, Vladislav Gorbunov, Sébastien Han

I'd rather shut the cloud down and copy the pool to a new one than take any chances of corruption by using an experimental feature. My guess is that there cannot be any i/o to the pool while copying, otherwise you'll lose the changes that are happening during the copy, correct? 

Dave Spano 
Optogenics 
Systems Administrator 



----- Original Message ----- 

From: "Greg Farnum" <greg@inktank.com> 
To: "Sébastien Han" <han.sebastien@gmail.com> 
Cc: "Dave Spano" <dspano@optogenics.com>, "ceph-devel" <ceph-devel@vger.kernel.org>, "Sage Weil" <sage@inktank.com>, "Wido den Hollander" <wido@42on.com>, "Sylvain Munaut" <s.munaut@whatever-company.com>, "Samuel Just" <sam.just@inktank.com>, "Vladislav Gorbunov" <vadikgo@gmail.com> 
Sent: Tuesday, March 12, 2013 4:20:13 PM 
Subject: Re: OSD memory leaks? 

On Tuesday, March 12, 2013 at 1:10 PM, Sébastien Han wrote: 
> Well to avoid un necessary data movement, there is also an 
> _experimental_ feature to change on fly the number of PGs in a pool. 
> 
> ceph osd pool set <poolname> pg_num <numpgs> --allow-experimental-feature 
Don't do that. We've got a set of 3 patches which fix bugs we know about that aren't in bobtail yet, and I'm sure there's more we aren't aware of… 
-Greg 

Software Engineer #42 @ http://inktank.com | http://ceph.com 

> 
> Cheers! 
> -- 
> Regards, 
> Sébastien Han. 
> 
> 
> On Tue, Mar 12, 2013 at 7:09 PM, Dave Spano <dspano@optogenics.com (mailto:dspano@optogenics.com)> wrote: 
> > Disregard my previous question. I found my answer in the post below. Absolutely brilliant! I thought I was screwed! 
> > 
> > http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/8924 
> > 
> > Dave Spano 
> > Optogenics 
> > Systems Administrator 
> > 
> > 
> > 
> > ----- Original Message ----- 
> > 
> > From: "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)> 
> > To: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)> 
> > Cc: "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Gregory Farnum" <greg@inktank.com (mailto:greg@inktank.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>, "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> 
> > Sent: Tuesday, March 12, 2013 1:41:21 PM 
> > Subject: Re: OSD memory leaks? 
> > 
> > 
> > If one were stupid enough to have their pg_num and pgp_num set to 8 on two of their pools, how could you fix that? 
> > 
> > 
> > Dave Spano 
> > 
> > 
> > 
> > ----- Original Message ----- 
> > 
> > From: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)> 
> > To: "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> 
> > Cc: "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Gregory Farnum" <greg@inktank.com (mailto:greg@inktank.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)> 
> > Sent: Tuesday, March 12, 2013 9:43:44 AM 
> > Subject: Re: OSD memory leaks? 
> > 
> > > Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd 
> > > dump | grep 'rep size'" 
> > 
> > 
> > 
> > Well it's still 450 each... 
> > 
> > > The default pg_num value 8 is NOT suitable for big cluster. 
> > 
> > Thanks I know, I'm not new with Ceph. What's your point here? I 
> > already said that pg_num was 450... 
> > -- 
> > Regards, 
> > Sébastien Han. 
> > 
> > 
> > On Tue, Mar 12, 2013 at 2:00 PM, Vladislav Gorbunov <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> wrote: 
> > > Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd 
> > > dump | grep 'rep size'" 
> > > The default pg_num value 8 is NOT suitable for big cluster. 
> > > 
> > > 2013/3/13 Sébastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>: 
> > > > Replica count has been set to 2. 
> > > > 
> > > > Why? 
> > > > -- 
> > > > Regards, 
> > > > Sébastien Han. 
> > > > 
> > > > 
> > > > On Tue, Mar 12, 2013 at 12:45 PM, Vladislav Gorbunov <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> wrote: 
> > > > > > FYI I'm using 450 pgs for my pools. 
> > > > > 
> > > > > 
> > > > > Please, can you show the number of object replicas? 
> > > > > 
> > > > > ceph osd dump | grep 'rep size' 
> > > > > 
> > > > > Vlad Gorbunov 
> > > > > 
> > > > > 2013/3/5 Sébastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>: 
> > > > > > FYI I'm using 450 pgs for my pools. 
> > > > > > 
> > > > > > -- 
> > > > > > Regards, 
> > > > > > Sébastien Han. 
> > > > > > 
> > > > > > 
> > > > > > On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil <sage@inktank.com (mailto:sage@inktank.com)> wrote: 
> > > > > > > 
> > > > > > > On Fri, 1 Mar 2013, Wido den Hollander wrote: 
> > > > > > > > On 02/23/2013 01:44 AM, Sage Weil wrote: 
> > > > > > > > > On Fri, 22 Feb 2013, S?bastien Han wrote: 
> > > > > > > > > > Hi all, 
> > > > > > > > > > 
> > > > > > > > > > I finally got a core dump. 
> > > > > > > > > > 
> > > > > > > > > > I did it with a kill -SEGV on the OSD process. 
> > > > > > > > > > 
> > > > > > > > > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008 
> > > > > > > > > > 
> > > > > > > > > > Hope we will get something out of it :-). 
> > > > > > > > > 
> > > > > > > > > AHA! We have a theory. The pg log isnt trimmed during scrub (because teh 
> > > > > > > > > old scrub code required that), but the new (deep) scrub can take a very 
> > > > > > > > > long time, which means the pg log will eat ram in the meantime.. 
> > > > > > > > > especially under high iops. 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Does the number of PGs influence the memory leak? So my theory is that when 
> > > > > > > > you have a high number of PGs with a low number of objects per PG you don't 
> > > > > > > > see the memory leak. 
> > > > > > > > 
> > > > > > > > I saw the memory leak on a RBD system where a pool had just 8 PGs, but after 
> > > > > > > > going to 1024 PGs in a new pool it seemed to be resolved. 
> > > > > > > > 
> > > > > > > > I've asked somebody else to try your patch since he's still seeing it on his 
> > > > > > > > systems. Hopefully that gives us some results. 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > The PGs were active+clean when you saw the leak? There is a problem (that 
> > > > > > > we just fixed in master) where pg logs aren't trimmed for degraded PGs. 
> > > > > > > 
> > > > > > > sage 
> > > > > > > 
> > > > > > > > 
> > > > > > > > Wido 
> > > > > > > > 
> > > > > > > > > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see 
> > > > > > > > > if that seems to work? Note that that patch shouldn't be run in a mixed 
> > > > > > > > > argonaut+bobtail cluster, since it isn't properly checking if the scrub is 
> > > > > > > > > class or chunky/deep. 
> > > > > > > > > 
> > > > > > > > > Thanks! 
> > > > > > > > > sage 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > > -- 
> > > > > > > > > > Regards, 
> > > > > > > > > > S?bastien Han. 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote: 
> > > > > > > > > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)> 
> > > > > > > > > > > wrote: 
> > > > > > > > > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active 
> > > > > > > > > > > > > use 
> > > > > > > > > > > > > of the memory profiler will itself cause memory usage to increase ? 
> > > > > > > > > > > > > this sounds a bit like that to me since it's staying stable at a 
> > > > > > > > > > > > > large 
> > > > > > > > > > > > > but finite portion of total memory. 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > Well, the memory consumption was already high before the profiler was 
> > > > > > > > > > > > started. So yes with the memory profiler enable an OSD might consume 
> > > > > > > > > > > > more memory but this doesn't cause the memory leaks. 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > My concern is that maybe you saw a leak but when you restarted with 
> > > > > > > > > > > the memory profiling you lost whatever conditions caused it. 
> > > > > > > > > > > 
> > > > > > > > > > > > Any ideas? Nothing to say about my scrumbing theory? 
> > > > > > > > > > > I like it, but Sam indicates that without some heap dumps which 
> > > > > > > > > > > capture the actual leak then scrub is too large to effectively code 
> > > > > > > > > > > review for leaks. :( 
> > > > > > > > > > > -Greg 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > -- 
> > > > > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
> > > > > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
> > > > > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > -- 
> > > > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
> > > > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
> > > > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > -- 
> > > > > > > > Wido den Hollander 
> > > > > > > > 42on B.V. 
> > > > > > > > 
> > > > > > > > Phone: +31 (0)20 700 9902 
> > > > > > > > Skype: contact42on 
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > > -- 
> > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
> > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
> > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html 
> > > > > 
> > > > 
> > > 
> > 
> > 
> > -- 
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
> > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
> > More majordomo info at http://vger.kernel.org/majordomo-info.html 
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-12 21:15       ` Dave Spano
@ 2013-03-12 21:37         ` Greg Farnum
  2013-03-13 13:12           ` Dave Spano
  0 siblings, 1 reply; 48+ messages in thread
From: Greg Farnum @ 2013-03-12 21:37 UTC (permalink / raw)
  To: Dave Spano
  Cc: ceph-devel, Sage Weil, Wido den Hollander, Sylvain Munaut,
	Samuel Just, Vladislav Gorbunov, Sébastien Han

Yeah. There's not anything intelligent about that cppool mechanism. :)
-Greg

On Tuesday, March 12, 2013 at 2:15 PM, Dave Spano wrote:

> I'd rather shut the cloud down and copy the pool to a new one than take any chances of corruption by using an experimental feature. My guess is that there cannot be any i/o to the pool while copying, otherwise you'll lose the changes that are happening during the copy, correct?  
>  
> Dave Spano  
> Optogenics  
> Systems Administrator  
>  
>  
>  
> ----- Original Message -----  
>  
> From: "Greg Farnum" <greg@inktank.com (mailto:greg@inktank.com)>  
> To: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>  
> Cc: "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>, "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)>  
> Sent: Tuesday, March 12, 2013 4:20:13 PM  
> Subject: Re: OSD memory leaks?  
>  
> On Tuesday, March 12, 2013 at 1:10 PM, Sébastien Han wrote:  
> > Well to avoid un necessary data movement, there is also an  
> > _experimental_ feature to change on fly the number of PGs in a pool.  
> >  
> > ceph osd pool set <poolname> pg_num <numpgs> --allow-experimental-feature  
> Don't do that. We've got a set of 3 patches which fix bugs we know about that aren't in bobtail yet, and I'm sure there's more we aren't aware of…  
> -Greg  
>  
> Software Engineer #42 @ http://inktank.com | http://ceph.com  
>  
> >  
> > Cheers!  
> > --  
> > Regards,  
> > Sébastien Han.  
> >  
> >  
> > On Tue, Mar 12, 2013 at 7:09 PM, Dave Spano <dspano@optogenics.com (mailto:dspano@optogenics.com)> wrote:  
> > > Disregard my previous question. I found my answer in the post below. Absolutely brilliant! I thought I was screwed!  
> > >  
> > > http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/8924  
> > >  
> > > Dave Spano  
> > > Optogenics  
> > > Systems Administrator  
> > >  
> > >  
> > >  
> > > ----- Original Message -----  
> > >  
> > > From: "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>  
> > > To: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>  
> > > Cc: "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Gregory Farnum" <greg@inktank.com (mailto:greg@inktank.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>, "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)>  
> > > Sent: Tuesday, March 12, 2013 1:41:21 PM  
> > > Subject: Re: OSD memory leaks?  
> > >  
> > >  
> > > If one were stupid enough to have their pg_num and pgp_num set to 8 on two of their pools, how could you fix that?  
> > >  
> > >  
> > > Dave Spano  
> > >  
> > >  
> > >  
> > > ----- Original Message -----  
> > >  
> > > From: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>  
> > > To: "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)>  
> > > Cc: "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Gregory Farnum" <greg@inktank.com (mailto:greg@inktank.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>  
> > > Sent: Tuesday, March 12, 2013 9:43:44 AM  
> > > Subject: Re: OSD memory leaks?  
> > >  
> > > > Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd  
> > > > dump | grep 'rep size'"  
> > >  
> > >  
> > >  
> > >  
> > >  
> > > Well it's still 450 each...  
> > >  
> > > > The default pg_num value 8 is NOT suitable for big cluster.  
> > >  
> > > Thanks I know, I'm not new with Ceph. What's your point here? I  
> > > already said that pg_num was 450...  
> > > --  
> > > Regards,  
> > > Sébastien Han.  
> > >  
> > >  
> > > On Tue, Mar 12, 2013 at 2:00 PM, Vladislav Gorbunov <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> wrote:  
> > > > Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd  
> > > > dump | grep 'rep size'"  
> > > > The default pg_num value 8 is NOT suitable for big cluster.  
> > > >  
> > > > 2013/3/13 Sébastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>:  
> > > > > Replica count has been set to 2.  
> > > > >  
> > > > > Why?  
> > > > > --  
> > > > > Regards,  
> > > > > Sébastien Han.  
> > > > >  
> > > > >  
> > > > > On Tue, Mar 12, 2013 at 12:45 PM, Vladislav Gorbunov <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> wrote:  
> > > > > > > FYI I'm using 450 pgs for my pools.  
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > >  
> > > > > > Please, can you show the number of object replicas?  
> > > > > >  
> > > > > > ceph osd dump | grep 'rep size'  
> > > > > >  
> > > > > > Vlad Gorbunov  
> > > > > >  
> > > > > > 2013/3/5 Sébastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>:  
> > > > > > > FYI I'm using 450 pgs for my pools.  
> > > > > > >  
> > > > > > > --  
> > > > > > > Regards,  
> > > > > > > Sébastien Han.  
> > > > > > >  
> > > > > > >  
> > > > > > > On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil <sage@inktank.com (mailto:sage@inktank.com)> wrote:  
> > > > > > > >  
> > > > > > > > On Fri, 1 Mar 2013, Wido den Hollander wrote:  
> > > > > > > > > On 02/23/2013 01:44 AM, Sage Weil wrote:  
> > > > > > > > > > On Fri, 22 Feb 2013, S?bastien Han wrote:  
> > > > > > > > > > > Hi all,  
> > > > > > > > > > >  
> > > > > > > > > > > I finally got a core dump.  
> > > > > > > > > > >  
> > > > > > > > > > > I did it with a kill -SEGV on the OSD process.  
> > > > > > > > > > >  
> > > > > > > > > > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008  
> > > > > > > > > > >  
> > > > > > > > > > > Hope we will get something out of it :-).  
> > > > > > > > > >  
> > > > > > > > > > AHA! We have a theory. The pg log isnt trimmed during scrub (because teh  
> > > > > > > > > > old scrub code required that), but the new (deep) scrub can take a very  
> > > > > > > > > > long time, which means the pg log will eat ram in the meantime..  
> > > > > > > > > > especially under high iops.  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Does the number of PGs influence the memory leak? So my theory is that when  
> > > > > > > > > you have a high number of PGs with a low number of objects per PG you don't  
> > > > > > > > > see the memory leak.  
> > > > > > > > >  
> > > > > > > > > I saw the memory leak on a RBD system where a pool had just 8 PGs, but after  
> > > > > > > > > going to 1024 PGs in a new pool it seemed to be resolved.  
> > > > > > > > >  
> > > > > > > > > I've asked somebody else to try your patch since he's still seeing it on his  
> > > > > > > > > systems. Hopefully that gives us some results.  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > The PGs were active+clean when you saw the leak? There is a problem (that  
> > > > > > > > we just fixed in master) where pg logs aren't trimmed for degraded PGs.  
> > > > > > > >  
> > > > > > > > sage  
> > > > > > > >  
> > > > > > > > >  
> > > > > > > > > Wido  
> > > > > > > > >  
> > > > > > > > > > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see  
> > > > > > > > > > if that seems to work? Note that that patch shouldn't be run in a mixed  
> > > > > > > > > > argonaut+bobtail cluster, since it isn't properly checking if the scrub is  
> > > > > > > > > > class or chunky/deep.  
> > > > > > > > > >  
> > > > > > > > > > Thanks!  
> > > > > > > > > > sage  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > > --  
> > > > > > > > > > > Regards,  
> > > > > > > > > > > S?bastien Han.  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:  
> > > > > > > > > > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>  
> > > > > > > > > > > > wrote:  
> > > > > > > > > > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active  
> > > > > > > > > > > > > > use  
> > > > > > > > > > > > > > of the memory profiler will itself cause memory usage to increase ?  
> > > > > > > > > > > > > > this sounds a bit like that to me since it's staying stable at a  
> > > > > > > > > > > > > > large  
> > > > > > > > > > > > > > but finite portion of total memory.  
> > > > > > > > > > > > >  
> > > > > > > > > > > > >  
> > > > > > > > > > > > >  
> > > > > > > > > > > > >  
> > > > > > > > > > > > >  
> > > > > > > > > > > > > Well, the memory consumption was already high before the profiler was  
> > > > > > > > > > > > > started. So yes with the memory profiler enable an OSD might consume  
> > > > > > > > > > > > > more memory but this doesn't cause the memory leaks.  
> > > > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > > > My concern is that maybe you saw a leak but when you restarted with  
> > > > > > > > > > > > the memory profiling you lost whatever conditions caused it.  
> > > > > > > > > > > >  
> > > > > > > > > > > > > Any ideas? Nothing to say about my scrumbing theory?  
> > > > > > > > > > > > I like it, but Sam indicates that without some heap dumps which  
> > > > > > > > > > > > capture the actual leak then scrub is too large to effectively code  
> > > > > > > > > > > > review for leaks. :(  
> > > > > > > > > > > > -Greg  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > > --  
> > > > > > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in  
> > > > > > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)  
> > > > > > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > --  
> > > > > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in  
> > > > > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)  
> > > > > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > --  
> > > > > > > > > Wido den Hollander  
> > > > > > > > > 42on B.V.  
> > > > > > > > >  
> > > > > > > > > Phone: +31 (0)20 700 9902  
> > > > > > > > > Skype: contact42on  
> > > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > >  
> > > > > > > --  
> > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in  
> > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)  
> > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html  
> > > > > >  
> > > > >  
> > > >  
> > >  
> > >  
> > >  
> > >  
> > > --  
> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in  
> > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)  
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html  
> >  
>  



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-12 21:37         ` Greg Farnum
@ 2013-03-13 13:12           ` Dave Spano
  2013-03-13 19:59             ` Sébastien Han
  0 siblings, 1 reply; 48+ messages in thread
From: Dave Spano @ 2013-03-13 13:12 UTC (permalink / raw)
  To: Greg Farnum
  Cc: ceph-devel, Sage Weil, Wido den Hollander, Sylvain Munaut,
	Samuel Just, Vladislav Gorbunov, Sébastien Han

Lol. I'm totally fine with that. My glance images pool isn't used too often. I'm going to give that a try today and see what happens. 

I'm still crossing my fingers, but since I added log max recent=10000 to ceph.conf, I've been okay despite the improper pg_num, and a lot of scrubbing/deep scrubbing yesterday. 

Dave Spano 




----- Original Message ----- 

From: "Greg Farnum" <greg@inktank.com> 
To: "Dave Spano" <dspano@optogenics.com> 
Cc: "ceph-devel" <ceph-devel@vger.kernel.org>, "Sage Weil" <sage@inktank.com>, "Wido den Hollander" <wido@42on.com>, "Sylvain Munaut" <s.munaut@whatever-company.com>, "Samuel Just" <sam.just@inktank.com>, "Vladislav Gorbunov" <vadikgo@gmail.com>, "Sébastien Han" <han.sebastien@gmail.com> 
Sent: Tuesday, March 12, 2013 5:37:37 PM 
Subject: Re: OSD memory leaks? 

Yeah. There's not anything intelligent about that cppool mechanism. :) 
-Greg 

On Tuesday, March 12, 2013 at 2:15 PM, Dave Spano wrote: 

> I'd rather shut the cloud down and copy the pool to a new one than take any chances of corruption by using an experimental feature. My guess is that there cannot be any i/o to the pool while copying, otherwise you'll lose the changes that are happening during the copy, correct? 
> 
> Dave Spano 
> Optogenics 
> Systems Administrator 
> 
> 
> 
> ----- Original Message ----- 
> 
> From: "Greg Farnum" <greg@inktank.com (mailto:greg@inktank.com)> 
> To: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)> 
> Cc: "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>, "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> 
> Sent: Tuesday, March 12, 2013 4:20:13 PM 
> Subject: Re: OSD memory leaks? 
> 
> On Tuesday, March 12, 2013 at 1:10 PM, Sébastien Han wrote: 
> > Well to avoid un necessary data movement, there is also an 
> > _experimental_ feature to change on fly the number of PGs in a pool. 
> > 
> > ceph osd pool set <poolname> pg_num <numpgs> --allow-experimental-feature 
> Don't do that. We've got a set of 3 patches which fix bugs we know about that aren't in bobtail yet, and I'm sure there's more we aren't aware of… 
> -Greg 
> 
> Software Engineer #42 @ http://inktank.com | http://ceph.com 
> 
> > 
> > Cheers! 
> > -- 
> > Regards, 
> > Sébastien Han. 
> > 
> > 
> > On Tue, Mar 12, 2013 at 7:09 PM, Dave Spano <dspano@optogenics.com (mailto:dspano@optogenics.com)> wrote: 
> > > Disregard my previous question. I found my answer in the post below. Absolutely brilliant! I thought I was screwed! 
> > > 
> > > http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/8924 
> > > 
> > > Dave Spano 
> > > Optogenics 
> > > Systems Administrator 
> > > 
> > > 
> > > 
> > > ----- Original Message ----- 
> > > 
> > > From: "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)> 
> > > To: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)> 
> > > Cc: "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Gregory Farnum" <greg@inktank.com (mailto:greg@inktank.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>, "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> 
> > > Sent: Tuesday, March 12, 2013 1:41:21 PM 
> > > Subject: Re: OSD memory leaks? 
> > > 
> > > 
> > > If one were stupid enough to have their pg_num and pgp_num set to 8 on two of their pools, how could you fix that? 
> > > 
> > > 
> > > Dave Spano 
> > > 
> > > 
> > > 
> > > ----- Original Message ----- 
> > > 
> > > From: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)> 
> > > To: "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> 
> > > Cc: "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Gregory Farnum" <greg@inktank.com (mailto:greg@inktank.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)> 
> > > Sent: Tuesday, March 12, 2013 9:43:44 AM 
> > > Subject: Re: OSD memory leaks? 
> > > 
> > > > Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd 
> > > > dump | grep 'rep size'" 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > Well it's still 450 each... 
> > > 
> > > > The default pg_num value 8 is NOT suitable for big cluster. 
> > > 
> > > Thanks I know, I'm not new with Ceph. What's your point here? I 
> > > already said that pg_num was 450... 
> > > -- 
> > > Regards, 
> > > Sébastien Han. 
> > > 
> > > 
> > > On Tue, Mar 12, 2013 at 2:00 PM, Vladislav Gorbunov <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> wrote: 
> > > > Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd 
> > > > dump | grep 'rep size'" 
> > > > The default pg_num value 8 is NOT suitable for big cluster. 
> > > > 
> > > > 2013/3/13 Sébastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>: 
> > > > > Replica count has been set to 2. 
> > > > > 
> > > > > Why? 
> > > > > -- 
> > > > > Regards, 
> > > > > Sébastien Han. 
> > > > > 
> > > > > 
> > > > > On Tue, Mar 12, 2013 at 12:45 PM, Vladislav Gorbunov <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> wrote: 
> > > > > > > FYI I'm using 450 pgs for my pools. 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > Please, can you show the number of object replicas? 
> > > > > > 
> > > > > > ceph osd dump | grep 'rep size' 
> > > > > > 
> > > > > > Vlad Gorbunov 
> > > > > > 
> > > > > > 2013/3/5 Sébastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>: 
> > > > > > > FYI I'm using 450 pgs for my pools. 
> > > > > > > 
> > > > > > > -- 
> > > > > > > Regards, 
> > > > > > > Sébastien Han. 
> > > > > > > 
> > > > > > > 
> > > > > > > On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil <sage@inktank.com (mailto:sage@inktank.com)> wrote: 
> > > > > > > > 
> > > > > > > > On Fri, 1 Mar 2013, Wido den Hollander wrote: 
> > > > > > > > > On 02/23/2013 01:44 AM, Sage Weil wrote: 
> > > > > > > > > > On Fri, 22 Feb 2013, S?bastien Han wrote: 
> > > > > > > > > > > Hi all, 
> > > > > > > > > > > 
> > > > > > > > > > > I finally got a core dump. 
> > > > > > > > > > > 
> > > > > > > > > > > I did it with a kill -SEGV on the OSD process. 
> > > > > > > > > > > 
> > > > > > > > > > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008 
> > > > > > > > > > > 
> > > > > > > > > > > Hope we will get something out of it :-). 
> > > > > > > > > > 
> > > > > > > > > > AHA! We have a theory. The pg log isnt trimmed during scrub (because teh 
> > > > > > > > > > old scrub code required that), but the new (deep) scrub can take a very 
> > > > > > > > > > long time, which means the pg log will eat ram in the meantime.. 
> > > > > > > > > > especially under high iops. 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Does the number of PGs influence the memory leak? So my theory is that when 
> > > > > > > > > you have a high number of PGs with a low number of objects per PG you don't 
> > > > > > > > > see the memory leak. 
> > > > > > > > > 
> > > > > > > > > I saw the memory leak on a RBD system where a pool had just 8 PGs, but after 
> > > > > > > > > going to 1024 PGs in a new pool it seemed to be resolved. 
> > > > > > > > > 
> > > > > > > > > I've asked somebody else to try your patch since he's still seeing it on his 
> > > > > > > > > systems. Hopefully that gives us some results. 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > The PGs were active+clean when you saw the leak? There is a problem (that 
> > > > > > > > we just fixed in master) where pg logs aren't trimmed for degraded PGs. 
> > > > > > > > 
> > > > > > > > sage 
> > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Wido 
> > > > > > > > > 
> > > > > > > > > > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see 
> > > > > > > > > > if that seems to work? Note that that patch shouldn't be run in a mixed 
> > > > > > > > > > argonaut+bobtail cluster, since it isn't properly checking if the scrub is 
> > > > > > > > > > class or chunky/deep. 
> > > > > > > > > > 
> > > > > > > > > > Thanks! 
> > > > > > > > > > sage 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > > -- 
> > > > > > > > > > > Regards, 
> > > > > > > > > > > S?bastien Han. 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote: 
> > > > > > > > > > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)> 
> > > > > > > > > > > > wrote: 
> > > > > > > > > > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active 
> > > > > > > > > > > > > > use 
> > > > > > > > > > > > > > of the memory profiler will itself cause memory usage to increase ? 
> > > > > > > > > > > > > > this sounds a bit like that to me since it's staying stable at a 
> > > > > > > > > > > > > > large 
> > > > > > > > > > > > > > but finite portion of total memory. 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Well, the memory consumption was already high before the profiler was 
> > > > > > > > > > > > > started. So yes with the memory profiler enable an OSD might consume 
> > > > > > > > > > > > > more memory but this doesn't cause the memory leaks. 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > My concern is that maybe you saw a leak but when you restarted with 
> > > > > > > > > > > > the memory profiling you lost whatever conditions caused it. 
> > > > > > > > > > > > 
> > > > > > > > > > > > > Any ideas? Nothing to say about my scrumbing theory? 
> > > > > > > > > > > > I like it, but Sam indicates that without some heap dumps which 
> > > > > > > > > > > > capture the actual leak then scrub is too large to effectively code 
> > > > > > > > > > > > review for leaks. :( 
> > > > > > > > > > > > -Greg 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > -- 
> > > > > > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
> > > > > > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
> > > > > > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > -- 
> > > > > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
> > > > > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
> > > > > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > -- 
> > > > > > > > > Wido den Hollander 
> > > > > > > > > 42on B.V. 
> > > > > > > > > 
> > > > > > > > > Phone: +31 (0)20 700 9902 
> > > > > > > > > Skype: contact42on 
> > > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > -- 
> > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
> > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
> > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > > 
> > > 
> > > 
> > > -- 
> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
> > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html 
> > 
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-13 13:12           ` Dave Spano
@ 2013-03-13 19:59             ` Sébastien Han
  2013-03-13 22:38               ` Dave Spano
  2013-03-14 21:28               ` Dave Spano
  0 siblings, 2 replies; 48+ messages in thread
From: Sébastien Han @ 2013-03-13 19:59 UTC (permalink / raw)
  To: Dave Spano
  Cc: Greg Farnum, ceph-devel, Sage Weil, Wido den Hollander,
	Sylvain Munaut, Samuel Just, Vladislav Gorbunov

Dave,

Just to be sure, did the log max recent=10000 _completely_ stod the
memory leak or did it slow it down?

Thanks!
--
Regards,
Sébastien Han.


On Wed, Mar 13, 2013 at 2:12 PM, Dave Spano <dspano@optogenics.com> wrote:
> Lol. I'm totally fine with that. My glance images pool isn't used too often. I'm going to give that a try today and see what happens.
>
> I'm still crossing my fingers, but since I added log max recent=10000 to ceph.conf, I've been okay despite the improper pg_num, and a lot of scrubbing/deep scrubbing yesterday.
>
> Dave Spano
>
>
>
>
> ----- Original Message -----
>
> From: "Greg Farnum" <greg@inktank.com>
> To: "Dave Spano" <dspano@optogenics.com>
> Cc: "ceph-devel" <ceph-devel@vger.kernel.org>, "Sage Weil" <sage@inktank.com>, "Wido den Hollander" <wido@42on.com>, "Sylvain Munaut" <s.munaut@whatever-company.com>, "Samuel Just" <sam.just@inktank.com>, "Vladislav Gorbunov" <vadikgo@gmail.com>, "Sébastien Han" <han.sebastien@gmail.com>
> Sent: Tuesday, March 12, 2013 5:37:37 PM
> Subject: Re: OSD memory leaks?
>
> Yeah. There's not anything intelligent about that cppool mechanism. :)
> -Greg
>
> On Tuesday, March 12, 2013 at 2:15 PM, Dave Spano wrote:
>
>> I'd rather shut the cloud down and copy the pool to a new one than take any chances of corruption by using an experimental feature. My guess is that there cannot be any i/o to the pool while copying, otherwise you'll lose the changes that are happening during the copy, correct?
>>
>> Dave Spano
>> Optogenics
>> Systems Administrator
>>
>>
>>
>> ----- Original Message -----
>>
>> From: "Greg Farnum" <greg@inktank.com (mailto:greg@inktank.com)>
>> To: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>
>> Cc: "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>, "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)>
>> Sent: Tuesday, March 12, 2013 4:20:13 PM
>> Subject: Re: OSD memory leaks?
>>
>> On Tuesday, March 12, 2013 at 1:10 PM, Sébastien Han wrote:
>> > Well to avoid un necessary data movement, there is also an
>> > _experimental_ feature to change on fly the number of PGs in a pool.
>> >
>> > ceph osd pool set <poolname> pg_num <numpgs> --allow-experimental-feature
>> Don't do that. We've got a set of 3 patches which fix bugs we know about that aren't in bobtail yet, and I'm sure there's more we aren't aware of…
>> -Greg
>>
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>
>> >
>> > Cheers!
>> > --
>> > Regards,
>> > Sébastien Han.
>> >
>> >
>> > On Tue, Mar 12, 2013 at 7:09 PM, Dave Spano <dspano@optogenics.com (mailto:dspano@optogenics.com)> wrote:
>> > > Disregard my previous question. I found my answer in the post below. Absolutely brilliant! I thought I was screwed!
>> > >
>> > > http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/8924
>> > >
>> > > Dave Spano
>> > > Optogenics
>> > > Systems Administrator
>> > >
>> > >
>> > >
>> > > ----- Original Message -----
>> > >
>> > > From: "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>
>> > > To: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>
>> > > Cc: "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Gregory Farnum" <greg@inktank.com (mailto:greg@inktank.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>, "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)>
>> > > Sent: Tuesday, March 12, 2013 1:41:21 PM
>> > > Subject: Re: OSD memory leaks?
>> > >
>> > >
>> > > If one were stupid enough to have their pg_num and pgp_num set to 8 on two of their pools, how could you fix that?
>> > >
>> > >
>> > > Dave Spano
>> > >
>> > >
>> > >
>> > > ----- Original Message -----
>> > >
>> > > From: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>
>> > > To: "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)>
>> > > Cc: "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Gregory Farnum" <greg@inktank.com (mailto:greg@inktank.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>
>> > > Sent: Tuesday, March 12, 2013 9:43:44 AM
>> > > Subject: Re: OSD memory leaks?
>> > >
>> > > > Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd
>> > > > dump | grep 'rep size'"
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > Well it's still 450 each...
>> > >
>> > > > The default pg_num value 8 is NOT suitable for big cluster.
>> > >
>> > > Thanks I know, I'm not new with Ceph. What's your point here? I
>> > > already said that pg_num was 450...
>> > > --
>> > > Regards,
>> > > Sébastien Han.
>> > >
>> > >
>> > > On Tue, Mar 12, 2013 at 2:00 PM, Vladislav Gorbunov <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> wrote:
>> > > > Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd
>> > > > dump | grep 'rep size'"
>> > > > The default pg_num value 8 is NOT suitable for big cluster.
>> > > >
>> > > > 2013/3/13 Sébastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>:
>> > > > > Replica count has been set to 2.
>> > > > >
>> > > > > Why?
>> > > > > --
>> > > > > Regards,
>> > > > > Sébastien Han.
>> > > > >
>> > > > >
>> > > > > On Tue, Mar 12, 2013 at 12:45 PM, Vladislav Gorbunov <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> wrote:
>> > > > > > > FYI I'm using 450 pgs for my pools.
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > Please, can you show the number of object replicas?
>> > > > > >
>> > > > > > ceph osd dump | grep 'rep size'
>> > > > > >
>> > > > > > Vlad Gorbunov
>> > > > > >
>> > > > > > 2013/3/5 Sébastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>:
>> > > > > > > FYI I'm using 450 pgs for my pools.
>> > > > > > >
>> > > > > > > --
>> > > > > > > Regards,
>> > > > > > > Sébastien Han.
>> > > > > > >
>> > > > > > >
>> > > > > > > On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil <sage@inktank.com (mailto:sage@inktank.com)> wrote:
>> > > > > > > >
>> > > > > > > > On Fri, 1 Mar 2013, Wido den Hollander wrote:
>> > > > > > > > > On 02/23/2013 01:44 AM, Sage Weil wrote:
>> > > > > > > > > > On Fri, 22 Feb 2013, S?bastien Han wrote:
>> > > > > > > > > > > Hi all,
>> > > > > > > > > > >
>> > > > > > > > > > > I finally got a core dump.
>> > > > > > > > > > >
>> > > > > > > > > > > I did it with a kill -SEGV on the OSD process.
>> > > > > > > > > > >
>> > > > > > > > > > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
>> > > > > > > > > > >
>> > > > > > > > > > > Hope we will get something out of it :-).
>> > > > > > > > > >
>> > > > > > > > > > AHA! We have a theory. The pg log isnt trimmed during scrub (because teh
>> > > > > > > > > > old scrub code required that), but the new (deep) scrub can take a very
>> > > > > > > > > > long time, which means the pg log will eat ram in the meantime..
>> > > > > > > > > > especially under high iops.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > Does the number of PGs influence the memory leak? So my theory is that when
>> > > > > > > > > you have a high number of PGs with a low number of objects per PG you don't
>> > > > > > > > > see the memory leak.
>> > > > > > > > >
>> > > > > > > > > I saw the memory leak on a RBD system where a pool had just 8 PGs, but after
>> > > > > > > > > going to 1024 PGs in a new pool it seemed to be resolved.
>> > > > > > > > >
>> > > > > > > > > I've asked somebody else to try your patch since he's still seeing it on his
>> > > > > > > > > systems. Hopefully that gives us some results.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > The PGs were active+clean when you saw the leak? There is a problem (that
>> > > > > > > > we just fixed in master) where pg logs aren't trimmed for degraded PGs.
>> > > > > > > >
>> > > > > > > > sage
>> > > > > > > >
>> > > > > > > > >
>> > > > > > > > > Wido
>> > > > > > > > >
>> > > > > > > > > > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see
>> > > > > > > > > > if that seems to work? Note that that patch shouldn't be run in a mixed
>> > > > > > > > > > argonaut+bobtail cluster, since it isn't properly checking if the scrub is
>> > > > > > > > > > class or chunky/deep.
>> > > > > > > > > >
>> > > > > > > > > > Thanks!
>> > > > > > > > > > sage
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > > --
>> > > > > > > > > > > Regards,
>> > > > > > > > > > > S?bastien Han.
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:
>> > > > > > > > > > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>
>> > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active
>> > > > > > > > > > > > > > use
>> > > > > > > > > > > > > > of the memory profiler will itself cause memory usage to increase ?
>> > > > > > > > > > > > > > this sounds a bit like that to me since it's staying stable at a
>> > > > > > > > > > > > > > large
>> > > > > > > > > > > > > > but finite portion of total memory.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Well, the memory consumption was already high before the profiler was
>> > > > > > > > > > > > > started. So yes with the memory profiler enable an OSD might consume
>> > > > > > > > > > > > > more memory but this doesn't cause the memory leaks.
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > My concern is that maybe you saw a leak but when you restarted with
>> > > > > > > > > > > > the memory profiling you lost whatever conditions caused it.
>> > > > > > > > > > > >
>> > > > > > > > > > > > > Any ideas? Nothing to say about my scrumbing theory?
>> > > > > > > > > > > > I like it, but Sam indicates that without some heap dumps which
>> > > > > > > > > > > > capture the actual leak then scrub is too large to effectively code
>> > > > > > > > > > > > review for leaks. :(
>> > > > > > > > > > > > -Greg
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > --
>> > > > > > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> > > > > > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)
>> > > > > > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > --
>> > > > > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> > > > > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)
>> > > > > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > --
>> > > > > > > > > Wido den Hollander
>> > > > > > > > > 42on B.V.
>> > > > > > > > >
>> > > > > > > > > Phone: +31 (0)20 700 9902
>> > > > > > > > > Skype: contact42on
>> > > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > --
>> > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)
>> > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
>> > > > > >
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > >
>> > > --
>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)
>> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
>> >
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-13 19:59             ` Sébastien Han
@ 2013-03-13 22:38               ` Dave Spano
  2013-03-13 22:52                 ` Greg Farnum
  2013-03-14 21:28               ` Dave Spano
  1 sibling, 1 reply; 48+ messages in thread
From: Dave Spano @ 2013-03-13 22:38 UTC (permalink / raw)
  To: Sébastien Han
  Cc: Greg Farnum, ceph-devel, Sage Weil, Wido den Hollander,
	Sylvain Munaut, Samuel Just, Vladislav Gorbunov

Sebastien,

I'm not totally sure yet, but everything is still working. 


Sage and Greg, 
I copied my glance image pool per the posting I mentioned previously, and everything works when I use the ceph tools. I can export rbds from the new pool and delete them as well.

I noticed that the copied images pool does not work with glance. 

I get this error when I try to create images in the new pool. If I put the old pool back, I can create images no problem. 

Is there something I'm missing in glance that I need to work with a pool created in bobtail? I'm using Openstack Folsom. 

  File "/usr/lib/python2.7/dist-packages/glance/api/v1/images.py", line 437, in _upload                 
    image_meta['size'])                                                                                  
  File "/usr/lib/python2.7/dist-packages/glance/store/rbd.py", line 244, in add                          
    image_size, order)                                                                                   
  File "/usr/lib/python2.7/dist-packages/glance/store/rbd.py", line 207, in _create_image                
    features=rbd.RBD_FEATURE_LAYERING)                                                                   
  File "/usr/lib/python2.7/dist-packages/rbd.py", line 194, in create                                    
    raise make_ex(ret, 'error creating image')                                                           
PermissionError: error creating image


Dave Spano 
 



----- Original Message ----- 

From: "Sébastien Han" <han.sebastien@gmail.com> 
To: "Dave Spano" <dspano@optogenics.com> 
Cc: "Greg Farnum" <greg@inktank.com>, "ceph-devel" <ceph-devel@vger.kernel.org>, "Sage Weil" <sage@inktank.com>, "Wido den Hollander" <wido@42on.com>, "Sylvain Munaut" <s.munaut@whatever-company.com>, "Samuel Just" <sam.just@inktank.com>, "Vladislav Gorbunov" <vadikgo@gmail.com> 
Sent: Wednesday, March 13, 2013 3:59:03 PM 
Subject: Re: OSD memory leaks? 

Dave, 

Just to be sure, did the log max recent=10000 _completely_ stod the 
memory leak or did it slow it down? 

Thanks! 
-- 
Regards, 
Sébastien Han. 


On Wed, Mar 13, 2013 at 2:12 PM, Dave Spano <dspano@optogenics.com> wrote: 
> Lol. I'm totally fine with that. My glance images pool isn't used too often. I'm going to give that a try today and see what happens. 
> 
> I'm still crossing my fingers, but since I added log max recent=10000 to ceph.conf, I've been okay despite the improper pg_num, and a lot of scrubbing/deep scrubbing yesterday. 
> 
> Dave Spano 
> 
> 
> 
> 
> ----- Original Message ----- 
> 
> From: "Greg Farnum" <greg@inktank.com> 
> To: "Dave Spano" <dspano@optogenics.com> 
> Cc: "ceph-devel" <ceph-devel@vger.kernel.org>, "Sage Weil" <sage@inktank.com>, "Wido den Hollander" <wido@42on.com>, "Sylvain Munaut" <s.munaut@whatever-company.com>, "Samuel Just" <sam.just@inktank.com>, "Vladislav Gorbunov" <vadikgo@gmail.com>, "Sébastien Han" <han.sebastien@gmail.com> 
> Sent: Tuesday, March 12, 2013 5:37:37 PM 
> Subject: Re: OSD memory leaks? 
> 
> Yeah. There's not anything intelligent about that cppool mechanism. :) 
> -Greg 
> 
> On Tuesday, March 12, 2013 at 2:15 PM, Dave Spano wrote: 
> 
>> I'd rather shut the cloud down and copy the pool to a new one than take any chances of corruption by using an experimental feature. My guess is that there cannot be any i/o to the pool while copying, otherwise you'll lose the changes that are happening during the copy, correct? 
>> 
>> Dave Spano 
>> Optogenics 
>> Systems Administrator 
>> 
>> 
>> 
>> ----- Original Message ----- 
>> 
>> From: "Greg Farnum" <greg@inktank.com (mailto:greg@inktank.com)> 
>> To: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)> 
>> Cc: "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>, "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> 
>> Sent: Tuesday, March 12, 2013 4:20:13 PM 
>> Subject: Re: OSD memory leaks? 
>> 
>> On Tuesday, March 12, 2013 at 1:10 PM, Sébastien Han wrote: 
>> > Well to avoid un necessary data movement, there is also an 
>> > _experimental_ feature to change on fly the number of PGs in a pool. 
>> > 
>> > ceph osd pool set <poolname> pg_num <numpgs> --allow-experimental-feature 
>> Don't do that. We've got a set of 3 patches which fix bugs we know about that aren't in bobtail yet, and I'm sure there's more we aren't aware of… 
>> -Greg 
>> 
>> Software Engineer #42 @ http://inktank.com | http://ceph.com 
>> 
>> > 
>> > Cheers! 
>> > -- 
>> > Regards, 
>> > Sébastien Han. 
>> > 
>> > 
>> > On Tue, Mar 12, 2013 at 7:09 PM, Dave Spano <dspano@optogenics.com (mailto:dspano@optogenics.com)> wrote: 
>> > > Disregard my previous question. I found my answer in the post below. Absolutely brilliant! I thought I was screwed! 
>> > > 
>> > > http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/8924 
>> > > 
>> > > Dave Spano 
>> > > Optogenics 
>> > > Systems Administrator 
>> > > 
>> > > 
>> > > 
>> > > ----- Original Message ----- 
>> > > 
>> > > From: "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)> 
>> > > To: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)> 
>> > > Cc: "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Gregory Farnum" <greg@inktank.com (mailto:greg@inktank.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>, "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> 
>> > > Sent: Tuesday, March 12, 2013 1:41:21 PM 
>> > > Subject: Re: OSD memory leaks? 
>> > > 
>> > > 
>> > > If one were stupid enough to have their pg_num and pgp_num set to 8 on two of their pools, how could you fix that? 
>> > > 
>> > > 
>> > > Dave Spano 
>> > > 
>> > > 
>> > > 
>> > > ----- Original Message ----- 
>> > > 
>> > > From: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)> 
>> > > To: "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> 
>> > > Cc: "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Gregory Farnum" <greg@inktank.com (mailto:greg@inktank.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)> 
>> > > Sent: Tuesday, March 12, 2013 9:43:44 AM 
>> > > Subject: Re: OSD memory leaks? 
>> > > 
>> > > > Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd 
>> > > > dump | grep 'rep size'" 
>> > > 
>> > > 
>> > > 
>> > > 
>> > > 
>> > > Well it's still 450 each... 
>> > > 
>> > > > The default pg_num value 8 is NOT suitable for big cluster. 
>> > > 
>> > > Thanks I know, I'm not new with Ceph. What's your point here? I 
>> > > already said that pg_num was 450... 
>> > > -- 
>> > > Regards, 
>> > > Sébastien Han. 
>> > > 
>> > > 
>> > > On Tue, Mar 12, 2013 at 2:00 PM, Vladislav Gorbunov <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> wrote: 
>> > > > Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd 
>> > > > dump | grep 'rep size'" 
>> > > > The default pg_num value 8 is NOT suitable for big cluster. 
>> > > > 
>> > > > 2013/3/13 Sébastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>: 
>> > > > > Replica count has been set to 2. 
>> > > > > 
>> > > > > Why? 
>> > > > > -- 
>> > > > > Regards, 
>> > > > > Sébastien Han. 
>> > > > > 
>> > > > > 
>> > > > > On Tue, Mar 12, 2013 at 12:45 PM, Vladislav Gorbunov <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> wrote: 
>> > > > > > > FYI I'm using 450 pgs for my pools. 
>> > > > > > 
>> > > > > > 
>> > > > > > 
>> > > > > > 
>> > > > > > Please, can you show the number of object replicas? 
>> > > > > > 
>> > > > > > ceph osd dump | grep 'rep size' 
>> > > > > > 
>> > > > > > Vlad Gorbunov 
>> > > > > > 
>> > > > > > 2013/3/5 Sébastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>: 
>> > > > > > > FYI I'm using 450 pgs for my pools. 
>> > > > > > > 
>> > > > > > > -- 
>> > > > > > > Regards, 
>> > > > > > > Sébastien Han. 
>> > > > > > > 
>> > > > > > > 
>> > > > > > > On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil <sage@inktank.com (mailto:sage@inktank.com)> wrote: 
>> > > > > > > > 
>> > > > > > > > On Fri, 1 Mar 2013, Wido den Hollander wrote: 
>> > > > > > > > > On 02/23/2013 01:44 AM, Sage Weil wrote: 
>> > > > > > > > > > On Fri, 22 Feb 2013, S?bastien Han wrote: 
>> > > > > > > > > > > Hi all, 
>> > > > > > > > > > > 
>> > > > > > > > > > > I finally got a core dump. 
>> > > > > > > > > > > 
>> > > > > > > > > > > I did it with a kill -SEGV on the OSD process. 
>> > > > > > > > > > > 
>> > > > > > > > > > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008 
>> > > > > > > > > > > 
>> > > > > > > > > > > Hope we will get something out of it :-). 
>> > > > > > > > > > 
>> > > > > > > > > > AHA! We have a theory. The pg log isnt trimmed during scrub (because teh 
>> > > > > > > > > > old scrub code required that), but the new (deep) scrub can take a very 
>> > > > > > > > > > long time, which means the pg log will eat ram in the meantime.. 
>> > > > > > > > > > especially under high iops. 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > Does the number of PGs influence the memory leak? So my theory is that when 
>> > > > > > > > > you have a high number of PGs with a low number of objects per PG you don't 
>> > > > > > > > > see the memory leak. 
>> > > > > > > > > 
>> > > > > > > > > I saw the memory leak on a RBD system where a pool had just 8 PGs, but after 
>> > > > > > > > > going to 1024 PGs in a new pool it seemed to be resolved. 
>> > > > > > > > > 
>> > > > > > > > > I've asked somebody else to try your patch since he's still seeing it on his 
>> > > > > > > > > systems. Hopefully that gives us some results. 
>> > > > > > > > 
>> > > > > > > > 
>> > > > > > > > 
>> > > > > > > > 
>> > > > > > > > 
>> > > > > > > > The PGs were active+clean when you saw the leak? There is a problem (that 
>> > > > > > > > we just fixed in master) where pg logs aren't trimmed for degraded PGs. 
>> > > > > > > > 
>> > > > > > > > sage 
>> > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > Wido 
>> > > > > > > > > 
>> > > > > > > > > > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see 
>> > > > > > > > > > if that seems to work? Note that that patch shouldn't be run in a mixed 
>> > > > > > > > > > argonaut+bobtail cluster, since it isn't properly checking if the scrub is 
>> > > > > > > > > > class or chunky/deep. 
>> > > > > > > > > > 
>> > > > > > > > > > Thanks! 
>> > > > > > > > > > sage 
>> > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > > > > -- 
>> > > > > > > > > > > Regards, 
>> > > > > > > > > > > S?bastien Han. 
>> > > > > > > > > > > 
>> > > > > > > > > > > 
>> > > > > > > > > > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote: 
>> > > > > > > > > > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)> 
>> > > > > > > > > > > > wrote: 
>> > > > > > > > > > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active 
>> > > > > > > > > > > > > > use 
>> > > > > > > > > > > > > > of the memory profiler will itself cause memory usage to increase ? 
>> > > > > > > > > > > > > > this sounds a bit like that to me since it's staying stable at a 
>> > > > > > > > > > > > > > large 
>> > > > > > > > > > > > > > but finite portion of total memory. 
>> > > > > > > > > > > > > 
>> > > > > > > > > > > > > 
>> > > > > > > > > > > > > 
>> > > > > > > > > > > > > 
>> > > > > > > > > > > > > 
>> > > > > > > > > > > > > Well, the memory consumption was already high before the profiler was 
>> > > > > > > > > > > > > started. So yes with the memory profiler enable an OSD might consume 
>> > > > > > > > > > > > > more memory but this doesn't cause the memory leaks. 
>> > > > > > > > > > > > 
>> > > > > > > > > > > > 
>> > > > > > > > > > > > 
>> > > > > > > > > > > > 
>> > > > > > > > > > > > 
>> > > > > > > > > > > > My concern is that maybe you saw a leak but when you restarted with 
>> > > > > > > > > > > > the memory profiling you lost whatever conditions caused it. 
>> > > > > > > > > > > > 
>> > > > > > > > > > > > > Any ideas? Nothing to say about my scrumbing theory? 
>> > > > > > > > > > > > I like it, but Sam indicates that without some heap dumps which 
>> > > > > > > > > > > > capture the actual leak then scrub is too large to effectively code 
>> > > > > > > > > > > > review for leaks. :( 
>> > > > > > > > > > > > -Greg 
>> > > > > > > > > > > 
>> > > > > > > > > > > 
>> > > > > > > > > > > 
>> > > > > > > > > > > 
>> > > > > > > > > > > -- 
>> > > > > > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
>> > > > > > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
>> > > > > > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html 
>> > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > > > -- 
>> > > > > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
>> > > > > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
>> > > > > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > -- 
>> > > > > > > > > Wido den Hollander 
>> > > > > > > > > 42on B.V. 
>> > > > > > > > > 
>> > > > > > > > > Phone: +31 (0)20 700 9902 
>> > > > > > > > > Skype: contact42on 
>> > > > > > > > 
>> > > > > > > 
>> > > > > > > 
>> > > > > > > 
>> > > > > > > 
>> > > > > > > -- 
>> > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
>> > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
>> > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html 
>> > > > > > 
>> > > > > 
>> > > > 
>> > > 
>> > > 
>> > > 
>> > > 
>> > > -- 
>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
>> > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
>> > > More majordomo info at http://vger.kernel.org/majordomo-info.html 
>> > 
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-13 22:38               ` Dave Spano
@ 2013-03-13 22:52                 ` Greg Farnum
  2013-03-14  0:05                   ` Dave Spano
  0 siblings, 1 reply; 48+ messages in thread
From: Greg Farnum @ 2013-03-13 22:52 UTC (permalink / raw)
  To: Dave Spano
  Cc: Sébastien Han, ceph-devel, Sage Weil, Wido den Hollander,
	Sylvain Munaut, Samuel Just, Vladislav Gorbunov

It sounds like maybe you didn't rename the new pool to use the old pool's name? Glance is looking for a specific pool to store its data in; I believe it's configurable but you'll need to do one or the other.
-Greg

On Wednesday, March 13, 2013 at 3:38 PM, Dave Spano wrote:

> Sebastien,
>  
> I'm not totally sure yet, but everything is still working.  
>  
>  
> Sage and Greg,  
> I copied my glance image pool per the posting I mentioned previously, and everything works when I use the ceph tools. I can export rbds from the new pool and delete them as well.
>  
> I noticed that the copied images pool does not work with glance.  
>  
> I get this error when I try to create images in the new pool. If I put the old pool back, I can create images no problem.  
>  
> Is there something I'm missing in glance that I need to work with a pool created in bobtail? I'm using Openstack Folsom.  
>  
> File "/usr/lib/python2.7/dist-packages/glance/api/v1/images.py", line 437, in _upload  
> image_meta['size'])  
> File "/usr/lib/python2.7/dist-packages/glance/store/rbd.py", line 244, in add  
> image_size, order)  
> File "/usr/lib/python2.7/dist-packages/glance/store/rbd.py", line 207, in _create_image  
> features=rbd.RBD_FEATURE_LAYERING)  
> File "/usr/lib/python2.7/dist-packages/rbd.py", line 194, in create  
> raise make_ex(ret, 'error creating image')  
> PermissionError: error creating image
>  
>  
> Dave Spano  
>  
>  
>  
>  
> ----- Original Message -----  
>  
> From: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>  
> To: "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>  
> Cc: "Greg Farnum" <greg@inktank.com (mailto:greg@inktank.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>, "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)>  
> Sent: Wednesday, March 13, 2013 3:59:03 PM  
> Subject: Re: OSD memory leaks?  
>  
> Dave,  
>  
> Just to be sure, did the log max recent=10000 _completely_ stod the  
> memory leak or did it slow it down?  
>  
> Thanks!  
> --  
> Regards,  
> Sébastien Han.  
>  
>  
> On Wed, Mar 13, 2013 at 2:12 PM, Dave Spano <dspano@optogenics.com (mailto:dspano@optogenics.com)> wrote:  
> > Lol. I'm totally fine with that. My glance images pool isn't used too often. I'm going to give that a try today and see what happens.  
> >  
> > I'm still crossing my fingers, but since I added log max recent=10000 to ceph.conf, I've been okay despite the improper pg_num, and a lot of scrubbing/deep scrubbing yesterday.  
> >  
> > Dave Spano  
> >  
> >  
> >  
> >  
> > ----- Original Message -----  
> >  
> > From: "Greg Farnum" <greg@inktank.com (mailto:greg@inktank.com)>  
> > To: "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>  
> > Cc: "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>, "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)>, "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>  
> > Sent: Tuesday, March 12, 2013 5:37:37 PM  
> > Subject: Re: OSD memory leaks?  
> >  
> > Yeah. There's not anything intelligent about that cppool mechanism. :)  
> > -Greg  
> >  
> > On Tuesday, March 12, 2013 at 2:15 PM, Dave Spano wrote:  
> >  
> > > I'd rather shut the cloud down and copy the pool to a new one than take any chances of corruption by using an experimental feature. My guess is that there cannot be any i/o to the pool while copying, otherwise you'll lose the changes that are happening during the copy, correct?  
> > >  
> > > Dave Spano  
> > > Optogenics  
> > > Systems Administrator  
> > >  
> > >  
> > >  
> > > ----- Original Message -----  
> > >  
> > > From: "Greg Farnum" <greg@inktank.com (mailto:greg@inktank.com)>  
> > > To: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>  
> > > Cc: "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>, "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)>  
> > > Sent: Tuesday, March 12, 2013 4:20:13 PM  
> > > Subject: Re: OSD memory leaks?  
> > >  
> > > On Tuesday, March 12, 2013 at 1:10 PM, Sébastien Han wrote:  
> > > > Well to avoid un necessary data movement, there is also an  
> > > > _experimental_ feature to change on fly the number of PGs in a pool.  
> > > >  
> > > > ceph osd pool set <poolname> pg_num <numpgs> --allow-experimental-feature  
> > > Don't do that. We've got a set of 3 patches which fix bugs we know about that aren't in bobtail yet, and I'm sure there's more we aren't aware of…  
> > > -Greg  
> > >  
> > > Software Engineer #42 @ http://inktank.com | http://ceph.com  
> > >  
> > > >  
> > > > Cheers!  
> > > > --  
> > > > Regards,  
> > > > Sébastien Han.  
> > > >  
> > > >  
> > > > On Tue, Mar 12, 2013 at 7:09 PM, Dave Spano <dspano@optogenics.com (mailto:dspano@optogenics.com)> wrote:  
> > > > > Disregard my previous question. I found my answer in the post below. Absolutely brilliant! I thought I was screwed!  
> > > > >  
> > > > > http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/8924  
> > > > >  
> > > > > Dave Spano  
> > > > > Optogenics  
> > > > > Systems Administrator  
> > > > >  
> > > > >  
> > > > >  
> > > > > ----- Original Message -----  
> > > > >  
> > > > > From: "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>  
> > > > > To: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>  
> > > > > Cc: "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Gregory Farnum" <greg@inktank.com (mailto:greg@inktank.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>, "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)>  
> > > > > Sent: Tuesday, March 12, 2013 1:41:21 PM  
> > > > > Subject: Re: OSD memory leaks?  
> > > > >  
> > > > >  
> > > > > If one were stupid enough to have their pg_num and pgp_num set to 8 on two of their pools, how could you fix that?  
> > > > >  
> > > > >  
> > > > > Dave Spano  
> > > > >  
> > > > >  
> > > > >  
> > > > > ----- Original Message -----  
> > > > >  
> > > > > From: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>  
> > > > > To: "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)>  
> > > > > Cc: "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Gregory Farnum" <greg@inktank.com (mailto:greg@inktank.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>  
> > > > > Sent: Tuesday, March 12, 2013 9:43:44 AM  
> > > > > Subject: Re: OSD memory leaks?  
> > > > >  
> > > > > > Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd  
> > > > > > dump | grep 'rep size'"  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > > Well it's still 450 each...  
> > > > >  
> > > > > > The default pg_num value 8 is NOT suitable for big cluster.  
> > > > >  
> > > > > Thanks I know, I'm not new with Ceph. What's your point here? I  
> > > > > already said that pg_num was 450...  
> > > > > --  
> > > > > Regards,  
> > > > > Sébastien Han.  
> > > > >  
> > > > >  
> > > > > On Tue, Mar 12, 2013 at 2:00 PM, Vladislav Gorbunov <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> wrote:  
> > > > > > Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd  
> > > > > > dump | grep 'rep size'"  
> > > > > > The default pg_num value 8 is NOT suitable for big cluster.  
> > > > > >  
> > > > > > 2013/3/13 Sébastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>:  
> > > > > > > Replica count has been set to 2.  
> > > > > > >  
> > > > > > > Why?  
> > > > > > > --  
> > > > > > > Regards,  
> > > > > > > Sébastien Han.  
> > > > > > >  
> > > > > > >  
> > > > > > > On Tue, Mar 12, 2013 at 12:45 PM, Vladislav Gorbunov <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> wrote:  
> > > > > > > > > FYI I'm using 450 pgs for my pools.  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > >  
> > > > > > > > Please, can you show the number of object replicas?  
> > > > > > > >  
> > > > > > > > ceph osd dump | grep 'rep size'  
> > > > > > > >  
> > > > > > > > Vlad Gorbunov  
> > > > > > > >  
> > > > > > > > 2013/3/5 Sébastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>:  
> > > > > > > > > FYI I'm using 450 pgs for my pools.  
> > > > > > > > >  
> > > > > > > > > --  
> > > > > > > > > Regards,  
> > > > > > > > > Sébastien Han.  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil <sage@inktank.com (mailto:sage@inktank.com)> wrote:  
> > > > > > > > > >  
> > > > > > > > > > On Fri, 1 Mar 2013, Wido den Hollander wrote:  
> > > > > > > > > > > On 02/23/2013 01:44 AM, Sage Weil wrote:  
> > > > > > > > > > > > On Fri, 22 Feb 2013, S?bastien Han wrote:  
> > > > > > > > > > > > > Hi all,  
> > > > > > > > > > > > >  
> > > > > > > > > > > > > I finally got a core dump.  
> > > > > > > > > > > > >  
> > > > > > > > > > > > > I did it with a kill -SEGV on the OSD process.  
> > > > > > > > > > > > >  
> > > > > > > > > > > > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008  
> > > > > > > > > > > > >  
> > > > > > > > > > > > > Hope we will get something out of it :-).  
> > > > > > > > > > > >  
> > > > > > > > > > > > AHA! We have a theory. The pg log isnt trimmed during scrub (because teh  
> > > > > > > > > > > > old scrub code required that), but the new (deep) scrub can take a very  
> > > > > > > > > > > > long time, which means the pg log will eat ram in the meantime..  
> > > > > > > > > > > > especially under high iops.  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > > Does the number of PGs influence the memory leak? So my theory is that when  
> > > > > > > > > > > you have a high number of PGs with a low number of objects per PG you don't  
> > > > > > > > > > > see the memory leak.  
> > > > > > > > > > >  
> > > > > > > > > > > I saw the memory leak on a RBD system where a pool had just 8 PGs, but after  
> > > > > > > > > > > going to 1024 PGs in a new pool it seemed to be resolved.  
> > > > > > > > > > >  
> > > > > > > > > > > I've asked somebody else to try your patch since he's still seeing it on his  
> > > > > > > > > > > systems. Hopefully that gives us some results.  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > >  
> > > > > > > > > > The PGs were active+clean when you saw the leak? There is a problem (that  
> > > > > > > > > > we just fixed in master) where pg logs aren't trimmed for degraded PGs.  
> > > > > > > > > >  
> > > > > > > > > > sage  
> > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > > Wido  
> > > > > > > > > > >  
> > > > > > > > > > > > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see  
> > > > > > > > > > > > if that seems to work? Note that that patch shouldn't be run in a mixed  
> > > > > > > > > > > > argonaut+bobtail cluster, since it isn't properly checking if the scrub is  
> > > > > > > > > > > > class or chunky/deep.  
> > > > > > > > > > > >  
> > > > > > > > > > > > Thanks!  
> > > > > > > > > > > > sage  
> > > > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > > > > --  
> > > > > > > > > > > > > Regards,  
> > > > > > > > > > > > > S?bastien Han.  
> > > > > > > > > > > > >  
> > > > > > > > > > > > >  
> > > > > > > > > > > > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote:  
> > > > > > > > > > > > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>  
> > > > > > > > > > > > > > wrote:  
> > > > > > > > > > > > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active  
> > > > > > > > > > > > > > > > use  
> > > > > > > > > > > > > > > > of the memory profiler will itself cause memory usage to increase ?  
> > > > > > > > > > > > > > > > this sounds a bit like that to me since it's staying stable at a  
> > > > > > > > > > > > > > > > large  
> > > > > > > > > > > > > > > > but finite portion of total memory.  
> > > > > > > > > > > > > > >  
> > > > > > > > > > > > > > >  
> > > > > > > > > > > > > > >  
> > > > > > > > > > > > > > >  
> > > > > > > > > > > > > > >  
> > > > > > > > > > > > > > >  
> > > > > > > > > > > > > > >  
> > > > > > > > > > > > > > > Well, the memory consumption was already high before the profiler was  
> > > > > > > > > > > > > > > started. So yes with the memory profiler enable an OSD might consume  
> > > > > > > > > > > > > > > more memory but this doesn't cause the memory leaks.  
> > > > > > > > > > > > > >  
> > > > > > > > > > > > > >  
> > > > > > > > > > > > > >  
> > > > > > > > > > > > > >  
> > > > > > > > > > > > > >  
> > > > > > > > > > > > > >  
> > > > > > > > > > > > > >  
> > > > > > > > > > > > > > My concern is that maybe you saw a leak but when you restarted with  
> > > > > > > > > > > > > > the memory profiling you lost whatever conditions caused it.  
> > > > > > > > > > > > > >  
> > > > > > > > > > > > > > > Any ideas? Nothing to say about my scrumbing theory?  
> > > > > > > > > > > > > > I like it, but Sam indicates that without some heap dumps which  
> > > > > > > > > > > > > > capture the actual leak then scrub is too large to effectively code  
> > > > > > > > > > > > > > review for leaks. :(  
> > > > > > > > > > > > > > -Greg  
> > > > > > > > > > > > >  
> > > > > > > > > > > > >  
> > > > > > > > > > > > >  
> > > > > > > > > > > > >  
> > > > > > > > > > > > >  
> > > > > > > > > > > > >  
> > > > > > > > > > > > > --  
> > > > > > > > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in  
> > > > > > > > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)  
> > > > > > > > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html  
> > > > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > > >  
> > > > > > > > > > > > --  
> > > > > > > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in  
> > > > > > > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)  
> > > > > > > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > >  
> > > > > > > > > > > --  
> > > > > > > > > > > Wido den Hollander  
> > > > > > > > > > > 42on B.V.  
> > > > > > > > > > >  
> > > > > > > > > > > Phone: +31 (0)20 700 9902  
> > > > > > > > > > > Skype: contact42on  
> > > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > >  
> > > > > > > > > --  
> > > > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in  
> > > > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)  
> > > > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html  
> > > > > > > >  
> > > > > > >  
> > > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > > --  
> > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in  
> > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org)  
> > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html  
> > > >  
> > >  
> >  
>  



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-13 22:52                 ` Greg Farnum
@ 2013-03-14  0:05                   ` Dave Spano
  2013-03-14  0:15                     ` Josh Durgin
  0 siblings, 1 reply; 48+ messages in thread
From: Dave Spano @ 2013-03-14  0:05 UTC (permalink / raw)
  To: Greg Farnum
  Cc: Sébastien Han, ceph-devel, Sage Weil, Wido den Hollander,
	Sylvain Munaut, Samuel Just, Vladislav Gorbunov

I renamed the old one from images to images-old, and the new one from images-new to images. 

Dave Spano
Optogenics
Systems Administrator



----- Original Message -----
From: Greg Farnum &lt;greg@inktank.com&gt;
To: Dave Spano &lt;dspano@optogenics.com&gt;
Cc: Sébastien Han &lt;han.sebastien@gmail.com&gt;, ceph-devel &lt;ceph-devel@vger.kernel.org&gt;, Sage Weil &lt;sage@inktank.com&gt;, Wido den Hollander &lt;wido@42on.com&gt;, Sylvain Munaut &lt;s.munaut@whatever-company.com&gt;, Samuel Just &lt;sam.just@inktank.com&gt;, Vladislav Gorbunov &lt;vadikgo@gmail.com&gt;
Sent: Wed, 13 Mar 2013 18:52:29 -0400 (EDT)
Subject: Re: OSD memory leaks?

It sounds like maybe you didn't rename the new pool to use the old pool's name? Glance is looking for a specific pool to store its data in; I believe it's configurable but you'll need to do one or the other.
-Greg

On Wednesday, March 13, 2013 at 3:38 PM, Dave Spano wrote:

&gt; Sebastien,
&gt; 
&gt; I'm not totally sure yet, but everything is still working. 
&gt; 
&gt; 
&gt; Sage and Greg, 
&gt; I copied my glance image pool per the posting I mentioned previously, and everything works when I use the ceph tools. I can export rbds from the new pool and delete them as well.
&gt; 
&gt; I noticed that the copied images pool does not work with glance. 
&gt; 
&gt; I get this error when I try to create images in the new pool. If I put the old pool back, I can create images no problem. 
&gt; 
&gt; Is there something I'm missing in glance that I need to work with a pool created in bobtail? I'm using Openstack Folsom. 
&gt; 
&gt; File "/usr/lib/python2.7/dist-packages/glance/api/v1/images.py", line 437, in _upload 
&gt; image_meta['size']) 
&gt; File "/usr/lib/python2.7/dist-packages/glance/store/rbd.py", line 244, in add 
&gt; image_size, order) 
&gt; File "/usr/lib/python2.7/dist-packages/glance/store/rbd.py", line 207, in _create_image 
&gt; features=rbd.RBD_FEATURE_LAYERING) 
&gt; File "/usr/lib/python2.7/dist-packages/rbd.py", line 194, in create 
&gt; raise make_ex(ret, 'error creating image') 
&gt; PermissionError: error creating image
&gt; 
&gt; 
&gt; Dave Spano 
&gt; 
&gt; 
&gt; 
&gt; 
&gt; ----- Original Message ----- 
&gt; 
&gt; From: "Sébastien Han" &lt;han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)&gt; 
&gt; To: "Dave Spano" &lt;dspano@optogenics.com (mailto:dspano@optogenics.com)&gt; 
&gt; Cc: "Greg Farnum" &lt;greg@inktank.com (mailto:greg@inktank.com)&gt;, "ceph-devel" &lt;ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)&gt;, "Sage Weil" &lt;sage@inktank.com (mailto:sage@inktank.com)&gt;, "Wido den Hollander" &lt;wido@42on.com (mailto:wido@42on.com)&gt;, "Sylvain Munaut" &lt;s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)&gt;, "Samuel Just" &lt;sam.just@inktank.com (mailto:sam.just@inktank.com)&gt;, "Vladislav Gorbunov" &lt;vadikgo@gmail.com (mailto:vadikgo@gmail.com)&gt; 
&gt; Sent: Wednesday, March 13, 2013 3:59:03 PM 
&gt; Subject: Re: OSD memory leaks? 
&gt; 
&gt; Dave, 
&gt; 
&gt; Just to be sure, did the log max recent=10000 _completely_ stod the 
&gt; memory leak or did it slow it down? 
&gt; 
&gt; Thanks! 
&gt; -- 
&gt; Regards, 
&gt; Sébastien Han. 
&gt; 
&gt; 
&gt; On Wed, Mar 13, 2013 at 2:12 PM, Dave Spano &lt;dspano@optogenics.com (mailto:dspano@optogenics.com)&gt; wrote: 
&gt; &gt; Lol. I'm totally fine with that. My glance images pool isn't used too often. I'm going to give that a try today and see what happens. 
&gt; &gt; 
&gt; &gt; I'm still crossing my fingers, but since I added log max recent=10000 to ceph.conf, I've been okay despite the improper pg_num, and a lot of scrubbing/deep scrubbing yesterday. 
&gt; &gt; 
&gt; &gt; Dave Spano 
&gt; &gt; 
&gt; &gt; 
&gt; &gt; 
&gt; &gt; 
&gt; &gt; ----- Original Message ----- 
&gt; &gt; 
&gt; &gt; From: "Greg Farnum" &lt;greg@inktank.com (mailto:greg@inktank.com)&gt; 
&gt; &gt; To: "Dave Spano" &lt;dspano@optogenics.com (mailto:dspano@optogenics.com)&gt; 
&gt; &gt; Cc: "ceph-devel" &lt;ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)&gt;, "Sage Weil" &lt;sage@inktank.com (mailto:sage@inktank.com)&gt;, "Wido den Hollander" &lt;wido@42on.com (mailto:wido@42on.com)&gt;, "Sylvain Munaut" &lt;s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)&gt;, "Samuel Just" &lt;sam.just@inktank.com (mailto:sam.just@inktank.com)&gt;, "Vladislav Gorbunov" &lt;vadikgo@gmail.com (mailto:vadikgo@gmail.com)&gt;, "Sébastien Han" &lt;han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)&gt; 
&gt; &gt; Sent: Tuesday, March 12, 2013 5:37:37 PM 
&gt; &gt; Subject: Re: OSD memory leaks? 
&gt; &gt; 
&gt; &gt; Yeah. There's not anything intelligent about that cppool mechanism. :) 
&gt; &gt; -Greg 
&gt; &gt; 
&gt; &gt; On Tuesday, March 12, 2013 at 2:15 PM, Dave Spano wrote: 
&gt; &gt; 
&gt; &gt; &gt; I'd rather shut the cloud down and copy the pool to a new one than take any chances of corruption by using an experimental feature. My guess is that there cannot be any i/o to the pool while copying, otherwise you'll lose the changes that are happening during the copy, correct? 
&gt; &gt; &gt; 
&gt; &gt; &gt; Dave Spano 
&gt; &gt; &gt; Optogenics 
&gt; &gt; &gt; Systems Administrator 
&gt; &gt; &gt; 
&gt; &gt; &gt; 
&gt; &gt; &gt; 
&gt; &gt; &gt; ----- Original Message ----- 
&gt; &gt; &gt; 
&gt; &gt; &gt; From: "Greg Farnum" &lt;greg@inktank.com (mailto:greg@inktank.com)&gt; 
&gt; &gt; &gt; To: "Sébastien Han" &lt;han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)&gt; 
&gt; &gt; &gt; Cc: "Dave Spano" &lt;dspano@optogenics.com (mailto:dspano@optogenics.com)&gt;, "ceph-devel" &lt;ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)&gt;, "Sage Weil" &lt;sage@inktank.com (mailto:sage@inktank.com)&gt;, "Wido den Hollander" &lt;wido@42on.com (mailto:wido@42on.com)&gt;, "Sylvain Munaut" &lt;s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)&gt;, "Samuel Just" &lt;sam.just@inktank.com (mailto:sam.just@inktank.com)&gt;, "Vladislav Gorbunov" &lt;vadikgo@gmail.com (mailto:vadikgo@gmail.com)&gt; 
&gt; &gt; &gt; Sent: Tuesday, March 12, 2013 4:20:13 PM 
&gt; &gt; &gt; Subject: Re: OSD memory leaks? 
&gt; &gt; &gt; 
&gt; &gt; &gt; On Tuesday, March 12, 2013 at 1:10 PM, Sébastien Han wrote: 
&gt; &gt; &gt; &gt; Well to avoid un necessary data movement, there is also an 
&gt; &gt; &gt; &gt; _experimental_ feature to change on fly the number of PGs in a pool. 
&gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; ceph osd pool set &lt;poolname&gt; pg_num &lt;numpgs&gt; --allow-experimental-feature 
&gt; &gt; &gt; Don't do that. We've got a set of 3 patches which fix bugs we know about that aren't in bobtail yet, and I'm sure there's more we aren't aware of… 
&gt; &gt; &gt; -Greg 
&gt; &gt; &gt; 
&gt; &gt; &gt; Software Engineer #42 @ http://inktank.com | http://ceph.com 
&gt; &gt; &gt; 
&gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; Cheers! 
&gt; &gt; &gt; &gt; -- 
&gt; &gt; &gt; &gt; Regards, 
&gt; &gt; &gt; &gt; Sébastien Han. 
&gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; On Tue, Mar 12, 2013 at 7:09 PM, Dave Spano &lt;dspano@optogenics.com (mailto:dspano@optogenics.com)&gt; wrote: 
&gt; &gt; &gt; &gt; &gt; Disregard my previous question. I found my answer in the post below. Absolutely brilliant! I thought I was screwed! 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/8924 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; Dave Spano 
&gt; &gt; &gt; &gt; &gt; Optogenics 
&gt; &gt; &gt; &gt; &gt; Systems Administrator 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; ----- Original Message ----- 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; From: "Dave Spano" &lt;dspano@optogenics.com (mailto:dspano@optogenics.com)&gt; 
&gt; &gt; &gt; &gt; &gt; To: "Sébastien Han" &lt;han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)&gt; 
&gt; &gt; &gt; &gt; &gt; Cc: "Sage Weil" &lt;sage@inktank.com (mailto:sage@inktank.com)&gt;, "Wido den Hollander" &lt;wido@42on.com (mailto:wido@42on.com)&gt;, "Gregory Farnum" &lt;greg@inktank.com (mailto:greg@inktank.com)&gt;, "Sylvain Munaut" &lt;s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)&gt;, "ceph-devel" &lt;ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)&gt;, "Samuel Just" &lt;sam.just@inktank.com (mailto:sam.just@inktank.com)&gt;, "Vladislav Gorbunov" &lt;vadikgo@gmail.com (mailto:vadikgo@gmail.com)&gt; 
&gt; &gt; &gt; &gt; &gt; Sent: Tuesday, March 12, 2013 1:41:21 PM 
&gt; &gt; &gt; &gt; &gt; Subject: Re: OSD memory leaks? 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; If one were stupid enough to have their pg_num and pgp_num set to 8 on two of their pools, how could you fix that? 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; Dave Spano 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; ----- Original Message ----- 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; From: "Sébastien Han" &lt;han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)&gt; 
&gt; &gt; &gt; &gt; &gt; To: "Vladislav Gorbunov" &lt;vadikgo@gmail.com (mailto:vadikgo@gmail.com)&gt; 
&gt; &gt; &gt; &gt; &gt; Cc: "Sage Weil" &lt;sage@inktank.com (mailto:sage@inktank.com)&gt;, "Wido den Hollander" &lt;wido@42on.com (mailto:wido@42on.com)&gt;, "Gregory Farnum" &lt;greg@inktank.com (mailto:greg@inktank.com)&gt;, "Sylvain Munaut" &lt;s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)&gt;, "Dave Spano" &lt;dspano@optogenics.com (mailto:dspano@optogenics.com)&gt;, "ceph-devel" &lt;ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)&gt;, "Samuel Just" &lt;sam.just@inktank.com (mailto:sam.just@inktank.com)&gt; 
&gt; &gt; &gt; &gt; &gt; Sent: Tuesday, March 12, 2013 9:43:44 AM 
&gt; &gt; &gt; &gt; &gt; Subject: Re: OSD memory leaks? 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd 
&gt; &gt; &gt; &gt; &gt; &gt; dump | grep 'rep size'" 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; Well it's still 450 each... 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; The default pg_num value 8 is NOT suitable for big cluster. 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; Thanks I know, I'm not new with Ceph. What's your point here? I 
&gt; &gt; &gt; &gt; &gt; already said that pg_num was 450... 
&gt; &gt; &gt; &gt; &gt; -- 
&gt; &gt; &gt; &gt; &gt; Regards, 
&gt; &gt; &gt; &gt; &gt; Sébastien Han. 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; On Tue, Mar 12, 2013 at 2:00 PM, Vladislav Gorbunov &lt;vadikgo@gmail.com (mailto:vadikgo@gmail.com)&gt; wrote: 
&gt; &gt; &gt; &gt; &gt; &gt; Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd 
&gt; &gt; &gt; &gt; &gt; &gt; dump | grep 'rep size'" 
&gt; &gt; &gt; &gt; &gt; &gt; The default pg_num value 8 is NOT suitable for big cluster. 
&gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; 2013/3/13 Sébastien Han &lt;han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)&gt;: 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; Replica count has been set to 2. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; Why? 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; -- 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; Regards, 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; Sébastien Han. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; On Tue, Mar 12, 2013 at 12:45 PM, Vladislav Gorbunov &lt;vadikgo@gmail.com (mailto:vadikgo@gmail.com)&gt; wrote: 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; FYI I'm using 450 pgs for my pools. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Please, can you show the number of object replicas? 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; ceph osd dump | grep 'rep size' 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Vlad Gorbunov 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 2013/3/5 Sébastien Han &lt;han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)&gt;: 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; FYI I'm using 450 pgs for my pools. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; -- 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Regards, 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Sébastien Han. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil &lt;sage@inktank.com (mailto:sage@inktank.com)&gt; wrote: 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; On Fri, 1 Mar 2013, Wido den Hollander wrote: 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; On 02/23/2013 01:44 AM, Sage Weil wrote: 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; On Fri, 22 Feb 2013, S?bastien Han wrote: 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Hi all, 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; I finally got a core dump. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; I did it with a kill -SEGV on the OSD process. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Hope we will get something out of it :-). 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; AHA! We have a theory. The pg log isnt trimmed during scrub (because teh 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; old scrub code required that), but the new (deep) scrub can take a very 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; long time, which means the pg log will eat ram in the meantime.. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; especially under high iops. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Does the number of PGs influence the memory leak? So my theory is that when 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; you have a high number of PGs with a low number of objects per PG you don't 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; see the memory leak. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; I saw the memory leak on a RBD system where a pool had just 8 PGs, but after 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; going to 1024 PGs in a new pool it seemed to be resolved. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; I've asked somebody else to try your patch since he's still seeing it on his 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; systems. Hopefully that gives us some results. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; The PGs were active+clean when you saw the leak? There is a problem (that 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; we just fixed in master) where pg logs aren't trimmed for degraded PGs. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; sage 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Wido 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; if that seems to work? Note that that patch shouldn't be run in a mixed 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; argonaut+bobtail cluster, since it isn't properly checking if the scrub is 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; class or chunky/deep. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Thanks! 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; sage 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; -- 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Regards, 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; S?bastien Han. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum &lt;greg@inktank.com (mailto:greg@inktank.com)&gt; wrote: 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han &lt;han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)&gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; wrote: 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Is osd.1 using the heap profiler as well? Keep in mind that active 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; use 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; of the memory profiler will itself cause memory usage to increase ? 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; this sounds a bit like that to me since it's staying stable at a 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; large 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; but finite portion of total memory. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Well, the memory consumption was already high before the profiler was 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; started. So yes with the memory profiler enable an OSD might consume 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; more memory but this doesn't cause the memory leaks. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; My concern is that maybe you saw a leak but when you restarted with 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; the memory profiling you lost whatever conditions caused it. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Any ideas? Nothing to say about my scrumbing theory? 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; I like it, but Sam indicates that without some heap dumps which 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; capture the actual leak then scrub is too large to effectively code 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; review for leaks. :( 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; -Greg 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; -- 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; More majordomo info at http://vger.kernel.org/majordomo-info.html 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; -- 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; More majordomo info at http://vger.kernel.org/majordomo-info.html 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; -- 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Wido den Hollander 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 42on B.V. 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Phone: +31 (0)20 700 9902 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; Skype: contact42on 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; -- 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; More majordomo info at http://vger.kernel.org/majordomo-info.html 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; 
&gt; &gt; &gt; &gt; &gt; -- 
&gt; &gt; &gt; &gt; &gt; To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
&gt; &gt; &gt; &gt; &gt; the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
&gt; &gt; &gt; &gt; &gt; More majordomo info at http://vger.kernel.org/majordomo-info.html 
&gt; &gt; &gt; &gt; 
&gt; &gt; &gt; 
&gt; &gt; 
&gt; 




--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-14  0:05                   ` Dave Spano
@ 2013-03-14  0:15                     ` Josh Durgin
  0 siblings, 0 replies; 48+ messages in thread
From: Josh Durgin @ 2013-03-14  0:15 UTC (permalink / raw)
  To: Dave Spano
  Cc: Greg Farnum, Sébastien Han, ceph-devel, Sage Weil,
	Wido den Hollander, Sylvain Munaut, Samuel Just,
	Vladislav Gorbunov

On 03/13/2013 05:05 PM, Dave Spano wrote:
> I renamed the old one from images to images-old, and the new one from images-new to images.

This reminds me of a problem you might hit with this:

RBD clones track the parent image pool by id, so they'll continue
working after the pool is renamed. If you have any clones of the
images-old pool, they'll stop working when that pool is deleted.

To get around this, you'll need to flatten any clones whose parents are
in images-old before deleting the images-old pool.

Josh

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: OSD memory leaks?
  2013-03-13 19:59             ` Sébastien Han
  2013-03-13 22:38               ` Dave Spano
@ 2013-03-14 21:28               ` Dave Spano
  1 sibling, 0 replies; 48+ messages in thread
From: Dave Spano @ 2013-03-14 21:28 UTC (permalink / raw)
  To: Sébastien Han
  Cc: Greg Farnum, ceph-devel, Sage Weil, Wido den Hollander,
	Sylvain Munaut, Samuel Just, Vladislav Gorbunov

Sebastien, 

I just had to restart the OSD about 10 minutes ago, so it looks like all it did was slow down the process. 

Dave Spano 




----- Original Message ----- 

From: "Sébastien Han" <han.sebastien@gmail.com> 
To: "Dave Spano" <dspano@optogenics.com> 
Cc: "Greg Farnum" <greg@inktank.com>, "ceph-devel" <ceph-devel@vger.kernel.org>, "Sage Weil" <sage@inktank.com>, "Wido den Hollander" <wido@42on.com>, "Sylvain Munaut" <s.munaut@whatever-company.com>, "Samuel Just" <sam.just@inktank.com>, "Vladislav Gorbunov" <vadikgo@gmail.com> 
Sent: Wednesday, March 13, 2013 3:59:03 PM 
Subject: Re: OSD memory leaks? 

Dave, 

Just to be sure, did the log max recent=10000 _completely_ stod the 
memory leak or did it slow it down? 

Thanks! 
-- 
Regards, 
Sébastien Han. 


On Wed, Mar 13, 2013 at 2:12 PM, Dave Spano <dspano@optogenics.com> wrote: 
> Lol. I'm totally fine with that. My glance images pool isn't used too often. I'm going to give that a try today and see what happens. 
> 
> I'm still crossing my fingers, but since I added log max recent=10000 to ceph.conf, I've been okay despite the improper pg_num, and a lot of scrubbing/deep scrubbing yesterday. 
> 
> Dave Spano 
> 
> 
> 
> 
> ----- Original Message ----- 
> 
> From: "Greg Farnum" <greg@inktank.com> 
> To: "Dave Spano" <dspano@optogenics.com> 
> Cc: "ceph-devel" <ceph-devel@vger.kernel.org>, "Sage Weil" <sage@inktank.com>, "Wido den Hollander" <wido@42on.com>, "Sylvain Munaut" <s.munaut@whatever-company.com>, "Samuel Just" <sam.just@inktank.com>, "Vladislav Gorbunov" <vadikgo@gmail.com>, "Sébastien Han" <han.sebastien@gmail.com> 
> Sent: Tuesday, March 12, 2013 5:37:37 PM 
> Subject: Re: OSD memory leaks? 
> 
> Yeah. There's not anything intelligent about that cppool mechanism. :) 
> -Greg 
> 
> On Tuesday, March 12, 2013 at 2:15 PM, Dave Spano wrote: 
> 
>> I'd rather shut the cloud down and copy the pool to a new one than take any chances of corruption by using an experimental feature. My guess is that there cannot be any i/o to the pool while copying, otherwise you'll lose the changes that are happening during the copy, correct? 
>> 
>> Dave Spano 
>> Optogenics 
>> Systems Administrator 
>> 
>> 
>> 
>> ----- Original Message ----- 
>> 
>> From: "Greg Farnum" <greg@inktank.com (mailto:greg@inktank.com)> 
>> To: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)> 
>> Cc: "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>, "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> 
>> Sent: Tuesday, March 12, 2013 4:20:13 PM 
>> Subject: Re: OSD memory leaks? 
>> 
>> On Tuesday, March 12, 2013 at 1:10 PM, Sébastien Han wrote: 
>> > Well to avoid un necessary data movement, there is also an 
>> > _experimental_ feature to change on fly the number of PGs in a pool. 
>> > 
>> > ceph osd pool set <poolname> pg_num <numpgs> --allow-experimental-feature 
>> Don't do that. We've got a set of 3 patches which fix bugs we know about that aren't in bobtail yet, and I'm sure there's more we aren't aware of… 
>> -Greg 
>> 
>> Software Engineer #42 @ http://inktank.com | http://ceph.com 
>> 
>> > 
>> > Cheers! 
>> > -- 
>> > Regards, 
>> > Sébastien Han. 
>> > 
>> > 
>> > On Tue, Mar 12, 2013 at 7:09 PM, Dave Spano <dspano@optogenics.com (mailto:dspano@optogenics.com)> wrote: 
>> > > Disregard my previous question. I found my answer in the post below. Absolutely brilliant! I thought I was screwed! 
>> > > 
>> > > http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/8924 
>> > > 
>> > > Dave Spano 
>> > > Optogenics 
>> > > Systems Administrator 
>> > > 
>> > > 
>> > > 
>> > > ----- Original Message ----- 
>> > > 
>> > > From: "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)> 
>> > > To: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)> 
>> > > Cc: "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Gregory Farnum" <greg@inktank.com (mailto:greg@inktank.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)>, "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> 
>> > > Sent: Tuesday, March 12, 2013 1:41:21 PM 
>> > > Subject: Re: OSD memory leaks? 
>> > > 
>> > > 
>> > > If one were stupid enough to have their pg_num and pgp_num set to 8 on two of their pools, how could you fix that? 
>> > > 
>> > > 
>> > > Dave Spano 
>> > > 
>> > > 
>> > > 
>> > > ----- Original Message ----- 
>> > > 
>> > > From: "Sébastien Han" <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)> 
>> > > To: "Vladislav Gorbunov" <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> 
>> > > Cc: "Sage Weil" <sage@inktank.com (mailto:sage@inktank.com)>, "Wido den Hollander" <wido@42on.com (mailto:wido@42on.com)>, "Gregory Farnum" <greg@inktank.com (mailto:greg@inktank.com)>, "Sylvain Munaut" <s.munaut@whatever-company.com (mailto:s.munaut@whatever-company.com)>, "Dave Spano" <dspano@optogenics.com (mailto:dspano@optogenics.com)>, "ceph-devel" <ceph-devel@vger.kernel.org (mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.just@inktank.com (mailto:sam.just@inktank.com)> 
>> > > Sent: Tuesday, March 12, 2013 9:43:44 AM 
>> > > Subject: Re: OSD memory leaks? 
>> > > 
>> > > > Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd 
>> > > > dump | grep 'rep size'" 
>> > > 
>> > > 
>> > > 
>> > > 
>> > > 
>> > > Well it's still 450 each... 
>> > > 
>> > > > The default pg_num value 8 is NOT suitable for big cluster. 
>> > > 
>> > > Thanks I know, I'm not new with Ceph. What's your point here? I 
>> > > already said that pg_num was 450... 
>> > > -- 
>> > > Regards, 
>> > > Sébastien Han. 
>> > > 
>> > > 
>> > > On Tue, Mar 12, 2013 at 2:00 PM, Vladislav Gorbunov <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> wrote: 
>> > > > Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd 
>> > > > dump | grep 'rep size'" 
>> > > > The default pg_num value 8 is NOT suitable for big cluster. 
>> > > > 
>> > > > 2013/3/13 Sébastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>: 
>> > > > > Replica count has been set to 2. 
>> > > > > 
>> > > > > Why? 
>> > > > > -- 
>> > > > > Regards, 
>> > > > > Sébastien Han. 
>> > > > > 
>> > > > > 
>> > > > > On Tue, Mar 12, 2013 at 12:45 PM, Vladislav Gorbunov <vadikgo@gmail.com (mailto:vadikgo@gmail.com)> wrote: 
>> > > > > > > FYI I'm using 450 pgs for my pools. 
>> > > > > > 
>> > > > > > 
>> > > > > > 
>> > > > > > 
>> > > > > > Please, can you show the number of object replicas? 
>> > > > > > 
>> > > > > > ceph osd dump | grep 'rep size' 
>> > > > > > 
>> > > > > > Vlad Gorbunov 
>> > > > > > 
>> > > > > > 2013/3/5 Sébastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)>: 
>> > > > > > > FYI I'm using 450 pgs for my pools. 
>> > > > > > > 
>> > > > > > > -- 
>> > > > > > > Regards, 
>> > > > > > > Sébastien Han. 
>> > > > > > > 
>> > > > > > > 
>> > > > > > > On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil <sage@inktank.com (mailto:sage@inktank.com)> wrote: 
>> > > > > > > > 
>> > > > > > > > On Fri, 1 Mar 2013, Wido den Hollander wrote: 
>> > > > > > > > > On 02/23/2013 01:44 AM, Sage Weil wrote: 
>> > > > > > > > > > On Fri, 22 Feb 2013, S?bastien Han wrote: 
>> > > > > > > > > > > Hi all, 
>> > > > > > > > > > > 
>> > > > > > > > > > > I finally got a core dump. 
>> > > > > > > > > > > 
>> > > > > > > > > > > I did it with a kill -SEGV on the OSD process. 
>> > > > > > > > > > > 
>> > > > > > > > > > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008 
>> > > > > > > > > > > 
>> > > > > > > > > > > Hope we will get something out of it :-). 
>> > > > > > > > > > 
>> > > > > > > > > > AHA! We have a theory. The pg log isnt trimmed during scrub (because teh 
>> > > > > > > > > > old scrub code required that), but the new (deep) scrub can take a very 
>> > > > > > > > > > long time, which means the pg log will eat ram in the meantime.. 
>> > > > > > > > > > especially under high iops. 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > Does the number of PGs influence the memory leak? So my theory is that when 
>> > > > > > > > > you have a high number of PGs with a low number of objects per PG you don't 
>> > > > > > > > > see the memory leak. 
>> > > > > > > > > 
>> > > > > > > > > I saw the memory leak on a RBD system where a pool had just 8 PGs, but after 
>> > > > > > > > > going to 1024 PGs in a new pool it seemed to be resolved. 
>> > > > > > > > > 
>> > > > > > > > > I've asked somebody else to try your patch since he's still seeing it on his 
>> > > > > > > > > systems. Hopefully that gives us some results. 
>> > > > > > > > 
>> > > > > > > > 
>> > > > > > > > 
>> > > > > > > > 
>> > > > > > > > 
>> > > > > > > > The PGs were active+clean when you saw the leak? There is a problem (that 
>> > > > > > > > we just fixed in master) where pg logs aren't trimmed for degraded PGs. 
>> > > > > > > > 
>> > > > > > > > sage 
>> > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > Wido 
>> > > > > > > > > 
>> > > > > > > > > > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see 
>> > > > > > > > > > if that seems to work? Note that that patch shouldn't be run in a mixed 
>> > > > > > > > > > argonaut+bobtail cluster, since it isn't properly checking if the scrub is 
>> > > > > > > > > > class or chunky/deep. 
>> > > > > > > > > > 
>> > > > > > > > > > Thanks! 
>> > > > > > > > > > sage 
>> > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > > > > -- 
>> > > > > > > > > > > Regards, 
>> > > > > > > > > > > S?bastien Han. 
>> > > > > > > > > > > 
>> > > > > > > > > > > 
>> > > > > > > > > > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)> wrote: 
>> > > > > > > > > > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebastien@gmail.com (mailto:han.sebastien@gmail.com)> 
>> > > > > > > > > > > > wrote: 
>> > > > > > > > > > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active 
>> > > > > > > > > > > > > > use 
>> > > > > > > > > > > > > > of the memory profiler will itself cause memory usage to increase ? 
>> > > > > > > > > > > > > > this sounds a bit like that to me since it's staying stable at a 
>> > > > > > > > > > > > > > large 
>> > > > > > > > > > > > > > but finite portion of total memory. 
>> > > > > > > > > > > > > 
>> > > > > > > > > > > > > 
>> > > > > > > > > > > > > 
>> > > > > > > > > > > > > 
>> > > > > > > > > > > > > 
>> > > > > > > > > > > > > Well, the memory consumption was already high before the profiler was 
>> > > > > > > > > > > > > started. So yes with the memory profiler enable an OSD might consume 
>> > > > > > > > > > > > > more memory but this doesn't cause the memory leaks. 
>> > > > > > > > > > > > 
>> > > > > > > > > > > > 
>> > > > > > > > > > > > 
>> > > > > > > > > > > > 
>> > > > > > > > > > > > 
>> > > > > > > > > > > > My concern is that maybe you saw a leak but when you restarted with 
>> > > > > > > > > > > > the memory profiling you lost whatever conditions caused it. 
>> > > > > > > > > > > > 
>> > > > > > > > > > > > > Any ideas? Nothing to say about my scrumbing theory? 
>> > > > > > > > > > > > I like it, but Sam indicates that without some heap dumps which 
>> > > > > > > > > > > > capture the actual leak then scrub is too large to effectively code 
>> > > > > > > > > > > > review for leaks. :( 
>> > > > > > > > > > > > -Greg 
>> > > > > > > > > > > 
>> > > > > > > > > > > 
>> > > > > > > > > > > 
>> > > > > > > > > > > 
>> > > > > > > > > > > -- 
>> > > > > > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
>> > > > > > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
>> > > > > > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html 
>> > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > > > -- 
>> > > > > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
>> > > > > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
>> > > > > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > -- 
>> > > > > > > > > Wido den Hollander 
>> > > > > > > > > 42on B.V. 
>> > > > > > > > > 
>> > > > > > > > > Phone: +31 (0)20 700 9902 
>> > > > > > > > > Skype: contact42on 
>> > > > > > > > 
>> > > > > > > 
>> > > > > > > 
>> > > > > > > 
>> > > > > > > 
>> > > > > > > -- 
>> > > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
>> > > > > > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
>> > > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html 
>> > > > > > 
>> > > > > 
>> > > > 
>> > > 
>> > > 
>> > > 
>> > > 
>> > > -- 
>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
>> > > the body of a message to majordomo@vger.kernel.org (mailto:majordomo@vger.kernel.org) 
>> > > More majordomo info at http://vger.kernel.org/majordomo-info.html 
>> > 
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2013-03-14 21:28 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <7332688.5.1363110084349.JavaMail.dspano@it1>
2013-03-12 18:09 ` OSD memory leaks? Dave Spano
2013-03-12 20:10   ` Sébastien Han
2013-03-12 20:20     ` Greg Farnum
2013-03-12 21:15       ` Dave Spano
2013-03-12 21:37         ` Greg Farnum
2013-03-13 13:12           ` Dave Spano
2013-03-13 19:59             ` Sébastien Han
2013-03-13 22:38               ` Dave Spano
2013-03-13 22:52                 ` Greg Farnum
2013-03-14  0:05                   ` Dave Spano
2013-03-14  0:15                     ` Josh Durgin
2013-03-14 21:28               ` Dave Spano
2013-03-12 21:01     ` Bryan K. Wright
     [not found] <4172429.450.1357580400977.JavaMail.dspano@it1>
     [not found] ` <10953797.470.1357585419142.JavaMail.dspano@it1>
2013-01-07 19:09   ` Samuel Just
2013-01-09 15:20     ` Sébastien Han
2013-01-09 16:10       ` Dave Spano
2013-01-09 16:35         ` Sébastien Han
2013-01-09 18:09           ` Sylvain Munaut
2013-01-09 19:11             ` Sébastien Han
2013-01-10 21:44             ` Gregory Farnum
2013-01-11 14:57               ` Sébastien Han
2013-01-11 18:13                 ` Gregory Farnum
2013-02-22 15:24                   ` Sébastien Han
2013-02-23  0:44                     ` Sage Weil
2013-02-24 23:10                       ` Sébastien Han
2013-02-25  0:21                         ` Sage Weil
2013-02-25  7:51                           ` Wido den Hollander
2013-02-25 17:18                             ` Sébastien Han
2013-03-01 15:51                       ` Wido den Hollander
2013-03-01 18:07                         ` Samuel Just
2013-03-01 19:10                         ` Sage Weil
2013-03-04 17:11                           ` Sébastien Han
     [not found]                             ` <13183200.155.1363027427897.JavaMail.dspano@it1>
2013-03-11 22:23                               ` Sébastien Han
2013-03-12 11:45                             ` Vladislav Gorbunov
2013-03-12 12:12                               ` Sébastien Han
2013-03-12 13:00                                 ` Vladislav Gorbunov
2013-03-12 13:43                                   ` Sébastien Han
2013-01-09 21:42           ` Dave Spano
2013-01-09 22:12             ` Sébastien Han
2013-01-09 23:03               ` Dave Spano
2013-01-10 21:43         ` Gregory Farnum
     [not found] <CAOLwVUmJb0y39dg3zTv=c4iPSkjLBhG3L7L4=L7Ms96iP-SiWw@mail.gmail.com>
2012-12-17  8:28 ` Fwd: " Sébastien Han
2012-12-17 18:12   ` Samuel Just
     [not found]     ` <CAOLwVUn5VbH1P=0wu-Oxb1bSKpaQfC6uQ5012wvPc7bvz606JA@mail.gmail.com>
2012-12-17 22:41       ` Sébastien Han
2012-12-17 22:55         ` Samuel Just
2012-12-18 17:21           ` Sébastien Han
2012-12-19 16:37             ` Sébastien Han
2012-12-19 21:43               ` Samuel Just
2013-01-04 15:20                 ` Sébastien Han

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.