* OSD memory leak when scrubbing [0.56.6]
@ 2013-05-21 15:24 Oliver Francke
2013-05-21 15:34 ` Oliver Francke
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Oliver Francke @ 2013-05-21 15:24 UTC (permalink / raw)
To: ceph-devel@vger.kernel.org
Well,
subject seems familiar, version was 0.48.3 in the last mail.
Some more of the story. Before successful upgrade to latest bobtail
everything with regards to scrubbing was disabled.
That is via:
ceph osd tell \* injectargs '--osd-max-scrubs 0'
We are running fine now since 9th of may. Fine means though, not have
ran any scrubbing for ages.
This morning I re-started scrubbing. After a couple of hours I detected
the first OSD's eating up memory.
Top-scorer was running with 23GiB rss. After stopping scrubbing again
there was no regain of memory.
Not anyone else with perhaps large pg's experiencing such behaviour?
Any advice on how to proceed?
Thnx in advance,
Oliver.
--
Oliver Francke
filoo GmbH
Moltkestraße 25a
33330 Gütersloh
HRB4355 AG Gütersloh
Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz
Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: OSD memory leak when scrubbing [0.56.6]
2013-05-21 15:24 OSD memory leak when scrubbing [0.56.6] Oliver Francke
@ 2013-05-21 15:34 ` Oliver Francke
2013-05-21 15:35 ` Sylvain Munaut
2013-05-22 6:21 ` Wolfgang Hennerbichler
2 siblings, 0 replies; 11+ messages in thread
From: Oliver Francke @ 2013-05-21 15:34 UTC (permalink / raw)
To: ceph-devel@vger.kernel.org
Uhm,
to be most correct... there was a follow-up even with version 0.56 ;)
On 05/21/2013 05:24 PM, Oliver Francke wrote:
> Well,
>
> subject seems familiar, version was 0.48.3 in the last mail.
>
> Some more of the story. Before successful upgrade to latest bobtail
> everything with regards to scrubbing was disabled.
> That is via:
> ceph osd tell \* injectargs '--osd-max-scrubs 0'
>
> We are running fine now since 9th of may. Fine means though, not have
> ran any scrubbing for ages.
> This morning I re-started scrubbing. After a couple of hours I
> detected the first OSD's eating up memory.
> Top-scorer was running with 23GiB rss. After stopping scrubbing again
> there was no regain of memory.
>
> Not anyone else with perhaps large pg's experiencing such behaviour?
>
> Any advice on how to proceed?
>
> Thnx in advance,
>
> Oliver.
>
--
Oliver Francke
filoo GmbH
Moltkestraße 25a
33330 Gütersloh
HRB4355 AG Gütersloh
Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz
Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: OSD memory leak when scrubbing [0.56.6]
2013-05-21 15:24 OSD memory leak when scrubbing [0.56.6] Oliver Francke
2013-05-21 15:34 ` Oliver Francke
@ 2013-05-21 15:35 ` Sylvain Munaut
2013-05-21 15:37 ` Stefan Priebe - Profihost AG
2013-05-21 15:38 ` Oliver Francke
2013-05-22 6:21 ` Wolfgang Hennerbichler
2 siblings, 2 replies; 11+ messages in thread
From: Sylvain Munaut @ 2013-05-21 15:35 UTC (permalink / raw)
To: Oliver Francke; +Cc: ceph-devel@vger.kernel.org
Hi,
> subject seems familiar, version was 0.48.3 in the last mail.
>
> Not anyone else with perhaps large pg's experiencing such behaviour?
>
> Any advice on how to proceed?
I had the same behavior in both argonaut and bobtail, raising sharply
~ 100M or so at each scrub (every 24h).
It's now resolved in cuttlefish AFAICT.
However given the mon leveldb issues I'm having with cuttlefish, I'm
not sure I'd recommend upgrading ...
Cheers,
Sylvain
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: OSD memory leak when scrubbing [0.56.6]
2013-05-21 15:35 ` Sylvain Munaut
@ 2013-05-21 15:37 ` Stefan Priebe - Profihost AG
2013-05-21 15:44 ` Sage Weil
2013-05-21 15:38 ` Oliver Francke
1 sibling, 1 reply; 11+ messages in thread
From: Stefan Priebe - Profihost AG @ 2013-05-21 15:37 UTC (permalink / raw)
To: Sylvain Munaut; +Cc: Oliver Francke, ceph-devel@vger.kernel.org
Am 21.05.2013 um 17:35 schrieb Sylvain Munaut <s.munaut@whatever-company.com>:
> Hi,
>
>> subject seems familiar, version was 0.48.3 in the last mail.
>>
>> Not anyone else with perhaps large pg's experiencing such behaviour?
>>
>> Any advice on how to proceed?
>
> I had the same behavior in both argonaut and bobtail, raising sharply
> ~ 100M or so at each scrub (every 24h).
>
> It's now resolved in cuttlefish AFAICT.
>
> However given the mon leveldb issues I'm having with cuttlefish, I'm
> not sure I'd recommend upgrading ...
I thought all mon leveldb issues were solved?
>
> Cheers,
>
> Sylvain
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: OSD memory leak when scrubbing [0.56.6]
2013-05-21 15:35 ` Sylvain Munaut
2013-05-21 15:37 ` Stefan Priebe - Profihost AG
@ 2013-05-21 15:38 ` Oliver Francke
1 sibling, 0 replies; 11+ messages in thread
From: Oliver Francke @ 2013-05-21 15:38 UTC (permalink / raw)
To: Sylvain Munaut; +Cc: ceph-devel@vger.kernel.org
Right,
On 05/21/2013 05:35 PM, Sylvain Munaut wrote:
> Hi,
>
>> subject seems familiar, version was 0.48.3 in the last mail.
>>
>> Not anyone else with perhaps large pg's experiencing such behaviour?
>>
>> Any advice on how to proceed?
> I had the same behavior in both argonaut and bobtail, raising sharply
> ~ 100M or so at each scrub (every 24h).
>
> It's now resolved in cuttlefish AFAICT.
>
> However given the mon leveldb issues I'm having with cuttlefish, I'm
> not sure I'd recommend upgrading ...
not really at this moment. Cause we have an otherwise stable and fast
cluster for our VM's ;)
Oliver.
>
> Cheers,
>
> Sylvain
--
Oliver Francke
filoo GmbH
Moltkestraße 25a
33330 Gütersloh
HRB4355 AG Gütersloh
Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz
Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: OSD memory leak when scrubbing [0.56.6]
2013-05-21 15:37 ` Stefan Priebe - Profihost AG
@ 2013-05-21 15:44 ` Sage Weil
2013-05-21 16:04 ` Mike Dawson
2013-05-21 19:29 ` Stefan Priebe
0 siblings, 2 replies; 11+ messages in thread
From: Sage Weil @ 2013-05-21 15:44 UTC (permalink / raw)
To: Stefan Priebe - Profihost AG
Cc: Sylvain Munaut, Oliver Francke, ceph-devel@vger.kernel.org
On Tue, 21 May 2013, Stefan Priebe - Profihost AG wrote:
> Am 21.05.2013 um 17:35 schrieb Sylvain Munaut <s.munaut@whatever-company.com>:
>
> > Hi,
> >
> >> subject seems familiar, version was 0.48.3 in the last mail.
> >>
> >> Not anyone else with perhaps large pg's experiencing such behaviour?
> >>
> >> Any advice on how to proceed?
> >
> > I had the same behavior in both argonaut and bobtail, raising sharply
> > ~ 100M or so at each scrub (every 24h).
> >
> > It's now resolved in cuttlefish AFAICT.
> >
> > However given the mon leveldb issues I'm having with cuttlefish, I'm
> > not sure I'd recommend upgrading ...
>
> I thought all mon leveldb issues were solved?
Not quite. I'm now able to reproduce the leveldb growth from Mike
Dawson's trace (thanks!) but we don't have a fix yet.
I thought the scrub memory issues were addressed by
f80f64cf024bd7519d5a1fb2a5698db97a003ce8 in 0.56.4... :(
sage
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: OSD memory leak when scrubbing [0.56.6]
2013-05-21 15:44 ` Sage Weil
@ 2013-05-21 16:04 ` Mike Dawson
2013-05-21 19:29 ` Stefan Priebe
1 sibling, 0 replies; 11+ messages in thread
From: Mike Dawson @ 2013-05-21 16:04 UTC (permalink / raw)
To: Sage Weil
Cc: Stefan Priebe - Profihost AG, Sylvain Munaut, Oliver Francke,
ceph-devel@vger.kernel.org
Sage,
Great news! My mons are growing again, but are significantly smaller right now. If you need a smaller set of data, I can probably get you something today.
Mike
Sent from my iPhone
On May 21, 2013, at 11:44 AM, Sage Weil <sage@inktank.com> wrote:
> On Tue, 21 May 2013, Stefan Priebe - Profihost AG wrote:
>> Am 21.05.2013 um 17:35 schrieb Sylvain Munaut <s.munaut@whatever-company.com>:
>>
>>> Hi,
>>>
>>>> subject seems familiar, version was 0.48.3 in the last mail.
>>>>
>>>> Not anyone else with perhaps large pg's experiencing such behaviour?
>>>>
>>>> Any advice on how to proceed?
>>>
>>> I had the same behavior in both argonaut and bobtail, raising sharply
>>> ~ 100M or so at each scrub (every 24h).
>>>
>>> It's now resolved in cuttlefish AFAICT.
>>>
>>> However given the mon leveldb issues I'm having with cuttlefish, I'm
>>> not sure I'd recommend upgrading ...
>>
>> I thought all mon leveldb issues were solved?
>
> Not quite. I'm now able to reproduce the leveldb growth from Mike
> Dawson's trace (thanks!) but we don't have a fix yet.
>
> I thought the scrub memory issues were addressed by
> f80f64cf024bd7519d5a1fb2a5698db97a003ce8 in 0.56.4... :(
>
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: OSD memory leak when scrubbing [0.56.6]
2013-05-21 15:44 ` Sage Weil
2013-05-21 16:04 ` Mike Dawson
@ 2013-05-21 19:29 ` Stefan Priebe
2013-05-21 19:31 ` Sage Weil
1 sibling, 1 reply; 11+ messages in thread
From: Stefan Priebe @ 2013-05-21 19:29 UTC (permalink / raw)
To: Sage Weil; +Cc: Sylvain Munaut, Oliver Francke, ceph-devel@vger.kernel.org
Am 21.05.2013 17:44, schrieb Sage Weil:
> On Tue, 21 May 2013, Stefan Priebe - Profihost AG wrote:
>> Am 21.05.2013 um 17:35 schrieb Sylvain Munaut <s.munaut@whatever-company.com>:
>>
>>> Hi,
>>>
>>>> subject seems familiar, version was 0.48.3 in the last mail.
>>>>
>>>> Not anyone else with perhaps large pg's experiencing such behaviour?
>>>>
>>>> Any advice on how to proceed?
>>>
>>> I had the same behavior in both argonaut and bobtail, raising sharply
>>> ~ 100M or so at each scrub (every 24h).
>>>
>>> It's now resolved in cuttlefish AFAICT.
>>>
>>> However given the mon leveldb issues I'm having with cuttlefish, I'm
>>> not sure I'd recommend upgrading ...
>>
>> I thought all mon leveldb issues were solved?
>
> Not quite. I'm now able to reproduce the leveldb growth from Mike
> Dawson's trace (thanks!) but we don't have a fix yet.
Oh OK. Is there a tracker id?
> I thought the scrub memory issues were addressed by
> f80f64cf024bd7519d5a1fb2a5698db97a003ce8 in 0.56.4... :(
>
> sage
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: OSD memory leak when scrubbing [0.56.6]
2013-05-21 19:29 ` Stefan Priebe
@ 2013-05-21 19:31 ` Sage Weil
2013-05-21 19:38 ` Oliver Francke
0 siblings, 1 reply; 11+ messages in thread
From: Sage Weil @ 2013-05-21 19:31 UTC (permalink / raw)
To: Stefan Priebe; +Cc: Sylvain Munaut, Oliver Francke, ceph-devel@vger.kernel.org
On Tue, 21 May 2013, Stefan Priebe wrote:
> Am 21.05.2013 17:44, schrieb Sage Weil:
> > On Tue, 21 May 2013, Stefan Priebe - Profihost AG wrote:
> > > Am 21.05.2013 um 17:35 schrieb Sylvain Munaut
> > > <s.munaut@whatever-company.com>:
> > >
> > > > Hi,
> > > >
> > > > > subject seems familiar, version was 0.48.3 in the last mail.
> > > > >
> > > > > Not anyone else with perhaps large pg's experiencing such behaviour?
> > > > >
> > > > > Any advice on how to proceed?
> > > >
> > > > I had the same behavior in both argonaut and bobtail, raising sharply
> > > > ~ 100M or so at each scrub (every 24h).
> > > >
> > > > It's now resolved in cuttlefish AFAICT.
> > > >
> > > > However given the mon leveldb issues I'm having with cuttlefish, I'm
> > > > not sure I'd recommend upgrading ...
> > >
> > > I thought all mon leveldb issues were solved?
> >
> > Not quite. I'm now able to reproduce the leveldb growth from Mike
> > Dawson's trace (thanks!) but we don't have a fix yet.
>
> Oh OK. Is there a tracker id?
http://tracker.ceph.com/issues/4895
>
> > I thought the scrub memory issues were addressed by
> > f80f64cf024bd7519d5a1fb2a5698db97a003ce8 in 0.56.4... :(
> >
> > sage
> >
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: OSD memory leak when scrubbing [0.56.6]
2013-05-21 19:31 ` Sage Weil
@ 2013-05-21 19:38 ` Oliver Francke
0 siblings, 0 replies; 11+ messages in thread
From: Oliver Francke @ 2013-05-21 19:38 UTC (permalink / raw)
To: Sage Weil; +Cc: Stefan Priebe, Sylvain Munaut, ceph-devel@vger.kernel.org
Well,
Am 21.05.2013 um 21:31 schrieb Sage Weil <sage@inktank.com>:
> On Tue, 21 May 2013, Stefan Priebe wrote:
>> Am 21.05.2013 17:44, schrieb Sage Weil:
>>> On Tue, 21 May 2013, Stefan Priebe - Profihost AG wrote:
>>>> Am 21.05.2013 um 17:35 schrieb Sylvain Munaut
>>>> <s.munaut@whatever-company.com>:
>>>>
>>>>> Hi,
>>>>>
>>>>>> subject seems familiar, version was 0.48.3 in the last mail.
>>>>>>
>>>>>> Not anyone else with perhaps large pg's experiencing such behaviour?
>>>>>>
>>>>>> Any advice on how to proceed?
>>>>>
>>>>> I had the same behavior in both argonaut and bobtail, raising sharply
>>>>> ~ 100M or so at each scrub (every 24h).
>>>>>
>>>>> It's now resolved in cuttlefish AFAICT.
>>>>>
>>>>> However given the mon leveldb issues I'm having with cuttlefish, I'm
>>>>> not sure I'd recommend upgrading ...
>>>>
>>>> I thought all mon leveldb issues were solved?
>>>
>>> Not quite. I'm now able to reproduce the leveldb growth from Mike
>>> Dawson's trace (thanks!) but we don't have a fix yet.
>>
>> Oh OK. Is there a tracker id?
>
> http://tracker.ceph.com/issues/4895
>
true for this issue, but back to topic, still not being able to safely ( deep-) scrub the whole cluster with 0.56.6.
>>
>>> I thought the scrub memory issues were addressed by
>>> f80f64cf024bd7519d5a1fb2a5698db97a003ce8 in 0.56.4... :(
>>>
any advice very welcome, though about 1/3 of the cluster is safe in means of "we scrubbed it deeply".
Best regards,
Oliver.
>>> sage
>>>
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: OSD memory leak when scrubbing [0.56.6]
2013-05-21 15:24 OSD memory leak when scrubbing [0.56.6] Oliver Francke
2013-05-21 15:34 ` Oliver Francke
2013-05-21 15:35 ` Sylvain Munaut
@ 2013-05-22 6:21 ` Wolfgang Hennerbichler
2 siblings, 0 replies; 11+ messages in thread
From: Wolfgang Hennerbichler @ 2013-05-22 6:21 UTC (permalink / raw)
To: Oliver Francke; +Cc: ceph-devel@vger.kernel.org
On 05/21/2013 05:24 PM, Oliver Francke wrote:
> Well,
>
> Not anyone else with perhaps large pg's experiencing such behaviour?
100pg's per pool, 3 pools. deep-scrub adds about 200M to every OSD, it
get's worse in time. it gets better when using syslog instead of direct
logging, interestingly enougy.
> Any advice on how to proceed?
in my cast it is: wait until they say: the cuttlefish swims into a
stable direction. and restart osd's every now and then...
> Thnx in advance,
>
> Oliver.
>
--
DI (FH) Wolfgang Hennerbichler
Software Development
Unit Advanced Computing Technologies
RISC Software GmbH
A company of the Johannes Kepler University Linz
IT-Center
Softwarepark 35
4232 Hagenberg
Austria
Phone: +43 7236 3343 245
Fax: +43 7236 3343 250
wolfgang.hennerbichler@risc-software.at
http://www.risc-software.at
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2013-05-22 6:21 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-21 15:24 OSD memory leak when scrubbing [0.56.6] Oliver Francke
2013-05-21 15:34 ` Oliver Francke
2013-05-21 15:35 ` Sylvain Munaut
2013-05-21 15:37 ` Stefan Priebe - Profihost AG
2013-05-21 15:44 ` Sage Weil
2013-05-21 16:04 ` Mike Dawson
2013-05-21 19:29 ` Stefan Priebe
2013-05-21 19:31 ` Sage Weil
2013-05-21 19:38 ` Oliver Francke
2013-05-21 15:38 ` Oliver Francke
2013-05-22 6:21 ` Wolfgang Hennerbichler
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.