From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wido den Hollander Subject: Re: OSD memory leaks? Date: Mon, 25 Feb 2013 08:51:20 +0100 Message-ID: <512B17F8.2090604@42on.com> References: <8366806.170.1357747859058.JavaMail.dspano@it1> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from websrv.42on.com ([31.25.102.167]:38058 "EHLO websrv.42on.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756611Ab3BYHvX (ORCPT ); Mon, 25 Feb 2013 02:51:23 -0500 In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Sage Weil Cc: =?ISO-8859-1?Q?S=E9bastien_Han?= , Gregory Farnum , Sylvain Munaut , Dave Spano , ceph-devel , Samuel Just On 02/25/2013 01:21 AM, Sage Weil wrote: > On Mon, 25 Feb 2013, S?bastien Han wrote: >> Hi Sage, >> >> Sorry it's a production system, so I can't test it. >> So at the end, you can't get anything out of the core dump? > > I saw a bunch of dup object anmes, which is what led us to the pg log > theory. I can look a bit more carefully to confirm, but in the end it > would be nice to see users scrubbing without leaking. > > This may be a bit moot because we want to allow trimming for other > reasons, so those patches are being tested and working their way into > master. We'll backport when things are solid. > > In the meantime, if someone has been able to reproduce this in a test > environment, testing is obviously welcome :) > I'll see what I can do later this week. I know of a cluster which has the same issues which is in semi-production as far as I know. Wido > sage > > > > > > >> -- >> Regards, >> S?bastien Han. >> >> >> On Sat, Feb 23, 2013 at 1:44 AM, Sage Weil wrote: >>> On Fri, 22 Feb 2013, S?bastien Han wrote: >>>> Hi all, >>>> >>>> I finally got a core dump. >>>> >>>> I did it with a kill -SEGV on the OSD process. >>>> >>>> https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008 >>>> >>>> Hope we will get something out of it :-). >>> >>> AHA! We have a theory. The pg log isnt trimmed during scrub (because teh >>> old scrub code required that), but the new (deep) scrub can take a very >>> long time, which means the pg log will eat ram in the meantime.. >>> especially under high iops. >>> >>> Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see >>> if that seems to work? Note that that patch shouldn't be run in a mixed >>> argonaut+bobtail cluster, since it isn't properly checking if the scrub is >>> class or chunky/deep. >>> >>> Thanks! >>> sage >>> >>> >>> > -- >>>> Regards, >>>> S?bastien Han. >>>> >>>> >>>> On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum wrote: >>>>> On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han wrote: >>>>>>> Is osd.1 using the heap profiler as well? Keep in mind that active use >>>>>>> of the memory profiler will itself cause memory usage to increase ? >>>>>>> this sounds a bit like that to me since it's staying stable at a large >>>>>>> but finite portion of total memory. >>>>>> >>>>>> Well, the memory consumption was already high before the profiler was >>>>>> started. So yes with the memory profiler enable an OSD might consume >>>>>> more memory but this doesn't cause the memory leaks. >>>>> >>>>> My concern is that maybe you saw a leak but when you restarted with >>>>> the memory profiling you lost whatever conditions caused it. >>>>> >>>>>> Any ideas? Nothing to say about my scrumbing theory? >>>>> I like it, but Sam indicates that without some heap dumps which >>>>> capture the actual leak then scrub is too large to effectively code >>>>> review for leaks. :( >>>>> -Greg >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>>> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Wido den Hollander 42on B.V. Phone: +31 (0)20 700 9902 Skype: contact42on