From: "guo tang" <tangguo77@gmail.com>
To: "Andreas Ericsson" <ae@op5.se>
Cc: "Git mailing list" <git@vger.kernel.org>
Subject: Re: out of memory problem
Date: Wed, 24 Sep 2008 09:45:44 -0700 [thread overview]
Message-ID: <4120f6ec0809240945t7b9313eagb1aa53f67ba9a76c@mail.gmail.com> (raw)
In-Reply-To: <48D89FB7.3000509@op5.se>
It happened again. I cannot make a copy of repository and try bisect.
The repository is 2.5GB, too time consuming to make a copy.
When I tried bisect without start from fresh. The bisect failed to
find the bad commit since the bad commit it found out can work now
once the repository is packed.
I did notice 3 things.
1. When I have memory trouble, "git gc" cannot finish the count
objects stage. It appears to be in an infinite loop.
2. When I ran "git count-objects -v", it reported there is an error: 1
garbage object found. The garbage object is named something like
.git/objects/35/tmp_xxx. Even if "git gc" finished without error, it
won't delete that garbage object. I guess there might be some
difference on how garbage object is handled between v1.5.5.1 and
v1.6.2
3. If I manually remove that garbage object using "rm", everything
seems still fine.
I will give valgrid a shoot next time I am in trouble.
Thanks,
Guo
On Tue, Sep 23, 2008 at 12:50 AM, Andreas Ericsson <ae@op5.se> wrote:
> guo tang wrote:
>>
>> On Mon, Sep 22, 2008 at 12:32 AM, Andreas Ericsson <ae@op5.se> wrote:
>>
>>> Guo Tang wrote:
>>>
>>>> Gentlemen,
>>>>
>>>> I try to run "git gc" on linux kernel tree. The virtual memory keeps
>>>> going
>>>> up until over 3GB, then crash. Tried twice with the v1.6.0.2, same
>>>> result.
>>>> Then I used the git coming with FC9 (v1.5.5.1), the peak virutal memory
>>>> usage is about 1.5GB. "git gc" finished without any trouble.
>>>> Could there be a memory leak in v1.6.0.2?
>>>>
>>>>
>>> There could be, but most likely it's commit
>>> 38bd64979a2a3ffa178af801c6a62e6fcd658274 (Enable threaded delta
>>> search on BSD and Linux). Do you have multiple cpu's in the
>>> computer where 'git gc' was running? If so, and if you've set
>>> pack.threads = 0, or --threads=0 it will autodetect the number
>>> of CPU's you have and then saturate all of them with work. Each
>>> thread will however consume memory close to that of a single
>>> process running the repack, so for large repositories you might
>>> want to set pack.threads = 1 in such large repositories.
>>
>>
>> It is a Pentium M single core machine. But I am not sure whether it is
>> using
>> just a single thread or
>> multiple threads. I will try setting pack.threads parameter next I run
>> into
>> trouble.
>>
>
> Unless you explicitly told it to run multiple threads (which
> would be a bit silly on a single-core machine), it just ran
> one thread.
>
>>> It's a shame you didn't save the unpacked repository, or this could
>>> have been properly debugged. While it's possible there is a memory
>>> leak, it's a dismal project trying to locate it by staring at the
>>> code, and the time it takes to repack huge repositories with memory
>>> intensive parameters is sort of prohibitive for finding the possible
>>> leak by bisection.
>>
>>
>> Yes, the repository is already packed now. One question, beside the
>> bisecting method, do we have
>> this ability built into kernel:
>> 1. Turn a flag on for a process.
>> 2. OS will keep track off process malloc(), free() calls and the call
>> stack.
>>
>> 3. For the malloc() calls without the the free() call (a memory leak), OS
>> will keep it count based on malloc() call stack.
>> 4. After some time, be able to dump this information out based on biggest
>> leak spot.
>>
>
> No, there's not. The kernel isn't the one handing out the memory when you
> call malloc(). That's handled by the C library, which can (and usually does)
> allocate a larger area of memory than the application needs, so that it
> doesn't have to run a system call for every malloc() call you do.
>
> You can pre-load a different memory allocator though, which can do whatever
> it wants with calls to malloc(), including ofcourse logging which function
> called them and how much memory was requested.
>
> Google for "memory leak check linux" and you'll get something like 750000
> results.
>
>
>> The complain when I ran out of memory if from mmap failure. Is it the same
>> as malloc() failure?
>>
>
> Sort of. Read 'man 2 mmap' for a more exhaustive description.
>
>> This kind of tool is available in Windows with its umdh (user mode heap
>> dump) tool.
>>
>
> There are a number of tools to detect leaks under Linux/Unix as well.
> valgrind is probably the most frequently used of all such leak checkers.
>
> --
> Andreas Ericsson andreas.ericsson@op5.se
> OP5 AB www.op5.se
> Tel: +46 8-230225 Fax: +46 8-230231
>
prev parent reply other threads:[~2008-09-24 16:46 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-21 22:59 out of memory problem Guo Tang
2008-09-22 7:32 ` Andreas Ericsson
[not found] ` <4120f6ec0809220828i3b38eda3tfc4974df8a2568cb@mail.gmail.com>
2008-09-23 7:50 ` Andreas Ericsson
2008-09-24 16:45 ` guo tang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4120f6ec0809240945t7b9313eagb1aa53f67ba9a76c@mail.gmail.com \
--to=tangguo77@gmail.com \
--cc=ae@op5.se \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).