out of memory problem

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* out of memory problem
@ 2008-09-21 22:59 Guo Tang
  2008-09-22  7:32 ` Andreas Ericsson
  0 siblings, 1 reply; 4+ messages in thread
From: Guo Tang @ 2008-09-21 22:59 UTC (permalink / raw)
  To: Git mailing list

Gentlemen,

I try to run "git gc" on linux kernel tree. The virtual memory keeps 
going up until over 3GB, then crash. 
Tried twice with the v1.6.0.2, same result.
Then I used the git coming with FC9 (v1.5.5.1), the peak virutal memory 
usage is about 1.5GB. "git gc" finished without any trouble. 

Could there be a memory leak in v1.6.0.2?

Thanks,
Guo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: out of memory problem
  2008-09-21 22:59 out of memory problem Guo Tang
@ 2008-09-22  7:32 ` Andreas Ericsson
       [not found]   ` <4120f6ec0809220828i3b38eda3tfc4974df8a2568cb@mail.gmail.com>
  0 siblings, 1 reply; 4+ messages in thread
From: Andreas Ericsson @ 2008-09-22  7:32 UTC (permalink / raw)
  To: Guo Tang; +Cc: Git mailing list

Guo Tang wrote:
> Gentlemen,
> 
> I try to run "git gc" on linux kernel tree. The virtual memory keeps 
> going up until over 3GB, then crash. 
> Tried twice with the v1.6.0.2, same result.
> Then I used the git coming with FC9 (v1.5.5.1), the peak virutal memory 
> usage is about 1.5GB. "git gc" finished without any trouble. 
> 
> Could there be a memory leak in v1.6.0.2?
> 

There could be, but most likely it's commit
38bd64979a2a3ffa178af801c6a62e6fcd658274 (Enable threaded delta
search on BSD and Linux). Do you have multiple cpu's in the
computer where 'git gc' was running? If so, and if you've set
pack.threads = 0, or --threads=0 it will autodetect the number
of CPU's you have and then saturate all of them with work. Each
thread will however consume memory close to that of a single
process running the repack, so for large repositories you might
want to set pack.threads = 1 in such large repositories.

It's a shame you didn't save the unpacked repository, or this could
have been properly debugged. While it's possible there is a memory
leak, it's a dismal project trying to locate it by staring at the
code, and the time it takes to repack huge repositories with memory
intensive parameters is sort of prohibitive for finding the possible
leak by bisection.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: out of memory problem
       [not found]   ` <4120f6ec0809220828i3b38eda3tfc4974df8a2568cb@mail.gmail.com>
@ 2008-09-23  7:50     ` Andreas Ericsson
  2008-09-24 16:45       ` guo tang
  0 siblings, 1 reply; 4+ messages in thread
From: Andreas Ericsson @ 2008-09-23  7:50 UTC (permalink / raw)
  To: guo tang; +Cc: Git mailing list

guo tang wrote:
> On Mon, Sep 22, 2008 at 12:32 AM, Andreas Ericsson <ae@op5.se> wrote:
> 
>> Guo Tang wrote:
>>
>>> Gentlemen,
>>>
>>> I try to run "git gc" on linux kernel tree. The virtual memory keeps going
>>> up until over 3GB, then crash. Tried twice with the v1.6.0.2, same result.
>>> Then I used the git coming with FC9 (v1.5.5.1), the peak virutal memory
>>> usage is about 1.5GB. "git gc" finished without any trouble.
>>> Could there be a memory leak in v1.6.0.2?
>>>
>>>
>> There could be, but most likely it's commit
>> 38bd64979a2a3ffa178af801c6a62e6fcd658274 (Enable threaded delta
>> search on BSD and Linux). Do you have multiple cpu's in the
>> computer where 'git gc' was running? If so, and if you've set
>> pack.threads = 0, or --threads=0 it will autodetect the number
>> of CPU's you have and then saturate all of them with work. Each
>> thread will however consume memory close to that of a single
>> process running the repack, so for large repositories you might
>> want to set pack.threads = 1 in such large repositories.
> 
> 
> It is a Pentium M single core machine. But I am not sure whether it is using
> just a single thread or
> multiple threads. I will  try setting pack.threads parameter next I run into
> trouble.
> 

Unless you explicitly told it to run multiple threads (which
would be a bit silly on a single-core machine), it just ran
one thread.

>> It's a shame you didn't save the unpacked repository, or this could
>> have been properly debugged. While it's possible there is a memory
>> leak, it's a dismal project trying to locate it by staring at the
>> code, and the time it takes to repack huge repositories with memory
>> intensive parameters is sort of prohibitive for finding the possible
>> leak by bisection.
> 
> 
> Yes, the repository is already packed now. One question, beside the
> bisecting method, do we have
> this ability built into kernel:
> 1. Turn a flag on for a process.
> 2. OS will keep track off process malloc(), free() calls and the call stack.
> 
> 3. For the malloc() calls without the the free() call (a memory leak), OS
> will keep it count based on malloc() call stack.
> 4. After some time, be able to dump this information out based on biggest
> leak spot.
> 

No, there's not. The kernel isn't the one handing out the memory when you
call malloc(). That's handled by the C library, which can (and usually does)
allocate a larger area of memory than the application needs, so that it
doesn't have to run a system call for every malloc() call you do.

You can pre-load a different memory allocator though, which can do whatever
it wants with calls to malloc(), including ofcourse logging which function
called them and how much memory was requested.

Google for "memory leak check linux" and you'll get something like 750000
results.


> The complain when I ran out of memory if from mmap failure. Is it the same
> as malloc() failure?
> 

Sort of. Read 'man 2 mmap' for a more exhaustive description.

> This kind of tool is available in Windows with its umdh (user mode heap
> dump) tool.
> 

There are a number of tools to detect leaks under Linux/Unix as well.
valgrind is probably the most frequently used of all such leak checkers.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: out of memory problem
  2008-09-23  7:50     ` Andreas Ericsson
@ 2008-09-24 16:45       ` guo tang
  0 siblings, 0 replies; 4+ messages in thread
From: guo tang @ 2008-09-24 16:45 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Git mailing list

It happened again. I cannot make a copy of repository and try bisect.
The repository is 2.5GB, too time consuming to make a copy.

When I tried bisect without start from fresh. The bisect failed to
find the bad commit since the bad commit it found out can work now
once the repository is packed.

I did notice 3 things.

1. When I have memory trouble, "git gc" cannot finish the count
objects stage. It appears to be in an infinite loop.
2. When I ran "git count-objects -v", it reported there is an error: 1
garbage object found. The garbage object is named something like
.git/objects/35/tmp_xxx.  Even if "git gc" finished without error, it
won't delete that garbage object. I guess there might be some
difference on how garbage object is handled between v1.5.5.1 and
v1.6.2
3. If I manually remove that garbage object using "rm", everything
seems still fine.

I will give valgrid a shoot next time I am in trouble.

Thanks,
Guo

On Tue, Sep 23, 2008 at 12:50 AM, Andreas Ericsson <ae@op5.se> wrote:
> guo tang wrote:
>>
>> On Mon, Sep 22, 2008 at 12:32 AM, Andreas Ericsson <ae@op5.se> wrote:
>>
>>> Guo Tang wrote:
>>>
>>>> Gentlemen,
>>>>
>>>> I try to run "git gc" on linux kernel tree. The virtual memory keeps
>>>> going
>>>> up until over 3GB, then crash. Tried twice with the v1.6.0.2, same
>>>> result.
>>>> Then I used the git coming with FC9 (v1.5.5.1), the peak virutal memory
>>>> usage is about 1.5GB. "git gc" finished without any trouble.
>>>> Could there be a memory leak in v1.6.0.2?
>>>>
>>>>
>>> There could be, but most likely it's commit
>>> 38bd64979a2a3ffa178af801c6a62e6fcd658274 (Enable threaded delta
>>> search on BSD and Linux). Do you have multiple cpu's in the
>>> computer where 'git gc' was running? If so, and if you've set
>>> pack.threads = 0, or --threads=0 it will autodetect the number
>>> of CPU's you have and then saturate all of them with work. Each
>>> thread will however consume memory close to that of a single
>>> process running the repack, so for large repositories you might
>>> want to set pack.threads = 1 in such large repositories.
>>
>>
>> It is a Pentium M single core machine. But I am not sure whether it is
>> using
>> just a single thread or
>> multiple threads. I will  try setting pack.threads parameter next I run
>> into
>> trouble.
>>
>
> Unless you explicitly told it to run multiple threads (which
> would be a bit silly on a single-core machine), it just ran
> one thread.
>
>>> It's a shame you didn't save the unpacked repository, or this could
>>> have been properly debugged. While it's possible there is a memory
>>> leak, it's a dismal project trying to locate it by staring at the
>>> code, and the time it takes to repack huge repositories with memory
>>> intensive parameters is sort of prohibitive for finding the possible
>>> leak by bisection.
>>
>>
>> Yes, the repository is already packed now. One question, beside the
>> bisecting method, do we have
>> this ability built into kernel:
>> 1. Turn a flag on for a process.
>> 2. OS will keep track off process malloc(), free() calls and the call
>> stack.
>>
>> 3. For the malloc() calls without the the free() call (a memory leak), OS
>> will keep it count based on malloc() call stack.
>> 4. After some time, be able to dump this information out based on biggest
>> leak spot.
>>
>
> No, there's not. The kernel isn't the one handing out the memory when you
> call malloc(). That's handled by the C library, which can (and usually does)
> allocate a larger area of memory than the application needs, so that it
> doesn't have to run a system call for every malloc() call you do.
>
> You can pre-load a different memory allocator though, which can do whatever
> it wants with calls to malloc(), including ofcourse logging which function
> called them and how much memory was requested.
>
> Google for "memory leak check linux" and you'll get something like 750000
> results.
>
>
>> The complain when I ran out of memory if from mmap failure. Is it the same
>> as malloc() failure?
>>
>
> Sort of. Read 'man 2 mmap' for a more exhaustive description.
>
>> This kind of tool is available in Windows with its umdh (user mode heap
>> dump) tool.
>>
>
> There are a number of tools to detect leaks under Linux/Unix as well.
> valgrind is probably the most frequently used of all such leak checkers.
>
> --
> Andreas Ericsson                   andreas.ericsson@op5.se
> OP5 AB                             www.op5.se
> Tel: +46 8-230225                  Fax: +46 8-230231
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-09-24 16:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-21 22:59 out of memory problem Guo Tang
2008-09-22  7:32 ` Andreas Ericsson
     [not found]   ` <4120f6ec0809220828i3b38eda3tfc4974df8a2568cb@mail.gmail.com>
2008-09-23  7:50     ` Andreas Ericsson
2008-09-24 16:45       ` guo tang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).