* dmalloc and leaks in git
@ 2007-12-08 20:53 Jon Smirl
2007-12-08 20:58 ` Johannes Schindelin
2007-12-09 20:57 ` Linus Torvalds
0 siblings, 2 replies; 7+ messages in thread
From: Jon Smirl @ 2007-12-08 20:53 UTC (permalink / raw)
To: Git Mailing List
It is very easy to use dmalloc with git. Follow the instructions here,
http://dmalloc.com/docs/latest/online/dmalloc_4.html
But using dmalloc shows a pervasive problem, none of the git commands
are cleaning up after themselves. For example I ran a simple command,
git-status, and thousands of objects were not freed.
Normally this doesn't hurt since exiting the process obviously frees
all of the memory. But when programming this way it becomes impossible
to tell which leaks were on purpose and which were accidental.
To sort this out an #ifdef DMALLOC needs to be created and then code
for freeing all of the 'on purpose' leaks needs to be added in an
IFDEF right before the process exits. The test scripts can then be
modified to ensure that everything is freed when the command exits.
I've used this process on several projects I've managed and it is a
very good thing to do. Once the new infrastructure is in place leaks
can be detected automatically and nipped in the bud before they get
out of control. The key to making this work is getting code in place
in the #ifdef to free those "on-purpose" leaks.
I tried a couple of year ago to add leak detection to Mozilla but
Mozilla is way too far gone. There are 10,000 places where things are
allocated and not being freed. It is a huge manually intensive task
sorting out which of these were on-purpose vs accidental. If Mozilla
had followed a discipline of ensuring that nothing was every leaked
(by using the scheme above) a lot of recent leak clean up work could
have been avoided.
I don't know enough about the structure of git to add the cleanups in
#ifdefs before exit. People who wrote the commands are going to have
to help out with this.
diff --git a/Makefile b/Makefile
index 0a5df7a..426830c 100644
--- a/Makefile
+++ b/Makefile
@@ -752,7 +752,7 @@ SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH))
PERL_PATH_SQ = $(subst ','\'',$(PERL_PATH))
TCLTK_PATH_SQ = $(subst ','\'',$(TCLTK_PATH))
-LIBS = $(GITLIBS) $(EXTLIBS)
+LIBS = $(GITLIBS) $(EXTLIBS) -ldmalloc
BASIC_CFLAGS += -DSHA1_HEADER='$(SHA1_HEADER_SQ)' \
$(COMPAT_CFLAGS)
diff --git a/git-compat-util.h b/git-compat-util.h
index 79eb10e..8894c30 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -428,3 +428,5 @@ static inline int strtol_i(char const *s, int
base, int *result)
}
#endif
+
+#include "dmalloc.h"
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: dmalloc and leaks in git
2007-12-08 20:53 dmalloc and leaks in git Jon Smirl
@ 2007-12-08 20:58 ` Johannes Schindelin
2007-12-08 21:02 ` Jon Smirl
2007-12-09 20:57 ` Linus Torvalds
1 sibling, 1 reply; 7+ messages in thread
From: Johannes Schindelin @ 2007-12-08 20:58 UTC (permalink / raw)
To: Jon Smirl; +Cc: Git Mailing List
Hi,
On Sat, 8 Dec 2007, Jon Smirl wrote:
> It is very easy to use dmalloc with git. Follow the instructions here,
> http://dmalloc.com/docs/latest/online/dmalloc_4.html
>
> But using dmalloc shows a pervasive problem, none of the git commands
> are cleaning up after themselves. For example I ran a simple command,
> git-status, and thousands of objects were not freed.
Known problem. Goes by the name of "libification" on this list.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: dmalloc and leaks in git
2007-12-08 20:58 ` Johannes Schindelin
@ 2007-12-08 21:02 ` Jon Smirl
2007-12-08 21:19 ` Johannes Schindelin
0 siblings, 1 reply; 7+ messages in thread
From: Jon Smirl @ 2007-12-08 21:02 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Git Mailing List
On 12/8/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> On Sat, 8 Dec 2007, Jon Smirl wrote:
>
> > It is very easy to use dmalloc with git. Follow the instructions here,
> > http://dmalloc.com/docs/latest/online/dmalloc_4.html
> >
> > But using dmalloc shows a pervasive problem, none of the git commands
> > are cleaning up after themselves. For example I ran a simple command,
> > git-status, and thousands of objects were not freed.
>
> Known problem. Goes by the name of "libification" on this list.
I tried using dmalloc to find the leak in repack but it is impossible
to sort out the accidental leaks from the on-purpose ones. On exit
there were millions of unfreed objects coming from all over the place.
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: dmalloc and leaks in git
2007-12-08 21:02 ` Jon Smirl
@ 2007-12-08 21:19 ` Johannes Schindelin
0 siblings, 0 replies; 7+ messages in thread
From: Johannes Schindelin @ 2007-12-08 21:19 UTC (permalink / raw)
To: Jon Smirl; +Cc: Git Mailing List
Hi,
On Sat, 8 Dec 2007, Jon Smirl wrote:
> On 12/8/07, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > Hi,
> >
> > On Sat, 8 Dec 2007, Jon Smirl wrote:
> >
> > > It is very easy to use dmalloc with git. Follow the instructions here,
> > > http://dmalloc.com/docs/latest/online/dmalloc_4.html
> > >
> > > But using dmalloc shows a pervasive problem, none of the git commands
> > > are cleaning up after themselves. For example I ran a simple command,
> > > git-status, and thousands of objects were not freed.
> >
> > Known problem. Goes by the name of "libification" on this list.
>
> I tried using dmalloc to find the leak in repack but it is impossible
> to sort out the accidental leaks from the on-purpose ones. On exit
> there were millions of unfreed objects coming from all over the place.
This might be a starting point:
http://repo.or.cz/w/git/dscho.git?a=commitdiff;h=2083418c5010f04fbcd6e1f67de522ad6acd863d
Hth,
Dscho
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: dmalloc and leaks in git
2007-12-08 20:53 dmalloc and leaks in git Jon Smirl
2007-12-08 20:58 ` Johannes Schindelin
@ 2007-12-09 20:57 ` Linus Torvalds
2007-12-10 16:34 ` Linus Torvalds
1 sibling, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2007-12-09 20:57 UTC (permalink / raw)
To: Jon Smirl; +Cc: Git Mailing List
On Sat, 8 Dec 2007, Jon Smirl wrote:
>
> But using dmalloc shows a pervasive problem, none of the git commands
> are cleaning up after themselves. For example I ran a simple command,
> git-status, and thousands of objects were not freed.
One thing to do is to use a better reporting tool.
For example, if you use
valgrind --tool=massif --heap=yes ...
it will generate a postscript file with the allocation history as a graph
of the different allocators in different colors etc. That would likely
show where the big users come from..
Linus
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: dmalloc and leaks in git
2007-12-09 20:57 ` Linus Torvalds
@ 2007-12-10 16:34 ` Linus Torvalds
2007-12-10 16:54 ` Nicolas Pitre
0 siblings, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2007-12-10 16:34 UTC (permalink / raw)
To: Jon Smirl; +Cc: Git Mailing List
On Sun, 9 Dec 2007, Linus Torvalds wrote:
>
> For example, if you use
>
> valgrind --tool=massif --heap=yes ...
I tried this on my copy of the gcc thing, but I didn't do the extreme
packing thing, so I never saw the 3.4GB usage. Massif just reported a 200M
heap, and about half of that was "add_object_entry".
Of course, that doesn't report any mmap usage at all, so it totally
ignores the mapping of the original pack-file itself (which will obviously
be totally dense by the end, since we look at all objects).
It also doesn't take into account various secondary effects. For example,
I don't think it looks at heap fragmentation issues etc, which normally
aren't a noticeable thing, but maybe some particular allocation pattern
can make the glibc allocator waste horrid amounts of memory or something
like that.
Linus
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: dmalloc and leaks in git
2007-12-10 16:34 ` Linus Torvalds
@ 2007-12-10 16:54 ` Nicolas Pitre
0 siblings, 0 replies; 7+ messages in thread
From: Nicolas Pitre @ 2007-12-10 16:54 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Jon Smirl, Git Mailing List
On Mon, 10 Dec 2007, Linus Torvalds wrote:
>
>
> On Sun, 9 Dec 2007, Linus Torvalds wrote:
> >
> > For example, if you use
> >
> > valgrind --tool=massif --heap=yes ...
>
> I tried this on my copy of the gcc thing, but I didn't do the extreme
> packing thing, so I never saw the 3.4GB usage. Massif just reported a 200M
> heap, and about half of that was "add_object_entry".
So far, it seems that the problem occurs much more severely when you run
'git repack -a -f' while using the already highly packed gcc repo as a
starting point.
Remains to determine if it occurs only when the repack is threaded, or
if that has no significance.
Nicolas
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-12-10 16:54 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-08 20:53 dmalloc and leaks in git Jon Smirl
2007-12-08 20:58 ` Johannes Schindelin
2007-12-08 21:02 ` Jon Smirl
2007-12-08 21:19 ` Johannes Schindelin
2007-12-09 20:57 ` Linus Torvalds
2007-12-10 16:34 ` Linus Torvalds
2007-12-10 16:54 ` Nicolas Pitre
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).