* [REGRESSION, BISECTED] `git checkout <branch>` started to be memory hog
@ 2011-07-22 13:05 Kirill Smelkov
2011-07-22 17:00 ` Jeff King
0 siblings, 1 reply; 8+ messages in thread
From: Kirill Smelkov @ 2011-07-22 13:05 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Hello up there,
Today I've tried to upgrade to linux-3.0 on my netbook with 1GB of RAM,
with today's Git master (v1.7.6-233-gd79bcd6) and Git was killed by an OOM:
$ git new-workdir linux linux-3.0 linux-3.0.y
Checking out files: ~80% (XXXXX/36783)
Aborted
(with OOM messages in dmesg).
git new-workdir boils down to (uninteresting to the issue) symlinks
setup + final `git checkout -f <branch>` and that final git checkout is
failing.
It turned out that with Git v1.7.6 memory usage for git-checkout
linux-3.0.y as seen in top is
VIRTmax RESmax
~338M ~247M
and for master
VIRTmax RESmax
(both till not killed)
~2200M ~1000M
i.e. it looks like when residential memory usage approaches the amount of
physical RAM, the OOM killer comes into play.
And I've bisected this to b6691092 ("Add streaming filter API"; Junio C
Hamano, May 20 2011; merged to next on Jun 30 2011):
commit b6691092d707860019bbab80eaaf9173ada10586
Author: Junio C Hamano <gitster@pobox.com>
Date: Fri May 20 14:33:31 2011 -0700
Add streaming filter API
This introduces an API to plug custom filters to an input stream.
The caller gets get_stream_filter("path") to obtain an appropriate
filter for the path, and then uses it when opening an input stream
via open_istream(). After that, the caller can read from the stream
with read_istream(), and close it with close_istream(), just like an
unfiltered stream.
This only adds a "null" filter that is a pass-thru filter, but later
changes can add LF-to-CRLF and other filters, and the callers of the
streaming API do not have to change.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Hope Git stays usable for us, with not enormous amount of RAM onboard,
and thanks beforehand for fixing,
Kirill
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [REGRESSION, BISECTED] `git checkout <branch>` started to be memory hog
2011-07-22 13:05 [REGRESSION, BISECTED] `git checkout <branch>` started to be memory hog Kirill Smelkov
@ 2011-07-22 17:00 ` Jeff King
2011-07-22 17:33 ` Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2011-07-22 17:00 UTC (permalink / raw)
To: Kirill Smelkov; +Cc: Junio C Hamano, git
On Fri, Jul 22, 2011 at 05:05:18PM +0400, Kirill Smelkov wrote:
> It turned out that with Git v1.7.6 memory usage for git-checkout
> linux-3.0.y as seen in top is
>
> VIRTmax RESmax
>
> ~338M ~247M
>
> and for master
>
> VIRTmax RESmax
> (both till not killed)
> ~2200M ~1000M
>
>
> i.e. it looks like when residential memory usage approaches the amount of
> physical RAM, the OOM killer comes into play.
>
>
> And I've bisected this to b6691092 ("Add streaming filter API"; Junio C
> Hamano, May 20 2011; merged to next on Jun 30 2011):
Hmm, that series was supposed to _reduce_ memory usage. :)
According to valgrind, we are leaking gigabytes of memory allocated in
git_istream buffers:
$ cd linux-2.6
$ rm -vrf *
$ valgrind --leak-check=full git checkout -f
[...]
2,418,940,448 (1,163,660,832 direct, 1,255,279,616 indirect) bytes in 35,271
blocks are definitely lost in loss record 81 of 81
at 0x4C2780D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x517C62: xmalloc (wrapper.c:35)
by 0x503112: attach_stream_filter (streaming.c:255)
by 0x502D61: open_istream (streaming.c:152)
by 0x4B1B7A: streaming_write_entry (entry.c:130)
by 0x4B1E0B: write_entry (entry.c:193)
by 0x4B23F4: checkout_entry (entry.c:318)
by 0x511DA4: check_updates (unpack-trees.c:223)
by 0x513D72: unpack_trees (unpack-trees.c:1125)
by 0x41CCB4: reset_tree (checkout.c:333)
by 0x41CE32: merge_working_tree (checkout.c:378)
by 0x41DE6A: switch_branches (checkout.c:737)
This malloc is for the actual git_istream struct. It seems that we never
actually free it when calling close_istream(). And these structs are
quite big; they contain 32K of filter buffers inside a union.
>From my quick look, I came up with the fix below. It removes the leak
and doesn't trigger any memory errors according to valgrind. So it
_must_ be right. :)
-Peff
---
diff --git a/streaming.c b/streaming.c
index 565f000..f3acc5d 100644
--- a/streaming.c
+++ b/streaming.c
@@ -93,7 +93,9 @@ struct git_istream {
int close_istream(struct git_istream *st)
{
- return st->vtbl->close(st);
+ int r = st->vtbl->close(st);
+ free(st);
+ return r;
}
ssize_t read_istream(struct git_istream *st, char *buf, size_t sz)
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [REGRESSION, BISECTED] `git checkout <branch>` started to be memory hog
2011-07-22 17:00 ` Jeff King
@ 2011-07-22 17:33 ` Junio C Hamano
2011-07-22 17:36 ` Jeff King
0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2011-07-22 17:33 UTC (permalink / raw)
To: Jeff King; +Cc: Kirill Smelkov, git
Jeff King <peff@peff.net> writes:
> From my quick look, I came up with the fix below. It removes the leak
> and doesn't trigger any memory errors according to valgrind. So it
> _must_ be right. :)
>
> -Peff
>
> ---
> diff --git a/streaming.c b/streaming.c
> index 565f000..f3acc5d 100644
> --- a/streaming.c
> +++ b/streaming.c
> @@ -93,7 +93,9 @@ struct git_istream {
>
> int close_istream(struct git_istream *st)
> {
> - return st->vtbl->close(st);
> + int r = st->vtbl->close(st);
> + free(st);
> + return r;
> }
I do not think of a reason any future callers would want to close the
stream first and then inspect some attribute of the stream afterwards (if
this were an output stream, there might be things like output stats that
may want such an interface, but even in such a case, the caller should say
"flush" first to drain whatever is pending, grab stats and then close), so
freeing the resource at close time seems sensible to me.
Both the external caller in entry.c and the internal caller (a filtered
istream reads from its underlying istream, and when it is getting closed,
closes the underlying istream) were leaking the istream exactly the same
way, so this fix should plug both of them.
Thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [REGRESSION, BISECTED] `git checkout <branch>` started to be memory hog
2011-07-22 17:33 ` Junio C Hamano
@ 2011-07-22 17:36 ` Jeff King
2011-07-22 20:26 ` Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2011-07-22 17:36 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Kirill Smelkov, git
On Fri, Jul 22, 2011 at 10:33:31AM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > From my quick look, I came up with the fix below. It removes the leak
> > and doesn't trigger any memory errors according to valgrind. So it
> > _must_ be right. :)
> >
> > -Peff
> >
> > ---
> > diff --git a/streaming.c b/streaming.c
> > index 565f000..f3acc5d 100644
> > --- a/streaming.c
> > +++ b/streaming.c
> > @@ -93,7 +93,9 @@ struct git_istream {
> >
> > int close_istream(struct git_istream *st)
> > {
> > - return st->vtbl->close(st);
> > + int r = st->vtbl->close(st);
> > + free(st);
> > + return r;
> > }
>
> I do not think of a reason any future callers would want to close the
> stream first and then inspect some attribute of the stream afterwards (if
> this were an output stream, there might be things like output stats that
> may want such an interface, but even in such a case, the caller should say
> "flush" first to drain whatever is pending, grab stats and then close), so
> freeing the resource at close time seems sensible to me.
Yeah. My impression was that these istreams would operate much like
"FILE *" streams. And closing those frees all resources, and you can
throw away the pointer afterwards.
> Both the external caller in entry.c and the internal caller (a filtered
> istream reads from its underlying istream, and when it is getting closed,
> closes the underlying istream) were leaking the istream exactly the same
> way, so this fix should plug both of them.
>
> Thanks.
Did you want me to write a commit message? I think you might do a better
job of writing a coherent one for this particular patch.
-Peff
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [REGRESSION, BISECTED] `git checkout <branch>` started to be memory hog
2011-07-22 17:36 ` Jeff King
@ 2011-07-22 20:26 ` Junio C Hamano
2011-07-22 20:36 ` Jeff King
0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2011-07-22 20:26 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, Kirill Smelkov, git
Jeff King <peff@peff.net> writes:
> Did you want me to write a commit message? I think you might do a better
> job of writing a coherent one for this particular patch.
I didn't do a very good job.
commit 06a21454b424d5bc0a36ed76ce46bb16ed156774
Author: Jeff King <peff@peff.net>
Date: Fri Jul 22 11:00:03 2011 -0600
streaming: free git_istream upon closing
Kirill Smelkov noticed that post 1.7.6 "git checkout" started leaking
tons of memory. close_istream() forgot to release its buffer and the
caller was not freeing it, either.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At least it needs your S-o-b ;-).
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [REGRESSION, BISECTED] `git checkout <branch>` started to be memory hog
2011-07-22 20:26 ` Junio C Hamano
@ 2011-07-22 20:36 ` Jeff King
2011-07-22 21:31 ` Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2011-07-22 20:36 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Kirill Smelkov, git
On Fri, Jul 22, 2011 at 01:26:17PM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > Did you want me to write a commit message? I think you might do a better
> > job of writing a coherent one for this particular patch.
>
> I didn't do a very good job.
How about:
-- >8 --
Subject: [PATCH] streaming: free git_istream upon closing
Kirill Smelkov noticed that post-1.7.6 "git checkout"
started leaking tons of memory. The streaming_write_entry
function properly calls close_istream(), but that function
did not actually free() the allocated git_istream struct.
The git_istream struct is totally opaque to calling code,
and must be heap-allocated by open_istream. Therefore it's
not appropriate for callers to have to free it.
This patch makes close_istream() into "close and de-allocate
all associated resources". We could add a new "free_istream"
call, but there's not much point in letting callers inspect
the istream after close. And this patch's semantics make us
match fopen/fclose, which is well-known and understood.
Signed-off-by: Jeff King <peff@peff.net>
---
streaming.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/streaming.c b/streaming.c
index 565f000..f3acc5d 100644
--- a/streaming.c
+++ b/streaming.c
@@ -93,7 +93,9 @@ struct git_istream {
int close_istream(struct git_istream *st)
{
- return st->vtbl->close(st);
+ int r = st->vtbl->close(st);
+ free(st);
+ return r;
}
ssize_t read_istream(struct git_istream *st, char *buf, size_t sz)
--
1.7.6.rc1.12.g65e2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [REGRESSION, BISECTED] `git checkout <branch>` started to be memory hog
2011-07-22 20:36 ` Jeff King
@ 2011-07-22 21:31 ` Junio C Hamano
2011-07-23 18:04 ` Kirill Smelkov
0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2011-07-22 21:31 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, Kirill Smelkov, git
Jeff King <peff@peff.net> writes:
> How about:
>
> -- >8 --
> Subject: [PATCH] streaming: free git_istream upon closing
>
> Kirill Smelkov noticed that post-1.7.6 "git checkout"
> started leaking tons of memory. The streaming_write_entry
> function properly calls close_istream(), but that function
> did not actually free() the allocated git_istream struct.
>
> The git_istream struct is totally opaque to calling code,
> and must be heap-allocated by open_istream. Therefore it's
> not appropriate for callers to have to free it.
>
> This patch makes close_istream() into "close and de-allocate
> all associated resources". We could add a new "free_istream"
> call, but there's not much point in letting callers inspect
> the istream after close. And this patch's semantics make us
> match fopen/fclose, which is well-known and understood.
>
> Signed-off-by: Jeff King <peff@peff.net>
Much nicer ;-) Thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [REGRESSION, BISECTED] `git checkout <branch>` started to be memory hog
2011-07-22 21:31 ` Junio C Hamano
@ 2011-07-23 18:04 ` Kirill Smelkov
0 siblings, 0 replies; 8+ messages in thread
From: Kirill Smelkov @ 2011-07-23 18:04 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jeff King, git
On Fri, Jul 22, 2011 at 02:31:22PM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > How about:
> >
> > -- >8 --
> > Subject: [PATCH] streaming: free git_istream upon closing
> >
> > Kirill Smelkov noticed that post-1.7.6 "git checkout"
> > started leaking tons of memory. The streaming_write_entry
> > function properly calls close_istream(), but that function
> > did not actually free() the allocated git_istream struct.
> >
> > The git_istream struct is totally opaque to calling code,
> > and must be heap-allocated by open_istream. Therefore it's
> > not appropriate for callers to have to free it.
> >
> > This patch makes close_istream() into "close and de-allocate
> > all associated resources". We could add a new "free_istream"
> > call, but there's not much point in letting callers inspect
> > the istream after close. And this patch's semantics make us
> > match fopen/fclose, which is well-known and understood.
> >
> > Signed-off-by: Jeff King <peff@peff.net>
>
> Much nicer ;-) Thanks.
Jeff, Junio,
Thanks for fixing this.
Kirill
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-07-23 18:05 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-22 13:05 [REGRESSION, BISECTED] `git checkout <branch>` started to be memory hog Kirill Smelkov
2011-07-22 17:00 ` Jeff King
2011-07-22 17:33 ` Junio C Hamano
2011-07-22 17:36 ` Jeff King
2011-07-22 20:26 ` Junio C Hamano
2011-07-22 20:36 ` Jeff King
2011-07-22 21:31 ` Junio C Hamano
2011-07-23 18:04 ` Kirill Smelkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).