git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] diff: release blobs after generating textual diff.
@ 2007-05-06  8:47 Junio C Hamano
  2007-05-06  8:49 ` [PATCH] diff.c: truly honor size_only request for populate_filespec Junio C Hamano
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Junio C Hamano @ 2007-05-06  8:47 UTC (permalink / raw)
  To: git

This reduces the memory pressure when dealing with many paths.

An unscientific test of running "diff-tree --stat --summary -M"
between v2.6.19 and v2.6.20-rc1 in the linux kernel repository
indicates that the number of minor faults are reduced by 2/3
(153k vs 49k).

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 * This is still a WIP, not in the sense that it breaks anything
   (it doesn't seem to), but in the sense that it is not known
   if it is useful in general and would make that much of a
   difference with a project much larger than the kernel.

diff --git a/diff.c b/diff.c
index 7bbe759..9de053c 100644
--- a/diff.c
+++ b/diff.c
@@ -1236,6 +1236,8 @@ static void builtin_diff(const char *name_a,
 	}
 
  free_ab_and_return:
+	diff_free_filespec_data(one);
+	diff_free_filespec_data(two);
 	free(a_one);
 	free(b_two);
 	return;
@@ -1262,7 +1264,7 @@ static void builtin_diffstat(const char *name_a, const char *name_b,
 		diff_populate_filespec(two, 0);
 		data->deleted = count_lines(one->data, one->size);
 		data->added = count_lines(two->data, two->size);
-		return;
+		goto free_and_return;
 	}
 	if (fill_mmfile(&mf1, one) < 0 || fill_mmfile(&mf2, two) < 0)
 		die("unable to read files to diff");
@@ -1284,6 +1286,10 @@ static void builtin_diffstat(const char *name_a, const char *name_b,
 		ecb.priv = diffstat;
 		xdl_diff(&mf1, &mf2, &xpp, &xecfg, &ecb);
 	}
+
+ free_and_return:
+	diff_free_filespec_data(one);
+	diff_free_filespec_data(two);
 }
 
 static void builtin_checkdiff(const char *name_a, const char *name_b,
@@ -1306,7 +1312,7 @@ static void builtin_checkdiff(const char *name_a, const char *name_b,
 		die("unable to read files to diff");
 
 	if (file_is_binary(two))
-		return;
+		goto free_and_return;
 	else {
 		/* Crazy xdl interfaces.. */
 		xpparam_t xpp;
@@ -1320,6 +1326,9 @@ static void builtin_checkdiff(const char *name_a, const char *name_b,
 		ecb.priv = &data;
 		xdl_diff(&mf1, &mf2, &xpp, &xecfg, &ecb);
 	}
+ free_and_return:
+	diff_free_filespec_data(one);
+	diff_free_filespec_data(two);
 }
 
 struct diff_filespec *alloc_filespec(const char *path)

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] diff.c: truly honor size_only request for populate_filespec.
  2007-05-06  8:47 [PATCH] diff: release blobs after generating textual diff Junio C Hamano
@ 2007-05-06  8:49 ` Junio C Hamano
  2007-05-06  8:51 ` [PATCH] diff: use filespec as in-place size cache Junio C Hamano
  2007-05-06 17:03 ` [PATCH] diff: release blobs after generating textual diff Nicolas Pitre
  2 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2007-05-06  8:49 UTC (permalink / raw)
  To: git

For some reason we did not honor size_only requests when "size cache"
was in use.  They should be independent.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 * I think this is a right fix.  When used together with other
   patches, it seems to help reducing the memory footprint a
   bit, but by itself not that much and it seems to have
   slightly negative performance impact.

 diff.c |    3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/diff.c b/diff.c
index 7bbe759..02c9467 100644
--- a/diff.c
+++ b/diff.c
@@ -1503,9 +1503,6 @@ int diff_populate_filespec(struct diff_filespec *s, int size_only)
 	if (S_ISDIR(s->mode))
 		return -1;
 
-	if (!use_size_cache)
-		size_only = 0;
-
 	if (s->data)
 		return err;
 
-- 
1.5.2.rc1.697.g5086

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] diff: use filespec as in-place size cache
  2007-05-06  8:47 [PATCH] diff: release blobs after generating textual diff Junio C Hamano
  2007-05-06  8:49 ` [PATCH] diff.c: truly honor size_only request for populate_filespec Junio C Hamano
@ 2007-05-06  8:51 ` Junio C Hamano
  2007-05-06 17:03 ` [PATCH] diff: release blobs after generating textual diff Nicolas Pitre
  2 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2007-05-06  8:51 UTC (permalink / raw)
  To: git

When we have a filespec primed with size, later size_only
request should be fillable without recomputing it.

---
diff --git a/diff.c b/diff.c
index 0f8c68f..fb3eba5 100644
--- a/diff.c
+++ b/diff.c
@@ -1515,6 +1515,9 @@ int diff_populate_filespec(struct diff_filespec *s, int size_only)
 	if (s->data)
 		return err;
 
+	if (size_only && 0 < s->size)
+		return err;
+
 	if (S_ISDIRLNK(s->mode))
 		return diff_populate_gitlink(s, size_only);
 

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] diff: release blobs after generating textual diff.
  2007-05-06  8:47 [PATCH] diff: release blobs after generating textual diff Junio C Hamano
  2007-05-06  8:49 ` [PATCH] diff.c: truly honor size_only request for populate_filespec Junio C Hamano
  2007-05-06  8:51 ` [PATCH] diff: use filespec as in-place size cache Junio C Hamano
@ 2007-05-06 17:03 ` Nicolas Pitre
  2007-05-06 21:36   ` Robin Rosenberg
  2 siblings, 1 reply; 6+ messages in thread
From: Nicolas Pitre @ 2007-05-06 17:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sun, 6 May 2007, Junio C Hamano wrote:

> This reduces the memory pressure when dealing with many paths.
> 
> An unscientific test of running "diff-tree --stat --summary -M"
> between v2.6.19 and v2.6.20-rc1 in the linux kernel repository
> indicates that the number of minor faults are reduced by 2/3
> (153k vs 49k).
> 
> Signed-off-by: Junio C Hamano <junkio@cox.net>
> ---
> 
>  * This is still a WIP, not in the sense that it breaks anything
>    (it doesn't seem to), but in the sense that it is not known
>    if it is useful in general and would make that much of a
>    difference with a project much larger than the kernel.

This can only be good.  People are really starting to use Git with 
gigantic repos on limited memory hardware.


Nicolas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] diff: release blobs after generating textual diff.
  2007-05-06 17:03 ` [PATCH] diff: release blobs after generating textual diff Nicolas Pitre
@ 2007-05-06 21:36   ` Robin Rosenberg
  2007-05-08 20:58     ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Robin Rosenberg @ 2007-05-06 21:36 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, git

söndag 06 maj 2007 skrev Nicolas Pitre:
> On Sun, 6 May 2007, Junio C Hamano wrote:
> 
> > This reduces the memory pressure when dealing with many paths.
> > 
> > An unscientific test of running "diff-tree --stat --summary -M"
> > between v2.6.19 and v2.6.20-rc1 in the linux kernel repository
> > indicates that the number of minor faults are reduced by 2/3
> > (153k vs 49k).
> > 
> > Signed-off-by: Junio C Hamano <junkio@cox.net>
> > ---
> > 
> >  * This is still a WIP, not in the sense that it breaks anything
> >    (it doesn't seem to), but in the sense that it is not known
> >    if it is useful in general and would make that much of a
> >    difference with a project much larger than the kernel.
> 
> This can only be good.  People are really starting to use Git with 
> gigantic repos on limited memory hardware.

This did wonders on the usually unreasonable diffs on huge repos. The openoffice
diff mentioned in the openoffice thread went from 6 to ~3 minutes, and most importantly
the computer was perfectly usable meanwhile. Git memory usage dropped from 1,7GB to
400MB.

A more reasonable test diffing against a recent branch , master vs v33M4-patches, in 
the eclipse repo some of you have didn't gain much in performance, but memory usage
dropped from 700MB to a peak just under 400MB, which makes a huge difference in
responsiveness for the other applications that I have, since they were not swapped out
during the diff.

-- robin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] diff: release blobs after generating textual diff.
  2007-05-06 21:36   ` Robin Rosenberg
@ 2007-05-08 20:58     ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2007-05-08 20:58 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: Nicolas Pitre, git

Robin Rosenberg <robin.rosenberg.lists@dewire.com> writes:

> söndag 06 maj 2007 skrev Nicolas Pitre:
>> On Sun, 6 May 2007, Junio C Hamano wrote:
>> 
>> > This reduces the memory pressure when dealing with many paths.
>> > 
>> > An unscientific test of running "diff-tree --stat --summary -M"
>> > between v2.6.19 and v2.6.20-rc1 in the linux kernel repository
>> > indicates that the number of minor faults are reduced by 2/3
>> > (153k vs 49k).
>> > 
>> > Signed-off-by: Junio C Hamano <junkio@cox.net>
>> > ---
>> > 
>> >  * This is still a WIP, not in the sense that it breaks anything
>> >    (it doesn't seem to), but in the sense that it is not known
>> >    if it is useful in general and would make that much of a
>> >    difference with a project much larger than the kernel.
>> 
>> This can only be good.  People are really starting to use Git with 
>> gigantic repos on limited memory hardware.
>
> This did wonders on the usually unreasonable diffs on huge
> repos. The openoffice diff mentioned in the openoffice thread
> went from 6 to ~3 minutes, and most importantly the computer
> was perfectly usable meanwhile. Git memory usage dropped from
> 1,7GB to 400MB.

I've parked a cleaned-up version in 'next'.  Hopefully we can
merge it to 'master' before 1.5.2, as it is not about a new
feature but about fixing a performance 'bug'.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-05-08 21:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-06  8:47 [PATCH] diff: release blobs after generating textual diff Junio C Hamano
2007-05-06  8:49 ` [PATCH] diff.c: truly honor size_only request for populate_filespec Junio C Hamano
2007-05-06  8:51 ` [PATCH] diff: use filespec as in-place size cache Junio C Hamano
2007-05-06 17:03 ` [PATCH] diff: release blobs after generating textual diff Nicolas Pitre
2007-05-06 21:36   ` Robin Rosenberg
2007-05-08 20:58     ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).