git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] rerere: fix overeager gc
@ 2010-06-29 11:38 SZEDER Gábor
  2010-06-29 17:59 ` Junio C Hamano
  0 siblings, 1 reply; 24+ messages in thread
From: SZEDER Gábor @ 2010-06-29 11:38 UTC (permalink / raw)
  To: git; +Cc: SZEDER Gábor

'rerere gc' prunes resolutions of conflicted merges that occurred long
time ago, and when doing so it takes the creation time of the merge
conflict resolution into account.  This can cause the loss of
frequently used merge resolutions (e.g. long-living topic branches are
merged into a regularly rebuilt integration branch (think of git's
pu)) when they become old enough to exceed 'rerere gc's threshold.

Prevent the loss of valuable merge resolutions by changing 'rerere gc'
to take the time of last usage of the merge resolution into account
when determining whether a merge resolution should be pruned.

Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
---

I was wondering that every once in a while when I got a merge conflict
during rebuilding my integration branch then it was usually followed
with a bunch of other conflicts as well, even though nothing really
changed around the conflicting areas.  Until today at last I noticed
that it happens right after doing a 'git gc'...

RFC, because I would not say that I put in too much effort to fully
understand how rerere works internally...  As far as I observed
rerere's behaviour and understood its code, thisimage is always
rewritten each time a merge resolution is used.  But I'm not sure I
can rely on that when gc'ing.

 builtin/rerere.c  |   15 +++++++++++++--
 t/t4200-rerere.sh |   18 +++++++++++++++++-
 2 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/builtin/rerere.c b/builtin/rerere.c
index 0048f9e..e095852 100644
--- a/builtin/rerere.c
+++ b/builtin/rerere.c
@@ -19,6 +19,12 @@ static time_t rerere_created_at(const char *name)
 	return stat(rerere_path(name, "preimage"), &st) ? (time_t) 0 : st.st_mtime;
 }
 
+static time_t rerere_last_used_at(const char *name)
+{
+	struct stat st;
+	return stat(rerere_path(name, "thisimage"), &st) ? (time_t) 0 : st.st_mtime;
+}
+
 static void unlink_rr_item(const char *name)
 {
 	unlink(rerere_path(name, "thisimage"));
@@ -56,8 +62,13 @@ static void garbage_collect(struct string_list *rr)
 		then = rerere_created_at(e->d_name);
 		if (!then)
 			continue;
-		cutoff = (has_rerere_resolution(e->d_name)
-			  ? cutoff_resolve : cutoff_noresolve);
+		if (has_rerere_resolution(e->d_name)) {
+			then = rerere_last_used_at(e->d_name);
+			if (!then)
+				continue;
+			cutoff = cutoff_resolve;
+		} else
+			cutoff = cutoff_noresolve;
 		if (then < now - cutoff * 86400)
 			string_list_append(e->d_name, &to_remove);
 	}
diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh
index 70856d0..45c9df8 100755
--- a/t/t4200-rerere.sh
+++ b/t/t4200-rerere.sh
@@ -154,33 +154,49 @@ test_expect_success 'clear removed the directory' "test ! -d $rr"
 mkdir $rr
 echo Hello > $rr/preimage
 echo World > $rr/postimage
+echo "!" > $rr/thisimage
 
 sha2=4000000000000000000000000000000000000000
 rr2=.git/rr-cache/$sha2
 mkdir $rr2
 echo Hello > $rr2/preimage
 
+sha3=ffffffffffffffffffffffffffffffffffffffff
+rr3=.git/rr-cache/$sha3
+mkdir $rr3
+echo Hello > $rr3/preimage
+echo World > $rr3/postimage
+echo "!" > $rr3/thisimage
+
 almost_15_days_ago=$((60-15*86400))
 just_over_15_days_ago=$((-1-15*86400))
 almost_60_days_ago=$((60-60*86400))
 just_over_60_days_ago=$((-1-60*86400))
 
 test-chmtime =$almost_60_days_ago $rr/preimage
+test-chmtime =$almost_60_days_ago $rr/thisimage
 test-chmtime =$almost_15_days_ago $rr2/preimage
+test-chmtime =$almost_60_days_ago $rr3/preimage
+test-chmtime =$almost_60_days_ago $rr3/thisimage
 
 test_expect_success 'garbage collection (part1)' 'git rerere gc'
 
 test_expect_success 'young records still live' \
-	"test -f $rr/preimage && test -f $rr2/preimage"
+	"test -f $rr/preimage && test -f $rr2/preimage && test -f $rr3/preimage"
 
 test-chmtime =$just_over_60_days_ago $rr/preimage
+test-chmtime =$just_over_60_days_ago $rr/thisimage
 test-chmtime =$just_over_15_days_ago $rr2/preimage
+test-chmtime =$just_over_60_days_ago $rr3/preimage
 
 test_expect_success 'garbage collection (part2)' 'git rerere gc'
 
 test_expect_success 'old records rest in peace' \
 	"test ! -f $rr/preimage && test ! -f $rr2/preimage"
 
+test_expect_success 'recently used records are still there' \
+	"test -f $rr3/preimage"
+
 test_expect_success 'file2 added differently in two branches' '
 	git reset --hard &&
 	git checkout -b fourth &&
-- 
1.7.2.rc0.42.g400d

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] rerere: fix overeager gc
  2010-06-29 11:38 [RFC PATCH] rerere: fix overeager gc SZEDER Gábor
@ 2010-06-29 17:59 ` Junio C Hamano
  2010-06-30  6:12   ` Johannes Sixt
  0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2010-06-29 17:59 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git

SZEDER Gábor <szeder@ira.uka.de> writes:

> ...  But I'm not sure I
> can rely on that when gc'ing.

Looking at the timestamp of "thisimage" would probaboly be more sensible
than "preimage" alone, _if_ "thisimage" still exists.  It is rewritten
every time this particular conflict is observed; this is not necessarily
when the recorded resolution is _used_, but it may be close enough in
practice.

You probably would want to rename the helper as "last_checked_at", though.

After rerere does its work, however, "thisimage" does not have to stay
around (the user can remove it, or we could enhance "gc" to do so).

> +		if (has_rerere_resolution(e->d_name)) {
> +			then = rerere_last_used_at(e->d_name);
> +			if (!then)
> +				continue;

Here you already know that you have resolution (i.e. "postimage"), but
your new function cannot stat a corresponding "thisimage", so you err
on the safer side---but that means you may keep pre/post image pairs
forever if somebody removes otherwise unused "thisimage" from a distant
past.  Perhaps we should apply cutoff_noresolve to the entry here?

One possibility is to look at the timestamp of the directory itself
instead.  Then we can safely gc otherwise-unused "thisimage" file when
rerere is not in use.  I wonder if directory m_time timestamps are usable
for this purpose on non-POSIX platforms?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] rerere: fix overeager gc
  2010-06-29 17:59 ` Junio C Hamano
@ 2010-06-30  6:12   ` Johannes Sixt
  2010-06-30  8:01     ` SZEDER Gábor
                       ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Johannes Sixt @ 2010-06-30  6:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: SZEDER Gábor, git

Am 6/29/2010 19:59, schrieb Junio C Hamano:
> One possibility is to look at the timestamp of the directory itself
> instead.  Then we can safely gc otherwise-unused "thisimage" file when
> rerere is not in use.  I wonder if directory m_time timestamps are usable
> for this purpose on non-POSIX platforms?

I don't think that will work at all: We only use fopen() to write
thisimage, which only truncates the file, but doesn't modify mtime of the
directory. Nor do we create any other (temporary) directory entries that
would modify the mtime.

Would it be possible to update the timestamp of preimage every time it is
used (e.g., in rerere.c:merge()), and check for that?

-- Hannes

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] rerere: fix overeager gc
  2010-06-30  6:12   ` Johannes Sixt
@ 2010-06-30  8:01     ` SZEDER Gábor
  2010-06-30 15:22     ` Junio C Hamano
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 24+ messages in thread
From: SZEDER Gábor @ 2010-06-30  8:01 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, git

Hi,

thank you both for your comments.


On Wed, Jun 30, 2010 at 08:12:30AM +0200, Johannes Sixt wrote:
> Am 6/29/2010 19:59, schrieb Junio C Hamano:
> > One possibility is to look at the timestamp of the directory itself
> > instead.  Then we can safely gc otherwise-unused "thisimage" file when
> > rerere is not in use.  I wonder if directory m_time timestamps are usable
> > for this purpose on non-POSIX platforms?
> 
> I don't think that will work at all: We only use fopen() to write
> thisimage, which only truncates the file, but doesn't modify mtime of the
> directory. Nor do we create any other (temporary) directory entries that
> would modify the mtime.

Indeed; on Linux I have:

drwxr-xr-x 2 szeder szeder 4096 2010-06-24 10:59 .git/rr-cache/13e67feeb07f97d6fccc2257d793d93ec4e730bf/

-rw-r--r-- 1 szeder szeder 3095 2010-06-30 04:56 .git/rr-cache/13e67feeb07f97d6fccc2257d793d93ec4e730bf/thisimage


> Would it be possible to update the timestamp of preimage every time it is
> used (e.g., in rerere.c:merge()), and check for that?

Will take a look.


Best,
Gábor

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] rerere: fix overeager gc
  2010-06-30  6:12   ` Johannes Sixt
  2010-06-30  8:01     ` SZEDER Gábor
@ 2010-06-30 15:22     ` Junio C Hamano
  2010-06-30 15:40       ` Johannes Sixt
  2010-07-01  9:36     ` [PATCH v2] " SZEDER Gábor
  2010-07-01 16:27     ` [RFC PATCH] " Junio C Hamano
  3 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2010-06-30 15:22 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: SZEDER Gábor, git

Johannes Sixt <j.sixt@viscovery.net> writes:

> Am 6/29/2010 19:59, schrieb Junio C Hamano:
>> One possibility is to look at the timestamp of the directory itself
>> instead.  Then we can safely gc otherwise-unused "thisimage" file when
>> rerere is not in use.  I wonder if directory m_time timestamps are usable
>> for this purpose on non-POSIX platforms?
>
> I don't think that will work at all: We only use fopen() to write
> thisimage, which only truncates the file, but doesn't modify mtime of the
> directory. Nor do we create any other (temporary) directory entries that
> would modify the mtime.

Ah, I see; I don't mind a patch that fixes the creation of thisimage to
follow the "create into temporary and then commit by renaming" pattern.

Would that solve this issue?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] rerere: fix overeager gc
  2010-06-30 15:22     ` Junio C Hamano
@ 2010-06-30 15:40       ` Johannes Sixt
  0 siblings, 0 replies; 24+ messages in thread
From: Johannes Sixt @ 2010-06-30 15:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: SZEDER Gábor, git

Am 6/30/2010 17:22, schrieb Junio C Hamano:
> Johannes Sixt <j.sixt@viscovery.net> writes:
>> Am 6/29/2010 19:59, schrieb Junio C Hamano:
>>> One possibility is to look at the timestamp of the directory itself
>>> instead.  Then we can safely gc otherwise-unused "thisimage" file when
>>> rerere is not in use.  I wonder if directory m_time timestamps are usable
>>> for this purpose on non-POSIX platforms?
>>
>> I don't think that will work at all: We only use fopen() to write
>> thisimage, which only truncates the file, but doesn't modify mtime of the
>> directory. Nor do we create any other (temporary) directory entries that
>> would modify the mtime.
> 
> Ah, I see; I don't mind a patch that fixes the creation of thisimage to
> follow the "create into temporary and then commit by renaming" pattern.
> 
> Would that solve this issue?

I think so. On Windows, the directory's mtime is updated.

-- Hannes

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2] rerere: fix overeager gc
  2010-06-30  6:12   ` Johannes Sixt
  2010-06-30  8:01     ` SZEDER Gábor
  2010-06-30 15:22     ` Junio C Hamano
@ 2010-07-01  9:36     ` SZEDER Gábor
  2010-07-01 10:10       ` Johannes Sixt
  2010-07-01 16:27     ` [RFC PATCH] " Junio C Hamano
  3 siblings, 1 reply; 24+ messages in thread
From: SZEDER Gábor @ 2010-07-01  9:36 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, git, SZEDER Gábor

'rerere gc' prunes resolutions of conflicted merges that occurred long
time ago, and when doing so it takes the creation time of the
conflicted automerge results into account.  This can cause the loss of
frequently used merge resolutions (e.g. long-living topic branches are
merged into a regularly rebuilt integration branch (think of git's
pu)) when they become old enough to exceed 'rerere gc's threshold.

Prevent the loss of valuable merge resolutions by updating the
timestamp of the conflicted automerge result each time when
encountering the same merge conflict.

Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
---

On Wed, Jun 30, 2010 at 08:12:30AM +0200, Johannes Sixt wrote:
> Would it be possible to update the timestamp of preimage every time it is
> used (e.g., in rerere.c:merge()), and check for that?

So, how about this?


 builtin/rerere.c  |    4 ++--
 rerere.c          |    8 +++++++-
 t/t4200-rerere.sh |    8 ++++++++
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/builtin/rerere.c b/builtin/rerere.c
index 0048f9e..4d4faae 100644
--- a/builtin/rerere.c
+++ b/builtin/rerere.c
@@ -13,7 +13,7 @@ static const char git_rerere_usage[] =
 static int cutoff_noresolve = 15;
 static int cutoff_resolve = 60;
 
-static time_t rerere_created_at(const char *name)
+static time_t rerere_last_checked_at(const char *name)
 {
 	struct stat st;
 	return stat(rerere_path(name, "preimage"), &st) ? (time_t) 0 : st.st_mtime;
@@ -53,7 +53,7 @@ static void garbage_collect(struct string_list *rr)
 	while ((e = readdir(dir))) {
 		if (is_dot_or_dotdot(e->d_name))
 			continue;
-		then = rerere_created_at(e->d_name);
+		then = rerere_last_checked_at(e->d_name);
 		if (!then)
 			continue;
 		cutoff = (has_rerere_resolution(e->d_name)
diff --git a/rerere.c b/rerere.c
index 2197890..1cc7c65 100644
--- a/rerere.c
+++ b/rerere.c
@@ -378,7 +378,13 @@ static int merge(const char *name, const char *path)
 	}
 	ret = ll_merge(&result, path, &base, NULL, &cur, "", &other, "", 0);
 	if (!ret) {
-		FILE *f = fopen(path, "w");
+		FILE *f;
+
+		if (utime(rerere_path(name, "preimage"), NULL) < 0)
+			warning("failed utime() on %s: %s",
+					rerere_path(name, "preimage"),
+					strerror(errno));
+		f = fopen(path, "w");
 		if (!f)
 			return error("Could not open %s: %s", path,
 				     strerror(errno));
diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh
index 70856d0..c01d930 100755
--- a/t/t4200-rerere.sh
+++ b/t/t4200-rerere.sh
@@ -144,6 +144,14 @@ test_expect_success 'rerere kicked in' "! grep ^=======$ a1"
 
 test_expect_success 'rerere prefers first change' 'test_cmp a1 expect'
 
+test_expect_success 'rerere updates preimage timestamp' '
+	git reset --hard &&
+	oldmtime=$(test-chmtime -v -42 $rr/preimage |cut -f 1) &&
+	test_must_fail git pull . first &&
+	newmtime=$(test-chmtime -v +0 $rr/preimage |cut -f 1) &&
+	test $oldmtime -lt $newmtime
+'
+
 rm $rr/postimage
 echo "$sha1	a1" | perl -pe 'y/\012/\000/' > .git/MERGE_RR
 
-- 
1.7.2.rc0.54.g4d821

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] rerere: fix overeager gc
  2010-07-01  9:36     ` [PATCH v2] " SZEDER Gábor
@ 2010-07-01 10:10       ` Johannes Sixt
  2010-07-01 11:07         ` [PATCH 1/2] mingw: utime() handles NULL times parameter SZEDER Gábor
  2010-07-01 11:07         ` [PATCH 2/2] rerere: fix overeager gc SZEDER Gábor
  0 siblings, 2 replies; 24+ messages in thread
From: Johannes Sixt @ 2010-07-01 10:10 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Junio C Hamano, git

Am 7/1/2010 11:36, schrieb SZEDER Gábor:
> -static time_t rerere_created_at(const char *name)
> +static time_t rerere_last_checked_at(const char *name)

rerere_last_used_at?

> @@ -378,7 +378,13 @@ static int merge(const char *name, const char *path)
>  	}
>  	ret = ll_merge(&result, path, &base, NULL, &cur, "", &other, "", 0);
>  	if (!ret) {
> -		FILE *f = fopen(path, "w");
> +		FILE *f;
> +
> +		if (utime(rerere_path(name, "preimage"), NULL) < 0)
> +			warning("failed utime() on %s: %s",
> +					rerere_path(name, "preimage"),
> +					strerror(errno));
> +		f = fopen(path, "w");

I think this should be outside of 'if (!ret)' condition because even if
the merge fails, the resolution was *used*.

Mental note: update mingw_utime to accept NULL for the second parameter...

-- Hannes

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/2] mingw: utime() handles NULL times parameter
  2010-07-01 10:10       ` Johannes Sixt
@ 2010-07-01 11:07         ` SZEDER Gábor
  2010-07-01 12:16           ` [PATCH 1/2 fixed] mingw_utime(): handle " Johannes Sixt
  2010-07-01 11:07         ` [PATCH 2/2] rerere: fix overeager gc SZEDER Gábor
  1 sibling, 1 reply; 24+ messages in thread
From: SZEDER Gábor @ 2010-07-01 11:07 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, git, SZEDER Gábor

POSIX sayeth:

  "If times is a null pointer, the access and modification
   times of the file shall be set to the current time."

Let's do so.

Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
---

On Thu, Jul 01, 2010 at 12:10:55PM +0200, Johannes Sixt wrote:
> Mental note: update mingw_utime to accept NULL for the second parameter...

Here it is, but I don't have mingw, so it's completely untested.

 compat/mingw.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 9a8e336..a54db74 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -304,8 +304,14 @@ int mingw_utime (const char *file_name, const struct utimbuf *times)
 		goto revert_attrs;
 	}
 
-	time_t_to_filetime(times->modtime, &mft);
-	time_t_to_filetime(times->actime, &aft);
+	if (times) {
+		time_t_to_filetime(times->modtime, &mft);
+		time_t_to_filetime(times->actime, &aft);
+	} else {
+		GetSystemTimeAsFileTime(&mft);
+		aft->dwLowDateTime = mft->dwLowDateTime;
+		aft->dwHighDateTime = mft->dwHighDateTime;
+	}
 	if (!SetFileTime((HANDLE)_get_osfhandle(fh), NULL, &aft, &mft)) {
 		errno = EINVAL;
 		rc = -1;
-- 
1.7.2.rc0.54.g4d821

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/2] rerere: fix overeager gc
  2010-07-01 10:10       ` Johannes Sixt
  2010-07-01 11:07         ` [PATCH 1/2] mingw: utime() handles NULL times parameter SZEDER Gábor
@ 2010-07-01 11:07         ` SZEDER Gábor
  1 sibling, 0 replies; 24+ messages in thread
From: SZEDER Gábor @ 2010-07-01 11:07 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, git, SZEDER Gábor

'rerere gc' prunes resolutions of conflicted merges that occurred long
time ago, and when doing so it takes the creation time of the
conflicted automerge results into account.  This can cause the loss of
frequently used merge resolutions (e.g. long-living topic branches are
merged into a regularly rebuilt integration branch (think of git's
pu)) when they become old enough to exceed 'rerere gc's threshold.

Prevent the loss of valuable merge resolutions by updating the
timestamp of the conflicted automerge result each time when
encountering the same merge conflict.

Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
---

On Thu, Jul 01, 2010 at 12:10:55PM +0200, Johannes Sixt wrote:
> rerere_last_used_at?

> I think this should be outside of 'if (!ret)' condition because even if
> the merge fails, the resolution was *used*.

Right on both points.


 builtin/rerere.c  |    4 ++--
 rerere.c          |    4 ++++
 t/t4200-rerere.sh |    8 ++++++++
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/builtin/rerere.c b/builtin/rerere.c
index 980d542..03434a7 100644
--- a/builtin/rerere.c
+++ b/builtin/rerere.c
@@ -13,7 +13,7 @@ static const char git_rerere_usage[] =
 static int cutoff_noresolve = 15;
 static int cutoff_resolve = 60;
 
-static time_t rerere_created_at(const char *name)
+static time_t rerere_last_used_at(const char *name)
 {
 	struct stat st;
 	return stat(rerere_path(name, "preimage"), &st) ? (time_t) 0 : st.st_mtime;
@@ -53,7 +53,7 @@ static void garbage_collect(struct string_list *rr)
 	while ((e = readdir(dir))) {
 		if (is_dot_or_dotdot(e->d_name))
 			continue;
-		then = rerere_created_at(e->d_name);
+		then = rerere_last_used_at(e->d_name);
 		if (!then)
 			continue;
 		cutoff = (has_rerere_resolution(e->d_name)
diff --git a/rerere.c b/rerere.c
index d03a696..0d8a167 100644
--- a/rerere.c
+++ b/rerere.c
@@ -389,6 +389,10 @@ static int merge(const char *name, const char *path)
 				     strerror(errno));
 	}
 
+	if (utime(rerere_path(name, "preimage"), NULL) < 0)
+		warning("failed utime() on %s: %s",
+				rerere_path(name, "preimage"), strerror(errno));
+
 out:
 	free(cur.ptr);
 	free(base.ptr);
diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh
index 70856d0..c01d930 100755
--- a/t/t4200-rerere.sh
+++ b/t/t4200-rerere.sh
@@ -144,6 +144,14 @@ test_expect_success 'rerere kicked in' "! grep ^=======$ a1"
 
 test_expect_success 'rerere prefers first change' 'test_cmp a1 expect'
 
+test_expect_success 'rerere updates preimage timestamp' '
+	git reset --hard &&
+	oldmtime=$(test-chmtime -v -42 $rr/preimage |cut -f 1) &&
+	test_must_fail git pull . first &&
+	newmtime=$(test-chmtime -v +0 $rr/preimage |cut -f 1) &&
+	test $oldmtime -lt $newmtime
+'
+
 rm $rr/postimage
 echo "$sha1	a1" | perl -pe 'y/\012/\000/' > .git/MERGE_RR
 
-- 
1.7.2.rc0.54.g4d821

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 1/2 fixed] mingw_utime(): handle NULL times parameter
  2010-07-01 11:07         ` [PATCH 1/2] mingw: utime() handles NULL times parameter SZEDER Gábor
@ 2010-07-01 12:16           ` Johannes Sixt
  0 siblings, 0 replies; 24+ messages in thread
From: Johannes Sixt @ 2010-07-01 12:16 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Junio C Hamano, git

From: SZEDER Gábor <szeder@ira.uka.de>

POSIX sayeth:

  "If times is a null pointer, the access and modification
   times of the file shall be set to the current time."

Let's do so.

Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
 >> Mental note: update mingw_utime to accept NULL for the second
 >> parameter...
 > 
 > Here it is, but I don't have mingw, so it's completely untested.

 Thanks. Here is a version that compiles; the interdiff is

 @@ -309,8 +309,7 @@
                 time_t_to_filetime(times->actime, &aft);
         } else {
                 GetSystemTimeAsFileTime(&mft);
 -               aft->dwLowDateTime = mft->dwLowDateTime;
 -               aft->dwHighDateTime = mft->dwHighDateTime;
 +               aft = mft;
         }
         if (!SetFileTime((HANDLE)_get_osfhandle(fh), NULL, &aft, &mft)) {
                 errno = EINVAL;

 With this, the series passes the test suite on MinGW.

 -- Hannes

 compat/mingw.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 0722a6d..b6f0a7f 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -304,8 +304,13 @@ int mingw_utime (const char *file_name, const struct utimbuf *times)
 		goto revert_attrs;
 	}
 
-	time_t_to_filetime(times->modtime, &mft);
-	time_t_to_filetime(times->actime, &aft);
+	if (times) {
+		time_t_to_filetime(times->modtime, &mft);
+		time_t_to_filetime(times->actime, &aft);
+	} else {
+		GetSystemTimeAsFileTime(&mft);
+		aft = mft;
+	}
 	if (!SetFileTime((HANDLE)_get_osfhandle(fh), NULL, &aft, &mft)) {
 		errno = EINVAL;
 		rc = -1;
-- 
1.7.2.rc1.1057.g1270

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] rerere: fix overeager gc
  2010-06-30  6:12   ` Johannes Sixt
                       ` (2 preceding siblings ...)
  2010-07-01  9:36     ` [PATCH v2] " SZEDER Gábor
@ 2010-07-01 16:27     ` Junio C Hamano
  2010-07-02  5:49       ` Johannes Sixt
  3 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2010-07-01 16:27 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: SZEDER Gábor, git

Johannes Sixt <j.sixt@viscovery.net> writes:

> Would it be possible to update the timestamp of preimage every time it is
> used (e.g., in rerere.c:merge()), and check for that?

It would be _possible_, but we are _not_ modifying the file at that point,
so somehow that solution feels very wrong, no?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] rerere: fix overeager gc
  2010-07-01 16:27     ` [RFC PATCH] " Junio C Hamano
@ 2010-07-02  5:49       ` Johannes Sixt
  2010-07-02 17:25         ` Junio C Hamano
  0 siblings, 1 reply; 24+ messages in thread
From: Johannes Sixt @ 2010-07-02  5:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: SZEDER Gábor, git

Am 7/1/2010 18:27, schrieb Junio C Hamano:
> Johannes Sixt <j.sixt@viscovery.net> writes:
> 
>> Would it be possible to update the timestamp of preimage every time it is
>> used (e.g., in rerere.c:merge()), and check for that?
> 
> It would be _possible_, but we are _not_ modifying the file at that point,
> so somehow that solution feels very wrong, no?

rr-cache is basically a static database. The fact that we have a file
named 'thisimage' is just an abuse - to put a temporary file somewhere.
Therefore, depending on lockfile infrastructure to change timestamps for
us while manipulating 'thisimage' should feel no less wrong, don't you
think so?

-- Hannes

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] rerere: fix overeager gc
  2010-07-02  5:49       ` Johannes Sixt
@ 2010-07-02 17:25         ` Junio C Hamano
  2010-07-05  6:02           ` Johannes Sixt
  2010-07-08  6:27           ` [RFC PATCH] " Johannes Sixt
  0 siblings, 2 replies; 24+ messages in thread
From: Junio C Hamano @ 2010-07-02 17:25 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: SZEDER Gábor, git

Johannes Sixt <j.sixt@viscovery.net> writes:

> Am 7/1/2010 18:27, schrieb Junio C Hamano:
>> Johannes Sixt <j.sixt@viscovery.net> writes:
>> 
>>> Would it be possible to update the timestamp of preimage every time it is
>>> used (e.g., in rerere.c:merge()), and check for that?
>> 
>> It would be _possible_, but we are _not_ modifying the file at that point,
>> so somehow that solution feels very wrong, no?
>
> rr-cache is basically a static database. The fact that we have a file
> named 'thisimage' is just an abuse - to put a temporary file somewhere.
> Therefore, depending on lockfile infrastructure to change timestamps for
> us while manipulating 'thisimage' should feel no less wrong, don't you
> think so?

Not really.  

"This" in "thisimage" stands for "image involved in conflict we are
looking at _this moment_", and using the per-conflict-id directory to
house that file was a conscious design decision.  Because you are trying
to record a new data item (namely "the last time we saw this conflict")
somewhere in the rr-cache database, using the timestamp of that directory
to tell when was the last time that _this moment_ was makes sense, at
least to me.

If anything, "preimage" that has newer timestamp than "postimage" feels
wrong, as you would record pre, think for a while how the corresponding
post should look like, and then record post.

If we for whatever reason trust placing an extra timestamp on a regular
file more than using directory timestamp (which I think may be a valid
concern from portability point of view), I'd rather see "preimage"
timestamp to keep track of the time when we _first_ encountered the
particular conflict, and "postimage" used for recording the time when we
saw the conflict most recently.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] rerere: fix overeager gc
  2010-07-02 17:25         ` Junio C Hamano
@ 2010-07-05  6:02           ` Johannes Sixt
  2010-07-08 14:35             ` [PATCH v4] " SZEDER Gábor
  2010-07-09  0:06             ` [RFC PATCH] " Junio C Hamano
  2010-07-08  6:27           ` [RFC PATCH] " Johannes Sixt
  1 sibling, 2 replies; 24+ messages in thread
From: Johannes Sixt @ 2010-07-05  6:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: SZEDER Gábor, git

Am 7/2/2010 19:25, schrieb Junio C Hamano:
> If anything, "preimage" that has newer timestamp than "postimage" feels
> wrong,...

Indeed; I agree.

> If we for whatever reason trust placing an extra timestamp on a regular
> file more than using directory timestamp (which I think may be a valid
> concern from portability point of view),

Windows behaves well in this regard. Writing of thisimage must be
converted to lockfile infrastructure, of course.

> I'd rather see "preimage"
> timestamp to keep track of the time when we _first_ encountered the
> particular conflict, and "postimage" used for recording the time when we
> saw the conflict most recently.

That would be fine, too.

-- Hannes

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] rerere: fix overeager gc
  2010-07-02 17:25         ` Junio C Hamano
  2010-07-05  6:02           ` Johannes Sixt
@ 2010-07-08  6:27           ` Johannes Sixt
  1 sibling, 0 replies; 24+ messages in thread
From: Johannes Sixt @ 2010-07-08  6:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: SZEDER Gábor, git

Instead of this series, I'll use this for a while; i.e., check the
timestamp of thisimage, and if it does not exist, check the directory's
timestamp. It doesn't pass t4200.20 because the test doesn't set the
timestamps of thisimage and the directory into the past.

diff --git a/builtin/rerere.c b/builtin/rerere.c
index 980d542..49fcbf7 100644
--- a/builtin/rerere.c
+++ b/builtin/rerere.c
@@ -13,10 +13,14 @@ static const char git_rerere_usage[] =
 static int cutoff_noresolve = 15;
 static int cutoff_resolve = 60;
 
-static time_t rerere_created_at(const char *name)
+static time_t rerere_last_used_at(const char *name)
 {
 	struct stat st;
-	return stat(rerere_path(name, "preimage"), &st) ? (time_t) 0 : st.st_mtime;
+	if (!stat(rerere_path(name, "thisimage"), &st))
+		return st.st_mtime;
+	if (errno == ENOENT && !stat(rerere_path(name, "."), &st))
+		return st.st_mtime;
+	return (time_t) 0;
 }
 
 static void unlink_rr_item(const char *name)
@@ -53,7 +57,7 @@ static void garbage_collect(struct string_list *rr)
 	while ((e = readdir(dir))) {
 		if (is_dot_or_dotdot(e->d_name))
 			continue;
-		then = rerere_created_at(e->d_name);
+		then = rerere_last_used_at(e->d_name);
 		if (!then)
 			continue;
 		cutoff = (has_rerere_resolution(e->d_name)

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v4] rerere: fix overeager gc
  2010-07-05  6:02           ` Johannes Sixt
@ 2010-07-08 14:35             ` SZEDER Gábor
  2010-07-09  0:06             ` [RFC PATCH] " Junio C Hamano
  1 sibling, 0 replies; 24+ messages in thread
From: SZEDER Gábor @ 2010-07-08 14:35 UTC (permalink / raw)
  To: Johannes Sixt, Junio C Hamano; +Cc: git, SZEDER Gábor

'rerere gc' prunes resolutions of conflicted merges that occurred long
time ago, and when doing so it takes the creation time of the
conflicted automerge results into account.  This can cause the loss of
frequently used merge resolutions (e.g. long-living topic branches are
merged into a regularly rebuilt integration branch (think of git's
pu)) when they become old enough to exceed 'rerere gc's threshold.

To prevent the loss of valuable merge resolutions 'rerere' will (1)
write 'thisimage' into a temporary file and then rename it into its
final destination, thereby updating the directory timestamp each time
when encountering the same merge conflict, and (2) take the directory
timestamp, i.e. the time of the last usage into account when gc'ing.

Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
---

On Mon, Jul 05, 2010 at 08:02:34AM +0200, Johannes Sixt wrote:
> Am 7/2/2010 19:25, schrieb Junio C Hamano:
> > If we for whatever reason trust placing an extra timestamp on a regular
> > file more than using directory timestamp (which I think may be a valid
> > concern from portability point of view),
> 
> Windows behaves well in this regard. Writing of thisimage must be
> converted to lockfile infrastructure, of course.

So, here is the next take with the tmpfile-rename stuff to update the
directory timestamp.  If there are any portability issues wrt directory
timestamp updating, the new test should catch them early.

 builtin/rerere.c  |    6 +++---
 rerere.c          |   19 +++++++++++++++++--
 t/t4200-rerere.sh |   16 ++++++++++++----
 3 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/builtin/rerere.c b/builtin/rerere.c
index 980d542..bede3eb 100644
--- a/builtin/rerere.c
+++ b/builtin/rerere.c
@@ -13,10 +13,10 @@ static const char git_rerere_usage[] =
 static int cutoff_noresolve = 15;
 static int cutoff_resolve = 60;
 
-static time_t rerere_created_at(const char *name)
+static time_t rerere_last_used_at(const char *name)
 {
 	struct stat st;
-	return stat(rerere_path(name, "preimage"), &st) ? (time_t) 0 : st.st_mtime;
+	return stat(rerere_path(name, "."), &st) ? (time_t) 0 : st.st_mtime;
 }
 
 static void unlink_rr_item(const char *name)
@@ -53,7 +53,7 @@ static void garbage_collect(struct string_list *rr)
 	while ((e = readdir(dir))) {
 		if (is_dot_or_dotdot(e->d_name))
 			continue;
-		then = rerere_created_at(e->d_name);
+		then = rerere_last_used_at(e->d_name);
 		if (!then)
 			continue;
 		cutoff = (has_rerere_resolution(e->d_name)
diff --git a/rerere.c b/rerere.c
index d03a696..e415f54 100644
--- a/rerere.c
+++ b/rerere.c
@@ -363,13 +363,28 @@ static int find_conflict(struct string_list *conflict)
 
 static int merge(const char *name, const char *path)
 {
-	int ret;
+	int fd, ret;
 	mmfile_t cur = {NULL, 0}, base = {NULL, 0}, other = {NULL, 0};
 	mmbuffer_t result = {NULL, 0};
+	char *tmpfile;
+
+	tmpfile = xstrdup(rerere_path(name, "tmp_thisimage_XXXXXX"));
+	fd = git_mkstemp_mode(tmpfile, 0600);
+	if (fd < 0) {
+		error("unable to create temporary file %s for rerere: %s\n",
+				tmpfile, strerror(errno));
+		return 1;
+	}
+	close(fd);
 
-	if (handle_file(path, NULL, rerere_path(name, "thisimage")) < 0)
+	if (handle_file(path, NULL, tmpfile) < 0)
 		return 1;
 
+	if (move_temp_to_file(tmpfile, rerere_path(name, "thisimage")) < 0)
+		return 1;
+
+	free(tmpfile);
+
 	if (read_mmfile(&cur, rerere_path(name, "thisimage")) ||
 			read_mmfile(&base, rerere_path(name, "preimage")) ||
 			read_mmfile(&other, rerere_path(name, "postimage"))) {
diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh
index 70856d0..a425329 100755
--- a/t/t4200-rerere.sh
+++ b/t/t4200-rerere.sh
@@ -144,6 +144,14 @@ test_expect_success 'rerere kicked in' "! grep ^=======$ a1"
 
 test_expect_success 'rerere prefers first change' 'test_cmp a1 expect'
 
+test_expect_success 'rerere updates directory timestamp' '
+	git reset --hard &&
+	oldmtime=$(test-chmtime -v -42 $rr |cut -f 1) &&
+	test_must_fail git pull . first &&
+	newmtime=$(test-chmtime -v +0 $rr |cut -f 1) &&
+	test $oldmtime -lt $newmtime
+'
+
 rm $rr/postimage
 echo "$sha1	a1" | perl -pe 'y/\012/\000/' > .git/MERGE_RR
 
@@ -165,16 +173,16 @@ just_over_15_days_ago=$((-1-15*86400))
 almost_60_days_ago=$((60-60*86400))
 just_over_60_days_ago=$((-1-60*86400))
 
-test-chmtime =$almost_60_days_ago $rr/preimage
-test-chmtime =$almost_15_days_ago $rr2/preimage
+test-chmtime =$almost_60_days_ago $rr
+test-chmtime =$almost_15_days_ago $rr2
 
 test_expect_success 'garbage collection (part1)' 'git rerere gc'
 
 test_expect_success 'young records still live' \
 	"test -f $rr/preimage && test -f $rr2/preimage"
 
-test-chmtime =$just_over_60_days_ago $rr/preimage
-test-chmtime =$just_over_15_days_ago $rr2/preimage
+test-chmtime =$just_over_60_days_ago $rr
+test-chmtime =$just_over_15_days_ago $rr2
 
 test_expect_success 'garbage collection (part2)' 'git rerere gc'
 
-- 
1.7.2.rc2.42.gfe876

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] rerere: fix overeager gc
  2010-07-05  6:02           ` Johannes Sixt
  2010-07-08 14:35             ` [PATCH v4] " SZEDER Gábor
@ 2010-07-09  0:06             ` Junio C Hamano
  2010-07-12 23:42               ` [PATCH 1/2] mingw_utime(): handle NULL times parameter SZEDER Gábor
  2010-07-12 23:42               ` [PATCH 2/2] rerere: fix overeager gc SZEDER Gábor
  1 sibling, 2 replies; 24+ messages in thread
From: Junio C Hamano @ 2010-07-09  0:06 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: SZEDER Gábor, git

Johannes Sixt <j.sixt@viscovery.net> writes:

> Am 7/2/2010 19:25, schrieb Junio C Hamano:
>
>> I'd rather see "preimage"
>> timestamp to keep track of the time when we _first_ encountered the
>> particular conflict, and "postimage" used for recording the time when we
>> saw the conflict most recently.
>
> That would be fine, too.

Ok, then let's make it so ;-)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/2] mingw_utime(): handle NULL times parameter
  2010-07-09  0:06             ` [RFC PATCH] " Junio C Hamano
@ 2010-07-12 23:42               ` SZEDER Gábor
  2010-07-12 23:42               ` [PATCH 2/2] rerere: fix overeager gc SZEDER Gábor
  1 sibling, 0 replies; 24+ messages in thread
From: SZEDER Gábor @ 2010-07-12 23:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Sixt, git, SZEDER Gábor, Johannes Sixt

POSIX sayeth:

  "If times is a null pointer, the access and modification
   times of the file shall be set to the current time."

Let's do so.

Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
 compat/mingw.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 9a8e336..24333cb 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -304,8 +304,13 @@ int mingw_utime (const char *file_name, const struct utimbuf *times)
 		goto revert_attrs;
 	}
 
-	time_t_to_filetime(times->modtime, &mft);
-	time_t_to_filetime(times->actime, &aft);
+	if (times) {
+		time_t_to_filetime(times->modtime, &mft);
+		time_t_to_filetime(times->actime, &aft);
+	} else {
+		GetSystemTimeAsFileTime(&mft);
+		aft = mft;
+	}
 	if (!SetFileTime((HANDLE)_get_osfhandle(fh), NULL, &aft, &mft)) {
 		errno = EINVAL;
 		rc = -1;
-- 
1.7.2.rc2.37.g5e8ef

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/2] rerere: fix overeager gc
  2010-07-09  0:06             ` [RFC PATCH] " Junio C Hamano
  2010-07-12 23:42               ` [PATCH 1/2] mingw_utime(): handle NULL times parameter SZEDER Gábor
@ 2010-07-12 23:42               ` SZEDER Gábor
  2010-07-13  0:40                 ` Junio C Hamano
  1 sibling, 1 reply; 24+ messages in thread
From: SZEDER Gábor @ 2010-07-12 23:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Sixt, git, SZEDER Gábor

'rerere gc' prunes resolutions of conflicted merges that occurred long
time ago, and when doing so it takes the creation time of the
conflicted automerge results into account.  This can cause the loss of
frequently used conflict resolutions (e.g. long-living topic branches
are merged into a regularly rebuilt integration branch (think of git's
pu)) when they become old enough to exceed 'rerere gc's threshold.

To prevent the loss of valuable merge resolutions 'rerere' will (1)
update the timestamp of the recorded conflict resolution (i.e.
'postimage') each time when encountering and resolving the same merge
conflict, and (2) take this timestamp, i.e. the time of the last usage
into account when gc'ing.

Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
---

On Thu, Jul 08, 2010 at 05:06:49PM -0700, Junio C Hamano wrote:
> Johannes Sixt <j.sixt@viscovery.net> writes:
> 
> > Am 7/2/2010 19:25, schrieb Junio C Hamano:
> >
> >> I'd rather see "preimage"
> >> timestamp to keep track of the time when we _first_ encountered the
> >> particular conflict, and "postimage" used for recording the time when we
> >> saw the conflict most recently.
> >
> > That would be fine, too.
> 
> Ok, then let's make it so ;-)

Here it is.  Now you have it all, take your pick (;

 builtin/rerere.c  |   15 +++++++++++++--
 rerere.c          |    8 +++++++-
 t/t4200-rerere.sh |   14 +++++++++++---
 3 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/builtin/rerere.c b/builtin/rerere.c
index 980d542..52e4b64 100644
--- a/builtin/rerere.c
+++ b/builtin/rerere.c
@@ -19,6 +19,12 @@ static time_t rerere_created_at(const char *name)
 	return stat(rerere_path(name, "preimage"), &st) ? (time_t) 0 : st.st_mtime;
 }
 
+static time_t rerere_last_used_at(const char *name)
+{
+	struct stat st;
+	return stat(rerere_path(name, "postimage"), &st) ? (time_t) 0 : st.st_mtime;
+}
+
 static void unlink_rr_item(const char *name)
 {
 	unlink(rerere_path(name, "thisimage"));
@@ -56,8 +62,13 @@ static void garbage_collect(struct string_list *rr)
 		then = rerere_created_at(e->d_name);
 		if (!then)
 			continue;
-		cutoff = (has_rerere_resolution(e->d_name)
-			  ? cutoff_resolve : cutoff_noresolve);
+		if (has_rerere_resolution(e->d_name)) {
+			then = rerere_last_used_at(e->d_name);
+			if (!then)
+				continue;
+			cutoff = cutoff_resolve;
+		} else
+			cutoff = cutoff_noresolve;
 		if (then < now - cutoff * 86400)
 			string_list_append(&to_remove, e->d_name);
 	}
diff --git a/rerere.c b/rerere.c
index d03a696..1d95161 100644
--- a/rerere.c
+++ b/rerere.c
@@ -378,7 +378,13 @@ static int merge(const char *name, const char *path)
 	}
 	ret = ll_merge(&result, path, &base, NULL, &cur, "", &other, "", 0);
 	if (!ret) {
-		FILE *f = fopen(path, "w");
+		FILE *f;
+
+		if (utime(rerere_path(name, "postimage"), NULL) < 0)
+			warning("failed utime() on %s: %s",
+					rerere_path(name, "postimage"),
+					strerror(errno));
+		f = fopen(path, "w");
 		if (!f)
 			return error("Could not open %s: %s", path,
 				     strerror(errno));
diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh
index 70856d0..093b138 100755
--- a/t/t4200-rerere.sh
+++ b/t/t4200-rerere.sh
@@ -132,6 +132,8 @@ test_expect_success 'commit succeeds' \
 
 test_expect_success 'recorded postimage' "test -f $rr/postimage"
 
+oldmtimepost=$(test-chmtime -v -60 $rr/postimage |cut -f 1)
+
 test_expect_success 'another conflicting merge' '
 	git checkout -b third master &&
 	git show second^:a1 | sed "s/To die: t/To die! T/" > a1 &&
@@ -144,6 +146,11 @@ test_expect_success 'rerere kicked in' "! grep ^=======$ a1"
 
 test_expect_success 'rerere prefers first change' 'test_cmp a1 expect'
 
+test_expect_success 'rerere updates postimage timestamp' '
+	newmtimepost=$(test-chmtime -v +0 $rr/postimage |cut -f 1) &&
+	test $oldmtimepost -lt $newmtimepost
+'
+
 rm $rr/postimage
 echo "$sha1	a1" | perl -pe 'y/\012/\000/' > .git/MERGE_RR
 
@@ -165,15 +172,16 @@ just_over_15_days_ago=$((-1-15*86400))
 almost_60_days_ago=$((60-60*86400))
 just_over_60_days_ago=$((-1-60*86400))
 
-test-chmtime =$almost_60_days_ago $rr/preimage
+test-chmtime =$just_over_60_days_ago $rr/preimage
+test-chmtime =$almost_60_days_ago $rr/postimage
 test-chmtime =$almost_15_days_ago $rr2/preimage
 
 test_expect_success 'garbage collection (part1)' 'git rerere gc'
 
-test_expect_success 'young records still live' \
+test_expect_success 'young or recently used records still live' \
 	"test -f $rr/preimage && test -f $rr2/preimage"
 
-test-chmtime =$just_over_60_days_ago $rr/preimage
+test-chmtime =$just_over_60_days_ago $rr/postimage
 test-chmtime =$just_over_15_days_ago $rr2/preimage
 
 test_expect_success 'garbage collection (part2)' 'git rerere gc'
-- 
1.7.2.rc2.37.g5e8ef

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] rerere: fix overeager gc
  2010-07-12 23:42               ` [PATCH 2/2] rerere: fix overeager gc SZEDER Gábor
@ 2010-07-13  0:40                 ` Junio C Hamano
  2010-07-14 12:19                   ` SZEDER Gábor
  0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2010-07-13  0:40 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Johannes Sixt, git

SZEDER Gábor <szeder@ira.uka.de> writes:

> 'rerere gc' prunes resolutions of conflicted merges that occurred long
> time ago, and when doing so it takes the creation time of the
> conflicted automerge results into account.  This can cause the loss of
> frequently used conflict resolutions (e.g. long-living topic branches
> are merged into a regularly rebuilt integration branch (think of git's
> pu)) when they become old enough to exceed 'rerere gc's threshold.
>
> To prevent the loss of valuable merge resolutions 'rerere' will (1)
> update the timestamp of the recorded conflict resolution (i.e.
> 'postimage') each time when encountering and resolving the same merge
> conflict, and (2) take this timestamp, i.e. the time of the last usage
> into account when gc'ing.

Thanks.

> +static time_t rerere_last_used_at(const char *name)
> +{
> +	struct stat st;
> +	return stat(rerere_path(name, "postimage"), &st) ? (time_t) 0 : st.st_mtime;
> +}

Doesn't has_rerere_resolution() already do a stat on this path?  There are
only two allers of the function so it would probably make sense to pass a
pointer to struct stat from the caller to avoid one extra call to stat.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] rerere: fix overeager gc
  2010-07-13  0:40                 ` Junio C Hamano
@ 2010-07-14 12:19                   ` SZEDER Gábor
  2010-07-14 16:23                     ` Junio C Hamano
  0 siblings, 1 reply; 24+ messages in thread
From: SZEDER Gábor @ 2010-07-14 12:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Sixt, git

On Mon, Jul 12, 2010 at 05:40:11PM -0700, Junio C Hamano wrote:
> SZEDER Gábor <szeder@ira.uka.de> writes:
> > +static time_t rerere_last_used_at(const char *name)
> > +{
> > +	struct stat st;
> > +	return stat(rerere_path(name, "postimage"), &st) ? (time_t) 0 : st.st_mtime;
> > +}
> 
> Doesn't has_rerere_resolution() already do a stat on this path?  There are
> only two allers of the function so it would probably make sense to pass a
> pointer to struct stat from the caller to avoid one extra call to stat.

rerere_last_used_at() returns 0 when the stat() on 'postimage' fails,
exactly like has_rerere_resolution().  Consequently, we can use
rerere_last_used_at() to determine whether a resolution exists, too.
And if we do that, we dont have to "pollute" callers of
has_rerere_resolution() with superfluous struct stat variables.

So how about this squashed in instead?

-- >8 --

diff --git a/builtin/rerere.c b/builtin/rerere.c
index 52e4b64..1dc424b 100644
--- a/builtin/rerere.c
+++ b/builtin/rerere.c
@@ -50,7 +50,7 @@ static void garbage_collect(struct string_list *rr)
 	DIR *dir;
 	struct dirent *e;
 	int i, cutoff;
-	time_t now = time(NULL), then;
+	time_t now = time(NULL), then, then_post;
 
 	git_config(git_rerere_gc_config, NULL);
 	dir = opendir(git_path("rr-cache"));
@@ -62,10 +62,9 @@ static void garbage_collect(struct string_list *rr)
 		then = rerere_created_at(e->d_name);
 		if (!then)
 			continue;
-		if (has_rerere_resolution(e->d_name)) {
-			then = rerere_last_used_at(e->d_name);
-			if (!then)
-				continue;
+		then_post = rerere_last_used_at(e->d_name);
+		if (then_post) {
+			then = then_post;
 			cutoff = cutoff_resolve;
 		} else
 			cutoff = cutoff_noresolve;

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] rerere: fix overeager gc
  2010-07-14 12:19                   ` SZEDER Gábor
@ 2010-07-14 16:23                     ` Junio C Hamano
  2010-07-14 18:33                       ` SZEDER Gábor
  0 siblings, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2010-07-14 16:23 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Johannes Sixt, git

SZEDER Gábor <szeder@ira.uka.de> writes:

> On Mon, Jul 12, 2010 at 05:40:11PM -0700, Junio C Hamano wrote:
>> SZEDER Gábor <szeder@ira.uka.de> writes:
>> > +static time_t rerere_last_used_at(const char *name)
>> > +{
>> > +	struct stat st;
>> > +	return stat(rerere_path(name, "postimage"), &st) ? (time_t) 0 : st.st_mtime;
>> > +}
>> 
>> Doesn't has_rerere_resolution() already do a stat on this path?  There are
>> only two allers of the function so it would probably make sense to pass a
>> pointer to struct stat from the caller to avoid one extra call to stat.
>
> rerere_last_used_at() returns 0 when the stat() on 'postimage' fails,
> exactly like has_rerere_resolution().  Consequently, we can use
> rerere_last_used_at() to determine whether a resolution exists, too.

"When was this last used?" is answered with "infinitely long time ago" for
an item that does not exist.  Between the ones that have never been
resolved (i.e. no "postimage") and the ones that have not been used for a
long time, there is no material difference in the expiration, as long as
"long time" exceeds rerere-unresolved.  Ok; I can see the logic in that.

I wonder if swapping the order of things to check may make it easier to
read, though.  If it was used, we want to compare gc.rerereresolved with
that timestamp, and otherwise we want to compare gc.rerereunresolved with
the timestamp of the creation.  I.e. something like this...

 builtin/rerere.c |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/builtin/rerere.c b/builtin/rerere.c
index 7e45afe..6d1b580 100644
--- a/builtin/rerere.c
+++ b/builtin/rerere.c
@@ -59,16 +59,16 @@ static void garbage_collect(struct string_list *rr)
 	while ((e = readdir(dir))) {
 		if (is_dot_or_dotdot(e->d_name))
 			continue;
-		then = rerere_created_at(e->d_name);
-		if (!then)
-			continue;
-		if (has_rerere_resolution(e->d_name)) {
-			then = rerere_last_used_at(e->d_name);
+
+		then = rerere_last_used_at(e->d_name);
+		if (then) {
+			cutoff = cutoff_resolve;
+		} else {
+			then = rerere_created_at(e->d_name);
 			if (!then)
 				continue;
-			cutoff = cutoff_resolve;
-		} else
 			cutoff = cutoff_noresolve;
+		}
 		if (then < now - cutoff * 86400)
 			string_list_append(e->d_name, &to_remove);
 	}

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] rerere: fix overeager gc
  2010-07-14 16:23                     ` Junio C Hamano
@ 2010-07-14 18:33                       ` SZEDER Gábor
  0 siblings, 0 replies; 24+ messages in thread
From: SZEDER Gábor @ 2010-07-14 18:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Sixt, git

On Wed, Jul 14, 2010 at 09:23:17AM -0700, Junio C Hamano wrote:
> I wonder if swapping the order of things to check may make it easier to
> read, though.  If it was used, we want to compare gc.rerereresolved with
> that timestamp, and otherwise we want to compare gc.rerereunresolved with
> the timestamp of the creation.  I.e. something like this...

That's even better.


>  builtin/rerere.c |   14 +++++++-------
>  1 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/builtin/rerere.c b/builtin/rerere.c
> index 7e45afe..6d1b580 100644
> --- a/builtin/rerere.c
> +++ b/builtin/rerere.c
> @@ -59,16 +59,16 @@ static void garbage_collect(struct string_list *rr)
>  	while ((e = readdir(dir))) {
>  		if (is_dot_or_dotdot(e->d_name))
>  			continue;
> -		then = rerere_created_at(e->d_name);
> -		if (!then)
> -			continue;
> -		if (has_rerere_resolution(e->d_name)) {
> -			then = rerere_last_used_at(e->d_name);
> +
> +		then = rerere_last_used_at(e->d_name);
> +		if (then) {
> +			cutoff = cutoff_resolve;
> +		} else {
> +			then = rerere_created_at(e->d_name);
>  			if (!then)
>  				continue;
> -			cutoff = cutoff_resolve;
> -		} else
>  			cutoff = cutoff_noresolve;
> +		}
>  		if (then < now - cutoff * 86400)
>  			string_list_append(e->d_name, &to_remove);
>  	}
> 
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2010-07-14 18:36 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-29 11:38 [RFC PATCH] rerere: fix overeager gc SZEDER Gábor
2010-06-29 17:59 ` Junio C Hamano
2010-06-30  6:12   ` Johannes Sixt
2010-06-30  8:01     ` SZEDER Gábor
2010-06-30 15:22     ` Junio C Hamano
2010-06-30 15:40       ` Johannes Sixt
2010-07-01  9:36     ` [PATCH v2] " SZEDER Gábor
2010-07-01 10:10       ` Johannes Sixt
2010-07-01 11:07         ` [PATCH 1/2] mingw: utime() handles NULL times parameter SZEDER Gábor
2010-07-01 12:16           ` [PATCH 1/2 fixed] mingw_utime(): handle " Johannes Sixt
2010-07-01 11:07         ` [PATCH 2/2] rerere: fix overeager gc SZEDER Gábor
2010-07-01 16:27     ` [RFC PATCH] " Junio C Hamano
2010-07-02  5:49       ` Johannes Sixt
2010-07-02 17:25         ` Junio C Hamano
2010-07-05  6:02           ` Johannes Sixt
2010-07-08 14:35             ` [PATCH v4] " SZEDER Gábor
2010-07-09  0:06             ` [RFC PATCH] " Junio C Hamano
2010-07-12 23:42               ` [PATCH 1/2] mingw_utime(): handle NULL times parameter SZEDER Gábor
2010-07-12 23:42               ` [PATCH 2/2] rerere: fix overeager gc SZEDER Gábor
2010-07-13  0:40                 ` Junio C Hamano
2010-07-14 12:19                   ` SZEDER Gábor
2010-07-14 16:23                     ` Junio C Hamano
2010-07-14 18:33                       ` SZEDER Gábor
2010-07-08  6:27           ` [RFC PATCH] " Johannes Sixt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).