git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: Jeff King <peff@peff.net>, git@vger.kernel.org
Cc: Michael Haggerty <mhagger@alum.mit.edu>
Subject: Re: [PATCH 16/16] write_sha1_file: freshen existing objects
Date: Sun, 05 Oct 2014 11:12:05 +0200	[thread overview]
Message-ID: <54310B65.2010301@web.de> (raw)
In-Reply-To: <20141003204110.GP16293@peff.net>

Am 03.10.2014 um 22:41 schrieb Jeff King:
> When we try to write a loose object file, we first check
> whether that object already exists. If so, we skip the
> write as an optimization. However, this can interfere with
> prune's strategy of using mtimes to mark files in progress.
>
> For example, if a branch contains a particular tree object
> and is deleted, that tree object may become unreachable, and
> have an old mtime. If a new operation then tries to write
> the same tree, this ends up as a noop; we notice we
> already have the object and do nothing. A prune running
> simultaneously with this operation will see the object as
> old, and may delete it.
>
> We can solve this by "freshening" objects that we avoid
> writing by updating their mtime. The algorithm for doing so
> is essentially the same as that of has_sha1_file. Therefore
> we provide a new (static) interface "check_and_freshen",
> which finds and optionally freshens the object. It's trivial
> to implement freshening and simple checking by tweaking a
> single parameter.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>   sha1_file.c                | 51 +++++++++++++++++++++++++++++++++++++++-------
>   t/t6501-freshen-objects.sh | 27 ++++++++++++++++++++++++
>   2 files changed, 71 insertions(+), 7 deletions(-)
>
> diff --git a/sha1_file.c b/sha1_file.c
> index d017289..d263b38 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -442,27 +442,53 @@ void prepare_alt_odb(void)
>   	read_info_alternates(get_object_directory(), 0);
>   }
>
> -static int has_loose_object_local(const unsigned char *sha1)
> +static int freshen_file(const char *fn)
>   {
> -	return !access(sha1_file_name(sha1), F_OK);
> +	struct utimbuf t;
> +	t.actime = t.modtime = time(NULL);
> +	return utime(fn, &t);
>   }

Mental note: freshen_file() returns 0 on success.

>
> -int has_loose_object_nonlocal(const unsigned char *sha1)
> +static int check_and_freshen_file(const char *fn, int freshen)
> +{
> +	if (access(fn, F_OK))
> +		return 0;
> +	if (freshen && freshen_file(fn))
> +		return 0;
> +	return 1;
> +}

Returns 1 on success.

> +
> +static int check_and_freshen_local(const unsigned char *sha1, int freshen)
> +{
> +	return check_and_freshen_file(sha1_file_name(sha1), freshen);
> +}

Returns 1 on success.

> +
> +static int check_and_freshen_nonlocal(const unsigned char *sha1, int freshen)
>   {
>   	struct alternate_object_database *alt;
>   	prepare_alt_odb();
>   	for (alt = alt_odb_list; alt; alt = alt->next) {
>   		fill_sha1_path(alt->name, sha1);
> -		if (!access(alt->base, F_OK))
> +		if (check_and_freshen_file(alt->base, freshen))
>   			return 1;
>   	}
>   	return 0;
>   }

Returns 1 on success.

>
> +static int check_and_freshen(const unsigned char *sha1, int freshen)
> +{
> +	return check_and_freshen_local(sha1, freshen) ||
> +	       check_and_freshen_nonlocal(sha1, freshen);
> +}

Returns 1 on success.

> +
> +int has_loose_object_nonlocal(const unsigned char *sha1)
> +{
> +	return check_and_freshen_nonlocal(sha1, 0);
> +}
> +
>   static int has_loose_object(const unsigned char *sha1)
>   {
> -	return has_loose_object_local(sha1) ||
> -	       has_loose_object_nonlocal(sha1);
> +	return check_and_freshen(sha1, 0);
>   }
>
>   static unsigned int pack_used_ctr;
> @@ -2949,6 +2975,17 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen,
>   	return move_temp_to_file(tmp_file, filename);
>   }
>
> +static int freshen_loose_object(const unsigned char *sha1)
> +{
> +	return check_and_freshen(sha1, 1);
> +}

Returns 1 on success.

> +
> +static int freshen_packed_object(const unsigned char *sha1)
> +{
> +	struct pack_entry e;
> +	return find_pack_entry(sha1, &e) && freshen_file(e.p->pack_name);
> +}

Returns 1 if a pack entry is found and freshen_file() fails, and 0 if no 
entry is found or freshen_file() succeeds.

It should be "&& !freshen(...)" instead, no?

Or better, let freshen_file() return 1 on success as the other functions 
here.

> +
>   int write_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *returnsha1)
>   {
>   	unsigned char sha1[20];
> @@ -2961,7 +2998,7 @@ int write_sha1_file(const void *buf, unsigned long len, const char *type, unsign
>   	write_sha1_file_prepare(buf, len, type, sha1, hdr, &hdrlen);
>   	if (returnsha1)
>   		hashcpy(returnsha1, sha1);
> -	if (has_sha1_file(sha1))
> +	if (freshen_loose_object(sha1) || freshen_packed_object(sha1))
>   		return 0;
>   	return write_loose_object(sha1, hdr, hdrlen, buf, len, 0);
>   }
> diff --git a/t/t6501-freshen-objects.sh b/t/t6501-freshen-objects.sh
> index e25c47d..157f3f9 100755
> --- a/t/t6501-freshen-objects.sh
> +++ b/t/t6501-freshen-objects.sh
> @@ -100,6 +100,33 @@ for repack in '' true; do
>   	test_expect_success "repository passes fsck ($title)" '
>   		git fsck
>   	'
> +
> +	test_expect_success "abandon objects again ($title)" '
> +		git reset --hard HEAD^ &&
> +		find .git/objects -type f |
> +		xargs test-chmtime -v -86400
> +	'
> +
> +	test_expect_success "start writing new commit with same tree ($title)" '
> +		tree=$(
> +			GIT_INDEX_FILE=index.tmp &&
> +			export GIT_INDEX_FILE &&
> +			git read-tree HEAD &&
> +			add abandon &&
> +			add unrelated &&
> +			git write-tree
> +		)
> +	'
> +
> +	test_expect_success "simultaneous gc ($title)" '
> +		git gc --prune=12.hours.ago
> +	'
> +
> +	# tree should have been refreshed by write-tree
> +	test_expect_success "finish writing out commit ($title)" '
> +		commit=$(echo foo | git commit-tree -p HEAD $tree) &&
> +		git update-ref HEAD $commit
> +	'
>   done
>
>   test_done
>

  parent reply	other threads:[~2014-10-05  9:13 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
2014-10-03 20:21 ` [PATCH 01/16] foreach_alt_odb: propagate return value from callback Jeff King
2014-10-03 22:55   ` René Scharfe
2014-10-04  0:31     ` Jeff King
2014-10-03 20:22 ` [PATCH 02/16] isxdigit: cast input to unsigned char Jeff King
2014-10-03 20:22 ` [PATCH 03/16] object_array: factor out slopbuf-freeing logic Jeff King
2014-10-07 11:25   ` Michael Haggerty
2014-10-08  7:36     ` Jeff King
2014-10-08  8:40       ` Michael Haggerty
2014-10-08  8:55         ` Jeff King
2014-10-03 20:22 ` [PATCH 04/16] object_array: add a "clear" function Jeff King
2014-10-03 20:23 ` [PATCH 05/16] clean up name allocation in prepare_revision_walk Jeff King
2014-10-03 20:24 ` [PATCH 06/16] reachable: clear pending array after walking it Jeff King
2014-10-03 20:25 ` [PATCH 07/16] t5304: use test_path_is_* instead of "test -f" Jeff King
2014-10-03 22:12   ` Junio C Hamano
2014-10-03 20:27 ` [PATCH 08/16] t5304: use helper to report failure of "test foo = bar" Jeff King
2014-10-03 22:17   ` Junio C Hamano
2014-10-04  0:13     ` Jeff King
2014-10-07 13:21   ` Michael Haggerty
2014-10-07 17:29     ` Junio C Hamano
2014-10-07 20:18       ` Jeff King
2014-10-07 20:35         ` Junio C Hamano
2014-10-07 21:29           ` Jeff King
2014-10-07 21:53             ` Junio C Hamano
2014-10-07 22:17               ` Michael Haggerty
2014-10-08  1:13                 ` Jeff King
2014-10-08 16:58                   ` Junio C Hamano
2014-10-07 21:16         ` Junio C Hamano
2014-10-03 20:29 ` [PATCH 09/16] prune: factor out loose-object directory traversal Jeff King
2014-10-03 22:19   ` Junio C Hamano
2014-10-04  0:24     ` Jeff King
2014-10-07 14:07   ` Michael Haggerty
2014-10-08  7:33     ` Jeff King
2014-10-03 20:31 ` [PATCH 10/16] count-objects: do not use xsize_t when counting object size Jeff King
2014-10-03 20:31 ` [PATCH 11/16] count-objects: use for_each_loose_file_in_objdir Jeff King
2014-10-03 20:32 ` [PATCH 12/16] sha1_file: add for_each iterators for loose and packed objects Jeff King
2014-10-05  8:15   ` René Scharfe
2014-10-05 10:47     ` Ramsay Jones
2014-10-03 20:39 ` [PATCH 13/16] prune: keep objects reachable from recent objects Jeff King
2014-10-03 21:47   ` Junio C Hamano
2014-10-04  0:09     ` Jeff King
2014-10-04  0:30     ` Jeff King
2014-10-04  3:04       ` Junio C Hamano
2014-10-07 16:29   ` Michael Haggerty
2014-10-08  7:19     ` Jeff King
2014-10-08 10:37       ` Michael Haggerty
2014-10-03 20:39 ` [PATCH 14/16] pack-objects: refactor unpack-unreachable expiration check Jeff King
2014-10-03 20:40 ` [PATCH 15/16] pack-objects: match prune logic for discarding objects Jeff King
2014-10-03 20:41 ` [PATCH 16/16] write_sha1_file: freshen existing objects Jeff King
2014-10-03 21:29   ` Junio C Hamano
2014-10-04  0:01     ` Jeff King
2014-10-05  9:12   ` René Scharfe [this message]
2014-10-03 22:20 ` [PATCH 0/16] make prune mtime-checking more careful Junio C Hamano
2014-10-04 22:22 ` Junio C Hamano
2014-10-05  9:19   ` René Scharfe
2014-10-06  1:42   ` Jeff King
2014-10-08  8:31     ` Jeff King
2014-10-08 17:03       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54310B65.2010301@web.de \
    --to=l.s.r@web.de \
    --cc=git@vger.kernel.org \
    --cc=mhagger@alum.mit.edu \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).