git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Taylor Blau <me@ttaylorr.com>
Cc: Elijah Newren <newren@gmail.com>,
	Jonathan Tan <jonathantanmy@google.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: Is fetch.writeCommitGraph (and thus features.experimental) meant to work in the presence of shallow clones?
Date: Wed, 15 Apr 2020 07:55:19 -0400	[thread overview]
Message-ID: <cdc80c0b-696f-19b8-d132-592782537b03@gmail.com> (raw)
In-Reply-To: <20200414235057.GA6863@syl.local>

On 4/14/2020 7:50 PM, Taylor Blau wrote:
> On Tue, Apr 14, 2020 at 04:31:19PM -0400, Derrick Stolee wrote:
>> On 4/14/2020 4:22 PM, Elijah Newren wrote:
>>> Hi,
>>>
>>> I was building a version of git for internal use, and thought I'd try
>>> turning on features.experimental to get more testing of it.  The
>>> following test error in the testsuite scared me, though:
>>>
>>> t5537.9 (fetch --update-shallow):
>>>
>>> ...
>>> + git fetch --update-shallow ../shallow/.git refs/heads/*:refs/remotes/shallow/*
[snip]
>> Well, commit-graphs are not supposed to do anything if we have
>> shallow clones. We definitely don't load a commit-graph in that
>> case. Seems like we need an extra check in write_commit_graph()
>> to stop writing in the presence of shallow commits.

Here, I assumed that the commit-graph wasn't checking the shallow status
appropriately, but...

> This rang a bell to me, too. There's a bug, but it's due to the mutative
> side-effects of 'is_repository_shallow' along with '--update-shallow' (a
> normal 'git fetch' works fine here, with or without
> fetch.writeCommitGraph).

...this makes sense as to why this particular case failed.

(Please allow me this brief moment to communicate my extreme dislike
of shallow clones. There, I'm done.)
 
> Here's a patch that I didn't sign-off on that fixes the problem for me.
> 
> --- >8 ---
> 
> Subject: [PATCH] shallow.c: use 'reset_repository_shallow' when appropriate
> 
> In bd0b42aed3 (fetch-pack: do not take shallow lock unnecessarily,
> 2019-01-10), the author noted that 'is_repository_shallow' produces
> visible side-effect(s) by setting 'is_shallow' and 'shallow_stat'.
> 
> This is a problem for e.g., fetching with '--update-shallow' in a
> shallow repsoitory with 'fetch.writeCommitGraph' enabled, since the

repository

> update to '.git/shallow' will cause Git to think that the repository
> *isn't* shallow when it is, thereby circumventing the commit-graph
> compatability check.

compatibility

> This causes problems in shallow repositories with at least shallow refs
> that have at least one ancestor (since the client won't have those
> object(s), and therefore can't take the reachability closure over
> commits to be written to the commit-graph).
> 
> Address this by introducing 'reset_repository_shallow()', and calling it
> when the shallow file is updated, forcing 'is_repository_shallow' to
> re-evaluate whether the repository is still shallow after fetching in
> the above scenario.
> 
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  commit.h     |  1 +
>  fetch-pack.c |  1 +
>  shallow.c    | 15 ++++++++-------
>  3 files changed, 10 insertions(+), 7 deletions(-)
> 
> diff --git a/commit.h b/commit.h
> index 008a0fa4a0..ee1ba139d4 100644
> --- a/commit.h
> +++ b/commit.h
> @@ -251,6 +251,7 @@ int register_shallow(struct repository *r, const struct object_id *oid);
>  int unregister_shallow(const struct object_id *oid);
>  int for_each_commit_graft(each_commit_graft_fn, void *);
>  int is_repository_shallow(struct repository *r);
> +void reset_repository_shallow(struct repository *r);
>  struct commit_list *get_shallow_commits(struct object_array *heads,
>  					int depth, int shallow_flag, int not_shallow_flag);
>  struct commit_list *get_shallow_commits_by_rev_list(
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 1734a573b0..051902ef6d 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1630,6 +1630,7 @@ static void update_shallow(struct fetch_pack_args *args,
>  		if (*alternate_shallow_file == '\0') { /* --unshallow */
>  			unlink_or_warn(git_path_shallow(the_repository));
>  			rollback_lock_file(&shallow_lock);
> +			reset_repository_shallow(the_repository);
>  		} else
>  			commit_lock_file(&shallow_lock);
>  		alternate_shallow_file = NULL;
> diff --git a/shallow.c b/shallow.c
> index 7fd04afed1..fac383dec9 100644
> --- a/shallow.c
> +++ b/shallow.c
> @@ -40,13 +40,6 @@ int register_shallow(struct repository *r, const struct object_id *oid)
> 
>  int is_repository_shallow(struct repository *r)
>  {
> -	/*
> -	 * NEEDSWORK: This function updates
> -	 * r->parsed_objects->{is_shallow,shallow_stat} as a side effect but
> -	 * there is no corresponding function to clear them when the shallow
> -	 * file is updated.
> -	 */
> -

It's always good to remove these NEEDSWORK comments. I wonder if the
problem is more complicated than your patch makes it seem, or else
the original author would have done it correctly at first.

But you are definitely closing out one dangling side-effect, which is
an improvement.

>  	FILE *fp;
>  	char buf[1024];
>  	const char *path = r->parsed_objects->alternate_shallow_file;
> @@ -79,6 +72,12 @@ int is_repository_shallow(struct repository *r)
>  	return r->parsed_objects->is_shallow;
>  }
> 
> +void reset_repository_shallow(struct repository *r)
> +{
> +	r->parsed_objects->is_shallow = -1;
> +	stat_validity_clear(r->parsed_objects->shallow_stat);
> +}
> +
>  /*
>   * TODO: use "int" elemtype instead of "int *" when/if commit-slab
>   * supports a "valid" flag.
> @@ -362,6 +361,7 @@ void setup_alternate_shallow(struct lock_file *shallow_lock,
>  		 * shallow file".
>  		 */
>  		*alternate_shallow_file = "";
> +	reset_repository_shallow(the_repository);
>  	strbuf_release(&sb);
>  }
> 
> @@ -411,6 +411,7 @@ void prune_shallow(unsigned options)
>  			die_errno("failed to write to %s",
>  				  get_lock_file_path(&shallow_lock));
>  		commit_lock_file(&shallow_lock);
> +		reset_repository_shallow(the_repository);
>  	} else {

These are likely good places to call reset_repository_shallow(),
but perhaps we should also call it here in commit-graph.c?

 static int commit_graph_compatible(struct repository *r)
 {
        if (!r->gitdir)
                return 0;
 
        if (read_replace_refs) {
                prepare_replace_object(r);
                if (hashmap_get_size(&r->objects->replace_map->map))
                        return 0;
        }
 
        prepare_commit_graft(r);
        if (r->parsed_objects && r->parsed_objects->grafts_nr)
                return 0;
+       reset_repository_shallow(r);
        if (is_repository_shallow(r))
                return 0;
 
        return 1;
 }

Or am I misunderstanding the state that reset_repository_shallow()
puts us in? My expectation is that is_repository_shallow() will
act as if the shallow variables had never been set and will look
for shallow data from disk.

Thanks,
-Stolee

  parent reply	other threads:[~2020-04-15 11:55 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-14 20:22 Is fetch.writeCommitGraph (and thus features.experimental) meant to work in the presence of shallow clones? Elijah Newren
2020-04-14 20:31 ` Taylor Blau
2020-04-14 20:31 ` Derrick Stolee
2020-04-14 23:50   ` Taylor Blau
2020-04-15  0:07     ` Taylor Blau
2020-04-15 11:55     ` Derrick Stolee [this message]
2020-04-15 15:55       ` Taylor Blau
2020-04-15 18:07       ` Elijah Newren
2020-04-16  2:05     ` Jonathan Tan
2020-04-15 20:54 ` Jonathan Nieder
2020-04-15 22:54   ` Elijah Newren
2020-04-16  0:47     ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cdc80c0b-696f-19b8-d132-592782537b03@gmail.com \
    --to=stolee@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).