git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Bence Ferdinandy <bence@ferdinandy.com>
Cc: git@vger.kernel.org,
	"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"SZEDER Gábor" <szeder.dev@gmail.com>,
	"Patrick Steinhardt" <ps@pks.im>,
	"Christian Schlack" <christian@backhub.co>,
	"Jeff King" <peff@peff.net>
Subject: Re: [PATCH v3] set-head: no update without change and better output for --auto
Date: Mon, 16 Sep 2024 11:36:39 -0700	[thread overview]
Message-ID: <xmqq5xqvo37s.fsf@gitster.g> (raw)
In-Reply-To: <20240915221055.904107-1-bence@ferdinandy.com> (Bence Ferdinandy's message of "Mon, 16 Sep 2024 00:09:54 +0200")

The message I am responding to is sent as the first message to start
a brand-new thread, but for future reference, it makes it easier for
everybody involved if you sent a new iteration of a patch (or a
patch series) as a reply/follow-up to the last round.

If it is too cumbersome to do for any reason, as an alternative, you
can give a reference to previous rounds as URLs to the list archive,
i.e.

 (v1) https://lore.kernel.org/git/20240910203129.2251090-1-bence@ferdinandy.com/
 (v2) https://lore.kernel.org/git/20240910203835.2288291-1-bence@ferdinandy.com/

> Currently, even if there is no actual change to remote/HEAD calling
> remote set-head will overwrite the appropriate file and if set to --auto
> will also print a message saying "remote/HEAD set to branch", which
> implies something was changed.
>
> Change the behaviour of remote set-head so that the reference is only
> updated if it actually needs to change.

I sense a patch with a racy TOCTOU change coming ;-)

> Change the output of --auto, so the output actually reflects what was
> done: a) set a previously unset HEAD, b) change HEAD because remote
> changed or c) no updates. Make the output easily parsable, by using
> a slightly clunky wording that allows all three outputs to have the same
> structure and number of words.

This would be useful.

>         This patch was originally sent along when I thought set-head was
>         going to be invoked by fetch, but the discussion on the RFC
>         concluded that it should be not. This opened the possibility to make
>         it more explicit.

I am not quite sure what the discussion concluded "should be not",
though.  In a message from you

  https://lore.kernel.org/git/D43G2CGX2N7L.ZRETD4HLIH0E@ferdinandy.com/

what we agreed was that "git fetch" does not need a new option, but
we also agreed that it would be a good idea for "git fetch" to
create, refs/remotes/$remote/HEAD when it does not exist without
being told.

I take it that you still want to see such a change to "git fetch"
eventually happen, but decided to tackle the other half of the
original two-patch series first with this patch?

>  static int set_head(int argc, const char **argv, const char *prefix)
>  {
> -	int i, opt_a = 0, opt_d = 0, result = 0;
> -	struct strbuf buf = STRBUF_INIT, buf2 = STRBUF_INIT;
> +	int i, opt_a = 0, opt_d = 0, is_ref_changed = 0, result = 0;

Shall we simply call it ref_changed instead, so that I do not have
to wonder which is better between has_ref_changed and is_ref_changed? ;-)

> +	struct strbuf buf = STRBUF_INIT, buf2 = STRBUF_INIT, buf3 = STRBUF_INIT;
>  	char *head_name = NULL;
>  
>  	struct option options[] = {
> @@ -1440,13 +1440,19 @@ static int set_head(int argc, const char **argv, const char *prefix)
>  
>  	if (head_name) {
>  		strbuf_addf(&buf2, "refs/remotes/%s/%s", argv[0], head_name);

> +		refs_read_symbolic_ref(get_main_ref_store(the_repository),buf.buf,&buf3);
> +		is_ref_changed = strcmp(buf2.buf,buf3.buf);
>  		/* make sure it's valid */
>  		if (!refs_ref_exists(get_main_ref_store(the_repository), buf2.buf))
>  			result |= error(_("Not a valid ref: %s"), buf2.buf);
> -		else if (refs_update_symref(get_main_ref_store(the_repository), buf.buf, buf2.buf, "remote set-head"))
> +		else if (is_ref_changed && refs_update_symref(get_main_ref_store(the_repository), buf.buf, buf2.buf, "remote set-head"))
>  			result |= error(_("Could not setup %s"), buf.buf);

Two and a half small issues:

 - Do not omit " " after "," and avoid overlong lines by folding
   lines at an appropriate place (usually after an operator at
   higher point in parse tree, i.e. after "is_ref_changed &&").

 - This is inherently racy, isn't it?  We read the _current_ value.
   After we do so, but before we write _our_ value, another process
   may update it, so we'd end up overwriting the value they wrote.

 - To compute is_ref_changed, we look at buf3.buf but what happens
   if such a ref does not exist, exists but is not a symbolic ref,
   or is a hierarchy under which other refs hang (e.g., a funny ref
   "refs/remotes/origin/HEAD/main" exists)?  Does the strcmp()
   compare a valid buf2.buf and an uninitialized junk in buf3.buf
   and give a random result?  Shouldn't the code checking the return
   value from the refs_read_symbolic_ref() call, and if we did so,
   would we learn enough to tell among the cases where (1) it is a
   symref, (2) it is a regular ref, (3) no such ref exists, and (4)
   the refs layer hit an error to prevent us from giving one of the
   previous 3 answers?

If we really wanted to resolve the raciness, I think we need to
enhance the set of values that create_symref() can optionally return
to its callers so that the caller can tell what the old value was.

I am not sure offhand how involved such an update would be, though.

> +		else if (opt_a && !strcmp(buf3.buf,""))
> +			printf("%s/HEAD was unset: set to %s\n", argv[0], head_name);

I suspect that this does not account for a case where the
"read_symbolic_ref()" we did earlier failed for a reason other than
"the ref did not exist".

> +		else if (opt_a && is_ref_changed)
> +			printf("%s/HEAD was changed: set to %s\n", argv[0], head_name);
>  		else if (opt_a)
> -			printf("%s/HEAD set to %s\n", argv[0], head_name);
> +			printf("%s/HEAD was unchanged: set to %s\n", argv[0], head_name);
>  		free(head_name);
>  	}

Quoting the values we obtained from the environment or the user may
be nicer, e.g.

    'refs/remotes/origin/HEAD' is now set to 'main'.
    'refs/remotes/origin/HEAD' is unchanged and points at 'main'.

This is a tangent outside the immediate topic of this patch, but I
wonder if we need "--quiet" mode.

Thanks.

  reply	other threads:[~2024-09-16 18:36 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-15 22:09 [PATCH v3] set-head: no update without change and better output for --auto Bence Ferdinandy
2024-09-16 18:36 ` Junio C Hamano [this message]
2024-09-16 19:44   ` Bence Ferdinandy
2024-09-17 20:32   ` Bence Ferdinandy
2024-09-17 20:51     ` Junio C Hamano
2024-09-19 20:53       ` Bence Ferdinandy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq5xqvo37s.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=bence@ferdinandy.com \
    --cc=christian@backhub.co \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).