From: Junio C Hamano <gitster@pobox.com>
To: Bence Ferdinandy <bence@ferdinandy.com>
Cc: git@vger.kernel.org,
"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"SZEDER Gábor" <szeder.dev@gmail.com>,
"Patrick Steinhardt" <ps@pks.im>,
"Christian Schlack" <christian@backhub.co>,
"Jeff King" <peff@peff.net>
Subject: Re: [PATCH v3] set-head: no update without change and better output for --auto
Date: Mon, 16 Sep 2024 11:36:39 -0700 [thread overview]
Message-ID: <xmqq5xqvo37s.fsf@gitster.g> (raw)
In-Reply-To: <20240915221055.904107-1-bence@ferdinandy.com> (Bence Ferdinandy's message of "Mon, 16 Sep 2024 00:09:54 +0200")
The message I am responding to is sent as the first message to start
a brand-new thread, but for future reference, it makes it easier for
everybody involved if you sent a new iteration of a patch (or a
patch series) as a reply/follow-up to the last round.
If it is too cumbersome to do for any reason, as an alternative, you
can give a reference to previous rounds as URLs to the list archive,
i.e.
(v1) https://lore.kernel.org/git/20240910203129.2251090-1-bence@ferdinandy.com/
(v2) https://lore.kernel.org/git/20240910203835.2288291-1-bence@ferdinandy.com/
> Currently, even if there is no actual change to remote/HEAD calling
> remote set-head will overwrite the appropriate file and if set to --auto
> will also print a message saying "remote/HEAD set to branch", which
> implies something was changed.
>
> Change the behaviour of remote set-head so that the reference is only
> updated if it actually needs to change.
I sense a patch with a racy TOCTOU change coming ;-)
> Change the output of --auto, so the output actually reflects what was
> done: a) set a previously unset HEAD, b) change HEAD because remote
> changed or c) no updates. Make the output easily parsable, by using
> a slightly clunky wording that allows all three outputs to have the same
> structure and number of words.
This would be useful.
> This patch was originally sent along when I thought set-head was
> going to be invoked by fetch, but the discussion on the RFC
> concluded that it should be not. This opened the possibility to make
> it more explicit.
I am not quite sure what the discussion concluded "should be not",
though. In a message from you
https://lore.kernel.org/git/D43G2CGX2N7L.ZRETD4HLIH0E@ferdinandy.com/
what we agreed was that "git fetch" does not need a new option, but
we also agreed that it would be a good idea for "git fetch" to
create, refs/remotes/$remote/HEAD when it does not exist without
being told.
I take it that you still want to see such a change to "git fetch"
eventually happen, but decided to tackle the other half of the
original two-patch series first with this patch?
> static int set_head(int argc, const char **argv, const char *prefix)
> {
> - int i, opt_a = 0, opt_d = 0, result = 0;
> - struct strbuf buf = STRBUF_INIT, buf2 = STRBUF_INIT;
> + int i, opt_a = 0, opt_d = 0, is_ref_changed = 0, result = 0;
Shall we simply call it ref_changed instead, so that I do not have
to wonder which is better between has_ref_changed and is_ref_changed? ;-)
> + struct strbuf buf = STRBUF_INIT, buf2 = STRBUF_INIT, buf3 = STRBUF_INIT;
> char *head_name = NULL;
>
> struct option options[] = {
> @@ -1440,13 +1440,19 @@ static int set_head(int argc, const char **argv, const char *prefix)
>
> if (head_name) {
> strbuf_addf(&buf2, "refs/remotes/%s/%s", argv[0], head_name);
> + refs_read_symbolic_ref(get_main_ref_store(the_repository),buf.buf,&buf3);
> + is_ref_changed = strcmp(buf2.buf,buf3.buf);
> /* make sure it's valid */
> if (!refs_ref_exists(get_main_ref_store(the_repository), buf2.buf))
> result |= error(_("Not a valid ref: %s"), buf2.buf);
> - else if (refs_update_symref(get_main_ref_store(the_repository), buf.buf, buf2.buf, "remote set-head"))
> + else if (is_ref_changed && refs_update_symref(get_main_ref_store(the_repository), buf.buf, buf2.buf, "remote set-head"))
> result |= error(_("Could not setup %s"), buf.buf);
Two and a half small issues:
- Do not omit " " after "," and avoid overlong lines by folding
lines at an appropriate place (usually after an operator at
higher point in parse tree, i.e. after "is_ref_changed &&").
- This is inherently racy, isn't it? We read the _current_ value.
After we do so, but before we write _our_ value, another process
may update it, so we'd end up overwriting the value they wrote.
- To compute is_ref_changed, we look at buf3.buf but what happens
if such a ref does not exist, exists but is not a symbolic ref,
or is a hierarchy under which other refs hang (e.g., a funny ref
"refs/remotes/origin/HEAD/main" exists)? Does the strcmp()
compare a valid buf2.buf and an uninitialized junk in buf3.buf
and give a random result? Shouldn't the code checking the return
value from the refs_read_symbolic_ref() call, and if we did so,
would we learn enough to tell among the cases where (1) it is a
symref, (2) it is a regular ref, (3) no such ref exists, and (4)
the refs layer hit an error to prevent us from giving one of the
previous 3 answers?
If we really wanted to resolve the raciness, I think we need to
enhance the set of values that create_symref() can optionally return
to its callers so that the caller can tell what the old value was.
I am not sure offhand how involved such an update would be, though.
> + else if (opt_a && !strcmp(buf3.buf,""))
> + printf("%s/HEAD was unset: set to %s\n", argv[0], head_name);
I suspect that this does not account for a case where the
"read_symbolic_ref()" we did earlier failed for a reason other than
"the ref did not exist".
> + else if (opt_a && is_ref_changed)
> + printf("%s/HEAD was changed: set to %s\n", argv[0], head_name);
> else if (opt_a)
> - printf("%s/HEAD set to %s\n", argv[0], head_name);
> + printf("%s/HEAD was unchanged: set to %s\n", argv[0], head_name);
> free(head_name);
> }
Quoting the values we obtained from the environment or the user may
be nicer, e.g.
'refs/remotes/origin/HEAD' is now set to 'main'.
'refs/remotes/origin/HEAD' is unchanged and points at 'main'.
This is a tangent outside the immediate topic of this patch, but I
wonder if we need "--quiet" mode.
Thanks.
next prev parent reply other threads:[~2024-09-16 18:36 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-15 22:09 [PATCH v3] set-head: no update without change and better output for --auto Bence Ferdinandy
2024-09-16 18:36 ` Junio C Hamano [this message]
2024-09-16 19:44 ` Bence Ferdinandy
2024-09-17 20:32 ` Bence Ferdinandy
2024-09-17 20:51 ` Junio C Hamano
2024-09-19 20:53 ` Bence Ferdinandy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq5xqvo37s.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=avarab@gmail.com \
--cc=bence@ferdinandy.com \
--cc=christian@backhub.co \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
--cc=ps@pks.im \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).