git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Manuel Quiñones" <manuel.por.aca@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: Usability issue: "Your branch is up to date"
Date: Tue, 04 Feb 2025 09:43:10 -0800	[thread overview]
Message-ID: <xmqqseottxld.fsf@gitster.g> (raw)
In-Reply-To: <CAPpV+Oaq3d3oNE-V3pnpQRNrGCoZr52uY91QtWYxcu1tgG_QXg@mail.gmail.com> ("Manuel Quiñones"'s message of "Tue, 4 Feb 2025 09:38:30 -0300")

Manuel Quiñones <manuel.por.aca@gmail.com> writes:

> Thanks for the insightful explanation Junio! Looking forward, do you
> think that it could be possible to record the timestamp that the
> remote-tracking branch has been updated with the remote branch? In
> order to make such information available to the end user.

The time at which each remote-tracking branch was updated is already
recorded in the reflog.  What is missing is the timestamp that a
fetch checked if a remote-tracking branch needs updating, found that
the branch at the remote hasn't changed, and did not update the
remote-tracking branch.

You'd need to first design where to store that information and how.

It does not have to be in the reflog, but as a thought experiment,
let's take how the design would go if we decided to use reflog to
store that information.

What a reflog entry records, in textual form, looks like

<old-object-name> <new-object-name> <user-ident> <timestamp> <comment>

We can imagine adding a new reflog entry whenever "git fetch" finds
that the branch at the remote hasn't been updated, with the same
value in <old-object-name> and <new-object-name>.

A reflog file I randomly picked as a sample is ~5k long with 34
entries (it keeps track of my fetching from and pushing to
https://git.kernel.org/pub/scm/git/git.git/#master), so a reflog
costs around 150 bytes per entry, and if you fetch once every hour
that would be like ~3k per branch per day.

While that is a trivial and insignificant number from storage cost
point of view, if you are monitoring the progress of the remote with
"git reflog origin/main", I suspect that such a change would make it
unusably noisy, so "git reflog" command may need to grow an option
that tells it to skip these no-op entries.

As to required change to "git fetch", this may be a bit tricky.

IIRC (I am writing from the memory without looking at the code),
when you say "git fetch [<remote> [<refspec>...]]", what it does
is roughly to:

 - figure out what <remote> and <refspec>... to use from the
   configuration, if omitted on the command line.

 - connect to the remote, and ask the current value of their refs.

 - drop any refspec <src>:<dst> whose <dst> side already has the
   value the remote has.

 - drive the object transfer machinery to receive the pack data from
   the remote and store it locally.

 - update the remote-tracking branches.

And the last step is where the remote-tracking branches are updated,
together with their reflog (if enabled).  Because that step does not
even see the remote-tracking branches whose value do not need to
change (filtered out earlier to help reduce the number of refs fed
to the object transfer machinery), the "drop no-op early" part need
to be designed differently (e.g. mark them as no-op, so that the 
object tranfer machinery can notice them and ignore) and then the
"update refs" step can see these no-op updates.

I do not think writing the "no-op" reflog entries should be done at
a step separate from the step that writes the real ref updates, as I
suspect that such a separate update scheme would have a funny
interactions with "git fetch --atomic".

So, do I think it could be possible?  Sure.  Do I think it would be
too hard as a rocket surgery?  No.  Will I jump up and down excited
and start coding?  I am not interested all that much, but I can help
reviewing patches if somebody else works on it.

There may be some other downsides (other than the cost of storage
and making the reflog noisy) I haven't thought about, which need to
be considered if somebody decides to work on this.

Thanks.



  reply	other threads:[~2025-02-04 17:43 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-03 16:45 Usability issue: "Your branch is up to date" Manuel Quiñones
2025-02-03 16:56 ` Junio C Hamano
2025-02-04  0:10   ` Junio C Hamano
2025-02-04  0:28     ` Bram van Oosterhout
     [not found]       ` <CAPx1GveyP4+yn5NMgvO3JpbOwPRT5=tb9YBx7U1Ufvae7gFnHQ@mail.gmail.com>
     [not found]         ` <CAMoUM6LstYx3PJcx-Sz3Dfs-1BxF1uP373MO8+eknbO7j-S01Q@mail.gmail.com>
2025-02-04  0:51           ` Fwd: " Bram van Oosterhout
2025-02-04  2:08       ` D. Ben Knoble
2025-02-04 12:53         ` Manuel Quiñones
2025-02-05  3:55         ` Bram van Oosterhout
2025-02-04 12:38     ` Manuel Quiñones
2025-02-04 17:43       ` Junio C Hamano [this message]
2025-02-05  6:54         ` Patrick Steinhardt
2025-02-05 18:40           ` Junio C Hamano
2025-02-06  9:53             ` Patrick Steinhardt
2025-02-07  8:20               ` Karthik Nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqseottxld.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=manuel.por.aca@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).