git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nico Williams <nico@cryptonector.com>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Junio C Hamano <gitster@pobox.com>,
	Martin von Zweigbergk <martinvonz@google.com>,
	Git Mailing List <git@vger.kernel.org>,
	Edwin Kempin <ekempin@google.com>,
	Scott Chacon <scott@gitbutler.com>,
	remo@buenzli.dev,
	"philipmetzger@bluewin.ch" <philipmetzger@bluewin.ch>
Subject: Semantics of change IDs (Re: Gerrit, GitButler, and Jujutsu projects collaborating on change-id commit footer)
Date: Wed, 9 Apr 2025 11:54:10 -0500	[thread overview]
Message-ID: <Z/amMj/eg0RbXdkS@ubby> (raw)
In-Reply-To: <20250409121924.GA148735@mit.edu>

On Wed, Apr 09, 2025 at 08:19:24AM -0400, Theodore Ts'o wrote:
> On Tue, Apr 08, 2025 at 10:53:06AM -0500, Nico Williams wrote:
> > I'm not keen on CR tools "intuiting" from.. similarity checks.
> > [...]
> 
> I'm not keen on fields that can have essentially random semantics.
> Part of this is because today Change-ID is in the footer, and so
> humans can randomly set it to any value they like.  Sometimes they cut
> and paste footers, and so completely unrelated commits have the same
> Change-Id which show up when you do a Gerrit lookup by Chnage-Id.
> Admittedly, this aspect gets better if we shove it into the git commit
> header.
>
> Part of it is because some tools will edit the Change-Id when doing a
> cherry-pick.  [...]

I was only proposing to leave some details out, not to have completely
undefined semantics.  The particular details we might want to leave out
are about resolving change IDs to URIs.  In particular this editing of
change IDs on cherry-pick you mention has to not be permitted, or
perhaps a new change ID could be added -- i.e., are these headers
single-valued or multi-valued?

Let's nail down the semantics of these change ID headers.  Here is a
proposal to bang on:

 - change IDs get preserved on cherry-pick and on `pick`s in rebases

 - users can manually remove or change these change IDs, naturally,
   though generall they would not

 - the actual change IDs are either free-form or they are URIs -- pick
   one, but if they are URIs they should be URIs to CRs, and approved
   CRs should perhaps have links to integration reports etc.

 - there should be one header for a change ID for the patch series (the
   MR/PR/whateverR); patch series IDs can be shared by many commits in
   one branch, so they are not in any way unique

 - there may be one header for a change ID for each commit, which should
   be unique in any _branch_, but not unique in any repo (due to back-
   and forward-ports for example)

 - there should be another header to list change IDs from which a commit
   was derived that nonetheless has a different commit change ID

 - these headers should be multi-valued to handle squashes and merges

 - if a commit change ID is missing but a path series change ID is
   present then similarity checks could be used to link multiple
   versions of any one such commit

Optional:

 - a commit change ID could be used as a ref to an object that lists the
   commits that have that change ID

 - a patch series change ID could be used as a ref to an object that lists
   the head commit of of that patch series in every branch that contains
   it

> Perhaps one approach might be that the hueristics that you hate being
> used as an automated way to sort it out, might get used to set the
> semantics at commit time, with perhaps a way for the user to override
> the hueristics, or where the user has to explicitly acknowledge that
> the hueristics correctly noticed that the patch has changed radically
> and maybe the Change-Id shouldn't be retained any more?

Yes, heuristics can be used to help the user make such decisions.  I've
no issue with that.

> Finally, perhaps there should be some discussion about whether we
> think git should be maintaining indexes based on the Commit-Id.

If they can be refs, then they should be.  Since they can't be unique
the ref should be to an object listing the actual commits (see above).

There could also be a non-ref index for these.

> Personally, cutting and pasting a random 17 character ID is painful
> and annoying, and when I see it in my shell history, I have no idea
> what might have been going on.  So if I need to cut and paste a
> Commit-Id, I might as well cut and paste the one-line commit summary,
> and do a "git log --grep" search based on that.  But if the Commit-Id
> is indexed, then maybe it might be more useful?  I dunno....

+1

> Well, see above about some possible semantics.  I'm *still* not
> convinced even with the better-defined semantics it's worth storing
> the extra baggage in the commit header.  But that's more of a
> value/philosophical question, much like how we "could" store explicit
> file rename information in the git commit, but in the very early days
> of the git design history, although BitKeeper did track file names,
> Linus consciously decided to go down a much simpler path.  So that's
> really more of a SMTP vs X.400 preference of simplicity versus
> complexity in the protocol versus implementation, which is something
> where people of good will might disagree --- and there Junio's
> opinions matter far more then mine.  :-)

I don't find file rename heuristics to be "simple", and they're often
wrong, though I've fully internalized that copies and renames have to be
done alone in separate commits with no contents changes so as to make
incorrect rename determinations much less likely.

Nico
-- 

  parent reply	other threads:[~2025-04-09 16:54 UTC|newest]

Thread overview: 118+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-02 18:48 Gerrit, GitButler, and Jujutsu projects collaborating on change-id commit footer Martin von Zweigbergk
2025-04-02 19:34 ` Remo Senekowitsch
2025-04-02 19:49   ` Konstantin Ryabitsev
2025-04-02 19:45 ` Konstantin Ryabitsev
2025-04-02 19:52 ` Martin von Zweigbergk
2025-04-03  9:09 ` Patrick Steinhardt
2025-04-03 10:38   ` Remo Senekowitsch
2025-04-03 11:06     ` Patrick Steinhardt
2025-04-03 15:56   ` Elijah Newren
2025-04-03 16:25     ` Remo Senekowitsch
2025-04-03 16:38       ` Elijah Newren
2025-04-03 21:46         ` Martin von Zweigbergk
2025-04-04  9:41     ` Patrick Steinhardt
2025-04-03 15:39 ` Elijah Newren
2025-04-03 16:40   ` Remo Senekowitsch
2025-04-03 22:11     ` Kane York
2025-04-04  2:28     ` Elijah Newren
2025-04-04  2:40       ` Elijah Newren
2025-04-04  3:47         ` Martin von Zweigbergk
2025-04-04  4:03           ` Nico Williams
2025-04-04  4:59           ` Elijah Newren
2025-04-04  5:21             ` Martin von Zweigbergk
2025-04-04  9:29               ` Patrick Steinhardt
2025-04-03 17:48   ` Theodore Ts'o
2025-04-03 20:31     ` Remo Senekowitsch
2025-04-05  2:09       ` Theodore Ts'o
2025-04-03 18:10   ` Nico Williams
2025-04-03 21:45     ` Remo Senekowitsch
     [not found]       ` <Z+8GoNrdaJlmNpGm@ubby>
2025-04-04  0:05         ` Remo Senekowitsch
2025-04-04  3:52           ` Nico Williams
2025-04-04  7:41             ` Remo Senekowitsch
2025-04-04 16:08               ` Nico Williams
2025-04-03 22:05     ` Martin von Zweigbergk
2025-04-03 22:13       ` Nico Williams
2025-04-03 22:47         ` Martin von Zweigbergk
2025-04-04  2:06           ` Elijah Newren
2025-04-04  3:11           ` Nico Williams
2025-04-04  4:08             ` Martin von Zweigbergk
2025-04-04  4:23               ` Nico Williams
2025-04-04  9:34                 ` Patrick Steinhardt
2025-04-04 16:04                   ` Nico Williams
2025-04-07  8:00                     ` Patrick Steinhardt
2025-04-07 20:59 ` Junio C Hamano
2025-04-07 21:36   ` Nico Williams
2025-04-08 12:55     ` Theodore Ts'o
2025-04-08 15:53       ` Nico Williams
2025-04-09 12:19         ` Theodore Ts'o
2025-04-09 12:56           ` Junio C Hamano
2025-04-09 19:13             ` Nico Williams
2025-04-10  8:29               ` Junio C Hamano
2025-04-10 21:40                 ` Martin von Zweigbergk
2025-04-09 16:54           ` Nico Williams [this message]
2025-04-09 18:02             ` Semantics of change IDs (Re: Gerrit, GitButler, and Jujutsu projects collaborating on change-id commit footer) Junio C Hamano
2025-04-09 18:35               ` Nico Williams
2025-04-09 19:14                 ` Eric Sunshine
2025-04-09 19:31                   ` Nico Williams
2025-04-10 13:44                 ` Theodore Ts'o
2025-04-10 16:18                   ` Junio C Hamano
2025-04-11 15:48                     ` Theodore Ts'o
2025-04-11 16:38                       ` Konstantin Ryabitsev
2025-04-11 17:44                       ` Junio C Hamano
2025-04-12 23:13                         ` Theodore Ts'o
2025-04-14 15:13                           ` Junio C Hamano
2025-04-15 22:30                             ` Remo Senekowitsch
2025-04-16  0:09                               ` Junio C Hamano
2025-04-16  0:21                               ` Jacob Keller
2025-04-15 21:38                           ` Jacob Keller
2025-04-14 19:54             ` D. Ben Knoble
2025-04-14 21:34               ` Nico Williams
2025-04-15 21:44               ` Jacob Keller
2025-04-16 11:36               ` Remo Senekowitsch
2025-04-22 20:17                 ` D. Ben Knoble
2025-04-22 22:24                   ` Remo Senekowitsch
2025-04-22 22:42                     ` Junio C Hamano
2025-04-22 22:51                       ` Nico Williams
2025-04-22 23:47                         ` Remo Senekowitsch
2025-04-23  0:32                           ` Nico Williams
2025-04-23  1:15                             ` Remo Senekowitsch
2025-04-23  4:45                               ` Nico Williams
2025-04-22 23:49                         ` Junio C Hamano
2025-04-23  1:02                           ` Nico Williams
2025-04-23  4:47                             ` Nico Williams
2025-04-22 23:21                       ` Remo Senekowitsch
2025-04-23  5:07                       ` Martin von Zweigbergk
2025-04-23 15:51                         ` Junio C Hamano
2025-04-23 16:19                           ` Martin von Zweigbergk
2025-06-06 13:04                             ` Toon Claes
     [not found]                   ` <aAgWytQNqtLzg2TU@ubby>
2025-04-23  0:25                     ` Remo Senekowitsch
2025-04-23  0:45                       ` Nico Williams
2025-04-23 12:58                         ` How GitLab does/doesn't need change IDs (was Re: Semantics of change IDs) Toon Claes
2025-04-23 18:59                           ` Nico Williams
2025-05-10 19:32                     ` Semantics of change IDs (Re: Gerrit, GitButler, and Jujutsu projects collaborating on change-id commit footer) D. Ben Knoble
2025-05-10 19:46                       ` D. Ben Knoble
2025-05-10 20:31                         ` Martin von Zweigbergk
2025-05-12 17:03                           ` Junio C Hamano
2025-05-12 17:19                             ` Martin von Zweigbergk
2025-05-14 14:38                               ` Junio C Hamano
2025-05-15 10:31                                 ` Oswald Buddenhagen
2025-05-15 16:32                                   ` Jacob Keller
2025-05-15 19:59                                     ` Junio C Hamano
2025-05-15 20:10                                       ` Nico Williams
     [not found]                           ` <aCJi+4q6DZhnfdy+@ubby>
2025-05-12 21:43                             ` Martin von Zweigbergk
2025-05-12 22:04                               ` brian m. carlson
2025-06-06 12:28                                 ` Toon Claes
2025-06-06 15:44                                   ` Junio C Hamano
2025-05-13 21:22                               ` D. Ben Knoble
2025-04-07 22:51   ` Gerrit, GitButler, and Jujutsu projects collaborating on change-id commit footer Remo Senekowitsch
2025-04-08  0:10   ` Junio C Hamano
2025-04-08  5:35     ` Martin von Zweigbergk
2025-04-08 14:27       ` Junio C Hamano
2025-04-08 15:58         ` Phillip Wood
2025-04-08 16:27           ` Nico Williams
2025-04-12 21:32           ` Junio C Hamano
2025-04-16  0:24         ` Jacob Keller
2025-05-14 15:08         ` Kristoffer Haugsbakk
2025-04-08 14:27       ` Junio C Hamano
2025-08-19 14:04 ` Askar Safin
2025-08-19 16:44   ` Ben Knoble

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z/amMj/eg0RbXdkS@ubby \
    --to=nico@cryptonector.com \
    --cc=ekempin@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=martinvonz@google.com \
    --cc=philipmetzger@bluewin.ch \
    --cc=remo@buenzli.dev \
    --cc=scott@gitbutler.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).