Git development
 help / color / mirror / Atom feed
* [PATCH] headers: Preserve 'change-id' header in rebase / cherry-pick.
@ 2026-04-07  3:13 Matt Stark
  2026-04-07  4:09 ` Junio C Hamano
  2026-04-07 23:28 ` brian m. carlson
  0 siblings, 2 replies; 12+ messages in thread
From: Matt Stark @ 2026-04-07  3:13 UTC (permalink / raw)
  To: git
  Cc: ps, gitster, phillip.wood, Martin von Zweigbergk, remo,
	Edwin Kempin, schacon, philipmetzger, konstantin, newren, tytso,
	nico, rikingcoding, Matt Stark

In the discussions on
https://lore.kernel.org/git/Z_OGMb-1oV0Ex05e@pks.im/T/#m038be849b9b4020c16c562d810cf77bad91a2c87,
it seems to be that:
* There is consensus that a `change-id` header provides good value
* There is not consenus on what precise format that should take

This commit, rather than attempting to standardize the format, simply
preserves the change-id header in whatever format it used previously.

If we so choose, we can later decide on a standardized format, but since
git only preserves existing headers, this should not create backwards
incompatibility.

Signed-off-by: Matt Stark <msta@google.com>
---
 sequencer.c                           | 39 ++++++++++++++++++++++-----
 t/t3400-rebase.sh                     | 20 ++++++++++++++
 t/t3501-revert-cherry-pick.sh         | 15 +++++++++++
 t/t7501-commit-basic-functionality.sh | 15 +++++++++++
 4 files changed, 83 insertions(+), 6 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index b7d8dca47f..093d47d42a 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1530,12 +1530,12 @@ static int try_to_commit(struct repository *r,
  struct strbuf *msg, const char *author,
  const char *reflog_action,
  struct replay_opts *opts, unsigned int flags,
- struct object_id *oid)
+ struct object_id *oid,
+ struct commit_extra_header *extra)
 {
  struct object_id tree;
  struct commit *current_head = NULL;
  struct commit_list *parents = NULL;
- struct commit_extra_header *extra = NULL;
  struct strbuf err = STRBUF_INIT;
  struct strbuf commit_msg = STRBUF_INIT;
  char *amend_author = NULL;
@@ -1721,7 +1721,8 @@ static int do_commit(struct repository *r,
       const char *msg_file, const char *author,
       const char *reflog_action,
       struct replay_opts *opts, unsigned int flags,
-      struct object_id *oid)
+      struct object_id *oid,
+      struct commit_extra_header *extra_headers)
 {
  int res = 1;

@@ -1735,7 +1736,7 @@ static int do_commit(struct repository *r,
     msg_file);

  res = try_to_commit(r, msg_file ? &sb : NULL,
-     author, reflog_action, opts, flags, &oid);
+     author, reflog_action, opts, flags, &oid, extra_headers);
  strbuf_release(&sb);
  if (!res) {
  refs_delete_ref(get_main_ref_store(r), "",
@@ -2511,10 +2512,36 @@ static int do_pick_commit(struct repository *r,
  oid_to_hex(&commit->object.oid), msg.subject);
  } /* else allow == 0 and there's nothing special to do */
  if (!opts->no_commit && !drop_commit) {
- if (author || command == TODO_REVERT || (flags & AMEND_MSG))
+ if (author || command == TODO_REVERT || (flags & AMEND_MSG)) {
+ struct commit_extra_header *extra_headers = NULL;
+ if (commit) {
+ unsigned long size;
+ const char *buffer = repo_get_commit_buffer(r, commit, &size);
+ size_t out_len;
+ // The Gerrit, GitButler, and Jujutsu projects all have a concept of
+ // a "change id", and it behaves in a similar way between the three
+ // tools. The change id is conceptually associated with a commit.
+ // It follows a commit as its rewritten (e.g. by amending and
+ // rebasing).
+ // While git doesn't add this header itself, and currently has no plans
+ // to do so, there is consensus that if the header is added by another
+ // tool, git should at least preserve it.
+ const char *header_value = find_commit_header(buffer, "change-id", &out_len);
+ if (header_value) {
+ extra_headers = xmalloc(sizeof(*extra_headers));
+ *extra_headers = (struct commit_extra_header){
+ .next = NULL,
+ .key = xstrdup("change-id"),
+ .value = xmemdupz(header_value, out_len),
+ .len = out_len
+ };
+ }
+ repo_unuse_commit_buffer(r, commit, buffer);
+ }
  res = do_commit(r, msg_file, author, reflog_action,
  opts, flags,
- commit? &commit->object.oid : NULL);
+ commit ? &commit->object.oid : NULL, extra_headers);
+ }
  else
  res = error(_("unable to parse commit author"));
  *check_todo = !!(flags & EDIT_MSG);
diff --git a/t/t3400-rebase.sh b/t/t3400-rebase.sh
index c0c00fbb7b..6b5d6fe56f 100755
--- a/t/t3400-rebase.sh
+++ b/t/t3400-rebase.sh
@@ -474,4 +474,24 @@ test_expect_success 'git rebase --update-ref with
core.commentChar and branch on
  test_grep "% Ref refs/heads/topic2 checked out at" actual
 '

+test_expect_success 'rebase preserves change-id header' '
+ test_commit "source-for-rebase" file-rebase content-rebase &&
+ git cat-file commit HEAD >commit_obj &&
+ awk "/^committer / { print; print \"change-id my-change-id\"; next
}1" commit_obj >commit_obj_mod &&
+ new_commit=$(git hash-object -t commit -w commit_obj_mod) &&
+ git branch -f source-branch $new_commit &&
+
+ git checkout -b target-branch HEAD^ &&
+ echo "unrelated" >file-unrelated &&
+ git add file-unrelated &&
+ git commit -m "unrelated" &&
+
+ git checkout source-branch &&
+ git rebase target-branch &&
+
+ git cat-file commit HEAD >result_obj &&
+ grep "^change-id my-change-id$" result_obj
+'
+
 test_done
+
diff --git a/t/t3501-revert-cherry-pick.sh b/t/t3501-revert-cherry-pick.sh
index 8025a28cfd..0ada99f216 100755
--- a/t/t3501-revert-cherry-pick.sh
+++ b/t/t3501-revert-cherry-pick.sh
@@ -256,4 +256,19 @@ test_expect_success 'cherry-pick is unaware of
--reference (for now)' '
  grep "^usage: git cherry-pick" actual
 '

+test_expect_success 'cherry-pick preserves change-id header' '
+ test_commit "source-for-cherry" file-cherry content-cherry &&
+ git cat-file commit HEAD >commit_obj &&
+ awk "/^committer / { print; print \"change-id my-change-id\"; next
}1" commit_obj >commit_obj_mod &&
+ new_commit=$(git hash-object -t commit -w commit_obj_mod) &&
+ git branch -f source-branch $new_commit &&
+
+ git checkout -b target-branch HEAD^ &&
+ git cherry-pick source-branch &&
+
+ git cat-file commit HEAD >result_obj &&
+ grep "^change-id my-change-id$" result_obj
+'
+
 test_done
+
diff --git a/t/t7501-commit-basic-functionality.sh
b/t/t7501-commit-basic-functionality.sh
index a37509f004..e25dd9dc6f 100755
--- a/t/t7501-commit-basic-functionality.sh
+++ b/t/t7501-commit-basic-functionality.sh
@@ -793,4 +793,19 @@ test_expect_success '--dry-run --short' '
  git commit --dry-run --short
 '

+test_expect_success 'amend preserves change-id header' '
+ test_commit "source-for-amend" file-amend content-amend &&
+ git cat-file commit HEAD >commit_obj &&
+ awk "/^committer / { print; print \"change-id my-change-id\"; next
}1" commit_obj >commit_obj_mod &&
+ new_commit=$(git hash-object -t commit -w commit_obj_mod) &&
+ git reset --hard $new_commit &&
+
+ echo "amended content" >>file-amend &&
+ git add file-amend &&
+ git commit --amend --no-edit &&
+
+ git cat-file commit HEAD >result_obj &&
+ grep "^change-id my-change-id$" result_obj
+'
+
 test_done
-- 
2.53.0.1213.gd9a14994de-goog

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] headers: Preserve 'change-id' header in rebase / cherry-pick.
  2026-04-07  3:13 [PATCH] headers: Preserve 'change-id' header in rebase / cherry-pick Matt Stark
@ 2026-04-07  4:09 ` Junio C Hamano
  2026-04-07  4:58   ` Nico Williams
  2026-04-07  9:41   ` Phillip Wood
  2026-04-07 23:28 ` brian m. carlson
  1 sibling, 2 replies; 12+ messages in thread
From: Junio C Hamano @ 2026-04-07  4:09 UTC (permalink / raw)
  To: Matt Stark
  Cc: git, ps, phillip.wood, Martin von Zweigbergk, remo, Edwin Kempin,
	schacon, philipmetzger, konstantin, newren, tytso, nico,
	rikingcoding

Matt Stark <msta@google.com> writes:

> In the discussions on
> https://lore.kernel.org/git/Z_OGMb-1oV0Ex05e@pks.im/T/#m038be849b9b4020c16c562d810cf77bad91a2c87,
> it seems to be that:
> * There is consensus that a `change-id` header provides good value

I doubt it.

There are multiple people who wanted it, but as far as I can recall,
I did not get the sense that they had the same semantics in mind.

> * There is not consenus on what precise format that should take

Format is one thing, but what it means is much more important.  When
is it inherited?  What happens when you split a single commit into
three pieces, which piece, if any, among the resulting three will
inherit thee parent's?  Should rebase, cherry-pick, and replay
behave the same way (IIRC, rebase and cherry-pick behaves
differently while propagating notes).  Etc., etc.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] headers: Preserve 'change-id' header in rebase / cherry-pick.
  2026-04-07  4:09 ` Junio C Hamano
@ 2026-04-07  4:58   ` Nico Williams
  2026-04-07  5:02     ` Nico Williams
                       ` (2 more replies)
  2026-04-07  9:41   ` Phillip Wood
  1 sibling, 3 replies; 12+ messages in thread
From: Nico Williams @ 2026-04-07  4:58 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Matt Stark, git, ps, phillip.wood, Martin von Zweigbergk, remo,
	Edwin Kempin, schacon, philipmetzger, konstantin, newren, tytso,
	rikingcoding

On Mon, Apr 06, 2026 at 09:09:54PM -0700, Junio C Hamano wrote:
> Matt Stark <msta@google.com> writes:
> 
> > In the discussions on
> > https://lore.kernel.org/git/Z_OGMb-1oV0Ex05e@pks.im/T/#m038be849b9b4020c16c562d810cf77bad91a2c87,
> > it seems to be that:
> > * There is consensus that a `change-id` header provides good value
> 
> I doubt it.
> 
> There are multiple people who wanted it, but as far as I can recall,
> I did not get the sense that they had the same semantics in mind.

The less semantics it has, the more acceptable it might be :)
But then what would need patching?  So it needs _some_ semantics.

Finding the minimal acceptable semantics for this header is the trick to
pull.

> > * There is not consenus on what precise format that should take
> 
> Format is one thing, but what it means is much more important.  When
> is it inherited?  What happens when you split a single commit into
> three pieces, which piece, if any, among the resulting three will
> inherit thee parent's?  Should rebase, cherry-pick, and replay
> behave the same way (IIRC, rebase and cherry-pick behaves
> differently while propagating notes).  Etc., etc.

Exactly.  I remember I argued that cherry-pick and rebase should have
the same behavior given that rebase is logically a script of
cherry-picks, but others had strong arguments that the two should not
have the same behavior (something which is not hard to implement if you
make the inherittance / non-inherittance an option to cherry-pick has
different defaults for cherry-pick than for rebase).

That the value of this header should not have a format imposed -- that
much is certainly the case as far as consensus goes, I think.  Basically
it should be site-local, for some definition of site.  But the tooling
can just treat it as opaque, perhaps with hooks to do any interpretation
of those values.

Maybe that's the trick: local configuration for determining the
copy-or-drop semantic for different operations, and maybe hooks for
altering when copying.  Thus for example splitting a commit (something
jj supports directly but Git doesn't, unless I missed something) could
derive or create new change-id values from the original using hooks.  A
hook might do things like create child or sibling problem tickets, or
might only qualify the original with some qualifier.  A hook might even
interact with the user to create new change-ids as needed.

The risk here is that this could yield too much configuration and be
more annoying than useful, but I think that wouldn't turn out to be the
case.

Nico
-- 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] headers: Preserve 'change-id' header in rebase / cherry-pick.
  2026-04-07  4:58   ` Nico Williams
@ 2026-04-07  5:02     ` Nico Williams
  2026-04-07 14:33       ` Junio C Hamano
  2026-04-07  9:55     ` Phillip Wood
  2026-04-07 14:42     ` Junio C Hamano
  2 siblings, 1 reply; 12+ messages in thread
From: Nico Williams @ 2026-04-07  5:02 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Matt Stark, git, ps, phillip.wood, Martin von Zweigbergk, remo,
	Edwin Kempin, schacon, philipmetzger, konstantin, newren, tytso,
	rikingcoding

On Mon, Apr 06, 2026 at 11:58:19PM -0500, Nico Williams wrote:
> Maybe that's the trick: local configuration for determining the
> copy-or-drop semantic for different operations, and maybe hooks for
> altering when copying.  [...]

I should add that I would want an original-change-id header that could
be used (again, optionally) to relate commits that get cherry-picked or
rebased but end up having different change-ids.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] headers: Preserve 'change-id' header in rebase / cherry-pick.
  2026-04-07  4:09 ` Junio C Hamano
  2026-04-07  4:58   ` Nico Williams
@ 2026-04-07  9:41   ` Phillip Wood
  1 sibling, 0 replies; 12+ messages in thread
From: Phillip Wood @ 2026-04-07  9:41 UTC (permalink / raw)
  To: Junio C Hamano, Matt Stark
  Cc: git, ps, phillip.wood, Martin von Zweigbergk, remo, Edwin Kempin,
	schacon, philipmetzger, konstantin, newren, tytso, nico,
	rikingcoding

On 07/04/2026 05:09, Junio C Hamano wrote:
> Matt Stark <msta@google.com> writes:
> 
>> In the discussions on
>> https://lore.kernel.org/git/Z_OGMb-1oV0Ex05e@pks.im/T/#m038be849b9b4020c16c562d810cf77bad91a2c87,
>> it seems to be that:
>> * There is consensus that a `change-id` header provides good value
> 
> I doubt it.
> 
> There are multiple people who wanted it, but as far as I can recall,
> I did not get the sense that they had the same semantics in mind.
> 
>> * There is not consenus on what precise format that should take
> 
> Format is one thing, but what it means is much more important.  When
> is it inherited?  What happens when you split a single commit into
> three pieces, which piece, if any, among the resulting three will
> inherit thee parent's?  Should rebase, cherry-pick, and replay
> behave the same way (IIRC, rebase and cherry-pick behaves
> differently while propagating notes).  Etc., etc.

Indeed, copying the header is easy (though the patch does not support 
copying the header when the commit message is edited), but agreeing on 
the semantics seems to be much harder.

Thanks

Phillip



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] headers: Preserve 'change-id' header in rebase / cherry-pick.
  2026-04-07  4:58   ` Nico Williams
  2026-04-07  5:02     ` Nico Williams
@ 2026-04-07  9:55     ` Phillip Wood
  2026-04-07 15:52       ` Nico Williams
  2026-04-07 14:42     ` Junio C Hamano
  2 siblings, 1 reply; 12+ messages in thread
From: Phillip Wood @ 2026-04-07  9:55 UTC (permalink / raw)
  To: Nico Williams, Junio C Hamano
  Cc: Matt Stark, git, ps, phillip.wood, Martin von Zweigbergk, remo,
	Edwin Kempin, schacon, philipmetzger, konstantin, newren, tytso,
	rikingcoding

On 07/04/2026 05:58, Nico Williams wrote:
> 
> Maybe that's the trick: local configuration for determining the
> copy-or-drop semantic for different operations, and maybe hooks for
> altering when copying.

I think the danger with making it configurable is that you cannot rely 
on the semantics because they vary between commits created by different 
authors. If we could get agreement on

  - Should cherry-pick copy the header

  - What to do with the header when a commit is split. Three options
    spring to mind (1) create new change-ids for all the new commits (2)
    create new change-ids but also copy the old one (3) allow the user to
    specify which new commit should copy the existing change-id and
    create new change-ids for the other commits.

  - What to do when commits are squashed - should the new commit copy all
    of change-ids? Should it have a new change-id?

Then I think it'd be much clearer what the implementation should do.

Thanks

Phillip


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] headers: Preserve 'change-id' header in rebase / cherry-pick.
  2026-04-07  5:02     ` Nico Williams
@ 2026-04-07 14:33       ` Junio C Hamano
  0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2026-04-07 14:33 UTC (permalink / raw)
  To: Nico Williams
  Cc: Matt Stark, git, ps, phillip.wood, Martin von Zweigbergk, remo,
	Edwin Kempin, schacon, philipmetzger, konstantin, newren, tytso,
	rikingcoding

Nico Williams <nico@cryptonector.com> writes:

> On Mon, Apr 06, 2026 at 11:58:19PM -0500, Nico Williams wrote:
>> Maybe that's the trick: local configuration for determining the
>> copy-or-drop semantic for different operations, and maybe hooks for
>> altering when copying.  [...]
>
> I should add that I would want an original-change-id header that could
> be used (again, optionally) to relate commits that get cherry-picked or
> rebased but end up having different change-ids.

With these people with (possibly just slightly) different wants
different project may have, wouldn't it work to record this kind of
random pieces of information either in notes (the benefit being that
it can be corrected without having to rewrite history) or in
trailers?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] headers: Preserve 'change-id' header in rebase / cherry-pick.
  2026-04-07  4:58   ` Nico Williams
  2026-04-07  5:02     ` Nico Williams
  2026-04-07  9:55     ` Phillip Wood
@ 2026-04-07 14:42     ` Junio C Hamano
  2 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2026-04-07 14:42 UTC (permalink / raw)
  To: Nico Williams
  Cc: Matt Stark, git, ps, phillip.wood, Martin von Zweigbergk, remo,
	Edwin Kempin, schacon, philipmetzger, konstantin, newren, tytso,
	rikingcoding

Nico Williams <nico@cryptonector.com> writes:

>> Format is one thing, but what it means is much more important.  When
>> is it inherited?  What happens when you split a single commit into
>> three pieces, which piece, if any, among the resulting three will
>> inherit thee parent's?  Should rebase, cherry-pick, and replay
>> behave the same way (IIRC, rebase and cherry-pick behaves
>> differently while propagating notes).  Etc., etc.
>
> Exactly.  I remember I argued that cherry-pick and rebase should have
> the same behavior given that rebase is logically a script of
> cherry-picks, but others had strong arguments that the two should not
> have the same behavior (something which is not hard to implement if you
> make the inherittance / non-inherittance an option to cherry-pick has
> different defaults for cherry-pick than for rebase).

Yes.  Even though I often feel irritated when I use cherry-pick and
see the "amlog" note not propagate when I should have used rebase, I
think it makes sense to allow cherry-pick and rebase to behavve
differently.  This is because rebase is a rewriting operation, where
the old incarnation of the topic is discarded (other than that it
can be resurrected from the reflog of the branch for the topic) and
only the new incarnation will stay in the history, while cherry-pick
is a duplicating operation, where the new copy is an adaptation of
the original commit into a different context and both of them will
stay in the history serving different purpose.

> That the value of this header should not have a format imposed -- that
> much is certainly the case as far as consensus goes, I think.  Basically
> it should be site-local, for some definition of site.  But the tooling
> can just treat it as opaque, perhaps with hooks to do any interpretation
> of those values.

And there is nothing to prevent us from doing all of the above (and
more) with trailers.  The existing interpret-trailers mechanism may
be lacking, but hopefully it gives enough framework to build on top
to allow projects to customize what they want them to mean and how
they behave.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] headers: Preserve 'change-id' header in rebase / cherry-pick.
  2026-04-07  9:55     ` Phillip Wood
@ 2026-04-07 15:52       ` Nico Williams
  2026-04-07 16:20         ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Nico Williams @ 2026-04-07 15:52 UTC (permalink / raw)
  To: phillip.wood
  Cc: Junio C Hamano, Matt Stark, git, ps, Martin von Zweigbergk, remo,
	Edwin Kempin, schacon, philipmetzger, konstantin, newren, tytso,
	rikingcoding

On Tue, Apr 07, 2026 at 10:55:00AM +0100, Phillip Wood wrote:
> On 07/04/2026 05:58, Nico Williams wrote:
> > 
> > Maybe that's the trick: local configuration for determining the
> > copy-or-drop semantic for different operations, and maybe hooks for
> > altering when copying.
> 
> I think the danger with making it configurable is that you cannot rely on
> the semantics because they vary between commits created by different
> authors. [...]

Well, I said "site-local" and "for some definition of site", and the one
I had in mind is that the upstream provides this [default] configuration
for clones.  Sure, authors could override this locally, but presumably
they wouldn't, and presumably upstreams would check for adherence to
their rules.

>   [...]. If we could get agreement on

That's proven difficult to do.

Nico
-- 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] headers: Preserve 'change-id' header in rebase / cherry-pick.
  2026-04-07 15:52       ` Nico Williams
@ 2026-04-07 16:20         ` Junio C Hamano
  2026-04-07 20:13           ` Nico Williams
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2026-04-07 16:20 UTC (permalink / raw)
  To: Nico Williams
  Cc: phillip.wood, Matt Stark, git, ps, Martin von Zweigbergk, remo,
	Edwin Kempin, schacon, philipmetzger, konstantin, newren, tytso,
	rikingcoding

Nico Williams <nico@cryptonector.com> writes:

> On Tue, Apr 07, 2026 at 10:55:00AM +0100, Phillip Wood wrote:
>> On 07/04/2026 05:58, Nico Williams wrote:
>> > 
>> > Maybe that's the trick: local configuration for determining the
>> > copy-or-drop semantic for different operations, and maybe hooks for
>> > altering when copying.
>> 
>> I think the danger with making it configurable is that you cannot rely on
>> the semantics because they vary between commits created by different
>> authors. [...]
>
> Well, I said "site-local" and "for some definition of site", and the one
> I had in mind is that the upstream provides this [default] configuration
> for clones.  Sure, authors could override this locally, but presumably
> they wouldn't, and presumably upstreams would check for adherence to
> their rules.

This does sound quite sensible.  What you called "site", I called
"project" in my earlier responses.

Some projects do already check that the changes are signed off with
the "Signed-off-by" trailers.  If change-id or original-change-id or
whatnot are deemed essential to a project, and are expected to be
formatted in certain ways, the project will certainly validate them.

None of that requires us to hide this information in the commit
object header, by the way.  And indeed, it is easier to validate
what is in the "git log" output (where optional header elements like
"encoding" are not shown).



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] headers: Preserve 'change-id' header in rebase / cherry-pick.
  2026-04-07 16:20         ` Junio C Hamano
@ 2026-04-07 20:13           ` Nico Williams
  0 siblings, 0 replies; 12+ messages in thread
From: Nico Williams @ 2026-04-07 20:13 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: phillip.wood, Matt Stark, git, ps, Martin von Zweigbergk, remo,
	Edwin Kempin, schacon, philipmetzger, konstantin, newren, tytso,
	rikingcoding

On Tue, Apr 07, 2026 at 09:20:35AM -0700, Junio C Hamano wrote:
> Nico Williams <nico@cryptonector.com> writes:
> > Well, I said "site-local" and "for some definition of site", and the one
> > I had in mind is that the upstream provides this [default] configuration
> > for clones.  Sure, authors could override this locally, but presumably
> > they wouldn't, and presumably upstreams would check for adherence to
> > their rules.
> 
> This does sound quite sensible.  What you called "site", I called
> "project" in my earlier responses.
> 
> Some projects do already check that the changes are signed off with
> the "Signed-off-by" trailers.  If change-id or original-change-id or
> whatnot are deemed essential to a project, and are expected to be
> formatted in certain ways, the project will certainly validate them.

Cool!  Maybe we can achieve consensus.  Here's a strawman:

 - upstreams publish (where?) a set of policies for

    - change-id
    - original-change-id

   regarding:

    - commit splits
    - commit squashes
    - cherry-picks
    - rebases

 - these policies should reference named hooks that have to be locally
   installed in the clone (that way the upstream can't just cause
   arbitrary remote execution clone-side) -- hooks that can transform
   change IDs

We should probably also have options for cherry-pick and rebase that a
user can use to provide useful context such as "this is a backport to
...", or "this is for <ticket>" (adds change-id).

Hooks could do things like create child tickets, etc.

Punting all semantics to hooks and upstream policies leaves only generic
things to decide, namely: what operations call what hooks.  And that
should leave us nothing to argue passionately over.

> None of that requires us to hide this information in the commit
> object header, by the way.  And indeed, it is easier to validate
> what is in the "git log" output (where optional header elements like
> "encoding" are not shown).

Yes, for sure, this could just be commit message formatting practices
enforced by hooks.  In this case there should be a hook for extracting
change ID(s) from a commit message.

Nico
-- 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] headers: Preserve 'change-id' header in rebase / cherry-pick.
  2026-04-07  3:13 [PATCH] headers: Preserve 'change-id' header in rebase / cherry-pick Matt Stark
  2026-04-07  4:09 ` Junio C Hamano
@ 2026-04-07 23:28 ` brian m. carlson
  1 sibling, 0 replies; 12+ messages in thread
From: brian m. carlson @ 2026-04-07 23:28 UTC (permalink / raw)
  To: Matt Stark
  Cc: git, ps, gitster, phillip.wood, Martin von Zweigbergk, remo,
	Edwin Kempin, schacon, philipmetzger, konstantin, newren, tytso,
	nico, rikingcoding

[-- Attachment #1: Type: text/plain, Size: 4576 bytes --]

On 2026-04-07 at 03:13:18, Matt Stark wrote:
> In the discussions on
> https://lore.kernel.org/git/Z_OGMb-1oV0Ex05e@pks.im/T/#m038be849b9b4020c16c562d810cf77bad91a2c87,
> it seems to be that:
> * There is consensus that a `change-id` header provides good value

I'm not sure I agree.

Absent some well-defined documentation describing what it means, I don't
see how it could provide good value.  It sounds like you're saying a
persistent commit ID is generally useful, but I don't see the value and
I associate persistent IDs with online tracking and advertisements,
which are neither useful to me nor particularly ethical.  Since nobody
has explained the compelling reasons in documentation, I am left to
speculate on them myself and have come up empty.

> * There is not consenus on what precise format that should take

I think stabilizing this before a format is defined is a mistake.

Even if, for the sake of argument, we agree that this is a generally
useful thing to have, we'd want to have a standard format (ideally
produced in a deterministic way for the reproducibility of the testsuite
and downstream projects), which we don't have, before we persist this.
We would probably want to have `git fsck` verify that the format is
correct and this is not being used as a way to store random information
as part of the initial change.  I assure you that users will very much
try to shovel random, arbitrary, malformed information in there
otherwise, since I've seen this in the author and committer headers[0].

> This commit, rather than attempting to standardize the format, simply
> preserves the change-id header in whatever format it used previously.
> 
> If we so choose, we can later decide on a standardized format, but since
> git only preserves existing headers, this should not create backwards
> incompatibility.

As I mentioned before in other threads, this needs to be off by default
or configurable.  This kind of ID provides tracking of commits, which is
useful in some situations but may also be undesirable for privacy or
other reasons.  Unlike other headers in commits, it is not easily
visible (one can easily tell if a commit is signed, for instance, or
what its tree is) and so therefore has potential privacy implications.

This is especially true since historically a great deal of information
has been automatically rewritten when rebasing or cherry-picking
(leaving only author and message alone), so users will have come to
expect this.

This is also a great way to leak information, such as secret keys.  I
can shovel sensitive keys or IDs into a commit (in a possibly encrypted
form), push them somewhere I have access to, and then exploit them.
Nobody will ever notice since corporate firewalls don't actually see the
raw object information, only the compressed and deltified packfile.  I
can even have my colleague rebase my commit with --reset-author and push
it so I have plausible deniability.

As an example of a problematic situation, say user A creates a commit
and publishes it somewhere on a remote.  It doesn't get picked up into
the main branch.  A year later, user A changes their name (because they
transition, marry, acquire a new citizenship[1], or for any other good
and valuable reason) and suddenly go by the name B.  Six months later,
they rebase the patch on the current main branch and, because the
project has advanced quite a bit, it looks completely different (so `git
cherry` will no longer identify it in any meaningful way).  They adjust
the message substantially due to the change and sign it off as user B
and submit it.

The user in this case may not have wanted the two commits to be
associated (very especially so if they transitioned), so this poses a
substantial risk of unintended disclosure.  The fact that Git makes this
a problem already is not a good excuse for making it worse here; to the
contrary, we should be making the situation better, not piling on.

[0] For instance, some people want to provide timestamps that are larger
than 2^64, despite the fact that it is remarkably unlikely that humans
will still exist 5×10^11 years in the future, let alone that Git will
still be in use.  Unsurprisingly, most programming languages don't
appreciate these timestamps, so problems ensue.
[1] Some countries require that citizens have a name which can decline
grammatically in the native language or otherwise meets linguistic or
cultural norms in that country.
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 325 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-04-07 23:28 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-07  3:13 [PATCH] headers: Preserve 'change-id' header in rebase / cherry-pick Matt Stark
2026-04-07  4:09 ` Junio C Hamano
2026-04-07  4:58   ` Nico Williams
2026-04-07  5:02     ` Nico Williams
2026-04-07 14:33       ` Junio C Hamano
2026-04-07  9:55     ` Phillip Wood
2026-04-07 15:52       ` Nico Williams
2026-04-07 16:20         ` Junio C Hamano
2026-04-07 20:13           ` Nico Williams
2026-04-07 14:42     ` Junio C Hamano
2026-04-07  9:41   ` Phillip Wood
2026-04-07 23:28 ` brian m. carlson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox