From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mout-p-102.mailbox.org (mout-p-102.mailbox.org [80.241.56.152]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 94729161310 for ; Tue, 15 Apr 2025 22:30:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.152 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744756235; cv=none; b=lJUfPVVlxX3wxsztmQOviPSQK70V5iE25j4l87DE8/n75EuZpvzLhANjUHsuSoS8OmwIHXwfA437cIeJe4uNqKg2jANCMTIHrIiKY9FV8ksHsmNFHCs00Ep3HPHRd7vCoPnSgD3qop6a+UjTh0V2WzjI5lySAjy84xQfUpfceYk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744756235; c=relaxed/simple; bh=/sA7EGoatVLJSTkXWlZTG9rRpX5j+kGwmIMAgR7nwAs=; h=Mime-Version:Content-Type:Date:Message-Id:Cc:Subject:From:To: References:In-Reply-To; b=K3Sr3IdvBVY/RMwU+yPHBzATnJ5PZBOO3Wh8wlXIpKjyMunG4+kTDfdzgGxSvgRkzdQ2eZq7DpQ2m6jQVx6RO2fJbstvDlueaOFXKVLGJ5Y9CTnSPncLB/0SURqYWEtiHFPp8xnFdT2IuRzlKrht/1z4HLQrV2VwdLks4pNQGpo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=buenzli.dev; spf=pass smtp.mailfrom=buenzli.dev; arc=none smtp.client-ip=80.241.56.152 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=buenzli.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=buenzli.dev Received: from smtp202.mailbox.org (smtp202.mailbox.org [10.196.197.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-102.mailbox.org (Postfix) with ESMTPS id 4Zcf4J4mMsz9vDQ; Wed, 16 Apr 2025 00:30:24 +0200 (CEST) Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Wed, 16 Apr 2025 00:30:22 +0200 Message-Id: Cc: "Nico Williams" , "Martin von Zweigbergk" , "Git Mailing List" , "Edwin Kempin" , "Scott Chacon" , "philipmetzger@bluewin.ch" Subject: Re: Semantics of change IDs (Re: Gerrit, GitButler, and Jujutsu projects collaborating on change-id commit footer) From: "Remo Senekowitsch" To: "Junio C Hamano" , "Theodore Ts'o" References: <20250408125521.GA17892@mit.edu> <20250409121924.GA148735@mit.edu> <20250410134426.GB13132@mit.edu> <20250411154839.GC648081@mit.edu> <20250412231318.GG13132@mit.edu> In-Reply-To: On Mon Apr 14, 2025 at 5:13 PM CEST, Junio C Hamano wrote: > "Theodore Ts'o" writes: > >> On Fri, Apr 11, 2025 at 10:44:43AM -0700, Junio C Hamano wrote: >>>=20 >>> The submitting contributor must make a conscious arrangement to give >>> a "patch set ID" shared among the messages in a single iteration, >>> and everybody who are responding must make sure they do not add the >>> same ID to the messages they throw at the thread in response. Those >>> who use format-patch and send-email can do that with convention and >>> automation and there is no reason to rely on In-Reply-To: header >>> (which may confuse the automated recipient of manually created >>> follow-up messages). >> >> So it all depends on how the patch set ID is implemented. Here's one >> way that I had in mind. The reason why I like like this over the >> Change-ID approach is that the semantics can be very clearly defined, >> and the only thing we rely on is the user saying "this new commit is >> part of patch series which I'm putting together".=20 >> >> By default when creating a new commit, the field is empty (in which >> case the patch set ID is presumed to be the same as the commit ID), or >> if the user gives a command-line flag say, "git commit --series" >> which indicates that it is part of a patch series in which case the >> patch set ID of the commit is set to the patch set ID of the current >> commit (i.e., eventully, its parent commit). >> >> Whenever the commit is amended or rebased or cherry picked, if the >> patch series ID is NULL, then it is set to the original commit ID. >> Otherwise, the existing patch set ID is preserved. >> >> The patch set ID will be output by git format-patch (perhaps as "Patch >> Series ID: sha has" immediately after the --- line. And if it is >> present, "git am" will import that patch series ID into git commit >> which creates when it sucks in the e-mail. >> >> The net affect of this is that for new versions of git which implement >> the Patch Set ID, all new commits are treated as patch series of >> length 1, unless a subsequent commit is created using "git commit >> --series". And the Patch Set ID will be preserved across >> cherry-picks, rebase operations, and git send-email/git apply-message >> operations. >> >> So if someone replies to an existing e-mail thread with a new commit, >> git format-patch will give it a different patch set ID, so we can >> distinguish it from an amended copy of a patch in the patch series. >> >> It also means that singleton commits, the patch ID effectively acts >> much like the tranditonal Change-ID. For multi-commit patch series, >> all of the commits will have the same patch set ID. > > Yeah, I like that aspect the best---the case for single commit > series falling out as a natural degenerate case of the more general > case to support multi-commit series is a good sign that the design > got something right ;-) > > I am still not sure what to think about the lack of explicit the > evolution history of one patch set that share the same patch set ID. > > When we have 10 commits that share the same patch set ID, I can > imagine that we can easily tell 3 are from one iteration, and 3 and > 4 among the rest are from another two iterations by noticing that > there are three strand of pearls, having 3, 3, and 4 commits on it. > And we can identify the initial round by noticing that one of the > commits have its name as the patch set ID, but I am not sure if we > should be OK by not having anything but the committter timestamp to > tell which one among the other two iterations are earlier, and we > cannot tell anything about these two other iterations if they are > independent rewrites of the original round. > > But other than that, I like something with clearly defined semantics > (and the definition coming naturally out of the structure, not out > of some arbitrary convention that forces to bring in some > semantics), and what you outlined above looks reasonably clean and > easy to use. Doesn't a patch set ID suffer from the same kind of ambiguity the change-id supposedly does? Patch sets can be split and merged, a commit from one patch set can be cherry-picked into another. What patch set ID should such a cherry-picked commit have? And I think the argument that a change-id for a singleton patch set naturally falls out of the patch set ID can easily be reversed. Admittedly, I don't have the most experience with the mailing list workflow, but a multi-commit patch set usually comes with a cover letter, right? And people like to track their cover letter in a commit? IIUC, b4 is designed around that too. In that case, the cover letter has its own change-id as any other commit, which will naturally remain stable across every version of the patch set. It would be non-sensical to squash, split or cherry-pick the cover letter commit. Sounds like a great candidate for the patch set ID. So the patch set ID can just as naturally flow out from the change-id. I can see two concrete disadvantages of the patch set ID: * It's strictly less powerful. As explained, the change-id can do everything the patch set ID can via the cover letter. But the patch set ID cannot help you track how individual commits within the patch set evolved. * It's more complicated. While many Git users work with patch sets every day, it's not a concept in Git iself. Git only knows about commits. The patch set ID would introduce a new concept into Git unnecessarily, while the change-id naturally extends the language Git already speaks, that of commits. Remo