From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from michel.telenet-ops.be (michel.telenet-ops.be [195.130.137.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DEAD321D591 for ; Thu, 5 Dec 2024 18:16:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.130.137.88 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733422616; cv=none; b=oG/otEuIATbd8HvLkWK+1CwkmVIIWRGLczG6GptB5j3rGd+znN5N6VEC1wINs15FziBhXp+KbXt1PLZuqah1K6SL80qIQnFHYi7JsqU9DlgzlxihNrKB4JGIKSNBwgMgxor+dSIMvLInEi+pvLt3QxX4dl+JVMXWQYaCkr9ogko= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733422616; c=relaxed/simple; bh=tOE7FoZbj2of2iSrM9k6O7Bi5zwn703dH6gxug1gaGk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=cXr06Co7TihqP/EtYOuCUyqo9GaFYoFWj9v0EJqc/W3XJ0mBpjIqeCjCyG52L2AN9l5WsYcixZ3muwKgIFHaozO9xL2bgaCi8YraHS9mzOSBY7j4SBiJJsOk33ntRtYW2J7G7YjaKBlw5+gpYVM24jO3g79QIXEh0Y0e0hwgtKY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=glider.be; spf=none smtp.mailfrom=linux-m68k.org; arc=none smtp.client-ip=195.130.137.88 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=glider.be Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux-m68k.org Received: from ramsan.of.borg ([IPv6:2a02:1810:ac12:ed80:b16a:6561:fa1:2b32]) by michel.telenet-ops.be with cmsmtp id l6Gf2D00D3EEtj2066Gf3k; Thu, 05 Dec 2024 19:16:46 +0100 Received: from rox.of.borg ([192.168.97.57]) by ramsan.of.borg with esmtp (Exim 4.95) (envelope-from ) id 1tJGP0-000LwV-9e; Thu, 05 Dec 2024 19:16:39 +0100 Received: from geert by rox.of.borg with local (Exim 4.95) (envelope-from ) id 1tJGP1-00EUTT-16; Thu, 05 Dec 2024 19:16:39 +0100 From: Geert Uytterhoeven To: Dwaipayan Ray , Lukas Bulwahn , Joe Perches , Jonathan Corbet , Thorsten Leemhuis , Andy Whitcroft , =?UTF-8?q?Niklas=20S=C3=B6derlund?= , Simon Horman , Conor Dooley , Miguel Ojeda , Linus Torvalds Cc: Junio C Hamano , workflows@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, Geert Uytterhoeven Subject: [PATCH v2 1/2] Align git commit ID abbreviation guidelines and checks Date: Thu, 5 Dec 2024 19:16:34 +0100 Message-Id: <1c244040bf6ce304656e31036e5178b4b9dfb719.1733421037.git.geert+renesas@glider.be> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-doc@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit The guidelines for git commit ID abbreviation are inconsistent: some places state to use 12 characters exactly, while other places recommend 12 characters or more. The same issue is present in the checkpatch.pl script. E.g. Documentation/dev-tools/checkpatch.rst says: **GIT_COMMIT_ID** The proper way to reference a commit id is: commit <12+ chars of sha1> ("") However, scripts/checkpatch.pl has two different checks: one warning check accepting 12 characters exactly: # Check Fixes: styles is correct Please use correct Fixes: style 'Fixes: <12 chars of sha1> (\"<title line>\")' and a second error check accepting 12-40 characters: # Check for git id commit length and improperly formed commit descriptions # A correctly formed commit description is: # commit <SHA-1 hash length 12+ chars> ("Complete commit subject") Please use git commit description style 'commit <12+ chars of sha1> Hence patches containing commit IDs with more than 12 characters are flagged by checkpatch, and sometimes rejected by maintainers or reviewers. This is becoming more important with the growth of the repository, as git may decide to use more characters in case of local conflicts. Fix this by settling on at least 12 characters, in both the documentation and in the checkpatch.pl script. Fixes: bd17e036b495bebb ("checkpatch: warn for non-standard fixes tag style") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> --- v2: - Rebase on top of commit 2f07b652384969f5 ("checkpatch: always parse orig_commit in fixes tag") in v6.13-rc1, - Update documentation, too. --- Documentation/process/maintainer-tip.rst | 2 +- Documentation/process/submitting-patches.rst | 8 ++++---- scripts/checkpatch.pl | 4 ++-- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/Documentation/process/maintainer-tip.rst b/Documentation/process/maintainer-tip.rst index e374b67b3277ac54..41d5855700cd4f83 100644 --- a/Documentation/process/maintainer-tip.rst +++ b/Documentation/process/maintainer-tip.rst @@ -270,7 +270,7 @@ Ordering of commit tags To have a uniform view of the commit tags, the tip maintainers use the following tag ordering scheme: - - Fixes: 12char-SHA1 ("sub/sys: Original subject line") + - Fixes: 12+char-SHA1 ("sub/sys: Original subject line") A Fixes tag should be added even for changes which do not need to be backported to stable kernels, i.e. when addressing a recently introduced diff --git a/Documentation/process/submitting-patches.rst b/Documentation/process/submitting-patches.rst index 1518bd57adab501f..f3508b5aa4ebab96 100644 --- a/Documentation/process/submitting-patches.rst +++ b/Documentation/process/submitting-patches.rst @@ -143,10 +143,10 @@ also track such tags and take certain actions. Private bug trackers and invalid URLs are forbidden. If your patch fixes a bug in a specific commit, e.g. you found an issue using -``git bisect``, please use the 'Fixes:' tag with the first 12 characters of -the SHA-1 ID, and the one line summary. Do not split the tag across multiple -lines, tags are exempt from the "wrap at 75 columns" rule in order to simplify -parsing scripts. For example:: +``git bisect``, please use the 'Fixes:' tag with at least the first 12 +characters of the SHA-1 ID, and the one line summary. Do not split the tag +across multiple lines, tags are exempt from the "wrap at 75 columns" rule in +order to simplify parsing scripts. For example:: Fixes: 54a4f0239f2e ("KVM: MMU: make kvm_mmu_zap_page() return the number of pages it actually freed") diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index dbb9c3c6fe30f906..5b57f0306a50046d 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -3230,7 +3230,7 @@ sub process { my $tag_case = not ($tag eq "Fixes:"); my $tag_space = not ($line =~ /^fixes:? [0-9a-f]{5,40} ($balanced_parens)/i); - my $id_length = not ($orig_commit =~ /^[0-9a-f]{12}$/i); + my $id_length = not ($orig_commit =~ /^[0-9a-f]{12,40}$/i); my $id_case = not ($orig_commit !~ /[A-F]/); my $id = "0123456789ab"; @@ -3240,7 +3240,7 @@ sub process { if ($ctitle ne $title || $tag_case || $tag_space || $id_length || $id_case || !$title_has_quotes) { if (WARN("BAD_FIXES_TAG", - "Please use correct Fixes: style 'Fixes: <12 chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr) && + "Please use correct Fixes: style 'Fixes: <12+ chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr) && $fix) { $fixed[$fixlinenr] = "Fixes: $cid (\"$ctitle\")"; } -- 2.34.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from albert.telenet-ops.be (albert.telenet-ops.be [195.130.137.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA8C1221461 for <linux-doc@vger.kernel.org>; Thu, 5 Dec 2024 18:16:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.130.137.90 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733422621; cv=none; b=V3OYkrUZeZen8WJgpx3HMBjicwGwPlmBq89dmMFaVDnSDlYy0f2FBfaW2UTqFJQ8Txkr3AhK5YMc36k1yWN1EMAY3OInJi+SGkWVfW0VuObaKfeYYhTSDOv6eb7KpFG9m+l3apuPJ6jeevdm11WMLuzS3tX1zZ8kwcDINsda0sM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733422621; c=relaxed/simple; bh=war1YBA12ktdBAB8QK/R71NCwgWsaH853s9u8/FO7Ug=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=mRWwtPtggbp0mIaNBZ8+FeQ1NC6W0ApgzAMqcA1X+i5z2QqaKPjzZf0tzVrtaPx3ubhm6CXz/3JH5DmhMXNa2Ops3HnduwOB+PnnF5pRJJuSBUPrtrKmjX67/0tmXsZ+AJIp9ZSn0hEL9jrxZ/+pEghsDFyOeMFKH8vbQLb3uNw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=glider.be; spf=none smtp.mailfrom=linux-m68k.org; arc=none smtp.client-ip=195.130.137.90 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=glider.be Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux-m68k.org Received: from ramsan.of.borg ([IPv6:2a02:1810:ac12:ed80:b16a:6561:fa1:2b32]) by albert.telenet-ops.be with cmsmtp id l6Gf2D00C3EEtj2066GftB; Thu, 05 Dec 2024 19:16:47 +0100 Received: from rox.of.borg ([192.168.97.57]) by ramsan.of.borg with esmtp (Exim 4.95) (envelope-from <geert@linux-m68k.org>) id 1tJGP0-000LwZ-AJ; Thu, 05 Dec 2024 19:16:39 +0100 Received: from geert by rox.of.borg with local (Exim 4.95) (envelope-from <geert@linux-m68k.org>) id 1tJGP1-00EUTW-1t; Thu, 05 Dec 2024 19:16:39 +0100 From: Geert Uytterhoeven <geert+renesas@glider.be> To: Dwaipayan Ray <dwaipayanray1@gmail.com>, Lukas Bulwahn <lukas.bulwahn@gmail.com>, Joe Perches <joe@perches.com>, Jonathan Corbet <corbet@lwn.net>, Thorsten Leemhuis <linux@leemhuis.info>, Andy Whitcroft <apw@canonical.com>, =?UTF-8?q?Niklas=20S=C3=B6derlund?= <niklas.soderlund@corigine.com>, Simon Horman <horms@kernel.org>, Conor Dooley <conor@kernel.org>, Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>, Linus Torvalds <torvalds@linux-foundation.org> Cc: Junio C Hamano <gitster@pobox.com>, workflows@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, Geert Uytterhoeven <geert+renesas@glider.be> Subject: [PATCH v2 2/2] Increase minimum git commit ID abbreviation to 16 characters Date: Thu, 5 Dec 2024 19:16:35 +0100 Message-Id: <46b320b91b8d86fade3c1b1c72ef94da85b45d0d.1733421037.git.geert+renesas@glider.be> X-Mailer: git-send-email 2.34.1 In-Reply-To: <cover.1733421037.git.geert+renesas@glider.be> References: <cover.1733421037.git.geert+renesas@glider.be> Precedence: bulk X-Mailing-List: linux-doc@vger.kernel.org List-Id: <linux-doc.vger.kernel.org> List-Subscribe: <mailto:linux-doc+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-doc+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit As of v6.13-rc1, a git repository with all upstream and stable versions of the Linux kernel sources contains more than 13 million objects. Local development trees contain many more, approaching or surpassing sqrt(16^12) = 16777216 objects. Hence according to the Birthday Paradox, collisions of 12-chararacter git commit IDs are imminent, or already happening. Indeed, patches with 13-character Fixes-tags have already been seen in the wild, due to git automatically increasing the size of the abbreviation when needed. Note that this need is based on the current repository of the creator, not on the (future) repository of the receiver. Decrease the probability of collisions by increasing the recommended minimum number of characters from 12 to 16. Update the guidelines and the examples in the documentation, and all checks in the checkpatch.pl script. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> --- v2: - New. --- Documentation/dev-tools/checkpatch.rst | 2 +- Documentation/doc-guide/sphinx.rst | 4 +-- Documentation/process/5.Posting.rst | 2 +- .../process/handling-regressions.rst | 6 ++-- Documentation/process/maintainer-tip.rst | 6 ++-- Documentation/process/submitting-patches.rst | 12 ++++---- scripts/checkpatch.pl | 28 +++++++++---------- 7 files changed, 30 insertions(+), 30 deletions(-) diff --git a/Documentation/dev-tools/checkpatch.rst b/Documentation/dev-tools/checkpatch.rst index abb3ff6820766ee0..592812c6eabad43a 100644 --- a/Documentation/dev-tools/checkpatch.rst +++ b/Documentation/dev-tools/checkpatch.rst @@ -599,7 +599,7 @@ Commit message **GIT_COMMIT_ID** The proper way to reference a commit id is: - commit <12+ chars of sha1> ("<title line>") + commit <16+ chars of sha1> ("<title line>") An example may be:: diff --git a/Documentation/doc-guide/sphinx.rst b/Documentation/doc-guide/sphinx.rst index 8081ebfe48bc045f..7e4ea14686e107be 100644 --- a/Documentation/doc-guide/sphinx.rst +++ b/Documentation/doc-guide/sphinx.rst @@ -441,8 +441,8 @@ Referencing commits References to git commits are automatically hyperlinked given that they are written in one of these formats:: - commit 72bf4f1767f0 - commit 72bf4f1767f0 ("net: do not leave an empty skb in write queue") + commit 72bf4f1767f03869 + commit 72bf4f1767f03869 ("net: do not leave an empty skb in write queue") .. _sphinx_kfigure: diff --git a/Documentation/process/5.Posting.rst b/Documentation/process/5.Posting.rst index b3eff03ea2491c88..d396d0f051b9c1e3 100644 --- a/Documentation/process/5.Posting.rst +++ b/Documentation/process/5.Posting.rst @@ -199,7 +199,7 @@ document; what follows here is a brief summary. One tag is used to refer to earlier commits which introduced problems fixed by the patch:: - Fixes: 1f2e3d4c5b6a ("The first line of the commit specified by the first 12 characters of its SHA-1 ID") + Fixes: 1f2e3d4c5b6a7980 ("The first line of the commit specified by the first 16 characters of its SHA-1 ID") Another tag is used for linking web pages with additional backgrounds or details, for example an earlier discussion which leads to the patch or a diff --git a/Documentation/process/handling-regressions.rst b/Documentation/process/handling-regressions.rst index 1f5ab49c48a480d2..cde4c663b379f567 100644 --- a/Documentation/process/handling-regressions.rst +++ b/Documentation/process/handling-regressions.rst @@ -31,7 +31,7 @@ The important bits (aka "The TL;DR") list in CC) containing a paragraph like the following, which tells regzbot when the issue started to happen:: - #regzbot ^introduced: 1f2e3d4c5b6a + #regzbot ^introduced: 1f2e3d4c5b6a7980 * When forwarding reports from a bug tracker to the regressions list (see above), include a paragraph like the following:: @@ -82,7 +82,7 @@ When doing either, consider making the Linux kernel regression tracking bot "regzbot" immediately start tracking the issue: * For mailed reports, check if the reporter included a "regzbot command" like - ``#regzbot introduced: 1f2e3d4c5b6a``. If not, send a reply (with the + ``#regzbot introduced: 1f2e3d4c5b6a7980``. If not, send a reply (with the regressions list in CC) with a paragraph like the following::: #regzbot ^introduced: v5.13..v5.14-rc1 @@ -100,7 +100,7 @@ When doing either, consider making the Linux kernel regression tracking bot * When forwarding a regression reported to a bug tracker, include a paragraph with these regzbot commands:: - #regzbot introduced: 1f2e3d4c5b6a + #regzbot introduced: 1f2e3d4c5b6a7980 #regzbot from: Some N. Ice Human <some.human@example.com> #regzbot monitor: http://some.bugtracker.example.com/ticket?id=123456789 diff --git a/Documentation/process/maintainer-tip.rst b/Documentation/process/maintainer-tip.rst index 41d5855700cd4f83..12edc8e06367e3f3 100644 --- a/Documentation/process/maintainer-tip.rst +++ b/Documentation/process/maintainer-tip.rst @@ -270,7 +270,7 @@ Ordering of commit tags To have a uniform view of the commit tags, the tip maintainers use the following tag ordering scheme: - - Fixes: 12+char-SHA1 ("sub/sys: Original subject line") + - Fixes: 16+char-SHA1 ("sub/sys: Original subject line") A Fixes tag should be added even for changes which do not need to be backported to stable kernels, i.e. when addressing a recently introduced @@ -284,7 +284,7 @@ following tag ordering scheme: Commit - abcdef012345678 ("x86/xxx: Replace foo with bar") + abcdef0123456789 ("x86/xxx: Replace foo with bar") left an unused instance of variable foo around. Remove it. @@ -295,7 +295,7 @@ following tag ordering scheme: The recent replacement of foo with bar left an unused instance of variable foo around. Remove it. - Fixes: abcdef012345678 ("x86/xxx: Replace foo with bar") + Fixes: abcdef0123456789 ("x86/xxx: Replace foo with bar") Signed-off-by: J.Dev <j.dev@mail> The latter puts the information about the patch into the focus and diff --git a/Documentation/process/submitting-patches.rst b/Documentation/process/submitting-patches.rst index f3508b5aa4ebab96..4c8e0f9c8fbbd83c 100644 --- a/Documentation/process/submitting-patches.rst +++ b/Documentation/process/submitting-patches.rst @@ -106,7 +106,7 @@ Example:: platform_set_drvdata(), but left the variable "dev" unused, delete it. -You should also be sure to use at least the first twelve characters of the +You should also be sure to use at least the first sixteen characters of the SHA-1 ID. The kernel repository holds a *lot* of objects, making collisions with shorter IDs a real possibility. Bear in mind that, even if there is no collision with your six-character ID now, that condition may @@ -143,25 +143,25 @@ also track such tags and take certain actions. Private bug trackers and invalid URLs are forbidden. If your patch fixes a bug in a specific commit, e.g. you found an issue using -``git bisect``, please use the 'Fixes:' tag with at least the first 12 +``git bisect``, please use the 'Fixes:' tag with at least the first 16 characters of the SHA-1 ID, and the one line summary. Do not split the tag across multiple lines, tags are exempt from the "wrap at 75 columns" rule in order to simplify parsing scripts. For example:: - Fixes: 54a4f0239f2e ("KVM: MMU: make kvm_mmu_zap_page() return the number of pages it actually freed") + Fixes: 54a4f0239f2e98bc ("KVM: MMU: make kvm_mmu_zap_page() return the number of pages it actually freed") The following ``git config`` settings can be used to add a pretty format for outputting the above style in the ``git log`` or ``git show`` commands:: [core] - abbrev = 12 + abbrev = 16 [pretty] fixes = Fixes: %h (\"%s\") An example call:: - $ git log -1 --pretty=fixes 54a4f0239f2e - Fixes: 54a4f0239f2e ("KVM: MMU: make kvm_mmu_zap_page() return the number of pages it actually freed") + $ git log -1 --pretty=fixes 54a4f0239f2e98bc + Fixes: 54a4f0239f2e98bc ("KVM: MMU: make kvm_mmu_zap_page() return the number of pages it actually freed") .. _split_changes: diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 5b57f0306a50046d..80cde0aa1a3115e9 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -1246,13 +1246,13 @@ sub git_commit_info { # git rev-list --remotes | grep -i "^$1" | # while read line ; do # git log --format='%H %s' -1 $line | -# echo "commit $(cut -c 1-12,41-)" +# echo "commit $(cut -c 1-16,41-)" # done } elsif ($lines[0] =~ /^fatal: ambiguous argument '$commit': unknown revision or path not in the working tree\./ || $lines[0] =~ /^fatal: bad object $commit/) { $id = undef; } else { - $id = substr($lines[0], 0, 12); + $id = substr($lines[0], 0, 16); $desc = substr($lines[0], 41); } @@ -1320,7 +1320,7 @@ for my $filename (@ARGV) { if ($filename eq '-') { $vname = 'Your patch'; } elsif ($git) { - $vname = "Commit " . substr($filename, 0, 12) . ' ("' . $git_commits{$filename} . '")'; + $vname = "Commit " . substr($filename, 0, 16) . ' ("' . $git_commits{$filename} . '")'; } else { $vname = $filename; } @@ -3230,17 +3230,17 @@ sub process { my $tag_case = not ($tag eq "Fixes:"); my $tag_space = not ($line =~ /^fixes:? [0-9a-f]{5,40} ($balanced_parens)/i); - my $id_length = not ($orig_commit =~ /^[0-9a-f]{12,40}$/i); + my $id_length = not ($orig_commit =~ /^[0-9a-f]{16,40}$/i); my $id_case = not ($orig_commit !~ /[A-F]/); - my $id = "0123456789ab"; + my $id = "0123456789abcdef"; my ($cid, $ctitle) = git_commit_info($orig_commit, $id, $title); if ($ctitle ne $title || $tag_case || $tag_space || $id_length || $id_case || !$title_has_quotes) { if (WARN("BAD_FIXES_TAG", - "Please use correct Fixes: style 'Fixes: <12+ chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr) && + "Please use correct Fixes: style 'Fixes: <16+ chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr) && $fix) { $fixed[$fixlinenr] = "Fixes: $cid (\"$ctitle\")"; } @@ -3330,7 +3330,7 @@ sub process { # Check for git id commit length and improperly formed commit descriptions # A correctly formed commit description is: -# commit <SHA-1 hash length 12+ chars> ("Complete commit subject") +# commit <SHA-1 hash length 16+ chars> ("Complete commit subject") # with the commit subject '("' prefix and '")' suffix # This is a fairly compilicated block as it tests for what appears to be # bare SHA-1 hash with minimum length of 5. It also avoids several types of @@ -3343,16 +3343,16 @@ sub process { $line !~ /^This reverts commit [0-9a-f]{7,40}/ && (($line =~ /\bcommit\s+[0-9a-f]{5,}\b/i || ($line =~ /\bcommit\s*$/i && defined($rawlines[$linenr]) && $rawlines[$linenr] =~ /^\s*[0-9a-f]{5,}\b/i)) || - ($line =~ /(?:\s|^)[0-9a-f]{12,40}(?:[\s"'\(\[]|$)/i && - $line !~ /[\<\[][0-9a-f]{12,40}[\>\]]/i && - $line !~ /\bfixes:\s*[0-9a-f]{12,40}/i))) { + ($line =~ /(?:\s|^)[0-9a-f]{16,40}(?:[\s"'\(\[]|$)/i && + $line !~ /[\<\[][0-9a-f]{16,40}[\>\]]/i && + $line !~ /\bfixes:\s*[0-9a-f]{16,40}/i))) { my $init_char = "c"; my $orig_commit = ""; my $short = 1; my $long = 0; my $case = 1; my $space = 1; - my $id = '0123456789ab'; + my $id = '0123456789abcdef'; my $orig_desc = "commit description"; my $description = ""; my $herectx = $herecurr; @@ -3383,11 +3383,11 @@ sub process { if ($input =~ /\b(c)ommit\s+([0-9a-f]{5,})\b/i) { $init_char = $1; $orig_commit = lc($2); - $short = 0 if ($input =~ /\bcommit\s+[0-9a-f]{12,40}/i); + $short = 0 if ($input =~ /\bcommit\s+[0-9a-f]{16,40}/i); $long = 1 if ($input =~ /\bcommit\s+[0-9a-f]{41,}/i); $space = 0 if ($input =~ /\bcommit [0-9a-f]/i); $case = 0 if ($input =~ /\b[Cc]ommit\s+[0-9a-f]{5,40}[^A-F]/); - } elsif ($input =~ /\b([0-9a-f]{12,40})\b/i) { + } elsif ($input =~ /\b([0-9a-f]{16,40})\b/i) { $orig_commit = lc($1); } @@ -3398,7 +3398,7 @@ sub process { ($short || $long || $space || $case || ($orig_desc ne $description) || !$has_quotes) && $last_git_commit_id_linenr != $linenr - 1) { ERROR("GIT_COMMIT_ID", - "Please use git commit description style 'commit <12+ chars of sha1> (\"<title line>\")' - ie: '${init_char}ommit $id (\"$description\")'\n" . $herectx); + "Please use git commit description style 'commit <16+ chars of sha1> (\"<title line>\")' - ie: '${init_char}ommit $id (\"$description\")'\n" . $herectx); } #don't report the next line if this line ends in commit and the sha1 hash is the next line $last_git_commit_id_linenr = $linenr if ($line =~ /\bcommit\s*$/i); -- 2.34.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from albert.telenet-ops.be (albert.telenet-ops.be [195.130.137.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CAA01225776 for <linux-doc@vger.kernel.org>; Thu, 5 Dec 2024 18:16:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.130.137.90 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733422621; cv=none; b=dBLdyTmNfTP4BhGx8EvLoBcuT6kXh66RWQLn9EBx+Yqqid2rythETJgKork0vPB0wh1IQzB/hyy9nAxkVXhEiP7LmQANa9V+kCQzTZM5koEACsMp+elYxhlThX0u1aWRz3f3v4bn96OknohYy4U6UErTE06vvk4mS4/3TIqAi9Y= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733422621; c=relaxed/simple; bh=HU3fKEdq+X4pMf6Fx9ukcIrk60Bkr1nGBpu5sND+6i8=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=ZTiyO1kzHNzwibLsIZdxfPea13fZdJxDnG6RZLGI65fdxIiJMoZdINsLgEgT6zj5OwgFHcYuvrlnd1V1vtZsgg4606/JZCN85Pcx2k14DQF3QxwnVBx3ciJJbVn/8QA60aoCSg6koo6asp5d365RpJSuk5SA+xBaGhjOiuHGEX0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=glider.be; spf=none smtp.mailfrom=linux-m68k.org; arc=none smtp.client-ip=195.130.137.90 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=glider.be Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux-m68k.org Received: from ramsan.of.borg ([IPv6:2a02:1810:ac12:ed80:b16a:6561:fa1:2b32]) by albert.telenet-ops.be with cmsmtp id l6Gf2D00B3EEtj2066GftC; Thu, 05 Dec 2024 19:16:47 +0100 Received: from rox.of.borg ([192.168.97.57]) by ramsan.of.borg with esmtp (Exim 4.95) (envelope-from <geert@linux-m68k.org>) id 1tJGP0-000LwW-9d; Thu, 05 Dec 2024 19:16:39 +0100 Received: from geert by rox.of.borg with local (Exim 4.95) (envelope-from <geert@linux-m68k.org>) id 1tJGP1-00EUTQ-0L; Thu, 05 Dec 2024 19:16:39 +0100 From: Geert Uytterhoeven <geert+renesas@glider.be> To: Dwaipayan Ray <dwaipayanray1@gmail.com>, Lukas Bulwahn <lukas.bulwahn@gmail.com>, Joe Perches <joe@perches.com>, Jonathan Corbet <corbet@lwn.net>, Thorsten Leemhuis <linux@leemhuis.info>, Andy Whitcroft <apw@canonical.com>, =?UTF-8?q?Niklas=20S=C3=B6derlund?= <niklas.soderlund@corigine.com>, Simon Horman <horms@kernel.org>, Conor Dooley <conor@kernel.org>, Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>, Linus Torvalds <torvalds@linux-foundation.org> Cc: Junio C Hamano <gitster@pobox.com>, workflows@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, Geert Uytterhoeven <geert+renesas@glider.be> Subject: [PATCH v2 0/2] Align and increase git commit ID abbreviation guidelines and checks Date: Thu, 5 Dec 2024 19:16:33 +0100 Message-Id: <cover.1733421037.git.geert+renesas@glider.be> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-doc@vger.kernel.org List-Id: <linux-doc.vger.kernel.org> List-Subscribe: <mailto:linux-doc+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-doc+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Hi all, This patch series: - Aligns the documentation and checks to settle on a mininum of 12 characters for git commit ID abbreviations, so larger abbreviations are no longer flagged as a warning by checkpatch.pl, - Increase the minimum recommended abbreviation to 16 characters. The Linux kernel repository is growing fast. Running git-unique-abbrev[1] on a tree containing v6.13-rc1 and all stable releases gives: 13021585 objects 4: 13021585 / 65536 5: 13021519 / 1048507 6: 7028382 / 3064402 7: 616474 / 305777 8: 39566 / 19775 9: 2452 / 1226 10: 186 / 93 11: 12 / 6 12: 0 / 0 21cf4d54d3c702ac20c6747fa6d4f64dee07dd11 21cf4d54d3ced8a3e752030e483d72997721076d 8a048bbf89528d45c604aed68f7e0f0ef957067d 8a048bbf895b1359e4a33b779ea6d7386cfe4de2 c8db0fc796454553647e11a055eed4e46676e3ed c8db0fc7964a2c9394c17722f30e4f1420aaa8e0 d3ac4e475103c4364ecb47a6a55c114d7c42a014 d3ac4e47510ec0753ebe1e418a334ad202784aa8 d597639e2036f04f0226761e2d818b31f2db7820 d597639e203a100156501df8a0756fd09573e2de ef91b6e893a00d903400f8e1303efc4d52b710af ef91b6e893afc4c4ca488453ea9f19ced5fa5861 13021585 is still smaller than sqrt(16^12) = 16777216, but the safety margin is getting smaller. E.g. my main work repo already contains +19M objects. Hence the Birthday Paradox states that collisions of 12-character commit IDs are imminent. Fortunately, git is smart: when the number of configured characters for abbreviations (using core.abbrev, or --abbrev) is too small, git automatically prints a larger hash. Obviously this only takes into account the repository of the creator, and not the (possibly much larger) repository of the receiver. Due to this, patches with 13-char Fixes tags have already been seen in the wild[2]. Unfortunately such patches are currently flagged by checkpatch.pl (and sometimes even rejected), despite some parts of the documentation stating that "at least 12 characters" is fine. FTR, I've been using 16-characters commit IDs for quite a while. Hence the first patch settles on "at least 12 chars" everywhere. The second patch increases the minimum to 16 characters, to reduce the risk of conflicts, now and in the near future. Note that we standardized on 12 chars in commit d311cd44545f2f69 ("checkpatch: add test for commit id formatting style in commit log") in v3.17, when the repo had just surpassed 4M objects. Going to 16 chars should provide enough headroom until after my official retirement ;-) Note that I did not update Documentation/translations/. Changes compared to v1[3]: - Rebase on top of commit 2f07b652384969f5 ("checkpatch: always parse orig_commit in fixes tag") in v6.13-rc1, - Update documentation, too, - New patch to increase the minimum to 16 characters. Thanks for your comments! [1] https://blog.cuviper.com/2013/11/10/how-short-can-git-abbreviate/ [2] https://lore.kernel.org/all/20241009-tamale-revisit-7d2606c5fdf3@spud [3] "[PATCH] checkpatch: Also accept commit ids with 13-40 chars of sha1" https://lore.kernel.org/all/62f82b0308de05f5aab913392049af15d53c777d.1701804489.git.geert+renesas@glider.be/ Geert Uytterhoeven (2): Align git commit ID abbreviation guidelines and checks Increase minimum git commit ID abbreviation to 16 characters Documentation/dev-tools/checkpatch.rst | 2 +- Documentation/doc-guide/sphinx.rst | 4 +-- Documentation/process/5.Posting.rst | 2 +- .../process/handling-regressions.rst | 6 ++-- Documentation/process/maintainer-tip.rst | 6 ++-- Documentation/process/submitting-patches.rst | 18 ++++++------ scripts/checkpatch.pl | 28 +++++++++---------- 7 files changed, 33 insertions(+), 33 deletions(-) -- 2.34.1 Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-ed1-f47.google.com (mail-ed1-f47.google.com [209.85.208.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0063D224B0F for <linux-doc@vger.kernel.org>; Thu, 5 Dec 2024 19:19:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.47 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733426380; cv=none; b=UoFStJTNeR19PBm7D9BkLOSm6ZJ4OXo5BS/pYgWU6Y7gMe/rUGROmYnUn3P1iwPFvnXy3ccATm0v7eO3D9Z0aM7xINPy2poQxPsMOhpjWWKOwimnux8iqvh1u/Q6KLIUIHvBaxF5upXxRhBJ9YhuUUfy51mKPztrRUY2BaV7E9I= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733426380; c=relaxed/simple; bh=N7Q2i/te3fJSR9GXmZAlQH1YXHaD/cG2+guiEYfirWs=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=muH3t/f3zRUMTdrAUBA0Evr9ccBt7oNwjqrZbheP+86+v8/ozNccd7VaASwgDibByUOO7dIA9wmGgKKD39u+jqe+VNreuEQxdwpqbvD8TlcC+hcdGbzCzYOeH6h41paHYtqk4W+L+gBTOTmvFseCrSQq+F682qkAl/oPseTjcz0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org; spf=pass smtp.mailfrom=linuxfoundation.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b=FdVq//hM; arc=none smtp.client-ip=209.85.208.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linuxfoundation.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="FdVq//hM" Received: by mail-ed1-f47.google.com with SMTP id 4fb4d7f45d1cf-5d0d4a2da4dso2069658a12.1 for <linux-doc@vger.kernel.org>; Thu, 05 Dec 2024 11:19:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1733426376; x=1734031176; darn=vger.kernel.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=DuB4RBng6UYOUJZnydBy5hrATh6bqjUq7A/wd6mOafg=; b=FdVq//hM5UpBn7kbH7x9osHum3getUCzrgapLYCNwgl0u52inmhgsjRe/ZO9hTuL6S 6bxykc0ArPlbugzu3Dj/My4ZrWJHfHodkHDoP9SUNRuc6q6jxykFQ3mBizwVDkPBvtR+ dYk8y02YiaUx1y4BwMSthOR98UQAhY7QnCnC0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733426376; x=1734031176; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=DuB4RBng6UYOUJZnydBy5hrATh6bqjUq7A/wd6mOafg=; b=eJkywOx5WbgLzdLK6zFDj13PfvQliyZVWb6+CWrA2HH0AGBK7oaPz4/u1eiN1VRQ2u I6eMBlQNbWYrGqL3ZtQ+C8xVBqFGlGIJF+E2XzOrmuhdXSCKy5RkL9PLGHxVFbDih8qh bbtFXauAjKXeHs1Yw/gSpA2ixwJUgNaWlv0dp3aMMMZ7Yrjt0oC822YhunbFvBef28a1 5RpkyWO7W0ToFT6xPxVdJu5xgBxnHiynH6WcTqobV0IbO11u3MbZcWahGq9K1890G2GZ +mjlO3CdMoaq7SqjuYDI8a6O4q3AJgYv8GWz+BXWBVp9GWTihpoiMgcDRHD6HQzyDGm+ 7RBA== X-Forwarded-Encrypted: i=1; AJvYcCWHLIWeNEbp6+mfBHfSWSrUSFH8nA55JzmJXqOEZsMuXLy5k1eXtVMG6Krd1rkVWmjDPj4Enmj5IfY=@vger.kernel.org X-Gm-Message-State: AOJu0Yxn8GDwlIxHjTKxq0yxqxbiLKLjdUHTUGW85gz8hrxbrLlQKfMB t6v0hNhpnbJ9YZ8EQcXy3JAqkzvjcR1fgYv3+cLyYmzKRjDqSXHg2KAJ51ju+JLiOUlp0Gxl4eD r802M2A== X-Gm-Gg: ASbGncuXDkytfn1Q7CiK/aFC1zrOt75UV+9MaSAmPArOBhJ1Fb5AuHchDK1mcESuFKf ZBpROTRqpNis1W/FQxg1tN5AxEfChj5Aww46xfQoHZ+KpzqGa91I5MqLQx8VqEwcsRqce9xhxPC UkHPGsqYUokbrQd36GvtRQ95UJGskNTDOnsroQXaUDU0EBkpycX7NVsaYOfKn2O1HmtAV7Gi+5V UOCZZoX35EfFTj5zFRzgmY6R9EzjFnglgtBRVZ4HTqJyDvQ/yel4J+9lV51Mt+i77ujNF5Kw+CS FNyKvKYFWkG8JbybHM79Do9e X-Google-Smtp-Source: AGHT+IEIVU3tqeGb0eT9qDA4GGapTJ6Q63eLP/65exLrzTjkDy1Vi7o6/OomwScUmWUNmCoFGSMm7Q== X-Received: by 2002:a17:907:1de1:b0:aa6:1871:2b98 with SMTP id a640c23a62f3a-aa63a270121mr807266b.60.1733426375923; Thu, 05 Dec 2024 11:19:35 -0800 (PST) Received: from mail-ej1-f49.google.com (mail-ej1-f49.google.com. [209.85.218.49]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa6260e1346sm129896266b.176.2024.12.05.11.19.35 for <linux-doc@vger.kernel.org> (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 05 Dec 2024 11:19:35 -0800 (PST) Received: by mail-ej1-f49.google.com with SMTP id a640c23a62f3a-a9e8522445dso235051466b.1 for <linux-doc@vger.kernel.org>; Thu, 05 Dec 2024 11:19:35 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCV2ZjJZRxR5jzCrkNoJld6WS3vUL8MAaDRYDazkzpgMGVOLcTut5L7cpz345JKzhz4/SECg3Re/Hh8=@vger.kernel.org X-Received: by 2002:a17:906:18b2:b0:aa6:23ba:d8c5 with SMTP id a640c23a62f3a-aa639fbda5cmr3997966b.10.1733426374829; Thu, 05 Dec 2024 11:19:34 -0800 (PST) Precedence: bulk X-Mailing-List: linux-doc@vger.kernel.org List-Id: <linux-doc.vger.kernel.org> List-Subscribe: <mailto:linux-doc+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-doc+unsubscribe@vger.kernel.org> MIME-Version: 1.0 References: <cover.1733421037.git.geert+renesas@glider.be> <46b320b91b8d86fade3c1b1c72ef94da85b45d0d.1733421037.git.geert+renesas@glider.be> In-Reply-To: <46b320b91b8d86fade3c1b1c72ef94da85b45d0d.1733421037.git.geert+renesas@glider.be> From: Linus Torvalds <torvalds@linux-foundation.org> Date: Thu, 5 Dec 2024 11:19:18 -0800 X-Gmail-Original-Message-ID: <CAHk-=wiwAz3UgPOWK3RdGXDnTRHcwVbxpuxCQt_0SoAJC-oGXQ@mail.gmail.com> Message-ID: <CAHk-=wiwAz3UgPOWK3RdGXDnTRHcwVbxpuxCQt_0SoAJC-oGXQ@mail.gmail.com> Subject: Re: [PATCH v2 2/2] Increase minimum git commit ID abbreviation to 16 characters To: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>, Lukas Bulwahn <lukas.bulwahn@gmail.com>, Joe Perches <joe@perches.com>, Jonathan Corbet <corbet@lwn.net>, Thorsten Leemhuis <linux@leemhuis.info>, Andy Whitcroft <apw@canonical.com>, =?UTF-8?Q?Niklas_S=C3=B6derlund?= <niklas.soderlund@corigine.com>, Simon Horman <horms@kernel.org>, Conor Dooley <conor@kernel.org>, Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>, Junio C Hamano <gitster@pobox.com>, workflows@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" On Thu, 5 Dec 2024 at 10:16, Geert Uytterhoeven <geert+renesas@glider.be> wrote: > > Hence according to the Birthday Paradox, collisions of 12-chararacter > git commit IDs are imminent, or already happening. Note that ambiguous commit IDs are not even remotely as scary as this implies. Yes, the current kernel tree has over ten million objects, and when you look at stable trees etc, you can easily see more. But commits are only a fraction (about 1/8th) of the total objects. My tree is at about 1.3M commits, so we're basically an order of magnitude off the point where collisions start being an issue wrt commit IDs. Can you find collisions by looking at all objects? Yes. Git will do that for you, and tell you their types. But to take one recent example, let's do the 6.12 commit: adc218676eef25575469234709c2d87185ca223a. To get an ambiguous ID, you have to go down to 6 characters, and even then git will tell you there's only one object that is a commit, ie $ git show adc218 results in error: short object ID adc218 is ambiguous hint: The candidates are: hint: adc218676eef commit 2024-11-17 - Linux 6.12 hint: adc2184009c5 blob so right now you have a collision in six digits for that commit, but even then it's actually still entirely unambiguous once you know you're talking about a commit. Are there worse cases? Yup. With just 7 characters, you get commits like 95b861a that actually have three ambiguous commit IDs. And you still get ambiguous results with 9 characters. With 10 characters, there are no collisions. So the "we're an order of magnitude off" seems about right - you get slightly more than one order of magnitude for each two digits. And remember: we're an order of magnitude off *AFTER 20 YEARS OF GIT HISTORY*. Furthermore, the "in the future" argument is bogus. Yes, there will be more commits in the future, but it's not going to suddenly make old SHA ID's somehow more ambiguous, since you can also take history into account - and when quoting the short format it should always be accompanied by the first line of the commit message too. Why do I care? Because long git commit IDs are actually detrimental to legibility. I try to make commit messages legible, and that very much is the *point* of the short format. It's for people, not machinery. Yes, the basic git machinery doesn't do object type disambiguation (and if you do "git show", you can give it blob IDs etc, so git itself may not know about the proper type to use disambiguate at all). And git also doesn't know about the whole "we also put the first line of the commit message" thing. But honestly, I'm claiming that something like Fixes: 48bcda684823 ("tracing: Remove definition of trace_*_rcuidle()") (to pick a random recent commit) is completely unambiguous for the intended audience, and will remain so forever within the context that it is in. And I think the "intended audience" here is important. 12 characters is already line noise, and causes occasional odd line wrapping (you don't see that in things like the "Fixes:" tags, but you do see it in the better commit messages that refer to the commits they fix). I think we should accept that it's not the full SHA1, and also accept what that really means. Final note: personally, I find that the SHA1 - shortened or not - is often *less* descriptive than the shortlog, for the simple reason that rebasing happens, and people refer to other commits with stale commit IDs. That's an issue that I personally hit regularly, and it has a fairly simple solution in the form of git log --grep="..one-liner goes here.." and my point here is that if you rely too much on the SHA1, your workflow is *ALREADY* broken, and it has nothing to do with the shortening. Put another way: if you have particular tooling that you worry about, I think you should look at the tooling. You can find real examples of much shorted commit IDs in the kernel, and real examples of the MUCH MORE REAL issue of wrong commit ID's right now. See for example: 0a1336c8c935 ("IB/ipath: Fix IRQ for PCI Express HCAs") which refers to commit 51f65ebc ("IB/ipath - program intconfig register using new HT irq hook"), which is still perfectly unique, but then look at 2e61c646edfa ("mlx4_core: Use mmiowb() to avoid firmware commands getting jumbled up") which refers to commit 66547550 ("IB/mthca: Use mmiowb() to avoid firmware commands getting jumbled up"). That commit doesn't exist at all - it's not ambiguous due to being short, it's ambiguous due to being *wrong* (presumably due to a rebase)(. The real commit ID? 76d7cc0345a0. Easily found using the human-readable shortlog, So here's the meat of the argument: you are barking up the wrong tree. We have real and present issues that have been going on since at least 2007, and they have *nothing* to do with the short SHA1s. I don't want to make the short SHA1's worse, when the real and present problems are elsewhere. Make the tools deal with the cases we already have, and you'll find that the shortening is a complete non-issue. Linus From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-vs1-f45.google.com (mail-vs1-f45.google.com [209.85.217.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 481511EF0B6; Fri, 13 Dec 2024 19:41:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.217.45 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734118894; cv=none; b=MHGsFD7CeZI11pSPKxbH0F0hcE3lKXPfb7QBnCYTcc7Fsnu8MfL+dg7dUd8+lW5Ve0Nyq/NvyZS+lArxiHHEOZCt2jvpfP9oXVVaV0JGfUmDbG8sEcn5454NLNyTGUEBeJSYUBQ5ZWSAU5V5HNInRhm94I2TC8uSEuqWoh8FAEE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734118894; c=relaxed/simple; bh=WcmpwK8L+1rdyFXu6+ygLABocvmTT9mSI/qvmMxq1vM=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=mZ0L3IOTlDbNgI4TxvuM9UdLXD+3wBmubqxq4xp3YXCtGGD1GX9lnQfI/PgPMED2/G8WWcQydFvi/wfHWYlh0wUJ0mQyfx8JZAWdZAcAMyxMN8KMRxUXZCk1ZzIMvdDvUCukkSGZbw5djrW4k//x200n4LP7I9UvzPD8PJQnovI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=linux-m68k.org; spf=pass smtp.mailfrom=gmail.com; arc=none smtp.client-ip=209.85.217.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=linux-m68k.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-vs1-f45.google.com with SMTP id ada2fe7eead31-4afe7429d37so566307137.2; Fri, 13 Dec 2024 11:41:31 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734118890; x=1734723690; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+NNjRUZHM5uucEv6fy+3ktXfQNKuRR8mDXFkC5X+CBM=; b=RFnmQNynqw2NkXPjp2Oxh1PFKpHWp4l5XbMt9oNEqlFGAKRMCmjKvp0x74wC8iU4TY 8I+wTmdvLuRg/anGQroRUDd/QVpod4Sk/9cYQG/hcUS55WWfjjCAr+CIJoo2o8VLiPbv ikuuZaFGyeG025b/4TGDyJq9JOUYCVkRLcWxwsW9jRCX7mUnb46o6a63BXLFtZyzrW66 Zr0GM+6q01glgeR8BQ/hmo396+Ch/6dD3YcaSqxFs+hEACZHOOqWleGruFIqcZOw9hdn sFOT9xgqnNwvijaHuHwbYbqxWEXIrjcj+shQZJgImvE30AocBDrruVWGvjEFkZB2FQHZ mY7Q== X-Forwarded-Encrypted: i=1; AJvYcCVU5OmT03P4fKTcllopWB6R+ywd4bLrVFPdhFo5n8ssO8DC32Oq0sI0dPxz61Gmm4m/lHyfObV0K1s=@vger.kernel.org, AJvYcCXJvhrbkvALItaYbFrPhAVbFZQYxWLZkVdd+xAG1reqV8ciuMweZUPNvrgE2SLqEowwAR0eq/IbvmiNJ2tG@vger.kernel.org, AJvYcCXM/5sp5xsWOt6/g+MfTGQEyiH/RfqVR6HhFAoD9sVq/TKAeHGAvaiRtdGjF38LKOqnJmT8AJGJ8o27@vger.kernel.org X-Gm-Message-State: AOJu0YzyO+l9R4zlpoQXXFujI97DqHq/LLT+d6D8Jg/LyQ+63xUoDpct HHbk8ZxskbLfs5S67FuX3KNkAdTfUm7eGIKk3jopC+jA55QEPJi7EPZa9t5H X-Gm-Gg: ASbGncvqFJZR4Dm5mVDoiXroNppDDsWJC0NMoZ2jzmyXD5VwhuzalT5ORxqeVtT7O4r shw/17bjpy3Rt+YUc0QYdHpoZC6L12jxDep+cbz8jYqbSHYcNxre3hgH7W/sxDU3e/EkEBHkIP6 XcQMjwwGW4taFBmXwD+hzjyUe6y9RxjUwjYoJRSFbRhc6XPqyC3FIl4rDylv8c6zWsjtrx1mhHB 1Ep47kq7ihsnn/GWGxHvK2IB32GaUTN6TYc1Dap/F1Q8jnVl6CE/tkA11Db8uQlwcYBQV6rKFCe ch8hhbp1I5abJDcIB/wwBUA= X-Google-Smtp-Source: AGHT+IE3FUP7NwkHryvGs8lB8qPl/CWB38xrqMRXss3js+pZWDS09s6TBZKB8TtM0W2KSNW6VLFQug== X-Received: by 2002:a05:6102:2082:b0:4b2:5de5:faea with SMTP id ada2fe7eead31-4b25de60878mr4448666137.0.1734118890041; Fri, 13 Dec 2024 11:41:30 -0800 (PST) Received: from mail-vk1-f175.google.com (mail-vk1-f175.google.com. [209.85.221.175]) by smtp.gmail.com with ESMTPSA id a1e0cc1a2514c-860ab71683csm9134241.34.2024.12.13.11.41.29 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 13 Dec 2024 11:41:29 -0800 (PST) Received: by mail-vk1-f175.google.com with SMTP id 71dfb90a1353d-5187f6f7bcaso568486e0c.3; Fri, 13 Dec 2024 11:41:29 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCVdrqF02k7dB2jqHrlU6h4Y3Wh5Sj/DkJc8XEfmdH/IsA30Kt/AVAVgR9mnM3uTIrNANp67Rzl6AJU=@vger.kernel.org, AJvYcCVxeJf3PpGKIQrecpYuQZ911/cXI8KXUwLFE7tWXsF41oSla6bGX0yGJmYEJi6Yufcfk01k+huH8KUi@vger.kernel.org, AJvYcCX4+/fbcBQNy+KUnXD9xzyOgo+tEojo6ONUC18DiCxJ7Uomx/STtC9OG2xuAg/B9wAIJNDs4uPixwIzh7YL@vger.kernel.org X-Received: by 2002:a05:6122:3290:b0:517:4fb0:74bc with SMTP id 71dfb90a1353d-518ca369a45mr5040949e0c.3.1734118889186; Fri, 13 Dec 2024 11:41:29 -0800 (PST) Precedence: bulk X-Mailing-List: linux-doc@vger.kernel.org List-Id: <linux-doc.vger.kernel.org> List-Subscribe: <mailto:linux-doc+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-doc+unsubscribe@vger.kernel.org> MIME-Version: 1.0 References: <cover.1733421037.git.geert+renesas@glider.be> <46b320b91b8d86fade3c1b1c72ef94da85b45d0d.1733421037.git.geert+renesas@glider.be> <CAHk-=wiwAz3UgPOWK3RdGXDnTRHcwVbxpuxCQt_0SoAJC-oGXQ@mail.gmail.com> In-Reply-To: <CAHk-=wiwAz3UgPOWK3RdGXDnTRHcwVbxpuxCQt_0SoAJC-oGXQ@mail.gmail.com> From: Geert Uytterhoeven <geert@linux-m68k.org> Date: Fri, 13 Dec 2024 20:41:17 +0100 X-Gmail-Original-Message-ID: <CAMuHMdXxqRRePJ_HHo---6ayjRnQcDRE--mx0kUDg0ceDELG9g@mail.gmail.com> Message-ID: <CAMuHMdXxqRRePJ_HHo---6ayjRnQcDRE--mx0kUDg0ceDELG9g@mail.gmail.com> Subject: Re: [PATCH v2 2/2] Increase minimum git commit ID abbreviation to 16 characters To: Linus Torvalds <torvalds@linux-foundation.org> Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>, Lukas Bulwahn <lukas.bulwahn@gmail.com>, Joe Perches <joe@perches.com>, Jonathan Corbet <corbet@lwn.net>, Thorsten Leemhuis <linux@leemhuis.info>, Andy Whitcroft <apw@canonical.com>, =?UTF-8?Q?Niklas_S=C3=B6derlund?= <niklas.soderlund@corigine.com>, Simon Horman <horms@kernel.org>, Conor Dooley <conor@kernel.org>, Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>, Junio C Hamano <gitster@pobox.com>, workflows@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Linus, On Thu, Dec 5, 2024 at 8:19=E2=80=AFPM Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Thu, 5 Dec 2024 at 10:16, Geert Uytterhoeven <geert+renesas@glider.be>= wrote: > > Hence according to the Birthday Paradox, collisions of 12-chararacter > > git commit IDs are imminent, or already happening. > > Note that ambiguous commit IDs are not even remotely as scary as this imp= lies. > > Yes, the current kernel tree has over ten million objects, and when > you look at stable trees etc, you can easily see more. > > But commits are only a fraction (about 1/8th) of the total objects. My > tree is at about 1.3M commits, so we're basically an order of > magnitude off the point where collisions start being an issue wrt > commit IDs. > > Can you find collisions by looking at all objects? Yes. Git will do > that for you, and tell you their types. But to take one recent > example, let's do the 6.12 commit: > adc218676eef25575469234709c2d87185ca223a. To get an ambiguous ID, you > have to go down to 6 characters, and even then git will tell you > there's only one object that is a commit, ie > > $ git show adc218 > > results in > > error: short object ID adc218 is ambiguous > hint: The candidates are: > hint: adc218676eef commit 2024-11-17 - Linux 6.12 > hint: adc2184009c5 blob > > so right now you have a collision in six digits for that commit, but > even then it's actually still entirely unambiguous once you know > you're talking about a commit. That's true for the basic command line tools... > Make the tools deal with the cases we already have, and you'll find > that the shortening is a complete non-issue. FTR, cgit can use some improvements, as https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?= id=3Dadc218 just tells you "Bad object id: adc218". Gr{oetje,eeting}s, Geert --=20 Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k= .org In personal conversations with technical people, I call myself a hacker. Bu= t when I'm talking to journalists I just say "programmer" or something like t= hat. -- Linus Torvalds From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F6CC522F; Sat, 14 Dec 2024 16:03:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734192218; cv=none; b=Egjuoxldot7Qf5a6XJo7Y20fW19kwnc2oR3NPZrn0P5AmjwGz3/9PQQ9wHMlQFRgiZWyBKsyNsLuDAcoRV1V4VLCNIRTODOyZJi4Gm7IN8F9UIPp7PcG3ap4MWz1qIsOVAzCSGFwx/re4z0Tbkv/hRpUFqwdAKrjFyjNN0+UuZc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734192218; c=relaxed/simple; bh=xxXLyGOyZnGpaJ4mvshI9zCtsAP5miyU2OJKnkGTl/Y=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=j8hmUvctRZu0W64PEFItcdE2RNphsVfVw6IgsCnPmpOImW7QvEm9t/FZIxU1xriGqnXwLrfKGHK26LvpSPTmQlgWQrfSaFBZLgIQGGDto+n1dMETeJ556lZF9cK1CF0hv7yimcZeVS+UG0nlZwIMkgiscOGTKbyp7OyiuByJqxQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=IGfeNfO/; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="IGfeNfO/" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=T1A5Dyhd0EpXiTSnjGvcbnHEdmDZiOrAmCbzr/JLCnI=; b=IGfeNfO/3qoDbUQQYKjPvkjxLU 1yHvQ52/YlqK1M5KJVUSL4qFTXj5/vO8qtpBp2QYTwrXPAeysrDYW9U8EWVyHSyivZgyDFgSm06V8 C7vnSyKZjmALsfLPoQXvJFtlFnpPzRJApwvEQgdTdwgRS/7ChtNju5ZKY2p6Si9WoCzNckrmgMMIu XHAxY11W2HVz7w1zKsTf0+nPG1JZAk5Hm73ZC/+DpspbJ+218n54P+Y/73EwnHPlfNsPuipGne/BE 79uyK9UtnDr4fyzJglMmXW2AYb0EefKCd5yniukpxZgNPL68r2FQw8lQb+cq2+BLxQU7bn6HTw1C4 C24vAs0w==; Received: from willy by casper.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tMUbx-00000003cyB-3EVu; Sat, 14 Dec 2024 16:03:21 +0000 Date: Sat, 14 Dec 2024 16:03:21 +0000 From: Matthew Wilcox <willy@infradead.org> To: Linus Torvalds <torvalds@linux-foundation.org> Cc: Geert Uytterhoeven <geert+renesas@glider.be>, Dwaipayan Ray <dwaipayanray1@gmail.com>, Lukas Bulwahn <lukas.bulwahn@gmail.com>, Joe Perches <joe@perches.com>, Jonathan Corbet <corbet@lwn.net>, Thorsten Leemhuis <linux@leemhuis.info>, Andy Whitcroft <apw@canonical.com>, Niklas =?iso-8859-1?Q?S=F6derlund?= <niklas.soderlund@corigine.com>, Simon Horman <horms@kernel.org>, Conor Dooley <conor@kernel.org>, Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>, Junio C Hamano <gitster@pobox.com>, workflows@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 2/2] Increase minimum git commit ID abbreviation to 16 characters Message-ID: <Z12sScjRHpB1d0nO@casper.infradead.org> References: <cover.1733421037.git.geert+renesas@glider.be> <46b320b91b8d86fade3c1b1c72ef94da85b45d0d.1733421037.git.geert+renesas@glider.be> <CAHk-=wiwAz3UgPOWK3RdGXDnTRHcwVbxpuxCQt_0SoAJC-oGXQ@mail.gmail.com> Precedence: bulk X-Mailing-List: linux-doc@vger.kernel.org List-Id: <linux-doc.vger.kernel.org> List-Subscribe: <mailto:linux-doc+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-doc+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <CAHk-=wiwAz3UgPOWK3RdGXDnTRHcwVbxpuxCQt_0SoAJC-oGXQ@mail.gmail.com> On Thu, Dec 05, 2024 at 11:19:18AM -0800, Linus Torvalds wrote: > Why do I care? Because long git commit IDs are actually detrimental to > legibility. I try to make commit messages legible, and that very much > is the *point* of the short format. It's for people, not machinery. > > Yes, the basic git machinery doesn't do object type disambiguation > (and if you do "git show", you can give it blob IDs etc, so git itself > may not know about the proper type to use disambiguate at all). And > git also doesn't know about the whole "we also put the first line of > the commit message" thing. > > But honestly, I'm claiming that something like > > Fixes: 48bcda684823 ("tracing: Remove definition of trace_*_rcuidle()") I have wondered about using a different encoding for the sha1. Classic Ascii85 encoding is no good; it uses characters like '"\< which interact poorly with every shell. RFC1924 is somewhat better, but still uses characters that interact poorly with shell. Base36 (ie 0-9a-z) would take 10 characters to represent as many bits as 12 characters in base16. Base62 (0-9a-zA-Z) gives us 8 characters to represent _almost_ 48 bits. We could do Base64 (RFC 4648) which uses + and /, and is common enough. Good enough to be worth the additional complexity? From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-ej1-f50.google.com (mail-ej1-f50.google.com [209.85.218.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 83B3B38DE0 for <linux-doc@vger.kernel.org>; Sat, 14 Dec 2024 16:31:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.50 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734193906; cv=none; b=dHSkbx0Oj5Tfr1uQOa4MrXXUWstemzEgbQ7plmcWNtIYvmQGMHsg44fu+FdTqZ+42PhZEOUQQKQlMeTkPfd+4gIKqYe65Aobo2EjlB8XPv2BpWfZdkpLITDK+MpWO4hOCIsf4hrPNEyLihNWsXxkKCmO2luv+7/SYQmZhYkuI+Y= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734193906; c=relaxed/simple; bh=aNfapG6I/CUOP9Y8AfR2pHR5FECAZ7Mbd4YYU7awi0w=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=euk0L+NZRJDI36UMId1qzH/C+Ih1Bhh0MoesGgKQ7RDWnx/FAjTj4eVdM0qmo8lQfCuDi2s7SFXMoQDvrI0g+0Zjp7CyZN6jh0rzTuSkd/j8pFD+CKJ/Lio8SX7t2zlIrK+NzDeuMjSDdU6/MUY02oQfcSbsNqf2njGPMdn0Lds= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org; spf=pass smtp.mailfrom=linuxfoundation.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b=WveFCrZo; arc=none smtp.client-ip=209.85.218.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linuxfoundation.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="WveFCrZo" Received: by mail-ej1-f50.google.com with SMTP id a640c23a62f3a-aab9e281bc0so85644966b.3 for <linux-doc@vger.kernel.org>; Sat, 14 Dec 2024 08:31:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1734193903; x=1734798703; darn=vger.kernel.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=YSMPYzUxCS1+M94lq1DhBfG9d5bon4YtVfF19FFr/ZQ=; b=WveFCrZoF9Gp2U+TTwShRyPEcCMrb5hoq1ovdTJLuE3ktPK76XTpBdb6EILAfwPE6o PHXOLPeHgU41P+UKj1YkUrQSzdKOf5bzuvknlnrEcRfxOGeD6Q+/LiQulBM0uvGKF3tg prVNN5ImfP04zbgnpXz384K8HmvSp5UDjCHFA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734193903; x=1734798703; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=YSMPYzUxCS1+M94lq1DhBfG9d5bon4YtVfF19FFr/ZQ=; b=uJ2taVSbgx8KEtw4WLUL7dKyEZu+qnJ7RGudmwKgzRUH7Ju5uQSzpt7NkOn2Ir7tgc mAPmCuinGlSEfItK+QraGxSi9ZZ1uprXKShhqPON91TU+fW89yVA8diJqIU/XcXTdWim dGav5XvkQuA60C1g7plvJwJLl/y15L70zgkX2soZTeYeK30qMHPbOa9WsTq6TH3XJ41p eCuli29Vi81sQKUbJTA2Odwp1zr1vg7lf3rO+2HVtWaYd5/hhJKmEHFTMkiiBXBRnCV7 Iw+NRAmDS7k87e+QaTwidqHxD5NmU+ve34uMQGUumynVGyngvR1E1pnN6MustDpVivxv UrHA== X-Forwarded-Encrypted: i=1; AJvYcCV6wnon48EK89cQHbYSn8hxUcCthKty7M9MzM8TSiiblUx6sdw7ft2EWQuTMAFxuVTsxViMiDD8h1A=@vger.kernel.org X-Gm-Message-State: AOJu0YyYUosem+auGJmkQP0pkbNrpttwV1nDrARGLWucp/dIEZ1RQD5R kXsZFcNbPFOdIrlVCew94MONeXOyUkxdfAgWHksaR/nPCrFfeQX9aX8BXk/SBPHFkJvzOOPkhfy wDz8= X-Gm-Gg: ASbGncv9khJgnDPeamAzTELIWCE9pSY+usx45Y1tOZgtPMojVa5ST+ok2GL3tGomtmt q+ly8HrOXuG8QtaxAmaUiGjM92Lz0715O79vWRjt9+JMScYmXJsB0CPtm8YX3aBkMab6VWYiIsn 6T8zAYGhD8ZfusbaNv4mZ68xs/oSXyTaMo0BZqZuXZ9sBScjpjGtSui8Ir8iLPtByPXJBuMwHIE qwYyDB4PmuHSV1gPCZIRX7SbwtnpTOsE6SIJPaXxNSC6AX5B75GKzNKQ7SDqBeIA52WU+QLmxMI 5UEYSXndoXYJKjwaRo92xk6pFlQv0+8= X-Google-Smtp-Source: AGHT+IGp3BdpXk1G4xbMkrrcL/jzF50FQd6543NwBo8kHyafPj2XlEqyayrjVlEwngD6f47UpowbRQ== X-Received: by 2002:a17:907:1c25:b0:aa6:6ca2:b772 with SMTP id a640c23a62f3a-aab778d9deamr704103866b.10.1734193902723; Sat, 14 Dec 2024 08:31:42 -0800 (PST) Received: from mail-ed1-f41.google.com (mail-ed1-f41.google.com. [209.85.208.41]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aab96394804sm109544466b.173.2024.12.14.08.31.41 for <linux-doc@vger.kernel.org> (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 14 Dec 2024 08:31:41 -0800 (PST) Received: by mail-ed1-f41.google.com with SMTP id 4fb4d7f45d1cf-5d3cf094768so4203801a12.0 for <linux-doc@vger.kernel.org>; Sat, 14 Dec 2024 08:31:41 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCUqtT9UGdu8qiTTqOfQ+g+9xTjcNpx5wPUVJp8JKMY9hownm1QEtHSebTmS831rx/imTlIT8cb09Uw=@vger.kernel.org X-Received: by 2002:a17:907:2d26:b0:aa6:691f:20a9 with SMTP id a640c23a62f3a-aab778d9db3mr565846766b.4.1734193900694; Sat, 14 Dec 2024 08:31:40 -0800 (PST) Precedence: bulk X-Mailing-List: linux-doc@vger.kernel.org List-Id: <linux-doc.vger.kernel.org> List-Subscribe: <mailto:linux-doc+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-doc+unsubscribe@vger.kernel.org> MIME-Version: 1.0 References: <cover.1733421037.git.geert+renesas@glider.be> <46b320b91b8d86fade3c1b1c72ef94da85b45d0d.1733421037.git.geert+renesas@glider.be> <CAHk-=wiwAz3UgPOWK3RdGXDnTRHcwVbxpuxCQt_0SoAJC-oGXQ@mail.gmail.com> <Z12sScjRHpB1d0nO@casper.infradead.org> In-Reply-To: <Z12sScjRHpB1d0nO@casper.infradead.org> From: Linus Torvalds <torvalds@linux-foundation.org> Date: Sat, 14 Dec 2024 08:31:24 -0800 X-Gmail-Original-Message-ID: <CAHk-=wgrc4zvZg+Sz_aLmMbaJ6ZHYaJBQ7nzByj2pMZBbh6www@mail.gmail.com> Message-ID: <CAHk-=wgrc4zvZg+Sz_aLmMbaJ6ZHYaJBQ7nzByj2pMZBbh6www@mail.gmail.com> Subject: Re: [PATCH v2 2/2] Increase minimum git commit ID abbreviation to 16 characters To: Matthew Wilcox <willy@infradead.org> Cc: Geert Uytterhoeven <geert+renesas@glider.be>, Dwaipayan Ray <dwaipayanray1@gmail.com>, Lukas Bulwahn <lukas.bulwahn@gmail.com>, Joe Perches <joe@perches.com>, Jonathan Corbet <corbet@lwn.net>, Thorsten Leemhuis <linux@leemhuis.info>, Andy Whitcroft <apw@canonical.com>, =?UTF-8?Q?Niklas_S=C3=B6derlund?= <niklas.soderlund@corigine.com>, Simon Horman <horms@kernel.org>, Conor Dooley <conor@kernel.org>, Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>, Junio C Hamano <gitster@pobox.com>, workflows@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" On Sat, 14 Dec 2024 at 08:03, Matthew Wilcox <willy@infradead.org> wrote: > > I have wondered about using a different encoding for the sha1. > Classic Ascii85 encoding is no good; it uses characters like '"\< > which interact poorly with every shell. RFC1924 is somewhat better, > but still uses characters that interact poorly with shell. I suspect that the pain would much outweigh the gain. You'd need to teach all tools about the new format, and you'd also need to add some additional format specifying character just to make it unambiguous *which* format you use, since if you just extend the character set you'll have lots of hashes that could be either. And you could disambiguate by testing both and seeing which one works better, but at that point, you're much better off disambiguating the current regular hex format by being a bit smarter about the objects. Using base36 doesn't add enough bits to then make up for such a disambiguation character in practice (ie 11 characters vs 12 - not really noticeable). base62 would be better, but christ does *that* really result in an unreadable jumble. At that point I'd rather see 16-character hex than the complete line noise that is base62. Also, I bet people would start looking for shorthand formats that spell rude words. You are kind of limited with hex, and sometimes that's an advantage. Linus From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA58035956; Wed, 18 Dec 2024 23:36:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734564997; cv=none; b=PmouoYug8EOIWWsKH4AQxP6Tj7XHqFXutX/JXI9ESl6b/KhBQ+pGfhMX36sWY6ZK84EI0GYR3nGGPjDtWJUfik1/DKSelWQvw0hbsYqLhGpIEBHbq2tbSgsADfsRctqXKx0L3XpeDoCu8ncRBQWCmMplyhIUpdXyh75y65uZwyg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734564997; c=relaxed/simple; bh=zBI3FaL6zIRdIHdBt5nTkhdL7dh2m7MZsR0WBVjYNhs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=nlJjgfEqxrGF5fgpnbN+BOoe/OSQeSx+FPTFEgzP2xjQvucrL78+CHfHcTKkUn2AZl1XVCUJeIz4qlur1GjdhAIuGMgM8+Ct2qbVa+Hu7KRDUx80RE7Gp8BcphFA/nBg5vaLPNbdO7RJzIpg4BIsnaMS6u74OVQHIQTvc1mohiw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=U1IAPvu2; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="U1IAPvu2" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6F7E6C4CECD; Wed, 18 Dec 2024 23:36:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734564997; bh=zBI3FaL6zIRdIHdBt5nTkhdL7dh2m7MZsR0WBVjYNhs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=U1IAPvu2abOxV3XX1UP+HFcDbm/zHErp4nhzBGACK/6X3MSU+KuNv1WVaMN4puWwz OsTCo7u9YpKw6fVo6hbTcJ7ddyGTxMySzAG5LPAZMPwVWW3viC+oqjltAqRYpCxaJy d42FbZpjqcXsVagNq7h96aH9/NGqob3/vsmJaBSoBuGQx7iK/eNhEdF8hZCMk/OUvF 6PnMxWE+WzxJOiKgfS/HILExKQalnNZD/8HgrsFAjQ+vNEBmaJg79KjIfWf1572Fl8 f4tlnsoCaYjGL9N20R3z1jXTQSLTMSvpnDzcd8T8LvsHF1I4hzsCN9nOAZ1So2/7Ma dwEFdnQWEB1xg== From: Sasha Levin <sashal@kernel.org> To: torvalds@linux-foundation.org Cc: apw@canonical.com, conor@kernel.org, corbet@lwn.net, dwaipayanray1@gmail.com, geert+renesas@glider.be, gitster@pobox.com, horms@kernel.org, joe@perches.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux@leemhuis.info, lukas.bulwahn@gmail.com, miguel.ojeda.sandonis@gmail.com, niklas.soderlund@corigine.com, workflows@vger.kernel.org, Sasha Levin <sashal@kernel.org> Subject: [RFC] git-disambiguate: disambiguate shorthand git ids Date: Wed, 18 Dec 2024 18:36:13 -0500 Message-Id: <20241218233613.219345-1-sashal@kernel.org> X-Mailer: git-send-email 2.39.5 In-Reply-To: <CAHk-=wiwAz3UgPOWK3RdGXDnTRHcwVbxpuxCQt_0SoAJC-oGXQ@mail.gmail.com> References: <CAHk-=wiwAz3UgPOWK3RdGXDnTRHcwVbxpuxCQt_0SoAJC-oGXQ@mail.gmail.com> Precedence: bulk X-Mailing-List: linux-doc@vger.kernel.org List-Id: <linux-doc.vger.kernel.org> List-Subscribe: <mailto:linux-doc+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-doc+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sometimes long commit hashes can be ambiguous even when providing several digits from its abbreviation. Add a script that resolves such ambiguity by also considering the commit subject line. This also allows users to use shorter commit ID prefixes than normally required, since we can correctly identify the intended commit using the subject line as additional context. In force mode (--force), you can even omit a valid commit ID prefix entirely - the script will try to find a commit matching just the subject line. Signed-off-by: Sasha Levin <sashal@kernel.org> --- scripts/git-disambiguate.sh | 163 ++++++++++++++++++++++++++++++++++++ 1 file changed, 163 insertions(+) create mode 100755 scripts/git-disambiguate.sh diff --git a/scripts/git-disambiguate.sh b/scripts/git-disambiguate.sh new file mode 100755 index 000000000000..86063dd0fd2c --- /dev/null +++ b/scripts/git-disambiguate.sh @@ -0,0 +1,163 @@ +#!/bin/bash + +usage() { + echo "Usage: $(basename "$0") [--selftest] [--force] <commit-id> [commit-subject]" + echo "Disambiguates a short git commit ID to its full SHA-1 hash." + echo "" + echo "Arguments:" + echo " --selftest Run self-tests" + echo " --force Try to find commit by subject if ID lookup fails" + echo " commit-id Short git commit ID to disambiguate" + echo " commit-subject Optional commit subject to help disambiguate between multiple matches" + exit 1 +} + +git_full_id() { + local force=0 + if [ "$1" = "--force" ]; then + force=1 + shift + fi + + # Split input into commit ID and subject + local input="$*" + local commit_id="${input%% *}" + local subject="" + + # Extract subject if present (everything after the first space) + if [[ "$input" == *" "* ]]; then + subject="${input#* }" + # Strip the ("...") quotes if present + subject="${subject#*(\"}" + subject="${subject%\")*}" + fi + + # Get all possible matching commit IDs + local matches + readarray -t matches < <(git rev-parse --disambiguate="$commit_id" 2>/dev/null) + + # Return immediately if we have exactly one match + if [ ${#matches[@]} -eq 1 ]; then + echo "${matches[0]}" + return 0 + fi + + # If no matches and not in force mode, return failure + if [ ${#matches[@]} -eq 0 ] && [ $force -eq 0 ]; then + return 1 + fi + + # If we have a subject, try to find a match with that subject + if [ -n "$subject" ]; then + # In force mode, search all commits if no ID matches found + if [ ${#matches[@]} -eq 0 ]; then + local match + match=$(git log --format="%H %s" --grep="$subject" --fixed-strings | grep -F -m 1 " $subject" | cut -d' ' -f1) + if [ -n "$match" ]; then + echo "$match" + return 0 + fi + else + # Normal subject matching for existing matches + for match in "${matches[@]}"; do + if [ "$(git log -1 --format="%s" "$match")" = "$subject" ]; then + echo "$match" + return 0 + fi + done + fi + fi + + # No match found + return 1 +} + +run_selftest() { + local test_cases=( + '00250b5 ("MAINTAINERS: add new Rockchip SoC list")' + '0037727 ("KVM: selftests: Convert xen_shinfo_test away from VCPU_ID")' + 'ffef737 ("net/tls: Fix skb memory leak when running kTLS traffic")' + '12345678' # Non-existent commit + '12345 ("I'\''m a dummy commit")' # Valid prefix but wrong subject + '--force 99999999 ("net/tls: Fix skb memory leak when running kTLS traffic")' # Force mode with non-existent ID but valid subject + ) + + local expected=( + "00250b529313d6262bb0ebbd6bdf0a88c809f6f0" + "0037727b3989c3fe1929c89a9a1dfe289ad86f58" + "ffef737fd0372ca462b5be3e7a592a8929a82752" + "" # Expect empty output for non-existent commit + "" # Expect empty output for wrong subject + "ffef737fd0372ca462b5be3e7a592a8929a82752" # Should find commit by subject in force mode + ) + + local expected_exit_codes=( + 0 + 0 + 0 + 1 # Expect failure for non-existent commit + 1 # Expect failure for wrong subject + 0 # Should succeed in force mode + ) + + local failed=0 + + echo "Running self-tests..." + for i in "${!test_cases[@]}"; do + # Capture both output and exit code + local result + result=$(git_full_id ${test_cases[$i]}) # Removed quotes to allow --force to be parsed + local exit_code=$? + + # Check both output and exit code + if [ "$result" != "${expected[$i]}" ] || [ $exit_code != ${expected_exit_codes[$i]} ]; then + echo "Test case $((i+1)) FAILED" + echo "Input: ${test_cases[$i]}" + echo "Expected output: '${expected[$i]}'" + echo "Got output: '$result'" + echo "Expected exit code: ${expected_exit_codes[$i]}" + echo "Got exit code: $exit_code" + failed=1 + else + echo "Test case $((i+1)) PASSED" + fi + done + + if [ $failed -eq 0 ]; then + echo "All tests passed!" + exit 0 + else + echo "Some tests failed!" + exit 1 + fi +} + +# Check for selftest +if [ "$1" = "--selftest" ]; then + run_selftest + exit $? +fi + +# Handle --force flag +force="" +if [ "$1" = "--force" ]; then + force="--force" + shift +fi + +# Verify arguments +if [ $# -eq 0 ]; then + usage +fi + +# Skip validation in force mode +if [ -z "$force" ]; then + # Validate that the first argument matches at least one git commit + if [ $(git rev-parse --disambiguate="$1" 2>/dev/null | wc -l) -eq 0 ]; then + echo "Error: '$1' does not match any git commit" + exit 1 + fi +fi + +git_full_id $force "$@" +exit $? -- 2.39.5 From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D84D670809; Thu, 19 Dec 2024 01:42:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734572522; cv=none; b=ixxiQJJsyfA4RnSTm6sbuiZ7FffNIvhY/zuZmnOAgqH7JO5ykQ/M6HoUHw4YchTsZoY59UPl8/RwfjHs+TUIepdBPq4AW2RhgowNJLS0An2fvdrH9dizMjvGdFx5OHQ3SY6prqX4OZmsz4vhDEz2bANVbA6ojDvd4GjRV2CZTXQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734572522; c=relaxed/simple; bh=aWRuniCSqOPimypDT24mnho4Pd9irB2YTppR8yLwUc4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=TRixFjuPzg2+2tRDxFO31ZTQYE5R9j420eNCaiIDk8WBfVela3ut7SVf37+/imwv4roZv539b1kebWytAApvrSCfNlxplb/7xGDGvPqDTAlh7CliqLQJCDP+JF29TxE1PwUz3cKkmVZdXkhMpkGeipbwmiRCKLVk1/LQzqBy5Io= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=bONZwgPj; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="bONZwgPj" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6248DC4CEDD; Thu, 19 Dec 2024 01:42:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734572521; bh=aWRuniCSqOPimypDT24mnho4Pd9irB2YTppR8yLwUc4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=bONZwgPj5rxYnVbHbFKZxkyofo5xrwR1J/ordzmMsqJuytvCn3xR2Lh2Q433pYvdr Tz0D3cL4cRjGVFX2o0MGxUp5uzir5KYf9MNR4mm3BTXsk/LMxvwHYtH4FIrH/UwV6i ZMJB76SUAbS6LYjK2cDr0OVaQoLCfJXuSoJmV24pnIpL8MEJdCDvtaHU2iTgj9etzV zsE642dp6xMr0UZoXPS+T4xsXwbh6PguMG6jI2ggn+vIw3uTmuod5nib4YnRiMUjFw UNUMKi4MBG5hYSXLVIU5eWBMXnFLF4QTnSPB9spw+fBEcPCx3GlkEA3zJeM4QzRqb0 WN5Ju+sNjmXyg== Date: Wed, 18 Dec 2024 17:41:58 -0800 From: Kees Cook <kees@kernel.org> To: Sasha Levin <sashal@kernel.org> Cc: torvalds@linux-foundation.org, apw@canonical.com, conor@kernel.org, corbet@lwn.net, dwaipayanray1@gmail.com, geert+renesas@glider.be, gitster@pobox.com, horms@kernel.org, joe@perches.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux@leemhuis.info, lukas.bulwahn@gmail.com, miguel.ojeda.sandonis@gmail.com, niklas.soderlund@corigine.com, workflows@vger.kernel.org Subject: Re: [RFC] git-disambiguate: disambiguate shorthand git ids Message-ID: <202412181739.0170E86E58@keescook> References: <CAHk-=wiwAz3UgPOWK3RdGXDnTRHcwVbxpuxCQt_0SoAJC-oGXQ@mail.gmail.com> <20241218233613.219345-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: linux-doc@vger.kernel.org List-Id: <linux-doc.vger.kernel.org> List-Subscribe: <mailto:linux-doc+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-doc+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20241218233613.219345-1-sashal@kernel.org> On Wed, Dec 18, 2024 at 06:36:13PM -0500, Sasha Levin wrote: > Sometimes long commit hashes can be ambiguous even when providing > several digits from its abbreviation. For testing, please see: https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=dev/collide/v6.13-rc2/12-char > scripts/git-disambiguate.sh | 163 ++++++++++++++++++++++++++++++++++++ sfr has a bunch of logic in his "check_fixes" script that we might want to consolidate into here. I have an updated copy here: https://github.com/kees/kernel-tools/blob/trunk/helpers/check_fixes -Kees -- Kees Cook From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EB51019F13B; Thu, 19 Dec 2024 20:35:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734640538; cv=none; b=hiqH9saUMn45JDwYpxCSUjBPAZLfvHv5g4VoadxZBgQON37xV0SjOZSwCZ3808l5TtJkqhdB6EM+IxkCDkr4x5Vl/h+0bMU2IQZWltGChb5Y1lBqKjqbkG4vn46dwWt77P8p6WJ5Tsx86NeO4k9rm5FkwyfwgLMuM9Lpuu2F+pc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734640538; c=relaxed/simple; bh=nsChvgsBvN5PC9wB5iyc+9GEU9Dq/vGvGTgheNyhyS4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=jsHHLXISbYaB5uNKpCNjWGrQByZl82wRo63m6gAZpTcSbV3J6+NKMLa28rCFTvVnuJa7Estbs4XTZ9/0g4ZDnnXVtlmqq7D+yaCNrqQ6Yzg3bmLLJFvsYaLMCF7c9HtCebD5GMKh/gsz26kr8av2dBq1AWzJZf+7AbsrmsFAqSY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=dvRNMeht; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="dvRNMeht" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 37DADC4CED0; Thu, 19 Dec 2024 20:35:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734640537; bh=nsChvgsBvN5PC9wB5iyc+9GEU9Dq/vGvGTgheNyhyS4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=dvRNMehtn0Sgs7wNk38il/udSjkYlgaNr7gKA4d1JFXgWr1lwogFsJOJZnD71+8gv E1hsmvjMz5QCXcBAmn/z5WkGz6zUXLGHcsmbrWVS5W7B/VCoUQ9uJ9XQozq7ZcYpy8 8l+/NkyEyLqUcR+vbrv8cg33amWkv15rKj+K9FNep6o+B30F7Nry6WhwZpLbVGz/D/ bqjbqISwUd3EZDZWi6Jkw/sBDAnlJFo7DhAziZD3tyqLZLV6tyjfpgrNeYnykxv/yS /7XN9d20gxGGVBl/q2y1eJhJa4uW2lRqqqBIO1QgCTm/AyWbt4rQT/SAy8RR9jtokM ZpiHQt62hovqQ== Date: Thu, 19 Dec 2024 15:35:35 -0500 From: Sasha Levin <sashal@kernel.org> To: Kees Cook <kees@kernel.org> Cc: torvalds@linux-foundation.org, apw@canonical.com, conor@kernel.org, corbet@lwn.net, dwaipayanray1@gmail.com, geert+renesas@glider.be, gitster@pobox.com, horms@kernel.org, joe@perches.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux@leemhuis.info, lukas.bulwahn@gmail.com, miguel.ojeda.sandonis@gmail.com, niklas.soderlund@corigine.com, workflows@vger.kernel.org Subject: Re: [RFC] git-disambiguate: disambiguate shorthand git ids Message-ID: <Z2SDl423NkL1QCIS@lappy> References: <CAHk-=wiwAz3UgPOWK3RdGXDnTRHcwVbxpuxCQt_0SoAJC-oGXQ@mail.gmail.com> <20241218233613.219345-1-sashal@kernel.org> <202412181739.0170E86E58@keescook> Precedence: bulk X-Mailing-List: linux-doc@vger.kernel.org List-Id: <linux-doc.vger.kernel.org> List-Subscribe: <mailto:linux-doc+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-doc+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <202412181739.0170E86E58@keescook> On Wed, Dec 18, 2024 at 05:41:58PM -0800, Kees Cook wrote: >On Wed, Dec 18, 2024 at 06:36:13PM -0500, Sasha Levin wrote: >> Sometimes long commit hashes can be ambiguous even when providing >> several digits from its abbreviation. > >For testing, please see: >https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=dev/collide/v6.13-rc2/12-char > >> scripts/git-disambiguate.sh | 163 ++++++++++++++++++++++++++++++++++++ > >sfr has a bunch of logic in his "check_fixes" script that we might want >to consolidate into here. I have an updated copy here: >https://github.com/kees/kernel-tools/blob/trunk/helpers/check_fixes Thanks! I'll look into it. -- Thanks, Sasha From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from ms.lwn.net (ms.lwn.net [45.79.88.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BA1CDEAD0; Mon, 30 Dec 2024 18:43:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.79.88.28 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735584221; cv=none; b=NumVuAFdyv7ubPRnNuC7SYuFrSVgAyyhNM+yQA58JeKDmK9ev22g4DtugwFl2CJN+h8pBmWKvdmtx85lpmxbTbOeGzFbSyxYFXS/LE69cWWyIgfGIb7dJEnD2PEGLv01Pjx6NiQNddv/T2ddy9OUzRwKdaPiUIW3OnGVs55XwPw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735584221; c=relaxed/simple; bh=Are0vYEs7wt/pMB5GHUqTxitsZRF+9fSy3nbp6/VI6A=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=FRUnhSifK8dIscOLX3g57TIKm1ONMzVF2ZdoOudU6bAyiU+LF8sORRJY+ENAsELaNDBHdKStth7lymA651CX/o/eMdYpLGjUo5J6t7l928oqunor2gwmkJy4+z38KetNLP/O+qoMEgoOLn5d99PlKFCcplgk986lRwROmuXTiZE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=lwn.net; spf=pass smtp.mailfrom=lwn.net; dkim=pass (2048-bit key) header.d=lwn.net header.i=@lwn.net header.b=flnJP/SD; arc=none smtp.client-ip=45.79.88.28 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=lwn.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=lwn.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=lwn.net header.i=@lwn.net header.b="flnJP/SD" DKIM-Filter: OpenDKIM Filter v2.11.0 ms.lwn.net CA833404EF DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lwn.net; s=20201203; t=1735584219; bh=VgeiIkZFH7IwsgLy59oFNWjYw6Iz4wAMcSAfEkw8z/Y=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=flnJP/SDqVmvCvN5NB7Td5fuLKxbre9fJRcOpQuayWKfPYlqOS5fJyRjdx35Cbi4o YHeBA0zKWWmbYTBoMyfRCxHLxTQW9ZX4OJKdY4EiVdW5H+fGT8r7ZANfH6xXPsuJjv x5KaVsOgpnD2hhrZ5DFIkQL/Ka+SVAggFNtqkN+8r+/Ka2DFWfIXkE4Hj7r9/FL31E 6f6Ru7gOoesBijYh96xkGLnWrd0ACZRFgwSVVUcyr1ApEZbBWJ7JpysXvsP8rPS3p/ gTLYTavH5MT0HEzCENBTA9KYxc44KOG19bQSWceu0ktHeFCLpNB0HTd1Y+lvwQbPws bX+i30qrNZfYg== Received: from localhost (unknown [IPv6:2601:280:5e00:625::1fe]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ms.lwn.net (Postfix) with ESMTPSA id CA833404EF; Mon, 30 Dec 2024 18:43:38 +0000 (UTC) From: Jonathan Corbet <corbet@lwn.net> To: Geert Uytterhoeven <geert+renesas@glider.be>, Dwaipayan Ray <dwaipayanray1@gmail.com>, Lukas Bulwahn <lukas.bulwahn@gmail.com>, Joe Perches <joe@perches.com>, Thorsten Leemhuis <linux@leemhuis.info>, Andy Whitcroft <apw@canonical.com>, Niklas =?utf-8?Q?S=C3=B6derlund?= <niklas.soderlund@corigine.com>, Simon Horman <horms@kernel.org>, Conor Dooley <conor@kernel.org>, Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>, Linus Torvalds <torvalds@linux-foundation.org> Cc: Junio C Hamano <gitster@pobox.com>, workflows@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, Geert Uytterhoeven <geert+renesas@glider.be> Subject: Re: [PATCH v2 1/2] Align git commit ID abbreviation guidelines and checks In-Reply-To: <1c244040bf6ce304656e31036e5178b4b9dfb719.1733421037.git.geert+renesas@glider.be> References: <cover.1733421037.git.geert+renesas@glider.be> <1c244040bf6ce304656e31036e5178b4b9dfb719.1733421037.git.geert+renesas@glider.be> Date: Mon, 30 Dec 2024 11:43:37 -0700 Message-ID: <87msgd106e.fsf@trenco.lwn.net> Precedence: bulk X-Mailing-List: linux-doc@vger.kernel.org List-Id: <linux-doc.vger.kernel.org> List-Subscribe: <mailto:linux-doc+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-doc+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Type: text/plain Geert Uytterhoeven <geert+renesas@glider.be> writes: > The guidelines for git commit ID abbreviation are inconsistent: some > places state to use 12 characters exactly, while other places recommend > 12 characters or more. The same issue is present in the checkpatch.pl > script. > > E.g. Documentation/dev-tools/checkpatch.rst says: > > **GIT_COMMIT_ID** > The proper way to reference a commit id is: > commit <12+ chars of sha1> ("<title line>") > > However, scripts/checkpatch.pl has two different checks: one warning > check accepting 12 characters exactly: > > # Check Fixes: styles is correct > Please use correct Fixes: style 'Fixes: <12 chars of sha1> (\"<title line>\")' > > and a second error check accepting 12-40 characters: > > # Check for git id commit length and improperly formed commit descriptions > # A correctly formed commit description is: > # commit <SHA-1 hash length 12+ chars> ("Complete commit subject") > Please use git commit description style 'commit <12+ chars of sha1> > > Hence patches containing commit IDs with more than 12 characters are > flagged by checkpatch, and sometimes rejected by maintainers or > reviewers. This is becoming more important with the growth of the > repository, as git may decide to use more characters in case of local > conflicts. > > Fix this by settling on at least 12 characters, in both the > documentation and in the checkpatch.pl script. > > Fixes: bd17e036b495bebb ("checkpatch: warn for non-standard fixes tag style") > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> > --- > v2: > - Rebase on top of commit 2f07b652384969f5 ("checkpatch: always parse > orig_commit in fixes tag") in v6.13-rc1, > - Update documentation, too. > --- > Documentation/process/maintainer-tip.rst | 2 +- > Documentation/process/submitting-patches.rst | 8 ++++---- > scripts/checkpatch.pl | 4 ++-- > 3 files changed, 7 insertions(+), 7 deletions(-) So, while the other patch in this series raised some eyebrows, nobody has complained about this one. Consistency and clarity are good, so I've applied this one, thanks. jon