git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Subject: [PATCH v2 4/5] patch-id: rewrite code that detects the beginning of a patch
Date: Mon, 29 Jul 2024 18:17:37 -0700	[thread overview]
Message-ID: <20240730011738.4032377-5-gitster@pobox.com> (raw)
In-Reply-To: <20240730011738.4032377-1-gitster@pobox.com>

The get_one_patchid() function reads input lines until it finds a
patch header (the line that begins a patch), whose beginning is one
of:

 (1) an "<object name>", which is what "git diff-tree --stdin" shows;
 (2) "commit <object name>", which is what "git log" shows; or
 (3) "From <object name>",  which is what "git log --format=email" shows.

When it finds such a line, it returns to the caller, reporting the
<object name> it found, and the size of the "patch" it processed.

The caller then calls the function again, which then ignores the
commit log message, and then processes the lines in the patch part
until it hits another "beginning of a patch".

The above logic was fairly easy to see until 2bb73ae8 (patch-id: use
starts_with() and skip_prefix(), 2016-05-28) reorganized the code,
which made another logic that has nothing to do with the "where does
the next patch begin?" logic, which came from 2485eab5
(git-patch-id: do not trip over "no newline" markers, 2011-02-17)
that ignores the "\ No newline at the end", rolled into the same
single if() statement.

Let's split it out.  The "\ No newline at the end" marker is part of
the patch, should not appear before we start reading the patch part,
and does not belong to the detection of patch header.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/patch-id.c | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/builtin/patch-id.c b/builtin/patch-id.c
index 1d3b7ff12b..a2fdc0505d 100644
--- a/builtin/patch-id.c
+++ b/builtin/patch-id.c
@@ -85,16 +85,19 @@ static int get_one_patchid(struct object_id *next_oid, struct object_id *result,
 		const char *p = line;
 		int len;
 
-		/* Possibly skip over the prefix added by "log" or "format-patch" */
-		if (!skip_prefix(line, "commit ", &p) &&
-		    !skip_prefix(line, "From ", &p) &&
-		    starts_with(line, "\\ ") && 12 < strlen(line)) {
+		/*
+		 * If we see a line that begins with "<object name>",
+		 * "commit <object name>" or "From <object name>", it is
+		 * the beginning of a patch.  Return to the caller, as
+		 * we are done with the one we have been processing.
+		 */
+		if (skip_prefix(line, "commit ", &p))
+			;
+		else if (skip_prefix(line, "From ", &p))
+			;
+		if (!get_oid_hex(p, next_oid)) {
 			if (verbatim)
 				the_hash_algo->update_fn(&ctx, line, strlen(line));
-			continue;
-		}
-
-		if (!get_oid_hex(p, next_oid)) {
 			found_next = 1;
 			break;
 		}
@@ -135,6 +138,16 @@ static int get_one_patchid(struct object_id *next_oid, struct object_id *result,
 				break;
 		}
 
+		/*
+		 * A hunk about an incomplete line may have this
+		 * marker at the end, which should just be ignored.
+		 */
+		if (starts_with(line, "\\ ") && 12 < strlen(line)) {
+			if (verbatim)
+				the_hash_algo->update_fn(&ctx, line, strlen(line));
+			continue;
+		}
+
 		if (diff_is_binary) {
 			if (starts_with(line, "diff ")) {
 				diff_is_binary = 0;
-- 
2.46.0-69-gd0749fd195


  parent reply	other threads:[~2024-07-30  1:17 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-21 23:18 [PATCH 0/5] Tighten patch header parsing in patch-id Junio C Hamano
2024-06-21 23:18 ` [PATCH 1/5] t4204: patch-id supports various input format Junio C Hamano
2024-06-21 23:18 ` [PATCH 2/5] patch-id: call flush_current_id() only when needed Junio C Hamano
2024-06-21 23:18 ` [PATCH 3/5] patch-id: make get_one_patchid() more extensible Junio C Hamano
2024-07-29 12:02   ` Patrick Steinhardt
2024-07-29 20:03     ` Junio C Hamano
2024-06-21 23:18 ` [PATCH 4/5] patch-id: rewrite code that detects the beginning of a patch Junio C Hamano
2024-07-29 12:03   ` Patrick Steinhardt
2024-06-21 23:18 ` [PATCH 5/5] patch-id: tighten code to detect the patch header Junio C Hamano
2024-07-29 12:07   ` Patrick Steinhardt
2024-07-29 20:12     ` Junio C Hamano
2024-07-30  4:55       ` Patrick Steinhardt
2024-07-30  5:12         ` Patrick Steinhardt
2024-07-30  1:17 ` [PATCH v2 0/5] Tighten patch header parsing in patch-id Junio C Hamano
2024-07-30  1:17   ` [PATCH v2 1/5] t4204: patch-id supports various input format Junio C Hamano
2024-07-30  1:17   ` [PATCH v2 2/5] patch-id: call flush_current_id() only when needed Junio C Hamano
2024-07-30  1:17   ` [PATCH v2 3/5] patch-id: make get_one_patchid() more extensible Junio C Hamano
2024-07-30  1:17   ` Junio C Hamano [this message]
2024-07-30  1:17   ` [PATCH v2 5/5] patch-id: tighten code to detect the patch header Junio C Hamano
2024-07-30  5:12   ` [PATCH v2 0/5] Tighten patch header parsing in patch-id Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240730011738.4032377-5-gitster@pobox.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).