From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Subject: [PATCH 5/5] patch-id: tighten code to detect the patch header
Date: Fri, 21 Jun 2024 16:18:26 -0700 [thread overview]
Message-ID: <20240621231826.3280338-6-gitster@pobox.com> (raw)
In-Reply-To: <20240621231826.3280338-1-gitster@pobox.com>
The get_one_patchid() function unconditionally takes a line that
matches the patch header (namely, a line that begins with a full
object name, possibly prefixed by "commit" or "From" plus a space)
as the beginning of a patch. Even when it is *not* looking for one
(namely, when the previous call found the patch header and returned,
and then we are called again to skip the log message and process the
patch whose header was found by the previous invocation).
As a consequence, a line in the commit log message that begins with
one of these patterns can be mistaken to start another patch, with
current message entirely skipped (because we haven't even reached
the patch at all).
Allow the caller to tell us if it called us already and saw the
patch header (in which case we shouldn't be looking for another one,
until we see the "diff" part of the patch; instead we simply should
be skipping these lines as part of the commit log message), and skip
the header processing logic when that is the case. In the helper
function, it also needs to flip this "are we looking for a header?"
bit, once it finished skipping the commit log message and started
processing the patches, as the patch header of the _next_ message is
the only clue in the input that the current patch is done.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
builtin/patch-id.c | 43 ++++++++++++++++++++++++++++++-------------
t/t4204-patch-id.sh | 17 +++++++++++++++++
2 files changed, 47 insertions(+), 13 deletions(-)
diff --git a/builtin/patch-id.c b/builtin/patch-id.c
index a649966f31..0e6aab1ca2 100644
--- a/builtin/patch-id.c
+++ b/builtin/patch-id.c
@@ -60,12 +60,14 @@ static int scan_hunk_header(const char *p, int *p_before, int *p_after)
#define GOPID_STABLE 01
#define GOPID_VERBATIM 02
+#define GOPID_FIND_HEADER 04
static int get_one_patchid(struct object_id *next_oid, struct object_id *result,
struct strbuf *line_buf, unsigned flags)
{
int stable = flags & GOPID_STABLE;
int verbatim = flags & GOPID_VERBATIM;
+ int find_header = flags & GOPID_FIND_HEADER;
int patchlen = 0, found_next = 0;
int before = -1, after = -1;
int diff_is_binary = 0;
@@ -81,26 +83,39 @@ static int get_one_patchid(struct object_id *next_oid, struct object_id *result,
int len;
/*
- * If we see a line that begins with "<object name>",
- * "commit <object name>" or "From <object name>", it is
- * the beginning of a patch. Return to the caller, as
- * we are done with the one we have been processing.
+ * The caller hasn't seen us find a patch header and
+ * return to it, or we have started processing patch
+ * and may encounter the beginning of the next patch.
*/
- if (skip_prefix(line, "commit ", &p))
- ;
- else if (skip_prefix(line, "From ", &p))
- ;
- if (!get_oid_hex(p, next_oid)) {
- if (verbatim)
- the_hash_algo->update_fn(&ctx, line, strlen(line));
- found_next = 1;
- break;
+ if (find_header) {
+ /*
+ * If we see a line that begins with "<object name>",
+ * "commit <object name>" or "From <object name>", it is
+ * the beginning of a patch. Return to the caller, as
+ * we are done with the one we have been processing.
+ */
+ if (skip_prefix(line, "commit ", &p))
+ ;
+ else if (skip_prefix(line, "From ", &p))
+ ;
+ if (!get_oid_hex(p, next_oid)) {
+ if (verbatim)
+ the_hash_algo->update_fn(&ctx, line, strlen(line));
+ found_next = 1;
+ break;
+ }
}
/* Ignore commit comments */
if (!patchlen && !starts_with(line, "diff "))
continue;
+ /*
+ * We are past the commit log message. Prepare to
+ * stop at the beginning of the next patch header.
+ */
+ find_header = 1;
+
/* Parsing diff header? */
if (before == -1) {
if (starts_with(line, "GIT binary patch") ||
@@ -196,11 +211,13 @@ static void generate_id_list(unsigned flags)
struct strbuf line_buf = STRBUF_INIT;
oidclr(&oid);
+ flags |= GOPID_FIND_HEADER;
while (!feof(stdin)) {
patchlen = get_one_patchid(&n, &result, &line_buf, flags);
if (patchlen)
flush_current_id(&oid, &result);
oidcpy(&oid, &n);
+ flags &= ~GOPID_FIND_HEADER;
}
strbuf_release(&line_buf);
}
diff --git a/t/t4204-patch-id.sh b/t/t4204-patch-id.sh
index 1627fdda1b..b1d98d4110 100755
--- a/t/t4204-patch-id.sh
+++ b/t/t4204-patch-id.sh
@@ -137,6 +137,23 @@ test_expect_success 'patch-id computes the same for various formats' '
test_cmp actual expect
'
+hash=$(git rev-parse same:)
+for cruft in "$hash" "commit $hash is bad" "From $hash status"
+do
+ test_expect_success "patch-id with <$cruft> in log message" '
+ git format-patch -1 --stdout same >patch-0 &&
+ git patch-id <patch-0 >expect &&
+
+ {
+ sed -e "/^$/q" patch-0 &&
+ printf "random message\n%s\n\n" "$cruft" &&
+ sed -e "1,/^$/d" patch-0
+ } >patch-cruft &&
+ git patch-id <patch-cruft >actual &&
+ test_cmp actual expect
+ '
+done
+
test_expect_success 'whitespace is irrelevant in footer' '
get_patch_id main &&
git checkout same &&
--
2.45.2-786-g49444cbe9a
next prev parent reply other threads:[~2024-06-21 23:18 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-21 23:18 [PATCH 0/5] Tighten patch header parsing in patch-id Junio C Hamano
2024-06-21 23:18 ` [PATCH 1/5] t4204: patch-id supports various input format Junio C Hamano
2024-06-21 23:18 ` [PATCH 2/5] patch-id: call flush_current_id() only when needed Junio C Hamano
2024-06-21 23:18 ` [PATCH 3/5] patch-id: make get_one_patchid() more extensible Junio C Hamano
2024-07-29 12:02 ` Patrick Steinhardt
2024-07-29 20:03 ` Junio C Hamano
2024-06-21 23:18 ` [PATCH 4/5] patch-id: rewrite code that detects the beginning of a patch Junio C Hamano
2024-07-29 12:03 ` Patrick Steinhardt
2024-06-21 23:18 ` Junio C Hamano [this message]
2024-07-29 12:07 ` [PATCH 5/5] patch-id: tighten code to detect the patch header Patrick Steinhardt
2024-07-29 20:12 ` Junio C Hamano
2024-07-30 4:55 ` Patrick Steinhardt
2024-07-30 5:12 ` Patrick Steinhardt
2024-07-30 1:17 ` [PATCH v2 0/5] Tighten patch header parsing in patch-id Junio C Hamano
2024-07-30 1:17 ` [PATCH v2 1/5] t4204: patch-id supports various input format Junio C Hamano
2024-07-30 1:17 ` [PATCH v2 2/5] patch-id: call flush_current_id() only when needed Junio C Hamano
2024-07-30 1:17 ` [PATCH v2 3/5] patch-id: make get_one_patchid() more extensible Junio C Hamano
2024-07-30 1:17 ` [PATCH v2 4/5] patch-id: rewrite code that detects the beginning of a patch Junio C Hamano
2024-07-30 1:17 ` [PATCH v2 5/5] patch-id: tighten code to detect the patch header Junio C Hamano
2024-07-30 5:12 ` [PATCH v2 0/5] Tighten patch header parsing in patch-id Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240621231826.3280338-6-gitster@pobox.com \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).