All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jerry Zhang via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Jerry Zhang <jerry@skydio.com>
Subject: [PATCH v3 0/7] patch-id fixes and improvements
Date: Fri, 14 Oct 2022 08:56:37 +0000	[thread overview]
Message-ID: <pull.1359.v3.git.1665737804.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1359.v2.git.1663654859.gitgitgadget@gmail.com>

These patches add fixes and features to the "git patch-id" command, mostly
discovered through our usage of patch-id in the revup project
(https://github.com/Skydio/revup). On top of that I've tried to make general
cleanup changes where I can.

Summary:

1: Fixed a bug in the combination of --stable with binary files and
header-only, and expanded the test to cover both binary and non-binary
files.

2: Switch internal usage of patch-id in rebase / cherry-pick to use the
stable variant to reduce the number of code paths and improve testing for
bugs like above.

3: Fixed bugs with patch-id and binary diffs. Previously patch-id did not
behave correctly for binary diffs regardless of whether "--binary" was given
to "diff".

4: Fixed bugs with patch-id and mode changes. Previously mode changes were
incorrectly excluded from the patch-id.

5: Add a new "--include-whitespace" mode to patch-id that prevents
whitespace from being stripped during id calculation. Also add a config
option for the same behavior.

6: Remove unused prefix from patch-id logic.

7: Update format-patch doc to specify when patch-ids are going to be equal
to those generated by "git patch-id".

V1->V2: Fixed comment style V2->V3: The ---/+++ lines no longer get added to
the patch-id of binary diffs. Also added patches 3-7 in the series.

Signed-off-by: Jerry Zhang jerry@skydio.com

Jerry Zhang (7):
  patch-id: fix stable patch id for binary / header-only
  patch-id: use stable patch-id for rebases
  builtin: patch-id: fix patch-id with binary diffs
  patch-id: fix patch-id for mode changes
  builtin: patch-id: add --include-whitespace as a command mode
  builtin: patch-id: remove unused diff-tree prefix
  documentation: format-patch: clarify requirements for patch-ids to
    match

 Documentation/git-format-patch.txt |   4 +-
 Documentation/git-patch-id.txt     |  25 +++++--
 builtin/log.c                      |   2 +-
 builtin/patch-id.c                 | 114 +++++++++++++++++++++--------
 diff.c                             |  75 +++++++++----------
 diff.h                             |   2 +-
 patch-ids.c                        |  10 +--
 patch-ids.h                        |   2 +-
 t/t3419-rebase-patch-id.sh         |  63 +++++++++++++---
 t/t4204-patch-id.sh                |  95 ++++++++++++++++++++++--
 10 files changed, 291 insertions(+), 101 deletions(-)


base-commit: d420dda0576340909c3faff364cfbd1485f70376
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1359%2Fjerry-skydio%2Fjerry%2Frevup%2Fmaster%2Fpatch_ids-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1359/jerry-skydio/jerry/revup/master/patch_ids-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1359

Range-diff vs v2:

 1:  945508df7b6 ! 1:  7d4c2e91ce0 patch-id: fix stable patch id for binary / header-only
     @@ Commit message
          populates the object ids, and normal which populates the text
          diff. All branches will end up flushing the hunk.
      
     +    Don't populate the ---a/ and +++b/ lines for binary diffs, to correspond
     +    to those lines not being present in the "git diff" text output.
     +    This is necessary because we advertise that the patch-id calculated
     +    internally and used in format-patch is the same that what the
     +    builtin "git patch-id" would produce when piped from a diff.
     +
          Update the test to run on both binary and normal files.
      
          Signed-off-by: Jerry Zhang <jerry@skydio.com>
      
       ## diff.c ##
      @@ diff.c: static int diff_get_patch_id(struct diff_options *options, struct object_id *oid
     - 			the_hash_algo->update_fn(&ctx, p->two->path, len2);
     + 		if (p->one->mode == 0) {
     + 			patch_id_add_string(&ctx, "newfilemode");
     + 			patch_id_add_mode(&ctx, p->two->mode);
     +-			patch_id_add_string(&ctx, "---/dev/null");
     +-			patch_id_add_string(&ctx, "+++b/");
     +-			the_hash_algo->update_fn(&ctx, p->two->path, len2);
     + 		} else if (p->two->mode == 0) {
     + 			patch_id_add_string(&ctx, "deletedfilemode");
     + 			patch_id_add_mode(&ctx, p->one->mode);
     +-			patch_id_add_string(&ctx, "---a/");
     +-			the_hash_algo->update_fn(&ctx, p->one->path, len1);
     +-			patch_id_add_string(&ctx, "+++/dev/null");
     +-		} else {
     +-			patch_id_add_string(&ctx, "---a/");
     +-			the_hash_algo->update_fn(&ctx, p->one->path, len1);
     +-			patch_id_add_string(&ctx, "+++b/");
     +-			the_hash_algo->update_fn(&ctx, p->two->path, len2);
       		}
       
      -		if (diff_header_only)
     @@ diff.c: static int diff_get_patch_id(struct diff_options *options, struct object
       			the_hash_algo->update_fn(&ctx, oid_to_hex(&p->two->oid),
       					the_hash_algo->hexsz);
      -			continue;
     +-		}
     +-
     +-		xpp.flags = 0;
     +-		xecfg.ctxlen = 3;
     +-		xecfg.flags = XDL_EMIT_NO_HUNK_HDR;
     +-		if (xdi_diff_outf(&mf1, &mf2, NULL,
     +-				  patch_id_consume, &data, &xpp, &xecfg))
     +-			return error("unable to generate patch-id diff for %s",
     +-				     p->one->path);
      +		} else {
     ++			if (p->one->mode == 0) {
     ++				patch_id_add_string(&ctx, "---/dev/null");
     ++				patch_id_add_string(&ctx, "+++b/");
     ++				the_hash_algo->update_fn(&ctx, p->two->path, len2);
     ++			} else if (p->two->mode == 0) {
     ++				patch_id_add_string(&ctx, "---a/");
     ++				the_hash_algo->update_fn(&ctx, p->one->path, len1);
     ++				patch_id_add_string(&ctx, "+++/dev/null");
     ++			} else {
     ++				patch_id_add_string(&ctx, "---a/");
     ++				the_hash_algo->update_fn(&ctx, p->one->path, len1);
     ++				patch_id_add_string(&ctx, "+++b/");
     ++				the_hash_algo->update_fn(&ctx, p->two->path, len2);
     ++			}
     + 
      +			if (fill_mmfile(options->repo, &mf1, p->one) < 0 ||
      +			    fill_mmfile(options->repo, &mf2, p->two) < 0)
      +				return error("unable to read files to diff");
     @@ diff.c: static int diff_get_patch_id(struct diff_options *options, struct object
      +					  patch_id_consume, &data, &xpp, &xecfg))
      +				return error("unable to generate patch-id diff for %s",
      +					     p->one->path);
     - 		}
     --
     --		xpp.flags = 0;
     --		xecfg.ctxlen = 3;
     --		xecfg.flags = XDL_EMIT_NO_HUNK_HDR;
     --		if (xdi_diff_outf(&mf1, &mf2, NULL,
     --				  patch_id_consume, &data, &xpp, &xecfg))
     --			return error("unable to generate patch-id diff for %s",
     --				     p->one->path);
     --
     ++		}
       		if (stable)
       			flush_one_hunk(oid, &ctx);
       	}
      
       ## t/t3419-rebase-patch-id.sh ##
      @@ t/t3419-rebase-patch-id.sh: test_expect_success 'setup: 500 lines' '
     - 	git cherry-pick main >/dev/null 2>&1
     - '
     + 	git add newfile &&
     + 	git commit -q -m "add small file" &&
     + 
     +-	git cherry-pick main >/dev/null 2>&1
     +-'
     ++	git cherry-pick main >/dev/null 2>&1 &&
       
      -test_expect_success 'setup attributes' '
      -	echo "file binary" >.gitattributes
     --'
     --
     ++	git branch -f squashed main &&
     ++	git checkout -q -f squashed &&
     ++	git reset -q --soft HEAD~2 &&
     ++	git commit -q -m squashed
     + '
     + 
       test_expect_success 'detect upstream patch' '
     - 	git checkout -q main &&
     +-	git checkout -q main &&
     ++	git checkout -q main^{} &&
       	scramble file &&
     + 	git add file &&
     + 	git commit -q -m "change big file again" &&
      @@ t/t3419-rebase-patch-id.sh: test_expect_success 'detect upstream patch' '
     - 	git checkout -q other^{} &&
     - 	git rebase main &&
     - 	git rev-list main...HEAD~ >revs &&
     --	test_must_be_empty revs
     -+	test_must_be_empty revs &&
     + 	test_must_be_empty revs
     + '
     + 
     ++test_expect_success 'detect upstream patch binary' '
      +	echo "file binary" >.gitattributes &&
      +	git checkout -q other^{} &&
      +	git rebase main &&
      +	git rev-list main...HEAD~ >revs &&
      +	test_must_be_empty revs &&
     -+	rm .gitattributes
     - '
     - 
     ++	test_when_finished "rm .gitattributes"
     ++'
     ++
       test_expect_success 'do not drop patch' '
     -@@ t/t3419-rebase-patch-id.sh: test_expect_success 'do not drop patch' '
     - 	git commit -q -m squashed &&
     +-	git branch -f squashed main &&
     +-	git checkout -q -f squashed &&
     +-	git reset -q --soft HEAD~2 &&
     +-	git commit -q -m squashed &&
       	git checkout -q other^{} &&
       	test_must_fail git rebase squashed &&
      -	git rebase --quit
     -+	git rebase --abort &&
     ++	test_when_finished "git rebase --abort"
     ++'
     ++
     ++test_expect_success 'do not drop patch binary' '
      +	echo "file binary" >.gitattributes &&
      +	git checkout -q other^{} &&
      +	test_must_fail git rebase squashed &&
     -+	git rebase --abort &&
     -+	rm .gitattributes
     ++	test_when_finished "git rebase --abort" &&
     ++	test_when_finished "rm .gitattributes"
       '
       
       test_done
 2:  30ec43cd129 ! 2:  25e28b7dab3 patch-id: use stable patch-id for rebases
     @@ Commit message
          patch-id: use stable patch-id for rebases
      
          Git doesn't persist patch-ids during the rebase process, so there is
     -    no need to specifically invoke the unstable variant.
     -
     -    This allows the legacy unstable id logic to be cleaned up.
     +    no need to specifically invoke the unstable variant. Use the stable
     +    logic for all internal patch-id calculations to minimize the number of
     +    code paths and improve test coverage.
      
          Signed-off-by: Jerry Zhang <jerry@skydio.com>
      
 -:  ----------- > 3:  21642128927 builtin: patch-id: fix patch-id with binary diffs
 -:  ----------- > 4:  6e07cfd5691 patch-id: fix patch-id for mode changes
 -:  ----------- > 5:  bbaa2425ad0 builtin: patch-id: add --include-whitespace as a command mode
 -:  ----------- > 6:  a1f6f36d487 builtin: patch-id: remove unused diff-tree prefix
 -:  ----------- > 7:  69440797f30 documentation: format-patch: clarify requirements for patch-ids to match

-- 
gitgitgadget

  parent reply	other threads:[~2022-10-14  8:56 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-20  5:58 [PATCH 0/2] update internal patch-id to use "stable" algorithm Jerry Zhang via GitGitGadget
2022-09-20  5:58 ` [PATCH 1/2] patch-id: fix stable patch id for binary / header-only Jerry Zhang via GitGitGadget
2022-09-20  5:58 ` [PATCH 2/2] patch-id: use stable patch-id for rebases Jerry Zhang via GitGitGadget
2022-09-20  6:20 ` [PATCH v2 0/2] update internal patch-id to use "stable" algorithm Jerry Zhang via GitGitGadget
2022-09-20  6:20   ` [PATCH v2 1/2] patch-id: fix stable patch id for binary / header-only Jerry Zhang via GitGitGadget
2022-09-20  6:20   ` [PATCH v2 2/2] patch-id: use stable patch-id for rebases Jerry Zhang via GitGitGadget
2022-10-14  8:56   ` Jerry Zhang via GitGitGadget [this message]
2022-10-14  8:56     ` [PATCH v3 1/7] patch-id: fix stable patch id for binary / header-only Jerry Zhang via GitGitGadget
2022-10-14  8:56     ` [PATCH v3 2/7] patch-id: use stable patch-id for rebases Jerry Zhang via GitGitGadget
2022-10-14  8:56     ` [PATCH v3 3/7] builtin: patch-id: fix patch-id with binary diffs Jerry Zhang via GitGitGadget
2022-10-14 17:13       ` Junio C Hamano
2022-10-14 22:33         ` Jerry Zhang
2022-10-14 21:12       ` Junio C Hamano
2022-10-14 22:34         ` Jerry Zhang
2022-10-17 15:23           ` Junio C Hamano
2022-10-14  8:56     ` [PATCH v3 4/7] patch-id: fix patch-id for mode changes Jerry Zhang via GitGitGadget
2022-10-14 21:17       ` Junio C Hamano
2022-10-14  8:56     ` [PATCH v3 5/7] builtin: patch-id: add --include-whitespace as a command mode Jerry Zhang via GitGitGadget
2022-10-14 21:24       ` Junio C Hamano
2022-10-14 22:55         ` Jerry Zhang
2022-10-17 15:38         ` Junio C Hamano
2022-10-18 22:12           ` Jerry Zhang
2022-10-14  8:56     ` [PATCH v3 6/7] builtin: patch-id: remove unused diff-tree prefix Jerry Zhang via GitGitGadget
2022-10-14 22:03       ` Junio C Hamano
2022-10-14  8:56     ` [PATCH v3 7/7] documentation: format-patch: clarify requirements for patch-ids to match Jerry Zhang via GitGitGadget
2022-10-17 15:18       ` Junio C Hamano
2022-10-18 21:57         ` Jerry Zhang
2022-10-19 16:19           ` Junio C Hamano
2022-10-20 23:16     ` [PATCH v4 0/6] patch-id fixes and improvements Jerry Zhang via GitGitGadget
2022-10-20 23:16       ` [PATCH v4 1/6] patch-id: fix stable patch id for binary / header-only Jerry Zhang via GitGitGadget
2022-10-20 23:16       ` [PATCH v4 2/6] patch-id: use stable patch-id for rebases Jerry Zhang via GitGitGadget
2022-10-20 23:16       ` [PATCH v4 3/6] builtin: patch-id: fix patch-id with binary diffs Jerry Zhang via GitGitGadget
2022-10-20 23:16       ` [PATCH v4 4/6] patch-id: fix patch-id for mode changes Jerry Zhang via GitGitGadget
2022-10-20 23:16       ` [PATCH v4 5/6] builtin: patch-id: add --verbatim as a command mode Jerry Zhang via GitGitGadget
2022-10-20 23:16       ` [PATCH v4 6/6] builtin: patch-id: remove unused diff-tree prefix Jerry Zhang via GitGitGadget
2022-10-21  9:33         ` Junio C Hamano
2022-10-24 20:07       ` [PATCH v5 0/6] patch-id fixes and improvements Jerry Zhang via GitGitGadget
2022-10-24 20:07         ` [PATCH v5 1/6] patch-id: fix stable patch id for binary / header-only Jerry Zhang via GitGitGadget
2022-10-24 20:07         ` [PATCH v5 2/6] patch-id: use stable patch-id for rebases Jerry Zhang via GitGitGadget
2022-10-24 20:07         ` [PATCH v5 3/6] builtin: patch-id: fix patch-id with binary diffs Jerry Zhang via GitGitGadget
2022-10-24 20:07         ` [PATCH v5 4/6] patch-id: fix patch-id for mode changes Jerry Zhang via GitGitGadget
2022-10-24 20:07         ` [PATCH v5 5/6] builtin: patch-id: add --verbatim as a command mode Jerry Zhang via GitGitGadget
2022-10-24 20:07         ` [PATCH v5 6/6] builtin: patch-id: remove unused diff-tree prefix Jerry Zhang via GitGitGadget
2022-10-24 22:55         ` [PATCH v5 0/6] patch-id fixes and improvements Junio C Hamano
2022-09-21 19:16 ` [PATCH 0/2] update internal patch-id to use "stable" algorithm Junio C Hamano
2022-09-21 20:59   ` Jerry Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1359.v3.git.1665737804.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jerry@skydio.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.