All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, vdye@github.com, Derrick Stolee <stolee@gmail.com>
Subject: [PATCH v3 0/4] git for-each-ref: is-base atom and base branches
Date: Wed, 14 Aug 2024 10:31:26 +0000	[thread overview]
Message-ID: <pull.1768.v3.git.1723631490.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1768.v2.git.1723397687.gitgitgadget@gmail.com>

This change introduces a new 'git for-each-ref' atom, 'is-base', in a very
similar way to the 'ahead-behind' atom. As detailed carefully in the first
change, this is motivated by the need to detect the concept of a "base
branch" in a repository with multiple long-lived branches.

This change is motivated by a third-party tool created to make this
detection with the same optimization mechanism, but using a much slower
technique due to the limitations of the Git CLI not presenting this
information. The existing algorithm involves using git rev-list
--first-parent -<N> in batches for the collection of considered references,
comparing those lists, and increasing <N> as needed until finding a
collision. This new use of 'git for-each-ref' will allow determining this
mechanism within a single process and walking a minimal number of commits.

There are benefits to users both on client-side and server-side. In an
internal monorepo, this base branch detection algorithm is used to determine
a long-lived branch based on the HEAD commit, mapping to a group within the
organizational structure of the repository, which determines a set of
projects that the user will likely need to build; this leads to
automatically selecting an initial sparse-checkout definition based on the
build dependencies required. An upcoming feature in Azure Repos will use
this algorithm to automatically create a pull request against the correct
target branch, reducing user pain from needing to select a different branch
after a large commit diff is rendered against the default branch. This atom
unlocks that ability for Git hosting services that use Git in their backend.

Thanks, -Stolee


Updates in v2
=============

 * I had forgotten to include a documentation change in v1. My attempt to
   create a succinct doc change in a follow-up hunk continued to be
   confusing. This version includes a more expanded version of the
   documentation blurb for the is-base token.


Updates in v3
=============

 * Corrected some grammar in a commit message.
 * Fixed (and tested for) a bug where the source branch is equal to a
   candidate ref.
 * Added a test in t6500-for-each-ref.sh to cover some non-commit refs and
   some broken objects.
 * Motivated by the test in t6500, add a new patch that adds a ..._gently()
   method to reduce error noise for non-commit refs.

Derrick Stolee (4):
  commit-reach: add get_branch_base_for_tip
  commit: add gentle reference lookup method
  for-each-ref: add 'is-base' token
  p1500: add is-base performance tests

 Documentation/git-for-each-ref.txt |  42 ++++++++++
 commit-reach.c                     | 126 +++++++++++++++++++++++++++++
 commit-reach.h                     |  17 ++++
 commit.c                           |   8 +-
 commit.h                           |   2 +
 ref-filter.c                       |  77 +++++++++++++++++-
 ref-filter.h                       |  15 ++++
 t/helper/test-reach.c              |   2 +
 t/perf/p1500-graph-walks.sh        |  31 +++++++
 t/t6300-for-each-ref.sh            |   9 +++
 t/t6600-test-reach.sh              | 121 +++++++++++++++++++++++++++
 11 files changed, 448 insertions(+), 2 deletions(-)


base-commit: bea9ecd24b0c3bf06cab4a851694fe09e7e51408
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1768%2Fderrickstolee%2Ftarget-ref-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1768/derrickstolee/target-ref-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1768

Range-diff vs v2:

 1:  580026f910d ! 1:  f93d642c8d9 commit-reach: add get_branch_base_for_tip
     @@ Commit message
          which branch was used as the starting point for a given commit. Add focused
          tests using the 'test-tool reach' command.
      
     -    Repositories that use pull requests (or merge requests) to advance one or
     +    In repositories that use pull requests (or merge requests) to advance one or
          more "protected" branches, the history of that reference can be recovered by
          following the first-parent history in most cases. Most are completed using
          no-fast-forward merges, though squash merges are quite common. Less common
     @@ commit-reach.c: done:
      + */
      +define_commit_slab(best_branch_base, int);
      +static struct best_branch_base best_branch_base;
     -+#define get_best(c) (*best_branch_base_at(&best_branch_base, c))
     -+#define set_best(c,v) (*best_branch_base_at(&best_branch_base, c) = v)
     ++#define get_best(c) (*best_branch_base_at(&best_branch_base, (c)))
     ++#define set_best(c,v) (*best_branch_base_at(&best_branch_base, (c)) = (v))
      +
      +int get_branch_base_for_tip(struct repository *r,
      +			    struct commit *tip,
     @@ commit-reach.c: done:
      +
      +	for (size_t i = 0; i < bases_nr; i++) {
      +		struct commit *c = bases[i];
     ++		int best = get_best(c);
      +
      +		/* Has this already been marked as best by another commit? */
     -+		if (get_best(c))
     ++		if (best) {
     ++			if (best == -1) {
     ++				/* We agree at this position. Stop now. */
     ++				best_index = i + 1;
     ++				goto cleanup;
     ++			}
      +			continue;
     ++		}
      +
      +		set_best(c, i + 1);
      +		prio_queue_put(&queue, c);
     @@ commit-reach.c: done:
      +		branch_point = parent;
      +	}
      +
     ++cleanup:
      +	clear_best_branch_base(&best_branch_base);
      +	clear_prio_queue(&queue);
      +	return best_index > 0 ? best_index - 1 : -1;
     @@ t/t6600-test-reach.sh: test_expect_success 'for-each-ref merged:none' '
      +	test_all_modes get_branch_base_for_tip
      +'
      +
     ++test_expect_success 'get_branch_base_for_tip: equal to tip' '
     ++	# (2,3) branched from the first tip (i,4) in X with i > 2
     ++	cat >input <<-\EOF &&
     ++		A:commit-8-4
     ++		X:commit-1-2
     ++		X:commit-1-4
     ++		X:commit-4-4
     ++		X:commit-8-4
     ++		X:commit-10-4
     ++	EOF
     ++	echo "get_branch_base_for_tip(A,X):3" >expect &&
     ++	test_all_modes get_branch_base_for_tip
     ++'
     ++
      +test_expect_success 'get_branch_base_for_tip: all reach tip' '
      +	# (2,3) branched from the first tip (i,4) in X with i > 2
      +	cat >input <<-\EOF &&
 -:  ----------- > 2:  5240c2a7b32 commit: add gentle reference lookup method
 2:  13341e7e512 ! 3:  df05cee6003 for-each-ref: add 'is-base' token
     @@ ref-filter.c: static int populate_value(struct ref_array_item *ref, struct strbu
      +				v->s = xstrfmt("(%s)", ref->is_base[is_base_atoms]);
      +				free(ref->is_base[is_base_atoms]);
      +			} else {
     -+				/* Not a commit. */
      +				v->s = xstrdup("");
      +			}
      +			is_base_atoms++;
     @@ ref-filter.c: void filter_ahead_behind(struct repository *r,
      +
      +	for (size_t i = 0; i < array->nr; i++) {
      +		const char *name = array->items[i]->refname;
     -+		struct commit *c = lookup_commit_reference_by_name(name);
     ++		struct commit *c = lookup_commit_reference_by_name_gently(name, 1);
      +
      +		CALLOC_ARRAY(array->items[i]->is_base, format->is_base_tips.nr);
      +
     @@ ref-filter.h: void filter_ahead_behind(struct repository *r,
       void ref_filter_clear(struct ref_filter *filter);
       
      
     + ## t/t6300-for-each-ref.sh ##
     +@@ t/t6300-for-each-ref.sh: test_expect_success 'git for-each-ref with nested tags' '
     + 	test_cmp expect actual
     + '
     + 
     ++test_expect_success 'is-base atom with non-commits' '
     ++	git for-each-ref --format="%(is-base:HEAD) %(refname)" >out 2>err &&
     ++	grep "(HEAD) refs/heads/main" out &&
     ++
     ++	test_line_count = 2 err &&
     ++	grep "error: object .* is a commit, not a blob" err &&
     ++	grep "error: bad tag pointer to" err
     ++'
     ++
     + GRADE_FORMAT="%(signature:grade)%0a%(signature:key)%0a%(signature:signer)%0a%(signature:fingerprint)%0a%(signature:primarykeyfingerprint)"
     + TRUSTLEVEL_FORMAT="%(signature:trustlevel)%0a%(signature:key)%0a%(signature:signer)%0a%(signature:fingerprint)%0a%(signature:primarykeyfingerprint)"
     + 
     +
       ## t/t6600-test-reach.sh ##
      @@ t/t6600-test-reach.sh: test_expect_success 'get_branch_base_for_tip: all reach tip' '
       	test_all_modes get_branch_base_for_tip
     @@ t/t6600-test-reach.sh: test_expect_success 'get_branch_base_for_tip: all reach t
      +		--format="%(refname):%(is-base:commit-4-1)" --stdin
      +'
      +
     ++test_expect_success 'for-each-ref is-base: equal to tip' '
     ++	cat >input <<-\EOF &&
     ++	refs/heads/commit-4-2
     ++	refs/heads/commit-5-1
     ++	EOF
     ++	cat >expect <<-\EOF &&
     ++	refs/heads/commit-4-2:(commit-4-2)
     ++	refs/heads/commit-5-1:
     ++	EOF
     ++	run_all_modes git for-each-ref \
     ++		--format="%(refname):%(is-base:commit-4-2)" --stdin
     ++'
     ++
      +test_expect_success 'for-each-ref is-base:multiple' '
      +	cat >input <<-\EOF &&
      +	refs/heads/commit-1-1
 3:  757c20090db = 4:  cce9921bbd8 p1500: add is-base performance tests

-- 
gitgitgadget

  parent reply	other threads:[~2024-08-14 10:31 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-01 22:10 [PATCH 0/3] git for-each-ref: is-base atom and base branches Derrick Stolee via GitGitGadget
2024-08-01 22:10 ` [PATCH 1/3] commit-reach: add get_branch_base_for_tip Derrick Stolee via GitGitGadget
2024-08-01 22:10 ` [PATCH 2/3] for-each-ref: add 'is-base' token Derrick Stolee via GitGitGadget
2024-08-01 22:10 ` [PATCH 3/3] p1500: add is-base performance tests Derrick Stolee via GitGitGadget
2024-08-01 23:06 ` [PATCH 0/3] git for-each-ref: is-base atom and base branches Junio C Hamano
2024-08-02 14:32   ` Derrick Stolee
2024-08-02 16:55     ` Junio C Hamano
2024-08-02 17:30       ` Junio C Hamano
2024-08-11 17:34 ` [PATCH v2 " Derrick Stolee via GitGitGadget
2024-08-11 17:34   ` [PATCH v2 1/3] commit-reach: add get_branch_base_for_tip Derrick Stolee via GitGitGadget
2024-08-12 20:30     ` Junio C Hamano
2024-08-13 13:39       ` Derrick Stolee
2024-08-11 17:34   ` [PATCH v2 2/3] for-each-ref: add 'is-base' token Derrick Stolee via GitGitGadget
2024-08-12 21:05     ` Junio C Hamano
2024-08-13 13:44       ` Derrick Stolee
2024-08-11 17:34   ` [PATCH v2 3/3] p1500: add is-base performance tests Derrick Stolee via GitGitGadget
2024-08-14 10:31   ` Derrick Stolee via GitGitGadget [this message]
2024-08-14 10:31     ` [PATCH v3 1/4] commit-reach: add get_branch_base_for_tip Derrick Stolee via GitGitGadget
2024-08-14 10:31     ` [PATCH v3 2/4] commit: add gentle reference lookup method Derrick Stolee via GitGitGadget
2024-08-14 10:31     ` [PATCH v3 3/4] for-each-ref: add 'is-base' token Derrick Stolee via GitGitGadget
2024-08-14 10:31     ` [PATCH v3 4/4] p1500: add is-base performance tests Derrick Stolee via GitGitGadget
2024-08-19 19:52     ` [PATCH v3 0/4] git for-each-ref: is-base atom and base branches Junio C Hamano
2024-08-20  1:33       ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1768.v3.git.1723631490.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=stolee@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.