public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] revision: add --maximal option
@ 2026-01-18  2:34 Derrick Stolee via GitGitGadget
  2026-01-18  9:05 ` Johannes Sixt
  2026-01-22 16:05 ` [PATCH v2] revision: add --maximal-only option Derrick Stolee via GitGitGadget
  0 siblings, 2 replies; 19+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2026-01-18  2:34 UTC (permalink / raw)
  To: git; +Cc: gitster, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <stolee@gmail.com>

When inspecting a range of commits from some set of starting references, it
is sometimes useful to learn which commits are not reachable from any other
commits in the selected range.

One such application is in the creation of a sequence of bundles for the
bundle URI feature. Creating a stack of bundles representing different
slices of time includes defining which references to include. If all
references are used, then this may be overwhelming or redundant. Instead,
selecting commits that are maximal to the range could help defining a
smaller reference set to use in the bundle header.

Add a new '--maximal' option to restrict the output of a revision range to
be only the commits that are not reachable from any other commit in the
range, based on the reachability definition of the walk.

This is accomplished by adding a new 28th bit flag, CHILD_VISITED, that is
set as we walk. This does extend the bit range in object.h, but using an
earlier bit may collide with another feature.

The tests demonstrate the behavior of the feature with a positive-only
range, ranges with negative references, and walk-modifying flags like
--first-parent and --exclude-first-parent-only.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
    revision: add --maximal option
    
    My motivation for this feature is very similar to the bundle URI
    application. I can get around it by creating a tool that uses git
    rev-list --parents and then uses a hashset to collect the parent list
    and filter out any commits that ever appear as parents. It would be more
    efficient to use Git's native revision-walking feature.
    
    This does bring the object struct up to a 32-bit boundary with 28 flag
    bits, 3 type bits, and a parsed bit. That's the biggest concern I have
    about this update adding a new flag bit. I would understand if this
    feature is not worth running out of room for extensions there.
    
    I considered looking through the earlier bit positions to see the impact
    of an overlap, but they certainly looked potentially risky to reuse.
    
    I wonder if anyone else has thought about this as a useful technique.
    For instance, it could be part of a strategy for choosing commits for
    reachability bitmaps.
    
    Thanks, -Stolee

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2032%2Fderrickstolee%2Fmaximal-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2032/derrickstolee/maximal-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2032

 Documentation/rev-list-options.adoc |  4 ++
 object.h                            |  4 +-
 revision.c                          |  9 +++-
 revision.h                          |  5 +-
 t/t6600-test-reach.sh               | 75 +++++++++++++++++++++++++++++
 5 files changed, 92 insertions(+), 5 deletions(-)

diff --git a/Documentation/rev-list-options.adoc b/Documentation/rev-list-options.adoc
index 453ec59057..f0d2ab32a9 100644
--- a/Documentation/rev-list-options.adoc
+++ b/Documentation/rev-list-options.adoc
@@ -444,6 +444,10 @@ The following options affect the way the simplification is performed:
 	times; if so, a commit is included if it is any of the commits
 	given or if it is an ancestor or descendant of one of them.
 
+`--maximal`::
+	Restrict the output commits to be those that are not reachable
+	from any other commits in the revision range.
+
 A more detailed explanation follows.
 
 Suppose you specified `foo` as the _<paths>_.  We shall call commits
diff --git a/object.h b/object.h
index 4bca957b8d..dfe7a1f0ea 100644
--- a/object.h
+++ b/object.h
@@ -64,7 +64,7 @@ void object_array_init(struct object_array *array);
 
 /*
  * object flag allocation:
- * revision.h:               0---------10         15               23------27
+ * revision.h:               0---------10         15               23--------28
  * fetch-pack.c:             01    67
  * negotiator/default.c:       2--5
  * walker.c:                 0-2
@@ -86,7 +86,7 @@ void object_array_init(struct object_array *array);
  * builtin/unpack-objects.c:                                 2021
  * pack-bitmap.h:                                              2122
  */
-#define FLAG_BITS  28
+#define FLAG_BITS  29
 
 #define TYPE_BITS 3
 
diff --git a/revision.c b/revision.c
index 1858e093ee..29864426d6 100644
--- a/revision.c
+++ b/revision.c
@@ -1150,7 +1150,8 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 			struct commit *p = parent->item;
 			parent = parent->next;
 			if (p)
-				p->object.flags |= UNINTERESTING;
+				p->object.flags |= UNINTERESTING |
+						   CHILD_VISITED;
 			if (repo_parse_commit_gently(revs->repo, p, 1) < 0)
 				continue;
 			if (p->parents)
@@ -1204,7 +1205,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 			if (!*slot)
 				*slot = *revision_sources_at(revs->sources, commit);
 		}
-		p->object.flags |= pass_flags;
+		p->object.flags |= pass_flags | CHILD_VISITED;
 		if (!(p->object.flags & SEEN)) {
 			p->object.flags |= (SEEN | NOT_USER_GIVEN);
 			if (list)
@@ -2381,6 +2382,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->first_parent_only = 1;
 	} else if (!strcmp(arg, "--exclude-first-parent-only")) {
 		revs->exclude_first_parent_only = 1;
+	} else if (!strcmp(arg, "--maximal")) {
+		revs->maximal = 1;
 	} else if (!strcmp(arg, "--ancestry-path")) {
 		revs->ancestry_path = 1;
 		revs->simplify_history = 0;
@@ -4125,6 +4128,8 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
 {
 	if (commit->object.flags & SHOWN)
 		return commit_ignore;
+	if (revs->maximal && (commit->object.flags & CHILD_VISITED))
+		return commit_ignore;
 	if (revs->unpacked && has_object_pack(revs->repo, &commit->object.oid))
 		return commit_ignore;
 	if (revs->no_kept_objects) {
diff --git a/revision.h b/revision.h
index b36acfc2d9..e5c2c82145 100644
--- a/revision.h
+++ b/revision.h
@@ -52,7 +52,9 @@
 #define NOT_USER_GIVEN	(1u<<25)
 #define TRACK_LINEAR	(1u<<26)
 #define ANCESTRY_PATH	(1u<<27)
-#define ALL_REV_FLAGS	(((1u<<11)-1) | NOT_USER_GIVEN | TRACK_LINEAR | PULL_MERGE)
+#define CHILD_VISITED	(1u<<28)
+#define ALL_REV_FLAGS	(((1u<<11)-1) | NOT_USER_GIVEN | TRACK_LINEAR \
+				      | PULL_MERGE | CHILD_VISITED)
 
 #define DECORATE_SHORT_REFS	1
 #define DECORATE_FULL_REFS	2
@@ -198,6 +200,7 @@ struct rev_info {
 			cherry_mark:1,
 			bisect:1,
 			ancestry_path:1,
+			maximal:1,
 
 			/* True if --ancestry-path was specified without an
 			 * argument. The bottom revisions are implicitly
diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index 6638d1aa1d..a759409756 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -762,4 +762,79 @@ test_expect_success 'for-each-ref is-base: --sort' '
 		--sort=refname --sort=-is-base:commit-2-3
 '
 
+test_expect_success 'rev-list --maximal (all positive)' '
+	# Only one maximal.
+	cat >input <<-\EOF &&
+	refs/heads/commit-1-1
+	refs/heads/commit-4-2
+	refs/heads/commit-4-4
+	refs/heads/commit-8-4
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/commit-8-4)
+	EOF
+	run_all_modes git rev-list --maximal --stdin &&
+
+	# All maximal.
+	cat >input <<-\EOF &&
+	refs/heads/commit-5-2
+	refs/heads/commit-4-3
+	refs/heads/commit-3-4
+	refs/heads/commit-2-5
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/commit-5-2)
+	$(git rev-parse refs/heads/commit-4-3)
+	$(git rev-parse refs/heads/commit-3-4)
+	$(git rev-parse refs/heads/commit-2-5)
+	EOF
+	run_all_modes git rev-list --maximal --stdin &&
+
+	# Mix of both.
+	cat >input <<-\EOF &&
+	refs/heads/commit-5-2
+	refs/heads/commit-3-2
+	refs/heads/commit-2-5
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/commit-5-2)
+	$(git rev-parse refs/heads/commit-2-5)
+	EOF
+	run_all_modes git rev-list --maximal --stdin
+'
+
+test_expect_success 'rev-list --maximal (range)' '
+	cat >input <<-\EOF &&
+	refs/heads/commit-1-1
+	refs/heads/commit-2-5
+	refs/heads/commit-6-4
+	^refs/heads/commit-4-5
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/commit-6-4)
+	EOF
+	run_all_modes git rev-list --maximal --stdin &&
+
+	# first-parent changes reachability: the first parent
+	# reduces the second coordinate to 1 before reducing the
+	# first coordinate.
+	cat >input <<-\EOF &&
+	refs/heads/commit-1-1
+	refs/heads/commit-2-5
+	refs/heads/commit-6-4
+	^refs/heads/commit-4-5
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/commit-6-4)
+	$(git rev-parse refs/heads/commit-2-5)
+	EOF
+	run_all_modes git rev-list --maximal --stdin \
+		--first-parent --exclude-first-parent-only
+'
+
 test_done

base-commit: b5c409c40f1595e3e590760c6f14a16b6683e22c
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] revision: add --maximal option
  2026-01-18  2:34 [PATCH] revision: add --maximal option Derrick Stolee via GitGitGadget
@ 2026-01-18  9:05 ` Johannes Sixt
  2026-01-18 18:27   ` Derrick Stolee
  2026-01-22 16:05 ` [PATCH v2] revision: add --maximal-only option Derrick Stolee via GitGitGadget
  1 sibling, 1 reply; 19+ messages in thread
From: Johannes Sixt @ 2026-01-18  9:05 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: gitster, git, Derrick Stolee via GitGitGadget

Am 18.01.26 um 03:34 schrieb Derrick Stolee via GitGitGadget:
> diff --git a/Documentation/rev-list-options.adoc b/Documentation/rev-list-options.adoc
> index 453ec59057..f0d2ab32a9 100644
> --- a/Documentation/rev-list-options.adoc
> +++ b/Documentation/rev-list-options.adoc
> @@ -444,6 +444,10 @@ The following options affect the way the simplification is performed:
>  	times; if so, a commit is included if it is any of the commits
>  	given or if it is an ancestor or descendant of one of them.
>  
> +`--maximal`::
> +	Restrict the output commits to be those that are not reachable
> +	from any other commits in the revision range.

I had to read this sentence three times to understand what it wants to
say, and that even though I had a rough idea what it was supposed to
mean. I tried to come up with a better wording, but found it to be
really hard.

	Restrict output to the commits at the tips of the
	revision range.

is all I could do, but this isn't a lot better, I am afraid.

The option name is too generic IMHO. How about "--starting-point",
"--topmost-only"?  It's function is somewhat parallel to --boundary, but
at the positive end of the revision range. Perhaps we can use that as
inspiration.

The option is listed among options that affect the way the
simplification is performed. But is this true? Isn't it just an option
that changes what output is produced?

-- Hannes


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] revision: add --maximal option
  2026-01-18  9:05 ` Johannes Sixt
@ 2026-01-18 18:27   ` Derrick Stolee
  2026-01-19 11:15     ` Johannes Sixt
  0 siblings, 1 reply; 19+ messages in thread
From: Derrick Stolee @ 2026-01-18 18:27 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: gitster, git, Derrick Stolee via GitGitGadget

On 1/18/26 4:05 AM, Johannes Sixt wrote:
> Am 18.01.26 um 03:34 schrieb Derrick Stolee via GitGitGadget:
>> diff --git a/Documentation/rev-list-options.adoc b/Documentation/rev-list-options.adoc
>> index 453ec59057..f0d2ab32a9 100644
>> --- a/Documentation/rev-list-options.adoc
>> +++ b/Documentation/rev-list-options.adoc
>> @@ -444,6 +444,10 @@ The following options affect the way the simplification is performed:
>>   	times; if so, a commit is included if it is any of the commits
>>   	given or if it is an ancestor or descendant of one of them.
>>   
>> +`--maximal`::
>> +	Restrict the output commits to be those that are not reachable
>> +	from any other commits in the revision range.
> 
> I had to read this sentence three times to understand what it wants to
> say, and that even though I had a rough idea what it was supposed to
> mean. I tried to come up with a better wording, but found it to be
> really hard.
> 
> 	Restrict output to the commits at the tips of the
> 	revision range.
> 
> is all I could do, but this isn't a lot better, I am afraid.
 > > The option name is too generic IMHO. How about "--starting-point",
> "--topmost-only"?  It's function is somewhat parallel to --boundary, but
> at the positive end of the revision range. Perhaps we can use that as
> inspiration.

My perspective is skewed, because "maximal" is a concrete term in the
world of partially-ordered sets (such as commit history ordered by
reachability across child-to-parent relationships). It's important to
distinguish from "starting points" because the inputs to the command
are a list of starting points, not all of which are maximal within the
set. In fact, if some positive starting points are reachable from the
negative starting points, then they are already excluded.

My familiarity with this term is skewed by my experience working with
such terms, so I'm very open to new names for this option.

Your comparison to --boundary is interesting, because --boundary _adds_
commits to the range by selecting the commits from the negative range
that are reachable from the output commits. --maximal as defined here
_restricts_ to the output of commits in the range. It's interaction with
--boundary is trivial because no boundary commits would be included as
they are necessarily reachable from a maximal commit.

> The option is listed among options that affect the way the
> simplification is performed. But is this true? Isn't it just an option
> that changes what output is produced?

You're right that this is poorly placed. I'll put it in a better location
in v2.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] revision: add --maximal option
  2026-01-18 18:27   ` Derrick Stolee
@ 2026-01-19 11:15     ` Johannes Sixt
  2026-01-19 16:44       ` Derrick Stolee
  2026-01-20  0:22       ` Junio C Hamano
  0 siblings, 2 replies; 19+ messages in thread
From: Johannes Sixt @ 2026-01-19 11:15 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: gitster, git, Derrick Stolee via GitGitGadget

Am 18.01.26 um 19:27 schrieb Derrick Stolee:
> On 1/18/26 4:05 AM, Johannes Sixt wrote:
>> Am 18.01.26 um 03:34 schrieb Derrick Stolee via GitGitGadget:
>> > The option name is too generic IMHO. How about "--starting-point",
>> "--topmost-only"?  It's function is somewhat parallel to --boundary, but
>> at the positive end of the revision range. Perhaps we can use that as
>> inspiration.
> 
> My perspective is skewed, because "maximal" is a concrete term in the
> world of partially-ordered sets (such as commit history ordered by
> reachability across child-to-parent relationships). It's important to
> distinguish from "starting points" because the inputs to the command
> are a list of starting points, not all of which are maximal within the
> set. In fact, if some positive starting points are reachable from the
> negative starting points, then they are already excluded.

AFAICS, we don't have options named after graph- or set-theoretical
terms, but tend to stick to terms established in the Git ecosystem. I
assume that "maximal" isn't a meaning that an average Git user would
associate with the operation that is performed here.

But even if we decide to use "maximal", the option must be named
something other than *just* "--maximal"; this is simply too generic.
Perhaps "--only-maximal" or "--maximal-only".

Other ideas:
- --hide-reachable
- --range-head
- --range-head-only
- --most-recent
- --most-recent-only

> [--maximal]'s interaction with
> --boundary is trivial because no boundary commits would be included as
> they are necessarily reachable from a maximal commit.

So, --boundary --maximal shows only the maximal commits? That sounds
unexpected. Boundary commits are shown with additional mark-up; they
don't need to be suppressed. But in a first iteration it's probably
better to just make the two options incompatible.

-- Hannes


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] revision: add --maximal option
  2026-01-19 11:15     ` Johannes Sixt
@ 2026-01-19 16:44       ` Derrick Stolee
  2026-01-19 19:05         ` Johannes Sixt
  2026-01-20  0:22       ` Junio C Hamano
  1 sibling, 1 reply; 19+ messages in thread
From: Derrick Stolee @ 2026-01-19 16:44 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: gitster, git, Derrick Stolee via GitGitGadget

On 1/19/2026 6:15 AM, Johannes Sixt wrote:
> Am 18.01.26 um 19:27 schrieb Derrick Stolee:
>> On 1/18/26 4:05 AM, Johannes Sixt wrote:
>>> Am 18.01.26 um 03:34 schrieb Derrick Stolee via GitGitGadget:
>>>> The option name is too generic IMHO. How about "--starting-point",
>>> "--topmost-only"?  It's function is somewhat parallel to --boundary, but
>>> at the positive end of the revision range. Perhaps we can use that as
>>> inspiration.
>>
>> My perspective is skewed, because "maximal" is a concrete term in the
>> world of partially-ordered sets (such as commit history ordered by
>> reachability across child-to-parent relationships). It's important to
>> distinguish from "starting points" because the inputs to the command
>> are a list of starting points, not all of which are maximal within the
>> set. In fact, if some positive starting points are reachable from the
>> negative starting points, then they are already excluded.
> 
> AFAICS, we don't have options named after graph- or set-theoretical
> terms, but tend to stick to terms established in the Git ecosystem. I
> assume that "maximal" isn't a meaning that an average Git user would
> associate with the operation that is performed here.

My mindset is usually "all words are made up by somebody" and since
there isn't an established term for this in the existing Git ecosystem,
it is up to us to create a term. Borrowing one that exists elsewhere is
a valuable way to build upon any context that term brings with it.

It is also helpful that the term has an explicit technical definition
that means exactly what we're using it for here. It explicitly
differentiates from any "maximum" or confusion with a collapse to a
total order (such as Git's --date-order or --topo-order apply).

> But even if we decide to use "maximal", the option must be named
> something other than *just* "--maximal"; this is simply too generic.
> Perhaps "--only-maximal" or "--maximal-only".

When the argument is moved in the documentation into the set of
filters, then the fact that --maximal restricts the set of commits
makes any modifier such as "only" redundant.
 
> Other ideas:
> - --hide-reachable
> - --range-head
> - --range-head-only
> - --most-recent
> - --most-recent-only

These all have issues, such as being technically wrong (maximal commits
are reachable) or imply total orders, date orders, or generally only a
single result.

>> [--maximal]'s interaction with
>> --boundary is trivial because no boundary commits would be included as
>> they are necessarily reachable from a maximal commit.
> 
> So, --boundary --maximal shows only the maximal commits? That sounds
> unexpected. Boundary commits are shown with additional mark-up; they
> don't need to be suppressed. But in a first iteration it's probably
> better to just make the two options incompatible.

Sure. But I'd like to counter that filters like --author also restrict
the set, including not showing boundary commits that don't fit the
--author pattern. It just happens that no boundary commits are also
maximal by definition.

I do sense that a lot of this is a matter of taste, and that you and I
differ greatly in our tastes on this topic. I look forward to more
opinions that can lead us towards one side or another (or in a new
direction).

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] revision: add --maximal option
  2026-01-19 16:44       ` Derrick Stolee
@ 2026-01-19 19:05         ` Johannes Sixt
  0 siblings, 0 replies; 19+ messages in thread
From: Johannes Sixt @ 2026-01-19 19:05 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: gitster, git, Derrick Stolee via GitGitGadget

Am 19.01.26 um 17:44 schrieb Derrick Stolee:
> When the argument is moved in the documentation into the set of
> filters, then the fact that --maximal restricts the set of commits
> makes any modifier such as "only" redundant.

Keep in mind that rev-list arguments are also understood by git-log.
Then one could expect --maximal to be somehow the opposite of --minimal.
(Which is badly named for the same reason, but that ship has sailed.)

-- Hannes


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] revision: add --maximal option
  2026-01-19 11:15     ` Johannes Sixt
  2026-01-19 16:44       ` Derrick Stolee
@ 2026-01-20  0:22       ` Junio C Hamano
  2026-01-22 15:08         ` Derrick Stolee
  1 sibling, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2026-01-20  0:22 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Derrick Stolee, git, Derrick Stolee via GitGitGadget

Johannes Sixt <j6t@kdbg.org> writes:

Johannes Sixt <j6t@kdbg.org> writes:

> But even if we decide to use "maximal", the option must be named
> something other than *just* "--maximal"; this is simply too generic.
> Perhaps "--only-maximal" or "--maximal-only".
>
> Other ideas:
> - --hide-reachable
> - --range-head
> - --range-head-only
> - --most-recent
> - --most-recent-only
>
>> [--maximal]'s interaction with
>> --boundary is trivial because no boundary commits would be included as
>> they are necessarily reachable from a maximal commit.
>
> So, --boundary --maximal shows only the maximal commits? That sounds
> unexpected. Boundary commits are shown with additional mark-up; they
> don't need to be suppressed. But in a first iteration it's probably
> better to just make the two options incompatible.

If I am reading the answer to "what is minimal/maximal elements in
partially ordered set?" correctly, our "--boundary" essentially is
to show direct parents of those commits that would be shown with the
(nonexistent) "--minimal-only" option.  So I agree with you that it
makes perfect sense to make "--boundary" and "--maximal-only"
incompatible (it is like asking for both "--minimal-only" and
"--maximal-only" at the same time).



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] revision: add --maximal option
  2026-01-20  0:22       ` Junio C Hamano
@ 2026-01-22 15:08         ` Derrick Stolee
  0 siblings, 0 replies; 19+ messages in thread
From: Derrick Stolee @ 2026-01-22 15:08 UTC (permalink / raw)
  To: Junio C Hamano, Johannes Sixt; +Cc: git, Derrick Stolee via GitGitGadget

On 1/19/26 7:22 PM, Junio C Hamano wrote:
> Johannes Sixt <j6t@kdbg.org> writes:
>> But even if we decide to use "maximal", the option must be named
>> something other than *just* "--maximal"; this is simply too generic.
>> Perhaps "--only-maximal" or "--maximal-only".
>>
>> Other ideas:
>> - --hide-reachable
>> - --range-head
>> - --range-head-only
>> - --most-recent
>> - --most-recent-only
>>
>>> [--maximal]'s interaction with
>>> --boundary is trivial because no boundary commits would be included as
>>> they are necessarily reachable from a maximal commit.
>>
>> So, --boundary --maximal shows only the maximal commits? That sounds
>> unexpected. Boundary commits are shown with additional mark-up; they
>> don't need to be suppressed. But in a first iteration it's probably
>> better to just make the two options incompatible.
> 
> If I am reading the answer to "what is minimal/maximal elements in
> partially ordered set?" correctly, our "--boundary" essentially is
> to show direct parents of those commits that would be shown with the
> (nonexistent) "--minimal-only" option.  So I agree with you that it
> makes perfect sense to make "--boundary" and "--maximal-only"
> incompatible (it is like asking for both "--minimal-only" and
> "--maximal-only" at the same time).

The existence of a --minimal option that doesn't match the mirror
of my suggested --maximal option convinces me to move to
--maximal-only. I will send a v2 shortly that updates this and
moves the documentation next to other filtering options.

I'll also mark --boundary and --maximal-only as incompatible to
avoid confusion.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2] revision: add --maximal-only option
  2026-01-18  2:34 [PATCH] revision: add --maximal option Derrick Stolee via GitGitGadget
  2026-01-18  9:05 ` Johannes Sixt
@ 2026-01-22 16:05 ` Derrick Stolee via GitGitGadget
  2026-01-22 21:44   ` Junio C Hamano
  1 sibling, 1 reply; 19+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2026-01-22 16:05 UTC (permalink / raw)
  To: git; +Cc: gitster, Johannes Sixt, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <stolee@gmail.com>

When inspecting a range of commits from some set of starting references, it
is sometimes useful to learn which commits are not reachable from any other
commits in the selected range.

One such application is in the creation of a sequence of bundles for the
bundle URI feature. Creating a stack of bundles representing different
slices of time includes defining which references to include. If all
references are used, then this may be overwhelming or redundant. Instead,
selecting commits that are maximal to the range could help defining a
smaller reference set to use in the bundle header.

Add a new '--maximal-only' option to restrict the output of a revision range
to be only the commits that are not reachable from any other commit in the
range, based on the reachability definition of the walk.

This is accomplished by adding a new 28th bit flag, CHILD_VISITED, that is
set as we walk. This does extend the bit range in object.h, but using an
earlier bit may collide with another feature.

The tests demonstrate the behavior of the feature with a positive-only
range, ranges with negative references, and walk-modifying flags like
--first-parent and --exclude-first-parent-only.

Since the --boundary option would not increase any results when used with
the --maximal-only option, mark them as incompatible.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
    revision: add --maximal-only option
    
    My motivation for this feature is very similar to the bundle URI
    application. I can get around it by creating a tool that uses git
    rev-list --parents and then uses a hashset to collect the parent list
    and filter out any commits that ever appear as parents. It would be more
    efficient to use Git's native revision-walking feature.
    
    This does bring the object struct up to a 32-bit boundary with 28 flag
    bits, 3 type bits, and a parsed bit. That's the biggest concern I have
    about this update adding a new flag bit. I would understand if this
    feature is not worth running out of room for extensions there.
    
    I considered looking through the earlier bit positions to see the impact
    of an overlap, but they certainly looked potentially risky to reuse.
    
    I wonder if anyone else has thought about this as a useful technique.
    For instance, it could be part of a strategy for choosing commits for
    reachability bitmaps.
    
    
    Updates in v2
    =============
    
     * option is now called --maximal-only.
     * Documentation is moved within the commit-filtering options not the
       walk-altering options.
     * --boundary and --maximal-only are marked as incompatible.
    
    Thanks, -Stolee

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2032%2Fderrickstolee%2Fmaximal-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2032/derrickstolee/maximal-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/2032

Range-diff vs v1:

 1:  889a2737bc ! 1:  54fbf36a1f revision: add --maximal option
     @@ Metadata
      Author: Derrick Stolee <stolee@gmail.com>
      
       ## Commit message ##
     -    revision: add --maximal option
     +    revision: add --maximal-only option
      
          When inspecting a range of commits from some set of starting references, it
          is sometimes useful to learn which commits are not reachable from any other
     @@ Commit message
          selecting commits that are maximal to the range could help defining a
          smaller reference set to use in the bundle header.
      
     -    Add a new '--maximal' option to restrict the output of a revision range to
     -    be only the commits that are not reachable from any other commit in the
     +    Add a new '--maximal-only' option to restrict the output of a revision range
     +    to be only the commits that are not reachable from any other commit in the
          range, based on the reachability definition of the walk.
      
          This is accomplished by adding a new 28th bit flag, CHILD_VISITED, that is
     @@ Commit message
          range, ranges with negative references, and walk-modifying flags like
          --first-parent and --exclude-first-parent-only.
      
     +    Since the --boundary option would not increase any results when used with
     +    the --maximal-only option, mark them as incompatible.
     +
          Signed-off-by: Derrick Stolee <stolee@gmail.com>
      
       ## Documentation/rev-list-options.adoc ##
     -@@ Documentation/rev-list-options.adoc: The following options affect the way the simplification is performed:
     - 	times; if so, a commit is included if it is any of the commits
     - 	given or if it is an ancestor or descendant of one of them.
     +@@ Documentation/rev-list-options.adoc: endif::git-log[]
     + 	from the point where it diverged from the remote branch, given
     + 	that arbitrary merges can be valid topic branch changes.
       
     -+`--maximal`::
     ++`--maximal-only`::
      +	Restrict the output commits to be those that are not reachable
      +	from any other commits in the revision range.
      +
     - A more detailed explanation follows.
     - 
     - Suppose you specified `foo` as the _<paths>_.  We shall call commits
     + `--not`::
     + 	Reverses the meaning of the '{caret}' prefix (or lack thereof)
     + 	for all following revision specifiers, up to the next `--not`.
      
       ## object.h ##
      @@ object.h: void object_array_init(struct object_array *array);
     @@ revision.c: static int process_parents(struct rev_info *revs, struct commit *com
       			p->object.flags |= (SEEN | NOT_USER_GIVEN);
       			if (list)
      @@ revision.c: static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
     + 	} else if ((argcount = parse_long_opt("until", argv, &optarg))) {
     + 		revs->min_age = approxidate(optarg);
     + 		return argcount;
     ++	} else if (!strcmp(arg, "--maximal-only")) {
     ++		revs->maximal_only = 1;
     + 	} else if (!strcmp(arg, "--first-parent")) {
       		revs->first_parent_only = 1;
       	} else if (!strcmp(arg, "--exclude-first-parent-only")) {
     - 		revs->exclude_first_parent_only = 1;
     -+	} else if (!strcmp(arg, "--maximal")) {
     -+		revs->maximal = 1;
     - 	} else if (!strcmp(arg, "--ancestry-path")) {
     - 		revs->ancestry_path = 1;
     - 		revs->simplify_history = 0;
     +@@ revision.c: int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
     + 				  !!revs->reverse, "--reverse",
     + 				  !!revs->reflog_info, "--walk-reflogs");
     + 
     ++	die_for_incompatible_opt2(!!revs->boundary, "--boundary",
     ++				  !!revs->maximal_only, "--maximal-only");
     ++
     + 	if (revs->no_walk && revs->graph)
     + 		die(_("options '%s' and '%s' cannot be used together"), "--no-walk", "--graph");
     + 	if (!revs->reflog_info && revs->grep_filter.use_reflog_filter)
      @@ revision.c: enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
       {
       	if (commit->object.flags & SHOWN)
       		return commit_ignore;
     -+	if (revs->maximal && (commit->object.flags & CHILD_VISITED))
     ++	if (revs->maximal_only && (commit->object.flags & CHILD_VISITED))
      +		return commit_ignore;
       	if (revs->unpacked && has_object_pack(revs->repo, &commit->object.oid))
       		return commit_ignore;
     @@ revision.h
       #define DECORATE_SHORT_REFS	1
       #define DECORATE_FULL_REFS	2
      @@ revision.h: struct rev_info {
     - 			cherry_mark:1,
     - 			bisect:1,
     - 			ancestry_path:1,
     -+			maximal:1,
     + 			left_right:1,
     + 			left_only:1,
     + 			right_only:1,
     ++			maximal_only:1,
     + 			rewrite_parents:1,
     + 			print_parents:1,
     + 			show_decorations:1,
     +
     + ## t/t6000-rev-list-misc.sh ##
     +@@ t/t6000-rev-list-misc.sh: test_expect_success 'rev-list -z --boundary' '
     + 	test_cmp expect actual
     + '
       
     - 			/* True if --ancestry-path was specified without an
     - 			 * argument. The bottom revisions are implicitly
     ++test_expect_success 'rev-list --boundary incompatible with --maximal-only' '
     ++	test_when_finished rm -rf repo &&
     ++
     ++	git init repo &&
     ++	test_commit -C repo 1 &&
     ++	test_commit -C repo 2 &&
     ++
     ++	oid1=$(git -C repo rev-parse HEAD~) &&
     ++	oid2=$(git -C repo rev-parse HEAD) &&
     ++
     ++	test_must_fail git -C repo rev-list --boundary --maximal-only \
     ++		HEAD~1..HEAD 2>err &&
     ++	test_grep "cannot be used together" err
     ++'
     ++
     + test_done
      
       ## t/t6600-test-reach.sh ##
      @@ t/t6600-test-reach.sh: test_expect_success 'for-each-ref is-base: --sort' '
       		--sort=refname --sort=-is-base:commit-2-3
       '
       
     -+test_expect_success 'rev-list --maximal (all positive)' '
     ++test_expect_success 'rev-list --maximal-only (all positive)' '
      +	# Only one maximal.
      +	cat >input <<-\EOF &&
      +	refs/heads/commit-1-1
     @@ t/t6600-test-reach.sh: test_expect_success 'for-each-ref is-base: --sort' '
      +	cat >expect <<-EOF &&
      +	$(git rev-parse refs/heads/commit-8-4)
      +	EOF
     -+	run_all_modes git rev-list --maximal --stdin &&
     ++	run_all_modes git rev-list --maximal-only --stdin &&
      +
      +	# All maximal.
      +	cat >input <<-\EOF &&
     @@ t/t6600-test-reach.sh: test_expect_success 'for-each-ref is-base: --sort' '
      +	$(git rev-parse refs/heads/commit-3-4)
      +	$(git rev-parse refs/heads/commit-2-5)
      +	EOF
     -+	run_all_modes git rev-list --maximal --stdin &&
     ++	run_all_modes git rev-list --maximal-only --stdin &&
      +
      +	# Mix of both.
      +	cat >input <<-\EOF &&
     @@ t/t6600-test-reach.sh: test_expect_success 'for-each-ref is-base: --sort' '
      +	$(git rev-parse refs/heads/commit-5-2)
      +	$(git rev-parse refs/heads/commit-2-5)
      +	EOF
     -+	run_all_modes git rev-list --maximal --stdin
     ++	run_all_modes git rev-list --maximal-only --stdin
      +'
      +
     -+test_expect_success 'rev-list --maximal (range)' '
     ++test_expect_success 'rev-list --maximal-only (range)' '
      +	cat >input <<-\EOF &&
      +	refs/heads/commit-1-1
      +	refs/heads/commit-2-5
     @@ t/t6600-test-reach.sh: test_expect_success 'for-each-ref is-base: --sort' '
      +	cat >expect <<-EOF &&
      +	$(git rev-parse refs/heads/commit-6-4)
      +	EOF
     -+	run_all_modes git rev-list --maximal --stdin &&
     ++	run_all_modes git rev-list --maximal-only --stdin &&
      +
      +	# first-parent changes reachability: the first parent
      +	# reduces the second coordinate to 1 before reducing the
     @@ t/t6600-test-reach.sh: test_expect_success 'for-each-ref is-base: --sort' '
      +	$(git rev-parse refs/heads/commit-6-4)
      +	$(git rev-parse refs/heads/commit-2-5)
      +	EOF
     -+	run_all_modes git rev-list --maximal --stdin \
     ++	run_all_modes git rev-list --maximal-only --stdin \
      +		--first-parent --exclude-first-parent-only
      +'
      +


 Documentation/rev-list-options.adoc |  4 ++
 object.h                            |  4 +-
 revision.c                          | 12 ++++-
 revision.h                          |  5 +-
 t/t6000-rev-list-misc.sh            | 15 ++++++
 t/t6600-test-reach.sh               | 75 +++++++++++++++++++++++++++++
 6 files changed, 110 insertions(+), 5 deletions(-)

diff --git a/Documentation/rev-list-options.adoc b/Documentation/rev-list-options.adoc
index 453ec59057..a39cf88bbc 100644
--- a/Documentation/rev-list-options.adoc
+++ b/Documentation/rev-list-options.adoc
@@ -148,6 +148,10 @@ endif::git-log[]
 	from the point where it diverged from the remote branch, given
 	that arbitrary merges can be valid topic branch changes.
 
+`--maximal-only`::
+	Restrict the output commits to be those that are not reachable
+	from any other commits in the revision range.
+
 `--not`::
 	Reverses the meaning of the '{caret}' prefix (or lack thereof)
 	for all following revision specifiers, up to the next `--not`.
diff --git a/object.h b/object.h
index 4bca957b8d..dfe7a1f0ea 100644
--- a/object.h
+++ b/object.h
@@ -64,7 +64,7 @@ void object_array_init(struct object_array *array);
 
 /*
  * object flag allocation:
- * revision.h:               0---------10         15               23------27
+ * revision.h:               0---------10         15               23--------28
  * fetch-pack.c:             01    67
  * negotiator/default.c:       2--5
  * walker.c:                 0-2
@@ -86,7 +86,7 @@ void object_array_init(struct object_array *array);
  * builtin/unpack-objects.c:                                 2021
  * pack-bitmap.h:                                              2122
  */
-#define FLAG_BITS  28
+#define FLAG_BITS  29
 
 #define TYPE_BITS 3
 
diff --git a/revision.c b/revision.c
index 1858e093ee..2dee78b838 100644
--- a/revision.c
+++ b/revision.c
@@ -1150,7 +1150,8 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 			struct commit *p = parent->item;
 			parent = parent->next;
 			if (p)
-				p->object.flags |= UNINTERESTING;
+				p->object.flags |= UNINTERESTING |
+						   CHILD_VISITED;
 			if (repo_parse_commit_gently(revs->repo, p, 1) < 0)
 				continue;
 			if (p->parents)
@@ -1204,7 +1205,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 			if (!*slot)
 				*slot = *revision_sources_at(revs->sources, commit);
 		}
-		p->object.flags |= pass_flags;
+		p->object.flags |= pass_flags | CHILD_VISITED;
 		if (!(p->object.flags & SEEN)) {
 			p->object.flags |= (SEEN | NOT_USER_GIVEN);
 			if (list)
@@ -2377,6 +2378,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if ((argcount = parse_long_opt("until", argv, &optarg))) {
 		revs->min_age = approxidate(optarg);
 		return argcount;
+	} else if (!strcmp(arg, "--maximal-only")) {
+		revs->maximal_only = 1;
 	} else if (!strcmp(arg, "--first-parent")) {
 		revs->first_parent_only = 1;
 	} else if (!strcmp(arg, "--exclude-first-parent-only")) {
@@ -3147,6 +3150,9 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 				  !!revs->reverse, "--reverse",
 				  !!revs->reflog_info, "--walk-reflogs");
 
+	die_for_incompatible_opt2(!!revs->boundary, "--boundary",
+				  !!revs->maximal_only, "--maximal-only");
+
 	if (revs->no_walk && revs->graph)
 		die(_("options '%s' and '%s' cannot be used together"), "--no-walk", "--graph");
 	if (!revs->reflog_info && revs->grep_filter.use_reflog_filter)
@@ -4125,6 +4131,8 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
 {
 	if (commit->object.flags & SHOWN)
 		return commit_ignore;
+	if (revs->maximal_only && (commit->object.flags & CHILD_VISITED))
+		return commit_ignore;
 	if (revs->unpacked && has_object_pack(revs->repo, &commit->object.oid))
 		return commit_ignore;
 	if (revs->no_kept_objects) {
diff --git a/revision.h b/revision.h
index b36acfc2d9..69242ecb18 100644
--- a/revision.h
+++ b/revision.h
@@ -52,7 +52,9 @@
 #define NOT_USER_GIVEN	(1u<<25)
 #define TRACK_LINEAR	(1u<<26)
 #define ANCESTRY_PATH	(1u<<27)
-#define ALL_REV_FLAGS	(((1u<<11)-1) | NOT_USER_GIVEN | TRACK_LINEAR | PULL_MERGE)
+#define CHILD_VISITED	(1u<<28)
+#define ALL_REV_FLAGS	(((1u<<11)-1) | NOT_USER_GIVEN | TRACK_LINEAR \
+				      | PULL_MERGE | CHILD_VISITED)
 
 #define DECORATE_SHORT_REFS	1
 #define DECORATE_FULL_REFS	2
@@ -189,6 +191,7 @@ struct rev_info {
 			left_right:1,
 			left_only:1,
 			right_only:1,
+			maximal_only:1,
 			rewrite_parents:1,
 			print_parents:1,
 			show_decorations:1,
diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
index fec16448cf..d0a2a86610 100755
--- a/t/t6000-rev-list-misc.sh
+++ b/t/t6000-rev-list-misc.sh
@@ -248,4 +248,19 @@ test_expect_success 'rev-list -z --boundary' '
 	test_cmp expect actual
 '
 
+test_expect_success 'rev-list --boundary incompatible with --maximal-only' '
+	test_when_finished rm -rf repo &&
+
+	git init repo &&
+	test_commit -C repo 1 &&
+	test_commit -C repo 2 &&
+
+	oid1=$(git -C repo rev-parse HEAD~) &&
+	oid2=$(git -C repo rev-parse HEAD) &&
+
+	test_must_fail git -C repo rev-list --boundary --maximal-only \
+		HEAD~1..HEAD 2>err &&
+	test_grep "cannot be used together" err
+'
+
 test_done
diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index 6638d1aa1d..2613075894 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -762,4 +762,79 @@ test_expect_success 'for-each-ref is-base: --sort' '
 		--sort=refname --sort=-is-base:commit-2-3
 '
 
+test_expect_success 'rev-list --maximal-only (all positive)' '
+	# Only one maximal.
+	cat >input <<-\EOF &&
+	refs/heads/commit-1-1
+	refs/heads/commit-4-2
+	refs/heads/commit-4-4
+	refs/heads/commit-8-4
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/commit-8-4)
+	EOF
+	run_all_modes git rev-list --maximal-only --stdin &&
+
+	# All maximal.
+	cat >input <<-\EOF &&
+	refs/heads/commit-5-2
+	refs/heads/commit-4-3
+	refs/heads/commit-3-4
+	refs/heads/commit-2-5
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/commit-5-2)
+	$(git rev-parse refs/heads/commit-4-3)
+	$(git rev-parse refs/heads/commit-3-4)
+	$(git rev-parse refs/heads/commit-2-5)
+	EOF
+	run_all_modes git rev-list --maximal-only --stdin &&
+
+	# Mix of both.
+	cat >input <<-\EOF &&
+	refs/heads/commit-5-2
+	refs/heads/commit-3-2
+	refs/heads/commit-2-5
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/commit-5-2)
+	$(git rev-parse refs/heads/commit-2-5)
+	EOF
+	run_all_modes git rev-list --maximal-only --stdin
+'
+
+test_expect_success 'rev-list --maximal-only (range)' '
+	cat >input <<-\EOF &&
+	refs/heads/commit-1-1
+	refs/heads/commit-2-5
+	refs/heads/commit-6-4
+	^refs/heads/commit-4-5
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/commit-6-4)
+	EOF
+	run_all_modes git rev-list --maximal-only --stdin &&
+
+	# first-parent changes reachability: the first parent
+	# reduces the second coordinate to 1 before reducing the
+	# first coordinate.
+	cat >input <<-\EOF &&
+	refs/heads/commit-1-1
+	refs/heads/commit-2-5
+	refs/heads/commit-6-4
+	^refs/heads/commit-4-5
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/commit-6-4)
+	$(git rev-parse refs/heads/commit-2-5)
+	EOF
+	run_all_modes git rev-list --maximal-only --stdin \
+		--first-parent --exclude-first-parent-only
+'
+
 test_done

base-commit: b5c409c40f1595e3e590760c6f14a16b6683e22c
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] revision: add --maximal-only option
  2026-01-22 16:05 ` [PATCH v2] revision: add --maximal-only option Derrick Stolee via GitGitGadget
@ 2026-01-22 21:44   ` Junio C Hamano
  2026-01-22 22:15     ` Derrick Stolee
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2026-01-22 21:44 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, Johannes Sixt, Derrick Stolee

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

>     My motivation for this feature is very similar to the bundle URI
>     application. I can get around it by creating a tool that uses git
>     rev-list --parents and then uses a hashset to collect the parent list
>     and filter out any commits that ever appear as parents. It would be more
>     efficient to use Git's native revision-walking feature.

How does this relate to "git merge-base --independent", or do they
compute completely different things?



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] revision: add --maximal-only option
  2026-01-22 21:44   ` Junio C Hamano
@ 2026-01-22 22:15     ` Derrick Stolee
  2026-01-22 23:11       ` Junio C Hamano
  2026-01-23  6:38       ` Johannes Sixt
  0 siblings, 2 replies; 19+ messages in thread
From: Derrick Stolee @ 2026-01-22 22:15 UTC (permalink / raw)
  To: Junio C Hamano, Derrick Stolee via GitGitGadget; +Cc: git, Johannes Sixt

On 1/22/2026 4:44 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>>     My motivation for this feature is very similar to the bundle URI
>>     application. I can get around it by creating a tool that uses git
>>     rev-list --parents and then uses a hashset to collect the parent list
>>     and filter out any commits that ever appear as parents. It would be more
>>     efficient to use Git's native revision-walking feature.
> 
> How does this relate to "git merge-base --independent", or do they
> compute completely different things?

This is the same idea, where among the merge bases, the ones that are
"maximal" to that set are the --independent ones. 
 
The documentation for --independent even uses similar language here:

  In other words, among the commits given, list those which cannot
  be reached from any other.

Unfortunately, it also says "print a minimal subset" which in some
sense is correct by "it cannot be made smaller without losing
information" but we actually choose the maximal set there, not a
minimal set.

The merge-base --independent calculation is basically asking for
the maximal set among commits in the intersection of two (or more)
commit histories. One trick the merge-base calculation does is
that it first looks for the --boundary commits, and then reduces
from within that set. This avoid walking further into the history
than necessary.

You are presenting interesting overlaps of terminology and needs.
One thing that is different about 'git rev-list --maximal-only' with
a list of starting commits is that it wants the maximal set from
the _union_ of the histories, instead of the _intersection_ like
'git merge-base --independent' does.

There is potential for a performance improvement if we converted
the search to be like a merge-base algorithm and checked the
priority queue to see if all elements have the CHILD_VISITED flag.
I think the cost here is that we would need more new logic and
would lose some expressiveness of 'git rev-list'.

For example, one of the applications I mentioned will require a
range request including negative refs, such as

  git rev-list --maximal-only --stdin <<-\EOF
  refs/heads/branch1
  refs/heads/branch2
  ^refs/heads/main
  ^refs/heads/release
  EOF

And this would likely return the tips of the two branches, but
also will detect if one already reaches the other or if one of
'main' or 'release' reaches one or both of them, excluding it
from the maximal set.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] revision: add --maximal-only option
  2026-01-22 22:15     ` Derrick Stolee
@ 2026-01-22 23:11       ` Junio C Hamano
  2026-01-23  6:38       ` Johannes Sixt
  1 sibling, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2026-01-22 23:11 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Derrick Stolee via GitGitGadget, git, Johannes Sixt

Derrick Stolee <stolee@gmail.com> writes:

> The merge-base --independent calculation is basically asking for
> the maximal set among commits in the intersection of two (or more)
> commit histories. One trick the merge-base calculation does is
> that it first looks for the --boundary commits, and then reduces
> from within that set. This avoid walking further into the history
> than necessary.
> ...
> And this would likely return the tips of the two branches, but
> also will detect if one already reaches the other or if one of
> 'main' or 'release' reaches one or both of them, excluding it
> from the maximal set.

Thanks for your thoughts.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] revision: add --maximal-only option
  2026-01-22 22:15     ` Derrick Stolee
  2026-01-22 23:11       ` Junio C Hamano
@ 2026-01-23  6:38       ` Johannes Sixt
  2026-01-23 15:58         ` Junio C Hamano
  1 sibling, 1 reply; 19+ messages in thread
From: Johannes Sixt @ 2026-01-23  6:38 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git, Derrick Stolee via GitGitGadget, Junio C Hamano

Am 22.01.26 um 23:15 schrieb Derrick Stolee:
> Unfortunately, it also says "print a minimal subset" which in some
> sense is correct by "it cannot be made smaller without losing
> information" but we actually choose the maximal set there, not a
> minimal set.
> ...
> You are presenting interesting overlaps of terminology and needs.
> One thing that is different about 'git rev-list --maximal-only' with
> a list of starting commits is that it wants the maximal set from
> the _union_ of the histories, instead of the _intersection_ like
> 'git merge-base --independent' does.

I don't quite understand how a union or intersection come into play
here. The difference between the two is that `git rev-list
--maximal-only` permits negative revisions as input, but `git merge-base
--independent` does not. In the case where the input is only positive
revisions, the result of --maximal-only should always be exactly
identical to --independent, right? Even if the revisions are on
disconnected histories?

-- Hannes


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] revision: add --maximal-only option
  2026-01-23  6:38       ` Johannes Sixt
@ 2026-01-23 15:58         ` Junio C Hamano
  2026-01-23 16:55           ` Derrick Stolee
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2026-01-23 15:58 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Derrick Stolee, git, Derrick Stolee via GitGitGadget

Johannes Sixt <j6t@kdbg.org> writes:

> Am 22.01.26 um 23:15 schrieb Derrick Stolee:
>> Unfortunately, it also says "print a minimal subset" which in some
>> sense is correct by "it cannot be made smaller without losing
>> information" but we actually choose the maximal set there, not a
>> minimal set.
>> ...
>> You are presenting interesting overlaps of terminology and needs.
>> One thing that is different about 'git rev-list --maximal-only' with
>> a list of starting commits is that it wants the maximal set from
>> the _union_ of the histories, instead of the _intersection_ like
>> 'git merge-base --independent' does.
>
> I don't quite understand how a union or intersection come into play
> here. The difference between the two is that `git rev-list
> --maximal-only` permits negative revisions as input, but `git merge-base
> --independent` does not. In the case where the input is only positive
> revisions, the result of --maximal-only should always be exactly
> identical to --independent, right? Even if the revisions are on
> disconnected histories?

Ahh, it is an ancient history that I forgot how the command worked.
"merge-base --independent A B C" does not do any "merge-base"
computation over the commits A B C and shows the ones that cannot be
reached from any other.  If it were to compute merge bases across
these commits and then find commits, among the computed merge bases,
that cannot be reached from any other merge bases, "intersection"
might come into play, but I do not think that is what the command
does.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] revision: add --maximal-only option
  2026-01-23 15:58         ` Junio C Hamano
@ 2026-01-23 16:55           ` Derrick Stolee
  2026-01-23 18:08             ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Derrick Stolee @ 2026-01-23 16:55 UTC (permalink / raw)
  To: Junio C Hamano, Johannes Sixt; +Cc: git, Derrick Stolee via GitGitGadget

On 1/23/2026 10:58 AM, Junio C Hamano wrote:
> Johannes Sixt <j6t@kdbg.org> writes:
> 
>> Am 22.01.26 um 23:15 schrieb Derrick Stolee:
>>> Unfortunately, it also says "print a minimal subset" which in some
>>> sense is correct by "it cannot be made smaller without losing
>>> information" but we actually choose the maximal set there, not a
>>> minimal set.
>>> ...
>>> You are presenting interesting overlaps of terminology and needs.
>>> One thing that is different about 'git rev-list --maximal-only' with
>>> a list of starting commits is that it wants the maximal set from
>>> the _union_ of the histories, instead of the _intersection_ like
>>> 'git merge-base --independent' does.
>>
>> I don't quite understand how a union or intersection come into play
>> here. The difference between the two is that `git rev-list
>> --maximal-only` permits negative revisions as input, but `git merge-base
>> --independent` does not. In the case where the input is only positive
>> revisions, the result of --maximal-only should always be exactly
>> identical to --independent, right? Even if the revisions are on
>> disconnected histories?
> 
> Ahh, it is an ancient history that I forgot how the command worked.
> "merge-base --independent A B C" does not do any "merge-base"
> computation over the commits A B C and shows the ones that cannot be
> reached from any other.  If it were to compute merge bases across
> these commits and then find commits, among the computed merge bases,
> that cannot be reached from any other merge bases, "intersection"
> might come into play, but I do not think that is what the command
> does.

Interesting. Thanks for the correction. So we _do_ have a way to
get this information for a range that doesn't have negative refs
or other custom walk modifiers (and this implementation would be
faster for this case).

My patch includes test cases that are not covered by the
merge-base command. I don't think it would be valuable to extend
the merge-base command with even more cases that don't actually
output merge-bases / intersections.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] revision: add --maximal-only option
  2026-01-23 16:55           ` Derrick Stolee
@ 2026-01-23 18:08             ` Junio C Hamano
  2026-01-28 14:28               ` Derrick Stolee
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2026-01-23 18:08 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Johannes Sixt, git, Derrick Stolee via GitGitGadget

Derrick Stolee <stolee@gmail.com> writes:

> Interesting. Thanks for the correction. So we _do_ have a way to
> get this information for a range that doesn't have negative refs
> or other custom walk modifiers (and this implementation would be
> faster for this case).

Perhaps.  If so, perhaps we can improve --maximal-only (and possibly
rename it to --independent?  I dunno about this part) by special
casing the logic, and then steer people to use the new implementation
that can use negative ends, deprecating "merge-base --independent"
(which was written to be a better "show-branch --independent")?

> My patch includes test cases that are not covered by the
> merge-base command. I don't think it would be valuable to extend
> the merge-base command with even more cases that don't actually
> output merge-bases / intersections.

Yup, I do not think show-branch nor merge-base were good home for
the feature.  We only needed to make reduce_heads_replace()
available somewhere, and "git show --maximal-only A B C" might be a
much better way to express "show only the independent ones", as it
would allow using all kinds of output options the "log" family of
commands support.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] revision: add --maximal-only option
  2026-01-23 18:08             ` Junio C Hamano
@ 2026-01-28 14:28               ` Derrick Stolee
  2026-01-29  0:14                 ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Derrick Stolee @ 2026-01-28 14:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Sixt, git, Derrick Stolee via GitGitGadget

On 1/23/26 1:08 PM, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
>> Interesting. Thanks for the correction. So we _do_ have a way to
>> get this information for a range that doesn't have negative refs
>> or other custom walk modifiers (and this implementation would be
>> faster for this case).
> 
> Perhaps.  If so, perhaps we can improve --maximal-only (and possibly
> rename it to --independent?  I dunno about this part) by special
> casing the logic, and then steer people to use the new implementation
> that can use negative ends, deprecating "merge-base --independent"
> (which was written to be a better "show-branch --independent")?
> 
>> My patch includes test cases that are not covered by the
>> merge-base command. I don't think it would be valuable to extend
>> the merge-base command with even more cases that don't actually
>> output merge-bases / intersections.
> 
> Yup, I do not think show-branch nor merge-base were good home for
> the feature.  We only needed to make reduce_heads_replace()
> available somewhere, and "git show --maximal-only A B C" might be a
> much better way to express "show only the independent ones", as it
> would allow using all kinds of output options the "log" family of
> commands support.

I explored some of these directions, and I see the value of allowing
a --maximal-only option to them in the future. I have some concerns
about them not solving the needs I have that this 'git rev-list'
implementation provides. I believe that you're suggesting that these
are other places where a user could benefit from such an option, and
I agree.

Can we delay such extensions to another series?

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] revision: add --maximal-only option
  2026-01-28 14:28               ` Derrick Stolee
@ 2026-01-29  0:14                 ` Junio C Hamano
  2026-01-29 14:57                   ` Derrick Stolee
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2026-01-29  0:14 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Johannes Sixt, git, Derrick Stolee via GitGitGadget

Derrick Stolee <stolee@gmail.com> writes:

>> Yup, I do not think show-branch nor merge-base were good home for
>> the feature.  We only needed to make reduce_heads_replace()
>> available somewhere, and "git show --maximal-only A B C" might be a
>> much better way to express "show only the independent ones", as it
>> would allow using all kinds of output options the "log" family of
>> commands support.
>
> I explored some of these directions, and I see the value of allowing
> a --maximal-only option to them in the future. I have some concerns
> about them not solving the needs I have that this 'git rev-list'
> implementation provides. I believe that you're suggesting that these
> are other places where a user could benefit from such an option, and
> I agree.
>
> Can we delay such extensions to another series?

Absolutely, as long as we all agree on what the longer term
direction is, which includes educating existing users of
"show-branch --independent" and "merge-base --independent" that
"rev-list --maximal-only" is the future even for their "positive end
only" use cases and it also can work on a history bounded by both
positive and negative ends.

The only small thing we need to decide here in the above is that
"--maximal-only" is understandable as an appropriate name for a
superset of "--independent" by those who are used to what the
latter has been doing for the past 15 years or so.

As long as with such understanding, it can be left to the future to
even advertise this option as a better alternative for existing
"--independent" option in the manual pages of these other two
commands.

THanks.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] revision: add --maximal-only option
  2026-01-29  0:14                 ` Junio C Hamano
@ 2026-01-29 14:57                   ` Derrick Stolee
  0 siblings, 0 replies; 19+ messages in thread
From: Derrick Stolee @ 2026-01-29 14:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Sixt, git, Derrick Stolee via GitGitGadget

On 1/28/2026 7:14 PM, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
>>> Yup, I do not think show-branch nor merge-base were good home for
>>> the feature.  We only needed to make reduce_heads_replace()
>>> available somewhere, and "git show --maximal-only A B C" might be a
>>> much better way to express "show only the independent ones", as it
>>> would allow using all kinds of output options the "log" family of
>>> commands support.
>>
>> I explored some of these directions, and I see the value of allowing
>> a --maximal-only option to them in the future. I have some concerns
>> about them not solving the needs I have that this 'git rev-list'
>> implementation provides. I believe that you're suggesting that these
>> are other places where a user could benefit from such an option, and
>> I agree.
>>
>> Can we delay such extensions to another series?
> 
> Absolutely, as long as we all agree on what the longer term
> direction is, which includes educating existing users of
> "show-branch --independent" and "merge-base --independent" that
> "rev-list --maximal-only" is the future even for their "positive end
> only" use cases and it also can work on a history bounded by both
> positive and negative ends.

I agree to this direction and have some drafts in this direction.

> The only small thing we need to decide here in the above is that
> "--maximal-only" is understandable as an appropriate name for a
> superset of "--independent" by those who are used to what the
> latter has been doing for the past 15 years or so.

My strong preference for using the word "maximal" somewhere over
continued reliance on "independent" is that a set of commits can
be "mutually independent" without any of the commits actually
being maximal within the range.

Here's an example:

    A       B
    |\     /|
    D E   F G
     \ \ / /
      H I J
       \|/
        K

In this commit history graph, each row is an "independent" set of
commits:

	{ A, B }
	{ D, E, F, G }
	{ H, I, J }
	{ K }

and some sets like { A, F, J } are also independent.

Only A and B are "maximal" commits within the history.

I describe this through an example mostly because I don't feel that
I've adequately described this distinction in this thread and
would not feel satisfied in my arguments without it.

With my reasoning more completely described, I am more ready for
someone to overrule my opinion with the argument that "independent"
has enough historical context to mean "a maximal independent set".

> As long as with such understanding, it can be left to the future to
> even advertise this option as a better alternative for existing
> "--independent" option in the manual pages of these other two
> commands.

The other, more complicated, task is to have the rev-list command
use the algorithm that backs 'git merge-base --independent' when
the input range and options is appropriate for that purpose. This
performance-only feature will require more careful construction
and review.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2026-01-29 14:57 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-18  2:34 [PATCH] revision: add --maximal option Derrick Stolee via GitGitGadget
2026-01-18  9:05 ` Johannes Sixt
2026-01-18 18:27   ` Derrick Stolee
2026-01-19 11:15     ` Johannes Sixt
2026-01-19 16:44       ` Derrick Stolee
2026-01-19 19:05         ` Johannes Sixt
2026-01-20  0:22       ` Junio C Hamano
2026-01-22 15:08         ` Derrick Stolee
2026-01-22 16:05 ` [PATCH v2] revision: add --maximal-only option Derrick Stolee via GitGitGadget
2026-01-22 21:44   ` Junio C Hamano
2026-01-22 22:15     ` Derrick Stolee
2026-01-22 23:11       ` Junio C Hamano
2026-01-23  6:38       ` Johannes Sixt
2026-01-23 15:58         ` Junio C Hamano
2026-01-23 16:55           ` Derrick Stolee
2026-01-23 18:08             ` Junio C Hamano
2026-01-28 14:28               ` Derrick Stolee
2026-01-29  0:14                 ` Junio C Hamano
2026-01-29 14:57                   ` Derrick Stolee

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox