* [PATCH] revision: add --maximal option
@ 2026-01-18 2:34 Derrick Stolee via GitGitGadget
2026-01-18 9:05 ` Johannes Sixt
2026-01-22 16:05 ` [PATCH v2] revision: add --maximal-only option Derrick Stolee via GitGitGadget
0 siblings, 2 replies; 19+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2026-01-18 2:34 UTC (permalink / raw)
To: git; +Cc: gitster, Derrick Stolee, Derrick Stolee
From: Derrick Stolee <stolee@gmail.com>
When inspecting a range of commits from some set of starting references, it
is sometimes useful to learn which commits are not reachable from any other
commits in the selected range.
One such application is in the creation of a sequence of bundles for the
bundle URI feature. Creating a stack of bundles representing different
slices of time includes defining which references to include. If all
references are used, then this may be overwhelming or redundant. Instead,
selecting commits that are maximal to the range could help defining a
smaller reference set to use in the bundle header.
Add a new '--maximal' option to restrict the output of a revision range to
be only the commits that are not reachable from any other commit in the
range, based on the reachability definition of the walk.
This is accomplished by adding a new 28th bit flag, CHILD_VISITED, that is
set as we walk. This does extend the bit range in object.h, but using an
earlier bit may collide with another feature.
The tests demonstrate the behavior of the feature with a positive-only
range, ranges with negative references, and walk-modifying flags like
--first-parent and --exclude-first-parent-only.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
revision: add --maximal option
My motivation for this feature is very similar to the bundle URI
application. I can get around it by creating a tool that uses git
rev-list --parents and then uses a hashset to collect the parent list
and filter out any commits that ever appear as parents. It would be more
efficient to use Git's native revision-walking feature.
This does bring the object struct up to a 32-bit boundary with 28 flag
bits, 3 type bits, and a parsed bit. That's the biggest concern I have
about this update adding a new flag bit. I would understand if this
feature is not worth running out of room for extensions there.
I considered looking through the earlier bit positions to see the impact
of an overlap, but they certainly looked potentially risky to reuse.
I wonder if anyone else has thought about this as a useful technique.
For instance, it could be part of a strategy for choosing commits for
reachability bitmaps.
Thanks, -Stolee
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2032%2Fderrickstolee%2Fmaximal-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2032/derrickstolee/maximal-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2032
Documentation/rev-list-options.adoc | 4 ++
object.h | 4 +-
revision.c | 9 +++-
revision.h | 5 +-
t/t6600-test-reach.sh | 75 +++++++++++++++++++++++++++++
5 files changed, 92 insertions(+), 5 deletions(-)
diff --git a/Documentation/rev-list-options.adoc b/Documentation/rev-list-options.adoc
index 453ec59057..f0d2ab32a9 100644
--- a/Documentation/rev-list-options.adoc
+++ b/Documentation/rev-list-options.adoc
@@ -444,6 +444,10 @@ The following options affect the way the simplification is performed:
times; if so, a commit is included if it is any of the commits
given or if it is an ancestor or descendant of one of them.
+`--maximal`::
+ Restrict the output commits to be those that are not reachable
+ from any other commits in the revision range.
+
A more detailed explanation follows.
Suppose you specified `foo` as the _<paths>_. We shall call commits
diff --git a/object.h b/object.h
index 4bca957b8d..dfe7a1f0ea 100644
--- a/object.h
+++ b/object.h
@@ -64,7 +64,7 @@ void object_array_init(struct object_array *array);
/*
* object flag allocation:
- * revision.h: 0---------10 15 23------27
+ * revision.h: 0---------10 15 23--------28
* fetch-pack.c: 01 67
* negotiator/default.c: 2--5
* walker.c: 0-2
@@ -86,7 +86,7 @@ void object_array_init(struct object_array *array);
* builtin/unpack-objects.c: 2021
* pack-bitmap.h: 2122
*/
-#define FLAG_BITS 28
+#define FLAG_BITS 29
#define TYPE_BITS 3
diff --git a/revision.c b/revision.c
index 1858e093ee..29864426d6 100644
--- a/revision.c
+++ b/revision.c
@@ -1150,7 +1150,8 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
struct commit *p = parent->item;
parent = parent->next;
if (p)
- p->object.flags |= UNINTERESTING;
+ p->object.flags |= UNINTERESTING |
+ CHILD_VISITED;
if (repo_parse_commit_gently(revs->repo, p, 1) < 0)
continue;
if (p->parents)
@@ -1204,7 +1205,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
if (!*slot)
*slot = *revision_sources_at(revs->sources, commit);
}
- p->object.flags |= pass_flags;
+ p->object.flags |= pass_flags | CHILD_VISITED;
if (!(p->object.flags & SEEN)) {
p->object.flags |= (SEEN | NOT_USER_GIVEN);
if (list)
@@ -2381,6 +2382,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
revs->first_parent_only = 1;
} else if (!strcmp(arg, "--exclude-first-parent-only")) {
revs->exclude_first_parent_only = 1;
+ } else if (!strcmp(arg, "--maximal")) {
+ revs->maximal = 1;
} else if (!strcmp(arg, "--ancestry-path")) {
revs->ancestry_path = 1;
revs->simplify_history = 0;
@@ -4125,6 +4128,8 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
{
if (commit->object.flags & SHOWN)
return commit_ignore;
+ if (revs->maximal && (commit->object.flags & CHILD_VISITED))
+ return commit_ignore;
if (revs->unpacked && has_object_pack(revs->repo, &commit->object.oid))
return commit_ignore;
if (revs->no_kept_objects) {
diff --git a/revision.h b/revision.h
index b36acfc2d9..e5c2c82145 100644
--- a/revision.h
+++ b/revision.h
@@ -52,7 +52,9 @@
#define NOT_USER_GIVEN (1u<<25)
#define TRACK_LINEAR (1u<<26)
#define ANCESTRY_PATH (1u<<27)
-#define ALL_REV_FLAGS (((1u<<11)-1) | NOT_USER_GIVEN | TRACK_LINEAR | PULL_MERGE)
+#define CHILD_VISITED (1u<<28)
+#define ALL_REV_FLAGS (((1u<<11)-1) | NOT_USER_GIVEN | TRACK_LINEAR \
+ | PULL_MERGE | CHILD_VISITED)
#define DECORATE_SHORT_REFS 1
#define DECORATE_FULL_REFS 2
@@ -198,6 +200,7 @@ struct rev_info {
cherry_mark:1,
bisect:1,
ancestry_path:1,
+ maximal:1,
/* True if --ancestry-path was specified without an
* argument. The bottom revisions are implicitly
diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index 6638d1aa1d..a759409756 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -762,4 +762,79 @@ test_expect_success 'for-each-ref is-base: --sort' '
--sort=refname --sort=-is-base:commit-2-3
'
+test_expect_success 'rev-list --maximal (all positive)' '
+ # Only one maximal.
+ cat >input <<-\EOF &&
+ refs/heads/commit-1-1
+ refs/heads/commit-4-2
+ refs/heads/commit-4-4
+ refs/heads/commit-8-4
+ EOF
+
+ cat >expect <<-EOF &&
+ $(git rev-parse refs/heads/commit-8-4)
+ EOF
+ run_all_modes git rev-list --maximal --stdin &&
+
+ # All maximal.
+ cat >input <<-\EOF &&
+ refs/heads/commit-5-2
+ refs/heads/commit-4-3
+ refs/heads/commit-3-4
+ refs/heads/commit-2-5
+ EOF
+
+ cat >expect <<-EOF &&
+ $(git rev-parse refs/heads/commit-5-2)
+ $(git rev-parse refs/heads/commit-4-3)
+ $(git rev-parse refs/heads/commit-3-4)
+ $(git rev-parse refs/heads/commit-2-5)
+ EOF
+ run_all_modes git rev-list --maximal --stdin &&
+
+ # Mix of both.
+ cat >input <<-\EOF &&
+ refs/heads/commit-5-2
+ refs/heads/commit-3-2
+ refs/heads/commit-2-5
+ EOF
+
+ cat >expect <<-EOF &&
+ $(git rev-parse refs/heads/commit-5-2)
+ $(git rev-parse refs/heads/commit-2-5)
+ EOF
+ run_all_modes git rev-list --maximal --stdin
+'
+
+test_expect_success 'rev-list --maximal (range)' '
+ cat >input <<-\EOF &&
+ refs/heads/commit-1-1
+ refs/heads/commit-2-5
+ refs/heads/commit-6-4
+ ^refs/heads/commit-4-5
+ EOF
+
+ cat >expect <<-EOF &&
+ $(git rev-parse refs/heads/commit-6-4)
+ EOF
+ run_all_modes git rev-list --maximal --stdin &&
+
+ # first-parent changes reachability: the first parent
+ # reduces the second coordinate to 1 before reducing the
+ # first coordinate.
+ cat >input <<-\EOF &&
+ refs/heads/commit-1-1
+ refs/heads/commit-2-5
+ refs/heads/commit-6-4
+ ^refs/heads/commit-4-5
+ EOF
+
+ cat >expect <<-EOF &&
+ $(git rev-parse refs/heads/commit-6-4)
+ $(git rev-parse refs/heads/commit-2-5)
+ EOF
+ run_all_modes git rev-list --maximal --stdin \
+ --first-parent --exclude-first-parent-only
+'
+
test_done
base-commit: b5c409c40f1595e3e590760c6f14a16b6683e22c
--
gitgitgadget
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH] revision: add --maximal option 2026-01-18 2:34 [PATCH] revision: add --maximal option Derrick Stolee via GitGitGadget @ 2026-01-18 9:05 ` Johannes Sixt 2026-01-18 18:27 ` Derrick Stolee 2026-01-22 16:05 ` [PATCH v2] revision: add --maximal-only option Derrick Stolee via GitGitGadget 1 sibling, 1 reply; 19+ messages in thread From: Johannes Sixt @ 2026-01-18 9:05 UTC (permalink / raw) To: Derrick Stolee; +Cc: gitster, git, Derrick Stolee via GitGitGadget Am 18.01.26 um 03:34 schrieb Derrick Stolee via GitGitGadget: > diff --git a/Documentation/rev-list-options.adoc b/Documentation/rev-list-options.adoc > index 453ec59057..f0d2ab32a9 100644 > --- a/Documentation/rev-list-options.adoc > +++ b/Documentation/rev-list-options.adoc > @@ -444,6 +444,10 @@ The following options affect the way the simplification is performed: > times; if so, a commit is included if it is any of the commits > given or if it is an ancestor or descendant of one of them. > > +`--maximal`:: > + Restrict the output commits to be those that are not reachable > + from any other commits in the revision range. I had to read this sentence three times to understand what it wants to say, and that even though I had a rough idea what it was supposed to mean. I tried to come up with a better wording, but found it to be really hard. Restrict output to the commits at the tips of the revision range. is all I could do, but this isn't a lot better, I am afraid. The option name is too generic IMHO. How about "--starting-point", "--topmost-only"? It's function is somewhat parallel to --boundary, but at the positive end of the revision range. Perhaps we can use that as inspiration. The option is listed among options that affect the way the simplification is performed. But is this true? Isn't it just an option that changes what output is produced? -- Hannes ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] revision: add --maximal option 2026-01-18 9:05 ` Johannes Sixt @ 2026-01-18 18:27 ` Derrick Stolee 2026-01-19 11:15 ` Johannes Sixt 0 siblings, 1 reply; 19+ messages in thread From: Derrick Stolee @ 2026-01-18 18:27 UTC (permalink / raw) To: Johannes Sixt; +Cc: gitster, git, Derrick Stolee via GitGitGadget On 1/18/26 4:05 AM, Johannes Sixt wrote: > Am 18.01.26 um 03:34 schrieb Derrick Stolee via GitGitGadget: >> diff --git a/Documentation/rev-list-options.adoc b/Documentation/rev-list-options.adoc >> index 453ec59057..f0d2ab32a9 100644 >> --- a/Documentation/rev-list-options.adoc >> +++ b/Documentation/rev-list-options.adoc >> @@ -444,6 +444,10 @@ The following options affect the way the simplification is performed: >> times; if so, a commit is included if it is any of the commits >> given or if it is an ancestor or descendant of one of them. >> >> +`--maximal`:: >> + Restrict the output commits to be those that are not reachable >> + from any other commits in the revision range. > > I had to read this sentence three times to understand what it wants to > say, and that even though I had a rough idea what it was supposed to > mean. I tried to come up with a better wording, but found it to be > really hard. > > Restrict output to the commits at the tips of the > revision range. > > is all I could do, but this isn't a lot better, I am afraid. > > The option name is too generic IMHO. How about "--starting-point", > "--topmost-only"? It's function is somewhat parallel to --boundary, but > at the positive end of the revision range. Perhaps we can use that as > inspiration. My perspective is skewed, because "maximal" is a concrete term in the world of partially-ordered sets (such as commit history ordered by reachability across child-to-parent relationships). It's important to distinguish from "starting points" because the inputs to the command are a list of starting points, not all of which are maximal within the set. In fact, if some positive starting points are reachable from the negative starting points, then they are already excluded. My familiarity with this term is skewed by my experience working with such terms, so I'm very open to new names for this option. Your comparison to --boundary is interesting, because --boundary _adds_ commits to the range by selecting the commits from the negative range that are reachable from the output commits. --maximal as defined here _restricts_ to the output of commits in the range. It's interaction with --boundary is trivial because no boundary commits would be included as they are necessarily reachable from a maximal commit. > The option is listed among options that affect the way the > simplification is performed. But is this true? Isn't it just an option > that changes what output is produced? You're right that this is poorly placed. I'll put it in a better location in v2. Thanks, -Stolee ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] revision: add --maximal option 2026-01-18 18:27 ` Derrick Stolee @ 2026-01-19 11:15 ` Johannes Sixt 2026-01-19 16:44 ` Derrick Stolee 2026-01-20 0:22 ` Junio C Hamano 0 siblings, 2 replies; 19+ messages in thread From: Johannes Sixt @ 2026-01-19 11:15 UTC (permalink / raw) To: Derrick Stolee; +Cc: gitster, git, Derrick Stolee via GitGitGadget Am 18.01.26 um 19:27 schrieb Derrick Stolee: > On 1/18/26 4:05 AM, Johannes Sixt wrote: >> Am 18.01.26 um 03:34 schrieb Derrick Stolee via GitGitGadget: >> > The option name is too generic IMHO. How about "--starting-point", >> "--topmost-only"? It's function is somewhat parallel to --boundary, but >> at the positive end of the revision range. Perhaps we can use that as >> inspiration. > > My perspective is skewed, because "maximal" is a concrete term in the > world of partially-ordered sets (such as commit history ordered by > reachability across child-to-parent relationships). It's important to > distinguish from "starting points" because the inputs to the command > are a list of starting points, not all of which are maximal within the > set. In fact, if some positive starting points are reachable from the > negative starting points, then they are already excluded. AFAICS, we don't have options named after graph- or set-theoretical terms, but tend to stick to terms established in the Git ecosystem. I assume that "maximal" isn't a meaning that an average Git user would associate with the operation that is performed here. But even if we decide to use "maximal", the option must be named something other than *just* "--maximal"; this is simply too generic. Perhaps "--only-maximal" or "--maximal-only". Other ideas: - --hide-reachable - --range-head - --range-head-only - --most-recent - --most-recent-only > [--maximal]'s interaction with > --boundary is trivial because no boundary commits would be included as > they are necessarily reachable from a maximal commit. So, --boundary --maximal shows only the maximal commits? That sounds unexpected. Boundary commits are shown with additional mark-up; they don't need to be suppressed. But in a first iteration it's probably better to just make the two options incompatible. -- Hannes ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] revision: add --maximal option 2026-01-19 11:15 ` Johannes Sixt @ 2026-01-19 16:44 ` Derrick Stolee 2026-01-19 19:05 ` Johannes Sixt 2026-01-20 0:22 ` Junio C Hamano 1 sibling, 1 reply; 19+ messages in thread From: Derrick Stolee @ 2026-01-19 16:44 UTC (permalink / raw) To: Johannes Sixt; +Cc: gitster, git, Derrick Stolee via GitGitGadget On 1/19/2026 6:15 AM, Johannes Sixt wrote: > Am 18.01.26 um 19:27 schrieb Derrick Stolee: >> On 1/18/26 4:05 AM, Johannes Sixt wrote: >>> Am 18.01.26 um 03:34 schrieb Derrick Stolee via GitGitGadget: >>>> The option name is too generic IMHO. How about "--starting-point", >>> "--topmost-only"? It's function is somewhat parallel to --boundary, but >>> at the positive end of the revision range. Perhaps we can use that as >>> inspiration. >> >> My perspective is skewed, because "maximal" is a concrete term in the >> world of partially-ordered sets (such as commit history ordered by >> reachability across child-to-parent relationships). It's important to >> distinguish from "starting points" because the inputs to the command >> are a list of starting points, not all of which are maximal within the >> set. In fact, if some positive starting points are reachable from the >> negative starting points, then they are already excluded. > > AFAICS, we don't have options named after graph- or set-theoretical > terms, but tend to stick to terms established in the Git ecosystem. I > assume that "maximal" isn't a meaning that an average Git user would > associate with the operation that is performed here. My mindset is usually "all words are made up by somebody" and since there isn't an established term for this in the existing Git ecosystem, it is up to us to create a term. Borrowing one that exists elsewhere is a valuable way to build upon any context that term brings with it. It is also helpful that the term has an explicit technical definition that means exactly what we're using it for here. It explicitly differentiates from any "maximum" or confusion with a collapse to a total order (such as Git's --date-order or --topo-order apply). > But even if we decide to use "maximal", the option must be named > something other than *just* "--maximal"; this is simply too generic. > Perhaps "--only-maximal" or "--maximal-only". When the argument is moved in the documentation into the set of filters, then the fact that --maximal restricts the set of commits makes any modifier such as "only" redundant. > Other ideas: > - --hide-reachable > - --range-head > - --range-head-only > - --most-recent > - --most-recent-only These all have issues, such as being technically wrong (maximal commits are reachable) or imply total orders, date orders, or generally only a single result. >> [--maximal]'s interaction with >> --boundary is trivial because no boundary commits would be included as >> they are necessarily reachable from a maximal commit. > > So, --boundary --maximal shows only the maximal commits? That sounds > unexpected. Boundary commits are shown with additional mark-up; they > don't need to be suppressed. But in a first iteration it's probably > better to just make the two options incompatible. Sure. But I'd like to counter that filters like --author also restrict the set, including not showing boundary commits that don't fit the --author pattern. It just happens that no boundary commits are also maximal by definition. I do sense that a lot of this is a matter of taste, and that you and I differ greatly in our tastes on this topic. I look forward to more opinions that can lead us towards one side or another (or in a new direction). Thanks, -Stolee ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] revision: add --maximal option 2026-01-19 16:44 ` Derrick Stolee @ 2026-01-19 19:05 ` Johannes Sixt 0 siblings, 0 replies; 19+ messages in thread From: Johannes Sixt @ 2026-01-19 19:05 UTC (permalink / raw) To: Derrick Stolee; +Cc: gitster, git, Derrick Stolee via GitGitGadget Am 19.01.26 um 17:44 schrieb Derrick Stolee: > When the argument is moved in the documentation into the set of > filters, then the fact that --maximal restricts the set of commits > makes any modifier such as "only" redundant. Keep in mind that rev-list arguments are also understood by git-log. Then one could expect --maximal to be somehow the opposite of --minimal. (Which is badly named for the same reason, but that ship has sailed.) -- Hannes ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] revision: add --maximal option 2026-01-19 11:15 ` Johannes Sixt 2026-01-19 16:44 ` Derrick Stolee @ 2026-01-20 0:22 ` Junio C Hamano 2026-01-22 15:08 ` Derrick Stolee 1 sibling, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2026-01-20 0:22 UTC (permalink / raw) To: Johannes Sixt; +Cc: Derrick Stolee, git, Derrick Stolee via GitGitGadget Johannes Sixt <j6t@kdbg.org> writes: Johannes Sixt <j6t@kdbg.org> writes: > But even if we decide to use "maximal", the option must be named > something other than *just* "--maximal"; this is simply too generic. > Perhaps "--only-maximal" or "--maximal-only". > > Other ideas: > - --hide-reachable > - --range-head > - --range-head-only > - --most-recent > - --most-recent-only > >> [--maximal]'s interaction with >> --boundary is trivial because no boundary commits would be included as >> they are necessarily reachable from a maximal commit. > > So, --boundary --maximal shows only the maximal commits? That sounds > unexpected. Boundary commits are shown with additional mark-up; they > don't need to be suppressed. But in a first iteration it's probably > better to just make the two options incompatible. If I am reading the answer to "what is minimal/maximal elements in partially ordered set?" correctly, our "--boundary" essentially is to show direct parents of those commits that would be shown with the (nonexistent) "--minimal-only" option. So I agree with you that it makes perfect sense to make "--boundary" and "--maximal-only" incompatible (it is like asking for both "--minimal-only" and "--maximal-only" at the same time). ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] revision: add --maximal option 2026-01-20 0:22 ` Junio C Hamano @ 2026-01-22 15:08 ` Derrick Stolee 0 siblings, 0 replies; 19+ messages in thread From: Derrick Stolee @ 2026-01-22 15:08 UTC (permalink / raw) To: Junio C Hamano, Johannes Sixt; +Cc: git, Derrick Stolee via GitGitGadget On 1/19/26 7:22 PM, Junio C Hamano wrote: > Johannes Sixt <j6t@kdbg.org> writes: >> But even if we decide to use "maximal", the option must be named >> something other than *just* "--maximal"; this is simply too generic. >> Perhaps "--only-maximal" or "--maximal-only". >> >> Other ideas: >> - --hide-reachable >> - --range-head >> - --range-head-only >> - --most-recent >> - --most-recent-only >> >>> [--maximal]'s interaction with >>> --boundary is trivial because no boundary commits would be included as >>> they are necessarily reachable from a maximal commit. >> >> So, --boundary --maximal shows only the maximal commits? That sounds >> unexpected. Boundary commits are shown with additional mark-up; they >> don't need to be suppressed. But in a first iteration it's probably >> better to just make the two options incompatible. > > If I am reading the answer to "what is minimal/maximal elements in > partially ordered set?" correctly, our "--boundary" essentially is > to show direct parents of those commits that would be shown with the > (nonexistent) "--minimal-only" option. So I agree with you that it > makes perfect sense to make "--boundary" and "--maximal-only" > incompatible (it is like asking for both "--minimal-only" and > "--maximal-only" at the same time). The existence of a --minimal option that doesn't match the mirror of my suggested --maximal option convinces me to move to --maximal-only. I will send a v2 shortly that updates this and moves the documentation next to other filtering options. I'll also mark --boundary and --maximal-only as incompatible to avoid confusion. Thanks, -Stolee ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2] revision: add --maximal-only option 2026-01-18 2:34 [PATCH] revision: add --maximal option Derrick Stolee via GitGitGadget 2026-01-18 9:05 ` Johannes Sixt @ 2026-01-22 16:05 ` Derrick Stolee via GitGitGadget 2026-01-22 21:44 ` Junio C Hamano 1 sibling, 1 reply; 19+ messages in thread From: Derrick Stolee via GitGitGadget @ 2026-01-22 16:05 UTC (permalink / raw) To: git; +Cc: gitster, Johannes Sixt, Derrick Stolee, Derrick Stolee From: Derrick Stolee <stolee@gmail.com> When inspecting a range of commits from some set of starting references, it is sometimes useful to learn which commits are not reachable from any other commits in the selected range. One such application is in the creation of a sequence of bundles for the bundle URI feature. Creating a stack of bundles representing different slices of time includes defining which references to include. If all references are used, then this may be overwhelming or redundant. Instead, selecting commits that are maximal to the range could help defining a smaller reference set to use in the bundle header. Add a new '--maximal-only' option to restrict the output of a revision range to be only the commits that are not reachable from any other commit in the range, based on the reachability definition of the walk. This is accomplished by adding a new 28th bit flag, CHILD_VISITED, that is set as we walk. This does extend the bit range in object.h, but using an earlier bit may collide with another feature. The tests demonstrate the behavior of the feature with a positive-only range, ranges with negative references, and walk-modifying flags like --first-parent and --exclude-first-parent-only. Since the --boundary option would not increase any results when used with the --maximal-only option, mark them as incompatible. Signed-off-by: Derrick Stolee <stolee@gmail.com> --- revision: add --maximal-only option My motivation for this feature is very similar to the bundle URI application. I can get around it by creating a tool that uses git rev-list --parents and then uses a hashset to collect the parent list and filter out any commits that ever appear as parents. It would be more efficient to use Git's native revision-walking feature. This does bring the object struct up to a 32-bit boundary with 28 flag bits, 3 type bits, and a parsed bit. That's the biggest concern I have about this update adding a new flag bit. I would understand if this feature is not worth running out of room for extensions there. I considered looking through the earlier bit positions to see the impact of an overlap, but they certainly looked potentially risky to reuse. I wonder if anyone else has thought about this as a useful technique. For instance, it could be part of a strategy for choosing commits for reachability bitmaps. Updates in v2 ============= * option is now called --maximal-only. * Documentation is moved within the commit-filtering options not the walk-altering options. * --boundary and --maximal-only are marked as incompatible. Thanks, -Stolee Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2032%2Fderrickstolee%2Fmaximal-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2032/derrickstolee/maximal-v2 Pull-Request: https://github.com/gitgitgadget/git/pull/2032 Range-diff vs v1: 1: 889a2737bc ! 1: 54fbf36a1f revision: add --maximal option @@ Metadata Author: Derrick Stolee <stolee@gmail.com> ## Commit message ## - revision: add --maximal option + revision: add --maximal-only option When inspecting a range of commits from some set of starting references, it is sometimes useful to learn which commits are not reachable from any other @@ Commit message selecting commits that are maximal to the range could help defining a smaller reference set to use in the bundle header. - Add a new '--maximal' option to restrict the output of a revision range to - be only the commits that are not reachable from any other commit in the + Add a new '--maximal-only' option to restrict the output of a revision range + to be only the commits that are not reachable from any other commit in the range, based on the reachability definition of the walk. This is accomplished by adding a new 28th bit flag, CHILD_VISITED, that is @@ Commit message range, ranges with negative references, and walk-modifying flags like --first-parent and --exclude-first-parent-only. + Since the --boundary option would not increase any results when used with + the --maximal-only option, mark them as incompatible. + Signed-off-by: Derrick Stolee <stolee@gmail.com> ## Documentation/rev-list-options.adoc ## -@@ Documentation/rev-list-options.adoc: The following options affect the way the simplification is performed: - times; if so, a commit is included if it is any of the commits - given or if it is an ancestor or descendant of one of them. +@@ Documentation/rev-list-options.adoc: endif::git-log[] + from the point where it diverged from the remote branch, given + that arbitrary merges can be valid topic branch changes. -+`--maximal`:: ++`--maximal-only`:: + Restrict the output commits to be those that are not reachable + from any other commits in the revision range. + - A more detailed explanation follows. - - Suppose you specified `foo` as the _<paths>_. We shall call commits + `--not`:: + Reverses the meaning of the '{caret}' prefix (or lack thereof) + for all following revision specifiers, up to the next `--not`. ## object.h ## @@ object.h: void object_array_init(struct object_array *array); @@ revision.c: static int process_parents(struct rev_info *revs, struct commit *com p->object.flags |= (SEEN | NOT_USER_GIVEN); if (list) @@ revision.c: static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg + } else if ((argcount = parse_long_opt("until", argv, &optarg))) { + revs->min_age = approxidate(optarg); + return argcount; ++ } else if (!strcmp(arg, "--maximal-only")) { ++ revs->maximal_only = 1; + } else if (!strcmp(arg, "--first-parent")) { revs->first_parent_only = 1; } else if (!strcmp(arg, "--exclude-first-parent-only")) { - revs->exclude_first_parent_only = 1; -+ } else if (!strcmp(arg, "--maximal")) { -+ revs->maximal = 1; - } else if (!strcmp(arg, "--ancestry-path")) { - revs->ancestry_path = 1; - revs->simplify_history = 0; +@@ revision.c: int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s + !!revs->reverse, "--reverse", + !!revs->reflog_info, "--walk-reflogs"); + ++ die_for_incompatible_opt2(!!revs->boundary, "--boundary", ++ !!revs->maximal_only, "--maximal-only"); ++ + if (revs->no_walk && revs->graph) + die(_("options '%s' and '%s' cannot be used together"), "--no-walk", "--graph"); + if (!revs->reflog_info && revs->grep_filter.use_reflog_filter) @@ revision.c: enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi { if (commit->object.flags & SHOWN) return commit_ignore; -+ if (revs->maximal && (commit->object.flags & CHILD_VISITED)) ++ if (revs->maximal_only && (commit->object.flags & CHILD_VISITED)) + return commit_ignore; if (revs->unpacked && has_object_pack(revs->repo, &commit->object.oid)) return commit_ignore; @@ revision.h #define DECORATE_SHORT_REFS 1 #define DECORATE_FULL_REFS 2 @@ revision.h: struct rev_info { - cherry_mark:1, - bisect:1, - ancestry_path:1, -+ maximal:1, + left_right:1, + left_only:1, + right_only:1, ++ maximal_only:1, + rewrite_parents:1, + print_parents:1, + show_decorations:1, + + ## t/t6000-rev-list-misc.sh ## +@@ t/t6000-rev-list-misc.sh: test_expect_success 'rev-list -z --boundary' ' + test_cmp expect actual + ' - /* True if --ancestry-path was specified without an - * argument. The bottom revisions are implicitly ++test_expect_success 'rev-list --boundary incompatible with --maximal-only' ' ++ test_when_finished rm -rf repo && ++ ++ git init repo && ++ test_commit -C repo 1 && ++ test_commit -C repo 2 && ++ ++ oid1=$(git -C repo rev-parse HEAD~) && ++ oid2=$(git -C repo rev-parse HEAD) && ++ ++ test_must_fail git -C repo rev-list --boundary --maximal-only \ ++ HEAD~1..HEAD 2>err && ++ test_grep "cannot be used together" err ++' ++ + test_done ## t/t6600-test-reach.sh ## @@ t/t6600-test-reach.sh: test_expect_success 'for-each-ref is-base: --sort' ' --sort=refname --sort=-is-base:commit-2-3 ' -+test_expect_success 'rev-list --maximal (all positive)' ' ++test_expect_success 'rev-list --maximal-only (all positive)' ' + # Only one maximal. + cat >input <<-\EOF && + refs/heads/commit-1-1 @@ t/t6600-test-reach.sh: test_expect_success 'for-each-ref is-base: --sort' ' + cat >expect <<-EOF && + $(git rev-parse refs/heads/commit-8-4) + EOF -+ run_all_modes git rev-list --maximal --stdin && ++ run_all_modes git rev-list --maximal-only --stdin && + + # All maximal. + cat >input <<-\EOF && @@ t/t6600-test-reach.sh: test_expect_success 'for-each-ref is-base: --sort' ' + $(git rev-parse refs/heads/commit-3-4) + $(git rev-parse refs/heads/commit-2-5) + EOF -+ run_all_modes git rev-list --maximal --stdin && ++ run_all_modes git rev-list --maximal-only --stdin && + + # Mix of both. + cat >input <<-\EOF && @@ t/t6600-test-reach.sh: test_expect_success 'for-each-ref is-base: --sort' ' + $(git rev-parse refs/heads/commit-5-2) + $(git rev-parse refs/heads/commit-2-5) + EOF -+ run_all_modes git rev-list --maximal --stdin ++ run_all_modes git rev-list --maximal-only --stdin +' + -+test_expect_success 'rev-list --maximal (range)' ' ++test_expect_success 'rev-list --maximal-only (range)' ' + cat >input <<-\EOF && + refs/heads/commit-1-1 + refs/heads/commit-2-5 @@ t/t6600-test-reach.sh: test_expect_success 'for-each-ref is-base: --sort' ' + cat >expect <<-EOF && + $(git rev-parse refs/heads/commit-6-4) + EOF -+ run_all_modes git rev-list --maximal --stdin && ++ run_all_modes git rev-list --maximal-only --stdin && + + # first-parent changes reachability: the first parent + # reduces the second coordinate to 1 before reducing the @@ t/t6600-test-reach.sh: test_expect_success 'for-each-ref is-base: --sort' ' + $(git rev-parse refs/heads/commit-6-4) + $(git rev-parse refs/heads/commit-2-5) + EOF -+ run_all_modes git rev-list --maximal --stdin \ ++ run_all_modes git rev-list --maximal-only --stdin \ + --first-parent --exclude-first-parent-only +' + Documentation/rev-list-options.adoc | 4 ++ object.h | 4 +- revision.c | 12 ++++- revision.h | 5 +- t/t6000-rev-list-misc.sh | 15 ++++++ t/t6600-test-reach.sh | 75 +++++++++++++++++++++++++++++ 6 files changed, 110 insertions(+), 5 deletions(-) diff --git a/Documentation/rev-list-options.adoc b/Documentation/rev-list-options.adoc index 453ec59057..a39cf88bbc 100644 --- a/Documentation/rev-list-options.adoc +++ b/Documentation/rev-list-options.adoc @@ -148,6 +148,10 @@ endif::git-log[] from the point where it diverged from the remote branch, given that arbitrary merges can be valid topic branch changes. +`--maximal-only`:: + Restrict the output commits to be those that are not reachable + from any other commits in the revision range. + `--not`:: Reverses the meaning of the '{caret}' prefix (or lack thereof) for all following revision specifiers, up to the next `--not`. diff --git a/object.h b/object.h index 4bca957b8d..dfe7a1f0ea 100644 --- a/object.h +++ b/object.h @@ -64,7 +64,7 @@ void object_array_init(struct object_array *array); /* * object flag allocation: - * revision.h: 0---------10 15 23------27 + * revision.h: 0---------10 15 23--------28 * fetch-pack.c: 01 67 * negotiator/default.c: 2--5 * walker.c: 0-2 @@ -86,7 +86,7 @@ void object_array_init(struct object_array *array); * builtin/unpack-objects.c: 2021 * pack-bitmap.h: 2122 */ -#define FLAG_BITS 28 +#define FLAG_BITS 29 #define TYPE_BITS 3 diff --git a/revision.c b/revision.c index 1858e093ee..2dee78b838 100644 --- a/revision.c +++ b/revision.c @@ -1150,7 +1150,8 @@ static int process_parents(struct rev_info *revs, struct commit *commit, struct commit *p = parent->item; parent = parent->next; if (p) - p->object.flags |= UNINTERESTING; + p->object.flags |= UNINTERESTING | + CHILD_VISITED; if (repo_parse_commit_gently(revs->repo, p, 1) < 0) continue; if (p->parents) @@ -1204,7 +1205,7 @@ static int process_parents(struct rev_info *revs, struct commit *commit, if (!*slot) *slot = *revision_sources_at(revs->sources, commit); } - p->object.flags |= pass_flags; + p->object.flags |= pass_flags | CHILD_VISITED; if (!(p->object.flags & SEEN)) { p->object.flags |= (SEEN | NOT_USER_GIVEN); if (list) @@ -2377,6 +2378,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg } else if ((argcount = parse_long_opt("until", argv, &optarg))) { revs->min_age = approxidate(optarg); return argcount; + } else if (!strcmp(arg, "--maximal-only")) { + revs->maximal_only = 1; } else if (!strcmp(arg, "--first-parent")) { revs->first_parent_only = 1; } else if (!strcmp(arg, "--exclude-first-parent-only")) { @@ -3147,6 +3150,9 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s !!revs->reverse, "--reverse", !!revs->reflog_info, "--walk-reflogs"); + die_for_incompatible_opt2(!!revs->boundary, "--boundary", + !!revs->maximal_only, "--maximal-only"); + if (revs->no_walk && revs->graph) die(_("options '%s' and '%s' cannot be used together"), "--no-walk", "--graph"); if (!revs->reflog_info && revs->grep_filter.use_reflog_filter) @@ -4125,6 +4131,8 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi { if (commit->object.flags & SHOWN) return commit_ignore; + if (revs->maximal_only && (commit->object.flags & CHILD_VISITED)) + return commit_ignore; if (revs->unpacked && has_object_pack(revs->repo, &commit->object.oid)) return commit_ignore; if (revs->no_kept_objects) { diff --git a/revision.h b/revision.h index b36acfc2d9..69242ecb18 100644 --- a/revision.h +++ b/revision.h @@ -52,7 +52,9 @@ #define NOT_USER_GIVEN (1u<<25) #define TRACK_LINEAR (1u<<26) #define ANCESTRY_PATH (1u<<27) -#define ALL_REV_FLAGS (((1u<<11)-1) | NOT_USER_GIVEN | TRACK_LINEAR | PULL_MERGE) +#define CHILD_VISITED (1u<<28) +#define ALL_REV_FLAGS (((1u<<11)-1) | NOT_USER_GIVEN | TRACK_LINEAR \ + | PULL_MERGE | CHILD_VISITED) #define DECORATE_SHORT_REFS 1 #define DECORATE_FULL_REFS 2 @@ -189,6 +191,7 @@ struct rev_info { left_right:1, left_only:1, right_only:1, + maximal_only:1, rewrite_parents:1, print_parents:1, show_decorations:1, diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh index fec16448cf..d0a2a86610 100755 --- a/t/t6000-rev-list-misc.sh +++ b/t/t6000-rev-list-misc.sh @@ -248,4 +248,19 @@ test_expect_success 'rev-list -z --boundary' ' test_cmp expect actual ' +test_expect_success 'rev-list --boundary incompatible with --maximal-only' ' + test_when_finished rm -rf repo && + + git init repo && + test_commit -C repo 1 && + test_commit -C repo 2 && + + oid1=$(git -C repo rev-parse HEAD~) && + oid2=$(git -C repo rev-parse HEAD) && + + test_must_fail git -C repo rev-list --boundary --maximal-only \ + HEAD~1..HEAD 2>err && + test_grep "cannot be used together" err +' + test_done diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh index 6638d1aa1d..2613075894 100755 --- a/t/t6600-test-reach.sh +++ b/t/t6600-test-reach.sh @@ -762,4 +762,79 @@ test_expect_success 'for-each-ref is-base: --sort' ' --sort=refname --sort=-is-base:commit-2-3 ' +test_expect_success 'rev-list --maximal-only (all positive)' ' + # Only one maximal. + cat >input <<-\EOF && + refs/heads/commit-1-1 + refs/heads/commit-4-2 + refs/heads/commit-4-4 + refs/heads/commit-8-4 + EOF + + cat >expect <<-EOF && + $(git rev-parse refs/heads/commit-8-4) + EOF + run_all_modes git rev-list --maximal-only --stdin && + + # All maximal. + cat >input <<-\EOF && + refs/heads/commit-5-2 + refs/heads/commit-4-3 + refs/heads/commit-3-4 + refs/heads/commit-2-5 + EOF + + cat >expect <<-EOF && + $(git rev-parse refs/heads/commit-5-2) + $(git rev-parse refs/heads/commit-4-3) + $(git rev-parse refs/heads/commit-3-4) + $(git rev-parse refs/heads/commit-2-5) + EOF + run_all_modes git rev-list --maximal-only --stdin && + + # Mix of both. + cat >input <<-\EOF && + refs/heads/commit-5-2 + refs/heads/commit-3-2 + refs/heads/commit-2-5 + EOF + + cat >expect <<-EOF && + $(git rev-parse refs/heads/commit-5-2) + $(git rev-parse refs/heads/commit-2-5) + EOF + run_all_modes git rev-list --maximal-only --stdin +' + +test_expect_success 'rev-list --maximal-only (range)' ' + cat >input <<-\EOF && + refs/heads/commit-1-1 + refs/heads/commit-2-5 + refs/heads/commit-6-4 + ^refs/heads/commit-4-5 + EOF + + cat >expect <<-EOF && + $(git rev-parse refs/heads/commit-6-4) + EOF + run_all_modes git rev-list --maximal-only --stdin && + + # first-parent changes reachability: the first parent + # reduces the second coordinate to 1 before reducing the + # first coordinate. + cat >input <<-\EOF && + refs/heads/commit-1-1 + refs/heads/commit-2-5 + refs/heads/commit-6-4 + ^refs/heads/commit-4-5 + EOF + + cat >expect <<-EOF && + $(git rev-parse refs/heads/commit-6-4) + $(git rev-parse refs/heads/commit-2-5) + EOF + run_all_modes git rev-list --maximal-only --stdin \ + --first-parent --exclude-first-parent-only +' + test_done base-commit: b5c409c40f1595e3e590760c6f14a16b6683e22c -- gitgitgadget ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2] revision: add --maximal-only option 2026-01-22 16:05 ` [PATCH v2] revision: add --maximal-only option Derrick Stolee via GitGitGadget @ 2026-01-22 21:44 ` Junio C Hamano 2026-01-22 22:15 ` Derrick Stolee 0 siblings, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2026-01-22 21:44 UTC (permalink / raw) To: Derrick Stolee via GitGitGadget; +Cc: git, Johannes Sixt, Derrick Stolee "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > My motivation for this feature is very similar to the bundle URI > application. I can get around it by creating a tool that uses git > rev-list --parents and then uses a hashset to collect the parent list > and filter out any commits that ever appear as parents. It would be more > efficient to use Git's native revision-walking feature. How does this relate to "git merge-base --independent", or do they compute completely different things? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] revision: add --maximal-only option 2026-01-22 21:44 ` Junio C Hamano @ 2026-01-22 22:15 ` Derrick Stolee 2026-01-22 23:11 ` Junio C Hamano 2026-01-23 6:38 ` Johannes Sixt 0 siblings, 2 replies; 19+ messages in thread From: Derrick Stolee @ 2026-01-22 22:15 UTC (permalink / raw) To: Junio C Hamano, Derrick Stolee via GitGitGadget; +Cc: git, Johannes Sixt On 1/22/2026 4:44 PM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> My motivation for this feature is very similar to the bundle URI >> application. I can get around it by creating a tool that uses git >> rev-list --parents and then uses a hashset to collect the parent list >> and filter out any commits that ever appear as parents. It would be more >> efficient to use Git's native revision-walking feature. > > How does this relate to "git merge-base --independent", or do they > compute completely different things? This is the same idea, where among the merge bases, the ones that are "maximal" to that set are the --independent ones. The documentation for --independent even uses similar language here: In other words, among the commits given, list those which cannot be reached from any other. Unfortunately, it also says "print a minimal subset" which in some sense is correct by "it cannot be made smaller without losing information" but we actually choose the maximal set there, not a minimal set. The merge-base --independent calculation is basically asking for the maximal set among commits in the intersection of two (or more) commit histories. One trick the merge-base calculation does is that it first looks for the --boundary commits, and then reduces from within that set. This avoid walking further into the history than necessary. You are presenting interesting overlaps of terminology and needs. One thing that is different about 'git rev-list --maximal-only' with a list of starting commits is that it wants the maximal set from the _union_ of the histories, instead of the _intersection_ like 'git merge-base --independent' does. There is potential for a performance improvement if we converted the search to be like a merge-base algorithm and checked the priority queue to see if all elements have the CHILD_VISITED flag. I think the cost here is that we would need more new logic and would lose some expressiveness of 'git rev-list'. For example, one of the applications I mentioned will require a range request including negative refs, such as git rev-list --maximal-only --stdin <<-\EOF refs/heads/branch1 refs/heads/branch2 ^refs/heads/main ^refs/heads/release EOF And this would likely return the tips of the two branches, but also will detect if one already reaches the other or if one of 'main' or 'release' reaches one or both of them, excluding it from the maximal set. Thanks, -Stolee ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] revision: add --maximal-only option 2026-01-22 22:15 ` Derrick Stolee @ 2026-01-22 23:11 ` Junio C Hamano 2026-01-23 6:38 ` Johannes Sixt 1 sibling, 0 replies; 19+ messages in thread From: Junio C Hamano @ 2026-01-22 23:11 UTC (permalink / raw) To: Derrick Stolee; +Cc: Derrick Stolee via GitGitGadget, git, Johannes Sixt Derrick Stolee <stolee@gmail.com> writes: > The merge-base --independent calculation is basically asking for > the maximal set among commits in the intersection of two (or more) > commit histories. One trick the merge-base calculation does is > that it first looks for the --boundary commits, and then reduces > from within that set. This avoid walking further into the history > than necessary. > ... > And this would likely return the tips of the two branches, but > also will detect if one already reaches the other or if one of > 'main' or 'release' reaches one or both of them, excluding it > from the maximal set. Thanks for your thoughts. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] revision: add --maximal-only option 2026-01-22 22:15 ` Derrick Stolee 2026-01-22 23:11 ` Junio C Hamano @ 2026-01-23 6:38 ` Johannes Sixt 2026-01-23 15:58 ` Junio C Hamano 1 sibling, 1 reply; 19+ messages in thread From: Johannes Sixt @ 2026-01-23 6:38 UTC (permalink / raw) To: Derrick Stolee; +Cc: git, Derrick Stolee via GitGitGadget, Junio C Hamano Am 22.01.26 um 23:15 schrieb Derrick Stolee: > Unfortunately, it also says "print a minimal subset" which in some > sense is correct by "it cannot be made smaller without losing > information" but we actually choose the maximal set there, not a > minimal set. > ... > You are presenting interesting overlaps of terminology and needs. > One thing that is different about 'git rev-list --maximal-only' with > a list of starting commits is that it wants the maximal set from > the _union_ of the histories, instead of the _intersection_ like > 'git merge-base --independent' does. I don't quite understand how a union or intersection come into play here. The difference between the two is that `git rev-list --maximal-only` permits negative revisions as input, but `git merge-base --independent` does not. In the case where the input is only positive revisions, the result of --maximal-only should always be exactly identical to --independent, right? Even if the revisions are on disconnected histories? -- Hannes ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] revision: add --maximal-only option 2026-01-23 6:38 ` Johannes Sixt @ 2026-01-23 15:58 ` Junio C Hamano 2026-01-23 16:55 ` Derrick Stolee 0 siblings, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2026-01-23 15:58 UTC (permalink / raw) To: Johannes Sixt; +Cc: Derrick Stolee, git, Derrick Stolee via GitGitGadget Johannes Sixt <j6t@kdbg.org> writes: > Am 22.01.26 um 23:15 schrieb Derrick Stolee: >> Unfortunately, it also says "print a minimal subset" which in some >> sense is correct by "it cannot be made smaller without losing >> information" but we actually choose the maximal set there, not a >> minimal set. >> ... >> You are presenting interesting overlaps of terminology and needs. >> One thing that is different about 'git rev-list --maximal-only' with >> a list of starting commits is that it wants the maximal set from >> the _union_ of the histories, instead of the _intersection_ like >> 'git merge-base --independent' does. > > I don't quite understand how a union or intersection come into play > here. The difference between the two is that `git rev-list > --maximal-only` permits negative revisions as input, but `git merge-base > --independent` does not. In the case where the input is only positive > revisions, the result of --maximal-only should always be exactly > identical to --independent, right? Even if the revisions are on > disconnected histories? Ahh, it is an ancient history that I forgot how the command worked. "merge-base --independent A B C" does not do any "merge-base" computation over the commits A B C and shows the ones that cannot be reached from any other. If it were to compute merge bases across these commits and then find commits, among the computed merge bases, that cannot be reached from any other merge bases, "intersection" might come into play, but I do not think that is what the command does. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] revision: add --maximal-only option 2026-01-23 15:58 ` Junio C Hamano @ 2026-01-23 16:55 ` Derrick Stolee 2026-01-23 18:08 ` Junio C Hamano 0 siblings, 1 reply; 19+ messages in thread From: Derrick Stolee @ 2026-01-23 16:55 UTC (permalink / raw) To: Junio C Hamano, Johannes Sixt; +Cc: git, Derrick Stolee via GitGitGadget On 1/23/2026 10:58 AM, Junio C Hamano wrote: > Johannes Sixt <j6t@kdbg.org> writes: > >> Am 22.01.26 um 23:15 schrieb Derrick Stolee: >>> Unfortunately, it also says "print a minimal subset" which in some >>> sense is correct by "it cannot be made smaller without losing >>> information" but we actually choose the maximal set there, not a >>> minimal set. >>> ... >>> You are presenting interesting overlaps of terminology and needs. >>> One thing that is different about 'git rev-list --maximal-only' with >>> a list of starting commits is that it wants the maximal set from >>> the _union_ of the histories, instead of the _intersection_ like >>> 'git merge-base --independent' does. >> >> I don't quite understand how a union or intersection come into play >> here. The difference between the two is that `git rev-list >> --maximal-only` permits negative revisions as input, but `git merge-base >> --independent` does not. In the case where the input is only positive >> revisions, the result of --maximal-only should always be exactly >> identical to --independent, right? Even if the revisions are on >> disconnected histories? > > Ahh, it is an ancient history that I forgot how the command worked. > "merge-base --independent A B C" does not do any "merge-base" > computation over the commits A B C and shows the ones that cannot be > reached from any other. If it were to compute merge bases across > these commits and then find commits, among the computed merge bases, > that cannot be reached from any other merge bases, "intersection" > might come into play, but I do not think that is what the command > does. Interesting. Thanks for the correction. So we _do_ have a way to get this information for a range that doesn't have negative refs or other custom walk modifiers (and this implementation would be faster for this case). My patch includes test cases that are not covered by the merge-base command. I don't think it would be valuable to extend the merge-base command with even more cases that don't actually output merge-bases / intersections. Thanks, -Stolee ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] revision: add --maximal-only option 2026-01-23 16:55 ` Derrick Stolee @ 2026-01-23 18:08 ` Junio C Hamano 2026-01-28 14:28 ` Derrick Stolee 0 siblings, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2026-01-23 18:08 UTC (permalink / raw) To: Derrick Stolee; +Cc: Johannes Sixt, git, Derrick Stolee via GitGitGadget Derrick Stolee <stolee@gmail.com> writes: > Interesting. Thanks for the correction. So we _do_ have a way to > get this information for a range that doesn't have negative refs > or other custom walk modifiers (and this implementation would be > faster for this case). Perhaps. If so, perhaps we can improve --maximal-only (and possibly rename it to --independent? I dunno about this part) by special casing the logic, and then steer people to use the new implementation that can use negative ends, deprecating "merge-base --independent" (which was written to be a better "show-branch --independent")? > My patch includes test cases that are not covered by the > merge-base command. I don't think it would be valuable to extend > the merge-base command with even more cases that don't actually > output merge-bases / intersections. Yup, I do not think show-branch nor merge-base were good home for the feature. We only needed to make reduce_heads_replace() available somewhere, and "git show --maximal-only A B C" might be a much better way to express "show only the independent ones", as it would allow using all kinds of output options the "log" family of commands support. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] revision: add --maximal-only option 2026-01-23 18:08 ` Junio C Hamano @ 2026-01-28 14:28 ` Derrick Stolee 2026-01-29 0:14 ` Junio C Hamano 0 siblings, 1 reply; 19+ messages in thread From: Derrick Stolee @ 2026-01-28 14:28 UTC (permalink / raw) To: Junio C Hamano; +Cc: Johannes Sixt, git, Derrick Stolee via GitGitGadget On 1/23/26 1:08 PM, Junio C Hamano wrote: > Derrick Stolee <stolee@gmail.com> writes: > >> Interesting. Thanks for the correction. So we _do_ have a way to >> get this information for a range that doesn't have negative refs >> or other custom walk modifiers (and this implementation would be >> faster for this case). > > Perhaps. If so, perhaps we can improve --maximal-only (and possibly > rename it to --independent? I dunno about this part) by special > casing the logic, and then steer people to use the new implementation > that can use negative ends, deprecating "merge-base --independent" > (which was written to be a better "show-branch --independent")? > >> My patch includes test cases that are not covered by the >> merge-base command. I don't think it would be valuable to extend >> the merge-base command with even more cases that don't actually >> output merge-bases / intersections. > > Yup, I do not think show-branch nor merge-base were good home for > the feature. We only needed to make reduce_heads_replace() > available somewhere, and "git show --maximal-only A B C" might be a > much better way to express "show only the independent ones", as it > would allow using all kinds of output options the "log" family of > commands support. I explored some of these directions, and I see the value of allowing a --maximal-only option to them in the future. I have some concerns about them not solving the needs I have that this 'git rev-list' implementation provides. I believe that you're suggesting that these are other places where a user could benefit from such an option, and I agree. Can we delay such extensions to another series? Thanks, -Stolee ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] revision: add --maximal-only option 2026-01-28 14:28 ` Derrick Stolee @ 2026-01-29 0:14 ` Junio C Hamano 2026-01-29 14:57 ` Derrick Stolee 0 siblings, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2026-01-29 0:14 UTC (permalink / raw) To: Derrick Stolee; +Cc: Johannes Sixt, git, Derrick Stolee via GitGitGadget Derrick Stolee <stolee@gmail.com> writes: >> Yup, I do not think show-branch nor merge-base were good home for >> the feature. We only needed to make reduce_heads_replace() >> available somewhere, and "git show --maximal-only A B C" might be a >> much better way to express "show only the independent ones", as it >> would allow using all kinds of output options the "log" family of >> commands support. > > I explored some of these directions, and I see the value of allowing > a --maximal-only option to them in the future. I have some concerns > about them not solving the needs I have that this 'git rev-list' > implementation provides. I believe that you're suggesting that these > are other places where a user could benefit from such an option, and > I agree. > > Can we delay such extensions to another series? Absolutely, as long as we all agree on what the longer term direction is, which includes educating existing users of "show-branch --independent" and "merge-base --independent" that "rev-list --maximal-only" is the future even for their "positive end only" use cases and it also can work on a history bounded by both positive and negative ends. The only small thing we need to decide here in the above is that "--maximal-only" is understandable as an appropriate name for a superset of "--independent" by those who are used to what the latter has been doing for the past 15 years or so. As long as with such understanding, it can be left to the future to even advertise this option as a better alternative for existing "--independent" option in the manual pages of these other two commands. THanks. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] revision: add --maximal-only option 2026-01-29 0:14 ` Junio C Hamano @ 2026-01-29 14:57 ` Derrick Stolee 0 siblings, 0 replies; 19+ messages in thread From: Derrick Stolee @ 2026-01-29 14:57 UTC (permalink / raw) To: Junio C Hamano; +Cc: Johannes Sixt, git, Derrick Stolee via GitGitGadget On 1/28/2026 7:14 PM, Junio C Hamano wrote: > Derrick Stolee <stolee@gmail.com> writes: > >>> Yup, I do not think show-branch nor merge-base were good home for >>> the feature. We only needed to make reduce_heads_replace() >>> available somewhere, and "git show --maximal-only A B C" might be a >>> much better way to express "show only the independent ones", as it >>> would allow using all kinds of output options the "log" family of >>> commands support. >> >> I explored some of these directions, and I see the value of allowing >> a --maximal-only option to them in the future. I have some concerns >> about them not solving the needs I have that this 'git rev-list' >> implementation provides. I believe that you're suggesting that these >> are other places where a user could benefit from such an option, and >> I agree. >> >> Can we delay such extensions to another series? > > Absolutely, as long as we all agree on what the longer term > direction is, which includes educating existing users of > "show-branch --independent" and "merge-base --independent" that > "rev-list --maximal-only" is the future even for their "positive end > only" use cases and it also can work on a history bounded by both > positive and negative ends. I agree to this direction and have some drafts in this direction. > The only small thing we need to decide here in the above is that > "--maximal-only" is understandable as an appropriate name for a > superset of "--independent" by those who are used to what the > latter has been doing for the past 15 years or so. My strong preference for using the word "maximal" somewhere over continued reliance on "independent" is that a set of commits can be "mutually independent" without any of the commits actually being maximal within the range. Here's an example: A B |\ /| D E F G \ \ / / H I J \|/ K In this commit history graph, each row is an "independent" set of commits: { A, B } { D, E, F, G } { H, I, J } { K } and some sets like { A, F, J } are also independent. Only A and B are "maximal" commits within the history. I describe this through an example mostly because I don't feel that I've adequately described this distinction in this thread and would not feel satisfied in my arguments without it. With my reasoning more completely described, I am more ready for someone to overrule my opinion with the argument that "independent" has enough historical context to mean "a maximal independent set". > As long as with such understanding, it can be left to the future to > even advertise this option as a better alternative for existing > "--independent" option in the manual pages of these other two > commands. The other, more complicated, task is to have the rev-list command use the algorithm that backs 'git merge-base --independent' when the input range and options is appropriate for that purpose. This performance-only feature will require more careful construction and review. Thanks, -Stolee ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2026-01-29 14:57 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-01-18 2:34 [PATCH] revision: add --maximal option Derrick Stolee via GitGitGadget 2026-01-18 9:05 ` Johannes Sixt 2026-01-18 18:27 ` Derrick Stolee 2026-01-19 11:15 ` Johannes Sixt 2026-01-19 16:44 ` Derrick Stolee 2026-01-19 19:05 ` Johannes Sixt 2026-01-20 0:22 ` Junio C Hamano 2026-01-22 15:08 ` Derrick Stolee 2026-01-22 16:05 ` [PATCH v2] revision: add --maximal-only option Derrick Stolee via GitGitGadget 2026-01-22 21:44 ` Junio C Hamano 2026-01-22 22:15 ` Derrick Stolee 2026-01-22 23:11 ` Junio C Hamano 2026-01-23 6:38 ` Johannes Sixt 2026-01-23 15:58 ` Junio C Hamano 2026-01-23 16:55 ` Derrick Stolee 2026-01-23 18:08 ` Junio C Hamano 2026-01-28 14:28 ` Derrick Stolee 2026-01-29 0:14 ` Junio C Hamano 2026-01-29 14:57 ` Derrick Stolee
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox