From: Mirko Faina <mroik@delayed.space>
To: "D. Ben Knoble" <ben.knoble@gmail.com>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
"Patrick Steinhardt" <ps@pks.im>,
"Jean-Noël Avila" <jn.avila@free.fr>, "Jeff King" <peff@peff.net>,
"Mirko Faina" <mroik@delayed.space>
Subject: Re: [PATCH] revision.c: implement --reverse=before for walks
Date: Sun, 19 Apr 2026 22:31:37 +0200 [thread overview]
Message-ID: <aeUqSltEWIWaPDh3@exploit> (raw)
In-Reply-To: <CALnO6CACfSyzyguX4623Dk3y+QEM_Dbmfko8dTyM1p3JxBjZFg@mail.gmail.com>
On Sun, Apr 19, 2026 at 03:12:27PM -0400, D. Ben Knoble wrote:
> On Sun, Apr 19, 2026 at 2:11 PM Mirko Faina <mroik@delayed.space> wrote:
> >
> > On Sun, Apr 19, 2026 at 08:06:24AM -0400, Ben Knoble wrote:
> > > The original handles multiple reverse options inverting each other…
> > >
> > > > + } else if (starts_with(arg, "--reverse")) {
> > > > + if (!skip_prefix(arg, "--reverse=", &optarg)) {
> > > > + if (argc < 2) {
> > > > + revs->reverse = 1;
> > > > + return 1;
> > > > + } else {
> > > > + optarg = argv[1];
> > > > + }
> > > > + }
> > > > +
> > > > + if (!strcmp(optarg, "after")) {
> > > > + revs->reverse = 1;
> > > > + } else if (!strcmp(optarg, "before")) {
> > > > + revs->reverse = 2;
> > > > + } else {
> > > > + revs->reverse = 1;
> > > > + return 1;
> > > > + }
> > > > +
> > > > + return optarg == argv[1] ? 2 : 1;
> > >
> > > …which I don’t see here.
> > >
> > > I’m not familiar with this parsing code though so I can’t add much about the test other than to say it is a bit hard to follow :/
> >
> > Given that it is no longer binary handling multiple reverse can't simply
> > be inverting bits, it wouldn't make sense. This is done before the walk
> > itself, so even from the POV of the user it wouldn't make much sense to
> > reverse multiple times as the order of the applied options before this
> > patch (commit limiting options then reverse) doesn't change.
> >
> > This doesn't break any tests so I assumed it was fine.
>
> I think I mean that
>
> git log --reverse --reverse
>
> shows commits in the same order as "git log"; what should
>
> git log --reverse=after --reverse
>
> do? Or what about preserving the behavior of the original "git log
> --reverse --reverse," which I don't think is done here?
Yes, this is what I was getting at. Since it is no longer binary what
would a double reverse mean? What if "--reverse=after --reverse=before"?
How should that be handled?
> Granted, I don't see this ability documented, and I cannot tell how
> many may scream if we change this behavior, so it's a bit
> hypothetical. But there is an argument for backward compatibility as a
> default, which I think we'd need to justify changing. Perhaps in the
> proposed log message?
>
> (The original seems nonsensical to type, but of course you can imagine
> alias.A=log --reverse <other-stuff>, and then sometimes you want to do
> "git A --reverse" to un-reverse the commits.)
Will do in v2.
> > > I haven’t looked, but it would be nice if we could use an enum instead. Unfortunately that would probably take up more space in the struct, and I suppose the bit-packing is done intentionally for performance.
> >
> > Could define new macros so that the readers don't have to mentally keep
> > track of which value rapresents what. I didn't think that was
> > necessary, should I change it?
>
> Yeah, a few `#define`d constants would make things more readable to
> me, at least, since we can't use the enum without space concerns
> (unless there's a way to bit-pack the enum to only 2 bits?).
Will do in v2.
> > > > if (revs->reverse_output_stage) {
> > > > + if (revs->reverse == 2 && revs->max_count == 0)
> > > > + return NULL;
> > > > +
>
> PS: something I spotted on a second read. [Ignoring reverse=after
> mode] This hunk looks to me like a nice little optimization (return
> nothing if we know max_count says we yield no commits). Of course, I
> could see that being viable early in the function, right? When asking
> get_revision for commits, if max_count is 0, just return NULL.
>
> For reverse=after mode, this condition is only true if the max_count
> was 0 in the previous conditional, also, since we use max_count=-1
> before iterating get_revision_internal. That means the original
> max_count isn't touched. At any rate, it _seems_ to me that the whole
> function could benefit from this optimization… but I wonder if it is
> _necessary_ for correctness of reverse=after in some way that I'm not
> seeing? Since the current version doesn't need the early bailout, why
> does reverse=after?
Just to clarify, "reverse = 2" is "--reverse=before" and not
"--reverse=after".
With "reverse = 2", the snippet of code you're referencing is not an
optimization but a requirement for correctness. With "reverse = 1" we
just keep the max_count as is and it's used by get_revision_internal()
to stop if that limit is reached. What we find in 'reversed' are already
just the commits we need to return.
With "reverse = 2", we first set max_count to -1 and then retrieve the
whole history, then we set max_count to its original value. Then we
return the commits on each call of get_revision(). Now, unlike with
"reverse = 1", we have the whole history in 'reversed', because of that
we need to know when to stop. That's the reason we decrement max_count
only for "reverse = 2" and why "max_count == 0" is checked only for
"reverse = 2".
> > > > c = pop_commit(&revs->commits);
> > > > + if (revs->reverse == 2)
> > > > + revs->max_count--;
> > >
> > > Hm. Why do we decrement here? Again, not an area I’m familiar with, but a bit surprising.
> >
> > get_revision() (in revision.c) handles the reverse option and updates
> > the "struct git_graph". get_revision() then calls
> > get_revision_internal(), which handles commit boundaries and max_count,
> > here is where it gets decreased. Since max_count gets decreased
> > everytime get_revision_internal() is called, if we were to leave
> > max_count as is before the walk (in get_revision() at line 4558), the
> > walk would stop before reaching the root commit. This is why the current
> > --reverse option is applied only after commit limiting options. So
> > instead we set max_count at -1 walking the whole history and storing it
> > in 'reversed'. Now we're in "reverse_output_stage = 1", and in this
> > state we never call get_revision_internal() again, instead we pop
> > commits from 'reversed'. Because of this we have to handle max_count
> > outside get_revision_internal(), so we decrement it in the snippet of
> > code you referenced.
> >
> > A bit verbose but hopefully it'll get my point across.
>
> I don't 100% follow, but I'm out of my depth :)
>
> I think I see that get_revision() effectively has 2 modes pertaining
> to reverse: reverse and reverse output stage (the former falls
> directly into the latter, though).
>
> After some setup, the reverse mode calls get_revision_internal() as
> you said. That decrements max_count as a way of counting how many
> commits we've seen through the loop, so if we asked for 5 we'd only
> process 5 commits.
>
> Then we fall into the output stage mode, which pops a commit [1].
>
> With this patch, in reverse=after we disable max_count in the first
> (reverse) mode, as you said. Ok: we get the whole (filtered) history
> then, at which point we can now shrink. That makes sense.
>
> Then in the reverse output stage mode, we pretend to have one less
> max_count. That's what I can't figure out. Is it because of the
> pop_commit()? I guess I'm not totally seeing how that interacted with
> the max_count in the original code: does the current code yield one
> extra commit in get_revision_internal() ?
I'm not sure I understand what you're referencing with "Then in the
reverse output stage mode, we pretend to have one less max_count".
If you're referring to line 4573, then...
> You wrote that "we never call get_revision_internal() again," but I
> don't see why that's true with this patch and not true before it.
>
> I do agree that _somebody_ has to handle max_count after
> get_revision() returns with reverse=after. I'm just not sure what
>
> if (revs->reverse == 2)
> revs->max_count--;
>
> is doing.
...we're not pretending we have fewer commits. Every subsequent call to
get_revision() after the first call will never enter the branch at line
4548 and will only enter the branch at 4568. Everytime we pop a commit
from 'reversed' we decrease max_count so we can limit only to the amount
of commits the user wants.
So, to recap, with "reverse = 2", on the first call to get_revision() we
walk the whole history and store it in 'reversed' in reversed order and
return the first commit.
On subsequent calls to get_revision() we do not walk the history again,
we simply return the commits that have been stored in 'reversed'.
Everytime we pop a commit we have to decrease max_count, and we check
againts max_count to know if we shouldn't return anymore commits (by
returning NULL).
> Of course if I'm the only one confused and others make sense of it,
> that's ok, too.
No, I completely understand. I did have to retouch the function a few
times after writing the tests :P
> [1]: I traced this to 498bcd3159 (rev-list: fix --reverse interaction
> with --parents, 2008-08-29), but I can't fathom what the pop is doing
> there.
It's pretty much doing the same thing it does now, it's returning stored
commits. In both versions, the initial setup when "revs->reverse" is
true, becomes "dead code" after the first call.
Thank you
next prev parent reply other threads:[~2026-04-19 20:31 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-18 16:47 [PATCH] revision.c: implement --reverse=before for walks Mirko Faina
2026-04-18 18:20 ` Tian Yuchen
2026-04-18 18:42 ` Mirko Faina
2026-04-18 18:51 ` Mirko Faina
2026-04-20 16:06 ` Junio C Hamano
2026-04-20 17:08 ` Tian Yuchen
2026-04-20 23:50 ` Mirko Faina
2026-04-19 12:06 ` Ben Knoble
2026-04-19 18:11 ` Mirko Faina
2026-04-19 19:12 ` D. Ben Knoble
2026-04-19 20:31 ` Mirko Faina [this message]
2026-04-20 0:21 ` Jeff King
2026-04-20 9:33 ` Mirko Faina
2026-04-20 10:30 ` Mirko Faina
2026-04-21 3:48 ` Jeff King
2026-04-22 18:24 ` D. Ben Knoble
2026-04-22 19:42 ` Mirko Faina
2026-04-20 0:04 ` Jeff King
2026-04-20 9:22 ` Mirko Faina
2026-04-22 0:28 ` [PATCH v2 0/2] " Mirko Faina
2026-04-22 0:30 ` Mirko Faina
2026-04-23 22:51 ` [PATCH v3 " Mirko Faina
2026-04-23 22:51 ` [PATCH v3 1/2] " Mirko Faina
2026-04-28 1:45 ` Junio C Hamano
2026-04-23 22:52 ` [PATCH v3 2/2] revision.c: reduce memory usage on reverse before Mirko Faina
2026-04-27 0:24 ` [PATCH v4 0/2] revision.c: implement --reverse=before for walks Mirko Faina
2026-04-27 0:24 ` [PATCH v4 1/2] " Mirko Faina
2026-04-27 6:45 ` Junio C Hamano
2026-04-27 7:33 ` Johannes Sixt
2026-04-27 12:30 ` Junio C Hamano
2026-04-27 13:58 ` Chris Torek
2026-04-27 16:48 ` [PATCH v4 1/2] revision.c: implement -b-reverse=before " Mirko Faina
2026-04-28 1:46 ` Junio C Hamano
2026-04-28 1:45 ` [PATCH v4 1/2] revision.c: implement --reverse=before " Junio C Hamano
2026-04-27 0:24 ` [PATCH v4 2/2] revision.c: reduce memory usage on reverse before Mirko Faina
2026-04-28 1:46 ` Junio C Hamano
2026-04-30 19:52 ` [PATCH v5] revision.c: implement --max-count-oldest Mirko Faina
2026-05-04 5:19 ` Junio C Hamano
2026-05-04 13:08 ` Mirko Faina
2026-05-05 21:54 ` [PATCH v6] " Mirko Faina
2026-05-06 6:45 ` Johannes Sixt
2026-05-06 12:54 ` Mirko Faina
2026-05-07 9:20 ` Junio C Hamano
2026-05-08 0:09 ` Mirko Faina
2026-05-09 12:46 ` Jean-Noël AVILA
2026-05-10 0:41 ` Mirko Faina
2026-05-09 21:01 ` Junio C Hamano
2026-05-10 0:48 ` Mirko Faina
2026-05-15 23:29 ` [PATCH v7] " Mirko Faina
2026-05-09 11:01 ` [PATCH v5] " Junio C Hamano
2026-05-10 0:36 ` Mirko Faina
2026-04-22 0:28 ` [PATCH v2 1/2] revision.c: implement --reverse=before for walks Mirko Faina
2026-04-22 22:44 ` Jeff King
2026-04-22 22:53 ` Mirko Faina
2026-04-22 0:28 ` [PATCH v2 2/2] revision.c: reduce memory usage on reverse before Mirko Faina
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aeUqSltEWIWaPDh3@exploit \
--to=mroik@delayed.space \
--cc=ben.knoble@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jn.avila@free.fr \
--cc=peff@peff.net \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox