All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mirko Faina <mroik@delayed.space>
To: "D. Ben Knoble" <ben.knoble@gmail.com>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"Patrick Steinhardt" <ps@pks.im>,
	"Jean-Noël Avila" <jn.avila@free.fr>, "Jeff King" <peff@peff.net>,
	"Mirko Faina" <mroik@delayed.space>
Subject: Re: [PATCH] revision.c: implement --reverse=before for walks
Date: Sun, 19 Apr 2026 22:31:37 +0200	[thread overview]
Message-ID: <aeUqSltEWIWaPDh3@exploit> (raw)
In-Reply-To: <CALnO6CACfSyzyguX4623Dk3y+QEM_Dbmfko8dTyM1p3JxBjZFg@mail.gmail.com>

On Sun, Apr 19, 2026 at 03:12:27PM -0400, D. Ben Knoble wrote:
> On Sun, Apr 19, 2026 at 2:11 PM Mirko Faina <mroik@delayed.space> wrote:
> >
> > On Sun, Apr 19, 2026 at 08:06:24AM -0400, Ben Knoble wrote:
> > > The original handles multiple reverse options inverting each other…
> > >
> > > > +    } else if (starts_with(arg, "--reverse")) {
> > > > +        if (!skip_prefix(arg, "--reverse=", &optarg)) {
> > > > +            if (argc < 2) {
> > > > +                revs->reverse = 1;
> > > > +                return 1;
> > > > +            } else {
> > > > +                optarg = argv[1];
> > > > +            }
> > > > +        }
> > > > +
> > > > +        if (!strcmp(optarg, "after")) {
> > > > +            revs->reverse = 1;
> > > > +        } else if (!strcmp(optarg, "before")) {
> > > > +            revs->reverse = 2;
> > > > +        } else {
> > > > +            revs->reverse = 1;
> > > > +            return 1;
> > > > +        }
> > > > +
> > > > +        return optarg == argv[1] ? 2 : 1;
> > >
> > > …which I don’t see here.
> > >
> > > I’m not familiar with this parsing code though so I can’t add much about the test other than to say it is a bit hard to follow :/
> >
> > Given that it is no longer binary handling multiple reverse can't simply
> > be inverting bits, it wouldn't make sense. This is done before the walk
> > itself, so even from the POV of the user it wouldn't make much sense to
> > reverse multiple times as the order of the applied options before this
> > patch (commit limiting options then reverse) doesn't change.
> >
> > This doesn't break any tests so I assumed it was fine.
> 
> I think I mean that
> 
>     git log --reverse --reverse
> 
> shows commits in the same order as "git log"; what should
> 
>     git log --reverse=after --reverse
> 
> do? Or what about preserving the behavior of the original "git log
> --reverse --reverse," which I don't think is done here?

Yes, this is what I was getting at. Since it is no longer binary what
would a double reverse mean? What if "--reverse=after --reverse=before"?
How should that be handled?

> Granted, I don't see this ability documented, and I cannot tell how
> many may scream if we change this behavior, so it's a bit
> hypothetical. But there is an argument for backward compatibility as a
> default, which I think we'd need to justify changing. Perhaps in the
> proposed log message?
> 
> (The original seems nonsensical to type, but of course you can imagine
> alias.A=log --reverse <other-stuff>, and then sometimes you want to do
> "git A --reverse" to un-reverse the commits.)

Will do in v2.

> > > I haven’t looked, but it would be nice if we could use an enum instead. Unfortunately that would probably take up more space in the struct, and I suppose the bit-packing is done intentionally for performance.
> >
> > Could define new macros so that the readers don't have to mentally keep
> > track of which value rapresents what. I didn't think that was
> > necessary, should I change it?
> 
> Yeah, a few `#define`d constants would make things more readable to
> me, at least, since we can't use the enum without space concerns
> (unless there's a way to bit-pack the enum to only 2 bits?).

Will do in v2.

> > > >    if (revs->reverse_output_stage) {
> > > > +        if (revs->reverse == 2 && revs->max_count == 0)
> > > > +            return NULL;
> > > > +
> 
> PS: something I spotted on a second read. [Ignoring reverse=after
> mode] This hunk looks to me like a nice little optimization (return
> nothing if we know max_count says we yield no commits). Of course, I
> could see that being viable early in the function, right? When asking
> get_revision for commits, if max_count is 0, just return NULL.
> 
> For reverse=after mode, this condition is only true if the max_count
> was 0 in the previous conditional, also, since we use max_count=-1
> before iterating get_revision_internal. That means the original
> max_count isn't touched. At any rate, it _seems_ to me that the whole
> function could benefit from this optimization… but I wonder if it is
> _necessary_ for correctness of reverse=after in some way that I'm not
> seeing? Since the current version doesn't need the early bailout, why
> does reverse=after?

Just to clarify, "reverse = 2" is "--reverse=before" and not
"--reverse=after".

With "reverse = 2", the snippet of code you're referencing is not an
optimization but a requirement for correctness. With "reverse = 1" we
just keep the max_count as is and it's used by get_revision_internal()
to stop if that limit is reached. What we find in 'reversed' are already
just the commits we need to return.

With "reverse = 2", we first set max_count to -1 and then retrieve the
whole history, then we set max_count to its original value. Then we
return the commits on each call of get_revision(). Now, unlike with
"reverse = 1", we have the whole history in 'reversed', because of that
we need to know when to stop. That's the reason we decrement max_count
only for "reverse = 2" and why "max_count == 0" is checked only for
"reverse = 2".

> > > >        c = pop_commit(&revs->commits);
> > > > +        if (revs->reverse == 2)
> > > > +            revs->max_count--;
> > >
> > > Hm. Why do we decrement here? Again, not an area I’m familiar with, but a bit surprising.
> >
> > get_revision() (in revision.c) handles the reverse option and updates
> > the "struct git_graph". get_revision() then calls
> > get_revision_internal(), which handles commit boundaries and max_count,
> > here is where it gets decreased. Since max_count gets decreased
> > everytime get_revision_internal() is called, if we were to leave
> > max_count as is before the walk (in get_revision() at line 4558), the
> > walk would stop before reaching the root commit. This is why the current
> > --reverse option is applied only after commit limiting options. So
> > instead we set max_count at -1 walking the whole history and storing it
> > in 'reversed'. Now we're in "reverse_output_stage = 1", and in this
> > state we never call get_revision_internal() again, instead we pop
> > commits from 'reversed'. Because of this we have to handle max_count
> > outside get_revision_internal(), so we decrement it in the snippet of
> > code you referenced.
> >
> > A bit verbose but hopefully it'll get my point across.
> 
> I don't 100% follow, but I'm out of my depth :)
> 
> I think I see that get_revision() effectively has 2 modes pertaining
> to reverse: reverse and reverse output stage (the former falls
> directly into the latter, though).
> 
> After some setup, the reverse mode calls get_revision_internal() as
> you said. That decrements max_count as a way of counting how many
> commits we've seen through the loop, so if we asked for 5 we'd only
> process 5 commits.
> 
> Then we fall into the output stage mode, which pops a commit [1].
> 
> With this patch, in reverse=after we disable max_count in the first
> (reverse) mode, as you said. Ok: we get the whole (filtered) history
> then, at which point we can now shrink. That makes sense.
> 
> Then in the reverse output stage mode, we pretend to have one less
> max_count. That's what I can't figure out. Is it because of the
> pop_commit()? I guess I'm not totally seeing how that interacted with
> the max_count in the original code: does the current code yield one
> extra commit in get_revision_internal() ?

I'm not sure I understand what you're referencing with "Then in the
reverse output stage mode, we pretend to have one less max_count".

If you're referring to line 4573, then...

> You wrote that "we never call get_revision_internal() again," but I
> don't see why that's true with this patch and not true before it.
> 
> I do agree that _somebody_ has to handle max_count after
> get_revision() returns with reverse=after. I'm just not sure what
> 
>     if (revs->reverse == 2)
>         revs->max_count--;
> 
> is doing.

...we're not pretending we have fewer commits. Every subsequent call to
get_revision() after the first call will never enter the branch at line
4548 and will only enter the branch at 4568. Everytime we pop a commit
from 'reversed' we decrease max_count so we can limit only to the amount
of commits the user wants.

So, to recap, with "reverse = 2", on the first call to get_revision() we
walk the whole history and store it in 'reversed' in reversed order and
return the first commit.
On subsequent calls to get_revision() we do not walk the history again,
we simply return the commits that have been stored in 'reversed'.
Everytime we pop a commit we have to decrease max_count, and we check
againts max_count to know if we shouldn't return anymore commits (by
returning NULL).

> Of course if I'm the only one confused and others make sense of it,
> that's ok, too.

No, I completely understand. I did have to retouch the function a few
times after writing the tests :P

> [1]: I traced this to 498bcd3159 (rev-list: fix --reverse interaction
> with --parents, 2008-08-29), but I can't fathom what the pop is doing
> there.

It's pretty much doing the same thing it does now, it's returning stored
commits. In both versions, the initial setup when "revs->reverse" is
true, becomes "dead code" after the first call.

Thank you

  reply	other threads:[~2026-04-19 20:31 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-18 16:47 [PATCH] revision.c: implement --reverse=before for walks Mirko Faina
2026-04-18 18:20 ` Tian Yuchen
2026-04-18 18:42   ` Mirko Faina
2026-04-18 18:51     ` Mirko Faina
2026-04-20 16:06   ` Junio C Hamano
2026-04-20 17:08     ` Tian Yuchen
2026-04-20 23:50     ` Mirko Faina
2026-04-19 12:06 ` Ben Knoble
2026-04-19 18:11   ` Mirko Faina
2026-04-19 19:12     ` D. Ben Knoble
2026-04-19 20:31       ` Mirko Faina [this message]
2026-04-20  0:21         ` Jeff King
2026-04-20  9:33           ` Mirko Faina
2026-04-20 10:30             ` Mirko Faina
2026-04-21  3:48             ` Jeff King
2026-04-22 18:24         ` D. Ben Knoble
2026-04-22 19:42           ` Mirko Faina
2026-04-20  0:04 ` Jeff King
2026-04-20  9:22   ` Mirko Faina
2026-04-22  0:28 ` [PATCH v2 0/2] " Mirko Faina
2026-04-22  0:30   ` Mirko Faina
2026-04-23 22:51   ` [PATCH v3 " Mirko Faina
2026-04-23 22:51     ` [PATCH v3 1/2] " Mirko Faina
2026-04-28  1:45       ` Junio C Hamano
2026-04-23 22:52     ` [PATCH v3 2/2] revision.c: reduce memory usage on reverse before Mirko Faina
2026-04-27  0:24     ` [PATCH v4 0/2] revision.c: implement --reverse=before for walks Mirko Faina
2026-04-27  0:24       ` [PATCH v4 1/2] " Mirko Faina
2026-04-27  6:45         ` Junio C Hamano
2026-04-27  7:33           ` Johannes Sixt
2026-04-27 12:30             ` Junio C Hamano
2026-04-27 13:58               ` Chris Torek
2026-04-27 16:48           ` [PATCH v4 1/2] revision.c: implement -b-reverse=before " Mirko Faina
2026-04-28  1:46             ` Junio C Hamano
2026-04-28  1:45         ` [PATCH v4 1/2] revision.c: implement --reverse=before " Junio C Hamano
2026-04-27  0:24       ` [PATCH v4 2/2] revision.c: reduce memory usage on reverse before Mirko Faina
2026-04-28  1:46         ` Junio C Hamano
2026-04-30 19:52       ` [PATCH v5] revision.c: implement --max-count-oldest Mirko Faina
2026-05-04  5:19         ` Junio C Hamano
2026-05-04 13:08           ` Mirko Faina
2026-05-05 21:54         ` [PATCH v6] " Mirko Faina
2026-05-06  6:45           ` Johannes Sixt
2026-05-06 12:54             ` Mirko Faina
2026-05-07  9:20               ` Junio C Hamano
2026-05-08  0:09                 ` Mirko Faina
2026-05-09 12:46           ` Jean-Noël AVILA
2026-05-10  0:41             ` Mirko Faina
2026-05-09 21:01           ` Junio C Hamano
2026-05-10  0:48             ` Mirko Faina
2026-05-09 11:01         ` [PATCH v5] " Junio C Hamano
2026-05-10  0:36           ` Mirko Faina
2026-04-22  0:28 ` [PATCH v2 1/2] revision.c: implement --reverse=before for walks Mirko Faina
2026-04-22 22:44   ` Jeff King
2026-04-22 22:53     ` Mirko Faina
2026-04-22  0:28 ` [PATCH v2 2/2] revision.c: reduce memory usage on reverse before Mirko Faina

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aeUqSltEWIWaPDh3@exploit \
    --to=mroik@delayed.space \
    --cc=ben.knoble@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jn.avila@free.fr \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.