All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org,  Victoria Dye <vdye@github.com>
Subject: Re: [PATCH 0/9] for-each-ref optimizations & usability improvements
Date: Tue, 07 Nov 2023 11:36:56 +0900	[thread overview]
Message-ID: <xmqqo7g69tmf.fsf@gitster.g> (raw)
In-Reply-To: <pull.1609.git.1699320361.gitgitgadget@gmail.com> (Victoria Dye via GitGitGadget's message of "Tue, 07 Nov 2023 01:25:52 +0000")

"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:

> This series is a bit of an informal follow-up to [1], adding some more
> substantial optimizations and usability fixes around ref
> filtering/formatting. Some of the changes here affect user-facing behavior,
> some are internal-only, but they're all interdependent enough to warrant
> putting them together in one series.
>
> [1]
> https://lore.kernel.org/git/pull.1594.v2.git.1696888736.gitgitgadget@gmail.com/
>
> Patch 1 changes the behavior of the '--no-sort' option in 'for-each-ref',
> 'tag', and 'branch'. Currently, it just removes previous sort keys and, if
> no further keys are specified, falls back on ascending refname sort (which,
> IMO, makes the name '--no-sort' somewhat misleading).

We can read it changes the behaviour and what the current behaviour
is, but I presume that the untold new behaviour with --no-sort is to
show the output in an unspecified order of implementation's
convenience?  I think it makes quite a lot of sense if that is what
is done.

> Patch 2 updates the 'for-each-ref' docs to clearly state what happens if you
> use '--omit-empty' and '--count' together. I based the explanation on what
> the current behavior is (i.e., refs omitted with '--omit-empty' do count
> towards the total limited by '--count').

OK.

> Patches 3-7 incrementally refactor various parts of the ref
> filtering/formatting workflows in order to create a
> 'filter_and_format_refs()' function. If certain conditions are met (sorting
> disabled, no reachability filtering or ahead-behind formatting), ref
> filtering & formatting is done within a single 'for_each_fullref_in'
> callback. Especially in large repositories, this makes a huge difference in
> memory usage & runtime for certain usages of 'for-each-ref', since it's no
> longer writing everything to a 'struct ref_array' then repeatedly whittling
> down/updating its contents.

OK.  I was wondering if you are going threaded implementation, until
I read into 6th line ;-)

> Patch 8 introduces a new option to 'for-each-ref' called '--full-deref'.
> When provided, any format fields for the dereferenced value of a tag (e.g.
> "%(*objectname)") will be populated with the fully peeled target of the tag;
> right now, those fields are populated with the immediate target of a tag
> (which can be another tag). This avoids the need to pipe 'for-each-ref'
> results to 'cat-file --batch-check' to get fully-peeled tag information. It
> also benefits from the 'filter_and_format_refs()' single-iteration
> optimization, since 'peel_iterated_oid()' may be able to read the
> pre-computed peeled OID from a packed ref. A couple notes on this one:
>
>  * I went with a command line option for '--full-deref' rather than another
>    format specifier (like ** instead of *) because it seems unlikely that a
>    user is going to want to perform a shallow dereference and a full
>    dereference in the same 'for-each-ref'. There's also a NEEDSWORK going
>    all the way back to the introduction of 'for-each-ref' in 9f613ddd21c
>    (Add git-for-each-ref: helper for language bindings, 2006-09-15) that (to
>    me) implies different dereferencing behavior corresponds to different use
>    cases/user needs.

Makes quite a lot of sense.

>  * I'm not attached to '--full-deref' as a name - if someone has an idea for
>    a more descriptive name, please suggest it!

Another candidate verb may be "to peel", and I have no strong
opinion between it and "to dereference".  But I have a mild aversion
to an abbreviation that is not strongly established.

> Finally, patch 9 adds performance tests for 'for-each-ref', showing the
> effects of optimizations made throughout the series. Here are some sample
> results from my Ubuntu VM (test names shortened for space):

Nice.

  parent reply	other threads:[~2023-11-07  2:37 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-07  1:25 [PATCH 0/9] for-each-ref optimizations & usability improvements Victoria Dye via GitGitGadget
2023-11-07  1:25 ` [PATCH 1/9] ref-filter.c: really don't sort when using --no-sort Victoria Dye via GitGitGadget
2023-11-07 10:49   ` Patrick Steinhardt
2023-11-07 18:13     ` Victoria Dye
2023-11-07  1:25 ` [PATCH 2/9] for-each-ref: clarify interaction of --omit-empty & --count Victoria Dye via GitGitGadget
2023-11-07 19:23   ` Øystein Walle
2023-11-07 19:30     ` Victoria Dye
2023-11-08  7:53       ` Øystein Walle
2023-11-08 10:00         ` Kristoffer Haugsbakk
2023-11-07  1:25 ` [PATCH 3/9] ref-filter.h: add max_count and omit_empty to ref_format Victoria Dye via GitGitGadget
2023-11-07  1:25 ` [PATCH 4/9] ref-filter.h: move contains caches into filter Victoria Dye via GitGitGadget
2023-11-07 10:49   ` Patrick Steinhardt
2023-11-07  1:25 ` [PATCH 5/9] ref-filter.h: add functions for filter/format & format-only Victoria Dye via GitGitGadget
2023-11-07  1:25 ` [PATCH 6/9] ref-filter.c: refactor to create common helper functions Victoria Dye via GitGitGadget
2023-11-07 10:49   ` Patrick Steinhardt
2023-11-07 18:41     ` Victoria Dye
2023-11-07  1:25 ` [PATCH 7/9] ref-filter.c: filter & format refs in the same callback Victoria Dye via GitGitGadget
2023-11-07 10:49   ` Patrick Steinhardt
2023-11-07 19:45     ` Victoria Dye
2023-11-07  1:26 ` [PATCH 8/9] for-each-ref: add option to fully dereference tags Victoria Dye via GitGitGadget
2023-11-07 10:50   ` Patrick Steinhardt
2023-11-08  1:13     ` Victoria Dye
2023-11-08  3:14       ` Junio C Hamano
2023-11-08  7:19         ` Patrick Steinhardt
2023-11-08 18:02         ` Victoria Dye
2023-11-09  1:22           ` Junio C Hamano
2023-11-09  1:23           ` Junio C Hamano
2023-11-09  1:32             ` Junio C Hamano
2023-11-07  1:26 ` [PATCH 9/9] t/perf: add perf tests for for-each-ref Victoria Dye via GitGitGadget
2023-11-07  2:36 ` Junio C Hamano [this message]
2023-11-07  2:48   ` [PATCH 0/9] for-each-ref optimizations & usability improvements Victoria Dye
2023-11-07  3:04     ` Junio C Hamano
2023-11-07 10:49     ` Patrick Steinhardt
2023-11-08  1:31       ` Victoria Dye
2023-11-14 19:53 ` [PATCH v2 00/10] " Victoria Dye via GitGitGadget
2023-11-14 19:53   ` [PATCH v2 01/10] ref-filter.c: really don't sort when using --no-sort Victoria Dye via GitGitGadget
2023-11-16  5:29     ` Junio C Hamano
2023-11-14 19:53   ` [PATCH v2 02/10] ref-filter.h: add max_count and omit_empty to ref_format Victoria Dye via GitGitGadget
2023-11-16 12:06     ` Øystein Walle
2023-11-14 19:53   ` [PATCH v2 03/10] ref-filter.h: move contains caches into filter Victoria Dye via GitGitGadget
2023-11-14 19:53   ` [PATCH v2 04/10] ref-filter.h: add functions for filter/format & format-only Victoria Dye via GitGitGadget
2023-11-16  5:39     ` Junio C Hamano
2023-11-14 19:53   ` [PATCH v2 05/10] ref-filter.c: rename 'ref_filter_handler()' to 'filter_one()' Victoria Dye via GitGitGadget
2023-11-14 19:53   ` [PATCH v2 06/10] ref-filter.c: refactor to create common helper functions Victoria Dye via GitGitGadget
2023-11-14 19:53   ` [PATCH v2 07/10] ref-filter.c: filter & format refs in the same callback Victoria Dye via GitGitGadget
2023-11-14 19:53   ` [PATCH v2 08/10] for-each-ref: clean up documentation of --format Victoria Dye via GitGitGadget
2023-11-14 19:53   ` [PATCH v2 09/10] ref-filter.c: use peeled tag for '*' format fields Victoria Dye via GitGitGadget
2023-11-16  5:48     ` Junio C Hamano
2023-11-14 19:53   ` [PATCH v2 10/10] t/perf: add perf tests for for-each-ref Victoria Dye via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqo7g69tmf.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.