From: Justin Tobler <jltobler@gmail.com>
To: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Cc: git@vger.kernel.org, ps@pks.im, gitster@pobox.com,
kristofferhaugsbakk@fastmail.com, eslam.reda.div@gmail.com
Subject: Re: [PATCH v2 0/5] builtin/repo: include largest object information
Date: Sun, 1 Mar 2026 13:22:33 -0600 [thread overview]
Message-ID: <aaR6a7o4omOIWJSe@denethor> (raw)
In-Reply-To: <EB04AA40-87BA-41D9-B2DC-92E87FACEB54@gmail.com>
On 26/02/28 08:43PM, Lucas Seiki Oshiro wrote:
> I was trying this patch series and I noticed that it took
> more time to run than before. In my machine, I tested it
> with the Git repository itself and it took 6s to run, while
> it took 3s to run in the current master [1].
Yes, now that objects are being parsed to fetch additional
commit/tree information we incur some additional overhead when
collecting metrics.
With git-repo-structure, the goal is to provide the user with an
overview of size/structure related statistics that may showcase problems
for a given repostiory and is directly inspired by git-sizer [1]. Thus
as it currently stands, the implementation of git-repo-structure is
still incomplete and as we collect additional metrics in subseqent
series the performance characteristics may still change.
> I understand the reason and I don't think we could avoid
> that, but I'm wondering if wouldn't be nice to have some
> way to only retrieve the "lighter" data (perhaps a flag,
> or something like the keys in git-repo-info).
If the main motivation is to allow the user to reduce the time spent by
selecting only a subset of metrics, I don't think using keys like
git-repo-info would be a good fit. Most of the collected metrics pull
from the same data sources so including/excluding any given metric may
not have any bearing on actual performance. For example: if the user
wants to collect largest object info which is a more expensive check, we
still have to collect the underlying data used by the other metrics
regardless of if they are shown or not. Furthermore, it would likely not
be obvious to users which categories of metrics would be more expensive
than others.
I could maybe see something akin to a `--[no-]extended` option that
breaks metrics into cheap/expensive categories and computes/displays the
metrics accordingly, but it would be important that the default set of
metrics collected satisfy the repository overview this command aims to
provide.
If we are more interested in adding a mechanism to filter
git-repo-structure results independent of performance considerations,
maybe we could eventually explore adding something like the
git-repo-info keys or a `--filter` option to restrict the output to a
specified subset. At the same time though, it is probably easy enough
for git-repo-structure users to filter the machine-parsable output
themselves if they wish to do so. For now I think this should be fine,
but an included result filtering option is still something we could
explore in the future. :)
Thanks,
-Justin
[1]: https://github.com/github/git-sizer
next prev parent reply other threads:[~2026-03-01 19:22 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-03 22:17 [PATCH 0/5] builtin/repo: include largest object information Justin Tobler
2026-02-03 22:17 ` [PATCH 1/5] builtin/repo: update stats for each object Justin Tobler
2026-02-03 22:36 ` Junio C Hamano
2026-02-18 19:40 ` Justin Tobler
2026-02-26 19:20 ` Junio C Hamano
2026-02-26 19:29 ` Justin Tobler
2026-02-03 22:17 ` [PATCH 2/5] builtin/repo: collect largest inflated objects Justin Tobler
2026-02-03 22:45 ` Junio C Hamano
2026-02-18 20:01 ` Justin Tobler
2026-02-03 22:17 ` [PATCH 3/5] builtin/repo: add OID annotations to table output Justin Tobler
2026-02-13 13:14 ` Patrick Steinhardt
2026-02-18 20:13 ` Justin Tobler
2026-02-03 22:17 ` [PATCH 4/5] builtin/repo: find commit with most parents Justin Tobler
2026-02-03 22:48 ` Junio C Hamano
2026-02-03 23:14 ` Kristoffer Haugsbakk
2026-02-03 23:33 ` Junio C Hamano
2026-02-18 20:06 ` Justin Tobler
2026-02-03 22:17 ` [PATCH 5/5] builtin/repo: find tree with most entries Justin Tobler
2026-02-03 22:50 ` Junio C Hamano
2026-02-04 8:28 ` Patrick Steinhardt
2026-02-04 15:28 ` Junio C Hamano
2026-02-23 17:41 ` [PATCH v2 0/5] builtin/repo: include largest object information Justin Tobler
2026-02-23 17:41 ` [PATCH v2 1/5] builtin/repo: update stats for each object Justin Tobler
2026-02-23 17:41 ` [PATCH v2 2/5] builtin/repo: collect largest inflated objects Justin Tobler
2026-02-26 19:50 ` Junio C Hamano
2026-03-02 17:28 ` Justin Tobler
2026-02-28 23:36 ` Lucas Seiki Oshiro
2026-03-02 17:38 ` Justin Tobler
2026-02-23 17:41 ` [PATCH v2 3/5] builtin/repo: add OID annotations to table output Justin Tobler
2026-02-26 19:56 ` Junio C Hamano
2026-03-02 17:39 ` Justin Tobler
2026-02-23 17:41 ` [PATCH v2 4/5] builtin/repo: find commit with most parents Justin Tobler
2026-02-23 17:41 ` [PATCH v2 5/5] builtin/repo: find tree with most entries Justin Tobler
2026-02-24 9:35 ` [PATCH v2 0/5] builtin/repo: include largest object information Patrick Steinhardt
2026-02-28 23:43 ` Lucas Seiki Oshiro
2026-03-01 19:22 ` Justin Tobler [this message]
2026-03-02 21:45 ` [PATCH v3 0/6] " Justin Tobler
2026-03-02 21:45 ` [PATCH v3 1/6] builtin/repo: update stats for each object Justin Tobler
2026-03-02 21:45 ` [PATCH v3 2/6] builtin/repo: add helper for printing keyvalue output Justin Tobler
2026-03-03 13:27 ` Patrick Steinhardt
2026-03-03 17:40 ` Junio C Hamano
2026-03-03 18:08 ` Justin Tobler
2026-03-02 21:45 ` [PATCH v3 3/6] builtin/repo: collect largest inflated objects Justin Tobler
2026-03-03 13:27 ` Patrick Steinhardt
2026-03-02 21:45 ` [PATCH v3 4/6] builtin/repo: add OID annotations to table output Justin Tobler
2026-03-02 21:45 ` [PATCH v3 5/6] builtin/repo: find commit with most parents Justin Tobler
2026-03-02 21:45 ` [PATCH v3 6/6] builtin/repo: find tree with most entries Justin Tobler
2026-03-02 22:09 ` [PATCH v3 0/6] builtin/repo: include largest object information Junio C Hamano
2026-03-06 22:36 ` Junio C Hamano
2026-03-08 18:44 ` Justin Tobler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aaR6a7o4omOIWJSe@denethor \
--to=jltobler@gmail.com \
--cc=eslam.reda.div@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=kristofferhaugsbakk@fastmail.com \
--cc=lucasseikioshiro@gmail.com \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox