From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Kyle Meyer <kyle@kyleam.com>,
Eric Sunshine <sunshine@sunshineco.com>,
Taylor Blau <me@ttaylorr.com>
Subject: Re: [PATCH v2] rev-list --disk-usage
Date: Wed, 10 Feb 2021 08:31:08 -0800 [thread overview]
Message-ID: <xmqq1rdn51gz.fsf@gitster.c.googlers.com> (raw)
In-Reply-To: <YCOu70m5SKU7L4CS@coredump.intra.peff.net> (Jeff King's message of "Wed, 10 Feb 2021 05:01:19 -0500")
Jeff King <peff@peff.net> writes:
> But in practice, we've found this kind of naive --disk-usage useful for
> answering questions like:
>
> - do I need all of these objects? Comparing "rev-list --disk-usage
> --objects --all", "rev-list --disk-usage --objects --all --reflog",
> and "du objects/pack/*.pack" will tell you if a prune/repack might
> help, and whether expiring reflogs makes a difference.
>
> - the size of the shared alternates repo for a set of forks has
> jumped. Comparing "rev-list --disk-usage --objects --remotes=$base
> --not --remotes=$fork" will tell you what's reachable from a fork
> but not from the base (we use "refs/remotes/$id/*" to keep track of
> fork refs in our alternates repo). This can be junk like somebody
> forking git/git and then uploading a bunch of pirated video files.
> :)
>
> - likewise, the size of cloning a single repo may jump. Comparing
> "rev-list --disk-usage --objects HEAD..$branch" for each branch
> might show that one branch is an outlier (e.g., because somebody
> accidentally committed a bunch of build artifacts).
>
> In those kinds of cases, it's not usually "oh, this version is twice as
> big as this other one". It's more like "wow, this branch is 100x as big
> as the other branches", and little decisions like delta direction are
> just noise. I imagine that in those cases the uncompressed object sizes
> would probably produce similar patterns and answers. But it's actually
> faster to produce the on-disk sizes. :)
Thanks.
I kind of feel sad to have a nice write-up like this only in the
list archive. Is there a section in our documentation set to keep
collection of such a real-life use cases? Perhaps the examples
section of manpages is the closest thing, but it looks a bit too
narrowly scoped for the example section of "rev-list" manpage.
THanks.
next prev parent reply other threads:[~2021-02-10 16:33 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-27 22:11 [PATCH 0/2] rev-list --disk-usage Jeff King
2021-01-27 22:12 ` [PATCH 1/2] t: add --no-tag option to test_commit Jeff King
2021-01-27 22:48 ` Taylor Blau
2021-01-27 22:17 ` [PATCH 2/2] rev-list: add --disk-usage option for calculating disk usage Jeff King
2021-01-27 22:57 ` Taylor Blau
2021-01-27 23:34 ` Jeff King
2021-01-27 23:01 ` Kyle Meyer
2021-01-27 23:36 ` Jeff King
2021-01-27 23:07 ` Eric Sunshine
2021-01-27 23:39 ` Jeff King
2021-01-27 22:46 ` [PATCH 0/2] rev-list --disk-usage Taylor Blau
2021-02-09 10:52 ` [PATCH v2] " Jeff King
2021-02-09 10:52 ` [PATCH v2 1/2] t: add --no-tag option to test_commit Jeff King
2021-02-09 10:53 ` [PATCH v2 2/2] rev-list: add --disk-usage option for calculating disk usage Jeff King
2021-02-09 11:09 ` [PATCH v2] rev-list --disk-usage Jeff King
2021-02-09 21:14 ` Junio C Hamano
2021-02-10 9:38 ` Jeff King
2021-02-10 0:44 ` Junio C Hamano
2021-02-10 1:49 ` Taylor Blau
2021-02-10 10:01 ` Jeff King
2021-02-10 16:31 ` Junio C Hamano [this message]
2021-02-10 20:38 ` Jeff King
2021-02-10 23:15 ` Taylor Blau
2021-02-11 11:00 ` Jeff King
2021-02-11 12:04 ` Ævar Arnfjörð Bjarmason
2021-02-11 17:57 ` Junio C Hamano
2021-02-17 23:31 ` [PATCH 0/2] rev-list --disk-usage example docs Jeff King
2021-02-17 23:34 ` [PATCH 1/2] docs/rev-list: add an examples section Jeff King
2021-02-17 23:35 ` [PATCH 2/2] docs/rev-list: add some examples of --disk-usage Jeff King
2021-02-17 23:44 ` [PATCH 0/2] rev-list --disk-usage example docs Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq1rdn51gz.fsf@gitster.c.googlers.com \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=kyle@kyleam.com \
--cc=me@ttaylorr.com \
--cc=peff@peff.net \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).