From: Justin Tobler <jltobler@gmail.com>
To: git@vger.kernel.org
Cc: ps@pks.im, gitster@pobox.com, worldhello.net@gmail.com,
Justin Tobler <jltobler@gmail.com>
Subject: [PATCH v5 0/7] builtin/repo: add object size info to structure output
Date: Wed, 17 Dec 2025 11:53:57 -0600 [thread overview]
Message-ID: <20251217175404.37963-1-jltobler@gmail.com> (raw)
In-Reply-To: <20251216173842.3357832-1-jltobler@gmail.com>
Greetings,
This patch series extends the recently introduced "structure" subcommand
for git-repo(1) to collect object size information. More specifically,
it shows total inflated and disk sizes of objects by object type. The
aim to provide additional insight that may be useful to users regarding
the structure of a repository.
In addition to this change, this series also updates the table output
format to downscale larger output values along with the appropriate unit
prefix. This is done to make table output more human friendly. The
keyvalue and nul output formats are left the same since they are
intended more for machine parsing.
Changes in V5:
- Small updates to some comments and log messages to improve
correctness.
- Adjusted spacing in builtin/repo.c:count_objects().
Changes in V4:
- Unmark "byte" string in "t/helper/test-simple-ipc.c" for translation
to avoid conflict with translated plural "byte/bytes" string.
- Remove some unnecessary translations and add comments to clarify some
of the added translations.
- Some small changes to the tests in patch 7.
Changes in V3:
- Address potential localization regression by making the downscaled
number format string also translatable. Also make the format string
for how the values and unit prefixes are displayed via
`strbuf_humanise_{bytes,rate}()` translatable to be more flexible.
- `strbuf_humanise_{bytes,count}_value()` has been renamed to
`humanise_{bytes,count}()` and updated to provide both the value and
unit prefix as separate strings.
- Unit prefix strings are no longer allocated and instead constant.
- The humanise flags are now defined in an enum.
- Instead of using `OBJECT_INFO_FOR_PREFETCH`,
`OBJECT_INFO_SKIP_FETCH_OBJECT` and `OBJECT_INFO_QUICK` are used
explicitly.
- Tests now use git-rev-list(1) to verify disk size info.
Changes in V2:
- Factor out and reuse existing logic from strbuf_humanise() to handle
downscaling values and determining the appropriate unit prefix
separately. This enables more control over how exactly the values are
written to the structure output table which is useful for alignment
reasons. I'm not how about the interface used in patch 2. Feedback is
most welcome.
- In the previous version, when checking object size on a missing object
we would die. Instead we now ignore missing objects. This allows the
structure command to work on partial clones.
- disk/inflated keyvalue names renamed to disk_size/inflated_size.
- Unit prefixes are marked for translation.
- The test for keyvalue disk size values are updated to check against
real expected values instead of skipping. Table output tests still
skip verifing human-readable values though.
Thanks,
-Justin
Justin Tobler (7):
builtin/repo: group per-type object values into struct
strbuf: split out logic to humanise byte values
builtin/repo: humanise count values in structure output
builtin/repo: add inflated object info to keyvalue structure output
builtin/repo: add inflated object info to structure table
builtin/repo: add disk size info to keyvalue stucture output
builtin/repo: add object disk size info to structure table
Documentation/git-repo.adoc | 2 +
builtin/repo.c | 175 ++++++++++++++++++++++++++++++------
strbuf.c | 102 ++++++++++++++-------
strbuf.h | 25 ++++++
t/helper/test-simple-ipc.c | 7 +-
t/t1901-repo-structure.sh | 118 ++++++++++++++++--------
6 files changed, 331 insertions(+), 98 deletions(-)
Range-diff against v4:
1: be14de68f6 = 1: be14de68f6 builtin/repo: group per-type object values into struct
2: 0a145cfeec ! 2: 61cff22afa strbuf: split out logic to humanise byte values
@@ Commit message
In a subsequent commit, byte size values displayed in table output for
the git-repo(1) "structure" subcommand will be shown in a more
human-readable format with the appropriate unit prefixes. For this
- usecase, the downscaled values and unit prefixes must be handled
+ usecase, the downscaled values and unit strings must be handled
separately to ensure proper column alignment.
Split out logic from strbuf_humanise() to downscale byte values and
@@ strbuf.c: void strbuf_addstr_urlencode(struct strbuf *sb, const char *s,
+
+ /*
+ * TRANSLATORS: The first argument is the number string. The second
-+ * argument is the unit prefix string (i.e. "12.34 MiB/s").
++ * argument is the unit string (i.e. "12.34 MiB/s").
+ */
+ strbuf_addf(buf, _("%s %s"), value, unit);
+ free(value);
@@ strbuf.h: void strbuf_addbuf_percentquote(struct strbuf *dst, const struct strbu
+enum humanise_flags {
+ /*
-+ * Use rate based unit prefixes for humanised values.
++ * Use rate based units for humanised values.
+ */
+ HUMANISE_RATE = (1 << 0),
+};
+
+/**
+ * Converts the given byte size into a downscaled human-readable value and
-+ * corresponding unit prefix as two separate strings.
++ * corresponding unit as two separate strings.
+ */
+void humanise_bytes(off_t bytes, char **value, const char **unit,
+ unsigned flags);
3: eebf0d917b ! 3: 0b575738c2 builtin/repo: humanise count values in structure output
@@ strbuf.h: enum humanise_flags {
+/**
+ * Converts the given count into a downscaled human-readable value and
-+ * corresponding unit prefix as two separate strings.
++ * corresponding unit as two separate strings.
+ */
+void humanise_count(size_t count, char **value, const char **unit);
+
4: 37f71cc1bc ! 4: e2c79c8759 builtin/repo: add inflated object info to keyvalue structure output
@@ builtin/repo.c: static int count_objects(const char *path UNUSED, struct oid_arr
+
+ if (odb_read_object_info_extended(data->odb, &oids->oid[i], &oi,
+ OBJECT_INFO_SKIP_FETCH_OBJECT |
-+ OBJECT_INFO_QUICK) < 0)
++ OBJECT_INFO_QUICK) < 0)
+ continue;
+
+ inflated_total += inflated;
5: 40edf4c20b ! 5: 03219630cc builtin/repo: add inflated object info to structure table
@@ strbuf.c: void humanise_bytes(off_t bytes, char **value, const char **unit,
## strbuf.h ##
@@ strbuf.h: enum humanise_flags {
- * Use rate based unit prefixes for humanised values.
+ * Use rate based units for humanised values.
*/
HUMANISE_RATE = (1 << 0),
+ /*
-+ * Use compact "B" unit prefixes instead of "byte/bytes" for humanised
++ * Use compact "B" unit symbol instead of "byte/bytes" for humanised
+ * values.
+ */
+ HUMANISE_COMPACT = (1 << 1),
6: ba861f37c9 = 6: 7d8862a064 builtin/repo: add disk size info to keyvalue stucture output
7: 3118c17ae3 ! 7: 3e2d5c20f8 builtin/repo: add object disk size info to structure table
@@ t/t1901-repo-structure.sh: test_expect_success SHA1 'repository with references
- cat >expect <<-\EOF &&
+ # The tags disk size is handled specially due to the
+ # git-rev-list(1) --disk-usage=human option printing the full
-+ # "byte/bytes" unit prefix instead of just "B".
++ # "byte/bytes" unit string instead of just "B".
+ cat >expect <<-EOF &&
| Repository structure | Value |
| -------------------- | ---------- |
base-commit: e85ae279b0d58edc2f4c3fd5ac391b51e1223985
--
2.52.0.209.ge85ae279b0
next prev parent reply other threads:[~2025-12-17 17:54 UTC|newest]
Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-09 22:58 [PATCH 0/6] builtin/repo: add object size info to structure output Justin Tobler
2025-12-09 22:58 ` [PATCH 1/6] builtin/repo: group per-type object values into struct Justin Tobler
2025-12-09 22:58 ` [PATCH 2/6] builtin/repo: humanise count values in structure output Justin Tobler
2025-12-10 6:28 ` Patrick Steinhardt
2025-12-10 15:10 ` Justin Tobler
2025-12-11 2:57 ` Junio C Hamano
2025-12-12 16:46 ` Justin Tobler
2025-12-09 22:58 ` [PATCH 3/6] builtin/repo: add inflated object info to keyvalue " Justin Tobler
2025-12-09 22:58 ` [PATCH 4/6] builtin/repo: add inflated object info to structure table Justin Tobler
2025-12-10 6:28 ` Patrick Steinhardt
2025-12-10 15:21 ` Justin Tobler
2025-12-09 22:58 ` [PATCH 5/6] builtin/repo: add disk size info to keyvalue stucture output Justin Tobler
2025-12-10 6:28 ` Patrick Steinhardt
2025-12-10 15:24 ` Justin Tobler
2025-12-12 20:40 ` Justin Tobler
2025-12-15 5:33 ` Patrick Steinhardt
2025-12-15 16:24 ` Justin Tobler
2025-12-10 14:58 ` Junio C Hamano
2025-12-10 19:09 ` Lucas Seiki Oshiro
2025-12-12 22:36 ` Justin Tobler
2025-12-12 23:58 ` Junio C Hamano
2025-12-09 22:58 ` [PATCH 6/6] builtin/repo: add object disk size info to structure table Justin Tobler
2025-12-10 6:28 ` Patrick Steinhardt
2025-12-10 15:24 ` Justin Tobler
2025-12-12 22:36 ` [PATCH v2 0/7] builtin/repo: add object size info to structure output Justin Tobler
2025-12-12 22:36 ` [PATCH v2 1/7] builtin/repo: group per-type object values into struct Justin Tobler
2025-12-12 22:36 ` [PATCH v2 2/7] strbuf: split out logic to humanise byte values Justin Tobler
2025-12-15 5:33 ` Patrick Steinhardt
2025-12-15 16:26 ` Justin Tobler
2025-12-15 8:21 ` Junio C Hamano
2025-12-15 16:47 ` Justin Tobler
2025-12-16 2:26 ` Jiang Xin
2025-12-16 4:37 ` Junio C Hamano
2025-12-16 6:18 ` Jiang Xin
2025-12-16 14:41 ` Justin Tobler
2025-12-12 22:36 ` [PATCH v2 3/7] builtin/repo: humanise count values in structure output Justin Tobler
2025-12-15 5:33 ` Patrick Steinhardt
2025-12-12 22:36 ` [PATCH v2 4/7] builtin/repo: add inflated object info to keyvalue " Justin Tobler
2025-12-15 5:33 ` Patrick Steinhardt
2025-12-15 16:48 ` Justin Tobler
2025-12-12 22:36 ` [PATCH v2 5/7] builtin/repo: add inflated object info to structure table Justin Tobler
2025-12-12 22:36 ` [PATCH v2 6/7] builtin/repo: add disk size info to keyvalue stucture output Justin Tobler
2025-12-15 5:33 ` Patrick Steinhardt
2025-12-12 22:36 ` [PATCH v2 7/7] builtin/repo: add object disk size info to structure table Justin Tobler
2025-12-15 20:56 ` [PATCH v3 0/7] builtin/repo: add object size info to structure output Justin Tobler
2025-12-15 20:56 ` [PATCH v3 1/7] builtin/repo: group per-type object values into struct Justin Tobler
2025-12-15 20:56 ` [PATCH v3 2/7] strbuf: split out logic to humanise byte values Justin Tobler
2025-12-16 1:19 ` Junio C Hamano
2025-12-16 1:36 ` Justin Tobler
2025-12-15 20:56 ` [PATCH v3 3/7] builtin/repo: humanise count values in structure output Justin Tobler
2025-12-16 8:25 ` Patrick Steinhardt
2025-12-15 20:56 ` [PATCH v3 4/7] builtin/repo: add inflated object info to keyvalue " Justin Tobler
2025-12-15 20:56 ` [PATCH v3 5/7] builtin/repo: add inflated object info to structure table Justin Tobler
2025-12-15 20:56 ` [PATCH v3 6/7] builtin/repo: add disk size info to keyvalue stucture output Justin Tobler
2025-12-15 20:56 ` [PATCH v3 7/7] builtin/repo: add object disk size info to structure table Justin Tobler
2025-12-16 8:25 ` Patrick Steinhardt
2025-12-16 14:48 ` Justin Tobler
2025-12-16 17:38 ` [PATCH v4 0/7] builtin/repo: add object size info to structure output Justin Tobler
2025-12-16 17:38 ` [PATCH v4 1/7] builtin/repo: group per-type object values into struct Justin Tobler
2025-12-16 17:38 ` [PATCH v4 2/7] strbuf: split out logic to humanise byte values Justin Tobler
2025-12-16 18:59 ` Junio C Hamano
2025-12-16 19:39 ` Justin Tobler
2025-12-16 17:38 ` [PATCH v4 3/7] builtin/repo: humanise count values in structure output Justin Tobler
2025-12-16 17:38 ` [PATCH v4 4/7] builtin/repo: add inflated object info to keyvalue " Justin Tobler
2025-12-17 7:03 ` Patrick Steinhardt
2025-12-17 16:10 ` Justin Tobler
2025-12-16 17:38 ` [PATCH v4 5/7] builtin/repo: add inflated object info to structure table Justin Tobler
2025-12-16 17:38 ` [PATCH v4 6/7] builtin/repo: add disk size info to keyvalue stucture output Justin Tobler
2025-12-16 17:38 ` [PATCH v4 7/7] builtin/repo: add object disk size info to structure table Justin Tobler
2025-12-17 7:03 ` [PATCH v4 0/7] builtin/repo: add object size info to structure output Patrick Steinhardt
2025-12-17 17:49 ` Justin Tobler
2025-12-17 17:53 ` Justin Tobler [this message]
2025-12-17 17:53 ` [PATCH v5 1/7] builtin/repo: group per-type object values into struct Justin Tobler
2025-12-17 17:53 ` [PATCH v5 2/7] strbuf: split out logic to humanise byte values Justin Tobler
2025-12-17 17:54 ` [PATCH v5 3/7] builtin/repo: humanise count values in structure output Justin Tobler
2025-12-17 17:54 ` [PATCH v5 4/7] builtin/repo: add inflated object info to keyvalue " Justin Tobler
2025-12-17 17:54 ` [PATCH v5 5/7] builtin/repo: add inflated object info to structure table Justin Tobler
2025-12-17 17:54 ` [PATCH v5 6/7] builtin/repo: add disk size info to keyvalue stucture output Justin Tobler
2025-12-17 17:54 ` [PATCH v5 7/7] builtin/repo: add object disk size info to structure table Justin Tobler
2025-12-18 6:32 ` [PATCH v5 0/7] builtin/repo: add object size info to structure output Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251217175404.37963-1-jltobler@gmail.com \
--to=jltobler@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=ps@pks.im \
--cc=worldhello.net@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).