git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Justin Tobler <jltobler@gmail.com>
To: git@vger.kernel.org
Cc: ps@pks.im, gitster@pobox.com, worldhello.net@gmail.com,
	Justin Tobler <jltobler@gmail.com>
Subject: [PATCH v5 0/7] builtin/repo: add object size info to structure output
Date: Wed, 17 Dec 2025 11:53:57 -0600	[thread overview]
Message-ID: <20251217175404.37963-1-jltobler@gmail.com> (raw)
In-Reply-To: <20251216173842.3357832-1-jltobler@gmail.com>

Greetings,

This patch series extends the recently introduced "structure" subcommand
for git-repo(1) to collect object size information. More specifically,
it shows total inflated and disk sizes of objects by object type. The
aim to provide additional insight that may be useful to users regarding
the structure of a repository.

In addition to this change, this series also updates the table output
format to downscale larger output values along with the appropriate unit
prefix. This is done to make table output more human friendly. The
keyvalue and nul output formats are left the same since they are
intended more for machine parsing.

Changes in V5:
- Small updates to some comments and log messages to improve
  correctness.
- Adjusted spacing in builtin/repo.c:count_objects().

Changes in V4:
- Unmark "byte" string in "t/helper/test-simple-ipc.c" for translation
  to avoid conflict with translated plural "byte/bytes" string.
- Remove some unnecessary translations and add comments to clarify some
  of the added translations.
- Some small changes to the tests in patch 7.

Changes in V3:
- Address potential localization regression by making the downscaled
  number format string also translatable. Also make the format string
  for how the values and unit prefixes are displayed via
  `strbuf_humanise_{bytes,rate}()` translatable to be more flexible.
- `strbuf_humanise_{bytes,count}_value()` has been renamed to
  `humanise_{bytes,count}()` and updated to provide both the value and
  unit prefix as separate strings.
- Unit prefix strings are no longer allocated and instead constant.
- The humanise flags are now defined in an enum.
- Instead of using `OBJECT_INFO_FOR_PREFETCH`,
  `OBJECT_INFO_SKIP_FETCH_OBJECT` and `OBJECT_INFO_QUICK` are used
  explicitly.
- Tests now use git-rev-list(1) to verify disk size info.

Changes in V2:
- Factor out and reuse existing logic from strbuf_humanise() to handle
  downscaling values and determining the appropriate unit prefix
  separately. This enables more control over how exactly the values are
  written to the structure output table which is useful for alignment
  reasons. I'm not how about the interface used in patch 2. Feedback is
  most welcome.
- In the previous version, when checking object size on a missing object
  we would die. Instead we now ignore missing objects. This allows the
  structure command to work on partial clones.
- disk/inflated keyvalue names renamed to disk_size/inflated_size.
- Unit prefixes are marked for translation.
- The test for keyvalue disk size values are updated to check against
  real expected values instead of skipping. Table output tests still
  skip verifing human-readable values though.

Thanks,
-Justin

Justin Tobler (7):
  builtin/repo: group per-type object values into struct
  strbuf: split out logic to humanise byte values
  builtin/repo: humanise count values in structure output
  builtin/repo: add inflated object info to keyvalue structure output
  builtin/repo: add inflated object info to structure table
  builtin/repo: add disk size info to keyvalue stucture output
  builtin/repo: add object disk size info to structure table

 Documentation/git-repo.adoc |   2 +
 builtin/repo.c              | 175 ++++++++++++++++++++++++++++++------
 strbuf.c                    | 102 ++++++++++++++-------
 strbuf.h                    |  25 ++++++
 t/helper/test-simple-ipc.c  |   7 +-
 t/t1901-repo-structure.sh   | 118 ++++++++++++++++--------
 6 files changed, 331 insertions(+), 98 deletions(-)

Range-diff against v4:
1:  be14de68f6 = 1:  be14de68f6 builtin/repo: group per-type object values into struct
2:  0a145cfeec ! 2:  61cff22afa strbuf: split out logic to humanise byte values
    @@ Commit message
         In a subsequent commit, byte size values displayed in table output for
         the git-repo(1) "structure" subcommand will be shown in a more
         human-readable format with the appropriate unit prefixes. For this
    -    usecase, the downscaled values and unit prefixes must be handled
    +    usecase, the downscaled values and unit strings must be handled
         separately to ensure proper column alignment.
     
         Split out logic from strbuf_humanise() to downscale byte values and
    @@ strbuf.c: void strbuf_addstr_urlencode(struct strbuf *sb, const char *s,
     +
     +	/*
     +	 * TRANSLATORS: The first argument is the number string. The second
    -+	 * argument is the unit prefix string (i.e. "12.34 MiB/s").
    ++	 * argument is the unit string (i.e. "12.34 MiB/s").
     +	 */
     +	strbuf_addf(buf, _("%s %s"), value, unit);
     +	free(value);
    @@ strbuf.h: void strbuf_addbuf_percentquote(struct strbuf *dst, const struct strbu
      
     +enum humanise_flags {
     +	/*
    -+	 * Use rate based unit prefixes for humanised values.
    ++	 * Use rate based units for humanised values.
     +	 */
     +	HUMANISE_RATE = (1 << 0),
     +};
     +
     +/**
     + * Converts the given byte size into a downscaled human-readable value and
    -+ * corresponding unit prefix as two separate strings.
    ++ * corresponding unit as two separate strings.
     + */
     +void humanise_bytes(off_t bytes, char **value, const char **unit,
     +		    unsigned flags);
3:  eebf0d917b ! 3:  0b575738c2 builtin/repo: humanise count values in structure output
    @@ strbuf.h: enum humanise_flags {
      
     +/**
     + * Converts the given count into a downscaled human-readable value and
    -+ * corresponding unit prefix as two separate strings.
    ++ * corresponding unit as two separate strings.
     + */
     +void humanise_count(size_t count, char **value, const char **unit);
     +
4:  37f71cc1bc ! 4:  e2c79c8759 builtin/repo: add inflated object info to keyvalue structure output
    @@ builtin/repo.c: static int count_objects(const char *path UNUSED, struct oid_arr
     +
     +		if (odb_read_object_info_extended(data->odb, &oids->oid[i], &oi,
     +						  OBJECT_INFO_SKIP_FETCH_OBJECT |
    -+							  OBJECT_INFO_QUICK) < 0)
    ++						  OBJECT_INFO_QUICK) < 0)
     +			continue;
     +
     +		inflated_total += inflated;
5:  40edf4c20b ! 5:  03219630cc builtin/repo: add inflated object info to structure table
    @@ strbuf.c: void humanise_bytes(off_t bytes, char **value, const char **unit,
     
      ## strbuf.h ##
     @@ strbuf.h: enum humanise_flags {
    - 	 * Use rate based unit prefixes for humanised values.
    + 	 * Use rate based units for humanised values.
      	 */
      	HUMANISE_RATE = (1 << 0),
     +	/*
    -+	 * Use compact "B" unit prefixes instead of "byte/bytes" for humanised
    ++	 * Use compact "B" unit symbol instead of "byte/bytes" for humanised
     +	 * values.
     +	 */
     +	HUMANISE_COMPACT = (1 << 1),
6:  ba861f37c9 = 6:  7d8862a064 builtin/repo: add disk size info to keyvalue stucture output
7:  3118c17ae3 ! 7:  3e2d5c20f8 builtin/repo: add object disk size info to structure table
    @@ t/t1901-repo-structure.sh: test_expect_success SHA1 'repository with references
     -		cat >expect <<-\EOF &&
     +		# The tags disk size is handled specially due to the
     +		# git-rev-list(1) --disk-usage=human option printing the full
    -+		# "byte/bytes" unit prefix instead of just "B".
    ++		# "byte/bytes" unit string instead of just "B".
     +		cat >expect <<-EOF &&
      		| Repository structure | Value      |
      		| -------------------- | ---------- |

base-commit: e85ae279b0d58edc2f4c3fd5ac391b51e1223985
-- 
2.52.0.209.ge85ae279b0


  parent reply	other threads:[~2025-12-17 17:54 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-09 22:58 [PATCH 0/6] builtin/repo: add object size info to structure output Justin Tobler
2025-12-09 22:58 ` [PATCH 1/6] builtin/repo: group per-type object values into struct Justin Tobler
2025-12-09 22:58 ` [PATCH 2/6] builtin/repo: humanise count values in structure output Justin Tobler
2025-12-10  6:28   ` Patrick Steinhardt
2025-12-10 15:10     ` Justin Tobler
2025-12-11  2:57       ` Junio C Hamano
2025-12-12 16:46         ` Justin Tobler
2025-12-09 22:58 ` [PATCH 3/6] builtin/repo: add inflated object info to keyvalue " Justin Tobler
2025-12-09 22:58 ` [PATCH 4/6] builtin/repo: add inflated object info to structure table Justin Tobler
2025-12-10  6:28   ` Patrick Steinhardt
2025-12-10 15:21     ` Justin Tobler
2025-12-09 22:58 ` [PATCH 5/6] builtin/repo: add disk size info to keyvalue stucture output Justin Tobler
2025-12-10  6:28   ` Patrick Steinhardt
2025-12-10 15:24     ` Justin Tobler
2025-12-12 20:40     ` Justin Tobler
2025-12-15  5:33       ` Patrick Steinhardt
2025-12-15 16:24         ` Justin Tobler
2025-12-10 14:58   ` Junio C Hamano
2025-12-10 19:09     ` Lucas Seiki Oshiro
2025-12-12 22:36     ` Justin Tobler
2025-12-12 23:58       ` Junio C Hamano
2025-12-09 22:58 ` [PATCH 6/6] builtin/repo: add object disk size info to structure table Justin Tobler
2025-12-10  6:28   ` Patrick Steinhardt
2025-12-10 15:24     ` Justin Tobler
2025-12-12 22:36 ` [PATCH v2 0/7] builtin/repo: add object size info to structure output Justin Tobler
2025-12-12 22:36   ` [PATCH v2 1/7] builtin/repo: group per-type object values into struct Justin Tobler
2025-12-12 22:36   ` [PATCH v2 2/7] strbuf: split out logic to humanise byte values Justin Tobler
2025-12-15  5:33     ` Patrick Steinhardt
2025-12-15 16:26       ` Justin Tobler
2025-12-15  8:21     ` Junio C Hamano
2025-12-15 16:47       ` Justin Tobler
2025-12-16  2:26     ` Jiang Xin
2025-12-16  4:37       ` Junio C Hamano
2025-12-16  6:18         ` Jiang Xin
2025-12-16 14:41           ` Justin Tobler
2025-12-12 22:36   ` [PATCH v2 3/7] builtin/repo: humanise count values in structure output Justin Tobler
2025-12-15  5:33     ` Patrick Steinhardt
2025-12-12 22:36   ` [PATCH v2 4/7] builtin/repo: add inflated object info to keyvalue " Justin Tobler
2025-12-15  5:33     ` Patrick Steinhardt
2025-12-15 16:48       ` Justin Tobler
2025-12-12 22:36   ` [PATCH v2 5/7] builtin/repo: add inflated object info to structure table Justin Tobler
2025-12-12 22:36   ` [PATCH v2 6/7] builtin/repo: add disk size info to keyvalue stucture output Justin Tobler
2025-12-15  5:33     ` Patrick Steinhardt
2025-12-12 22:36   ` [PATCH v2 7/7] builtin/repo: add object disk size info to structure table Justin Tobler
2025-12-15 20:56   ` [PATCH v3 0/7] builtin/repo: add object size info to structure output Justin Tobler
2025-12-15 20:56     ` [PATCH v3 1/7] builtin/repo: group per-type object values into struct Justin Tobler
2025-12-15 20:56     ` [PATCH v3 2/7] strbuf: split out logic to humanise byte values Justin Tobler
2025-12-16  1:19       ` Junio C Hamano
2025-12-16  1:36         ` Justin Tobler
2025-12-15 20:56     ` [PATCH v3 3/7] builtin/repo: humanise count values in structure output Justin Tobler
2025-12-16  8:25       ` Patrick Steinhardt
2025-12-15 20:56     ` [PATCH v3 4/7] builtin/repo: add inflated object info to keyvalue " Justin Tobler
2025-12-15 20:56     ` [PATCH v3 5/7] builtin/repo: add inflated object info to structure table Justin Tobler
2025-12-15 20:56     ` [PATCH v3 6/7] builtin/repo: add disk size info to keyvalue stucture output Justin Tobler
2025-12-15 20:56     ` [PATCH v3 7/7] builtin/repo: add object disk size info to structure table Justin Tobler
2025-12-16  8:25       ` Patrick Steinhardt
2025-12-16 14:48         ` Justin Tobler
2025-12-16 17:38     ` [PATCH v4 0/7] builtin/repo: add object size info to structure output Justin Tobler
2025-12-16 17:38       ` [PATCH v4 1/7] builtin/repo: group per-type object values into struct Justin Tobler
2025-12-16 17:38       ` [PATCH v4 2/7] strbuf: split out logic to humanise byte values Justin Tobler
2025-12-16 18:59         ` Junio C Hamano
2025-12-16 19:39           ` Justin Tobler
2025-12-16 17:38       ` [PATCH v4 3/7] builtin/repo: humanise count values in structure output Justin Tobler
2025-12-16 17:38       ` [PATCH v4 4/7] builtin/repo: add inflated object info to keyvalue " Justin Tobler
2025-12-17  7:03         ` Patrick Steinhardt
2025-12-17 16:10           ` Justin Tobler
2025-12-16 17:38       ` [PATCH v4 5/7] builtin/repo: add inflated object info to structure table Justin Tobler
2025-12-16 17:38       ` [PATCH v4 6/7] builtin/repo: add disk size info to keyvalue stucture output Justin Tobler
2025-12-16 17:38       ` [PATCH v4 7/7] builtin/repo: add object disk size info to structure table Justin Tobler
2025-12-17  7:03       ` [PATCH v4 0/7] builtin/repo: add object size info to structure output Patrick Steinhardt
2025-12-17 17:49         ` Justin Tobler
2025-12-17 17:53       ` Justin Tobler [this message]
2025-12-17 17:53         ` [PATCH v5 1/7] builtin/repo: group per-type object values into struct Justin Tobler
2025-12-17 17:53         ` [PATCH v5 2/7] strbuf: split out logic to humanise byte values Justin Tobler
2025-12-17 17:54         ` [PATCH v5 3/7] builtin/repo: humanise count values in structure output Justin Tobler
2025-12-17 17:54         ` [PATCH v5 4/7] builtin/repo: add inflated object info to keyvalue " Justin Tobler
2025-12-17 17:54         ` [PATCH v5 5/7] builtin/repo: add inflated object info to structure table Justin Tobler
2025-12-17 17:54         ` [PATCH v5 6/7] builtin/repo: add disk size info to keyvalue stucture output Justin Tobler
2025-12-17 17:54         ` [PATCH v5 7/7] builtin/repo: add object disk size info to structure table Justin Tobler
2025-12-18  6:32         ` [PATCH v5 0/7] builtin/repo: add object size info to structure output Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251217175404.37963-1-jltobler@gmail.com \
    --to=jltobler@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=ps@pks.im \
    --cc=worldhello.net@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).