From: Siddharth Asthana <siddharthasthana31@gmail.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, chriscool@tuxfamily.org, toon@iotcl.com,
karthik.188@gmail.com, justin@parity.io
Subject: Re: [PATCH v1 1/1] rev-list: add --missing=print-only mode
Date: Mon, 20 Apr 2026 16:03:36 +0530 [thread overview]
Message-ID: <3bb52232-ebe4-4c50-a674-943a175e983d@gmail.com> (raw)
In-Reply-To: <aeXZOAtILSr638LG@pks.im>
On 20/04/26 13:13, Patrick Steinhardt wrote:
> On Sun, Apr 19, 2026 at 02:18:40PM +0530, Siddharth Asthana wrote:
>> When working with partial clones, it's common to want just the list of
>> missing objects. The current --missing=print mode does this but mixes
>> present and missing objects together, with missing ones prefixed by '?'.
>> Getting only the missing OIDs requires an extra pipe:
>>
>> git rev-list --objects --all --missing=print | perl -ne 'print if s/^[?]//'
>>
>> Add --missing=print-only which outputs only the missing object OIDs, one
>> per line, without any prefix. This makes the above one-liner unnecessary
>> and the output directly usable by downstream tools.
>
> Naming is a bit tough, as "print-only" sounds as if we're only printing
> them without doing anything else, but it doesn't quite convey the
The name came from Christian's original suggesttion in the issue [1],
but agreed it's ambiguous. Phillip's --missing-only approach solve this.
[1] https://gitlab.com/gitlab-org/git/-/work_items/80#note_464005298
> relation to non-missing objects. I don't really have a better suggestion
> though -- "print-exclusively" may convey the meaning a tiny bit better,
> but still suffers kind of the same issue.
>
>> diff --git a/builtin/rev-list.c b/builtin/rev-list.c
>> index 8f63003709..ba7e3e3919 100644
>> --- a/builtin/rev-list.c
>> +++ b/builtin/rev-list.c
>> @@ -104,14 +104,22 @@ static void missing_objects_map_entry_free(void *e)
>>
>> static struct oidmap missing_objects;
>> enum missing_action {
>> - MA_ERROR = 0, /* fail if any missing objects are encountered */
>> - MA_ALLOW_ANY, /* silently allow ALL missing objects */
>> - MA_PRINT, /* print ALL missing objects in special section */
>> - MA_PRINT_INFO, /* same as MA_PRINT but also prints missing object info */
>> + MA_ERROR = 0, /* fail if any missing objects are encountered */
>> + MA_ALLOW_ANY, /* silently allow ALL missing objects */
>> + MA_PRINT, /* print ALL missing objects in special section */
>> + MA_PRINT_INFO, /* same as MA_PRINT but also prints missing object info */
>> + MA_PRINT_ONLY, /* print ONLY missing objects, without the "?" prefix */
>
> Makes me wonder whether we'll eventually also want to have
> `MA_PRINT_INFO_ONLY`.
Right - that's the strongest argument for --missing-only as a separate
flag. Gets us that for free.
>
>> MA_ALLOW_PROMISOR, /* silently allow all missing PROMISOR objects */
>> };
>> static enum missing_action arg_missing_action;
>>
>> +static inline int missing_action_prints(void)
>
> How about naming this `should_print_missing_object()` instead? That
> gives the reader a bit more context.
The function is a predicate on the mode, not on a specific object, so
that name would be slightly misleading. But with --missing-only this
helper might change shape anyway. WDYT?
>
>> @@ -1011,7 +1036,7 @@ int cmd_rev_list(int argc,
>>
>> stop_progress(&progress);
>>
>> - if (revs.count) {
>> + if (revs.count && arg_missing_action != MA_PRINT_ONLY) {
>> if (revs.left_right && revs.cherry_mark)
>> printf("%d\t%d\t%d\n", revs.count_left, revs.count_right, revs.count_same);
>> else if (revs.left_right)
>
> Not a fault of your patch, but I really feel like git-rev-list(1) is
> becoming more and more tangled. The fact that we have to add this check
> to so many different sites doesn't inspire confidence that we have
> indeed catched all of them that need this check.
>
> It would be great if this was reworked a bit to become more obvious, but
> that's probably outside the scope of this patch series.
>
>> diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
>> index 08e92dd002..105560ad21 100755
>> --- a/t/t6022-rev-list-missing.sh
>> +++ b/t/t6022-rev-list-missing.sh
>> @@ -198,6 +198,32 @@ do
>> '
>> done
>>
>> +for obj in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
>> +do
>> + test_expect_success "rev-list --missing=print-only with missing $obj" '
>> + oid="$(git rev-parse $obj)" &&
>> + path=".git/objects/$(test_oid_to_path $oid)" &&
>> +
>> + # Capture present OIDs before hiding anything.
>> + git rev-list --objects --no-object-names HEAD ^$obj >present.raw &&
>> +
>> + mv "$path" "$path.hidden" &&
>> + test_when_finished "mv $path.hidden $path" &&
>> +
>> + git rev-list --missing=print-only --objects --no-object-names \
>> + HEAD >actual &&
>> +
>> + # Only the missing OID should appear, without the "?" prefix.
>> + grep "^$oid$" actual &&
>> +
>> + # Present objects must NOT appear in the output.
>> + while read present_oid
>> + do
>> + ! grep "^$present_oid$" actual || return 1
>> + done <present.raw
>
> How many present object IDs do we have? I'm a bit worried that we now
> execute grep(1) hundreds of times. Can we maybe do some tricks with
> comm(1) instead?
Phillip's test_cmp approach is simpler, since we hide one object, the
output should be exactly that OID. Will use that.
Thanks,
Asthana
>
> Thanks!
>
> Patrick
prev parent reply other threads:[~2026-04-20 10:33 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-19 8:48 [PATCH v1 0/1] rev-list: add --missing=print-only mode Siddharth Asthana
2026-04-19 8:48 ` [PATCH v1 1/1] " Siddharth Asthana
2026-04-19 22:36 ` Derrick Stolee
2026-04-20 10:24 ` Siddharth Asthana
2026-04-20 11:44 ` Derrick Stolee
2026-04-20 7:43 ` Patrick Steinhardt
2026-04-20 8:57 ` Phillip Wood
2026-04-20 9:55 ` Patrick Steinhardt
2026-04-20 10:37 ` Siddharth Asthana
2026-04-20 11:00 ` Kristoffer Haugsbakk
2026-04-20 10:33 ` Siddharth Asthana [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3bb52232-ebe4-4c50-a674-943a175e983d@gmail.com \
--to=siddharthasthana31@gmail.com \
--cc=chriscool@tuxfamily.org \
--cc=git@vger.kernel.org \
--cc=justin@parity.io \
--cc=karthik.188@gmail.com \
--cc=ps@pks.im \
--cc=toon@iotcl.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox