Git development
 help / color / mirror / Atom feed
From: Siddharth Asthana <siddharthasthana31@gmail.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, chriscool@tuxfamily.org, toon@iotcl.com,
	karthik.188@gmail.com, justin@parity.io
Subject: Re: [PATCH v1 1/1] rev-list: add --missing=print-only mode
Date: Mon, 20 Apr 2026 16:03:36 +0530	[thread overview]
Message-ID: <3bb52232-ebe4-4c50-a674-943a175e983d@gmail.com> (raw)
In-Reply-To: <aeXZOAtILSr638LG@pks.im>



On 20/04/26 13:13, Patrick Steinhardt wrote:
> On Sun, Apr 19, 2026 at 02:18:40PM +0530, Siddharth Asthana wrote:
>> When working with partial clones, it's common to want just the list of
>> missing objects. The current --missing=print mode does this but mixes
>> present and missing objects together, with missing ones prefixed by '?'.
>> Getting only the missing OIDs requires an extra pipe:
>>
>>    git rev-list --objects --all --missing=print | perl -ne 'print if s/^[?]//'
>>
>> Add --missing=print-only which outputs only the missing object OIDs, one
>> per line, without any prefix. This makes the above one-liner unnecessary
>> and the output directly usable by downstream tools.
> 
> Naming is a bit tough, as "print-only" sounds as if we're only printing
> them without doing anything else, but it doesn't quite convey the


The name came from Christian's original suggesttion in the issue [1], 
but agreed it's ambiguous. Phillip's --missing-only approach solve this.

[1] https://gitlab.com/gitlab-org/git/-/work_items/80#note_464005298


> relation to non-missing objects. I don't really have a better suggestion
> though -- "print-exclusively" may convey the meaning a tiny bit better,
> but still suffers kind of the same issue.
> 
>> diff --git a/builtin/rev-list.c b/builtin/rev-list.c
>> index 8f63003709..ba7e3e3919 100644
>> --- a/builtin/rev-list.c
>> +++ b/builtin/rev-list.c
>> @@ -104,14 +104,22 @@ static void missing_objects_map_entry_free(void *e)
>>   
>>   static struct oidmap missing_objects;
>>   enum missing_action {
>> -	MA_ERROR = 0,    /* fail if any missing objects are encountered */
>> -	MA_ALLOW_ANY,    /* silently allow ALL missing objects */
>> -	MA_PRINT,        /* print ALL missing objects in special section */
>> -	MA_PRINT_INFO,   /* same as MA_PRINT but also prints missing object info */
>> +	MA_ERROR = 0, /* fail if any missing objects are encountered */
>> +	MA_ALLOW_ANY, /* silently allow ALL missing objects */
>> +	MA_PRINT, /* print ALL missing objects in special section */
>> +	MA_PRINT_INFO, /* same as MA_PRINT but also prints missing object info */
>> +	MA_PRINT_ONLY, /* print ONLY missing objects, without the "?" prefix */
> 
> Makes me wonder whether we'll eventually also want to have
> `MA_PRINT_INFO_ONLY`.



Right - that's the strongest argument for --missing-only as a separate 
flag. Gets us that for free.


> 
>>   	MA_ALLOW_PROMISOR, /* silently allow all missing PROMISOR objects */
>>   };
>>   static enum missing_action arg_missing_action;
>>   
>> +static inline int missing_action_prints(void)
> 
> How about naming this `should_print_missing_object()` instead? That
> gives the reader a bit more context.


The function is a predicate on the mode, not on a specific object, so 
that name would be slightly misleading. But with --missing-only this 
helper might change shape anyway. WDYT?



> 
>> @@ -1011,7 +1036,7 @@ int cmd_rev_list(int argc,
>>   
>>   	stop_progress(&progress);
>>   
>> -	if (revs.count) {
>> +	if (revs.count && arg_missing_action != MA_PRINT_ONLY) {
>>   		if (revs.left_right && revs.cherry_mark)
>>   			printf("%d\t%d\t%d\n", revs.count_left, revs.count_right, revs.count_same);
>>   		else if (revs.left_right)
> 
> Not a fault of your patch, but I really feel like git-rev-list(1) is
> becoming more and more tangled. The fact that we have to add this check
> to so many different sites doesn't inspire confidence that we have
> indeed catched all of them that need this check.
> 
> It would be great if this was reworked a bit to become more obvious, but
> that's probably outside the scope of this patch series.
> 
>> diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
>> index 08e92dd002..105560ad21 100755
>> --- a/t/t6022-rev-list-missing.sh
>> +++ b/t/t6022-rev-list-missing.sh
>> @@ -198,6 +198,32 @@ do
>>   	'
>>   done
>>   
>> +for obj in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
>> +do
>> +	test_expect_success "rev-list --missing=print-only with missing $obj" '
>> +		oid="$(git rev-parse $obj)" &&
>> +		path=".git/objects/$(test_oid_to_path $oid)" &&
>> +
>> +		# Capture present OIDs before hiding anything.
>> +		git rev-list --objects --no-object-names HEAD ^$obj >present.raw &&
>> +
>> +		mv "$path" "$path.hidden" &&
>> +		test_when_finished "mv $path.hidden $path" &&
>> +
>> +		git rev-list --missing=print-only --objects --no-object-names \
>> +			HEAD >actual &&
>> +
>> +		# Only the missing OID should appear, without the "?" prefix.
>> +		grep "^$oid$" actual &&
>> +
>> +		# Present objects must NOT appear in the output.
>> +		while read present_oid
>> +		do
>> +			! grep "^$present_oid$" actual || return 1
>> +		done <present.raw
> 
> How many present object IDs do we have? I'm a bit worried that we now
> execute grep(1) hundreds of times. Can we maybe do some tricks with
> comm(1) instead?


Phillip's test_cmp approach is simpler, since we hide one object, the 
output should be exactly that OID. Will use that.


Thanks,
Asthana


> 
> Thanks!
> 
> Patrick


      parent reply	other threads:[~2026-04-20 10:33 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-19  8:48 [PATCH v1 0/1] rev-list: add --missing=print-only mode Siddharth Asthana
2026-04-19  8:48 ` [PATCH v1 1/1] " Siddharth Asthana
2026-04-19 22:36   ` Derrick Stolee
2026-04-20 10:24     ` Siddharth Asthana
2026-04-20 11:44       ` Derrick Stolee
2026-04-20  7:43   ` Patrick Steinhardt
2026-04-20  8:57     ` Phillip Wood
2026-04-20  9:55       ` Patrick Steinhardt
2026-04-20 10:37       ` Siddharth Asthana
2026-04-20 11:00       ` Kristoffer Haugsbakk
2026-04-20 10:33     ` Siddharth Asthana [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3bb52232-ebe4-4c50-a674-943a175e983d@gmail.com \
    --to=siddharthasthana31@gmail.com \
    --cc=chriscool@tuxfamily.org \
    --cc=git@vger.kernel.org \
    --cc=justin@parity.io \
    --cc=karthik.188@gmail.com \
    --cc=ps@pks.im \
    --cc=toon@iotcl.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox