All of lore.kernel.org
 help / color / mirror / Atom feed
From: Siddharth Asthana <siddharthasthana31@gmail.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, chriscool@tuxfamily.org, toon@iotcl.com,
	karthik.188@gmail.com, justin@parity.io
Subject: Re: [PATCH v1 1/1] rev-list: add --missing=print-only mode
Date: Mon, 20 Apr 2026 16:03:36 +0530	[thread overview]
Message-ID: <3bb52232-ebe4-4c50-a674-943a175e983d@gmail.com> (raw)
In-Reply-To: <aeXZOAtILSr638LG@pks.im>



On 20/04/26 13:13, Patrick Steinhardt wrote:
> On Sun, Apr 19, 2026 at 02:18:40PM +0530, Siddharth Asthana wrote:
>> When working with partial clones, it's common to want just the list of
>> missing objects. The current --missing=print mode does this but mixes
>> present and missing objects together, with missing ones prefixed by '?'.
>> Getting only the missing OIDs requires an extra pipe:
>>
>>    git rev-list --objects --all --missing=print | perl -ne 'print if s/^[?]//'
>>
>> Add --missing=print-only which outputs only the missing object OIDs, one
>> per line, without any prefix. This makes the above one-liner unnecessary
>> and the output directly usable by downstream tools.
> 
> Naming is a bit tough, as "print-only" sounds as if we're only printing
> them without doing anything else, but it doesn't quite convey the


The name came from Christian's original suggesttion in the issue [1], 
but agreed it's ambiguous. Phillip's --missing-only approach solve this.

[1] https://gitlab.com/gitlab-org/git/-/work_items/80#note_464005298


> relation to non-missing objects. I don't really have a better suggestion
> though -- "print-exclusively" may convey the meaning a tiny bit better,
> but still suffers kind of the same issue.
> 
>> diff --git a/builtin/rev-list.c b/builtin/rev-list.c
>> index 8f63003709..ba7e3e3919 100644
>> --- a/builtin/rev-list.c
>> +++ b/builtin/rev-list.c
>> @@ -104,14 +104,22 @@ static void missing_objects_map_entry_free(void *e)
>>   
>>   static struct oidmap missing_objects;
>>   enum missing_action {
>> -	MA_ERROR = 0,    /* fail if any missing objects are encountered */
>> -	MA_ALLOW_ANY,    /* silently allow ALL missing objects */
>> -	MA_PRINT,        /* print ALL missing objects in special section */
>> -	MA_PRINT_INFO,   /* same as MA_PRINT but also prints missing object info */
>> +	MA_ERROR = 0, /* fail if any missing objects are encountered */
>> +	MA_ALLOW_ANY, /* silently allow ALL missing objects */
>> +	MA_PRINT, /* print ALL missing objects in special section */
>> +	MA_PRINT_INFO, /* same as MA_PRINT but also prints missing object info */
>> +	MA_PRINT_ONLY, /* print ONLY missing objects, without the "?" prefix */
> 
> Makes me wonder whether we'll eventually also want to have
> `MA_PRINT_INFO_ONLY`.



Right - that's the strongest argument for --missing-only as a separate 
flag. Gets us that for free.


> 
>>   	MA_ALLOW_PROMISOR, /* silently allow all missing PROMISOR objects */
>>   };
>>   static enum missing_action arg_missing_action;
>>   
>> +static inline int missing_action_prints(void)
> 
> How about naming this `should_print_missing_object()` instead? That
> gives the reader a bit more context.


The function is a predicate on the mode, not on a specific object, so 
that name would be slightly misleading. But with --missing-only this 
helper might change shape anyway. WDYT?



> 
>> @@ -1011,7 +1036,7 @@ int cmd_rev_list(int argc,
>>   
>>   	stop_progress(&progress);
>>   
>> -	if (revs.count) {
>> +	if (revs.count && arg_missing_action != MA_PRINT_ONLY) {
>>   		if (revs.left_right && revs.cherry_mark)
>>   			printf("%d\t%d\t%d\n", revs.count_left, revs.count_right, revs.count_same);
>>   		else if (revs.left_right)
> 
> Not a fault of your patch, but I really feel like git-rev-list(1) is
> becoming more and more tangled. The fact that we have to add this check
> to so many different sites doesn't inspire confidence that we have
> indeed catched all of them that need this check.
> 
> It would be great if this was reworked a bit to become more obvious, but
> that's probably outside the scope of this patch series.
> 
>> diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
>> index 08e92dd002..105560ad21 100755
>> --- a/t/t6022-rev-list-missing.sh
>> +++ b/t/t6022-rev-list-missing.sh
>> @@ -198,6 +198,32 @@ do
>>   	'
>>   done
>>   
>> +for obj in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
>> +do
>> +	test_expect_success "rev-list --missing=print-only with missing $obj" '
>> +		oid="$(git rev-parse $obj)" &&
>> +		path=".git/objects/$(test_oid_to_path $oid)" &&
>> +
>> +		# Capture present OIDs before hiding anything.
>> +		git rev-list --objects --no-object-names HEAD ^$obj >present.raw &&
>> +
>> +		mv "$path" "$path.hidden" &&
>> +		test_when_finished "mv $path.hidden $path" &&
>> +
>> +		git rev-list --missing=print-only --objects --no-object-names \
>> +			HEAD >actual &&
>> +
>> +		# Only the missing OID should appear, without the "?" prefix.
>> +		grep "^$oid$" actual &&
>> +
>> +		# Present objects must NOT appear in the output.
>> +		while read present_oid
>> +		do
>> +			! grep "^$present_oid$" actual || return 1
>> +		done <present.raw
> 
> How many present object IDs do we have? I'm a bit worried that we now
> execute grep(1) hundreds of times. Can we maybe do some tricks with
> comm(1) instead?


Phillip's test_cmp approach is simpler, since we hide one object, the 
output should be exactly that OID. Will use that.


Thanks,
Asthana


> 
> Thanks!
> 
> Patrick


      parent reply	other threads:[~2026-04-20 10:33 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-19  8:48 [PATCH v1 0/1] rev-list: add --missing=print-only mode Siddharth Asthana
2026-04-19  8:48 ` [PATCH v1 1/1] " Siddharth Asthana
2026-04-19 22:36   ` Derrick Stolee
2026-04-20 10:24     ` Siddharth Asthana
2026-04-20 11:44       ` Derrick Stolee
2026-04-20  7:43   ` Patrick Steinhardt
2026-04-20  8:57     ` Phillip Wood
2026-04-20  9:55       ` Patrick Steinhardt
2026-04-20 10:37       ` Siddharth Asthana
2026-04-20 11:00       ` Kristoffer Haugsbakk
2026-04-20 10:33     ` Siddharth Asthana [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3bb52232-ebe4-4c50-a674-943a175e983d@gmail.com \
    --to=siddharthasthana31@gmail.com \
    --cc=chriscool@tuxfamily.org \
    --cc=git@vger.kernel.org \
    --cc=justin@parity.io \
    --cc=karthik.188@gmail.com \
    --cc=ps@pks.im \
    --cc=toon@iotcl.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.