All of lore.kernel.org
 help / color / mirror / Atom feed
From: Toon Claes <toon@iotcl.com>
To: Patrick Steinhardt <ps@pks.im>, git@vger.kernel.org
Cc: Karthik Nayak <karthik.188@gmail.com>,
	Taylor Blau <me@ttaylorr.com>, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v2 02/10] builtin/cat-file: wire up an option to filter objects
Date: Tue, 01 Apr 2025 13:45:46 +0200	[thread overview]
Message-ID: <87r02cf6l1.fsf@iotcl.com> (raw)
In-Reply-To: <20250327-pks-cat-file-object-type-filter-v2-2-4bbc7085d7c5@pks.im>

Patrick Steinhardt <ps@pks.im> writes:

> In batch mode, git-cat-file(1) enumerates all objects and prints them
> by iterating through both loose and packed objects. This works without
> considering their reachability at all, and consequently most options to
> filter objects as they exist in e.g. git-rev-list(1) are not applicable.
> In some situations it may still be useful though to filter objects based
> on properties that are inherent to them. This includes the object size
> as well as its type.
>
> Such a filter already exists in git-rev-list(1) with the `--filter=`
> command line option. While this option supports a couple of filters that
> are not applicable to our usecase, some of them are quite a neat fit.
>
> Wire up the filter as an option for git-cat-file(1). This allows us to
> reuse the same syntax as in git-rev-list(1) so that we don't have to
> reinvent the wheel. For now, we die when any of the filter options has
> been passed by the user, but they will be wired up in subsequent
> commits.
>
> Further note that the filters that we are about to introduce don't
> significantly speed up the runtime of git-cat-file(1). While we can skip
> emitting a lot of objects in case they are uninteresting to us, the
> majority of time is spent reading the packfile, which is bottlenecked by
> I/O and not the processor. This will change though once we start to make
> use of bitmaps, which will allow us to skip reading the whole packfile.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  Documentation/git-cat-file.adoc |  6 ++++++
>  builtin/cat-file.c              | 37 +++++++++++++++++++++++++++++++++----
>  t/t1006-cat-file.sh             | 32 ++++++++++++++++++++++++++++++++
>  3 files changed, 71 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/git-cat-file.adoc b/Documentation/git-cat-file.adoc
> index d5890ae3686..f7f57b7f538 100644
> --- a/Documentation/git-cat-file.adoc
> +++ b/Documentation/git-cat-file.adoc
> @@ -81,6 +81,12 @@ OPTIONS
>  	end-of-line conversion, etc). In this case, `<object>` has to be of
>  	the form `<tree-ish>:<path>`, or `:<path>`.
>  
> +--filter=<filter-spec>::
> +--no-filter::
> +	Omit objects from the list of printed objects. This can only be used in
> +	combination with one of the batched modes. The '<filter-spec>' may be
> +	one of the following:
> +
>  --path=<path>::
>  	For use with `--textconv` or `--filters`, to allow specifying an object
>  	name and a path separately, e.g. when it is difficult to figure out
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 8e40016dd24..940900d92ad 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -936,10 +946,13 @@ int cmd_cat_file(int argc,
>  	int opt_cw = 0;
>  	int opt_epts = 0;
>  	const char *exp_type = NULL, *obj_name = NULL;
> -	struct batch_options batch = {0};
> +	struct batch_options batch = {
> +		.objects_filter = LIST_OBJECTS_FILTER_INIT,
> +	};
>  	int unknown_type = 0;
>  	int input_nul_terminated = 0;
>  	int nul_terminated = 0;
> +	int ret;
>  
>  	const char * const builtin_catfile_usage[] = {
>  		N_("git cat-file <type> <object>"),
> @@ -1000,6 +1013,8 @@ int cmd_cat_file(int argc,
>  			    N_("run filters on object's content"), 'w'),
>  		OPT_STRING(0, "path", &force_path, N_("blob|tree"),
>  			   N_("use a <path> for (--textconv | --filters); Not with 'batch'")),
> +		OPT_CALLBACK(0, "filter", &batch.objects_filter, N_("args"),
> +			     N_("object filtering"), opt_parse_list_objects_filter),

Because we've decided on `--filter` we can use
`OPT_PARSE_LIST_OBJECTS_FILTER` here now.

--
Toon

  reply	other threads:[~2025-04-01 11:46 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-21  7:47 [PATCH 0/9] builtin/cat-file: allow filtering objects in batch mode Patrick Steinhardt
2025-02-21  7:47 ` [PATCH 1/9] builtin/cat-file: rename variable that tracks usage Patrick Steinhardt
2025-02-21  7:47 ` [PATCH 2/9] builtin/cat-file: wire up an option to filter objects Patrick Steinhardt
2025-02-26 15:20   ` Toon Claes
2025-02-28 10:51     ` Patrick Steinhardt
2025-02-28 17:44       ` Junio C Hamano
2025-03-03 10:40         ` Patrick Steinhardt
2025-02-27 11:20   ` Karthik Nayak
2025-02-21  7:47 ` [PATCH 3/9] builtin/cat-file: support "blob:none" objects filter Patrick Steinhardt
2025-02-26 15:22   ` Toon Claes
2025-02-27 11:26   ` Karthik Nayak
2025-02-21  7:47 ` [PATCH 4/9] builtin/cat-file: support "blob:limit=" " Patrick Steinhardt
2025-02-21  7:47 ` [PATCH 5/9] builtin/cat-file: support "object:type=" " Patrick Steinhardt
2025-02-26 15:23   ` Toon Claes
2025-02-28 10:51     ` Patrick Steinhardt
2025-02-21  7:47 ` [PATCH 6/9] pack-bitmap: expose function to iterate over bitmapped objects Patrick Steinhardt
2025-02-24 18:05   ` Junio C Hamano
2025-02-25  6:59     ` Patrick Steinhardt
2025-02-25 16:59       ` Junio C Hamano
2025-02-27 23:26       ` Taylor Blau
2025-02-28 10:54         ` Patrick Steinhardt
2025-02-27 23:23     ` Taylor Blau
2025-02-27 23:32       ` Junio C Hamano
2025-02-27 23:39         ` Taylor Blau
2025-02-21  7:47 ` [PATCH 7/9] pack-bitmap: introduce function to check whether a pack is bitmapped Patrick Steinhardt
2025-02-27 23:33   ` Taylor Blau
2025-02-21  7:47 ` [PATCH 8/9] builtin/cat-file: deduplicate logic to iterate over all objects Patrick Steinhardt
2025-02-21  7:47 ` [PATCH 9/9] builtin/cat-file: use bitmaps to efficiently filter by object type Patrick Steinhardt
2025-02-27 11:38   ` Karthik Nayak
2025-02-27 23:48   ` Taylor Blau
2025-03-27  9:43 ` [PATCH v2 00/10] builtin/cat-file: allow filtering objects in batch mode Patrick Steinhardt
2025-03-27  9:43   ` [PATCH v2 01/10] builtin/cat-file: rename variable that tracks usage Patrick Steinhardt
2025-04-01  9:51     ` Karthik Nayak
2025-04-02 11:13       ` Patrick Steinhardt
2025-04-07 20:25         ` Junio C Hamano
2025-03-27  9:43   ` [PATCH v2 02/10] builtin/cat-file: wire up an option to filter objects Patrick Steinhardt
2025-04-01 11:45     ` Toon Claes [this message]
2025-04-02 11:13       ` Patrick Steinhardt
2025-04-01 12:05     ` Karthik Nayak
2025-04-02 11:13       ` Patrick Steinhardt
2025-03-27  9:43   ` [PATCH v2 03/10] builtin/cat-file: support "blob:none" objects filter Patrick Steinhardt
2025-04-01 12:22     ` Karthik Nayak
2025-04-01 12:31       ` Karthik Nayak
2025-04-02 11:13         ` Patrick Steinhardt
2025-03-27  9:43   ` [PATCH v2 04/10] builtin/cat-file: support "blob:limit=" " Patrick Steinhardt
2025-03-27  9:44   ` [PATCH v2 05/10] builtin/cat-file: support "object:type=" " Patrick Steinhardt
2025-03-27  9:44   ` [PATCH v2 06/10] pack-bitmap: allow passing payloads to `show_reachable_fn()` Patrick Steinhardt
2025-04-01 12:17     ` Toon Claes
2025-04-02 11:13       ` Patrick Steinhardt
2025-03-27  9:44   ` [PATCH v2 07/10] pack-bitmap: add function to iterate over filtered bitmapped objects Patrick Steinhardt
2025-03-27  9:44   ` [PATCH v2 08/10] pack-bitmap: introduce function to check whether a pack is bitmapped Patrick Steinhardt
2025-04-01 11:46     ` Toon Claes
2025-04-02 11:13       ` Patrick Steinhardt
2025-03-27  9:44   ` [PATCH v2 09/10] builtin/cat-file: deduplicate logic to iterate over all objects Patrick Steinhardt
2025-04-01 12:13     ` Toon Claes
2025-04-02 11:13       ` Patrick Steinhardt
2025-04-03 18:24         ` Toon Claes
2025-03-27  9:44   ` [PATCH v2 10/10] builtin/cat-file: use bitmaps to efficiently filter by object type Patrick Steinhardt
2025-04-02 11:13 ` [PATCH v3 00/11] builtin/cat-file: allow filtering objects in batch mode Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 01/11] builtin/cat-file: rename variable that tracks usage Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 02/11] builtin/cat-file: introduce function to report object status Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 03/11] builtin/cat-file: wire up an option to filter objects Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 04/11] builtin/cat-file: support "blob:none" objects filter Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 05/11] builtin/cat-file: support "blob:limit=" " Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 06/11] builtin/cat-file: support "object:type=" " Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 07/11] pack-bitmap: allow passing payloads to `show_reachable_fn()` Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 08/11] pack-bitmap: add function to iterate over filtered bitmapped objects Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 09/11] pack-bitmap: introduce function to check whether a pack is bitmapped Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 10/11] builtin/cat-file: deduplicate logic to iterate over all objects Patrick Steinhardt
2025-04-02 11:13   ` [PATCH v3 11/11] builtin/cat-file: use bitmaps to efficiently filter by object type Patrick Steinhardt
2025-04-03  8:17   ` [PATCH v3 00/11] builtin/cat-file: allow filtering objects in batch mode Karthik Nayak
2025-04-08  0:32     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r02cf6l1.fsf@iotcl.com \
    --to=toon@iotcl.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=karthik.188@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.