From: "Andrew Olsen via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Andrew Olsen <andrew232@gmail.com>,
Andrew Olsen <andrew.olsen@koordinates.com>
Subject: [PATCH 4/4] Documentation for list-object-filter extensions
Date: Sun, 05 Sep 2021 23:51:39 +0000 [thread overview]
Message-ID: <22475475cae2d32dfa664709f283e4c6e8f60e83.1630885899.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1031.git.1630885899.gitgitgadget@gmail.com>
From: Andrew Olsen <andrew.olsen@koordinates.com>
Explains how to develop a custom extension for list-objects-filter
behavior, and how to compile it into a custom build of Git using the
FILTER_EXTENSIONS Makefile argument.
Signed-off-by: Andrew Olsen <andrew.olsen@koordinates.com>
---
Documentation/config/uploadpack.txt | 7 +-
Documentation/rev-list-options.txt | 4 +
contrib/filter-extensions/README.txt | 153 +++++++++++++++++++++++++++
3 files changed, 161 insertions(+), 3 deletions(-)
create mode 100644 contrib/filter-extensions/README.txt
diff --git a/Documentation/config/uploadpack.txt b/Documentation/config/uploadpack.txt
index 32fad5bbe81..b2ef2421a6d 100644
--- a/Documentation/config/uploadpack.txt
+++ b/Documentation/config/uploadpack.txt
@@ -66,9 +66,10 @@ uploadpackfilter.allow::
uploadpackfilter.<filter>.allow::
Explicitly allow or ban the object filter corresponding to
`<filter>`, where `<filter>` may be one of: `blob:none`,
- `blob:limit`, `object:type`, `tree`, `sparse:oid`, or `combine`.
- If using combined filters, both `combine` and all of the nested
- filter kinds must be allowed. Defaults to `uploadpackfilter.allow`.
+ `blob:limit`, `tree`, `sparse:oid`, `combine`, or a named filter extension
+ `extension:<name>`. If using combined filters, both `combine` and all of
+ the nested filter kinds must be allowed. Defaults to
+ `uploadpackfilter.allow`.
uploadpackfilter.tree.maxDepth::
Only allow `--filter=tree:<n>` when `<n>` is no more than the value of
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index b7bd27e1713..d7a317f0aa1 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -914,6 +914,10 @@ Note that the form '--filter=sparse:path=<path>' that wants to read
from an arbitrary path on the filesystem has been dropped for security
reasons.
+
+The form '--filter=extension:<name>[=<arg>]' uses a compile-time extension
+to implement a named filter. Filter extensions may take an argument string
+which is passed via `<arg>`.
++
Multiple '--filter=' flags can be specified to combine filters. Only
objects which are accepted by every filter are included.
+
diff --git a/contrib/filter-extensions/README.txt b/contrib/filter-extensions/README.txt
new file mode 100644
index 00000000000..3d5921cda9b
--- /dev/null
+++ b/contrib/filter-extensions/README.txt
@@ -0,0 +1,153 @@
+= List-Objects-Filter Extensions API
+:pp: {plus}{plus}
+
+This API can be used to develop filter extensions used for custom filtering
+behaviour with `git-upload-pack` and `git-rev-list`. The API is defined in
+link:../../list-objects-filter-extensions.h[list-objects-filter-extensions.h]
+and defines three functions to implement a filter operation.
+
+NOTE: Each filter implementing this API must compiled into Git as a
+static library. There is some plumbing in the Makefile to help with this
+via `FILTER_EXTENSIONS`.
+
+== Overview
+
+. You write a filter and compile it into your custom build of git.
+. A filter request is received that specifically names the filter extension
+that you have written, ie: `--filter=extension:<name>[=<arg>]`
+. The `init_fn` function of your filter is called.
+. The `filter_object_fn` function of your filter is called for each object
+at least once.
+. The `free_fn` function of your filter is called.
+
+== Examples
+
+*link:./rand/[`rand`]* is a filter that matches all trees and a random
+percentage of blobs, where the percentage is parsed from the filter arg. It
+imports and uses the `oid_to_hex()` and `trace_key_printf()` functions from the
+Git API.
+
+Build via:
+
+[,console]
+----
+$ make FILTER_EXTENSIONS=contrib/filter-extensions/rand/rand.a
+ ...
+ SUBDIR contrib/filter-extensions/rand
+ ...
+----
+
+We can run against git's own repo:
+
+[,console]
+----
+$ ./git rev-list refs/heads/master --objects --max-count 1 --filter=extension:rand=3 --filter-print-omitted | grep -c '^~'
+filter-rand: matching 3%
+filter-rand: done: count=4068 (blob=3866 tree=202) matched=117 elapsed=0.005017s rate=810843.1/s average=1.2us
+3749 # number of omitted blobs = 3866 - 117
+----
+
+== Development
+
+See the examples for a basic implementation. The comments in
+link:../../list-objects-filter.h[`list-objects-filter.h`] and the built-in
+filter implementations in
+link:../../list-objects-filter.c[`list-objects-filter.c`] are important to
+understand how filters are implemented - `filter_blobs_limit()` provides a
+simple example, and `filter_sparse()` is more complex.
+
+The API differences between the built-in filters and the filter extensions:
+
+. Filter extensions don't handle ``omitset``s directly, instead setting `omit`.
+. Filter extensions receive a void pointer they can use for context.
+
+== Building
+
+There is some plumbing in the Git Makefile to help with this via
+`FILTER_EXTENSIONS`, setting it to space-separated paths of the filter extension
+static libraries indicates that these filters should be compiled into git.
+For example:
+
+[,console]
+----
+make FILTER_EXTENSIONS=contrib/filter-extensions/rand/rand.a
+----
+
+Filter extensions don't need to be within the Git source tree. A filter
+extension static library should either exist at the given path - ie, `rand.a`
+should exist - or there should be a Makefile in that directory which will create
+it when `make rand.a` is run. (Such a Makefile should also have a `clean` target
+which deletes all object files and brings the directory back to its initial
+state).
+
+The static library should define a struct of type `filter_extension` called
+`filter_extension_NAME` where `NAME` is the name of your extension (ie `rand`
+for `rand.a`). See
+link:../../list-objects-filter-extensions.h[list-objects-filter-extensions.h]
+
+This definition should follow the following pattern:
+
+[,C]
+----
+#include "list-objects-filter-extensions.h"
+
+/* Definitions of rand_init, rand_filter_object, rand_free ... */
+
+const struct filter_extension filter_extension_rand = {
+ "rand",
+ &rand_init,
+ &rand_filter_object,
+ &rand_free,
+};
+----
+
+(The names of your `init_fn`, `filter_object_fn` and `free_fn` are not
+important, but the string literal should again be the the name of your extension
+- `"rand"` for the filter extension in `rand.a`.)
+
+You may use library functions from Git if you include the relevant Git headers,
+since the filter extensions and Git itself will be linked together into a single
+binary.
+
+You may depend on other libraries if you indicate that they are to be linked
+into the Git binary using `LDFLAGS`. See the C{pp} example below.
+
+== Developing in C{pp} (and other languages)
+
+You can develop filter extensions with C{pp}, but many Git header files are not
+compatible with modern C{pp}, so you won't be able to directly use Git library
+functions. However, you can use them if you create wrapper functions in C that
+delegates to the Git library functions you need, but which are also C{pp}
+compatible. See link:./rand_cpp/[`rand_cpp`] for a simple example. A similar
+solution would be to implement the extension itself in C, and have the
+extension do any operations that require Git library functions, but have it
+delegate to a C wrapper API that you add to a C{pp} library that already
+contains the domain-specific operations that you need. In either case, remember
+to wrap any functions that must be C-compatible with `extern C` when declaring
+or defining them from within C{pp}.
+
+To build the C{pp} example:
+
+[,console]
+----
+make FILTER_EXTENSIONS=contrib/filter-extensions/rand_cpp/rand_cpp.a \
+ LDFLAGS=-lstdc++
+----
+
+For other languages you'll either need to port definitions of some internal Git
+structs (at a minimum, `object`, `object_id`, `repository`, and `hash_algo`) -
+or again, you could write the extension in C but have it delegate to a domain
+specific library in the language of your choice that has a C-compatible API.
+Extra libraries can be required using `LDFLAGS`.
+
+== Linking more than one filter extension
+
+To link in more than one extension, set `FILTER_EXTENSIONS` to the
+space-separated paths of all the extensions you want linked. For example, to
+link in both example filters at once:
+
+[,console]
+----
+make FILTER_EXTENSIONS="contrib/filter-extensions/rand/rand.a contrib/filter-extensions/rand_cpp/rand_cpp.a" \
+ LDFLAGS=-lstdc++
+----
--
gitgitgadget
next prev parent reply other threads:[~2021-09-05 23:52 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-05 23:51 [PATCH 0/4] Compile-time extensions for list-object-filter Andrew Olsen via GitGitGadget
2021-09-05 23:51 ` [PATCH 1/4] " Andrew Olsen via GitGitGadget
2021-09-05 23:51 ` [PATCH 2/4] Makefile for list-object-filter extensions Andrew Olsen via GitGitGadget
2021-09-06 6:15 ` Bagas Sanjaya
2021-09-05 23:51 ` [PATCH 3/4] Sample " Andrew Olsen via GitGitGadget
2021-09-05 23:51 ` Andrew Olsen via GitGitGadget [this message]
2021-09-06 0:49 ` [PATCH 0/4] Compile-time extensions for list-object-filter Ævar Arnfjörð Bjarmason
2021-09-06 6:18 ` Bagas Sanjaya
2021-09-07 0:37 ` Andrew Olsen
2021-09-07 8:59 ` Ævar Arnfjörð Bjarmason
2021-09-08 14:23 ` Robert Coup
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=22475475cae2d32dfa664709f283e4c6e8f60e83.1630885899.git.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=andrew.olsen@koordinates.com \
--cc=andrew232@gmail.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).