* Re: [PATCH 2/5] strbuf: factor out strbuf_expand_step()
From: René Scharfe @ 2023-06-27 16:21 UTC (permalink / raw)
To: Jeff King; +Cc: Git List
In-Reply-To: <20230627082622.GH1226768@coredump.intra.peff.net>
Am 27.06.23 um 10:26 schrieb Jeff King:
> On Sat, Jun 17, 2023 at 10:41:44PM +0200, René Scharfe wrote:
>
>> Extract the part of strbuf_expand that finds the next placeholder into a
>> new function. It allows to build parsers without callback functions and
>> the overhead imposed by them.
>>
>> Signed-off-by: René Scharfe <l.s.r@web.de>
>> ---
>> builtin/branch.c | 13 ++-----------
>> strbuf.c | 28 ++++++++++++++--------------
>> strbuf.h | 8 ++++++++
>> 3 files changed, 24 insertions(+), 25 deletions(-)
>
> I was a little surprised to see the change in branch.c here (as well as
> the one in strbuf_addftime), since the commit message just says
> "extract..." and doesn't mention them.
Fair point, the other extraction sources should have been mentioned as
well somehow.
> They do serve as examples of how it can be used, so I think it's OK. But
> it made me wonder: is this all of the sites? Or are these just examples?
These are all that I could find.
René
^ permalink raw reply
* Re: [PATCH 3/5] replace strbuf_expand_dict_cb() with strbuf_expand_step()
From: René Scharfe @ 2023-06-27 16:24 UTC (permalink / raw)
To: Jeff King; +Cc: Git List
In-Reply-To: <20230627083526.GA1273865@coredump.intra.peff.net>
Am 27.06.23 um 10:35 schrieb Jeff King:
> On Tue, Jun 27, 2023 at 04:29:02AM -0400, Jeff King wrote:
>
>> Your comment above does make me wonder if strbuf_expand_step() should be
>> quietly handling "%%" itself. But I guess strbuf_expand() doesn't, and
>> your branch.c quote-literal example probably would not want that
>> behavior.
>
> Er, nope, strbuf_expand() does handle "%%" itself. It really feels like
> we'd want strbuf_expand_step() to do so, too, then. Even if we had two
> variants (a "raw" one which doesn't handle "%%" so that branch.c could
> use it, and then another that wrapped it to handle "%%").
strbuf_expand() handles %% and unrecognized placeholders.
We could do that, or move the %% handling to strbuf_expand_literal or
something else. E.g. initially I had a strbuf_expand_default that
handled %% and unrecognized placeholders.
I think it's a good idea to first get used to the loop paradigm by
eschewing the sugary automatic handling and having everything spelled
out explicitly. It's just one more branch. Then we can come up with
a better factoring after a while.
René
^ permalink raw reply
* Re: [PATCH 4/5] replace strbuf_expand() with strbuf_expand_step()
From: René Scharfe @ 2023-06-27 16:31 UTC (permalink / raw)
To: Jeff King; +Cc: Git List
In-Reply-To: <20230627085422.GJ1226768@coredump.intra.peff.net>
Am 27.06.23 um 10:54 schrieb Jeff King:
> On Sat, Jun 17, 2023 at 10:43:17PM +0200, René Scharfe wrote:
>
>> Avoid the overhead of passing context to a callback function of
>> strbuf_expand() by using strbuf_expand_step() in a loop instead. It
>> requires explicit handling of %% and unrecognized placeholders, but is
>> simpler, more direct and avoids void pointers.
>
> I like this. I don't care that much about the runtime overhead of
> passing the context around, but if you meant by "overhead" the
> programmer time and confusion in stuffing everything into context
> structs, then I agree this is much better. :)
Indeed, I meant the burden of being forced to define a struct and
filling in all necessary context. Bureaucratic overhead.
> It does still feel like we should be handling "%%" on behalf of the
> callers.
I feel the same, but restrained myself from doing that, so that we
can see all the pieces for now. It allows us to recombine them in
better ways than before.
>> builtin/cat-file.c | 35 +++++++--------
>> builtin/ls-files.c | 109 +++++++++++++++++++--------------------------
>> builtin/ls-tree.c | 107 +++++++++++++++++---------------------------
>> daemon.c | 61 ++++++++-----------------
>> pretty.c | 72 ++++++++++++++++++------------
>> strbuf.c | 20 ---------
>> strbuf.h | 37 ++-------------
>> 7 files changed, 169 insertions(+), 272 deletions(-)
>
> The changes mostly looked OK to me (and the diffstat is certainly
> pleasing). The old callbacks returned a "consumed" length, and we need
> each "step" caller to now do "format += consumed" themselves. At first
> glance I thought there were cases where you didn't, but then I realized
> that you are relying on skip_prefix() to do that incrementing. Which
> makes sense and the resulting code looks nice, but it took me a minute
> to realize what was going on.
*nod* The "returns consumed length" style is still used by
strbuf_expand_literal, though, so we have a bit of a mix. Which
works, but a uniform convention might be better.
>> @@ -1894,7 +1880,26 @@ void userformat_find_requirements(const char *fmt, struct userformat_want *w)
>> return;
>> fmt = user_format;
>> }
>> - strbuf_expand(&dummy, fmt, userformat_want_item, w);
>> + while (strbuf_expand_step(&dummy, &fmt)) {
>> + if (skip_prefix(fmt, "%", &fmt))
>> + continue;
>> +
>> + if (*fmt == '+' || *fmt == '-' || *fmt == ' ')
>> + fmt++;
>> +
>> + switch (*fmt) {
>> + case 'N':
>> + w->notes = 1;
>> + break;
>> + case 'S':
>> + w->source = 1;
>> + break;
>> + case 'd':
>> + case 'D':
>> + w->decorate = 1;
>> + break;
>> + }
>> + }
>> strbuf_release(&dummy);
>> }
>
> This one actually doesn't increment the format (so we restart the
> expansion on "N" or whatever). But neither did the original! It always
> returned 0:
>
>> -static size_t userformat_want_item(struct strbuf *sb UNUSED,
>> - const char *placeholder,
>> - void *context)
>> -{
>> - struct userformat_want *w = context;
>> -
>> - if (*placeholder == '+' || *placeholder == '-' || *placeholder == ' ')
>> - placeholder++;
>> -
>> - switch (*placeholder) {
>> - case 'N':
>> - w->notes = 1;
>> - break;
>> - case 'S':
>> - w->source = 1;
>> - break;
>> - case 'd':
>> - case 'D':
>> - w->decorate = 1;
>> - break;
>> - }
>> - return 0;
>> -}
>
> So probably OK, though a little funny.
>
> It also feels like this whole function would be just as happy using
> "strchr()", since it throws away the expanded result anyway. But that
> can be for another time. :)
Good idea! And the conversion to a loop brings us halfway there.
>
>> @@ -1912,7 +1917,16 @@ void repo_format_commit_message(struct repository *r,
>> const char *output_enc = pretty_ctx->output_encoding;
>> const char *utf8 = "UTF-8";
>>
>> - strbuf_expand(sb, format, format_commit_item, &context);
>> + while (strbuf_expand_step(sb, &format)) {
>> + size_t len;
>> +
>> + if (skip_prefix(format, "%", &format))
>> + strbuf_addch(sb, '%');
>> + else if ((len = format_commit_item(sb, format, &context)))
>> + format += len;
>> + else
>> + strbuf_addch(sb, '%');
>> + }
>
> This one doesn't advance the format for a not-understood placeholder.
> But that's OK, because we know it isn't "%", so starting the search from
> there again is correct.
Right. This is copied from strbuf_expand.
>
>> @@ -1842,7 +1852,7 @@ static size_t format_commit_item(struct strbuf *sb, /* in UTF-8 */
>> }
>>
>> orig_len = sb->len;
>> - if (((struct format_commit_context *)context)->flush_type != no_flush)
>> + if ((context)->flush_type != no_flush)
>> consumed = format_and_pad_commit(sb, placeholder, context);
>> else
>> consumed = format_commit_one(sb, placeholder, context);
>
> Since we're no longer casting, the extra parentheses seem redundant now.
> I.e., this can just be context->flush_type.
Indeed.
René
^ permalink raw reply
* Re: [PATCH 5/5] strbuf: simplify strbuf_expand_literal_cb()
From: René Scharfe @ 2023-06-27 16:32 UTC (permalink / raw)
To: Jeff King; +Cc: Git List
In-Reply-To: <20230627085701.GK1226768@coredump.intra.peff.net>
Am 27.06.23 um 10:57 schrieb Jeff King:
> On Sat, Jun 17, 2023 at 10:44:00PM +0200, René Scharfe wrote:
>
>> Now that strbuf_expand_literal_cb() is no longer used as a callback,
>> drop its "_cb" name suffix and unused context parameter.
>
> Makes sense.
>
> Since most callers just call "format += len", it kind of feels like the
> appropriate interface might be more like:
>
> strbuf_expand_literal(&sb, &format);
>
> to auto-advance the format. But I guess that gets weird with this
> caller:
>
>> @@ -1395,7 +1395,7 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
>> char **slot;
>>
>> /* these are independent of the commit */
>> - res = strbuf_expand_literal_cb(sb, placeholder, NULL);
>> + res = strbuf_expand_literal(sb, placeholder);
>> if (res)
>> return res;
>
> which is still in the "return the length" mentality (OTOH, if it
> advanced the local copy of the placeholder pointer nobody would mind).
Yes, I only grabbed the two low-hanging fruits here.
A format-advancing version would also look a bit weird in an if/else
cascade:
else if (strbuf_expand_literal(&sb, &format))
; /* nothing */
else ...
> I dunno. It is a little thing, and I am OK with it either way (I would
> not even think of changing it if you were not already touching the
> function).
Unsure. Should I overcome my horror vacui and let the /* nothing */ in?
René
^ permalink raw reply
* Re: [PATCH v1 0/4] Remove obsolete Cygwin support from git-gui
From: Junio C Hamano @ 2023-06-27 17:52 UTC (permalink / raw)
To: Mark Levedahl; +Cc: mdl123, git, adam, me, johannes.schindelin, sunshine
In-Reply-To: <20230626165305.37488-1-mlevedahl@gmail.com>
Mark Levedahl <mlevedahl@gmail.com> writes:
> === This is an update, incorporating responses to Junio's and Eric's
> comments:
> -- clarified what the "upstream" git-gui branch is
> -- Removed some changes from patch 2 as requested by Junio, reducing
> changes in patch 3 and patch 4
> All code is fixed only after applying patch 4
> Differences in patch 3 and 4 are minimimized
> -- updated comments to clarify G4w dedicated code.
> -- updated all comments to (hopefully) clarify points of confusion
> ===
> ...
> Mark Levedahl (4):
> git gui Makefile - remove Cygwin modifications
> git-gui - remove obsolete Cygwin specific code
> git-gui - use cygstart to browse on Cygwin
> git-gui - use mkshortcut on Cygwin
>
> Makefile | 21 +------
> git-gui.sh | 118 +++-----------------------------------
> lib/choose_repository.tcl | 27 +--------
> lib/shortcut.tcl | 31 +++++-----
> 4 files changed, 27 insertions(+), 170 deletions(-)
OK, Dscho says v1 looks good, and I have no further comments.
Pratyush, can I expect that you take further comments and usher
these patches to your tree, and eventually tell me to pull from your
repository?
Thanks, all.
^ permalink raw reply
* Re: [PATCH v2 6/7] var: add attributes files locations
From: Junio C Hamano @ 2023-06-27 17:56 UTC (permalink / raw)
To: brian m. carlson; +Cc: Jeff King, git, Eric Sunshine, Derrick Stolee
In-Reply-To: <ZJsKa6ZNq7nnh0gf@tapette.crustytoothpaste.net>
"brian m. carlson" <sandals@crustytoothpaste.net> writes:
>> So I dunno. I'm mostly giving a heads-up, and seeing if you (or other
>> reviewers in the thread) have an idea to make this "flag" thing less
>> awful. I'm also happy to just do my topic separately, and then
>> eventually circle back after yours is merged.
>
> I've picked up your patch as the first patch in the series and will send
> it out in v3 in just a few minutes. Since I plan to have v3 be the last
> round of this series, I'll let you send out any further changes as
> fixups on top of that.
Sounds like a sensible way to go, at least to me.
Thanks for working well together. I wish all conflicting topics
worked nicely with each other like these ;-)
^ permalink raw reply
* Send out for squad
From: ross thomas @ 2023-06-27 18:07 UTC (permalink / raw)
To: git@vger.kernel.org
Send battle squad for Joseph nickolas Thomas Joseph manual Thomas, Sherrie Lynn Thomas Joyce Thomas, violet rose, ray rose, dalton rose this is Ross Nicholas oneil thomas, octocat
Sent from my iPhone
^ permalink raw reply
* Re: [PATCH 2/2] for-each-ref: add --count-matches option
From: Junio C Hamano @ 2023-06-27 18:22 UTC (permalink / raw)
To: Phillip Wood
Cc: Jeff King, Derrick Stolee via GitGitGadget, git, vdye, me,
mjcheetham, Derrick Stolee
In-Reply-To: <f6fd39bc-65d4-76e3-94b4-9163194c89dd@gmail.com>
Phillip Wood <phillip.wood123@gmail.com> writes:
> I'm a bit suspicious of the massive speed up I'm seeing by avoiding
> the default format but it appears to be repeatable.
This is interesting. The default format over just 'x' is to
concatenate the refname with hexadecimal form of the object name,
while (at least as the code originally was intended):
* the cost of parsing the "--format=<format>" string into series of
atoms is one-time O(1), and there is nothing specially tricky in
there.
* the cost of computing these atoms should be miniscule, as
in the default format, the per-ref cost come from these:
- populate_value() -> get_refname() -> show_ref() for the refname
would merely be a xstrdup() of the refname.
- populate_value() -> grab_oid() -> do_grab_oid() should be using
O_FULL so there shouldn't be any find_unique_abbrev() cost.
Just binary-to-hex.
- populate_value() -> get_object() -> oid_object_info_extended() ->
grab_common_values() asks for ATOM_OBJECTTYPE that should be a
single xstrdup(), but oid_object_info_extended() needs to parse
the object, but .info.contentp ought to be NULL and we should
not be calling parse_object_buffer().
So it might be worth looking into where the time is going (didn't
Peff or somebody do that a few years ago, though?) when using the
default format and optimize or special case the codepath, but the
responses I have seen from two of you makes me suspect that the new
option is not the best general approach.
Thanks.
^ permalink raw reply
* Re: [PATCH] ll-merge: killing the external merge driver aborts the merge
From: Junio C Hamano @ 2023-06-27 19:08 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git, Elijah Newren, Joshua Hudson
In-Reply-To: <59b7a582-be68-3f7b-a06f-3bd662582a1d@gmx.de>
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> On Fri, 23 Jun 2023, Junio C Hamano wrote:
>
>> Junio C Hamano <gitster@pobox.com> writes:
>>
>> > Elijah Newren <newren@gmail.com> writes:
>> >
>> >> Reviewed-by: Elijah Newren <newren@gmail.com>
>> >
>> >
>> > Thanks for a quick review.
>>
>> Unfortunately Windows does not seem to correctly detect the aborting
Sorry, I did not mean "abort(3)" literally. What I meant was that
an external merge driver that gets spawned via the run_command()
interface may not die by calling exit()---like "killed by signal"
(including "segfaulting"). The new test script piece added in the
patch did "kill -9 $$" to kill the external merge driver itself,
which gets reported as "killed by signal" from run_command() by
returning the signal number + 128, but that did not pass Windows CI.
^ permalink raw reply
* Re: [PATCH] ll-merge: killing the external merge driver aborts the merge
From: Joshua Hudson @ 2023-06-27 19:10 UTC (permalink / raw)
To: Junio C Hamano, Johannes Schindelin; +Cc: git, Elijah Newren
In-Reply-To: <xmqqedlwhgf9.fsf@gitster.g>
On 6/27/2023 12:08 PM, Junio C Hamano wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
>> On Fri, 23 Jun 2023, Junio C Hamano wrote:
>>
>>> Junio C Hamano <gitster@pobox.com> writes:
>>>
>>>> Elijah Newren <newren@gmail.com> writes:
>>>>
>>>>> Reviewed-by: Elijah Newren <newren@gmail.com>
>>>>
>>>> Thanks for a quick review.
>>> Unfortunately Windows does not seem to correctly detect the aborting
> Sorry, I did not mean "abort(3)" literally. What I meant was that
> an external merge driver that gets spawned via the run_command()
> interface may not die by calling exit()---like "killed by signal"
> (including "segfaulting"). The new test script piece added in the
> patch did "kill -9 $$" to kill the external merge driver itself,
> which gets reported as "killed by signal" from run_command() by
> returning the signal number + 128, but that did not pass Windows CI.
>
Do you need me to provide a windows test harness?
^ permalink raw reply
* Re: [PATCH 3/3] diff --no-index: support reading from named pipes
From: Junio C Hamano @ 2023-06-27 19:44 UTC (permalink / raw)
To: Phillip Wood
Cc: git, Jeff King, Taylor Blau, brian m. carlson,
Thomas Guyot-Sionnest
In-Reply-To: <990e71882bfdc697285c5b04b92c290679ca22ab.1687874975.git.phillip.wood@dunelm.org.uk>
Phillip Wood <phillip.wood123@gmail.com> writes:
> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> In some shells, such as bash and zsh, it's possible to use a command
> substitution to provide the output of a command as a file argument to
> another process, like so:
>
> diff -u <(printf "a\nb\n") <(printf "a\nc\n")
>
> However, this syntax does not produce useful results with "git diff
> --no-index". On macOS, the arguments to the command are named pipes
> under /dev/fd, and git diff doesn't know how to handle a named pipe. On
> Linux, the arguments are symlinks to pipes, so git diff "helpfully"
> diffs these symlinks, comparing their targets like "pipe:[1234]" and
> "pipe:[5678]".
>
> To address this "diff --no-index" is changed so that if a path given on
> the commandline is a named pipe or a symbolic link that resolves to a
> named pipe then we read the data to diff from that pipe. This is
> implemented by generalizing the code that already exists to handle
> reading from stdin when the user passes the path "-".
>
> As process substitution is not support by POSIX this change is tested by
> using a pipe and a symbolic link to a pipe.
>
> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> ---
> diff-no-index.c | 80 ++++++++++++++++++++++++----------------
> t/t4053-diff-no-index.sh | 25 +++++++++++++
> 2 files changed, 73 insertions(+), 32 deletions(-)
This looks good, if a bit invasive, to a cursory read, at least to
me. It is very focused to the real problem at hand, and shows that
the way we split the "no-index" mode out to its own implementation
of filespec population code does make sense.
> -static void populate_from_stdin(struct diff_filespec *s)
> +static void populate_from_pipe(struct diff_filespec *s, int is_stdin)
> {
> struct strbuf buf = STRBUF_INIT;
> size_t size = 0;
> + int fd = 0;
>
> - if (strbuf_read(&buf, 0, 0) < 0)
> + if (!is_stdin)
> + fd = xopen(s->path, O_RDONLY);
> + if (strbuf_read(&buf, fd, 0) < 0)
> die_errno("error while reading from stdin");
> + if (!is_stdin)
> + close(fd);
Given that the error message explicitly says "stdin", and there are
many "if ([!]is_stdin)" sprinkled in the code, I actually suspect
that there should be two separate helpers, one for stdin and one for
non-stdin pipe. It is especially true since there is only one
caller that does this:
> + if (is_pipe)
> + populate_from_pipe(s, name == file_from_standard_input);
which can be
if (is_pipe) {
if (name == file_from_standard_input)
populate_from_stdin(s);
else
populate_from_pipe(s);
}
without losing clarity. The code that you are sharing by forcing
them to be a single helper to wrap up a handful of members in the s
structure can become its own helper that is called from these two
helper functions.
> static int queue_diff(struct diff_options *o,
> - const char *name1, const char *name2)
> + const char *name1, int is_pipe1,
> + const char *name2, int is_pipe2)
> {
> int mode1 = 0, mode2 = 0;
>
> - if (get_mode(name1, &mode1) || get_mode(name2, &mode2))
> + if (get_mode(name1, is_pipe1, &mode1) ||
> + get_mode(name2, is_pipe2, &mode2))
> return -1;
Makes me wonder why the caller of queue_diff() even needs to know if
these two names are pipes; we are calling get_mode() which would run
stat(2) anyway, and the result from stat(2) is what you use (in the
caller) to determine the values of is_pipeN. Wouldn't it be more
appropriate to leave the caller oblivious of special casing of the
pipes and let get_mode() handle this? After all, that is how the
existing code special cases the standard input so there is a strong
precedence.
If we go that route, it may make sense to further isolate the
"address comparison" trick used for the standard input mode.
Perhaps we can and do something like
static int get_mode(const char *path, int *mode, int *special)
{
struct stat st;
+ *special = 0; /* default - nothing special */
...
else if (path == file_from_standard_input) {
*mode = create_ce_mode(0666);
+ *pipe_kind = 1; /* STDIN */
+ } else if (stat(path, &st)) {
+ ... error ...
+ } else if (S_ISFIFO(st.st_mode)) {
+ *mode = create_ce_mode(0666);
+ *pipe_kind = 2; /* FIFO */
} else if (lstat(path, &st)) {
... error ...
} else {
*mode = st.st_mode;
}
and have the caller act on "special" to choose among calling
populate_from_stdin(), populate_from_pipe(), or do nothing for
the regular files?
Side note: this has an added benefit of highlighting that we do
stat() and lstat() because of dereferencing. What I suspect is
that "git diff --no-index" mode was primarily to give Git
niceties like rename detection and diff algorithms to those who
wanted to use in contexts (i.e. contents not tracked by Git)
they use "diff" by other people like GNU, without bothering to
update "diff" by other people. I further suspect that "compare
the readlink contents", which is very much necessary within the
Git context, may not fall into the "Git niceties" when they
invoke "--no-index" mode. Which leads me to imagine a future
direction where we only use stat() and not lstat() in the
"--no-index" codepath. Having everything including these
lstat() and stat() calls inside get_mode() will allow such a
future transition hopefully simpler.
I do not quite see why you decided to move the "is_dir" processing
up and made the caller responsible. Specifically,
> - fixup_paths(paths, &replacement);
> + if (!is_pipe[0] && !is_pipe[1])
> + fixup_paths(paths, is_dir, &replacement);
this seems fishy when one side is pipe and the other one is not.
When the user says
$ git diff --no-index <(command) path
fixup_paths() are bypassed because one of them is pipe. It makes me
suspect that it should be an error if "path" is a directory. I do
not know if fixup_paths() is the best place for doing such checking,
but somebody should be doing that, no?
^ permalink raw reply
* [RFC PATCH 0/8] Introduce Git Standard Library
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy
Introduction / Pre-reading
================
The Git Standard Library intends to serve as the foundational library
and root dependency that other libraries in Git will be built off of.
That is to say, suppose we have libraries X and Y; a user that wants to
use X and Y would need to include X, Y, and this Git Standard Library.
This cover letter will explain the rationale behind having a root
dependency that encompasses many files in the form of a standard library
rather than many root dependencies/libraries of those files. This does
not mean that the Git Standard Library will be the only possible root
dependency in the future, but rather the most significant and widely
used one. I will also explain why each file was chosen to be a part of
Git Standard Library v1. I will not explain entirely why we would like
to libify parts of Git -- see here[1] for that context.
Before looking at this series, it probably makes sense to look at the
other series that this is built on top of since that is the state I will
be referring to in this cover letter:
- Elijah's final cache.h cleanup series[2]
- my strbuf cleanup series[3]
- my git-compat-util cleanup series[4]
Most importantly, in the git-compat-util series, the declarations for
functions implemented in wrapper.c and usage.c have been moved to their
respective header files, wrapper.h and usage.h, from git-compat-util.h.
Also config.[ch] had its general parsing code moved to parse.[ch].
Dependency graph in libified Git
================
If you look in the Git Makefile, all of the objects defined in the Git
library are compiled and archived into a singular file, libgit.a, which
is linked against by common-main.o with other external dependencies and
turned into the Git executable. In other words, the Git executable has
dependencies on libgit.a and a couple of external libraries. While our
efforts to libify Git will not affect this current build flow, it will
provide an alternate method for building Git.
With our current method of building Git, we can imagine the dependency
graph as such:
Git
/\
/ \
/ \
libgit.a ext deps
In libifying parts of Git, we want to shrink the dependency graph to
only the minimal set of dependencies, so libraries should not use
libgit.a. Instead, it would look like:
Git
/\
/ \
/ \
libgit.a ext deps
/\
/ \
/ \
object-store.a (other lib)
| /
| /
| /
config.a /
| /
| /
| /
git-std-lib.a
Instead of containing all of the objects in Git, libgit.a would contain
objects that are not built by libraries it links against. Consequently,
if someone wanted their own custom build of Git with their own custom
implementation of the object store, they would only have to swap out
object-store.a rather than do a hard fork of Git.
Rationale behind Git Standard Library
================
The rationale behind Git Standard Library essentially is the result of
two observations within the Git codebase: every file includes
git-compat-util.h which defines functions in a couple of different
files, and wrapper.c + usage.c have difficult-to-separate circular
dependencies with each other and other files.
Ubiquity of git-compat-util.h and circular dependencies
========
Every file in the Git codebase includes git-compat-util.h. It serves as
"a compatibility aid that isolates the knowledge of platform specific
inclusion order and what feature macros to define before including which
system header" (Junio[5]). Since every file includes git-compat-util.h, and
git-compat-util.h includes wrapper.h and usage.h, it would make sense
for wrapper.c and usage.c to be a part of the root library. They have
difficult to separate circular dependencies with each other so they
can't be independent libraries. Wrapper.c has dependencies on parse.c,
abspath.c, strbuf.c, which in turn also have dependencies on usage.c and
wrapper.c -- more circular dependencies.
Tradeoff between swappability and refactoring
========
From the above dependency graph, we can see that git-std-lib.a could be
many smaller libraries rather than a singular library. So why choose a
singular library when multiple libraries can be individually easier to
swap and are more modular? A singular library requires less work to
separate out circular dependencies within itself so it becomes a
tradeoff question between work and reward. While there may be a point in
the future where a file like usage.c would want its own library so that
someone can have custom die() or error(), the work required to refactor
out the circular dependencies in some files would be enormous due to
their ubiquity so therefore I believe it is not worth the tradeoff
currently. Additionally, we can in the future choose to do this refactor
and change the API for the library if there becomes enough of a reason
to do so (remember we are avoiding promising stability of the interfaces
of those libraries).
Reuse of compatibility functions in git-compat-util.h
========
Most functions defined in git-compat-util.h are implemented in compat/
and have dependencies limited to strbuf.h and wrapper.h so they can be
easily included in git-std-lib.a, which as a root dependency means that
higher level libraries do not have to worry about compatibility files in
compat/. The rest of the functions defined in git-compat-util.h are
implemented in top level files and, in this patch set, are hidden behind
an #ifdef if their implementation is not in git-std-lib.a.
Rationale summary
========
The Git Standard Library allows us to get the libification ball rolling
with other libraries in Git (such as Glen's removal of global state from
config iteration[6] prepares a config library). By not spending many
more months attempting to refactor difficult circular dependencies and
instead spending that time getting to a state where we can test out
swapping a library out such as config or object store, we can prove the
viability of Git libification on a much faster time scale. Additionally
the code cleanups that have happened so far have been minor and
beneficial for the codebase. It is probable that making large movements
would negatively affect code clarity.
Git Standard Library boundary
================
While I have described above some useful heuristics for identifying
potential candidates for git-std-lib.a, a standard library should not
have a shaky definition for what belongs in it.
- Low-level files (aka operates only on other primitive types) that are
used everywhere within the codebase (wrapper.c, usage.c, strbuf.c)
- Dependencies that are low-level and widely used
(abspath.c, date.c, hex-ll.c, parse.c, utf8.c)
- low-level git/* files with functions defined in git-compat-util.h
(ctype.c)
- compat/*
There are other files that might fit this definition, but that does not
mean it should belong in git-std-lib.a. Those files should start as
their own separate library since any file added to git-std-lib.a loses
its flexibility of being easily swappable.
Files inside of Git Standard Library
================
The initial set of files in git-std-lib.a are:
abspath.c
ctype.c
date.c
hex-ll.c
parse.c
strbuf.c
usage.c
utf8.c
wrapper.c
relevant compat/ files
Pitfalls
================
In patch 7, I use #ifdef GIT_STD_LIB to both stub out code and hide
certain function headers. As other parts of Git are libified, if we
have to use more ifdefs for each different library, then the codebase
will become uglier and harder to understand.
There are a small amount of files under compat/* that have dependencies
not inside of git-std-lib.a. While those functions are not called on
Linux, other OSes might call those problematic functions. I don't see
this as a major problem, just moreso an observation that libification in
general may also require some minor compatibility work in the future.
Testing
================
Patch 8 introduces a temporary test file which will be replaced with
unit tests once a unit testing framework is decided upon[7]. It simply
proves that all of the functions in git-std-lib.a do not have any
missing dependencies and can stand up by itself.
I have not yet tested building Git with git-std-lib.a yet (basically
removing the objects in git-std-lib.a from LIB_OBJS and linking against
git-std-lib.a instead), but I intend on testing this in a future version
of this patch. As an RFC, I want to showcase git-std-lib.a as an
experimental dependency that other executables can include in order to
use Git binaries. Internally we have tested building and calling
functions in git-std-lib.a from other programs.
Unit tests should catch any breakages caused by changes to files in
git-std-lib.a (i.e. introduction of a out of scope dependency) and new
functions introduced to git-std-lib.a will require unit tests written
for them.
Series structure
================
While my strbuf and git-compat-util series can stand alone, they also
function as preparatory patches for this series. There are more cleanup
patches in this series, but since most of them have marginal benefits
probably not worth the churn on its own, I decided not to split them
into a separate series like with strbuf and git-compat-util. As an RFC,
I am looking for comments on whether the rationale behind git-std-lib
makes sense as well as whether there are better ways to build and enable
git-std-lib in patch 7, specifically regarding Makefile rules and the
usage of ifdef's to stub out certain functions and headers.
The patch series is structured as follows:
Patches 1-6 are cleanup patches to remove the last few extraneous
dependencies from git-std-lib.a. Here's a short summary of the
dependencies that are specifically removed from git-std-lib.a since some
of the commit messages and diffs showcase dependency cleanups for other
files not directly related to git-std-lib.a:
- Patch 1 removes trace2.h and repository.h dependencies from wrapper.c
- Patch 2 removes the repository.h dependency from strbuf.c inherited from
hex.c by separating it into hex-ll.c and hex.c
- Patch 3 removes the object.h dependency from wrapper.c
- Patch 4 is a bug fix that sets up the next patch. This importantly
removes the git_config_bool() call from git_env_bool() so that env
parsing can go in a separate file
- Patch 5 removes the config.h dependency from wrapper.c and swaps it
with a dependency to parse.h, which doesn't have extraneous
dependencies to files outside of git-std-lib.a
- Patch 6 removes the pager.h dependency from date.c
Patch 7 introduces Git standard library.
Patch 8 introduces a temporary test file for Git standard library. The
test file directly or indirectly calls all functions in git-std-lib.a to
showcase that the functions don't reference missing objects and that
git-std-lib.a can stand on its own.
[1] https://lore.kernel.org/git/CAJoAoZ=Cig_kLocxKGax31sU7Xe4==BGzC__Bg2_pr7krNq6MA@mail.gmail.com/
[2] https://lore.kernel.org/git/pull.1525.v3.git.1684218848.gitgitgadget@gmail.com/
[3] https://lore.kernel.org/git/20230606194720.2053551-1-calvinwan@google.com/
[4] https://lore.kernel.org/git/20230606170711.912972-1-calvinwan@google.com/
[5] https://lore.kernel.org/git/xmqqwn17sydw.fsf@gitster.g/
[6] https://lore.kernel.org/git/pull.1497.v3.git.git.1687290231.gitgitgadget@gmail.com/
[7] https://lore.kernel.org/git/8afdb215d7e10ca16a2ce8226b4127b3d8a2d971.1686352386.git.steadmon@google.com/
Calvin Wan (8):
trace2: log fsync stats in trace2 rather than wrapper
hex-ll: split out functionality from hex
object: move function to object.c
config: correct bad boolean env value error message
parse: create new library for parsing strings and env values
pager: remove pager_in_use()
git-std-lib: introduce git standard library
git-std-lib: add test file to call git-std-lib.a functions
Documentation/technical/git-std-lib.txt | 182 ++++++++++++++++++
Makefile | 30 ++-
attr.c | 2 +-
builtin/log.c | 2 +-
color.c | 4 +-
column.c | 2 +-
config.c | 173 +----------------
config.h | 14 +-
date.c | 4 +-
git-compat-util.h | 7 +-
git.c | 2 +-
hex-ll.c | 49 +++++
hex-ll.h | 27 +++
hex.c | 47 -----
hex.h | 24 +--
mailinfo.c | 2 +-
object.c | 5 +
object.h | 6 +
pack-objects.c | 2 +-
pack-revindex.c | 2 +-
pager.c | 5 -
pager.h | 1 -
parse-options.c | 3 +-
parse.c | 182 ++++++++++++++++++
parse.h | 20 ++
pathspec.c | 2 +-
preload-index.c | 2 +-
progress.c | 2 +-
prompt.c | 2 +-
rebase.c | 2 +-
strbuf.c | 2 +-
symlinks.c | 2 +
t/Makefile | 4 +
t/helper/test-env-helper.c | 2 +-
t/stdlib-test.c | 239 ++++++++++++++++++++++++
trace2.c | 13 ++
trace2.h | 5 +
unpack-trees.c | 2 +-
url.c | 2 +-
urlmatch.c | 2 +-
usage.c | 8 +
wrapper.c | 25 +--
wrapper.h | 9 +-
write-or-die.c | 2 +-
44 files changed, 813 insertions(+), 311 deletions(-)
create mode 100644 Documentation/technical/git-std-lib.txt
create mode 100644 hex-ll.c
create mode 100644 hex-ll.h
create mode 100644 parse.c
create mode 100644 parse.h
create mode 100644 t/stdlib-test.c
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply
* [RFC PATCH 1/8] trace2: log fsync stats in trace2 rather than wrapper
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy
In-Reply-To: <20230627195251.1973421-1-calvinwan@google.com>
As a library boundary, wrapper.c should not directly log trace2
statistics, but instead provide those statistics upon
request. Therefore, move the trace2 logging code to trace2.[ch.]. This
also allows wrapper.c to not be dependent on trace2.h and repository.h.
Signed-off-by: Calvin Wan <calvinwan@google.com>
---
trace2.c | 13 +++++++++++++
trace2.h | 5 +++++
wrapper.c | 17 ++++++-----------
wrapper.h | 4 ++--
4 files changed, 26 insertions(+), 13 deletions(-)
diff --git a/trace2.c b/trace2.c
index 0efc4e7b95..f367a1ce31 100644
--- a/trace2.c
+++ b/trace2.c
@@ -915,3 +915,16 @@ const char *trace2_session_id(void)
{
return tr2_sid_get();
}
+
+static void log_trace_fsync_if(const char *key)
+{
+ intmax_t value = get_trace_git_fsync_stats(key);
+ if (value)
+ trace2_data_intmax("fsync", the_repository, key, value);
+}
+
+void trace_git_fsync_stats(void)
+{
+ log_trace_fsync_if("fsync/writeout-only");
+ log_trace_fsync_if("fsync/hardware-flush");
+}
diff --git a/trace2.h b/trace2.h
index 4ced30c0db..689e9a4027 100644
--- a/trace2.h
+++ b/trace2.h
@@ -581,4 +581,9 @@ void trace2_collect_process_info(enum trace2_process_info_reason reason);
const char *trace2_session_id(void);
+/*
+ * Writes out trace statistics for fsync
+ */
+void trace_git_fsync_stats(void);
+
#endif /* TRACE2_H */
diff --git a/wrapper.c b/wrapper.c
index 22be9812a7..bd7f0a9752 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -6,9 +6,7 @@
#include "config.h"
#include "gettext.h"
#include "object.h"
-#include "repository.h"
#include "strbuf.h"
-#include "trace2.h"
static intmax_t count_fsync_writeout_only;
static intmax_t count_fsync_hardware_flush;
@@ -600,16 +598,13 @@ int git_fsync(int fd, enum fsync_action action)
}
}
-static void log_trace_fsync_if(const char *key, intmax_t value)
+intmax_t get_trace_git_fsync_stats(const char *key)
{
- if (value)
- trace2_data_intmax("fsync", the_repository, key, value);
-}
-
-void trace_git_fsync_stats(void)
-{
- log_trace_fsync_if("fsync/writeout-only", count_fsync_writeout_only);
- log_trace_fsync_if("fsync/hardware-flush", count_fsync_hardware_flush);
+ if (!strcmp(key, "fsync/writeout-only"))
+ return count_fsync_writeout_only;
+ if (!strcmp(key, "fsync/hardware-flush"))
+ return count_fsync_hardware_flush;
+ return 0;
}
static int warn_if_unremovable(const char *op, const char *file, int rc)
diff --git a/wrapper.h b/wrapper.h
index c85b1328d1..db1bc109ed 100644
--- a/wrapper.h
+++ b/wrapper.h
@@ -88,9 +88,9 @@ enum fsync_action {
int git_fsync(int fd, enum fsync_action action);
/*
- * Writes out trace statistics for fsync using the trace2 API.
+ * Returns trace statistics for fsync using the trace2 API.
*/
-void trace_git_fsync_stats(void);
+intmax_t get_trace_git_fsync_stats(const char *key);
/*
* Preserves errno, prints a message, but gives no warning for ENOENT.
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related
* [RFC PATCH 2/8] hex-ll: split out functionality from hex
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy
In-Reply-To: <20230627195251.1973421-1-calvinwan@google.com>
Separate out hex functionality that doesn't require a hash algo into
hex-ll.[ch]. Since the hash algo is currently a global that sits in
repository, this separation removes that dependency for files that only
need basic hex manipulation functions.
Signed-off-by: Calvin Wan <calvinwan@google.com>
---
Makefile | 1 +
color.c | 2 +-
hex-ll.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
hex-ll.h | 27 +++++++++++++++++++++++++++
hex.c | 47 -----------------------------------------------
hex.h | 24 +-----------------------
mailinfo.c | 2 +-
strbuf.c | 2 +-
url.c | 2 +-
urlmatch.c | 2 +-
10 files changed, 83 insertions(+), 75 deletions(-)
create mode 100644 hex-ll.c
create mode 100644 hex-ll.h
diff --git a/Makefile b/Makefile
index 045e2187c4..83b385b0be 100644
--- a/Makefile
+++ b/Makefile
@@ -1040,6 +1040,7 @@ LIB_OBJS += hash-lookup.o
LIB_OBJS += hashmap.o
LIB_OBJS += help.o
LIB_OBJS += hex.o
+LIB_OBJS += hex-ll.o
LIB_OBJS += hook.o
LIB_OBJS += ident.o
LIB_OBJS += json-writer.o
diff --git a/color.c b/color.c
index 83abb11eda..f3c0a4659b 100644
--- a/color.c
+++ b/color.c
@@ -3,7 +3,7 @@
#include "color.h"
#include "editor.h"
#include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
#include "pager.h"
#include "strbuf.h"
diff --git a/hex-ll.c b/hex-ll.c
new file mode 100644
index 0000000000..4d7ece1de5
--- /dev/null
+++ b/hex-ll.c
@@ -0,0 +1,49 @@
+#include "git-compat-util.h"
+#include "hex-ll.h"
+
+const signed char hexval_table[256] = {
+ -1, -1, -1, -1, -1, -1, -1, -1, /* 00-07 */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* 08-0f */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* 10-17 */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* 18-1f */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* 20-27 */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* 28-2f */
+ 0, 1, 2, 3, 4, 5, 6, 7, /* 30-37 */
+ 8, 9, -1, -1, -1, -1, -1, -1, /* 38-3f */
+ -1, 10, 11, 12, 13, 14, 15, -1, /* 40-47 */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* 48-4f */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* 50-57 */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* 58-5f */
+ -1, 10, 11, 12, 13, 14, 15, -1, /* 60-67 */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* 68-67 */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* 70-77 */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* 78-7f */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* 80-87 */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* 88-8f */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* 90-97 */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* 98-9f */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* a0-a7 */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* a8-af */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* b0-b7 */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* b8-bf */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* c0-c7 */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* c8-cf */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* d0-d7 */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* d8-df */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* e0-e7 */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* e8-ef */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* f0-f7 */
+ -1, -1, -1, -1, -1, -1, -1, -1, /* f8-ff */
+};
+
+int hex_to_bytes(unsigned char *binary, const char *hex, size_t len)
+{
+ for (; len; len--, hex += 2) {
+ unsigned int val = (hexval(hex[0]) << 4) | hexval(hex[1]);
+
+ if (val & ~0xff)
+ return -1;
+ *binary++ = val;
+ }
+ return 0;
+}
diff --git a/hex-ll.h b/hex-ll.h
new file mode 100644
index 0000000000..a381fa8556
--- /dev/null
+++ b/hex-ll.h
@@ -0,0 +1,27 @@
+#ifndef HEX_LL_H
+#define HEX_LL_H
+
+extern const signed char hexval_table[256];
+static inline unsigned int hexval(unsigned char c)
+{
+ return hexval_table[c];
+}
+
+/*
+ * Convert two consecutive hexadecimal digits into a char. Return a
+ * negative value on error. Don't run over the end of short strings.
+ */
+static inline int hex2chr(const char *s)
+{
+ unsigned int val = hexval(s[0]);
+ return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
+}
+
+/*
+ * Read `len` pairs of hexadecimal digits from `hex` and write the
+ * values to `binary` as `len` bytes. Return 0 on success, or -1 if
+ * the input does not consist of hex digits).
+ */
+int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
+
+#endif
diff --git a/hex.c b/hex.c
index 7bb440e794..03e55841ed 100644
--- a/hex.c
+++ b/hex.c
@@ -2,53 +2,6 @@
#include "hash.h"
#include "hex.h"
-const signed char hexval_table[256] = {
- -1, -1, -1, -1, -1, -1, -1, -1, /* 00-07 */
- -1, -1, -1, -1, -1, -1, -1, -1, /* 08-0f */
- -1, -1, -1, -1, -1, -1, -1, -1, /* 10-17 */
- -1, -1, -1, -1, -1, -1, -1, -1, /* 18-1f */
- -1, -1, -1, -1, -1, -1, -1, -1, /* 20-27 */
- -1, -1, -1, -1, -1, -1, -1, -1, /* 28-2f */
- 0, 1, 2, 3, 4, 5, 6, 7, /* 30-37 */
- 8, 9, -1, -1, -1, -1, -1, -1, /* 38-3f */
- -1, 10, 11, 12, 13, 14, 15, -1, /* 40-47 */
- -1, -1, -1, -1, -1, -1, -1, -1, /* 48-4f */
- -1, -1, -1, -1, -1, -1, -1, -1, /* 50-57 */
- -1, -1, -1, -1, -1, -1, -1, -1, /* 58-5f */
- -1, 10, 11, 12, 13, 14, 15, -1, /* 60-67 */
- -1, -1, -1, -1, -1, -1, -1, -1, /* 68-67 */
- -1, -1, -1, -1, -1, -1, -1, -1, /* 70-77 */
- -1, -1, -1, -1, -1, -1, -1, -1, /* 78-7f */
- -1, -1, -1, -1, -1, -1, -1, -1, /* 80-87 */
- -1, -1, -1, -1, -1, -1, -1, -1, /* 88-8f */
- -1, -1, -1, -1, -1, -1, -1, -1, /* 90-97 */
- -1, -1, -1, -1, -1, -1, -1, -1, /* 98-9f */
- -1, -1, -1, -1, -1, -1, -1, -1, /* a0-a7 */
- -1, -1, -1, -1, -1, -1, -1, -1, /* a8-af */
- -1, -1, -1, -1, -1, -1, -1, -1, /* b0-b7 */
- -1, -1, -1, -1, -1, -1, -1, -1, /* b8-bf */
- -1, -1, -1, -1, -1, -1, -1, -1, /* c0-c7 */
- -1, -1, -1, -1, -1, -1, -1, -1, /* c8-cf */
- -1, -1, -1, -1, -1, -1, -1, -1, /* d0-d7 */
- -1, -1, -1, -1, -1, -1, -1, -1, /* d8-df */
- -1, -1, -1, -1, -1, -1, -1, -1, /* e0-e7 */
- -1, -1, -1, -1, -1, -1, -1, -1, /* e8-ef */
- -1, -1, -1, -1, -1, -1, -1, -1, /* f0-f7 */
- -1, -1, -1, -1, -1, -1, -1, -1, /* f8-ff */
-};
-
-int hex_to_bytes(unsigned char *binary, const char *hex, size_t len)
-{
- for (; len; len--, hex += 2) {
- unsigned int val = (hexval(hex[0]) << 4) | hexval(hex[1]);
-
- if (val & ~0xff)
- return -1;
- *binary++ = val;
- }
- return 0;
-}
-
static int get_hash_hex_algop(const char *hex, unsigned char *hash,
const struct git_hash_algo *algop)
{
diff --git a/hex.h b/hex.h
index 7df4b3c460..c07c8b34c2 100644
--- a/hex.h
+++ b/hex.h
@@ -2,22 +2,7 @@
#define HEX_H
#include "hash-ll.h"
-
-extern const signed char hexval_table[256];
-static inline unsigned int hexval(unsigned char c)
-{
- return hexval_table[c];
-}
-
-/*
- * Convert two consecutive hexadecimal digits into a char. Return a
- * negative value on error. Don't run over the end of short strings.
- */
-static inline int hex2chr(const char *s)
-{
- unsigned int val = hexval(s[0]);
- return (val & ~0xf) ? val : (val << 4) | hexval(s[1]);
-}
+#include "hex-ll.h"
/*
* Try to read a SHA1 in hexadecimal format from the 40 characters
@@ -32,13 +17,6 @@ int get_oid_hex(const char *hex, struct object_id *sha1);
/* Like get_oid_hex, but for an arbitrary hash algorithm. */
int get_oid_hex_algop(const char *hex, struct object_id *oid, const struct git_hash_algo *algop);
-/*
- * Read `len` pairs of hexadecimal digits from `hex` and write the
- * values to `binary` as `len` bytes. Return 0 on success, or -1 if
- * the input does not consist of hex digits).
- */
-int hex_to_bytes(unsigned char *binary, const char *hex, size_t len);
-
/*
* Convert a binary hash in "unsigned char []" or an object name in
* "struct object_id *" to its hex equivalent. The `_r` variant is reentrant,
diff --git a/mailinfo.c b/mailinfo.c
index 2aeb20e5e6..eb34c30be7 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -1,7 +1,7 @@
#include "git-compat-util.h"
#include "config.h"
#include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
#include "utf8.h"
#include "strbuf.h"
#include "mailinfo.h"
diff --git a/strbuf.c b/strbuf.c
index 8dac52b919..a2a05fe168 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1,6 +1,6 @@
#include "git-compat-util.h"
#include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
#include "strbuf.h"
#include "string-list.h"
#include "utf8.h"
diff --git a/url.c b/url.c
index 2e1a9f6fee..282b12495a 100644
--- a/url.c
+++ b/url.c
@@ -1,5 +1,5 @@
#include "git-compat-util.h"
-#include "hex.h"
+#include "hex-ll.h"
#include "strbuf.h"
#include "url.h"
diff --git a/urlmatch.c b/urlmatch.c
index eba0bdd77f..f1aa87d1dd 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -1,6 +1,6 @@
#include "git-compat-util.h"
#include "gettext.h"
-#include "hex.h"
+#include "hex-ll.h"
#include "strbuf.h"
#include "urlmatch.h"
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related
* [RFC PATCH 3/8] object: move function to object.c
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy
In-Reply-To: <20230627195251.1973421-1-calvinwan@google.com>
While remove_or_warn() is a simple ternary operator to call two other
wrapper functions, it creates an unnecessary dependency to object.h in
wrapper.c. Therefore move the function to object.[ch] where the concept
of GITLINKs is first defined.
Signed-off-by: Calvin Wan <calvinwan@google.com>
---
object.c | 5 +++++
object.h | 6 ++++++
wrapper.c | 6 ------
wrapper.h | 5 -----
4 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/object.c b/object.c
index 60f954194f..cb29fcc304 100644
--- a/object.c
+++ b/object.c
@@ -617,3 +617,8 @@ void parsed_object_pool_clear(struct parsed_object_pool *o)
FREE_AND_NULL(o->object_state);
FREE_AND_NULL(o->shallow_stat);
}
+
+int remove_or_warn(unsigned int mode, const char *file)
+{
+ return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
+}
diff --git a/object.h b/object.h
index 5871615fee..e908ef6515 100644
--- a/object.h
+++ b/object.h
@@ -284,4 +284,10 @@ void clear_object_flags(unsigned flags);
*/
void repo_clear_commit_marks(struct repository *r, unsigned int flags);
+/*
+ * Calls the correct function out of {unlink,rmdir}_or_warn based on
+ * the supplied file mode.
+ */
+int remove_or_warn(unsigned int mode, const char *path);
+
#endif /* OBJECT_H */
diff --git a/wrapper.c b/wrapper.c
index bd7f0a9752..62c04aeb17 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -5,7 +5,6 @@
#include "abspath.h"
#include "config.h"
#include "gettext.h"
-#include "object.h"
#include "strbuf.h"
static intmax_t count_fsync_writeout_only;
@@ -642,11 +641,6 @@ int rmdir_or_warn(const char *file)
return warn_if_unremovable("rmdir", file, rmdir(file));
}
-int remove_or_warn(unsigned int mode, const char *file)
-{
- return S_ISGITLINK(mode) ? rmdir_or_warn(file) : unlink_or_warn(file);
-}
-
static int access_error_is_ok(int err, unsigned flag)
{
return (is_missing_file_error(err) ||
diff --git a/wrapper.h b/wrapper.h
index db1bc109ed..166740ae60 100644
--- a/wrapper.h
+++ b/wrapper.h
@@ -111,11 +111,6 @@ int unlink_or_msg(const char *file, struct strbuf *err);
* not exist.
*/
int rmdir_or_warn(const char *path);
-/*
- * Calls the correct function out of {unlink,rmdir}_or_warn based on
- * the supplied file mode.
- */
-int remove_or_warn(unsigned int mode, const char *path);
/*
* Call access(2), but warn for any error except "missing file"
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related
* [RFC PATCH 4/8] config: correct bad boolean env value error message
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy
In-Reply-To: <20230627195251.1973421-1-calvinwan@google.com>
An incorrectly defined boolean environment value would result in the
following error message:
bad boolean config value '%s' for '%s'
This is a misnomer since environment value != config value. Instead of
calling git_config_bool() to parse the environment value, mimic the
functionality inside of git_config_bool() but with the correct error
message.
Signed-off-by: Calvin Wan <calvinwan@google.com>
---
config.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/config.c b/config.c
index 09851a6909..5b71ef1624 100644
--- a/config.c
+++ b/config.c
@@ -2172,7 +2172,14 @@ void git_global_config(char **user_out, char **xdg_out)
int git_env_bool(const char *k, int def)
{
const char *v = getenv(k);
- return v ? git_config_bool(k, v) : def;
+ int val;
+ if (!v)
+ return def;
+ val = git_parse_maybe_bool(v);
+ if (val < 0)
+ die(_("bad boolean environment value '%s' for '%s'"),
+ v, k);
+ return val;
}
/*
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related
* [RFC PATCH 5/8] parse: create new library for parsing strings and env values
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy
In-Reply-To: <20230627195251.1973421-1-calvinwan@google.com>
While string and environment value parsing is mainly consumed by
config.c, there are other files that only need parsing functionality and
not config functionality. By separating out string and environment value
parsing from config, those files can instead be dependent on parse,
which has a much smaller dependency chain than config.
Move general string and env parsing functions from config.[ch] to
parse.[ch].
Signed-off-by: Calvin Wan <calvinwan@google.com>
---
Makefile | 1 +
attr.c | 2 +-
config.c | 180 +-----------------------------------
config.h | 14 +--
pack-objects.c | 2 +-
pack-revindex.c | 2 +-
parse-options.c | 3 +-
parse.c | 182 +++++++++++++++++++++++++++++++++++++
parse.h | 20 ++++
pathspec.c | 2 +-
preload-index.c | 2 +-
progress.c | 2 +-
prompt.c | 2 +-
rebase.c | 2 +-
t/helper/test-env-helper.c | 2 +-
unpack-trees.c | 2 +-
wrapper.c | 2 +-
write-or-die.c | 2 +-
18 files changed, 219 insertions(+), 205 deletions(-)
create mode 100644 parse.c
create mode 100644 parse.h
diff --git a/Makefile b/Makefile
index 83b385b0be..e9ad9f9ef1 100644
--- a/Makefile
+++ b/Makefile
@@ -1091,6 +1091,7 @@ LIB_OBJS += pack-write.o
LIB_OBJS += packfile.o
LIB_OBJS += pager.o
LIB_OBJS += parallel-checkout.o
+LIB_OBJS += parse.o
LIB_OBJS += parse-options-cb.o
LIB_OBJS += parse-options.o
LIB_OBJS += patch-delta.o
diff --git a/attr.c b/attr.c
index e9c81b6e07..cb047b4618 100644
--- a/attr.c
+++ b/attr.c
@@ -7,7 +7,7 @@
*/
#include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
#include "environment.h"
#include "exec-cmd.h"
#include "attr.h"
diff --git a/config.c b/config.c
index 5b71ef1624..cdd70999aa 100644
--- a/config.c
+++ b/config.c
@@ -11,6 +11,7 @@
#include "date.h"
#include "branch.h"
#include "config.h"
+#include "parse.h"
#include "convert.h"
#include "environment.h"
#include "gettext.h"
@@ -1204,129 +1205,6 @@ static int git_parse_source(struct config_source *cs, config_fn_t fn,
return error_return;
}
-static uintmax_t get_unit_factor(const char *end)
-{
- if (!*end)
- return 1;
- else if (!strcasecmp(end, "k"))
- return 1024;
- else if (!strcasecmp(end, "m"))
- return 1024 * 1024;
- else if (!strcasecmp(end, "g"))
- return 1024 * 1024 * 1024;
- return 0;
-}
-
-static int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
-{
- if (value && *value) {
- char *end;
- intmax_t val;
- intmax_t factor;
-
- if (max < 0)
- BUG("max must be a positive integer");
-
- errno = 0;
- val = strtoimax(value, &end, 0);
- if (errno == ERANGE)
- return 0;
- if (end == value) {
- errno = EINVAL;
- return 0;
- }
- factor = get_unit_factor(end);
- if (!factor) {
- errno = EINVAL;
- return 0;
- }
- if ((val < 0 && -max / factor > val) ||
- (val > 0 && max / factor < val)) {
- errno = ERANGE;
- return 0;
- }
- val *= factor;
- *ret = val;
- return 1;
- }
- errno = EINVAL;
- return 0;
-}
-
-static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
-{
- if (value && *value) {
- char *end;
- uintmax_t val;
- uintmax_t factor;
-
- /* negative values would be accepted by strtoumax */
- if (strchr(value, '-')) {
- errno = EINVAL;
- return 0;
- }
- errno = 0;
- val = strtoumax(value, &end, 0);
- if (errno == ERANGE)
- return 0;
- if (end == value) {
- errno = EINVAL;
- return 0;
- }
- factor = get_unit_factor(end);
- if (!factor) {
- errno = EINVAL;
- return 0;
- }
- if (unsigned_mult_overflows(factor, val) ||
- factor * val > max) {
- errno = ERANGE;
- return 0;
- }
- val *= factor;
- *ret = val;
- return 1;
- }
- errno = EINVAL;
- return 0;
-}
-
-int git_parse_int(const char *value, int *ret)
-{
- intmax_t tmp;
- if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
- return 0;
- *ret = tmp;
- return 1;
-}
-
-static int git_parse_int64(const char *value, int64_t *ret)
-{
- intmax_t tmp;
- if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
- return 0;
- *ret = tmp;
- return 1;
-}
-
-int git_parse_ulong(const char *value, unsigned long *ret)
-{
- uintmax_t tmp;
- if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(long)))
- return 0;
- *ret = tmp;
- return 1;
-}
-
-int git_parse_ssize_t(const char *value, ssize_t *ret)
-{
- intmax_t tmp;
- if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
- return 0;
- *ret = tmp;
- return 1;
-}
-
static int reader_config_name(struct config_reader *reader, const char **out);
static int reader_origin_type(struct config_reader *reader,
enum config_origin_type *type);
@@ -1404,23 +1282,6 @@ ssize_t git_config_ssize_t(const char *name, const char *value)
return ret;
}
-static int git_parse_maybe_bool_text(const char *value)
-{
- if (!value)
- return 1;
- if (!*value)
- return 0;
- if (!strcasecmp(value, "true")
- || !strcasecmp(value, "yes")
- || !strcasecmp(value, "on"))
- return 1;
- if (!strcasecmp(value, "false")
- || !strcasecmp(value, "no")
- || !strcasecmp(value, "off"))
- return 0;
- return -1;
-}
-
static const struct fsync_component_name {
const char *name;
enum fsync_component component_bits;
@@ -1495,16 +1356,6 @@ static enum fsync_component parse_fsync_components(const char *var, const char *
return (current & ~negative) | positive;
}
-int git_parse_maybe_bool(const char *value)
-{
- int v = git_parse_maybe_bool_text(value);
- if (0 <= v)
- return v;
- if (git_parse_int(value, &v))
- return !!v;
- return -1;
-}
-
int git_config_bool_or_int(const char *name, const char *value, int *is_bool)
{
int v = git_parse_maybe_bool_text(value);
@@ -2165,35 +2016,6 @@ void git_global_config(char **user_out, char **xdg_out)
*xdg_out = xdg_config;
}
-/*
- * Parse environment variable 'k' as a boolean (in various
- * possible spellings); if missing, use the default value 'def'.
- */
-int git_env_bool(const char *k, int def)
-{
- const char *v = getenv(k);
- int val;
- if (!v)
- return def;
- val = git_parse_maybe_bool(v);
- if (val < 0)
- die(_("bad boolean environment value '%s' for '%s'"),
- v, k);
- return val;
-}
-
-/*
- * Parse environment variable 'k' as ulong with possibly a unit
- * suffix; if missing, use the default value 'val'.
- */
-unsigned long git_env_ulong(const char *k, unsigned long val)
-{
- const char *v = getenv(k);
- if (v && !git_parse_ulong(v, &val))
- die(_("failed to parse %s"), k);
- return val;
-}
-
int git_config_system(void)
{
return !git_env_bool("GIT_CONFIG_NOSYSTEM", 0);
diff --git a/config.h b/config.h
index 247b572b37..7a7f53e503 100644
--- a/config.h
+++ b/config.h
@@ -3,7 +3,7 @@
#include "hashmap.h"
#include "string-list.h"
-
+#include "parse.h"
/**
* The config API gives callers a way to access Git configuration files
@@ -205,16 +205,6 @@ int config_with_options(config_fn_t fn, void *,
* The following helper functions aid in parsing string values
*/
-int git_parse_ssize_t(const char *, ssize_t *);
-int git_parse_ulong(const char *, unsigned long *);
-int git_parse_int(const char *value, int *ret);
-
-/**
- * Same as `git_config_bool`, except that it returns -1 on error rather
- * than dying.
- */
-int git_parse_maybe_bool(const char *);
-
/**
* Parse the string to an integer, including unit factors. Dies on error;
* otherwise, returns the parsed result.
@@ -343,8 +333,6 @@ int git_config_rename_section(const char *, const char *);
int git_config_rename_section_in_file(const char *, const char *, const char *);
int git_config_copy_section(const char *, const char *);
int git_config_copy_section_in_file(const char *, const char *, const char *);
-int git_env_bool(const char *, int);
-unsigned long git_env_ulong(const char *, unsigned long);
int git_config_system(void);
int config_error_nonbool(const char *);
#if defined(__GNUC__)
diff --git a/pack-objects.c b/pack-objects.c
index 1b8052bece..f403ca6986 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -3,7 +3,7 @@
#include "pack.h"
#include "pack-objects.h"
#include "packfile.h"
-#include "config.h"
+#include "parse.h"
static uint32_t locate_object_entry_hash(struct packing_data *pdata,
const struct object_id *oid,
diff --git a/pack-revindex.c b/pack-revindex.c
index 7fffcad912..a01a2a4640 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -6,7 +6,7 @@
#include "packfile.h"
#include "strbuf.h"
#include "trace2.h"
-#include "config.h"
+#include "parse.h"
#include "midx.h"
#include "csum-file.h"
diff --git a/parse-options.c b/parse-options.c
index f8a155ee13..9f542950a7 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -1,11 +1,12 @@
#include "git-compat-util.h"
#include "parse-options.h"
#include "abspath.h"
-#include "config.h"
+#include "parse.h"
#include "commit.h"
#include "color.h"
#include "gettext.h"
#include "strbuf.h"
+#include "string-list.h"
#include "utf8.h"
static int disallow_abbreviated_options;
diff --git a/parse.c b/parse.c
new file mode 100644
index 0000000000..42d691a0fb
--- /dev/null
+++ b/parse.c
@@ -0,0 +1,182 @@
+#include "git-compat-util.h"
+#include "gettext.h"
+#include "parse.h"
+
+static uintmax_t get_unit_factor(const char *end)
+{
+ if (!*end)
+ return 1;
+ else if (!strcasecmp(end, "k"))
+ return 1024;
+ else if (!strcasecmp(end, "m"))
+ return 1024 * 1024;
+ else if (!strcasecmp(end, "g"))
+ return 1024 * 1024 * 1024;
+ return 0;
+}
+
+int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
+{
+ if (value && *value) {
+ char *end;
+ intmax_t val;
+ intmax_t factor;
+
+ if (max < 0)
+ BUG("max must be a positive integer");
+
+ errno = 0;
+ val = strtoimax(value, &end, 0);
+ if (errno == ERANGE)
+ return 0;
+ if (end == value) {
+ errno = EINVAL;
+ return 0;
+ }
+ factor = get_unit_factor(end);
+ if (!factor) {
+ errno = EINVAL;
+ return 0;
+ }
+ if ((val < 0 && -max / factor > val) ||
+ (val > 0 && max / factor < val)) {
+ errno = ERANGE;
+ return 0;
+ }
+ val *= factor;
+ *ret = val;
+ return 1;
+ }
+ errno = EINVAL;
+ return 0;
+}
+
+static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
+{
+ if (value && *value) {
+ char *end;
+ uintmax_t val;
+ uintmax_t factor;
+
+ /* negative values would be accepted by strtoumax */
+ if (strchr(value, '-')) {
+ errno = EINVAL;
+ return 0;
+ }
+ errno = 0;
+ val = strtoumax(value, &end, 0);
+ if (errno == ERANGE)
+ return 0;
+ if (end == value) {
+ errno = EINVAL;
+ return 0;
+ }
+ factor = get_unit_factor(end);
+ if (!factor) {
+ errno = EINVAL;
+ return 0;
+ }
+ if (unsigned_mult_overflows(factor, val) ||
+ factor * val > max) {
+ errno = ERANGE;
+ return 0;
+ }
+ val *= factor;
+ *ret = val;
+ return 1;
+ }
+ errno = EINVAL;
+ return 0;
+}
+
+int git_parse_int(const char *value, int *ret)
+{
+ intmax_t tmp;
+ if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
+ return 0;
+ *ret = tmp;
+ return 1;
+}
+
+int git_parse_int64(const char *value, int64_t *ret)
+{
+ intmax_t tmp;
+ if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
+ return 0;
+ *ret = tmp;
+ return 1;
+}
+
+int git_parse_ulong(const char *value, unsigned long *ret)
+{
+ uintmax_t tmp;
+ if (!git_parse_unsigned(value, &tmp, maximum_unsigned_value_of_type(long)))
+ return 0;
+ *ret = tmp;
+ return 1;
+}
+
+int git_parse_ssize_t(const char *value, ssize_t *ret)
+{
+ intmax_t tmp;
+ if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
+ return 0;
+ *ret = tmp;
+ return 1;
+}
+
+int git_parse_maybe_bool_text(const char *value)
+{
+ if (!value)
+ return 1;
+ if (!*value)
+ return 0;
+ if (!strcasecmp(value, "true")
+ || !strcasecmp(value, "yes")
+ || !strcasecmp(value, "on"))
+ return 1;
+ if (!strcasecmp(value, "false")
+ || !strcasecmp(value, "no")
+ || !strcasecmp(value, "off"))
+ return 0;
+ return -1;
+}
+
+int git_parse_maybe_bool(const char *value)
+{
+ int v = git_parse_maybe_bool_text(value);
+ if (0 <= v)
+ return v;
+ if (git_parse_int(value, &v))
+ return !!v;
+ return -1;
+}
+
+/*
+ * Parse environment variable 'k' as a boolean (in various
+ * possible spellings); if missing, use the default value 'def'.
+ */
+int git_env_bool(const char *k, int def)
+{
+ const char *v = getenv(k);
+ int val;
+ if (!v)
+ return def;
+ val = git_parse_maybe_bool(v);
+ if (val < 0)
+ die(_("bad boolean environment value '%s' for '%s'"),
+ v, k);
+ return val;
+}
+
+/*
+ * Parse environment variable 'k' as ulong with possibly a unit
+ * suffix; if missing, use the default value 'val'.
+ */
+unsigned long git_env_ulong(const char *k, unsigned long val)
+{
+ const char *v = getenv(k);
+ if (v && !git_parse_ulong(v, &val))
+ die(_("failed to parse %s"), k);
+ return val;
+}
diff --git a/parse.h b/parse.h
new file mode 100644
index 0000000000..07d2193d69
--- /dev/null
+++ b/parse.h
@@ -0,0 +1,20 @@
+#ifndef PARSE_H
+#define PARSE_H
+
+int git_parse_signed(const char *value, intmax_t *ret, intmax_t max);
+int git_parse_ssize_t(const char *, ssize_t *);
+int git_parse_ulong(const char *, unsigned long *);
+int git_parse_int(const char *value, int *ret);
+int git_parse_int64(const char *value, int64_t *ret);
+
+/**
+ * Same as `git_config_bool`, except that it returns -1 on error rather
+ * than dying.
+ */
+int git_parse_maybe_bool(const char *);
+int git_parse_maybe_bool_text(const char *value);
+
+int git_env_bool(const char *, int);
+unsigned long git_env_ulong(const char *, unsigned long);
+
+#endif /* PARSE_H */
diff --git a/pathspec.c b/pathspec.c
index 4991455281..39337999d4 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -1,6 +1,6 @@
#include "git-compat-util.h"
#include "abspath.h"
-#include "config.h"
+#include "parse.h"
#include "dir.h"
#include "environment.h"
#include "gettext.h"
diff --git a/preload-index.c b/preload-index.c
index e44530c80c..63fd35d64b 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -7,7 +7,7 @@
#include "environment.h"
#include "fsmonitor.h"
#include "gettext.h"
-#include "config.h"
+#include "parse.h"
#include "preload-index.h"
#include "progress.h"
#include "read-cache.h"
diff --git a/progress.c b/progress.c
index f695798aca..c83cb60bf1 100644
--- a/progress.c
+++ b/progress.c
@@ -17,7 +17,7 @@
#include "trace.h"
#include "trace2.h"
#include "utf8.h"
-#include "config.h"
+#include "parse.h"
#define TP_IDX_MAX 8
diff --git a/prompt.c b/prompt.c
index 3baa33f63d..8935fe4dfb 100644
--- a/prompt.c
+++ b/prompt.c
@@ -1,5 +1,5 @@
#include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
#include "environment.h"
#include "run-command.h"
#include "strbuf.h"
diff --git a/rebase.c b/rebase.c
index 17a570f1ff..69a1822da3 100644
--- a/rebase.c
+++ b/rebase.c
@@ -1,6 +1,6 @@
#include "git-compat-util.h"
#include "rebase.h"
-#include "config.h"
+#include "parse.h"
#include "gettext.h"
/*
diff --git a/t/helper/test-env-helper.c b/t/helper/test-env-helper.c
index 66c88b8ff3..1c486888a4 100644
--- a/t/helper/test-env-helper.c
+++ b/t/helper/test-env-helper.c
@@ -1,5 +1,5 @@
#include "test-tool.h"
-#include "config.h"
+#include "parse.h"
#include "parse-options.h"
static char const * const env__helper_usage[] = {
diff --git a/unpack-trees.c b/unpack-trees.c
index 87517364dc..761562a96e 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2,7 +2,7 @@
#include "advice.h"
#include "strvec.h"
#include "repository.h"
-#include "config.h"
+#include "parse.h"
#include "dir.h"
#include "environment.h"
#include "gettext.h"
diff --git a/wrapper.c b/wrapper.c
index 62c04aeb17..3e554f50c6 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -3,7 +3,7 @@
*/
#include "git-compat-util.h"
#include "abspath.h"
-#include "config.h"
+#include "parse.h"
#include "gettext.h"
#include "strbuf.h"
diff --git a/write-or-die.c b/write-or-die.c
index d8355c0c3e..42a2dc73cd 100644
--- a/write-or-die.c
+++ b/write-or-die.c
@@ -1,5 +1,5 @@
#include "git-compat-util.h"
-#include "config.h"
+#include "parse.h"
#include "run-command.h"
#include "write-or-die.h"
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related
* [RFC PATCH 6/8] pager: remove pager_in_use()
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy
In-Reply-To: <20230627195251.1973421-1-calvinwan@google.com>
pager_in_use() is simply a wrapper around
git_env_bool("GIT_PAGER_IN_USE", 0). Other places that call
git_env_bool() in this fashion also do not have a wrapper function
around it. By removing pager_in_use(), we can also get rid of the
pager.h dependency from a few files.
Signed-off-by: Calvin Wan <calvinwan@google.com>
---
builtin/log.c | 2 +-
color.c | 2 +-
column.c | 2 +-
date.c | 4 ++--
git.c | 2 +-
| 5 -----
| 1 -
7 files changed, 6 insertions(+), 12 deletions(-)
diff --git a/builtin/log.c b/builtin/log.c
index 03954fb749..d5e979932f 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -82,7 +82,7 @@ struct line_opt_callback_data {
static int session_is_interactive(void)
{
- return isatty(1) || pager_in_use();
+ return isatty(1) || git_env_bool("GIT_PAGER_IN_USE", 0);
}
static int auto_decoration_style(void)
diff --git a/color.c b/color.c
index f3c0a4659b..dd6f26b8db 100644
--- a/color.c
+++ b/color.c
@@ -388,7 +388,7 @@ static int check_auto_color(int fd)
int *is_tty_p = fd == 1 ? &color_stdout_is_tty : &color_stderr_is_tty;
if (*is_tty_p < 0)
*is_tty_p = isatty(fd);
- if (*is_tty_p || (fd == 1 && pager_in_use() && pager_use_color)) {
+ if (*is_tty_p || (fd == 1 && git_env_bool("GIT_PAGER_IN_USE", 0) && pager_use_color)) {
if (!is_terminal_dumb())
return 1;
}
diff --git a/column.c b/column.c
index ff2f0abf39..e15ca70f36 100644
--- a/column.c
+++ b/column.c
@@ -214,7 +214,7 @@ int finalize_colopts(unsigned int *colopts, int stdout_is_tty)
if (stdout_is_tty < 0)
stdout_is_tty = isatty(1);
*colopts &= ~COL_ENABLE_MASK;
- if (stdout_is_tty || pager_in_use())
+ if (stdout_is_tty || git_env_bool("GIT_PAGER_IN_USE", 0))
*colopts |= COL_ENABLED;
}
return 0;
diff --git a/date.c b/date.c
index 619ada5b20..95c0f568ba 100644
--- a/date.c
+++ b/date.c
@@ -7,7 +7,7 @@
#include "git-compat-util.h"
#include "date.h"
#include "gettext.h"
-#include "pager.h"
+#include "parse.h"
#include "strbuf.h"
/*
@@ -1009,7 +1009,7 @@ void parse_date_format(const char *format, struct date_mode *mode)
/* "auto:foo" is "if tty/pager, then foo, otherwise normal" */
if (skip_prefix(format, "auto:", &p)) {
- if (isatty(1) || pager_in_use())
+ if (isatty(1) || git_env_bool("GIT_PAGER_IN_USE", 0))
format = p;
else
format = "default";
diff --git a/git.c b/git.c
index eb69f4f997..3bfb673a4c 100644
--- a/git.c
+++ b/git.c
@@ -131,7 +131,7 @@ static void commit_pager_choice(void)
void setup_auto_pager(const char *cmd, int def)
{
- if (use_pager != -1 || pager_in_use())
+ if (use_pager != -1 || git_env_bool("GIT_PAGER_IN_USE", 0))
return;
use_pager = check_pager_config(cmd);
if (use_pager == -1)
--git a/pager.c b/pager.c
index 63055d0873..9b392622d2 100644
--- a/pager.c
+++ b/pager.c
@@ -149,11 +149,6 @@ void setup_pager(void)
atexit(wait_for_pager_atexit);
}
-int pager_in_use(void)
-{
- return git_env_bool("GIT_PAGER_IN_USE", 0);
-}
-
/*
* Return cached value (if set) or $COLUMNS environment variable (if
* set and positive) or ioctl(1, TIOCGWINSZ).ws_col (if positive),
--git a/pager.h b/pager.h
index b77433026d..6832c6168d 100644
--- a/pager.h
+++ b/pager.h
@@ -5,7 +5,6 @@ struct child_process;
const char *git_pager(int stdout_is_tty);
void setup_pager(void);
-int pager_in_use(void);
int term_columns(void);
void term_clear_line(void);
int decimal_width(uintmax_t);
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related
* [RFC PATCH 8/8] git-std-lib: add test file to call git-std-lib.a functions
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy
In-Reply-To: <20230627195251.1973421-1-calvinwan@google.com>
Add test file that directly or indirectly calls all functions defined in
git-std-lib.a object files to showcase that they do not reference
missing objects and that git-std-lib.a can stand on its own.
Certain functions that cause the program to exit or are already called
by other functions are commented out.
TODO: replace with unit tests
Signed-off-by: Calvin Wan <calvinwan@google.com>
---
t/Makefile | 4 +
t/stdlib-test.c | 239 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 243 insertions(+)
create mode 100644 t/stdlib-test.c
diff --git a/t/Makefile b/t/Makefile
index 3e00cdd801..b6d0bc9daa 100644
--- a/t/Makefile
+++ b/t/Makefile
@@ -150,3 +150,7 @@ perf:
.PHONY: pre-clean $(T) aggregate-results clean valgrind perf \
check-chainlint clean-chainlint test-chainlint
+
+test-git-std-lib:
+ cc -It -o stdlib-test stdlib-test.c -L. -l:../git-std-lib.a
+ ./stdlib-test
diff --git a/t/stdlib-test.c b/t/stdlib-test.c
new file mode 100644
index 0000000000..0e4f6d5807
--- /dev/null
+++ b/t/stdlib-test.c
@@ -0,0 +1,239 @@
+#include "../git-compat-util.h"
+#include "../abspath.h"
+#include "../hex-ll.h"
+#include "../parse.h"
+#include "../strbuf.h"
+#include "../string-list.h"
+
+/*
+ * Calls all functions from git-std-lib
+ * Some inline/trivial functions are skipped
+ */
+
+void abspath_funcs(void) {
+ struct strbuf sb = STRBUF_INIT;
+
+ fprintf(stderr, "calling abspath functions\n");
+ is_directory("foo");
+ strbuf_realpath(&sb, "foo", 0);
+ strbuf_realpath_forgiving(&sb, "foo", 0);
+ real_pathdup("foo", 0);
+ absolute_path("foo");
+ absolute_pathdup("foo");
+ prefix_filename("foo/", "bar");
+ prefix_filename_except_for_dash("foo/", "bar");
+ is_absolute_path("foo");
+ strbuf_add_absolute_path(&sb, "foo");
+ strbuf_add_real_path(&sb, "foo");
+}
+
+void hex_ll_funcs(void) {
+ unsigned char c;
+
+ fprintf(stderr, "calling hex-ll functions\n");
+
+ hexval('c');
+ hex2chr("A1");
+ hex_to_bytes(&c, "A1", 2);
+}
+
+void parse_funcs(void) {
+ intmax_t foo;
+ ssize_t foo1 = -1;
+ unsigned long foo2;
+ int foo3;
+ int64_t foo4;
+
+ fprintf(stderr, "calling parse functions\n");
+
+ git_parse_signed("42", &foo, maximum_signed_value_of_type(int));
+ git_parse_ssize_t("42", &foo1);
+ git_parse_ulong("42", &foo2);
+ git_parse_int("42", &foo3);
+ git_parse_int64("42", &foo4);
+ git_parse_maybe_bool("foo");
+ git_parse_maybe_bool_text("foo");
+ git_env_bool("foo", 1);
+ git_env_ulong("foo", 1);
+}
+
+static int allow_unencoded_fn(char ch) {
+ return 0;
+}
+
+void strbuf_funcs(void) {
+ struct strbuf *sb = xmalloc(sizeof(void*));
+ struct strbuf *sb2 = xmalloc(sizeof(void*));
+ struct strbuf sb3 = STRBUF_INIT;
+ struct string_list list = STRING_LIST_INIT_NODUP;
+ char *buf = "foo";
+ struct strbuf_expand_dict_entry dict[] = {
+ { "foo", NULL, },
+ { "bar", NULL, },
+ };
+ int fd = open("/dev/null", O_RDONLY);
+
+ fprintf(stderr, "calling strbuf functions\n");
+
+ starts_with("foo", "bar");
+ istarts_with("foo", "bar");
+ // skip_to_optional_arg_default(const char *str, const char *prefix,
+ // const char **arg, const char *def)
+ strbuf_init(sb, 0);
+ strbuf_init(sb2, 0);
+ strbuf_release(sb);
+ strbuf_attach(sb, strbuf_detach(sb, NULL), 0, 0); // calls strbuf_grow
+ strbuf_swap(sb, sb2);
+ strbuf_setlen(sb, 0);
+ strbuf_trim(sb); // calls strbuf_rtrim, strbuf_ltrim
+ // strbuf_rtrim() called by strbuf_trim()
+ // strbuf_ltrim() called by strbuf_trim()
+ strbuf_trim_trailing_dir_sep(sb);
+ strbuf_trim_trailing_newline(sb);
+ strbuf_reencode(sb, "foo", "bar");
+ strbuf_tolower(sb);
+ strbuf_add_separated_string_list(sb, " ", &list);
+ strbuf_list_free(strbuf_split_buf("foo bar", 8, ' ', -1));
+ strbuf_cmp(sb, sb2);
+ strbuf_addch(sb, 1);
+ strbuf_splice(sb, 0, 1, "foo", 3);
+ strbuf_insert(sb, 0, "foo", 3);
+ // strbuf_vinsertf() called by strbuf_insertf
+ strbuf_insertf(sb, 0, "%s", "foo");
+ strbuf_remove(sb, 0, 1);
+ strbuf_add(sb, "foo", 3);
+ strbuf_addbuf(sb, sb2);
+ strbuf_join_argv(sb, 0, NULL, ' ');
+ strbuf_addchars(sb, 1, 1);
+ strbuf_addf(sb, "%s", "foo");
+ strbuf_add_commented_lines(sb, "foo", 3, '#');
+ strbuf_commented_addf(sb, '#', "%s", "foo");
+ // strbuf_vaddf() called by strbuf_addf()
+ strbuf_expand(sb, "%s", strbuf_expand_literal_cb, NULL);
+ strbuf_expand(sb, "%s", strbuf_expand_dict_cb, &dict);
+ // strbuf_expand_literal_cb() called by strbuf_expand()
+ // strbuf_expand_dict_cb() called by strbuf_expand()
+ strbuf_addbuf_percentquote(sb, &sb3);
+ strbuf_add_percentencode(sb, "foo", STRBUF_ENCODE_SLASH);
+ strbuf_fread(sb, 0, stdin);
+ strbuf_read(sb, fd, 0);
+ strbuf_read_once(sb, fd, 0);
+ strbuf_write(sb, stderr);
+ strbuf_readlink(sb, "/dev/null", 0);
+ strbuf_getcwd(sb);
+ strbuf_getwholeline(sb, stderr, '\n');
+ strbuf_appendwholeline(sb, stderr, '\n');
+ strbuf_getline(sb, stderr);
+ strbuf_getline_lf(sb, stderr);
+ strbuf_getline_nul(sb, stderr);
+ strbuf_getwholeline_fd(sb, fd, '\n');
+ strbuf_read_file(sb, "/dev/null", 0);
+ strbuf_add_lines(sb, "foo", "bar", 0);
+ strbuf_addstr_xml_quoted(sb, "foo");
+ strbuf_addstr_urlencode(sb, "foo", allow_unencoded_fn);
+ strbuf_humanise_bytes(sb, 42);
+ strbuf_humanise_rate(sb, 42);
+ printf_ln("%s", sb);
+ fprintf_ln(stderr, "%s", sb);
+ xstrdup_tolower("foo");
+ xstrdup_toupper("foo");
+ // xstrvfmt() called by xstrfmt()
+ xstrfmt("%s", "foo");
+ // strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm *tm,
+ // int tz_offset, int suppress_tz_name)
+ // strbuf_stripspace(struct strbuf *sb, char comment_line_char)
+ // strbuf_strip_suffix(struct strbuf *sb, const char *suffix)
+ // strbuf_strip_file_from_path(struct strbuf *sb)
+}
+
+static void error_builtin(const char *err, va_list params) {}
+static void warn_builtin(const char *err, va_list params) {}
+
+static report_fn error_routine = error_builtin;
+static report_fn warn_routine = warn_builtin;
+
+void usage_funcs(void) {
+ fprintf(stderr, "calling usage functions\n");
+ // Functions that call exit() are commented out
+
+ // usage()
+ // usagef()
+ // die()
+ // die_errno();
+ error("foo");
+ error_errno("foo");
+ die_message("foo");
+ die_message_errno("foo");
+ warning("foo");
+ warning_errno("foo");
+
+ // set_die_routine();
+ get_die_message_routine();
+ set_error_routine(error_builtin);
+ get_error_routine();
+ set_warn_routine(warn_builtin);
+ get_warn_routine();
+ // set_die_is_recursing_routine();
+}
+
+void wrapper_funcs(void) {
+ void *ptr = xmalloc(1);
+ int fd = open("/dev/null", O_RDONLY);
+ struct strbuf sb = STRBUF_INIT;
+ int mode = 0444;
+ char host[PATH_MAX], path[PATH_MAX], path1[PATH_MAX];
+ xsnprintf(path, sizeof(path), "out-XXXXXX");
+ xsnprintf(path1, sizeof(path1), "out-XXXXXX");
+ int tmp;
+
+ fprintf(stderr, "calling wrapper functions\n");
+
+ xstrdup("foo");
+ xmalloc(1);
+ xmallocz(1);
+ xmallocz_gently(1);
+ xmemdupz("foo", 3);
+ xstrndup("foo", 3);
+ xrealloc(ptr, 2);
+ xcalloc(1, 1);
+ xsetenv("foo", "bar", 0);
+ xopen("/dev/null", O_RDONLY);
+ xread(fd, &sb, 1);
+ xwrite(fd, &sb, 1);
+ xpread(fd, &sb, 1, 0);
+ xdup(fd);
+ xfopen("/dev/null", "r");
+ xfdopen(fd, "r");
+ tmp = xmkstemp(path);
+ close(tmp);
+ unlink(path);
+ tmp = xmkstemp_mode(path1, mode);
+ close(tmp);
+ unlink(path1);
+ xgetcwd();
+ fopen_for_writing(path);
+ fopen_or_warn(path, "r");
+ xstrncmpz("foo", "bar", 3);
+ // xsnprintf() called above
+ xgethostname(host, 3);
+ tmp = git_mkstemps_mode(path, 1, mode);
+ close(tmp);
+ unlink(path);
+ tmp = git_mkstemp_mode(path, mode);
+ close(tmp);
+ unlink(path);
+ read_in_full(fd, &sb, 1);
+ write_in_full(fd, &sb, 1);
+ pread_in_full(fd, &sb, 1, 0);
+}
+
+int main() {
+ abspath_funcs();
+ hex_ll_funcs();
+ parse_funcs();
+ strbuf_funcs();
+ usage_funcs();
+ wrapper_funcs();
+ fprintf(stderr, "all git-std-lib functions finished calling\n");
+ return 0;
+}
\ No newline at end of file
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related
* [RFC PATCH 7/8] git-std-lib: introduce git standard library
From: Calvin Wan @ 2023-06-27 19:52 UTC (permalink / raw)
To: git; +Cc: Calvin Wan, nasamuffin, chooglen, johnathantanmy
In-Reply-To: <20230627195251.1973421-1-calvinwan@google.com>
The Git Standard Library intends to serve as the foundational library
and root dependency that other libraries in Git will be built off of.
That is to say, suppose we have libraries X and Y; a user that wants to
use X and Y would need to include X, Y, and this Git Standard Library.
Add Documentation/technical/git-std-lib.txt to further explain the
design and rationale.
Signed-off-by: Calvin Wan <calvinwan@google.com>
---
Documentation/technical/git-std-lib.txt | 182 ++++++++++++++++++++++++
Makefile | 28 +++-
git-compat-util.h | 7 +-
symlinks.c | 2 +
usage.c | 8 ++
5 files changed, 225 insertions(+), 2 deletions(-)
create mode 100644 Documentation/technical/git-std-lib.txt
diff --git a/Documentation/technical/git-std-lib.txt b/Documentation/technical/git-std-lib.txt
new file mode 100644
index 0000000000..3dce36c9f9
--- /dev/null
+++ b/Documentation/technical/git-std-lib.txt
@@ -0,0 +1,182 @@
+Git Standard Library
+================
+
+The Git Standard Library intends to serve as the foundational library
+and root dependency that other libraries in Git will be built off of.
+That is to say, suppose we have libraries X and Y; a user that wants to
+use X and Y would need to include X, Y, and this Git Standard Library.
+This does not mean that the Git Standard Library will be the only
+possible root dependency in the future, but rather the most significant
+and widely used one.
+
+Dependency graph in libified Git
+================
+
+If you look in the Git Makefile, all of the objects defined in the Git
+library are compiled and archived into a singular file, libgit.a, which
+is linked against by common-main.o with other external dependencies and
+turned into the Git executable. In other words, the Git executable has
+dependencies on libgit.a and a couple of external libraries. The
+libfication of Git will not affect this current build flow, but instead
+will provide an alternate method for building Git.
+
+With our current method of building Git, we can imagine the dependency
+graph as such:
+
+ Git
+ /\
+ / \
+ / \
+ libgit.a ext deps
+
+In libifying parts of Git, we want to shrink the dependency graph to
+only the minimal set of dependencies, so libraries should not use
+libgit.a. Instead, it would look like:
+
+ Git
+ /\
+ / \
+ / \
+ libgit.a ext deps
+ /\
+ / \
+ / \
+object-store.a (other lib)
+ | /
+ | /
+ | /
+ config.a /
+ | /
+ | /
+ | /
+git-std-lib.a
+
+Instead of containing all of the objects in Git, libgit.a would contain
+objects that are not built by libraries it links against. Consequently,
+if someone wanted their own custom build of Git with their own custom
+implementation of the object store, they would only have to swap out
+object-store.a rather than do a hard fork of Git.
+
+Rationale behind Git Standard Library
+================
+
+The rationale behind Git Standard Library essentially is the result of
+two observations within the Git codebase: every file includes
+git-compat-util.h which defines functions in a couple of different
+files, and wrapper.c + usage.c have difficult-to-separate circular
+dependencies with each other and other files.
+
+Ubiquity of git-compat-util.h and circular dependencies
+========
+
+Every file in the Git codebase includes git-compat-util.h. It serves as
+"a compatibility aid that isolates the knowledge of platform specific
+inclusion order and what feature macros to define before including which
+system header" (Junio[1]). Since every file includes git-compat-util.h, and
+git-compat-util.h includes wrapper.h and usage.h, it would make sense
+for wrapper.c and usage.c to be a part of the root library. They have
+difficult to separate circular dependencies with each other so they
+can't be independent libraries. Wrapper.c has dependencies on parse.c,
+abspath.c, strbuf.c, which in turn also have dependencies on usage.c and
+wrapper.c -- more circular dependencies.
+
+Tradeoff between swappability and refactoring
+========
+
+From the above dependency graph, we can see that git-std-lib.a could be
+many smaller libraries rather than a singular library. So why choose a
+singular library when multiple libraries can be individually easier to
+swap and are more modular? A singular library requires less work to
+separate out circular dependencies within itself so it becomes a
+tradeoff question between work and reward. While there may be a point in
+the future where a file like usage.c would want its own library so that
+someone can have custom die() or error(), the work required to refactor
+out the circular dependencies in some files would be enormous due to
+their ubiquity so therefore I believe it is not worth the tradeoff
+currently. Additionally, we can in the future choose to do this refactor
+and change the API for the library if there becomes enough of a reason
+to do so (remember we are avoiding promising stability of the interfaces
+of those libraries).
+
+Reuse of compatibility functions in git-compat-util.h
+========
+
+Most functions defined in git-compat-util.h are implemented in compat/
+and have dependencies limited to strbuf.h and wrapper.h so they can be
+easily included in git-std-lib.a, which as a root dependency means that
+higher level libraries do not have to worry about compatibility files in
+compat/. The rest of the functions defined in git-compat-util.h are
+implemented in top level files and, in this patch set, are hidden behind
+an #ifdef if their implementation is not in git-std-lib.a.
+
+Rationale summary
+========
+
+The Git Standard Library allows us to get the libification ball rolling
+with other libraries in Git. By not spending many
+more months attempting to refactor difficult circular dependencies and
+instead spending that time getting to a state where we can test out
+swapping a library out such as config or object store, we can prove the
+viability of Git libification on a much faster time scale. Additionally
+the code cleanups that have happened so far have been minor and
+beneficial for the codebase. It is probable that making large movements
+would negatively affect code clarity.
+
+Git Standard Library boundary
+================
+
+While I have described above some useful heuristics for identifying
+potential candidates for git-std-lib.a, a standard library should not
+have a shaky definition for what belongs in it.
+
+ - Low-level files (aka operates only on other primitive types) that are
+ used everywhere within the codebase (wrapper.c, usage.c, strbuf.c)
+ - Dependencies that are low-level and widely used
+ (abspath.c, date.c, hex-ll.c, parse.c, utf8.c)
+ - low-level git/* files with functions defined in git-compat-util.h
+ (ctype.c)
+ - compat/*
+
+There are other files that might fit this definition, but that does not
+mean it should belong in git-std-lib.a. Those files should start as
+their own separate library since any file added to git-std-lib.a loses
+its flexibility of being easily swappable.
+
+Files inside of Git Standard Library
+================
+
+The initial set of files in git-std-lib.a are:
+abspath.c
+ctype.c
+date.c
+hex-ll.c
+parse.c
+strbuf.c
+usage.c
+utf8.c
+wrapper.c
+relevant compat/ files
+
+Pitfalls
+================
+
+In patch 7, I use #ifdef GIT_STD_LIB to both stub out code and hide
+certain function headers. As other parts of Git are libified, if we
+have to use more ifdefs for each different library, then the codebase
+will become uglier and harder to understand.
+
+There are a small amount of files under compat/* that have dependencies
+not inside of git-std-lib.a. While those functions are not called on
+Linux, other OSes might call those problematic functions. I don't see
+this as a major problem, just moreso an observation that libification in
+general may also require some minor compatibility work in the future.
+
+Testing
+================
+
+Unit tests should catch any breakages caused by changes to files in
+git-std-lib.a (i.e. introduction of a out of scope dependency) and new
+functions introduced to git-std-lib.a will require unit tests written
+for them.
+
+[1] https://lore.kernel.org/git/xmqqwn17sydw.fsf@gitster.g/
\ No newline at end of file
diff --git a/Makefile b/Makefile
index e9ad9f9ef1..255bd10b82 100644
--- a/Makefile
+++ b/Makefile
@@ -2162,6 +2162,11 @@ ifdef FSMONITOR_OS_SETTINGS
COMPAT_OBJS += compat/fsmonitor/fsm-path-utils-$(FSMONITOR_OS_SETTINGS).o
endif
+ifdef GIT_STD_LIB
+ BASIC_CFLAGS += -DGIT_STD_LIB
+ BASIC_CFLAGS += -DNO_GETTEXT
+endif
+
ifeq ($(TCLTK_PATH),)
NO_TCLTK = NoThanks
endif
@@ -3654,7 +3659,7 @@ clean: profile-clean coverage-clean cocciclean
$(RM) po/git.pot po/git-core.pot
$(RM) git.res
$(RM) $(OBJECTS)
- $(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
+ $(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB) $(STD_LIB_FILE)
$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) $(OTHER_PROGRAMS)
$(RM) $(TEST_PROGRAMS)
$(RM) $(FUZZ_PROGRAMS)
@@ -3834,3 +3839,24 @@ $(FUZZ_PROGRAMS): all
$(XDIFF_OBJS) $(EXTLIBS) git.o $@.o $(LIB_FUZZING_ENGINE) -o $@
fuzz-all: $(FUZZ_PROGRAMS)
+
+### Libified Git rules
+
+# git-std-lib
+# `make git-std-lib GIT_STD_LIB=YesPlease`
+STD_LIB = git-std-lib.a
+
+GIT_STD_LIB_OBJS += abspath.o
+GIT_STD_LIB_OBJS += ctype.o
+GIT_STD_LIB_OBJS += date.o
+GIT_STD_LIB_OBJS += hex-ll.o
+GIT_STD_LIB_OBJS += parse.o
+GIT_STD_LIB_OBJS += strbuf.o
+GIT_STD_LIB_OBJS += usage.o
+GIT_STD_LIB_OBJS += utf8.o
+GIT_STD_LIB_OBJS += wrapper.o
+
+$(STD_LIB): $(GIT_STD_LIB_OBJS) $(COMPAT_OBJS)
+ $(QUIET_AR)$(RM) $@ && $(AR) $(ARFLAGS) $@ $^
+
+git-std-lib: $(STD_LIB)
diff --git a/git-compat-util.h b/git-compat-util.h
index 481dac22b0..75aa9b263e 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -396,8 +396,8 @@ static inline int noop_core_config(const char *var UNUSED,
#define platform_core_config noop_core_config
#endif
+#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
int lstat_cache_aware_rmdir(const char *path);
-#if !defined(__MINGW32__) && !defined(_MSC_VER)
#define rmdir lstat_cache_aware_rmdir
#endif
@@ -787,9 +787,11 @@ const char *inet_ntop(int af, const void *src, char *dst, size_t size);
#endif
#ifdef NO_PTHREADS
+#ifdef GIT_STD_LIB
#define atexit git_atexit
int git_atexit(void (*handler)(void));
#endif
+#endif
/*
* Limit size of IO chunks, because huge chunks only cause pain. OS X
@@ -951,14 +953,17 @@ int git_access(const char *path, int mode);
# endif
#endif
+#ifndef GIT_STD_LIB
int cmd_main(int, const char **);
/*
* Intercept all calls to exit() and route them to trace2 to
* optionally emit a message before calling the real exit().
*/
+
int common_exit(const char *file, int line, int code);
#define exit(code) exit(common_exit(__FILE__, __LINE__, (code)))
+#endif
/*
* You can mark a stack variable with UNLEAK(var) to avoid it being
diff --git a/symlinks.c b/symlinks.c
index b29e340c2d..bced721a0c 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -337,6 +337,7 @@ void invalidate_lstat_cache(void)
reset_lstat_cache(&default_cache);
}
+#if !defined(__MINGW32__) && !defined(_MSC_VER) && !defined(GIT_STD_LIB)
#undef rmdir
int lstat_cache_aware_rmdir(const char *path)
{
@@ -348,3 +349,4 @@ int lstat_cache_aware_rmdir(const char *path)
return ret;
}
+#endif
diff --git a/usage.c b/usage.c
index 09f0ed509b..58994e0d5c 100644
--- a/usage.c
+++ b/usage.c
@@ -5,7 +5,15 @@
*/
#include "git-compat-util.h"
#include "gettext.h"
+
+#ifdef GIT_STD_LIB
+#undef trace2_cmd_name
+#undef trace2_cmd_error_va
+#define trace2_cmd_name(x)
+#define trace2_cmd_error_va(x, y)
+#else
#include "trace2.h"
+#endif
static void vreportf(const char *prefix, const char *err, va_list params)
{
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related
* Re: [PATCH 2/2] for-each-ref: add --count-matches option
From: Jeff King @ 2023-06-27 19:59 UTC (permalink / raw)
To: Junio C Hamano
Cc: Phillip Wood, Derrick Stolee via GitGitGadget, git, vdye, me,
mjcheetham, Derrick Stolee
In-Reply-To: <xmqqmt0khik5.fsf@gitster.g>
On Tue, Jun 27, 2023 at 11:22:02AM -0700, Junio C Hamano wrote:
> - populate_value() -> get_object() -> oid_object_info_extended() ->
> grab_common_values() asks for ATOM_OBJECTTYPE that should be a
> single xstrdup(), but oid_object_info_extended() needs to parse
> the object, but .info.contentp ought to be NULL and we should
> not be calling parse_object_buffer().
This is the atom that I expect to give most of the cost. We should
definitely be computing that just with a pack lookup (and chasing deltas
through the pack, though commits usually aren't deltas). But without
that atom, I think we would not even mmap the idx or pack file at all.
The ref-filter code is also pretty keen to malloc() little strings. When
we're measuring 5ms total runtimes and there's just not that much actual
work to do, those little things start to matter more.
> So it might be worth looking into where the time is going (didn't
> Peff or somebody do that a few years ago, though?) when using the
> default format and optimize or special case the codepath,[...]
Running:
perf record -g git for-each-ref --format='%(refname) %(objecttype)' >/dev/null
perf script report flamegraph
in linux.git shows that we spend a lot of time faulting in .idx pages.
A lot of the time is attributed to ref_array_sort(), but I think that's
a red herring. It lazy-loads the atom values, so that's where most of
the work is happening (though I would have thought it would just do the
refname atom, not all of them; but I guess it doesn't really matter
since we'll eventually need them to print anyway). Commenting out the
sort call just shows the same time spent in format_ref_array_item().
The patches you're thinking of are probably:
https://lore.kernel.org/git/YTNpQ7Od1U%2F5i0R7@coredump.intra.peff.net/
which are mostly about micro-optimizing out those mallocs. They do still
help a little in the '%(refname)' case, but not nearly as much as
dropping '%(objecttype)' does.
(Using --format=x in theory drops out some mallocs, but I wasn't able to
measure any actual speedup).
> [...]but the
> responses I have seen from two of you makes me suspect that the new
> option is not the best general approach.
Yeah, my feeling is that with the reduced format, eliminating the
pipe/wc overhead is a pretty small improvement.
-Peff
^ permalink raw reply
* Re: [PATCH] ll-merge: killing the external merge driver aborts the merge
From: Junio C Hamano @ 2023-06-27 20:04 UTC (permalink / raw)
To: Joshua Hudson; +Cc: Johannes Schindelin, git, Elijah Newren
In-Reply-To: <4f28a9c4-b422-69b7-ccc1-2661d756d876@cedaron.com>
Joshua Hudson <jhudson@cedaron.com> writes:
> On 6/27/2023 12:08 PM, Junio C Hamano wrote:
>> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>>
>>> On Fri, 23 Jun 2023, Junio C Hamano wrote:
>>>
>>>> Junio C Hamano <gitster@pobox.com> writes:
>>>>
>>>>> Elijah Newren <newren@gmail.com> writes:
>>>>>
>>>>>> Reviewed-by: Elijah Newren <newren@gmail.com>
>>>>>
>>>>> Thanks for a quick review.
>>>> Unfortunately Windows does not seem to correctly detect the aborting
>> Sorry, I did not mean "abort(3)" literally. What I meant was that
>> an external merge driver that gets spawned via the run_command()
>> interface may not die by calling exit()---like "killed by signal"
>> (including "segfaulting"). The new test script piece added in the
>> patch did "kill -9 $$" to kill the external merge driver itself,
>> which gets reported as "killed by signal" from run_command() by
>> returning the signal number + 128, but that did not pass Windows CI.
>>
> Do you need me to provide a windows test harness?
Sorry, I do not understand the question.
FWIW how "external merge driver that kills itself by sending a
signal to itself does not get noticed on Windows" appears in our
tests can be seen at
https://github.com/git/git/actions/runs/5360824580/jobs/9727137272
The job is "win test(0)", part of our standard Windows test harness
implemented as part of our GitHub Actions CI test.
Thanks.
^ permalink raw reply
* Re: [PATCH v2 6/7] var: add attributes files locations
From: Jeff King @ 2023-06-27 20:16 UTC (permalink / raw)
To: brian m. carlson; +Cc: git, Junio C Hamano, Eric Sunshine, Derrick Stolee
In-Reply-To: <ZJsKa6ZNq7nnh0gf@tapette.crustytoothpaste.net>
On Tue, Jun 27, 2023 at 04:12:27PM +0000, brian m. carlson wrote:
> > So I dunno. I'm mostly giving a heads-up, and seeing if you (or other
> > reviewers in the thread) have an idea to make this "flag" thing less
> > awful. I'm also happy to just do my topic separately, and then
> > eventually circle back after yours is merged.
>
> I've picked up your patch as the first patch in the series and will send
> it out in v3 in just a few minutes. Since I plan to have v3 be the last
> round of this series, I'll let you send out any further changes as
> fixups on top of that.
Thanks, that is optimal from my perspective. :)
The resulting series looks good to me, both the changes you integrated
from me, but also the whole thing overall. It is a little funny to stuff
the GIT_CONFIG_GLOBAL values into a string only to re-split them (versus
passing around a string_list itself), but I think it's a reasonable
compromise given the function interface.
-Peff
^ permalink raw reply
* Re: [PATCH 4/5] replace strbuf_expand() with strbuf_expand_step()
From: Jeff King @ 2023-06-27 20:19 UTC (permalink / raw)
To: René Scharfe; +Cc: Git List
In-Reply-To: <8d2223b8-ab33-be5a-20ea-ad60f6cbcc75@web.de>
On Tue, Jun 27, 2023 at 06:31:55PM +0200, René Scharfe wrote:
> > It does still feel like we should be handling "%%" on behalf of the
> > callers.
>
> I feel the same, but restrained myself from doing that, so that we
> can see all the pieces for now. It allows us to recombine them in
> better ways than before.
Yeah, since you did the work to handle "%%" in each caller, I'm OK with
proceeding and letting a later refactor shrink it back down if we
choose.
-Peff
^ permalink raw reply
* Re: [PATCH 5/5] strbuf: simplify strbuf_expand_literal_cb()
From: Jeff King @ 2023-06-27 20:21 UTC (permalink / raw)
To: René Scharfe; +Cc: Git List
In-Reply-To: <3654f95d-5709-28f6-eeb2-ca62a1ee789c@web.de>
On Tue, Jun 27, 2023 at 06:32:22PM +0200, René Scharfe wrote:
> A format-advancing version would also look a bit weird in an if/else
> cascade:
>
> else if (strbuf_expand_literal(&sb, &format))
> ; /* nothing */
> else ...
>
> > I dunno. It is a little thing, and I am OK with it either way (I would
> > not even think of changing it if you were not already touching the
> > function).
>
> Unsure. Should I overcome my horror vacui and let the /* nothing */ in?
Heh. It is a little funny to have an empty block. But at the same time,
it aligns the conditional with all of the skip_prefix() calls, which are
advancing "format" as a side effect of matching.
So I could go either way (and we can always change it on top).
I think based on your responses that there's nothing that would require
a re-roll. The only thing left from my review is the extra parentheses
in format_commit_item, which could possibly be fixed up while applying.
-Peff
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox