From: Junio C Hamano <gitster@pobox.com>
To: "Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, rsbecker@nexbridge.com,
bagasdotme@gmail.com, newren@gmail.com, avarab@gmail.com,
nksingh85@gmail.com, ps@pks.im, sandals@crustytoothpaste.net,
"Neeraj K. Singh" <neerajsi@microsoft.com>
Subject: Re: [PATCH v5 3/5] core.fsync: introduce granular fsync control
Date: Wed, 09 Mar 2022 16:21:08 -0800 [thread overview]
Message-ID: <xmqqo82eirnv.fsf@gitster.g> (raw)
In-Reply-To: <e31886717b42837f4e1538a13c8954aa07865af5.1646866998.git.gitgitgadget@gmail.com> (Neeraj Singh via GitGitGadget's message of "Wed, 09 Mar 2022 23:03:16 +0000")
"Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:
> +/*
> + * These values are used to help identify parts of a repository to fsync.
> + * FSYNC_COMPONENT_NONE identifies data that will not be a persistent part of the
> + * repository and so shouldn't be fsynced.
> + */
> +enum fsync_component {
> + FSYNC_COMPONENT_NONE,
> + FSYNC_COMPONENT_LOOSE_OBJECT = 1 << 0,
> + FSYNC_COMPONENT_PACK = 1 << 1,
> + FSYNC_COMPONENT_PACK_METADATA = 1 << 2,
> + FSYNC_COMPONENT_COMMIT_GRAPH = 1 << 3,
> +};
OK, so the idea is that Patrick's "we need to fsync refs" will be
done by adding a new component to this list, and sprinkling a call
to fsync_component_or_die() in the code of ref-files backend?
I am wondering if fsync_or_die() interface is abstracted well
enough, or we need things like "the fd is inside this directory; in
addition to doing the fsync of the fd, please sync the parent
directory as well" support before we start adding more components
(if there is such a need, perhaps it comes before this step).
> +#define FSYNC_COMPONENTS_DEFAULT (FSYNC_COMPONENT_PACK | \
> + FSYNC_COMPONENT_PACK_METADATA | \
> + FSYNC_COMPONENT_COMMIT_GRAPH)
IOW, everything other than loose object, which already has a
separate core.fsyncObjectFiles knob to loosen. Everything else we
currently sync unconditionally and the default keeps that
arrangement?
> +static inline void fsync_component_or_die(enum fsync_component component, int fd, const char *msg)
> +{
> + if (fsync_components & component)
> + fsync_or_die(fd, msg);
> +}
Do we have a compelling reason to have this as a static inline
function? We are talking about concluding an I/O operation and
I doubt there is a good performance argument for it.
> +static const struct fsync_component_entry {
> + const char *name;
> + enum fsync_component component_bits;
> +} fsync_component_table[] = {
thing[] is an array of "thing" (and thing[4] is the "fourth" such
thing), but this is not an array of a table (it is a name-to-bit
mapping).
I wonder if this array works without "_table" suffix in its name.
> + { "loose-object", FSYNC_COMPONENT_LOOSE_OBJECT },
> + { "pack", FSYNC_COMPONENT_PACK },
> + { "pack-metadata", FSYNC_COMPONENT_PACK_METADATA },
> + { "commit-graph", FSYNC_COMPONENT_COMMIT_GRAPH },
> +};
> +
> +static enum fsync_component parse_fsync_components(const char *var, const char *string)
> +{
> + enum fsync_component output = 0;
> +
> + if (!strcmp(string, "none"))
> + return FSYNC_COMPONENT_NONE;
> +
> + while (string) {
> + int i;
> + size_t len;
> + const char *ep;
> + int negated = 0;
> + int found = 0;
> +
> + string = string + strspn(string, ", \t\n\r");
> + ep = strchrnul(string, ',');
> + len = ep - string;
> +
> + if (*string == '-') {
> + negated = 1;
> + string++;
> + len--;
> + if (!len)
> + warning(_("invalid value for variable %s"), var);
> + }
> +
> + if (!len)
> + break;
> +
> + for (i = 0; i < ARRAY_SIZE(fsync_component_table); ++i) {
> + const struct fsync_component_entry *entry = &fsync_component_table[i];
> +
> + if (strncmp(entry->name, string, len))
> + continue;
> +
> + found = 1;
> + if (negated)
> + output &= ~entry->component_bits;
> + else
> + output |= entry->component_bits;
> + }
> +
> + if (!found) {
> + char *component = xstrndup(string, len);
> + warning(_("ignoring unknown core.fsync component '%s'"), component);
> + free(component);
> + }
> +
> + string = ep;
> + }
> +
> + return output;
> +}
Hmph. I would have expected, with built-in default of
pack,pack-metadata,commit-graph,
- "none,pack" would choose only "pack" by first clearing the
built-in default (or whatever was set in configuration files that
are lower precedence than what we are reading) and then OR'ing
the "pack" bit in.
- "-pack" would choose "pack-metadata,commit-graph" by first
starting from the built-in default and then CLR'ing the "pack"
bit out. If there were already changes made by the lower
precedence configuration files like /etc/gitconfig, the result
might be different and the only definite thing we can say is that
the pack bit is cleared.
- "loose-object" would choose all of the bits by first starting
from the built-in default and then OR'ing the "loose-object" bit
in.
Otherwise, parsing "none" is more or less pointless, as the above
parser always start from 0 and OR's in or CLR's out the named bit.
Whoever writes "none" can just write an empty string, no?
I wonder you'd rather want to do it this way?
parse_fsync_components(var, value, current) {
enum fsync_component positive = 0, negative = 0;
while (string) {
int negated = 0;
enum fsync_component bits;
parse out a single component into <negated, bits>;
if (bits == 0) { /* "none" given */
current = 0;
} else if (negated) {
negative |= bits;
} else {
positive |= bits;
}
advance <string> pointer;
}
return (current | positive) & ~negative;
}
And then ...
> + if (!strcmp(var, "core.fsync")) {
> + if (!value)
> + return config_error_nonbool(var);
> + fsync_components = parse_fsync_components(var, value);
> + return 0;
> + }
> +
... this part would pass the current value of fsync_components as
the third parameter to the parse_fsync_components(). The variable
would be initialized to the FSYNC_COMPONENTS_DEFAULT we saw earlier.
> @@ -1613,7 +1684,7 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
> }
>
> if (!strcmp(var, "core.fsyncobjectfiles")) {
> - fsync_object_files = git_config_bool(var, value);
> + warning(_("core.fsyncobjectfiles is deprecated; use core.fsync instead"));
This is not deprecating but removing the support, which I am not
sure is a sensible thing to do. Rather we should pretend that
core.fsync = "loose-object" (or "-loose-object") were found in the
configuration, shouldn't we?
next prev parent reply other threads:[~2022-03-10 0:21 UTC|newest]
Thread overview: 122+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-04 3:28 [PATCH 0/2] A design for future-proofing fsync() configuration Neeraj K. Singh via GitGitGadget
2021-12-04 3:28 ` [PATCH 1/2] fsync: add writeout-only mode for fsyncing repo data Neeraj Singh via GitGitGadget
2021-12-06 7:54 ` Neeraj Singh
2021-12-04 3:28 ` [PATCH 2/2] core.fsync: introduce granular fsync control Neeraj Singh via GitGitGadget
2021-12-07 2:46 ` [PATCH v2 0/3] A design for future-proofing fsync() configuration Neeraj K. Singh via GitGitGadget
2021-12-07 2:46 ` [PATCH v2 1/3] core.fsyncmethod: add writeout-only mode Neeraj Singh via GitGitGadget
2021-12-07 11:44 ` Patrick Steinhardt
2021-12-07 12:14 ` Ævar Arnfjörð Bjarmason
2021-12-07 23:29 ` Neeraj Singh
2021-12-07 12:18 ` Ævar Arnfjörð Bjarmason
2021-12-07 23:58 ` Neeraj Singh
2021-12-07 2:46 ` [PATCH v2 2/3] core.fsync: introduce granular fsync control Neeraj Singh via GitGitGadget
2021-12-07 11:53 ` Patrick Steinhardt
2021-12-07 20:46 ` Neeraj Singh
2021-12-07 12:29 ` Ævar Arnfjörð Bjarmason
2021-12-07 21:44 ` Neeraj Singh
2021-12-08 10:05 ` Ævar Arnfjörð Bjarmason
2021-12-09 0:14 ` Neeraj Singh
2021-12-09 0:44 ` Junio C Hamano
2021-12-09 4:08 ` Ævar Arnfjörð Bjarmason
2021-12-09 6:18 ` Neeraj Singh
2022-01-18 23:50 ` Neeraj Singh
2022-01-19 15:28 ` Ævar Arnfjörð Bjarmason
2022-01-19 14:52 ` Ævar Arnfjörð Bjarmason
2022-01-28 1:28 ` Neeraj Singh
2021-12-07 2:46 ` [PATCH v2 3/3] core.fsync: new option to harden the index Neeraj Singh via GitGitGadget
2021-12-07 11:56 ` [PATCH v2 0/3] A design for future-proofing fsync() configuration Patrick Steinhardt
2021-12-08 0:44 ` Neeraj Singh
2021-12-09 0:57 ` [PATCH v3 0/4] " Neeraj K. Singh via GitGitGadget
2021-12-09 0:57 ` [PATCH v3 1/4] core.fsyncmethod: add writeout-only mode Neeraj Singh via GitGitGadget
2021-12-09 0:57 ` [PATCH v3 2/4] core.fsync: introduce granular fsync control Neeraj Singh via GitGitGadget
2021-12-09 0:57 ` [PATCH v3 3/4] core.fsync: new option to harden the index Neeraj Singh via GitGitGadget
2021-12-09 0:57 ` [PATCH v3 4/4] core.fsync: add a `derived-metadata` aggregate option Neeraj Singh via GitGitGadget
2022-01-08 1:13 ` [PATCH v3 0/4] A design for future-proofing fsync() configuration Neeraj Singh
2022-01-09 0:55 ` rsbecker
2022-01-10 19:00 ` Neeraj Singh
2022-02-01 3:33 ` [PATCH v4 " Neeraj K. Singh via GitGitGadget
2022-02-01 3:33 ` [PATCH v4 1/4] core.fsyncmethod: add writeout-only mode Neeraj Singh via GitGitGadget
2022-02-01 3:33 ` [PATCH v4 2/4] core.fsync: introduce granular fsync control Neeraj Singh via GitGitGadget
2022-02-02 0:51 ` Junio C Hamano
2022-02-02 1:42 ` Junio C Hamano
2022-02-11 21:18 ` Neeraj Singh
2022-02-11 22:19 ` Junio C Hamano
2022-02-11 23:04 ` Neeraj Singh
2022-02-11 23:15 ` Junio C Hamano
2022-02-12 0:39 ` rsbecker
2022-02-14 7:04 ` Patrick Steinhardt
2022-02-14 17:17 ` Junio C Hamano
2022-03-09 13:42 ` Patrick Steinhardt
2022-03-09 18:50 ` Ævar Arnfjörð Bjarmason
2022-03-09 20:03 ` Junio C Hamano
2022-03-10 12:33 ` Patrick Steinhardt
2022-03-10 17:15 ` Junio C Hamano
2022-03-09 20:05 ` Neeraj Singh
2022-02-11 20:38 ` Neeraj Singh
2022-02-01 3:33 ` [PATCH v4 3/4] core.fsync: new option to harden the index Neeraj Singh via GitGitGadget
2022-02-01 3:33 ` [PATCH v4 4/4] core.fsync: add a `derived-metadata` aggregate option Neeraj Singh via GitGitGadget
2022-03-09 23:03 ` [PATCH v5 0/5] A design for future-proofing fsync() configuration Neeraj K. Singh via GitGitGadget
2022-03-09 23:03 ` [PATCH v5 1/5] wrapper: move inclusion of CSPRNG headers the wrapper.c file Neeraj Singh via GitGitGadget
2022-03-09 23:29 ` Junio C Hamano
2022-03-10 1:21 ` Neeraj Singh
2022-03-10 1:26 ` brian m. carlson
2022-03-10 1:56 ` Neeraj Singh
2022-03-09 23:03 ` [PATCH v5 2/5] core.fsyncmethod: add writeout-only mode Neeraj Singh via GitGitGadget
2022-03-09 23:48 ` Junio C Hamano
2022-03-09 23:03 ` [PATCH v5 3/5] core.fsync: introduce granular fsync control Neeraj Singh via GitGitGadget
2022-03-10 0:21 ` Junio C Hamano [this message]
2022-03-10 2:53 ` Neeraj Singh
2022-03-10 7:19 ` Junio C Hamano
2022-03-10 18:38 ` Neeraj Singh
2022-03-10 18:44 ` Junio C Hamano
2022-03-10 19:57 ` Junio C Hamano
2022-03-10 20:25 ` Neeraj Singh
2022-03-10 21:17 ` Junio C Hamano
2022-03-10 13:11 ` Johannes Schindelin
2022-03-10 17:18 ` Junio C Hamano
2022-03-09 23:03 ` [PATCH v5 4/5] core.fsync: new option to harden the index Neeraj Singh via GitGitGadget
2022-03-09 23:03 ` [PATCH v5 5/5] core.fsync: documentation and user-friendly aggregate options Neeraj Singh via GitGitGadget
2022-03-10 9:53 ` Future-proofed syncing of refs Patrick Steinhardt
2022-03-10 9:53 ` [PATCH 6/8] core.fsync: add `fsync_component()` wrapper which doesn't die Patrick Steinhardt
2022-03-10 17:34 ` Junio C Hamano
2022-03-10 18:40 ` Neeraj Singh
2022-03-10 9:53 ` [PATCH 7/8] core.fsync: new option to harden loose references Patrick Steinhardt
2022-03-10 18:25 ` Junio C Hamano
2022-03-10 19:03 ` Neeraj Singh
2022-03-10 22:54 ` Neeraj Singh
2022-03-11 6:40 ` Junio C Hamano
2022-03-11 9:15 ` Patrick Steinhardt
2022-03-11 9:36 ` Ævar Arnfjörð Bjarmason
2022-03-10 9:53 ` [PATCH 8/8] core.fsync: new option to harden packed references Patrick Steinhardt
2022-03-10 18:28 ` Junio C Hamano
2022-03-11 9:10 ` Patrick Steinhardt
2022-03-10 22:43 ` [PATCH v6 0/6] A design for future-proofing fsync() configuration Neeraj K. Singh via GitGitGadget
2022-03-10 22:43 ` [PATCH v6 1/6] wrapper: make inclusion of Windows csprng header tightly scoped Neeraj Singh via GitGitGadget
2022-03-10 22:43 ` [PATCH v6 2/6] core.fsyncmethod: add writeout-only mode Neeraj Singh via GitGitGadget
2022-03-10 22:43 ` [PATCH v6 3/6] core.fsync: introduce granular fsync control infrastructure Neeraj Singh via GitGitGadget
2022-03-10 22:43 ` [PATCH v6 4/6] core.fsync: add configuration parsing Neeraj Singh via GitGitGadget
2022-03-28 11:06 ` Jiang Xin
2022-03-28 19:45 ` Neeraj Singh
2022-03-10 22:43 ` [PATCH v6 5/6] core.fsync: new option to harden the index Neeraj Singh via GitGitGadget
2022-03-10 22:43 ` [PATCH v6 6/6] core.fsync: documentation and user-friendly aggregate options Neeraj Singh via GitGitGadget
2022-03-15 19:12 ` [PATCH v7] " Neeraj Singh
2022-03-15 19:32 ` Junio C Hamano
2022-03-15 19:56 ` Neeraj Singh
2022-03-23 14:20 ` do we have too much fsync() configuration in 'next'? (was: [PATCH v7] core.fsync: documentation and user-friendly aggregate options) Ævar Arnfjörð Bjarmason
2022-03-25 21:24 ` Neeraj Singh
2022-03-26 0:24 ` Ævar Arnfjörð Bjarmason
2022-03-26 1:23 ` do we have too much fsync() configuration in 'next'? Junio C Hamano
2022-03-26 1:25 ` do we have too much fsync() configuration in 'next'? (was: [PATCH v7] core.fsync: documentation and user-friendly aggregate options) Neeraj Singh
2022-03-26 15:31 ` Ævar Arnfjörð Bjarmason
2022-03-27 5:27 ` Neeraj Singh
2022-03-27 12:43 ` Ævar Arnfjörð Bjarmason
2022-03-28 10:56 ` Patrick Steinhardt
2022-03-28 11:25 ` Ævar Arnfjörð Bjarmason
2022-03-28 19:56 ` Neeraj Singh
2022-03-30 16:59 ` Neeraj Singh
2022-03-10 23:34 ` [PATCH v6 0/6] A design for future-proofing fsync() configuration Junio C Hamano
2022-03-11 0:03 ` Neeraj Singh
2022-03-11 18:50 ` Neeraj Singh
2022-03-13 23:50 ` Junio C Hamano
2022-03-11 9:58 ` [PATCH v2] core.fsync: new option to harden references Patrick Steinhardt
2022-03-25 6:11 ` SZEDER Gábor
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqo82eirnv.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=avarab@gmail.com \
--cc=bagasdotme@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=neerajsi@microsoft.com \
--cc=newren@gmail.com \
--cc=nksingh85@gmail.com \
--cc=ps@pks.im \
--cc=rsbecker@nexbridge.com \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).