From: Patrick Steinhardt <ps@pks.im>
To: Jeff Hostetler via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Jeff Hostetler <jeffhostetler@github.com>
Subject: Re: [PATCH 08/12] fsmonitor: support case-insensitive directory events
Date: Thu, 15 Feb 2024 10:32:43 +0100 [thread overview]
Message-ID: <Zc3aO95bvuXXENLj@tanuki> (raw)
In-Reply-To: <e0029a2aad68a60f672e74368a384f68a343e21c.1707857541.git.gitgitgadget@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 8603 bytes --]
On Tue, Feb 13, 2024 at 08:52:17PM +0000, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhostetler@github.com>
>
> Teach fsmonitor_refresh_callback() to handle case-insensitive
> lookups if case-sensitive lookups fail on case-insensitive systems.
> This can cause 'git status' to report stale status for files if there
> are case issues/errors in the worktree.
>
> The FSMonitor daemon sends FSEvents using the observed spelling
> of each pathname. On case-insensitive file systems this may be
> different than the expected case spelling.
>
> The existing code uses index_name_pos() to find the cache-entry for
> the pathname in the FSEvent and clear the CE_FSMONITOR_VALID bit so
> that the worktree scan/index refresh will revisit and revalidate the
> path.
>
> On a case-insensitive file system, the exact match lookup may fail
> to find the associated cache-entry. This causes status to think that
> the cached CE flags are correct and skip over the file.
>
> Update the handling of directory-style FSEvents (ones containing a
> path with a trailing slash) to optionally use the name-hash if the
> case-correct search does not find a match.
>
> (The FSMonitor daemon can send directory FSEvents if the OS provides
> that information.)
>
> Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
> ---
> fsmonitor.c | 122 +++++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 120 insertions(+), 2 deletions(-)
>
> diff --git a/fsmonitor.c b/fsmonitor.c
> index 14585b6c516..73e6ac82af7 100644
> --- a/fsmonitor.c
> +++ b/fsmonitor.c
> @@ -5,6 +5,7 @@
> #include "ewah/ewok.h"
> #include "fsmonitor.h"
> #include "fsmonitor-ipc.h"
> +#include "name-hash.h"
> #include "run-command.h"
> #include "strbuf.h"
> #include "trace2.h"
> @@ -183,6 +184,9 @@ static int query_fsmonitor_hook(struct repository *r,
> return result;
> }
>
> +static int fsmonitor_refresh_callback_slash(
> + struct index_state *istate, const char *name, int len, int pos);
> +
> /*
> * Invalidate the untracked cache for the given pathname. Copy the
> * buffer to a proper null-terminated string (since the untracked
> @@ -205,6 +209,84 @@ static void my_invalidate_untracked_cache(
> strbuf_release(&work_path);
> }
>
> +/*
> + * Use the name-hash to lookup the pathname.
> + *
> + * Returns the number of cache-entries that we invalidated.
> + */
The function not only looks up the path name, but also invalidates the
corresponding cache entry. You imply this with the second sentence, but
this could be a bit more explicit.
> +static int my_callback_name_hash(
> + struct index_state *istate, const char *name, int len)
I find the naming conventions here to be weird with the `my_` prefix.
> +{
> + struct cache_entry *ce = NULL;
> +
> + ce = index_file_exists(istate, name, len, 1);
> + if (!ce)
> + return 0;
Okay, `index_file_exists()` is called with `icase == 1` here. But is
that the correct thing to do on case-sensitive platforms? I would have
expected us to honor `core.ignoreCase` here.
Turns out, we only end up calling this function when `ignore_case` is
set, so we already do. I think this can be clarified both by giving the
function a better name and by documenting this in the comment. Also,
neither of this or the next function really are callbacks -- they only
happen to be called by a callback function.
I'd think something like `lookup_and_invalidate_path_icase()` and
`lookup_and_invalidate_dir_icase()` could help to clarify intent.
> + /*
> + * The index contains a case-insensitive match for the pathname.
> + * This could either be a regular file or a sparse-index directory.
> + *
> + * We should not have seen FSEvents for a sparse-index directory,
> + * but we handle it just in case.
> + *
> + * Either way, we know that there are not any cache-entries for
> + * children inside the cone of the directory, so we don't need to
> + * do the usual scan.
> + */
> + trace_printf_key(&trace_fsmonitor,
> + "fsmonitor_refresh_callback map '%s' '%s'",
> + name, ce->name);
> +
> + my_invalidate_untracked_cache(istate, ce->name, ce->ce_namelen);
> +
> + ce->ce_flags &= ~CE_FSMONITOR_VALID;
> + return 1;
> +}
> +
> +/*
> + * Use the directory name-hash to find the correct-case spelling
> + * of the directory. Use the canonical spelling to invalidate all
> + * of the cache-entries within the matching cone.
> + *
> + * The pathname MUST NOT have a trailing slash.
> + *
> + * Returns the number of cache-entries that we invalidated.
> + */
> +static int my_callback_dir_name_hash(
> + struct index_state *istate, const char *name, int len)
> +{
> + struct strbuf canonical_path = STRBUF_INIT;
> + int pos;
> + int nr_in_cone;
> +
> + if (!index_dir_exists2(istate, name, len, &canonical_path))
> + return 0; /* name is untracked */
> + if (!memcmp(name, canonical_path.buf, len)) {
> + strbuf_release(&canonical_path);
> + return 0; /* should not happen */
> + }
So in other words, this function should only be called when we know that
casing differs, and thus the passed-in name and the canonical name
should never be the same? If this case shouldn't ever happen, shouldn't
we report this as an error or use `BUG()` instead of silently ignoring
this mismatch of expectations?
Patrick
> + trace_printf_key(&trace_fsmonitor,
> + "fsmonitor_refresh_callback map '%s' '%s'",
> + name, canonical_path.buf);
> +
> + /*
> + * The directory name-hash only tells us the corrected
> + * spelling of the prefix. We have to use this canonical
> + * path to do a lookup in the cache-entry array so that we
> + * we repeat the original search using the case-corrected
> + * spelling.
> + */
> + strbuf_addch(&canonical_path, '/');
> + pos = index_name_pos(istate, canonical_path.buf,
> + canonical_path.len);
> + nr_in_cone = fsmonitor_refresh_callback_slash(
> + istate, canonical_path.buf, canonical_path.len, pos);
> + strbuf_release(&canonical_path);
> + return nr_in_cone;
> +}
> +
> static void fsmonitor_refresh_callback_unqualified(
> struct index_state *istate, const char *name, int len, int pos)
> {
> @@ -269,7 +351,10 @@ static void fsmonitor_refresh_callback_unqualified(
> *
> * Return the number of cache-entries that we invalidated. We will
> * use this later to determine if we need to attempt a second
> - * case-insensitive search.
> + * case-insensitive search. That is, if a observed-case search yields
> + * any results, we assume the prefix is case-correct. If there are
> + * no matches, we still don't know if the observed path is simply
> + * untracked or case-incorrect.
> */
> static int fsmonitor_refresh_callback_slash(
> struct index_state *istate, const char *name, int len, int pos)
> @@ -293,17 +378,50 @@ static int fsmonitor_refresh_callback_slash(
> return nr_in_cone;
> }
>
> +/*
> + * On a case-insensitive FS, use the name-hash and directory name-hash
> + * to map the case of the observed path to the canonical case expected
> + * by the index.
> + *
> + * The given pathname includes the trailing slash.
> + *
> + * Return the number of cache-entries that we invalidated.
> + */
> +static int fsmonitor_refresh_callback_slash_icase(
> + struct index_state *istate, const char *name, int len)
> +{
> + int nr_in_cone;
> +
> + /*
> + * Look for a case-incorrect sparse-index directory.
> + */
> + nr_in_cone = my_callback_name_hash(istate, name, len);
> + if (nr_in_cone)
> + return nr_in_cone;
> +
> + /*
> + * (len-1) because we do not include the trailing slash in the
> + * pathname.
> + */
> + nr_in_cone = my_callback_dir_name_hash(istate, name, len-1);
> + return nr_in_cone;
> +}
> +
> static void fsmonitor_refresh_callback(struct index_state *istate, char *name)
> {
> int len = strlen(name);
> int pos = index_name_pos(istate, name, len);
> + int nr_in_cone;
> +
>
> trace_printf_key(&trace_fsmonitor,
> "fsmonitor_refresh_callback '%s' (pos %d)",
> name, pos);
>
> if (name[len - 1] == '/') {
> - fsmonitor_refresh_callback_slash(istate, name, len, pos);
> + nr_in_cone = fsmonitor_refresh_callback_slash(istate, name, len, pos);
> + if (ignore_case && !nr_in_cone)
> + fsmonitor_refresh_callback_slash_icase(istate, name, len);
> } else {
> fsmonitor_refresh_callback_unqualified(istate, name, len, pos);
> }
> --
> gitgitgadget
>
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2024-02-15 9:32 UTC|newest]
Thread overview: 91+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-13 20:52 [PATCH 00/12] FSMonitor edge cases on case-insensitive file systems Jeff Hostetler via GitGitGadget
2024-02-13 20:52 ` [PATCH 01/12] sparse-index: pass string length to index_file_exists() Jeff Hostetler via GitGitGadget
2024-02-13 22:07 ` Junio C Hamano
2024-02-20 17:34 ` Jeff Hostetler
2024-02-13 20:52 ` [PATCH 02/12] name-hash: add index_dir_exists2() Jeff Hostetler via GitGitGadget
2024-02-13 21:43 ` Junio C Hamano
2024-02-20 17:38 ` Jeff Hostetler
2024-02-20 19:34 ` Junio C Hamano
2024-02-15 9:31 ` Patrick Steinhardt
2024-02-13 20:52 ` [PATCH 03/12] t7527: add case-insensitve test for FSMonitor Jeff Hostetler via GitGitGadget
2024-02-13 20:52 ` [PATCH 04/12] fsmonitor: refactor refresh callback on directory events Jeff Hostetler via GitGitGadget
2024-02-15 9:32 ` Patrick Steinhardt
2024-02-20 18:54 ` Jeff Hostetler
2024-02-21 12:54 ` Patrick Steinhardt
2024-02-13 20:52 ` [PATCH 05/12] fsmonitor: refactor refresh callback for non-directory events Jeff Hostetler via GitGitGadget
2024-02-14 1:34 ` Junio C Hamano
2024-02-15 9:32 ` Patrick Steinhardt
2024-02-13 20:52 ` [PATCH 06/12] fsmonitor: clarify handling of directory events in callback Jeff Hostetler via GitGitGadget
2024-02-14 7:47 ` Junio C Hamano
2024-02-20 18:56 ` Jeff Hostetler
2024-02-20 19:24 ` Junio C Hamano
2024-02-15 9:32 ` Patrick Steinhardt
2024-02-20 19:10 ` Jeff Hostetler
2024-02-13 20:52 ` [PATCH 07/12] fsmonitor: refactor untracked-cache invalidation Jeff Hostetler via GitGitGadget
2024-02-14 16:46 ` Junio C Hamano
2024-02-15 9:32 ` Patrick Steinhardt
2024-02-13 20:52 ` [PATCH 08/12] fsmonitor: support case-insensitive directory events Jeff Hostetler via GitGitGadget
2024-02-15 9:32 ` Patrick Steinhardt [this message]
2024-02-13 20:52 ` [PATCH 09/12] fsmonitor: refactor non-directory callback Jeff Hostetler via GitGitGadget
2024-02-15 9:32 ` Patrick Steinhardt
2024-02-13 20:52 ` [PATCH 10/12] fsmonitor: support case-insensitive non-directory events Jeff Hostetler via GitGitGadget
2024-02-13 20:52 ` [PATCH 11/12] fsmonitor: refactor bit invalidation in refresh callback Jeff Hostetler via GitGitGadget
2024-02-15 9:32 ` Patrick Steinhardt
2024-02-13 20:52 ` [PATCH 12/12] t7527: update case-insenstive fsmonitor test Jeff Hostetler via GitGitGadget
2024-02-23 3:18 ` [PATCH v2 00/16] FSMonitor edge cases on case-insensitive file systems Jeff Hostetler via GitGitGadget
2024-02-23 3:18 ` [PATCH v2 01/16] name-hash: add index_dir_find() Jeff Hostetler via GitGitGadget
2024-02-23 6:37 ` Junio C Hamano
2024-02-23 3:18 ` [PATCH v2 02/16] t7527: add case-insensitve test for FSMonitor Jeff Hostetler via GitGitGadget
2024-02-23 3:18 ` [PATCH v2 03/16] t7527: temporarily disable case-insensitive tests Jeff Hostetler via GitGitGadget
2024-02-23 8:17 ` Junio C Hamano
2024-02-26 17:12 ` Jeff Hostetler
2024-02-23 3:18 ` [PATCH v2 04/16] fsmonitor: refactor refresh callback on directory events Jeff Hostetler via GitGitGadget
2024-02-23 8:18 ` Junio C Hamano
2024-02-23 3:18 ` [PATCH v2 05/16] fsmonitor: clarify handling of directory events in callback helper Jeff Hostetler via GitGitGadget
2024-02-23 3:18 ` [PATCH v2 06/16] fsmonitor: refactor refresh callback for non-directory events Jeff Hostetler via GitGitGadget
2024-02-23 8:18 ` Junio C Hamano
2024-02-25 12:30 ` Torsten Bögershausen
2024-02-25 17:24 ` Junio C Hamano
2024-02-23 3:18 ` [PATCH v2 07/16] dir: create untracked_cache_invalidate_trimmed_path() Jeff Hostetler via GitGitGadget
2024-02-25 12:35 ` Torsten Bögershausen
2024-02-23 3:18 ` [PATCH v2 08/16] fsmonitor: refactor untracked-cache invalidation Jeff Hostetler via GitGitGadget
2024-02-23 3:18 ` [PATCH v2 09/16] fsmonitor: move untracked invalidation into helper functions Jeff Hostetler via GitGitGadget
2024-02-23 17:36 ` Junio C Hamano
2024-02-26 18:45 ` Jeff Hostetler
2024-02-23 3:18 ` [PATCH v2 10/16] fsmonitor: return invalidated cache-entry count on directory event Jeff Hostetler via GitGitGadget
2024-02-23 3:18 ` [PATCH v2 11/16] fsmonitor: remove custom loop from non-directory path handler Jeff Hostetler via GitGitGadget
2024-02-23 17:47 ` Junio C Hamano
2024-02-23 3:18 ` [PATCH v2 12/16] fsmonitor: return invalided cache-entry count on non-directory event Jeff Hostetler via GitGitGadget
2024-02-23 17:51 ` Junio C Hamano
2024-02-23 3:18 ` [PATCH v2 13/16] fsmonitor: trace the new invalidated cache-entry count Jeff Hostetler via GitGitGadget
2024-02-23 17:53 ` Junio C Hamano
2024-02-23 3:18 ` [PATCH v2 14/16] fsmonitor: support case-insensitive events Jeff Hostetler via GitGitGadget
2024-02-23 18:14 ` Junio C Hamano
2024-02-26 20:41 ` Jeff Hostetler
2024-02-26 21:18 ` Junio C Hamano
2024-02-25 13:10 ` Torsten Bögershausen
2024-02-26 20:47 ` Jeff Hostetler
2024-02-23 3:18 ` [PATCH v2 15/16] fsmonitor: refactor bit invalidation in refresh callback Jeff Hostetler via GitGitGadget
2024-02-23 18:18 ` Junio C Hamano
2024-02-23 3:18 ` [PATCH v2 16/16] t7527: update case-insenstive fsmonitor test Jeff Hostetler via GitGitGadget
2024-02-26 21:39 ` [PATCH v3 00/14] FSMonitor edge cases on case-insensitive file systems Jeff Hostetler via GitGitGadget
2024-02-26 21:39 ` [PATCH v3 01/14] name-hash: add index_dir_find() Jeff Hostetler via GitGitGadget
2024-02-26 21:39 ` [PATCH v3 02/14] t7527: add case-insensitve test for FSMonitor Jeff Hostetler via GitGitGadget
2024-02-26 21:39 ` [PATCH v3 03/14] fsmonitor: refactor refresh callback on directory events Jeff Hostetler via GitGitGadget
2024-02-26 21:39 ` [PATCH v3 04/14] fsmonitor: clarify handling of directory events in callback helper Jeff Hostetler via GitGitGadget
2024-02-26 21:39 ` [PATCH v3 05/14] fsmonitor: refactor refresh callback for non-directory events Jeff Hostetler via GitGitGadget
2024-02-26 21:39 ` [PATCH v3 06/14] dir: create untracked_cache_invalidate_trimmed_path() Jeff Hostetler via GitGitGadget
2024-02-26 21:39 ` [PATCH v3 07/14] fsmonitor: refactor untracked-cache invalidation Jeff Hostetler via GitGitGadget
2024-02-26 21:39 ` [PATCH v3 08/14] fsmonitor: move untracked-cache invalidation into helper functions Jeff Hostetler via GitGitGadget
2024-02-26 21:39 ` [PATCH v3 09/14] fsmonitor: return invalidated cache-entry count on directory event Jeff Hostetler via GitGitGadget
2024-02-26 21:39 ` [PATCH v3 10/14] fsmonitor: remove custom loop from non-directory path handler Jeff Hostetler via GitGitGadget
2024-02-26 21:39 ` [PATCH v3 11/14] fsmonitor: return invalided cache-entry count on non-directory event Jeff Hostetler via GitGitGadget
2024-03-06 12:58 ` Patrick Steinhardt
2024-02-26 21:39 ` [PATCH v3 12/14] fsmonitor: trace the new invalidated cache-entry count Jeff Hostetler via GitGitGadget
2024-02-26 21:39 ` [PATCH v3 13/14] fsmonitor: refactor bit invalidation in refresh callback Jeff Hostetler via GitGitGadget
2024-02-26 21:39 ` [PATCH v3 14/14] fsmonitor: support case-insensitive events Jeff Hostetler via GitGitGadget
2024-03-06 12:58 ` Patrick Steinhardt
2024-02-27 1:40 ` [PATCH v3 00/14] FSMonitor edge cases on case-insensitive file systems Junio C Hamano
2024-03-06 12:58 ` Patrick Steinhardt
2024-03-06 17:09 ` Junio C Hamano
2024-03-06 18:10 ` Jeff Hostetler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zc3aO95bvuXXENLj@tanuki \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=jeffhostetler@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.