All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: "Patrick Steinhardt" <ps@pks.im>,
	"Jeff Hostetler" <git@jeffhostetler.com>,
	"Torsten Bögershausen" <tboegi@web.de>,
	"Jeff Hostetler" <jeffhostetler@github.com>,
	"Jeff Hostetler" <jeffhostetler@github.com>
Subject: [PATCH v3 02/14] t7527: add case-insensitve test for FSMonitor
Date: Mon, 26 Feb 2024 21:39:13 +0000	[thread overview]
Message-ID: <beeebf5596385cf72f7aa566fd277da738e627fd.1708983566.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1662.v3.git.1708983565.gitgitgadget@gmail.com>

From: Jeff Hostetler <jeffhostetler@github.com>

The FSMonitor client code trusts the spelling of the pathnames in the
FSEvents received from the FSMonitor daemon.  On case-insensitive file
systems, these OBSERVED pathnames may be spelled differently than the
EXPECTED pathnames listed in the .git/index.  This causes a miss when
using `index_name_pos()` which expects the given case to be correct.

When this happens, the FSMonitor client code does not update the state
of the CE_FSMONITOR_VALID bit when refreshing the index (and before
starting to scan the worktree).

This results in modified files NOT being reported by `git status` when
there is a discrepancy in the case-spelling of a tracked file's
pathname.

This commit contains a (rather contrived) test case to demonstrate
this.  A later commit in this series will update the FSMonitor client
code to recognize these discrepancies and update the CE_ bit accordingly.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
---
 t/t7527-builtin-fsmonitor.sh | 217 +++++++++++++++++++++++++++++++++++
 1 file changed, 217 insertions(+)

diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index 363f9dc0e41..830f2d9de33 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -1037,4 +1037,221 @@ test_expect_success 'split-index and FSMonitor work well together' '
 	)
 '
 
+# The FSMonitor daemon reports the OBSERVED pathname of modified files
+# and thus contains the OBSERVED spelling on case-insensitive file
+# systems.  The daemon does not (and should not) load the .git/index
+# file and therefore does not know the expected case-spelling.  Since
+# it is possible for the user to create files/subdirectories with the
+# incorrect case, a modified file event for a tracked will not have
+# the EXPECTED case. This can cause `index_name_pos()` to incorrectly
+# report that the file is untracked. This causes the client to fail to
+# mark the file as possibly dirty (keeping the CE_FSMONITOR_VALID bit
+# set) so that `git status` will avoid inspecting it and thus not
+# present in the status output.
+#
+# The setup is a little contrived.
+#
+test_expect_failure CASE_INSENSITIVE_FS 'fsmonitor subdir case wrong on disk' '
+	test_when_finished "stop_daemon_delete_repo subdir_case_wrong" &&
+
+	git init subdir_case_wrong &&
+	(
+		cd subdir_case_wrong &&
+		echo x >AAA &&
+		echo x >BBB &&
+
+		mkdir dir1 &&
+		echo x >dir1/file1 &&
+		mkdir dir1/dir2 &&
+		echo x >dir1/dir2/file2 &&
+		mkdir dir1/dir2/dir3 &&
+		echo x >dir1/dir2/dir3/file3 &&
+
+		echo x >yyy &&
+		echo x >zzz &&
+		git add . &&
+		git commit -m "data" &&
+
+		# This will cause "dir1/" and everything under it
+		# to be deleted.
+		git sparse-checkout set --cone --sparse-index &&
+
+		# Create dir2 with the wrong case and then let Git
+		# repopulate dir3 -- it will not correct the spelling
+		# of dir2.
+		mkdir dir1 &&
+		mkdir dir1/DIR2 &&
+		git sparse-checkout add dir1/dir2/dir3
+	) &&
+
+	start_daemon -C subdir_case_wrong --tf "$PWD/subdir_case_wrong.trace" &&
+
+	# Enable FSMonitor in the client. Run enough commands for
+	# the .git/index to sync up with the daemon with everything
+	# marked clean.
+	git -C subdir_case_wrong config core.fsmonitor true &&
+	git -C subdir_case_wrong update-index --fsmonitor &&
+	git -C subdir_case_wrong status &&
+
+	# Make some files dirty so that FSMonitor gets FSEvents for
+	# each of them.
+	echo xx >>subdir_case_wrong/AAA &&
+	echo xx >>subdir_case_wrong/dir1/DIR2/dir3/file3 &&
+	echo xx >>subdir_case_wrong/zzz &&
+
+	GIT_TRACE_FSMONITOR="$PWD/subdir_case_wrong.log" \
+		git -C subdir_case_wrong --no-optional-locks status --short \
+			>"$PWD/subdir_case_wrong.out" &&
+
+	# "git status" should have gotten file events for each of
+	# the 3 files.
+	#
+	# "dir2" should be in the observed case on disk.
+	grep "fsmonitor_refresh_callback" \
+		<"$PWD/subdir_case_wrong.log" \
+		>"$PWD/subdir_case_wrong.log1" &&
+
+	grep -q "AAA.*pos 0" "$PWD/subdir_case_wrong.log1" &&
+	grep -q "zzz.*pos 6" "$PWD/subdir_case_wrong.log1" &&
+
+	grep -q "dir1/DIR2/dir3/file3.*pos -3" "$PWD/subdir_case_wrong.log1" &&
+
+	# The refresh-callbacks should have caused "git status" to clear
+	# the CE_FSMONITOR_VALID bit on each of those files and caused
+	# the worktree scan to visit them and mark them as modified.
+	grep -q " M AAA" "$PWD/subdir_case_wrong.out" &&
+	grep -q " M zzz" "$PWD/subdir_case_wrong.out" &&
+
+	# Expect Breakage: with the case confusion, the "(pos -3)" causes
+	# the client to not clear the CE_FSMONITOR_VALID bit and therefore
+	# status will not rescan the file and therefore not report it as dirty.
+	grep -q " M dir1/dir2/dir3/file3" "$PWD/subdir_case_wrong.out"
+'
+
+test_expect_failure CASE_INSENSITIVE_FS 'fsmonitor file case wrong on disk' '
+	test_when_finished "stop_daemon_delete_repo file_case_wrong" &&
+
+	git init file_case_wrong &&
+	(
+		cd file_case_wrong &&
+		echo x >AAA &&
+		echo x >BBB &&
+
+		mkdir dir1 &&
+		mkdir dir1/dir2 &&
+		mkdir dir1/dir2/dir3 &&
+		echo x >dir1/dir2/dir3/FILE-3-B &&
+		echo x >dir1/dir2/dir3/XXXX-3-X &&
+		echo x >dir1/dir2/dir3/file-3-a &&
+		echo x >dir1/dir2/dir3/yyyy-3-y &&
+		mkdir dir1/dir2/dir4 &&
+		echo x >dir1/dir2/dir4/FILE-4-A &&
+		echo x >dir1/dir2/dir4/XXXX-4-X &&
+		echo x >dir1/dir2/dir4/file-4-b &&
+		echo x >dir1/dir2/dir4/yyyy-4-y &&
+
+		echo x >yyy &&
+		echo x >zzz &&
+		git add . &&
+		git commit -m "data"
+	) &&
+
+	start_daemon -C file_case_wrong --tf "$PWD/file_case_wrong.trace" &&
+
+	# Enable FSMonitor in the client. Run enough commands for
+	# the .git/index to sync up with the daemon with everything
+	# marked clean.
+	git -C file_case_wrong config core.fsmonitor true &&
+	git -C file_case_wrong update-index --fsmonitor &&
+	git -C file_case_wrong status &&
+
+	# Make some files dirty so that FSMonitor gets FSEvents for
+	# each of them.
+	echo xx >>file_case_wrong/AAA &&
+	echo xx >>file_case_wrong/zzz &&
+
+	# Rename some files so that FSMonitor sees a create and delete
+	# FSEvent for each.  (A simple "mv foo FOO" is not portable
+	# between macOS and Windows. It works on both platforms, but makes
+	# the test messy, since (1) one platform updates "ctime" on the
+	# moved file and one does not and (2) it causes a directory event
+	# on one platform and not on the other which causes additional
+	# scanning during "git status" which causes a "H" vs "h" discrepancy
+	# in "git ls-files -f".)  So old-school it and move it out of the
+	# way and copy it to the case-incorrect name so that we get fresh
+	# "ctime" and "mtime" values.
+
+	mv file_case_wrong/dir1/dir2/dir3/file-3-a file_case_wrong/dir1/dir2/dir3/ORIG &&
+	cp file_case_wrong/dir1/dir2/dir3/ORIG     file_case_wrong/dir1/dir2/dir3/FILE-3-A &&
+	rm file_case_wrong/dir1/dir2/dir3/ORIG &&
+	mv file_case_wrong/dir1/dir2/dir4/FILE-4-A file_case_wrong/dir1/dir2/dir4/ORIG &&
+	cp file_case_wrong/dir1/dir2/dir4/ORIG     file_case_wrong/dir1/dir2/dir4/file-4-a &&
+	rm file_case_wrong/dir1/dir2/dir4/ORIG &&
+
+	# Run status enough times to fully sync.
+	#
+	# The first instance should get the create and delete FSEvents
+	# for each pair.  Status should update the index with a new FSM
+	# token (so the next invocation will not see data for these
+	# events).
+
+	GIT_TRACE_FSMONITOR="$PWD/file_case_wrong-try1.log" \
+		git -C file_case_wrong status --short \
+			>"$PWD/file_case_wrong-try1.out" &&
+	grep -q "fsmonitor_refresh_callback.*FILE-3-A.*pos -3" "$PWD/file_case_wrong-try1.log" &&
+	grep -q "fsmonitor_refresh_callback.*file-3-a.*pos 4"  "$PWD/file_case_wrong-try1.log" &&
+	grep -q "fsmonitor_refresh_callback.*FILE-4-A.*pos 6"  "$PWD/file_case_wrong-try1.log" &&
+	grep -q "fsmonitor_refresh_callback.*file-4-a.*pos -9" "$PWD/file_case_wrong-try1.log" &&
+
+	# FSM refresh will have invalidated the FSM bit and cause a regular
+	# (real) scan of these tracked files, so they should have "H" status.
+	# (We will not see a "h" status until the next refresh (on the next
+	# command).)
+
+	git -C file_case_wrong ls-files -f >"$PWD/file_case_wrong-lsf1.out" &&
+	grep -q "H dir1/dir2/dir3/file-3-a" "$PWD/file_case_wrong-lsf1.out" &&
+	grep -q "H dir1/dir2/dir4/FILE-4-A" "$PWD/file_case_wrong-lsf1.out" &&
+
+
+	# Try the status again. We assume that the above status command
+	# advanced the token so that the next one will not see those events.
+
+	GIT_TRACE_FSMONITOR="$PWD/file_case_wrong-try2.log" \
+		git -C file_case_wrong status --short \
+			>"$PWD/file_case_wrong-try2.out" &&
+	! grep -q "fsmonitor_refresh_callback.*FILE-3-A.*pos" "$PWD/file_case_wrong-try2.log" &&
+	! grep -q "fsmonitor_refresh_callback.*file-3-a.*pos" "$PWD/file_case_wrong-try2.log" &&
+	! grep -q "fsmonitor_refresh_callback.*FILE-4-A.*pos" "$PWD/file_case_wrong-try2.log" &&
+	! grep -q "fsmonitor_refresh_callback.*file-4-a.*pos" "$PWD/file_case_wrong-try2.log" &&
+
+	# FSM refresh saw nothing, so it will mark all files as valid,
+	# so they should now have "h" status.
+
+	git -C file_case_wrong ls-files -f >"$PWD/file_case_wrong-lsf2.out" &&
+	grep -q "h dir1/dir2/dir3/file-3-a" "$PWD/file_case_wrong-lsf2.out" &&
+	grep -q "h dir1/dir2/dir4/FILE-4-A" "$PWD/file_case_wrong-lsf2.out" &&
+
+
+	# We now have files with clean content, but with case-incorrect
+	# file names.  Modify them to see if status properly reports
+	# them.
+
+	echo xx >>file_case_wrong/dir1/dir2/dir3/FILE-3-A &&
+	echo xx >>file_case_wrong/dir1/dir2/dir4/file-4-a &&
+
+	GIT_TRACE_FSMONITOR="$PWD/file_case_wrong-try3.log" \
+		git -C file_case_wrong --no-optional-locks status --short \
+			>"$PWD/file_case_wrong-try3.out" &&
+	# FSEvents are in observed case.
+	grep -q "fsmonitor_refresh_callback.*FILE-3-A.*pos -3" "$PWD/file_case_wrong-try3.log" &&
+	grep -q "fsmonitor_refresh_callback.*file-4-a.*pos -9" "$PWD/file_case_wrong-try3.log" &&
+
+	# Expect Breakage: with the case confusion, the "(pos-3)" and
+	# "(pos -9)" causes the client to not clear the CE_FSMONITOR_VALID
+	# bit and therefore status will not rescan the files and therefore
+	# not report them as dirty.
+	grep -q " M dir1/dir2/dir3/file-3-a" "$PWD/file_case_wrong-try3.out" &&
+	grep -q " M dir1/dir2/dir4/FILE-4-A" "$PWD/file_case_wrong-try3.out"
+'
+
 test_done
-- 
gitgitgadget


  parent reply	other threads:[~2024-02-26 21:39 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-13 20:52 [PATCH 00/12] FSMonitor edge cases on case-insensitive file systems Jeff Hostetler via GitGitGadget
2024-02-13 20:52 ` [PATCH 01/12] sparse-index: pass string length to index_file_exists() Jeff Hostetler via GitGitGadget
2024-02-13 22:07   ` Junio C Hamano
2024-02-20 17:34     ` Jeff Hostetler
2024-02-13 20:52 ` [PATCH 02/12] name-hash: add index_dir_exists2() Jeff Hostetler via GitGitGadget
2024-02-13 21:43   ` Junio C Hamano
2024-02-20 17:38     ` Jeff Hostetler
2024-02-20 19:34       ` Junio C Hamano
2024-02-15  9:31   ` Patrick Steinhardt
2024-02-13 20:52 ` [PATCH 03/12] t7527: add case-insensitve test for FSMonitor Jeff Hostetler via GitGitGadget
2024-02-13 20:52 ` [PATCH 04/12] fsmonitor: refactor refresh callback on directory events Jeff Hostetler via GitGitGadget
2024-02-15  9:32   ` Patrick Steinhardt
2024-02-20 18:54     ` Jeff Hostetler
2024-02-21 12:54       ` Patrick Steinhardt
2024-02-13 20:52 ` [PATCH 05/12] fsmonitor: refactor refresh callback for non-directory events Jeff Hostetler via GitGitGadget
2024-02-14  1:34   ` Junio C Hamano
2024-02-15  9:32   ` Patrick Steinhardt
2024-02-13 20:52 ` [PATCH 06/12] fsmonitor: clarify handling of directory events in callback Jeff Hostetler via GitGitGadget
2024-02-14  7:47   ` Junio C Hamano
2024-02-20 18:56     ` Jeff Hostetler
2024-02-20 19:24       ` Junio C Hamano
2024-02-15  9:32   ` Patrick Steinhardt
2024-02-20 19:10     ` Jeff Hostetler
2024-02-13 20:52 ` [PATCH 07/12] fsmonitor: refactor untracked-cache invalidation Jeff Hostetler via GitGitGadget
2024-02-14 16:46   ` Junio C Hamano
2024-02-15  9:32   ` Patrick Steinhardt
2024-02-13 20:52 ` [PATCH 08/12] fsmonitor: support case-insensitive directory events Jeff Hostetler via GitGitGadget
2024-02-15  9:32   ` Patrick Steinhardt
2024-02-13 20:52 ` [PATCH 09/12] fsmonitor: refactor non-directory callback Jeff Hostetler via GitGitGadget
2024-02-15  9:32   ` Patrick Steinhardt
2024-02-13 20:52 ` [PATCH 10/12] fsmonitor: support case-insensitive non-directory events Jeff Hostetler via GitGitGadget
2024-02-13 20:52 ` [PATCH 11/12] fsmonitor: refactor bit invalidation in refresh callback Jeff Hostetler via GitGitGadget
2024-02-15  9:32   ` Patrick Steinhardt
2024-02-13 20:52 ` [PATCH 12/12] t7527: update case-insenstive fsmonitor test Jeff Hostetler via GitGitGadget
2024-02-23  3:18 ` [PATCH v2 00/16] FSMonitor edge cases on case-insensitive file systems Jeff Hostetler via GitGitGadget
2024-02-23  3:18   ` [PATCH v2 01/16] name-hash: add index_dir_find() Jeff Hostetler via GitGitGadget
2024-02-23  6:37     ` Junio C Hamano
2024-02-23  3:18   ` [PATCH v2 02/16] t7527: add case-insensitve test for FSMonitor Jeff Hostetler via GitGitGadget
2024-02-23  3:18   ` [PATCH v2 03/16] t7527: temporarily disable case-insensitive tests Jeff Hostetler via GitGitGadget
2024-02-23  8:17     ` Junio C Hamano
2024-02-26 17:12       ` Jeff Hostetler
2024-02-23  3:18   ` [PATCH v2 04/16] fsmonitor: refactor refresh callback on directory events Jeff Hostetler via GitGitGadget
2024-02-23  8:18     ` Junio C Hamano
2024-02-23  3:18   ` [PATCH v2 05/16] fsmonitor: clarify handling of directory events in callback helper Jeff Hostetler via GitGitGadget
2024-02-23  3:18   ` [PATCH v2 06/16] fsmonitor: refactor refresh callback for non-directory events Jeff Hostetler via GitGitGadget
2024-02-23  8:18     ` Junio C Hamano
2024-02-25 12:30     ` Torsten Bögershausen
2024-02-25 17:24       ` Junio C Hamano
2024-02-23  3:18   ` [PATCH v2 07/16] dir: create untracked_cache_invalidate_trimmed_path() Jeff Hostetler via GitGitGadget
2024-02-25 12:35     ` Torsten Bögershausen
2024-02-23  3:18   ` [PATCH v2 08/16] fsmonitor: refactor untracked-cache invalidation Jeff Hostetler via GitGitGadget
2024-02-23  3:18   ` [PATCH v2 09/16] fsmonitor: move untracked invalidation into helper functions Jeff Hostetler via GitGitGadget
2024-02-23 17:36     ` Junio C Hamano
2024-02-26 18:45       ` Jeff Hostetler
2024-02-23  3:18   ` [PATCH v2 10/16] fsmonitor: return invalidated cache-entry count on directory event Jeff Hostetler via GitGitGadget
2024-02-23  3:18   ` [PATCH v2 11/16] fsmonitor: remove custom loop from non-directory path handler Jeff Hostetler via GitGitGadget
2024-02-23 17:47     ` Junio C Hamano
2024-02-23  3:18   ` [PATCH v2 12/16] fsmonitor: return invalided cache-entry count on non-directory event Jeff Hostetler via GitGitGadget
2024-02-23 17:51     ` Junio C Hamano
2024-02-23  3:18   ` [PATCH v2 13/16] fsmonitor: trace the new invalidated cache-entry count Jeff Hostetler via GitGitGadget
2024-02-23 17:53     ` Junio C Hamano
2024-02-23  3:18   ` [PATCH v2 14/16] fsmonitor: support case-insensitive events Jeff Hostetler via GitGitGadget
2024-02-23 18:14     ` Junio C Hamano
2024-02-26 20:41       ` Jeff Hostetler
2024-02-26 21:18         ` Junio C Hamano
2024-02-25 13:10     ` Torsten Bögershausen
2024-02-26 20:47       ` Jeff Hostetler
2024-02-23  3:18   ` [PATCH v2 15/16] fsmonitor: refactor bit invalidation in refresh callback Jeff Hostetler via GitGitGadget
2024-02-23 18:18     ` Junio C Hamano
2024-02-23  3:18   ` [PATCH v2 16/16] t7527: update case-insenstive fsmonitor test Jeff Hostetler via GitGitGadget
2024-02-26 21:39   ` [PATCH v3 00/14] FSMonitor edge cases on case-insensitive file systems Jeff Hostetler via GitGitGadget
2024-02-26 21:39     ` [PATCH v3 01/14] name-hash: add index_dir_find() Jeff Hostetler via GitGitGadget
2024-02-26 21:39     ` Jeff Hostetler via GitGitGadget [this message]
2024-02-26 21:39     ` [PATCH v3 03/14] fsmonitor: refactor refresh callback on directory events Jeff Hostetler via GitGitGadget
2024-02-26 21:39     ` [PATCH v3 04/14] fsmonitor: clarify handling of directory events in callback helper Jeff Hostetler via GitGitGadget
2024-02-26 21:39     ` [PATCH v3 05/14] fsmonitor: refactor refresh callback for non-directory events Jeff Hostetler via GitGitGadget
2024-02-26 21:39     ` [PATCH v3 06/14] dir: create untracked_cache_invalidate_trimmed_path() Jeff Hostetler via GitGitGadget
2024-02-26 21:39     ` [PATCH v3 07/14] fsmonitor: refactor untracked-cache invalidation Jeff Hostetler via GitGitGadget
2024-02-26 21:39     ` [PATCH v3 08/14] fsmonitor: move untracked-cache invalidation into helper functions Jeff Hostetler via GitGitGadget
2024-02-26 21:39     ` [PATCH v3 09/14] fsmonitor: return invalidated cache-entry count on directory event Jeff Hostetler via GitGitGadget
2024-02-26 21:39     ` [PATCH v3 10/14] fsmonitor: remove custom loop from non-directory path handler Jeff Hostetler via GitGitGadget
2024-02-26 21:39     ` [PATCH v3 11/14] fsmonitor: return invalided cache-entry count on non-directory event Jeff Hostetler via GitGitGadget
2024-03-06 12:58       ` Patrick Steinhardt
2024-02-26 21:39     ` [PATCH v3 12/14] fsmonitor: trace the new invalidated cache-entry count Jeff Hostetler via GitGitGadget
2024-02-26 21:39     ` [PATCH v3 13/14] fsmonitor: refactor bit invalidation in refresh callback Jeff Hostetler via GitGitGadget
2024-02-26 21:39     ` [PATCH v3 14/14] fsmonitor: support case-insensitive events Jeff Hostetler via GitGitGadget
2024-03-06 12:58       ` Patrick Steinhardt
2024-02-27  1:40     ` [PATCH v3 00/14] FSMonitor edge cases on case-insensitive file systems Junio C Hamano
2024-03-06 12:58     ` Patrick Steinhardt
2024-03-06 17:09       ` Junio C Hamano
2024-03-06 18:10       ` Jeff Hostetler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=beeebf5596385cf72f7aa566fd277da738e627fd.1708983566.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=jeffhostetler@github.com \
    --cc=ps@pks.im \
    --cc=tboegi@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.