git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Committing crimes with NTFS-3G
@ 2024-08-29 20:43 Roman Sandu
  2024-08-30  0:47 ` brian m. carlson
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Roman Sandu @ 2024-08-29 20:43 UTC (permalink / raw)
  To: git

Good day!

I have a decently sized (80K files) monorepo on an NTFS drive that I 
have been working with for a while under Windows via git-for-windows. 
Recently, I had to (temporarily) switch to Ubuntu (24.04) via dual-boot 
for irrelevant reasons, and I decided that simply mounting my NTFS drive 
and using the monorepo from Ubuntu is a great idea, actually, as NTFS-3G 
allow for seamless interop with NTFS via UserMapping. And so that is 
exactly what I did and It Just Works!

Except it kind of does not. Every time I run `git status` it takes 8 
seconds, which is very painful when doing tricky history rewriting.

To diagnose the problem, I ran git status with GIT_TRACE_PERFORMANCE 
enabled, and what I see is that the "refresh index" region is taking up 
99% of the time. Digging further, `strace -fc git status` tells me that 
99% of the time is spent on newfstatat'ing files. Okay, makes sense, 
stat'ing files through FUSE is not all that quick. But how many files 
are we talking about? My repository has `feature.manyFiles` enabled in 
git, so I would expect `core.untrackedCache` make it so that `git 
status` skips basically everything except for the root folder which 
contains, what, 20 subfolders? But it actually does >96K stat calls! 
Which is more than the amount of files in the repository in total. 
Briefly looking at the output of `strace -f git status`, I see that git 
indeed goes through basically all of the repository, even things that 
have not changed for years, as if `core.untrackedCache` is not actually 
enabled. Manually enabling it on top of `feature.manyFiles` does not 
help. Note that `git update-index --test-untracked-cache` tells me that 
mtime does indeed work, and I've also manually stat'ed some folders 
which `git status` re-stats on every run and I see that the modify time 
is indeed a couple of hours ago, yet even when running `git status` 
several times in a row it re-scans the entire folder every time.

So, what do I do about this? It honestly looks like a git bug to me, 
maybe it silently fails to update the index with new timestamps because 
it was initially created on Windows? But I have no clue how to narrow 
this issue down further, so any ideas or suggestions would be appreciated!

Regards,
Roman Sandu

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-09-05 19:51 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-29 20:43 Committing crimes with NTFS-3G Roman Sandu
2024-08-30  0:47 ` brian m. carlson
2024-08-30 12:52   ` Roman Sandu
2024-08-30 15:02     ` brian m. carlson
2024-08-30 19:25       ` Roman Sandu
2024-08-30 15:55         ` brian m. carlson
2024-08-30 22:00           ` Roman Sandu
2024-08-30  4:18 ` Vito Caputo
2024-08-30  4:58 ` Johannes Sixt
2024-08-30 12:41   ` Roman Sandu
2024-08-30 16:28   ` Junio C Hamano
2024-09-03 15:58     ` Torsten Bögershausen
2024-09-03 17:30       ` Roman Sandu
2024-09-05 19:51         ` Torsten Bögershausen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).