* [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0 @ 2008-07-21 17:35 Alex Riesen 2008-07-21 18:20 ` Johannes Schindelin 2008-07-22 7:17 ` Johannes Sixt 0 siblings, 2 replies; 41+ messages in thread From: Alex Riesen @ 2008-07-21 17:35 UTC (permalink / raw) To: git; +Cc: Junio C Hamano For example - Cygwin. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> --- Can MSys folks please try it? I noticed it when the test t2103-update-index-ignore-missing.sh (the 5th case) started failing. read-cache.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/read-cache.c b/read-cache.c index a50a851..eb30c20 100644 --- a/read-cache.c +++ b/read-cache.c @@ -281,7 +281,7 @@ int ie_modified(const struct index_state *istate, * the length field is zero. For other cases the ce_size * should match the SHA1 recorded in the index entry. */ - if ((changed & DATA_CHANGED) && ce->ce_size != 0) + if ((changed & DATA_CHANGED) && (ce->ce_size != 0 || S_ISGITLINK(ce->ce_mode))) return changed; changed_fs = ce_modified_check_fs(ce, st); -- 1.5.6.4.452.g7b2a ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0 2008-07-21 17:35 [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0 Alex Riesen @ 2008-07-21 18:20 ` Johannes Schindelin 2008-07-21 19:43 ` Alex Riesen 2008-07-22 7:17 ` Johannes Sixt 1 sibling, 1 reply; 41+ messages in thread From: Johannes Schindelin @ 2008-07-21 18:20 UTC (permalink / raw) To: Alex Riesen; +Cc: git, Junio C Hamano Hi, On Mon, 21 Jul 2008, Alex Riesen wrote: > For example - Cygwin. Please enhance: your oneline is too long, and your commit message body too short. > Can MSys folks please try it? I noticed it when the test > t2103-update-index-ignore-missing.sh (the 5th case) started failing. Since M$' documentation says "This member does not have a meaning for directories." about the member nFileSizeLow of WIN32_FILE_ATTRIBUTE_DATA which we use to implement a sane "lstat()", I think this bug hits MinGW (not MSys) as well. Ciao, Dscho ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0 2008-07-21 18:20 ` Johannes Schindelin @ 2008-07-21 19:43 ` Alex Riesen 2008-07-21 23:26 ` Johannes Schindelin 2008-07-22 8:07 ` Junio C Hamano 0 siblings, 2 replies; 41+ messages in thread From: Alex Riesen @ 2008-07-21 19:43 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git, Junio C Hamano Johannes Schindelin, Mon, Jul 21, 2008 20:20:43 +0200: > Hi, > > On Mon, 21 Jul 2008, Alex Riesen wrote: > > > For example - Cygwin. > > Please enhance: your oneline is too long, and your commit message body too > short. Well, I'm really not sure. I just found this difference between linux and cygwin (st_stat is 0 for dirs on cygwin). Than I noticed that the routine where I made the change explicitely checks for st_size not being 0. I must admit I can't make much out of comment, and hope this discussion will help to clear the check up. > > Can MSys folks please try it? I noticed it when the test > > t2103-update-index-ignore-missing.sh (the 5th case) started failing. > > Since M$' documentation says "This member does not have a meaning for > directories." about the member nFileSizeLow of WIN32_FILE_ATTRIBUTE_DATA > which we use to implement a sane "lstat()", I think this bug hits MinGW > (not MSys) as well. Could you, just for completeness, try? I don't have mingw at hand and SUSv3 (http://www.opengroup.org/onlinepubs/009695399/toc.htm) does not tells much too. No UNIX system I know about has it 0 for directories. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0 2008-07-21 19:43 ` Alex Riesen @ 2008-07-21 23:26 ` Johannes Schindelin 2008-07-22 8:07 ` Junio C Hamano 1 sibling, 0 replies; 41+ messages in thread From: Johannes Schindelin @ 2008-07-21 23:26 UTC (permalink / raw) To: Alex Riesen; +Cc: git, Junio C Hamano Hi, On Mon, 21 Jul 2008, Alex Riesen wrote: > Johannes Schindelin, Mon, Jul 21, 2008 20:20:43 +0200: > > > Alex wrote: > > > > > Can MSys folks please try it? I noticed it when the test > > > t2103-update-index-ignore-missing.sh (the 5th case) started failing. > > > > Since M$' documentation says "This member does not have a meaning for > > directories." about the member nFileSizeLow of > > WIN32_FILE_ATTRIBUTE_DATA which we use to implement a sane "lstat()", > > I think this bug hits MinGW (not MSys) as well. > > Could you, just for completeness, try? No, I cannot. I have no Windows machine. Ciao, Dscho ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0 2008-07-21 19:43 ` Alex Riesen 2008-07-21 23:26 ` Johannes Schindelin @ 2008-07-22 8:07 ` Junio C Hamano 2008-07-22 16:49 ` Alex Riesen 1 sibling, 1 reply; 41+ messages in thread From: Junio C Hamano @ 2008-07-22 8:07 UTC (permalink / raw) To: Alex Riesen; +Cc: Johannes Schindelin, git Alex Riesen <raa.lkml@gmail.com> writes: > Johannes Schindelin, Mon, Jul 21, 2008 20:20:43 +0200: >> Hi, >> >> On Mon, 21 Jul 2008, Alex Riesen wrote: >> >> > For example - Cygwin. >> >> Please enhance: your oneline is too long, and your commit message body too >> short. > > Well, I'm really not sure. I just found this difference between linux > and cygwin (st_stat is 0 for dirs on cygwin). Than I noticed that the > routine where I made the change explicitely checks for st_size not > being 0. I must admit I can't make much out of comment, and hope this > discussion will help to clear the check up. The cached stat information in the index is used to speed up comparison between "the last staged data" and what is in the working tree. ie_match_stat() compares ce_xxx fields with the result from lstat(2) we just did, and if there are differences, we take it as a sign that what's in the working tree is different from what we saw when we updated the index entry. But there is a twist. Ordinarily, when an entry enteres the index, the hash of the blob contents goes along with the lstat(2) information taken from the file that supplied the contents. However there are some cases we populate the index without lstat(2). update-index --cacheinfo, update-index --index-info are two examples, and when they add index entries, they leave ce_size field to zero. ie_match_stat() will compare that zero ce_size with the size information obtained from the working tree, and declare (falsely) that "what's in the working tree is different -- it can never match, and there is no point trying to re-index to see if they actually match", even though the reason ce_size is zero is *not* because we observed the size of the working tree file *was* zero when we indexed it the last time (it is zero merely because we haven't looked at it yet). The ce_modified_check_fs() call is there to deal with this "we cannot trust the ce_xxx fields" case. I however have to wonder if you also need to touch the end of ce_match_stat_basic() that checks for zero sized cache entry. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0 2008-07-22 8:07 ` Junio C Hamano @ 2008-07-22 16:49 ` Alex Riesen 0 siblings, 0 replies; 41+ messages in thread From: Alex Riesen @ 2008-07-22 16:49 UTC (permalink / raw) To: Junio C Hamano; +Cc: Johannes Schindelin, git Junio C Hamano, Tue, Jul 22, 2008 10:07:49 +0200: > I however have to wonder if you also need to touch the end of > ce_match_stat_basic() that checks for zero sized cache entry. I frankly don't know. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0 2008-07-21 17:35 [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0 Alex Riesen 2008-07-21 18:20 ` Johannes Schindelin @ 2008-07-22 7:17 ` Johannes Sixt 2008-07-22 16:46 ` Alex Riesen 2008-07-22 17:28 ` Junio C Hamano 1 sibling, 2 replies; 41+ messages in thread From: Johannes Sixt @ 2008-07-22 7:17 UTC (permalink / raw) To: Alex Riesen, Junio C Hamano; +Cc: git Alex Riesen schrieb: > Can MSys folks please try it? I noticed it when the test > t2103-update-index-ignore-missing.sh (the 5th case) started failing. I tested it. mingw.git does suffer from the problem, and this fixes it. But! > + if ((changed & DATA_CHANGED) && (ce->ce_size != 0 || S_ISGITLINK(ce->ce_mode))) Does this mean that ce->ce_size is non-zero for gitlinks, at least on Unix? Is this value useful in anyway? I don't think so. Then it shouldn't be a random value that lstat() happens to return. -- Hannes ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0 2008-07-22 7:17 ` Johannes Sixt @ 2008-07-22 16:46 ` Alex Riesen 2008-07-22 16:59 ` Johannes Sixt 2008-07-22 17:28 ` Junio C Hamano 1 sibling, 1 reply; 41+ messages in thread From: Alex Riesen @ 2008-07-22 16:46 UTC (permalink / raw) To: Johannes Sixt; +Cc: Junio C Hamano, git Johannes Sixt, Tue, Jul 22, 2008 09:17:16 +0200: > Alex Riesen schrieb: > > Can MSys folks please try it? I noticed it when the test > > t2103-update-index-ignore-missing.sh (the 5th case) started failing. > > I tested it. mingw.git does suffer from the problem, and this fixes it. > Yes, I did too (at work). > > + if ((changed & DATA_CHANGED) && (ce->ce_size != 0 || S_ISGITLINK(ce->ce_mode))) > > Does this mean that ce->ce_size is non-zero for gitlinks, at least on > Unix? It is non-zero for directories (which is what gitlinks are in working directories) on UNIX operating systems I met. > Is this value useful in anyway? Sometimes it is (the size a directory takes on storage) > I don't think so. Then it shouldn't be a random value that lstat() > happens to return. The problem is: it is not random. I even suspect that Windows is the ONLY system which has st_size 0 for directories. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0 2008-07-22 16:46 ` Alex Riesen @ 2008-07-22 16:59 ` Johannes Sixt 0 siblings, 0 replies; 41+ messages in thread From: Johannes Sixt @ 2008-07-22 16:59 UTC (permalink / raw) To: Alex Riesen; +Cc: Junio C Hamano, git Alex Riesen schrieb: > Johannes Sixt, Tue, Jul 22, 2008 09:17:16 +0200: >> Alex Riesen schrieb: >>> + if ((changed & DATA_CHANGED) && (ce->ce_size != 0 || S_ISGITLINK(ce->ce_mode))) >> Does this mean that ce->ce_size is non-zero for gitlinks, at least on >> Unix? > > It is non-zero for directories (which is what gitlinks are in working > directories) on UNIX operating systems I met. > >> Is this value useful in anyway? > > Sometimes it is (the size a directory takes on storage) Sure; but is ce->ce_size of gitlinks useful? -- Hannes ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0 2008-07-22 7:17 ` Johannes Sixt 2008-07-22 16:46 ` Alex Riesen @ 2008-07-22 17:28 ` Junio C Hamano 2008-07-22 19:39 ` [PATCH] Build configuration to skip ctime for modification test Alex Riesen 1 sibling, 1 reply; 41+ messages in thread From: Junio C Hamano @ 2008-07-22 17:28 UTC (permalink / raw) To: Johannes Sixt; +Cc: Alex Riesen, git Johannes Sixt <j.sixt@viscovery.net> writes: >> + if ((changed & DATA_CHANGED) && (ce->ce_size != 0 || S_ISGITLINK(ce->ce_mode))) > > Does this mean that ce->ce_size is non-zero for gitlinks, at least on > Unix? Is this value useful in anyway? I don't think so. Then it shouldn't > be a random value that lstat() happens to return. These ce_xxx fields are the values we read from lstat(2) when the user told us to stage that working tree entity, be it a regular file, a symlink, or a directory that is a submodule. The only thing required for them is that they are stable (i.e. if you haven't touched the working tree entity, the value stays the same), and changes across modification. The value itself does not have to "mean" anything. When trying to see if the user has changes in the working tree entity since the last such staging of the path, we compare that value with what comes back from lstat(2), before actually comparing the contents. If the filesize changed, they cannot be the same and the code says you have modified it without having to look at the contents. Side note. This is why you need to be careful after modifying autocrlf related configuration and attributes. If you had CRLF contents in the working tree that was incorrectly staged as-is, then switch autocrlf-on, and "git add" to fix the staged copy to be LF-terminated, we say "it's unchanged and we do not bother rehashing" by comparing the ce_xxx fields without looking at the contents (this is an absolutely necessary optimization to make "git add ." usable), because ce_size records the size of CRLF version you have in the working tree, and you haven't changed the working tree file in this sequence above. Removing the file and checking things out would be the most straightforward solution in such a case. We used to include ce_dev (taken from struct stat.st_dev) in the list of fields to cache and compare to detect changes, but that is now excluded because it is not stable (see comments in read-cache.c). If the directory size is unstable, perhaps it would be better to force it to some fixed magic value so that it is not used by this "quick change detection" check. If you network-mount the same directory from POSIX and windows, the former may give "storage size of the directory" while the latter may give 0. This would mean that you would need a "update-index --refresh" when you switch between such machines. ^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH] Build configuration to skip ctime for modification test 2008-07-22 17:28 ` Junio C Hamano @ 2008-07-22 19:39 ` Alex Riesen 2008-07-22 20:17 ` Johannes Schindelin 0 siblings, 1 reply; 41+ messages in thread From: Alex Riesen @ 2008-07-22 19:39 UTC (permalink / raw) To: Junio C Hamano; +Cc: Johannes Sixt, git On Windows, the only file attribute we need (executable) cannot be used, so ctime can be ignored as well. Change time is updated when file attributes were changed (or it is written to, but in this case, mtime is updated as well). Signed-off-by: Alex Riesen <raa.lkml@gmail.com> --- Junio C Hamano, Tue, Jul 22, 2008 19:28:46 +0200: > Johannes Sixt <j.sixt@viscovery.net> writes: > > >> + if ((changed & DATA_CHANGED) && (ce->ce_size != 0 || S_ISGITLINK(ce->ce_mode))) > > > > Does this mean that ce->ce_size is non-zero for gitlinks, at least on > > Unix? Is this value useful in anyway? I don't think so. Then it shouldn't > > be a random value that lstat() happens to return. > > These ce_xxx fields are the values we read from lstat(2) when the user > told us to stage that working tree entity, be it a regular file, a > symlink, or a directory that is a submodule. The only thing required for > them is that they are stable (i.e. if you haven't touched the working tree > entity, the value stays the same), and changes across modification. The > value itself does not have to "mean" anything. This reminds me... We can't use the only file attribute we care about on Windows, so we can as well skip check for ctime. Besides, Google Desktop Search keeps changing ctime when crawling files (ok, GDS is a major usability nuance anyway, but the point is - we don't use the file attribute). read-cache.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/read-cache.c b/read-cache.c index a50a851..c4f2718 100644 --- a/read-cache.c +++ b/read-cache.c @@ -181,8 +181,10 @@ static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st) } if (ce->ce_mtime != (unsigned int) st->st_mtime) changed |= MTIME_CHANGED; +#ifndef NO_TRUSTABLE_FILEMODE if (ce->ce_ctime != (unsigned int) st->st_ctime) changed |= CTIME_CHANGED; +#endif if (ce->ce_uid != (unsigned int) st->st_uid || ce->ce_gid != (unsigned int) st->st_gid) -- 1.6.0.rc0.41.g70446 ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH] Build configuration to skip ctime for modification test 2008-07-22 19:39 ` [PATCH] Build configuration to skip ctime for modification test Alex Riesen @ 2008-07-22 20:17 ` Johannes Schindelin 2008-07-22 20:31 ` Alex Riesen 0 siblings, 1 reply; 41+ messages in thread From: Johannes Schindelin @ 2008-07-22 20:17 UTC (permalink / raw) To: Alex Riesen; +Cc: Junio C Hamano, Johannes Sixt, git Hi, On Tue, 22 Jul 2008, Alex Riesen wrote: > +#ifndef NO_TRUSTABLE_FILEMODE > if (ce->ce_ctime != (unsigned int) st->st_ctime) > changed |= CTIME_CHANGED; > +#endif Surely you meant trust_executable_bit instead, right? Otherwise, if you really want to tell at compile time,I think for clarity you have to introduce another #define, since NO_TRUSTABLE_FILEMODE definitely says something different than CTIME_IS_USELESS. Ciao, Dscho ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Build configuration to skip ctime for modification test 2008-07-22 20:17 ` Johannes Schindelin @ 2008-07-22 20:31 ` Alex Riesen 2008-07-23 0:12 ` Junio C Hamano 2008-07-23 16:00 ` git svn throws locale related error when built from source Anton Mostovoy 0 siblings, 2 replies; 41+ messages in thread From: Alex Riesen @ 2008-07-22 20:31 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Junio C Hamano, Johannes Sixt, git Johannes Schindelin, Tue, Jul 22, 2008 22:17:21 +0200: > Hi, > > On Tue, 22 Jul 2008, Alex Riesen wrote: > > > +#ifndef NO_TRUSTABLE_FILEMODE > > if (ce->ce_ctime != (unsigned int) st->st_ctime) > > changed |= CTIME_CHANGED; > > +#endif > > Surely you meant trust_executable_bit instead, right? No. Just what I said: we don't have filemode (like "at all") - so no ctime as well. But maybe you're right, and trust_executable_bit is more flexible. Or maybe both (the #ifdef _and_ trust_executable_bit) and must be used... > Otherwise, if you really want to tell at compile time,I think for clarity > you have to introduce another #define, since NO_TRUSTABLE_FILEMODE > definitely says something different than CTIME_IS_USELESS. I had that at first (NO_DEPENDABLE_CTIME, than IGNORE_CTIME), than deemed it excessive. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Build configuration to skip ctime for modification test 2008-07-22 20:31 ` Alex Riesen @ 2008-07-23 0:12 ` Junio C Hamano 2008-07-23 16:46 ` Alex Riesen 2008-07-24 19:00 ` [PATCH] Do not use ctime if file mode is not used Alex Riesen 2008-07-23 16:00 ` git svn throws locale related error when built from source Anton Mostovoy 1 sibling, 2 replies; 41+ messages in thread From: Junio C Hamano @ 2008-07-23 0:12 UTC (permalink / raw) To: Alex Riesen; +Cc: Johannes Schindelin, Johannes Sixt, git Alex Riesen <raa.lkml@gmail.com> writes: > Johannes Schindelin, Tue, Jul 22, 2008 22:17:21 +0200: >> Hi, >> >> On Tue, 22 Jul 2008, Alex Riesen wrote: >> >> > +#ifndef NO_TRUSTABLE_FILEMODE >> > if (ce->ce_ctime != (unsigned int) st->st_ctime) >> > changed |= CTIME_CHANGED; >> > +#endif >> >> Surely you meant trust_executable_bit instead, right? > > No. Just what I said: we don't have filemode (like "at all") - so no > ctime as well. But maybe you're right, and trust_executable_bit is > more flexible. Or maybe both (the #ifdef _and_ trust_executable_bit) > and must be used... > >> Otherwise, if you really want to tell at compile time,I think for clarity >> you have to introduce another #define, since NO_TRUSTABLE_FILEMODE >> definitely says something different than CTIME_IS_USELESS. > > I had that at first (NO_DEPENDABLE_CTIME, than IGNORE_CTIME), than > deemed it excessive. Why is it excessive? My initial reaction was "what does trustable filemode nor trust_executable_bit has anything to do with ctime". Please explain. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Build configuration to skip ctime for modification test 2008-07-23 0:12 ` Junio C Hamano @ 2008-07-23 16:46 ` Alex Riesen 2008-07-23 16:59 ` Johannes Schindelin 2008-07-24 19:00 ` [PATCH] Do not use ctime if file mode is not used Alex Riesen 1 sibling, 1 reply; 41+ messages in thread From: Alex Riesen @ 2008-07-23 16:46 UTC (permalink / raw) To: Junio C Hamano; +Cc: Johannes Schindelin, Johannes Sixt, git Junio C Hamano, Wed, Jul 23, 2008 02:12:50 +0200: > >> Otherwise, if you really want to tell at compile time,I think for clarity > >> you have to introduce another #define, since NO_TRUSTABLE_FILEMODE > >> definitely says something different than CTIME_IS_USELESS. > > > > I had that at first (NO_DEPENDABLE_CTIME, than IGNORE_CTIME), than > > deemed it excessive. > > Why is it excessive? My initial reaction was "what does trustable > filemode nor trust_executable_bit has anything to do with ctime". Please > explain. Because exactly the file mode (the executable bit) is the reason for checking ctime. Otherwise there is no point: Git doesn't save any other data which when changed cause a ctime update. And exactly the file mode is completely broken on that cygwin thing. Which is precisely pointed by NO_TRUSTABLE_FILEMODE. Hence - just it. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Build configuration to skip ctime for modification test 2008-07-23 16:46 ` Alex Riesen @ 2008-07-23 16:59 ` Johannes Schindelin 2008-07-23 19:16 ` Alex Riesen 0 siblings, 1 reply; 41+ messages in thread From: Johannes Schindelin @ 2008-07-23 16:59 UTC (permalink / raw) To: Alex Riesen; +Cc: Junio C Hamano, Johannes Sixt, git Hi, On Wed, 23 Jul 2008, Alex Riesen wrote: > Because exactly the file mode (the executable bit) is the reason for > checking ctime. But ctime is broken on Windows. Because ctime is supposed to change whenever the _inode_ changes. You have to admit that saying "I ignore the ctime because the executable bit is broken" must leave the reader puzzled. Ciao, Dscho ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Build configuration to skip ctime for modification test 2008-07-23 16:59 ` Johannes Schindelin @ 2008-07-23 19:16 ` Alex Riesen 2008-07-25 2:00 ` Linus Torvalds 0 siblings, 1 reply; 41+ messages in thread From: Alex Riesen @ 2008-07-23 19:16 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Junio C Hamano, Johannes Sixt, git Johannes Schindelin, Wed, Jul 23, 2008 18:59:02 +0200: > On Wed, 23 Jul 2008, Alex Riesen wrote: > > > Because exactly the file mode (the executable bit) is the reason for > > checking ctime. > > But ctime is broken on Windows. Because ctime is supposed to change > whenever the _inode_ changes. It is not that it is broken. We just don't need it, because the st_mode is not used, and the rest of inode information is not used anyway. > You have to admit that saying "I ignore the ctime because the executable > bit is broken" must leave the reader puzzled. This is conclusion. I said "file mode" and "file attributes", which is how reason for ctime update is defined in SUSv3. man 2 stat says: The field st_ctime is changed by writing or by setting inode information (i.e., owner, group, link count, mode, etc.). Not just "inode" but "inode information". Only mode is used, and even that is ignored on Windows. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Build configuration to skip ctime for modification test 2008-07-23 19:16 ` Alex Riesen @ 2008-07-25 2:00 ` Linus Torvalds 2008-07-25 5:55 ` Alex Riesen 0 siblings, 1 reply; 41+ messages in thread From: Linus Torvalds @ 2008-07-25 2:00 UTC (permalink / raw) To: Alex Riesen; +Cc: Johannes Schindelin, Junio C Hamano, Johannes Sixt, git On Wed, 23 Jul 2008, Alex Riesen wrote: > > It is not that it is broken. We just don't need it, because the st_mode > is not used, and the rest of inode information is not used anyway. That is NOT why git checks the ctime. Git checks the ctime not because it cares about the inode state being modified per se: since it can see that _directly_ - so why should it care about inode state like st_mode? No, git checks ctime because it in general tries to make it VERY VERY hard for people to try to "fake out" git and replace files from underneath it without git noticing. It's much easier and much more common for tools to restore 'mtime' when they do some operation on a file than it is for them to restore 'ctime'. For example, if you rsync files between two hosts, the '-t' flag will make rsync try to keep the mtimes in sync (and it's part of '-a', which is the common form that you'd use for rsync). So if you only look at mtime and size, you often miss the fact that the file has actually been messed with! Looking at ctime gets around a number of those things. Of course, it can cause git to be _too_ eager in thinking that a file is modified, and an example of that is the insane indexing that 'beagle' does, which actually modifies the files by adding inode extended attributes to them and thus changes ctime due to the indexing. But in general it's a lot better to be too careful than to miss the fact that somebody changed the file. Linus ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Build configuration to skip ctime for modification test 2008-07-25 2:00 ` Linus Torvalds @ 2008-07-25 5:55 ` Alex Riesen 2008-07-26 0:57 ` Johannes Schindelin 0 siblings, 1 reply; 41+ messages in thread From: Alex Riesen @ 2008-07-25 5:55 UTC (permalink / raw) To: Linus Torvalds; +Cc: Johannes Schindelin, Junio C Hamano, Johannes Sixt, git Linus Torvalds, Fri, Jul 25, 2008 04:00:29 +0200: > On Wed, 23 Jul 2008, Alex Riesen wrote: > > > > It is not that it is broken. We just don't need it, because the st_mode > > is not used, and the rest of inode information is not used anyway. > > That is NOT why git checks the ctime. > > Git checks the ctime not because it cares about the inode state being > modified per se: since it can see that _directly_ - so why should it care > about inode state like st_mode? > > No, git checks ctime because it in general tries to make it VERY VERY hard > for people to try to "fake out" git and replace files from underneath it > without git noticing. > > It's much easier and much more common for tools to restore 'mtime' when > they do some operation on a file than it is for them to restore 'ctime'. > > For example, if you rsync files between two hosts, the '-t' flag will make > rsync try to keep the mtimes in sync (and it's part of '-a', which is the > common form that you'd use for rsync). So if you only look at mtime and > size, you often miss the fact that the file has actually been messed with! > > Looking at ctime gets around a number of those things. Of course, it can > cause git to be _too_ eager in thinking that a file is modified, and an > example of that is the insane indexing that 'beagle' does, which actually > modifies the files by adding inode extended attributes to them and thus > changes ctime due to the indexing. But in general it's a lot better to be > too careful than to miss the fact that somebody changed the file. > But, given the fact, that somewhere sometimes its very-very annoying to have even one (un)changed file, something must be done about it. Maybe just direct # my .gitconfig for Windows machines with GDS [core] filemode = false trustctime = false logallrefupdates = false [pack] threads = 1 etc... ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Build configuration to skip ctime for modification test 2008-07-25 5:55 ` Alex Riesen @ 2008-07-26 0:57 ` Johannes Schindelin 2008-07-26 15:38 ` [PATCH] Make use of stat.ctime configurable Alex Riesen 0 siblings, 1 reply; 41+ messages in thread From: Johannes Schindelin @ 2008-07-26 0:57 UTC (permalink / raw) To: Alex Riesen; +Cc: Linus Torvalds, Junio C Hamano, Johannes Sixt, git Hi, On Fri, 25 Jul 2008, Alex Riesen wrote: > But, given the fact, that somewhere sometimes its very-very annoying to > have even one (un)changed file, something must be done about it. Maybe > just direct > > [...] > trustctime = false ... which is all Junio and I were asking all along: a separate way to ask for ignoring ctime; not just DWIM it on top of the executable bit. Ciao, Dscho ^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH] Make use of stat.ctime configurable 2008-07-26 0:57 ` Johannes Schindelin @ 2008-07-26 15:38 ` Alex Riesen 2008-07-27 19:46 ` Junio C Hamano 2008-07-27 19:46 ` Junio C Hamano 0 siblings, 2 replies; 41+ messages in thread From: Alex Riesen @ 2008-07-26 15:38 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Linus Torvalds, Junio C Hamano, Johannes Sixt, git because there are situations where it produces too much false positives. Like when file system crawlers keep changing it when scanning and using the ctime for marking scanned files. The default is to allow use of ctime. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> --- Johannes Schindelin, Sat, Jul 26, 2008 02:57:25 +0200: > On Fri, 25 Jul 2008, Alex Riesen wrote: > > But, given the fact, that somewhere sometimes its very-very annoying to > > have even one (un)changed file, something must be done about it. Maybe > > just direct > > > > [...] > > trustctime = false > > ... which is all Junio and I were asking all along: a separate way to ask > for ignoring ctime; not just DWIM it on top of the executable bit. Oh... cache.h | 1 + config.c | 4 ++++ environment.c | 1 + read-cache.c | 2 +- 4 files changed, 7 insertions(+), 1 deletions(-) diff --git a/cache.h b/cache.h index 38985aa..00d02d3 100644 --- a/cache.h +++ b/cache.h @@ -421,6 +421,7 @@ extern int delete_ref(const char *, const unsigned char *sha1); /* Environment bits from configuration mechanism */ extern int trust_executable_bit; +extern int trust_file_ctime; extern int quote_path_fully; extern int has_symlinks; extern int ignore_case; diff --git a/config.c b/config.c index 1e066c7..860e8ab 100644 --- a/config.c +++ b/config.c @@ -341,6 +341,10 @@ static int git_default_core_config(const char *var, const char *value) trust_executable_bit = git_config_bool(var, value); return 0; } + if (!strcmp(var, "core.filectime")) { + trust_file_ctime = git_config_bool(var, value); + return 0; + } if (!strcmp(var, "core.quotepath")) { quote_path_fully = git_config_bool(var, value); diff --git a/environment.c b/environment.c index 4a88a17..4982771 100644 --- a/environment.c +++ b/environment.c @@ -13,6 +13,7 @@ char git_default_email[MAX_GITNAME]; char git_default_name[MAX_GITNAME]; int user_ident_explicitly_given; int trust_executable_bit = 1; +int trust_file_ctime = 1; int has_symlinks = 1; int ignore_case; int assume_unchanged; diff --git a/read-cache.c b/read-cache.c index a50a851..00d39dc 100644 --- a/read-cache.c +++ b/read-cache.c @@ -181,7 +181,7 @@ static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st) } if (ce->ce_mtime != (unsigned int) st->st_mtime) changed |= MTIME_CHANGED; - if (ce->ce_ctime != (unsigned int) st->st_ctime) + if (trust_file_ctime && ce->ce_ctime != (unsigned int) st->st_ctime) changed |= CTIME_CHANGED; if (ce->ce_uid != (unsigned int) st->st_uid || -- 1.6.0.rc0.76.g581e ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH] Make use of stat.ctime configurable 2008-07-26 15:38 ` [PATCH] Make use of stat.ctime configurable Alex Riesen @ 2008-07-27 19:46 ` Junio C Hamano 2008-07-27 19:46 ` Junio C Hamano 1 sibling, 0 replies; 41+ messages in thread From: Junio C Hamano @ 2008-07-27 19:46 UTC (permalink / raw) To: Alex Riesen; +Cc: Johannes Schindelin, Linus Torvalds, Johannes Sixt, git Alex Riesen <raa.lkml@gmail.com> writes: > because there are situations where it produces too much false > positives. Like when file system crawlers keep changing it when > scanning and using the ctime for marking scanned files. This justification is good and I am very inclined to advocate for its inclusion in 1.6.0, but any new configuration needs to be in the documentation. It appears there is "gui.trustmtime"; shouldn't this be called "core.trustctime" or something? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Make use of stat.ctime configurable 2008-07-26 15:38 ` [PATCH] Make use of stat.ctime configurable Alex Riesen 2008-07-27 19:46 ` Junio C Hamano @ 2008-07-27 19:46 ` Junio C Hamano 2008-07-28 6:31 ` Alex Riesen 1 sibling, 1 reply; 41+ messages in thread From: Junio C Hamano @ 2008-07-27 19:46 UTC (permalink / raw) To: Alex Riesen; +Cc: Johannes Schindelin, Linus Torvalds, Johannes Sixt, git Alex Riesen <raa.lkml@gmail.com> writes: > because there are situations where it produces too much false > positives. Like when file system crawlers keep changing it when > scanning and using the ctime for marking scanned files. This justification is good and I am very inclined to advocate for its inclusion in 1.6.0, but any new configuration needs to be in the documentation. It appears there is "gui.trustmtime"; shouldn't this be called "core.trustctime" or something? ^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH] Make use of stat.ctime configurable 2008-07-27 19:46 ` Junio C Hamano @ 2008-07-28 6:31 ` Alex Riesen 2008-07-28 16:04 ` David Brown 2008-07-28 16:20 ` Petr Baudis 0 siblings, 2 replies; 41+ messages in thread From: Alex Riesen @ 2008-07-28 6:31 UTC (permalink / raw) To: Junio C Hamano; +Cc: Johannes Schindelin, Linus Torvalds, Johannes Sixt, git because there are situations where it produces too much false positives. Like when file system crawlers keep changing it when scanning and using the ctime for marking scanned files. The default is to allow use of ctime. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> --- Junio C Hamano, Sun, Jul 27, 2008 21:46:42 +0200: > Alex Riesen <raa.lkml@gmail.com> writes: > > > because there are situations where it produces too much false > > positives. Like when file system crawlers keep changing it when > > scanning and using the ctime for marking scanned files. > > This justification is good and I am very inclined to advocate for its > inclusion in 1.6.0, but any new configuration needs to be in the > documentation. Done. > It appears there is "gui.trustmtime"; shouldn't this be called > "core.trustctime" or something? Getting old... I even called the global flag trust_file_ctime! Corrected. Changed trust_file_ctime to trust_ctime. Documentation/config.txt | 7 +++++++ Documentation/git-update-index.txt | 5 +++++ cache.h | 1 + config.c | 4 ++++ environment.c | 1 + read-cache.c | 2 +- 6 files changed, 19 insertions(+), 1 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 1a13abc..552c134 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -149,6 +149,13 @@ core.safecrlf:: `core.autocrlf`, git will reject the file. The variable can be set to "warn", in which case git will only warn about an irreversible conversion but continue the operation. + +core.trustctime:: + If false, the ctime differences between the index and the + working copy are ignored; useful when the inode change time + is regularly modified by something outside Git (file system + crawlers and some backup systems). + See linkgit:git-update-index[1]. True by default. + CRLF conversion bears a slight chance of corrupting data. autocrlf=true will convert CRLF to LF during commit and LF to diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt index 6b930bc..1d9d81a 100644 --- a/Documentation/git-update-index.txt +++ b/Documentation/git-update-index.txt @@ -323,6 +323,11 @@ from symbolic link to regular file. The command looks at `core.ignorestat` configuration variable. See 'Using "assume unchanged" bit' section above. +The command also looks at `core.trustctime` configuration variable. +It can be useful when the inode change time is regularly modified by +something outside Git (file system crawlers and backup systems use +ctime for marking files processed) (see linkgit:git-config[1]). + SEE ALSO -------- diff --git a/cache.h b/cache.h index 4b6c0a6..2475de9 100644 --- a/cache.h +++ b/cache.h @@ -423,6 +423,7 @@ extern int delete_ref(const char *, const unsigned char *sha1); /* Environment bits from configuration mechanism */ extern int trust_executable_bit; +extern int trust_ctime; extern int quote_path_fully; extern int has_symlinks; extern int ignore_case; diff --git a/config.c b/config.c index 1e066c7..53f04a0 100644 --- a/config.c +++ b/config.c @@ -341,6 +341,10 @@ static int git_default_core_config(const char *var, const char *value) trust_executable_bit = git_config_bool(var, value); return 0; } + if (!strcmp(var, "core.trustctime")) { + trust_ctime = git_config_bool(var, value); + return 0; + } if (!strcmp(var, "core.quotepath")) { quote_path_fully = git_config_bool(var, value); diff --git a/environment.c b/environment.c index 4a88a17..0c6d11f 100644 --- a/environment.c +++ b/environment.c @@ -13,6 +13,7 @@ char git_default_email[MAX_GITNAME]; char git_default_name[MAX_GITNAME]; int user_ident_explicitly_given; int trust_executable_bit = 1; +int trust_ctime = 1; int has_symlinks = 1; int ignore_case; int assume_unchanged; diff --git a/read-cache.c b/read-cache.c index 6c08803..1cae361 100644 --- a/read-cache.c +++ b/read-cache.c @@ -197,7 +197,7 @@ static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st) } if (ce->ce_mtime != (unsigned int) st->st_mtime) changed |= MTIME_CHANGED; - if (ce->ce_ctime != (unsigned int) st->st_ctime) + if (trust_ctime && ce->ce_ctime != (unsigned int) st->st_ctime) changed |= CTIME_CHANGED; if (ce->ce_uid != (unsigned int) st->st_uid || -- 1.6.0.rc0.76.g581e ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH] Make use of stat.ctime configurable 2008-07-28 6:31 ` Alex Riesen @ 2008-07-28 16:04 ` David Brown 2008-07-28 16:09 ` Linus Torvalds 2008-07-28 16:20 ` Petr Baudis 1 sibling, 1 reply; 41+ messages in thread From: David Brown @ 2008-07-28 16:04 UTC (permalink / raw) To: Alex Riesen Cc: Junio C Hamano, Johannes Schindelin, Linus Torvalds, Johannes Sixt, git On Mon, Jul 28, 2008 at 08:31:28AM +0200, Alex Riesen wrote: >because there are situations where it produces too much false >positives. Like when file system crawlers keep changing it when >scanning and using the ctime for marking scanned files. That's interesting, since most backup software uses the ctime to determine file changes. David ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Make use of stat.ctime configurable 2008-07-28 16:04 ` David Brown @ 2008-07-28 16:09 ` Linus Torvalds 2008-07-28 21:49 ` Alex Riesen 2008-07-29 1:16 ` Junio C Hamano 0 siblings, 2 replies; 41+ messages in thread From: Linus Torvalds @ 2008-07-28 16:09 UTC (permalink / raw) To: David Brown Cc: Alex Riesen, Junio C Hamano, Johannes Schindelin, Johannes Sixt, git On Mon, 28 Jul 2008, David Brown wrote: > On Mon, Jul 28, 2008 at 08:31:28AM +0200, Alex Riesen wrote: > > > because there are situations where it produces too much false > > positives. Like when file system crawlers keep changing it when > > scanning and using the ctime for marking scanned files. > > That's interesting, since most backup software uses the ctime to determine > file changes. It really is just Beagle that is (was? I can dream) a piece of unbelievable crap. Anybody who uses extended attributes as part of a indexing scheme is just insane. Modifying the file you are indexing is not just fundamentally wrong to begin with, but it will then also be incredibly inefficient to read those entries one at a time. And no other sane model would ever touch 'ctime'. Oh, well. Making ctime configurable is not wrong per se. But if it's Beagle that triggers this, the fix is sadly in the wrong place. Linus ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Make use of stat.ctime configurable 2008-07-28 16:09 ` Linus Torvalds @ 2008-07-28 21:49 ` Alex Riesen 2008-07-29 1:16 ` Junio C Hamano 1 sibling, 0 replies; 41+ messages in thread From: Alex Riesen @ 2008-07-28 21:49 UTC (permalink / raw) To: Linus Torvalds Cc: David Brown, Junio C Hamano, Johannes Schindelin, Johannes Sixt, git Linus Torvalds, Mon, Jul 28, 2008 18:09:32 +0200: > It really is just Beagle that is (was? I can dream) a piece of > unbelievable crap. > > Anybody who uses extended attributes as part of a indexing scheme is just > insane. Modifying the file you are indexing is not just fundamentally > wrong to begin with, but it will then also be incredibly inefficient to > read those entries one at a time. > > And no other sane model would ever touch 'ctime'. Beagle is not alone. Google Desktop Search was mentioned before. > Oh, well. Making ctime configurable is not wrong per se. But if it's > Beagle that triggers this, the fix is sadly in the wrong place. Never said it was a fix. Same as CRLF conversion is not a feature. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Make use of stat.ctime configurable 2008-07-28 16:09 ` Linus Torvalds 2008-07-28 21:49 ` Alex Riesen @ 2008-07-29 1:16 ` Junio C Hamano 2008-07-29 1:23 ` Linus Torvalds 1 sibling, 1 reply; 41+ messages in thread From: Junio C Hamano @ 2008-07-29 1:16 UTC (permalink / raw) To: Linus Torvalds Cc: David Brown, Alex Riesen, Johannes Schindelin, Johannes Sixt, git Linus Torvalds <torvalds@linux-foundation.org> writes: > On Mon, 28 Jul 2008, David Brown wrote: > >> On Mon, Jul 28, 2008 at 08:31:28AM +0200, Alex Riesen wrote: >> >> > because there are situations where it produces too much false >> > positives. Like when file system crawlers keep changing it when >> > scanning and using the ctime for marking scanned files. >> >> That's interesting, since most backup software uses the ctime to determine >> file changes. > > It really is just Beagle that is (was? I can dream) a piece of > unbelievable crap. > > Anybody who uses extended attributes as part of a indexing scheme is just > insane. Modifying the file you are indexing is not just fundamentally > wrong to begin with, but it will then also be incredibly inefficient to > read those entries one at a time. It's a typo and you are saying it _is_ fundamentally wrong, aren't you? If you are prepared to pick up new files, you need to crawl everywhere anyway, so if the xattr is used to leave a mark "The last time I looked at this file was this" in the file itself, it does not sound too bad to me. It would be irritating that it touches ctime, though, but I do not use it so it is not my problem ;-) http://beagle-project.org/FAQ "Do I really need extended attributes?" talks about BEAGLE_DISABLE_XATTR environment variable and interestingly it says disabling use of xattr would slow you down. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Make use of stat.ctime configurable 2008-07-29 1:16 ` Junio C Hamano @ 2008-07-29 1:23 ` Linus Torvalds 2008-07-29 1:31 ` Junio C Hamano 2008-07-29 10:49 ` Johannes Schindelin 0 siblings, 2 replies; 41+ messages in thread From: Linus Torvalds @ 2008-07-29 1:23 UTC (permalink / raw) To: Junio C Hamano Cc: David Brown, Alex Riesen, Johannes Schindelin, Johannes Sixt, git On Mon, 28 Jul 2008, Junio C Hamano wrote: > > > > Anybody who uses extended attributes as part of a indexing scheme is just > > insane. Modifying the file you are indexing is not just fundamentally > > wrong to begin with, but it will then also be incredibly inefficient to > > read those entries one at a time. > > It's a typo and you are saying it _is_ fundamentally wrong, aren't you? Not a typo, and I'm sayin that "it's not _just_ fundamentally wrong" So yes, it's fundamentally wrong, but it's worse than that. It's fundamentally wrong _and_ it's inefficient as hell. > If you are prepared to pick up new files, you need to crawl everywhere > anyway, so if the xattr is used to leave a mark "The last time I looked at > this file was this" in the file itself, it does not sound too bad to me. It's absolutely horrible. It means that you have another extra indirection and accompanying disk seek to check the thing. It's a total performance nightmare. Trust me, anybody who uses extended attributes like this simply does not know what he is doing. > It would be irritating that it touches ctime, though, but I do not use it > http://beagle-project.org/FAQ "Do I really need extended attributes?" > talks about BEAGLE_DISABLE_XATTR environment variable and interestingly > it says disabling use of xattr would slow you down. They don't have a clue. They say that, but it's simply not true. Of course, the fact that they think it is probably implies that they did something EVEN MORE STUPID for the non-xattr case. That wouldn't surprise me at all. If I had to guess, I'd guess that they used an SQL database and query language, and did all their tests with hot caches too. The kernel does caching really well, and the kernel is fast as hell, so _of_course_ when you benchmark, using kernel data structures looks good, especially if you benchmark against code that isn't well written for the particular usage case. Linus ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Make use of stat.ctime configurable 2008-07-29 1:23 ` Linus Torvalds @ 2008-07-29 1:31 ` Junio C Hamano 2008-07-29 1:41 ` David Brown 2008-07-29 1:55 ` Linus Torvalds 2008-07-29 10:49 ` Johannes Schindelin 1 sibling, 2 replies; 41+ messages in thread From: Junio C Hamano @ 2008-07-29 1:31 UTC (permalink / raw) To: Linus Torvalds Cc: David Brown, Alex Riesen, Johannes Schindelin, Johannes Sixt, git Linus Torvalds <torvalds@linux-foundation.org> writes: > The kernel does caching really well, and the kernel is fast as hell, so > _of_course_ when you benchmark, using kernel data structures looks good, > especially if you benchmark against code that isn't well written for the > particular usage case. Ok. While I have your attention on st_ctime, let me ask you a stupid question. Why does "rename(old, new)" change st_ctime when you move a regular file? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Make use of stat.ctime configurable 2008-07-29 1:31 ` Junio C Hamano @ 2008-07-29 1:41 ` David Brown 2008-07-29 2:49 ` Junio C Hamano 2008-07-29 10:45 ` Johannes Schindelin 2008-07-29 1:55 ` Linus Torvalds 1 sibling, 2 replies; 41+ messages in thread From: David Brown @ 2008-07-29 1:41 UTC (permalink / raw) To: Junio C Hamano Cc: Linus Torvalds, Alex Riesen, Johannes Schindelin, Johannes Sixt, git On Mon, Jul 28, 2008 at 06:31:24PM -0700, Junio C Hamano wrote: >Linus Torvalds <torvalds@linux-foundation.org> writes: > >> The kernel does caching really well, and the kernel is fast as hell, so >> _of_course_ when you benchmark, using kernel data structures looks good, >> especially if you benchmark against code that isn't well written for the >> particular usage case. > >Ok. While I have your attention on st_ctime, let me ask you a stupid >question. Why does "rename(old, new)" change st_ctime when you move a >regular file? A simple answer might be that posix requires it. But, from the point of view of backup software, not updating the ctime on rename would be horrible, because you'd never know when files got renamed. David ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Make use of stat.ctime configurable 2008-07-29 1:41 ` David Brown @ 2008-07-29 2:49 ` Junio C Hamano 2008-07-29 10:45 ` Johannes Schindelin 1 sibling, 0 replies; 41+ messages in thread From: Junio C Hamano @ 2008-07-29 2:49 UTC (permalink / raw) To: David Brown Cc: Linus Torvalds, Alex Riesen, Johannes Schindelin, Johannes Sixt, git David Brown <git@davidb.org> writes: > On Mon, Jul 28, 2008 at 06:31:24PM -0700, Junio C Hamano wrote: >>Linus Torvalds <torvalds@linux-foundation.org> writes: >> >>> The kernel does caching really well, and the kernel is fast as >>> hell, so _of_course_ when you benchmark, using kernel data >>> structures looks good, especially if you benchmark against code >>> that isn't well written for the particular usage case. >> >>Ok. While I have your attention on st_ctime, let me ask you a stupid >>question. Why does "rename(old, new)" change st_ctime when you move a >>regular file? > > A simple answer might be that posix requires it. I would understand that an obvious implementation would be to link to new and then unlink the old, and the link count of the moved entity needs to change (although in the end, the increment and decrement would cancel out) in each step, so it would be convenient for the implementation to update ctime in both steps; however my reading of POSIX does not seem to require it. The only mention of ctime I find is about updating the parent directories of old and new, as contents of both change so do their mtime and ctime obviously need to change. But it does not talk about ctime of the entity being moved. Additionally, the only way rename(2) is described to fail with EMLINK is when renaming an directory and the parent of the new location cannot have any more links; which implies that it does not have to fail if the first step of link+unlink overflows the link count of old. Ah, I found in the informative Application Usage section that this is implementation dependent. Sorry for the noise. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Make use of stat.ctime configurable 2008-07-29 1:41 ` David Brown 2008-07-29 2:49 ` Junio C Hamano @ 2008-07-29 10:45 ` Johannes Schindelin 1 sibling, 0 replies; 41+ messages in thread From: Johannes Schindelin @ 2008-07-29 10:45 UTC (permalink / raw) To: David Brown Cc: Junio C Hamano, Linus Torvalds, Alex Riesen, Johannes Sixt, git Hi, On Mon, 28 Jul 2008, David Brown wrote: > On Mon, Jul 28, 2008 at 06:31:24PM -0700, Junio C Hamano wrote: > > > Why does "rename(old, new)" change st_ctime when you move a regular > >file? > > But, from the point of view of backup software, not updating the ctime > on rename would be horrible, because you'd never know when files got > renamed. Any backup software that does not discover that there is a new _filename_ is not worth the label "software". Rather "daftware" or somesuch. Ciao, Dscho ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Make use of stat.ctime configurable 2008-07-29 1:31 ` Junio C Hamano 2008-07-29 1:41 ` David Brown @ 2008-07-29 1:55 ` Linus Torvalds 2008-07-29 2:01 ` Linus Torvalds 1 sibling, 1 reply; 41+ messages in thread From: Linus Torvalds @ 2008-07-29 1:55 UTC (permalink / raw) To: Junio C Hamano Cc: David Brown, Alex Riesen, Johannes Schindelin, Johannes Sixt, git On Mon, 28 Jul 2008, Junio C Hamano wrote: > > Ok. While I have your attention on st_ctime, let me ask you a stupid > question. Why does "rename(old, new)" change st_ctime when you move a > regular file? Hmm. I think that's just a plain POSIX oddity. There's no real "reason" for it, except the historical one: in really old UNIX terms, rename used to be a "link+unlink". And that "link+unlink" updated ctime because the 'nlink' part of the inode changed. Never mind that it got changed right back ;) Linus ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Make use of stat.ctime configurable 2008-07-29 1:55 ` Linus Torvalds @ 2008-07-29 2:01 ` Linus Torvalds 0 siblings, 0 replies; 41+ messages in thread From: Linus Torvalds @ 2008-07-29 2:01 UTC (permalink / raw) To: Junio C Hamano Cc: David Brown, Alex Riesen, Johannes Schindelin, Johannes Sixt, git On Mon, 28 Jul 2008, Linus Torvalds wrote: > > Hmm. I think that's just a plain POSIX oddity. There's no real "reason" > for it, except the historical one: in really old UNIX terms, rename used > to be a "link+unlink". Side note: a lot of the mtime/ctime/atime rules are really pretty arbitrary. They've grown over time, and have various historic reasons. 'ctime' in particular is more arbitrary than most, and I don't at all guarantee that all Unixes will work exactly the same wrt ctime and rename. In fact, I -can- guarantee that some older versions of Linux haven't always updated ctime on renames, for example, and it's probably still per-filesystem. Linus ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Make use of stat.ctime configurable 2008-07-29 1:23 ` Linus Torvalds 2008-07-29 1:31 ` Junio C Hamano @ 2008-07-29 10:49 ` Johannes Schindelin 1 sibling, 0 replies; 41+ messages in thread From: Johannes Schindelin @ 2008-07-29 10:49 UTC (permalink / raw) To: Linus Torvalds Cc: Junio C Hamano, David Brown, Alex Riesen, Johannes Sixt, git Hi, On Mon, 28 Jul 2008, Linus Torvalds wrote: > On Mon, 28 Jul 2008, Junio C Hamano wrote: > > > > > > Anybody who uses extended attributes as part of a indexing scheme is > > > just insane. Modifying the file you are indexing is not just > > > fundamentally wrong to begin with, but it will then also be > > > incredibly inefficient to read those entries one at a time. > > > > It's a typo and you are saying it _is_ fundamentally wrong, aren't > > you? > > Not a typo, and I'm sayin that "it's not _just_ fundamentally wrong" > > So yes, it's fundamentally wrong, but it's worse than that. It's > fundamentally wrong _and_ it's inefficient as hell. I haven't looked at Beagle's source code either, but as a _user_ I can say that it really became horribly, horribly slow after half a year of normal usage. And yes, uninstalling Beagle, backing up the files, reformatting and putting the files back (to really get rid of the extended attributes already in the file system) helped. So the first thing I did, back when I still used openSUSE, was to uninstall Beagle after the system install. Ciao, Dscho ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] Make use of stat.ctime configurable 2008-07-28 6:31 ` Alex Riesen 2008-07-28 16:04 ` David Brown @ 2008-07-28 16:20 ` Petr Baudis 2008-07-28 21:47 ` [PATCH] Improve the placement of core.trustctime in the documentation Alex Riesen 1 sibling, 1 reply; 41+ messages in thread From: Petr Baudis @ 2008-07-28 16:20 UTC (permalink / raw) To: Alex Riesen Cc: Junio C Hamano, Johannes Schindelin, Linus Torvalds, Johannes Sixt, git On Mon, Jul 28, 2008 at 08:31:28AM +0200, Alex Riesen wrote: > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 1a13abc..552c134 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -149,6 +149,13 @@ core.safecrlf:: > `core.autocrlf`, git will reject the file. The variable can > be set to "warn", in which case git will only warn about an > irreversible conversion but continue the operation. > + > +core.trustctime:: > + If false, the ctime differences between the index and the > + working copy are ignored; useful when the inode change time > + is regularly modified by something outside Git (file system > + crawlers and some backup systems). > + See linkgit:git-update-index[1]. True by default. > + > CRLF conversion bears a slight chance of corrupting data. > autocrlf=true will convert CRLF to LF during commit and LF to Somehow, this particular position of the new hunk does not feel like the best choice. ;-) Petr "Pasky" Baudis ^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH] Improve the placement of core.trustctime in the documentation 2008-07-28 16:20 ` Petr Baudis @ 2008-07-28 21:47 ` Alex Riesen 2008-07-29 6:23 ` Junio C Hamano 0 siblings, 1 reply; 41+ messages in thread From: Alex Riesen @ 2008-07-28 21:47 UTC (permalink / raw) To: Petr Baudis Cc: Junio C Hamano, Johannes Schindelin, Linus Torvalds, Johannes Sixt, git Accidentally, it split a _chapter_ about a file data corrup... conversion for a weird, but common operating system. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> --- Petr Baudis, Mon, Jul 28, 2008 18:20:43 +0200: > On Mon, Jul 28, 2008 at 08:31:28AM +0200, Alex Riesen wrote: > > diff --git a/Documentation/config.txt b/Documentation/config.txt > > index 1a13abc..552c134 100644 > > --- a/Documentation/config.txt > > +++ b/Documentation/config.txt > > @@ -149,6 +149,13 @@ core.safecrlf:: > > `core.autocrlf`, git will reject the file. The variable can > > be set to "warn", in which case git will only warn about an > > irreversible conversion but continue the operation. > > + > > +core.trustctime:: > > + If false, the ctime differences between the index and the > > + working copy are ignored; useful when the inode change time > > + is regularly modified by something outside Git (file system > > + crawlers and some backup systems). > > + See linkgit:git-update-index[1]. True by default. > > + > > CRLF conversion bears a slight chance of corrupting data. > > autocrlf=true will convert CRLF to LF during commit and LF to > > Somehow, this particular position of the new hunk does not feel like the > best choice. ;-) > It's alphabetical. Why? Oh, shit... Screw alphabetical Documentation/config.txt | 14 +++++++------- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 552c134..61c3760 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -117,6 +117,13 @@ core.fileMode:: the working copy are ignored; useful on broken filesystems like FAT. See linkgit:git-update-index[1]. True by default. +core.trustctime:: + If false, the ctime differences between the index and the + working copy are ignored; useful when the inode change time + is regularly modified by something outside Git (file system + crawlers and some backup systems). + See linkgit:git-update-index[1]. True by default. + core.quotepath:: The commands that output paths (e.g. 'ls-files', 'diff'), when not given the `-z` option, will quote @@ -149,13 +156,6 @@ core.safecrlf:: `core.autocrlf`, git will reject the file. The variable can be set to "warn", in which case git will only warn about an irreversible conversion but continue the operation. - -core.trustctime:: - If false, the ctime differences between the index and the - working copy are ignored; useful when the inode change time - is regularly modified by something outside Git (file system - crawlers and some backup systems). - See linkgit:git-update-index[1]. True by default. + CRLF conversion bears a slight chance of corrupting data. autocrlf=true will convert CRLF to LF during commit and LF to -- 1.6.0.rc0.76.g581e ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH] Improve the placement of core.trustctime in the documentation 2008-07-28 21:47 ` [PATCH] Improve the placement of core.trustctime in the documentation Alex Riesen @ 2008-07-29 6:23 ` Junio C Hamano 0 siblings, 0 replies; 41+ messages in thread From: Junio C Hamano @ 2008-07-29 6:23 UTC (permalink / raw) To: Alex Riesen Cc: Petr Baudis, Johannes Schindelin, Linus Torvalds, Johannes Sixt, git Alex Riesen <raa.lkml@gmail.com> writes: > Accidentally, it split a _chapter_ about a file data corrup... > conversion for a weird, but common operating system. > > Signed-off-by: Alex Riesen <raa.lkml@gmail.com> > --- > > Petr Baudis, Mon, Jul 28, 2008 18:20:43 +0200: >> On Mon, Jul 28, 2008 at 08:31:28AM +0200, Alex Riesen wrote: >> > diff --git a/Documentation/config.txt b/Documentation/config.txt >> > index 1a13abc..552c134 100644 >> > --- a/Documentation/config.txt >> > +++ b/Documentation/config.txt >> > @@ -149,6 +149,13 @@ core.safecrlf:: >> > `core.autocrlf`, git will reject the file. The variable can >> > be set to "warn", in which case git will only warn about an >> > irreversible conversion but continue the operation. >> > + >> > +core.trustctime:: >> > + If false, the ctime differences between the index and the >> > + working copy are ignored; useful when the inode change time >> > + is regularly modified by something outside Git (file system >> > + crawlers and some backup systems). >> > + See linkgit:git-update-index[1]. True by default. >> > + >> > CRLF conversion bears a slight chance of corrupting data. >> > autocrlf=true will convert CRLF to LF during commit and LF to >> >> Somehow, this particular position of the new hunk does not feel like the >> best choice. ;-) >> > > It's alphabetical. Why? Oh, shit... Screw alphabetical Yeah, I think it makes sense to put this after core.filemode. ^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH] Do not use ctime if file mode is not used 2008-07-23 0:12 ` Junio C Hamano 2008-07-23 16:46 ` Alex Riesen @ 2008-07-24 19:00 ` Alex Riesen 1 sibling, 0 replies; 41+ messages in thread From: Alex Riesen @ 2008-07-24 19:00 UTC (permalink / raw) To: Junio C Hamano; +Cc: Johannes Schindelin, Johannes Sixt, git On some file systems, the only part of inode information we need (executable) cannot be used, so ctime can be ignored as well. Change time is updated when file attributes were changed (or it is written to, but in this case, mtime is updated as well). Signed-off-by: Alex Riesen <raa.lkml@gmail.com> --- Junio C Hamano, Wed, Jul 23, 2008 02:12:50 +0200: > > I had that at first (NO_DEPENDABLE_CTIME, than IGNORE_CTIME), than > > deemed it excessive. > > Why is it excessive? My initial reaction was "what does trustable > filemode nor trust_executable_bit has anything to do with ctime". Please > explain. You know, you have a good point... (and I'm sometimes really stupid) Of course it depends on the underlying filesystem! The updated patch is untested yet, but should be obviously correct. BTW, any idea how to check if all callers of ce_match_stat_basic have read the configuration? It is not that essential to have trust_executable_bit set correctly, though. In worst case an index entry will be marked not up-to-date. read-cache.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/read-cache.c b/read-cache.c index a50a851..f2fa0d9 100644 --- a/read-cache.c +++ b/read-cache.c @@ -181,7 +181,7 @@ static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st) } if (ce->ce_mtime != (unsigned int) st->st_mtime) changed |= MTIME_CHANGED; - if (ce->ce_ctime != (unsigned int) st->st_ctime) + if (trust_executable_bit && ce->ce_ctime != (unsigned int) st->st_ctime) changed |= CTIME_CHANGED; if (ce->ce_uid != (unsigned int) st->st_uid || -- 1.6.0.rc0.70.g5aae9 ^ permalink raw reply related [flat|nested] 41+ messages in thread
* git svn throws locale related error when built from source 2008-07-22 20:31 ` Alex Riesen 2008-07-23 0:12 ` Junio C Hamano @ 2008-07-23 16:00 ` Anton Mostovoy 1 sibling, 0 replies; 41+ messages in thread From: Anton Mostovoy @ 2008-07-23 16:00 UTC (permalink / raw) To: git Hi I built git 1.5.6.3 manually (no package with 1.5.4+ on gutsy), and I am getting the error below when running 'git svn rebase'. svn works fine on its own though. The default locale is set to en_US.UTF-8 svn: error: cannot set LC_ALL locale svn: error: environment variable LANG is en_US.UTF-8 svn: error: please check that your locale name is correct I found a workaround that works, but it would be nice to have it work properly. $> LANG= git svn rebase" Thanks in advance. -Anton ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2008-07-29 10:49 UTC | newest] Thread overview: 41+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-07-21 17:35 [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0 Alex Riesen 2008-07-21 18:20 ` Johannes Schindelin 2008-07-21 19:43 ` Alex Riesen 2008-07-21 23:26 ` Johannes Schindelin 2008-07-22 8:07 ` Junio C Hamano 2008-07-22 16:49 ` Alex Riesen 2008-07-22 7:17 ` Johannes Sixt 2008-07-22 16:46 ` Alex Riesen 2008-07-22 16:59 ` Johannes Sixt 2008-07-22 17:28 ` Junio C Hamano 2008-07-22 19:39 ` [PATCH] Build configuration to skip ctime for modification test Alex Riesen 2008-07-22 20:17 ` Johannes Schindelin 2008-07-22 20:31 ` Alex Riesen 2008-07-23 0:12 ` Junio C Hamano 2008-07-23 16:46 ` Alex Riesen 2008-07-23 16:59 ` Johannes Schindelin 2008-07-23 19:16 ` Alex Riesen 2008-07-25 2:00 ` Linus Torvalds 2008-07-25 5:55 ` Alex Riesen 2008-07-26 0:57 ` Johannes Schindelin 2008-07-26 15:38 ` [PATCH] Make use of stat.ctime configurable Alex Riesen 2008-07-27 19:46 ` Junio C Hamano 2008-07-27 19:46 ` Junio C Hamano 2008-07-28 6:31 ` Alex Riesen 2008-07-28 16:04 ` David Brown 2008-07-28 16:09 ` Linus Torvalds 2008-07-28 21:49 ` Alex Riesen 2008-07-29 1:16 ` Junio C Hamano 2008-07-29 1:23 ` Linus Torvalds 2008-07-29 1:31 ` Junio C Hamano 2008-07-29 1:41 ` David Brown 2008-07-29 2:49 ` Junio C Hamano 2008-07-29 10:45 ` Johannes Schindelin 2008-07-29 1:55 ` Linus Torvalds 2008-07-29 2:01 ` Linus Torvalds 2008-07-29 10:49 ` Johannes Schindelin 2008-07-28 16:20 ` Petr Baudis 2008-07-28 21:47 ` [PATCH] Improve the placement of core.trustctime in the documentation Alex Riesen 2008-07-29 6:23 ` Junio C Hamano 2008-07-24 19:00 ` [PATCH] Do not use ctime if file mode is not used Alex Riesen 2008-07-23 16:00 ` git svn throws locale related error when built from source Anton Mostovoy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).