git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0
@ 2008-07-21 17:35 Alex Riesen
  2008-07-21 18:20 ` Johannes Schindelin
  2008-07-22  7:17 ` Johannes Sixt
  0 siblings, 2 replies; 41+ messages in thread
From: Alex Riesen @ 2008-07-21 17:35 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

For example - Cygwin.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
---

Can MSys folks please try it? I noticed it when the test
t2103-update-index-ignore-missing.sh (the 5th case) started failing.

 read-cache.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index a50a851..eb30c20 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -281,7 +281,7 @@ int ie_modified(const struct index_state *istate,
 	 * the length field is zero.  For other cases the ce_size
 	 * should match the SHA1 recorded in the index entry.
 	 */
-	if ((changed & DATA_CHANGED) && ce->ce_size != 0)
+	if ((changed & DATA_CHANGED) && (ce->ce_size != 0 || S_ISGITLINK(ce->ce_mode)))
 		return changed;
 
 	changed_fs = ce_modified_check_fs(ce, st);
-- 
1.5.6.4.452.g7b2a

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0
  2008-07-21 17:35 [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0 Alex Riesen
@ 2008-07-21 18:20 ` Johannes Schindelin
  2008-07-21 19:43   ` Alex Riesen
  2008-07-22  7:17 ` Johannes Sixt
  1 sibling, 1 reply; 41+ messages in thread
From: Johannes Schindelin @ 2008-07-21 18:20 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git, Junio C Hamano

Hi,

On Mon, 21 Jul 2008, Alex Riesen wrote:

> For example - Cygwin.

Please enhance: your oneline is too long, and your commit message body too 
short.

> Can MSys folks please try it? I noticed it when the test
> t2103-update-index-ignore-missing.sh (the 5th case) started failing.

Since M$' documentation says "This member does not have a meaning for 
directories." about the member nFileSizeLow of WIN32_FILE_ATTRIBUTE_DATA 
which we use to implement a sane "lstat()", I think this bug hits MinGW 
(not MSys) as well.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0
  2008-07-21 18:20 ` Johannes Schindelin
@ 2008-07-21 19:43   ` Alex Riesen
  2008-07-21 23:26     ` Johannes Schindelin
  2008-07-22  8:07     ` Junio C Hamano
  0 siblings, 2 replies; 41+ messages in thread
From: Alex Riesen @ 2008-07-21 19:43 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Junio C Hamano

Johannes Schindelin, Mon, Jul 21, 2008 20:20:43 +0200:
> Hi,
> 
> On Mon, 21 Jul 2008, Alex Riesen wrote:
> 
> > For example - Cygwin.
> 
> Please enhance: your oneline is too long, and your commit message body too 
> short.

Well, I'm really not sure. I just found this difference between linux
and cygwin (st_stat is 0 for dirs on cygwin). Than I noticed that the
routine where I made the change explicitely checks for st_size not
being 0. I must admit I can't make much out of comment, and hope this
discussion will help to clear the check up.

> > Can MSys folks please try it? I noticed it when the test
> > t2103-update-index-ignore-missing.sh (the 5th case) started failing.
> 
> Since M$' documentation says "This member does not have a meaning for 
> directories." about the member nFileSizeLow of WIN32_FILE_ATTRIBUTE_DATA 
> which we use to implement a sane "lstat()", I think this bug hits MinGW 
> (not MSys) as well.

Could you, just for completeness, try? I don't have mingw at hand
and SUSv3 (http://www.opengroup.org/onlinepubs/009695399/toc.htm)
does not tells much too. No UNIX system I know about has it 0 for
directories.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0
  2008-07-21 19:43   ` Alex Riesen
@ 2008-07-21 23:26     ` Johannes Schindelin
  2008-07-22  8:07     ` Junio C Hamano
  1 sibling, 0 replies; 41+ messages in thread
From: Johannes Schindelin @ 2008-07-21 23:26 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git, Junio C Hamano

Hi,

On Mon, 21 Jul 2008, Alex Riesen wrote:

> Johannes Schindelin, Mon, Jul 21, 2008 20:20:43 +0200:
>
> > Alex wrote:
> > 
> > > Can MSys folks please try it? I noticed it when the test 
> > > t2103-update-index-ignore-missing.sh (the 5th case) started failing.
> > 
> > Since M$' documentation says "This member does not have a meaning for 
> > directories." about the member nFileSizeLow of 
> > WIN32_FILE_ATTRIBUTE_DATA which we use to implement a sane "lstat()", 
> > I think this bug hits MinGW (not MSys) as well.
> 
> Could you, just for completeness, try?

No, I cannot.  I have no Windows machine.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0
  2008-07-21 17:35 [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0 Alex Riesen
  2008-07-21 18:20 ` Johannes Schindelin
@ 2008-07-22  7:17 ` Johannes Sixt
  2008-07-22 16:46   ` Alex Riesen
  2008-07-22 17:28   ` Junio C Hamano
  1 sibling, 2 replies; 41+ messages in thread
From: Johannes Sixt @ 2008-07-22  7:17 UTC (permalink / raw)
  To: Alex Riesen, Junio C Hamano; +Cc: git

Alex Riesen schrieb:
> Can MSys folks please try it? I noticed it when the test
> t2103-update-index-ignore-missing.sh (the 5th case) started failing.

I tested it. mingw.git does suffer from the problem, and this fixes it.

But!

> +	if ((changed & DATA_CHANGED) && (ce->ce_size != 0 || S_ISGITLINK(ce->ce_mode)))

Does this mean that ce->ce_size is non-zero for gitlinks, at least on
Unix? Is this value useful in anyway? I don't think so. Then it shouldn't
be a random value that lstat() happens to return.

-- Hannes

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0
  2008-07-21 19:43   ` Alex Riesen
  2008-07-21 23:26     ` Johannes Schindelin
@ 2008-07-22  8:07     ` Junio C Hamano
  2008-07-22 16:49       ` Alex Riesen
  1 sibling, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2008-07-22  8:07 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Johannes Schindelin, git

Alex Riesen <raa.lkml@gmail.com> writes:

> Johannes Schindelin, Mon, Jul 21, 2008 20:20:43 +0200:
>> Hi,
>> 
>> On Mon, 21 Jul 2008, Alex Riesen wrote:
>> 
>> > For example - Cygwin.
>> 
>> Please enhance: your oneline is too long, and your commit message body too 
>> short.
>
> Well, I'm really not sure. I just found this difference between linux
> and cygwin (st_stat is 0 for dirs on cygwin). Than I noticed that the
> routine where I made the change explicitely checks for st_size not
> being 0. I must admit I can't make much out of comment, and hope this
> discussion will help to clear the check up.

The cached stat information in the index is used to speed up comparison
between "the last staged data" and what is in the working tree.
ie_match_stat() compares ce_xxx fields with the result from lstat(2) we
just did, and if there are differences, we take it as a sign that what's
in the working tree is different from what we saw when we updated the
index entry.

But there is a twist.

Ordinarily, when an entry enteres the index, the hash of the blob contents
goes along with the lstat(2) information taken from the file that supplied
the contents.  However there are some cases we populate the index without
lstat(2).  update-index --cacheinfo, update-index --index-info are two
examples, and when they add index entries, they leave ce_size field to
zero.  ie_match_stat() will compare that zero ce_size with the size
information obtained from the working tree, and declare (falsely) that
"what's in the working tree is different -- it can never match, and there
is no point trying to re-index to see if they actually match", even though
the reason ce_size is zero is *not* because we observed the size of the
working tree file *was* zero when we indexed it the last time (it is zero
merely because we haven't looked at it yet).  The ce_modified_check_fs()
call is there to deal with this "we cannot trust the ce_xxx fields" case.

I however have to wonder if you also need to touch the end of
ce_match_stat_basic() that checks for zero sized cache entry.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0
  2008-07-22  7:17 ` Johannes Sixt
@ 2008-07-22 16:46   ` Alex Riesen
  2008-07-22 16:59     ` Johannes Sixt
  2008-07-22 17:28   ` Junio C Hamano
  1 sibling, 1 reply; 41+ messages in thread
From: Alex Riesen @ 2008-07-22 16:46 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, git

Johannes Sixt, Tue, Jul 22, 2008 09:17:16 +0200:
> Alex Riesen schrieb:
> > Can MSys folks please try it? I noticed it when the test
> > t2103-update-index-ignore-missing.sh (the 5th case) started failing.
> 
> I tested it. mingw.git does suffer from the problem, and this fixes it.
> 

Yes, I did too (at work).

> > +	if ((changed & DATA_CHANGED) && (ce->ce_size != 0 || S_ISGITLINK(ce->ce_mode)))
> 
> Does this mean that ce->ce_size is non-zero for gitlinks, at least on
> Unix?

It is non-zero for directories (which is what gitlinks are in working
directories) on UNIX operating systems I met.

> Is this value useful in anyway?

Sometimes it is (the size a directory takes on storage)

> I don't think so. Then it shouldn't be a random value that lstat()
> happens to return.

The problem is: it is not random. I even suspect that Windows is the
ONLY system which has st_size 0 for directories.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0
  2008-07-22  8:07     ` Junio C Hamano
@ 2008-07-22 16:49       ` Alex Riesen
  0 siblings, 0 replies; 41+ messages in thread
From: Alex Riesen @ 2008-07-22 16:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, git

Junio C Hamano, Tue, Jul 22, 2008 10:07:49 +0200:
> I however have to wonder if you also need to touch the end of
> ce_match_stat_basic() that checks for zero sized cache entry.

I frankly don't know.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0
  2008-07-22 16:46   ` Alex Riesen
@ 2008-07-22 16:59     ` Johannes Sixt
  0 siblings, 0 replies; 41+ messages in thread
From: Johannes Sixt @ 2008-07-22 16:59 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Junio C Hamano, git

Alex Riesen schrieb:
> Johannes Sixt, Tue, Jul 22, 2008 09:17:16 +0200:
>> Alex Riesen schrieb:
>>> +	if ((changed & DATA_CHANGED) && (ce->ce_size != 0 || S_ISGITLINK(ce->ce_mode)))
>> Does this mean that ce->ce_size is non-zero for gitlinks, at least on
>> Unix?
> 
> It is non-zero for directories (which is what gitlinks are in working
> directories) on UNIX operating systems I met.
> 
>> Is this value useful in anyway?
> 
> Sometimes it is (the size a directory takes on storage)

Sure; but is ce->ce_size of gitlinks useful?

-- Hannes

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0
  2008-07-22  7:17 ` Johannes Sixt
  2008-07-22 16:46   ` Alex Riesen
@ 2008-07-22 17:28   ` Junio C Hamano
  2008-07-22 19:39     ` [PATCH] Build configuration to skip ctime for modification test Alex Riesen
  1 sibling, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2008-07-22 17:28 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Alex Riesen, git

Johannes Sixt <j.sixt@viscovery.net> writes:

>> +	if ((changed & DATA_CHANGED) && (ce->ce_size != 0 || S_ISGITLINK(ce->ce_mode)))
>
> Does this mean that ce->ce_size is non-zero for gitlinks, at least on
> Unix? Is this value useful in anyway? I don't think so. Then it shouldn't
> be a random value that lstat() happens to return.

These ce_xxx fields are the values we read from lstat(2) when the user
told us to stage that working tree entity, be it a regular file, a
symlink, or a directory that is a submodule.  The only thing required for
them is that they are stable (i.e. if you haven't touched the working tree
entity, the value stays the same), and changes across modification.  The
value itself does not have to "mean" anything.

When trying to see if the user has changes in the working tree entity
since the last such staging of the path, we compare that value with what
comes back from lstat(2), before actually comparing the contents.  If the
filesize changed, they cannot be the same and the code says you have
modified it without having to look at the contents.

	Side note.  This is why you need to be careful after modifying
	autocrlf related configuration and attributes.  If you had CRLF
	contents in the working tree that was incorrectly staged as-is,
	then switch autocrlf-on, and "git add" to fix the staged copy to
	be LF-terminated, we say "it's unchanged and we do not bother
	rehashing" by comparing the ce_xxx fields without looking at the
	contents (this is an absolutely necessary optimization to make
	"git add ."  usable), because ce_size records the size of CRLF
	version you have in the working tree, and you haven't changed the
	working tree file in this sequence above.

        Removing the file and checking things out would be the most
	straightforward solution in such a case.

We used to include ce_dev (taken from struct stat.st_dev) in the list of
fields to cache and compare to detect changes, but that is now excluded
because it is not stable (see comments in read-cache.c).  If the directory
size is unstable, perhaps it would be better to force it to some fixed
magic value so that it is not used by this "quick change detection" check.

If you network-mount the same directory from POSIX and windows, the former
may give "storage size of the directory" while the latter may give 0.
This would mean that you would need a "update-index --refresh" when you
switch between such machines.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH] Build configuration to skip ctime for modification test
  2008-07-22 17:28   ` Junio C Hamano
@ 2008-07-22 19:39     ` Alex Riesen
  2008-07-22 20:17       ` Johannes Schindelin
  0 siblings, 1 reply; 41+ messages in thread
From: Alex Riesen @ 2008-07-22 19:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Sixt, git

On Windows, the only file attribute we need (executable) cannot be
used, so ctime can be ignored as well. Change time is updated when
file attributes were changed (or it is written to, but in this case,
mtime is updated as well).

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
---

Junio C Hamano, Tue, Jul 22, 2008 19:28:46 +0200:
> Johannes Sixt <j.sixt@viscovery.net> writes:
> 
> >> +	if ((changed & DATA_CHANGED) && (ce->ce_size != 0 || S_ISGITLINK(ce->ce_mode)))
> >
> > Does this mean that ce->ce_size is non-zero for gitlinks, at least on
> > Unix? Is this value useful in anyway? I don't think so. Then it shouldn't
> > be a random value that lstat() happens to return.
> 
> These ce_xxx fields are the values we read from lstat(2) when the user
> told us to stage that working tree entity, be it a regular file, a
> symlink, or a directory that is a submodule.  The only thing required for
> them is that they are stable (i.e. if you haven't touched the working tree
> entity, the value stays the same), and changes across modification.  The
> value itself does not have to "mean" anything.

This reminds me... We can't use the only file attribute we care about
on Windows, so we can as well skip check for ctime. Besides, Google
Desktop Search keeps changing ctime when crawling files (ok, GDS is a
major usability nuance anyway, but the point is - we don't use the
file attribute).

 read-cache.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index a50a851..c4f2718 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -181,8 +181,10 @@ static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st)
 	}
 	if (ce->ce_mtime != (unsigned int) st->st_mtime)
 		changed |= MTIME_CHANGED;
+#ifndef NO_TRUSTABLE_FILEMODE
 	if (ce->ce_ctime != (unsigned int) st->st_ctime)
 		changed |= CTIME_CHANGED;
+#endif
 
 	if (ce->ce_uid != (unsigned int) st->st_uid ||
 	    ce->ce_gid != (unsigned int) st->st_gid)
-- 
1.6.0.rc0.41.g70446

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] Build configuration to skip ctime for modification test
  2008-07-22 19:39     ` [PATCH] Build configuration to skip ctime for modification test Alex Riesen
@ 2008-07-22 20:17       ` Johannes Schindelin
  2008-07-22 20:31         ` Alex Riesen
  0 siblings, 1 reply; 41+ messages in thread
From: Johannes Schindelin @ 2008-07-22 20:17 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Junio C Hamano, Johannes Sixt, git

Hi,

On Tue, 22 Jul 2008, Alex Riesen wrote:

> +#ifndef NO_TRUSTABLE_FILEMODE
>  	if (ce->ce_ctime != (unsigned int) st->st_ctime)
>  		changed |= CTIME_CHANGED;
> +#endif

Surely you meant trust_executable_bit instead, right?

Otherwise, if you really want to tell at compile time,I think for clarity 
you have to introduce another #define, since NO_TRUSTABLE_FILEMODE 
definitely says something different than CTIME_IS_USELESS.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Build configuration to skip ctime for modification test
  2008-07-22 20:17       ` Johannes Schindelin
@ 2008-07-22 20:31         ` Alex Riesen
  2008-07-23  0:12           ` Junio C Hamano
  2008-07-23 16:00           ` git svn throws locale related error when built from source Anton Mostovoy
  0 siblings, 2 replies; 41+ messages in thread
From: Alex Riesen @ 2008-07-22 20:31 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, Johannes Sixt, git

Johannes Schindelin, Tue, Jul 22, 2008 22:17:21 +0200:
> Hi,
> 
> On Tue, 22 Jul 2008, Alex Riesen wrote:
> 
> > +#ifndef NO_TRUSTABLE_FILEMODE
> >  	if (ce->ce_ctime != (unsigned int) st->st_ctime)
> >  		changed |= CTIME_CHANGED;
> > +#endif
> 
> Surely you meant trust_executable_bit instead, right?

No. Just what I said: we don't have filemode (like "at all") - so no
ctime as well. But maybe you're right, and trust_executable_bit is
more flexible. Or maybe both (the #ifdef _and_ trust_executable_bit)
and must be used...

> Otherwise, if you really want to tell at compile time,I think for clarity 
> you have to introduce another #define, since NO_TRUSTABLE_FILEMODE 
> definitely says something different than CTIME_IS_USELESS.

I had that at first (NO_DEPENDABLE_CTIME, than IGNORE_CTIME), than
deemed it excessive.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Build configuration to skip ctime for modification test
  2008-07-22 20:31         ` Alex Riesen
@ 2008-07-23  0:12           ` Junio C Hamano
  2008-07-23 16:46             ` Alex Riesen
  2008-07-24 19:00             ` [PATCH] Do not use ctime if file mode is not used Alex Riesen
  2008-07-23 16:00           ` git svn throws locale related error when built from source Anton Mostovoy
  1 sibling, 2 replies; 41+ messages in thread
From: Junio C Hamano @ 2008-07-23  0:12 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Johannes Schindelin, Johannes Sixt, git

Alex Riesen <raa.lkml@gmail.com> writes:

> Johannes Schindelin, Tue, Jul 22, 2008 22:17:21 +0200:
>> Hi,
>> 
>> On Tue, 22 Jul 2008, Alex Riesen wrote:
>> 
>> > +#ifndef NO_TRUSTABLE_FILEMODE
>> >  	if (ce->ce_ctime != (unsigned int) st->st_ctime)
>> >  		changed |= CTIME_CHANGED;
>> > +#endif
>> 
>> Surely you meant trust_executable_bit instead, right?
>
> No. Just what I said: we don't have filemode (like "at all") - so no
> ctime as well. But maybe you're right, and trust_executable_bit is
> more flexible. Or maybe both (the #ifdef _and_ trust_executable_bit)
> and must be used...
>
>> Otherwise, if you really want to tell at compile time,I think for clarity 
>> you have to introduce another #define, since NO_TRUSTABLE_FILEMODE 
>> definitely says something different than CTIME_IS_USELESS.
>
> I had that at first (NO_DEPENDABLE_CTIME, than IGNORE_CTIME), than
> deemed it excessive.

Why is it excessive?  My initial reaction was "what does trustable
filemode nor trust_executable_bit has anything to do with ctime".  Please
explain.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* git svn throws locale related error when built from source
  2008-07-22 20:31         ` Alex Riesen
  2008-07-23  0:12           ` Junio C Hamano
@ 2008-07-23 16:00           ` Anton Mostovoy
  1 sibling, 0 replies; 41+ messages in thread
From: Anton Mostovoy @ 2008-07-23 16:00 UTC (permalink / raw)
  To: git

Hi

I built git 1.5.6.3 manually (no package with 1.5.4+ on gutsy), and I am
getting the error below when running 'git svn rebase'. 
svn works fine on its own though.  
The default locale is set to en_US.UTF-8

svn: error: cannot set LC_ALL locale
svn: error: environment variable LANG is en_US.UTF-8
svn: error: please check that your locale name is correct

I found a workaround that works, but it would be nice to have it work
properly.
$> LANG= git svn rebase"

Thanks in advance.
-Anton

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Build configuration to skip ctime for modification test
  2008-07-23  0:12           ` Junio C Hamano
@ 2008-07-23 16:46             ` Alex Riesen
  2008-07-23 16:59               ` Johannes Schindelin
  2008-07-24 19:00             ` [PATCH] Do not use ctime if file mode is not used Alex Riesen
  1 sibling, 1 reply; 41+ messages in thread
From: Alex Riesen @ 2008-07-23 16:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, Johannes Sixt, git

Junio C Hamano, Wed, Jul 23, 2008 02:12:50 +0200:
> >> Otherwise, if you really want to tell at compile time,I think for clarity 
> >> you have to introduce another #define, since NO_TRUSTABLE_FILEMODE 
> >> definitely says something different than CTIME_IS_USELESS.
> >
> > I had that at first (NO_DEPENDABLE_CTIME, than IGNORE_CTIME), than
> > deemed it excessive.
> 
> Why is it excessive?  My initial reaction was "what does trustable
> filemode nor trust_executable_bit has anything to do with ctime".  Please
> explain.

Because exactly the file mode (the executable bit) is the reason for
checking ctime. Otherwise there is no point: Git doesn't save any
other data which when changed cause a ctime update. And exactly the
file mode is completely broken on that cygwin thing. Which is
precisely pointed by NO_TRUSTABLE_FILEMODE. Hence - just it.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Build configuration to skip ctime for modification test
  2008-07-23 16:46             ` Alex Riesen
@ 2008-07-23 16:59               ` Johannes Schindelin
  2008-07-23 19:16                 ` Alex Riesen
  0 siblings, 1 reply; 41+ messages in thread
From: Johannes Schindelin @ 2008-07-23 16:59 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Junio C Hamano, Johannes Sixt, git

Hi,

On Wed, 23 Jul 2008, Alex Riesen wrote:

> Because exactly the file mode (the executable bit) is the reason for
> checking ctime.

But ctime is broken on Windows.  Because ctime is supposed to change 
whenever the _inode_ changes.

You have to admit that saying "I ignore the ctime because the executable 
bit is broken" must leave the reader puzzled.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Build configuration to skip ctime for modification test
  2008-07-23 16:59               ` Johannes Schindelin
@ 2008-07-23 19:16                 ` Alex Riesen
  2008-07-25  2:00                   ` Linus Torvalds
  0 siblings, 1 reply; 41+ messages in thread
From: Alex Riesen @ 2008-07-23 19:16 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, Johannes Sixt, git

Johannes Schindelin, Wed, Jul 23, 2008 18:59:02 +0200:
> On Wed, 23 Jul 2008, Alex Riesen wrote:
> 
> > Because exactly the file mode (the executable bit) is the reason for
> > checking ctime.
> 
> But ctime is broken on Windows.  Because ctime is supposed to change 
> whenever the _inode_ changes.

It is not that it is broken. We just don't need it, because the st_mode
is not used, and the rest of inode information is not used anyway.

> You have to admit that saying "I ignore the ctime because the executable 
> bit is broken" must leave the reader puzzled.

This is conclusion. I said "file mode" and "file attributes", which
is how reason for ctime update is defined in SUSv3. man 2 stat says:

       The field st_ctime is changed by writing or by setting  inode
       information (i.e., owner, group, link count, mode, etc.).

Not just "inode" but "inode information". Only mode is used, and even
that is ignored on Windows.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH] Do not use ctime if file mode is not used
  2008-07-23  0:12           ` Junio C Hamano
  2008-07-23 16:46             ` Alex Riesen
@ 2008-07-24 19:00             ` Alex Riesen
  1 sibling, 0 replies; 41+ messages in thread
From: Alex Riesen @ 2008-07-24 19:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, Johannes Sixt, git

On some file systems, the only part of inode information we need
(executable) cannot be used, so ctime can be ignored as well. Change
time is updated when file attributes were changed (or it is written
to, but in this case, mtime is updated as well).

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
---

Junio C Hamano, Wed, Jul 23, 2008 02:12:50 +0200:
> > I had that at first (NO_DEPENDABLE_CTIME, than IGNORE_CTIME), than
> > deemed it excessive.
> 
> Why is it excessive?  My initial reaction was "what does trustable
> filemode nor trust_executable_bit has anything to do with ctime".  Please
> explain.

You know, you have a good point... (and I'm sometimes really stupid)
Of course it depends on the underlying filesystem!

The updated patch is untested yet, but should be obviously correct.

BTW, any idea how to check if all callers of ce_match_stat_basic have
read the configuration? It is not that essential to have
trust_executable_bit set correctly, though. In worst case an index
entry will be marked not up-to-date.

 read-cache.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index a50a851..f2fa0d9 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -181,7 +181,7 @@ static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st)
 	}
 	if (ce->ce_mtime != (unsigned int) st->st_mtime)
 		changed |= MTIME_CHANGED;
-	if (ce->ce_ctime != (unsigned int) st->st_ctime)
+	if (trust_executable_bit && ce->ce_ctime != (unsigned int) st->st_ctime)
 		changed |= CTIME_CHANGED;
 
 	if (ce->ce_uid != (unsigned int) st->st_uid ||
-- 
1.6.0.rc0.70.g5aae9

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] Build configuration to skip ctime for modification test
  2008-07-23 19:16                 ` Alex Riesen
@ 2008-07-25  2:00                   ` Linus Torvalds
  2008-07-25  5:55                     ` Alex Riesen
  0 siblings, 1 reply; 41+ messages in thread
From: Linus Torvalds @ 2008-07-25  2:00 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Johannes Schindelin, Junio C Hamano, Johannes Sixt, git



On Wed, 23 Jul 2008, Alex Riesen wrote:
> 
> It is not that it is broken. We just don't need it, because the st_mode
> is not used, and the rest of inode information is not used anyway.

That is NOT why git checks the ctime.

Git checks the ctime not because it cares about the inode state being 
modified per se: since it can see that _directly_ - so why should it care 
about inode state like st_mode?

No, git checks ctime because it in general tries to make it VERY VERY hard 
for people to try to "fake out" git and replace files from underneath it 
without git noticing.

It's much easier and much more common for tools to restore 'mtime' when 
they do some operation on a file than it is for them to restore 'ctime'.

For example, if you rsync files between two hosts, the '-t' flag will make 
rsync try to keep the mtimes in sync (and it's part of '-a', which is the 
common form that you'd use for rsync). So if you only look at mtime and 
size, you often miss the fact that the file has actually been messed with!

Looking at ctime gets around a number of those things. Of course, it can 
cause git to be _too_ eager in thinking that a file is modified, and an 
example of that is the insane indexing that 'beagle' does, which actually 
modifies the files by adding inode extended attributes to them and thus 
changes ctime due to the indexing. But in general it's a lot better to be 
too careful than to miss the fact that somebody changed the file.

		Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Build configuration to skip ctime for modification test
  2008-07-25  2:00                   ` Linus Torvalds
@ 2008-07-25  5:55                     ` Alex Riesen
  2008-07-26  0:57                       ` Johannes Schindelin
  0 siblings, 1 reply; 41+ messages in thread
From: Alex Riesen @ 2008-07-25  5:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Johannes Schindelin, Junio C Hamano, Johannes Sixt, git

Linus Torvalds, Fri, Jul 25, 2008 04:00:29 +0200:
> On Wed, 23 Jul 2008, Alex Riesen wrote:
> > 
> > It is not that it is broken. We just don't need it, because the st_mode
> > is not used, and the rest of inode information is not used anyway.
> 
> That is NOT why git checks the ctime.
> 
> Git checks the ctime not because it cares about the inode state being 
> modified per se: since it can see that _directly_ - so why should it care 
> about inode state like st_mode?
> 
> No, git checks ctime because it in general tries to make it VERY VERY hard 
> for people to try to "fake out" git and replace files from underneath it 
> without git noticing.
> 
> It's much easier and much more common for tools to restore 'mtime' when 
> they do some operation on a file than it is for them to restore 'ctime'.
> 
> For example, if you rsync files between two hosts, the '-t' flag will make 
> rsync try to keep the mtimes in sync (and it's part of '-a', which is the 
> common form that you'd use for rsync). So if you only look at mtime and 
> size, you often miss the fact that the file has actually been messed with!
> 
> Looking at ctime gets around a number of those things. Of course, it can 
> cause git to be _too_ eager in thinking that a file is modified, and an 
> example of that is the insane indexing that 'beagle' does, which actually 
> modifies the files by adding inode extended attributes to them and thus 
> changes ctime due to the indexing. But in general it's a lot better to be 
> too careful than to miss the fact that somebody changed the file.
> 

But, given the fact, that somewhere sometimes its very-very annoying
to have even one (un)changed file, something must be done about it.
Maybe just direct

    # my .gitconfig for Windows machines with GDS
    [core]
	filemode = false
	trustctime = false
	logallrefupdates = false
    [pack]
	threads = 1

    etc...

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Build configuration to skip ctime for modification test
  2008-07-25  5:55                     ` Alex Riesen
@ 2008-07-26  0:57                       ` Johannes Schindelin
  2008-07-26 15:38                         ` [PATCH] Make use of stat.ctime configurable Alex Riesen
  0 siblings, 1 reply; 41+ messages in thread
From: Johannes Schindelin @ 2008-07-26  0:57 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Linus Torvalds, Junio C Hamano, Johannes Sixt, git

Hi,

On Fri, 25 Jul 2008, Alex Riesen wrote:

> But, given the fact, that somewhere sometimes its very-very annoying to 
> have even one (un)changed file, something must be done about it. Maybe 
> just direct
> 
> [...]
> 	trustctime = false

... which is all Junio and I were asking all along: a separate way to ask 
for ignoring ctime; not just DWIM it on top of the executable bit.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH] Make use of stat.ctime configurable
  2008-07-26  0:57                       ` Johannes Schindelin
@ 2008-07-26 15:38                         ` Alex Riesen
  2008-07-27 19:46                           ` Junio C Hamano
  2008-07-27 19:46                           ` Junio C Hamano
  0 siblings, 2 replies; 41+ messages in thread
From: Alex Riesen @ 2008-07-26 15:38 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Linus Torvalds, Junio C Hamano, Johannes Sixt, git

because there are situations where it produces too much false
positives. Like when file system crawlers keep changing it when
scanning and using the ctime for marking scanned files.

The default is to allow use of ctime.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
---
Johannes Schindelin, Sat, Jul 26, 2008 02:57:25 +0200:
> On Fri, 25 Jul 2008, Alex Riesen wrote:
> > But, given the fact, that somewhere sometimes its very-very annoying to 
> > have even one (un)changed file, something must be done about it. Maybe 
> > just direct
> > 
> > [...]
> > 	trustctime = false
> 
> ... which is all Junio and I were asking all along: a separate way to ask 
> for ignoring ctime; not just DWIM it on top of the executable bit.

Oh...

 cache.h       |    1 +
 config.c      |    4 ++++
 environment.c |    1 +
 read-cache.c  |    2 +-
 4 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/cache.h b/cache.h
index 38985aa..00d02d3 100644
--- a/cache.h
+++ b/cache.h
@@ -421,6 +421,7 @@ extern int delete_ref(const char *, const unsigned char *sha1);
 
 /* Environment bits from configuration mechanism */
 extern int trust_executable_bit;
+extern int trust_file_ctime;
 extern int quote_path_fully;
 extern int has_symlinks;
 extern int ignore_case;
diff --git a/config.c b/config.c
index 1e066c7..860e8ab 100644
--- a/config.c
+++ b/config.c
@@ -341,6 +341,10 @@ static int git_default_core_config(const char *var, const char *value)
 		trust_executable_bit = git_config_bool(var, value);
 		return 0;
 	}
+	if (!strcmp(var, "core.filectime")) {
+		trust_file_ctime = git_config_bool(var, value);
+		return 0;
+	}
 
 	if (!strcmp(var, "core.quotepath")) {
 		quote_path_fully = git_config_bool(var, value);
diff --git a/environment.c b/environment.c
index 4a88a17..4982771 100644
--- a/environment.c
+++ b/environment.c
@@ -13,6 +13,7 @@ char git_default_email[MAX_GITNAME];
 char git_default_name[MAX_GITNAME];
 int user_ident_explicitly_given;
 int trust_executable_bit = 1;
+int trust_file_ctime = 1;
 int has_symlinks = 1;
 int ignore_case;
 int assume_unchanged;
diff --git a/read-cache.c b/read-cache.c
index a50a851..00d39dc 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -181,7 +181,7 @@ static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st)
 	}
 	if (ce->ce_mtime != (unsigned int) st->st_mtime)
 		changed |= MTIME_CHANGED;
-	if (ce->ce_ctime != (unsigned int) st->st_ctime)
+	if (trust_file_ctime && ce->ce_ctime != (unsigned int) st->st_ctime)
 		changed |= CTIME_CHANGED;
 
 	if (ce->ce_uid != (unsigned int) st->st_uid ||
-- 
1.6.0.rc0.76.g581e

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] Make use of stat.ctime configurable
  2008-07-26 15:38                         ` [PATCH] Make use of stat.ctime configurable Alex Riesen
@ 2008-07-27 19:46                           ` Junio C Hamano
  2008-07-27 19:46                           ` Junio C Hamano
  1 sibling, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2008-07-27 19:46 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Johannes Schindelin, Linus Torvalds, Johannes Sixt, git

Alex Riesen <raa.lkml@gmail.com> writes:

> because there are situations where it produces too much false
> positives. Like when file system crawlers keep changing it when
> scanning and using the ctime for marking scanned files.

This justification is good and I am very inclined to advocate for its
inclusion in 1.6.0, but any new configuration needs to be in the
documentation.

It appears there is "gui.trustmtime"; shouldn't this be called
"core.trustctime" or something?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Make use of stat.ctime configurable
  2008-07-26 15:38                         ` [PATCH] Make use of stat.ctime configurable Alex Riesen
  2008-07-27 19:46                           ` Junio C Hamano
@ 2008-07-27 19:46                           ` Junio C Hamano
  2008-07-28  6:31                             ` Alex Riesen
  1 sibling, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2008-07-27 19:46 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Johannes Schindelin, Linus Torvalds, Johannes Sixt, git

Alex Riesen <raa.lkml@gmail.com> writes:

> because there are situations where it produces too much false
> positives. Like when file system crawlers keep changing it when
> scanning and using the ctime for marking scanned files.

This justification is good and I am very inclined to advocate for its
inclusion in 1.6.0, but any new configuration needs to be in the
documentation.

It appears there is "gui.trustmtime"; shouldn't this be called
"core.trustctime" or something?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH] Make use of stat.ctime configurable
  2008-07-27 19:46                           ` Junio C Hamano
@ 2008-07-28  6:31                             ` Alex Riesen
  2008-07-28 16:04                               ` David Brown
  2008-07-28 16:20                               ` Petr Baudis
  0 siblings, 2 replies; 41+ messages in thread
From: Alex Riesen @ 2008-07-28  6:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, Linus Torvalds, Johannes Sixt, git

because there are situations where it produces too much false
positives. Like when file system crawlers keep changing it when
scanning and using the ctime for marking scanned files.

The default is to allow use of ctime.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
---
Junio C Hamano, Sun, Jul 27, 2008 21:46:42 +0200:
> Alex Riesen <raa.lkml@gmail.com> writes:
> 
> > because there are situations where it produces too much false
> > positives. Like when file system crawlers keep changing it when
> > scanning and using the ctime for marking scanned files.
> 
> This justification is good and I am very inclined to advocate for its
> inclusion in 1.6.0, but any new configuration needs to be in the
> documentation.

Done.

> It appears there is "gui.trustmtime"; shouldn't this be called
> "core.trustctime" or something?

Getting old... I even called the global flag trust_file_ctime!
Corrected. Changed trust_file_ctime to trust_ctime.

 Documentation/config.txt           |    7 +++++++
 Documentation/git-update-index.txt |    5 +++++
 cache.h                            |    1 +
 config.c                           |    4 ++++
 environment.c                      |    1 +
 read-cache.c                       |    2 +-
 6 files changed, 19 insertions(+), 1 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 1a13abc..552c134 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -149,6 +149,13 @@ core.safecrlf::
 	`core.autocrlf`, git will reject the file.  The variable can
 	be set to "warn", in which case git will only warn about an
 	irreversible conversion but continue the operation.
+
+core.trustctime::
+	If false, the ctime differences between the index and the
+	working copy are ignored; useful when the inode change time
+	is regularly modified by something outside Git (file system
+	crawlers and some backup systems).
+	See linkgit:git-update-index[1]. True by default.
 +
 CRLF conversion bears a slight chance of corrupting data.
 autocrlf=true will convert CRLF to LF during commit and LF to
diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
index 6b930bc..1d9d81a 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -323,6 +323,11 @@ from symbolic link to regular file.
 The command looks at `core.ignorestat` configuration variable.  See
 'Using "assume unchanged" bit' section above.
 
+The command also looks at `core.trustctime` configuration variable.
+It can be useful when the inode change time is regularly modified by
+something outside Git (file system crawlers and backup systems use
+ctime for marking files processed) (see linkgit:git-config[1]).
+
 
 SEE ALSO
 --------
diff --git a/cache.h b/cache.h
index 4b6c0a6..2475de9 100644
--- a/cache.h
+++ b/cache.h
@@ -423,6 +423,7 @@ extern int delete_ref(const char *, const unsigned char *sha1);
 
 /* Environment bits from configuration mechanism */
 extern int trust_executable_bit;
+extern int trust_ctime;
 extern int quote_path_fully;
 extern int has_symlinks;
 extern int ignore_case;
diff --git a/config.c b/config.c
index 1e066c7..53f04a0 100644
--- a/config.c
+++ b/config.c
@@ -341,6 +341,10 @@ static int git_default_core_config(const char *var, const char *value)
 		trust_executable_bit = git_config_bool(var, value);
 		return 0;
 	}
+	if (!strcmp(var, "core.trustctime")) {
+		trust_ctime = git_config_bool(var, value);
+		return 0;
+	}
 
 	if (!strcmp(var, "core.quotepath")) {
 		quote_path_fully = git_config_bool(var, value);
diff --git a/environment.c b/environment.c
index 4a88a17..0c6d11f 100644
--- a/environment.c
+++ b/environment.c
@@ -13,6 +13,7 @@ char git_default_email[MAX_GITNAME];
 char git_default_name[MAX_GITNAME];
 int user_ident_explicitly_given;
 int trust_executable_bit = 1;
+int trust_ctime = 1;
 int has_symlinks = 1;
 int ignore_case;
 int assume_unchanged;
diff --git a/read-cache.c b/read-cache.c
index 6c08803..1cae361 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -197,7 +197,7 @@ static int ce_match_stat_basic(struct cache_entry *ce, struct stat *st)
 	}
 	if (ce->ce_mtime != (unsigned int) st->st_mtime)
 		changed |= MTIME_CHANGED;
-	if (ce->ce_ctime != (unsigned int) st->st_ctime)
+	if (trust_ctime && ce->ce_ctime != (unsigned int) st->st_ctime)
 		changed |= CTIME_CHANGED;
 
 	if (ce->ce_uid != (unsigned int) st->st_uid ||
-- 
1.6.0.rc0.76.g581e

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] Make use of stat.ctime configurable
  2008-07-28  6:31                             ` Alex Riesen
@ 2008-07-28 16:04                               ` David Brown
  2008-07-28 16:09                                 ` Linus Torvalds
  2008-07-28 16:20                               ` Petr Baudis
  1 sibling, 1 reply; 41+ messages in thread
From: David Brown @ 2008-07-28 16:04 UTC (permalink / raw)
  To: Alex Riesen
  Cc: Junio C Hamano, Johannes Schindelin, Linus Torvalds,
	Johannes Sixt, git

On Mon, Jul 28, 2008 at 08:31:28AM +0200, Alex Riesen wrote:

>because there are situations where it produces too much false
>positives. Like when file system crawlers keep changing it when
>scanning and using the ctime for marking scanned files.

That's interesting, since most backup software uses the ctime to determine
file changes.

David

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Make use of stat.ctime configurable
  2008-07-28 16:04                               ` David Brown
@ 2008-07-28 16:09                                 ` Linus Torvalds
  2008-07-28 21:49                                   ` Alex Riesen
  2008-07-29  1:16                                   ` Junio C Hamano
  0 siblings, 2 replies; 41+ messages in thread
From: Linus Torvalds @ 2008-07-28 16:09 UTC (permalink / raw)
  To: David Brown
  Cc: Alex Riesen, Junio C Hamano, Johannes Schindelin, Johannes Sixt,
	git



On Mon, 28 Jul 2008, David Brown wrote:

> On Mon, Jul 28, 2008 at 08:31:28AM +0200, Alex Riesen wrote:
> 
> > because there are situations where it produces too much false
> > positives. Like when file system crawlers keep changing it when
> > scanning and using the ctime for marking scanned files.
> 
> That's interesting, since most backup software uses the ctime to determine
> file changes.

It really is just Beagle that is (was? I can dream) a piece of 
unbelievable crap.

Anybody who uses extended attributes as part of a indexing scheme is just 
insane. Modifying the file you are indexing is not just fundamentally 
wrong to begin with, but it will then also be incredibly inefficient to 
read those entries one at a time.

And no other sane model would ever touch 'ctime'.

Oh, well. Making ctime configurable is not wrong per se. But if it's 
Beagle that triggers this, the fix is sadly in the wrong place.

		Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Make use of stat.ctime configurable
  2008-07-28  6:31                             ` Alex Riesen
  2008-07-28 16:04                               ` David Brown
@ 2008-07-28 16:20                               ` Petr Baudis
  2008-07-28 21:47                                 ` [PATCH] Improve the placement of core.trustctime in the documentation Alex Riesen
  1 sibling, 1 reply; 41+ messages in thread
From: Petr Baudis @ 2008-07-28 16:20 UTC (permalink / raw)
  To: Alex Riesen
  Cc: Junio C Hamano, Johannes Schindelin, Linus Torvalds,
	Johannes Sixt, git

On Mon, Jul 28, 2008 at 08:31:28AM +0200, Alex Riesen wrote:
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 1a13abc..552c134 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -149,6 +149,13 @@ core.safecrlf::
>  	`core.autocrlf`, git will reject the file.  The variable can
>  	be set to "warn", in which case git will only warn about an
>  	irreversible conversion but continue the operation.
> +
> +core.trustctime::
> +	If false, the ctime differences between the index and the
> +	working copy are ignored; useful when the inode change time
> +	is regularly modified by something outside Git (file system
> +	crawlers and some backup systems).
> +	See linkgit:git-update-index[1]. True by default.
>  +
>  CRLF conversion bears a slight chance of corrupting data.
>  autocrlf=true will convert CRLF to LF during commit and LF to

Somehow, this particular position of the new hunk does not feel like the
best choice. ;-)

				Petr "Pasky" Baudis

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH] Improve the placement of core.trustctime in the documentation
  2008-07-28 16:20                               ` Petr Baudis
@ 2008-07-28 21:47                                 ` Alex Riesen
  2008-07-29  6:23                                   ` Junio C Hamano
  0 siblings, 1 reply; 41+ messages in thread
From: Alex Riesen @ 2008-07-28 21:47 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Junio C Hamano, Johannes Schindelin, Linus Torvalds,
	Johannes Sixt, git

Accidentally, it split a _chapter_ about a file data corrup...
conversion for a weird, but common operating system.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
---

Petr Baudis, Mon, Jul 28, 2008 18:20:43 +0200:
> On Mon, Jul 28, 2008 at 08:31:28AM +0200, Alex Riesen wrote:
> > diff --git a/Documentation/config.txt b/Documentation/config.txt
> > index 1a13abc..552c134 100644
> > --- a/Documentation/config.txt
> > +++ b/Documentation/config.txt
> > @@ -149,6 +149,13 @@ core.safecrlf::
> >  	`core.autocrlf`, git will reject the file.  The variable can
> >  	be set to "warn", in which case git will only warn about an
> >  	irreversible conversion but continue the operation.
> > +
> > +core.trustctime::
> > +	If false, the ctime differences between the index and the
> > +	working copy are ignored; useful when the inode change time
> > +	is regularly modified by something outside Git (file system
> > +	crawlers and some backup systems).
> > +	See linkgit:git-update-index[1]. True by default.
> >  +
> >  CRLF conversion bears a slight chance of corrupting data.
> >  autocrlf=true will convert CRLF to LF during commit and LF to
> 
> Somehow, this particular position of the new hunk does not feel like the
> best choice. ;-)
> 

It's alphabetical. Why? Oh, shit... Screw alphabetical

 Documentation/config.txt |   14 +++++++-------
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 552c134..61c3760 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -117,6 +117,13 @@ core.fileMode::
 	the working copy are ignored; useful on broken filesystems like FAT.
 	See linkgit:git-update-index[1]. True by default.
 
+core.trustctime::
+	If false, the ctime differences between the index and the
+	working copy are ignored; useful when the inode change time
+	is regularly modified by something outside Git (file system
+	crawlers and some backup systems).
+	See linkgit:git-update-index[1]. True by default.
+
 core.quotepath::
 	The commands that output paths (e.g. 'ls-files',
 	'diff'), when not given the `-z` option, will quote
@@ -149,13 +156,6 @@ core.safecrlf::
 	`core.autocrlf`, git will reject the file.  The variable can
 	be set to "warn", in which case git will only warn about an
 	irreversible conversion but continue the operation.
-
-core.trustctime::
-	If false, the ctime differences between the index and the
-	working copy are ignored; useful when the inode change time
-	is regularly modified by something outside Git (file system
-	crawlers and some backup systems).
-	See linkgit:git-update-index[1]. True by default.
 +
 CRLF conversion bears a slight chance of corrupting data.
 autocrlf=true will convert CRLF to LF during commit and LF to
-- 
1.6.0.rc0.76.g581e

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] Make use of stat.ctime configurable
  2008-07-28 16:09                                 ` Linus Torvalds
@ 2008-07-28 21:49                                   ` Alex Riesen
  2008-07-29  1:16                                   ` Junio C Hamano
  1 sibling, 0 replies; 41+ messages in thread
From: Alex Riesen @ 2008-07-28 21:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Brown, Junio C Hamano, Johannes Schindelin, Johannes Sixt,
	git

Linus Torvalds, Mon, Jul 28, 2008 18:09:32 +0200:
> It really is just Beagle that is (was? I can dream) a piece of 
> unbelievable crap.
> 
> Anybody who uses extended attributes as part of a indexing scheme is just 
> insane. Modifying the file you are indexing is not just fundamentally 
> wrong to begin with, but it will then also be incredibly inefficient to 
> read those entries one at a time.
> 
> And no other sane model would ever touch 'ctime'.

Beagle is not alone. Google Desktop Search was mentioned before.

> Oh, well. Making ctime configurable is not wrong per se. But if it's 
> Beagle that triggers this, the fix is sadly in the wrong place.

Never said it was a fix. Same as CRLF conversion is not a feature.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Make use of stat.ctime configurable
  2008-07-28 16:09                                 ` Linus Torvalds
  2008-07-28 21:49                                   ` Alex Riesen
@ 2008-07-29  1:16                                   ` Junio C Hamano
  2008-07-29  1:23                                     ` Linus Torvalds
  1 sibling, 1 reply; 41+ messages in thread
From: Junio C Hamano @ 2008-07-29  1:16 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Brown, Alex Riesen, Johannes Schindelin, Johannes Sixt, git

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Mon, 28 Jul 2008, David Brown wrote:
>
>> On Mon, Jul 28, 2008 at 08:31:28AM +0200, Alex Riesen wrote:
>> 
>> > because there are situations where it produces too much false
>> > positives. Like when file system crawlers keep changing it when
>> > scanning and using the ctime for marking scanned files.
>> 
>> That's interesting, since most backup software uses the ctime to determine
>> file changes.
>
> It really is just Beagle that is (was? I can dream) a piece of 
> unbelievable crap.
>
> Anybody who uses extended attributes as part of a indexing scheme is just 
> insane. Modifying the file you are indexing is not just fundamentally 
> wrong to begin with, but it will then also be incredibly inefficient to 
> read those entries one at a time.

It's a typo and you are saying it _is_ fundamentally wrong, aren't you?

If you are prepared to pick up new files, you need to crawl everywhere
anyway, so if the xattr is used to leave a mark "The last time I looked at
this file was this" in the file itself, it does not sound too bad to me.
It would be irritating that it touches ctime, though, but I do not use it
so it is not my problem ;-)

http://beagle-project.org/FAQ "Do I really need extended attributes?"
talks about BEAGLE_DISABLE_XATTR environment variable and interestingly
it says disabling use of xattr would slow you down.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Make use of stat.ctime configurable
  2008-07-29  1:16                                   ` Junio C Hamano
@ 2008-07-29  1:23                                     ` Linus Torvalds
  2008-07-29  1:31                                       ` Junio C Hamano
  2008-07-29 10:49                                       ` Johannes Schindelin
  0 siblings, 2 replies; 41+ messages in thread
From: Linus Torvalds @ 2008-07-29  1:23 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: David Brown, Alex Riesen, Johannes Schindelin, Johannes Sixt, git



On Mon, 28 Jul 2008, Junio C Hamano wrote:
> >
> > Anybody who uses extended attributes as part of a indexing scheme is just 
> > insane. Modifying the file you are indexing is not just fundamentally 
> > wrong to begin with, but it will then also be incredibly inefficient to 
> > read those entries one at a time.
> 
> It's a typo and you are saying it _is_ fundamentally wrong, aren't you?

Not a typo, and I'm sayin that "it's not _just_ fundamentally wrong"

So yes, it's fundamentally wrong, but it's worse than that. It's 
fundamentally wrong _and_ it's inefficient as hell.

> If you are prepared to pick up new files, you need to crawl everywhere
> anyway, so if the xattr is used to leave a mark "The last time I looked at
> this file was this" in the file itself, it does not sound too bad to me.

It's absolutely horrible. 

It means that you have another extra indirection and accompanying disk 
seek to check the thing. It's a total performance nightmare. Trust me, 
anybody who uses extended attributes like this simply does not know what 
he is doing.

> It would be irritating that it touches ctime, though, but I do not use it
> http://beagle-project.org/FAQ "Do I really need extended attributes?"
> talks about BEAGLE_DISABLE_XATTR environment variable and interestingly
> it says disabling use of xattr would slow you down.

They don't have a clue. They say that, but it's simply not true. 

Of course, the fact that they think it is probably implies that they did 
something EVEN MORE STUPID for the non-xattr case. That wouldn't surprise 
me at all. If I had to guess, I'd guess that they used an SQL database and 
query language, and did all their tests with hot caches too.

The kernel does caching really well, and the kernel is fast as hell, so 
_of_course_ when you benchmark, using kernel data structures looks good, 
especially if you benchmark against code that isn't well written for the 
particular usage case.

			Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Make use of stat.ctime configurable
  2008-07-29  1:23                                     ` Linus Torvalds
@ 2008-07-29  1:31                                       ` Junio C Hamano
  2008-07-29  1:41                                         ` David Brown
  2008-07-29  1:55                                         ` Linus Torvalds
  2008-07-29 10:49                                       ` Johannes Schindelin
  1 sibling, 2 replies; 41+ messages in thread
From: Junio C Hamano @ 2008-07-29  1:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Brown, Alex Riesen, Johannes Schindelin, Johannes Sixt, git

Linus Torvalds <torvalds@linux-foundation.org> writes:

> The kernel does caching really well, and the kernel is fast as hell, so 
> _of_course_ when you benchmark, using kernel data structures looks good, 
> especially if you benchmark against code that isn't well written for the 
> particular usage case.

Ok.  While I have your attention on st_ctime, let me ask you a stupid
question.  Why does "rename(old, new)" change st_ctime when you move a
regular file?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Make use of stat.ctime configurable
  2008-07-29  1:31                                       ` Junio C Hamano
@ 2008-07-29  1:41                                         ` David Brown
  2008-07-29  2:49                                           ` Junio C Hamano
  2008-07-29 10:45                                           ` Johannes Schindelin
  2008-07-29  1:55                                         ` Linus Torvalds
  1 sibling, 2 replies; 41+ messages in thread
From: David Brown @ 2008-07-29  1:41 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Linus Torvalds, Alex Riesen, Johannes Schindelin, Johannes Sixt,
	git

On Mon, Jul 28, 2008 at 06:31:24PM -0700, Junio C Hamano wrote:
>Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>> The kernel does caching really well, and the kernel is fast as hell, so 
>> _of_course_ when you benchmark, using kernel data structures looks good, 
>> especially if you benchmark against code that isn't well written for the 
>> particular usage case.
>
>Ok.  While I have your attention on st_ctime, let me ask you a stupid
>question.  Why does "rename(old, new)" change st_ctime when you move a
>regular file?

A simple answer might be that posix requires it.  But, from the point of
view of backup software, not updating the ctime on rename would be
horrible, because you'd never know when files got renamed.

David

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Make use of stat.ctime configurable
  2008-07-29  1:31                                       ` Junio C Hamano
  2008-07-29  1:41                                         ` David Brown
@ 2008-07-29  1:55                                         ` Linus Torvalds
  2008-07-29  2:01                                           ` Linus Torvalds
  1 sibling, 1 reply; 41+ messages in thread
From: Linus Torvalds @ 2008-07-29  1:55 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: David Brown, Alex Riesen, Johannes Schindelin, Johannes Sixt, git



On Mon, 28 Jul 2008, Junio C Hamano wrote:
> 
> Ok.  While I have your attention on st_ctime, let me ask you a stupid
> question.  Why does "rename(old, new)" change st_ctime when you move a
> regular file?

Hmm. I think that's just a plain POSIX oddity. There's no real "reason" 
for it, except the historical one: in really old UNIX terms, rename used 
to be a "link+unlink".

And that "link+unlink" updated ctime because the 'nlink' part of the inode 
changed. Never mind that it got changed right back ;)

			Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Make use of stat.ctime configurable
  2008-07-29  1:55                                         ` Linus Torvalds
@ 2008-07-29  2:01                                           ` Linus Torvalds
  0 siblings, 0 replies; 41+ messages in thread
From: Linus Torvalds @ 2008-07-29  2:01 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: David Brown, Alex Riesen, Johannes Schindelin, Johannes Sixt, git



On Mon, 28 Jul 2008, Linus Torvalds wrote:
> 
> Hmm. I think that's just a plain POSIX oddity. There's no real "reason" 
> for it, except the historical one: in really old UNIX terms, rename used 
> to be a "link+unlink".

Side note: a lot of the mtime/ctime/atime rules are really pretty 
arbitrary. They've grown over time, and have various historic reasons.
'ctime' in particular is more arbitrary than most, and I don't at all 
guarantee that all Unixes will work exactly the same wrt ctime and rename. 

In fact, I -can- guarantee that some older versions of Linux haven't 
always updated ctime on renames, for example, and it's probably still 
per-filesystem.

			Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Make use of stat.ctime configurable
  2008-07-29  1:41                                         ` David Brown
@ 2008-07-29  2:49                                           ` Junio C Hamano
  2008-07-29 10:45                                           ` Johannes Schindelin
  1 sibling, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2008-07-29  2:49 UTC (permalink / raw)
  To: David Brown
  Cc: Linus Torvalds, Alex Riesen, Johannes Schindelin, Johannes Sixt,
	git

David Brown <git@davidb.org> writes:

> On Mon, Jul 28, 2008 at 06:31:24PM -0700, Junio C Hamano wrote:
>>Linus Torvalds <torvalds@linux-foundation.org> writes:
>>
>>> The kernel does caching really well, and the kernel is fast as
>>> hell, so _of_course_ when you benchmark, using kernel data
>>> structures looks good, especially if you benchmark against code
>>> that isn't well written for the particular usage case.
>>
>>Ok.  While I have your attention on st_ctime, let me ask you a stupid
>>question.  Why does "rename(old, new)" change st_ctime when you move a
>>regular file?
>
> A simple answer might be that posix requires it.

I would understand that an obvious implementation would be to link to new
and then unlink the old, and the link count of the moved entity needs to
change (although in the end, the increment and decrement would cancel out)
in each step, so it would be convenient for the implementation to update
ctime in both steps; however my reading of POSIX does not seem to require
it.

The only mention of ctime I find is about updating the parent directories
of old and new, as contents of both change so do their mtime and ctime
obviously need to change.  But it does not talk about ctime of the entity
being moved.

Additionally, the only way rename(2) is described to fail with EMLINK is
when renaming an directory and the parent of the new location cannot have
any more links; which implies that it does not have to fail if the first
step of link+unlink overflows the link count of old.

Ah, I found in the informative Application Usage section that this is
implementation dependent.

Sorry for the noise.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Improve the placement of core.trustctime in the documentation
  2008-07-28 21:47                                 ` [PATCH] Improve the placement of core.trustctime in the documentation Alex Riesen
@ 2008-07-29  6:23                                   ` Junio C Hamano
  0 siblings, 0 replies; 41+ messages in thread
From: Junio C Hamano @ 2008-07-29  6:23 UTC (permalink / raw)
  To: Alex Riesen
  Cc: Petr Baudis, Johannes Schindelin, Linus Torvalds, Johannes Sixt,
	git

Alex Riesen <raa.lkml@gmail.com> writes:

> Accidentally, it split a _chapter_ about a file data corrup...
> conversion for a weird, but common operating system.
>
> Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
> ---
>
> Petr Baudis, Mon, Jul 28, 2008 18:20:43 +0200:
>> On Mon, Jul 28, 2008 at 08:31:28AM +0200, Alex Riesen wrote:
>> > diff --git a/Documentation/config.txt b/Documentation/config.txt
>> > index 1a13abc..552c134 100644
>> > --- a/Documentation/config.txt
>> > +++ b/Documentation/config.txt
>> > @@ -149,6 +149,13 @@ core.safecrlf::
>> >  	`core.autocrlf`, git will reject the file.  The variable can
>> >  	be set to "warn", in which case git will only warn about an
>> >  	irreversible conversion but continue the operation.
>> > +
>> > +core.trustctime::
>> > +	If false, the ctime differences between the index and the
>> > +	working copy are ignored; useful when the inode change time
>> > +	is regularly modified by something outside Git (file system
>> > +	crawlers and some backup systems).
>> > +	See linkgit:git-update-index[1]. True by default.
>> >  +
>> >  CRLF conversion bears a slight chance of corrupting data.
>> >  autocrlf=true will convert CRLF to LF during commit and LF to
>> 
>> Somehow, this particular position of the new hunk does not feel like the
>> best choice. ;-)
>> 
>
> It's alphabetical. Why? Oh, shit... Screw alphabetical

Yeah, I think it makes sense to put this after core.filemode.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Make use of stat.ctime configurable
  2008-07-29  1:41                                         ` David Brown
  2008-07-29  2:49                                           ` Junio C Hamano
@ 2008-07-29 10:45                                           ` Johannes Schindelin
  1 sibling, 0 replies; 41+ messages in thread
From: Johannes Schindelin @ 2008-07-29 10:45 UTC (permalink / raw)
  To: David Brown
  Cc: Junio C Hamano, Linus Torvalds, Alex Riesen, Johannes Sixt, git

Hi,

On Mon, 28 Jul 2008, David Brown wrote:

> On Mon, Jul 28, 2008 at 06:31:24PM -0700, Junio C Hamano wrote:
>
> > Why does "rename(old, new)" change st_ctime when you move a regular 
> >file?
> 
> But, from the point of view of backup software, not updating the ctime 
> on rename would be horrible, because you'd never know when files got 
> renamed.

Any backup software that does not discover that there is a new _filename_ 
is not worth the label "software".  Rather "daftware" or somesuch.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] Make use of stat.ctime configurable
  2008-07-29  1:23                                     ` Linus Torvalds
  2008-07-29  1:31                                       ` Junio C Hamano
@ 2008-07-29 10:49                                       ` Johannes Schindelin
  1 sibling, 0 replies; 41+ messages in thread
From: Johannes Schindelin @ 2008-07-29 10:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Junio C Hamano, David Brown, Alex Riesen, Johannes Sixt, git

Hi,

On Mon, 28 Jul 2008, Linus Torvalds wrote:

> On Mon, 28 Jul 2008, Junio C Hamano wrote:
> > >
> > > Anybody who uses extended attributes as part of a indexing scheme is 
> > > just insane. Modifying the file you are indexing is not just 
> > > fundamentally wrong to begin with, but it will then also be 
> > > incredibly inefficient to read those entries one at a time.
> > 
> > It's a typo and you are saying it _is_ fundamentally wrong, aren't 
> > you?
> 
> Not a typo, and I'm sayin that "it's not _just_ fundamentally wrong"
> 
> So yes, it's fundamentally wrong, but it's worse than that. It's 
> fundamentally wrong _and_ it's inefficient as hell.

I haven't looked at Beagle's source code either, but as a _user_ I can say 
that it really became horribly, horribly slow after half a year of normal 
usage.

And yes, uninstalling Beagle, backing up the files, reformatting and 
putting the files back (to really get rid of the extended attributes 
already in the file system) helped.

So the first thing I did, back when I still used openSUSE, was to 
uninstall Beagle after the system install.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2008-07-29 10:49 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-21 17:35 [PATCH] Fix update-index --refresh for submodules if stat(2) returns st_size 0 Alex Riesen
2008-07-21 18:20 ` Johannes Schindelin
2008-07-21 19:43   ` Alex Riesen
2008-07-21 23:26     ` Johannes Schindelin
2008-07-22  8:07     ` Junio C Hamano
2008-07-22 16:49       ` Alex Riesen
2008-07-22  7:17 ` Johannes Sixt
2008-07-22 16:46   ` Alex Riesen
2008-07-22 16:59     ` Johannes Sixt
2008-07-22 17:28   ` Junio C Hamano
2008-07-22 19:39     ` [PATCH] Build configuration to skip ctime for modification test Alex Riesen
2008-07-22 20:17       ` Johannes Schindelin
2008-07-22 20:31         ` Alex Riesen
2008-07-23  0:12           ` Junio C Hamano
2008-07-23 16:46             ` Alex Riesen
2008-07-23 16:59               ` Johannes Schindelin
2008-07-23 19:16                 ` Alex Riesen
2008-07-25  2:00                   ` Linus Torvalds
2008-07-25  5:55                     ` Alex Riesen
2008-07-26  0:57                       ` Johannes Schindelin
2008-07-26 15:38                         ` [PATCH] Make use of stat.ctime configurable Alex Riesen
2008-07-27 19:46                           ` Junio C Hamano
2008-07-27 19:46                           ` Junio C Hamano
2008-07-28  6:31                             ` Alex Riesen
2008-07-28 16:04                               ` David Brown
2008-07-28 16:09                                 ` Linus Torvalds
2008-07-28 21:49                                   ` Alex Riesen
2008-07-29  1:16                                   ` Junio C Hamano
2008-07-29  1:23                                     ` Linus Torvalds
2008-07-29  1:31                                       ` Junio C Hamano
2008-07-29  1:41                                         ` David Brown
2008-07-29  2:49                                           ` Junio C Hamano
2008-07-29 10:45                                           ` Johannes Schindelin
2008-07-29  1:55                                         ` Linus Torvalds
2008-07-29  2:01                                           ` Linus Torvalds
2008-07-29 10:49                                       ` Johannes Schindelin
2008-07-28 16:20                               ` Petr Baudis
2008-07-28 21:47                                 ` [PATCH] Improve the placement of core.trustctime in the documentation Alex Riesen
2008-07-29  6:23                                   ` Junio C Hamano
2008-07-24 19:00             ` [PATCH] Do not use ctime if file mode is not used Alex Riesen
2008-07-23 16:00           ` git svn throws locale related error when built from source Anton Mostovoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).