* running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste
@ 2005-05-22 12:28 Thomas Glanzmann
2005-05-22 19:09 ` Linus Torvalds
0 siblings, 1 reply; 8+ messages in thread
From: Thomas Glanzmann @ 2005-05-22 12:28 UTC (permalink / raw)
To: GIT
Hello,
I wonder why 'git-update-cache --refresh' running in the same directory
shared via NFS ends up in reindexing the whole files when running on
different machines on a NFS share.
Is there a reason for this or can it easily be fixes. I also wonder if
the locking which is used to lock the cache is 'nfs safe'.
Thomas
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste
2005-05-22 12:28 running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste Thomas Glanzmann
@ 2005-05-22 19:09 ` Linus Torvalds
2005-05-22 19:27 ` Thomas Glanzmann
0 siblings, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2005-05-22 19:09 UTC (permalink / raw)
To: Thomas Glanzmann; +Cc: GIT
On Sun, 22 May 2005, Thomas Glanzmann wrote:
>
> I wonder why 'git-update-cache --refresh' running in the same directory
> shared via NFS ends up in reindexing the whole files when running on
> different machines on a NFS share.
It does?
Can you check what
ls -li --time=atime
shows on the different clients? Also, try "ctime".
> Is there a reason for this or can it easily be fixes. I also wonder if
> the locking which is used to lock the cache is 'nfs safe'.
It _should_ be safe. It does the old lockfile thing, with a "link()" that
should guarantee atomicity. No fcntl locking or similar that can have
problems with networked filesystems and different UNIXes.
Linus
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste
2005-05-22 19:09 ` Linus Torvalds
@ 2005-05-22 19:27 ` Thomas Glanzmann
2005-05-22 20:43 ` Linus Torvalds
0 siblings, 1 reply; 8+ messages in thread
From: Thomas Glanzmann @ 2005-05-22 19:27 UTC (permalink / raw)
To: Linus Torvalds; +Cc: GIT
Hello,
> It does?
Not for me at the moment:
faui03 -> NFS Server (Solaris 2.9)
faui04a -> NFS Client (Solaris 2.9)
faui01 -> NFS Client (Linux 2.4.30)
(faui03) [~/work/blastwave] date; time git-update-cache --refresh
Sun May 22 21:09:33 CEST 2005
real 1m6.362s
user 0m12.550s
sys 0m9.200s
(faui04a) [~/work/blastwave] date; time git-update-cache --refresh
Sun May 22 21:10:56 CEST 2005
real 1m20.097s
user 0m12.270s
sys 0m8.930s
(faui01) [~/work/blastwave] date; time git-update-cache --refresh;
Sun May 22 21:17:22 CEST 2005
real 0m30.617s
user 0m2.340s
sys 0m7.970s
> Can you check what
> ls -li --time=atime
> shows on the different clients? Also, try "ctime".
atime is different of course different.
(faui01) [~/work/blastwave] (ls -Rli --time=atime; ls -lRi --time=ctime) > ~/faui01
(faui03) [~/work/blastwave] (ls -Rli --time=atime; ls -lRi --time=ctime) > ~/faui03
(faui04a) [~/work/blastwave] (ls -Rli --time=atime; ls -lRi --time=ctime) > ~/faui04a
(faui01) [~/work/blastwave] md5sum ~/faui0{1,3,4a}
a2c2cdb38537a54fb74613d1cf6537f0 /home/cip/adm/sithglan/faui01
67aee985bfb7514900a0a1d2c629cec9 /home/cip/adm/sithglan/faui03
67aee985bfb7514900a0a1d2c629cec9 /home/cip/adm/sithglan/faui04a
(faui01) [~/work/blastwave] diff -b -u ~/faui01 ~/faui03
--- /home/cip/adm/sithglan/faui01 2005-05-22 21:24:02.000000000 +0200
+++ /home/cip/adm/sithglan/faui03 2005-05-22 21:23:54.000000000 +0200
@@ -1,11 +1,11 @@
.:
total 15
5483033 -rw-r--r-- 1 sithglan icipguru 391 May 22 21:14 Makefile
-1842682 drwxr-xr-x 2 sithglan icipguru 512 May 22 21:23 packages/
-5541351 drwxr-xr-x 2 sithglan icipguru 512 May 22 21:23 public_html/
-5541339 drwxr-xr-x 2 sithglan icipguru 512 May 22 21:23 scripts/
-5482949 drwxr-xr-x 2 sithglan icipguru 8704 May 22 21:23 sources/
-5482985 drwxr-xr-x 2 sithglan icipguru 2048 May 22 21:23 specs/
+1842682 drwxr-xr-x 2 sithglan icipguru 512 May 22 21:19 packages/
+5541351 drwxr-xr-x 2 sithglan icipguru 512 May 22 21:19 public_html/
+5541339 drwxr-xr-x 2 sithglan icipguru 512 May 22 21:19 scripts/
+5482949 drwxr-xr-x 2 sithglan icipguru 8704 May 22 21:19 sources/
+5482985 drwxr-xr-x 2 sithglan icipguru 2048 May 22 21:19 specs/
./packages:
total 0
If you need the files:
http://wwwcip.informatik.uni-erlangen.de/~sithglan/faui01 (58k)
http://wwwcip.informatik.uni-erlangen.de/~sithglan/faui03 (61k)
http://wwwcip.informatik.uni-erlangen.de/~sithglan/faui04a (61k)
> It _should_ be safe. It does the old lockfile thing, with a "link()" that
> should guarantee atomicity. No fcntl locking or similar that can have
> problems with networked filesystems and different UNIXes.
Is link() NFS safe? I thought only mkdir() for nfs?
Thomas
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste
2005-05-22 19:27 ` Thomas Glanzmann
@ 2005-05-22 20:43 ` Linus Torvalds
2005-05-22 21:23 ` [PATCH] Don't include devicenumber into INODE_CHANGED test [WAS: Re: running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste] Thomas Glanzmann
0 siblings, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2005-05-22 20:43 UTC (permalink / raw)
To: Thomas Glanzmann; +Cc: GIT
On Sun, 22 May 2005, Thomas Glanzmann wrote:
>
> Is link() NFS safe? I thought only mkdir() for nfs?
Sorry, I meant "rename", not "link", and yes, it should be NFS-safe. It's
how all the mailers do things too, afaik.
As to your update-cache problem, it seems to be just due to NFS stat
caching. You generally should _not_ work on two machines at the same time,
but it probably does the right thing in the end.
In general, I would suggest using separate GIT repositories over sharing
them over NFS. As far as I'm concerned, I think NFS should work in the
sense that you can work from different clients at _different_times_, and
I'm certainly not going to guarantee that two different clients that work
at the same time against the same repository will get sane results.
For example, if you do a "git-checkout-cache -f -a" at the same time, I
won't guarantee that things won't race on the working files. Don't do it.
Linus
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Don't include devicenumber into INODE_CHANGED test [WAS: Re: running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste]
2005-05-22 20:43 ` Linus Torvalds
@ 2005-05-22 21:23 ` Thomas Glanzmann
2005-05-22 21:41 ` Alternate Patch: [PATCH] Don't include device number in cache invalidation when running on NFS Thomas Glanzmann
0 siblings, 1 reply; 8+ messages in thread
From: Thomas Glanzmann @ 2005-05-22 21:23 UTC (permalink / raw)
To: GIT; +Cc: Linus Torvalds
Hello,
> Sorry, I meant "rename", not "link", and yes, it should be NFS-safe. It's
> how all the mailers do things too, afaik.
okay. I will doublecheck that and come back.
> As to your update-cache problem, it seems to be just due to NFS stat
> caching. You generally should _not_ work on two machines at the same time,
> but it probably does the right thing in the end.
I added some debugging output (see attached patch) and saw that the
reason for the invalid thing is that the inode has changed:
...
name: pull.h 0x00000010
name: read-cache.c 0x00000010
...
#define INODE_CHANGED 0x0010
Same problem tla had. It looked at the device number. And of course the
device number for NFS shares isn't the same on all machines. So I
attached a little patch which fixes the issue for me (and others).
> In general, I would suggest using separate GIT repositories over sharing
> them over NFS. As far as I'm concerned, I think NFS should work in the
> sense that you can work from different clients at _different_times_, and
> I'm certainly not going to guarantee that two different clients that work
> at the same time against the same repository will get sane results.
It is more like that I don't remember on which machine I worked last and
working accidently on my next free window in screen (and I have a lot of
windows). And getting 370 Mbyte over NFS hits my nerves. ;-)
> For example, if you do a "git-checkout-cache -f -a" at the same time, I
> won't guarantee that things won't race on the working files. Don't do it.
I will not do that. And I will add locking for such operations in my frontend
anyway.
Thomas
CRAP CRAP CRAP: This is just the patch which showed me the debugging
output:
diff --git a/update-cache.c b/update-cache.c
--- a/update-cache.c
+++ b/update-cache.c
@@ -174,6 +174,8 @@ static struct cache_entry *refresh_entry
if (!changed)
return ce;
+ fprintf(stderr, "name: %s 0x%08x\n", ce->name, changed);
+
/*
* If the mode or type has changed, there's no point in trying
* to refresh the entry - it's not going to match
Here is the real patch:
[PATCH] Don't include devicenumber into INODE_CHANGED test
This fixes the problem that git-update-cache --refresh rebuilds the
cache stat information everytime it is started on a different host while
working in the same NFS shared repository.
Signed-off-by: Thomas Glanzmann <sithglan@stud.uni-erlangen.de>
diff --git a/read-cache.c b/read-cache.c
--- a/read-cache.c
+++ b/read-cache.c
@@ -65,8 +65,7 @@ int ce_match_stat(struct cache_entry *ce
if (ce->ce_uid != htonl(st->st_uid) ||
ce->ce_gid != htonl(st->st_gid))
changed |= OWNER_CHANGED;
- if (ce->ce_dev != htonl(st->st_dev) ||
- ce->ce_ino != htonl(st->st_ino))
+ if (ce->ce_ino != htonl(st->st_ino))
changed |= INODE_CHANGED;
if (ce->ce_size != htonl(st->st_size))
changed |= DATA_CHANGED;
^ permalink raw reply [flat|nested] 8+ messages in thread
* Alternate Patch: [PATCH] Don't include device number in cache invalidation when running on NFS
2005-05-22 21:23 ` [PATCH] Don't include devicenumber into INODE_CHANGED test [WAS: Re: running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste] Thomas Glanzmann
@ 2005-05-22 21:41 ` Thomas Glanzmann
2005-05-22 21:58 ` Linus Torvalds
0 siblings, 1 reply; 8+ messages in thread
From: Thomas Glanzmann @ 2005-05-22 21:41 UTC (permalink / raw)
To: GIT, Linus Torvalds
Hello,
* Thomas Glanzmann <sithglan@stud.uni-erlangen.de> [050522 23:24]:
> Hello,
> > Sorry, I meant "rename", not "link", and yes, it should be NFS-safe. It's
> > how all the mailers do things too, afaik.
> okay. I will doublecheck that and come back.
yes, you're right.
While reading liblockfile I saw the following:
/*
* See if the directory where is certain file is in
* is located on an NFS mounted volume.
*/
static int is_nfs(const char *file)
{
char dir[1024];
char *s;
struct stat st;
strncpy(dir, file, sizeof(dir));
if ((s = strrchr(dir, '/')) != NULL)
*s = 0;
else
strcpy(dir, ".");
if (stat(dir, &st) < 0)
return 0;
return ((st.st_dev & 0xFF00) == 0);
}
So here comes an alternate patch if you like to verify the st_dev for non
NFS stuff. Also tested.
[PATCH] Don't include device number in cache invalidation when running on NFS
This patches includes the device number only in the cache invalidation
process when not running on a NFS volume.
Signed-off-by: Thomas Glanzmann <sithglan@stud.uni-erlangen.de>
diff --git a/read-cache.c b/read-cache.c
--- a/read-cache.c
+++ b/read-cache.c
@@ -65,8 +65,11 @@ int ce_match_stat(struct cache_entry *ce
if (ce->ce_uid != htonl(st->st_uid) ||
ce->ce_gid != htonl(st->st_gid))
changed |= OWNER_CHANGED;
- if (ce->ce_dev != htonl(st->st_dev) ||
- ce->ce_ino != htonl(st->st_ino))
+ /* Only include device number if not running on NFS */
+ if (ce->ce_dev != htonl(st->st_dev) &&
+ ((st->st_dev & 0xFF00) == 0))
+ changed |= INODE_CHANGED;
+ if (ce->ce_ino != htonl(st->st_ino))
changed |= INODE_CHANGED;
if (ce->ce_size != htonl(st->st_size))
changed |= DATA_CHANGED;
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Alternate Patch: [PATCH] Don't include device number in cache invalidation when running on NFS
2005-05-22 21:41 ` Alternate Patch: [PATCH] Don't include device number in cache invalidation when running on NFS Thomas Glanzmann
@ 2005-05-22 21:58 ` Linus Torvalds
2005-05-22 22:07 ` Thomas Glanzmann
0 siblings, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2005-05-22 21:58 UTC (permalink / raw)
To: Thomas Glanzmann; +Cc: GIT
On Sun, 22 May 2005, Thomas Glanzmann wrote:
>
> While reading liblockfile I saw the following:
This is _really_ Linux-specific afaik. Which is ok for git, but at the
same time it really makes me go "Ewww". It's testing that the major number
is 0, and it would be a lot more cleaner to use
if (!major(st.st_dev))
but even that is very Linux-specific.
> [PATCH] Don't include device number in cache invalidation when running on NFS
I'll have to think about it. Maybe I should just remove the st_dev check.
I guess inode/size/mtime/ctime should be plenty safe enough in practice.
Linus
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Alternate Patch: [PATCH] Don't include device number in cache invalidation when running on NFS
2005-05-22 21:58 ` Linus Torvalds
@ 2005-05-22 22:07 ` Thomas Glanzmann
0 siblings, 0 replies; 8+ messages in thread
From: Thomas Glanzmann @ 2005-05-22 22:07 UTC (permalink / raw)
To: GIT
Hello,
> This is _really_ Linux-specific afaik. Which is ok for git, but at the
> same time it really makes me go "Ewww". It's testing that the major number
> is 0, and it would be a lot more cleaner to use
> if (!major(st.st_dev))
> but even that is very Linux-specific.
I see.
> I'll have to think about it. Maybe I should just remove the st_dev check.
> I guess inode/size/mtime/ctime should be plenty safe enough in practice.
I think so. At least I kick this one out because it is just getting on
my nerves. :-)
Thomas
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2005-05-22 22:06 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-22 12:28 running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste Thomas Glanzmann
2005-05-22 19:09 ` Linus Torvalds
2005-05-22 19:27 ` Thomas Glanzmann
2005-05-22 20:43 ` Linus Torvalds
2005-05-22 21:23 ` [PATCH] Don't include devicenumber into INODE_CHANGED test [WAS: Re: running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste] Thomas Glanzmann
2005-05-22 21:41 ` Alternate Patch: [PATCH] Don't include device number in cache invalidation when running on NFS Thomas Glanzmann
2005-05-22 21:58 ` Linus Torvalds
2005-05-22 22:07 ` Thomas Glanzmann
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).