* running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste @ 2005-05-22 12:28 Thomas Glanzmann 2005-05-22 19:09 ` Linus Torvalds 0 siblings, 1 reply; 8+ messages in thread From: Thomas Glanzmann @ 2005-05-22 12:28 UTC (permalink / raw) To: GIT Hello, I wonder why 'git-update-cache --refresh' running in the same directory shared via NFS ends up in reindexing the whole files when running on different machines on a NFS share. Is there a reason for this or can it easily be fixes. I also wonder if the locking which is used to lock the cache is 'nfs safe'. Thomas ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste 2005-05-22 12:28 running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste Thomas Glanzmann @ 2005-05-22 19:09 ` Linus Torvalds 2005-05-22 19:27 ` Thomas Glanzmann 0 siblings, 1 reply; 8+ messages in thread From: Linus Torvalds @ 2005-05-22 19:09 UTC (permalink / raw) To: Thomas Glanzmann; +Cc: GIT On Sun, 22 May 2005, Thomas Glanzmann wrote: > > I wonder why 'git-update-cache --refresh' running in the same directory > shared via NFS ends up in reindexing the whole files when running on > different machines on a NFS share. It does? Can you check what ls -li --time=atime shows on the different clients? Also, try "ctime". > Is there a reason for this or can it easily be fixes. I also wonder if > the locking which is used to lock the cache is 'nfs safe'. It _should_ be safe. It does the old lockfile thing, with a "link()" that should guarantee atomicity. No fcntl locking or similar that can have problems with networked filesystems and different UNIXes. Linus ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste 2005-05-22 19:09 ` Linus Torvalds @ 2005-05-22 19:27 ` Thomas Glanzmann 2005-05-22 20:43 ` Linus Torvalds 0 siblings, 1 reply; 8+ messages in thread From: Thomas Glanzmann @ 2005-05-22 19:27 UTC (permalink / raw) To: Linus Torvalds; +Cc: GIT Hello, > It does? Not for me at the moment: faui03 -> NFS Server (Solaris 2.9) faui04a -> NFS Client (Solaris 2.9) faui01 -> NFS Client (Linux 2.4.30) (faui03) [~/work/blastwave] date; time git-update-cache --refresh Sun May 22 21:09:33 CEST 2005 real 1m6.362s user 0m12.550s sys 0m9.200s (faui04a) [~/work/blastwave] date; time git-update-cache --refresh Sun May 22 21:10:56 CEST 2005 real 1m20.097s user 0m12.270s sys 0m8.930s (faui01) [~/work/blastwave] date; time git-update-cache --refresh; Sun May 22 21:17:22 CEST 2005 real 0m30.617s user 0m2.340s sys 0m7.970s > Can you check what > ls -li --time=atime > shows on the different clients? Also, try "ctime". atime is different of course different. (faui01) [~/work/blastwave] (ls -Rli --time=atime; ls -lRi --time=ctime) > ~/faui01 (faui03) [~/work/blastwave] (ls -Rli --time=atime; ls -lRi --time=ctime) > ~/faui03 (faui04a) [~/work/blastwave] (ls -Rli --time=atime; ls -lRi --time=ctime) > ~/faui04a (faui01) [~/work/blastwave] md5sum ~/faui0{1,3,4a} a2c2cdb38537a54fb74613d1cf6537f0 /home/cip/adm/sithglan/faui01 67aee985bfb7514900a0a1d2c629cec9 /home/cip/adm/sithglan/faui03 67aee985bfb7514900a0a1d2c629cec9 /home/cip/adm/sithglan/faui04a (faui01) [~/work/blastwave] diff -b -u ~/faui01 ~/faui03 --- /home/cip/adm/sithglan/faui01 2005-05-22 21:24:02.000000000 +0200 +++ /home/cip/adm/sithglan/faui03 2005-05-22 21:23:54.000000000 +0200 @@ -1,11 +1,11 @@ .: total 15 5483033 -rw-r--r-- 1 sithglan icipguru 391 May 22 21:14 Makefile -1842682 drwxr-xr-x 2 sithglan icipguru 512 May 22 21:23 packages/ -5541351 drwxr-xr-x 2 sithglan icipguru 512 May 22 21:23 public_html/ -5541339 drwxr-xr-x 2 sithglan icipguru 512 May 22 21:23 scripts/ -5482949 drwxr-xr-x 2 sithglan icipguru 8704 May 22 21:23 sources/ -5482985 drwxr-xr-x 2 sithglan icipguru 2048 May 22 21:23 specs/ +1842682 drwxr-xr-x 2 sithglan icipguru 512 May 22 21:19 packages/ +5541351 drwxr-xr-x 2 sithglan icipguru 512 May 22 21:19 public_html/ +5541339 drwxr-xr-x 2 sithglan icipguru 512 May 22 21:19 scripts/ +5482949 drwxr-xr-x 2 sithglan icipguru 8704 May 22 21:19 sources/ +5482985 drwxr-xr-x 2 sithglan icipguru 2048 May 22 21:19 specs/ ./packages: total 0 If you need the files: http://wwwcip.informatik.uni-erlangen.de/~sithglan/faui01 (58k) http://wwwcip.informatik.uni-erlangen.de/~sithglan/faui03 (61k) http://wwwcip.informatik.uni-erlangen.de/~sithglan/faui04a (61k) > It _should_ be safe. It does the old lockfile thing, with a "link()" that > should guarantee atomicity. No fcntl locking or similar that can have > problems with networked filesystems and different UNIXes. Is link() NFS safe? I thought only mkdir() for nfs? Thomas ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste 2005-05-22 19:27 ` Thomas Glanzmann @ 2005-05-22 20:43 ` Linus Torvalds 2005-05-22 21:23 ` [PATCH] Don't include devicenumber into INODE_CHANGED test [WAS: Re: running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste] Thomas Glanzmann 0 siblings, 1 reply; 8+ messages in thread From: Linus Torvalds @ 2005-05-22 20:43 UTC (permalink / raw) To: Thomas Glanzmann; +Cc: GIT On Sun, 22 May 2005, Thomas Glanzmann wrote: > > Is link() NFS safe? I thought only mkdir() for nfs? Sorry, I meant "rename", not "link", and yes, it should be NFS-safe. It's how all the mailers do things too, afaik. As to your update-cache problem, it seems to be just due to NFS stat caching. You generally should _not_ work on two machines at the same time, but it probably does the right thing in the end. In general, I would suggest using separate GIT repositories over sharing them over NFS. As far as I'm concerned, I think NFS should work in the sense that you can work from different clients at _different_times_, and I'm certainly not going to guarantee that two different clients that work at the same time against the same repository will get sane results. For example, if you do a "git-checkout-cache -f -a" at the same time, I won't guarantee that things won't race on the working files. Don't do it. Linus ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Don't include devicenumber into INODE_CHANGED test [WAS: Re: running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste] 2005-05-22 20:43 ` Linus Torvalds @ 2005-05-22 21:23 ` Thomas Glanzmann 2005-05-22 21:41 ` Alternate Patch: [PATCH] Don't include device number in cache invalidation when running on NFS Thomas Glanzmann 0 siblings, 1 reply; 8+ messages in thread From: Thomas Glanzmann @ 2005-05-22 21:23 UTC (permalink / raw) To: GIT; +Cc: Linus Torvalds Hello, > Sorry, I meant "rename", not "link", and yes, it should be NFS-safe. It's > how all the mailers do things too, afaik. okay. I will doublecheck that and come back. > As to your update-cache problem, it seems to be just due to NFS stat > caching. You generally should _not_ work on two machines at the same time, > but it probably does the right thing in the end. I added some debugging output (see attached patch) and saw that the reason for the invalid thing is that the inode has changed: ... name: pull.h 0x00000010 name: read-cache.c 0x00000010 ... #define INODE_CHANGED 0x0010 Same problem tla had. It looked at the device number. And of course the device number for NFS shares isn't the same on all machines. So I attached a little patch which fixes the issue for me (and others). > In general, I would suggest using separate GIT repositories over sharing > them over NFS. As far as I'm concerned, I think NFS should work in the > sense that you can work from different clients at _different_times_, and > I'm certainly not going to guarantee that two different clients that work > at the same time against the same repository will get sane results. It is more like that I don't remember on which machine I worked last and working accidently on my next free window in screen (and I have a lot of windows). And getting 370 Mbyte over NFS hits my nerves. ;-) > For example, if you do a "git-checkout-cache -f -a" at the same time, I > won't guarantee that things won't race on the working files. Don't do it. I will not do that. And I will add locking for such operations in my frontend anyway. Thomas CRAP CRAP CRAP: This is just the patch which showed me the debugging output: diff --git a/update-cache.c b/update-cache.c --- a/update-cache.c +++ b/update-cache.c @@ -174,6 +174,8 @@ static struct cache_entry *refresh_entry if (!changed) return ce; + fprintf(stderr, "name: %s 0x%08x\n", ce->name, changed); + /* * If the mode or type has changed, there's no point in trying * to refresh the entry - it's not going to match Here is the real patch: [PATCH] Don't include devicenumber into INODE_CHANGED test This fixes the problem that git-update-cache --refresh rebuilds the cache stat information everytime it is started on a different host while working in the same NFS shared repository. Signed-off-by: Thomas Glanzmann <sithglan@stud.uni-erlangen.de> diff --git a/read-cache.c b/read-cache.c --- a/read-cache.c +++ b/read-cache.c @@ -65,8 +65,7 @@ int ce_match_stat(struct cache_entry *ce if (ce->ce_uid != htonl(st->st_uid) || ce->ce_gid != htonl(st->st_gid)) changed |= OWNER_CHANGED; - if (ce->ce_dev != htonl(st->st_dev) || - ce->ce_ino != htonl(st->st_ino)) + if (ce->ce_ino != htonl(st->st_ino)) changed |= INODE_CHANGED; if (ce->ce_size != htonl(st->st_size)) changed |= DATA_CHANGED; ^ permalink raw reply [flat|nested] 8+ messages in thread
* Alternate Patch: [PATCH] Don't include device number in cache invalidation when running on NFS 2005-05-22 21:23 ` [PATCH] Don't include devicenumber into INODE_CHANGED test [WAS: Re: running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste] Thomas Glanzmann @ 2005-05-22 21:41 ` Thomas Glanzmann 2005-05-22 21:58 ` Linus Torvalds 0 siblings, 1 reply; 8+ messages in thread From: Thomas Glanzmann @ 2005-05-22 21:41 UTC (permalink / raw) To: GIT, Linus Torvalds Hello, * Thomas Glanzmann <sithglan@stud.uni-erlangen.de> [050522 23:24]: > Hello, > > Sorry, I meant "rename", not "link", and yes, it should be NFS-safe. It's > > how all the mailers do things too, afaik. > okay. I will doublecheck that and come back. yes, you're right. While reading liblockfile I saw the following: /* * See if the directory where is certain file is in * is located on an NFS mounted volume. */ static int is_nfs(const char *file) { char dir[1024]; char *s; struct stat st; strncpy(dir, file, sizeof(dir)); if ((s = strrchr(dir, '/')) != NULL) *s = 0; else strcpy(dir, "."); if (stat(dir, &st) < 0) return 0; return ((st.st_dev & 0xFF00) == 0); } So here comes an alternate patch if you like to verify the st_dev for non NFS stuff. Also tested. [PATCH] Don't include device number in cache invalidation when running on NFS This patches includes the device number only in the cache invalidation process when not running on a NFS volume. Signed-off-by: Thomas Glanzmann <sithglan@stud.uni-erlangen.de> diff --git a/read-cache.c b/read-cache.c --- a/read-cache.c +++ b/read-cache.c @@ -65,8 +65,11 @@ int ce_match_stat(struct cache_entry *ce if (ce->ce_uid != htonl(st->st_uid) || ce->ce_gid != htonl(st->st_gid)) changed |= OWNER_CHANGED; - if (ce->ce_dev != htonl(st->st_dev) || - ce->ce_ino != htonl(st->st_ino)) + /* Only include device number if not running on NFS */ + if (ce->ce_dev != htonl(st->st_dev) && + ((st->st_dev & 0xFF00) == 0)) + changed |= INODE_CHANGED; + if (ce->ce_ino != htonl(st->st_ino)) changed |= INODE_CHANGED; if (ce->ce_size != htonl(st->st_size)) changed |= DATA_CHANGED; ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Alternate Patch: [PATCH] Don't include device number in cache invalidation when running on NFS 2005-05-22 21:41 ` Alternate Patch: [PATCH] Don't include device number in cache invalidation when running on NFS Thomas Glanzmann @ 2005-05-22 21:58 ` Linus Torvalds 2005-05-22 22:07 ` Thomas Glanzmann 0 siblings, 1 reply; 8+ messages in thread From: Linus Torvalds @ 2005-05-22 21:58 UTC (permalink / raw) To: Thomas Glanzmann; +Cc: GIT On Sun, 22 May 2005, Thomas Glanzmann wrote: > > While reading liblockfile I saw the following: This is _really_ Linux-specific afaik. Which is ok for git, but at the same time it really makes me go "Ewww". It's testing that the major number is 0, and it would be a lot more cleaner to use if (!major(st.st_dev)) but even that is very Linux-specific. > [PATCH] Don't include device number in cache invalidation when running on NFS I'll have to think about it. Maybe I should just remove the st_dev check. I guess inode/size/mtime/ctime should be plenty safe enough in practice. Linus ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Alternate Patch: [PATCH] Don't include device number in cache invalidation when running on NFS 2005-05-22 21:58 ` Linus Torvalds @ 2005-05-22 22:07 ` Thomas Glanzmann 0 siblings, 0 replies; 8+ messages in thread From: Thomas Glanzmann @ 2005-05-22 22:07 UTC (permalink / raw) To: GIT Hello, > This is _really_ Linux-specific afaik. Which is ok for git, but at the > same time it really makes me go "Ewww". It's testing that the major number > is 0, and it would be a lot more cleaner to use > if (!major(st.st_dev)) > but even that is very Linux-specific. I see. > I'll have to think about it. Maybe I should just remove the st_dev check. > I guess inode/size/mtime/ctime should be plenty safe enough in practice. I think so. At least I kick this one out because it is just getting on my nerves. :-) Thomas ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2005-05-22 22:06 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-05-22 12:28 running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste Thomas Glanzmann 2005-05-22 19:09 ` Linus Torvalds 2005-05-22 19:27 ` Thomas Glanzmann 2005-05-22 20:43 ` Linus Torvalds 2005-05-22 21:23 ` [PATCH] Don't include devicenumber into INODE_CHANGED test [WAS: Re: running git-update-cache --refresh on different machines on a NFS share always ends up in a lot of io/cpu/time waste] Thomas Glanzmann 2005-05-22 21:41 ` Alternate Patch: [PATCH] Don't include device number in cache invalidation when running on NFS Thomas Glanzmann 2005-05-22 21:58 ` Linus Torvalds 2005-05-22 22:07 ` Thomas Glanzmann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).