* pread() over NFS (again) [1.5.5.4] @ 2008-06-26 16:40 Christian Holtje 2008-06-26 20:46 ` Shawn O. Pearce 0 siblings, 1 reply; 16+ messages in thread From: Christian Holtje @ 2008-06-26 16:40 UTC (permalink / raw) To: git I have read all the threads on git having trouble with pread() and I didn't see anything to help. Situation: git commands run on system "dev2" which is Linux 2.6.9-42.0.8.ELsmp on an NFS mounted directory. The NFS server is "dev1" which is Linux 2.4.28 I get the following output from git when doing a fetch: remote: Counting objects: 406, done. remote: Compressing objects: 100% (198/198), done. remote: Total 253 (delta 127), reused 150 (delta 55) Receiving objects: 100% (253/253), 5.27 MiB | 9136 KiB/s, done. fatal: cannot pread pack file: No such file or directory fatal: index-pack failed The end of the strace looks like so: pread(3, "", 205, 1373) = 0 write(2, "fatal: cannot pread pack file: N"..., 57) = 57 I have ran strace -o /somedir/q -ff git fetch and have the straces available at: http://docwhat.gerf.org/files/tmp/git-strace-20080626.tgz I have worked around the problem by running the fetch from the nfs server. It looks like I can recreate this at will and I'm willing to help figure this out. Thanks! Ciao! ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: pread() over NFS (again) [1.5.5.4] 2008-06-26 16:40 pread() over NFS (again) [1.5.5.4] Christian Holtje @ 2008-06-26 20:46 ` Shawn O. Pearce 2008-06-26 20:56 ` Junio C Hamano 0 siblings, 1 reply; 16+ messages in thread From: Shawn O. Pearce @ 2008-06-26 20:46 UTC (permalink / raw) To: Christian Holtje; +Cc: git Christian Holtje <docwhat@gmail.com> wrote: > I have read all the threads on git having trouble with pread() and I > didn't see anything to help. ... > Receiving objects: 100% (253/253), 5.27 MiB | 9136 KiB/s, done. > fatal: cannot pread pack file: No such file or directory > fatal: index-pack failed > > The end of the strace looks like so: > pread(3, "", 205, 1373) = 0 > write(2, "fatal: cannot pread pack file: N"..., 57) = 57 Hmmph. So pread for a length of 205 can return 0 on NFS? Is this a transient error? If so, perhaps a patch like this might help: diff --git a/index-pack.c b/index-pack.c index 5ac91ba..737f757 100644 --- a/index-pack.c +++ b/index-pack.c @@ -309,14 +309,19 @@ static void *get_data_from_pack(struct object_entry *obj) unsigned char *src, *data; z_stream stream; int st; + int attempts = 0; src = xmalloc(len); data = src; do { ssize_t n = pread(pack_fd, data + rdy, len - rdy, from + rdy); - if (n <= 0) + if (n <= 0) { + if (n == 0 && ++attempts < 10) + continue; die("cannot pread pack file: %s", strerror(errno)); + } rdy += n; + attempts = 0; } while (rdy < len); data = xmalloc(obj->size); memset(&stream, 0, sizeof(stream)); The file shouldn't be short unless someone truncated it, or there is a bug in index-pack. Neither is very likely, but I don't think we would want to retry pread'ing the same block forever. -- Shawn. ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: pread() over NFS (again) [1.5.5.4] 2008-06-26 20:46 ` Shawn O. Pearce @ 2008-06-26 20:56 ` Junio C Hamano 2008-06-26 21:05 ` Shawn O. Pearce 2008-06-26 23:36 ` logank 0 siblings, 2 replies; 16+ messages in thread From: Junio C Hamano @ 2008-06-26 20:56 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Christian Holtje, git "Shawn O. Pearce" <spearce@spearce.org> writes: > Christian Holtje <docwhat@gmail.com> wrote: >> I have read all the threads on git having trouble with pread() and I >> didn't see anything to help. > ... >> Receiving objects: 100% (253/253), 5.27 MiB | 9136 KiB/s, done. >> fatal: cannot pread pack file: No such file or directory >> fatal: index-pack failed >> >> The end of the strace looks like so: >> pread(3, "", 205, 1373) = 0 >> write(2, "fatal: cannot pread pack file: N"..., 57) = 57 > > Hmmph. So pread for a length of 205 can return 0 on NFS? Is this > a transient error? If so, perhaps a patch like this might help: > > diff --git a/index-pack.c b/index-pack.c > index 5ac91ba..737f757 100644 > --- a/index-pack.c > +++ b/index-pack.c > @@ -309,14 +309,19 @@ static void *get_data_from_pack(struct object_entry *obj) > unsigned char *src, *data; > z_stream stream; > int st; > + int attempts = 0; > > src = xmalloc(len); > data = src; > do { > ssize_t n = pread(pack_fd, data + rdy, len - rdy, from + rdy); > - if (n <= 0) > + if (n <= 0) { > + if (n == 0 && ++attempts < 10) > + continue; > die("cannot pread pack file: %s", strerror(errno)); > + } > rdy += n; > + attempts = 0; > } while (rdy < len); > data = xmalloc(obj->size); > memset(&stream, 0, sizeof(stream)); > > > The file shouldn't be short unless someone truncated it, or there > is a bug in index-pack. Neither is very likely, but I don't think > we would want to retry pread'ing the same block forever. I don't think we would want to retry even once. Return value of 0 from pread is defined to be an EOF, isn't it? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: pread() over NFS (again) [1.5.5.4] 2008-06-26 20:56 ` Junio C Hamano @ 2008-06-26 21:05 ` Shawn O. Pearce 2008-06-26 21:36 ` Christian Holtje 2008-06-26 22:04 ` Junio C Hamano 2008-06-26 23:36 ` logank 1 sibling, 2 replies; 16+ messages in thread From: Shawn O. Pearce @ 2008-06-26 21:05 UTC (permalink / raw) To: Junio C Hamano; +Cc: Christian Holtje, git Junio C Hamano <gitster@pobox.com> wrote: > "Shawn O. Pearce" <spearce@spearce.org> writes: > > Christian Holtje <docwhat@gmail.com> wrote: > >> I have read all the threads on git having trouble with pread() and I > >> didn't see anything to help. > > ... > >> Receiving objects: 100% (253/253), 5.27 MiB | 9136 KiB/s, done. > >> fatal: cannot pread pack file: No such file or directory > >> fatal: index-pack failed > >> > >> The end of the strace looks like so: > >> pread(3, "", 205, 1373) = 0 > >> write(2, "fatal: cannot pread pack file: N"..., 57) = 57 > > > > Hmmph. So pread for a length of 205 can return 0 on NFS? Is this > > a transient error? If so, perhaps a patch like this might help: ... > > The file shouldn't be short unless someone truncated it, or there > > is a bug in index-pack. Neither is very likely, but I don't think > > we would want to retry pread'ing the same block forever. > > I don't think we would want to retry even once. Return value of 0 from > pread is defined to be an EOF, isn't it? Indeed, it is defined to be EOF, but EOF here makes no sense. We have a file position we saw once before as the start of a delta. We wrote it down to disk. We want to go back and open it up, as we have the base decompressed and in memory and need to compute the SHA-1 of the object that resides at this offset. And *wham* we get an EOF. Where there should be data. Where we know there is data. I'm open to the idea that index-pack has a bug, but I doubt it. We shovel hundreds of megabytes through that on a daily basis across all of the git users, and nobody ever sees it crash out with an EOF in the middle of an object it knows to be present. Except poor Christian on NFS. Actually, I think the last time someone reported something like this in Git it turned out to be an NFS kernel bug. I didn't quote it in my reply to him, but I think he did say this was a linux client, linux server. -- Shawn. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: pread() over NFS (again) [1.5.5.4] 2008-06-26 21:05 ` Shawn O. Pearce @ 2008-06-26 21:36 ` Christian Holtje 2008-06-26 22:04 ` Junio C Hamano 1 sibling, 0 replies; 16+ messages in thread From: Christian Holtje @ 2008-06-26 21:36 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Junio C Hamano, git On Jun 26, 2008, at 5:05 PM, Shawn O. Pearce wrote: > Junio C Hamano <gitster@pobox.com> wrote: >> "Shawn O. Pearce" <spearce@spearce.org> writes: >>> Christian Holtje <docwhat@gmail.com> wrote: >>>> I have read all the threads on git having trouble with pread() >>>> and I >>>> didn't see anything to help. >>> ... >>>> Receiving objects: 100% (253/253), 5.27 MiB | 9136 KiB/s, done. >>>> fatal: cannot pread pack file: No such file or directory >>>> fatal: index-pack failed >>>> >>>> The end of the strace looks like so: >>>> pread(3, "", 205, 1373) = 0 >>>> write(2, "fatal: cannot pread pack file: N"..., 57) = 57 >>> >>> Hmmph. So pread for a length of 205 can return 0 on NFS? Is this >>> a transient error? If so, perhaps a patch like this might help: > ... >>> The file shouldn't be short unless someone truncated it, or there >>> is a bug in index-pack. Neither is very likely, but I don't think >>> we would want to retry pread'ing the same block forever. >> >> I don't think we would want to retry even once. Return value of 0 >> from >> pread is defined to be an EOF, isn't it? > > Indeed, it is defined to be EOF, but EOF here makes no sense. > > We have a file position we saw once before as the start of a delta. > We wrote it down to disk. We want to go back and open it up, as > we have the base decompressed and in memory and need to compute > the SHA-1 of the object that resides at this offset. > > And *wham* we get an EOF. Where there should be data. Where we > know there is data. > > I'm open to the idea that index-pack has a bug, but I doubt it. > We shovel hundreds of megabytes through that on a daily basis > across all of the git users, and nobody ever sees it crash out > with an EOF in the middle of an object it knows to be present. > Except poor Christian on NFS. > > Actually, I think the last time someone reported something like this > in Git it turned out to be an NFS kernel bug. I didn't quote it > in my reply to him, but I think he did say this was a linux client, > linux server. I did see the threads that talked about NFS, but I couldn't find any matching messages in the linux kernel mailing list. If someone can give me a pointer to where this picked up on other mailing lists (say, via a URL) then I'll take that back to IT as a reason to upgrade. Would it be a bug in the client or server? I'd assume client... The other NFS/pread() email thread was also a client of linux 2.6.9.... Ciao! ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: pread() over NFS (again) [1.5.5.4] 2008-06-26 21:05 ` Shawn O. Pearce 2008-06-26 21:36 ` Christian Holtje @ 2008-06-26 22:04 ` Junio C Hamano 2008-06-26 22:07 ` Shawn O. Pearce 1 sibling, 1 reply; 16+ messages in thread From: Junio C Hamano @ 2008-06-26 22:04 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Christian Holtje, git "Shawn O. Pearce" <spearce@spearce.org> writes: > We have a file position we saw once before as the start of a delta. > We wrote it down to disk. We want to go back and open it up, as > we have the base decompressed and in memory and need to compute > the SHA-1 of the object that resides at this offset. > > And *wham* we get an EOF. Where there should be data. Where we > know there is data. We have written that earlier in the same process? Are we playing games with mixed mmap() and pread()? Is fsync() or msync() or unmap/remap needed? > Actually, I think the last time someone reported something like this > in Git it turned out to be an NFS kernel bug. I didn't quote it > in my reply to him, but I think he did say this was a linux client, > linux server. This is getting into the area Linus would immediately know the answer to, but he is away for this week. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: pread() over NFS (again) [1.5.5.4] 2008-06-26 22:04 ` Junio C Hamano @ 2008-06-26 22:07 ` Shawn O. Pearce 0 siblings, 0 replies; 16+ messages in thread From: Shawn O. Pearce @ 2008-06-26 22:07 UTC (permalink / raw) To: Junio C Hamano; +Cc: Christian Holtje, git Junio C Hamano <gitster@pobox.com> wrote: > "Shawn O. Pearce" <spearce@spearce.org> writes: > > > We have a file position we saw once before as the start of a delta. > > We wrote it down to disk. We want to go back and open it up, as > > we have the base decompressed and in memory and need to compute > > the SHA-1 of the object that resides at this offset. > > > > And *wham* we get an EOF. Where there should be data. Where we > > know there is data. > > We have written that earlier in the same process? Are we playing games > with mixed mmap() and pread()? Is fsync() or msync() or unmap/remap > needed? Yes, we wrote the very same file earlier in the process. index-pack _used_ to use a mixed write/mmap game. Today it uses write, followed by later pread. No mmap. We had major performance issues on Mac OS X with mmap, not to mention the coherency issues that can arise from using it. So index-pack is mmap free[*1*]. > > Actually, I think the last time someone reported something like this > > in Git it turned out to be an NFS kernel bug. I didn't quote it > > in my reply to him, but I think he did say this was a linux client, > > linux server. > > This is getting into the area Linus would immediately know the answer to, > but he is away for this week. Yea, I was expecting a reply from him, but that explains it. *1* There is usage of mmap in index-pack, but only to access _existing_ objects and packs from the object store. -- Shawn. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: pread() over NFS (again) [1.5.5.4] 2008-06-26 20:56 ` Junio C Hamano 2008-06-26 21:05 ` Shawn O. Pearce @ 2008-06-26 23:36 ` logank 2008-06-26 23:38 ` Junio C Hamano ` (2 more replies) 1 sibling, 3 replies; 16+ messages in thread From: logank @ 2008-06-26 23:36 UTC (permalink / raw) To: Junio C Hamano; +Cc: Shawn O. Pearce, Christian Holtje, git On Jun 26, 2008, at 1:56 PM, Junio C Hamano wrote: >> "The file shouldn't be short unless someone truncated it, or there >> is a bug in index-pack. Neither is very likely, but I don't think >> we would want to retry pread'ing the same block forever. > > I don't think we would want to retry even once. Return value of 0 > from > pread is defined to be an EOF, isn't it? No, it seems to be a simple error-out in this case. We have 2.4.20 systems with nfs-utils 0.3.3 and used to frequently get the same error while pushing. I made a similar change back in February and haven't had a problem since: diff --git a/index-pack.c b/index-pack.c index 5ac91ba..85c8bdb 100644 --- a/index-pack.c +++ b/index-pack.c @@ -313,7 +313,14 @@ static void *get_data_from_pack(struct object_entry *obj) src = xmalloc(len); data = src; do { + // It appears that if multiple threads read across NFS, the + // second read will fail. I know this is awful, but we wait for + // a little bit and try again. ssize_t n = pread(pack_fd, data + rdy, len - rdy, from + rdy); + if (n <= 0) { + sleep(1); + n = pread(pack_fd, data + rdy, len - rdy, from + rdy); + } if (n <= 0) die("cannot pread pack file: %s", strerror(errno)); rdy += n; I use a sleep request since it seems less likely that the other thread will have an outstanding request after a second of waiting. -- Logan Kennelly ,,, (. .) --ooO-(_)-Ooo-- ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: pread() over NFS (again) [1.5.5.4] 2008-06-26 23:36 ` logank @ 2008-06-26 23:38 ` Junio C Hamano 2008-06-27 2:57 ` J. Bruce Fields 2008-06-27 2:54 ` J. Bruce Fields 2008-06-27 13:54 ` Christian Holtje 2 siblings, 1 reply; 16+ messages in thread From: Junio C Hamano @ 2008-06-26 23:38 UTC (permalink / raw) To: logank; +Cc: J. Bruce Fields, Shawn O. Pearce, Christian Holtje, git logank@sent.com writes: > On Jun 26, 2008, at 1:56 PM, Junio C Hamano wrote: > >>> "The file shouldn't be short unless someone truncated it, or there >>> is a bug in index-pack. Neither is very likely, but I don't think >>> we would want to retry pread'ing the same block forever. >> >> I don't think we would want to retry even once. Return value of 0 >> from >> pread is defined to be an EOF, isn't it? > > No, it seems to be a simple error-out in this case. We have 2.4.20 > systems with nfs-utils 0.3.3 and used to frequently get the same error > while pushing. I made a similar change back in February and haven't > had a problem since: > > diff --git a/index-pack.c b/index-pack.c > index 5ac91ba..85c8bdb 100644 > --- a/index-pack.c > +++ b/index-pack.c > @@ -313,7 +313,14 @@ static void *get_data_from_pack(struct > object_entry *obj) > src = xmalloc(len); > data = src; > do { > + // It appears that if multiple threads read across NFS, the > + // second read will fail. I know this is awful, but we wait for > + // a little bit and try again. > ssize_t n = pread(pack_fd, data + rdy, len - rdy, from + rdy); > + if (n <= 0) { > + sleep(1); > + n = pread(pack_fd, data + rdy, len - rdy, from + rdy); > + } > if (n <= 0) > die("cannot pread pack file: %s", strerror(errno)); > rdy += n; > > I use a sleep request since it seems less likely that the other thread > will have an outstanding request after a second of waiting. Gaah. Don't we have NFS experts in house? Bruce, perhaps? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: pread() over NFS (again) [1.5.5.4] 2008-06-26 23:38 ` Junio C Hamano @ 2008-06-27 2:57 ` J. Bruce Fields 2008-06-27 14:50 ` Trond Myklebust 0 siblings, 1 reply; 16+ messages in thread From: J. Bruce Fields @ 2008-06-27 2:57 UTC (permalink / raw) To: Junio C Hamano Cc: logank, Shawn O. Pearce, Christian Holtje, git, Trond Myklebust On Thu, Jun 26, 2008 at 04:38:40PM -0700, Junio C Hamano wrote: > logank@sent.com writes: > > > On Jun 26, 2008, at 1:56 PM, Junio C Hamano wrote: > > > >>> "The file shouldn't be short unless someone truncated it, or there > >>> is a bug in index-pack. Neither is very likely, but I don't think > >>> we would want to retry pread'ing the same block forever. > >> > >> I don't think we would want to retry even once. Return value of 0 > >> from > >> pread is defined to be an EOF, isn't it? > > > > No, it seems to be a simple error-out in this case. We have 2.4.20 > > systems with nfs-utils 0.3.3 and used to frequently get the same error > > while pushing. I made a similar change back in February and haven't > > had a problem since: > > > > diff --git a/index-pack.c b/index-pack.c > > index 5ac91ba..85c8bdb 100644 > > --- a/index-pack.c > > +++ b/index-pack.c > > @@ -313,7 +313,14 @@ static void *get_data_from_pack(struct > > object_entry *obj) > > src = xmalloc(len); > > data = src; > > do { > > + // It appears that if multiple threads read across NFS, the > > + // second read will fail. I know this is awful, but we wait for > > + // a little bit and try again. > > ssize_t n = pread(pack_fd, data + rdy, len - rdy, from + rdy); > > + if (n <= 0) { > > + sleep(1); > > + n = pread(pack_fd, data + rdy, len - rdy, from + rdy); > > + } > > if (n <= 0) > > die("cannot pread pack file: %s", strerror(errno)); > > rdy += n; > > > > I use a sleep request since it seems less likely that the other thread > > will have an outstanding request after a second of waiting. > > Gaah. Don't we have NFS experts in house? Bruce, perhaps? Trond, you don't have any idea why a 2.6.9-42.0.8.ELsmp client (2.4.28 server) might be returning spurious 0's from pread()? Seems like everything is happening from that one client--the file isn't being simultaneously accessed from the server or from another client. --b. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: pread() over NFS (again) [1.5.5.4] 2008-06-27 2:57 ` J. Bruce Fields @ 2008-06-27 14:50 ` Trond Myklebust 2008-06-30 0:32 ` Shawn O. Pearce 0 siblings, 1 reply; 16+ messages in thread From: Trond Myklebust @ 2008-06-27 14:50 UTC (permalink / raw) To: J. Bruce Fields Cc: Junio C Hamano, logank, Shawn O. Pearce, Christian Holtje, git, Trond Myklebust On Thu, 2008-06-26 at 22:57 -0400, J. Bruce Fields wrote: > On Thu, Jun 26, 2008 at 04:38:40PM -0700, Junio C Hamano wrote: > > logank@sent.com writes: > > > > > On Jun 26, 2008, at 1:56 PM, Junio C Hamano wrote: > > > > > >>> "The file shouldn't be short unless someone truncated it, or there > > >>> is a bug in index-pack. Neither is very likely, but I don't think > > >>> we would want to retry pread'ing the same block forever. > > >> > > >> I don't think we would want to retry even once. Return value of 0 > > >> from > > >> pread is defined to be an EOF, isn't it? > > > > > > No, it seems to be a simple error-out in this case. We have 2.4.20 > > > systems with nfs-utils 0.3.3 and used to frequently get the same error > > > while pushing. I made a similar change back in February and haven't > > > had a problem since: > > > > > > diff --git a/index-pack.c b/index-pack.c > > > index 5ac91ba..85c8bdb 100644 > > > --- a/index-pack.c > > > +++ b/index-pack.c > > > @@ -313,7 +313,14 @@ static void *get_data_from_pack(struct > > > object_entry *obj) > > > src = xmalloc(len); > > > data = src; > > > do { > > > + // It appears that if multiple threads read across NFS, the > > > + // second read will fail. I know this is awful, but we wait for > > > + // a little bit and try again. > > > ssize_t n = pread(pack_fd, data + rdy, len - rdy, from + rdy); > > > + if (n <= 0) { > > > + sleep(1); > > > + n = pread(pack_fd, data + rdy, len - rdy, from + rdy); > > > + } > > > if (n <= 0) > > > die("cannot pread pack file: %s", strerror(errno)); > > > rdy += n; > > > > > > I use a sleep request since it seems less likely that the other thread > > > will have an outstanding request after a second of waiting. > > > > Gaah. Don't we have NFS experts in house? Bruce, perhaps? > > Trond, you don't have any idea why a 2.6.9-42.0.8.ELsmp client (2.4.28 > server) might be returning spurious 0's from pread()? > > Seems like everything is happening from that one client--the file isn't > being simultaneously accessed from the server or from another client. Is the file only being read, or could there be a simultaneous write to the same file? I'm surmising this could be an effect resulting from simultaneous cache invalidations: prior to Linux 2.6.20 or so, we weren't rigorously following the VFS/VM rules for page locking, and so page cache invalidation in particular could have some curious side-effects. Cheers Trond -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@netapp.com www.netapp.com ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: pread() over NFS (again) [1.5.5.4] 2008-06-27 14:50 ` Trond Myklebust @ 2008-06-30 0:32 ` Shawn O. Pearce 2008-06-30 19:09 ` Nicolas Pitre 0 siblings, 1 reply; 16+ messages in thread From: Shawn O. Pearce @ 2008-06-30 0:32 UTC (permalink / raw) To: Trond Myklebust Cc: J. Bruce Fields, Junio C Hamano, logank, Christian Holtje, git, Trond Myklebust Trond Myklebust <Trond.Myklebust@netapp.com> wrote: > On Thu, 2008-06-26 at 22:57 -0400, J. Bruce Fields wrote: > > On Thu, Jun 26, 2008 at 04:38:40PM -0700, Junio C Hamano wrote: > > > logank@sent.com writes: > > > > > > > On Jun 26, 2008, at 1:56 PM, Junio C Hamano wrote: > > > > > > > >>> "The file shouldn't be short unless someone truncated it, or there > > > >>> is a bug in index-pack. Neither is very likely, but I don't think > > > >>> we would want to retry pread'ing the same block forever. > > > >> > > > >> I don't think we would want to retry even once. Return value of 0 > > > >> from > > > >> pread is defined to be an EOF, isn't it? > > > > > > > > No, it seems to be a simple error-out in this case. We have 2.4.20 > > > > systems with nfs-utils 0.3.3 and used to frequently get the same error > > > > while pushing. I made a similar change back in February and haven't > > > > had a problem since: > > > > > > > > diff --git a/index-pack.c b/index-pack.c > > > > index 5ac91ba..85c8bdb 100644 > > > > --- a/index-pack.c > > > > +++ b/index-pack.c > > > > @@ -313,7 +313,14 @@ static void *get_data_from_pack(struct > > > > object_entry *obj) > > > > src = xmalloc(len); > > > > data = src; > > > > do { > > > > + // It appears that if multiple threads read across NFS, the > > > > + // second read will fail. I know this is awful, but we wait for > > > > + // a little bit and try again. > > > > ssize_t n = pread(pack_fd, data + rdy, len - rdy, from + rdy); > > > > + if (n <= 0) { > > > > + sleep(1); > > > > + n = pread(pack_fd, data + rdy, len - rdy, from + rdy); > > > > + } > > > > if (n <= 0) > > > > die("cannot pread pack file: %s", strerror(errno)); > > > > rdy += n; > > > > > > > > I use a sleep request since it seems less likely that the other thread > > > > will have an outstanding request after a second of waiting. > > > > > > Gaah. Don't we have NFS experts in house? Bruce, perhaps? > > > > Trond, you don't have any idea why a 2.6.9-42.0.8.ELsmp client (2.4.28 > > server) might be returning spurious 0's from pread()? > > > > Seems like everything is happening from that one client--the file isn't > > being simultaneously accessed from the server or from another client. > > Is the file only being read, or could there be a simultaneous write to > the same file? I'm surmising this could be an effect resulting from > simultaneous cache invalidations: prior to Linux 2.6.20 or so, we > weren't rigorously following the VFS/VM rules for page locking, and so > page cache invalidation in particular could have some curious > side-effects. The file was created and opened O_CREAT|O_EXCL|O_RDWR, by this process, written linearly using write(2), without any lseeks. We kept the file descriptor open and starting issuing pread(2) calls for earlier offsets we had alread written. One of those kicks back EOF far too early (and results in this bug report). Note the only accesses we are using is write(2) and pread(2), and once we start reading we don't ever go back to writing. The pread(2) calls are typically issued in ascending offsets, and we read each position only once. This is to try and take advantage of any read-ahead the kernel may be able to do. The pread(2) calls are rarely (if ever) on a block/page boundary. Nobody else should know about this file. Its written to a temporary name and no other well behaved processes would attempt to read the file until it gets closed and renamed to its final destination. We haven't reached that far in the processing when we get this error, so there should be only one file descriptor open on the inode, and its the same one that wrote the data. -- Shawn. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: pread() over NFS (again) [1.5.5.4] 2008-06-30 0:32 ` Shawn O. Pearce @ 2008-06-30 19:09 ` Nicolas Pitre 0 siblings, 0 replies; 16+ messages in thread From: Nicolas Pitre @ 2008-06-30 19:09 UTC (permalink / raw) To: Shawn O. Pearce Cc: Trond Myklebust, J. Bruce Fields, Junio C Hamano, logank, Christian Holtje, git, Trond Myklebust On Sun, 29 Jun 2008, Shawn O. Pearce wrote: > Trond Myklebust <Trond.Myklebust@netapp.com> wrote: > > Is the file only being read, or could there be a simultaneous write to > > the same file? I'm surmising this could be an effect resulting from > > simultaneous cache invalidations: prior to Linux 2.6.20 or so, we > > weren't rigorously following the VFS/VM rules for page locking, and so > > page cache invalidation in particular could have some curious > > side-effects. > > The file was created and opened O_CREAT|O_EXCL|O_RDWR, by this > process, written linearly using write(2), without any lseeks. > We kept the file descriptor open and starting issuing pread(2) > calls for earlier offsets we had alread written. One of those > kicks back EOF far too early (and results in this bug report). > > Note the only accesses we are using is write(2) and pread(2), and > once we start reading we don't ever go back to writing. That's not exact. With a thin pack, we continue appending data to the file after a bunch of pread() have occurred. And only after those pread()'s do we know that we actually have a thin pack. > The pread(2) > calls are typically issued in ascending offsets, and we read each > position only once. This is to try and take advantage of any > read-ahead the kernel may be able to do. That's not exact. The pread() calls are done when resolving deltas, hence a base object is read and every deltas based on it are recursively resolved to find their SHA1 signature. Then another base is picked up and the same process repeated. And in practice all those delta chains are all interleaced in the pack file due to the fact that objects are stored so to optimize access to recent commits. Therefore they're more or less random. Nicolas ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: pread() over NFS (again) [1.5.5.4] 2008-06-26 23:36 ` logank 2008-06-26 23:38 ` Junio C Hamano @ 2008-06-27 2:54 ` J. Bruce Fields 2008-06-27 13:44 ` Christian Holtje 2008-06-27 13:54 ` Christian Holtje 2 siblings, 1 reply; 16+ messages in thread From: J. Bruce Fields @ 2008-06-27 2:54 UTC (permalink / raw) To: logank; +Cc: Junio C Hamano, Shawn O. Pearce, Christian Holtje, git On Thu, Jun 26, 2008 at 04:36:27PM -0700, logank@sent.com wrote: > On Jun 26, 2008, at 1:56 PM, Junio C Hamano wrote: > >>> "The file shouldn't be short unless someone truncated it, or there >>> is a bug in index-pack. Neither is very likely, but I don't think >>> we would want to retry pread'ing the same block forever. >> >> I don't think we would want to retry even once. Return value of 0 >> from >> pread is defined to be an EOF, isn't it? > > No, it seems to be a simple error-out in this case. We have 2.4.20 > systems That version's for the client or the server? --b. > with nfs-utils 0.3.3 and used to frequently get the same error > while pushing. I made a similar change back in February and haven't had a > problem since: > > diff --git a/index-pack.c b/index-pack.c > index 5ac91ba..85c8bdb 100644 > --- a/index-pack.c > +++ b/index-pack.c > @@ -313,7 +313,14 @@ static void *get_data_from_pack(struct object_entry > *obj) > src = xmalloc(len); > data = src; > do { > + // It appears that if multiple threads read across NFS, the > + // second read will fail. I know this is awful, but we wait for > + // a little bit and try again. > ssize_t n = pread(pack_fd, data + rdy, len - rdy, from + rdy); > + if (n <= 0) { > + sleep(1); > + n = pread(pack_fd, data + rdy, len - rdy, from + rdy); > + } > if (n <= 0) > die("cannot pread pack file: %s", strerror(errno)); > rdy += n; > > I use a sleep request since it seems less likely that the other thread > will have an outstanding request after a second of waiting. > > -- > Logan Kennelly > ,,, > (. .) > --ooO-(_)-Ooo-- > > > > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: pread() over NFS (again) [1.5.5.4] 2008-06-27 2:54 ` J. Bruce Fields @ 2008-06-27 13:44 ` Christian Holtje 0 siblings, 0 replies; 16+ messages in thread From: Christian Holtje @ 2008-06-27 13:44 UTC (permalink / raw) To: J. Bruce Fields; +Cc: logank, Junio C Hamano, Shawn O. Pearce, git On Jun 26, 2008, at 10:54 PM, J. Bruce Fields wrote: > On Thu, Jun 26, 2008 at 04:36:27PM -0700, logank@sent.com wrote: >> No, it seems to be a simple error-out in this case. We have 2.4.20 >> systems > > That version's for the client or the server? For the bug I sent the information is: Server: Linux dev1 2.4.28 #3 SMP Tue Jan 18 14:59:40 EST 2005 i686 i686 i386 GNU/Linux Red Hat Linux release 9 (Shrike) Client: Linux dev2 2.6.9-42.0.8.ELsmp #1 SMP Tue Jan 30 12:18:01 EST 2007 x86_64 x86_64 x86_64 GNU/Linux CentOS release 4.4 (Final) Ciao! ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: pread() over NFS (again) [1.5.5.4] 2008-06-26 23:36 ` logank 2008-06-26 23:38 ` Junio C Hamano 2008-06-27 2:54 ` J. Bruce Fields @ 2008-06-27 13:54 ` Christian Holtje 2 siblings, 0 replies; 16+ messages in thread From: Christian Holtje @ 2008-06-27 13:54 UTC (permalink / raw) To: logank; +Cc: Junio C Hamano, Shawn O. Pearce, git On Jun 26, 2008, at 7:36 PM, logank@sent.com wrote: > diff --git a/index-pack.c b/index-pack.c > index 5ac91ba..85c8bdb 100644 > --- a/index-pack.c > +++ b/index-pack.c > @@ -313,7 +313,14 @@ static void *get_data_from_pack(struct > object_entry *obj) > src = xmalloc(len); > data = src; > do { > + // It appears that if multiple threads read across NFS, the > + // second read will fail. I know this is awful, but we wait for > + // a little bit and try again. > ssize_t n = pread(pack_fd, data + rdy, len - rdy, from + rdy); > + if (n <= 0) { > + sleep(1); > + n = pread(pack_fd, data + rdy, len - rdy, from + rdy); > + } > if (n <= 0) > die("cannot pread pack file: %s", strerror(errno)); > rdy += n; This does work. But the "unpacking objects" phase becomes very slow. I had 86 objects and I could see every time it did a sleep because the counter would wait a second (or more) before the next object was unpacked. Ciao! ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2008-06-30 19:10 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-06-26 16:40 pread() over NFS (again) [1.5.5.4] Christian Holtje 2008-06-26 20:46 ` Shawn O. Pearce 2008-06-26 20:56 ` Junio C Hamano 2008-06-26 21:05 ` Shawn O. Pearce 2008-06-26 21:36 ` Christian Holtje 2008-06-26 22:04 ` Junio C Hamano 2008-06-26 22:07 ` Shawn O. Pearce 2008-06-26 23:36 ` logank 2008-06-26 23:38 ` Junio C Hamano 2008-06-27 2:57 ` J. Bruce Fields 2008-06-27 14:50 ` Trond Myklebust 2008-06-30 0:32 ` Shawn O. Pearce 2008-06-30 19:09 ` Nicolas Pitre 2008-06-27 2:54 ` J. Bruce Fields 2008-06-27 13:44 ` Christian Holtje 2008-06-27 13:54 ` Christian Holtje
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).