* upload-pack timing issue on windows? @ 2010-02-05 23:51 Erik Faye-Lund 2010-02-06 10:06 ` Johannes Sixt 0 siblings, 1 reply; 11+ messages in thread From: Erik Faye-Lund @ 2010-02-05 23:51 UTC (permalink / raw) To: Git Mailing List, msysGit As some of you might know, I've been working on porting git-daemon to Windows for quite some time now. As it stands now, there's really only one known issue that is blocking on my end here: Something weird happens *sometimes* when upload-pack is exiting, leading to a client dying with a "fatal: read error: Invalid argument\nfatal: early EOF"-error. If I place a sleep(1) at some place after exiting the while(1)-loop in create_pack() in upload-pack.c, the symptom goes away. create_pack() contains some async-code, but this doesn't seem to be triggered in my minimal case at all. I've tried flushing stdout and stderr explicitly, no luck. How often the issue triggers seems to depend on two things, the size of the repo and the connection speed. If I clone from localhost, I can't get it to trigger at all. If the repo is of some size, it triggers rarely. However if I have a repo with only one commit, it seems to trigger every single time for me. I've noticed that one of the last things that happens is a call to poll with nfds=1. This triggers a special case in our poll-emulation on Windows; but removing that special case hasn't given me any positive results. Does anyone have a hunch about what might trigger this issue? -- Erik "kusma" Faye-Lund ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: upload-pack timing issue on windows? 2010-02-05 23:51 upload-pack timing issue on windows? Erik Faye-Lund @ 2010-02-06 10:06 ` Johannes Sixt 2010-02-06 12:01 ` [msysGit] " Erik Faye-Lund 0 siblings, 1 reply; 11+ messages in thread From: Johannes Sixt @ 2010-02-06 10:06 UTC (permalink / raw) To: kusmabite; +Cc: msysgit, Git Mailing List On Samstag, 6. Februar 2010, Erik Faye-Lund wrote: > As some of you might know, I've been working on porting git-daemon to > Windows for quite some time now. As it stands now, there's really only > one known issue that is blocking on my end here: > > Something weird happens *sometimes* when upload-pack is exiting, > leading to a client dying with a "fatal: read error: Invalid > argument\nfatal: early EOF"-error. If I place a sleep(1) at some place > after exiting the while(1)-loop in create_pack() in upload-pack.c, the > symptom goes away. create_pack() contains some async-code, but this > doesn't seem to be triggered in my minimal case at all. I've tried > flushing stdout and stderr explicitly, no luck. I've observed timing related issues in upload-pack as well, but only in the case where the die() is called from the async thread. This is the reason why t5530 does not pass. But your case seems to be different - i.e. there is no die() involved. Sorry, can't help more... Perhaps use Procmon to analyse differences among the different successful and failing cases. Try hacking fetch-pack so that it does not announce side-band(-64k). Perhaps it makes a difference. -- Hannes ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [msysGit] upload-pack timing issue on windows? 2010-02-06 10:06 ` Johannes Sixt @ 2010-02-06 12:01 ` Erik Faye-Lund 2010-02-06 22:18 ` Johannes Sixt 0 siblings, 1 reply; 11+ messages in thread From: Erik Faye-Lund @ 2010-02-06 12:01 UTC (permalink / raw) To: Johannes Sixt; +Cc: msysgit, Git Mailing List On Sat, Feb 6, 2010 at 11:06 AM, Johannes Sixt <j6t@kdbg.org> wrote: > On Samstag, 6. Februar 2010, Erik Faye-Lund wrote: >> As some of you might know, I've been working on porting git-daemon to >> Windows for quite some time now. As it stands now, there's really only >> one known issue that is blocking on my end here: >> >> Something weird happens *sometimes* when upload-pack is exiting, >> leading to a client dying with a "fatal: read error: Invalid >> argument\nfatal: early EOF"-error. If I place a sleep(1) at some place >> after exiting the while(1)-loop in create_pack() in upload-pack.c, the >> symptom goes away. create_pack() contains some async-code, but this >> doesn't seem to be triggered in my minimal case at all. I've tried >> flushing stdout and stderr explicitly, no luck. > > I've observed timing related issues in upload-pack as well, but only in the > case where the die() is called from the async thread. This is the reason why > t5530 does not pass. > > But your case seems to be different - i.e. there is no die() involved. Sorry, > can't help more... > Yeah, it's probably not the same case, but I certainly do find it interesting that we seemingly have two separate timing-related around here somewhere... > Perhaps use Procmon to analyse differences among the different successful and > failing cases. > I'm not entirely sure what to look for, but I do see that there's difference. There's about 3.5k lines of logging from git.exe, git-daemon.exe and git-upload-pack.exe for the failure case versus 2.5k for the successful case. And the last sequence of TCP Send in the success case is a send of 8 bytes, followed by a send of 212 bytes, followed again by a send of 1 byte. In the failure case, there's only a send of 8 bytes in the end. This sequence is reported as sent by git-daemon.exe. In fact, all TCP actions are reported from git-daemon.exe, and apart from the last sequence the lengths are reported as identical. > Try hacking fetch-pack so that it does not announce side-band(-64k). Perhaps > it makes a difference. > This didn't make any difference. I removed "side-band" and "side-band-64k" from capabilities in send_ref() in upload-pack.c, as well as the "if (server_supports("side-band<...>"-lines in builtin-fetch-pack.c. While I was at it, I also tried to disable all other capabilities; no luck. However, I have tracked down a bit of what goes on in the client. There's a call to read_in_full, called from pack-write.c, line 246 that fails in the failure-case, but not in the success-case. This is where the client expects "pack\tSHA-1" or "keep\tSHA-1". There "fatal: early EOF"-messages seems to originate from index-pack.c, line 197. This is the first line of code in parse_pack_header(), it's also AFAICT the first call-site for any read(0, <...>) (though fill()). -- Erik "kusma" Faye-Lund ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [msysGit] upload-pack timing issue on windows? 2010-02-06 12:01 ` [msysGit] " Erik Faye-Lund @ 2010-02-06 22:18 ` Johannes Sixt 2010-02-08 11:18 ` Erik Faye-Lund 0 siblings, 1 reply; 11+ messages in thread From: Johannes Sixt @ 2010-02-06 22:18 UTC (permalink / raw) To: kusmabite; +Cc: msysgit, Git Mailing List On Samstag, 6. Februar 2010, Erik Faye-Lund wrote: > However, I have tracked down a bit of what goes on in the client. > There's a call to read_in_full, called from pack-write.c, line 246 > that fails in the failure-case, but not in the success-case. This is > where the client expects "pack\tSHA-1" or "keep\tSHA-1". There "fatal: > early EOF"-messages seems to originate from index-pack.c, line 197. > This is the first line of code in parse_pack_header(), it's also > AFAICT the first call-site for any read(0, <...>) (though fill()). This looks like upload-pack died without sending enough to fill a pack header. Try merging this branch: git://repo.or.cz/git/mingw/j6t.git async-in-thread It contains your changes to start_async plus a refinement of die() when it is called from the async procedure (it passes t5530, for example). It is also converted to pthreads, and therefore also works on Unix. The new implementation of start_async is more careful about the file handles, though not so much on Windows. If there's no change for you, then you could look into implementing fcntl(F_GETFD/SETFD, FD_CLOEXEC), which are currently ignored, on top of this branch, using Get/SetHandleInformation(). Background: On Unix, we need FD_CLOEXEC so that the fds that are meant for the async thread do not remain open in an unrelated child process; on Windows, we are just lucky and can get away without FD_CLOEXEC because our pipe()s are non-inheritable and async only work with pipes. But once we pass other fds to the async procedure, we need a working FD_CLOEXEC. Perhaps something in this direction is related to your problem. You could push out your current state of the git-daemon and a recipe to reproduce the problem. Perhaps I find some time to look into it. -- Hannes ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [msysGit] upload-pack timing issue on windows? 2010-02-06 22:18 ` Johannes Sixt @ 2010-02-08 11:18 ` Erik Faye-Lund 2010-02-10 20:41 ` Jay Soffian 2010-08-22 23:27 ` Erik Faye-Lund 0 siblings, 2 replies; 11+ messages in thread From: Erik Faye-Lund @ 2010-02-08 11:18 UTC (permalink / raw) To: Johannes Sixt; +Cc: msysgit, Git Mailing List On Sat, Feb 6, 2010 at 11:18 PM, Johannes Sixt <j6t@kdbg.org> wrote: > On Samstag, 6. Februar 2010, Erik Faye-Lund wrote: >> However, I have tracked down a bit of what goes on in the client. >> There's a call to read_in_full, called from pack-write.c, line 246 >> that fails in the failure-case, but not in the success-case. This is >> where the client expects "pack\tSHA-1" or "keep\tSHA-1". There "fatal: >> early EOF"-messages seems to originate from index-pack.c, line 197. >> This is the first line of code in parse_pack_header(), it's also >> AFAICT the first call-site for any read(0, <...>) (though fill()). > > This looks like upload-pack died without sending enough to fill a pack header. > > Try merging this branch: > > git://repo.or.cz/git/mingw/j6t.git async-in-thread > > It contains your changes to start_async plus a refinement of die() when it is > called from the async procedure (it passes t5530, for example). It is also > converted to pthreads, and therefore also works on Unix. The new > implementation of start_async is more careful about the file handles, though > not so much on Windows. > > If there's no change for you, then you could look into implementing > fcntl(F_GETFD/SETFD, FD_CLOEXEC), which are currently ignored, on top of this > branch, using Get/SetHandleInformation(). > Thanks a lot. I tried merging it, but the issue still pops up. I also tried to implement fcntl(F_GETFD/SETFD, FD_CLOEXEC), still no dice. I'm not entirely sure if I did it correctly, though. > Background: On Unix, we need FD_CLOEXEC so that the fds that are meant for the > async thread do not remain open in an unrelated child process; on Windows, we > are just lucky and can get away without FD_CLOEXEC because our pipe()s are > non-inheritable and async only work with pipes. But once we pass other fds to > the async procedure, we need a working FD_CLOEXEC. Perhaps something in this > direction is related to your problem. > > You could push out your current state of the git-daemon and a recipe to > reproduce the problem. Perhaps I find some time to look into it. > Sure. You can find my current version at git://repo.or.cz/git/kusma.git work/daemon-fcntl This branch includes your branch and my fcntl-attempt, as well as an almost-fixed-up version of the last daemon-win32 series I sent out (still lacking critical sections when saving process ids, as you suggested). -- Erik "kusma" Faye-Lund ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [msysGit] upload-pack timing issue on windows? 2010-02-08 11:18 ` Erik Faye-Lund @ 2010-02-10 20:41 ` Jay Soffian 2010-08-22 23:27 ` Erik Faye-Lund 1 sibling, 0 replies; 11+ messages in thread From: Jay Soffian @ 2010-02-10 20:41 UTC (permalink / raw) To: kusmabite; +Cc: Johannes Sixt, msysgit, Git Mailing List On Mon, Feb 8, 2010 at 6:18 AM, Erik Faye-Lund <kusmabite@googlemail.com> wrote: > On Sat, Feb 6, 2010 at 11:18 PM, Johannes Sixt <j6t@kdbg.org> wrote: >> On Samstag, 6. Februar 2010, Erik Faye-Lund wrote: >>> However, I have tracked down a bit of what goes on in the client. >>> There's a call to read_in_full, called from pack-write.c, line 246 >>> that fails in the failure-case, but not in the success-case. This is >>> where the client expects "pack\tSHA-1" or "keep\tSHA-1". There "fatal: >>> early EOF"-messages seems to originate from index-pack.c, line 197. >>> This is the first line of code in parse_pack_header(), it's also >>> AFAICT the first call-site for any read(0, <...>) (though fill()). >> >> This looks like upload-pack died without sending enough to fill a pack header. >> >> Try merging this branch: >> >> git://repo.or.cz/git/mingw/j6t.git async-in-thread >> >> It contains your changes to start_async plus a refinement of die() when it is >> called from the async procedure (it passes t5530, for example). It is also >> converted to pthreads, and therefore also works on Unix. The new >> implementation of start_async is more careful about the file handles, though >> not so much on Windows. >> >> If there's no change for you, then you could look into implementing >> fcntl(F_GETFD/SETFD, FD_CLOEXEC), which are currently ignored, on top of this >> branch, using Get/SetHandleInformation(). >> > > Thanks a lot. I tried merging it, but the issue still pops up. I also > tried to implement fcntl(F_GETFD/SETFD, FD_CLOEXEC), still no dice. > I'm not entirely sure if I did it correctly, though. I have no idea if it's related, but a similar thing seems to happen with git under cygwin-1.7.1. http://article.gmane.org/gmane.os.cygwin/114032 This is when cloning/fetching over ssh. I've not personally seen the problem, but I compile git from source. Coworkers who are using the cygwin-1.7.1 provided git see the problem consistently. j. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [msysGit] upload-pack timing issue on windows? 2010-02-08 11:18 ` Erik Faye-Lund 2010-02-10 20:41 ` Jay Soffian @ 2010-08-22 23:27 ` Erik Faye-Lund 2010-08-24 19:24 ` Johannes Sixt 1 sibling, 1 reply; 11+ messages in thread From: Erik Faye-Lund @ 2010-08-22 23:27 UTC (permalink / raw) To: kusmabite; +Cc: Johannes Sixt, msysgit, Git Mailing List On Mon, Feb 8, 2010 at 1:18 PM, Erik Faye-Lund <kusmabite@googlemail.com> wrote: > On Sat, Feb 6, 2010 at 11:18 PM, Johannes Sixt <j6t@kdbg.org> wrote: >> On Samstag, 6. Februar 2010, Erik Faye-Lund wrote: >>> However, I have tracked down a bit of what goes on in the client. >>> There's a call to read_in_full, called from pack-write.c, line 246 >>> that fails in the failure-case, but not in the success-case. This is >>> where the client expects "pack\tSHA-1" or "keep\tSHA-1". There "fatal: >>> early EOF"-messages seems to originate from index-pack.c, line 197. >>> This is the first line of code in parse_pack_header(), it's also >>> AFAICT the first call-site for any read(0, <...>) (though fill()). >> >> This looks like upload-pack died without sending enough to fill a pack header. >> >> Try merging this branch: >> >> git://repo.or.cz/git/mingw/j6t.git async-in-thread >> >> It contains your changes to start_async plus a refinement of die() when it is >> called from the async procedure (it passes t5530, for example). It is also >> converted to pthreads, and therefore also works on Unix. The new >> implementation of start_async is more careful about the file handles, though >> not so much on Windows. >> >> If there's no change for you, then you could look into implementing >> fcntl(F_GETFD/SETFD, FD_CLOEXEC), which are currently ignored, on top of this >> branch, using Get/SetHandleInformation(). >> > > Thanks a lot. I tried merging it, but the issue still pops up. I also > tried to implement fcntl(F_GETFD/SETFD, FD_CLOEXEC), still no dice. > I'm not entirely sure if I did it correctly, though. > More than 6 months later, and I've finally bothered to debug this further: - The culprit seems to be our poll-emulation. My understanding is that poll() was called by create_pack_file() in upload-pack.c with nfds=1 (it's 2 until one of the fds are closed) when there's no data available in the pipe. Since our poll() always returns POLLIN when nfds=1, the check for xread(...) == 0 further down in create_pack_file() cause the fd to be closed, leading to an error on the client-side. - Just removing the nfds=1-hack works for me, but I'm suspecting the nfds=1-hack is there for some socket-reason. So instead I've replaced our poll-emulation with gnulib's in my branch (with a couple of patches on top), and it seems to do the trick for me. I still haven't tested it heavily, though. - The easiest way I've found to debug the issue, is to use git-fetch from localhost with a repo with a single commit with a single, empty file. Doing this triggers the bug every time for me, and doesn't hide what's going on as much as git-clone does. The latest version of my branch can be found here: http://repo.or.cz/w/git/kusma.git/shortlog/refs/heads/work/daemon-win32-process ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [msysGit] upload-pack timing issue on windows? 2010-08-22 23:27 ` Erik Faye-Lund @ 2010-08-24 19:24 ` Johannes Sixt 2010-08-25 17:40 ` Erik Faye-Lund 0 siblings, 1 reply; 11+ messages in thread From: Johannes Sixt @ 2010-08-24 19:24 UTC (permalink / raw) To: kusmabite; +Cc: msysgit, Git Mailing List On Montag, 23. August 2010, Erik Faye-Lund wrote: > - The culprit seems to be our poll-emulation. My understanding is that > poll() was called by create_pack_file() in upload-pack.c with nfds=1 > (it's 2 until one of the fds are closed) when there's no data > available in the pipe. Since our poll() always returns POLLIN when > nfds=1, the check for xread(...) == 0 further down in > create_pack_file() cause the fd to be closed, leading to an error on > the client-side. > - Just removing the nfds=1-hack works for me, but I'm suspecting the > nfds=1-hack is there for some socket-reason. So instead I've replaced > our poll-emulation with gnulib's in my branch (with a couple of > patches on top), and it seems to do the trick for me. I still haven't > tested it heavily, though. The nfds == 1 hack is an "optimization": When only one channel must be observed, then we can let (x)read() wait for data instead of doing it inside poll() in some way. I'm not happy with our poll emulation because it contains a busy-loop. Gnulib's version looks quite capable, but I haven't studied it in detail. Until then, I trust that it does the right thing. --Hannes ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [msysGit] upload-pack timing issue on windows? 2010-08-24 19:24 ` Johannes Sixt @ 2010-08-25 17:40 ` Erik Faye-Lund 2010-08-25 20:53 ` Johannes Sixt 0 siblings, 1 reply; 11+ messages in thread From: Erik Faye-Lund @ 2010-08-25 17:40 UTC (permalink / raw) To: Johannes Sixt; +Cc: msysgit, Git Mailing List On Tue, Aug 24, 2010 at 9:24 PM, Johannes Sixt <j6t@kdbg.org> wrote: > On Montag, 23. August 2010, Erik Faye-Lund wrote: >> - The culprit seems to be our poll-emulation. My understanding is that >> poll() was called by create_pack_file() in upload-pack.c with nfds=1 >> (it's 2 until one of the fds are closed) when there's no data >> available in the pipe. Since our poll() always returns POLLIN when >> nfds=1, the check for xread(...) == 0 further down in >> create_pack_file() cause the fd to be closed, leading to an error on >> the client-side. >> - Just removing the nfds=1-hack works for me, but I'm suspecting the >> nfds=1-hack is there for some socket-reason. So instead I've replaced >> our poll-emulation with gnulib's in my branch (with a couple of >> patches on top), and it seems to do the trick for me. I still haven't >> tested it heavily, though. > > The nfds == 1 hack is an "optimization": When only one channel must be > observed, then we can let (x)read() wait for data instead of doing it inside > poll() in some way. > OK, thanks for the clarification. Unfortunately, this optimization breaks down in some cases (like the one I described). > I'm not happy with our poll emulation because it contains a busy-loop. > Gnulib's version looks quite capable, but I haven't studied it in detail. > Until then, I trust that it does the right thing. > Well, I've found (and supplied patches for) a couple of bugs in it, but apart from that it seems quite sane. So yeah, I think it might be the best way forward. But I'm curious, what's the best way of import a couple of foreign source files, while maintaining a couple of patches on top of them? I'm thinking that perhaps a import-commit followed by the patches would make it easier to merge in changes than to just import the patched version, but I'm not entirely sure how to do such a merge without merging a full subtree... ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [msysGit] upload-pack timing issue on windows? 2010-08-25 17:40 ` Erik Faye-Lund @ 2010-08-25 20:53 ` Johannes Sixt 2010-08-25 20:57 ` Erik Faye-Lund 0 siblings, 1 reply; 11+ messages in thread From: Johannes Sixt @ 2010-08-25 20:53 UTC (permalink / raw) To: kusmabite; +Cc: msysgit, Git Mailing List On Mittwoch, 25. August 2010, Erik Faye-Lund wrote: > But I'm curious, what's the best way of import a couple of foreign > source files, while maintaining a couple of patches on top of them? > I'm thinking that perhaps a import-commit followed by the patches > would make it easier to merge in changes than to just import the > patched version, but I'm not entirely sure how to do such a merge > without merging a full subtree... This is about only two files. When a new version is available from upstream, just branch from the import-commit, apply the two new files, and merge the result. -- Hannes ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [msysGit] upload-pack timing issue on windows? 2010-08-25 20:53 ` Johannes Sixt @ 2010-08-25 20:57 ` Erik Faye-Lund 0 siblings, 0 replies; 11+ messages in thread From: Erik Faye-Lund @ 2010-08-25 20:57 UTC (permalink / raw) To: Johannes Sixt; +Cc: msysgit, Git Mailing List On Wed, Aug 25, 2010 at 10:53 PM, Johannes Sixt <j6t@kdbg.org> wrote: > On Mittwoch, 25. August 2010, Erik Faye-Lund wrote: >> But I'm curious, what's the best way of import a couple of foreign >> source files, while maintaining a couple of patches on top of them? >> I'm thinking that perhaps a import-commit followed by the patches >> would make it easier to merge in changes than to just import the >> patched version, but I'm not entirely sure how to do such a merge >> without merging a full subtree... > > This is about only two files. When a new version is available from upstream, > just branch from the import-commit, apply the two new files, and merge the > result. > Thanks, makes sense. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-08-25 20:58 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-02-05 23:51 upload-pack timing issue on windows? Erik Faye-Lund 2010-02-06 10:06 ` Johannes Sixt 2010-02-06 12:01 ` [msysGit] " Erik Faye-Lund 2010-02-06 22:18 ` Johannes Sixt 2010-02-08 11:18 ` Erik Faye-Lund 2010-02-10 20:41 ` Jay Soffian 2010-08-22 23:27 ` Erik Faye-Lund 2010-08-24 19:24 ` Johannes Sixt 2010-08-25 17:40 ` Erik Faye-Lund 2010-08-25 20:53 ` Johannes Sixt 2010-08-25 20:57 ` Erik Faye-Lund
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).