* Clone fails on a repo with too many heads/tags @ 2012-03-18 8:14 Ivan Todoroski 2012-03-18 11:37 ` Ivan Todoroski 2012-03-18 16:36 ` Jakub Narebski 0 siblings, 2 replies; 54+ messages in thread From: Ivan Todoroski @ 2012-03-18 8:14 UTC (permalink / raw) To: git I recently tried cloning a fresh copy of a large repo (converted from CVS, nearly 10 years of history) and to my surprise "git clone" failed with the following message: error: cannot spawn git: No such file or directory The problem is only reproduced using the Smart HTTP transport. I used msysGit on Windows so my first instinct was to contact them, but after some poking around I discovered that the problem is present in the Linux version too, although harder to trigger. Try executing this script: ------------------------------- git init too-many-refs cd too-many-refs echo bla > bla.txt git add . git commit -m test sha=$(git rev-parse HEAD) for ((i=0; i<100000; i++)); do echo $sha refs/tags/artificially-long-tag-name-to-more-easily- demonstrate-the-problem-$i >> .git/packed-refs done ------------------------------- Now share this repo using the Smart HTTP transport (git-http-backend) and then try cloning it in a different directory. This is what you would get: $ git clone http://localhost/.../too-many-refs/.git Cloning into 'too-many-refs'... fatal: cannot exec 'fetch-pack': Argument list too long So we come to the real reason for the failure: somewhere inside Git a subcommand is invoked with all the tags/heads on the command line and if you have enough of them it overflows the command line length limit of the OS. Obviously the number of tags in the "too-many-refs" repo above is absurd (100k) because the cmdline length in Linux is much more generous, but on Windows the clone fails with as little as 500 tags in the above loop! I am already hitting this problem with msysGit on real repos, not just artificial test cases. I tracked down the problem to remote-curl.c:fetch_git(). That's where the "fetch-pack" command line is being constructed with all the refs on one line: git fetch-pack --stateless-rpc --lock-pack ...<all the refs>... The solution is conceptually simple: if the list of refs results in a too long command line, split the refs in batches and call fetch-pack multiple times such that each call is under the cmdline limit: git fetch-pack --stateless-rpc --lock-pack ...<first batch of refs>... git fetch-pack --stateless-rpc --lock-pack ...<second batch of refs>... ... git fetch-pack --stateless-rpc --lock-pack ...<last batch of refs>... ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-18 8:14 Clone fails on a repo with too many heads/tags Ivan Todoroski @ 2012-03-18 11:37 ` Ivan Todoroski 2012-03-18 12:04 ` Nguyen Thai Ngoc Duy 2012-03-18 16:36 ` Jakub Narebski 1 sibling, 1 reply; 54+ messages in thread From: Ivan Todoroski @ 2012-03-18 11:37 UTC (permalink / raw) To: git Ivan Todoroski <grnch_lists <at> gmx.net> writes: > Now share this repo using the Smart HTTP transport (git-http-backend) and then > try cloning it in a different directory. This is what you would get: > > $ git clone http://localhost/.../too-many-refs/.git > Cloning into 'too-many-refs'... > fatal: cannot exec 'fetch-pack': Argument list too long > > [...] > > The solution is conceptually simple: if the list of refs results in a too long > command line, split the refs in batches and call fetch-pack multiple times such > that each call is under the cmdline limit: > > git fetch-pack --stateless-rpc --lock-pack ...<first batch of refs>... > git fetch-pack --stateless-rpc --lock-pack ...<second batch of refs>... > ... > git fetch-pack --stateless-rpc --lock-pack ...<last batch of refs>... BTW, I didn't want to sound like I am expecting or demanding a fix. If the experienced Git devs lack the time or inclination to work on this bug (understandable), I am certainly willing to try it myself. My C skills are a bit rusty and I'm not very familiar with the Git codebase, but I will do my best to follow Documentation/SubmittingPatches as well as the existing code structure. I will need a few pointers to get me started in the right direction though... 1) Is splitting the cmdline in batches and executing fetch-pack multiple times the right approach? If you have another solution please suggest. 2) Should I add the test case for this bug to existing scripts like t/t5551- http-fetch.sh and t/t5561-http-backend.sh, or should I create a new test script under t/ following their example? There will probably be only one test case for this bug, basically the script I pasted in the original email to reproduce it. 3) What would be the most portable way to get the cmdline length limit between POSIX and Windows? Would something like this be acceptable: #ifder _WIN32 int cmdline_limit = 32767; #else int cmdline_limit = sysconf(_SC_ARG_MAX); #endif I couldn't actually find a Windows API to get the cmdline limit, but this blog post by one of the Windows people tells the value: http://blogs.msdn.com/b/oldnewthing/archive/2003/12/10/56028.aspx 4) Should this problem be fixed only in remote-curl.c:fetch_git() or should it be solved more generally in run-command.c:start_command(), which is used by fetch_git() for the actual invocation? If this is fixed only in remote-curl:fetch_git(), then the same logic would need to be open coded in any other such place that might be found. Are you aware of any other internal sub-commands that put all refs on the command line and could be susceptible to the same issue? If it's fixed at a lower level in run-command.c:start_command(), the logic would become available to any other sub-command that needs it. However, this would mean that struct child_process as well as struct rpc_state would need an additional field that would tell whether the command is safe to execute in multiple batches and how many of the arguments at the beginning of child_process.argv must be preserved on every invocation (the switches and such). Something like child_process.split_after, which if non-zero would mean that start_command() is free to invoke the command multiple times when argv exceeds the cmdline limit, by grouping any arguments after argv[split_after] in smaller batches for each invocation. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-18 11:37 ` Ivan Todoroski @ 2012-03-18 12:04 ` Nguyen Thai Ngoc Duy 0 siblings, 0 replies; 54+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-03-18 12:04 UTC (permalink / raw) To: Ivan Todoroski; +Cc: git, Shawn O. Pearce I had a look but could not come up with a simple solution for this. CC Shawn who knows smart-http really well. On Sun, Mar 18, 2012 at 6:37 PM, Ivan Todoroski <grnch_lists@gmx.net> wrote: > Ivan Todoroski <grnch_lists <at> gmx.net> writes: >> Now share this repo using the Smart HTTP transport (git-http-backend) and > then >> try cloning it in a different directory. This is what you would get: >> >> $ git clone http://localhost/.../too-many-refs/.git >> Cloning into 'too-many-refs'... >> fatal: cannot exec 'fetch-pack': Argument list too long >> >> [...] >> >> The solution is conceptually simple: if the list of refs results in a too > long >> command line, split the refs in batches and call fetch-pack multiple times > such >> that each call is under the cmdline limit: >> >> git fetch-pack --stateless-rpc --lock-pack ...<first batch of refs>... >> git fetch-pack --stateless-rpc --lock-pack ...<second batch of refs>... >> ... >> git fetch-pack --stateless-rpc --lock-pack ...<last batch of refs>... > > > BTW, I didn't want to sound like I am expecting or demanding a fix. If the > experienced Git devs lack the time or inclination to work on this bug > (understandable), I am certainly willing to try it myself. My C skills are a > bit rusty and I'm not very familiar with the Git codebase, but I will do my > best to follow Documentation/SubmittingPatches as well as the existing code > structure. > > I will need a few pointers to get me started in the right direction though... > > > 1) Is splitting the cmdline in batches and executing fetch-pack multiple times > the right approach? If you have another solution please suggest. > > > 2) Should I add the test case for this bug to existing scripts like t/t5551- > http-fetch.sh and t/t5561-http-backend.sh, or should I create a new test script > under t/ following their example? There will probably be only one test case for > this bug, basically the script I pasted in the original email to reproduce it. > > > 3) What would be the most portable way to get the cmdline length limit between > POSIX and Windows? Would something like this be acceptable: > > #ifder _WIN32 > int cmdline_limit = 32767; > #else > int cmdline_limit = sysconf(_SC_ARG_MAX); > #endif > > I couldn't actually find a Windows API to get the cmdline limit, but this blog > post by one of the Windows people tells the value: > > http://blogs.msdn.com/b/oldnewthing/archive/2003/12/10/56028.aspx > > > 4) Should this problem be fixed only in remote-curl.c:fetch_git() or should it > be solved more generally in run-command.c:start_command(), which is used by > fetch_git() for the actual invocation? > > If this is fixed only in remote-curl:fetch_git(), then the same logic would > need to be open coded in any other such place that might be found. Are you > aware of any other internal sub-commands that put all refs on the command line > and could be susceptible to the same issue? > > > If it's fixed at a lower level in run-command.c:start_command(), the logic > would become available to any other sub-command that needs it. > > However, this would mean that struct child_process as well as struct rpc_state > would need an additional field that would tell whether the command is safe to > execute in multiple batches and how many of the arguments at the beginning of > child_process.argv must be preserved on every invocation (the switches and > such). > > Something like child_process.split_after, which if non-zero would mean that > start_command() is free to invoke the command multiple times when argv exceeds > the cmdline limit, by grouping any arguments after argv[split_after] in smaller > batches for each invocation. > > > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Duy ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-18 8:14 Clone fails on a repo with too many heads/tags Ivan Todoroski 2012-03-18 11:37 ` Ivan Todoroski @ 2012-03-18 16:36 ` Jakub Narebski 2012-03-18 19:07 ` Jeff King 1 sibling, 1 reply; 54+ messages in thread From: Jakub Narebski @ 2012-03-18 16:36 UTC (permalink / raw) To: Ivan Todoroski; +Cc: git Ivan Todoroski <grnch_lists@gmx.net> writes: > I recently tried cloning a fresh copy of a large repo (converted from CVS, > nearly 10 years of history) and to my surprise "git clone" failed with the > following message: > > error: cannot spawn git: No such file or directory > > The problem is only reproduced using the Smart HTTP transport. [...] > I tracked down the problem to remote-curl.c:fetch_git(). That's where the > "fetch-pack" command line is being constructed with all the refs on one line: > > git fetch-pack --stateless-rpc --lock-pack ...<all the refs>... > > The solution is conceptually simple: if the list of refs results in a too long > command line, split the refs in batches and call fetch-pack multiple times such > that each call is under the cmdline limit: > > git fetch-pack --stateless-rpc --lock-pack ...<first batch of refs>... > git fetch-pack --stateless-rpc --lock-pack ...<second batch of refs>... > ... > git fetch-pack --stateless-rpc --lock-pack ...<last batch of refs>... That, or implement --stdin / --revs in git-fetch-pach (perhaps following git-pack-objects that implements --revs). -- Jakub Narebski ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-18 16:36 ` Jakub Narebski @ 2012-03-18 19:07 ` Jeff King 2012-03-18 22:07 ` Jakub Narebski ` (2 more replies) 0 siblings, 3 replies; 54+ messages in thread From: Jeff King @ 2012-03-18 19:07 UTC (permalink / raw) To: Jakub Narebski; +Cc: Ivan Todoroski, git, Nguyen Thai Ngoc Duy, Shawn O. Pearce On Sun, Mar 18, 2012 at 09:36:24AM -0700, Jakub Narebski wrote: > > The solution is conceptually simple: if the list of refs results in a too long > > command line, split the refs in batches and call fetch-pack multiple times such > > that each call is under the cmdline limit: > > > > git fetch-pack --stateless-rpc --lock-pack ...<first batch of refs>... > > git fetch-pack --stateless-rpc --lock-pack ...<second batch of refs>... > > ... > > git fetch-pack --stateless-rpc --lock-pack ...<last batch of refs>... > > That, or implement --stdin / --revs in git-fetch-pach (perhaps > following git-pack-objects that implements --revs). I don't think that will work, as stateless-rpc fetch-pack already uses stdin to receive the list of advertised refs from the remote. Nor would you want to have multiple invocations of fetch-pack, since that would mean multiple http requests and multiple pack responses (which could not delta between themselves). And you can't condense the list in the general case. It is the set of refs that we actually want to fetch. We could try passing just the original refspecs (not the expansion) and letting fetch-pack try to do the expansion, but in the worst case, you might really just have a gigantic list of refs. I think the only sane solution is to write the values to a temporary file, and do something like: git fetch-pack --stateless-rpc --refs-from=$tmpfile Even if you put the tmpfile in $GIT_DIR, I don't think this should run afoul of any read-only repositories, since by definition you are fetching into the repository (but you could also just put it in /tmp). -Peff ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-18 19:07 ` Jeff King @ 2012-03-18 22:07 ` Jakub Narebski 2012-03-19 2:32 ` Jeff King 2012-03-19 1:05 ` Ivan Todoroski 2012-03-19 1:30 ` Nguyen Thai Ngoc Duy 2 siblings, 1 reply; 54+ messages in thread From: Jakub Narebski @ 2012-03-18 22:07 UTC (permalink / raw) To: Jeff King; +Cc: Ivan Todoroski, git, Nguyen Thai Ngoc Duy, Shawn O. Pearce Jeff King wrote: [...] > I think the only sane solution is to write the values to a temporary > file, and do something like: > > git fetch-pack --stateless-rpc --refs-from=$tmpfile > > Even if you put the tmpfile in $GIT_DIR, I don't think this should run > afoul of any read-only repositories, since by definition you are > fetching into the repository (but you could also just put it in /tmp). Or git fetch-pack --stateless-rpc --refs-fd=$n and there would be no need for temporary file. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-18 22:07 ` Jakub Narebski @ 2012-03-19 2:32 ` Jeff King 2012-03-19 2:43 ` Nguyen Thai Ngoc Duy 0 siblings, 1 reply; 54+ messages in thread From: Jeff King @ 2012-03-19 2:32 UTC (permalink / raw) To: Jakub Narebski; +Cc: Ivan Todoroski, git, Nguyen Thai Ngoc Duy, Shawn O. Pearce On Sun, Mar 18, 2012 at 11:07:42PM +0100, Jakub Narebski wrote: > > I think the only sane solution is to write the values to a temporary > > file, and do something like: > > > > git fetch-pack --stateless-rpc --refs-from=$tmpfile > > > > Even if you put the tmpfile in $GIT_DIR, I don't think this should run > > afoul of any read-only repositories, since by definition you are > > fetching into the repository (but you could also just put it in /tmp). > > git fetch-pack --stateless-rpc --refs-fd=$n > > and there would be no need for temporary file. Yeah. That is a slightly more awkward interface for command-line users, but this is really meant to be an internal-to-git thing (just like stateless-rpc is in the first place). And avoiding a tempfile is a good thing. However, I think you would need to teach the run-command API how to open extra pipes to a child beyond stdout/stdin/stderr. -Peff ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-19 2:32 ` Jeff King @ 2012-03-19 2:43 ` Nguyen Thai Ngoc Duy 2012-03-19 2:45 ` Jeff King 0 siblings, 1 reply; 54+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-03-19 2:43 UTC (permalink / raw) To: Jeff King Cc: Jakub Narebski, Ivan Todoroski, git, Shawn O. Pearce, Johannes Sixt On Mon, Mar 19, 2012 at 9:32 AM, Jeff King <peff@peff.net> wrote: > However, I think you would need to teach the run-command API how to open > extra pipes to a child beyond stdout/stdin/stderr. That might be a problem on Windows. I think Windows exec* has special support for stdin/stdout/stderr and only those. -- Duy ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-19 2:43 ` Nguyen Thai Ngoc Duy @ 2012-03-19 2:45 ` Jeff King 0 siblings, 0 replies; 54+ messages in thread From: Jeff King @ 2012-03-19 2:45 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy Cc: Jakub Narebski, Ivan Todoroski, git, Shawn O. Pearce, Johannes Sixt On Mon, Mar 19, 2012 at 09:43:05AM +0700, Nguyen Thai Ngoc Duy wrote: > On Mon, Mar 19, 2012 at 9:32 AM, Jeff King <peff@peff.net> wrote: > > However, I think you would need to teach the run-command API how to open > > extra pipes to a child beyond stdout/stdin/stderr. > > That might be a problem on Windows. I think Windows exec* has special > support for stdin/stdout/stderr and only those. Then your sneak-it-across-stdin trick is probably the best bet. It's a little ugly, but this is for stateless-rpc, so it's not like anything outside of git is going to actually see this. -Peff ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-18 19:07 ` Jeff King 2012-03-18 22:07 ` Jakub Narebski @ 2012-03-19 1:05 ` Ivan Todoroski 2012-03-19 1:30 ` Nguyen Thai Ngoc Duy 2 siblings, 0 replies; 54+ messages in thread From: Ivan Todoroski @ 2012-03-19 1:05 UTC (permalink / raw) To: Jeff King; +Cc: Jakub Narebski, git, Nguyen Thai Ngoc Duy, Shawn O. Pearce On 18.03.2012 20:07, Jeff King wrote: > Nor would you want to have multiple invocations of fetch-pack, since > that would mean multiple http requests and multiple pack responses > (which could not delta between themselves). Sure, but if fetch-pack is called with so many refs that it overflows the command line, it's looking at a lot of work ahead of it anyway, it's not going to be a fast operation. So the overhead of multiple HTTP requests and lack of delta compression between huge batches of refs wouldn't be all that significant in practice. That said, I really like your temp file idea below. It's the best solution so far. Simple, efficient, non-intrusive. > I think the only sane solution is to write the values to a temporary > file, and do something like: > > git fetch-pack --stateless-rpc --refs-from=$tmpfile > > Even if you put the tmpfile in $GIT_DIR, I don't think this should > run afoul of any read-only repositories, since by definition you are > fetching into the repository (but you could also just put it in > /tmp). OK, if nobody beats me to it I will try to code this next weekend. Maybe sooner, but I don't usually have much free time over the week (day job, family, etc.). ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-18 19:07 ` Jeff King 2012-03-18 22:07 ` Jakub Narebski 2012-03-19 1:05 ` Ivan Todoroski @ 2012-03-19 1:30 ` Nguyen Thai Ngoc Duy 2012-03-19 2:44 ` Jeff King 2 siblings, 1 reply; 54+ messages in thread From: Nguyen Thai Ngoc Duy @ 2012-03-19 1:30 UTC (permalink / raw) To: Jeff King; +Cc: Jakub Narebski, Ivan Todoroski, git, Shawn O. Pearce On Mon, Mar 19, 2012 at 2:07 AM, Jeff King <peff@peff.net> wrote: > I don't think that will work, as stateless-rpc fetch-pack already uses > stdin to receive the list of advertised refs from the remote. Nor would > you want to have multiple invocations of fetch-pack, since that would > mean multiple http requests and multiple pack responses (which could not > delta between themselves). remote-curl functions as middle man between http client and fetch-pack. Can we just send ref list over fetch-pack.stdin first, followed by maybe empty line, then normal stuff that remote-curl feeds fetch-pack? -- Duy ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-19 1:30 ` Nguyen Thai Ngoc Duy @ 2012-03-19 2:44 ` Jeff King 2012-03-21 11:05 ` Ivan Todoroski 0 siblings, 1 reply; 54+ messages in thread From: Jeff King @ 2012-03-19 2:44 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: Jakub Narebski, Ivan Todoroski, git, Shawn O. Pearce On Mon, Mar 19, 2012 at 08:30:38AM +0700, Nguyen Thai Ngoc Duy wrote: > On Mon, Mar 19, 2012 at 2:07 AM, Jeff King <peff@peff.net> wrote: > > I don't think that will work, as stateless-rpc fetch-pack already uses > > stdin to receive the list of advertised refs from the remote. Nor would > > you want to have multiple invocations of fetch-pack, since that would > > mean multiple http requests and multiple pack responses (which could not > > delta between themselves). > > remote-curl functions as middle man between http client and > fetch-pack. Can we just send ref list over fetch-pack.stdin first, > followed by maybe empty line, then normal stuff that remote-curl feeds > fetch-pack? Yes, I think that would work. You'd just have to take care to pass the residual buffer (i.e., what is left in your input buffer after you notice that reading the list of wanted refs is finished) along to the git-protocol code. So it would require a little refactoring of get_remote_heads, I think. -Peff ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-19 2:44 ` Jeff King @ 2012-03-21 11:05 ` Ivan Todoroski 2012-03-21 14:28 ` Shawn Pearce 0 siblings, 1 reply; 54+ messages in thread From: Ivan Todoroski @ 2012-03-21 11:05 UTC (permalink / raw) To: Jeff King; +Cc: Nguyen Thai Ngoc Duy, Jakub Narebski, git, Shawn O. Pearce On 19.03.2012 03:44, Jeff King wrote: > On Mon, Mar 19, 2012 at 08:30:38AM +0700, Nguyen Thai Ngoc Duy wrote: > >> On Mon, Mar 19, 2012 at 2:07 AM, Jeff King <peff@peff.net> wrote: >>> I don't think that will work, as stateless-rpc fetch-pack already uses >>> stdin to receive the list of advertised refs from the remote. Nor would >>> you want to have multiple invocations of fetch-pack, since that would >>> mean multiple http requests and multiple pack responses (which could not >>> delta between themselves). >> remote-curl functions as middle man between http client and >> fetch-pack. Can we just send ref list over fetch-pack.stdin first, >> followed by maybe empty line, then normal stuff that remote-curl feeds >> fetch-pack? > > Yes, I think that would work. You'd just have to take care to pass the > residual buffer (i.e., what is left in your input buffer after you > notice that reading the list of wanted refs is finished) along to > the git-protocol code. So it would require a little refactoring of > get_remote_heads, I think. Would it be OK for fetch-pack.c to use the packetized format (pkt-line.h) for reading the list of refs from stdin? This way we can read() the exact number of bytes needed for the refs from the fd and there would be no need to refactor get_remote_heads() to pass a residual buffer, it could just continue reading straight from the same fd. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-21 11:05 ` Ivan Todoroski @ 2012-03-21 14:28 ` Shawn Pearce 2012-03-21 17:14 ` Jeff King 0 siblings, 1 reply; 54+ messages in thread From: Shawn Pearce @ 2012-03-21 14:28 UTC (permalink / raw) To: Ivan Todoroski; +Cc: Jeff King, Nguyen Thai Ngoc Duy, Jakub Narebski, git On Wed, Mar 21, 2012 at 04:05, Ivan Todoroski <grnch_lists@gmx.net> wrote: > On 19.03.2012 03:44, Jeff King wrote: >> Yes, I think that would work. You'd just have to take care to pass the >> residual buffer (i.e., what is left in your input buffer after you >> notice that reading the list of wanted refs is finished) along to >> the git-protocol code. So it would require a little refactoring of >> get_remote_heads, I think. > > > Would it be OK for fetch-pack.c to use the packetized format (pkt-line.h) > for reading the list of refs from stdin? This is probably the easiest way to implement the sneak-into-stdin patch. Use a pkt-line for each argument that should have been in the argv array from the command line, and a flush pkt to terminate the list. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-21 14:28 ` Shawn Pearce @ 2012-03-21 17:14 ` Jeff King 2012-03-21 17:59 ` Jakub Narebski ` (2 more replies) 0 siblings, 3 replies; 54+ messages in thread From: Jeff King @ 2012-03-21 17:14 UTC (permalink / raw) To: Shawn Pearce; +Cc: Ivan Todoroski, Nguyen Thai Ngoc Duy, Jakub Narebski, git On Wed, Mar 21, 2012 at 07:28:24AM -0700, Shawn O. Pearce wrote: > > Would it be OK for fetch-pack.c to use the packetized format (pkt-line.h) > > for reading the list of refs from stdin? > > This is probably the easiest way to implement the sneak-into-stdin > patch. Use a pkt-line for each argument that should have been in the > argv array from the command line, and a flush pkt to terminate the > list. Something in me feels slightly uncomfortable with that, just because simple newline-delimited formats make it easy for people to hack on the tool and feed input from unexpected sources. But stateless-rpc is already a pretty deep internal interface anyway, and the format is already weird (the second half is already packetized input from a remote anyway). So it's probably not worth caring about hackability here. -Peff ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-21 17:14 ` Jeff King @ 2012-03-21 17:59 ` Jakub Narebski 2012-03-21 20:02 ` Ivan Todoroski 2012-03-27 6:23 ` [PATCH/RFC v2 0/4] Fix fetch-pack command line overflow during clone Ivan Todoroski 2 siblings, 0 replies; 54+ messages in thread From: Jakub Narebski @ 2012-03-21 17:59 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Ivan Todoroski, Nguyen Thai Ngoc Duy, git Jeff King wrote: > On Wed, Mar 21, 2012 at 07:28:24AM -0700, Shawn O. Pearce wrote: > > > > Would it be OK for fetch-pack.c to use the packetized format (pkt-line.h) > > > for reading the list of refs from stdin? > > > > This is probably the easiest way to implement the sneak-into-stdin > > patch. Use a pkt-line for each argument that should have been in the > > argv array from the command line, and a flush pkt to terminate the > > list. > > Something in me feels slightly uncomfortable with that, just because > simple newline-delimited formats make it easy for people to hack on the > tool and feed input from unexpected sources. Well, give people "pkt-line" command line tool which would transform ordinary textual output on input into pkt-line (which is almost pure text anyway) output and vice versa. > But stateless-rpc is already a pretty deep internal interface anyway, > and the format is already weird (the second half is already packetized > input from a remote anyway). So it's probably not worth caring about > hackability here. Right. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-21 17:14 ` Jeff King 2012-03-21 17:59 ` Jakub Narebski @ 2012-03-21 20:02 ` Ivan Todoroski 2012-03-21 20:17 ` Jeff King 2012-03-27 6:23 ` [PATCH/RFC v2 0/4] Fix fetch-pack command line overflow during clone Ivan Todoroski 2 siblings, 1 reply; 54+ messages in thread From: Ivan Todoroski @ 2012-03-21 20:02 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On 21.03.2012 18:14, Jeff King wrote: > On Wed, Mar 21, 2012 at 07:28:24AM -0700, Shawn O. Pearce wrote: > >>> Would it be OK for fetch-pack.c to use the packetized format (pkt-line.h) >>> for reading the list of refs from stdin? >> This is probably the easiest way to implement the sneak-into-stdin >> patch. Use a pkt-line for each argument that should have been in the >> argv array from the command line, and a flush pkt to terminate the >> list. > > Something in me feels slightly uncomfortable with that, just because > simple newline-delimited formats make it easy for people to hack on the > tool and feed input from unexpected sources. I understand what you mean. How about this: If both --stdin and --stateless-rpc are specified to fetch-pack, it will use pkt-line to read the refs from stdin before handing off stdin to get_remote_heads(). However, if only --stdin is specified, it will read refs from stdin in a script-friendly newline delimited format, one ref per line. This is okay because when --stateless-rpc is not specified get_remote_heads() reads from an fd different from stdin so there is no issue with residual buffers in this case. This way you preserve scriptability for any other callers who don't use --stateless-rpc. How does this sound? ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-21 20:02 ` Ivan Todoroski @ 2012-03-21 20:17 ` Jeff King 2012-03-24 20:49 ` Ivan Todoroski ` (2 more replies) 0 siblings, 3 replies; 54+ messages in thread From: Jeff King @ 2012-03-21 20:17 UTC (permalink / raw) To: Ivan Todoroski; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On Wed, Mar 21, 2012 at 09:02:13PM +0100, Ivan Todoroski wrote: > >Something in me feels slightly uncomfortable with that, just because > >simple newline-delimited formats make it easy for people to hack on the > >tool and feed input from unexpected sources. > > I understand what you mean. How about this: > > If both --stdin and --stateless-rpc are specified to fetch-pack, it > will use pkt-line to read the refs from stdin before handing off > stdin to get_remote_heads(). > > However, if only --stdin is specified, it will read refs from stdin > in a script-friendly newline delimited format, one ref per line. This > is okay because when --stateless-rpc is not specified > get_remote_heads() reads from an fd different from stdin so there is > no issue with residual buffers in this case. > > This way you preserve scriptability for any other callers who don't > use --stateless-rpc. > > How does this sound? I think that sounds quite reasonable, and shouldn't be more than a few extra lines to implement. -Peff ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-21 20:17 ` Jeff King @ 2012-03-24 20:49 ` Ivan Todoroski 2012-03-25 1:06 ` Jeff King 2012-03-24 20:53 ` [PATCH/RFC 1/2] fetch-pack: new option to read refs from stdin Ivan Todoroski 2012-03-24 20:54 ` [PATCH/RFC 2/2] remote-curl: send the refs to fetch-pack on stdin Ivan Todoroski 2 siblings, 1 reply; 54+ messages in thread From: Ivan Todoroski @ 2012-03-24 20:49 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git [-- Attachment #1: Type: text/plain, Size: 2474 bytes --] On 21.03.2012 21:17, Jeff King wrote: > On Wed, Mar 21, 2012 at 09:02:13PM +0100, Ivan Todoroski wrote: > >>> Something in me feels slightly uncomfortable with that, just because >>> simple newline-delimited formats make it easy for people to hack on the >>> tool and feed input from unexpected sources. >> I understand what you mean. How about this: >> >> If both --stdin and --stateless-rpc are specified to fetch-pack, it >> will use pkt-line to read the refs from stdin before handing off >> stdin to get_remote_heads(). >> >> However, if only --stdin is specified, it will read refs from stdin >> in a script-friendly newline delimited format, one ref per line. This >> is okay because when --stateless-rpc is not specified >> get_remote_heads() reads from an fd different from stdin so there is >> no issue with residual buffers in this case. >> >> This way you preserve scriptability for any other callers who don't >> use --stateless-rpc. >> >> How does this sound? > > I think that sounds quite reasonable, and shouldn't be more than a few > extra lines to implement. > > -Peff I wrote the code for this and went to write the test cases, but the test suite is not cooperating. :( I'll send just the code changes for comments while I'm figuring out the test suite. I'll include the test cases along with better commit messages in the next version of the patches. So, back to the test suite problem. To get familiar with it I first ran the test suite against the vanilla "maint" branch without any of my changes, just to see what I should expect on properly running code. Unfortunately it failed, and to make matters worse it failed exactly on the parts dealing with smart HTTP, which is what I need to test in the first place. Talk about Murphy's law... Is it failing for anyone else on the vanilla "maint" branch? I would appreciate any help I could get here. My machine is CentOS 5.8 with latest updates and I have the httpd package installed. Here is the command that is failing (you can find the full output in the attachment): $ GIT_TEST_HTTPD=yes ./t5551-http-fetch.sh -i -v [... skip successful tests ...] ok 5 - used upload-pack service expecting success: git clone $HTTPD_URL/smart-redir-perm/repo.git --quiet repo-p error: RPC failed; result=22, HTTP code = 405 fatal: The remote end hung up unexpectedly not ok - 6 follow redirects (301) # # git clone $HTTPD_URL/smart-redir-perm/repo.git --quiet repo-p # [-- Attachment #2: http-fetch.log --] [-- Type: text/plain, Size: 2412 bytes --] Initialized empty Git repository in /home/itodoroski/git/t/trash directory.t5551-http-fetch/.git/ expecting success: echo content >file && git add file && git commit -m one [master (root-commit) ba36540] one Author: A U Thor <author@example.com> 1 file changed, 1 insertion(+) create mode 100644 file ok 1 - setup repository expecting success: mkdir "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" && (cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" && git --bare init ) && git remote add public "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" && git push public master:master Initialized empty Git repository in /home/itodoroski/git/t/trash directory.t5551-http-fetch/httpd/www/repo.git/ To /home/itodoroski/git/t/trash directory.t5551-http-fetch/httpd/www/repo.git * [new branch] master -> master ok 2 - create http-accessible bare repository expecting success: GIT_CURL_VERBOSE=1 git clone --quiet $HTTPD_URL/smart/repo.git clone 2>err && test_cmp file clone/file && tr '\015' Q <err | sed -e " s/Q\$// /^[*] /d /^$/d /^< $/d /^[^><]/{ s/^/> / } /^> User-Agent: /d /^> Host: /d /^> POST /,$ { /^> Accept: [*]\\/[*]/d } s/^> Content-Length: .*/> Content-Length: xxx/ /^> 00..want /d /^> 00.*done/d /^< Server: /d /^< Expires: /d /^< Date: /d /^< Content-Length: /d /^< Transfer-Encoding: /d " >act && test_cmp exp act ok 3 - clone http repository expecting success: echo content >>file && git commit -a -m two && git push public (cd clone && git pull) && test_cmp file clone/file [master ace4881] two Author: A U Thor <author@example.com> 1 file changed, 1 insertion(+) To /home/itodoroski/git/t/trash directory.t5551-http-fetch/httpd/www/repo.git ba36540..ace4881 master -> master From http://127.0.0.1:5551/smart/repo ba36540..ace4881 master -> origin/master Updating ba36540..ace4881 Fast-forward file | 1 + 1 file changed, 1 insertion(+) ok 4 - fetch changes via http expecting success: sed -e " s/^.* \"// s/\"// s/ [1-9][0-9]*\$// s/^GET /GET / " >act <"$HTTPD_ROOT_PATH"/access.log && test_cmp exp act ok 5 - used upload-pack service expecting success: git clone $HTTPD_URL/smart-redir-perm/repo.git --quiet repo-p error: RPC failed; result=22, HTTP code = 405 fatal: The remote end hung up unexpectedly not ok - 6 follow redirects (301) # # git clone $HTTPD_URL/smart-redir-perm/repo.git --quiet repo-p # ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-24 20:49 ` Ivan Todoroski @ 2012-03-25 1:06 ` Jeff King 2012-03-25 2:32 ` Jeff King 2012-03-25 15:30 ` Ivan Todoroski 0 siblings, 2 replies; 54+ messages in thread From: Jeff King @ 2012-03-25 1:06 UTC (permalink / raw) To: Ivan Todoroski; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On Sat, Mar 24, 2012 at 09:49:55PM +0100, Ivan Todoroski wrote: > Unfortunately it failed, and to make matters worse it failed exactly > on the parts dealing with smart HTTP, which is what I need to test in > the first place. Talk about Murphy's law... > > Is it failing for anyone else on the vanilla "maint" branch? I would > appreciate any help I could get here. No, it passes fine here (Debian unstable). > expecting success: > git clone $HTTPD_URL/smart-redir-perm/repo.git --quiet repo-p > > error: RPC failed; result=22, HTTP code = 405 > fatal: The remote end hung up unexpectedly > not ok - 6 follow redirects (301) That's weird. 405 is "Method Not Allowed". Clone shouldn't be doing anything more exotic than GET and POST. And the prior tests pass, so it means that it's working in general. The only thing different about this test is that apache is configured to use mod_rewrite to issue a redirect. Does your machine have mod_rewrite installed and enabled? I would think apache would complain at startup if it wasn't. I wonder if there's something non-portable in the minimal apache config we ship. Does httpd/error.log in the trash directory say anything interesting? -Peff ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-25 1:06 ` Jeff King @ 2012-03-25 2:32 ` Jeff King 2012-03-25 17:33 ` Ivan Todoroski 2012-03-25 15:30 ` Ivan Todoroski 1 sibling, 1 reply; 54+ messages in thread From: Jeff King @ 2012-03-25 2:32 UTC (permalink / raw) To: Ivan Todoroski; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On Sat, Mar 24, 2012 at 09:06:09PM -0400, Jeff King wrote: > That's weird. 405 is "Method Not Allowed". Clone shouldn't be doing > anything more exotic than GET and POST. And the prior tests pass, so it > means that it's working in general. The only thing different about this > test is that apache is configured to use mod_rewrite to issue a > redirect. > > Does your machine have mod_rewrite installed and enabled? I would think > apache would complain at startup if it wasn't. I wonder if there's > something non-portable in the minimal apache config we ship. > > Does httpd/error.log in the trash directory say anything interesting? Also, does it work any better with this patch? diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf index 3c12b05..714760d 100644 --- a/t/lib-httpd/apache.conf +++ b/t/lib-httpd/apache.conf @@ -61,9 +61,8 @@ ScriptAlias /smart_noexport/ ${GIT_EXEC_PATH}/git-http-backend/ Options ExecCGI </Files> -RewriteEngine on -RewriteRule ^/smart-redir-perm/(.*)$ /smart/$1 [R=301] -RewriteRule ^/smart-redir-temp/(.*)$ /smart/$1 [R=302] +RedirectMatch 301 ^/smart-redir-perm/(.*)$ /smart/$1 +RedirectMatch 302 ^/smart-redir-temp/(.*)$ /smart/$1 <IfDefine SSL> LoadModule ssl_module modules/mod_ssl.so ^ permalink raw reply related [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-25 2:32 ` Jeff King @ 2012-03-25 17:33 ` Ivan Todoroski 2012-03-25 17:54 ` Ivan Todoroski 0 siblings, 1 reply; 54+ messages in thread From: Ivan Todoroski @ 2012-03-25 17:33 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On 25.03.2012 04:32, Jeff King wrote: > On Sat, Mar 24, 2012 at 09:06:09PM -0400, Jeff King wrote: > >> That's weird. 405 is "Method Not Allowed". Clone shouldn't be doing >> anything more exotic than GET and POST. And the prior tests pass, so it >> means that it's working in general. The only thing different about this >> test is that apache is configured to use mod_rewrite to issue a >> redirect. >> >> Does your machine have mod_rewrite installed and enabled? I would think >> apache would complain at startup if it wasn't. I wonder if there's >> something non-portable in the minimal apache config we ship. >> >> Does httpd/error.log in the trash directory say anything interesting? > > Also, does it work any better with this patch? > > diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf > index 3c12b05..714760d 100644 > --- a/t/lib-httpd/apache.conf > +++ b/t/lib-httpd/apache.conf > @@ -61,9 +61,8 @@ ScriptAlias /smart_noexport/ ${GIT_EXEC_PATH}/git-http-backend/ > Options ExecCGI > </Files> > > -RewriteEngine on > -RewriteRule ^/smart-redir-perm/(.*)$ /smart/$1 [R=301] > -RewriteRule ^/smart-redir-temp/(.*)$ /smart/$1 [R=302] > +RedirectMatch 301 ^/smart-redir-perm/(.*)$ /smart/$1 > +RedirectMatch 302 ^/smart-redir-temp/(.*)$ /smart/$1 > > <IfDefine SSL> > LoadModule ssl_module modules/mod_ssl.so Yes! That patch did it, t5551-http-fetch.sh now passes fully. Thank you. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-25 17:33 ` Ivan Todoroski @ 2012-03-25 17:54 ` Ivan Todoroski 2012-03-26 17:33 ` Jeff King 0 siblings, 1 reply; 54+ messages in thread From: Ivan Todoroski @ 2012-03-25 17:54 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On 25.03.2012 19:33, Ivan Todoroski wrote: > On 25.03.2012 04:32, Jeff King wrote: >> On Sat, Mar 24, 2012 at 09:06:09PM -0400, Jeff King wrote: >> >>> That's weird. 405 is "Method Not Allowed". Clone shouldn't be doing >>> anything more exotic than GET and POST. And the prior tests pass, so it >>> means that it's working in general. The only thing different about this >>> test is that apache is configured to use mod_rewrite to issue a >>> redirect. >>> >>> Does your machine have mod_rewrite installed and enabled? I would think >>> apache would complain at startup if it wasn't. I wonder if there's >>> something non-portable in the minimal apache config we ship. >>> >>> Does httpd/error.log in the trash directory say anything interesting? >> >> Also, does it work any better with this patch? >> >> diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf >> index 3c12b05..714760d 100644 >> --- a/t/lib-httpd/apache.conf >> +++ b/t/lib-httpd/apache.conf >> @@ -61,9 +61,8 @@ ScriptAlias /smart_noexport/ >> ${GIT_EXEC_PATH}/git-http-backend/ >> Options ExecCGI >> </Files> >> >> -RewriteEngine on >> -RewriteRule ^/smart-redir-perm/(.*)$ /smart/$1 [R=301] >> -RewriteRule ^/smart-redir-temp/(.*)$ /smart/$1 [R=302] >> +RedirectMatch 301 ^/smart-redir-perm/(.*)$ /smart/$1 >> +RedirectMatch 302 ^/smart-redir-temp/(.*)$ /smart/$1 >> >> <IfDefine SSL> >> LoadModule ssl_module modules/mod_ssl.so > > > Yes! That patch did it, t5551-http-fetch.sh now passes fully. Thank you. Ah, no... I spoke too soon. I didn't look closely enough, I wasn't logged in at the right machine, sorry. :) It's still failing on CentOS 5.8 with Apache 2.2.3. I will just find another machine where the test suite works. If you'd still like to debug the problem on CentOS 5.8 I can run any commands or test any patches for you. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-25 17:54 ` Ivan Todoroski @ 2012-03-26 17:33 ` Jeff King 2012-03-27 7:07 ` Ivan Todoroski 0 siblings, 1 reply; 54+ messages in thread From: Jeff King @ 2012-03-26 17:33 UTC (permalink / raw) To: Ivan Todoroski; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On Sun, Mar 25, 2012 at 07:54:19PM +0200, Ivan Todoroski wrote: > >>>That's weird. 405 is "Method Not Allowed". Clone shouldn't be doing > >>>anything more exotic than GET and POST. And the prior tests pass, so it > [...] > It's still failing on CentOS 5.8 with Apache 2.2.3. I will just find > another machine where the test suite works. I'm running the tests with apache 2.2.22. So I wonder if there is some config syntax or behavior that is simply different in the much older version. > If you'd still like to debug the problem on CentOS 5.8 I can run any > commands or test any patches for you. A few more things to try: - check httpd/access.log to confirm that apache really is returning a 405 (I can't imagine that curl would not be reporting it accurately, but it's worth a shot) - try running with GIT_CURL_VERBOSE=1 to get debug output from curl - try adding these to the apache config: RewriteLog "httpd/rewrite.log" RewriteLogLevel 9 which might yield more information. I have no idea what you're looking for in any of those, but maybe something useful will be obvious. I don't have a lot of apache experience, so my next step in your shoes would be just trying to get more information on what's happening. -Peff ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-26 17:33 ` Jeff King @ 2012-03-27 7:07 ` Ivan Todoroski 0 siblings, 0 replies; 54+ messages in thread From: Ivan Todoroski @ 2012-03-27 7:07 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On 26.03.2012 19:33, Jeff King wrote: > On Sun, Mar 25, 2012 at 07:54:19PM +0200, Ivan Todoroski wrote: > >>>>> That's weird. 405 is "Method Not Allowed". Clone shouldn't be doing >>>>> anything more exotic than GET and POST. And the prior tests pass, so it >> [...] >> It's still failing on CentOS 5.8 with Apache 2.2.3. I will just find >> another machine where the test suite works. > > I'm running the tests with apache 2.2.22. So I wonder if there is some > config syntax or behavior that is simply different in the much older > version. Most likely. >> If you'd still like to debug the problem on CentOS 5.8 I can run any >> commands or test any patches for you. > > A few more things to try: > > - check httpd/access.log to confirm that apache really is returning a > 405 (I can't imagine that curl would not be reporting it accurately, > but it's worth a shot) Yeah, it has 405 in access log too. > - try running with GIT_CURL_VERBOSE=1 to get debug output from curl > > - try adding these to the apache config: > > RewriteLog "httpd/rewrite.log" > RewriteLogLevel 9 > > which might yield more information. > > I have no idea what you're looking for in any of those, but maybe > something useful will be obvious. I don't have a lot of apache > experience, so my next step in your shoes would be just trying to get > more information on what's happening. Thanks for the tips. I'll keep tinkering with this as time allows and post a patch if I find something. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: Clone fails on a repo with too many heads/tags 2012-03-25 1:06 ` Jeff King 2012-03-25 2:32 ` Jeff King @ 2012-03-25 15:30 ` Ivan Todoroski 1 sibling, 0 replies; 54+ messages in thread From: Ivan Todoroski @ 2012-03-25 15:30 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On 25.03.2012 03:06, Jeff King wrote: > On Sat, Mar 24, 2012 at 09:49:55PM +0100, Ivan Todoroski wrote: > >> Is it failing for anyone else on the vanilla "maint" branch? I would >> appreciate any help I could get here. > > No, it passes fine here (Debian unstable). OK, thanks for checking, it's good to know. > Does your machine have mod_rewrite installed and enabled? I would think > apache would complain at startup if it wasn't. I wonder if there's > something non-portable in the minimal apache config we ship. Looks like it comes by default with the RPM. Maybe the Apache version is too old? $ rpm -q httpd httpd-2.2.3-63.el5.centos.1 $ rpm -ql httpd | grep rewrite /usr/lib64/httpd/modules/mod_rewrite.so $ ls -l /usr/lib64/httpd/modules/mod_rewrite.so -rwxr-xr-x 1 root root 60384 Feb 23 19:23 /usr/lib64/httpd/modules/mod_rewrite.so $ ls -l trash\ directory.t5551-http-fetch/httpd/modules/mod_rewrite.so -rwxr-xr-x 1 root root 60384 Feb 23 19:23 trash directory.t5551-http-fetch/httpd/modules/mod_rewrite.so Also, mod_rewrite is enabled in both the master config file /etc/httpd/conf/httpd.conf, as well as the minimal git/t/lib-httpd/apache.conf. > Does httpd/error.log in the trash directory say anything interesting? Nothing, it just gives the startup and shutdown message: $ cat trash\ directory.t5551-http-fetch/httpd/error.log [Sun Mar 25 15:27:12 2012] [notice] Apache/2.2.3 (CentOS) configured -- resuming normal operations [Sun Mar 25 15:27:12 2012] [notice] caught SIGTERM, shutting down I will try your patch from the other message a bit later and let you know. ^ permalink raw reply [flat|nested] 54+ messages in thread
* [PATCH/RFC 1/2] fetch-pack: new option to read refs from stdin 2012-03-21 20:17 ` Jeff King 2012-03-24 20:49 ` Ivan Todoroski @ 2012-03-24 20:53 ` Ivan Todoroski 2012-03-25 1:19 ` Jeff King 2012-03-24 20:54 ` [PATCH/RFC 2/2] remote-curl: send the refs to fetch-pack on stdin Ivan Todoroski 2 siblings, 1 reply; 54+ messages in thread From: Ivan Todoroski @ 2012-03-24 20:53 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git >From c4bb55f9f27569faa368d823ca6fe4b236e37cd6 Mon Sep 17 00:00:00 2001 From: Ivan Todoroski <grnch@gmx.net> Date: Sat, 24 Mar 2012 15:13:05 +0100 Subject: [PATCH/RFC 1/2] fetch-pack: new option to read refs from stdin --- Documentation/git-fetch-pack.txt | 9 ++++++++ builtin/fetch-pack.c | 44 ++++++++++++++++++++++++++++++++++++-- fetch-pack.h | 3 ++- 3 files changed, 53 insertions(+), 3 deletions(-) diff --git a/Documentation/git-fetch-pack.txt b/Documentation/git-fetch-pack.txt index ed1bdaacd1..7cd7c785bc 100644 --- a/Documentation/git-fetch-pack.txt +++ b/Documentation/git-fetch-pack.txt @@ -32,6 +32,15 @@ OPTIONS --all:: Fetch all remote refs. +--stdin:: + Take the list of refs from stdin, one per line. If there + are refs specified on the command line in addition to this + option, then the refs from stdin are processed after those + on the command line. + If '--stateless-rpc' is specified together with this option + then the list of refs must be in packet format (pkt-line) + with a flush packet terminating the list. + -q:: --quiet:: Pass '-q' flag to 'git unpack-objects'; this makes the diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index a4d3e90a86..3c4c193e45 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -23,7 +23,7 @@ static struct fetch_pack_args args = { }; static const char fetch_pack_usage[] = -"git fetch-pack [--all] [--quiet|-q] [--keep|-k] [--thin] [--include-tag] [--upload-pack=<git-upload-pack>] [--depth=<n>] [--no-progress] [-v] [<host>:]<directory> [<refs>...]"; +"git fetch-pack [--all] [--stdin] [--quiet|-q] [--keep|-k] [--thin] [--include-tag] [--upload-pack=<git-upload-pack>] [--depth=<n>] [--no-progress] [-v] [<host>:]<directory> [<refs>...]"; #define COMPLETE (1U << 0) #define COMMON (1U << 1) @@ -896,7 +896,7 @@ static void fetch_pack_setup(void) int cmd_fetch_pack(int argc, const char **argv, const char *prefix) { - int i, ret, nr_heads; + int i, ret, nr_heads, alloc_heads; struct ref *ref = NULL; char *dest = NULL, **heads; int fd[2]; @@ -941,6 +941,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) args.fetch_all = 1; continue; } + if (!strcmp("--stdin", arg)) { + args.refs_from_stdin = 1; + continue; + } if (!strcmp("-v", arg)) { args.verbose = 1; continue; @@ -972,6 +976,42 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) if (!dest) usage(fetch_pack_usage); + if (args.refs_from_stdin) { + char ref[1000]; + /* copy refs from cmdline to new growable list */ + int size = nr_heads * sizeof(*heads); + heads = memcpy(xmalloc(size), heads, size); + alloc_heads = nr_heads; + /* append refs from stdin to ones from cmdline */ + for (;;) { + if (args.stateless_rpc) { + /* read using pkt-line in stateless RPC mode, + flush packet signifies end of refs */ + int n = packet_read_line(0, ref, sizeof(ref)); + if (!n) + break; + if (ref[n-1] == '\n') + ref[n-1] = '\0'; + } + else { + int n; + if (!fgets(ref, sizeof(ref), stdin)) { + if (ferror(stdin)) + die_errno("cannot read ref"); + if (feof(stdin)) + break; + } + n = strlen(ref); + if (ref[n-1] == '\n') + ref[n-1] = '\0'; + else if (!feof(stdin)) + die("ref too long on stdin"); + } + ALLOC_GROW(heads, nr_heads + 1, alloc_heads); + heads[nr_heads++] = xstrdup(ref); + } + } + if (args.stateless_rpc) { conn = NULL; fd[0] = 0; diff --git a/fetch-pack.h b/fetch-pack.h index 0608edae3f..292d69389e 100644 --- a/fetch-pack.h +++ b/fetch-pack.h @@ -13,7 +13,8 @@ struct fetch_pack_args { verbose:1, no_progress:1, include_tag:1, - stateless_rpc:1; + stateless_rpc:1, + refs_from_stdin:1; }; struct ref *fetch_pack(struct fetch_pack_args *args, -- 1.7.9.4.16.gd24fa.dirty ^ permalink raw reply related [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC 1/2] fetch-pack: new option to read refs from stdin 2012-03-24 20:53 ` [PATCH/RFC 1/2] fetch-pack: new option to read refs from stdin Ivan Todoroski @ 2012-03-25 1:19 ` Jeff King 2012-03-25 9:39 ` Ivan Todoroski 2012-03-25 20:00 ` Ivan Todoroski 0 siblings, 2 replies; 54+ messages in thread From: Jeff King @ 2012-03-25 1:19 UTC (permalink / raw) To: Ivan Todoroski; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On Sat, Mar 24, 2012 at 09:53:26PM +0100, Ivan Todoroski wrote: > From c4bb55f9f27569faa368d823ca6fe4b236e37cd6 Mon Sep 17 00:00:00 2001 > From: Ivan Todoroski <grnch@gmx.net> > Date: Sat, 24 Mar 2012 15:13:05 +0100 > Subject: [PATCH/RFC 1/2] fetch-pack: new option to read refs from stdin You can drop these lines; they are redundant with the actual email headers. > --- > Documentation/git-fetch-pack.txt | 9 ++++++++ > builtin/fetch-pack.c | 44 ++++++++++++++++++++++++++++++++++++-- > fetch-pack.h | 3 ++- > 3 files changed, 53 insertions(+), 3 deletions(-) Give more of a commit message. Why is this option useful (I know, of course, from our previous discussion. But keep in mind the audience of developers reading "git log" a year from now). > +--stdin:: > + Take the list of refs from stdin, one per line. If there > + are refs specified on the command line in addition to this > + option, then the refs from stdin are processed after those > + on the command line. > + If '--stateless-rpc' is specified together with this option > + then the list of refs must be in packet format (pkt-line) > + with a flush packet terminating the list. Nice. Thanks for taking the time not just to solve your problem, but to give sane semantics in the non-stateless-rpc case. I think there is a minor formatting bug in the above. Asciidoc will make your two paragraphs into a single one, won't it? I think you need to do the (horribly ugly): --stdin:: First paragraph. + Second paragraph. to appease asciidoc. > --- a/builtin/fetch-pack.c > +++ b/builtin/fetch-pack.c > @@ -23,7 +23,7 @@ static struct fetch_pack_args args = { > }; > > static const char fetch_pack_usage[] = > -"git fetch-pack [--all] [--quiet|-q] [--keep|-k] [--thin] [--include-tag] [--upload-pack=<git-upload-pack>] [--depth=<n>] [--no-progress] [-v] [<host>:]<directory> [<refs>...]"; > +"git fetch-pack [--all] [--stdin] [--quiet|-q] [--keep|-k] [--thin] [--include-tag] [--upload-pack=<git-upload-pack>] [--depth=<n>] [--no-progress] [-v] [<host>:]<directory> [<refs>...]"; Not a problem you introduced, but maybe it is time for us to cut down this gigantic line (it's 180 characters even before your patch!). Even breaking it across lines at 80 columns would help. > @@ -972,6 +976,42 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) > if (!dest) > usage(fetch_pack_usage); > > + if (args.refs_from_stdin) { > + char ref[1000]; Ick. Is there any reason not to use a strbuf here? 1000 is probably plenty, but we are generally moving towards removing such limits where possible. You'd also get to use strbuf_getline and strbuf_trim in the newline-delimited case. -Peff ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC 1/2] fetch-pack: new option to read refs from stdin 2012-03-25 1:19 ` Jeff King @ 2012-03-25 9:39 ` Ivan Todoroski 2012-03-25 15:15 ` Ivan Todoroski 2012-03-25 20:00 ` Ivan Todoroski 1 sibling, 1 reply; 54+ messages in thread From: Ivan Todoroski @ 2012-03-25 9:39 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On 25.03.2012 03:19, Jeff King wrote: > On Sat, Mar 24, 2012 at 09:53:26PM +0100, Ivan Todoroski wrote: > >> --- >> Documentation/git-fetch-pack.txt | 9 ++++++++ >> builtin/fetch-pack.c | 44 ++++++++++++++++++++++++++++++++++++-- >> fetch-pack.h | 3 ++- >> 3 files changed, 53 insertions(+), 3 deletions(-) > > Give more of a commit message. Why is this option useful (I know, of > course, from our previous discussion. But keep in mind the audience of > developers reading "git log" a year from now). Definitely. This was just work in progress I posted while I figure out the test suite problem. I will add full commit messages when I complete the test cases. Thanks for reviewing, I will incorporate all your suggestions in the next version. >> @@ -972,6 +976,42 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) >> if (!dest) >> usage(fetch_pack_usage); >> >> + if (args.refs_from_stdin) { >> + char ref[1000]; > > Ick. Is there any reason not to use a strbuf here? 1000 is probably > plenty, but we are generally moving towards removing such limits where > possible. > > You'd also get to use strbuf_getline and strbuf_trim in the > newline-delimited case. Right now that "char ref[1000]" code is rejecting refs on stdin if they are longer than 1000 chars or if they contain an ASCII NUL char in them. When I change this to strbuf_getline() should I be doing any similar checks, or do I just pass on whatever I read? ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC 1/2] fetch-pack: new option to read refs from stdin 2012-03-25 9:39 ` Ivan Todoroski @ 2012-03-25 15:15 ` Ivan Todoroski 0 siblings, 0 replies; 54+ messages in thread From: Ivan Todoroski @ 2012-03-25 15:15 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On 25.03.2012 11:39, Ivan Todoroski wrote: > On 25.03.2012 03:19, Jeff King wrote: >> On Sat, Mar 24, 2012 at 09:53:26PM +0100, Ivan Todoroski wrote: >>> @@ -972,6 +976,42 @@ int cmd_fetch_pack(int argc, const char **argv, >>> const char *prefix) >>> if (!dest) >>> usage(fetch_pack_usage); >>> >>> + if (args.refs_from_stdin) { >>> + char ref[1000]; >> >> Ick. Is there any reason not to use a strbuf here? 1000 is probably >> plenty, but we are generally moving towards removing such limits where >> possible. >> >> You'd also get to use strbuf_getline and strbuf_trim in the >> newline-delimited case. > > Right now that "char ref[1000]" code is rejecting refs on stdin if they > are longer than 1000 chars or if they contain an ASCII NUL char in them. > When I change this to strbuf_getline() should I be doing any similar > checks, or do I just pass on whatever I read? Never mind, there is no checking necessary. I just saw that strbuf_getline() grows the buffer automatically so there is no chance of buffer overflows, plus it doesn't mind if NUL chars are present in the middle of the string. The subsequent code that reads the refs will stop at the first NUL char anyway so all is well. I didn't know about this cool strbuf infrastructure, thanks for the tip. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC 1/2] fetch-pack: new option to read refs from stdin 2012-03-25 1:19 ` Jeff King 2012-03-25 9:39 ` Ivan Todoroski @ 2012-03-25 20:00 ` Ivan Todoroski 2012-03-26 17:21 ` Jeff King 1 sibling, 1 reply; 54+ messages in thread From: Ivan Todoroski @ 2012-03-25 20:00 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On 25.03.2012 03:19, Jeff King wrote: > On Sat, Mar 24, 2012 at 09:53:26PM +0100, Ivan Todoroski wrote: > I think there is a minor formatting bug in the above. Asciidoc will make > your two paragraphs into a single one, won't it? I think you need to do > the (horribly ugly): > > --stdin:: > First paragraph. > + > Second paragraph. Apparently this works too (i.e. indent the "+" too): --stdin:: First paragraph. + Second paragraph. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC 1/2] fetch-pack: new option to read refs from stdin 2012-03-25 20:00 ` Ivan Todoroski @ 2012-03-26 17:21 ` Jeff King 2012-03-26 17:49 ` Ivan Todoroski 0 siblings, 1 reply; 54+ messages in thread From: Jeff King @ 2012-03-26 17:21 UTC (permalink / raw) To: Ivan Todoroski; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On Sun, Mar 25, 2012 at 10:00:01PM +0200, Ivan Todoroski wrote: > On 25.03.2012 03:19, Jeff King wrote: > >On Sat, Mar 24, 2012 at 09:53:26PM +0100, Ivan Todoroski wrote: > >I think there is a minor formatting bug in the above. Asciidoc will make > >your two paragraphs into a single one, won't it? I think you need to do > >the (horribly ugly): > > > > --stdin:: > > First paragraph. > > + > > Second paragraph. > > Apparently this works too (i.e. indent the "+" too): > > --stdin:: > First paragraph. > + > Second paragraph. Sadly, it's not quite the same (because I consider the source of your version much more readable). The diff of the resulting HTML between my version and yours is: --- no-indent.html 2012-03-26 13:19:26.206728013 -0400 +++ indent.html 2012-03-26 13:19:04.270727319 -0400 @@ -5,7 +5,8 @@ <dd> <p> First paragraph. + <br /> + Second paragraph. </p> -<div class="paragraph"><p>Second paragraph.</p></div> </dd> </dl></div> So in your case it is putting in a line break, but not actually starting a new paragraph. -Peff ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC 1/2] fetch-pack: new option to read refs from stdin 2012-03-26 17:21 ` Jeff King @ 2012-03-26 17:49 ` Ivan Todoroski 2012-03-26 17:51 ` Jeff King 0 siblings, 1 reply; 54+ messages in thread From: Ivan Todoroski @ 2012-03-26 17:49 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On 26.03.2012 19:21, Jeff King wrote: > On Sun, Mar 25, 2012 at 10:00:01PM +0200, Ivan Todoroski wrote: > >> On 25.03.2012 03:19, Jeff King wrote: >>> On Sat, Mar 24, 2012 at 09:53:26PM +0100, Ivan Todoroski wrote: >>> I think there is a minor formatting bug in the above. Asciidoc will make >>> your two paragraphs into a single one, won't it? I think you need to do >>> the (horribly ugly): >>> >>> --stdin:: >>> First paragraph. >>> + >>> Second paragraph. >> Apparently this works too (i.e. indent the "+" too): >> >> --stdin:: >> First paragraph. >> + >> Second paragraph. > > Sadly, it's not quite the same (because I consider the source of your > version much more readable). The diff of the resulting HTML between my > version and yours is: > > --- no-indent.html 2012-03-26 13:19:26.206728013 -0400 > +++ indent.html 2012-03-26 13:19:04.270727319 -0400 > @@ -5,7 +5,8 @@ > <dd> > <p> > First paragraph. > + <br /> > + Second paragraph. > </p> > -<div class="paragraph"><p>Second paragraph.</p></div> > </dd> > </dl></div> > > So in your case it is putting in a line break, but not actually starting > a new paragraph. > > -Peff Ah, that's too bad. OK, I guess we have two options: 1) Unindent the second paragraph - looks ugly in source but good in final output. 2) Just merge the two paragraphs into a single paragraph. It's not that much text anyway, it doesn't really *have* to be in two paragraphs. Let me know which option you prefer. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC 1/2] fetch-pack: new option to read refs from stdin 2012-03-26 17:49 ` Ivan Todoroski @ 2012-03-26 17:51 ` Jeff King 0 siblings, 0 replies; 54+ messages in thread From: Jeff King @ 2012-03-26 17:51 UTC (permalink / raw) To: Ivan Todoroski; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On Mon, Mar 26, 2012 at 07:49:59PM +0200, Ivan Todoroski wrote: > >So in your case it is putting in a line break, but not actually starting > >a new paragraph. > > > Ah, that's too bad. OK, I guess we have two options: > > 1) Unindent the second paragraph - looks ugly in source but good in > final output. > > 2) Just merge the two paragraphs into a single paragraph. It's not > that much text anyway, it doesn't really *have* to be in two > paragraphs. > > Let me know which option you prefer. I would do (1). While I (and I suspect other git devs) do just read the source straight from Documentation/*.txt, the formatted output is read by many more people, and should be considered the final product. So optimize for that case. -Peff ^ permalink raw reply [flat|nested] 54+ messages in thread
* [PATCH/RFC 2/2] remote-curl: send the refs to fetch-pack on stdin 2012-03-21 20:17 ` Jeff King 2012-03-24 20:49 ` Ivan Todoroski 2012-03-24 20:53 ` [PATCH/RFC 1/2] fetch-pack: new option to read refs from stdin Ivan Todoroski @ 2012-03-24 20:54 ` Ivan Todoroski 2012-03-25 1:24 ` Jeff King 2 siblings, 1 reply; 54+ messages in thread From: Ivan Todoroski @ 2012-03-24 20:54 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git >From 723a561946824ee367f57f0d9b9d336a6bc28d13 Mon Sep 17 00:00:00 2001 From: Ivan Todoroski <grnch@gmx.net> Date: Sat, 24 Mar 2012 15:53:36 +0100 Subject: [PATCH/RFC 2/2] remote-curl: send the refs to fetch-pack on stdin --- remote-curl.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/remote-curl.c b/remote-curl.c index d159fe7f34..9fdfca9f32 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -531,12 +531,13 @@ static int post_rpc(struct rpc_state *rpc) return err; } -static int rpc_service(struct rpc_state *rpc, struct discovery *heads) +static int rpc_service(struct rpc_state *rpc, struct discovery *heads, + int nr_fetch, struct ref **to_fetch) { const char *svc = rpc->service_name; struct strbuf buf = STRBUF_INIT; struct child_process client; - int err = 0; + int err = 0, i; memset(&client, 0, sizeof(client)); client.in = -1; @@ -545,6 +546,13 @@ static int rpc_service(struct rpc_state *rpc, struct discovery *heads) client.argv = rpc->argv; if (start_command(&client)) exit(1); + + /* push the refs to fetch-pack via stdin, if requested */ + if (to_fetch) { + for (i = 0; i < nr_fetch; i++) + packet_write(client.in, "%s\n", to_fetch[i]->name); + packet_flush(client.in); + } if (heads) write_or_die(client.in, heads->buf, heads->len); @@ -633,6 +641,7 @@ static int fetch_git(struct discovery *heads, argv = xmalloc((15 + nr_heads) * sizeof(char*)); argv[argc++] = "fetch-pack"; argv[argc++] = "--stateless-rpc"; + argv[argc++] = "--stdin"; argv[argc++] = "--lock-pack"; if (options.followtags) argv[argc++] = "--include-tag"; @@ -655,7 +664,6 @@ static int fetch_git(struct discovery *heads, struct ref *ref = to_fetch[i]; if (!ref->name || !*ref->name) die("cannot fetch by sha1 over smart http"); - argv[argc++] = ref->name; } argv[argc++] = NULL; @@ -664,7 +672,7 @@ static int fetch_git(struct discovery *heads, rpc.argv = argv; rpc.gzip_request = 1; - err = rpc_service(&rpc, heads); + err = rpc_service(&rpc, heads, nr_heads, to_fetch); if (rpc.result.len) safe_write(1, rpc.result.buf, rpc.result.len); strbuf_release(&rpc.result); @@ -783,7 +791,7 @@ static int push_git(struct discovery *heads, int nr_spec, char **specs) rpc.service_name = "git-receive-pack", rpc.argv = argv; - err = rpc_service(&rpc, heads); + err = rpc_service(&rpc, heads, 0, NULL); if (rpc.result.len) safe_write(1, rpc.result.buf, rpc.result.len); strbuf_release(&rpc.result); -- 1.7.9.4.16.gd24fa.dirty ^ permalink raw reply related [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC 2/2] remote-curl: send the refs to fetch-pack on stdin 2012-03-24 20:54 ` [PATCH/RFC 2/2] remote-curl: send the refs to fetch-pack on stdin Ivan Todoroski @ 2012-03-25 1:24 ` Jeff King 2012-03-25 9:52 ` Ivan Todoroski 0 siblings, 1 reply; 54+ messages in thread From: Jeff King @ 2012-03-25 1:24 UTC (permalink / raw) To: Ivan Todoroski; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On Sat, Mar 24, 2012 at 09:54:16PM +0100, Ivan Todoroski wrote: > From 723a561946824ee367f57f0d9b9d336a6bc28d13 Mon Sep 17 00:00:00 2001 > From: Ivan Todoroski <grnch@gmx.net> > Date: Sat, 24 Mar 2012 15:53:36 +0100 > Subject: [PATCH/RFC 2/2] remote-curl: send the refs to fetch-pack on stdin > > --- Same comments as patch 1. Drop this stuff, and add in more commit message. :) > remote-curl.c | 18 +++++++++++++----- > 1 file changed, 13 insertions(+), 5 deletions(-) Overall the code looks sane to me. One minor comment: > -static int rpc_service(struct rpc_state *rpc, struct discovery *heads) > +static int rpc_service(struct rpc_state *rpc, struct discovery *heads, > + int nr_fetch, struct ref **to_fetch) I was curious why you needed to add new arguments for this, since we surely must be passing the information into rpc_service already (since it has to put them on the command line). And the answer is that they are already in the argv member of the struct rpc_state. I wonder if it would fit the existing style of the code to pass the arguments through a member in the rpc_state in the same way. I don't feel strongly about it, though. -Peff ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC 2/2] remote-curl: send the refs to fetch-pack on stdin 2012-03-25 1:24 ` Jeff King @ 2012-03-25 9:52 ` Ivan Todoroski 2012-03-26 17:24 ` Jeff King 0 siblings, 1 reply; 54+ messages in thread From: Ivan Todoroski @ 2012-03-25 9:52 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On 25.03.2012 03:24, Jeff King wrote: > On Sat, Mar 24, 2012 at 09:54:16PM +0100, Ivan Todoroski wrote: >> -static int rpc_service(struct rpc_state *rpc, struct discovery *heads) >> +static int rpc_service(struct rpc_state *rpc, struct discovery *heads, >> + int nr_fetch, struct ref **to_fetch) > > I was curious why you needed to add new arguments for this, since we > surely must be passing the information into rpc_service already (since > it has to put them on the command line). And the answer is that they are > already in the argv member of the struct rpc_state. I wonder if it would > fit the existing style of the code to pass the arguments through a > member in the rpc_state in the same way. I don't feel strongly about it, > though. Yeah, I realized this was ugly after I sent it.... not only are new arguments added to rpc_service(), but they seem completely arbitrary and specific to fetch-pack. There is no reason other callers of rpc_service() would want args in the exact same format. On the other hand I can't reuse rpc_state.argv because that is already passed to client.argv for start_command(). Would it be OK if I add a new memeber in rpc_state, e.g. "struct strbuf *stdin_preamble"? If non-NULL, it would contain any data the caller of rpc_service() wants shoved into the stdin of the sub-command before anything else. That way the caller is free to format this data in whatever format they need, not only in the pkt-line format needed by fetch-pack. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC 2/2] remote-curl: send the refs to fetch-pack on stdin 2012-03-25 9:52 ` Ivan Todoroski @ 2012-03-26 17:24 ` Jeff King 0 siblings, 0 replies; 54+ messages in thread From: Jeff King @ 2012-03-26 17:24 UTC (permalink / raw) To: Ivan Todoroski; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On Sun, Mar 25, 2012 at 11:52:35AM +0200, Ivan Todoroski wrote: > >>-static int rpc_service(struct rpc_state *rpc, struct discovery *heads) > >>+static int rpc_service(struct rpc_state *rpc, struct discovery *heads, > >>+ int nr_fetch, struct ref **to_fetch) > [...] > Yeah, I realized this was ugly after I sent it.... not only are new > arguments added to rpc_service(), but they seem completely arbitrary > and specific to fetch-pack. There is no reason other callers of > rpc_service() would want args in the exact same format. Yeah, I think that is what was bugging me about it. > On the other hand I can't reuse rpc_state.argv because that is > already passed to client.argv for start_command(). Right. You'd need a new member. > Would it be OK if I add a new memeber in rpc_state, e.g. "struct > strbuf *stdin_preamble"? If non-NULL, it would contain any data the > caller of rpc_service() wants shoved into the stdin of the > sub-command before anything else. That way the caller is free to > format this data in whatever format they need, not only in the > pkt-line format needed by fetch-pack. I think that is the cleanest solution. It's a little less efficient, in that we build the whole buffer in memory instead of sending each packet as we form it. But I think we are talking about a few kilobytes at most. -Peff ^ permalink raw reply [flat|nested] 54+ messages in thread
* [PATCH/RFC v2 0/4] Fix fetch-pack command line overflow during clone 2012-03-21 17:14 ` Jeff King 2012-03-21 17:59 ` Jakub Narebski 2012-03-21 20:02 ` Ivan Todoroski @ 2012-03-27 6:23 ` Ivan Todoroski 2012-03-27 6:25 ` [PATCH/RFC v2 1/4] fetch-pack: new --stdin option to read refs from stdin Ivan Todoroski ` (3 more replies) 2 siblings, 4 replies; 54+ messages in thread From: Ivan Todoroski @ 2012-03-27 6:23 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git This patch series is against the latest "maint" branch. Please let me know if I need to rebase this on top of some other branch. The problem we are fixing is described in the first patch in the series. Changes since the original patch: * add test cases * add full commit messages * fix formatting problem in --stdin doc * split overly long fetch_pack_usage line * use strbuf_getline() instead of fgets() for reading refs from stdin * minor optimization of the pkt-line reading loop, it was using xstrdup() even though the string length was already known, use xmemdupz() instead * rework the remote-curl.c patch to not add new parameters to rpc_service(), instead add a new strbuf member to rpc_state to pass the info around Ivan Todoroski (4): fetch-pack: new --stdin option to read refs from stdin remote-curl: send the refs to fetch-pack on stdin fetch-pack: test cases for the new --stdin option remote-curl: main test case for the OS command line overflow Documentation/git-fetch-pack.txt | 10 ++++ builtin/fetch-pack.c | 45 ++++++++++++++++- fetch-pack.h | 3 +- remote-curl.c | 15 +++++- t/t5500-fetch-pack.sh | 100 ++++++++++++++++++++++++++++++++++++++ t/t5551-http-fetch.sh | 32 ++++++++++++ 6 files changed, 201 insertions(+), 4 deletions(-) -- 1.7.9.5.4.g4f508 ^ permalink raw reply [flat|nested] 54+ messages in thread
* [PATCH/RFC v2 1/4] fetch-pack: new --stdin option to read refs from stdin 2012-03-27 6:23 ` [PATCH/RFC v2 0/4] Fix fetch-pack command line overflow during clone Ivan Todoroski @ 2012-03-27 6:25 ` Ivan Todoroski 2012-03-27 16:59 ` Junio C Hamano 2012-03-27 6:26 ` [PATCH/RFC v2 2/4] remote-curl: send the refs to fetch-pack on stdin Ivan Todoroski ` (2 subsequent siblings) 3 siblings, 1 reply; 54+ messages in thread From: Ivan Todoroski @ 2012-03-27 6:25 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git If a remote repo has too many tags (or branches), cloning it over the smart HTTP transport can fail because remote-curl.c puts all the refs from the remote repo on the fetch-pack command line. This can make the command line longer than the global OS command line limit, causing fetch-pack to fail. This is especially a problem on Windows where the command line limit is orders of magnitude shorter than Linux. There are already real repos out there that msysGit cannot clone over smart HTTP due to this problem. Here is an easy way to trigger this problem: git init too-many-refs cd too-many-refs echo bla > bla.txt git add . git commit -m test sha=$(git rev-parse HEAD) for ((i=0; i<50000; i++)); do echo $sha refs/tags/artificially-long-tag-name-to-more-easily-\ demonstrate-the-problem-$i >> .git/packed-refs done Then share this repo over the smart HTTP protocol and try cloning it: $ git clone http://localhost/.../too-many-refs/.git Cloning into 'too-many-refs'... fatal: cannot exec 'fetch-pack': Argument list too long 50k tags is obviously an absurd number, but it is required to demonstrate the problem on Linux because it has a much more generous command line limit. On Windows the clone fails with as little as 500 tags in the above loop, which is getting uncomfortably close to the number of tags you might see in real long lived repos. This is not just theoretical, msysGit is already failing to clone our company repo due to this. It's a large repo converted from CVS, nearly 10 years of history. Four possible solutions were discussed on the Git mailing list (in no particular order): 1) Call fetch-pack multiple times with smaller batches of refs. This was dismissed as inefficient and inelegant. 2) Add option --refs-fd=$n to pass a an fd from where to read the refs. This was rejected because inheriting descriptors other than stdin/stdout/stderr through exec() is apparently problematic on Windows, plus it would require changes to the run-command API to open extra pipes. 3) Add option --refs-from=$tmpfile to pass the refs using a temp file. This was not favored because of the temp file requirement. 4) Add option --stdin to pass the refs on stdin, one per line. In the end this option was chosen as the most efficient and most desirable from scripting perspective. There was however a small complication when using stdin to pass refs to fetch-pack. The --stateless-rpc option to fetch-pack also uses stdin for communication with the remote server. If we were going to sneak refs on stdin line by line in the presence of --stateless-rpc it would have to be done very carefully, because when reading refs one by line we might buffer too much data ahead and eat some of the remote protocol data also coming on stdin. One way to solve this would be refactor get_remote_heads() in fetch-pack.c to accept a residual buffer from our stdin line parsing above, but this function is used in several places so other callers would be burdened by this residual buffer interface even when most of them don't need it. In the end we settled on the following solution: If --stdin is specified without --stateless-rpc, fetch-pack would read the refs from stdin one per line, in a script friendly format. However if --stdin is specified together with --stateless-rpc, fetch-pack would read the refs from stdin in packetized format (pkt-line) with a flush packet terminating the list of refs. This way we can read the exact number of bytes that we need from stdin, and then get_remote_heads() can continue reading from the same fd without losing a single byte of remote protocol data. This way the --stdin option only loses generality and scriptability when used together with --stateless-rpc, which is not easily scriptable anyway because it also uses pkt-line when talking to the remote server. --- Documentation/git-fetch-pack.txt | 10 +++++++++ builtin/fetch-pack.c | 45 +++++++++++++++++++++++++++++++++++++- fetch-pack.h | 3 ++- 3 files changed, 56 insertions(+), 2 deletions(-) diff --git a/Documentation/git-fetch-pack.txt b/Documentation/git-fetch-pack.txt index ed1bdaacd1..1dd44fd348 100644 --- a/Documentation/git-fetch-pack.txt +++ b/Documentation/git-fetch-pack.txt @@ -32,6 +32,16 @@ OPTIONS --all:: Fetch all remote refs. +--stdin:: + Take the list of refs from stdin, one per line. If there + are refs specified on the command line in addition to this + option, then the refs from stdin are processed after those + on the command line. ++ +If '--stateless-rpc' is specified together with this option then +the list of refs must be in packet format (pkt-line) with a flush +packet terminating the list. + -q:: --quiet:: Pass '-q' flag to 'git unpack-objects'; this makes the diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c index a4d3e90a86..1a90fa852f 100644 --- a/builtin/fetch-pack.c +++ b/builtin/fetch-pack.c @@ -23,7 +23,9 @@ static struct fetch_pack_args args = { }; static const char fetch_pack_usage[] = -"git fetch-pack [--all] [--quiet|-q] [--keep|-k] [--thin] [--include-tag] [--upload-pack=<git-upload-pack>] [--depth=<n>] [--no-progress] [-v] [<host>:]<directory> [<refs>...]"; +"git fetch-pack [--all] [--stdin] [--quiet|-q] [--keep|-k] [--thin] " +"[--include-tag] [--upload-pack=<git-upload-pack>] [--depth=<n>] " +"[--no-progress] [-v] [<host>:]<directory> [<refs>...]"; #define COMPLETE (1U << 0) #define COMMON (1U << 1) @@ -941,6 +943,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) args.fetch_all = 1; continue; } + if (!strcmp("--stdin", arg)) { + args.refs_from_stdin = 1; + continue; + } if (!strcmp("-v", arg)) { args.verbose = 1; continue; @@ -972,6 +978,43 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) if (!dest) usage(fetch_pack_usage); + if (args.refs_from_stdin) { + /* copy refs from cmdline to new growable list, + then append the refs from stdin */ + int alloc_heads = nr_heads; + int size = nr_heads * sizeof(*heads); + heads = memcpy(xmalloc(size), heads, size); + if (args.stateless_rpc) { + /* in stateless RPC mode we use pkt-line to read + from stdin, until we get a flush packet */ + static char line[1000]; + for (;;) { + int n = packet_read_line(0, line, sizeof(line)); + if (!n) + break; + if (line[n-1] == '\n') + line[--n] = '\0'; + ALLOC_GROW(heads, nr_heads + 1, alloc_heads); + heads[nr_heads++] = xmemdupz(line, n); + } + } + else { + /* read from stdin one ref per line, until EOF */ + struct strbuf line; + strbuf_init(&line, 0); + for (;;) { + if (strbuf_getline(&line, stdin, '\n') == EOF) + break; + strbuf_trim(&line); + if (!line.len) + continue; /* skip empty lines */ + ALLOC_GROW(heads, nr_heads + 1, alloc_heads); + heads[nr_heads++] = strbuf_detach(&line, NULL); + } + strbuf_release(&line); + } + } + if (args.stateless_rpc) { conn = NULL; fd[0] = 0; diff --git a/fetch-pack.h b/fetch-pack.h index 0608edae3f..292d69389e 100644 --- a/fetch-pack.h +++ b/fetch-pack.h @@ -13,7 +13,8 @@ struct fetch_pack_args { verbose:1, no_progress:1, include_tag:1, - stateless_rpc:1; + stateless_rpc:1, + refs_from_stdin:1; }; struct ref *fetch_pack(struct fetch_pack_args *args, -- 1.7.9.5.4.g4f508 ^ permalink raw reply related [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC v2 1/4] fetch-pack: new --stdin option to read refs from stdin 2012-03-27 6:25 ` [PATCH/RFC v2 1/4] fetch-pack: new --stdin option to read refs from stdin Ivan Todoroski @ 2012-03-27 16:59 ` Junio C Hamano 2012-03-27 23:18 ` Ivan Todoroski 0 siblings, 1 reply; 54+ messages in thread From: Junio C Hamano @ 2012-03-27 16:59 UTC (permalink / raw) To: Ivan Todoroski Cc: Jeff King, Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git Ivan Todoroski <grnch@gmx.net> writes: > sha=$(git rev-parse HEAD) > for ((i=0; i<50000; i++)); do > echo $sha refs/tags/artificially-long-tag-name-to-more-easily-\ > demonstrate-the-problem-$i >> .git/packed-refs > done This comment does not have much to the issue, but the above is better written as commit=$(git rev-parse HEAD) i=0 while test $i -le 50000 do echo $commit refs/tags/....-$i done >>.git/packed.refs to stay within the usual Bourne shell. > 4) Add option --stdin to pass the refs on stdin, one per line. > ... > In the end we settled on the following solution: > > If --stdin is specified without --stateless-rpc, fetch-pack would read > the refs from stdin one per line, in a script friendly format. > > However if --stdin is specified together with --stateless-rpc, > fetch-pack would read the refs from stdin in packetized format > (pkt-line) with a flush packet terminating the list of refs. This way we > can read the exact number of bytes that we need from stdin, and then > get_remote_heads() can continue reading from the same fd without losing > a single byte of remote protocol data. That sounds like the right way to use pkt-line machinery. Send stuff you need to send, and mark it with a flush to tell the receiver that you have finished giving one logical collection of information. > This way the --stdin option only loses generality and scriptability when > used together with --stateless-rpc, which is not easily scriptable > anyway because it also uses pkt-line when talking to the remote server. > --- Just a gentle reminder; the final submission will need your sign off here. > diff --git a/Documentation/git-fetch-pack.txt b/Documentation/git-fetch-pack.txt > index ed1bdaacd1..1dd44fd348 100644 > --- a/Documentation/git-fetch-pack.txt > +++ b/Documentation/git-fetch-pack.txt > @@ -32,6 +32,16 @@ OPTIONS > --all:: > Fetch all remote refs. > > +--stdin:: > + Take the list of refs from stdin, one per line. If there > + are refs specified on the command line in addition to this > + option, then the refs from stdin are processed after those > + on the command line. > ++ > +If '--stateless-rpc' is specified together with this option then > +the list of refs must be in packet format (pkt-line) with a flush > +packet terminating the list. > + It is not clear from this description alone if this is a single (possibly giant) packet with multiple lines, each of which describes a ref, or a series of one packet per ref with a flush at the end of the sequence. > diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c > index a4d3e90a86..1a90fa852f 100644 > --- a/builtin/fetch-pack.c > +++ b/builtin/fetch-pack.c > @@ -972,6 +978,43 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix) > if (!dest) > usage(fetch_pack_usage); > > + if (args.refs_from_stdin) { > + /* copy refs from cmdline to new growable list, > + then append the refs from stdin */ /* * We tend to format our multi-line * comments like this */ > + int alloc_heads = nr_heads; > + int size = nr_heads * sizeof(*heads); > + heads = memcpy(xmalloc(size), heads, size); > + if (args.stateless_rpc) { > + /* in stateless RPC mode we use pkt-line to read > + from stdin, until we get a flush packet */ > + static char line[1000]; We will never have a refname that is longer than this limit? > + for (;;) { > + int n = packet_read_line(0, line, sizeof(line)); > + if (!n) > + break; > + if (line[n-1] == '\n') > + line[--n] = '\0'; > + ALLOC_GROW(heads, nr_heads + 1, alloc_heads); > + heads[nr_heads++] = xmemdupz(line, n); Micronit. The use of xmemdupz() here means you do not have to replace LF with NUL above; decrementing 'n' should be sufficient. > + } > + } > + else { > + /* read from stdin one ref per line, until EOF */ > + struct strbuf line; > + strbuf_init(&line, 0); > + for (;;) { > + if (strbuf_getline(&line, stdin, '\n') == EOF) > + break; > + strbuf_trim(&line); > + if (!line.len) > + continue; /* skip empty lines */ Curious. "stop at EOF", "trim" and "skip empty" imply that you are catering to people who debug this from the terminal by typing (or copy pasting). Is that the expected use case? Otherwise we may want to tighten this part a bit to forbid cruft (e.g. a line with leading or trailing whitespaces). > + ALLOC_GROW(heads, nr_heads + 1, alloc_heads); > + heads[nr_heads++] = strbuf_detach(&line, NULL); > + } > + strbuf_release(&line); > + } > + } > + > if (args.stateless_rpc) { > conn = NULL; > fd[0] = 0; > diff --git a/fetch-pack.h b/fetch-pack.h > index 0608edae3f..292d69389e 100644 > --- a/fetch-pack.h > +++ b/fetch-pack.h > @@ -13,7 +13,8 @@ struct fetch_pack_args { > verbose:1, > no_progress:1, > include_tag:1, > - stateless_rpc:1; > + stateless_rpc:1, > + refs_from_stdin:1; > }; > > struct ref *fetch_pack(struct fetch_pack_args *args, ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC v2 1/4] fetch-pack: new --stdin option to read refs from stdin 2012-03-27 16:59 ` Junio C Hamano @ 2012-03-27 23:18 ` Ivan Todoroski 2012-03-27 23:26 ` Junio C Hamano 0 siblings, 1 reply; 54+ messages in thread From: Ivan Todoroski @ 2012-03-27 23:18 UTC (permalink / raw) To: Junio C Hamano Cc: Jeff King, Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On 27.03.2012 18:59, Junio C Hamano wrote: > Ivan Todoroski <grnch@gmx.net> writes: > >> + int alloc_heads = nr_heads; >> + int size = nr_heads * sizeof(*heads); >> + heads = memcpy(xmalloc(size), heads, size); >> + if (args.stateless_rpc) { >> + /* in stateless RPC mode we use pkt-line to read >> + from stdin, until we get a flush packet */ >> + static char line[1000]; > > We will never have a refname that is longer than this limit? I don't know. I grepped the code for existing usages of packet_read_line and that seemed to be a common idiom everywhere. Should I just bump up the size or is there some accepted way to read arbitrary length packets? >> + } >> + } >> + else { >> + /* read from stdin one ref per line, until EOF */ >> + struct strbuf line; >> + strbuf_init(&line, 0); >> + for (;;) { >> + if (strbuf_getline(&line, stdin, '\n') == EOF) >> + break; >> + strbuf_trim(&line); >> + if (!line.len) >> + continue; /* skip empty lines */ > > Curious. "stop at EOF", "trim" and "skip empty" imply that you are > catering to people who debug this from the terminal by typing (or copy > pasting). Is that the expected use case? The expected use case is people using this from shell scripts that could be getting refs by slicing and dicing output of other commands with regexps and what not which could leave some whitespace here and there, so a more liberal interface might be more friendly to such script writers. Currently you would pass a list of generated refs to fetch-pack using something like this: generate-refs | xargs fetch-pack or this: fetch-pack $(generate-refs) Both of these commands will ignore any extra whitespace produced by "generate-refs". Since --stdin is meant to be a spiritual replacement for the two commands above, I thought it should behave in a similar spirit. At least that was the reasoning... if you're not swayed by it I can just remove those lines and not tolerate any extra whitespace. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC v2 1/4] fetch-pack: new --stdin option to read refs from stdin 2012-03-27 23:18 ` Ivan Todoroski @ 2012-03-27 23:26 ` Junio C Hamano 2012-03-27 23:48 ` Ivan Todoroski 0 siblings, 1 reply; 54+ messages in thread From: Junio C Hamano @ 2012-03-27 23:26 UTC (permalink / raw) To: Ivan Todoroski Cc: Jeff King, Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git Ivan Todoroski <grnch@gmx.net> writes: > On 27.03.2012 18:59, Junio C Hamano wrote: >> Ivan Todoroski <grnch@gmx.net> writes: >> >>> + int alloc_heads = nr_heads; >>> + int size = nr_heads * sizeof(*heads); >>> + heads = memcpy(xmalloc(size), heads, size); >>> + if (args.stateless_rpc) { >>> + /* in stateless RPC mode we use pkt-line to read >>> + from stdin, until we get a flush packet */ >>> + static char line[1000]; >> >> We will never have a refname that is longer than this limit? > > I don't know. I grepped the code for existing usages of packet_read_line > and that seemed to be a common idiom everywhere. Should I just bump up > the size or is there some accepted way to read arbitrary length packets? As you may have already guessed, you can have up to (64k - slop) bytes in a packet. But it probably does not matter, as a loong ref that bust your limit will not fit in the packet used in the normal codepath anyway. >> Curious. "stop at EOF", "trim" and "skip empty" imply that you are >> catering to people who debug this from the terminal by typing (or copy >> pasting). Is that the expected use case? > > The expected use case is people using this from shell scripts that > could be getting refs by slicing and dicing output of other commands > with regexps and what not That kind of use would not have leading or trailing whitespaces, and if the target audience is scripts, I would prefer to force them to be strict. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC v2 1/4] fetch-pack: new --stdin option to read refs from stdin 2012-03-27 23:26 ` Junio C Hamano @ 2012-03-27 23:48 ` Ivan Todoroski 0 siblings, 0 replies; 54+ messages in thread From: Ivan Todoroski @ 2012-03-27 23:48 UTC (permalink / raw) To: Junio C Hamano Cc: Jeff King, Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On 28.03.2012 01:26, Junio C Hamano wrote: > Ivan Todoroski <grnch@gmx.net> writes: > >>> Curious. "stop at EOF", "trim" and "skip empty" imply that you are >>> catering to people who debug this from the terminal by typing (or copy >>> pasting). Is that the expected use case? >> The expected use case is people using this from shell scripts that >> could be getting refs by slicing and dicing output of other commands >> with regexps and what not > > That kind of use would not have leading or trailing whitespaces, and if > the target audience is scripts, I would prefer to force them to be > strict. Depends on how good the script writer is. :) But you're right, they can always change the script to tighten up what it generates. Also, I just checked what other commands with --stdin flags do (like rev-list, checkout-index, etc). They all expect an exact format like you said and they don't tolerate extra whitespace, making my patch the odd one out. It is important that similar flags behave consistently across different commands. Sorry for not checking for precedent before deciding to tolerate whitespace. I will tighten it up. ^ permalink raw reply [flat|nested] 54+ messages in thread
* [PATCH/RFC v2 2/4] remote-curl: send the refs to fetch-pack on stdin 2012-03-27 6:23 ` [PATCH/RFC v2 0/4] Fix fetch-pack command line overflow during clone Ivan Todoroski 2012-03-27 6:25 ` [PATCH/RFC v2 1/4] fetch-pack: new --stdin option to read refs from stdin Ivan Todoroski @ 2012-03-27 6:26 ` Ivan Todoroski 2012-03-27 17:18 ` Junio C Hamano 2012-03-27 6:27 ` [PATCH/RFC v2 3/4] fetch-pack: test cases for the new --stdin option Ivan Todoroski 2012-03-27 6:28 ` [PATCH/RFC v2 4/4] remote-curl: main test case for the OS command line overflow Ivan Todoroski 3 siblings, 1 reply; 54+ messages in thread From: Ivan Todoroski @ 2012-03-27 6:26 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git If a remote repo has too many tags (or branches), cloning it over the smart HTTP transport can fail because remote-curl.c puts all the refs from the remote repo on the fetch-pack command line. This can make the command line longer than the global OS command line limit, causing fetch-pack to fail. This is especially a problem on Windows where the command line limit is orders of magnitude shorter than Linux. There are already real repos out there that msysGit cannot clone over smart HTTP due to this problem. To solve this problem we teach remote-curl.c to pipe the refs to fetch-pack using the new --stdin option, instead of on the fetch-pack command line. For a more detailed discussion of the problem see the parent of this commit, titled "fetch-pack: new --stdin option to read refs from stdin". --- remote-curl.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/remote-curl.c b/remote-curl.c index d159fe7f34..52c21433c7 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -290,6 +290,7 @@ static void output_refs(struct ref *refs) struct rpc_state { const char *service_name; const char **argv; + struct strbuf *stdin_preamble; char *service_url; char *hdr_content_type; char *hdr_accept; @@ -535,6 +536,7 @@ static int rpc_service(struct rpc_state *rpc, struct discovery *heads) { const char *svc = rpc->service_name; struct strbuf buf = STRBUF_INIT; + struct strbuf *preamble = rpc->stdin_preamble; struct child_process client; int err = 0; @@ -545,6 +547,8 @@ static int rpc_service(struct rpc_state *rpc, struct discovery *heads) client.argv = rpc->argv; if (start_command(&client)) exit(1); + if (preamble) + write_or_die(client.in, preamble->buf, preamble->len); if (heads) write_or_die(client.in, heads->buf, heads->len); @@ -626,6 +630,7 @@ static int fetch_git(struct discovery *heads, int nr_heads, struct ref **to_fetch) { struct rpc_state rpc; + struct strbuf preamble; char *depth_arg = NULL; const char **argv; int argc = 0, i, err; @@ -633,6 +638,7 @@ static int fetch_git(struct discovery *heads, argv = xmalloc((15 + nr_heads) * sizeof(char*)); argv[argc++] = "fetch-pack"; argv[argc++] = "--stateless-rpc"; + argv[argc++] = "--stdin"; argv[argc++] = "--lock-pack"; if (options.followtags) argv[argc++] = "--include-tag"; @@ -651,23 +657,28 @@ static int fetch_git(struct discovery *heads, argv[argc++] = depth_arg; } argv[argc++] = url; + argv[argc++] = NULL; + + strbuf_init(&preamble, 4); for (i = 0; i < nr_heads; i++) { struct ref *ref = to_fetch[i]; if (!ref->name || !*ref->name) die("cannot fetch by sha1 over smart http"); - argv[argc++] = ref->name; + packet_buf_write(&preamble, "%s\n", ref->name); } - argv[argc++] = NULL; + packet_buf_flush(&preamble); memset(&rpc, 0, sizeof(rpc)); rpc.service_name = "git-upload-pack", rpc.argv = argv; + rpc.stdin_preamble = &preamble; rpc.gzip_request = 1; err = rpc_service(&rpc, heads); if (rpc.result.len) safe_write(1, rpc.result.buf, rpc.result.len); strbuf_release(&rpc.result); + strbuf_release(&preamble); free(argv); free(depth_arg); return err; -- 1.7.9.5.4.g4f508 ^ permalink raw reply related [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC v2 2/4] remote-curl: send the refs to fetch-pack on stdin 2012-03-27 6:26 ` [PATCH/RFC v2 2/4] remote-curl: send the refs to fetch-pack on stdin Ivan Todoroski @ 2012-03-27 17:18 ` Junio C Hamano 2012-03-27 23:20 ` Ivan Todoroski 0 siblings, 1 reply; 54+ messages in thread From: Junio C Hamano @ 2012-03-27 17:18 UTC (permalink / raw) To: Ivan Todoroski Cc: Jeff King, Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git Ivan Todoroski <grnch@gmx.net> writes: > If a remote repo has too many tags (or branches), cloning it over the > smart HTTP transport can fail because remote-curl.c puts all the refs > from the remote repo on the fetch-pack command line. This can make the > command line longer than the global OS command line limit, causing > fetch-pack to fail. > > This is especially a problem on Windows where the command line limit is > orders of magnitude shorter than Linux. There are already real repos out > there that msysGit cannot clone over smart HTTP due to this problem. > > To solve this problem we teach remote-curl.c to pipe the refs to > fetch-pack using the new --stdin option, instead of on the fetch-pack > command line. > > For a more detailed discussion of the problem see the parent of this > commit, titled "fetch-pack: new --stdin option to read refs from stdin". Thanks. As there is no way for this single commit to be applied without the previous step, you could have just said: Now we can throw arbitrary number of refs at fetch-pack using its --stdin option, use it in remote-curl helper to lift the command line length limit. or something ;-). > @@ -626,6 +630,7 @@ static int fetch_git(struct discovery *heads, > int nr_heads, struct ref **to_fetch) > { > struct rpc_state rpc; > + struct strbuf preamble; > char *depth_arg = NULL; > const char **argv; > int argc = 0, i, err; > @@ -633,6 +638,7 @@ static int fetch_git(struct discovery *heads, > argv = xmalloc((15 + nr_heads) * sizeof(char*)); > argv[argc++] = "fetch-pack"; > argv[argc++] = "--stateless-rpc"; > + argv[argc++] = "--stdin"; > argv[argc++] = "--lock-pack"; > if (options.followtags) > argv[argc++] = "--include-tag"; > @@ -651,23 +657,28 @@ static int fetch_git(struct discovery *heads, > argv[argc++] = depth_arg; > } > argv[argc++] = url; > + argv[argc++] = NULL; > + > + strbuf_init(&preamble, 4); Curious. If "4" does not really matter, I would drop this and use STRBUF_INIT at the beginning instead. > for (i = 0; i < nr_heads; i++) { > struct ref *ref = to_fetch[i]; > if (!ref->name || !*ref->name) > die("cannot fetch by sha1 over smart http"); > - argv[argc++] = ref->name; > + packet_buf_write(&preamble, "%s\n", ref->name); > } > - argv[argc++] = NULL; > + packet_buf_flush(&preamble); > > memset(&rpc, 0, sizeof(rpc)); > rpc.service_name = "git-upload-pack", > rpc.argv = argv; > + rpc.stdin_preamble = &preamble; > rpc.gzip_request = 1; > > err = rpc_service(&rpc, heads); > if (rpc.result.len) > safe_write(1, rpc.result.buf, rpc.result.len); > strbuf_release(&rpc.result); > + strbuf_release(&preamble); > free(argv); > free(depth_arg); > return err; ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC v2 2/4] remote-curl: send the refs to fetch-pack on stdin 2012-03-27 17:18 ` Junio C Hamano @ 2012-03-27 23:20 ` Ivan Todoroski 0 siblings, 0 replies; 54+ messages in thread From: Ivan Todoroski @ 2012-03-27 23:20 UTC (permalink / raw) To: Junio C Hamano Cc: Jeff King, Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On 27.03.2012 19:18, Junio C Hamano wrote: > Ivan Todoroski <grnch@gmx.net> writes: > >> + strbuf_init(&preamble, 4); > > Curious. > > If "4" does not really matter, I would drop this and use STRBUF_INIT at > the beginning instead. I only put 4 because I know the strbuf will always have at least 4 bytes, because it always includes a flush packet. It doesn't matter at all otherwise, I will do as you suggest. ^ permalink raw reply [flat|nested] 54+ messages in thread
* [PATCH/RFC v2 3/4] fetch-pack: test cases for the new --stdin option 2012-03-27 6:23 ` [PATCH/RFC v2 0/4] Fix fetch-pack command line overflow during clone Ivan Todoroski 2012-03-27 6:25 ` [PATCH/RFC v2 1/4] fetch-pack: new --stdin option to read refs from stdin Ivan Todoroski 2012-03-27 6:26 ` [PATCH/RFC v2 2/4] remote-curl: send the refs to fetch-pack on stdin Ivan Todoroski @ 2012-03-27 6:27 ` Ivan Todoroski 2012-03-27 17:40 ` Junio C Hamano 2012-03-27 6:28 ` [PATCH/RFC v2 4/4] remote-curl: main test case for the OS command line overflow Ivan Todoroski 3 siblings, 1 reply; 54+ messages in thread From: Ivan Todoroski @ 2012-03-27 6:27 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git These test cases focus only on testing the parsing of refs on stdin, without bothering with the rest of the fetch-pack machinery. We pass in the refs using different combinations of command line and stdin and then we watch fetch-pack's stdout to see whether it prints all the refs we specified and in the same order as we specified them. For the --stateless-rpc tests we cannot easily execute fetch-pack to the end because that would require simulating the remote protocol. We settle for only checking 2 cases: 1) Whether fetch-pack correctly fails to parse the refs if they are not terminated by a flush packet 2) Whether fetch-pack finishes parsing the refs without error when they are correctly terminated by a flush packet The fetch-pack invocation fails in both cases due to the missing remote side of the protocol, but it fails in different ways which allows us to determine how the refs parsing ended by inspecting the different error messages. --- t/t5500-fetch-pack.sh | 100 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 100 insertions(+) diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh index 9bf69e9a0f..f4de7d07c1 100755 --- a/t/t5500-fetch-pack.sh +++ b/t/t5500-fetch-pack.sh @@ -248,4 +248,104 @@ test_expect_success 'clone shallow object count' ' grep "^count: 52" count.shallow ' + +cat >stdin.exp <<EOF +refs/heads/C +refs/heads/A +refs/heads/D +refs/tags/C +refs/heads/B +refs/tags/A +refs/heads/E +refs/tags/B +refs/tags/E +refs/tags/D +EOF + +test_expect_success 'setup tests for --stdin' ' + for head in C D E F; do + add $head + done && + for head in A B C D E F; do + git tag $head $head + done +' + +test_expect_success 'fetch refs from cmdline, make sure it still works OK' ' + cd client && + git fetch-pack --no-progress .. $(cat ../stdin.exp) | + cut -d " " -f 2 > ../stdin.act && + cd .. && + test_cmp stdin.exp stdin.act +' + +test_expect_success 'fetch refs from stdin' ' + cd client && + cat ../stdin.exp | + git fetch-pack --stdin --no-progress .. | + cut -d " " -f 2 > ../stdin.act && + cd .. && + test_cmp stdin.exp stdin.act +' + +test_expect_success 'fetch mixed refs from cmdline and stdin in right order' ' + cd client && + tail -n +5 ../stdin.exp | + git fetch-pack --stdin --no-progress .. $(head -n 4 ../stdin.exp) | + cut -d " " -f 2 > ../stdin.act && + cd .. && + test_cmp stdin.exp stdin.act +' + + +# remove final newline and insert random spaces, NULs, and empty lines +head -c -1 stdin.exp | sed -e ' + 1i + 2s/^\|$/ /g + 4s/^/ / + 6s/$/ / + 8s/$/\n / + 9s/$/Ztrailing garbage/ + 9s/^\|$/ /g +' | tr "Z" "\000" > stdin.spaces + +test_expect_success 'ignore leading/trailing spaces, empty lines and NULs' ' + cd client && + cat ../stdin.spaces | + git fetch-pack --stdin --no-progress .. | + cut -d " " -f 2 > ../stdin.act && + cd .. && + test_cmp stdin.exp stdin.act +' + + +packet_write() { printf "%04x%s\n" $((${#1} + 5)) "$1"; } +packet_flush() { printf "0000"; } + +cat stdin.exp | while read ref; do + packet_write $ref +done | tee stdin.pkt.bad | +(packet_flush; packet_write garbage) > stdin.pkt.good + +echo "fatal: protocol error: expected sha/ref, got 'garbage'" > stdin.pkt.good.exp +echo "fatal: The remote end hung up unexpectedly" > stdin.pkt.bad.exp + +test_expect_success 'refs on stdin in packetized format' ' + cd client && + cat ../stdin.pkt.good | + test_must_fail git fetch-pack --stdin --stateless-rpc .. \ + 2> ../stdin.pkt.good.act && + cd .. && + test_cmp stdin.pkt.good.exp stdin.pkt.good.act +' + +test_expect_success 'fail if flush packet is missing' ' + cd client && + cat ../stdin.pkt.bad | + test_must_fail git fetch-pack --stdin --stateless-rpc .. \ + 2> ../stdin.pkt.bad.act && + cd .. && + test_cmp stdin.pkt.bad.exp stdin.pkt.bad.act +' + test_done -- 1.7.9.5.4.g4f508 ^ permalink raw reply related [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC v2 3/4] fetch-pack: test cases for the new --stdin option 2012-03-27 6:27 ` [PATCH/RFC v2 3/4] fetch-pack: test cases for the new --stdin option Ivan Todoroski @ 2012-03-27 17:40 ` Junio C Hamano 2012-03-27 23:36 ` Ivan Todoroski 0 siblings, 1 reply; 54+ messages in thread From: Junio C Hamano @ 2012-03-27 17:40 UTC (permalink / raw) To: Ivan Todoroski Cc: Jeff King, Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git Ivan Todoroski <grnch@gmx.net> writes: > These test cases focus only on testing the parsing of refs on stdin, > without bothering with the rest of the fetch-pack machinery. We pass in > the refs using different combinations of command line and stdin and then > we watch fetch-pack's stdout to see whether it prints all the refs we > specified and in the same order as we specified them. Thanks for writing what it does concisely and clearly. It makes it very pleasant to review the patch. It is sensible to expect that we see all the refs we told it to fetch, but I do not think it is sensible to require they come in the same order as we have given them to the command. > For the --stateless-rpc tests we cannot easily execute fetch-pack to the > end because that would require simulating the remote protocol. We settle > for only checking 2 cases: > > 1) Whether fetch-pack correctly fails to parse the refs if they are not > terminated by a flush packet > > 2) Whether fetch-pack finishes parsing the refs without error when they > are correctly terminated by a flush packet > > The fetch-pack invocation fails in both cases due to the missing remote > side of the protocol, but it fails in different ways which allows us to > determine how the refs parsing ended by inspecting the different error > messages. Ick. > --- > t/t5500-fetch-pack.sh | 100 +++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 100 insertions(+) > > diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh > index 9bf69e9a0f..f4de7d07c1 100755 > --- a/t/t5500-fetch-pack.sh > +++ b/t/t5500-fetch-pack.sh > @@ -248,4 +248,104 @@ test_expect_success 'clone shallow object count' ' > grep "^count: 52" count.shallow > ' > > + > +cat >stdin.exp <<EOF > +refs/heads/C > +refs/heads/A > +refs/heads/D > +refs/tags/C > +refs/heads/B > +refs/tags/A > +refs/heads/E > +refs/tags/B > +refs/tags/E > +refs/tags/D > +EOF > + > +test_expect_success 'setup tests for --stdin' ' > + for head in C D E F; do > + add $head > + done && > + for head in A B C D E F; do > + git tag $head $head > + done > +' Drop ";" and write "do" on its own line with proper indentation. > + > +test_expect_success 'fetch refs from cmdline, make sure it still works OK' ' > + cd client && > + git fetch-pack --no-progress .. $(cat ../stdin.exp) | > + cut -d " " -f 2 > ../stdin.act && > + cd .. && > + test_cmp stdin.exp stdin.act > +' - Do not chdir around without being in a subprocess (); - Do not place the command you are testing that might crash on the upstream of the pipe; - style; i.e. ( cd client && git fetch-pack --no-progress .. $(cat ../stdin.exp) >../stdin.raw ) && cut -d " " -f 2 <stdin.raw | sort >stdin.act && test_cmp stdin.exp stdin.act Note that I lifted the "in the same order" requirement, which should not be there. You may need to adjust the hardcoded stdin.exp to be sorted. > +test_expect_success 'fetch refs from stdin' ' > + cd client && > + cat ../stdin.exp | > + git fetch-pack --stdin --no-progress .. | > + cut -d " " -f 2 > ../stdin.act && > + cd .. && > + test_cmp stdin.exp stdin.act > +' In addition to the comments on the previous one: - Do not pipe output from cat. i.e. ( cd client && git fetch-pack ... <../stdin.exp >stdin.raw ) && cut -d " " -f 2 <stdin.raw | sort >stdin.act && test_cmp stdin.exp stdin.act By the way, why are these not called "expect" and "actual" like most other tests? > +test_expect_success 'fetch mixed refs from cmdline and stdin in right order' ' > + cd client && > + tail -n +5 ../stdin.exp | > + git fetch-pack --stdin --no-progress .. $(head -n 4 ../stdin.exp) | > + cut -d " " -f 2 > ../stdin.act && > + cd .. && > + test_cmp stdin.exp stdin.act > +' Ditto. Do we want to have a separate test to see what happens when there are dups in the input? > +# remove final newline and insert random spaces, NULs, and empty lines > +head -c -1 stdin.exp | sed -e ' > + 1i > + 2s/^\|$/ /g > + 4s/^/ / > + 6s/$/ / > + 8s/$/\n / > + 9s/$/Ztrailing garbage/ > + 9s/^\|$/ /g > +' | tr "Z" "\000" > stdin.spaces Somebody may want to try this sed expression on Macs and BSDs, but more importantly... > +test_expect_success 'ignore leading/trailing spaces, empty lines and NULs' ' > + cd client && > + cat ../stdin.spaces | > + git fetch-pack --stdin --no-progress .. | > + cut -d " " -f 2 > ../stdin.act && > + cd .. && > + test_cmp stdin.exp stdin.act > +' ... it is unclear why it is a good thing to let the input with garbage produce a result instead of erroring out. The rest of the patch skimmed but not thoroughly reviewed; seems to have similar issues as already have been pointed out above. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC v2 3/4] fetch-pack: test cases for the new --stdin option 2012-03-27 17:40 ` Junio C Hamano @ 2012-03-27 23:36 ` Ivan Todoroski 2012-03-27 23:43 ` Junio C Hamano 2012-03-28 0:14 ` Ivan Todoroski 0 siblings, 2 replies; 54+ messages in thread From: Ivan Todoroski @ 2012-03-27 23:36 UTC (permalink / raw) To: Junio C Hamano Cc: Jeff King, Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On 27.03.2012 19:40, Junio C Hamano wrote: > > It is sensible to expect that we see all the refs we told it to fetch, but > I do not think it is sensible to require they come in the same order as we > have given them to the command. Oh, OK. I was not aware that fetch-pack is free to reorder commits (at least in principle). I will adjust the tests to be order-independent. >> For the --stateless-rpc tests we cannot easily execute fetch-pack to the >> end because that would require simulating the remote protocol. We settle >> for only checking 2 cases: >> >> 1) Whether fetch-pack correctly fails to parse the refs if they are not >> terminated by a flush packet >> >> 2) Whether fetch-pack finishes parsing the refs without error when they >> are correctly terminated by a flush packet >> >> The fetch-pack invocation fails in both cases due to the missing remote >> side of the protocol, but it fails in different ways which allows us to >> determine how the refs parsing ended by inspecting the different error >> messages. > > Ick. Yeah... I couldn't figure out a way to do an isolated test of the packetized version of --stdin when --stateless-rpc is also in effect. Any guidance here would be welcome. On second thought, maybe we can just drop these two --stateless-rpc tests from this patch? The "git clone" test in the next patch also exercises the packetized refs in --stateless-rpc mode and if there was anything wrong with them it would fail. >> + >> +test_expect_success 'fetch refs from cmdline, make sure it still works OK' ' >> + cd client && >> + git fetch-pack --no-progress .. $(cat ../stdin.exp) | >> + cut -d " " -f 2 > ../stdin.act && >> + cd .. && >> + test_cmp stdin.exp stdin.act >> +' > > - Do not chdir around without being in a subprocess (); Sorry, I didn't realize the tests were eval-ed in the current environment. I will correct all such problems in the next version. > - Do not place the command you are testing that might crash on the > upstream of the pipe; > > - style; Noted. > ( > cd client && > git fetch-pack ... <../stdin.exp >stdin.raw > ) && > cut -d " " -f 2 <stdin.raw | sort >stdin.act && > test_cmp stdin.exp stdin.act > > By the way, why are these not called "expect" and "actual" like most other > tests? The test files I worked with used the shorter exp/act convention so I followed that. Or are you wondering about the "stdin." prefix I added? That was because I didn't want to overwrite any exp/act files that might be created by the previous tests, or is that not a concern? Sorry, my inexperience with the Git test framework is showing... > Do we want to have a separate test to see what happens when there are dups > in the input? Good idea. I will check what happens in the current fetch-pack if duplicate refs are passed on the command line and will write a test to ensure that the --stdin version does the same thing. P.S. It goes without saying that I implicitly accept all the corrections that I did not reply to and will include them in the next version. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC v2 3/4] fetch-pack: test cases for the new --stdin option 2012-03-27 23:36 ` Ivan Todoroski @ 2012-03-27 23:43 ` Junio C Hamano 2012-03-28 0:14 ` Ivan Todoroski 1 sibling, 0 replies; 54+ messages in thread From: Junio C Hamano @ 2012-03-27 23:43 UTC (permalink / raw) To: Ivan Todoroski Cc: Junio C Hamano, Jeff King, Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git Ivan Todoroski <grnch@gmx.net> writes: > On second thought, maybe we can just drop these two --stateless-rpc > tests from this patch? The "git clone" test in the next patch also > exercises the packetized refs in --stateless-rpc mode and if there was > anything wrong with them it would fail. Yeah, I was thinking about the same. >>> + >>> +test_expect_success 'fetch refs from cmdline, make sure it still works OK' ' >>> + cd client && >>> + git fetch-pack --no-progress .. $(cat ../stdin.exp) | >>> + cut -d " " -f 2 > ../stdin.act && >>> + cd .. && >>> + test_cmp stdin.exp stdin.act >>> +' >> >> - Do not chdir around without being in a subprocess (); > > Sorry, I didn't realize the tests were eval-ed in the current > environment. I will correct all such problems in the next version. > >> - Do not place the command you are testing that might crash on the >> upstream of the pipe; >> >> - style; > > Noted. > >> ( >> cd client && >> git fetch-pack ... <../stdin.exp >stdin.raw >> ) && >> cut -d " " -f 2 <stdin.raw | sort >stdin.act && >> test_cmp stdin.exp stdin.act >> >> By the way, why are these not called "expect" and "actual" like most other >> tests? > > The test files I worked with used the shorter exp/act convention so I > followed that. > > Or are you wondering about the "stdin." prefix I added? No. I was referring to just that these two files were not literally named "expect"/"actual", and I was lazy to look beyond what was in the patch context ;-). If the surrounding tests uses exp/act, mimicking them in this patch is a good idea (we may want to fix them later but that is a separate topic, and should not be done in this patch). Thanks. ^ permalink raw reply [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC v2 3/4] fetch-pack: test cases for the new --stdin option 2012-03-27 23:36 ` Ivan Todoroski 2012-03-27 23:43 ` Junio C Hamano @ 2012-03-28 0:14 ` Ivan Todoroski 1 sibling, 0 replies; 54+ messages in thread From: Ivan Todoroski @ 2012-03-28 0:14 UTC (permalink / raw) To: Junio C Hamano Cc: Jeff King, Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git On 28.03.2012 01:36, Ivan Todoroski wrote: > On 27.03.2012 19:40, Junio C Hamano wrote: >> >> It is sensible to expect that we see all the refs we told it to >> fetch, but I do not think it is sensible to require they come in >> the same order as we have given them to the command. > > Oh, OK. I was not aware that fetch-pack is free to reorder commits > (at least in principle). I will adjust the tests to be > order-independent. s/reorder commits/reorder refs for fetching/ I didn't mean actually reordering commits in history, of course. ^ permalink raw reply [flat|nested] 54+ messages in thread
* [PATCH/RFC v2 4/4] remote-curl: main test case for the OS command line overflow 2012-03-27 6:23 ` [PATCH/RFC v2 0/4] Fix fetch-pack command line overflow during clone Ivan Todoroski ` (2 preceding siblings ...) 2012-03-27 6:27 ` [PATCH/RFC v2 3/4] fetch-pack: test cases for the new --stdin option Ivan Todoroski @ 2012-03-27 6:28 ` Ivan Todoroski 2012-03-27 17:43 ` Junio C Hamano 3 siblings, 1 reply; 54+ messages in thread From: Ivan Todoroski @ 2012-03-27 6:28 UTC (permalink / raw) To: Jeff King; +Cc: Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git This is main test case for the original problem that triggered this patch series. We create a repo with 50k tags and then test whether git-clone over the smart HTTP protocol succeeds. Note that we construct the repo in a slightly different way than the original script used to reproduce the problem. This is because the original script just created 50k tags all pointing to the same commit, so if there was a bug where remote-curl.c was not passing all the refs to fetch-pack we wouldn't know. The clone would succed even if only one tag was passed, because all the other tags were pointing at the same SHA and would be considered present. Instead we create a repo with 50k independent (dangling) commits and then tag each of those commits with a unique tag. This way if one of the tags is not given to fetch-pack, later stages of the clone would complain about it. This allows us to test both that the command line overflow was fixed, as well as that it was fixed in a way that doesn't leave out any of the refs. --- t/t5551-http-fetch.sh | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/t/t5551-http-fetch.sh b/t/t5551-http-fetch.sh index 26d355725f..fb970bda22 100755 --- a/t/t5551-http-fetch.sh +++ b/t/t5551-http-fetch.sh @@ -109,5 +109,37 @@ test_expect_success 'follow redirects (302)' ' git clone $HTTPD_URL/smart-redir-temp/repo.git --quiet repo-t ' + +test -n "$GIT_TEST_LONG" && test_set_prereq EXPENSIVE + +test_expect_success EXPENSIVE 'create 50,000 tags in the repo' ' + ( + cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" && + N=50000 && + for ((i=1; i<=$N; i++)); do + echo "commit refs/heads/too-many-refs" + echo "mark :$i" + echo "committer git <git@example.com> $i +0000" + echo "data 0" + echo "M 644 inline bla.txt" + echo "data 4" + echo "bla" + # make every commit dangling by always + # rewinding the branch after each commit + echo "reset refs/heads/too-many-refs" + echo "from :1" + done | git fast-import --export-marks=marks && + + # now assign tags to all the dangling commits we created above + export tag=refs/tags/artificially-long-tag-name-to-more-easily-demonstrate-the-problem && + perl -nle '\''/:(.+) (.+)/; print "$2 $ENV{tag}-$1"'\'' < marks >> packed-refs + ) +' + +test_expect_success EXPENSIVE 'clone the 50,000 tag repo to check OS command line overflow' ' + git clone $HTTPD_URL/smart/repo.git too-many-refs 2> too-many-refs.err && + test_line_count = 0 too-many-refs.err +' + stop_httpd test_done -- 1.7.9.5.4.g4f508 ^ permalink raw reply related [flat|nested] 54+ messages in thread
* Re: [PATCH/RFC v2 4/4] remote-curl: main test case for the OS command line overflow 2012-03-27 6:28 ` [PATCH/RFC v2 4/4] remote-curl: main test case for the OS command line overflow Ivan Todoroski @ 2012-03-27 17:43 ` Junio C Hamano 0 siblings, 0 replies; 54+ messages in thread From: Junio C Hamano @ 2012-03-27 17:43 UTC (permalink / raw) To: Ivan Todoroski Cc: Jeff King, Shawn Pearce, Nguyen Thai Ngoc Duy, Jakub Narebski, git Ivan Todoroski <grnch@gmx.net> writes: > +test_expect_success EXPENSIVE 'create 50,000 tags in the repo' ' > + ( > + cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" && > + N=50000 && > + for ((i=1; i<=$N; i++)); do That's bash-ism isn't it? N=50000 I=1 while test $I -le $N do ... I=$(( $I + 1 )) done The body of the test itself looked reasonable. Thanks. ^ permalink raw reply [flat|nested] 54+ messages in thread
end of thread, other threads:[~2012-03-28 0:14 UTC | newest] Thread overview: 54+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-03-18 8:14 Clone fails on a repo with too many heads/tags Ivan Todoroski 2012-03-18 11:37 ` Ivan Todoroski 2012-03-18 12:04 ` Nguyen Thai Ngoc Duy 2012-03-18 16:36 ` Jakub Narebski 2012-03-18 19:07 ` Jeff King 2012-03-18 22:07 ` Jakub Narebski 2012-03-19 2:32 ` Jeff King 2012-03-19 2:43 ` Nguyen Thai Ngoc Duy 2012-03-19 2:45 ` Jeff King 2012-03-19 1:05 ` Ivan Todoroski 2012-03-19 1:30 ` Nguyen Thai Ngoc Duy 2012-03-19 2:44 ` Jeff King 2012-03-21 11:05 ` Ivan Todoroski 2012-03-21 14:28 ` Shawn Pearce 2012-03-21 17:14 ` Jeff King 2012-03-21 17:59 ` Jakub Narebski 2012-03-21 20:02 ` Ivan Todoroski 2012-03-21 20:17 ` Jeff King 2012-03-24 20:49 ` Ivan Todoroski 2012-03-25 1:06 ` Jeff King 2012-03-25 2:32 ` Jeff King 2012-03-25 17:33 ` Ivan Todoroski 2012-03-25 17:54 ` Ivan Todoroski 2012-03-26 17:33 ` Jeff King 2012-03-27 7:07 ` Ivan Todoroski 2012-03-25 15:30 ` Ivan Todoroski 2012-03-24 20:53 ` [PATCH/RFC 1/2] fetch-pack: new option to read refs from stdin Ivan Todoroski 2012-03-25 1:19 ` Jeff King 2012-03-25 9:39 ` Ivan Todoroski 2012-03-25 15:15 ` Ivan Todoroski 2012-03-25 20:00 ` Ivan Todoroski 2012-03-26 17:21 ` Jeff King 2012-03-26 17:49 ` Ivan Todoroski 2012-03-26 17:51 ` Jeff King 2012-03-24 20:54 ` [PATCH/RFC 2/2] remote-curl: send the refs to fetch-pack on stdin Ivan Todoroski 2012-03-25 1:24 ` Jeff King 2012-03-25 9:52 ` Ivan Todoroski 2012-03-26 17:24 ` Jeff King 2012-03-27 6:23 ` [PATCH/RFC v2 0/4] Fix fetch-pack command line overflow during clone Ivan Todoroski 2012-03-27 6:25 ` [PATCH/RFC v2 1/4] fetch-pack: new --stdin option to read refs from stdin Ivan Todoroski 2012-03-27 16:59 ` Junio C Hamano 2012-03-27 23:18 ` Ivan Todoroski 2012-03-27 23:26 ` Junio C Hamano 2012-03-27 23:48 ` Ivan Todoroski 2012-03-27 6:26 ` [PATCH/RFC v2 2/4] remote-curl: send the refs to fetch-pack on stdin Ivan Todoroski 2012-03-27 17:18 ` Junio C Hamano 2012-03-27 23:20 ` Ivan Todoroski 2012-03-27 6:27 ` [PATCH/RFC v2 3/4] fetch-pack: test cases for the new --stdin option Ivan Todoroski 2012-03-27 17:40 ` Junio C Hamano 2012-03-27 23:36 ` Ivan Todoroski 2012-03-27 23:43 ` Junio C Hamano 2012-03-28 0:14 ` Ivan Todoroski 2012-03-27 6:28 ` [PATCH/RFC v2 4/4] remote-curl: main test case for the OS command line overflow Ivan Todoroski 2012-03-27 17:43 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).