git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* problem cloning via http since v1.6.6-rc0
@ 2010-01-21  0:47 Yaroslav Halchenko
  2010-01-21  1:34 ` Tay Ray Chuan
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Yaroslav Halchenko @ 2010-01-21  0:47 UTC (permalink / raw)
  To: git

Dear Git Developers,

Some users of our project started recently to complain that they could not
clone the repository via http (git:// wasn't a choice due to heavy firewalling)
and because http:// was used as a protocol to get sources in some distributions
(e.g. macports).

Cloning of the repository works fine with v1.6.5.7 but fails with v1.6.6-rc0.
I haven't done full bisection since that repository is relatively bulky and
poor server is quite loaded anyways, so I thought you just would get a clue
without going brute-force.  But here are the details:  in case of failing
operation, I immediately get failure:

$> GIT_TRACE=2 ./git clone http://git.debian.org/git/pkg-exppsy/pymvpa.git
trace: built-in: git 'clone' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
warning: templates not found /home/yoh/share/git-core/templates
Initialized empty Git repository in /home/yoh/proj/misc/git/pymvpa/.git/
trace: run_command: 'remote-curl' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
trace: exec: 'git' 'remote-curl' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
trace: exec: 'git-remote-curl' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
trace: run_command: 'git-remote-curl' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
fatal: http://git.debian.org/git/pkg-exppsy/pymvpa.git/info/refs?service=git-upload-pack not found: did you run git update-server-info on the server?

on the server, 1.6.3.3 version of git was used to run git
update-server-info.

Thanks in advance
-- 
Yaroslav O. Halchenko
Postdoctoral Fellow,   Department of Psychological and Brain Sciences
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: problem cloning via http since v1.6.6-rc0
  2010-01-21  0:47 problem cloning via http since v1.6.6-rc0 Yaroslav Halchenko
@ 2010-01-21  1:34 ` Tay Ray Chuan
  2010-01-21  1:36 ` Tay Ray Chuan
  2010-01-21  5:08 ` Ilari Liusvaara
  2 siblings, 0 replies; 28+ messages in thread
From: Tay Ray Chuan @ 2010-01-21  1:34 UTC (permalink / raw)
  To: Yaroslav Halchenko; +Cc: git

Hi,

On Thu, Jan 21, 2010 at 8:47 AM, Yaroslav Halchenko
<debian@onerussian.com> wrote:
> Cloning of the repository works fine with v1.6.5.7 but fails with v1.6.6-rc0.

this sounds like around the time the smart http protocol was introduced.

> fatal: http://git.debian.org/git/pkg-exppsy/pymvpa.git/info/refs?service=git-upload-pack not found: did you run git update-server-info on the server?
>
> on the server, 1.6.3.3 version of git was used to run git
> update-server-info.

hmm, are you using the WebDAV-flavour or the smart http protocol to
host the repository?

-- 
Cheers,
Ray Chuan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: problem cloning via http since v1.6.6-rc0
  2010-01-21  0:47 problem cloning via http since v1.6.6-rc0 Yaroslav Halchenko
  2010-01-21  1:34 ` Tay Ray Chuan
@ 2010-01-21  1:36 ` Tay Ray Chuan
  2010-01-21  2:33   ` Yaroslav Halchenko
  2010-01-21  5:08 ` Ilari Liusvaara
  2 siblings, 1 reply; 28+ messages in thread
From: Tay Ray Chuan @ 2010-01-21  1:36 UTC (permalink / raw)
  To: Yaroslav Halchenko; +Cc: git

On Thu, Jan 21, 2010 at 8:47 AM, Yaroslav Halchenko
<debian@onerussian.com> wrote:
> $> GIT_TRACE=2 ./git clone http://git.debian.org/git/pkg-exppsy/pymvpa.git
> trace: built-in: git 'clone' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
> warning: templates not found /home/yoh/share/git-core/templates
> Initialized empty Git repository in /home/yoh/proj/misc/git/pymvpa/.git/
> trace: run_command: 'remote-curl' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
> trace: exec: 'git' 'remote-curl' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
> trace: exec: 'git-remote-curl' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
> trace: run_command: 'git-remote-curl' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
> fatal: http://git.debian.org/git/pkg-exppsy/pymvpa.git/info/refs?service=git-upload-pack not found: did you run git update-server-info on the server?

oh, and by the way, could you also run this again with GIT_CURL_VERBOSE=1?

-- 
Cheers,
Ray Chuan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: problem cloning via http since v1.6.6-rc0
  2010-01-21  1:36 ` Tay Ray Chuan
@ 2010-01-21  2:33   ` Yaroslav Halchenko
  2010-01-21  4:01     ` Tay Ray Chuan
  0 siblings, 1 reply; 28+ messages in thread
From: Yaroslav Halchenko @ 2010-01-21  2:33 UTC (permalink / raw)
  To: Tay Ray Chuan; +Cc: git

$> GIT_CURL_VERBOSE=1 GIT_TRACE=2 ./git clone http://git.debian.org/git/pkg-exppsy/pymvpa.git 
trace: built-in: git 'clone' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
warning: templates not found /home/yoh/share/git-core/templates
Initialized empty Git repository in /home/yoh/proj/misc/git/pymvpa/.git/
trace: run_command: 'remote-http' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
trace: exec: 'git' 'remote-http' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
trace: exec: 'git-remote-http' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
trace: run_command: 'git-remote-http' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
* Couldn't find host git.debian.org in the .netrc file; using defaults
* About to connect() to git.debian.org port 80 (#0)
*   Trying 217.196.43.134... * Connected to git.debian.org (217.196.43.134) port 80 (#0)
> GET /git/pkg-exppsy/pymvpa.git/info/refs?service=git-upload-pack HTTP/1.1
User-Agent: git/1.6.6.267.g5b159
Host: git.debian.org
Accept: */*
Pragma: no-cache

* The requested URL returned error: 404
* Closing connection #0
fatal: http://git.debian.org/git/pkg-exppsy/pymvpa.git/info/refs?service=git-upload-pack not found: did you run git update-server-info on the server?

as for smart vs DAV -- don't see any smart alias handling in apache
configuration (I have/had no clue about some smart http in git, just looked at
apache template and saw smart aliases -- is there smth else to check within
webserver config?)

On Thu, 21 Jan 2010, Tay Ray Chuan wrote:

> On Thu, Jan 21, 2010 at 8:47 AM, Yaroslav Halchenko
> <debian@onerussian.com> wrote:
> > $> GIT_TRACE=2 ./git clone http://git.debian.org/git/pkg-exppsy/pymvpa.git
> > trace: built-in: git 'clone' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
> > warning: templates not found /home/yoh/share/git-core/templates
> > Initialized empty Git repository in /home/yoh/proj/misc/git/pymvpa/.git/
> > trace: run_command: 'remote-curl' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
> > trace: exec: 'git' 'remote-curl' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
> > trace: exec: 'git-remote-curl' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
> > trace: run_command: 'git-remote-curl' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
> > fatal: http://git.debian.org/git/pkg-exppsy/pymvpa.git/info/refs?service=git-upload-pack not found: did you run git update-server-info on the server?

> oh, and by the way, could you also run this again with GIT_CURL_VERBOSE=1?
-- 
Yaroslav O. Halchenko
Postdoctoral Fellow,   Department of Psychological and Brain Sciences
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: problem cloning via http since v1.6.6-rc0
  2010-01-21  2:33   ` Yaroslav Halchenko
@ 2010-01-21  4:01     ` Tay Ray Chuan
  2010-01-21  4:38       ` Yaroslav Halchenko
  0 siblings, 1 reply; 28+ messages in thread
From: Tay Ray Chuan @ 2010-01-21  4:01 UTC (permalink / raw)
  To: Yaroslav Halchenko; +Cc: git

Hi,

On Thu, Jan 21, 2010 at 10:33 AM, Yaroslav Halchenko
<debian@onerussian.com> wrote:
> *   Trying 217.196.43.134... * Connected to git.debian.org (217.196.43.134) port 80 (#0)
>> GET /git/pkg-exppsy/pymvpa.git/info/refs?service=git-upload-pack HTTP/1.1
> User-Agent: git/1.6.6.267.g5b159
> Host: git.debian.org
> Accept: */*
> Pragma: no-cache
>
> * The requested URL returned error: 404
> * Closing connection #0
> fatal: http://git.debian.org/git/pkg-exppsy/pymvpa.git/info/refs?service=git-upload-pack not found: did you run git update-server-info on the server?

I don't think git's at fault here, as we're getting a 404 Not Found.
Could you check that the repository (the one the url points to, after
taking into any url rewriting) is a bare one, ie. has structure

  pkg-exppsy/
  |-pymvpa.git
    |-objects/
    |-info/
    |-refs/
    |-...

rather than the non-bare

  pkg-exppsy/
  |-pymvpa.git
    |-.git
      |-objects/
      |-info/
      |-refs/
      |-...

?

> as for smart vs DAV -- don't see any smart alias handling in apache
> configuration (I have/had no clue about some smart http in git, just looked at
> apache template and saw smart aliases -- is there smth else to check within
> webserver config?)

If that's the case, I don't think it's related to your problem. (Btw,
"smart" refers to the http protocol that git can use to sync your
repo, via a CGI program on the server, instead of WebDAV. See
git-http-backend(1) for details.)

-- 
Cheers,
Ray Chuan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: problem cloning via http since v1.6.6-rc0
  2010-01-21  4:01     ` Tay Ray Chuan
@ 2010-01-21  4:38       ` Yaroslav Halchenko
  0 siblings, 0 replies; 28+ messages in thread
From: Yaroslav Halchenko @ 2010-01-21  4:38 UTC (permalink / raw)
  To: Tay Ray Chuan; +Cc: git


On Thu, 21 Jan 2010, Tay Ray Chuan wrote:
> > fatal: http://git.debian.org/git/pkg-exppsy/pymvpa.git/info/refs?service=git-upload-pack not found: did you run git update-server-info on the server?
> I don't think git's at fault here, as we're getting a 404 Not Found.
khe khe, pardon me, but if git can't talk to its nephew (ie its own
repository which was created with earlier version), whenever another
nephew (ie git of earlier version) can talk to it, I do consider it to
be git's fault ;)

let me zoom in onto difference in communication between two different versions :

*   Trying 217.196.43.134... * Connected to git.debian.org (217.196.43.134) port 80 (#0)
> GET /git/pkg-exppsy/pymvpa.git/info/refs HTTP/1.1
User-Agent: git/1.6.5
Host: git.debian.org
Accept: */*
Pragma: no-cache

< HTTP/1.1 200 OK

whenever, once again, for 1.6.6 it looked much shorter:

*   Trying 217.196.43.134... * Connected to git.debian.org (217.196.43.134) port 80 (#0)
> GET /git/pkg-exppsy/pymvpa.git/info/refs?service=git-upload-pack HTTP/1.1
User-Agent: git/1.6.6.267.g5b159
Host: git.debian.org
Accept: */*
Pragma: no-cache

* The requested URL returned error: 404
* Closing connection #0
fatal: http://git.debian.org/git/pkg-exppsy/pymvpa.git/info/refs?service=git-upload-pack not found: did you run git update-server-info on the server?


> Could you check that the repository (the one the url points to, after
> taking into any url rewriting) is a bare one, ie. has structure
yes - it is bare... I was 101% sure (isn't that a convention to have
.git suffix for directories with bare repositories), but just to make
sure:

$> cd /srv/git.debian.org/git/pkg-exppsy/pymvpa.git/
total 76
 8 branches/   8 config   8 description   8 HEAD   8 hooks/   8 info/   8 objects/  12 packed-refs   8 refs/


> If that's the case, I don't think it's related to your problem. (Btw,
> "smart" refers to the http protocol that git can use to sync your
> repo, via a CGI program on the server, instead of WebDAV. See
> git-http-backend(1) for details.)
thanks for info, I did know about this beastie.

I see no references to git-http-backend in apache config -- so indeed
should not be the case... but from the git-http-backend description:

,---
| By default, only the `upload-pack` service is enabled, which serves
| 'git-fetch-pack' and 'git-ls-remote' clients, which are invoked from
| 'git-fetch', 'git-pull', and 'git-clone'.
`---

so, it looks like 1.6.6 for some reason decided to assume that it is "smart"
http whenever it is not?  is that the case here?

-- 
Yaroslav O. Halchenko
Postdoctoral Fellow,   Department of Psychological and Brain Sciences
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: problem cloning via http since v1.6.6-rc0
  2010-01-21  0:47 problem cloning via http since v1.6.6-rc0 Yaroslav Halchenko
  2010-01-21  1:34 ` Tay Ray Chuan
  2010-01-21  1:36 ` Tay Ray Chuan
@ 2010-01-21  5:08 ` Ilari Liusvaara
  2010-01-21  6:47   ` Tay Ray Chuan
  2 siblings, 1 reply; 28+ messages in thread
From: Ilari Liusvaara @ 2010-01-21  5:08 UTC (permalink / raw)
  To: Yaroslav Halchenko; +Cc: git, Shawn O. Pearce

On Wed, Jan 20, 2010 at 07:47:56PM -0500, Yaroslav Halchenko wrote:

Added spearce to cc.

> Dear Git Developers,
> 
> Some users of our project started recently to complain that they could not
> clone the repository via http (git:// wasn't a choice due to heavy firewalling)
> and because http:// was used as a protocol to get sources in some distributions
> (e.g. macports).
> 
> Cloning of the repository works fine with v1.6.5.7 but fails with v1.6.6-rc0.
> I haven't done full bisection since that repository is relatively bulky and
> poor server is quite loaded anyways, so I thought you just would get a clue
> without going brute-force.  But here are the details:  in case of failing
> operation, I immediately get failure:
> 
> $> GIT_TRACE=2 ./git clone http://git.debian.org/git/pkg-exppsy/pymvpa.git
> trace: built-in: git 'clone' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
> warning: templates not found /home/yoh/share/git-core/templates
> Initialized empty Git repository in /home/yoh/proj/misc/git/pymvpa/.git/
> trace: run_command: 'remote-curl' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
> trace: exec: 'git' 'remote-curl' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
> trace: exec: 'git-remote-curl' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
> trace: run_command: 'git-remote-curl' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git' 'http://git.debian.org/git/pkg-exppsy/pymvpa.git'
> fatal: http://git.debian.org/git/pkg-exppsy/pymvpa.git/info/refs?service=git-upload-pack not found: did you run git update-server-info on the server?

Looks like remote-curl (which handles http) issues request for:

'.../info/refs?service=git-upload-pack'

And expects that if there is no smart HTTP server there for the request to be
interpretted as:

'.../info/refs'

(i.e. webserver would ignore the query). This isn't true for git.debian.org.
Requesting the latter works (and the data formatting looks sane), but the
former is 404. This causes the fetch to fail.

-Ilari

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: problem cloning via http since v1.6.6-rc0
  2010-01-21  5:08 ` Ilari Liusvaara
@ 2010-01-21  6:47   ` Tay Ray Chuan
  2010-01-21  7:51     ` Tay Ray Chuan
  2010-01-21 10:35     ` problem cloning via http since v1.6.6-rc0 Ilari Liusvaara
  0 siblings, 2 replies; 28+ messages in thread
From: Tay Ray Chuan @ 2010-01-21  6:47 UTC (permalink / raw)
  To: Ilari Liusvaara; +Cc: Yaroslav Halchenko, git, Shawn O. Pearce

Hi,

On Thu, Jan 21, 2010 at 1:08 PM, Ilari Liusvaara
<ilari.liusvaara@elisanet.fi> wrote:
> Looks like remote-curl (which handles http) issues request for:
>
> '.../info/refs?service=git-upload-pack'
>
> And expects that if there is no smart HTTP server there for the request to be
> interpretted as:
>
> '.../info/refs'
>
> (i.e. webserver would ignore the query). This isn't true for git.debian.org.
> Requesting the latter works (and the data formatting looks sane), but the
> former is 404. This causes the fetch to fail.

afaik, putting a "?var1=val1&var2=...." still makes it a normal GET
request, even if the url requested is just a plain file and not some
cgi handler that uses those variables/values.

--
Cheers,
Ray Chuan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: problem cloning via http since v1.6.6-rc0
  2010-01-21  6:47   ` Tay Ray Chuan
@ 2010-01-21  7:51     ` Tay Ray Chuan
  2010-01-21 14:00       ` Yaroslav Halchenko
  2010-01-21 10:35     ` problem cloning via http since v1.6.6-rc0 Ilari Liusvaara
  1 sibling, 1 reply; 28+ messages in thread
From: Tay Ray Chuan @ 2010-01-21  7:51 UTC (permalink / raw)
  To: Yaroslav Halchenko; +Cc: Ilari Liusvaara, Shawn O. Pearce, git

Hi,

On Thu, 21 Jan 2010 14:47:37 +0800
Tay Ray Chuan <rctay89@gmail.com> wrote:
> On Thu, Jan 21, 2010 at 1:08 PM, Ilari Liusvaara
> <ilari.liusvaara@elisanet.fi> wrote:
> > Looks like remote-curl (which handles http) issues request for:
> >
> > '.../info/refs?service=git-upload-pack'
> >
> > And expects that if there is no smart HTTP server there for the request to be
> > interpretted as:
> >
> > '.../info/refs'
> >
> > (i.e. webserver would ignore the query). This isn't true for git.debian.org.
> > Requesting the latter works (and the data formatting looks sane), but the
> > former is 404. This causes the fetch to fail.
> 
> afaik, putting a "?var1=val1&var2=...." still makes it a normal GET
> request, even if the url requested is just a plain file and not some
> cgi handler that uses those variables/values.

Yaroslav, sorry for making you run in circles - it really is git's
fault (sorta).

In recent versions of git, we were sending out the GET request for
info/refs with a query string (?serivce=<service name>). I'm not sure
why, but your server is not playing nice when the query string is
appended.

Could you try this patch and see if it solves the issue? I manage to
clone your repo successfully with it.

-- 
Cheers,
Ray Chuan

-->8--
Subject: [PATCH] http/remote-curl: coddle picky servers

When "info/refs" is a static file and not behind a CGI handler, some
servers may not handle a GET request for it with a query string
appended (eg. "?foo=bar") properly.

If such a request fails, retry it sans the query string, and also
discount the possibility of using the "smart" protocol (since no
service is specified with "?service=<service name>").

Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
---
 remote-curl.c |   18 ++++++++++++++++--
 1 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index 1361006..a904164 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -102,7 +102,7 @@ static struct discovery* discover_refs(const char *service)
 	struct strbuf buffer = STRBUF_INIT;
 	struct discovery *last = last_discovery;
 	char *refs_url;
-	int http_ret, is_http = 0;
+	int http_ret, is_http = 0, proto_git_candidate = 1;
 
 	if (last && !strcmp(service, last->service))
 		return last;
@@ -121,6 +121,19 @@ static struct discovery* discover_refs(const char *service)
 
 	init_walker();
 	http_ret = http_get_strbuf(refs_url, &buffer, HTTP_NO_CACHE);
+
+	/* try again with "plain" url (no ? or & appended) */
+	if (http_ret != HTTP_OK) {
+		free(refs_url);
+		strbuf_reset(&buffer);
+
+		proto_git_candidate = 0;
+		strbuf_addf(&buffer, "%s/info/refs", url);
+		refs_url = strbuf_detach(&buffer, NULL);
+
+		http_ret = http_get_strbuf(refs_url, &buffer, HTTP_NO_CACHE);
+	}
+
 	switch (http_ret) {
 	case HTTP_OK:
 		break;
@@ -137,7 +150,8 @@ static struct discovery* discover_refs(const char *service)
 	last->buf_alloc = strbuf_detach(&buffer, &last->len);
 	last->buf = last->buf_alloc;
 
-	if (is_http && 5 <= last->len && last->buf[4] == '#') {
+	if (is_http && proto_git_candidate
+		&& 5 <= last->len && last->buf[4] == '#') {
 		/* smart HTTP response; validate that the service
 		 * pkt-line matches our request.
 		 */
-- 
1.6.6.1.337.g96bc8

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: problem cloning via http since v1.6.6-rc0
  2010-01-21  6:47   ` Tay Ray Chuan
  2010-01-21  7:51     ` Tay Ray Chuan
@ 2010-01-21 10:35     ` Ilari Liusvaara
  2010-01-21 11:36       ` Tay Ray Chuan
  1 sibling, 1 reply; 28+ messages in thread
From: Ilari Liusvaara @ 2010-01-21 10:35 UTC (permalink / raw)
  To: Tay Ray Chuan; +Cc: Yaroslav Halchenko, git, Shawn O. Pearce

On Thu, Jan 21, 2010 at 02:47:37PM +0800, Tay Ray Chuan wrote:
> Hi,
> 
> On Thu, Jan 21, 2010 at 1:08 PM, Ilari Liusvaara
> >
> > (i.e. webserver would ignore the query). This isn't true for git.debian.org.
> > Requesting the latter works (and the data formatting looks sane), but the
> > former is 404. This causes the fetch to fail.
> 
> afaik, putting a "?var1=val1&var2=...." still makes it a normal GET
> request, even if the url requested is just a plain file and not some
> cgi handler that uses those variables/values.

Yes, it is normal GET (POST would be something else). And wheither it is CGI
doesn't come into play for request since client decides wheither to send GET
or POST and wheither to include query or not.

Query is just technical name for part between ? and # (or end of HTTP URL),
and can be present in any type of request that accepts http:// URL.

As said, code expects query part to be ignored if target is regular file
but broke when it didn't get ignored.

-Ilari

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: problem cloning via http since v1.6.6-rc0
  2010-01-21 10:35     ` problem cloning via http since v1.6.6-rc0 Ilari Liusvaara
@ 2010-01-21 11:36       ` Tay Ray Chuan
  0 siblings, 0 replies; 28+ messages in thread
From: Tay Ray Chuan @ 2010-01-21 11:36 UTC (permalink / raw)
  To: Ilari Liusvaara; +Cc: Yaroslav Halchenko, git, Shawn O. Pearce

Hi,

On Thu, Jan 21, 2010 at 6:35 PM, Ilari Liusvaara
<ilari.liusvaara@elisanet.fi> wrote:
> On Thu, Jan 21, 2010 at 02:47:37PM +0800, Tay Ray Chuan wrote:
>> Hi,
>>
>> On Thu, Jan 21, 2010 at 1:08 PM, Ilari Liusvaara
>> >
>> > (i.e. webserver would ignore the query). This isn't true for git.debian.org.
>> > Requesting the latter works (and the data formatting looks sane), but the
>> > former is 404. This causes the fetch to fail.
>>
>> afaik, putting a "?var1=val1&var2=...." still makes it a normal GET
>> request, even if the url requested is just a plain file and not some
>> cgi handler that uses those variables/values.
>
> Yes, it is normal GET (POST would be something else). And wheither it is CGI
> doesn't come into play for request since client decides wheither to send GET
> or POST and wheither to include query or not.
>
> Query is just technical name for part between ? and # (or end of HTTP URL),
> and can be present in any type of request that accepts http:// URL.

yes, indeed, I misread your message. Your idea of the query string
affecting the server response didn't strike me then.

-- 
Cheers,
Ray Chuan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: problem cloning via http since v1.6.6-rc0
  2010-01-21  7:51     ` Tay Ray Chuan
@ 2010-01-21 14:00       ` Yaroslav Halchenko
  2010-01-21 14:41         ` [PATCH] http/remote-curl: coddle picky servers Tay Ray Chuan
  0 siblings, 1 reply; 28+ messages in thread
From: Yaroslav Halchenko @ 2010-01-21 14:00 UTC (permalink / raw)
  To: Tay Ray Chuan; +Cc: Ilari Liusvaara, Shawn O. Pearce, git

Hi Tay Ray,

That patch works fine for me ;) I only hope it would get accepted into
bugfix and next dev release  (I guess it might annoy some of apache
admins a bit due to increase of their errors.log now even for well
maintained repositories, but well -- that is life ;-) )

Thanks!
Yarik

On Thu, 21 Jan 2010, Tay Ray Chuan wrote:
> > afaik, putting a "?var1=val1&var2=...." still makes it a normal GET
> > request, even if the url requested is just a plain file and not some
> > cgi handler that uses those variables/values.

> Yaroslav, sorry for making you run in circles - it really is git's
> fault (sorta).

> In recent versions of git, we were sending out the GET request for
> info/refs with a query string (?serivce=<service name>). I'm not sure
> why, but your server is not playing nice when the query string is
> appended.

> Could you try this patch and see if it solves the issue? I manage to
> clone your repo successfully with it.
-- 
                                  .-.
=------------------------------   /v\  ----------------------------=
Keep in touch                    // \\     (yoh@|www.)onerussian.com
Yaroslav Halchenko              /(   )\               ICQ#: 60653192
                   Linux User    ^^-^^    [175555]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH] http/remote-curl: coddle picky servers
  2010-01-21 14:00       ` Yaroslav Halchenko
@ 2010-01-21 14:41         ` Tay Ray Chuan
  2010-01-21 15:56           ` Shawn O. Pearce
  0 siblings, 1 reply; 28+ messages in thread
From: Tay Ray Chuan @ 2010-01-21 14:41 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Yaroslav Halchenko, Ilari Liusvaara,
	Shawn O. Pearce

When "info/refs" is a static file and not behind a CGI handler, some
servers may not handle a GET request for it with a query string
appended (eg. "?foo=bar") properly.

If such a request fails, retry it sans the query string. In addition,
ensure that the "smart" http protocol is not used (a service has to be
specified with "?service=<service name>" to be conformant).

Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Reported-and-tested-by: Yaroslav Halchenko <debian@onerussian.com>
---
 remote-curl.c |   18 ++++++++++++++++--
 1 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index 1361006..a904164 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -102,7 +102,7 @@ static struct discovery* discover_refs(const char *service)
 	struct strbuf buffer = STRBUF_INIT;
 	struct discovery *last = last_discovery;
 	char *refs_url;
-	int http_ret, is_http = 0;
+	int http_ret, is_http = 0, proto_git_candidate = 1;

 	if (last && !strcmp(service, last->service))
 		return last;
@@ -121,6 +121,19 @@ static struct discovery* discover_refs(const char *service)

 	init_walker();
 	http_ret = http_get_strbuf(refs_url, &buffer, HTTP_NO_CACHE);
+
+	/* try again with "plain" url (no ? or & appended) */
+	if (http_ret != HTTP_OK) {
+		free(refs_url);
+		strbuf_reset(&buffer);
+
+		proto_git_candidate = 0;
+		strbuf_addf(&buffer, "%s/info/refs", url);
+		refs_url = strbuf_detach(&buffer, NULL);
+
+		http_ret = http_get_strbuf(refs_url, &buffer, HTTP_NO_CACHE);
+	}
+
 	switch (http_ret) {
 	case HTTP_OK:
 		break;
@@ -137,7 +150,8 @@ static struct discovery* discover_refs(const char *service)
 	last->buf_alloc = strbuf_detach(&buffer, &last->len);
 	last->buf = last->buf_alloc;

-	if (is_http && 5 <= last->len && last->buf[4] == '#') {
+	if (is_http && proto_git_candidate
+		&& 5 <= last->len && last->buf[4] == '#') {
 		/* smart HTTP response; validate that the service
 		 * pkt-line matches our request.
 		 */
--
1.6.6.1.337.g96bc8

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH] http/remote-curl: coddle picky servers
  2010-01-21 14:41         ` [PATCH] http/remote-curl: coddle picky servers Tay Ray Chuan
@ 2010-01-21 15:56           ` Shawn O. Pearce
  2010-01-21 16:07             ` Mike Hommey
  0 siblings, 1 reply; 28+ messages in thread
From: Shawn O. Pearce @ 2010-01-21 15:56 UTC (permalink / raw)
  To: Tay Ray Chuan; +Cc: git, Junio C Hamano, Yaroslav Halchenko, Ilari Liusvaara

Tay Ray Chuan <rctay89@gmail.com> wrote:
> When "info/refs" is a static file and not behind a CGI handler, some
> servers may not handle a GET request for it with a query string
> appended (eg. "?foo=bar") properly.
> 
> If such a request fails, retry it sans the query string. In addition,
> ensure that the "smart" http protocol is not used (a service has to be
> specified with "?service=<service name>" to be conformant).
> 
> Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
> Reported-and-tested-by: Yaroslav Halchenko <debian@onerussian.com>

*grumble* stupid Apache *grumble*

Acked-by: Shawn O. Pearce <spearce@spearce.org>

-- 
Shawn.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] http/remote-curl: coddle picky servers
  2010-01-21 15:56           ` Shawn O. Pearce
@ 2010-01-21 16:07             ` Mike Hommey
  2010-01-21 16:10               ` git fetch -v not at all verbose? Michael S. Tsirkin
  2010-01-21 16:20               ` [PATCH] http/remote-curl: coddle picky servers Tay Ray Chuan
  0 siblings, 2 replies; 28+ messages in thread
From: Mike Hommey @ 2010-01-21 16:07 UTC (permalink / raw)
  To: Shawn O. Pearce
  Cc: Tay Ray Chuan, git, Junio C Hamano, Yaroslav Halchenko,
	Ilari Liusvaara

On Thu, Jan 21, 2010 at 07:56:37AM -0800, Shawn O. Pearce wrote:
> Tay Ray Chuan <rctay89@gmail.com> wrote:
> > When "info/refs" is a static file and not behind a CGI handler, some
> > servers may not handle a GET request for it with a query string
> > appended (eg. "?foo=bar") properly.
> > 
> > If such a request fails, retry it sans the query string. In addition,
> > ensure that the "smart" http protocol is not used (a service has to be
> > specified with "?service=<service name>" to be conformant).
> > 
> > Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
> > Reported-and-tested-by: Yaroslav Halchenko <debian@onerussian.com>
> 
> *grumble* stupid Apache *grumble*

stupid Apache... configuration.

Check the error message you get on
http://git.debian.org/git/pkg-exppsy/pymvpa.git/info/refs?service=git-upload-pack:

The requested URL /gitweb.cgigit/pkg-exppsy/pymvpa.git/info/refs was not
found on this server.

Look closely at the start of the requested URL: /gitweb.cgi...
It comes from this rule:

RewriteCond %{QUERY_STRING} ^(.+)$
RewriteRule ^/(.*)$ /gitweb.cgi$1 [L,PT]

which is global to the virtual host.

Anyways, while git.debian.org can certainly be fixed for that, other
servers may want to do some different things with urls with parameters.

Mike

^ permalink raw reply	[flat|nested] 28+ messages in thread

* git fetch -v not at all verbose?
  2010-01-21 16:07             ` Mike Hommey
@ 2010-01-21 16:10               ` Michael S. Tsirkin
  2010-01-21 16:18                 ` Shawn O. Pearce
  2010-01-21 16:20               ` [PATCH] http/remote-curl: coddle picky servers Tay Ray Chuan
  1 sibling, 1 reply; 28+ messages in thread
From: Michael S. Tsirkin @ 2010-01-21 16:10 UTC (permalink / raw)
  To: git, Junio C Hamano

Hi!
On many of my trees (with linux kernel), git fetch is slower than git clone.
Even more annoyingly, it would hang sometimes for tens of minutes without any
output, even if -v is supplied.

stracing it shows a ton of lines like the following:
16324 read(10, "ACK 4bbdfe65d23014f539fec4227260"..., 51) = 51
16324 read(10, "0037", 4)               = 4
16324 read(10, "ACK 322c06560fa314b04a6302ea03c0"..., 51) = 51
16324 read(10, "0037", 4)               = 4
16324 read(10, "ACK 848ea2043b128b5947851866a114"..., 51) = 51
16324 read(10, "0037", 4)               = 4

Is there some way to make got fetch show progress at this stage,
or even better, can it be made faster somehow?

Thanks!

-- 
MST

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: git fetch -v not at all verbose?
  2010-01-21 16:10               ` git fetch -v not at all verbose? Michael S. Tsirkin
@ 2010-01-21 16:18                 ` Shawn O. Pearce
  2010-01-21 16:35                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 28+ messages in thread
From: Shawn O. Pearce @ 2010-01-21 16:18 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: git, Junio C Hamano

"Michael S. Tsirkin" <mst@redhat.com> wrote:
> On many of my trees (with linux kernel), git fetch is slower than git clone.
> Even more annoyingly, it would hang sometimes for tens of minutes without any
> output, even if -v is supplied.

Ouch.  I think -v -v boosts the output to be ever more verbose,
and might actually show you something.
 
> stracing it shows a ton of lines like the following:
> 16324 read(10, "ACK 4bbdfe65d23014f539fec4227260"..., 51) = 51
> 16324 read(10, "0037", 4)               = 4
> 16324 read(10, "ACK 322c06560fa314b04a6302ea03c0"..., 51) = 51
> 16324 read(10, "0037", 4)               = 4
> 16324 read(10, "ACK 848ea2043b128b5947851866a114"..., 51) = 51
> 16324 read(10, "0037", 4)               = 4

That's the peers trying to determine a common base.
 
> Is there some way to make got fetch show progress at this stage,
> or even better, can it be made faster somehow?

We shouldn't need to show progress here, we should just be faster.

Given the symptom, it sounds to me like your local repository
is some 1,000s of commits ahead of the remote repository you are
fetching from.  Is that true?

Are you fetching from a configured remote that has tracking branches,
or are you fetching through a one-shot URL pasted onto the command
line?

-- 
Shawn.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] http/remote-curl: coddle picky servers
  2010-01-21 16:07             ` Mike Hommey
  2010-01-21 16:10               ` git fetch -v not at all verbose? Michael S. Tsirkin
@ 2010-01-21 16:20               ` Tay Ray Chuan
  2010-01-21 16:24                 ` Shawn O. Pearce
  2010-01-21 16:34                 ` Mike Hommey
  1 sibling, 2 replies; 28+ messages in thread
From: Tay Ray Chuan @ 2010-01-21 16:20 UTC (permalink / raw)
  To: Mike Hommey
  Cc: John 'Warthog9' Hawley, Shawn O. Pearce, git,
	Junio C Hamano, Yaroslav Halchenko, Ilari Liusvaara

Hi,

On Fri, Jan 22, 2010 at 12:07 AM, Mike Hommey <mh@glandium.org> wrote:
> Look closely at the start of the requested URL: /gitweb.cgi...
> It comes from this rule:
>
> RewriteCond %{QUERY_STRING} ^(.+)$
> RewriteRule ^/(.*)$ /gitweb.cgi$1 [L,PT]
>
> which is global to the virtual host.
>
> Anyways, while git.debian.org can certainly be fixed for that, other
> servers may want to do some different things with urls with parameters.

heh, I was suspecting some URL rewriting was going on.

Is this an issue that should be fixed in gitweb?

(added John 'Warthog9' Hawley to the Cc list, perhaps he might know.)

-- 
Cheers,
Ray Chuan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] http/remote-curl: coddle picky servers
  2010-01-21 16:20               ` [PATCH] http/remote-curl: coddle picky servers Tay Ray Chuan
@ 2010-01-21 16:24                 ` Shawn O. Pearce
  2010-01-21 16:34                   ` Mike Hommey
  2010-01-21 16:34                 ` Mike Hommey
  1 sibling, 1 reply; 28+ messages in thread
From: Shawn O. Pearce @ 2010-01-21 16:24 UTC (permalink / raw)
  To: Tay Ray Chuan
  Cc: Mike Hommey, John 'Warthog9' Hawley, git, Junio C Hamano,
	Yaroslav Halchenko, Ilari Liusvaara

Tay Ray Chuan <rctay89@gmail.com> wrote:
> On Fri, Jan 22, 2010 at 12:07 AM, Mike Hommey <mh@glandium.org> wrote:
> > Look closely at the start of the requested URL: /gitweb.cgi...
> > It comes from this rule:
> >
> > RewriteCond %{QUERY_STRING} ^(.+)$
> > RewriteRule ^/(.*)$ /gitweb.cgi$1 [L,PT]
> >
> > which is global to the virtual host.
> >
> > Anyways, while git.debian.org can certainly be fixed for that, other
> > servers may want to do some different things with urls with parameters.
> 
> heh, I was suspecting some URL rewriting was going on.
> 
> Is this an issue that should be fixed in gitweb?

I don't see why it should be.  gitweb isn't a service CGI.  I find
it odd that someone would configure their website to route anything
with a query string into gitweb.  WTF?

-- 
Shawn.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] http/remote-curl: coddle picky servers
  2010-01-21 16:20               ` [PATCH] http/remote-curl: coddle picky servers Tay Ray Chuan
  2010-01-21 16:24                 ` Shawn O. Pearce
@ 2010-01-21 16:34                 ` Mike Hommey
  1 sibling, 0 replies; 28+ messages in thread
From: Mike Hommey @ 2010-01-21 16:34 UTC (permalink / raw)
  To: Tay Ray Chuan
  Cc: John 'Warthog9' Hawley, Shawn O. Pearce, git,
	Junio C Hamano, Yaroslav Halchenko, Ilari Liusvaara

On Fri, Jan 22, 2010 at 12:20:26AM +0800, Tay Ray Chuan wrote:
> Hi,
> 
> On Fri, Jan 22, 2010 at 12:07 AM, Mike Hommey <mh@glandium.org> wrote:
> > Look closely at the start of the requested URL: /gitweb.cgi...
> > It comes from this rule:
> >
> > RewriteCond %{QUERY_STRING} ^(.+)$
> > RewriteRule ^/(.*)$ /gitweb.cgi$1 [L,PT]
> >
> > which is global to the virtual host.
> >
> > Anyways, while git.debian.org can certainly be fixed for that, other
> > servers may want to do some different things with urls with parameters.
> 
> heh, I was suspecting some URL rewriting was going on.
> 
> Is this an issue that should be fixed in gitweb?

Nah, that's just an issue with the config at git.debian.org.

It's fixed already.

Mike

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] http/remote-curl: coddle picky servers
  2010-01-21 16:24                 ` Shawn O. Pearce
@ 2010-01-21 16:34                   ` Mike Hommey
  0 siblings, 0 replies; 28+ messages in thread
From: Mike Hommey @ 2010-01-21 16:34 UTC (permalink / raw)
  To: Shawn O. Pearce
  Cc: Tay Ray Chuan, John 'Warthog9' Hawley, git,
	Junio C Hamano, Yaroslav Halchenko, Ilari Liusvaara

On Thu, Jan 21, 2010 at 08:24:02AM -0800, Shawn O. Pearce wrote:
> Tay Ray Chuan <rctay89@gmail.com> wrote:
> > On Fri, Jan 22, 2010 at 12:07 AM, Mike Hommey <mh@glandium.org> wrote:
> > > Look closely at the start of the requested URL: /gitweb.cgi...
> > > It comes from this rule:
> > >
> > > RewriteCond %{QUERY_STRING} ^(.+)$
> > > RewriteRule ^/(.*)$ /gitweb.cgi$1 [L,PT]
> > >
> > > which is global to the virtual host.
> > >
> > > Anyways, while git.debian.org can certainly be fixed for that, other
> > > servers may want to do some different things with urls with parameters.
> > 
> > heh, I was suspecting some URL rewriting was going on.
> > 
> > Is this an issue that should be fixed in gitweb?
> 
> I don't see why it should be.  gitweb isn't a service CGI.  I find
> it odd that someone would configure their website to route anything
> with a query string into gitweb.  WTF?

There was a good reason for it, but the implementation was too broad:
The main gitweb list (http://git.debian.org/) is made statically,
because it is too long for gitweb to create it in a timely fashion.

So while the main page is made to be a static file, when the request has
a query string, which means it's not the main gitweb list, the cgi is
used.

Except this rule was also used for unrelated urls.

But as I said in my reply to Tay Ray Chuan, it's fixed.

Mike

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: git fetch -v not at all verbose?
  2010-01-21 16:18                 ` Shawn O. Pearce
@ 2010-01-21 16:35                   ` Michael S. Tsirkin
  2010-01-21 16:57                     ` Shawn O. Pearce
  0 siblings, 1 reply; 28+ messages in thread
From: Michael S. Tsirkin @ 2010-01-21 16:35 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git, Junio C Hamano

On Thu, Jan 21, 2010 at 08:18:58AM -0800, Shawn O. Pearce wrote:
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On many of my trees (with linux kernel), git fetch is slower than git clone.
> > Even more annoyingly, it would hang sometimes for tens of minutes without any
> > output, even if -v is supplied.
> 
> Ouch.  I think -v -v boosts the output to be ever more verbose,
> and might actually show you something.
>  
> > stracing it shows a ton of lines like the following:
> > 16324 read(10, "ACK 4bbdfe65d23014f539fec4227260"..., 51) = 51
> > 16324 read(10, "0037", 4)               = 4
> > 16324 read(10, "ACK 322c06560fa314b04a6302ea03c0"..., 51) = 51
> > 16324 read(10, "0037", 4)               = 4
> > 16324 read(10, "ACK 848ea2043b128b5947851866a114"..., 51) = 51
> > 16324 read(10, "0037", 4)               = 4
> 
> That's the peers trying to determine a common base.
>  
> > Is there some way to make got fetch show progress at this stage,
> > or even better, can it be made faster somehow?
> 
> We shouldn't need to show progress here, we should just be faster.
> 
> Given the symptom, it sounds to me like your local repository
> is some 1,000s of commits ahead of the remote repository you are
> fetching from.  Is that true?

Hmm, no, but what is true is that I fetched several remotes
that diverged significantly into the same local repository.
Would that have same effect?

> Are you fetching from a configured remote that has tracking branches,
> or are you fetching through a one-shot URL pasted onto the command
> line?

Configured remote.

> -- 
> Shawn.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: git fetch -v not at all verbose?
  2010-01-21 16:35                   ` Michael S. Tsirkin
@ 2010-01-21 16:57                     ` Shawn O. Pearce
  2010-01-21 17:30                       ` Michael S. Tsirkin
  2010-01-21 17:42                       ` Junio C Hamano
  0 siblings, 2 replies; 28+ messages in thread
From: Shawn O. Pearce @ 2010-01-21 16:57 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: git, Junio C Hamano

"Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Thu, Jan 21, 2010 at 08:18:58AM -0800, Shawn O. Pearce wrote:
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On many of my trees (with linux kernel), git fetch is slower than git clone.
> > > Even more annoyingly, it would hang sometimes for tens of minutes without any
> > > output, even if -v is supplied.
...
> > Given the symptom, it sounds to me like your local repository
> > is some 1,000s of commits ahead of the remote repository you are
> > fetching from.  Is that true?
> 
> Hmm, no, but what is true is that I fetched several remotes
> that diverged significantly into the same local repository.
> Would that have same effect?

Yes.

> > Are you fetching from a configured remote that has tracking branches,
> > or are you fetching through a one-shot URL pasted onto the command
> > line?
> 
> Configured remote.

Hmm.  I wonder if we should try to shortcut the commit walking in
a case like this and just feed the tracking branches we already have.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: git fetch -v not at all verbose?
  2010-01-21 16:57                     ` Shawn O. Pearce
@ 2010-01-21 17:30                       ` Michael S. Tsirkin
  2010-01-21 17:47                         ` Thomas Rast
  2010-01-21 17:42                       ` Junio C Hamano
  1 sibling, 1 reply; 28+ messages in thread
From: Michael S. Tsirkin @ 2010-01-21 17:30 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git, Junio C Hamano

On Thu, Jan 21, 2010 at 08:57:37AM -0800, Shawn O. Pearce wrote:
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Thu, Jan 21, 2010 at 08:18:58AM -0800, Shawn O. Pearce wrote:
> > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > On many of my trees (with linux kernel), git fetch is slower than git clone.
> > > > Even more annoyingly, it would hang sometimes for tens of minutes without any
> > > > output, even if -v is supplied.
> ...
> > > Given the symptom, it sounds to me like your local repository
> > > is some 1,000s of commits ahead of the remote repository you are
> > > fetching from.  Is that true?
> > 
> > Hmm, no, but what is true is that I fetched several remotes
> > that diverged significantly into the same local repository.
> > Would that have same effect?
> 
> Yes.
> 
> > > Are you fetching from a configured remote that has tracking branches,
> > > or are you fetching through a one-shot URL pasted onto the command
> > > line?
> > 
> > Configured remote.
> 
> Hmm.  I wonder if we should try to shortcut the commit walking in
> a case like this and just feed the tracking branches we already have.

Or for the case of 1,000s of commits ahead, git could try to implement a
heuristic to reduce the number of commits sent. Currently all commits
are sent in order, correct?  How about binary search like what git
bisect does?

> -- 
> Shawn.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: git fetch -v not at all verbose?
  2010-01-21 16:57                     ` Shawn O. Pearce
  2010-01-21 17:30                       ` Michael S. Tsirkin
@ 2010-01-21 17:42                       ` Junio C Hamano
  2010-11-03  9:52                         ` Michael S. Tsirkin
  1 sibling, 1 reply; 28+ messages in thread
From: Junio C Hamano @ 2010-01-21 17:42 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Michael S. Tsirkin, git

"Shawn O. Pearce" <spearce@spearce.org> writes:

>> > Are you fetching from a configured remote that has tracking branches,
>> > or are you fetching through a one-shot URL pasted onto the command
>> > line?
>> 
>> Configured remote.
>
> Hmm.  I wonder if we should try to shortcut the commit walking in
> a case like this and just feed the tracking branches we already have.

You mean that the main culprit is the presense of thousdands of commits
that fetcher has obtained through the other remotes (and his own) that the
uploader makes fetcher walk all the way, in the false hope that there
might be a commit among them that is closer to the commits being fetched
than the ones at the tip of tracking branch the fetcher has for this
uploader currently?

And the solution might be to tell only about the tips of remote tracking
branches fetcher has obtained from this particular uploader, not about
other remote tracking bracnesh it got from others or his own local
branches (which may have merged from other remotes)?

It is a clever idea but I suspect it may not work well in practice.  For
example, suppose a project is two-tier, say, with top-level and subsystem
repositories, the former of which regularly merge from the latter, and you
are a participant primarily working on the subsystem.  You fetch daily
from the subsystem repository, but weekly from the top-level.

Now, when you fetch from the top-level, the remote tracking refs you have
for it are much more stale than your other refs.  The top-level would have
acquired a lot more commits from the same subsystem repository since you
fetched from there the last time, and you already have many of them
through your daily fetch from the subsystem repository.  To minimize the
transfer in such a case, the fetcher does want to tell the uploader that
it has those commits from the same subsystem repository, so that the
commit walker can stop at a recent merge into the top-level from the
subsystem repository.

There was a discussion about updating the commit walk exchange to bisect
the history (skip and try a much older one to see if it is reachable, but
to avoid overshooting, step back and see if a newer one is still common).
It would be a lot more work and needs to be implemented as a new protocol
capability, but I think it is the right way to go in the longer term.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: git fetch -v not at all verbose?
  2010-01-21 17:30                       ` Michael S. Tsirkin
@ 2010-01-21 17:47                         ` Thomas Rast
  0 siblings, 0 replies; 28+ messages in thread
From: Thomas Rast @ 2010-01-21 17:47 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Shawn O. Pearce, git, Junio C Hamano

On Thursday 21 January 2010 18:30:10 Michael S. Tsirkin wrote:
> On Thu, Jan 21, 2010 at 08:57:37AM -0800, Shawn O. Pearce wrote:
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > Hmm, no, but what is true is that I fetched several remotes
> > > that diverged significantly into the same local repository.
> > > Would that have same effect?
[...]
> > Hmm.  I wonder if we should try to shortcut the commit walking in
> > a case like this and just feed the tracking branches we already have.
> 
> Or for the case of 1,000s of commits ahead, git could try to implement a
> heuristic to reduce the number of commits sent. Currently all commits
> are sent in order, correct?  How about binary search like what git
> bisect does?

I had a patch for this ages ago (that combines exponential-stride
backwards search and later bisection), but it was shot down on grounds
of not working at times and code convolution and I forgot about it...

I can give this another shot, but it seems most of the code has moved
due to the transport handlers changes, so I'll first have to read into
it again.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: git fetch -v not at all verbose?
  2010-01-21 17:42                       ` Junio C Hamano
@ 2010-11-03  9:52                         ` Michael S. Tsirkin
  2010-11-03 16:14                           ` Junio C Hamano
  0 siblings, 1 reply; 28+ messages in thread
From: Michael S. Tsirkin @ 2010-11-03  9:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, git

On Thu, Jan 21, 2010 at 09:42:36AM -0800, Junio C Hamano wrote:
> "Shawn O. Pearce" <spearce@spearce.org> writes:
> 
> >> > Are you fetching from a configured remote that has tracking branches,
> >> > or are you fetching through a one-shot URL pasted onto the command
> >> > line?
> >> 
> >> Configured remote.
> >
> > Hmm.  I wonder if we should try to shortcut the commit walking in
> > a case like this and just feed the tracking branches we already have.
> 
> You mean that the main culprit is the presense of thousdands of commits
> that fetcher has obtained through the other remotes (and his own) that the
> uploader makes fetcher walk all the way, in the false hope that there
> might be a commit among them that is closer to the commits being fetched
> than the ones at the tip of tracking branch the fetcher has for this
> uploader currently?
> 
> And the solution might be to tell only about the tips of remote tracking
> branches fetcher has obtained from this particular uploader, not about
> other remote tracking bracnesh it got from others or his own local
> branches (which may have merged from other remotes)?
> 
> It is a clever idea but I suspect it may not work well in practice.  For
> example, suppose a project is two-tier, say, with top-level and subsystem
> repositories, the former of which regularly merge from the latter, and you
> are a participant primarily working on the subsystem.  You fetch daily
> from the subsystem repository, but weekly from the top-level.
> 
> Now, when you fetch from the top-level, the remote tracking refs you have
> for it are much more stale than your other refs.  The top-level would have
> acquired a lot more commits from the same subsystem repository since you
> fetched from there the last time, and you already have many of them
> through your daily fetch from the subsystem repository.  To minimize the
> transfer in such a case, the fetcher does want to tell the uploader that
> it has those commits from the same subsystem repository, so that the
> commit walker can stop at a recent merge into the top-level from the
> subsystem repository.
> 
> There was a discussion about updating the commit walk exchange to bisect
> the history (skip and try a much older one to see if it is reachable, but
> to avoid overshooting, step back and see if a newer one is still common).
> It would be a lot more work and needs to be implemented as a new protocol
> capability, but I think it is the right way to go in the longer term.

I thought about this some more: it seems that nothing in
pack-protocol.txt dictates that client has to send have
lines in order. The whole logic would be on client side.

So a new capability will be there just in case we find a use for a
server-side optimization later on, we don't need the client to behave
differently in any way when this capability is enabled/disabled.
Right?

-- 
MST

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: git fetch -v not at all verbose?
  2010-11-03  9:52                         ` Michael S. Tsirkin
@ 2010-11-03 16:14                           ` Junio C Hamano
  0 siblings, 0 replies; 28+ messages in thread
From: Junio C Hamano @ 2010-11-03 16:14 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Shawn O. Pearce, git

"Michael S. Tsirkin" <mst@redhat.com> writes:

> On Thu, Jan 21, 2010 at 09:42:36AM -0800, Junio C Hamano wrote:
> ...
>> There was a discussion about updating the commit walk exchange to bisect
>> the history (skip and try a much older one to see if it is reachable, but
>> to avoid overshooting, step back and see if a newer one is still common).
>> It would be a lot more work and needs to be implemented as a new protocol
>> capability, but I think it is the right way to go in the longer term.
>
> I thought about this some more: it seems that nothing in
> pack-protocol.txt dictates that client has to send have
> lines in order. The whole logic would be on client side.

The current protocol may not require any order for it to function
correctly in the sense that the sent pack will contain everything that is
necessary, but it does require that commits on a lineage to be sent from
near tip to near root if you want to have a _good_ common ancestor to be
found.

If the downloader sends older commits first without sending some new ones,
the uploader can say "Ok, I know about that old one you told me you have,
so we could use that as a common commit" [*1*].  But there is no way for
it to continue the sentence with "... but I cannot tell if other ones you
told me you have that I know nothing about are all directly connected to
that common one we just found (in which case that common one is the best
we can do), or you have newer ones than the common commit that I also have
but you omitted from the listing (in other words, if you didn't omit them,
we could have found a better common commit).  Could you please back up a
bit and let us see if we can do better with newer ones?" with the current
protocol exchange.

The downloader _could_, upon seeing an ACK to a commit that is an ancestor
of commits that it skipped, try sending these skipped commits, without
telling the uploader that it what it is doing.  But the uploader will
unilaterally decide when it thinks it has heard enough, after giving an
ACK back in the original protocol, or after finding enough common
ancestors to cover all the tips requested with WANTs, so I suspect that
you may not have a chance to play such a game without an explicit protocol
extension.


[Footnote]

*1* That is what an ACK means.  In an multi-ack exchange, it also tells
the downloader there is no point to give any ancestors of that commit, but
allows the downloader to continue sending commits from other lineage.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2010-11-03 16:14 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-21  0:47 problem cloning via http since v1.6.6-rc0 Yaroslav Halchenko
2010-01-21  1:34 ` Tay Ray Chuan
2010-01-21  1:36 ` Tay Ray Chuan
2010-01-21  2:33   ` Yaroslav Halchenko
2010-01-21  4:01     ` Tay Ray Chuan
2010-01-21  4:38       ` Yaroslav Halchenko
2010-01-21  5:08 ` Ilari Liusvaara
2010-01-21  6:47   ` Tay Ray Chuan
2010-01-21  7:51     ` Tay Ray Chuan
2010-01-21 14:00       ` Yaroslav Halchenko
2010-01-21 14:41         ` [PATCH] http/remote-curl: coddle picky servers Tay Ray Chuan
2010-01-21 15:56           ` Shawn O. Pearce
2010-01-21 16:07             ` Mike Hommey
2010-01-21 16:10               ` git fetch -v not at all verbose? Michael S. Tsirkin
2010-01-21 16:18                 ` Shawn O. Pearce
2010-01-21 16:35                   ` Michael S. Tsirkin
2010-01-21 16:57                     ` Shawn O. Pearce
2010-01-21 17:30                       ` Michael S. Tsirkin
2010-01-21 17:47                         ` Thomas Rast
2010-01-21 17:42                       ` Junio C Hamano
2010-11-03  9:52                         ` Michael S. Tsirkin
2010-11-03 16:14                           ` Junio C Hamano
2010-01-21 16:20               ` [PATCH] http/remote-curl: coddle picky servers Tay Ray Chuan
2010-01-21 16:24                 ` Shawn O. Pearce
2010-01-21 16:34                   ` Mike Hommey
2010-01-21 16:34                 ` Mike Hommey
2010-01-21 10:35     ` problem cloning via http since v1.6.6-rc0 Ilari Liusvaara
2010-01-21 11:36       ` Tay Ray Chuan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).