Git development
 help / color / mirror / Atom feed
* Re: [git patches] libata updates, GPG signed (but see admin notes)
From: Junio C Hamano @ 2011-11-02 18:58 UTC (permalink / raw)
  To: Michael J Gruber
  Cc: Linus Torvalds, git, James Bottomley, Jeff Garzik, Andrew Morton,
	linux-ide, LKML
In-Reply-To: <4EB12122.7010803@drmicha.warpmail.net>

Michael J Gruber <git@drmicha.warpmail.net> writes:

> The advantage of tags is that they can be added without rewriting the
> commit, of course.

And you did neither think about the downsides of tags, nor read what
others already explained for you?

^ permalink raw reply

* Re: [PATCH] Escape file:// URL's to meet subversion SVN::Ra requirements
From: Ben Walton @ 2011-11-02 19:05 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: normalperson, git
In-Reply-To: <20111102182015.GA11401@elie.hsd1.il.comcast.net>

Excerpts from Jonathan Nieder's message of Wed Nov 02 14:20:38 -0400 2011:

> Thanks for your work on this!  I'm not really sure how one can
> decide that the problem is not in svn --- some existing functions
> changed ABI in such a way as to break existing applications and
> require code changes and a recompile.  It would be better for
> Subversion to silently fix up paths provided by bad callers, or at
> least to return a sensible error code.

Yes, my apologies.  I wrote this commit message before fully realizing
just how wrong the response on the svn list was.  It core dumps, after
all...If the patch is useful, I'll resubmit with a better commit
message.

> So the problem is that nobody who cared was testing prereleases of
> subversion and reporting bugs early enough for it to get this fixed
> before the 1.7 release.  But yes, that's water under the bridge and
> git-svn (and libsvn-perl, and pysvn, and ...) should just adjust to
> the new world order.
> 
> > [1] http://news.gmane.org/gmane.comp.version-control.subversion.devel
> 
> Do you mean
> http://thread.gmane.org/gmane.comp.version-control.subversion.devel/132250
> ?

No, the link should have been:
http://article.gmane.org/gmane.comp.version-control.subversion.devel/132227

I'm not sure how it got mangled like that.

> | # failed 1 among 2 test(s)
> | 1..2
> | make[3]: *** [t9145-git-svn-master-branch.sh] Error 1
> 
> Does it work for you?  This is with a merge of git 1.7.8-rc0 and
> 1.7.7.2.

Yikes...I had svn tests turned off in my global build script still and
only validated this against t9134.  My apologies.  I'll submit
something proper later this evening.  (Assuming I get a working patch
that passes the whole suite.)

Sorry for the clumsy patch.

Thanks
-Ben


--
Ben Walton
Systems Programmer - CHASS
University of Toronto
C:416.407.5610 | W:416.978.4302

^ permalink raw reply

* Re: [ANNOUNCE] Git 1.7.8.rc0
From: Junio C Hamano @ 2011-11-02 19:13 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Naewe, Michael J Gruber, git
In-Reply-To: <20111102181041.GA5366@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> So the ideal logic is:
>
>   1. look in netrc
>
>   2. If we have a username and no password, ask for password
>
>   3. Otherwise, try it and see if we get a 401.
>
> But we can't do that, because (1) and (3) happen atomically inside of
> curl.
>
> The simplest thing is to just drop the behavior in (2), and let it drop
> to a 401. The extra round trip probably isn't that big a deal.

That is essentially what Stefan's fix is about.

The cases we have "extra" roundtrip are:

 - when you have username@ in URL but no password is stored in .netrc;
 - when you have username@ in URL and no $HOME/.netrc file.

and in such a case using URL without username@ in it as a workaround would
save the roundtrip but forces you to type your username@ over and over
again, which is _not_ a real workaround.

A workaround for people who want ultimate convenience is to use .netrc to
have both username:password, but that is at the cost of potentially
reduced security. Having username@ in URL and typing password
interactively, if it worked properly, would have been the best of both
worlds.

> The other option is to start parsing netrc ourselves, or do the extra
> round trip if we detect ~/.netrc or something. But that last one is
> getting pretty hackish.

I tend to agree that we wouldn't want to parse netrc ourselves (that is
what library support e.g. CURLOPT_NETRC is for). The latter is hackish but
on the other hand it is a cheap, simple and useful hack.

How would the upcoming keystore support fit in this picture, by the way?

^ permalink raw reply

* Re: [PATCH] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping
From: Daniel Stenberg @ 2011-11-02 19:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Mika Fischer, git
In-Reply-To: <7v62j2js3p.fsf@alter.siamese.dyndns.org>

On Wed, 2 Nov 2011, Junio C Hamano wrote:

I'm totally fine with the patch's approach with respect to how it uses libcurl 
and it should be an improvement compared to the previous way.

>> + curl_multi_fdset(curlm, &readfds, &writefds, &excfds, &max_fd);
>
> I couldn't find in http://curl.haxx.se/libcurl/c/curl_multi_fdset.html
> what the version requirement for using this function is, but the same
> comment as above applies here.

That function has been around for as long as the multi interface has, so it 
should be safe to use it just as widely as curl_multi_perform().

> By the way, I think I saw Daniel posting a link to a nicely formatted table 
> that lists each and every functions and CURLOPT_* symbol with ranges of 
> version it is usable, but I seem to be unable to find it.

Right, the document is a bit hard to find and I should figure out a more 
prominent place to link to it. But it can be found here:

https://github.com/bagder/curl/blob/master/docs/libcurl/symbols-in-versions

-- 

  / daniel.haxx.se

^ permalink raw reply

* Re: [PATCH] t3200: add test case for 'branch -m'
From: Junio C Hamano @ 2011-11-02 19:43 UTC (permalink / raw)
  To: Stefan Naewe; +Cc: git, Tay Ray Chuan
In-Reply-To: <1320246425-2141-1-git-send-email-stefan.naewe@gmail.com>

Thanks, both.

^ permalink raw reply

* Re: [git patches] libata updates, GPG signed (but see admin notes)
From: Linus Torvalds @ 2011-11-02 20:04 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, James Bottomley, Jeff Garzik, Andrew Morton, linux-ide, LKML
In-Reply-To: <7vk47jld5s.fsf@alter.siamese.dyndns.org>

On Tue, Nov 1, 2011 at 2:56 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
> But on the other hand, in many ways, publishing your commit to the outside
> world, not necessarily for getting pulled into the final destination
> (i.e. your tree) but merely for other people to try it out, is the point
> of no return (aka "don't rewind or rebase once you publish").  "pushing
> out" might be less special than "please pull", but it still is special.

So I really think that signing the top commit itself is fundamentally wrong.

That commit may not even be *yours*. You may have pulled it from a
sub-lieutenant as a fast-forward, or similar. Amending it later would
be actively very very *wrong*.

So quite frankly, I think the stuff in pu (or next?) is completely
mis-designed. Doing it in the commit is wrong for fundamental reasons,
which all boil down to a simple issue:

 - you absolutely *need* to add the signature later. You *cannot* do
it at "git commit" time.

That's a fundamental issue both from a "workflow model" issue (ie you
want to sign stuff after it has passed testing etc, but you may need
to commit it in order to *get* testing), as well as from a
"fundamental git datastructures" issue (ie you would want to sign
commits that aren't yours.

"git commit --amend" is not the answer - that destroys the fundamental
concept of history being immutable, and while it works for your local
commits, it doesn't work for anybody elses commits, or for stuff you
already pushed out.

And "add a fake empty commit just for the signature" is not the answer
either - because that is clearly inferior to the tags we already had.

I dunno. Did I miss something? As far as I can tell, the signed tags
that we've had since day one are *clearly* much better in very
fundamental ways.

                             Linus

^ permalink raw reply

* Re: Q: "git diff" using tag names
From: Alexey Shumkin @ 2011-11-02 20:08 UTC (permalink / raw)
  To: Frans Klaver; +Cc: Ulrich Windl, git
In-Reply-To: <CAH6sp9OdjNZ6_j-eSFMpecUcxW_y6fpkDZ0cRds62wOrc12x-Q@mail.gmail.com>

thanks )

> On Wed, Nov 2, 2011 at 10:29 AM, Alexey Shumkin
> <alex.crezoff@gmail.com> wrote:
> 
> >> Also it seems that both syntaxes work:
> >> git diff v0.4..v0.5
> >> git diff v0.4 v0.5
> > honestly, I do not know the difference (at the moment :))
> > may be gurus or manual will help to discover it
> 
> As per the git-diff documentation, these two versions behave equally
> -- i.e. no differences.
> 
> Comparing branches
> $ git diff topic master    <1>
> $ git diff topic..master   <2>
> $ git diff topic...master  <3>
> ‪1.‬ Changes between the tips of the topic and the master branches.
> ‪2.‬ Same as above.
> ‪3.‬ Changes that occurred on the master branch since when the topic
> branch was started off it.
> 
> Cheers,
> Frans

^ permalink raw reply

* Re: Q: "git diff" using tag names
From: Alexey Shumkin @ 2011-11-02 20:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ulrich Windl, git
In-Reply-To: <7vty6mkgsg.fsf@alter.siamese.dyndns.org>

Oh, I see

> Alexey Shumkin <alex.crezoff@gmail.com> writes:
> 
> >> Also it seems that both syntaxes work:
> >> git diff v0.4..v0.5
> >> git diff v0.4 v0.5
> > honestly, I do not know the difference (at the moment :))
> > may be gurus or manual will help to discover it
> 
> The latter is the kosher version, as diff is about two "endpoints"
> and not about "ranges". The only reason the former is parsed without
> erroring out is because too many people are used to type .. between
> two things without thinking, learned the notation from "git log",
> which _is_ about ranges.

^ permalink raw reply

* Re: [ANNOUNCE] Git 1.7.8.rc0
From: Jeff King @ 2011-11-02 20:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Naewe, Michael J Gruber, git
In-Reply-To: <7vwrbiibgz.fsf@alter.siamese.dyndns.org>

On Wed, Nov 02, 2011 at 12:13:48PM -0700, Junio C Hamano wrote:

> > The simplest thing is to just drop the behavior in (2), and let it drop
> > to a 401. The extra round trip probably isn't that big a deal.
> 
> That is essentially what Stefan's fix is about.

Right. I think it may be the sanest thing to do.

> The cases we have "extra" roundtrip are:
> 
>  - when you have username@ in URL but no password is stored in .netrc;
>  - when you have username@ in URL and no $HOME/.netrc file.
> 
> and in such a case using URL without username@ in it as a workaround would
> save the roundtrip but forces you to type your username@ over and over
> again, which is _not_ a real workaround.

Yeah. There's no way for us to know before we hand off to curl what you
have in netrc. So these netrc cases will always be at odds with the
no-netrc case.

Normally I would say to implement in favor of the no-netrc case, as it
is probably more common (and will hopefully be more so after the auth
helpers are finished). But the problem is that the penalties are
different. On the one hand, we have the extra http round-trip. Which is
annoying, but mostly invisible to the user. But on the other, we have
git prompting the user unnecessarily, which is just awful.

> > The other option is to start parsing netrc ourselves, or do the extra
> > round trip if we detect ~/.netrc or something. But that last one is
> > getting pretty hackish.
> 
> I tend to agree that we wouldn't want to parse netrc ourselves (that is
> what library support e.g. CURLOPT_NETRC is for). The latter is hackish but
> on the other hand it is a cheap, simple and useful hack.

Note that it's not always right, of course. You might have a .netrc but
no entry for that host. But at least it lets the common case people
(i.e., people who never heard of or touched netrc) to avoid the round
trip.

> How would the upcoming keystore support fit in this picture, by the way?

Any time we would call getpass(), we ask the helper for the credential.
So for user@host, we would call out to the helper for the password
proactively, and otherwise wait for a 401.

We _could_ be proactive and actually ask the helpers for a username and
password even for "https://host/repo", which would save a round-trip to
get the 401 in some cases. But that assumes that asking the helper is
cheap. It might actually require the user inputting a password to unlock
the keystore, which would be annoying if the remote doesn't require
auth at all.

We could try to be clever and use a heuristic that fetch probably
doesn't need auth, but push does. Then fetch gets the extra round-trip
but push doesn't. But that just seems needlessly complex to save one
http round-trip on push.

-Peff

^ permalink raw reply

* [PATCHv2] Improve use of select in http backend
From: Mika Fischer @ 2011-11-02 20:21 UTC (permalink / raw)
  To: git; +Cc: gitster, daniel

I've split my previous patch into two, since the two parts actually do
different things.

The first patch uses curl_multi_fdset to use the file descriptors of the http
connections in the call to select, in effect waking up immediately when new
data arrives.

The second patch uses curl_multi_timeout (if available) to get a better
recommendation for how long to sleep from curl instead of always sleeping
for 50ms.

^ permalink raw reply

* [PATCH 1/2] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping
From: Mika Fischer @ 2011-11-02 20:21 UTC (permalink / raw)
  To: git; +Cc: gitster, daniel, Mika Fischer
In-Reply-To: <1320265288-12647-1-git-send-email-mika.fischer@zoopnet.de>

Instead of sleeping unconditionally for a 50ms, when no data can be read
from the http connection(s), use curl_multi_fdset to obtain the actual
file descriptors of the open connections and use them in the select call.
This way, the 50ms sleep is interrupted when new data arrives.
---
 http.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/http.c b/http.c
index a4bc770..ae92318 100644
--- a/http.c
+++ b/http.c
@@ -664,14 +664,14 @@ void run_active_slot(struct active_request_slot *slot)
 		}
 
 		if (slot->in_use && !data_received) {
-			max_fd = 0;
+			max_fd = -1;
 			FD_ZERO(&readfds);
 			FD_ZERO(&writefds);
 			FD_ZERO(&excfds);
+			curl_multi_fdset(curlm, &readfds, &writefds, &excfds, &max_fd);
 			select_timeout.tv_sec = 0;
 			select_timeout.tv_usec = 50000;
-			select(max_fd, &readfds, &writefds,
-			       &excfds, &select_timeout);
+			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
 		}
 	}
 #else
-- 
1.7.8.rc0.33.g09c6f.dirty

^ permalink raw reply related

* [PATCH 2/2] http.c: Use timeout suggested by curl instead of fixed 50ms timeout
From: Mika Fischer @ 2011-11-02 20:21 UTC (permalink / raw)
  To: git; +Cc: gitster, daniel, Mika Fischer
In-Reply-To: <1320265288-12647-1-git-send-email-mika.fischer@zoopnet.de>

Recent versions of curl can suggest a period of time the library user
should sleep and try again, when curl is blocked on reading or writing
(or connecting). Use this timeout instead of always sleeping for 50ms.
---
 http.c |   22 ++++++++++++++++++++--
 1 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/http.c b/http.c
index ae92318..e91a2ab 100644
--- a/http.c
+++ b/http.c
@@ -649,6 +649,9 @@ void run_active_slot(struct active_request_slot *slot)
 	fd_set excfds;
 	int max_fd;
 	struct timeval select_timeout;
+#if LIBCURL_VERSION_NUM >= 0x070f04
+	long curl_timeout;
+#endif
 	int finished = 0;
 
 	slot->finished = &finished;
@@ -664,13 +667,28 @@ void run_active_slot(struct active_request_slot *slot)
 		}
 
 		if (slot->in_use && !data_received) {
+#if LIBCURL_VERSION_NUM >= 0x070f04
+			curl_multi_timeout(curlm, &curl_timeout);
+			if (curl_timeout == 0) {
+				continue;
+			} else if (curl_timeout == -1) {
+				select_timeout.tv_sec  = 0;
+				select_timeout.tv_usec = 50000;
+			} else {
+				select_timeout.tv_sec  =  curl_timeout / 1000;
+				select_timeout.tv_usec = (curl_timeout % 1000) * 1000;
+			}
+#else
+			select_timeout.tv_sec  = 0;
+			select_timeout.tv_usec = 50000;
+#endif
+
 			max_fd = -1;
 			FD_ZERO(&readfds);
 			FD_ZERO(&writefds);
 			FD_ZERO(&excfds);
 			curl_multi_fdset(curlm, &readfds, &writefds, &excfds, &max_fd);
-			select_timeout.tv_sec = 0;
-			select_timeout.tv_usec = 50000;
+
 			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
 		}
 	}
-- 
1.7.8.rc0.33.g09c6f.dirty

^ permalink raw reply related

* Re: [PATCH 1/2] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping
From: Jeff King @ 2011-11-02 20:32 UTC (permalink / raw)
  To: Mika Fischer; +Cc: git, gitster, daniel
In-Reply-To: <1320265288-12647-2-git-send-email-mika.fischer@zoopnet.de>

On Wed, Nov 02, 2011 at 09:21:27PM +0100, Mika Fischer wrote:

> Instead of sleeping unconditionally for a 50ms, when no data can be read
> from the http connection(s), use curl_multi_fdset to obtain the actual
> file descriptors of the open connections and use them in the select call.
> This way, the 50ms sleep is interrupted when new data arrives.
> ---
>  http.c |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/http.c b/http.c
> index a4bc770..ae92318 100644
> --- a/http.c
> +++ b/http.c
> @@ -664,14 +664,14 @@ void run_active_slot(struct active_request_slot *slot)
>  		}
>  
>  		if (slot->in_use && !data_received) {
> -			max_fd = 0;
> +			max_fd = -1;
>  			FD_ZERO(&readfds);
>  			FD_ZERO(&writefds);
>  			FD_ZERO(&excfds);
> +			curl_multi_fdset(curlm, &readfds, &writefds, &excfds, &max_fd);
>  			select_timeout.tv_sec = 0;
>  			select_timeout.tv_usec = 50000;
> -			select(max_fd, &readfds, &writefds,
> -			       &excfds, &select_timeout);
> +			select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
>  		}
>  	}

Do we still need to care about data_received?

My understanding was that the code was originally trying to do:

  1. Call curl, maybe get some data.

  2. If we got data, then ask curl against immediately for some data.

  3. Otherwise, sleep 50ms and then ask curl again.

But now that we are actually selecting on the proper descriptors, it
should now be safe to just do:

  1. Call curl, maybe get some data.

  2. Call select, which will wake immediately if curl is going to get
     data.

At least that's my reading. I am working on unrelated patches that clean
up the handling of data_received, but if it could go away altogether,
that would be even simpler.

-Peff

^ permalink raw reply

* Re: [PATCH 1/2] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping
From: Jeff King @ 2011-11-02 20:35 UTC (permalink / raw)
  To: Mika Fischer; +Cc: git, gitster, daniel
In-Reply-To: <20111102203221.GB5628@sigill.intra.peff.net>

On Wed, Nov 02, 2011 at 04:32:21PM -0400, Jeff King wrote:

> At least that's my reading. I am working on unrelated patches that clean
> up the handling of data_received, but if it could go away altogether,
> that would be even simpler.

That patch, btw, looks like this:

-- >8 --
Subject: [PATCH] http: remove "local" member from slot struct

The curl-multi http code does something like this:

  while (!finished) {
	  try_to_read_from_slots();
	  if (!data_received)
		  wait_for_50_ms();
  }

Which is horrible enough in itself, because of the
hard-coded 50ms wait.

But there's some additional complexity: the method for
finding whether we received data is to actually run ftell()
on the file we are writing to to see if it got anything. So
the "local" member of the slot struct contains the FILE
pointer. Except that sometimes we don't have a file, because
we're writing to a strbuf. In that case, since curl calls
our custom callback, we just increment the data_received
flag when curl gives data to our callback.

Let's do the same thing for the write-to-file case as we do
for the write-to-strbuf case: use a thin wrapper callback
and increment the received flag. This makes both methods
consistent with each other, and saves us from managing the
"local" struct member at all, reducing the code size.

Signed-off-by: Jeff King <peff@peff.net>
---
 http.c |   23 +++++++----------------
 http.h |    1 -
 2 files changed, 7 insertions(+), 17 deletions(-)

diff --git a/http.c b/http.c
index a4bc770..99386ef 100644
--- a/http.c
+++ b/http.c
@@ -93,6 +93,12 @@ curlioerr ioctl_buffer(CURL *handle, int cmd, void *clientp)
 }
 #endif
 
+size_t fwrite_file(char *ptr, size_t eltsize, size_t nmemb, void *handle)
+{
+	data_received++;
+	return fwrite(ptr, eltsize, nmemb, handle);
+}
+
 size_t fwrite_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
 {
 	size_t size = eltsize * nmemb;
@@ -538,7 +544,6 @@ struct active_request_slot *get_active_slot(void)
 
 	active_requests++;
 	slot->in_use = 1;
-	slot->local = NULL;
 	slot->results = NULL;
 	slot->finished = NULL;
 	slot->callback_data = NULL;
@@ -642,8 +647,6 @@ void step_active_slots(void)
 void run_active_slot(struct active_request_slot *slot)
 {
 #ifdef USE_CURL_MULTI
-	long last_pos = 0;
-	long current_pos;
 	fd_set readfds;
 	fd_set writefds;
 	fd_set excfds;
@@ -656,13 +659,6 @@ void run_active_slot(struct active_request_slot *slot)
 		data_received = 0;
 		step_active_slots();
 
-		if (!data_received && slot->local != NULL) {
-			current_pos = ftell(slot->local);
-			if (current_pos > last_pos)
-				data_received++;
-			last_pos = current_pos;
-		}
-
 		if (slot->in_use && !data_received) {
 			max_fd = 0;
 			FD_ZERO(&readfds);
@@ -818,13 +814,12 @@ static int http_request(const char *url, void *result, int target, int options)
 		if (target == HTTP_REQUEST_FILE) {
 			long posn = ftell(result);
 			curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
-					 fwrite);
+					 fwrite_file);
 			if (posn > 0) {
 				strbuf_addf(&buf, "Range: bytes=%ld-", posn);
 				headers = curl_slist_append(headers, buf.buf);
 				strbuf_reset(&buf);
 			}
-			slot->local = result;
 		} else
 			curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
 					 fwrite_buffer);
@@ -871,7 +866,6 @@ static int http_request(const char *url, void *result, int target, int options)
 		ret = HTTP_START_FAILED;
 	}
 
-	slot->local = NULL;
 	curl_slist_free_all(headers);
 	strbuf_release(&buf);
 
@@ -1066,7 +1060,6 @@ void release_http_pack_request(struct http_pack_request *preq)
 	if (preq->packfile != NULL) {
 		fclose(preq->packfile);
 		preq->packfile = NULL;
-		preq->slot->local = NULL;
 	}
 	if (preq->range_header != NULL) {
 		curl_slist_free_all(preq->range_header);
@@ -1088,7 +1081,6 @@ int finish_http_pack_request(struct http_pack_request *preq)
 
 	fclose(preq->packfile);
 	preq->packfile = NULL;
-	preq->slot->local = NULL;
 
 	lst = preq->lst;
 	while (*lst != p)
@@ -1157,7 +1149,6 @@ struct http_pack_request *new_http_pack_request(
 	}
 
 	preq->slot = get_active_slot();
-	preq->slot->local = preq->packfile;
 	curl_easy_setopt(preq->slot->curl, CURLOPT_FILE, preq->packfile);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_WRITEFUNCTION, fwrite);
 	curl_easy_setopt(preq->slot->curl, CURLOPT_URL, preq->url);
diff --git a/http.h b/http.h
index 3c332a9..7429381 100644
--- a/http.h
+++ b/http.h
@@ -49,7 +49,6 @@ struct slot_results {
 
 struct active_request_slot {
 	CURL *curl;
-	FILE *local;
 	int in_use;
 	CURLcode curl_result;
 	long http_code;
-- 
1.7.7.rc3.12.g571d67

^ permalink raw reply related

* Re: [git patches] libata updates, GPG signed (but see admin notes)
From: Michael J Gruber @ 2011-11-02 21:05 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Linus Torvalds, git, James Bottomley, Jeff Garzik, Andrew Morton,
	linux-ide, LKML
In-Reply-To: <7v1utqjqre.fsf@alter.siamese.dyndns.org>

Junio C Hamano venit, vidit, dixit 02.11.2011 19:58:
> Michael J Gruber <git@drmicha.warpmail.net> writes:
> 
>> The advantage of tags is that they can be added without rewriting the
>> commit, of course.
> 
> And you did neither think about the downsides of tags, nor read what
> others already explained for you?

We're just weighing things differently here, and no accusations of
"misinformation" or "not thinking" will change this.

^ permalink raw reply

* Re: [git patches] libata updates, GPG signed (but see admin notes)
From: Junio C Hamano @ 2011-11-02 21:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: git, James Bottomley, Jeff Garzik, Andrew Morton, linux-ide, LKML
In-Reply-To: <CA+55aFz7TeQQH3D4Tpp31cZYZoQKeK37jouo+2Kh61Wa07knfw@mail.gmail.com>

Linus Torvalds <torvalds@linux-foundation.org> writes:

> And "add a fake empty commit just for the signature" is not the answer
> either - because that is clearly inferior to the tags we already had.
>
> I dunno. Did I miss something? As far as I can tell, the signed tags
> that we've had since day one are *clearly* much better in very
> fundamental ways.

Ok, back to the drawing board (which is not a loss as I wasn't expecting
this to be in the official release in upcoming 1.7.8 anyway).

^ permalink raw reply

* Re: [PATCH 1/2] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping
From: Junio C Hamano @ 2011-11-02 21:26 UTC (permalink / raw)
  To: Jeff King; +Cc: Mika Fischer, git, gitster, daniel
In-Reply-To: <20111102203543.GC5628@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On Wed, Nov 02, 2011 at 04:32:21PM -0400, Jeff King wrote:
>
>> At least that's my reading. I am working on unrelated patches that clean
>> up the handling of data_received, but if it could go away altogether,
>> that would be even simpler.
>
> That patch, btw, looks like this:
>
> -- >8 --
> Subject: [PATCH] http: remove "local" member from slot struct
>
> The curl-multi http code does something like this:
>
>   while (!finished) {
> 	  try_to_read_from_slots();
> 	  if (!data_received)
> 		  wait_for_50_ms();
>   }
>
> ...
> Let's do the same thing for the write-to-file case as we do
> for the write-to-strbuf case: use a thin wrapper callback
> and increment the received flag. This makes both methods
> consistent with each other, and saves us from managing the
> "local" struct member at all, reducing the code size.

Looks very sensible.

^ permalink raw reply

* Re: long fsck time
From: Jeff King @ 2011-11-02 21:33 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: Git Mailing List
In-Reply-To: <CACsJy8B=5mEWoOBkrTfmJ+p7HxqJM97zdG-k71oW81-3XxuO_Q@mail.gmail.com>

On Wed, Nov 02, 2011 at 07:10:26PM +0700, Nguyen Thai Ngoc Duy wrote:

> On Wed, Nov 2, 2011 at 7:06 PM, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:
> > On git.git
> >
> > $ /usr/bin/time git fsck
> > 333.25user 4.28system 5:37.59elapsed 99%CPU (0avgtext+0avgdata
> > 420080maxresident)k
> > 0inputs+0outputs (0major+726560minor)pagefaults 0swaps
> >
> > That's really long time, perhaps we should print progress so users
> > know it's still running?
> 
> Ahh.. --verbose. Sorry for the noise. Still good to show the number of
> checked objects though.

fsck --verbose is _really_ verbose. It could probably stand to have some
progress meters sprinkled throughout. The patch below produces this on
my git.git repo:

  $ git fsck
  Checking object directories: 100% (256/256), done.
  Verifying packs: 100% (7/7), done.
  Checking objects (pack 1/7): 100% (241/241), done.
  Checking objects (pack 2/7): 100% (176/176), done.
  Checking objects (pack 3/7): 100% (312/312), done.
  Checking objects (pack 4/7): 100% (252/252), done.
  Checking objects (pack 5/7): 100% (353/353), done.
  Checking objects (pack 6/7): 100% (375/375), done.
  Checking objects (pack 7/7): 100% (171079/171079), done.

which gives reasonably smooth progress. The longest hang is that
"Verifying pack" 7 is slow (I believe it's doing a sha1 over the whole
thing). If you really wanted to get fancy, you could probably do a
throughput meter as we sha1 the whole contents.

Patch is below. It would need --{no-,}progress support on the command
line, and to check isatty(2) before it would be acceptable.

---
diff --git a/builtin/fsck.c b/builtin/fsck.c
index df1a88b..481de4e 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -11,6 +11,7 @@
 #include "fsck.h"
 #include "parse-options.h"
 #include "dir.h"
+#include "progress.h"
 
 #define REACHABLE 0x0001
 #define SEEN      0x0002
@@ -512,15 +513,19 @@ static void get_default_heads(void)
 static void fsck_object_dir(const char *path)
 {
 	int i;
+	struct progress *progress;
 
 	if (verbose)
 		fprintf(stderr, "Checking object directory\n");
 
+	progress = start_progress("Checking object directories", 256);
 	for (i = 0; i < 256; i++) {
 		static char dir[4096];
 		sprintf(dir, "%s/%02x", path, i);
 		fsck_dir(i, dir);
+		display_progress(progress, i+1);
 	}
+	stop_progress(&progress);
 	fsck_sha1_list();
 }
 
@@ -622,19 +627,36 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 
 	if (check_full) {
 		struct packed_git *p;
+		int i, nr_packs = 0;
+		struct progress *progress;
 
 		prepare_packed_git();
 		for (p = packed_git; p; p = p->next)
+			nr_packs++;
+
+		progress = start_progress("Verifying packs", nr_packs);
+		for (i = 1, p = packed_git; p; p = p->next, i++) {
 			/* verify gives error messages itself */
 			verify_pack(p);
+			display_progress(progress, i);
+		}
+		stop_progress(&progress);
 
-		for (p = packed_git; p; p = p->next) {
+		for (i = 1, p = packed_git; p; p = p->next, i++) {
+			char buf[32];
 			uint32_t j, num;
 			if (open_pack_index(p))
 				continue;
 			num = p->num_objects;
-			for (j = 0; j < num; j++)
+
+			snprintf(buf, sizeof(buf), "Checking objects (pack %d/%d)",
+				 i, nr_packs);
+			progress = start_progress(buf, num);
+			for (j = 0; j < num; j++) {
 				fsck_sha1(nth_packed_object_sha1(p, j));
+				display_progress(progress, j+1);
+			}
+			stop_progress(&progress);
 		}
 	}
 

^ permalink raw reply related

* Re: [ANNOUNCE] Git 1.7.7.2
From: Stefan Roas @ 2011-11-02 21:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Linux Kernel
In-Reply-To: <7v7h3jl3kw.fsf@alter.siamese.dyndns.org>

[-- Attachment #1: Type: text/plain, Size: 531 bytes --]

Hi Junio,

is it possible that you forgot to update the GIT-VERSION-GEN with the
release of 1.7.7.2? I stll get version 1.7.7.1 from the tarball on
http://git-scm.com/ and when building from the git repository itself.

Regards,
  Stefan

-- 
Stefan Roas                                         sroas@roath.org
Joh.-Seb.-Bach-Str. 4                            D-91083 Baiersdorf
-------------------------------------------------------------------
Key fingerprint = 557C 99BE 865B 1463 2A44  7936 C662 8970 4DA5 50B8


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: New Feature wanted: Is it possible to let git clone continue last break point?
From: Jeff King @ 2011-11-02 22:06 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: netroby, Git Mail List, Tomas Carnecky
In-Reply-To: <20111031090717.GA24978@elie.hsd1.il.comcast.net>

On Mon, Oct 31, 2011 at 04:07:18AM -0500, Jonathan Nieder wrote:

> Something like Jeff's "priming the well with a server-specified
> bundle" proposal[2] might be a good way to make the same trick
> transparent to clients in the future.

Yes, that is one of the use cases I hope to address. But it will require
the publisher specifying a mirror location (it's possible we could add
some kind of automagic "hit a bundler service first" config option,
though I fear that the existing small-time bundler services would
crumble under the load).

So in the general case (and in the meantime), you may have to learn to
manually prime the repo using a bundle.

I haven't started on the patches for communicating mirror sites between
the server and client, but I did just write some patches to handle "git
fetch http://host/path/to/file.bundle" automatically, which is the first
step. They need a few finishing touches and some testing, though.

> Even with that, later fetches, which grab a pack generated on the fly
> to only contain the objects not already fetched, are generally not
> resumable.  Overcoming that would presumably require larger protocol
> changes, and I don't know of anyone working on it.  (My workaround
> when in a setup where this mattered was to use the old-fashioned
> "dumb" http protocol.  It worked fine.)

My goal was for the mirror communication between client and server to be
something like:

  - if you don't have object XXXXXX, then prime with URL
    http://host/bundle1

  - if you don't have object YYYYYY, then prime with URL
    http://host/bundle2

and so forth. A cloning client would grab the first bundle, then the
second, and then hit the real repo via the git protocol. A client who
had previously cloned might have XXX, but would now grab bundle2, and
then hit the real repo.

So depending on how often the server side feels like creating new
bundles, you would get most of the changes via bundles, and then only
be getting a small number of objects via git.

The downside of cumulative fetching is that the bundles can only serve
well-known checkpoints. So if you have a timeline like this:

  t0: server publishes bundle/mirror config with one line (the XXX bit
      above)

  t1: you clone, getting the whole bundle. No waste, because you had
      nothing in the first place, and you needed everything.

  t2: you fetch again, getting N commits worth of history via the git
      protocol

  t3: server decides a lot of new objects (let's say M commits worth)
      have accumulated, and generates a new line (the YYY line).

  t4: you fetch, see that you don't yet have YYY, and grab the second
      bundle

But in t4 you grabbed a bundle containing M commits, when you already
had the first N of them. So you actually wasted bandwidth getting
objects you already had. The only benefit is that you grabbed a static
file, which is resumable.

So I suspect there is some black magic involved in deciding when to
create a new bundle, and at what tip. If you create a bundle once a
month, but include only commits up to a week ago, then people pulling
weekly will never grab the bundle, but people pulling less frequently
will get the whole month as a bundle.

A secondary issue is also that in a scheme like this, your mirror list
will grow without bound. So you'd want to periodically repack everything
into a single bundle. But then people who are fetching wouldn't want
that, as it is just an exacerbated version of the same problem above.

Which is all a roundabout way of saying that the git protocol is really
the sane way to do efficient transfers. An alternative, much simpler
scheme would be for the server to just say:

  - if you have nothing, then prime with URL http://host/bundle

And then _only_ clone would bother with checking mirrors. People doing
fetch would be expected to do it often enough that not being resumable
isn't a big deal.

-Peff

^ permalink raw reply

* Re: [PATCH] Escape file:// URL's to meet subversion SVN::Ra requirements
From: Eric Wong @ 2011-11-02 22:09 UTC (permalink / raw)
  To: Ben Walton; +Cc: Jonathan Nieder, git
In-Reply-To: <1320260449-sup-479@pinkfloyd.chass.utoronto.ca>

Ben Walton <bwalton@artsci.utoronto.ca> wrote:
> Sorry for the clumsy patch.

I don't have much time to help you fix it, but I got numerous errors on
SVN 1.6.x (svn 1.6.12).  Can you make sure things continue to work on
1.6 and earlier, also?

Maybe just enable the escaping for file:// on >= SVN 1.7

Here are the tests that failed for me:

make[1]: *** [t9100-git-svn-basic.sh] Error 1
make[1]: *** [t9103-git-svn-tracked-directory-removed.sh] Error 1
make[1]: *** [t9104-git-svn-follow-parent.sh] Error 1
make[1]: *** [t9105-git-svn-commit-diff.sh] Error 1
make[1]: *** [t9107-git-svn-migrate.sh] Error 1
make[1]: *** [t9108-git-svn-glob.sh] Error 1
make[1]: *** [t9109-git-svn-multi-glob.sh] Error 1
make[1]: *** [t9110-git-svn-use-svm-props.sh] Error 1
make[1]: *** [t9111-git-svn-use-svnsync-props.sh] Error 1
make[1]: *** [t9114-git-svn-dcommit-merge.sh] Error 1
make[1]: *** [t9116-git-svn-log.sh] Error 1
make[1]: *** [t9117-git-svn-init-clone.sh] Error 1
make[1]: *** [t9118-git-svn-funky-branch-names.sh] Error 1
make[1]: *** [t9120-git-svn-clone-with-percent-escapes.sh] Error 1
make[1]: *** [t9125-git-svn-multi-glob-branch-names.sh] Error 1
make[1]: *** [t9127-git-svn-partial-rebuild.sh] Error 1
make[1]: *** [t9128-git-svn-cmd-branch.sh] Error 1
make[1]: *** [t9130-git-svn-authors-file.sh] Error 1
make[1]: *** [t9135-git-svn-moved-branch-empty-file.sh] Error 1
make[1]: *** [t9136-git-svn-recreated-branch-empty-file.sh] Error 1
make[1]: *** [t9141-git-svn-multiple-branches.sh] Error 1
make[1]: *** [t9145-git-svn-master-branch.sh] Error 1
make[1]: *** [t9146-git-svn-empty-dirs.sh] Error 1
make[1]: *** [t9150-svk-mergetickets.sh] Error 1
make[1]: *** [t9151-svn-mergeinfo.sh] Error 1
make[1]: *** [t9153-git-svn-rewrite-uuid.sh] Error 1
make[1]: *** [t9154-git-svn-fancy-glob.sh] Error 1
make[1]: *** [t9155-git-svn-fetch-deleted-tag.sh] Error 1
make[1]: *** [t9156-git-svn-fetch-deleted-tag-2.sh] Error 1
make[1]: *** [t9157-git-svn-fetch-merge.sh] Error 1
make[1]: *** [t9159-git-svn-no-parent-mergeinfo.sh] Error 1
make[1]: *** [t9161-git-svn-mergeinfo-push.sh] Error 1

^ permalink raw reply

* Re: [ANNOUNCE] Git 1.7.7.2
From: Junio C Hamano @ 2011-11-02 22:30 UTC (permalink / raw)
  To: Stefan Roas; +Cc: git, Linux Kernel
In-Reply-To: <20111102214725.GA2860@geminga.roas.networks.roath.org>

Stefan Roas <sroas@roath.org> writes:

> is it possible that you forgot to update the GIT-VERSION-GEN with the
> release of 1.7.7.2? I stll get version 1.7.7.1 from the tarball on
> http://git-scm.com/ and when building from the git repository itself.

Probably.

^ permalink raw reply

* Re: [PATCH 1/2] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping
From: Mika Fischer @ 2011-11-02 22:22 UTC (permalink / raw)
  To: Jeff King; +Cc: git, gitster, daniel
In-Reply-To: <20111102203221.GB5628@sigill.intra.peff.net>

On Wed, Nov 2, 2011 at 21:32, Jeff King <peff@peff.net> wrote:
> Do we still need to care about data_received?
>
> My understanding was that the code was originally trying to do:
>
>  1. Call curl, maybe get some data.
>
>  2. If we got data, then ask curl against immediately for some data.
>
>  3. Otherwise, sleep 50ms and then ask curl again.

Yes, that's exactly what it did.

> But now that we are actually selecting on the proper descriptors, it
> should now be safe to just do:
>
>  1. Call curl, maybe get some data.
>
>  2. Call select, which will wake immediately if curl is going to get
>     data.

The only problem I can see is that curl_multi_fdset is not guaranteed
to return any fds. So in theory it could be possible that we don't get
fds, but we're actually reading stuff. In this case things would get
slow, because we would sleep for 50ms after every read...

However, I don't know if this is a case that actually comes up in the
real world. Maybe Daniel has some advice on this.

Best,
 Mika

^ permalink raw reply

* Re: [PATCH 1/2] http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping
From: Daniel Stenberg @ 2011-11-02 22:40 UTC (permalink / raw)
  To: Mika Fischer; +Cc: Jeff King, git, gitster
In-Reply-To: <CAOs=hR+QqUpYuth8Uvi2o7pm1LO8ogO2pN7nrMchYj96Cutmww@mail.gmail.com>

On Wed, 2 Nov 2011, Mika Fischer wrote:

> The only problem I can see is that curl_multi_fdset is not guaranteed to 
> return any fds. So in theory it could be possible that we don't get fds, but 
> we're actually reading stuff. In this case things would get slow, because we 
> would sleep for 50ms after every read...
>
> However, I don't know if this is a case that actually comes up in the real 
> world. Maybe Daniel has some advice on this.

It doesn't really happen so it should be safe.

The case where no fds are returned is when libcurl cannot return a socket to 
wait for during name resolving (if your particular libcurl is built to use 
such a resolver backend - libcurl has several different ones). And during name 
resolving there won't be any data to read for the libcurl-app anyway.

-- 

  / daniel.haxx.se

^ permalink raw reply

* Re: New Feature wanted: Is it possible to let git clone continue last break point?
From: Junio C Hamano @ 2011-11-02 22:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Nieder, netroby, Git Mail List, Tomas Carnecky
In-Reply-To: <20111102220614.GB14108@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> Which is all a roundabout way of saying that the git protocol is really
> the sane way to do efficient transfers. An alternative, much simpler
> scheme would be for the server to just say:
>
>   - if you have nothing, then prime with URL http://host/bundle
>
> And then _only_ clone would bother with checking mirrors. People doing
> fetch would be expected to do it often enough that not being resumable
> isn't a big deal.

I think that is a sensible place to start.

A more fancy conditional "If you have X then fetch this, if you have Y
fetch that, ..." sounds nice but depending on what branch you are fetching
the answer has to be different. If we were to do that, the natural place
for the server to give the redirect instruction to the client is after the
client finishes saying "want", and before the client starts saying "have".

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox