Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH 2/2] Makefile: optionally exclude code that needs Unix sockets
From: Jeff King @ 2011-12-13  0:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Sixt, git
In-Reply-To: <7v1us91i5b.fsf@alter.siamese.dyndns.org>

On Mon, Dec 12, 2011 at 03:31:44PM -0800, Junio C Hamano wrote:

> > In theory we should also disable the documentation for credential-cache.
> > But that means surgery not only to Documentation/Makefile, but figuring
> > out how to pass the flags down to the actual asciidoc process (since
> > gitcredentials(7) mentions it in the text). Certainly possible, but I
> > don't know if it's worth the effort or not.
> 
> I do not think it matters that much. We've been shipping documentation for
> stuff like remote archiver and daemon without conditional compilation, no?

True. Let's not worry about it, then.

> I'll queue a single patch that is a squash between 2/2 and Peff's test
> updates between "credentials: add "cache" helper" and "strbuf: add
> strbuf_add*_urlencode" in the series.

That's perfect. Thanks.

-Peff

^ permalink raw reply

* Re: [PATCH 3/4] fetch-pack: match refs exactly
From: Jeff King @ 2011-12-13  0:54 UTC (permalink / raw)
  To: git; +Cc: Kevin Sawicki
In-Reply-To: <20111213004808.GC3699@sigill.intra.peff.net>

On Mon, Dec 12, 2011 at 07:48:08PM -0500, Jeff King wrote:

> diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
> index 46688dc..6207ecd 100644
> --- a/builtin/fetch-pack.c
> +++ b/builtin/fetch-pack.c
> @@ -556,11 +556,16 @@ static void filter_refs(struct ref **refs, int nr_match, char **match)
>  			continue;
>  		}
>  		else {
> -			int order = path_match(ref->name, nr_match, match);
> -			if (order) {
> -				return_refs[order-1] = ref;
> -				continue; /* we will link it later */
> +			int i;
> +			for (i = 0; i < nr_match; i++) {
> +				if (!strcmp(ref->name, match[i])) {
> +					match[i][0] = '\0';
> +					return_refs[i] = ref;
> +					break;
> +				}
>  			}
> +			if (i < nr_match)
> +				continue; /* we will link it later */

Astute readers may notice that our matching scheme is now something
like:

  for ref in refs
    for match in matches
      strcmp(ref, match)

If you are fetching everything, that's O(n^2) strcmps (where n is the
number of remote refs). If we sorted the match list (the refs list is
already sorted), we could turn it into an O(n lg n) sort+merge.  This is
an optimization we couldn't do with the old suffix-matching rule.

I doubt it matters in any practical case. Even for our crazy 100K ref
cases at GitHub, we don't tend to actually fetch all of those at one
time. So it is more like O(n * m), where m is probably in the dozens at
most.

-Peff

^ permalink raw reply

* [PATCH 4/4] connect.c: drop path_match function
From: Jeff King @ 2011-12-13  0:49 UTC (permalink / raw)
  To: git; +Cc: Kevin Sawicki
In-Reply-To: <20111213003925.GA28403@sigill.intra.peff.net>

This function was used for comparing local and remote ref
names during fetch (which makes it a candidate for "most
confusingly named function of the year").

It no longer has any callers, so let's get rid of it.

Signed-off-by: Jeff King <peff@peff.net>
---
I just did the exact-string match inline in the previous patch. I could
also have modified path_match to do it. But really, I can't think of a
worse name for a global function in a system which is all about
managing content in paths. Unless, you know, it actually matched paths.
Which it doesn't.

 cache.h   |    1 -
 connect.c |   21 ---------------------
 2 files changed, 0 insertions(+), 22 deletions(-)

diff --git a/cache.h b/cache.h
index 408e880..2ad063f 100644
--- a/cache.h
+++ b/cache.h
@@ -1029,7 +1029,6 @@ struct ref {
 extern struct child_process *git_connect(int fd[2], const char *url, const char *prog, int flags);
 extern int finish_connect(struct child_process *conn);
 extern int git_connection_is_socket(struct child_process *conn);
-extern int path_match(const char *path, int nr, char **match);
 struct extra_have_objects {
 	int nr, alloc;
 	unsigned char (*array)[20];
diff --git a/connect.c b/connect.c
index 48df90b..2a0a040 100644
--- a/connect.c
+++ b/connect.c
@@ -105,27 +105,6 @@ int server_supports(const char *feature)
 		strstr(server_capabilities, feature) != NULL;
 }
 
-int path_match(const char *path, int nr, char **match)
-{
-	int i;
-	int pathlen = strlen(path);
-
-	for (i = 0; i < nr; i++) {
-		char *s = match[i];
-		int len = strlen(s);
-
-		if (!len || len > pathlen)
-			continue;
-		if (memcmp(path + pathlen - len, s, len))
-			continue;
-		if (pathlen > len && path[pathlen - len - 1] != '/')
-			continue;
-		*s = 0;
-		return (i + 1);
-	}
-	return 0;
-}
-
 enum protocol {
 	PROTO_LOCAL = 1,
 	PROTO_SSH,
-- 
1.7.8.13.g74677

^ permalink raw reply related

* [PATCH 3/4] fetch-pack: match refs exactly
From: Jeff King @ 2011-12-13  0:48 UTC (permalink / raw)
  To: git; +Cc: Kevin Sawicki
In-Reply-To: <20111213003925.GA28403@sigill.intra.peff.net>

When we are determining the list of refs to fetch via
fetch-pack, we have two sets of refs to compare: those on
the remote side, and a "match" list of things we want to
fetch. We iterate through the remote refs alphabetically,
seeing if each one is wanted by the "match" list.

Since def88e9 (Commit first cut at "git-fetch-pack",
2005-07-04), we have used the "path_match" function to do a
suffix match, where a remote ref is considered wanted if
any of the "match" elements is a suffix of the remote
refname.

This enables callers of fetch-pack to specify unqualified
refs and have them matched up with remote refs (e.g., ask
for "A" and get remote's "refs/heads/A"). However, if you
provide a fully qualified ref, then there are corner cases
where we provide the wrong answer. For example, given a
remote with two refs:

   refs/foo/refs/heads/master
   refs/heads/master

asking for "refs/heads/master" will first match
"refs/foo/refs/heads/master" by the suffix rule, and we will
erroneously fetch it instead of refs/heads/master.

As it turns out, all callers of fetch_pack do provide
fully-qualified refs for the match list. There are two ways
fetch_pack can get match lists:

  1. Through the transport code (i.e., via git-fetch)

  2. On the command-line of git-fetch-pack

In the first case, we will always be providing the names of
fully-qualified refs from "struct ref" objects. We will have
pre-matched those ref objects already (since we have to
handle more advanced matching, like wildcard refspecs), and
are just providing a list of the refs whose objects we need.

In the second case, users could in theory be providing
non-qualified refs on the command-line. However, the
fetch-pack documentation claims that refs should be fully
qualified (and has always done so since it was written in
2005).

Let's change this path_match call to simply check for string
equality, matching what the callers of fetch_pack are
expecting.

Signed-off-by: Jeff King <peff@peff.net>
---
This is obviously the one that can break existing fetch-pack users. I
doubt they exist. If they do, there are a few alternatives:

  1. Come up with some more sane rules for path_match (e.g., try full
     strings first, use full-string matching for things starting with
     "refs/", etc).

  2. Leave the matching in-place for git-fetch-pack, but use exact
     matching for internal users that will always provide qualified refs
     (i.e., "git fetch").

I don't personally think it's worth the trouble.

 builtin/fetch-pack.c      |   13 +++++++++----
 t/t5527-fetch-odd-refs.sh |   29 +++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+), 4 deletions(-)
 create mode 100755 t/t5527-fetch-odd-refs.sh

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 46688dc..6207ecd 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -556,11 +556,16 @@ static void filter_refs(struct ref **refs, int nr_match, char **match)
 			continue;
 		}
 		else {
-			int order = path_match(ref->name, nr_match, match);
-			if (order) {
-				return_refs[order-1] = ref;
-				continue; /* we will link it later */
+			int i;
+			for (i = 0; i < nr_match; i++) {
+				if (!strcmp(ref->name, match[i])) {
+					match[i][0] = '\0';
+					return_refs[i] = ref;
+					break;
+				}
 			}
+			if (i < nr_match)
+				continue; /* we will link it later */
 		}
 		free(ref);
 	}
diff --git a/t/t5527-fetch-odd-refs.sh b/t/t5527-fetch-odd-refs.sh
new file mode 100755
index 0000000..edea9f9
--- /dev/null
+++ b/t/t5527-fetch-odd-refs.sh
@@ -0,0 +1,29 @@
+#!/bin/sh
+
+test_description='test fetching of oddly-named refs'
+. ./test-lib.sh
+
+# afterwards we will have:
+#  HEAD - two
+#  refs/for/refs/heads/master - one
+#  refs/heads/master - three
+test_expect_success 'setup repo with odd suffix ref' '
+	echo content >file &&
+	git add . &&
+	git commit -m one &&
+	git update-ref refs/for/refs/heads/master HEAD &&
+	echo content >>file &&
+	git commit -a -m two &&
+	echo content >>file &&
+	git commit -a -m three &&
+	git checkout HEAD^
+'
+
+test_expect_success 'suffix ref is ignored during fetch' '
+	git clone --bare file://"$PWD" suffix &&
+	echo three >expect &&
+	git --git-dir=suffix log -1 --format=%s refs/heads/master >actual &&
+	test_cmp expect actual
+'
+
+test_done
-- 
1.7.8.13.g74677

^ permalink raw reply related

* Re: [PATCH 2/2] Makefile: optionally exclude code that needs Unix sockets
From: Junio C Hamano @ 2011-12-13  0:45 UTC (permalink / raw)
  To: Johannes Sixt, Jeff King; +Cc: git
In-Reply-To: <20111212213951.GB9754@sigill.intra.peff.net>

I'll queue a single patch that is a squash between 2/2 and Peff's test
updates between "credentials: add "cache" helper" and "strbuf: add
strbuf_add*_urlencode" in the series.

Thanks.

^ permalink raw reply

* Re: [PATCH v2 28/51] refs.c: rename ref_array -> ref_dir
From: Junio C Hamano @ 2011-12-13  0:45 UTC (permalink / raw)
  To: mhagger
  Cc: git, Jeff King, Drew Northup, Jakub Narebski, Heiko Voigt,
	Johan Herland, Julian Phillips
In-Reply-To: <1323668338-1764-29-git-send-email-mhagger@alum.mit.edu>

mhagger@alum.mit.edu writes:

> From: Michael Haggerty <mhagger@alum.mit.edu>
>
> This purely textual change is in preparation for storing references
> hierarchically, when the old ref_array structure will represent one
> "directory" of references.  Rename functions that deal with this
> structure analogously, and also rename the structure's "refs" member
> to "entries".
>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>  refs.c |  166 ++++++++++++++++++++++++++++++++--------------------------------
>  1 files changed, 83 insertions(+), 83 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index fe6d657..b74ef80 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -106,9 +106,9 @@ struct ref_value {
>  	unsigned char peeled[20];
>  };
>  
> -struct ref_array {
> +struct ref_dir {
>  	int nr, alloc;
> -	struct ref_entry **refs;
> +	struct ref_entry **entries;
>  };

The s/refs/entries/ renaming is a sane thing to do; on the other hand, I
somehow find the s/ref_array/ref_dir/ renaming is a short-sighted change
and undesirable, as you are essentially declaring that "if you use this
structure, the contents you store there are expected to be named
hierarchically", forbidding users that want to use a simple flat array.

BUT. That was an observation before I continued reading the remainder of
the series.

I think the above observation primarily come from my worries around the
extra ref stuff, which by nature does not fit well with the hierarchical
naming (they do not even have any meaningful names). Sorting or requiring
uniqueness among them do not make any sense, let alone cutting their names
in hierarchical boundaries.

As an alternative, it _might_ make sense to get rid of "add_extra_ref()"
API from refs.c and make it the responsibility for its users to add their
extra anchor points where they use for_each_ref() to find out all anchor
points in the history from the refs.c subsystem. If we go that route, I
fully agree that "s/ref_array/ref_dir/" renaming is the right thing to do,
as refs.c subsystem will _only_ handle the hierarchical ref namespace and
nothing else after such a change.

A bit of background refresher.

"git clone --reference=../another-repo.git $origin" is a typical user of
the add_extra_ref() API. It runs an equivalent of ls-remote against the
reference repository, and uses add_extra_ref() to cause its later call to
for_each_ref() to return the names of the objects at the tips of refs in
that reference repository.

During the object transfer (i.e. the equivalent of "git fetch") from the
origin, "clone" and "upload-pack" negotiate what objects need to be sent,
and this is done by the receiving end telling what it has to the sending
end, and the sending end uses this information to subtract what the
receiving end claims to already have from what is going to be sent.

And the enumeration of what the receiving end, especially when there is no
"reference", is done using for_each_ref(), as object transfer considers
everything reachable from the refs complete. add_extra_ref() is a _hack_
to cause for_each_ref() to include objects that actually do _not_ sit at
the tip of any of our refs.

When cloning, we do not have any ref, and after clone is done, we do not
want the refs the repository we are borrowing its objects from to be our
ref (after all, we may be using a local copy of linux-3.0 repository as
our reference but cloning from linux-2.4 history), and that is why this
hack was invented (in the old scripted version, we tentatively created
real refs in our ref namespace and removed them after clone finished).

In the case of "clone", these extra refs are registered with names taken
from the repository we are borrowing from, so they may be unique and
without conflict among them if you are borrowing from one repository, but
if you borrow from more than one repositories, it is likely that all of
them have "refs/heads/master" and there is no reason to expect that the
extra refs added with add_extra_ref() API have unique names. Worse, in
"receive-pack" that runs on the receiving end when you "git push" your
history, "refs" that exist in repositories that the receiving repository
is borrowing from (i.e. it has $GIT_DIR/objects/info/alternates) are
advertised with ".have" in their names (not using anything refs/* is to
avoid the pusher from mistakenly think there is such a ref that is
eligible for "matching push" logic), and these ".have" entries obviously
are not unique. The code also uses the same add_extra_ref() hack to cause
for_each_ref() to report them.

The removal of this hack, taking receive-pack as an example, would be to
stop using add_extra_ref() but instead to keep a local list of object
names that the code currently uses add_extra_ref() to keep track of, and
then modify write_head_info() to feed show_ref_cb() with these object
names in addition to the current for_each_ref() callback.

It may make the resulting code cleaner if we go that route, and if we were
to do so, I think the right place to do so in the series would be either
at the very beginning of the series (as part of the preliminary clean-up),
or immediately before "do_for_each_ref: correctly terminate while
processing extra_refs" (making that commit unnecessary, as after such a
fix, refs.c API won't have to worry about the extra_refs hack anymore).

What do you think?

^ permalink raw reply

* [PATCH 2/4] t5500: give fully-qualified refs to fetch-pack
From: Jeff King @ 2011-12-13  0:44 UTC (permalink / raw)
  To: git; +Cc: Kevin Sawicki
In-Reply-To: <20111213003925.GA28403@sigill.intra.peff.net>

The fetch-pack documentation is very clear that refs given
on the command line are to be full refs:

  <refs>...::
          The remote heads to update from. This is relative to
          $GIT_DIR (e.g. "HEAD", "refs/heads/master").  When
          unspecified, update from all heads the remote side has.

and this has been the case since fetch-pack was originally documented in
8b3d9dc ([PATCH] Documentation: clone/fetch/upload., 2005-07-14).

Let's follow our own documentation to set a good example,
and to avoid breaking when this restriction is enforced in
the next patch.

Signed-off-by: Jeff King <peff@peff.net>
---
This patch by itself isn't a big deal for backwards compatibility But
if we are violating the constraints in the documentation, then I worry
that others are, too. On the other hand, I seriously doubt anyone is
actually using fetch-pack directly anymore at all.

 t/t5500-fetch-pack.sh |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index bafcca7..9bf69e9 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -97,7 +97,7 @@ test_expect_success 'setup' '
 	git symbolic-ref HEAD refs/heads/B
 '
 
-pull_to_client 1st "B A" $((11*3))
+pull_to_client 1st "refs/heads/B refs/heads/A" $((11*3))
 
 test_expect_success 'post 1st pull setup' '
 	add A11 $A10 &&
@@ -110,9 +110,9 @@ test_expect_success 'post 1st pull setup' '
 	done
 '
 
-pull_to_client 2nd "B" $((64*3))
+pull_to_client 2nd "refs/heads/B" $((64*3))
 
-pull_to_client 3rd "A" $((1*3))
+pull_to_client 3rd "refs/heads/A" $((1*3))
 
 test_expect_success 'clone shallow' '
 	git clone --depth 2 "file://$(pwd)/." shallow
-- 
1.7.8.13.g74677

^ permalink raw reply related

* [PATCH 1/4] drop "match" parameter from get_remote_heads
From: Jeff King @ 2011-12-13  0:41 UTC (permalink / raw)
  To: git; +Cc: Kevin Sawicki
In-Reply-To: <20111213003925.GA28403@sigill.intra.peff.net>

The get_remote_heads function reads the list of remote refs
during git protocol session. It dates all the way back to
def88e9 (Commit first cut at "git-fetch-pack", 2005-07-04).
At that time, the idea was to come up with a list of refs we
were interested in, and then filter the list as we got it
from the remote side.

Later, 1baaae5 (Make maximal use of the remote refs,
2005-10-28) stopped filtering at the get_remote_heads layer,
letting us use the non-matching refs to find common history.

As a result, all callers now simply pass an empty match
list (and any future callers will want to do the same). So
let's drop these now-useless parameters.

Signed-off-by: Jeff King <peff@peff.net>
---
This one isn't necessary for the bugfix, but since it is the only other
caller of path_match besides fetch-pack, it gives us freedom to modify
or get rid of path_match later.

 builtin/fetch-pack.c |    2 +-
 builtin/send-pack.c  |    3 +--
 cache.h              |    2 +-
 connect.c            |    3 ---
 remote-curl.c        |    2 +-
 transport.c          |    7 +++----
 6 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index c6bc8eb..46688dc 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -976,7 +976,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 				   args.verbose ? CONNECT_VERBOSE : 0);
 	}
 
-	get_remote_heads(fd[0], &ref, 0, NULL, 0, NULL);
+	get_remote_heads(fd[0], &ref, 0, NULL);
 
 	ref = fetch_pack(&args, fd, conn, ref, dest,
 		nr_heads, heads, pack_lockfile_ptr);
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index e0b8030..cd1115f 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -494,8 +494,7 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
 
 	memset(&extra_have, 0, sizeof(extra_have));
 
-	get_remote_heads(fd[0], &remote_refs, 0, NULL, REF_NORMAL,
-			 &extra_have);
+	get_remote_heads(fd[0], &remote_refs, REF_NORMAL, &extra_have);
 
 	transport_verify_remote_names(nr_refspecs, refspecs);
 
diff --git a/cache.h b/cache.h
index 8c98d05..408e880 100644
--- a/cache.h
+++ b/cache.h
@@ -1034,7 +1034,7 @@ struct extra_have_objects {
 	int nr, alloc;
 	unsigned char (*array)[20];
 };
-extern struct ref **get_remote_heads(int in, struct ref **list, int nr_match, char **match, unsigned int flags, struct extra_have_objects *);
+extern struct ref **get_remote_heads(int in, struct ref **list, unsigned int flags, struct extra_have_objects *);
 extern int server_supports(const char *feature);
 
 extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path);
diff --git a/connect.c b/connect.c
index 51990fa..48df90b 100644
--- a/connect.c
+++ b/connect.c
@@ -53,7 +53,6 @@ static void add_extra_have(struct extra_have_objects *extra, unsigned char *sha1
  * Read all the refs from the other end
  */
 struct ref **get_remote_heads(int in, struct ref **list,
-			      int nr_match, char **match,
 			      unsigned int flags,
 			      struct extra_have_objects *extra_have)
 {
@@ -92,8 +91,6 @@ struct ref **get_remote_heads(int in, struct ref **list,
 
 		if (!check_ref(name, name_len, flags))
 			continue;
-		if (nr_match && !path_match(name, nr_match, match))
-			continue;
 		ref = alloc_ref(buffer + 41);
 		hashcpy(ref->old_sha1, old_sha1);
 		*list = ref;
diff --git a/remote-curl.c b/remote-curl.c
index 0e720ee..94dc488 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -200,7 +200,7 @@ static struct ref *parse_git_refs(struct discovery *heads)
 
 	if (start_async(&async))
 		die("cannot start thread to parse advertised refs");
-	get_remote_heads(async.out, &list, 0, NULL, 0, NULL);
+	get_remote_heads(async.out, &list, 0, NULL);
 	close(async.out);
 	if (finish_async(&async))
 		die("ref parsing thread failed");
diff --git a/transport.c b/transport.c
index 51814b5..c2245d4 100644
--- a/transport.c
+++ b/transport.c
@@ -502,7 +502,7 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 	struct ref *refs;
 
 	connect_setup(transport, for_push, 0);
-	get_remote_heads(data->fd[0], &refs, 0, NULL,
+	get_remote_heads(data->fd[0], &refs,
 			 for_push ? REF_NORMAL : 0, &data->extra_have);
 	data->got_remote_heads = 1;
 
@@ -537,7 +537,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 
 	if (!data->got_remote_heads) {
 		connect_setup(transport, 0, 0);
-		get_remote_heads(data->fd[0], &refs_tmp, 0, NULL, 0, NULL);
+		get_remote_heads(data->fd[0], &refs_tmp, 0, NULL);
 		data->got_remote_heads = 1;
 	}
 
@@ -772,8 +772,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 		struct ref *tmp_refs;
 		connect_setup(transport, 1, 0);
 
-		get_remote_heads(data->fd[0], &tmp_refs, 0, NULL, REF_NORMAL,
-				 NULL);
+		get_remote_heads(data->fd[0], &tmp_refs, REF_NORMAL, NULL);
 		data->got_remote_heads = 1;
 	}
 
-- 
1.7.8.13.g74677

^ permalink raw reply related

* [PATCH 0/4] exact ref-matching for fetch-pack
From: Jeff King @ 2011-12-13  0:39 UTC (permalink / raw)
  To: git; +Cc: Kevin Sawicki

I was passed a bug report of clone failing on a repo with an oddly named
ref in it: "refs/for/refs/heads/master". This is probably an error
somebody made while pushing and should simply be deleted. However, it's
curious that "clone" would fail on this, since it should be ignoring
everything outside the "refs/heads/*" refspec.

It turns out that during the fetch-pack phase, we accidentally ask for
the sha1 of "refs/for/refs/heads/master" instead of that of
"refs/heads/master". This can cause "clone" to fail, as we may not have
asked for the object pointed to by "refs/heads/master" at all. This
behavior is due to some questionable suffix-matching deep inside
fetch-pack. The code in question dates back to the beginning of git; I
think it simply hasn't triggered much because the refname you need to
have is so specific (you must be asking to fetch a ref "refs/X", have
"refs/Y/X" on the remote, and "Y" must come alphabetically before "X",
since we match refs in alphabetical order).

As I said, this is probably just a silly one-off error. But I could
imagine this happening legitimately. Here are two possible scenarios:

  1. You are mirroring some meta-information about your refs in a
     parallel namespace. E.g., "refs/descriptions/refs/heads/master"
     contains information about "refs/heads/master".

  2. One of the things we've discussed for future git is mirroring the
     remote ref namespaces more literally. E.g., having something like
     "refs/remotes/origin/refs/heads/master". That won't actually
     trigger this bug because "heads" is alphabetically before
     "remotes". But "tags", for example, is not.

This could possibly be considered a behavior regression for users of the
fetch-pack command line. See patches 2 and 3 for details.

  [1/4]: drop "match" parameter from get_remote_heads
  [2/4]: t5500: give fully-qualified refs to fetch-pack
  [3/4]: fetch-pack: match refs exactly
  [4/4]: connect.c: drop path_match function

-Peff

^ permalink raw reply

* Re: [PATCH v2 30/51] sort_ref_dir(): do not sort if already sorted
From: Junio C Hamano @ 2011-12-12 23:26 UTC (permalink / raw)
  To: mhagger
  Cc: git, Jeff King, Drew Northup, Jakub Narebski, Heiko Voigt,
	Johan Herland, Julian Phillips
In-Reply-To: <1323668338-1764-31-git-send-email-mhagger@alum.mit.edu>

mhagger@alum.mit.edu writes:

> From: Michael Haggerty <mhagger@alum.mit.edu>
>
> Keep track of how many entries in a ref_dir are already sorted.  In
> sort_ref_dir(), only call qsort() if the dir contains unsorted
> entries.
>
> We could store a binary "sorted" value instead of an integer, but
> storing the number of sorted entries leaves the way open for a couple
> of possible future optimizations:
>
> * In sort_ref_dir(), sort *only* the unsorted entries, then merge them
>   with the sorted entries.  This should be faster if most of the
>   entries are already sorted.
>
> * Teach search_ref_dir() to do a binary search of any sorted entries,
>   and if unsuccessful do a linear search of any unsorted entries.
>   This would avoid the need to sort the list every time that
>   search_ref_dir() is called, and (given some intelligence about how
>   often to sort) could significantly improve the speed in certain
>   hypothetical usage patterns.

The elegance (which I think is the right word for this case as it is both
simple and clever) of the above strategy is not fully appreciated unless
you explain that your plan for "lazily add new entries and keep the
directory unsorted" case is to append them at the end (hence preserving
the property that entries[0..sorted] is always sorted).

If you reword the comment "How many of the entries..." to something like
"entries at 0 up to this index in the array are sorted", you could express
that idea without changing the log message that much, I guess.

> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>  refs.c |   29 ++++++++++++++++++++++++-----
>  1 files changed, 24 insertions(+), 5 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index ccd2806..ce141ea 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -108,6 +108,10 @@ struct ref_value {
>  
>  struct ref_dir {
>  	int nr, alloc;
> +
> +	/* How many of the entries in this directory are sorted? */
> +	int sorted;
> +
>  	struct ref_entry **entries;
>  };

^ permalink raw reply

* Re: [PATCH v2 20/51] repack_without_ref(): reimplement using do_for_each_ref_in_array()
From: Junio C Hamano @ 2011-12-12 22:44 UTC (permalink / raw)
  To: mhagger
  Cc: git, Jeff King, Drew Northup, Jakub Narebski, Heiko Voigt,
	Johan Herland, Julian Phillips
In-Reply-To: <1323668338-1764-21-git-send-email-mhagger@alum.mit.edu>

mhagger@alum.mit.edu writes:

> +static int repack_without_ref_fn(const char *refname, const unsigned char *sha1,
> +				 int flags, void *cb_data)
> +{
> +	struct repack_without_ref_sb *data = cb_data;
> +	char line[PATH_MAX + 100];
> +	int len;
> +
> +	if (!strcmp(data->refname, refname))
> +		return 0;
> +	len = snprintf(line, sizeof(line), "%s %s\n",
> +		       sha1_to_hex(sha1), refname);
> +	/* this should not happen but just being defensive */
> +	if (len > sizeof(line))
> +		die("too long a refname '%s'", refname);

Perhaps strbuf would be easier to use for things like this.

^ permalink raw reply

* Re: [PATCH v2 17/51] do_for_each_ref(): correctly terminate while processesing extra_refs
From: Junio C Hamano @ 2011-12-12 22:41 UTC (permalink / raw)
  To: mhagger
  Cc: git, Jeff King, Drew Northup, Jakub Narebski, Heiko Voigt,
	Johan Herland, Julian Phillips
In-Reply-To: <1323668338-1764-18-git-send-email-mhagger@alum.mit.edu>

mhagger@alum.mit.edu writes:

> From: Michael Haggerty <mhagger@alum.mit.edu>
>
> If the user-supplied function returns a nonzero value while processing
> extra_refs, terminate without processing the rest of the list.
>
> This probably has no practical importance, but makes the handling of
> extra_refs a little bit more consistent with the handling of other
> refs.
>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>  refs.c |    5 ++++-
>  1 files changed, 4 insertions(+), 1 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index 579e4c3..fb6fe84 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -711,8 +711,11 @@ static int do_for_each_ref(const char *submodule, const char *base, each_ref_fn
>  
>  	struct ref_array *extra = &extra_refs;
>  
> -	for (i = 0; i < extra->nr; i++)
> +	for (i = 0; i < extra->nr; i++) {
>  		retval = do_one_ref(base, fn, trim, flags, cb_data, extra->refs[i]);
> +		if (retval)
> +			goto end_each;
> +	}

Makes sense and everything up to this point looks sane.

Thanks.

^ permalink raw reply

* Re: [PATCH v2 08/51] is_dup_ref(): extract function from sort_ref_array()
From: Junio C Hamano @ 2011-12-12 22:33 UTC (permalink / raw)
  To: mhagger
  Cc: git, Jeff King, Drew Northup, Jakub Narebski, Heiko Voigt,
	Johan Herland, Julian Phillips
In-Reply-To: <1323668338-1764-9-git-send-email-mhagger@alum.mit.edu>

mhagger@alum.mit.edu writes:

> +/*
> + * Emit a warning and return true iff ref1 and ref2 have the same name
> + * and the same sha1.  Die if they have the same name but different
> + * sha1s.
> + */
> +static int is_dup_ref(const struct ref_entry *ref1, const struct ref_entry *ref2)
> +{
> +	if (!strcmp(ref1->name, ref2->name)) {
> +		/* Duplicate name; make sure that the SHA1s match: */
> +		if (hashcmp(ref1->sha1, ref2->sha1))
> +			die("Duplicated ref, and SHA1s don't match: %s",
> +			    ref1->name);
> +		warning("Duplicated ref: %s", ref1->name);
> +		return 1;
> +	} else {
> +		return 0;
> +	}
> +}

At this step, is_dup_ref() is only called from sort_ref_array() which in
turn is only called on either a collection of loose or packed refs, so
warning is warranted. IOW I do not see anything wrong with _this_ patch.

Later in the series, however, the collection of extra refs seems to be
shoved into a phoney "direntry" and made to go through the same add_ref()
machinery that uses find_containing_direntry() which in turn calls
search_ref_dir() that smells like a definite no-no.

^ permalink raw reply

* Re: [PATCH v2 02/51] refs: rename "refname" variables
From: Junio C Hamano @ 2011-12-13  0:37 UTC (permalink / raw)
  To: mhagger
  Cc: git, Jeff King, Drew Northup, Jakub Narebski, Heiko Voigt,
	Johan Herland, Julian Phillips
In-Reply-To: <1323668338-1764-3-git-send-email-mhagger@alum.mit.edu>

This added otherwise unnecessary conflicts with topics in flight, but the
conflicts were nothing Git and a human couldn't manage, and overall it is
not a bad readability change for the longer-term code health.

Thanks.

^ permalink raw reply

* Re: [PATCH 3/4] Guard memory overwriting in resolve_ref's static buffer
From: Junio C Hamano @ 2011-12-13  0:37 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Jonathan Nieder
In-Reply-To: <1323688832-32016-3-git-send-email-pclouds@gmail.com>

Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:

> diff --git a/cache.h b/cache.h
> index 4887a3e..ba5e911 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -865,7 +865,8 @@ extern int read_ref(const char *filename, unsigned char *sha1);
>   *
>   * errno is sometimes set on errors, but not always.
>   */
> -extern const char *resolve_ref(const char *ref, unsigned char *sha1, int reading, int *flag);
> +#define resolve_ref(ref, sha1, reading, flag) resolve_ref_real(ref, sha1, reading, flag, __FILE__, __LINE__)
> +extern const char *resolve_ref_real(const char *ref, unsigned char *sha1, int reading, int *flag, const char *file, int line);
>  extern char *resolve_refdup(const char *ref, unsigned char *sha1, int reading, int *flag);
>  
>  extern int dwim_ref(const char *str, int len, unsigned char *sha1, char **ref);

Eek.

> diff --git a/wrapper.c b/wrapper.c
> index 85f09df..02b6c81 100644
> --- a/wrapper.c
> +++ b/wrapper.c
> @@ -60,6 +60,33 @@ void *xmallocz(size_t size)
>  	return ret;
>  }
>  
> +void *xmalloc_mmap(size_t size, const char *file, int line)
> +{
> +	struct alloc_header *block;
> +	size += offsetof(struct alloc_header,buf);
> +	block = mmap(NULL, size, PROT_READ | PROT_WRITE,
> +		   MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> +	if (block == (struct alloc_header*)-1)
> +		die_errno("unable to mmap %lu bytes anonymously",
> +			  (unsigned long)size);
> +
> +	block->file = file;
> +	block->line = line;
> +	block->size = size;
> +	return block->buf;
> +}
> +

Double eek. A refname is ordinarily way too small than a page and we spend
a full page every time resolve_ref_unsafe() is called. That is acceptable
only for debugging build, but then having to patch the main codepath like
the following, renaming the "real" implementation of a rather important
function is not acceptable in a non-debugging build.

> diff --git a/refs.c b/refs.c
> index 8ffb32f..cf8dfcc 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -497,12 +497,21 @@ static int get_packed_ref(const char *ref, unsigned char *sha1)
>  	return -1;
>  }
>  
> -const char *resolve_ref(const char *ref, unsigned char *sha1, int reading, int *flag)
> +const char *resolve_ref_real(const char *ref, unsigned char *sha1,
> +			     int reading, int *flag, const char *file, int line)
>  {
>  	int depth = MAXDEPTH;
>  	ssize_t len;
>  	char buffer[256];
> -	static char ref_buffer[256];
> +	static char real_ref_buffer[256];
> +	static char *ref_buffer;
> +
> +	if (!ref_buffer && !getenv("GIT_DEBUG_MEMCHECK"))
> +		ref_buffer = real_ref_buffer;
> +	if (ref_buffer != real_ref_buffer) {
> +		xfree_mmap(ref_buffer);
> +		ref_buffer = xmalloc_mmap(256, file, line);
> +	}

I'll drop 3/4 from the series, adjust 4/4, and queue the result as a
three-patch series for now.

Thanks.

^ permalink raw reply

* Re: Debugging git-commit slowness on a large repo
From: Joshua Redstone @ 2011-12-13  0:15 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy, Carlos Martín Nieto, Tomas Carnecky,
	Junio C Hamano
  Cc: git@vger.kernel.org
In-Reply-To: <CB069308.2C9DD%joshua.redstone@fb.com>

Sorry for the poor formatting of the stack trace.

I've written two scripts to reproduce the slow commit behavior that I see.
 I've posted both to:
   https://gist.github.com/1469760

To repro, first create a dir with lots of files (it defaults to creating 1
million files in 1000 dirs):

$ loadGen.py --baseDir=./bigdir

then, run the simulator scripts to generate and commit a series of small
changes to the repo:

$ git reset --hard HEAD && simulate.py ./bigdir git

The git reset is to clean up any cruft left over from a previous partial
invocation of simulate.py

Note that loadGen.py defaults to creating 1 million files and committing
them in one commit.  With a flash drive this took < 30 min, and subsequent
small commits in simulate.py took about 6 seconds.  With a hard-drive,
it's taking > 1hr (still waiting for it to finish).

Cheers,
Josh


On 12/8/11 4:17 PM, "Joshua Redstone" <joshua.redstone@fb.com> wrote:

>Btw, I also tried doing some very poor-man's profiling on "git commit"
>without any of the readtree/writetree/updateindex commands.
>
>Around 50% of the time was in (bottom few frames may have varied)
>
>#1  0x00000000004c467e in find_pack_entry (sha1=0x1475a44 ,
>e=0x7fff2621f070) at sha1_file.c:2027
>#2  0x00000000004c57b0 in has_sha1_file (sha1=0x7fe2cd9c7900 "00") at
>sha1_file.c:2567  
>                  
>                 
>#3  0x000000000046e4af in update_one (it=<value optimized out>,
>cache=<value optimized out>, entries=<value optimized out>, base=<value
>optimized out>, baselen=<value optimized out>, missing_ok=<value optimized
>out>, dryrun=0) at cache-\
>tree.c:333        
>                  
>                  
>            
>#4  0x000000000046e278 in update_one (it=<value optimized out>,
>cache=<value optimized out>, entries=<value optimized out>, base=<value
>optimized out>, baselen=<value optimized out>, missing_ok=<value optimized
>out>, dryrun=0) at cache-\
>tree.c:285        
>                  
>                  
>            
>#5  0x000000000046e278 in update_one (it=<value optimized out>,
>cache=<value optimized out>, entries=<value optimized out>, base=<value
>optimized out>, baselen=<value optimized out>, missing_ok=<value optimized
>out>, dryrun=0) at cache-\
>tree.c:285        
>                  
>                  
>            
>#6  0x000000000046e278 in update_one (it=<value optimized out>,
>cache=<value optimized out>, entries=<value optimized out>, base=<value
>optimized out>, baselen=<value optimized out>, missing_ok=<value optimized
>out>, dryrun=0) at cache-\
>tree.c:285        
>                  
>                  
>            
>#7  0x000000000046e278 in update_one (it=<value optimized out>,
>cache=<value optimized out>, entries=<value optimized out>, base=<value
>optimized out>, baselen=<value optimized out>, missing_ok=<value optimized
>out>, dryrun=0) at cache-\
>tree.c:285        
>                  
>                  
>            
>#8  0x000000000046e278 in update_one (it=<value optimized out>,
>cache=<value optimized out>, entries=<value optimized out>, base=<value
>optimized out>, baselen=<value optimized out>, missing_ok=<value optimized
>out>, dryrun=0) at cache-\
>tree.c:285        
>                  
>                  
>            
>#9  0x000000000046e869 in cache_tree_update (it=<value optimized out>,
>cache=<value optimized out>, entries=dwarf2_read_address: Corrupted DWARF
>expression.       
>                 
>) at cache-tree.c:379
>                  
>                  
>            
>#10 0x000000000041cade in prepare_to_commit (index_file=0x781740
>".git/index", prefix=<value optimized out>, current_head=<value optimized
>out>, s=0x7fff26220d00, author_ident=<value optimized out>) at
>builtin/commit.c:866
>#11 0x000000000041d891 in cmd_commit (argc=0, argv=0x7fff262213a0,
>prefix=0x0) at builtin/commit.c:1407
>                  
>                  
>#12 0x0000000000404bf7 in handle_internal_command (argc=4,
>argv=0x7fff262213a0) at git.c:308
>                  
>                  
>#13 0x0000000000404e2f in main (argc=4, argv=0x7fff262213a0) at git.c:512
>                  
>                  
>            
> 
>
>
>And 30% of the time was in:
>
>#0  0x00000034af2c34a5 in _lxstat () from /lib64/libc.so.6
>                  
>                  
>            
>#1  0x00000000004abe0f in refresh_cache_ent (istate=0x780940,
>ce=0x7f8462a34e40, options=0, err=0x7fff6dd9f588) at
>/usr/include/sys/stat.h:443
>                  
>#2  0x00000000004ac1a0 in refresh_index (istate=0x780940, flags=<value
>optimized out>, pathspec=<value optimized out>, seen=<value optimized
>out>, header_msg=0x0) at read-cache.c:1133
>                  
>#3  0x000000000041b60a in refresh_cache_or_die (refresh_flags=<value
>optimized out>) at builtin/commit.c:331
>                  
>                  
>#4  0x000000000041bc39 in prepare_index (argc=0, argv=0x7fff6dda0310,
>prefix=0x0, current_head=<value optimized out>, is_status=<value optimized
>out>) at builtin/commit.c:414
>                 
>#5  0x000000000041d878 in cmd_commit (argc=0, argv=0x7fff6dda0310,
>prefix=0x0) at builtin/commit.c:1403
>                  
>                  
>  
>
>
>Josh
>
>
>On 12/8/11 4:09 PM, "Joshua Redstone" <joshua.redstone@fb.com> wrote:
>
>>On 12/7/11 5:39 PM, "Nguyen Thai Ngoc Duy" <pclouds@gmail.com> wrote:
>>
>>>On Thu, Dec 8, 2011 at 5:48 AM, Joshua Redstone <joshua.redstone@fb.com>
>>>wrote:
>>>> Hi Duy,
>>>> Thanks for the documentation link.
>>>>
>>>> git ls-files shows 100k files, which matches # of files in the working
>>>> tree ('find . -type f -print | wc -l').
>>>
>>>Any chance you can split it into smaller repositories, or remove files
>>>from working directory (e.g. if you store logs, you don't have to keep
>>>logs from all time in working directory, they can be retrieved from
>>>history).
>>
>>It's not really feasible to split it into smaller repositories.  In fact,
>>we're expecting it to grow between 3x and 5x in number of files and
>>number
>>of commits.
>>
>>>
>>>> I added a 'git read-tree HEAD' before the git-add, and a 'git
>>>>write-tree'
>>>> after the add.  With that, the commit time slowed down to 8 seconds
>>>>per
>>>> commit, plus 4 more seconds for the read-tree/add/write-tree ops.  The
>>>> read-tree/add/write-tree each took about a second.
>>>
>>>read-tree destroys stat info in index, refreshing 100k entries in
>>>index in this case may take some time. Try this to see if commit time
>>>reduces and how much time update-index takes
>>>
>>>read-tree HEAD
>>>update-index --refresh
>>>add ....
>>>write-tree
>>>commit -q
>>
>>I added the "update-index --refresh" and the time for commit became more
>>like 0.6 seconds.
>>In this setup: read-tree takes ~2 seconds, update-index takes ~8 seconds,
>>git-add takes 1 to 4 seconds, and write-tree takes less than 1 second.
>>
>>>
>>>> As an experiment, I also tried removing the 'git read-tree' and just
>>>> having the git-write-tree.  That sped up commits to 0.6 seconds, but
>>>>the
>>>> overall time for add/write-tree/commit was still 3 to 6 seconds.
>>>
>>>overall time is not really important because we duplicate work here
>>>(write-tree is done as part of commit again). What I'm trying to do is
>>>to determine how much time each operation in commit may take.
>>>-- 
>>>Duy
>>
>

^ permalink raw reply

* Re: [RFC/PATCH] add update to branch support for "floating submodules"
From: Brandon Casey @ 2011-12-13  0:12 UTC (permalink / raw)
  To: Leif Gruenwoldt; +Cc: Junio C Hamano, git
In-Reply-To: <CALFF=ZQKRgx_AodBQH17T9cSe_JFtoKie7DoMMfkTXCyCFospw@mail.gmail.com>

On Sat, Dec 10, 2011 at 9:27 AM, Leif Gruenwoldt <leifer@gmail.com> wrote:
> On Sat, Dec 10, 2011 at 1:30 AM, Junio C Hamano <gitster@pobox.com> wrote:
>
>> So that use case does not sound like a good rationale to require addition
>> of floating submodules.
>
> Ok I will try another scenario :)
>
> Imagine again products A, B and C and a common library. The products are in
> a stable state of development and track a stable branch of the common lib.
> Then imagine an important security fix gets made to the common library. On
> the next pull of products A, B, and C they get this fix for free
> because they were
> floating. They didn't need to communicate with the maintainer of the common
> repo to know this. In fact they don't really care. They just want the
> latest stable
> code for that release branch.
>
> This is how package management on many linux systems works. Dependencies
> get updated and all products reap the benefit (or catastrophe) automatically.

What happens if the update to the floating submodule introduces a bug
that prevents A, B, and/or C from building/running correctly?  If the
submodule states are not recorded, how would the previously working
submodule version be restored so that development on A, B, and C,
could proceed?  I guess each developer could manually checkout @{1} in
the submodule in their working directory, though that wouldn't work
for a new clone, and it's not very elegant.

I presume that if A, B, and C, do not care to know exactly what was
fixed in the common library, they probably do not care to investigate
the breakage in that repo either.  Or, they may not have the
expertise.

-Brandon

^ permalink raw reply

* [PATCH] Update documentation for stripspace
From: Conrad Irwin @ 2011-12-12 23:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Conrad Irwin
In-Reply-To: <7vwra1z7bg.fsf@alter.siamese.dyndns.org>

2011/12/12 Junio C Hamano <gitster@pobox.com>:
> Conrad Irwin <conrad.irwin@gmail.com> writes:
>> I've moved the *NOTE* into a SEE ALSO section where I think it reads
>> less opinionatedly ― is that better?
>
> I think it looks a lot worse.
[snip]
>
> The new example section looks good. Perhaps we can just drop the extra SEE
> ALSO and resurrect the *NOTE* from your v1 patch.
>
> Thanks.

Done.

Conrad

---8<---

Tell the user what this command is intended for, and expand the
description of what it does.

Signed-off-by: Conrad Irwin <conrad.irwin@gmail.com>
---
 Documentation/git-stripspace.txt |   69 ++++++++++++++++++++++++++++++++++---
 builtin/stripspace.c             |    2 +-
 2 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-stripspace.txt b/Documentation/git-stripspace.txt
index b78f031..a548444 100644
--- a/Documentation/git-stripspace.txt
+++ b/Documentation/git-stripspace.txt
@@ -3,26 +3,83 @@ git-stripspace(1)
 
 NAME
 ----
-git-stripspace - Filter out empty lines
+git-stripspace - Remove unnecessary whitespace
 
 
 SYNOPSIS
 --------
 [verse]
-'git stripspace' [-s | --strip-comments] < <stream>
+'git stripspace' [-s | --strip-comments] < input
 
 DESCRIPTION
 -----------
-Remove multiple empty lines, and empty lines at beginning and end.
+
+Clean the input in the manner used by 'git' for text such as commit
+messages, notes, tags and branch descriptions.
+
+With no arguments, this will:
+
+- remove trailing whitespace from all lines
+- collapse multiple consecutive empty lines into one empty line
+- remove empty lines from the beginning and end of the input
+- add a missing '\n' to the last line if necessary.
+
+In the case where the input consists entirely of whitespace characters, no
+output will be produced.
+
+*NOTE*: This is intended for cleaning metadata, prefer the `--whitespace=fix`
+mode of linkgit:git-apply[1] for correcting whitespace of patches or files in
+the repository.
 
 OPTIONS
 -------
 -s::
 --strip-comments::
-	In addition to empty lines, also strip lines starting with '#'.
+	Skip and remove all lines starting with '#'.
+
+EXAMPLES
+--------
+
+Given the following noisy input with '$' indicating the end of a line:
 
-<stream>::
-	Byte stream to act on.
+--------
+|A brief introduction   $
+|   $
+|$
+|A new paragraph$
+|# with a commented-out line    $
+|explaining lots of stuff.$
+|$
+|# An old paragraph, also commented-out. $
+|      $
+|The end.$
+|  $
+---------
+
+Use 'git stripspace' with no arguments to obtain:
+
+--------
+|A brief introduction$
+|$
+|A new paragraph$
+|# with a commented-out line$
+|explaining lots of stuff.$
+|$
+|# An old paragraph, also commented-out.$
+|$
+|The end.$
+---------
+
+Use 'git stripspace --strip-comments' to obtain:
+
+--------
+|A brief introduction$
+|$
+|A new paragraph$
+|explaining lots of stuff.$
+|$
+|The end.$
+---------
 
 GIT
 ---
diff --git a/builtin/stripspace.c b/builtin/stripspace.c
index 1288ffc..f16986c 100644
--- a/builtin/stripspace.c
+++ b/builtin/stripspace.c
@@ -75,7 +75,7 @@ int cmd_stripspace(int argc, const char **argv, const char *prefix)
 				!strcmp(argv[1], "--strip-comments")))
 		strip_comments = 1;
 	else if (argc > 1)
-		usage("git stripspace [-s | --strip-comments] < <stream>");
+		usage("git stripspace [-s | --strip-comments] < input");
 
 	if (strbuf_read(&buf, 0, 1024) < 0)
 		die_errno("could not read the input");
-- 
1.7.8.164.g00d7e

^ permalink raw reply related

* Re: [PATCH v4 2/2] push: teach --recurse-submodules the on-demand option
From: Phil Hord @ 2011-12-12 23:50 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: Junio C Hamano, Heiko Voigt, Fredrik Gustafsson, git
In-Reply-To: <4EE6805D.7020708@web.de>

On Mon, Dec 12, 2011 at 5:29 PM, Jens Lehmann <Jens.Lehmann@web.de> wrote:
> Am 12.12.2011 22:16, schrieb Phil Hord:
>> On Tue, Oct 18, 2011 at 4:58 PM, Jens Lehmann <Jens.Lehmann@web.de> wrote:
>>> Am 18.10.2011 00:33, schrieb Junio C Hamano:
>>>> We could even declare that the gitlink for such a
>>>> submodule should record 0{40} SHA-1 in the superproject, but I do not
>>>> think that is necessary.
>>>
>>> Me neither, e.g. the SHA-1 which was the submodules HEAD when it was added
>>> should do nicely. And that would avoid referencing a non-existing commit
>>> in case you later want to turn a floating submodule into an exact one.
>>
>>
>> I'm sorry I missed this comment before.
>>
>> I hope we can allow storing the actual gitlink in the superproject for
>> each commit even when we're using floating submodules.
>
> I think you misread my statement, I was just talking about the initial
> commit containing the newly added submodule, not any subsequent ones.
> Floating makes differences between the original SHA-1 and the current
> tip of the branch invisible, so there is nothing to commit.
>
>>  I thought-experimented with this a bit last year and came to the
>> conclusion that I should be able to 'float' to tips (developer
>> convenience) and also to store the SHA-1 of each gitlink through
>> history (automated maybe; as-needed).
>
> Which means that after "git submodule update" floated a submodule branch
> further, you would have to commit that in the superproject.

Sadly, yes.  Currently I have my CI-server do this for me after it
verifies each new submodule commit is able to build successfully.

>> The problem with "float-only" is that it loses history so, for
>> example, git-bisect doesn't work.
>
> Yep. And different developers can have the same superproject commit
> checked out but their submodules can be quite different.

>> The problem with "float + gitlinks", of course, is that it looks like
>> "not floating" to the developers (git-status is dirty unless
>> overridden, etc.)
>
> Yeah. But what if each "git submodule update" would update the tip of
> the submodule branch and add that to the superproject? You could follow
> a tip but still produce reproducible trees.

Yes, and that's what I want.

Not what it sounded like was being suggested before, which (to my
eyes) implied that the submodule gitlinks were useless noise.

Phil

^ permalink raw reply

* Re: [PATCH v3 0/3] grep attributes and multithreading
From: Junio C Hamano @ 2011-12-12 23:44 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, René Scharfe, Jeff King, Eric Herman
In-Reply-To: <cover.1323723759.git.trast@student.ethz.ch>

Thanks; sign-off?

^ permalink raw reply

* Re: Auto update submodules after merge and reset
From: Phil Hord @ 2011-12-12 23:43 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: Andreas T.Auer, git
In-Reply-To: <4EE682A3.8070704@web.de>

On Mon, Dec 12, 2011 at 5:39 PM, Jens Lehmann <Jens.Lehmann@web.de> wrote:
> Am 11.12.2011 22:15, schrieb Andreas T.Auer:
>> In http://thread.gmane.org/gmane.comp.version-control.git/183837 was discussed whether the gitlink in the superproject should be set to all-zero if updates follow the tip or maybe use the SHA1 of the commit when the submodule was added. I think the gitlink should be updated everytime when a new commit in the superproject is created.
>
> Nope, only when "git submodule update" is run. Otherwise you'll spray the
> history with submodule updates totally unrelated to the commits in the
> superproject, which is rather confusing.


And this is why my superproject is a makefile, a .gitmodules file and
a bunch of gitlinks.  We only use it to track the advancement of
submodule activity.

So yes, I want my superproject's gitlinks to update automatically.  I
just don't know a smart way to make that happen.

Phil

^ permalink raw reply

* Re: [PATCH v2] Update documentation for stripspace
From: Junio C Hamano @ 2011-12-12 23:41 UTC (permalink / raw)
  To: Conrad Irwin; +Cc: Junio C Hamano, git
In-Reply-To: <1323728909-7847-1-git-send-email-conrad.irwin@gmail.com>

Conrad Irwin <conrad.irwin@gmail.com> writes:

> ...
> I've moved the *NOTE* into a SEE ALSO section where I think it reads
> less opinionatedly ― is that better?

I think it looks a lot worse.

At least your original hinted that some people may confuse the two and the
NOTE was there for such people; other people who would not even dream of
such a confusion won't find the existence of the note a "Huh?". But the
updated patch with a link in SEE ALSO section without any explanation
would be a definite "Huh?" for those of us who find that stripspace does
not have anything to do with what "apply --whitespace=fix" does.

The new example section looks good. Perhaps we can just drop the extra SEE
ALSO and resurrect the *NOTE* from your v1 patch.

Thanks.

^ permalink raw reply

* Re: Breakage (?) in configure and git_vsnprintf()
From: Brandon Casey @ 2011-12-12 23:25 UTC (permalink / raw)
  To: Jeff King
  Cc: Andreas Schwab, Michael Haggerty, Junio C Hamano,
	git discussion list, Michal Rokos
In-Reply-To: <20111212142550.GA23875@sigill.intra.peff.net>


FYI: all tests passed on IRIX (not that I suspected otherwise).

-Brandon

On 12/12/2011 08:25 AM, Jeff King wrote:
> On Mon, Dec 12, 2011 at 10:30:27AM +0100, Andreas Schwab wrote:
> 
>> Jeff King <peff@peff.net> writes:
>>
>>>  	if (maxsize > 0) {
>>> -		ret = vsnprintf(str, maxsize-SNPRINTF_SIZE_CORR, format, ap);
>>> +		va_copy(cp, ap);
>>> +		ret = vsnprintf(str, maxsize-SNPRINTF_SIZE_CORR, format, cp);
>>
>> You also need to call va_end on the copy.
> 
> Silly me. Thanks for catching.
> 
> Junio, here's a corrected version.
> 
> -- >8 --
> Subject: [PATCH] compat/snprintf: don't look at va_list twice
> 
> If you define SNPRINTF_RETURNS_BOGUS, we use a special
> git_vsnprintf wrapper assumes that vsnprintf returns "-1"
> instead of the number of characters that you would need to
> store the result.
> 
> To do this, it invokes vsnprintf multiple times, growing a
> heap buffer until we have enough space to hold the result.
> However, this means we evaluate the va_list parameter
> multiple times, which is generally a bad thing (it may be
> modified by calls to vsnprintf, yielding undefined
> behavior).
> 
> Instead, we must va_copy it and hand the copy to vsnprintf,
> so we always have a pristine va_list.
> 
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  compat/snprintf.c |    9 +++++++--
>  1 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/compat/snprintf.c b/compat/snprintf.c
> index e1e0e75..42ea1ac 100644
> --- a/compat/snprintf.c
> +++ b/compat/snprintf.c
> @@ -19,11 +19,14 @@
>  #undef vsnprintf
>  int git_vsnprintf(char *str, size_t maxsize, const char *format, va_list ap)
>  {
> +	va_list cp;
>  	char *s;
>  	int ret = -1;
>  
>  	if (maxsize > 0) {
> -		ret = vsnprintf(str, maxsize-SNPRINTF_SIZE_CORR, format, ap);
> +		va_copy(cp, ap);
> +		ret = vsnprintf(str, maxsize-SNPRINTF_SIZE_CORR, format, cp);
> +		va_end(cp);
>  		if (ret == maxsize-1)
>  			ret = -1;
>  		/* Windows does not NUL-terminate if result fills buffer */
> @@ -42,7 +45,9 @@ int git_vsnprintf(char *str, size_t maxsize, const char *format, va_list ap)
>  		if (! str)
>  			break;
>  		s = str;
> -		ret = vsnprintf(str, maxsize-SNPRINTF_SIZE_CORR, format, ap);
> +		va_copy(cp, ap);
> +		ret = vsnprintf(str, maxsize-SNPRINTF_SIZE_CORR, format, cp);
> +		va_end(cp);
>  		if (ret == maxsize-1)
>  			ret = -1;
>  	}

^ permalink raw reply

* Re: [PATCH 2/2] Makefile: optionally exclude code that needs Unix sockets
From: Junio C Hamano @ 2011-12-12 23:31 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Sixt, git
In-Reply-To: <20111212213951.GB9754@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> In theory we should also disable the documentation for credential-cache.
> But that means surgery not only to Documentation/Makefile, but figuring
> out how to pass the flags down to the actual asciidoc process (since
> gitcredentials(7) mentions it in the text). Certainly possible, but I
> don't know if it's worth the effort or not.

I do not think it matters that much. We've been shipping documentation for
stuff like remote archiver and daemon without conditional compilation, no?

^ permalink raw reply

* Re: [PATCH 1/3] test-terminal: give the child an empty stdin TTY
From: Jonathan Nieder @ 2011-12-12 23:25 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Thomas Rast, git, Michael Haggerty
In-Reply-To: <20111212191602.GA14061@sigill.intra.peff.net>

Jeff King wrote:

> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -198,6 +198,9 @@ else
>  	exec 4>/dev/null 3>/dev/null
>  fi
>
> +exec 6<&0
> +exec 0</dev/null
> +
>  test_failure=0
>  test_count=0
>  test_fixed=0

Independently of the changes to explicitly pass HEAD to "git shortlog"
in t7006 (we should) and make test_terminal make stdin into a tty, too
(if tests still make sure "git log --stdin" launches a pager with
stdin not a tty, then it should be a fine change), this looks good to
me.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox