Git development
 help / color / mirror / Atom feed
* Re: [PATCH] add post-fetch hook
From: Junio C Hamano @ 2011-12-25  3:13 UTC (permalink / raw)
  To: Joey Hess; +Cc: git
In-Reply-To: <20111224234212.GA21533@gnu.kitenet.net>

Joey Hess <joey@kitenet.net> writes:

> The post-fetch hook is fed on its stdin all refs that were newly fetched.
> It is not allowed to abort the fetch (or pull), but can modify what
> was fetched or take other actions.
>
> One example use of this hook is to automatically merge certain remote
> branches into a local branch. Another is to update a local cache
> (such as a search index) with the fetched refs. No other hook is run
> near fetch time, except for post-merge, which doesn't always run after a
> fetch, which is why this additional hook is useful.
>
> Signed-off-by: Joey Hess <joey@kitenet.net>

A typical "'git pull' invokes 'git fetch' and then lets 'git merge' (or
'git rebase') integrate the work on the current branch with what was
fetched" sequence goes like this:

 - 'git fetch':
  . grabs the necessary objects from the remote;
  . decides what remote tracking branches are updated to point
    at what objects, and what updates are to be denied;
  . updates remote tracking branches accordingly; and
  . writes $GIT_DIR/FETCH_HEAD to communicate what have been fetched
    and what are to be merged.

 - 'git merge':
  . reads $GIT_DIR/FETCH_HEAD to learn what commits to be merged; and
  . merges the commits to the current branch.

Even though we do not add arbitrary hooks on the client side that could
easily be implemented by wrapping the client side commands (i.e. you could
implement "git myfetch" that runs "git fetch" followed by whatever script
that mucks with the result of the fetch) in general, I can see that it
would be useful to have a hook that can tweak the result of the fetch run
inside of the "git pull", because you cannot tell "git pull" to run "git
myfetch" instead of "git fetch".

Because the sequence of "git fetch" followed by "git merge", both commands
issued by the end user, should be equivalent to "git pull" from an end
user's point of view, the hook must be called from near the end of "git
fetch" if we were to have such a hook that tweaks the result of the fetch
inside "pull". IOW, the implementation, even though logically it belongs
to "pull", has to be inside "fetch", not "pull".

In that sense, I am not fundamentally opposed to the idea of adding a post
fetch hook that allows tweaking of the result.

*HOWEVER*

If we _were_ to sanction the use of the hook to tweak the result, I do not
want to see it implemented as an ad-hoc hack that tells the hook writers
that it is _entirely_ their responsiblity to update the remote tracking
branches from what it fetched, and also update $GIT_DIR/FETCH_HEAD to
maintain consistency between these two places.

A very cursory look at the patch tells me that there are a few problems
with it.  It does not seem to affect what will go to $GIT_DIR/FETCH_HEAD
at all, and hence it does not have any way to affect the result of the
fetch that does not store it to any of our remote tracking branches.

> The #1 point of confusion for git-annex users is the need to run
> "git annex merge" after fetching. That does a union merge of newly
> fetched remote git-annex branches into the local git-annex branch.

That use case sounds like that "git fetch" is called as a first class UI,
which is covered by "git myfetch" (you can call it "git annex fetch")
wrapper approach, the canonical example of a hook that we explicitly do
not want to add. It also does not seem to call for mucking with the result
of the fetch at all.

Perhaps the two concepts should be separated into different hooks?

^ permalink raw reply

* Re: [PATCH v2 3/3] grep: disable threading in all but worktree case
From: Nguyen Thai Ngoc Duy @ 2011-12-25  3:32 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð, Thomas Rast, René Scharfe,
	Eric Herman, git, Junio C Hamano, Fredrik Kuivinen
In-Reply-To: <20111224070715.GA32267@sigill.intra.peff.net>

On Sat, Dec 24, 2011 at 2:07 PM, Jeff King <peff@peff.net> wrote:
> The case where we would most expect the setup cost to be drowned out is
> using a more complex regex, grepping tree objects. There we have a
> baseline of:
>
>  $ time git grep 'a.*c' HEAD >/dev/null
>  real    0m5.684s
>  user    0m5.472s
>  sys     0m0.196s
>
>  $ time git ls-tree --name-only -r HEAD |
>      xargs git grep 'a.*c' HEAD -- >/dev/null
>  real    0m10.906s
>  user    0m10.725s
>  sys     0m0.240s
>
> Here, we still almost double our time. It looks like we don't use the
> same pathspec matching code in this case. But we do waste a lot of extra
> time zlib-inflating the trees in "ls-tree", only to do it separately in
> "grep".

Or you could pass blob SHA-1 to git grep to avoid reinflating trees

$ time git ls-tree -r HEAD|cut -c 13-52|xargs git grep 'a.*c' >/dev/null

Doing it in parallel does not seem to save time for me though.
-- 
Duy

^ permalink raw reply

* Re: [PATCH] add post-fetch hook
From: Joey Hess @ 2011-12-25  3:50 UTC (permalink / raw)
  To: git
In-Reply-To: <7v4nwpbaxq.fsf@alter.siamese.dyndns.org>

[-- Attachment #1: Type: text/plain, Size: 1852 bytes --]

Junio C Hamano wrote:
> If we _were_ to sanction the use of the hook to tweak the result, I do not
> want to see it implemented as an ad-hoc hack that tells the hook writers
> that it is _entirely_ their responsiblity to update the remote tracking
> branches from what it fetched, and also update $GIT_DIR/FETCH_HEAD to
> maintain consistency between these two places.
> 
> A very cursory look at the patch tells me that there are a few problems
> with it.  It does not seem to affect what will go to $GIT_DIR/FETCH_HEAD
> at all, and hence it does not have any way to affect the result of the
> fetch that does not store it to any of our remote tracking branches.

True, it does not update FETCH_HEAD. I had not considered using the hook
that way.

I suppose that after running the hook, fetch could check each remote
tracking branch for a new value, and only then write to FETCH_HEAD.

> > The #1 point of confusion for git-annex users is the need to run
> > "git annex merge" after fetching. That does a union merge of newly
> > fetched remote git-annex branches into the local git-annex branch.
> 
> That use case sounds like that "git fetch" is called as a first class UI,
> which is covered by "git myfetch" (you can call it "git annex fetch")
> wrapper approach, the canonical example of a hook that we explicitly do
> not want to add. It also does not seem to call for mucking with the result
> of the fetch at all.

Most users are fetching by calling git pull as part of their normal
workflow. I would like to avoid git-annex needing its own special pull
command. For one thing, there can be many programs that use git branches
in similar ways (another one is pristine-tar), and a user shouldn't have
to run multiple wrapped versions of git fetch or pull when using
multiple such programs.

-- 
see shy jo

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply

* [PATCH] Support wrapping commit messages when you read them
From: Sidney San Martín @ 2011-12-25  5:05 UTC (permalink / raw)
  To: git

Fairly simpleminded line wrapping that makes commit messages
readable if they weren’t wrapped by the committer.

- Use strbuf_add_wrapped_text() to do the dirty work
- Detect simple lists which begin with "+ ", "* ", or "- " and indent
  them appropriately (like this line)
- Print lines which begin with whitespace as-is (e.g. code samples)

Add --wrap[=<width>] and --no-wrap to commands that pretty-print commit
messages, and add log.wrap and log.wrap.width configuration options.

log.wrap defaults to never, and can be set to never/false, auto/true,
or always. If auto, hijack want_color() to decide whether it’s
appropriate to use line wrapping. (This is a little hacky, but as far
as I can tell the conditions for auto color and auto wrapping are the
same. Maybe it would make sense to rename want_color()?)

log.wrap.width defaults to 80.

Signed-off-by: Sidney San Martín <s@sidneysm.com>
---

I hope I’m doing this right!

Consider this a first draft of the new feature I brought up a few weeks ago.


 Documentation/config.txt         |    8 ++++
 Documentation/pretty-options.txt |   14 +++++++
 commit.h                         |    6 +++
 log-tree.c                       |    1 +
 pretty.c                         |   71 +++++++++++++++++++++++++++++++++++++-
 revision.c                       |   10 +++++
 revision.h                       |    3 ++
 7 files changed, 112 insertions(+), 1 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6e63b59..b8c1a81 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1403,6 +1403,14 @@ log.showroot::
 	Tools like linkgit:git-log[1] or linkgit:git-whatchanged[1], which
 	normally hide the root commit will now show it. True by default.
 
+log.wrap::
+	Commit message wrapping. May be set to always, false (or never) or
+	auto (or true), in which case commit messages are only wrapped when
+	the output is to a terminal. Defaults to false. See linkgit:git-log[1].
+
+log.wrap.width::
+	Line width for commit message wrapping. Defaults to 80 characters.
+
 mailmap.file::
 	The location of an augmenting mailmap file. The default
 	mailmap, located in the root of the repository, is loaded
diff --git a/Documentation/pretty-options.txt b/Documentation/pretty-options.txt
index 2a3dc86..7601a43 100644
--- a/Documentation/pretty-options.txt
+++ b/Documentation/pretty-options.txt
@@ -10,6 +10,20 @@
 Note: you can specify the default pretty format in the repository
 configuration (see linkgit:git-config[1]).
 
+--wrap[=<width>]::
+	Word-wrap commit messages to the specified width, 80 characters
+	by default. Lines beginning with +, *, or - and a space, or with a
+	number followed by a period and a space, are interpreted as lists
+	and wrapped with a hanging indent. Lines beginning with whitespace
+	(e.g. code samples) are left as-is.
++
+Note: you can specify the default wrapping behavior in the repository
+configuration (see linkgit:git-config[1]).
+
+--no-wrap::
+	Turn off commit message wrapping. This is the default (unless
+	otherwise specified in the repository configuration).
+
 --abbrev-commit::
 	Instead of showing the full 40-byte hexadecimal commit object
 	name, show only a partial prefix.  Non default number of
diff --git a/commit.h b/commit.h
index 4df3978..1321666 100644
--- a/commit.h
+++ b/commit.h
@@ -86,6 +86,12 @@ struct pretty_print_context {
 	int show_notes;
 	struct reflog_walk_info *reflog_info;
 	const char *output_encoding;
+	struct wrap_options {
+		unsigned int
+			width,
+			wrap:1,
+			wrap_given:1;
+	} wrap;
 };
 
 struct userformat_want {
diff --git a/log-tree.c b/log-tree.c
index 319bd31..3dfa944 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -414,6 +414,7 @@ void show_log(struct rev_info *opt)
 
 	opt->loginfo = NULL;
 	ctx.show_notes = opt->show_notes;
+	ctx.wrap = opt->wrap;
 	if (!opt->verbose_header) {
 		graph_show_commit(opt->graph);
 
diff --git a/pretty.c b/pretty.c
index 1580299..133bc53 100644
--- a/pretty.c
+++ b/pretty.c
@@ -23,6 +23,10 @@ static size_t commit_formats_len;
 static size_t commit_formats_alloc;
 static struct cmt_fmt_map *find_commit_format(const char *sought);
 
+static unsigned int wrap_configured = 0;
+static unsigned int wrap = 0;
+static unsigned int wrap_width = 80;
+
 static void save_user_format(struct rev_info *rev, const char *cp, int is_tformat)
 {
 	free(user_format);
@@ -172,6 +176,63 @@ void get_commit_format(const char *arg, struct rev_info *rev)
 	}
 }
 
+static int git_wrap_config(const char *name, const char *value, void *cb)
+{
+	wrap_configured = 1;
+
+	if (prefixcmp(name, "log.wrap"))
+		return 0;
+
+	if (name[8] == '\0') {
+		if (value && !strcmp(value, "always")) {
+			wrap = 1;
+		} else if (value && !strcmp(value, "never")) {
+			wrap = 0;
+		} else if (!value || !strcmp(value, "auto") || git_config_bool(name, value)) {
+			wrap = want_color(GIT_COLOR_AUTO);
+		}
+	} else if (name[8] == '.') {
+		name += 9;
+		if (!strcmp(name, "width")) {
+			wrap_width = git_config_int(name, value);
+		}
+	}
+	return 0;
+}
+
+static unsigned int want_wrap(struct wrap_options options)
+{
+	if(!wrap_configured)
+		git_config(git_wrap_config, NULL);
+	return (options.wrap_given ? options.wrap : wrap);
+}
+static unsigned int get_wrap_width(struct wrap_options options)
+{
+	if(!wrap_configured)
+		git_config(git_wrap_config, NULL);
+	return options.width ? options.width : wrap_width;
+}
+
+static int line_list_prefix(const char *line, int len)
+{
+	unsigned int numberLength = 0;
+	const char *pos = line;
+	if (len < 3) {
+		return 0;
+	} else if ((line[0] == '*' || line[0] == '+' || line[0] == '-') && line[1] == ' ') {
+		return 2;
+	} else {
+		while (pos - line < len && pos[0] >= '0' && pos[0] <= '9'){
+			numberLength++;
+			pos++;
+		}
+		if (numberLength && pos - line + 1 < len && pos[0] == '.' && pos[1] == ' ') {
+			return numberLength + 2;
+		}
+	}
+	return 0;
+}
+
 /*
  * Generic support for pretty-printing the header
  */
@@ -1246,6 +1307,8 @@ void pp_remainder(const struct pretty_print_context *pp,
 		  struct strbuf *sb,
 		  int indent)
 {
+	unsigned int wrap = want_wrap(pp->wrap);
+	unsigned int width = get_wrap_width(pp->wrap);
 	int first = 1;
 	for (;;) {
 		const char *line = *msg_p;
@@ -1268,7 +1331,13 @@ void pp_remainder(const struct pretty_print_context *pp,
 			memset(sb->buf + sb->len, ' ', indent);
 			strbuf_setlen(sb, sb->len + indent);
 		}
-		strbuf_add(sb, line, linelen);
+		if (wrap && linelen && line[0] != ' ' && line[0] != '\t') {
+			struct strbuf wrapped = STRBUF_INIT;
+			strbuf_add(&wrapped, line, linelen);
+			strbuf_add_wrapped_text(sb, wrapped.buf, 0, indent + line_list_prefix(line, linelen), width - indent);
+		} else {
+			strbuf_add(sb, line, linelen);
+		}
 		strbuf_addch(sb, '\n');
 	}
 }
diff --git a/revision.c b/revision.c
index 8764dde..ca4b386 100644
--- a/revision.c
+++ b/revision.c
@@ -1465,6 +1465,16 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->verbose_header = 1;
 		revs->pretty_given = 1;
 		get_commit_format(arg+9, revs);
+	} else if (!strcmp(arg, "--wrap")) {
+		revs->wrap.wrap = 1;
+		revs->wrap.wrap_given = 1;
+	} else if (!prefixcmp(arg, "--wrap=")) {
+		revs->wrap.wrap = 1;
+		revs->wrap.wrap_given = 1;
+		revs->wrap.width = atoi(arg+7);
+	} else if (!prefixcmp(arg, "--no-wrap")) {
+		revs->wrap.wrap = 0;
+		revs->wrap.wrap_given = 1;
 	} else if (!strcmp(arg, "--show-notes") || !strcmp(arg, "--notes")) {
 		revs->show_notes = 1;
 		revs->show_notes_given = 1;
diff --git a/revision.h b/revision.h
index 6aa53d1..f812685 100644
--- a/revision.h
+++ b/revision.h
@@ -117,6 +117,9 @@ struct rev_info {
 			missing_newline:1,
 			date_mode_explicit:1,
 			preserve_subject:1;
+
+	struct wrap_options wrap;
+
 	unsigned int	disable_stdin:1;
 	unsigned int	leak_pending:1;
 
-- 
1.7.8.1

^ permalink raw reply related

* Re: [PATCH] add post-fetch hook
From: Junio C Hamano @ 2011-12-25  9:30 UTC (permalink / raw)
  To: Joey Hess; +Cc: git
In-Reply-To: <20111225035059.GA29852@gnu.kitenet.net>

Joey Hess <joey@kitenet.net> writes:

> Junio C Hamano wrote:
>> If we _were_ to sanction the use of the hook to tweak the result, I do not
>> want to see it implemented as an ad-hoc hack that tells the hook writers
>> that it is _entirely_ their responsiblity to update the remote tracking
>> branches from what it fetched, and also update $GIT_DIR/FETCH_HEAD to
>> maintain consistency between these two places.
>> 
>> A very cursory look at the patch tells me that there are a few problems
>> with it.  It does not seem to affect what will go to $GIT_DIR/FETCH_HEAD
>> at all, and hence it does not have any way to affect the result of the
>> fetch that does not store it to any of our remote tracking branches.
>
> True, it does not update FETCH_HEAD. I had not considered using the hook
> that way.

Perhaps I misread what you wrote in the commit log message then.  I
somehow got an impression that one of the advertised way for the hook to
be used was to lie which commits the remote tracking refs point at by
letting it run "update-ref refs/*/*" and the lie is later picked up by
re-reading them, but I wasn't reading the patch very carefully.

If your use case does not involve updating the remote tracking refs nor
FETCH_HEAD (updating only one and not the other is a no-starter), then we
should explicitly forbid it in the documentation, as allowing updates will
invite inconsistencies.

Although I do not deeply care between such a "trigger to only notify, no
touching" hook and a full-blown "allow hook writers to easily lie about
what happened in the fetch" hook, I was hoping that we would get this
right and useful if we were spending our brain bandwidth on it. I am not
very fond of an easier "trigger to only notify" hook because people are
bound to misuse the interface and try updating the refs anyway, making it
easy to introduce inconsistencies between refs and FETCH_HEAD that will
confuse the later "merge" step.

As hook writers are more prone to write such an incorrect code than people
who implement the mechanism to call hooks on the git-core side, the more
the hook interface helps hook writers to avoid such mistakes, the better.

So if we were to allow the hook to lie what commits were fetched and store
something different from what we fetched in the remote tracking refs, I
think the correct place to do so would be in store_updated_refs(),
immediately before we call check_everything_connected().

 - Feed the contents of the ref_map to the hook. For each entry, the hook
   would get (at least):
   . the object name;
   . the refname at the remote;
   . the refname at the local (which could be empty when we are not
     copying it to any of our local ref); and
   . if the entry is to be used for merge.

 - The hook must read _everything_ from its standard input, and then
   return the
   re-written result in the same format as its input. The hook could
   . update the object name (i.e. re-point the remote tracking ref);
   . update the local refname (i.e. use different remote tracking ref);
   . change "merge" flag between true/false; and/or
   . add or remove entries

 - You read from the hook and replace the ref_map list that is fed to
   check_everything_connected(). This ref_map list is what is used in the
   next for() loop that calls update_local_ref() to update the remote
   tracking ref, records the entry in FETCH_HEAD, and produces the report.

This way, the hook cannot screw up, as what it tells us will consistently
be written by us to where it should go.

^ permalink raw reply

* Re: [PATCH] Support wrapping commit messages when you read them
From: Junio C Hamano @ 2011-12-25  9:57 UTC (permalink / raw)
  To: Sidney San Martín; +Cc: git
In-Reply-To: <8DE6E894-B50D-4F7E-AE18-C10E7E40A550@sidneysm.com>

Sidney San Martín <s@sidneysm.com> writes:

> Fairly simpleminded line wrapping that makes commit messages
> readable if they weren’t wrapped by the committer.

This does not say anything useful, other than "this is a naïve
implementation of message wrapper" and invites "So what?".

The most simple-minded solution is to reject such commits with crappy log
message.

After all, SCM is merely a method to help communication between
developers, and sticking to the common denominator is a proven good way to
make sure everybody involved in the project can use what is recorded in
the repository. This is not limited only to the log message, but equally
applies to filenames (e.g. don't create xt_tcpmss.c and xt_TCPMSS.c in the
same directory if you want your project extractable on case insensitive
filesystems) and even to the sources.

You need to justify the cause a bit better. Why is such a new logic
justified?

> - Use strbuf_add_wrapped_text() to do the dirty work
> - Detect simple lists which begin with "+ ", "* ", or "- " and indent
>   them appropriately (like this line)
> - Print lines which begin with whitespace as-is (e.g. code samples)

I suspect the above would make it more palatable than format=flowed
brought up in earlier discussions, which is unsuitable for nothing but
straight text.

> Add --wrap[=<width>] and --no-wrap to commands that pretty-print commit
> messages, and add log.wrap and log.wrap.width configuration options.

Why do you need two separate options and configurations that look as if
they are independent but in reality not?  If you say "no wrap", there is
no room for you to say "wrap width is 72".

I would expect something like:

 --log-message-wrap, --log-message-wrap=72, --log-message-wrap=no 

with --log-message-wrap=yes as a synonym for --log-message-wrap to give
consistency. The corresponding configuraiton would be log.messageWrap
whose values could be the usual bool-or-int.

> log.wrap defaults to never, and can be set to never/false, auto/true,
> or always. If auto, hijack want_color() to decide whether it’s
> appropriate to use line wrapping. (This is a little hacky, but as far
> as I can tell the conditions for auto color and auto wrapping are the
> same.

Why does coloring have _anything_ to do with line wrapping? Maybe your
personaly preference might be "wrap and color if interactive terminal" but
that is conflating two unrelated concepts. A user may not expect coloring
on a dumb interactive terminal, but wrapping may still be useful.

> log.wrap.width defaults to 80.

This does not deserve a comment as I already rejected the "two
configuration" approach, but do not use three-level names this way. We try
to reserve three-level names only for cases where the second level is used
for an unbound collection (e.g. "remote.$name.url", "branch.$name.merge").
that is user-specified.

^ permalink raw reply

* Re: [PATCH] add post-fetch hook
From: Jakub Narebski @ 2011-12-25 16:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Joey Hess, git
In-Reply-To: <7vsjk99exw.fsf@alter.siamese.dyndns.org>

Junio C Hamano <gitster@pobox.com> writes:

> [...] if we were to allow the hook to lie what commits were fetched and store
> something different from what we fetched in the remote tracking refs, I
> think the correct place to do so would be in store_updated_refs(),
> immediately before we call check_everything_connected().
> 
>  - Feed the contents of the ref_map to the hook. For each entry, the hook
>    would get (at least):
>    . the object name;
>    . the refname at the remote;
>    . the refname at the local (which could be empty when we are not
>      copying it to any of our local ref); and
>    . if the entry is to be used for merge.
> 
>  - The hook must read _everything_ from its standard input, and then
>    return the
>    re-written result in the same format as its input. The hook could
>    . update the object name (i.e. re-point the remote tracking ref);
>    . update the local refname (i.e. use different remote tracking ref);
>    . change "merge" flag between true/false; and/or
>    . add or remove entries

This is a very nice idea, IMHO, both because it makes it simple to
implement no-op (example) hook by just using "cat", and beause it
makes possible to stack such hooks (e.g. one from git-annex with the
one from pristine-tar etc.).

One thing that needs to be specified is what should happen if the hook
changes "the refname at the remote" part...
 
>  - You read from the hook and replace the ref_map list that is fed to
>    check_everything_connected(). This ref_map list is what is used in the
>    next for() loop that calls update_local_ref() to update the remote
>    tracking ref, records the entry in FETCH_HEAD, and produces the report.
> 
> This way, the hook cannot screw up, as what it tells us will consistently
> be written by us to where it should go.

-- 
Jakub Narebski

^ permalink raw reply

* Re: [RFC PATCH] Allow cloning branches selectively
From: Jakub Narebski @ 2011-12-25 16:28 UTC (permalink / raw)
  To: Carlos Martín Nieto; +Cc: git
In-Reply-To: <1324671199-7074-1-git-send-email-cmn@elego.de>

Carlos Martín Nieto <cmn@elego.de> writes:

> Sometimes it's useful to clone only a subset of branches from a remote
> we're cloning. Teach clone the --fetch option to select which branches
> should get fetched.
> 
> Each --fetch sets up a fetch refspec for that branch. Previously this
> was only possible by initializing a repo and manually setting up the
> config.

I haven't read the code, but I guess it should be possible to share
code with "git remote add", which allows to select which branches to
track, and which branch is to be 'main' on remote.

  git remote add [-t <branch>] [-m <master>] [-f] ...

-- 
Jakub Narebski

^ permalink raw reply

* FETCH_HEAD documentation vs reality
From: Joey Hess @ 2011-12-25 17:39 UTC (permalink / raw)
  To: git


[-- Attachment #1.1: Type: text/plain, Size: 2428 bytes --]

While trying to find some documentation of the format of .git/FETCH_HEAD,
I found this example in git-read-tree.txt, which I think will no longer
work. Probably when this was written, .git/FETCH_HEAD contained only a single
SHA; it's much more complicated now.

        $ JC=`git rev-parse --verify "HEAD^0"`
        $ git checkout-index -f -u -a $JC
        ...
        $ git fetch git://.... linus
        $ LT=`cat .git/FETCH_HEAD`
        ...
        $ git read-tree -m -u `git merge-base $JC $LT` $JC $LT

It's also common for the first line of .git/FETCH_HEAD to be an
arbitrary branch that was fetched (as part of an unqualified "git
pull"), marked not-for-merge. So using "FETCH_HEAD" as a refname will
refer to such a branch unintentionally. There are several places in the
docs that seem to expect FETCH_HEAD to always refer to the one that was
fetched and will be merged (ie, master):

revisions.txt:

	'FETCH_HEAD' records the branch which you fetched from a remote repository
	with your last `git fetch` invocation.

git-pull.txt:

	In its default mode, `git pull` is shorthand for
	`git fetch` followed by `git merge FETCH_HEAD`.

gittutorial.txt:

	alice$ git log -p HEAD..FETCH_HEAD
	$ gitk HEAD..FETCH_HEAD

howto/rebase-from-internal-branch.txt:

	You fetch from upstream, but not merge.
	
	    $ git fetch upstream
	
	This leaves the updated upstream head in .git/FETCH_HEAD but
	does not touch your .git/HEAD nor .git/refs/heads/master.
	You run "git rebase" now.
	
	    $ git rebase FETCH_HEAD master

All this documentation could be changed, or resolve_ref_unsafe in refs.c
could be changed to have a special case parser for .git/FETCH_HEAD,
that finds the first branch that is marked for merge, where it now has
this minor special case for it:

        /* Please note that FETCH_HEAD has a second line containing other data. */
        if (get_sha1_hex(buffer, sha1) || (buffer[40] != '\0' && !isspace(buffer[40]))) {

Or yet another way to fix it would be to make git fetch always write the
intended FETCH_HEAD first into .git/FETCH_HEAD. (When not in --append mode.)
This seems like perhaps the best fix, although it does mean that if a
fetch is done of only not-for-merge refs, without --append, FETCH_HEAD
will still refer to one of them. 

I've attached a minimal proof-of-concept patch implementing this last
option.

-- 
see shy jo

[-- Attachment #1.2: patch --]
[-- Type: text/plain, Size: 1221 bytes --]

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 33ad3aa..e2f2c69 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -376,6 +376,7 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
 	struct strbuf note = STRBUF_INIT;
 	const char *what, *kind;
 	struct ref *rm;
+	int top = 1;
 	char *url, *filename = dry_run ? "/dev/null" : git_path("FETCH_HEAD");
 
 	fp = fopen(filename, "a");
@@ -393,6 +394,7 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
 		goto abort;
 	}
 
+ write:
 	for (rm = ref_map; rm; rm = rm->next) {
 		struct ref *ref = NULL;
 
@@ -408,6 +410,9 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
 		if (!commit)
 			rm->merge = 0;
 
+		if (top != rm->merge)
+			continue;
+
 		if (!strcmp(rm->name, "HEAD")) {
 			kind = "";
 			what = "";
@@ -474,6 +479,11 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
 		}
 	}
 
+	if (top) {
+		top = 0;
+		goto write;
+	}
+
 	if (rc & STORE_REF_ERROR_DF_CONFLICT)
 		error(_("some local refs could not be updated; try running\n"
 		      " 'git remote prune %s' to remove any old, conflicting "

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related

* Re: [RFC PATCH] Allow cloning branches selectively
From: Carlos Martín Nieto @ 2011-12-25 17:43 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <m3hb0ofwem.fsf@localhost.localdomain>

[-- Attachment #1: Type: text/plain, Size: 838 bytes --]

On Sun, Dec 25, 2011 at 08:28:39AM -0800, Jakub Narebski wrote:
> Carlos Martín Nieto <cmn@elego.de> writes:
> 
> > Sometimes it's useful to clone only a subset of branches from a remote
> > we're cloning. Teach clone the --fetch option to select which branches
> > should get fetched.
> > 
> > Each --fetch sets up a fetch refspec for that branch. Previously this
> > was only possible by initializing a repo and manually setting up the
> > config.
> 
> I haven't read the code, but I guess it should be possible to share
> code with "git remote add", which allows to select which branches to
> track, and which branch is to be 'main' on remote.
> 
>   git remote add [-t <branch>] [-m <master>] [-f] ...

Ah, very nice. I didn't know about those options. Hopefully I can use
it. Thanks for pointing it out.

   cmn

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* Re: [PATCH] add post-fetch hook
From: Joey Hess @ 2011-12-25 17:54 UTC (permalink / raw)
  To: git
In-Reply-To: <7vsjk99exw.fsf@alter.siamese.dyndns.org>

[-- Attachment #1: Type: text/plain, Size: 2987 bytes --]

Junio C Hamano wrote:
> So if we were to allow the hook to lie what commits were fetched and store
> something different from what we fetched in the remote tracking refs, I
> think the correct place to do so would be in store_updated_refs(),
> immediately before we call check_everything_connected().
> 
>  - Feed the contents of the ref_map to the hook. For each entry, the hook
>    would get (at least):
>    . the object name;
>    . the refname at the remote;
>    . the refname at the local (which could be empty when we are not
>      copying it to any of our local ref); and
>    . if the entry is to be used for merge.
> 
>  - The hook must read _everything_ from its standard input, and then
>    return the
>    re-written result in the same format as its input. The hook could
>    . update the object name (i.e. re-point the remote tracking ref);
>    . update the local refname (i.e. use different remote tracking ref);
>    . change "merge" flag between true/false; and/or
>    . add or remove entries
> 
>  - You read from the hook and replace the ref_map list that is fed to
>    check_everything_connected(). This ref_map list is what is used in the
>    next for() loop that calls update_local_ref() to update the remote
>    tracking ref, records the entry in FETCH_HEAD, and produces the report.
> 
> This way, the hook cannot screw up, as what it tells us will consistently
> be written by us to where it should go.

This is a good plan, the only problem I see with it is that
store_updated_refs is potentially called twice in a fetch, when the
automated tag following is done. I don't see that as a large problem,
perhaps it could even be optimised away.

The format of .git/FETCH_HEAD does not seem appropriate for this hook
to use (it's not documented, and it doesn't quite have all the necessary
fields, particularly missing the local refname). Instead, how about this,
for the hook's input/output format?

<sha1> SP <not-for-merge|merge> SP <remote-refname> SP <local-refname> LF

Example:

5d6dfc7cb140a6eb90138334fab2245b69bc8bc4 merge refs/heads/master refs/remotes/origin/master
f95247ea15bc62a2dab0f6ae3cd247267a0639b8 not-for-merge refs/heads/pu refs/remotes/origin/pu
2ce0edcd786b790fed580e7df56291619834d276 not-for-merge refs/tags/v1.7.8.1 refs/tags/v1.7.8.1

Allowing the hook to change the merge flag does open up some other
interesting uses of the hook. I can now think of three use cases for it:

1. Only accepting tags that meet some criteria, such as being signed
   by a trusted signature.
2. Causing branches that would not normally be merged to get merged.
   For example, a hook could set the merge flag on a branch when it was
   pulled from a remote other than branch.master.remote. This could be useful
   when using git without a single central origin and with a number of
   repositories that are always wanted to be kept merged.
3. My git annex merge case.

-- 
see shy jo

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply

* Re: [PATCH] add post-fetch hook
From: Joey Hess @ 2011-12-25 18:06 UTC (permalink / raw)
  To: git
In-Reply-To: <m3liq0fwkz.fsf@localhost.localdomain>

[-- Attachment #1: Type: text/plain, Size: 952 bytes --]

Jakub Narebski wrote:
> This is a very nice idea, IMHO, both because it makes it simple to
> implement no-op (example) hook by just using "cat", and beause it
> makes possible to stack such hooks (e.g. one from git-annex with the
> one from pristine-tar etc.).

Indeed.

> One thing that needs to be specified is what should happen if the hook
> changes "the refname at the remote" part...

I'm not sure what the use case is for including the remote's refname is yet.
Changing it would allow changing what's written into FETCH_HEAD though.
For example:

-2ce0edcd786b790fed580e7df56291619834d276        not-for-merge   branch 'maint' of git://git.kernel.org/pub/scm/git/git
+2ce0edcd786b790fed580e7df56291619834d276        not-for-merge   branch 'jc-maint' of git://git.kernel.org/pub/scm/git/git

And that would in turn be used by a few things that consume that
information. Whether that's useful, I don't know.

-- 
see shy jo

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply

* Re: [RFC PATCH] Allow cloning branches selectively
From: Carlos Martín Nieto @ 2011-12-25 19:00 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git
In-Reply-To: <CACsJy8ANZvkY+na-tM95prHEfXD+N2OT8+3NLeccycGL1BmbCg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 596 bytes --]

On Sat, Dec 24, 2011 at 11:31:51AM +0700, Nguyen Thai Ngoc Duy wrote:
> On Sat, Dec 24, 2011 at 3:13 AM, Carlos Martín Nieto <cmn@elego.de> wrote:
> > Sometimes it's useful to clone only a subset of branches from a remote
> > we're cloning. Teach clone the --fetch option to select which branches
> > should get fetched.
> 
> What about tags? Are all tags fetched or only ones part of the
> selected branches?

I haven't really touched that part of the code, so I think it'll fetch
every tag, as that's what clone does by default. Certainly something
that should get fixed.

   cmn

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* Re: Gitk: shortcut to jump to the current HEAD (yellow spot)?
From: Dirk Süsserott @ 2011-12-25 19:33 UTC (permalink / raw)
  To: Martin von Zweigbergk; +Cc: Pat Thoyts, Git Mailing List
In-Reply-To: <CAOeW2eGCKxYW1TT-HPoSCO0_PsQPX5C-bcHLUy73MTd7=CsqRA@mail.gmail.com>

Am 24.12.2011 05:22 schrieb Martin von Zweigbergk:
> 2011/12/23 Dirk Süsserott <newsletter@dirk.my1.cc>:
>>
>> That's because gitk behaves odd (at least to me) when not run from the
>> top-level directory. E.g. the "touching paths" box won't find files in
>> the top dir if you don't prefix them with a slash.
> 
> This should be fixed in c332f44 (gitk: Fix file highlight when run in
> subdirectory, 2011-04-04), which is in the current master and thus, I
> believe, to be released in Git 1.7.9.
> 
> Martin

Ahh, cool. I wouldn't have noticed because I'm so used to my "cd $TOP &&
gitk". I thought it was by intention because it just behaves like "git
log": When run from subdirs it doesn't know about topdir files: Assume
README.txt is in the topdir and current dir is some subdir:

$ git log -- README.txt    # fails
$ git log -- ../README.txt # works

My alias (or function) was just a helper to avoid remembering where I
started gitk from.

Cheers,
    Dirk

BTW, Merry X-Mas to you and all others on the list :-)

^ permalink raw reply

* Re: [RFC PATCH] Allow cloning branches selectively
From: Carlos Martín Nieto @ 2011-12-25 20:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vmxajaswj.fsf@alter.siamese.dyndns.org>

[-- Attachment #1: Type: text/plain, Size: 3455 bytes --]

On Fri, Dec 23, 2011 at 01:18:36PM -0800, Junio C Hamano wrote:
> Carlos Martín Nieto <cmn@elego.de> writes:
> 
> > Sometimes it's useful to clone only a subset of branches from a remote
> > we're cloning. Teach clone the --fetch option to select which branches
> > should get fetched.
> 
> This is just a knee-jerk reaction without reading the patch text (which I
> won't be doing over the holiday weekend anyway), but is the workflow of
> the primarly intended audience to clone "a subset of branches" or "a
> single branch"?
> 
> I have a slight suspicion that this started out as "I often want to create
> a clone to _track_ a single branch, but because I am mucking with the code
> related to cloning anyway, I might as well allow more than one to be
> fetched, even though I do not have any need for that, somebody might find
> it useful". And that is why it is important to answer the first question.

The main usefulness is indeed to track single branches. Limiting it to
one looked to me to be an arbitrary limitation and I don't like
those. Tracking between two and all branches is probably quite a niche
however, so I'll go with allowing one -t/--track option and if enough
people want to extend it, we'll see what we do then.

> 
> If the primary motivation is for a single branch, I suspect supporting
> only a single branch and advertising the feature as "tracking only one
> branch" might make it much easier to understand to the end users.
> 
> Upon "git clone --track cn/single-clone $there x.git", you would do
> something like:
> 
>   it=cn/single-clone &&
>   git init x.git &&
>   cd x.git &&
> 
>   # configure "git fetch origin" to only get branch $it
>   git config remote.origin.url "$there" &&
>   git config remote.origin.fetch refs/heads/$it:refs/remotes/origin/$it 
> 
>   # declare that the primary branch at origin is $it as far as we are concerned
>   git symbolic-ref -m clone refs/remotes/origin/HEAD refs/remotes/origin/$it &&

As Jakub pointed out, remote already has this option, so you can
replace these calls with

    git remote add origin $there -t $it -m $it

> 
>   # Git aware prompt reminds us that this repository is to track branch $it
>   git symbolic-ref -m clone HEAD refs/heads/$it &&
> 
>   # And Go!
>   git fetch origin &&
>   git reset --hard remotes/origin/$it &&
>   git config branch.$it.remote origin &&
>   git config branch.$it.merge $it

A lazier man (or one who doesn't work with internals every day) might do

    git checkout -t origin/$it

which the documentation claims would do the same thing. So it boils
down to doing three commands and it feels like it'd be much easier to
just write a small script than to modify the clone code. Putting it in
C will hopefully make the code a bit cleaner, if I use the code that
already exists in remote.

> 
> Of course you _could_ support more than one pretty easily, but the point
> is that it is unclear how you explain to the end user what the feature
> does and what it is for in easily understoodd terms, once you start doing
> so. It will no longer be "this new clone is to track that branch", but
> something else, and I do not know what that something else is.

Looking at the explanation for the -t (--track) option in git-remote,
it's certainly not very friendly unless you already know exactly what
a -t option would do if you were to implement it.

   cmn

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* [PATCH] add post-fetch hook
From: Joey Hess @ 2011-12-26  2:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vsjk99exw.fsf@alter.siamese.dyndns.org>

The post-fetch hook is fed lines on stdin for all refs that were fetched, and
outputs on stdout possibly modified lines. Its output is parsed and used
when git fetch updates the remote tracking refs, records the entries in
FETCH_HEAD, and produces its report.

---

Not quite ready to sign off on this yet, but it does work.
Comments and code review appreciated.

Demo:

joey@gnu:~/tmp/demo>cat .git/hooks/post-fetch
#!/bin/sh
# Rename branches, and block all tags.
sed 's!foo!bar!g' | grep -v refs/tags/
joey@gnu:~/tmp/demo>chmod +x .git/hooks/post-fetch
joey@gnu:~/tmp/demo>git pull
From /home/joey/tmp/a
 * [new branch]      bar        -> origin/bar
Already up-to-date.
joey@gnu:~/tmp/demo>chmod -x .git/hooks/post-fetch
joey@gnu:~/tmp/demo>git pull
From /home/joey/tmp/a
 * [new branch]      foo        -> origin/foo
 * [new tag]         v1.0       -> v1.0
Already up-to-date.
joey@gnu:~/tmp/demo>chmod +x .git/hooks/post-fetch
joey@gnu:~/tmp/demo>git remote update --prune
Fetching origin
 x [deleted]         (none)     -> origin/foo


 Documentation/githooks.txt |   29 ++++++
 builtin/fetch.c            |  228 ++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 238 insertions(+), 19 deletions(-)

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index 28edefa..9c2b6bf 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -162,6 +162,35 @@ This hook can be used to perform repository validity checks, auto-display
 differences from the previous HEAD if different, or set working dir metadata
 properties.
 
+post-fetch
+~~~~~~~~~~
+
+This hook is invoked by 'git fetch' (commonly called by 'git pull'), after
+refs have been fetched from the remote repository. It is not executed, if
+nothing was fetched.
+
+It takes no arguments, but is fed a line of the following format on
+its standard input for each ref that was fetched.
+
+  <sha1> SP not-for-merge|merge SP <remote-refname> SP <local-refname> LF
+
+Where the "not-for-merge" flag indicates the ref is not to be merged into the
+current branch, and the "merge" flag indicates that 'git merge' should
+later merge it. The `<remote-refname>` is the remote's name for the ref
+that was pulled, and `<local-refname>` is a name of a remote-tracking branch,
+like "refs/remotes/origin/master", or can be empty if the fetched ref is not
+being stored in a local refname.
+
+The hook must consume all of its standard input, and output back lines
+of the same format. It can modify its input as desired, including
+adding or removing lines, updating the sha1 (i.e. re-point the
+remote-tracking branch), changing the merge flag, and changing the
+`<local-refname>` (i.e. use different remote-tracking branch).
+
+The output of the hook is used to update the remote-tracking branches, and
+`.git/FETCH_HEAD`, in preparation for for a later merge operation done by
+'git merge'.
+
 post-merge
 ~~~~~~~~~~
 
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 33ad3aa..aa401b2 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -89,6 +89,188 @@ static struct option builtin_fetch_options[] = {
 	OPT_END()
 };
 
+static int add_existing(const char *refname, const unsigned char *sha1,
+			int flag, void *cbdata)
+{
+	struct string_list *list = (struct string_list *)cbdata;
+	struct string_list_item *item = string_list_insert(list, refname);
+	item->util = (void *)sha1;
+	return 0;
+}
+	
+static const char post_fetch_hook[] = "post-fetch";
+struct ref *post_fetch_hook_refs;
+
+int feed_post_fetch_hook (int in, int out, void *data)
+{
+	struct ref *ref;
+	struct strbuf buf = STRBUF_INIT;
+	int ret;
+
+	for (ref = post_fetch_hook_refs; ref; ref = ref->next) {
+		strbuf_addstr(&buf, sha1_to_hex(ref->old_sha1));
+		strbuf_addch(&buf, ' ');
+		strbuf_addstr(&buf, ref->merge ? "merge" : "not-for-merge");
+		strbuf_addch(&buf, ' ');
+		if (ref->name)
+			strbuf_addstr(&buf, ref->name);
+		strbuf_addch(&buf, ' ');
+		if (ref->peer_ref && ref->peer_ref->name)
+			strbuf_addstr(&buf, ref->peer_ref->name);
+		strbuf_addch(&buf, '\n');
+	}
+
+	ret = write_in_full(out, buf.buf, buf.len) != buf.len;
+	if (ret)
+		warning("%s hook failed to consume all its input",
+				post_fetch_hook);
+	close(out);
+	strbuf_release(&buf);
+	return ret;
+}
+	
+struct ref *parse_post_fetch_hook_line (char *l, 
+		struct string_list *existing_refs)
+{
+	struct ref *ref = NULL, *peer_ref = NULL;
+	struct string_list_item *peer_item = NULL;
+	char *words[4];
+	int i, word=0;
+	char *problem;
+
+	for (i=0; l[i]; i++) {
+		if (isspace(l[i])) {
+			l[i]='\0';
+			words[word]=l;
+			l+=i+1;
+			i=0;
+			word++;
+			if (word > 3) {
+				problem="too many words";
+				goto unparsable;
+			}
+		}
+	}
+	if (word < 3) {
+		problem="not enough words";
+		goto unparsable;
+	}
+	
+	ref = alloc_ref(words[2]);
+	peer_ref = ref->peer_ref = alloc_ref(l);
+	ref->peer_ref->force=1;
+
+	if (get_sha1_hex(words[0], ref->old_sha1)) {
+		problem="bad sha1";
+		goto unparsable;
+	}
+
+	if (strcmp(words[1], "merge") == 0) {
+		ref->merge=1;
+	}
+	else if (strcmp(words[1], "not-for-merge") != 0) {
+		problem="bad merge flag";
+		goto unparsable;
+	}
+
+	peer_item = string_list_lookup(existing_refs, peer_ref->name);
+	if (peer_item)
+		hashcpy(peer_ref->old_sha1, peer_item->util);
+
+	return ref;
+
+ unparsable:
+	warning("%s hook output a wrongly formed line: %s",
+			post_fetch_hook, problem);
+	free(ref);
+	free(peer_ref);
+	return NULL;
+}
+
+/* The hook is fed lines of the form:
+ * <sha1> SP <not-for-merge|merge> SP <remote-refname> SP <local-refname> LF
+ * And should output rewritten lines of the same form.
+ */
+struct ref *run_post_fetch_hook (struct ref *fetched_refs)
+{
+	struct ref *new_refs = NULL;
+	struct string_list existing_refs = STRING_LIST_INIT_NODUP;
+	struct child_process hook;
+	struct async async;
+	const char *argv[2];
+	FILE *f;
+	struct strbuf buf;
+	struct ref *ref, *prevref=NULL;
+	int ok = 1;
+
+	if (! fetched_refs)
+		return fetched_refs;
+
+	argv[0] = git_path("hooks/%s", post_fetch_hook);
+	if (access(argv[0], X_OK) < 0)
+		return fetched_refs;
+	argv[1] = NULL;
+
+	memset(&hook, 0, sizeof(hook));
+	hook.argv = argv;
+	hook.in = -1;
+	hook.out = -1;
+	if (start_command(&hook) != 0)
+		return fetched_refs;
+
+	/* Use an async writer to feed the hook process.
+	 * This allows the hook to read and write a line at
+	 * a time without blocking. */
+	memset(&async, 0, sizeof(async));
+	async.proc = feed_post_fetch_hook;
+	post_fetch_hook_refs = fetched_refs;
+	async.out = hook.in;
+	if (start_async(&async))
+		goto failed_hook;
+
+	for_each_ref(add_existing, &existing_refs);
+
+	f = fdopen(hook.out, "r");
+	if (f == NULL)
+		goto failed_hook;
+	strbuf_init(&buf, 128);
+	while (strbuf_getline(&buf, f, '\n') != EOF) {
+		char *l = strbuf_detach(&buf, NULL);
+		ref = parse_post_fetch_hook_line(l, &existing_refs);
+		if (ref) {
+			if (prevref) {
+				prevref->next=ref;
+				prevref=ref;
+			}
+			else {
+				new_refs = prevref = ref;
+			}
+		}
+		else {
+			ok = 0; /* ignore the other output */
+		}
+		free(l);
+	}
+	strbuf_release(&buf);
+	fclose(f);
+
+	if (finish_async(&async))
+		goto failed_hook;
+	finish_command(&hook);
+
+	if (! ok)
+		return fetched_refs;
+	/* The new_refs are returned, to be used in place of fetched_refs,
+	 * so it is not needed anymore and can be freed here. */
+	free_refs(fetched_refs);
+	return new_refs;
+
+failed_hook:
+	close(hook.out);
+	finish_command(&hook);
+	return fetched_refs;
+}
+
 static void unlock_pack(void)
 {
 	if (transport)
@@ -507,17 +689,30 @@ static int quickfetch(struct ref *ref_map)
 	return check_everything_connected(iterate_ref_map, 1, &rm);
 }
 
-static int fetch_refs(struct transport *transport, struct ref *ref_map)
+struct fetch_refs_result {
+	struct ref *new_refs;
+	int status;
+};
+
+static struct fetch_refs_result fetch_refs(struct transport *transport,
+		struct ref *ref_map)
 {
-	int ret = quickfetch(ref_map);
-	if (ret)
-		ret = transport_fetch_refs(transport, ref_map);
-	if (!ret)
-		ret |= store_updated_refs(transport->url,
+	struct fetch_refs_result res;
+	res.status = quickfetch(ref_map);
+	if (res.status)
+		res.status = transport_fetch_refs(transport, ref_map);
+	if (!res.status) {
+		res.new_refs = run_post_fetch_hook(ref_map);
+
+		res.status |= store_updated_refs(transport->url,
 				transport->remote->name,
-				ref_map);
+				res.new_refs);
+	}
+	else {
+		res.new_refs = ref_map;
+	}
 	transport_unlock_pack(transport);
-	return ret;
+	return res;
 }
 
 static int prune_refs(struct refspec *refs, int ref_count, struct ref *ref_map)
@@ -542,15 +737,6 @@ static int prune_refs(struct refspec *refs, int ref_count, struct ref *ref_map)
 	return result;
 }
 
-static int add_existing(const char *refname, const unsigned char *sha1,
-			int flag, void *cbdata)
-{
-	struct string_list *list = (struct string_list *)cbdata;
-	struct string_list_item *item = string_list_insert(list, refname);
-	item->util = (void *)sha1;
-	return 0;
-}
-
 static int will_fetch(struct ref **head, const unsigned char *sha1)
 {
 	struct ref *rm = *head;
@@ -673,6 +859,7 @@ static int do_fetch(struct transport *transport,
 	struct string_list_item *peer_item = NULL;
 	struct ref *ref_map;
 	struct ref *rm;
+	struct fetch_refs_result res;
 	int autotags = (transport->remote->fetch_tags == 1);
 
 	for_each_ref(add_existing, &existing_refs);
@@ -710,7 +897,9 @@ static int do_fetch(struct transport *transport,
 
 	if (tags == TAGS_DEFAULT && autotags)
 		transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, "1");
-	if (fetch_refs(transport, ref_map)) {
+	res = fetch_refs(transport, ref_map);
+	ref_map = res.new_refs;
+	if (res.status) {
 		free_refs(ref_map);
 		return 1;
 	}
@@ -750,7 +939,8 @@ static int do_fetch(struct transport *transport,
 		if (ref_map) {
 			transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, NULL);
 			transport_set_option(transport, TRANS_OPT_DEPTH, "0");
-			fetch_refs(transport, ref_map);
+			res = fetch_refs(transport, ref_map);
+			ref_map = res.new_refs;
 		}
 		free_refs(ref_map);
 	}

-- 
1.7.7.3

^ permalink raw reply related

* Re: [PATCH] add post-fetch hook
From: Junio C Hamano @ 2011-12-26  7:59 UTC (permalink / raw)
  To: Joey Hess; +Cc: git
In-Reply-To: <20111226023154.GA3243@gnu.kitenet.net>

Joey Hess <joey@kitenet.net> writes:

> The post-fetch hook is fed lines on stdin for all refs that were fetched, and
> outputs on stdout possibly modified lines. Its output is parsed and used
> when git fetch updates the remote tracking refs, records the entries in
> FETCH_HEAD, and produces its report.
>
> ---
>
> Not quite ready to sign off on this yet, but it does work.
> Comments and code review appreciated.

Thanks. I do have a few comments.

This hook is no longer a "post" fetch hook. The mechanism lets the object
transfer phase does its work and then rewrites/tweaks the result before
fetch completes. To an outside observer, what the hook does is an integral
part of what "fetch" does, and not something that happens _after_ fetch
completes. I am bad at naming things, but something along the lines of
"tweak-fetch" that makes it clear that what happens in the hook is still
part of the fetching may be a more appropriate name, methinks.

I very much on purpose said that the hook "must read everything from its
standard input, *and* *then* return..." in my response. Your "Demo" hook
emits output as it reads its input with sed, but your main process invokes
the hook, drains everything with write_in_full() before starting to read a
single line, so I suspect that your hook will deadlock when its output
pipe buffer fills up without being read by the main process. Of course,
for this deadlock to actually happen, you need to be fetching quite a lot
of refs.

To make the life of hook writers easier, however, it would be good to
support a hook written in the style of your "Demo" hook, instead of
requiring it to read everything before emiting any output. I think you
could solve this by having a select(2) loop to avoid the deadlock on our
end (lets call the code that spawns a hook with run_command() and
interacts with it a "hook driver" in the rest of this message), but before
going in that direction, I would like to see us stepping back and bit and
think about the way hooks are called in the current code.

We seem to already have too many hook drivers, each of which hand-roll
similar logic using run-command API. At some point, we would want to have
a single "run_hook" helper function that takes:

 - the name of a hook;
 - the command line arguments to be fed;
 - a callback "generator" function to feed the standard input stream of
   the hook process;
 - a callback "consumer" function to receive the standard output stream of
   the hook process; and
 - set of environment tweaks while running the hook (e.g. run the hook
   while setting GIT_INDEX_FILE to a temporary file).

and hides away the complexity from hook drivers.  The command line
arguments, input and output callback functions are all optional depending
on the API between the hook driver and the hook (e.g. the "post-update"
hook takes arguments from its command line but does not interact with the
standard I/O stream, while the "post-receive" hook takes its input from
the standard input stream). Tweaking of the environment is also optional;
not many hooks need it.

By formalizing the hook driver API that way, any hook driver that drives a
tricky hook that may need a select(2) loop to avoid a deadlock in a way
similar to your patch do would not have to worry about the issue, as the
run_hook() helper would take care of it by reading from the hook's output
pipe and drain the pipe by calling the "consumer" callback before calling
the "generator" callback and feed more input to the hook to cause a
deadlock. We also could in the future do many other things if and when we
wanted to that the current code structure makes difficult. A few examples
that readily come to my mind are:

 - relocate where the hook scripts live, by limiting the hook driver API
   to take just the "name" of the hook. The current hook callsites know
   that the hooks live in git_path("hooks/$name") and call run_command()
   on their own, but the run_hook() helper could redirect it away to
   somewhere else, e.g. "/etc/git-core/hooks/$name".

 - run a set of hooks on the same triggering condition. You may want to
   have two "post-receive" hooks, one to feed an e-mail based notification
   system and another to drive an autobuilder, for example. For this to
   work, the "generator" and "consumer" callbacks need to have a way for
   us to tell "beginning of a new session" and "end of a new session", so
   that they can produce/consume the same set of values more than once.

 - perhaps ignore SIGPIPE if the hook chooses not to read any information
   the hook driver provides with it.

I have been wondering when would be the good time to refactor the hook
driver API.  We can add your patch, after polishing it enough to make it
ready for inclusion, independent of the hook API refactoring. But that
would mean that it would require more work when refactoring the hook API,
as we would have one more hand-rolled hook caller that is based on
run_command().

^ permalink raw reply

* Re: [PATCH] add post-fetch hook
From: Junio C Hamano @ 2011-12-26  8:09 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Joey Hess, git
In-Reply-To: <m3liq0fwkz.fsf@localhost.localdomain>

Jakub Narebski <jnareb@gmail.com> writes:

> One thing that needs to be specified is what should happen if the hook
> changes "the refname at the remote" part...

Hmm, good eyes.

The mechanism is to allow the hook to rewrite what happened during the
fetch. If we decided refs/heads/master on the remote that points at the
commit $X should update refs/remotes/origin/master, we tell that to the
hook, and the hook reads it. The hook may tell us to pretend that we
fetched refs/heads/next on the remote that points at the commit $Y should
update refs/remotes/origin/pu. In FETCH_HEAD we leave where the commit was
fetched from and hook will affect that information (which is used in the
resulting merge commit log message), but otherwise I do not think anything
unexpected would happen, as tracking refs do not record where the stuff
came from (perhaps they are in reflog? I didn't check).

^ permalink raw reply

* Re: FETCH_HEAD documentation vs reality
From: Junio C Hamano @ 2011-12-26  8:16 UTC (permalink / raw)
  To: Joey Hess; +Cc: git
In-Reply-To: <20111225173901.GA668@gnu.kitenet.net>

Joey Hess <joey@kitenet.net> writes:

> Or yet another way to fix it would be to make git fetch always write the
> intended FETCH_HEAD first into .git/FETCH_HEAD. (When not in --append mode.)
> This seems like perhaps the best fix,...

Sounds good, if you mean "the first one that is marked as for-merge" by
"intended FETCH_HEAD".

Thanks. Please make it so.

^ permalink raw reply

* [PATCH 1/3] Eliminate recursion in setting/clearing marks in commit list
From: Nguyễn Thái Ngọc Duy @ 2011-12-26 12:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Shawn O. Pearce, Nguyen Thai Ngoc Duy

From: Nguyen Thai Ngoc Duy <pclouds@gmail.com>

Recursion in a DAG is generally a bad idea because it could be very
deep. Be defensive and avoid recursion in mark_parents_uninteresting()
and clear_commit_marks().

mark_parents_uninteresting() learns a trick from clear_commit_marks()
to avoid malloc() in (dorminant) single-parent case.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 commit.c   |   13 +++++++++++--
 revision.c |   45 +++++++++++++++++++++++++++++----------------
 2 files changed, 40 insertions(+), 18 deletions(-)

diff --git a/commit.c b/commit.c
index 73b7e00..628066c 100644
--- a/commit.c
+++ b/commit.c
@@ -421,7 +421,8 @@ struct commit *pop_most_recent_commit(struct commit_list **list,
 	return ret;
 }
 
-void clear_commit_marks(struct commit *commit, unsigned int mark)
+static void clear_commit_marks_1(struct commit_list **plist,
+				 struct commit *commit, unsigned int mark)
 {
 	while (commit) {
 		struct commit_list *parents;
@@ -436,12 +437,20 @@ void clear_commit_marks(struct commit *commit, unsigned int mark)
 			return;
 
 		while ((parents = parents->next))
-			clear_commit_marks(parents->item, mark);
+			commit_list_insert(parents->item, plist);
 
 		commit = commit->parents->item;
 	}
 }
 
+void clear_commit_marks(struct commit *commit, unsigned int mark)
+{
+	struct commit_list *list = NULL;
+	commit_list_insert(commit, &list);
+	while (list)
+		clear_commit_marks_1(&list, pop_commit(&list), mark);
+}
+
 void clear_commit_marks_for_object_array(struct object_array *a, unsigned mark)
 {
 	struct object *object;
diff --git a/revision.c b/revision.c
index 8764dde..7cc72fc 100644
--- a/revision.c
+++ b/revision.c
@@ -139,11 +139,32 @@ void mark_tree_uninteresting(struct tree *tree)
 
 void mark_parents_uninteresting(struct commit *commit)
 {
-	struct commit_list *parents = commit->parents;
+	struct commit_list *parents = NULL, *l;
+
+	for (l = commit->parents; l; l = l->next)
+		commit_list_insert(l->item, &parents);
 
 	while (parents) {
 		struct commit *commit = parents->item;
-		if (!(commit->object.flags & UNINTERESTING)) {
+		l = parents;
+		parents = parents->next;
+		free(l);
+
+		while (commit) {
+			/*
+			 * A missing commit is ok iff its parent is marked
+			 * uninteresting.
+			 *
+			 * We just mark such a thing parsed, so that when
+			 * it is popped next time around, we won't be trying
+			 * to parse it and get an error.
+			 */
+			if (!has_sha1_file(commit->object.sha1))
+				commit->object.parsed = 1;
+
+			if (commit->object.flags & UNINTERESTING)
+				break;
+
 			commit->object.flags |= UNINTERESTING;
 
 			/*
@@ -154,21 +175,13 @@ void mark_parents_uninteresting(struct commit *commit)
 			 * wasn't uninteresting), in which case we need
 			 * to mark its parents recursively too..
 			 */
-			if (commit->parents)
-				mark_parents_uninteresting(commit);
-		}
+			if (!commit->parents)
+				break;
 
-		/*
-		 * A missing commit is ok iff its parent is marked
-		 * uninteresting.
-		 *
-		 * We just mark such a thing parsed, so that when
-		 * it is popped next time around, we won't be trying
-		 * to parse it and get an error.
-		 */
-		if (!has_sha1_file(commit->object.sha1))
-			commit->object.parsed = 1;
-		parents = parents->next;
+			for (l = commit->parents->next; l; l = l->next)
+				commit_list_insert(l->item, &parents);
+			commit = commit->parents->item;
+		}
 	}
 }
 
-- 
1.7.8.36.g69ee2

^ permalink raw reply related

* [PATCH 2/3] index-pack: eliminate recursion in find_unresolved_deltas
From: Nguyễn Thái Ngọc Duy @ 2011-12-26 12:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Shawn O. Pearce,
	Nguyễn Thái Ngọc Duy
In-Reply-To: <1324901080-23215-1-git-send-email-pclouds@gmail.com>


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/index-pack.c |  112 +++++++++++++++++++++++++++++++------------------
 1 files changed, 71 insertions(+), 41 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 98025da..d311c05 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -34,6 +34,8 @@ struct base_data {
 	struct object_entry *obj;
 	void *data;
 	unsigned long size;
+	int ref_first, ref_last;
+	int ofs_first, ofs_last;
 };
 
 /*
@@ -221,6 +223,15 @@ static NORETURN void bad_object(unsigned long offset, const char *format, ...)
 	die("pack has bad object at offset %lu: %s", offset, buf);
 }
 
+static struct base_data *alloc_base_data()
+{
+	struct base_data *base = xmalloc(sizeof(struct base_data));
+	memset(base, 0, sizeof(*base));
+	base->ref_last = -1;
+	base->ofs_last = -1;
+	return base;
+}
+
 static void free_base_data(struct base_data *c)
 {
 	if (c->data) {
@@ -553,58 +564,77 @@ static void resolve_delta(struct object_entry *delta_obj,
 	nr_resolved_deltas++;
 }
 
-static void find_unresolved_deltas(struct base_data *base,
-				   struct base_data *prev_base)
+static struct base_data *find_unresolved_deltas_1(struct base_data *base,
+						  struct base_data *prev_base)
 {
-	int i, ref_first, ref_last, ofs_first, ofs_last;
-
-	/*
-	 * This is a recursive function. Those brackets should help reducing
-	 * stack usage by limiting the scope of the delta_base union.
-	 */
-	{
+	if (base->ref_last == -1 && base->ofs_last == -1) {
 		union delta_base base_spec;
 
 		hashcpy(base_spec.sha1, base->obj->idx.sha1);
 		find_delta_children(&base_spec,
-				    &ref_first, &ref_last, OBJ_REF_DELTA);
+				    &base->ref_first, &base->ref_last, OBJ_REF_DELTA);
 
 		memset(&base_spec, 0, sizeof(base_spec));
 		base_spec.offset = base->obj->idx.offset;
 		find_delta_children(&base_spec,
-				    &ofs_first, &ofs_last, OBJ_OFS_DELTA);
-	}
+				    &base->ofs_first, &base->ofs_last, OBJ_OFS_DELTA);
 
-	if (ref_last == -1 && ofs_last == -1) {
-		free(base->data);
-		return;
-	}
+		if (base->ref_last == -1 && base->ofs_last == -1) {
+			free(base->data);
+			return NULL;
+		}
 
-	link_base_data(prev_base, base);
+		link_base_data(prev_base, base);
+	}
 
-	for (i = ref_first; i <= ref_last; i++) {
-		struct object_entry *child = objects + deltas[i].obj_no;
-		struct base_data result;
+	if (base->ref_first <= base->ref_last) {
+		struct object_entry *child = objects + deltas[base->ref_first].obj_no;
+		struct base_data *result = alloc_base_data();
 
 		assert(child->real_type == OBJ_REF_DELTA);
-		resolve_delta(child, base, &result);
-		if (i == ref_last && ofs_last == -1)
+		resolve_delta(child, base, result);
+		if (base->ref_first == base->ref_last && base->ofs_last == -1)
 			free_base_data(base);
-		find_unresolved_deltas(&result, base);
+
+		base->ref_first++;
+		return result;
 	}
 
-	for (i = ofs_first; i <= ofs_last; i++) {
-		struct object_entry *child = objects + deltas[i].obj_no;
-		struct base_data result;
+	if (base->ofs_first <= base->ofs_last) {
+		struct object_entry *child = objects + deltas[base->ofs_first].obj_no;
+		struct base_data *result = alloc_base_data();
 
 		assert(child->real_type == OBJ_OFS_DELTA);
-		resolve_delta(child, base, &result);
-		if (i == ofs_last)
+		resolve_delta(child, base, result);
+		if (base->ofs_first == base->ofs_last)
 			free_base_data(base);
-		find_unresolved_deltas(&result, base);
+
+		base->ofs_first++;
+		return result;
 	}
 
 	unlink_base_data(base);
+	return NULL;
+}
+
+static void find_unresolved_deltas(struct base_data *base)
+{
+	struct base_data *new_base, *prev_base = NULL;
+	for (;;) {
+		new_base = find_unresolved_deltas_1(base, prev_base);
+
+		if (new_base) {
+			prev_base = base;
+			base = new_base;
+		}
+		else {
+			free(base);
+			base = prev_base;
+			if (!base)
+				return;
+			prev_base = base->base;
+		}
+	}
 }
 
 static int compare_delta_entry(const void *a, const void *b)
@@ -684,13 +714,13 @@ static void parse_pack_objects(unsigned char *sha1)
 		progress = start_progress("Resolving deltas", nr_deltas);
 	for (i = 0; i < nr_objects; i++) {
 		struct object_entry *obj = &objects[i];
-		struct base_data base_obj;
+		struct base_data *base_obj = alloc_base_data();
 
 		if (is_delta_type(obj->type))
 			continue;
-		base_obj.obj = obj;
-		base_obj.data = NULL;
-		find_unresolved_deltas(&base_obj, NULL);
+		base_obj->obj = obj;
+		base_obj->data = NULL;
+		find_unresolved_deltas(base_obj);
 		display_progress(progress, nr_resolved_deltas);
 	}
 }
@@ -783,20 +813,20 @@ static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved)
 	for (i = 0; i < n; i++) {
 		struct delta_entry *d = sorted_by_pos[i];
 		enum object_type type;
-		struct base_data base_obj;
+		struct base_data *base_obj = alloc_base_data();
 
 		if (objects[d->obj_no].real_type != OBJ_REF_DELTA)
 			continue;
-		base_obj.data = read_sha1_file(d->base.sha1, &type, &base_obj.size);
-		if (!base_obj.data)
+		base_obj->data = read_sha1_file(d->base.sha1, &type, &base_obj->size);
+		if (!base_obj->data)
 			continue;
 
-		if (check_sha1_signature(d->base.sha1, base_obj.data,
-				base_obj.size, typename(type)))
+		if (check_sha1_signature(d->base.sha1, base_obj->data,
+				base_obj->size, typename(type)))
 			die("local object %s is corrupt", sha1_to_hex(d->base.sha1));
-		base_obj.obj = append_obj_to_pack(f, d->base.sha1,
-					base_obj.data, base_obj.size, type);
-		find_unresolved_deltas(&base_obj, NULL);
+		base_obj->obj = append_obj_to_pack(f, d->base.sha1,
+					base_obj->data, base_obj->size, type);
+		find_unresolved_deltas(base_obj);
 		display_progress(progress, nr_resolved_deltas);
 	}
 	free(sorted_by_pos);
-- 
1.7.8.36.g69ee2

^ permalink raw reply related

* [PATCH 3/3] index-pack: eliminate unlimited recursion in get_delta_base()
From: Nguyễn Thái Ngọc Duy @ 2011-12-26 12:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Shawn O. Pearce,
	Nguyễn Thái Ngọc Duy
In-Reply-To: <1324901080-23215-1-git-send-email-pclouds@gmail.com>

Revert the order of delta applying so that by the time a delta is
applied, its base is either non-delta or already inflated.
get_delta_base() is still recursive, but because base's data is always
ready, the inner get_delta_base() call never has any chance to call
itself again.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/index-pack.c |   30 +++++++++++++++++++++---------
 1 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index d311c05..e8801bb 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -519,10 +519,25 @@ static void *get_base_data(struct base_data *c)
 {
 	if (!c->data) {
 		struct object_entry *obj = c->obj;
+		struct base_data **delta = NULL;
+		int delta_nr = 0, delta_alloc = 0;
 
-		if (is_delta_type(obj->type)) {
-			void *base = get_base_data(c->base);
-			void *raw = get_data_from_pack(obj);
+		for (; is_delta_type(c->obj->type); c = c->base) {
+			ALLOC_GROW(delta, delta_nr + 1, delta_alloc);
+			delta[delta_nr++] = c;
+		}
+		if (!delta_nr) {
+			c->data = get_data_from_pack(obj);
+			c->size = obj->size;
+			base_cache_used += c->size;
+			prune_base_data(c);
+		}
+		for (; delta_nr > 0; delta_nr--) {
+			void *base, *raw;
+			c = delta[delta_nr - 1];
+			obj = c->obj;
+			base = get_base_data(c->base);
+			raw = get_data_from_pack(obj);
 			c->data = patch_delta(
 				base, c->base->size,
 				raw, obj->size,
@@ -530,13 +545,10 @@ static void *get_base_data(struct base_data *c)
 			free(raw);
 			if (!c->data)
 				bad_object(obj->idx.offset, "failed to apply delta");
-		} else {
-			c->data = get_data_from_pack(obj);
-			c->size = obj->size;
+			base_cache_used += c->size;
+			prune_base_data(c);
 		}
-
-		base_cache_used += c->size;
-		prune_base_data(c);
+		free(delta);
 	}
 	return c->data;
 }
-- 
1.7.8.36.g69ee2

^ permalink raw reply related

* Re: [PATCH] add post-fetch hook
From: Joey Hess @ 2011-12-26 15:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vlipz930t.fsf@alter.siamese.dyndns.org>


[-- Attachment #1.1: Type: text/plain, Size: 4024 bytes --]

Junio C Hamano wrote:
> Thanks. I do have a few comments.
> 
> This hook is no longer a "post" fetch hook. The mechanism lets the object
> transfer phase does its work and then rewrites/tweaks the result before
> fetch completes. To an outside observer, what the hook does is an integral
> part of what "fetch" does, and not something that happens _after_ fetch
> completes. I am bad at naming things, but something along the lines of
> "tweak-fetch" that makes it clear that what happens in the hook is still
> part of the fetching may be a more appropriate name, methinks.

I'd be happy to call it tweak-fetch.

> I very much on purpose said that the hook "must read everything from its
> standard input, *and* *then* return..." in my response. Your "Demo" hook
> emits output as it reads its input with sed, but your main process invokes
> the hook, drains everything with write_in_full() before starting to read a
> single line, so I suspect that your hook will deadlock when its output
> pipe buffer fills up without being read by the main process. Of course,
> for this deadlock to actually happen, you need to be fetching quite a lot
> of refs.

Doesn't having the hook be fed by the async process/thread avoid this
problem? That's exactly why I did that, being familiar with the problem
from stupid perl scripts like this one:

#!/usr/bin/perl
use FileHandle;
use IPC::Open2;
$pid = open2(*Reader, *Writer, "cat");
print Writer "stuff $_\n" for 1..100000; # blocks forever
close Writer;
print "got: $_" while <Reader>;

I've tested feeding large quantities of data through my demo hook
(just feed it all the lines repeated 100 thousand times) and have
not seen it block. And other code in git uses an async feeder similarly,
see for example convert.c's apply_filter(). So I think this is ok..?

> We seem to already have too many hook drivers, each of which hand-roll
> similar logic using run-command API. At some point, we would want to have
> a single "run_hook" helper function that takes:

This was also my feeling as I looked around the code.

> By formalizing the hook driver API that way, any hook driver that drives a
> tricky hook that may need a select(2) loop to avoid a deadlock in a way
> similar to your patch do would not have to worry about the issue, as the
> run_hook() helper would take care of it by reading from the hook's output
> pipe and drain the pipe by calling the "consumer" callback before calling
> the "generator" callback and feed more input to the hook to cause a
> deadlock.

Actually, run_hook would need to either have a select loop, or an async
feeder like I've used. The method you describe is itself prone to deadlock!
Consider a hook that *does* consume all its input before outputting anything.
The first call to the "consumer" callback would block.

>  - run a set of hooks on the same triggering condition. You may want to
>    have two "post-receive" hooks, one to feed an e-mail based notification
>    system and another to drive an autobuilder

Something I've always wanted, in fact..

> I have been wondering when would be the good time to refactor the hook
> driver API.  We can add your patch, after polishing it enough to make it
> ready for inclusion, independent of the hook API refactoring. But that
> would mean that it would require more work when refactoring the hook API,
> as we would have one more hand-rolled hook caller that is based on
> run_command().

On the other hand, I have uses for this hook that are blocked on it
getting into git, so would rather not see it hung up in a general
refactoring. If it helps, I'd be happy to help with a hook refactoring
later.

The latest version of my patch (attached) already lifts reading from the
hook out into a helper function, so should need not many changes in such
a refactoring -- most of my run_tweak_fetch_hook would simply go away
then. I've also cleaned it up a lot and and pretty happy with it now.

-- 
see shy jo

[-- Attachment #1.2: 0001-preparations-for-tweak-fetch-hook.patch --]
[-- Type: text/x-diff, Size: 4094 bytes --]

From a06b5ef9908f56692a07fb37f5794c4122f10491 Mon Sep 17 00:00:00 2001
From: Joey Hess <joey@kitenet.net>
Date: Mon, 26 Dec 2011 10:44:55 -0400
Subject: [PATCH 1/2] preparations for tweak-fetch hook

No behavior changes yet, only some groundwork for the next
change.

The refs_result structure combines a status code with a ref map,
which can be NULL even on success. This will be needed when
there's a tweak-fetch hook, because it can filter out all refs,
while still succeeding.

fetch_refs returns a refs_result, so that it can modify the ref_map.
---
 builtin/fetch.c |   54 +++++++++++++++++++++++++++++++++++-------------------
 1 files changed, 35 insertions(+), 19 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 33ad3aa..70b9f89 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -29,6 +29,11 @@ enum {
 	TAGS_SET = 2
 };
 
+struct refs_result {
+	struct ref *new_refs;
+	int status;
+};
+
 static int all, append, dry_run, force, keep, multiple, prune, update_head_ok, verbosity;
 static int progress, recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
 static int tags = TAGS_DEFAULT;
@@ -89,6 +94,15 @@ static struct option builtin_fetch_options[] = {
 	OPT_END()
 };
 
+static int add_existing(const char *refname, const unsigned char *sha1,
+			int flag, void *cbdata)
+{
+	struct string_list *list = (struct string_list *)cbdata;
+	struct string_list_item *item = string_list_insert(list, refname);
+	item->util = (void *)sha1;
+	return 0;
+}
+
 static void unlock_pack(void)
 {
 	if (transport)
@@ -507,17 +521,24 @@ static int quickfetch(struct ref *ref_map)
 	return check_everything_connected(iterate_ref_map, 1, &rm);
 }
 
-static int fetch_refs(struct transport *transport, struct ref *ref_map)
+static struct refs_result fetch_refs(struct transport *transport,
+		struct ref *ref_map)
 {
-	int ret = quickfetch(ref_map);
-	if (ret)
-		ret = transport_fetch_refs(transport, ref_map);
-	if (!ret)
-		ret |= store_updated_refs(transport->url,
+	struct refs_result res;
+	res.status = quickfetch(ref_map);
+	if (res.status)
+		res.status = transport_fetch_refs(transport, ref_map);
+	if (!res.status) {
+		res.new_refs = ref_map
+		res.status |= store_updated_refs(transport->url,
 				transport->remote->name,
-				ref_map);
+				res.new_refs);
+	}
+	else {
+		res.new_refs = ref_map;
+	}
 	transport_unlock_pack(transport);
-	return ret;
+	return res;
 }
 
 static int prune_refs(struct refspec *refs, int ref_count, struct ref *ref_map)
@@ -542,15 +563,6 @@ static int prune_refs(struct refspec *refs, int ref_count, struct ref *ref_map)
 	return result;
 }
 
-static int add_existing(const char *refname, const unsigned char *sha1,
-			int flag, void *cbdata)
-{
-	struct string_list *list = (struct string_list *)cbdata;
-	struct string_list_item *item = string_list_insert(list, refname);
-	item->util = (void *)sha1;
-	return 0;
-}
-
 static int will_fetch(struct ref **head, const unsigned char *sha1)
 {
 	struct ref *rm = *head;
@@ -673,6 +685,7 @@ static int do_fetch(struct transport *transport,
 	struct string_list_item *peer_item = NULL;
 	struct ref *ref_map;
 	struct ref *rm;
+	struct refs_result res;
 	int autotags = (transport->remote->fetch_tags == 1);
 
 	for_each_ref(add_existing, &existing_refs);
@@ -710,7 +723,9 @@ static int do_fetch(struct transport *transport,
 
 	if (tags == TAGS_DEFAULT && autotags)
 		transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, "1");
-	if (fetch_refs(transport, ref_map)) {
+	res = fetch_refs(transport, ref_map);
+	ref_map = res.new_refs;
+	if (res.status) {
 		free_refs(ref_map);
 		return 1;
 	}
@@ -750,7 +765,8 @@ static int do_fetch(struct transport *transport,
 		if (ref_map) {
 			transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, NULL);
 			transport_set_option(transport, TRANS_OPT_DEPTH, "0");
-			fetch_refs(transport, ref_map);
+			res = fetch_refs(transport, ref_map);
+			ref_map = res.new_refs;
 		}
 		free_refs(ref_map);
 	}
-- 
1.7.7.3


[-- Attachment #1.3: 0002-add-tweak-fetch-hook.patch --]
[-- Type: text/x-diff, Size: 7599 bytes --]

From 073b0921bb5988628e7af423924c410f522f403a Mon Sep 17 00:00:00 2001
From: Joey Hess <joey@kitenet.net>
Date: Mon, 26 Dec 2011 10:53:27 -0400
Subject: [PATCH 2/2] add tweak-fetch hook

The tweak-fetch hook is fed lines on stdin for all refs that were fetched,
and outputs on stdout possibly modified lines. Its output is parsed and
used when git fetch updates the remote tracking refs, records the entries
in FETCH_HEAD, and produces its report.
---
 Documentation/githooks.txt |   29 +++++++
 builtin/fetch.c            |  191 +++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 219 insertions(+), 1 deletions(-)

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index 28edefa..be2624c 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -162,6 +162,35 @@ This hook can be used to perform repository validity checks, auto-display
 differences from the previous HEAD if different, or set working dir metadata
 properties.
 
+tweak-fetch
+~~~~~~~~~~
+
+This hook is invoked by 'git fetch' (commonly called by 'git pull'), after
+refs have been fetched from the remote repository. It is not executed, if
+nothing was fetched.
+
+The output of the hook is used to update the remote-tracking branches, and
+`.git/FETCH_HEAD`, in preparation for for a later merge operation done by
+'git merge'.
+
+It takes no arguments, but is fed a line of the following format on
+its standard input for each ref that was fetched.
+
+  <sha1> SP not-for-merge|merge SP <remote-refname> SP <local-refname> LF
+
+Where the "not-for-merge" flag indicates the ref is not to be merged into the
+current branch, and the "merge" flag indicates that 'git merge' should
+later merge it. The `<remote-refname>` is the remote's name for the ref
+that was pulled, and `<local-refname>` is a name of a remote-tracking branch,
+like "refs/remotes/origin/master", or can be empty if the fetched ref is not
+being stored in a local refname.
+
+The hook must consume all of its standard input, and output back lines
+of the same format. It can modify its input as desired, including
+adding or removing lines, updating the sha1 (i.e. re-point the
+remote-tracking branch), changing the merge flag, and changing the
+`<local-refname>` (i.e. use different remote-tracking branch).
+
 post-merge
 ~~~~~~~~~~
 
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 70b9f89..5434b6f 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -103,6 +103,194 @@ static int add_existing(const char *refname, const unsigned char *sha1,
 	return 0;
 }
 
+static const char tweak_fetch_hook[] = "tweak-fetch";
+
+int feed_tweak_fetch_hook (int in, int out, void *data)
+{
+	struct ref *ref;
+	struct strbuf buf = STRBUF_INIT;
+	int ret;
+
+	for (ref = data; ref; ref = ref->next) {
+		strbuf_addstr(&buf, sha1_to_hex(ref->old_sha1));
+		strbuf_addch(&buf, ' ');
+		strbuf_addstr(&buf, ref->merge ? "merge" : "not-for-merge");
+		strbuf_addch(&buf, ' ');
+		if (ref->name)
+			strbuf_addstr(&buf, ref->name);
+		strbuf_addch(&buf, ' ');
+		if (ref->peer_ref && ref->peer_ref->name)
+			strbuf_addstr(&buf, ref->peer_ref->name);
+		strbuf_addch(&buf, '\n');
+	}
+
+	ret = write_in_full(out, buf.buf, buf.len) != buf.len;
+	if (ret)
+		warning("%s hook failed to consume all its input",
+				tweak_fetch_hook);
+	close(out);
+	strbuf_release(&buf);
+	return ret;
+}
+	
+struct ref *parse_tweak_fetch_hook_line (char *l, 
+		struct string_list *existing_refs)
+{
+	struct ref *ref = NULL, *peer_ref = NULL;
+	struct string_list_item *peer_item = NULL;
+	char *words[4];
+	int i, word=0;
+	char *problem;
+
+	for (i=0; l[i]; i++) {
+		if (isspace(l[i])) {
+			l[i]='\0';
+			words[word]=l;
+			l+=i+1;
+			i=0;
+			word++;
+			if (word > 3) {
+				problem="too many words";
+				goto unparsable;
+			}
+		}
+	}
+	if (word < 3) {
+		problem="not enough words";
+		goto unparsable;
+	}
+	
+	ref = alloc_ref(words[2]);
+	peer_ref = ref->peer_ref = alloc_ref(l);
+	ref->peer_ref->force=1;
+
+	if (get_sha1_hex(words[0], ref->old_sha1)) {
+		problem="bad sha1";
+		goto unparsable;
+	}
+
+	if (strcmp(words[1], "merge") == 0) {
+		ref->merge=1;
+	}
+	else if (strcmp(words[1], "not-for-merge") != 0) {
+		problem="bad merge flag";
+		goto unparsable;
+	}
+
+	peer_item = string_list_lookup(existing_refs, peer_ref->name);
+	if (peer_item)
+		hashcpy(peer_ref->old_sha1, peer_item->util);
+
+	return ref;
+
+ unparsable:
+	warning("%s hook output a wrongly formed line: %s",
+			tweak_fetch_hook, problem);
+	free(ref);
+	free(peer_ref);
+	return NULL;
+}
+
+struct refs_result read_tweak_fetch_hook (int in) {
+	struct refs_result res;
+	FILE *f;
+	struct strbuf buf;
+	struct string_list existing_refs = STRING_LIST_INIT_NODUP;
+	struct ref *ref, *prevref=NULL;
+
+	res.status = 0;
+	res.new_refs = NULL;
+
+	f = fdopen(in, "r");
+	if (f == NULL) {
+		res.status = 1;
+		return res;
+	}
+
+	strbuf_init(&buf, 128);
+	for_each_ref(add_existing, &existing_refs);
+
+	while (strbuf_getline(&buf, f, '\n') != EOF) {
+		char *l = strbuf_detach(&buf, NULL);
+		ref = parse_tweak_fetch_hook_line(l, &existing_refs);
+		if (ref) {
+			if (prevref) {
+				prevref->next=ref;
+				prevref=ref;
+			}
+			else {
+				res.new_refs = prevref = ref;
+			}
+		}
+		else {
+			res.status = 1;
+		}
+		free(l);
+	}
+
+	string_list_clear(&existing_refs, 0);
+	strbuf_release(&buf);
+	fclose(f);
+	return res;
+}
+
+/* The hook is fed lines of the form:
+ * <sha1> SP <not-for-merge|merge> SP <remote-refname> SP <local-refname> LF
+ * And should output rewritten lines of the same form.
+ */
+struct ref *run_tweak_fetch_hook (struct ref *fetched_refs)
+{
+	struct child_process hook;
+	const char *argv[2];
+	struct async async;
+	struct refs_result res;
+
+	if (! fetched_refs)
+		return fetched_refs;
+
+	argv[0] = git_path("hooks/%s", tweak_fetch_hook);
+	if (access(argv[0], X_OK) < 0)
+		return fetched_refs;
+	argv[1] = NULL;
+
+	memset(&hook, 0, sizeof(hook));
+	hook.argv = argv;
+	hook.in = -1;
+	hook.out = -1;
+	if (start_command(&hook))
+		return fetched_refs;
+
+	/* Use an async writer to feed the hook process.
+	 * This allows the hook to read and write a line at
+	 * a time without blocking. */
+	memset(&async, 0, sizeof(async));
+	async.proc = feed_tweak_fetch_hook;
+	async.data = fetched_refs;
+	async.out = hook.in;
+	if (start_async(&async)) {
+		close(hook.in);
+		close(hook.out);
+		finish_command(&hook);
+		return fetched_refs;
+	}
+	res = read_tweak_fetch_hook(hook.out);
+	res.status |= finish_async(&async);
+	res.status |= finish_command(&hook);
+
+	if (res.status) {
+		warning("%s hook failed, ignoring its output", tweak_fetch_hook);
+		free(res.new_refs);
+		return fetched_refs;
+	}
+	else {
+		/* The new_refs are returned, to be used in place of
+		 * fetched_refs, so it is not needed anymore and can
+		 * be freed here. */
+		free_refs(fetched_refs);
+		return res.new_refs;
+	}
+}
+
 static void unlock_pack(void)
 {
 	if (transport)
@@ -529,7 +717,8 @@ static struct refs_result fetch_refs(struct transport *transport,
 	if (res.status)
 		res.status = transport_fetch_refs(transport, ref_map);
 	if (!res.status) {
-		res.new_refs = ref_map
+		res.new_refs = run_tweak_fetch_hook(ref_map);
+
 		res.status |= store_updated_refs(transport->url,
 				transport->remote->name,
 				res.new_refs);
-- 
1.7.7.3


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related

* [PATCH] write first for-merge ref to FETCH_HEAD first
From: Joey Hess @ 2011-12-26 16:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vd3bb929n.fsf@alter.siamese.dyndns.org>

[-- Attachment #1: Type: text/plain, Size: 6263 bytes --]

The FETCH_HEAD refname is supposed to refer to the ref that was fetched
and should be merged. However all fetched refs are written to
.git/FETCH_HEAD in an arbitrary order, and resolve_ref_unsafe simply
takes the first ref as the FETCH_HEAD, which is often the wrong one,
when other branches were also fetched.

The solution is to write the for-merge ref(s) to FETCH_HEAD first.
Then, unless --append is used, the FETCH_HEAD refname behaves as intended.
If the user uses --append, they presumably are doing so in order to
preserve the old FETCH_HEAD.

Also included a fix to documentation that assumes FETCH_HEAD contains
only a single ref.
---
 Documentation/git-read-tree.txt |    2 +-
 builtin/fetch.c                 |  158 +++++++++++++++++++++------------------
 2 files changed, 85 insertions(+), 75 deletions(-)

diff --git a/Documentation/git-read-tree.txt b/Documentation/git-read-tree.txt
index 5375549..2d3ff23 100644
--- a/Documentation/git-read-tree.txt
+++ b/Documentation/git-read-tree.txt
@@ -342,7 +342,7 @@ since you pulled from him:
 
 ----------------
 $ git fetch git://.... linus
-$ LT=`cat .git/FETCH_HEAD`
+$ LT=`git rev-parse FETCH_HEAD`
 ----------------
 
 Your work tree is still based on your HEAD ($JC), but you have
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 33ad3aa..db565cd 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -377,6 +377,7 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
 	const char *what, *kind;
 	struct ref *rm;
 	char *url, *filename = dry_run ? "/dev/null" : git_path("FETCH_HEAD");
+	signed int want_merge;
 
 	fp = fopen(filename, "a");
 	if (!fp)
@@ -393,84 +394,93 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
 		goto abort;
 	}
 
-	for (rm = ref_map; rm; rm = rm->next) {
-		struct ref *ref = NULL;
-
-		if (rm->peer_ref) {
-			ref = xcalloc(1, sizeof(*ref) + strlen(rm->peer_ref->name) + 1);
-			strcpy(ref->name, rm->peer_ref->name);
-			hashcpy(ref->old_sha1, rm->peer_ref->old_sha1);
-			hashcpy(ref->new_sha1, rm->old_sha1);
-			ref->force = rm->peer_ref->force;
-		}
+	/* Two passes are made over the ref_map, to write merge refs
+	 * to FETCH_HEAD first. This allows using FETCH_HEAD as a refname
+	 * to refer to the ref to be merged. */
+	for (want_merge = 1; want_merge >= 0 ; want_merge--) {
+		for (rm = ref_map; rm; rm = rm->next) {
+			struct ref *ref = NULL;
+
+			commit = lookup_commit_reference_gently(rm->old_sha1, 1);
+			if (!commit)
+				rm->merge = 0;
+
+			if (rm->merge != want_merge)
+				continue;
+
+			if (rm->peer_ref) {
+				ref = xcalloc(1, sizeof(*ref) + strlen(rm->peer_ref->name) + 1);
+				strcpy(ref->name, rm->peer_ref->name);
+				hashcpy(ref->old_sha1, rm->peer_ref->old_sha1);
+				hashcpy(ref->new_sha1, rm->old_sha1);
+				ref->force = rm->peer_ref->force;
+			}
 
-		commit = lookup_commit_reference_gently(rm->old_sha1, 1);
-		if (!commit)
-			rm->merge = 0;
 
-		if (!strcmp(rm->name, "HEAD")) {
-			kind = "";
-			what = "";
-		}
-		else if (!prefixcmp(rm->name, "refs/heads/")) {
-			kind = "branch";
-			what = rm->name + 11;
-		}
-		else if (!prefixcmp(rm->name, "refs/tags/")) {
-			kind = "tag";
-			what = rm->name + 10;
-		}
-		else if (!prefixcmp(rm->name, "refs/remotes/")) {
-			kind = "remote-tracking branch";
-			what = rm->name + 13;
-		}
-		else {
-			kind = "";
-			what = rm->name;
-		}
+			if (!strcmp(rm->name, "HEAD")) {
+				kind = "";
+				what = "";
+			}
+			else if (!prefixcmp(rm->name, "refs/heads/")) {
+				kind = "branch";
+				what = rm->name + 11;
+			}
+			else if (!prefixcmp(rm->name, "refs/tags/")) {
+				kind = "tag";
+				what = rm->name + 10;
+			}
+			else if (!prefixcmp(rm->name, "refs/remotes/")) {
+				kind = "remote-tracking branch";
+				what = rm->name + 13;
+			}
+			else {
+				kind = "";
+				what = rm->name;
+			}
 
-		url_len = strlen(url);
-		for (i = url_len - 1; url[i] == '/' && 0 <= i; i--)
-			;
-		url_len = i + 1;
-		if (4 < i && !strncmp(".git", url + i - 3, 4))
-			url_len = i - 3;
-
-		strbuf_reset(&note);
-		if (*what) {
-			if (*kind)
-				strbuf_addf(&note, "%s ", kind);
-			strbuf_addf(&note, "'%s' of ", what);
-		}
-		fprintf(fp, "%s\t%s\t%s",
-			sha1_to_hex(rm->old_sha1),
-			rm->merge ? "" : "not-for-merge",
-			note.buf);
-		for (i = 0; i < url_len; ++i)
-			if ('\n' == url[i])
-				fputs("\\n", fp);
-			else
-				fputc(url[i], fp);
-		fputc('\n', fp);
-
-		strbuf_reset(&note);
-		if (ref) {
-			rc |= update_local_ref(ref, what, &note);
-			free(ref);
-		} else
-			strbuf_addf(&note, "* %-*s %-*s -> FETCH_HEAD",
-				    TRANSPORT_SUMMARY_WIDTH,
-				    *kind ? kind : "branch",
-				    REFCOL_WIDTH,
-				    *what ? what : "HEAD");
-		if (note.len) {
-			if (verbosity >= 0 && !shown_url) {
-				fprintf(stderr, _("From %.*s\n"),
-						url_len, url);
-				shown_url = 1;
+			url_len = strlen(url);
+			for (i = url_len - 1; url[i] == '/' && 0 <= i; i--)
+				;
+			url_len = i + 1;
+			if (4 < i && !strncmp(".git", url + i - 3, 4))
+				url_len = i - 3;
+
+			strbuf_reset(&note);
+			if (*what) {
+				if (*kind)
+					strbuf_addf(&note, "%s ", kind);
+				strbuf_addf(&note, "'%s' of ", what);
+			}
+			fprintf(fp, "%s\t%s\t%s",
+				sha1_to_hex(rm->old_sha1),
+				rm->merge ? "" : "not-for-merge",
+				note.buf);
+			for (i = 0; i < url_len; ++i)
+				if ('\n' == url[i])
+					fputs("\\n", fp);
+				else
+					fputc(url[i], fp);
+			fputc('\n', fp);
+
+			strbuf_reset(&note);
+			if (ref) {
+				rc |= update_local_ref(ref, what, &note);
+				free(ref);
+			} else
+				strbuf_addf(&note, "* %-*s %-*s -> FETCH_HEAD",
+					    TRANSPORT_SUMMARY_WIDTH,
+					    *kind ? kind : "branch",
+					    REFCOL_WIDTH,
+					    *what ? what : "HEAD");
+			if (note.len) {
+				if (verbosity >= 0 && !shown_url) {
+					fprintf(stderr, _("From %.*s\n"),
+							url_len, url);
+					shown_url = 1;
+				}
+				if (verbosity >= 0)
+					fprintf(stderr, " %s\n", note.buf);
 			}
-			if (verbosity >= 0)
-				fprintf(stderr, " %s\n", note.buf);
 		}
 	}
 
-- 
1.7.7.3


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related

* Re: [PATCH] honour GIT_ASKPASS for querying username in git-svn
From: Sven Strickroth @ 2011-12-26 23:49 UTC (permalink / raw)
  To: git; +Cc: Jeff King, gitster
In-Reply-To: <20111130064401.GC5317@sigill.intra.peff.net>

Hi,

Am 30.11.2011 07:44 schrieb Jeff King:
> That aside, I think this is an improvement over the current code.
>   1. Regular git will also respect SSH_ASKPASS
>   2. Regular git will ignore an askpass variable that is set but empty.
> Perhaps git-svn should be refactored to have a reusable "prompt"
> function that respects askpass and tries to behave like C git? It could
> even go into the Git perl module.

I honoured all your ideas. Hopefully the patches can be applied now. The new patches
follow (you can also pull from git://github.com/csware/git.git askpass-prompt):

>From b760546c59d1b9982296c19f8eaea6dc225b5a4f Mon Sep 17 00:00:00 2001
From: Sven Strickroth <email@cs-ware.de>
Date: Tue, 27 Dec 2011 00:33:46 +0100
Subject: [PATCH 1/4] add central method for prompting a user using
 GIT_ASKPASS or SSH_ASKPASS

Signed-off-by: Sven Strickroth <email@cs-ware.de>
---
 perl/Git.pm |   31 ++++++++++++++++++++++++++++++-
 1 files changed, 30 insertions(+), 1 deletions(-)

diff --git a/perl/Git.pm b/perl/Git.pm
index f7ce511..8176d47 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -58,7 +58,7 @@ require Exporter;
                 command_output_pipe command_input_pipe command_close_pipe
                 command_bidi_pipe command_close_bidi_pipe
                 version exec_path html_path hash_object git_cmd_try
-                remote_refs
+                remote_refs prompt
                 temp_acquire temp_release temp_reset temp_path);


@@ -512,6 +512,35 @@ C<git --html-path>). Useful mostly only internally.
 sub html_path { command_oneline('--html-path') }


+=item prompt ( PROMPT)
+
+Checks if GIT_ASKPASS or SSH_ASKPASS is set, and if yes
+use it and return answer from user.
+
+=cut
+
+sub prompt {
+	my ($self, $prompt) = _maybe_self(@_);
+	if (exists $ENV{'GIT_ASKPASS'}) {
+		return _prompt($ENV{'GIT_ASKPASS'}, $prompt);
+	} elsif (exists $ENV{'SSH_ASKPASS'}) {
+		return _prompt($ENV{'SSH_ASKPASS'}, $prompt);
+	} else {
+		return undef;
+	}
+}
+
+sub _prompt {
+	my ($self, $askpass, $prompt) = _maybe_self(@_);
+	my $ret;
+	open(PH, "-|", $askpass, $prompt);
+	$ret = <PH>;
+	$ret =~ s/[\012\015]//g; # strip \n\r
+	close(PH);
+	return $ret;
+}
+
+
 =item repo_path ()

 Return path to the git repository. Must be called on a repository instance.
-- 
1.7.7.1.msysgit.0

>From ef4c6557d1b0e33440d13c64742d44b2a22143f3 Mon Sep 17 00:00:00 2001
From: Sven Strickroth <email@cs-ware.de>
Date: Tue, 27 Dec 2011 00:34:09 +0100
Subject: [PATCH 2/4] switch to central prompt method

Signed-off-by: Sven Strickroth <email@cs-ware.de>
---
 git-svn.perl |    9 ++-------
 1 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/git-svn.perl b/git-svn.perl
index eeb83d3..4fd4eca 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -4415,13 +4415,8 @@ sub username {

 sub _read_password {
 	my ($prompt, $realm) = @_;
-	my $password = '';
-	if (exists $ENV{GIT_ASKPASS}) {
-		open(PH, "-|", $ENV{GIT_ASKPASS}, $prompt);
-		$password = <PH>;
-		$password =~ s/[\012\015]//; # \n\r
-		close(PH);
-	} else {
+	my $password = Git->prompt($prompt);;
+	if (!defined $password) {
 		print STDERR $prompt;
 		STDERR->flush;
 		require Term::ReadKey;
-- 
1.7.7.1.msysgit.0

>From d58f41d7b9b8e690c9839f6f7539774da88aa3a4 Mon Sep 17 00:00:00 2001
From: Sven Strickroth <email@cs-ware.de>
Date: Tue, 27 Dec 2011 00:37:43 +0100
Subject: [PATCH 3/4] honour *_ASKPASS for querying username and for querying
 further actions on unknown certificates

git-svn reads usernames (and answers for certificate errors) from an interactive terminal.
This behavior cause GUIs to hang waiting for git-svn to complete (http://code.google.com/p/tortoisegit/issues/detail?id=967).

Also see commit 56a853b62c0ae7ebaad0a7a0a704f5ef561eb795.

Signed-off-by: Sven Strickroth <email@cs-ware.de>
---
 git-svn.perl |   13 ++++++++++---
 1 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/git-svn.perl b/git-svn.perl
index 4fd4eca..b85a7de 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -4357,11 +4357,15 @@ sub ssl_server_trust {
 	                               issuer_dname fingerprint);
 	my $choice;
 prompt:
-	print STDERR $may_save ?
+	my $options = $may_save ?
 	      "(R)eject, accept (t)emporarily or accept (p)ermanently? " :
 	      "(R)eject or accept (t)emporarily? ";
-	STDERR->flush;
-	$choice = lc(substr(<STDIN> || 'R', 0, 1));
+	$choice = Git->prompt("Certificate unknown. " . $options);
+	if (!defined $choice) {
+		print STDERR $options;
+		STDERR->flush;
+		$choice = lc(substr(<STDIN> || 'R', 0, 1));
+	}
 	if ($choice =~ /^t$/i) {
 		$cred->may_save(undef);
 	} elsif ($choice =~ /^r$/i) {
@@ -4404,6 +4408,9 @@ sub username {
 	if (defined $_username) {
 		$username = $_username;
 	} else {
+		$username = Git->prompt("Username");
+	}
+	if (!defined $username) {
 		print STDERR "Username: ";
 		STDERR->flush;
 		chomp($username = <STDIN>);
-- 
1.7.7.1.msysgit.0

>From 2c1dbdae8024f28d17abfbdc7e45865a1277151a Mon Sep 17 00:00:00 2001
From: Sven Strickroth <email@cs-ware.de>
Date: Tue, 27 Dec 2011 00:42:07 +0100
Subject: [PATCH 4/4] ignore empty *_ASKPASS variables

Signed-off-by: Sven Strickroth <email@cs-ware.de>
---
 perl/Git.pm |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/perl/Git.pm b/perl/Git.pm
index 8176d47..fade617 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -532,6 +532,9 @@ sub prompt {

 sub _prompt {
 	my ($self, $askpass, $prompt) = _maybe_self(@_);
+	unless ($askpass) {
+		return undef;
+	}
 	my $ret;
 	open(PH, "-|", $askpass, $prompt);
 	$ret = <PH>;
-- 
1.7.7.1.msysgit.0

-- 
Best regards,
 Sven Strickroth
 ClamAV, a GPL anti-virus toolkit   http://www.clamav.net
 PGP key id F5A9D4C4 @ any key-server

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox