* Re: [PATCH v7 0/3] submodule update: add --remote for submodule's upstream changes
From: W. Trevor King @ 2012-12-12 15:24 UTC (permalink / raw)
To: Junio C Hamano
Cc: Git, Heiko Voigt, Jeff King, Phil Hord, Shawn Pearce,
Jens Lehmann, Nahor
In-Reply-To: <7vtxrr6d2f.fsf@alter.siamese.dyndns.org>
[-- Attachment #1: Type: text/plain, Size: 789 bytes --]
On Tue, Dec 11, 2012 at 09:42:48PM -0800, Junio C Hamano wrote:
> What branch did you base this series on?
Every version of this series has been based on v1.8.0.
> The preimage of git-submodule.sh in [2/3] does not seem to match
> anything I have (I could wiggle the patch, but in general I would
> rather prefer not having to).
From patch 1/3:
diff --git a/git-submodule.sh b/git-submodule.sh
index ab6b110..f969f28 100755
And from patch 2/3:
diff --git a/git-submodule.sh b/git-submodule.sh
index f969f28..1395079 100755
ab6b110 is in v1.8.0:
$ git ls-tree v1.8.0 git-submodule.sh
100755 blob ab6b1107b6090494f192f361471ed5748ffa7dc1 git-submodule.sh
I can reroll if necessary, but I'm not sure what I've done wrong…
Cheers,
Trevor
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply
* Re: [PATCH v2] submodule: add 'deinit' command
From: Michael J Gruber @ 2012-12-12 15:08 UTC (permalink / raw)
To: Jens Lehmann
Cc: Junio C Hamano, Phil Hord, W. Trevor King, Git, Heiko Voigt,
Jeff King, Shawn Pearce, Nahor
In-Reply-To: <50BE6FB9.70301@web.de>
Jens Lehmann venit, vidit, dixit 04.12.2012 22:48:
> With "git submodule init" the user is able to tell git he cares about one
> or more submodules and wants to have it populated on the next call to "git
> submodule update". But currently there is no easy way he could tell git he
> does not care about a submodule anymore and wants to get rid of his local
> work tree (except he knows a lot about submodule internals and removes the
> "submodule.$name.url" setting from .git/config himself).
>
> Help those users by providing a 'deinit' command. This removes the whole
> submodule.<name> section from .git/config either for the given
> submodule(s) or for all those which have been initialized if none were
> given. Complain only when for a submodule given on the command line the
> url setting can't be found in .git/config.
Whoaaa, so why not have "git rm" remove everything unless I specify a
file to be removed?
I know I'm exaggerating a bit, but defaulting to "--all" for a
destructive operation seems to be a bit harsh, especially when the
command is targeted at "those" users that you mention.
> Add tests and link the man pages of "git submodule deinit" and "git rm"
> to assist the user in deciding whether removing or unregistering the
> submodule is the right thing to do for him.
>
> Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
> ---
>
> Am 03.12.2012 08:58, schrieb Junio C Hamano:
>> Jens Lehmann <Jens.Lehmann@web.de> writes:
>>
>>> Maybe the principle of least surprise is better followed when we
>>> nuke the whole section, as it might surprise the user more to have
>>> a setting resurrected he customized in the last life cycle of the
>>> submodule than seeing that after an deinit followed by an init all
>>> former customizations are consistently gone. So I tend to think now
>>> that removing the whole section would be the better solution here.
>>
>> I tend to agree; I suspect that a "deinit" would be mostly done
>> either to
>>
>> (1) correct mistakes the user made during a recent "init" and
>> perhaps "sync"; or
>>
>> (2) tell Git that the user has finished woing with this particular
>> submodule and does not intend to use it for quite a while.
>>
>> For both (1) and (2), I think it would be easier to users if we gave
>> them a clean slate, the same state as the one the user who never had
>> ran "init" on it would be in. A user in situation (1) is asking for
>> a clean slate, and a user in situation (2) is better served if he
>> does not have to worry about leftover entries in $GIT_DIR/config he
>> has long forgotten from many months ago (during which time the way
>> the project uses the particular submodule may well have changed)
>> giving non-standard experience different from what other project
>> participants would get.
>
> Changes in v2:
> - Remove the whole submodule section instead of only removing the
> "url" setting and explain why we do that in a comment
> - Reworded commit message and git-submodule.txt to reflect that
> - Extend the test to check that a custom settings are removed
>
>
> Documentation/git-rm.txt | 4 ++++
> Documentation/git-submodule.txt | 12 ++++++++++
> git-submodule.sh | 52 ++++++++++++++++++++++++++++++++++++++++-
> t/t7400-submodule-basic.sh | 12 ++++++++++
> 4 files changed, 79 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/git-rm.txt b/Documentation/git-rm.txt
> index 262436b..ec42bf5 100644
> --- a/Documentation/git-rm.txt
> +++ b/Documentation/git-rm.txt
> @@ -149,6 +149,10 @@ files that aren't ignored are present in the submodules work tree.
> Ignored files are deemed expendable and won't stop a submodule's work
> tree from being removed.
>
> +If you only want to remove the local checkout of a submodule from your
> +work tree without committing that use `git submodule deinit` instead
> +(see linkgit:git-submodule[1]).
> +
> EXAMPLES
> --------
> `git rm Documentation/\*.txt`::
> diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
> index b1de3ba..08b55a7 100644
> --- a/Documentation/git-submodule.txt
> +++ b/Documentation/git-submodule.txt
> @@ -13,6 +13,7 @@ SYNOPSIS
> [--reference <repository>] [--] <repository> [<path>]
> 'git submodule' [--quiet] status [--cached] [--recursive] [--] [<path>...]
> 'git submodule' [--quiet] init [--] [<path>...]
> +'git submodule' [--quiet] deinit [--] [<path>...]
> 'git submodule' [--quiet] update [--init] [-N|--no-fetch] [--rebase]
> [--reference <repository>] [--merge] [--recursive] [--] [<path>...]
> 'git submodule' [--quiet] summary [--cached|--files] [(-n|--summary-limit) <n>]
> @@ -134,6 +135,17 @@ init::
> the explicit 'init' step if you do not intend to customize
> any submodule locations.
>
> +deinit::
> + Unregister the submodules, i.e. remove the whole `submodule.$name`
> + section from .git/config. Further calls to `git submodule update`,
> + `git submodule foreach` and `git submodule sync` will skip any
> + unregistered submodules until they are initialized again, so use
> + this command if you don't want to have a local checkout of the
> + submodule in your work tree anymore (but note that this command
> + does not remove the submodule work tree). If you really want to
> + remove a submodule from the repository and commit that use
> + linkgit:git-rm[1] instead.
> +
> update::
> Update the registered submodules, i.e. clone missing submodules and
> checkout the commit specified in the index of the containing repository.
> diff --git a/git-submodule.sh b/git-submodule.sh
> index 2365149..3f558ed 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -8,6 +8,7 @@ dashless=$(basename "$0" | sed -e 's/-/ /')
> USAGE="[--quiet] add [-b <branch>] [-f|--force] [--name <name>] [--reference <repository>] [--] <repository> [<path>]
> or: $dashless [--quiet] status [--cached] [--recursive] [--] [<path>...]
> or: $dashless [--quiet] init [--] [<path>...]
> + or: $dashless [--quiet] deinit [--] [<path>...]
> or: $dashless [--quiet] update [--init] [-N|--no-fetch] [-f|--force] [--rebase] [--reference <repository>] [--merge] [--recursive] [--] [<path>...]
> or: $dashless [--quiet] summary [--cached|--files] [--summary-limit <n>] [commit] [--] [<path>...]
> or: $dashless [--quiet] foreach [--recursive] <command>
> @@ -516,6 +517,55 @@ cmd_init()
> }
>
> #
> +# Unregister submodules from .git/config
> +#
> +# $@ = requested paths (default to all)
> +#
> +cmd_deinit()
> +{
> + # parse $args after "submodule ... init".
> + while test $# -ne 0
> + do
> + case "$1" in
> + -q|--quiet)
> + GIT_QUIET=1
> + ;;
> + --)
> + shift
> + break
> + ;;
> + -*)
> + usage
> + ;;
> + *)
> + break
> + ;;
> + esac
> + shift
> + done
> +
> + module_list "$@" |
> + while read mode sha1 stage sm_path
> + do
> + die_if_unmatched "$mode"
> + name=$(module_name "$sm_path") || exit
> + url=$(git config submodule."$name".url)
> + if test -z "$url"
> + then
> + # Only mention uninitialized submodules when its
> + # path have been specified
> + test "$#" != "0" &&
> + say "$(eval_gettext "No url found for submodule path '\$sm_path' in .git/config")"
> + continue
> + fi
> + # Remove the whole section so we have a clean state when the user
> + # later decides to init this submodule again
> + git config --remove-section submodule."$name" &&
> + say "$(eval_gettext "Submodule '\$name' (\$url) unregistered")"
> + done
> +}
> +
> +#
> # Update each submodule path to correct revision, using clone and checkout as needed
> #
> # $@ = requested paths (default to all)
> @@ -1108,7 +1158,7 @@ cmd_sync()
> while test $# != 0 && test -z "$command"
> do
> case "$1" in
> - add | foreach | init | update | status | summary | sync)
> + add | foreach | init | deinit | update | status | summary | sync)
> command=$1
> ;;
> -q|--quiet)
> diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
> index de7d453..ee4f0ab 100755
> --- a/t/t7400-submodule-basic.sh
> +++ b/t/t7400-submodule-basic.sh
> @@ -756,4 +756,16 @@ test_expect_success 'submodule add with an existing name fails unless forced' '
> )
> '
>
> +test_expect_success 'submodule deinit should remove the whole submodule section from .git/config' '
> + git config submodule.example.foo bar &&
> + git submodule deinit &&
> + test -z "$(git config submodule.example.url)" &&
> + test -z "$(git config submodule.example.foo)"
> +'
> +
> +test_expect_success 'submodule deinit complains only when explicitly used on an uninitialized submodule' '
> + git submodule deinit &&
> + test_must_fail git submodule deinit example
> +'
> +
> test_done
>
^ permalink raw reply
* Re: How to avoid the ^M induced by Meld and Git
From: Michael J Gruber @ 2012-12-12 14:57 UTC (permalink / raw)
To: Karl Brand; +Cc: git
In-Reply-To: <50C72821.10908@erasmusmc.nl>
Karl Brand venit, vidit, dixit 11.12.2012 13:33:
> Esteemed Git users,
>
> What i do:
>
> 1. Create a script.r using Emacs/ESS.
> 2. Make some modifications to script.r with the nice diff gui, Meld
> 3. Commit these modifications using git commit -am "my message"
> 4. Reopen script.r in Emacs/ESS to continue working.
>
> The lines added (&/edited ?) using Meld all end with ^M which i
> certainly don't want. Lines not added/edited with Meld do NOT end with ^M.
What happens if you leave out step 3? If the same happens then Meld is
the culprit. (Unless you've set some special options, git does not
modify your file on commit, so this can't be git related.)
> There are plenty of posts around about these being line endings used for
> windows which can appear when working on a script under a *nix OS which
> has previously been edited in a Windows OS. This is not the case here -
> everything is taking place on Ubuntu 12.04.
>
> FWIW: the directory is being synced by dropbox; and in Meld, Preferences
> > Encoding tab, "utf8" is entered in the text box.
>
> Current work around is running in a terminal: dos2unix /path/to/script.r
> which strips the ^M's
>
> But this just shouldn't be necessary and I'd really appreciate the
> reflections & advice on how to stop inducing these ^M's !
>
> With thanks,
>
> Karl
>
> (re)posted here as suggested off topic at SO:
> http://stackoverflow.com/questions/13799631/create-script-r-in-emacs-modify-with-meld-git-commit-reopen-in-emacs-m
>
^ permalink raw reply
* Re: [PATCH] RFC Optionally handle symbolic links as copies
From: Michael J Gruber @ 2012-12-12 14:43 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: Junio C Hamano, git
In-Reply-To: <1622149333.19335600.1354756984435.JavaMail.root@dewire.com>
Robin Rosenberg venit, vidit, dixit 06.12.2012 02:23:
>
>
> ----- Ursprungligt meddelande -----
>> Robin Rosenberg <robin.rosenberg@dewire.com> writes:
>>
>>> If core.symlinks is set to copy then symbolic links in a git
>>> repository
>>> will be checked out as copies of the file it points to.
>>
>> That all sounds nice on surface when the primary thing you care
>> about is to fetch and check out other people's code and extract it
>> to the working tree, but how well would that work on the checkin
>> side? What happens if I check out a symlink that points at a file
>> (either in-tree or out-of-tree), make some changes that do not
>> involve the symlink, and before I make the commit, an unrelated
>> change is made to the file the symlink is pointing at?
>>
>>> - git status - when do we report a diff.
>>> - After checkout we should probably not
>>> - if the "linked" files change?
>>
>> Yeah, exactly.
>>
>>> - if a change in the copied directory chsnges
>>
>> That, too.
>>
>>> - if a file in the copied diretory is added/removed
>>> - update, should we update the copied structure automatically
>>> when the link target changes
>
> Some of the questions have proposals in the includes test script. A
> little more dangerous than having real symlinks ofcourse, but regardless
> of what one does with or without copied symlinks one can make mistakes
> and I feel letting Git do the copying is way better than having real
> copies in the git repository. Another crappy scm which the users are
> converting from does this and it works. A difference to git is that
> it (ok clearcase) makes all files read-only so there are fewer mays
> of making mistakes with the copies.
>
>> I personally do not think this is worth it. It would be very useful
>> on the export/checkout side, so it may make sense to add it to "git
>> archive", though.
>
> It makes sense, but it does not solve the problem at hand.
>
> -- robin
>
Well, what is the problem at hand?
Your commit message begins in present tense as if it described the
current state of git, when in fact it describes what the patch is about
to achieve. This is confusing enough already,
I don't see any mention of the purpose, other than "content may be
used", which would be served perfectly by a copy-on-link feature on the
export side, as mentioned by Junio. Is git-archive|tar an option?
Michael
^ permalink raw reply
* Re: Bad URL passed to RA layer ('https')
From: Eugene @ 2012-12-12 14:36 UTC (permalink / raw)
To: git
In-Reply-To: <l2y5208b6091005040218t2890b871x1753a1788b67350b@mail.gmail.com>
Here I. Come <me.detected <at> gmail.com> writes:
> ------------------8<-----------------------
> $ git svn clone https://host/svn/myrepo
> Initialized empty Git repository in /tmp/myrepo/.git/
> Bad URL passed to RA layer: Unrecognized URL scheme for
> 'https://host/svn/myrepo' at /usr/libexec/git-core/git-svn line 1770
> ------------------8<-----------------------
Hi, I have faced with the same problem. Did you find out who to resolve it?
^ permalink raw reply
* Re: [PATCH 4/5] pretty: Use mailmap to display username and email
From: Antoine Pelisse @ 2012-12-12 13:27 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Rich Midwinter
In-Reply-To: <7vehiw6wc1.fsf@alter.siamese.dyndns.org>
> Or it might be better to make those two strbufs output-only
> parameter, e.g.
>
> map_user(struct string_list *mailmap,
> const char *name, size_t namelen,
> const char *mail, size_t maillen,
> struct strbuf *name_out, struct strbuf *mail_out);
>
> then after split_ident_line(), this caller could feed two pointers
> into the original "line" as name and mail parameter, without making
> any copies (the callee has to make a copy but it has to be done
> anyway when the name/mail is mapped). I suspect it would make this
> caller simpler, but I do not know how invasive such changes are for
> other callers of map_user().
It makes a lot of sense.
blame.c::get_commit_info() hard code the length
shortlog.c::insert_one_record() hard code the length
pretty.c::format_person_part() hard code the length
I don't think it will be invasive.
> Such an update can be left outside of this series, of course.
I will try to make it at the beginning of the series. It will avoid unnecessary
conflicts.
>> + strbuf_addch(sb, ' ');
>> + strbuf_addch(sb, '<');
>> + strbuf_add(sb, person_mail, strlen(person_mail));
>> + strbuf_addch(sb, '>');
>> strbuf_addch(sb, '\n');
>
> Is that strbuf_addf(sb, " <%s>\n", person_mail)?
Of couse ;) Fixed.
^ permalink raw reply
* [PATCH] index-format.txt: be more liberal on what can represent invalid cache tree
From: Nguyễn Thái Ngọc Duy @ 2012-12-12 12:44 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
We have been writing -1 as "invalid" since day 1. On that same day we
accept all negative entry counts as "invalid". So in theory all C Git
versions out there would be happy to accept any negative numbers. JGit
seems to do exactly the same.
Correct the document to reflect the fact that -1 is not the only magic
number. At least one implementation, libgit2, is found to treat -1
this way.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
Documentation/technical/index-format.txt | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
index 9d25b30..2028a49 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -161,8 +161,8 @@ GIT index format
this span of index as a tree.
An entry can be in an invalidated state and is represented by having
- -1 in the entry_count field. In this case, there is no object name
- and the next entry starts immediately after the newline.
+ a negative number in the entry_count field. In this case, there is no
+ object name and the next entry starts immediately after the newline.
The entries are written out in the top-down, depth-first order. The
first entry represents the root level of the repository, followed by the
--
1.8.0.rc2.23.g1fb49df
^ permalink raw reply related
* Re: Python extension commands in git - request for policy change
From: Eric S. Raymond @ 2012-12-12 12:43 UTC (permalink / raw)
To: Patrick Donnelly
Cc: Sitaram Chamarty, Nguyen Thai Ngoc Duy, Michael Haggerty,
Felipe Contreras, git
In-Reply-To: <CACh33Fpk8_ZXw8Ladx83J+rmdRYf7ruYAMMkqOKcoH3OApKPJQ@mail.gmail.com>
Patrick Donnelly <batrick@batbytes.com>:
> On Tue, Dec 11, 2012 at 10:30 PM, Eric S. Raymond <esr@thyrsus.com> wrote:
> > It might be a good fit for extending git; I wouldn't be very surprised if
> > that worked. However, I do have concerns about the "Oh, we'll just
> > lash together a binding to C" attitude common among lua programmers; I
> > foresee maintainability problems and the possibility of slow death by
> > low-level details as that strategy tries to scale up.
>
> I think this is quite a prediction? Could you give an example
> scenario?
Everything old is new again. I'm going by experience with Tcl back in the day.
> How would another language (e.g. Python) mitigate this?
The way you mitigate this sort of problem is to have a good set of
high-level bindings for standard services (like socket I/O) built in
your extension language and using its abstractions, so you don't get a
proliferation of low-level semi-custom APIs for doing the same stuff.
I have elsewhere referred to this as "the harsh lesson of Perl", which
I do not love but which was the first scripting language to get this
right. There is a reason Tcl and a couple of earlier designs like csh
that we would now call "scripting languages" were displaced by Python
and Perl; this is it.
My favorite present-day example of getting this right is the Python bindings
for GTK. They're lovely. A work of art.
> I don't see how these languages are more appropriate based on your concerns.
Your previous exchange with Jeff King indicates that you don't
understand glue scripting very well. Your puzzlement here just
confirms that. Trust both of us on this, it's important. And
reread my previous three paragraphs.
--
<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
^ permalink raw reply
* Re: Python extension commands in git - request for policy change
From: Jeff King @ 2012-12-12 12:29 UTC (permalink / raw)
To: Eric S. Raymond
Cc: Sitaram Chamarty, Patrick Donnelly, Nguyen Thai Ngoc Duy,
Michael Haggerty, Felipe Contreras, git
In-Reply-To: <20121212122625.GB25981@thyrsus.com>
On Wed, Dec 12, 2012 at 07:26:25AM -0500, Eric S. Raymond wrote:
> Jeff King <peff@peff.net>:
> > I think there are really two separate use cases to consider:
> >
> > 1. Providing snippets of script to Git to get Turing-complete behavior
> > for existing Git features. For example, selecting commits during a
> > traversal (e.g., a better "log --grep"), formatting output (e.g., a
> > better "log --format" or "for-each-ref --format").
> >
> > 2. Writing whole new git commands in a language that is quicker or
> > easier to develop in than C.
>
> That's good analysis. I agree with your use-case split, I guess I'm just not
> very aware of the places in git where (1) is important.
Yeah, I don't think (1) is your use case at all. But when people talk
about "Jeff's lua experiment", they are talking about some patches I had
to do (1), which covered "log --format" (but ultimately would need more
cleanup to be acceptable upstream). Maybe that clears up the discussion
a little bit.
-Peff
^ permalink raw reply
* Re: Python extension commands in git - request for policy change
From: Eric S. Raymond @ 2012-12-12 12:26 UTC (permalink / raw)
To: Jeff King
Cc: Sitaram Chamarty, Patrick Donnelly, Nguyen Thai Ngoc Duy,
Michael Haggerty, Felipe Contreras, git
In-Reply-To: <20121212063208.GA18322@sigill.intra.peff.net>
Jeff King <peff@peff.net>:
> I think there are really two separate use cases to consider:
>
> 1. Providing snippets of script to Git to get Turing-complete behavior
> for existing Git features. For example, selecting commits during a
> traversal (e.g., a better "log --grep"), formatting output (e.g., a
> better "log --format" or "for-each-ref --format").
>
> 2. Writing whole new git commands in a language that is quicker or
> easier to develop in than C.
That's good analysis. I agree with your use-case split, I guess I'm just not
very aware of the places in git where (1) is important.
--
<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
^ permalink raw reply
* Re: [PATCH] git(1): remove a defunct link to "list of authors"
From: Jeff King @ 2012-12-12 12:24 UTC (permalink / raw)
To: Nguyen Thai Ngoc Duy; +Cc: Junio C Hamano, git
In-Reply-To: <CACsJy8Dg1a0siDbiHtk4m1RhjLt-XKiS8kOO7qPKjwRczLF9vA@mail.gmail.com>
On Mon, Dec 10, 2012 at 07:04:40PM +0700, Nguyen Thai Ngoc Duy wrote:
> > With or without "--no-merges", the big picture you can get out of
> > "git shortlog -s -n --since=1.year" does not change very much, but
> > the headline numbers give a wrong impression.
>
> These numbers are approximate anyway. Commit counts or the number of
> changed lines do not accurately reflect the effort in many cases. And
> about merges, in this particular case of Git where the maintainer imo
> has done an excellent job as a guard, I'd say it's the credit for
> reviewing, not simply merging.
I agree that commit count is approximate. But counting merges is really
quite a large factor of error (in git.git, it more than doubles Junio's
count, and represents over 20% of the total number of commits).
The GitHub contributors page counts merges _and_ fails to use mailmap.
Yuck. I'm working on fixing that now.
> But not using the link is fine too. We can wait for Jeff's patch to be
> merged.
After the discussion in the PR, I am inclined to think the site (and
possibly the manpage) should just point to some decent contributors
graph (either GitHub, ohloh, or something else; suggestions welcome).
Anything else is just recreating a crappy static version of something
that could be much more dynamic and explorable.
I find the ohloh one a little more informative than the GitHub graph. I
couldn't find any others (Google Code does not seem to have one,
kernel.org and other gitweb sites do not, and I can't think of anywhere
else that hosts a mirror).
-Peff
^ permalink raw reply
* Re: Python extension commands in git - request for policy change
From: Eric S. Raymond @ 2012-12-12 12:23 UTC (permalink / raw)
To: Joshua Jensen
Cc: Sitaram Chamarty, Patrick Donnelly, Nguyen Thai Ngoc Duy,
Michael Haggerty, Felipe Contreras, git
In-Reply-To: <50C811ED.4000600@workspacewhiz.com>
Joshua Jensen <jjensen@workspacewhiz.com>:
> Anyway, my preference is to allow scripts to run in-process within
> Git, because it is far, far faster on Windows. I imagine it is
> faster than forking processes on non-Windows machines, too, but I
> have no statistics to back that up.
>
> Python, Perl, or Ruby can be embedded, too, but Lua probably embeds
> the easiest and smallest out of those other 3 languages.
>
> And shell scripts tend to be the slowest on Windows due to the
> excessive numbers of process invocations needed to get anything
> reasonable done.
I don't think there's *any* dimension along which lua is not clearly
better than shell for this sort of thing, so no argument there.
--
<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
^ permalink raw reply
* Re: [PATCH 5/5] log: Add --use-mailmap option
From: Antoine Pelisse @ 2012-12-12 11:58 UTC (permalink / raw)
To: git, Junio C Hamano; +Cc: Rich Midwinter, Antoine Pelisse
In-Reply-To: <1355264493-8298-6-git-send-email-apelisse@gmail.com>
On Tue, Dec 11, 2012 at 11:21 PM, Antoine Pelisse <apelisse@gmail.com> wrote:
> Add the --use-mailmap option to log commands. It allows
> to display names from mailmap file when displaying logs,
> whatever the format used.
The question is which log commands actually ?
Shouldn't we put the option in revision.c::handle_revision_opt instead ?
My opinion is that it belongs to 'Commit Formatting'.
It would also make sense to be able to use '--use-mailmap' when running
format-patch for example.
Also, I've noticed that my series break some tests (linked with
format-patch BTW).
I fixed that and re-ran all tests successfully. I will resubmit it later.
^ permalink raw reply
* [PATCH 5/5] contrib: update stats/mailmap script
From: Jeff King @ 2012-12-12 11:41 UTC (permalink / raw)
To: git
In-Reply-To: <20121212113036.GB19625@sigill.intra.peff.net>
This version changes quite a few things:
1. The original parsed the mailmap file itself, and it did
it wrong (it did not understand entries with an extra
email key).
Instead, this version uses git's "%aE" and "%aN"
formats to have git perform the mapping, meaning we do
not have to read .mailmap at all, but still operate on
the current state that git sees (and it also works
properly from subdirs).
2. The original would find multiple names for an email,
but not the other way around.
This version can do either or both. If we find multiple
emails for a name, the resolution is less obvious than
the other way around. However, it can still be a
starting point for a human to investigate.
3. The original would order only by count, not by recency.
This version can do either. Combined with showing the
counts, it can be easier to decide how to resolve.
4. This version shows similar entries in a blank-delimited
stanza, which makes it more clear which options you are
picking from.
Signed-off-by: Jeff King <peff@peff.net>
---
contrib/stats/mailmap.pl | 108 ++++++++++++++++++++++++++++++-----------------
1 file changed, 70 insertions(+), 38 deletions(-)
rewrite contrib/stats/mailmap.pl (97%)
diff --git a/contrib/stats/mailmap.pl b/contrib/stats/mailmap.pl
dissimilarity index 97%
index 4b852e2..9513f5e 100755
--- a/contrib/stats/mailmap.pl
+++ b/contrib/stats/mailmap.pl
@@ -1,38 +1,70 @@
-#!/usr/bin/perl -w
-my %mailmap = ();
-open I, "<", ".mailmap";
-while (<I>) {
- chomp;
- next if /^#/;
- if (my ($author, $mail) = /^(.*?)\s+<(.+)>$/) {
- $mailmap{$mail} = $author;
- }
-}
-close I;
-
-my %mail2author = ();
-open I, "git log --pretty='format:%ae %an' |";
-while (<I>) {
- chomp;
- my ($mail, $author) = split(/\t/, $_);
- next if exists $mailmap{$mail};
- $mail2author{$mail} ||= {};
- $mail2author{$mail}{$author} ||= 0;
- $mail2author{$mail}{$author}++;
-}
-close I;
-
-while (my ($mail, $authorcount) = each %mail2author) {
- # %$authorcount is ($author => $count);
- # sort and show the names from the most frequent ones.
- my @names = (map { $_->[0] }
- sort { $b->[1] <=> $a->[1] }
- map { [$_, $authorcount->{$_}] }
- keys %$authorcount);
- if (1 < @names) {
- for (@names) {
- print "$_ <$mail>\n";
- }
- }
-}
-
+#!/usr/bin/perl
+
+use warnings 'all';
+use strict;
+use Getopt::Long;
+
+my $match_emails;
+my $match_names;
+my $order_by = 'count';
+Getopt::Long::Configure(qw(bundling));
+GetOptions(
+ 'emails|e!' => \$match_emails,
+ 'names|n!' => \$match_names,
+ 'count|c' => sub { $order_by = 'count' },
+ 'time|t' => sub { $order_by = 'stamp' },
+) or exit 1;
+$match_emails = 1 unless $match_names;
+
+my $email = {};
+my $name = {};
+
+open(my $fh, '-|', "git log --format='%at <%aE> %aN'");
+while(<$fh>) {
+ my ($t, $e, $n) = /(\S+) <(\S+)> (.*)/;
+ mark($email, $e, $n, $t);
+ mark($name, $n, $e, $t);
+}
+close($fh);
+
+if ($match_emails) {
+ foreach my $e (dups($email)) {
+ foreach my $n (vals($email->{$e})) {
+ show($n, $e, $email->{$e}->{$n});
+ }
+ print "\n";
+ }
+}
+if ($match_names) {
+ foreach my $n (dups($name)) {
+ foreach my $e (vals($name->{$n})) {
+ show($n, $e, $name->{$n}->{$e});
+ }
+ print "\n";
+ }
+}
+exit 0;
+
+sub mark {
+ my ($h, $k, $v, $t) = @_;
+ my $e = $h->{$k}->{$v} ||= { count => 0, stamp => 0 };
+ $e->{count}++;
+ $e->{stamp} = $t unless $t < $e->{stamp};
+}
+
+sub dups {
+ my $h = shift;
+ return grep { keys($h->{$_}) > 1 } keys($h);
+}
+
+sub vals {
+ my $h = shift;
+ return sort {
+ $h->{$b}->{$order_by} <=> $h->{$a}->{$order_by}
+ } keys($h);
+}
+
+sub show {
+ my ($n, $e, $h) = @_;
+ print "$n <$e> ($h->{$order_by})\n";
+}
--
1.8.0.2.4.g59402aa
^ permalink raw reply related
* [PATCH 4/5] .mailmap: normalize emails for Linus Torvalds
From: Jeff King @ 2012-12-12 11:41 UTC (permalink / raw)
To: git; +Cc: Linus Torvalds
In-Reply-To: <20121212113036.GB19625@sigill.intra.peff.net>
Linus used a lot of different per-machine email addresses in
the early days. This means that "git shortlog -nse" does not
aggregate his counts, and he is listed well below where he
should be (8th instead of 3rd).
Signed-off-by: Jeff King <peff@peff.net>
---
Linus,
I recall you considered "email ident from random machine" as a feature
very early on in git's history, but you seem to have settled on using
the linux-foundation address pretty consistently these days. Please let
me know if you object to normalizing your entries in this way.
.mailmap | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/.mailmap b/.mailmap
index 4a27b7f..c7e8618 100644
--- a/.mailmap
+++ b/.mailmap
@@ -52,6 +52,12 @@ Li Hong <leehong@pku.edu.cn>
Lars Doelle <lars.doelle@on-line ! de>
Lars Doelle <lars.doelle@on-line.de>
Li Hong <leehong@pku.edu.cn>
+Linus Torvalds <torvalds@linux-foundation.org> <torvalds@woody.linux-foundation.org>
+Linus Torvalds <torvalds@linux-foundation.org> <torvalds@osdl.org>
+Linus Torvalds <torvalds@linux-foundation.org> <torvalds@g5.osdl.org>
+Linus Torvalds <torvalds@linux-foundation.org> <torvalds@evo.osdl.org>
+Linus Torvalds <torvalds@linux-foundation.org> <torvalds@ppc970.osdl.org>
+Linus Torvalds <torvalds@linux-foundation.org> <torvalds@ppc970.osdl.org.(none)>
Lukas Sandström <lukass@etek.chalmers.se>
Marc-André Lureau <marcandre.lureau@gmail.com>
Mark Rada <marada@uwaterloo.ca>
--
1.8.0.2.4.g59402aa
^ permalink raw reply related
* [PATCH 3/5] .mailmap: normalize emails for Jeff King
From: Jeff King @ 2012-12-12 11:38 UTC (permalink / raw)
To: git
In-Reply-To: <20121212113036.GB19625@sigill.intra.peff.net>
I never meant anything special by using my @github.com
address; it is merely a mistake that it has sometimes bled
through to patches.
Signed-off-by: Jeff King <peff@peff.net>
---
.mailmap | 1 +
1 file changed, 1 insertion(+)
diff --git a/.mailmap b/.mailmap
index e370e86..4a27b7f 100644
--- a/.mailmap
+++ b/.mailmap
@@ -31,6 +31,7 @@ Jay Soffian <jaysoffian+git@gmail.com>
İsmail Dönmez <ismail@pardus.org.tr>
Jakub Narębski <jnareb@gmail.com>
Jay Soffian <jaysoffian+git@gmail.com>
+Jeff King <peff@peff.net> <peff@github.com>
Joachim Berdal Haga <cjhaga@fys.uio.no>
Johannes Sixt <j6t@kdbg.org> <johannes.sixt@telecom.at>
Johannes Sixt <j6t@kdbg.org> <j.sixt@viscovery.net>
--
1.8.0.2.4.g59402aa
^ permalink raw reply related
* [PATCH 2/5] .mailmap: fix broken entry for Martin Langhoff
From: Jeff King @ 2012-12-12 11:38 UTC (permalink / raw)
To: git; +Cc: Martin Langhoff
In-Reply-To: <20121212113036.GB19625@sigill.intra.peff.net>
Commit adc3192 (Martin Langhoff has a new e-mail address,
2010-10-05) added a mailmap entry, but forgot that both the
old and new email addresses need to appear for one to be
mapped to the other (i.e., we do not key mailmap emails by
name).
Signed-off-by: Jeff King <peff@peff.net>
---
.mailmap | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/.mailmap b/.mailmap
index 69301bd..e370e86 100644
--- a/.mailmap
+++ b/.mailmap
@@ -54,7 +54,7 @@ Mark Rada <marada@uwaterloo.ca>
Lukas Sandström <lukass@etek.chalmers.se>
Marc-André Lureau <marcandre.lureau@gmail.com>
Mark Rada <marada@uwaterloo.ca>
-Martin Langhoff <martin@laptop.org>
+Martin Langhoff <martin@laptop.org> <martin@catalyst.net.nz>
Martin von Zweigbergk <martinvonz@gmail.com> <martin.von.zweigbergk@gmail.com>
Michael Coleman <tutufan@gmail.com>
Michael J Gruber <git@drmicha.warpmail.net> <michaeljgruber+gmane@fastmail.fm>
--
1.8.0.2.4.g59402aa
^ permalink raw reply related
* [PATCH 1/5] .mailmap: match up some obvious names/emails
From: Jeff King @ 2012-12-12 11:36 UTC (permalink / raw)
To: git
Cc: Cheng Renquan, Dan Johnson, Eric S. Raymond,
Frédéric Heitzmann, Jakub Narębski, Kevin Leung,
Marc-André Lureau, Mark Rada, Robert Zeh, Tay Ray Chuan
In-Reply-To: <20121212113036.GB19625@sigill.intra.peff.net>
This patch updates git's .mailmap in cases where multiple
names are matched to a single email. The "master" name for
each email was chosen by:
1. If the only difference is in the presence or absence
of accented characters, the accented form is chosen
(under the assumption that it is the natural spelling,
and accents are sometimes stripped in email).
2. Otherwise, the most commonly used name is chosen.
3. If all names are equally common, the most recently used name is
chosen.
Signed-off-by: Jeff King <peff@peff.net>
---
I'm cc-ing all involved authors. If you object or want to normalize your
name in some other way, please let me know.
.mailmap | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/.mailmap b/.mailmap
index bcf4f87..69301bd 100644
--- a/.mailmap
+++ b/.mailmap
@@ -9,7 +9,9 @@ Chris Shoemaker <c.shoemaker@cox.net>
Alexander Gavrilov <angavrilov@gmail.com>
Aneesh Kumar K.V <aneesh.kumar@gmail.com>
Brian M. Carlson <sandals@crustytoothpaste.ath.cx>
+Cheng Renquan <crquan@gmail.com>
Chris Shoemaker <c.shoemaker@cox.net>
+Dan Johnson <computerdruid@gmail.com>
Dana L. How <danahow@gmail.com>
Dana L. How <how@deathvalley.cswitch.com>
Daniel Barkalow <barkalow@iabervon.org>
@@ -18,13 +20,16 @@ Horst H. von Brand <vonbrand@inf.utfsm.cl>
David S. Miller <davem@davemloft.net>
Deskin Miller <deskinm@umich.edu>
Dirk Süsserott <newsletter@dirk.my1.cc>
+Eric S. Raymond <esr@thyrsus.com>
Erik Faye-Lund <kusmabite@gmail.com> <kusmabite@googlemail.com>
Fredrik Kuivinen <freku045@student.liu.se>
+Frédéric Heitzmann <frederic.heitzmann@gmail.com>
H. Peter Anvin <hpa@bonde.sc.orionmulti.com>
H. Peter Anvin <hpa@tazenda.sc.orionmulti.com>
H. Peter Anvin <hpa@trantor.hos.anvin.org>
Horst H. von Brand <vonbrand@inf.utfsm.cl>
İsmail Dönmez <ismail@pardus.org.tr>
+Jakub Narębski <jnareb@gmail.com>
Jay Soffian <jaysoffian+git@gmail.com>
Joachim Berdal Haga <cjhaga@fys.uio.no>
Johannes Sixt <j6t@kdbg.org> <johannes.sixt@telecom.at>
@@ -41,11 +46,14 @@ Lukas Sandström <lukass@etek.chalmers.se>
Junio C Hamano <gitster@pobox.com> <junio@kernel.org>
Junio C Hamano <gitster@pobox.com> <junkio@cox.net>
Karl Hasselström <kha@treskal.com>
+Kevin Leung <kevinlsk@gmail.com>
Kent Engstrom <kent@lysator.liu.se>
Lars Doelle <lars.doelle@on-line ! de>
Lars Doelle <lars.doelle@on-line.de>
Li Hong <leehong@pku.edu.cn>
Lukas Sandström <lukass@etek.chalmers.se>
+Marc-André Lureau <marcandre.lureau@gmail.com>
+Mark Rada <marada@uwaterloo.ca>
Martin Langhoff <martin@laptop.org>
Martin von Zweigbergk <martinvonz@gmail.com> <martin.von.zweigbergk@gmail.com>
Michael Coleman <tutufan@gmail.com>
@@ -63,11 +71,13 @@ Steven Grimm <koreth@midwinter.com>
Ramsay Allan Jones <ramsay@ramsay1.demon.co.uk>
René Scharfe <rene.scharfe@lsrfire.ath.cx>
Robert Fitzsimons <robfitz@273k.net>
+Robert Zeh <robert.a.zeh@gmail.com>
Sam Vilain <sam@vilain.net>
Santi Béjar <sbejar@gmail.com>
Sean Estabrooks <seanlkml@sympatico.ca>
Shawn O. Pearce <spearce@spearce.org>
Steven Grimm <koreth@midwinter.com>
+Tay Ray Chuan <rctay89@gmail.com>
Theodore Ts'o <tytso@mit.edu>
Thomas Rast <trast@inf.ethz.ch> <trast@student.ethz.ch>
Tony Luck <tony.luck@intel.com>
--
1.8.0.2.4.g59402aa
^ permalink raw reply related
* [PATCH 0/5] git.git .mailmap cleanups
From: Jeff King @ 2012-12-12 11:30 UTC (permalink / raw)
To: git
I noticed a few obvious problems in the output of "git shortlog -nse" on
git.git. So I wrote an analysis script to find more, and of course there
were lots.
This series tries to clean up the low-hanging fruit. The first two
commits fix multiple names matching a single email. Hopefully not too
contentious, but I'll cc all involved parties to confirm. The second has
a different root cause, so I've broken it out into its own commit.
[1/5]: .mailmap: match up some obvious names/emails
[2/5]: .mailmap: fix broken entry for Martin Langhoff
Next up are multiple emails which match a single name. There are over a
hundred of these, and they are much less obvious to fix. They really
need individuals to post patches to fix their own identities (and some
may not want fixing at all, if they used different emails to have
meaningful different identities).
So I've left these untouched except for:
[3/5]: .mailmap: normalize emails for Jeff King
I am allowed to fix my own. :)
[4/5]: .mailmap: normalize emails for Linus Torvalds
As the benevolent dictator, Linus has underlings to fix such things for
him.
Also, his entry was the original reason I started looking at the data.
He fares quite poorly in "shortlog -nse" because his commits are
scattered across many addresses.
[5/5]: contrib: update stats/mailmap script
This replaces the current mailmap script in contrib, which has a bug and
lacks some of the features of my new script.
-Peff
^ permalink raw reply
* [PATCH 3/2] mailmap: clean up read_mailmap error handling
From: Jeff King @ 2012-12-12 11:18 UTC (permalink / raw)
To: git
In-Reply-To: <20121212110404.GB19653@sigill.intra.peff.net>
On Wed, Dec 12, 2012 at 06:04:04AM -0500, Jeff King wrote:
> The error-return convention from read_mailmap is really wonky, but I
> didn't change it here. It will return "1" for error, and will do so only
> if no mailmap sources could be read (including if they simply don't
> exist). But it's perfectly OK not to have a mailmap at all. However,
> nobody actually seems to check the return code, so nobody has cared.
>
> A more sane convention would probably be:
>
> 1. If ENOENT (or no such blob), silently return success.
>
> 2. Otherwise, return -1 and print a message to stderr indicating that
> there is a mailmap file, but it is broken or otherwise could not be
> opened.
Maybe like this:
-- >8 --
Subject: [PATCH] mailmap: clean up read_mailmap error handling
The error handling for the read_mailmap function is odd. It
returns 1 on error, rather than -1. And it treats a
non-existent mailmap as an error, even though there is no
reason that one needs to exist. Unless some other mailmap
source loads successfully, in which case the original error
is completely masked.
This does not cause any bugs, however, because no caller
bothers to check the return value, anyway. Let's make this a
little more robust to real errors and less surprising for
future callers that do check the error code:
1. Return -1 on errors.
2. Treat a missing entry (e.g., no mailmap.file given),
ENOENT, or a non-existent blob (for mailmap.blob) as
"no error".
3. Complain loudly when a real error (e.g., a transient
I/O error, no permission to open the mailmap file,
missing or corrupted blob object, etc) occurs.
Signed-off-by: Jeff King <peff@peff.net>
---
mailmap.c | 32 +++++++++++++++++++++-----------
1 file changed, 21 insertions(+), 11 deletions(-)
diff --git a/mailmap.c b/mailmap.c
index 2f9c691..5ffe48a 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -168,10 +168,19 @@ static int read_mailmap_file(struct string_list *map, const char *filename,
char **repo_abbrev)
{
char buffer[1024];
- FILE *f = (filename == NULL ? NULL : fopen(filename, "r"));
+ FILE *f;
+
+ if (!filename)
+ return 0;
+
+ f = fopen(filename, "r");
+ if (!f) {
+ if (errno == ENOENT)
+ return 0;
+ return error("unable to open mailmap at %s: %s",
+ filename, strerror(errno));
+ }
- if (f == NULL)
- return 1;
while (fgets(buffer, sizeof(buffer), f) != NULL)
read_mailmap_line(map, buffer, repo_abbrev);
fclose(f);
@@ -205,15 +214,15 @@ static int read_mailmap_blob(struct string_list *map,
enum object_type type;
if (!name)
- return 1;
+ return 0;
if (get_sha1(name, sha1) < 0)
- return 1;
+ return 0;
buf = read_sha1_file(sha1, &type, &size);
if (!buf)
- return 1;
+ return error("unable to read mailmap object at %s", name);
if (type != OBJ_BLOB)
- return 1;
+ return error("mailmap is not a blob: %s", name);
read_mailmap_buf(map, buf, size, repo_abbrev);
@@ -223,11 +232,12 @@ int read_mailmap(struct string_list *map, char **repo_abbrev)
int read_mailmap(struct string_list *map, char **repo_abbrev)
{
+ int err = 0;
map->strdup_strings = 1;
- /* each failure returns 1, so >2 means all calls failed */
- return read_mailmap_file(map, ".mailmap", repo_abbrev) +
- read_mailmap_blob(map, git_mailmap_blob, repo_abbrev) +
- read_mailmap_file(map, git_mailmap_file, repo_abbrev) > 2;
+ err |= read_mailmap_file(map, ".mailmap", repo_abbrev);
+ err |= read_mailmap_blob(map, git_mailmap_blob, repo_abbrev);
+ err |= read_mailmap_file(map, git_mailmap_file, repo_abbrev);
+ return err;
}
void clear_mailmap(struct string_list *map)
--
1.8.0.2.4.g59402aa
^ permalink raw reply related
* [PATCH 2/2] mailmap: support reading mailmap from blobs
From: Jeff King @ 2012-12-12 11:04 UTC (permalink / raw)
To: git
In-Reply-To: <20121212105822.GA15842@sigill.intra.peff.net>
In a bare repository, there isn't a simple way to respect an
in-tree mailmap without extracting it to a temporary file.
This patch provides a config variable, similar to
mailmap.file, which reads the mailmap from a blob in the
repository.
Signed-off-by: Jeff King <peff@peff.net>
---
The error-return convention from read_mailmap is really wonky, but I
didn't change it here. It will return "1" for error, and will do so only
if no mailmap sources could be read (including if they simply don't
exist). But it's perfectly OK not to have a mailmap at all. However,
nobody actually seems to check the return code, so nobody has cared.
A more sane convention would probably be:
1. If ENOENT (or no such blob), silently return success.
2. Otherwise, return -1 and print a message to stderr indicating that
there is a mailmap file, but it is broken or otherwise could not be
opened.
Documentation/config.txt | 6 ++++
cache.h | 1 +
config.c | 2 ++
mailmap.c | 49 ++++++++++++++++++++++++++++++--
t/t4203-mailmap.sh | 73 ++++++++++++++++++++++++++++++++++++++++++++++++
5 files changed, 129 insertions(+), 2 deletions(-)
diff --git a/Documentation/config.txt b/Documentation/config.txt
index bf8f911..3760077 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1517,6 +1517,12 @@ mailmap.file::
subdirectory, or somewhere outside of the repository itself.
See linkgit:git-shortlog[1] and linkgit:git-blame[1].
+mailmap.blob::
+ Like `mailmap.file`, but consider the value as a reference to a
+ blob in the repository (e.g., `HEAD:.mailmap`). If both
+ `mailmap.file` and `mailmap.blob` are given, both are parsed,
+ with entries from `mailmap.file` taking precedence.
+
man.viewer::
Specify the programs that may be used to display help in the
'man' format. See linkgit:git-help[1].
diff --git a/cache.h b/cache.h
index 18fdd18..a65f6d1 100644
--- a/cache.h
+++ b/cache.h
@@ -1155,6 +1155,7 @@ extern const char *git_mailmap_file;
extern const char *git_commit_encoding;
extern const char *git_log_output_encoding;
extern const char *git_mailmap_file;
+extern const char *git_mailmap_blob;
/* IO helper functions */
extern void maybe_flush_or_die(FILE *, const char *);
diff --git a/config.c b/config.c
index fb3f868..97364c0 100644
--- a/config.c
+++ b/config.c
@@ -839,6 +839,8 @@ static int git_default_mailmap_config(const char *var, const char *value)
{
if (!strcmp(var, "mailmap.file"))
return git_config_string(&git_mailmap_file, var, value);
+ if (!strcmp(var, "mailmap.blob"))
+ return git_config_string(&git_mailmap_blob, var, value);
/* Add other config variables here and to Documentation/config.txt. */
return 0;
diff --git a/mailmap.c b/mailmap.c
index 89bc318..2f9c691 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -10,6 +10,7 @@ const char *git_mailmap_file;
#endif
const char *git_mailmap_file;
+const char *git_mailmap_blob;
struct mailmap_info {
char *name;
@@ -177,12 +178,56 @@ int read_mailmap(struct string_list *map, char **repo_abbrev)
return 0;
}
+static void read_mailmap_buf(struct string_list *map,
+ const char *buf, unsigned long len,
+ char **repo_abbrev)
+{
+ while (len) {
+ const char *end = strchrnul(buf, '\n');
+ unsigned long linelen = end - buf + 1;
+ char *line = xmemdupz(buf, linelen);
+
+ read_mailmap_line(map, line, repo_abbrev);
+
+ free(line);
+ buf += linelen;
+ len -= linelen;
+ }
+}
+
+static int read_mailmap_blob(struct string_list *map,
+ const char *name,
+ char **repo_abbrev)
+{
+ unsigned char sha1[20];
+ char *buf;
+ unsigned long size;
+ enum object_type type;
+
+ if (!name)
+ return 1;
+ if (get_sha1(name, sha1) < 0)
+ return 1;
+
+ buf = read_sha1_file(sha1, &type, &size);
+ if (!buf)
+ return 1;
+ if (type != OBJ_BLOB)
+ return 1;
+
+ read_mailmap_buf(map, buf, size, repo_abbrev);
+
+ free(buf);
+ return 0;
+}
+
int read_mailmap(struct string_list *map, char **repo_abbrev)
{
map->strdup_strings = 1;
- /* each failure returns 1, so >1 means both calls failed */
+ /* each failure returns 1, so >2 means all calls failed */
return read_mailmap_file(map, ".mailmap", repo_abbrev) +
- read_mailmap_file(map, git_mailmap_file, repo_abbrev) > 1;
+ read_mailmap_blob(map, git_mailmap_blob, repo_abbrev) +
+ read_mailmap_file(map, git_mailmap_file, repo_abbrev) > 2;
}
void clear_mailmap(struct string_list *map)
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index 1f182f6..e7ea40c 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -149,6 +149,79 @@ test_expect_success 'No mailmap files, but configured' '
test_cmp expect actual
'
+test_expect_success 'setup mailmap blob tests' '
+ git checkout -b map &&
+ test_when_finished "git checkout master" &&
+ cat >just-bugs <<-\EOF &&
+ Blob Guy <bugs@company.xx>
+ EOF
+ cat >both <<-\EOF &&
+ Blob Guy <author@example.com>
+ Blob Guy <bugs@company.xx>
+ EOF
+ git add just-bugs both &&
+ git commit -m "my mailmaps" &&
+ echo "Repo Guy <author@example.com>" >.mailmap &&
+ echo "Internal Guy <author@example.com>" >internal.map
+'
+
+test_expect_success 'mailmap.blob set' '
+ cat >expect <<-\EOF &&
+ Blob Guy (1):
+ second
+
+ Repo Guy (1):
+ initial
+
+ EOF
+ git -c mailmap.blob=map:just-bugs shortlog HEAD >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'mailmap.blob overrides .mailmap' '
+ cat >expect <<-\EOF &&
+ Blob Guy (2):
+ initial
+ second
+
+ EOF
+ git -c mailmap.blob=map:both shortlog HEAD >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'mailmap.file overrides mailmap.blob' '
+ cat >expect <<-\EOF &&
+ Blob Guy (1):
+ second
+
+ Internal Guy (1):
+ initial
+
+ EOF
+ git \
+ -c mailmap.blob=map:both \
+ -c mailmap.file=internal.map \
+ shortlog HEAD >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'mailmap.blob can be missing' '
+ cat >expect <<-\EOF &&
+ Repo Guy (1):
+ initial
+
+ nick1 (1):
+ second
+
+ EOF
+ git -c mailmap.blob=map:nonexistent shortlog HEAD >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'cleanup after mailmap.blob tests' '
+ rm -f .mailmap
+'
+
# Extended mailmap configurations should give us the following output for shortlog
cat >expect <<\EOF
A U Thor <author@example.com> (1):
--
1.8.0.2.4.g59402aa
^ permalink raw reply related
* [PATCH 1/2] mailmap: refactor mailmap parsing for non-file sources
From: Jeff King @ 2012-12-12 10:59 UTC (permalink / raw)
To: git
In-Reply-To: <20121212105822.GA15842@sigill.intra.peff.net>
The read_single_mailmap function opens a mailmap file and
parses each line. In preparation for having non-file
mailmaps, let's pull out the line-parsing logic into its own
function (read_mailmap_line), and rename the file-parsing
function to match (read_mailmap_file).
Signed-off-by: Jeff King <peff@peff.net>
---
Cleanup for the next patch. It's mostly indentation changes, so "diff
-w" is much easier to review.
mailmap.c | 74 ++++++++++++++++++++++++++++++++++-----------------------------
1 file changed, 40 insertions(+), 34 deletions(-)
diff --git a/mailmap.c b/mailmap.c
index ea4b471..89bc318 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -129,44 +129,50 @@ static int read_single_mailmap(struct string_list *map, const char *filename, ch
return (*right == '\0' ? NULL : right);
}
-static int read_single_mailmap(struct string_list *map, const char *filename, char **repo_abbrev)
+static void read_mailmap_line(struct string_list *map, char *buffer,
+ char **repo_abbrev)
+{
+ char *name1 = NULL, *email1 = NULL, *name2 = NULL, *email2 = NULL;
+ if (buffer[0] == '#') {
+ static const char abbrev[] = "# repo-abbrev:";
+ int abblen = sizeof(abbrev) - 1;
+ int len = strlen(buffer);
+
+ if (!repo_abbrev)
+ return;
+
+ if (len && buffer[len - 1] == '\n')
+ buffer[--len] = 0;
+ if (!strncmp(buffer, abbrev, abblen)) {
+ char *cp;
+
+ if (repo_abbrev)
+ free(*repo_abbrev);
+ *repo_abbrev = xmalloc(len);
+
+ for (cp = buffer + abblen; isspace(*cp); cp++)
+ ; /* nothing */
+ strcpy(*repo_abbrev, cp);
+ }
+ return;
+ }
+ if ((name2 = parse_name_and_email(buffer, &name1, &email1, 0)) != NULL)
+ parse_name_and_email(name2, &name2, &email2, 1);
+
+ if (email1)
+ add_mapping(map, name1, email1, name2, email2);
+}
+
+static int read_mailmap_file(struct string_list *map, const char *filename,
+ char **repo_abbrev)
{
char buffer[1024];
FILE *f = (filename == NULL ? NULL : fopen(filename, "r"));
if (f == NULL)
return 1;
- while (fgets(buffer, sizeof(buffer), f) != NULL) {
- char *name1 = NULL, *email1 = NULL, *name2 = NULL, *email2 = NULL;
- if (buffer[0] == '#') {
- static const char abbrev[] = "# repo-abbrev:";
- int abblen = sizeof(abbrev) - 1;
- int len = strlen(buffer);
-
- if (!repo_abbrev)
- continue;
-
- if (len && buffer[len - 1] == '\n')
- buffer[--len] = 0;
- if (!strncmp(buffer, abbrev, abblen)) {
- char *cp;
-
- if (repo_abbrev)
- free(*repo_abbrev);
- *repo_abbrev = xmalloc(len);
-
- for (cp = buffer + abblen; isspace(*cp); cp++)
- ; /* nothing */
- strcpy(*repo_abbrev, cp);
- }
- continue;
- }
- if ((name2 = parse_name_and_email(buffer, &name1, &email1, 0)) != NULL)
- parse_name_and_email(name2, &name2, &email2, 1);
-
- if (email1)
- add_mapping(map, name1, email1, name2, email2);
- }
+ while (fgets(buffer, sizeof(buffer), f) != NULL)
+ read_mailmap_line(map, buffer, repo_abbrev);
fclose(f);
return 0;
}
@@ -175,8 +181,8 @@ int read_mailmap(struct string_list *map, char **repo_abbrev)
{
map->strdup_strings = 1;
/* each failure returns 1, so >1 means both calls failed */
- return read_single_mailmap(map, ".mailmap", repo_abbrev) +
- read_single_mailmap(map, git_mailmap_file, repo_abbrev) > 1;
+ return read_mailmap_file(map, ".mailmap", repo_abbrev) +
+ read_mailmap_file(map, git_mailmap_file, repo_abbrev) > 1;
}
void clear_mailmap(struct string_list *map)
--
1.8.0.2.4.g59402aa
^ permalink raw reply related
* [PATCH 0/2] mailmap from blobs
From: Jeff King @ 2012-12-12 10:58 UTC (permalink / raw)
To: git
I noticed recently that the GitHub contributions page for git.git did
not seem very accurate. The problem is that while it uses shortlog, it
does not respect .mailmap, because we do not have a working tree from
which to read the .mailmap.
This series adds a config option analogous to mailmap.file, but which
reads from a blob in the repository (so the obvious thing to set it to
is "HEAD:.mailmap" in a bare repo, and probably "master:.mailmap" if you
frequently want to traverse while on unrelated branches). The obvious
alternative is to checkout a temporary file of .mailmap and point
mailmap.file at it, but this is a bit more convenient.
A config option is perhaps not the most flexible way to access this. For
example, one could in theory want to pull the mailmap from the tip of
the history being traversed (e.g., because you have multiple unrelated
DAGs in a single repo). But that could also produce the _wrong_ results,
if you are looking at the shortlog of older history (e.g., when doing
"git shortlog v1.5.0..v1.5.5", you would still want to be using the
modern mailmap from "master").
By making it a config option, the simple, common case does the right
thing, and people with complex cases can use "git -c mailmap.blob=..."
to feed the appropriate map for the history they are traversing. If
somebody wants to do something fancier (like --mailmap-from-tip or
something), it would be easy to build on top later.
[1/2]: mailmap: refactor mailmap parsing for non-file sources
[2/2]: mailmap: support reading mailmap from blobs
-Peff
^ permalink raw reply
* Re: [BUG] Cannot push some grafted branches
From: Yann Dirson @ 2012-12-12 10:54 UTC (permalink / raw)
To: Yann Dirson; +Cc: Junio C Hamano, git list
In-Reply-To: <20121212094432.6e1e48c8@chalon.bertin.fr>
On Wed, 12 Dec 2012 09:44:32 +0100 Yann Dirson <dirson@bertin.fr> wrote:
> In fact, I even looked for a way to specify an alternate (or supplementary)
> grafts file for this drafting work, so only well-controlled git invocations
> would see them, whereas the others would just ignore them, and could not find
> any - nor could I identify an existing way of disabling the use of grafts by
> other means than moving it out of the way. In this respect, they seem to be
> lacking a few features, when compared to "replace" refs, but they have different
> uses, and just using the latter as a drafting area is just not adequate.
>
> I thought about adding support for a GIT_GRAFTS_FILE envvar, which would
> default to $GITDIR/info/grafts, or maybe with a more general addition of a
> GIT_EXTRA_GRAFT_FILES envvar, but I'm not sure the latter would be that useful.
My bad on this point: there *is* a GIT_GRAFT_FILE envvar, it is just undocumented.
In fact it is not the only one:
git.git$ for v in $(git grep define.*_ENVIRONMENT master -- cache.h | cut -d'"' -f2|grep ^GIT_); do git grep -q $v master -- Documentation || echo "missing $v"; done
missing GIT_GRAFT_FILE
missing GIT_CONFIG_PARAMETERS
--
Yann Dirson - Bertin Technologies
^ permalink raw reply
* Re: git-prompt.sh vs leading white space in __git_ps1()::printf_format
From: Simon Oosthoek @ 2012-12-12 8:55 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Piotr Krukowiecki, git
In-Reply-To: <7vy5h45e7b.fsf@alter.siamese.dyndns.org>
Hi Junio
This removes most of the ambiguities :-)
Ack from me!
I still have some minor nits, but I'll leave that for another time when I'm less busy.
BTW, I haven't tried this yet, but if you pass 2 arguments to __git_ps1 when called from command-substition mode, I suppose it will think it's in PC mode and overwrite the PS1!
At some point, I'd like to see this code split off into "pc" and "cs" functions which call a common function to get the git status. But that's a major rewrite and it may involve more overhead, since each function should process the output of the common function in a different way.
Cheers
Simon
* Junio C Hamano <gitster@pobox.com> [2012-12-11 16:03:36 -0800]:
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Perhaps like this?
>
> OK, this time with a log message.
>
> -- >8 --
> Subject: [PATCH] git-prompt.sh: update PROMPT_COMMAND documentation
>
> The description of __git_ps1 function operating in two-arg mode was
> not very clear. It said "set PROMPT_COMMAND=__git_ps1" which is not
> the right usage for this mode, followed by "To customize the prompt,
> do this", giving a false impression that those who do not want to
> customize it can get away with no-arg form, which was incorrect.
>
> Make it clear that this mode always takes two arguments, pre and
> post, with an example.
>
> The straight-forward one should be listed as the primary usage, and
> the confusing one should be an alternate for advanced users. Swap
> the order of these two.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> contrib/completion/git-prompt.sh | 16 +++++++++++-----
> 1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh
> index a8b53ba..9b074e1 100644
> --- a/contrib/completion/git-prompt.sh
> +++ b/contrib/completion/git-prompt.sh
> @@ -10,14 +10,20 @@
> # 1) Copy this file to somewhere (e.g. ~/.git-prompt.sh).
> # 2) Add the following line to your .bashrc/.zshrc:
> # source ~/.git-prompt.sh
> -# 3a) In ~/.bashrc set PROMPT_COMMAND=__git_ps1
> -# To customize the prompt, provide start/end arguments
> -# PROMPT_COMMAND='__git_ps1 "\u@\h:\w" "\\\$ "'
> -# 3b) Alternatively change your PS1 to call __git_ps1 as
> +# 3a) Change your PS1 to call __git_ps1 as
> # command-substitution:
> # Bash: PS1='[\u@\h \W$(__git_ps1 " (%s)")]\$ '
> # ZSH: PS1='[%n@%m %c$(__git_ps1 " (%s)")]\$ '
> -# the optional argument will be used as format string
> +# the optional argument will be used as format string.
> +# 3b) Alternatively, if you are using bash, __git_ps1 can be
> +# used for PROMPT_COMMAND with two parameters, <pre> and
> +# <post>, which are strings you would put in $PS1 before
> +# and after the status string generated by the git-prompt
> +# machinery. e.g.
> +# PROMPT_COMMAND='__git_ps1 "\u@\h:\w" "\\\$ "'
> +# will show username, at-sign, host, colon, cwd, then
> +# various status string, followed by dollar and SP, as
> +# your prompt.
> #
> # The argument to __git_ps1 will be displayed only if you are currently
> # in a git repository. The %s token will be the name of the current
> --
> 1.8.1.rc1.128.gd8d1528
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox