Git development
 help / color / mirror / Atom feed
* Re: [PATCH 1/8] Use %B for Split Subject/Body
From: Junio C Hamano @ 2013-01-07 16:53 UTC (permalink / raw)
  To: 郑文辉(Techlive Zheng); +Cc: David A. Greene, git
In-Reply-To: <CAPYzjrT_8g26y-QrYvbQYoySWskGdn15jCX60rz04wQFQ2ikVw@mail.gmail.com>

"郑文辉(Techlive Zheng)"  <techlivezheng@gmail.com> writes:

> Though, this patch defintely should be merged, becuase no one expects
> his commit message be altered durging the splitting process.

Are you saying that after double-checking what was posted?  It said
something like this below, which does not look like 'definitely
should be' to me.

diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
index 920c664..f2b6d4a 100755
--- a/contrib/subtree/git-subtree.sh
+++ b/contrib/subtree/git-subtree.sh
@@ -296,7 +296,12 @@ copy_commit()
 	# We're going to set some environment vars here, so
 	# do it in a subshell to get rid of them safely later
 	debug copy_commit "{$1}" "{$2}" "{$3}"
+	# Use %B rather than %s%n%n%b to handle the special case of a
+	# commit that only has a subject line.  We don't want to
+	# introduce a newline after the subject, causing generation of
+	# a new hash.
 	git log -1 --pretty=format:'%an%n%ae%n%ad%n%cn%n%ce%n%cd%n%s%n%n%b' "$1" |
+#	git log -1 --pretty=format:'%an%n%ae%n%ad%n%cn%n%ce%n%cd%n%B' "$1" |
 	(
 		read GIT_AUTHOR_NAME
 		read GIT_AUTHOR_EMAIL

^ permalink raw reply related

* Re: [PATCH v2] status: always report ignored tracked directories
From: Torsten Bögershausen @ 2013-01-07 17:21 UTC (permalink / raw)
  To: Antoine Pelisse, Nguyen Thai Ngoc Duy; +Cc: Jeff King, tboegi, git
In-Reply-To: <1357510179-22852-1-git-send-email-apelisse@gmail.com>

On 06.01.13 23:09, Antoine Pelisse wrote:
[snip]
> Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
> ---
>  dir.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/dir.c b/dir.c
> index 9b80348..f836590 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -672,7 +672,8 @@ static struct dir_entry *dir_entry_new(const char *pathname, int len)
>
>  static struct dir_entry *dir_add_name(struct dir_struct *dir, const char *pathname, int len)
>  {
> -	if (cache_name_exists(pathname, len, ignore_case))
> +	if (!(dir->flags & DIR_SHOW_IGNORED) &&
> +	    cache_name_exists(pathname, len, ignore_case))
>  		return NULL;
>
>  	ALLOC_GROW(dir->entries, dir->nr+1, dir->alloc);
> @@ -877,11 +878,7 @@ static int treat_file(struct dir_struct *dir, struct strbuf *path, int exclude,
>  	if (exclude)
>  		exclude_file = !(dir->flags & DIR_SHOW_IGNORED);
>  	else if (dir->flags & DIR_SHOW_IGNORED) {
> -		/*
> -		 * Optimization:
> -		 * Don't spend time on indexed files, they won't be
> -		 * added to the list anyway
> -		 */
> +		/* Always exclude indexed files */
>  		struct cache_entry *ce = index_name_exists(&the_index,
>  		    path->buf, path->len, ignore_case);
>
> --
> 1.7.12.4.3.g90f5e2d
>
The bad news: the patch does not apply.
The good news: t7061 passes on pu,
and dir.c seems to be changes as needed:

commit 1f4e17c6c9833f17dc6bbf045f8a8d6378dcb417
Merge: dee1fa4 cc37e5b
Author: Junio C Hamano <gitster@pobox.com>
Date: Sun Jan 6 23:46:29 2013 -0800

Merge branch 'nd/parse-pathspec' into pu

which comes from Duy:

commit cc37e5bf18ca11d9a884bddfebcdff61df3e6279
Author: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Date: Sun Jan 6 13:21:08 2013 +0700

Convert more init_pathspec() to parse_pathspec()

^ permalink raw reply

* Re: [PATCH 1/4] test: Add target test-lint-shell-syntax
From: Torsten Bögershausen @ 2013-01-07 17:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Torsten Bögershausen, git
In-Reply-To: <7vpq1nyvp1.fsf@alter.siamese.dyndns.org>

On 03.01.13 01:08, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
>> I would actually not add this to TEST_LINT by default, especially
>> when "duplicates" and "executable" that are much simpler and less
>> likely to hit false positives are not on by default.
>>
>> At least, a change to add this to TEST_LINT by default must wait to
>> be merged to any integration branch until all the fix-up topics that
>> this test triggers in the current codebase graduate to the branch.
>>
>>>> +test-lint-shell-syntax:
>>>> +	$(PERL_PATH) check-non-portable-shell.pl $(T)
>>>
>>> This is wrong if $(PERL_PATH) contains spaces, no?
>>
>> You are correct; "harness" thing in the same Makefile gets this
>> wrong, too.  I think the right invocation is:
>>
>> 	@'$(PERL_PATH_SQ)' check-non-portable.shell.pl $(T)
>>
>> although I do not offhand know if that symbol is already exported by
>> the top-level Makefile.
> 
> I'll tentatively queue this instead.  The log message has also been
> cleaned up a bit.

Sorry for late answer, but there is a problem (both linux and Mac OS X) :-(
$ make test-lint does not do shel syntax check, neither
$ make test-lint-shell-syntax

In the Makefile the the line 
	@'$(PERL_PATH_SQ)' check-non-portable-shell.pl $(T)
doesn't seem to anything (?)

Replacing @'$(PERL_PATH_SQ)' with $(PERL_PATH) gives the following,
expected result: (a very long line starting like this:)

$ make test-lint-shell-syntax
/usr/bin/perl check-non-portable-shell.pl t0000-basic.sh ......

confused...
/Torsten

^ permalink raw reply

* Re: [PATCH v2] status: always report ignored tracked directories
From: Junio C Hamano @ 2013-01-07 17:50 UTC (permalink / raw)
  To: Torsten Bögershausen
  Cc: Antoine Pelisse, Nguyen Thai Ngoc Duy, Jeff King, git
In-Reply-To: <50EB0409.1090307@web.de>

Torsten Bögershausen <tboegi@web.de> writes:

> The bad news: the patch does not apply.

Huh, isn't this already queued as 22ccf86 (status: always report
ignored tracked directories, 2013-01-06)?

> The good news: t7061 passes on pu,
> and dir.c seems to be changes as needed:
>
> commit 1f4e17c6c9833f17dc6bbf045f8a8d6378dcb417
> Merge: dee1fa4 cc37e5b
> Author: Junio C Hamano <gitster@pobox.com>
> Date: Sun Jan 6 23:46:29 2013 -0800
>
> Merge branch 'nd/parse-pathspec' into pu
>
> which comes from Duy:
>
> commit cc37e5bf18ca11d9a884bddfebcdff61df3e6279
> Author: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> Date: Sun Jan 6 13:21:08 2013 +0700
>
> Convert more init_pathspec() to parse_pathspec()

Yes, it needs conflict resolution with other topics in flight, but
the thing to test is to see if the result of merging 22ccf86 into
the 'master' branch does what we want; the newer topic is still in
flux and we know it will be rerolled before it gets into testable
shape.

^ permalink raw reply

* [PATCH] Documentation on depth option in git clone.
From: Stefan Beller @ 2013-01-07 18:06 UTC (permalink / raw)
  To: schlotter, gitster, Ralf.Wildenhues, git; +Cc: Stefan Beller
In-Reply-To: <1357581996-17505-1-git-send-email-stefanbeller@googlemail.com>

---
 Documentation/git-clone.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index 7fefdb0..e76aa50 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -186,7 +186,8 @@ objects from the source repository into a pack in the cloned repository.
 	it, nor push from nor into it), but is adequate if you
 	are only interested in the recent history of a large project
 	with a long history, and would want to send in fixes
-	as patches.
+	as patches. The depth should be at least 1. If it is 0 or
+	below, the cloned repository will not be shallow.
 
 --single-branch::
 	Clone only the history leading to the tip of a single branch,
-- 
1.8.1.80.g3e293fb.dirty

^ permalink raw reply related

* Re: [PATCH 1/4] test: Add target test-lint-shell-syntax
From: Junio C Hamano @ 2013-01-07 18:07 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: Jeff King, git
In-Reply-To: <50EB0928.3090901@web.de>

Torsten Bögershausen <tboegi@web.de> writes:

> Sorry for late answer, but there is a problem (both linux and Mac OS X) :-(
> $ make test-lint does not do shel syntax check, neither
> $ make test-lint-shell-syntax

In which directory?

    $ make -C t test-lint-shell-syntax
    ... passes silently ...
    $ ed t/t0000-basic.sh
    /test_expect_success/
    a
            which sh
    .
    w
    q
    $ make -C t test-lint-shell-syntax
    t0000-basic.sh:28: error: which is not portable (please use type):      which sh
    make: *** [test-lint-shell-syntax] Error 1

If you edit out '@' (but nothing else) from this line:

> 	@'$(PERL_PATH_SQ)' check-non-portable-shell.pl $(T)

and run the above again, you would see that it is running this shell
command:

'/usr/bin/perl' check-non-portable-shell.pl t0000-basic.sh t0001-init.sh ...

If you introduce a Perl syntax error to check-non-portable-shell.pl,
like this, you will get:

    $ make -C t test-lint-shell-syntax
    syntax error at check-non-portable-shell.pl line 11, near "whoa

So... is your shell broken?  The above seems to work for dash, bash,
ksh and zsh.

^ permalink raw reply

* [PATCH] Documentation on depth option in git clone.
From: Stefan Beller @ 2013-01-07 18:10 UTC (permalink / raw)
  To: gitster, schlotter, Ralf.Wildenhues, git; +Cc: Stefan Beller

Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
---
 Documentation/git-clone.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index 7fefdb0..e76aa50 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -186,7 +186,8 @@ objects from the source repository into a pack in the cloned repository.
 	it, nor push from nor into it), but is adequate if you
 	are only interested in the recent history of a large project
 	with a long history, and would want to send in fixes
-	as patches.
+	as patches. The depth should be at least 1. If it is 0 or
+	below, the cloned repository will not be shallow.
 
 --single-branch::
 	Clone only the history leading to the tip of a single branch,
-- 
1.8.1.80.g3e293fb.dirty

^ permalink raw reply related

* Re: [PATCH] refs: do not use cached refs in repack_without_ref
From: Martin Fick @ 2013-01-07 18:14 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, git, Junio C Hamano
In-Reply-To: <201301071109.12086.mfick@codeaurora.org>

...[Sorry about the previous HTML reposts]

Jeff King <peff@peff.net> wrote:
>On Mon, Dec 31, 2012 at 03:30:53AM -0700, Martin Fick 
wrote:
>
>> The general approach is to setup a transaction and
>> either commit or abort it. A transaction can be setup
>> by renaming an appropriately setup directory to the
>> "ref.lock" name. If the rename succeeds, the
>> transaction is begun. Any actor can abort the
>> transaction (up until it is committed) by simply
>> deleting the "ref.lock" directory, so it is not at
>> risk of going stale.
>
> Deleting a directory is not atomic, as you first have
> to remove the contents, putting it into a potentially
> inconsistent state. I'll assume you deal with that
> later...

Right, these simple single file transactions have at 
best 1 important file/directory in them, once deleted 
the transaction is aborted (can no longer complete).  
However to support multi file transactions, a better 
approach is likely to rename the uuid directory to have 
a .delete extension before deleting stuff in it.


> > One important piece of the transaction is the use
> > of uuids. The uuids provide a mechanism to tie the
> > atomic commit pieces to the transactions and thus to
> > prevent long sleeping process from inadvertently
> > performing actions which could be out of date when
> > they wake finally up.
> >
>Has this been a problem for you in practice?

No, but as you say, we don't currently hold locks for 
very long. I anticipate it being a problem in a 
clustered environment when transactions start spanning 
repos from java processes, with insane amounts of RAM, 
which can sometimes have unpredictable indeterminately 
long java GC cycles at inopportune times.. It would seem 
short sighted if Gerrit at least did not assume this 
will be a problem.

But, deletes today in git are not so short and Michael's 
fixes may make things worse? But, as you point out, that 
should perhaps be solved a different way.


> Avoiding this is one of the reasons that git does not
> take out long locks; instead, it takes the lock only
> at the moment it is ready to write, and aborts if it
> has been updated since the longer-term operation
> began. This has its own problems (you might do a lot
> of work only to have your operation aborted), but I
> am not sure that your proposal improves on that.

It does not, it might increase this.


> Git typically holds ref locks for a few syscalls. If
> you are conservative about leaving potentially stale
> locks in place (e.g., give them a few minutes to
> complete before assuming they are now bogus), you will
> not run into that problem.

In a distributed environment even a few minutes might 
not be enough, processes could be on a remote server 
with a temporarily split network, that could cause 
delays longer than your typical local expectations.

But there is also the other piece of this problem, how 
do you detect stale locks? How long will it be stale 
until a user figures it out and reports it? How many 
other users will simply have failed pushes and wonder 
why without reporting them?


> > In each case, the atomic commit piece is the
> > renaming of a file. For the create and update
> > pieces, a file is renamed from the "ref.lock" 
> > dir to the "ref" file resulting in an update to 
> > the sha for the ref.
>
> I think we've had problems with cross-directory
> renames on some filesystems, but I don't recall the
> details. I know that Coda does not like 
> cross-directory links, but cross-directory renames 
> are OK (and in fact we fall back to the latter when
> the former does not work).
>
> Ah, here we go: 5723fe7 (Avoid cross-directory renames
> and linking on object creation, 2008-06-14). Looks
> like NFS is the culprit.

If the renames fail we can fall back to regular file 
locking, the hard part to detect and deal with would be 
if the renames don't fail but become copies/mkdirs.


>> In the case of a delete, the actor may verify that
>> "ref" currently contains the sha to "prune" if it
>> needs to, and then renames the "ref" file to
>> "ref.lock/uuid/delete". On success, the ref was
>> deleted.
>>
>> Whether successful or not, the actor may now simply
>> delete the "ref.lock" directory, clearing the way for
>> a new transaction. Any other actor may delete this
>> directory at any time also, likely either on conflict
>> (if they are attempting to initiate a transaction),
>> or after a grace period just to cleanup the FS. Any
>> actor may also safely cleanup the tmp directories,
>> preferably also after a grace period.
>
> Hmm. So what happens to the "delete" file when the
> ref.lock directory is being deleted? Presumably
> deleting the ref.lock directory means doing it
> recursively (which is non-atomic). But then why are 
> we keeping the delete file at all, if we're just about
> to remove it?

We are not trying to keep it, but we need to ensure that 
our transaction has not yet been aborted: the rename 
does this.  If we just deleted the file, we may sleep 
and another transaction may abort our transaction and 
complete before we wake up and actually delete the file. 
But by using a rename we tie the delete atomically to 
the transaction, it cannot succeed if our transaction 
was aborted during a sleep since the directory we are 
renaming the file into would be gone! This is sort of 
the magic piece that makes the whole scheme special, 
safe deletes tend to be the hardest part.


>What happens if another process wants to cancel a
> transaction that is partially done? That is, the
> ref.lock directory is created, but it contains the
> uuid subdir? It sounds like it's OK to just delete
> from it, and the original process will then fail at
> its rename?

Yes, exactly.  But again, it might be better to cause a 
rename of the uuid dir before deleting it.


>> One neat part about this scheme is that I believe it
>> would be backwards compatible with the current
>> locking mechanism since the transaction directory
>> will simply appear to be a lock to older clients. 
>> And the old lock file should continue to lock
>> out these newer transactions.
>
>Yeah, I don't see anything that would prevent that. 
> The current code only cares about open("$ref.lock",
> O_EXCL). But that should be correct and atomic with
> respect to the renamed directories. You are depending
> on atomic directory renames. Those aren't used
> anywhere yet in git, as far as I know. So that may run
> into problems (but there's no reason this system
> couldn't be optional for filesystems that are more
> abled, and other systems could fall back to the
> straight-file locking).

Right.


> So in response to your question, no, I don't see any
> real showstoppers here. 

Alright, cool, thanks for the review and analysis, I 
appreciate it.


> And unlike the "all refs are files in a directory"
> scheme, it's confined to writing, which solves the
> readdir() atomicity questions I had there.


Right, I couldn't see a way around those, I think it was 
inherently flawed.


> I still can't say I'm super excited about it, just
> because it seems like a solution in search of a
> problem that I have not experienced myself.

It might be.  But at least a solution is out there now. 
If time indicates that it is needed, I feel better 
knowing there is a solution. 

Murphy has a way of making unlikely problems real 
annoyances in automated distributed environments. The 
problem is not usually one real annoying problem, those 
get fixed. The problem we deal mostly with is 1000 
infrequent problems, all different, but they add up to a 
system which needs constant maintenance. And the 
maintenance is hard since each problem is different, the 
analysis is difficult and requires expertise.


> Fixing the minor remaining races on ref packing would
> be nice, but I do not think those are a problem of the
> on-disk lock representation. The reason they are not
> fixed now is from an attempt to keep lock contention
> low between unrelated updates (something which we
> should probably give up on in favor of correctness;
> but that is orthogonal to whether we do it with the
> existing file locks or with a new system).

Agreed.


>A much stronger fix for that would be to record deletes
> in the loose ref storage so that we don't have to
> update the packed-refs file as often (because it is
> inherently a lock bottleneck, and because it is
> wasteful when you have a very large number of refs).
> But that means dealing with directory/file conflicts
> between deleted and existing branches, which is a
> pain.

Makes sense.

Thanks again for your thoughts,

-Martin

-- 
The Qualcomm Innovation Center, Inc. is a member of Code 
Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply

* [PATCH] git clone depth of 0 not possible.
From: Stefan Beller @ 2013-01-07 18:06 UTC (permalink / raw)
  To: schlotter, gitster, Ralf.Wildenhues, git; +Cc: Stefan Beller

Currently it is not possible to have a shallow depth of
just 0, i.e. only one commit in that repository after cloning.
The minimum number of commits is 2, caused by depth=1.

I had no good idea how to add this behavior to git clone as
the depth variable in git_transport_options struct (file transport.h)
uses value 0 for another meaning, so it would have need changes at
all places, where the transport options depth is being used 
(e.g. fetch)

So I documented the current behavior, see attached patch.

Stefan Beller (1):
  Documentation on depth option in git clone.

 Documentation/git-clone.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-- 
1.8.1.80.g3e293fb.dirty

^ permalink raw reply

* Re: [PATCH v2] status: always report ignored tracked directories
From: Junio C Hamano @ 2013-01-07 19:06 UTC (permalink / raw)
  To: Antoine Pelisse; +Cc: Jeff King, tboegi, git
In-Reply-To: <1357510179-22852-1-git-send-email-apelisse@gmail.com>

Antoine Pelisse <apelisse@gmail.com> writes:

First, thanks for working on this.

The explanation is a bit confusing, especially for people like me,
as it does not make it clear that there are two kinds of "paths in
the index".  Files can be added to the index ("git add" and then
shown via "ls-files") and these are what we usually call "files in
the index".  But there is a separate "name hash" that keeps track of
"paths the index knows about" in play, and that is what is discussed
in the description.

> Tracked directories (i.e. directories containing tracked files)
> that are ignored must be reported as ignored if they contain
> untracked files.
>
> Currently, files in the index can't be reported as ignored and are
> automatically dropped from the list:

"file in the index" will never be ignored, period.  Some paths the
index knows about, namely, directory names in name hash, may need to
be reported as ignored.

It is not clear what "list" the above "dropped from the list" refers
to.  The list of paths that becomes the result of "status"?

>  - When core.ignorecase is false, directories (which are not directly
>  tracked) are not listed as part of the index, and the directory can be
>  shown as ignored.
>
>  - When core.ignorecase is true on the other hand, directories are
>  reported as part of the index, and the directory is dropped, thus not
>  displayed as ignored.
>
> Fix that behavior by allowing indexed file to be added when looking for
> ignored files.
>
>  - Ignored tracked and untracked directories are treated the same way
>  when looking for ignored files, so we should not care about their index
>  status.
>  - Files are dismissed by treat_file() if they belong to the
>  index, so that step would have been a no-op anyway.

Here is my attempt...

    When enumerating paths that are ignored, paths the index knows
    about are not included in the result.  The "index knows about"
    check is done by consulting the name hash, not the actual
    contents of the index:

     - When core.ignorecase is false, directory names are not in the
       name hash, and ignored ones are shown as ignored (directories
       can never be tracked anyway).

     - When core.ignorecase is true, however, the name hash keeps
       track of the names of directories, in order to detect
       additions of the paths under different cases.  This causes
       ignored directories to be mistakenly excluded when
       enumerating ignored paths.

    Stop excluding directories that are in the name hash when
    looking for ignored files in dir_add_name(); the names that are
    actually in the index are excluded much earlier in the callchain
    in treat_file(), so this fix will not make them mistakenly
    identified as ignored.

    Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
    Reviewed-by: Jeff King <peff@peff.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>

diff --git a/dir.c b/dir.c
index 9b80348..f836590 100644
--- a/dir.c
+++ b/dir.c
@@ -672,7 +672,8 @@ static struct dir_entry *dir_entry_new(const char *pathname, int len)
 
 static struct dir_entry *dir_add_name(struct dir_struct *dir, const char *pathname, int len)
 {
-	if (cache_name_exists(pathname, len, ignore_case))
+	if (!(dir->flags & DIR_SHOW_IGNORED) &&
+	    cache_name_exists(pathname, len, ignore_case))
 		return NULL;
 
 	ALLOC_GROW(dir->entries, dir->nr+1, dir->alloc);
@@ -877,11 +878,7 @@ static int treat_file(struct dir_struct *dir, struct strbuf *path, int exclude,
 	if (exclude)
 		exclude_file = !(dir->flags & DIR_SHOW_IGNORED);
 	else if (dir->flags & DIR_SHOW_IGNORED) {
-		/*
-		 * Optimization:
-		 * Don't spend time on indexed files, they won't be
-		 * added to the list anyway
-		 */
+		/* Always exclude indexed files */
 		struct cache_entry *ce = index_name_exists(&the_index,
 		    path->buf, path->len, ignore_case);
 

^ permalink raw reply related

* [PATCH] Prevent space after directories in tcsh completion
From: Marc Khouzam @ 2013-01-07 19:07 UTC (permalink / raw)
  To: 'git@vger.kernel.org'
  Cc: 'Junio C Hamano', 'felipe.contreras@gmail.com',
	'szeder@ira.uka.de'

If git-completion.bash returns a single directory as a completion,
tcsh will automatically add a space after it, which is not what the
user wants.

This commit prevents tcsh from doing this.

Also, a check is added to make sure the tcsh version used is recent
enough to allow completion to work as expected.

Signed-off-by: Marc Khouzam <marc.khouzam@ericsson.com>
---

This update is meant to have tcsh completion work better if the
feature "git-completion.bash: add support for path completion"
is accepted.
See http://www.mail-archive.com/git@vger.kernel.org/msg14137.html
This commit does not depend on that other feature though and can
be applied right away.

Furthermore, based on feedback I received, some users are running
versions of tcsh that are over 5 years old and don't provide the
proper support for this script.  I've added a check to let the user
know of such (sad) situation nicely.

Thanks

Marc

 contrib/completion/git-completion.tcsh | 33 +++++++++++++++++++++------------
 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/contrib/completion/git-completion.tcsh b/contrib/completion/git-completion.tcsh
index 8aafb63..3e3889f 100644
--- a/contrib/completion/git-completion.tcsh
+++ b/contrib/completion/git-completion.tcsh
@@ -13,6 +13,7 @@
 #
 # To use this completion script:
 #
+#    0) You need tcsh 6.16.00 or newer.
 #    1) Copy both this file and the bash completion script to ${HOME}.
 #       You _must_ use the name ${HOME}/.git-completion.bash for the
 #       bash script.
@@ -24,6 +25,15 @@
 #        set autolist=ambiguous
 #       It will tell tcsh to list the possible completion choices.
 
+set __git_tcsh_completion_version = `\echo ${tcsh} | \sed 's/\./ /g'`
+if ( ${__git_tcsh_completion_version[1]} < 6 || \
+     ( ${__git_tcsh_completion_version[1]} == 6 && \
+       ${__git_tcsh_completion_version[2]} < 16 ) ) then
+	echo "git-completion.tcsh: Your version of tcsh is too old, you need version 6.16.00 or newer.  Git completion will not work."
+	exit
+endif
+unset __git_tcsh_completion_version
+
 set __git_tcsh_completion_original_script = ${HOME}/.git-completion.bash
 set __git_tcsh_completion_script = ${HOME}/.git-completion.tcsh.bash
 
@@ -64,9 +74,7 @@ fi
 _\${1}
 
 IFS=\$'\n'
-if [ \${#COMPREPLY[*]} -gt 0 ]; then
-	echo "\${COMPREPLY[*]}" | sort | uniq
-else
+if [ \${#COMPREPLY[*]} -eq 0 ]; then
 	# No completions suggested.  In this case, we want tcsh to perform
 	# standard file completion.  However, there does not seem to be way
 	# to tell tcsh to do that.  To help the user, we try to simulate
@@ -85,19 +93,20 @@ else
 		# We don't support ~ expansion: too tricky.
 		if [ "\${TO_COMPLETE:0:1}" != "~" ]; then
 			# Use ls so as to add the '/' at the end of directories.
-			RESULT=(\`ls -dp \${TO_COMPLETE}* 2> /dev/null\`)
-			echo \${RESULT[*]}
-
-			# If there is a single completion and it is a directory,
-			# we output it a second time to trick tcsh into not adding a space
-			# after it.
-			if [ \${#RESULT[*]} -eq 1 ] && [ "\${RESULT[0]: -1}" == "/" ]; then
-				echo \${RESULT[*]}
-			fi
+			COMPREPLY=(\`ls -dp \${TO_COMPLETE}* 2> /dev/null\`)
 		fi
 	fi
 fi
 
+# tcsh does not automatically remove duplicates, so we do it ourselves
+echo "\${COMPREPLY[*]}" | sort | uniq
+
+# If there is a single completion and it is a directory, we output it
+# a second time to trick tcsh into not adding a space after it.
+if [ \${#COMPREPLY[*]} -eq 1 ] && [ "\${COMPREPLY[0]: -1}" == "/" ]; then
+	echo "\${COMPREPLY[*]}"
+fi
+
 EOF
 
 # Don't need this variable anymore, so don't pollute the users environment
-- 
1.8.1.367.g8e14972.dirty

^ permalink raw reply related

* Re: [PATCH] Remove the suggestion to use parsecvs, which is currently broken.
From: Junio C Hamano @ 2013-01-07 19:21 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: Eric S. Raymond, git
In-Reply-To: <20130106163420.GA3378@book-mint>

Heiko Voigt <hvoigt@hvoigt.net> writes:

> Hi,
>
> On Fri, Dec 28, 2012 at 11:42:00PM -0500, Eric S. Raymond wrote:
>> Heiko Voigt <hvoigt@hvoigt.net>:
>> > Maybe you could add that information to the parsecvs compile
>> > instructions? I think just because it takes some effort to compile does
>> > not justify to remove this useful pointer here. When I was converting a
>> > legacy cvs repository this pointer would have helped me a lot.
>> 
>> I'm parsecvs's maintainer now.  It's not in good shape; there is at
>> least one other known showstopper besides the build issue.  I would
>> strongly prefer to direct peoples' attention away from it until I
>> have time to fix it and cut a release.  This is not a distant 
>> prospect - two or three weeks out, maybe.
>
> So for this short amount of time you want to change gits documentation?
> Is this hint causing you trouble? Are there many people asking for
> support because of that?
>
> Even if it takes some effort to get parsecvs running I would like to
> keep the hint to a good and proven cvs importer.

I do not mind changing the documentation, but now I re-read the
change, I tend to agree that dropping the last (un)maintained
version of parsecvs may be detrimental.  Most people will not
download parsecvs from keith's page, but first will try the one
shipped with their distros, and the recent maintainer change for
that tool would not have any impact to these users.

But at the same time, by the time this change reaches these users
who may benefit by the mention of parsecvs via the distro route, the
situation may be different, so I do not think it is such a big deal.
In the longer term, if parsecvs is revived either by Eric or
somebody else, we will add the mention back to the documentation,
probably with an updated URL.

^ permalink raw reply

* The tip of 'next' has been rewound.
From: Junio C Hamano @ 2013-01-07 20:04 UTC (permalink / raw)
  To: git

Just to let you contributors know, your next "git fetch" will notice
that remotes/origin/next has been rewound.  Two topics have been
kicked back to 'pu', and the branch has been reordered somewhat.

I'll start merging bugfix topics that have graduated to master since
v1.8.1 to maint so that we can have v1.8.1.1 sometime next week.  We
haven't seen any real regression report for v1.8.1 since tagging it,
and it has been a week, so this may be the first 'maint' release
without "brown paper bag regression fix" for quite some time ;-)

^ permalink raw reply

* Re: [PATCH v2] status: always report ignored tracked directories
From: Antoine Pelisse @ 2013-01-07 20:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Torsten Bögershausen, git
In-Reply-To: <7vip78izir.fsf@alter.siamese.dyndns.org>

> Here is my attempt...
>
>     When enumerating paths that are ignored, paths the index knows
>     about are not included in the result.  The "index knows about"
>     check is done by consulting the name hash, not the actual
>     contents of the index:
>
>      - When core.ignorecase is false, directory names are not in the
>        name hash, and ignored ones are shown as ignored (directories
>        can never be tracked anyway).
>
>      - When core.ignorecase is true, however, the name hash keeps
>        track of the names of directories, in order to detect
>        additions of the paths under different cases.  This causes
>        ignored directories to be mistakenly excluded when
>        enumerating ignored paths.
>
>     Stop excluding directories that are in the name hash when
>     looking for ignored files in dir_add_name(); the names that are
>     actually in the index are excluded much earlier in the callchain
>     in treat_file(), so this fix will not make them mistakenly
>     identified as ignored.
>
>     Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
>     Reviewed-by: Jeff King <peff@peff.net>
>     Signed-off-by: Junio C Hamano <gitster@pobox.com>

I like it a lot, thanks to both of you

^ permalink raw reply

* Re: [PATCH] Remove the suggestion to use parsecvs, which is currently broken.
From: Eric S. Raymond @ 2013-01-07 20:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Heiko Voigt, git
In-Reply-To: <7vehhwiyt6.fsf@alter.siamese.dyndns.org>

Junio C Hamano <gitster@pobox.com>:
> In the longer term, if parsecvs is revived either by Eric or
> somebody else, we will add the mention back to the documentation,
> probably with an updated URL.

I'm working on the revival right now. Repository generation is still
broken, and likely to remain so until I can make the export-stream stage
work, but just a few minutes ago I coaxed it into generating what looks 
like graphviz markup describing a commit graph on standard output.

Even though dot(1) barfs on the markup, this is encouraging. It almost
certainly means that the analysis and parsing stages aren't broken, and
by stubbing out enough functions I can figure out what is being passed
to the broken repository-maker well enough for my purposes.

Actually, I've already figured out how to generate blob and commit-header
markup.  The hard part is generating fileops; I don't quite understand
the generated data structures well enough to do that yet.  But I'm
making progress, and feeling more optimistic than I was yesterday.

In related news, I've sent Michael Haggerty patches that fix the visible
problems with cvs2git that I enumerated in previous mail.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

^ permalink raw reply

* Re: [PATCH 01/10] list_lookup: create case and length search
From: Junio C Hamano @ 2013-01-07 22:37 UTC (permalink / raw)
  To: Antoine Pelisse; +Cc: git
In-Reply-To: <7vmwwnp84p.fsf@alter.siamese.dyndns.org>

Junio C Hamano <gitster@pobox.com> writes:

> Antoine Pelisse <apelisse@gmail.com> writes:
>
>> Create a new function to look-up a string in a string_list, but:
>>  - add a new parameter to ignore case differences
>>  - add a length parameter to search for a substring
>>
>> The idea is to avoid several copies (lowering a string before searching
>> it when we just want to ignore case), or copying a substring of a bigger
>> string to search it in the string_list
>>
>> Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
>> ---
>
> I did not read beyond the log message and the function signature of
> the updated get_entry_index(), but it strikes me insane to build a
> sorted list using case sensitive full string comarison and then to
> look for an element in that list using a different comparison
> criteria (e.g. case insensitive comparison) and to expect the
> bisection search to still work.  Shouldn't the codepath that builds
> the string-list be sorting the list in case insensitive way from the
> beginning if this were to work correctly?
>
> It seems to suggest to me that this "are the keys case sensitive?"
> bit belongs to the entire struct string_list structure as its
> property (similar to the way "do the keys need to be strdup'ed?"
> bit), not something you would flip per invocation basis of the
> lookup function.
>
> Also isn't size_t an unsigned type?  What does it even mean to pass
> "-1" to it, and propagate it down to strncmp()?
>
> If you have a list sorted by a key, and if you want to query it with
> a partial prefix of a possibly valid key, I think you shouldn't have
> to add the "length search" at all. The existing look up function
> would return the location in the list that short key would have been
> inserted, which would come before the first entry that your short
> key is a prefix of, so the caller can iterate the list from there to
> find all entries.  In other words, if existing list has "aaa",
> "bba", and "bbc", and you want to grab entries that begin with "bb",
> you can ask for "bb" to the loop up function, which would say "the
> key does not exist in the list, but it would be inserted before this
> 'bba' entry".  Then you can discover that "bba" and "bbc" both
> matches the shortened key you have, no?

In the above, I mixed up which string is a prefix of which, but the
overall comment still stands, I think.

Your mailmap string list has "Ee@eee" (email part only) stored, while
the record from author/committer line has "NNNN <EE@EEE> TTTTTT sZZZZ",
to hold name, timestamp and zone.  You have a pointer email pointing
at the first "E" in that record, with maillen set to 6, and see if
the mailmap contains a string that match "EE@EEE".  So I mixed up
the prefix-ness of these strings.

Let's address the case sensitivety first.  The string_list_lookup()
and your string_list_lookup_extended() functions both work on a
string list that is built by add_mapping() function that asks
string_list_find_insert_index() to find an insertion point in the
sorted string list.  If this is not done using the same case
insensitive manner for a string list that is meant to be queried
case insensitively, the resulting list is still sorted case
sensitively, and look-up functions that work by binary search won't
find what you are looking for when they try to find the string using
case-insensitive comparison.  The change to implement it would
probably look like the attached patch.

Assuming that we can leave the case sensitivity behind us that way,
we still would need a new helper function

    string_list_lookup_prefix(struct string_list *, const char*, size_t)

so that you can find an existing "Ee@eee" entry in the string list
by email that points at "EE@EEE> TTTTTT sZZZZ" with maillen of 6.

I am wondering if we can do that without adding the length-limited
sort in the low-level get_entry_index() to do so, though.

You could first ask string_list_find_insert_index() to locate the
insertion point for "EE@EEE> TTTTTT sZZZZ".  That has to come after
"Ee@eee" if it exists (or it could be "De@eee" in which case you
know there is no match).

 1. Ask string_list_find_insert_index() to locate "EE@EEE> TTTTTT..."

    a. If it says there is an exact match, then that did not match
       "EE@EEE" list (you found "EE@EEE> TTTTTT sZZZZ").  You might
       still find "EE@EEE" before that record, though.  Also your
       key may have been "EE@EEE" exactly, in which case you already
       found what you were looking for.

    b. If it says there is not an exact match, the returned value is
       pointing at a record that comes after "EE@EEE> TTTTTT sZZZZ" if
       you were to insert it, which definitely is after "EE@EEE".

    So in either case, you know what you are looking for must come
    before string_list_find_insert_index() gave you.  Let's call
    that position "i".

    You still need to be careful that there could be an existing
    entry whose key is "EE@EEE" followed by a byte that sorts before
    ">" or " " (we do not want to limit this new generic string-list
    function to operate only on e-mail addresses), though.  So the
    above "must come before" is not "must come immediately before".

 2. Then compare map->items[i].string and email using strncasecmp()
    for maillen bytes, and make sure map->items[].string is exactly
    maillen bytes long.  If it matches, you found what you were
    looking for.  Otherwise, decrement "i" (e.g. the overlong key
    string you had was "EE@EEE> TTTTTT..." and the record at one
    before the insertion point was "EE@EEE\t" which sorts before it,
    but the string-list may have "EE@EEE" before that entry), see if
    the first maillen bytes still match, and continue.  Once you
    decrement i enough, the first maillen bytes won't match
    (e.g. you hit "De@EEE"), or "i" goes negative, and you can stop.

or something like that.  I'll cook up a patch and see how ugly it
ends up with.

-- >8 --
Subject: [PATCH] string-list: allow case-insensitive string list

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * At the very beginning of mailmap.c::read_mailmap(), you would say
   something like:

	int read_mailmap(struct string_list *map, char **repo_ident)
        {
        	map->strdup_strings = 1;
       +        map->cmp = strcasecmp;
                /* each failure returns 1, so >1 means both calls ...

   to make the mailmap string list a case insensitive one.

 string-list.c | 17 +++++++++++++----
 string-list.h |  4 ++++
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/string-list.c b/string-list.c
index 397e6cf..6f3d8cf 100644
--- a/string-list.c
+++ b/string-list.c
@@ -7,10 +7,11 @@ static int get_entry_index(const struct string_list *list, const char *string,
 		int *exact_match)
 {
 	int left = -1, right = list->nr;
+	compare_strings_fn cmp = list->cmp ? list->cmp : strcmp;
 
 	while (left + 1 < right) {
 		int middle = (left + right) / 2;
-		int compare = strcmp(string, list->items[middle].string);
+		int compare = cmp(string, list->items[middle].string);
 		if (compare < 0)
 			right = middle;
 		else if (compare > 0)
@@ -96,8 +97,9 @@ void string_list_remove_duplicates(struct string_list *list, int free_util)
 {
 	if (list->nr > 1) {
 		int src, dst;
+		compare_strings_fn cmp = list->cmp ? list->cmp : strcmp;
 		for (src = dst = 1; src < list->nr; src++) {
-			if (!strcmp(list->items[dst - 1].string, list->items[src].string)) {
+			if (!cmp(list->items[dst - 1].string, list->items[src].string)) {
 				if (list->strdup_strings)
 					free(list->items[src].string);
 				if (free_util)
@@ -230,15 +232,20 @@ struct string_list_item *string_list_append(struct string_list *list,
 			list->strdup_strings ? xstrdup(string) : (char *)string);
 }
 
+/* Yuck */
+static compare_strings_fn compare_for_qsort;
+
+/* Only call this from inside sort_string_list! */
 static int cmp_items(const void *a, const void *b)
 {
 	const struct string_list_item *one = a;
 	const struct string_list_item *two = b;
-	return strcmp(one->string, two->string);
+	return compare_for_qsort(one->string, two->string);
 }
 
 void sort_string_list(struct string_list *list)
 {
+	compare_for_qsort = list->cmp ? list->cmp : strcmp;
 	qsort(list->items, list->nr, sizeof(*list->items), cmp_items);
 }
 
@@ -246,8 +253,10 @@ struct string_list_item *unsorted_string_list_lookup(struct string_list *list,
 						     const char *string)
 {
 	int i;
+	compare_strings_fn cmp = list->cmp ? list->cmp : strcmp;
+
 	for (i = 0; i < list->nr; i++)
-		if (!strcmp(string, list->items[i].string))
+		if (!cmp(string, list->items[i].string))
 			return list->items + i;
 	return NULL;
 }
diff --git a/string-list.h b/string-list.h
index c50b0d0..446e79e 100644
--- a/string-list.h
+++ b/string-list.h
@@ -5,10 +5,14 @@ struct string_list_item {
 	char *string;
 	void *util;
 };
+
+typedef int (*compare_strings_fn)(const char *, const char *);
+
 struct string_list {
 	struct string_list_item *items;
 	unsigned int nr, alloc;
 	unsigned int strdup_strings:1;
+	compare_strings_fn cmp; /* NULL uses strcmp() */
 };
 
 #define STRING_LIST_INIT_NODUP { NULL, 0, 0, 0 }
-- 
1.8.1.304.gf036638

^ permalink raw reply related

* [PATCH v2 01/10] string-list: allow case-insensitive string list
From: Junio C Hamano @ 2013-01-08  0:10 UTC (permalink / raw)
  To: git; +Cc: Antoine Pelisse
In-Reply-To: <1357603821-8647-1-git-send-email-gitster@pobox.com>

Some string list needs to be searched case insensitively, and for
that to work correctly, the string needs to be sorted case
insensitively from the beginning.

Allow a custom comparison function to be defined on a string list
instance and use it throughout in place of strcmp().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 string-list.c | 17 +++++++++++++----
 string-list.h |  4 ++++
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/string-list.c b/string-list.c
index 397e6cf..6f3d8cf 100644
--- a/string-list.c
+++ b/string-list.c
@@ -7,10 +7,11 @@ static int get_entry_index(const struct string_list *list, const char *string,
 		int *exact_match)
 {
 	int left = -1, right = list->nr;
+	compare_strings_fn cmp = list->cmp ? list->cmp : strcmp;
 
 	while (left + 1 < right) {
 		int middle = (left + right) / 2;
-		int compare = strcmp(string, list->items[middle].string);
+		int compare = cmp(string, list->items[middle].string);
 		if (compare < 0)
 			right = middle;
 		else if (compare > 0)
@@ -96,8 +97,9 @@ void string_list_remove_duplicates(struct string_list *list, int free_util)
 {
 	if (list->nr > 1) {
 		int src, dst;
+		compare_strings_fn cmp = list->cmp ? list->cmp : strcmp;
 		for (src = dst = 1; src < list->nr; src++) {
-			if (!strcmp(list->items[dst - 1].string, list->items[src].string)) {
+			if (!cmp(list->items[dst - 1].string, list->items[src].string)) {
 				if (list->strdup_strings)
 					free(list->items[src].string);
 				if (free_util)
@@ -230,15 +232,20 @@ struct string_list_item *string_list_append(struct string_list *list,
 			list->strdup_strings ? xstrdup(string) : (char *)string);
 }
 
+/* Yuck */
+static compare_strings_fn compare_for_qsort;
+
+/* Only call this from inside sort_string_list! */
 static int cmp_items(const void *a, const void *b)
 {
 	const struct string_list_item *one = a;
 	const struct string_list_item *two = b;
-	return strcmp(one->string, two->string);
+	return compare_for_qsort(one->string, two->string);
 }
 
 void sort_string_list(struct string_list *list)
 {
+	compare_for_qsort = list->cmp ? list->cmp : strcmp;
 	qsort(list->items, list->nr, sizeof(*list->items), cmp_items);
 }
 
@@ -246,8 +253,10 @@ struct string_list_item *unsorted_string_list_lookup(struct string_list *list,
 						     const char *string)
 {
 	int i;
+	compare_strings_fn cmp = list->cmp ? list->cmp : strcmp;
+
 	for (i = 0; i < list->nr; i++)
-		if (!strcmp(string, list->items[i].string))
+		if (!cmp(string, list->items[i].string))
 			return list->items + i;
 	return NULL;
 }
diff --git a/string-list.h b/string-list.h
index c50b0d0..446e79e 100644
--- a/string-list.h
+++ b/string-list.h
@@ -5,10 +5,14 @@ struct string_list_item {
 	char *string;
 	void *util;
 };
+
+typedef int (*compare_strings_fn)(const char *, const char *);
+
 struct string_list {
 	struct string_list_item *items;
 	unsigned int nr, alloc;
 	unsigned int strdup_strings:1;
+	compare_strings_fn cmp; /* NULL uses strcmp() */
 };
 
 #define STRING_LIST_INIT_NODUP { NULL, 0, 0, 0 }
-- 
1.8.1.304.gf036638

^ permalink raw reply related

* [PATCH v2 00/10] reroll of ap/log-mailmap
From: Junio C Hamano @ 2013-01-08  0:10 UTC (permalink / raw)
  To: git; +Cc: Antoine Pelisse

This is a reroll of the previous series Antoine posted on Saturday.

A new patch "string-list: allow case-insensitive string list"
teaches the string-list API that some string lists can be sorted
case insensitively (actually, you can feed any custom two string
comparison functions).

The string_list_lookup_extended() function introduced by the
previous series has been discarded.  Instead, the third patch
"mailmap: remove email copy and length limitation" introduces a
helper function that takes a <char *, size_t> key that is not NUL
terminated to look for a matching item in a string list, and uses
that to update map_user() function, together with the fourth
patch "mailmap: simplify map_user() interface".

All other patches are unmodified from Antoine's series (modulo
wording tweaks here and there).

Antoine Pelisse (9):
  Use split_ident_line to parse author and committer
  mailmap: remove email copy and length limitation
  mailmap: simplify map_user() interface
  mailmap: add mailmap structure to rev_info and pp
  pretty: use mailmap to display username and email
  log: add --use-mailmap option
  test: add test for --use-mailmap option
  log: grep author/committer using mailmap
  log: add log.mailmap configuration option

Junio C Hamano (1):
  string-list: allow case-insensitive string list

 Documentation/config.txt  |   4 +
 Documentation/git-log.txt |   5 ++
 builtin/blame.c           | 183 ++++++++++++++++++++++------------------------
 builtin/log.c             |  16 +++-
 builtin/shortlog.c        |  54 ++++----------
 commit.h                  |   1 +
 log-tree.c                |   1 +
 mailmap.c                 |  94 +++++++++++++++---------
 mailmap.h                 |   4 +-
 pretty.c                  | 114 ++++++++++++++++-------------
 revision.c                |  54 ++++++++++++++
 revision.h                |   1 +
 string-list.c             |  17 ++++-
 string-list.h             |   4 +
 t/t4203-mailmap.sh        |  56 ++++++++++++++
 15 files changed, 379 insertions(+), 229 deletions(-)

-- 
1.8.1.304.gf036638

^ permalink raw reply

* [PATCH v2 02/10] Use split_ident_line to parse author and committer
From: Junio C Hamano @ 2013-01-08  0:10 UTC (permalink / raw)
  To: git; +Cc: Antoine Pelisse
In-Reply-To: <1357603821-8647-1-git-send-email-gitster@pobox.com>

From: Antoine Pelisse <apelisse@gmail.com>

Currently blame.c::get_acline(), pretty.c::pp_user_info() and
shortlog.c::insert_one_record() are parsing author name, email, time
and tz themselves.

Use ident.c::split_ident_line() for better code reuse.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/blame.c    | 59 +++++++++++++++++++-----------------------------------
 builtin/shortlog.c | 36 +++++++++------------------------
 pretty.c           | 35 +++++++++++++++++++-------------
 3 files changed, 52 insertions(+), 78 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index cfae569..dd4aff9 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1343,8 +1343,9 @@ static void get_ac_line(const char *inbuf, const char *what,
 			int mail_len, char *mail,
 			unsigned long *time, const char **tz)
 {
-	int len, tzlen, maillen;
-	char *tmp, *endp, *timepos, *mailpos;
+	struct ident_split ident;
+	int len, tzlen, maillen, namelen;
+	char *tmp, *endp, *mailpos;
 
 	tmp = strstr(inbuf, what);
 	if (!tmp)
@@ -1355,7 +1356,10 @@ static void get_ac_line(const char *inbuf, const char *what,
 		len = strlen(tmp);
 	else
 		len = endp - tmp;
-	if (person_len <= len) {
+	if (person_len <= len)
+		goto error_out;
+
+	if (split_ident_line(&ident, tmp, len)) {
 	error_out:
 		/* Ugh */
 		*tz = "(unknown)";
@@ -1364,47 +1368,26 @@ static void get_ac_line(const char *inbuf, const char *what,
 		*time = 0;
 		return;
 	}
-	memcpy(person, tmp, len);
 
-	tmp = person;
-	tmp += len;
-	*tmp = 0;
-	while (person < tmp && *tmp != ' ')
-		tmp--;
-	if (tmp <= person)
-		goto error_out;
-	*tz = tmp+1;
-	tzlen = (person+len)-(tmp+1);
+	namelen = ident.name_end - ident.name_begin;
+	memcpy(person, ident.name_begin, namelen);
+	person[namelen] = 0;
 
-	*tmp = 0;
-	while (person < tmp && *tmp != ' ')
-		tmp--;
-	if (tmp <= person)
-		goto error_out;
-	*time = strtoul(tmp, NULL, 10);
-	timepos = tmp;
+	maillen = ident.mail_end - ident.mail_begin + 2;
+	memcpy(mail, ident.mail_begin - 1, maillen);
+	mail[maillen] = 0;
 
-	*tmp = 0;
-	while (person < tmp && !(*tmp == ' ' && tmp[1] == '<'))
-		tmp--;
-	if (tmp <= person)
-		return;
-	mailpos = tmp + 1;
-	*tmp = 0;
-	maillen = timepos - tmp;
-	memcpy(mail, mailpos, maillen);
+	*time = strtoul(ident.date_begin, NULL, 10);
 
-	if (!mailmap.nr)
-		return;
+	tzlen = ident.tz_end - ident.tz_begin;
 
-	/*
-	 * mailmap expansion may make the name longer.
-	 * make room by pushing stuff down.
-	 */
-	tmp = person + person_len - (tzlen + 1);
-	memmove(tmp, *tz, tzlen);
+	/* Place tz at the end of person */
+	*tz = tmp = person + person_len - (tzlen + 1);
+	memcpy(tmp, ident.tz_begin, tzlen);
 	tmp[tzlen] = 0;
-	*tz = tmp;
+
+	if (!mailmap.nr)
+		return;
 
 	/*
 	 * Now, convert both name and e-mail using mailmap
diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index b316cf3..03c6cd7 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -40,40 +40,24 @@ static void insert_one_record(struct shortlog *log,
 	char emailbuf[1024];
 	size_t len;
 	const char *eol;
-	const char *boemail, *eoemail;
 	struct strbuf subject = STRBUF_INIT;
+	struct ident_split ident;
 
-	boemail = strchr(author, '<');
-	if (!boemail)
-		return;
-	eoemail = strchr(boemail, '>');
-	if (!eoemail)
+	if (split_ident_line(&ident, author, strlen(author)))
 		return;
 
 	/* copy author name to namebuf, to support matching on both name and email */
-	memcpy(namebuf, author, boemail - author);
-	len = boemail - author;
-	while (len > 0 && isspace(namebuf[len-1]))
-		len--;
+	len = ident.name_end - ident.name_begin;
+	memcpy(namebuf, ident.name_begin, len);
 	namebuf[len] = 0;
 
 	/* copy email name to emailbuf, to allow email replacement as well */
-	memcpy(emailbuf, boemail+1, eoemail - boemail);
-	emailbuf[eoemail - boemail - 1] = 0;
-
-	if (!map_user(&log->mailmap, emailbuf, sizeof(emailbuf), namebuf, sizeof(namebuf))) {
-		while (author < boemail && isspace(*author))
-			author++;
-		for (len = 0;
-		     len < sizeof(namebuf) - 1 && author + len < boemail;
-		     len++)
-			namebuf[len] = author[len];
-		while (0 < len && isspace(namebuf[len-1]))
-			len--;
-		namebuf[len] = '\0';
-	}
-	else
-		len = strlen(namebuf);
+	len = ident.mail_end - ident.mail_begin;
+	memcpy(emailbuf, ident.mail_begin, len);
+	emailbuf[len] = 0;
+
+	map_user(&log->mailmap, emailbuf, sizeof(emailbuf), namebuf, sizeof(namebuf));
+	len = strlen(namebuf);
 
 	if (log->email) {
 		size_t room = sizeof(namebuf) - len - 1;
diff --git a/pretty.c b/pretty.c
index 5bdc2e7..350d1df 100644
--- a/pretty.c
+++ b/pretty.c
@@ -387,29 +387,36 @@ void pp_user_info(const struct pretty_print_context *pp,
 		  const char *what, struct strbuf *sb,
 		  const char *line, const char *encoding)
 {
+	struct ident_split ident;
+	int linelen, namelen;
+	char *line_end, *date;
 	int max_length = 78; /* per rfc2822 */
-	char *date;
-	int namelen;
 	unsigned long time;
 	int tz;
 
 	if (pp->fmt == CMIT_FMT_ONELINE)
 		return;
-	date = strchr(line, '>');
-	if (!date)
+
+	line_end = strchr(line, '\n');
+	if (!line_end) {
+		line_end = strchr(line, '\0');
+		if (!line_end)
+			return;
+	}
+
+	linelen = ++line_end - line;
+	if (split_ident_line(&ident, line, linelen))
 		return;
-	namelen = ++date - line;
-	time = strtoul(date, &date, 10);
+
+	namelen = ident.mail_end - ident.name_begin + 1;
+	time = strtoul(ident.date_begin, &date, 10);
 	tz = strtol(date, NULL, 10);
 
 	if (pp->fmt == CMIT_FMT_EMAIL) {
-		char *name_tail = strchr(line, '<');
 		int display_name_length;
-		if (!name_tail)
-			return;
-		while (line < name_tail && isspace(name_tail[-1]))
-			name_tail--;
-		display_name_length = name_tail - line;
+
+		display_name_length = ident.name_end - ident.name_begin;
+
 		strbuf_addstr(sb, "From: ");
 		if (needs_rfc2047_encoding(line, display_name_length, RFC2047_ADDRESS)) {
 			add_rfc2047(sb, line, display_name_length,
@@ -427,10 +434,10 @@ void pp_user_info(const struct pretty_print_context *pp,
 		}
 		if (namelen - display_name_length + last_line_length(sb) > max_length) {
 			strbuf_addch(sb, '\n');
-			if (!isspace(name_tail[0]))
+			if (!isspace(ident.name_end[0]))
 				strbuf_addch(sb, ' ');
 		}
-		strbuf_add(sb, name_tail, namelen - display_name_length);
+		strbuf_add(sb, ident.name_end, namelen - display_name_length);
 		strbuf_addch(sb, '\n');
 	} else {
 		strbuf_addf(sb, "%s: %.*s%.*s\n", what,
-- 
1.8.1.304.gf036638

^ permalink raw reply related

* [PATCH v2 03/10] mailmap: remove email copy and length limitation
From: Junio C Hamano @ 2013-01-08  0:10 UTC (permalink / raw)
  To: git; +Cc: Antoine Pelisse
In-Reply-To: <1357603821-8647-1-git-send-email-gitster@pobox.com>

In map_user(), we have email pointer that points at the beginning of
an e-mail address, but the buffer is not terminated with a NUL after
the e-mail address.  It typically has ">" after the address, and it
could have even more if it comes from author/committer line in a
commit object.  Or it may not have ">" after it.

We used to copy the e-mail address proper into a temporary buffer
before asking the string-list API to find the e-mail address in the
mailmap, because string_list_lookup() function only takes a NUL
terminated full string.

Introduce a helper function lookup_prefix that takes the email
pointer and the length, and finds a matching entry in the string
list used for the mailmap, by doing the following:

 - First ask string_list_find_insert_index() where in its sorted
   list the e-mail address we have (including the possible trailing
   junk ">...") would be inserted.

 - It could find an exact match (e.g. we had a clean e-mail address
   without any trailing junk).  We can return the item in that case.

 - Or it could return the index of an item that sorts after the
   e-mail address we have.

 - If we did not find an exact match against a clean e-mail address,
   then the record we are looking for in the mailmap has to exist
   before the index returned by the function (i.e. "email>junk"
   always sorts later than "email").  Iterate, starting from that
   index, down the map->items[] array until we find the exact record
   we are looking for, or we see a record with a key that definitely
   sorts earlier than the e-mail we are looking for (i.e. when we
   are looking for "email" in "email>junk", a record in the mailmap
   that begins with "emaik" strictly sorts before "email", if such a
   key existed in the mailmap).

This, together with the earlier enhancement to support
case-insensitive sorting, allow us to remove an extra copy of email
buffer to downcase it.

A part of this is based on Antoine Pelisse's previous work.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 mailmap.c | 61 +++++++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 45 insertions(+), 16 deletions(-)

diff --git a/mailmap.c b/mailmap.c
index ea4b471..1a0b769 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -174,6 +174,7 @@ static int read_single_mailmap(struct string_list *map, const char *filename, ch
 int read_mailmap(struct string_list *map, char **repo_abbrev)
 {
 	map->strdup_strings = 1;
+	map->cmp = strcasecmp;
 	/* each failure returns 1, so >1 means both calls failed */
 	return read_single_mailmap(map, ".mailmap", repo_abbrev) +
 	       read_single_mailmap(map, git_mailmap_file, repo_abbrev) > 1;
@@ -187,14 +188,50 @@ void clear_mailmap(struct string_list *map)
 	debug_mm("mailmap: cleared\n");
 }
 
+static struct string_list_item *lookup_prefix(struct string_list *map,
+					      const char *string, size_t len)
+{
+	int i = string_list_find_insert_index(map, string, 1);
+	if (i < 0) {
+		/* exact match */
+		i = -1 - i;
+		/* does it match exactly? */
+		if (!map->items[i].string[len])
+			return &map->items[i];
+	}
+
+	/*
+	 * i is at the exact match to an overlong key, or
+	 * location the possibly overlong key would be inserted,
+	 * which must be after the real location of the key.
+	 */
+	while (0 <= --i && i < map->nr) {
+		int cmp = strncasecmp(map->items[i].string, string, len);
+		if (cmp < 0)
+			/*
+			 * "i" points at a key definitely below the prefix;
+			 * the map does not have string[0:len] in it.
+			 */
+			break;
+		else if (!cmp && !map->items[i].string[len])
+			/* found it */
+			return &map->items[i];
+		/*
+		 * otherwise, the string at "i" may be string[0:len]
+		 * followed by a string that sorts later than string[len:];
+		 * keep trying.
+		 */
+	}
+	return NULL;
+}
+
 int map_user(struct string_list *map,
 	     char *email, int maxlen_email, char *name, int maxlen_name)
 {
 	char *end_of_email;
 	struct string_list_item *item;
 	struct mailmap_entry *me;
-	char buf[1024], *mailbuf;
-	int i;
+	size_t maillen;
 
 	/* figure out space requirement for email */
 	end_of_email = strchr(email, '>');
@@ -204,18 +241,12 @@ int map_user(struct string_list *map,
 		if (!end_of_email)
 			return 0;
 	}
-	if (end_of_email - email + 1 < sizeof(buf))
-		mailbuf = buf;
-	else
-		mailbuf = xmalloc(end_of_email - email + 1);
-
-	/* downcase the email address */
-	for (i = 0; i < end_of_email - email; i++)
-		mailbuf[i] = tolower(email[i]);
-	mailbuf[i] = 0;
-
-	debug_mm("map_user: map '%s' <%s>\n", name, mailbuf);
-	item = string_list_lookup(map, mailbuf);
+
+	maillen = end_of_email - email;
+
+	debug_mm("map_user: map '%s' <%.*s>\n", name, maillen, email);
+
+	item = lookup_prefix(map, email, maillen);
 	if (item != NULL) {
 		me = (struct mailmap_entry *)item->util;
 		if (me->namemap.nr) {
@@ -226,8 +257,6 @@ int map_user(struct string_list *map,
 				item = subitem;
 		}
 	}
-	if (mailbuf != buf)
-		free(mailbuf);
 	if (item != NULL) {
 		struct mailmap_info *mi = (struct mailmap_info *)item->util;
 		if (mi->name == NULL && (mi->email == NULL || maxlen_email == 0)) {
-- 
1.8.1.304.gf036638

^ permalink raw reply related

* [PATCH v2 07/10] log: add --use-mailmap option
From: Junio C Hamano @ 2013-01-08  0:10 UTC (permalink / raw)
  To: git; +Cc: Antoine Pelisse
In-Reply-To: <1357603821-8647-1-git-send-email-gitster@pobox.com>

From: Antoine Pelisse <apelisse@gmail.com>

Add the --use-mailmap option to log commands. It allows to display
names from mailmap file when displaying logs, whatever the format
used.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-log.txt | 5 +++++
 builtin/log.c             | 9 ++++++++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-log.txt b/Documentation/git-log.txt
index 585dac4..a99be97 100644
--- a/Documentation/git-log.txt
+++ b/Documentation/git-log.txt
@@ -47,6 +47,11 @@ OPTIONS
 	Print out the ref name given on the command line by which each
 	commit was reached.
 
+--use-mailmap::
+	Use mailmap file to map author and committer names and email
+	to canonical real names and email addresses. See
+	linkgit:git-shortlog[1].
+
 --full-diff::
 	Without this flag, "git log -p <path>..." shows commits that
 	touch the specified paths, and diffs about the same specified
diff --git a/builtin/log.c b/builtin/log.c
index e7b7db1..d2bd8ce 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -22,6 +22,7 @@
 #include "branch.h"
 #include "streaming.h"
 #include "version.h"
+#include "mailmap.h"
 
 /* Set a default date-time format for git log ("log.date" config variable) */
 static const char *default_date_mode = NULL;
@@ -94,11 +95,12 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
 			 struct rev_info *rev, struct setup_revision_opt *opt)
 {
 	struct userformat_want w;
-	int quiet = 0, source = 0;
+	int quiet = 0, source = 0, mailmap = 0;
 
 	const struct option builtin_log_options[] = {
 		OPT_BOOLEAN(0, "quiet", &quiet, N_("suppress diff output")),
 		OPT_BOOLEAN(0, "source", &source, N_("show source")),
+		OPT_BOOLEAN(0, "use-mailmap", &mailmap, N_("Use mail map file")),
 		{ OPTION_CALLBACK, 0, "decorate", NULL, NULL, N_("decorate options"),
 		  PARSE_OPT_OPTARG, decorate_callback},
 		OPT_END()
@@ -136,6 +138,11 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
 	if (source)
 		rev->show_source = 1;
 
+	if (mailmap) {
+		rev->mailmap = xcalloc(1, sizeof(struct string_list));
+		read_mailmap(rev->mailmap, NULL);
+	}
+
 	if (rev->pretty_given && rev->commit_format == CMIT_FMT_RAW) {
 		/*
 		 * "log --pretty=raw" is special; ignore UI oriented
-- 
1.8.1.304.gf036638

^ permalink raw reply related

* [PATCH v2 04/10] mailmap: simplify map_user() interface
From: Junio C Hamano @ 2013-01-08  0:10 UTC (permalink / raw)
  To: git; +Cc: Antoine Pelisse
In-Reply-To: <1357603821-8647-1-git-send-email-gitster@pobox.com>

From: Antoine Pelisse <apelisse@gmail.com>

Simplify map_user(), mostly to avoid copies of string buffers. It
also simplifies caller functions.

map_user() directly receive pointers and length from the commit buffer
as mail and name. If mapping of the user and mail can be done, the
pointer is updated to a new location. Lengths are also updated if
necessary.

The caller of map_user() can then copy the new email and name if
necessary.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/blame.c    | 156 +++++++++++++++++++++++++++--------------------------
 builtin/shortlog.c |  32 +++++------
 mailmap.c          |  43 +++++++--------
 mailmap.h          |   4 +-
 pretty.c           |  35 +++++-------
 5 files changed, 126 insertions(+), 144 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index dd4aff9..86450e3 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1321,31 +1321,31 @@ static void pass_blame(struct scoreboard *sb, struct origin *origin, int opt)
  * Information on commits, used for output.
  */
 struct commit_info {
-	const char *author;
-	const char *author_mail;
+	struct strbuf author;
+	struct strbuf author_mail;
 	unsigned long author_time;
-	const char *author_tz;
+	struct strbuf author_tz;
 
 	/* filled only when asked for details */
-	const char *committer;
-	const char *committer_mail;
+	struct strbuf committer;
+	struct strbuf committer_mail;
 	unsigned long committer_time;
-	const char *committer_tz;
+	struct strbuf committer_tz;
 
-	const char *summary;
+	struct strbuf summary;
 };
 
 /*
  * Parse author/committer line in the commit object buffer
  */
 static void get_ac_line(const char *inbuf, const char *what,
-			int person_len, char *person,
-			int mail_len, char *mail,
-			unsigned long *time, const char **tz)
+	struct strbuf *name, struct strbuf *mail,
+	unsigned long *time, struct strbuf *tz)
 {
 	struct ident_split ident;
-	int len, tzlen, maillen, namelen;
-	char *tmp, *endp, *mailpos;
+	size_t len, maillen, namelen;
+	char *tmp, *endp;
+	const char *namebuf, *mailbuf;
 
 	tmp = strstr(inbuf, what);
 	if (!tmp)
@@ -1356,51 +1356,61 @@ static void get_ac_line(const char *inbuf, const char *what,
 		len = strlen(tmp);
 	else
 		len = endp - tmp;
-	if (person_len <= len)
-		goto error_out;
 
 	if (split_ident_line(&ident, tmp, len)) {
 	error_out:
 		/* Ugh */
-		*tz = "(unknown)";
-		strcpy(person, *tz);
-		strcpy(mail, *tz);
+		tmp = "(unknown)";
+		strbuf_addstr(name, tmp);
+		strbuf_addstr(mail, tmp);
+		strbuf_addstr(tz, tmp);
 		*time = 0;
 		return;
 	}
 
 	namelen = ident.name_end - ident.name_begin;
-	memcpy(person, ident.name_begin, namelen);
-	person[namelen] = 0;
+	namebuf = ident.name_begin;
 
-	maillen = ident.mail_end - ident.mail_begin + 2;
-	memcpy(mail, ident.mail_begin - 1, maillen);
-	mail[maillen] = 0;
+	maillen = ident.mail_end - ident.mail_begin;
+	mailbuf = ident.mail_begin;
 
 	*time = strtoul(ident.date_begin, NULL, 10);
 
-	tzlen = ident.tz_end - ident.tz_begin;
-
-	/* Place tz at the end of person */
-	*tz = tmp = person + person_len - (tzlen + 1);
-	memcpy(tmp, ident.tz_begin, tzlen);
-	tmp[tzlen] = 0;
-
-	if (!mailmap.nr)
-		return;
+	len = ident.tz_end - ident.tz_begin;
+	strbuf_add(tz, ident.tz_begin, len);
 
 	/*
 	 * Now, convert both name and e-mail using mailmap
 	 */
-	if (map_user(&mailmap, mail+1, mail_len-1, person, tmp-person-1)) {
-		/* Add a trailing '>' to email, since map_user returns plain emails
-		   Note: It already has '<', since we replace from mail+1 */
-		mailpos = memchr(mail, '\0', mail_len);
-		if (mailpos && mailpos-mail < mail_len - 1) {
-			*mailpos = '>';
-			*(mailpos+1) = '\0';
-		}
-	}
+	map_user(&mailmap, &mailbuf, &maillen,
+		 &namebuf, &namelen);
+
+	strbuf_addf(mail, "<%.*s>", (int)maillen, mailbuf);
+	strbuf_add(name, namebuf, namelen);
+}
+
+static void commit_info_init(struct commit_info *ci)
+{
+
+	strbuf_init(&ci->author, 0);
+	strbuf_init(&ci->author_mail, 0);
+	strbuf_init(&ci->author_tz, 0);
+	strbuf_init(&ci->committer, 0);
+	strbuf_init(&ci->committer_mail, 0);
+	strbuf_init(&ci->committer_tz, 0);
+	strbuf_init(&ci->summary, 0);
+}
+
+static void commit_info_destroy(struct commit_info *ci)
+{
+
+	strbuf_release(&ci->author);
+	strbuf_release(&ci->author_mail);
+	strbuf_release(&ci->author_tz);
+	strbuf_release(&ci->committer);
+	strbuf_release(&ci->committer_mail);
+	strbuf_release(&ci->committer_tz);
+	strbuf_release(&ci->summary);
 }
 
 static void get_commit_info(struct commit *commit,
@@ -1410,11 +1420,8 @@ static void get_commit_info(struct commit *commit,
 	int len;
 	const char *subject, *encoding;
 	char *reencoded, *message;
-	static char author_name[1024];
-	static char author_mail[1024];
-	static char committer_name[1024];
-	static char committer_mail[1024];
-	static char summary_buf[1024];
+
+	commit_info_init(ret);
 
 	/*
 	 * We've operated without save_commit_buffer, so
@@ -1432,11 +1439,8 @@ static void get_commit_info(struct commit *commit,
 	encoding = get_log_output_encoding();
 	reencoded = logmsg_reencode(commit, encoding);
 	message   = reencoded ? reencoded : commit->buffer;
-	ret->author = author_name;
-	ret->author_mail = author_mail;
 	get_ac_line(message, "\nauthor ",
-		    sizeof(author_name), author_name,
-		    sizeof(author_mail), author_mail,
+		    &ret->author, &ret->author_mail,
 		    &ret->author_time, &ret->author_tz);
 
 	if (!detailed) {
@@ -1444,21 +1448,16 @@ static void get_commit_info(struct commit *commit,
 		return;
 	}
 
-	ret->committer = committer_name;
-	ret->committer_mail = committer_mail;
 	get_ac_line(message, "\ncommitter ",
-		    sizeof(committer_name), committer_name,
-		    sizeof(committer_mail), committer_mail,
+		    &ret->committer, &ret->committer_mail,
 		    &ret->committer_time, &ret->committer_tz);
 
-	ret->summary = summary_buf;
 	len = find_commit_subject(message, &subject);
-	if (len && len < sizeof(summary_buf)) {
-		memcpy(summary_buf, subject, len);
-		summary_buf[len] = 0;
-	} else {
-		sprintf(summary_buf, "(%s)", sha1_to_hex(commit->object.sha1));
-	}
+	if (len)
+		strbuf_add(&ret->summary, subject, len);
+	else
+		strbuf_addf(&ret->summary, "(%s)", sha1_to_hex(commit->object.sha1));
+
 	free(reencoded);
 }
 
@@ -1487,15 +1486,15 @@ static int emit_one_suspect_detail(struct origin *suspect, int repeat)
 
 	suspect->commit->object.flags |= METAINFO_SHOWN;
 	get_commit_info(suspect->commit, &ci, 1);
-	printf("author %s\n", ci.author);
-	printf("author-mail %s\n", ci.author_mail);
+	printf("author %s\n", ci.author.buf);
+	printf("author-mail %s\n", ci.author_mail.buf);
 	printf("author-time %lu\n", ci.author_time);
-	printf("author-tz %s\n", ci.author_tz);
-	printf("committer %s\n", ci.committer);
-	printf("committer-mail %s\n", ci.committer_mail);
+	printf("author-tz %s\n", ci.author_tz.buf);
+	printf("committer %s\n", ci.committer.buf);
+	printf("committer-mail %s\n", ci.committer_mail.buf);
 	printf("committer-time %lu\n", ci.committer_time);
-	printf("committer-tz %s\n", ci.committer_tz);
-	printf("summary %s\n", ci.summary);
+	printf("committer-tz %s\n", ci.committer_tz.buf);
+	printf("summary %s\n", ci.summary.buf);
 	if (suspect->commit->object.flags & UNINTERESTING)
 		printf("boundary\n");
 	if (suspect->previous) {
@@ -1503,6 +1502,9 @@ static int emit_one_suspect_detail(struct origin *suspect, int repeat)
 		printf("previous %s ", sha1_to_hex(prev->commit->object.sha1));
 		write_name_quoted(prev->path, stdout, '\n');
 	}
+
+	commit_info_destroy(&ci);
+
 	return 1;
 }
 
@@ -1689,11 +1691,11 @@ static void emit_other(struct scoreboard *sb, struct blame_entry *ent, int opt)
 		if (opt & OUTPUT_ANNOTATE_COMPAT) {
 			const char *name;
 			if (opt & OUTPUT_SHOW_EMAIL)
-				name = ci.author_mail;
+				name = ci.author_mail.buf;
 			else
-				name = ci.author;
+				name = ci.author.buf;
 			printf("\t(%10s\t%10s\t%d)", name,
-			       format_time(ci.author_time, ci.author_tz,
+			       format_time(ci.author_time, ci.author_tz.buf,
 					   show_raw_time),
 			       ent->lno + 1 + cnt);
 		} else {
@@ -1712,14 +1714,14 @@ static void emit_other(struct scoreboard *sb, struct blame_entry *ent, int opt)
 				const char *name;
 				int pad;
 				if (opt & OUTPUT_SHOW_EMAIL)
-					name = ci.author_mail;
+					name = ci.author_mail.buf;
 				else
-					name = ci.author;
+					name = ci.author.buf;
 				pad = longest_author - utf8_strwidth(name);
 				printf(" (%s%*s %10s",
 				       name, pad, "",
 				       format_time(ci.author_time,
-						   ci.author_tz,
+						   ci.author_tz.buf,
 						   show_raw_time));
 			}
 			printf(" %*d) ",
@@ -1734,6 +1736,8 @@ static void emit_other(struct scoreboard *sb, struct blame_entry *ent, int opt)
 
 	if (sb->final_buf_size && cp[-1] != '\n')
 		putchar('\n');
+
+	commit_info_destroy(&ci);
 }
 
 static void output(struct scoreboard *sb, int option)
@@ -1858,9 +1862,9 @@ static void find_alignment(struct scoreboard *sb, int *option)
 			suspect->commit->object.flags |= METAINFO_SHOWN;
 			get_commit_info(suspect->commit, &ci, 1);
 			if (*option & OUTPUT_SHOW_EMAIL)
-				num = utf8_strwidth(ci.author_mail);
+				num = utf8_strwidth(ci.author_mail.buf);
 			else
-				num = utf8_strwidth(ci.author);
+				num = utf8_strwidth(ci.author.buf);
 			if (longest_author < num)
 				longest_author = num;
 		}
@@ -1872,6 +1876,8 @@ static void find_alignment(struct scoreboard *sb, int *option)
 			longest_dst_lines = num;
 		if (largest_score < ent_score(sb, e))
 			largest_score = ent_score(sb, e);
+
+		commit_info_destroy(&ci);
 	}
 	max_orig_digits = decimal_width(longest_src_lines);
 	max_digits = decimal_width(longest_dst_lines);
diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index 03c6cd7..1eeed0f 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -36,36 +36,28 @@ static void insert_one_record(struct shortlog *log,
 	const char *dot3 = log->common_repo_prefix;
 	char *buffer, *p;
 	struct string_list_item *item;
-	char namebuf[1024];
-	char emailbuf[1024];
-	size_t len;
+	const char *mailbuf, *namebuf;
+	size_t namelen, maillen;
 	const char *eol;
 	struct strbuf subject = STRBUF_INIT;
+	struct strbuf namemailbuf = STRBUF_INIT;
 	struct ident_split ident;
 
 	if (split_ident_line(&ident, author, strlen(author)))
 		return;
 
-	/* copy author name to namebuf, to support matching on both name and email */
-	len = ident.name_end - ident.name_begin;
-	memcpy(namebuf, ident.name_begin, len);
-	namebuf[len] = 0;
+	namebuf = ident.name_begin;
+	mailbuf = ident.mail_begin;
+	namelen = ident.name_end - ident.name_begin;
+	maillen = ident.mail_end - ident.mail_begin;
 
-	/* copy email name to emailbuf, to allow email replacement as well */
-	len = ident.mail_end - ident.mail_begin;
-	memcpy(emailbuf, ident.mail_begin, len);
-	emailbuf[len] = 0;
+	map_user(&log->mailmap, &mailbuf, &maillen, &namebuf, &namelen);
+	strbuf_add(&namemailbuf, namebuf, namelen);
 
-	map_user(&log->mailmap, emailbuf, sizeof(emailbuf), namebuf, sizeof(namebuf));
-	len = strlen(namebuf);
+	if (log->email)
+		strbuf_addf(&namemailbuf, " <%.*s>", (int)maillen, mailbuf);
 
-	if (log->email) {
-		size_t room = sizeof(namebuf) - len - 1;
-		int maillen = strlen(emailbuf);
-		snprintf(namebuf + len, room, " <%.*s>", maillen, emailbuf);
-	}
-
-	item = string_list_insert(&log->list, namebuf);
+	item = string_list_insert(&log->list, namemailbuf.buf);
 	if (item->util == NULL)
 		item->util = xcalloc(1, sizeof(struct string_list));
 
diff --git a/mailmap.c b/mailmap.c
index 1a0b769..5ba5c37 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -226,50 +226,43 @@ static struct string_list_item *lookup_prefix(struct string_list *map,
 }
 
 int map_user(struct string_list *map,
-	     char *email, int maxlen_email, char *name, int maxlen_name)
+			 const char **email, size_t *emaillen,
+			 const char **name, size_t *namelen)
 {
-	char *end_of_email;
 	struct string_list_item *item;
 	struct mailmap_entry *me;
-	size_t maillen;
-
-	/* figure out space requirement for email */
-	end_of_email = strchr(email, '>');
-	if (!end_of_email) {
-		/* email passed in might not be wrapped in <>, but end with a \0 */
-		end_of_email = memchr(email, '\0', maxlen_email);
-		if (!end_of_email)
-			return 0;
-	}
-
-	maillen = end_of_email - email;
 
-	debug_mm("map_user: map '%s' <%.*s>\n", name, maillen, email);
+	debug_mm("map_user: map '%.*s' <%.*s>\n",
+		 *name, *namelen, *emaillen, *email);
 
-	item = lookup_prefix(map, email, maillen);
+	item = lookup_prefix(map, *email, *emaillen);
 	if (item != NULL) {
 		me = (struct mailmap_entry *)item->util;
 		if (me->namemap.nr) {
 			/* The item has multiple items, so we'll look up on name too */
 			/* If the name is not found, we choose the simple entry      */
-			struct string_list_item *subitem = string_list_lookup(&me->namemap, name);
+			struct string_list_item *subitem;
+			subitem = lookup_prefix(&me->namemap, *name, *namelen);
 			if (subitem)
 				item = subitem;
 		}
 	}
 	if (item != NULL) {
 		struct mailmap_info *mi = (struct mailmap_info *)item->util;
-		if (mi->name == NULL && (mi->email == NULL || maxlen_email == 0)) {
+		if (mi->name == NULL && mi->email == NULL) {
 			debug_mm("map_user:  -- (no simple mapping)\n");
 			return 0;
 		}
-		if (maxlen_email && mi->email)
-			strlcpy(email, mi->email, maxlen_email);
-		else
-			*end_of_email = '\0';
-		if (maxlen_name && mi->name)
-			strlcpy(name, mi->name, maxlen_name);
-		debug_mm("map_user:  to '%s' <%s>\n", name, mi->email ? mi->email : "");
+		if (mi->email) {
+				*email = mi->email;
+				*emaillen = strlen(*email);
+		}
+		if (mi->name) {
+				*name = mi->name;
+				*namelen = strlen(*name);
+		}
+		debug_mm("map_user:  to '%.*s' <.*%s>\n", *namelen, *name,
+				 *emaillen, *email);
 		return 1;
 	}
 	debug_mm("map_user:  --\n");
diff --git a/mailmap.h b/mailmap.h
index d5c3664..ed7c93b 100644
--- a/mailmap.h
+++ b/mailmap.h
@@ -4,7 +4,7 @@
 int read_mailmap(struct string_list *map, char **repo_abbrev);
 void clear_mailmap(struct string_list *map);
 
-int map_user(struct string_list *mailmap,
-	     char *email, int maxlen_email, char *name, int maxlen_name);
+int map_user(struct string_list *map,
+			 const char **email, size_t *emaillen, const char **name, size_t *namelen);
 
 #endif
diff --git a/pretty.c b/pretty.c
index 350d1df..dffcade 100644
--- a/pretty.c
+++ b/pretty.c
@@ -593,7 +593,8 @@ char *logmsg_reencode(const struct commit *commit,
 	return out;
 }
 
-static int mailmap_name(char *email, int email_len, char *name, int name_len)
+static int mailmap_name(const char **email, size_t *email_len,
+			const char **name, size_t *name_len)
 {
 	static struct string_list *mail_map;
 	if (!mail_map) {
@@ -610,36 +611,26 @@ static size_t format_person_part(struct strbuf *sb, char part,
 	const int placeholder_len = 2;
 	int tz;
 	unsigned long date = 0;
-	char person_name[1024];
-	char person_mail[1024];
 	struct ident_split s;
-	const char *name_start, *name_end, *mail_start, *mail_end;
+	const char *name, *mail;
+	size_t maillen, namelen;
 
 	if (split_ident_line(&s, msg, len) < 0)
 		goto skip;
 
-	name_start = s.name_begin;
-	name_end = s.name_end;
-	mail_start = s.mail_begin;
-	mail_end = s.mail_end;
-
-	if (part == 'N' || part == 'E') { /* mailmap lookup */
-		snprintf(person_name, sizeof(person_name), "%.*s",
-			 (int)(name_end - name_start), name_start);
-		snprintf(person_mail, sizeof(person_mail), "%.*s",
-			 (int)(mail_end - mail_start), mail_start);
-		mailmap_name(person_mail, sizeof(person_mail), person_name, sizeof(person_name));
-		name_start = person_name;
-		name_end = name_start + strlen(person_name);
-		mail_start = person_mail;
-		mail_end = mail_start +  strlen(person_mail);
-	}
+	name = s.name_begin;
+	namelen = s.name_end - s.name_begin;
+	mail = s.mail_begin;
+	maillen = s.mail_end - s.mail_begin;
+
+	if (part == 'N' || part == 'E') /* mailmap lookup */
+		mailmap_name(&mail, &maillen, &name, &namelen);
 	if (part == 'n' || part == 'N') {	/* name */
-		strbuf_add(sb, name_start, name_end-name_start);
+		strbuf_add(sb, name, namelen);
 		return placeholder_len;
 	}
 	if (part == 'e' || part == 'E') {	/* email */
-		strbuf_add(sb, mail_start, mail_end-mail_start);
+		strbuf_add(sb, mail, maillen);
 		return placeholder_len;
 	}
 
-- 
1.8.1.304.gf036638

^ permalink raw reply related

* [PATCH v2 05/10] mailmap: add mailmap structure to rev_info and pp
From: Junio C Hamano @ 2013-01-08  0:10 UTC (permalink / raw)
  To: git; +Cc: Antoine Pelisse
In-Reply-To: <1357603821-8647-1-git-send-email-gitster@pobox.com>

From: Antoine Pelisse <apelisse@gmail.com>

Pass a mailmap from rev_info to pretty_print_context to so that the
pretty printer can use rewritten name and email address when showing
commits.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 commit.h   | 1 +
 log-tree.c | 1 +
 revision.h | 1 +
 3 files changed, 3 insertions(+)

diff --git a/commit.h b/commit.h
index b6ad8f3..7f8f987 100644
--- a/commit.h
+++ b/commit.h
@@ -89,6 +89,7 @@ struct pretty_print_context {
 	char *notes_message;
 	struct reflog_walk_info *reflog_info;
 	const char *output_encoding;
+	struct string_list *mailmap;
 };
 
 struct userformat_want {
diff --git a/log-tree.c b/log-tree.c
index 4f86def..92254fd 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -671,6 +671,7 @@ void show_log(struct rev_info *opt)
 	ctx.preserve_subject = opt->preserve_subject;
 	ctx.reflog_info = opt->reflog_info;
 	ctx.fmt = opt->commit_format;
+	ctx.mailmap = opt->mailmap;
 	pretty_print_commit(&ctx, commit, &msgbuf);
 
 	if (opt->add_signoff)
diff --git a/revision.h b/revision.h
index 059bfff..83a79f5 100644
--- a/revision.h
+++ b/revision.h
@@ -143,6 +143,7 @@ struct rev_info {
 	const char	*subject_prefix;
 	int		no_inline;
 	int		show_log_size;
+	struct string_list *mailmap;
 
 	/* Filter by commit log message */
 	struct grep_opt	grep_filter;
-- 
1.8.1.304.gf036638

^ permalink raw reply related

* [PATCH v2 10/10] log: add log.mailmap configuration option
From: Junio C Hamano @ 2013-01-08  0:10 UTC (permalink / raw)
  To: git; +Cc: Antoine Pelisse
In-Reply-To: <1357603821-8647-1-git-send-email-gitster@pobox.com>

From: Antoine Pelisse <apelisse@gmail.com>

Teach "log.mailmap" configuration variable to turn "--use-mailmap"
option on to "git log", "git show" and "git whatchanged".

The "--no-use-mailmap" option from the command line can countermand
the setting.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config.txt |  4 ++++
 builtin/log.c            |  7 +++++++
 t/t4203-mailmap.sh       | 24 ++++++++++++++++++++++++
 3 files changed, 35 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index bf8f911..226362a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1509,6 +1509,10 @@ log.showroot::
 	Tools like linkgit:git-log[1] or linkgit:git-whatchanged[1], which
 	normally hide the root commit will now show it. True by default.
 
+log.mailmap::
+	If true, makes linkgit:git-log[1], linkgit:git-show[1], and
+	linkgit:git-whatchanged[1] assume `--use-mailmap`.
+
 mailmap.file::
 	The location of an augmenting mailmap file. The default
 	mailmap, located in the root of the repository, is loaded
diff --git a/builtin/log.c b/builtin/log.c
index d2bd8ce..16e6520 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -31,6 +31,7 @@ static int default_abbrev_commit;
 static int default_show_root = 1;
 static int decoration_style;
 static int decoration_given;
+static int use_mailmap_config;
 static const char *fmt_patch_subject_prefix = "PATCH";
 static const char *fmt_pretty;
 
@@ -106,6 +107,7 @@ static void cmd_log_init_finish(int argc, const char **argv, const char *prefix,
 		OPT_END()
 	};
 
+	mailmap = use_mailmap_config;
 	argc = parse_options(argc, argv, prefix,
 			     builtin_log_options, builtin_log_usage,
 			     PARSE_OPT_KEEP_ARGV0 | PARSE_OPT_KEEP_UNKNOWN |
@@ -358,6 +360,11 @@ static int git_log_config(const char *var, const char *value, void *cb)
 	}
 	if (!prefixcmp(var, "color.decorate."))
 		return parse_decorate_color_config(var, 15, value);
+	if (!strcmp(var, "log.mailmap")) {
+		use_mailmap_config = git_config_bool(var, value);
+		return 0;
+	}
+
 	if (grep_config(var, value, cb) < 0)
 		return -1;
 	return git_diff_ui_config(var, value, cb);
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index e16187f..7d4d31c 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -255,6 +255,21 @@ test_expect_success 'Log output with --use-mailmap' '
 '
 
 cat >expect <<\EOF
+Author: CTO <cto@company.xx>
+Author: Santa Claus <santa.claus@northpole.xx>
+Author: Santa Claus <santa.claus@northpole.xx>
+Author: Other Author <other@author.xx>
+Author: Other Author <other@author.xx>
+Author: Some Dude <some@dude.xx>
+Author: A U Thor <author@example.com>
+EOF
+
+test_expect_success 'Log output with log.mailmap' '
+	git -c log.mailmap=True log | grep Author >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<\EOF
 Author: Santa Claus <santa.claus@northpole.xx>
 Author: Santa Claus <santa.claus@northpole.xx>
 EOF
@@ -263,6 +278,15 @@ test_expect_success 'Grep author with --use-mailmap' '
 	git log --use-mailmap --author Santa | grep Author >actual &&
 	test_cmp expect actual
 '
+cat >expect <<\EOF
+Author: Santa Claus <santa.claus@northpole.xx>
+Author: Santa Claus <santa.claus@northpole.xx>
+EOF
+
+test_expect_success 'Grep author with log.mailmap' '
+	git -c log.mailmap=True log --author Santa | grep Author >actual &&
+	test_cmp expect actual
+'
 
 >expect
 
-- 
1.8.1.304.gf036638

^ permalink raw reply related

* [PATCH v2 06/10] pretty: use mailmap to display username and email
From: Junio C Hamano @ 2013-01-08  0:10 UTC (permalink / raw)
  To: git; +Cc: Antoine Pelisse
In-Reply-To: <1357603821-8647-1-git-send-email-gitster@pobox.com>

From: Antoine Pelisse <apelisse@gmail.com>

Use the mailmap information to display the rewritten
username and email address in all log commands.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 pretty.c | 58 +++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 37 insertions(+), 21 deletions(-)

diff --git a/pretty.c b/pretty.c
index dffcade..622275c 100644
--- a/pretty.c
+++ b/pretty.c
@@ -387,9 +387,13 @@ void pp_user_info(const struct pretty_print_context *pp,
 		  const char *what, struct strbuf *sb,
 		  const char *line, const char *encoding)
 {
+	struct strbuf name;
+	struct strbuf mail;
 	struct ident_split ident;
-	int linelen, namelen;
+	int linelen;
 	char *line_end, *date;
+	const char *mailbuf, *namebuf;
+	size_t namelen, maillen;
 	int max_length = 78; /* per rfc2822 */
 	unsigned long time;
 	int tz;
@@ -408,42 +412,54 @@ void pp_user_info(const struct pretty_print_context *pp,
 	if (split_ident_line(&ident, line, linelen))
 		return;
 
-	namelen = ident.mail_end - ident.name_begin + 1;
+
+	mailbuf = ident.mail_begin;
+	maillen = ident.mail_end - ident.mail_begin;
+	namebuf = ident.name_begin;
+	namelen = ident.name_end - ident.name_begin;
+
+	if (pp->mailmap)
+		map_user(pp->mailmap, &mailbuf, &maillen, &namebuf, &namelen);
+
+	strbuf_init(&mail, 0);
+	strbuf_init(&name, 0);
+
+	strbuf_add(&mail, mailbuf, maillen);
+	strbuf_add(&name, namebuf, namelen);
+
+	namelen = name.len + mail.len + 3; /* ' ' + '<' + '>' */
 	time = strtoul(ident.date_begin, &date, 10);
 	tz = strtol(date, NULL, 10);
 
 	if (pp->fmt == CMIT_FMT_EMAIL) {
-		int display_name_length;
-
-		display_name_length = ident.name_end - ident.name_begin;
-
 		strbuf_addstr(sb, "From: ");
-		if (needs_rfc2047_encoding(line, display_name_length, RFC2047_ADDRESS)) {
-			add_rfc2047(sb, line, display_name_length,
-						encoding, RFC2047_ADDRESS);
+		if (needs_rfc2047_encoding(name.buf, name.len, RFC2047_ADDRESS)) {
+			add_rfc2047(sb, name.buf, name.len,
+				    encoding, RFC2047_ADDRESS);
 			max_length = 76; /* per rfc2047 */
-		} else if (needs_rfc822_quoting(line, display_name_length)) {
+		} else if (needs_rfc822_quoting(name.buf, name.len)) {
 			struct strbuf quoted = STRBUF_INIT;
-			add_rfc822_quoted(&quoted, line, display_name_length);
+			add_rfc822_quoted(&quoted, name.buf, name.len);
 			strbuf_add_wrapped_bytes(sb, quoted.buf, quoted.len,
 							-6, 1, max_length);
 			strbuf_release(&quoted);
 		} else {
-			strbuf_add_wrapped_bytes(sb, line, display_name_length,
-							-6, 1, max_length);
+			strbuf_add_wrapped_bytes(sb, name.buf, name.len,
+						 -6, 1, max_length);
 		}
-		if (namelen - display_name_length + last_line_length(sb) > max_length) {
+		if (namelen - name.len + last_line_length(sb) > max_length)
 			strbuf_addch(sb, '\n');
-			if (!isspace(ident.name_end[0]))
-				strbuf_addch(sb, ' ');
-		}
-		strbuf_add(sb, ident.name_end, namelen - display_name_length);
-		strbuf_addch(sb, '\n');
+
+		strbuf_addf(sb, " <%s>\n", mail.buf);
 	} else {
-		strbuf_addf(sb, "%s: %.*s%.*s\n", what,
+		strbuf_addf(sb, "%s: %.*s%s <%s>\n", what,
 			      (pp->fmt == CMIT_FMT_FULLER) ? 4 : 0,
-			      "    ", namelen, line);
+			      "    ", name.buf, mail.buf);
 	}
+
+	strbuf_release(&mail);
+	strbuf_release(&name);
+
 	switch (pp->fmt) {
 	case CMIT_FMT_MEDIUM:
 		strbuf_addf(sb, "Date:   %s\n", show_date(time, tz, pp->date_mode));
-- 
1.8.1.304.gf036638

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox