From: Jakub Narebski <jnareb@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH (amend)] diff: Make numstat machine friendly also for renames (and copies)
Date: Wed, 12 Dec 2007 00:09:35 +0100 [thread overview]
Message-ID: <200712120009.36560.jnareb@gmail.com> (raw)
In-Reply-To: <7vprxehrlv.fsf@gitster.siamese.dyndns.org>
On Wed, 11 Dec 2007, Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
>> Junio C Hamano wrote:
>>> Jakub Narebski <jnareb@gmail.com> writes:
>>>
>>>> Previous version of this patch (from 7 May 2007) used instead of current
>>>> only "to_name" format similar to git-diff-tree raw format for renames:
>>>>
>>>> added deleted TAB path for "src" TAB path for "dst" LF
>
> That's perfectly fine (that is without -z).
>
>>>> ... Previous version
>>>> of patch used (or was to use actually, because of error in the code)
>>>>
>>>> added deleted TAB path for "src" NUL NUL path for "dst" NUL
>>>>
>>>> when -z option was used.
>
> I think your "previous" one:
>
> <added> <deleted> <src> NUL (no rename)
> <added> <deleted> <src> NUL NUL <dst> NUL (with rename)
>
> would be unambiguously parsable, but it would be cleaner and easier for
> the parser if the latter were like this instead:
>
> <added> <deleted> NUL <src> NUL <dst> NUL (with rename)
>
> The reader can expect to find a NUL terminated src, finds the length is
> zero and notices it cannot be src and then reads two paths from that
> point.
Very good idea.
> Without having the status letter, we cannot do "optional two paths"
> without breaking existing --numstat -z readers that do not care about
> renames and copies, and I agree that existing --numstat -z readers that
> pass -M or -C are already broken, so I think a format extension along
> the above line (your "previous" and the above clean-up proposal have the
> same power of expressiveness, I just think the latter is syntactically
> cleaner) would be reasonable. And at that point we probably would not
> need --numstat-enhanced at all ;-)
The previous patch is a fix (usability fix) for numstat output in the
presence of rename detection, making it truly machine friendly. Moreover
it should not break any scripts which parsed numstat output not
expecting to deal with renames (for example if repository has
diff.renames config option set), and might even fix them.
The proposed - and implemented in patch below - change could break some
scripts not expecting to deal with new numstat output. Adding status
letter (with similarity/dissimilarity percentage info) would
unfortunately require greater surgery... We can either allow scripts
(do there exist any?) to break, or make the full rename info shown
only for --numstat-extended; but I'd like to have this status letter
for --numstat-extended / --numstat-enhanced.
So possible solution are:
(a) Accept both patches, and accept remote possibility that some
scripts might break
(b) Add --numstat-extended option whoich would show rename info
as implemented in this patch (and of course accept previous)
(c) Implement --numstat-extended with status letter, as suggested,
accept first patch but not this one.
Note that gitweb would require --numstat with proper rename detection
support if we want to e.g. provide graphical diffstat in 'commit' view
for merge commits.
P.S. The numstat output format for renames should be probably described
in documentation, and not only in commit message, but I was not sure
where to put it (and even how it should be written).
-- >8 --
From: Jakub Narebski <jnareb@gmail.com>
Date: Tue, 11 Dec 2007 23:52:18 +0100
Subject: [RFC/PATCH] diff: Show rename info in numstat output, correctly
Make numstat output of git-diff show both source and destination
filename for renames (and copies) in easily parseable format, similar
to raw diff output format.
The numstat format for rename (or copy) is now
<added> <deleted> TAB <path for "src"> TAB <path for "dst"> LF
or if -z option is used
<added> <deleted> TAB NULL <path for "src"> NULL <path for "dst"> NULL
where <added> and <deleted> is number of added/deleted lines
(or '-' for binary file).
When -z option is not used, TAB, LF, and backslash characters in
pathnames are represented as \t, \n, and \\, respectively.
Idea for -z numstat rename format by Junio C Hamano.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
diff.c | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)
diff --git a/diff.c b/diff.c
index 8039ac7..56cbf28 100644
--- a/diff.c
+++ b/diff.c
@@ -991,7 +991,14 @@ static void show_numstat(struct diffstat_t* data, struct diff_options *options)
printf("-\t-\t");
else
printf("%d\t%d\t", file->added, file->deleted);
- write_name_quoted(file->name, stdout, options->line_termination);
+ if (file->is_renamed) {
+ char sep = options->line_termination ? '\t' : '\0';
+ if (!options->line_termination)
+ putchar(options->line_termination);
+ write_name_quoted(file->from_name, stdout, sep);
+ write_name_quoted(file->name, stdout, options->line_termination);
+ } else
+ write_name_quoted(file->name, stdout, options->line_termination);
}
}
--
1.5.3.7
next prev parent reply other threads:[~2007-12-11 23:10 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-12-10 22:32 [PATCH] diff: Make numstat machine friendly also for renames (and copies) Jakub Narebski
2007-12-10 22:55 ` [PATCH (amend)] " Jakub Narebski
2007-12-11 1:06 ` Junio C Hamano
2007-12-11 1:26 ` Jakub Narebski
2007-12-11 2:50 ` Junio C Hamano
2007-12-11 23:09 ` Jakub Narebski [this message]
2007-12-12 7:46 ` Junio C Hamano
2007-12-12 10:21 ` Jakub Narebski
2007-12-12 10:31 ` Jakub Narebski
2007-12-12 19:07 ` Junio C Hamano
2007-12-10 23:00 ` [PATCH] " Junio C Hamano
2007-12-10 23:14 ` Jakub Narebski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200712120009.36560.jnareb@gmail.com \
--to=jnareb@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).