* Parsing diff --git lines
@ 2008-03-09 1:48 Simon Fraser
2008-03-09 4:04 ` Linus Torvalds
0 siblings, 1 reply; 4+ messages in thread
From: Simon Fraser @ 2008-03-09 1:48 UTC (permalink / raw)
To: git
I'm working on a GUI for git, and I'd like to be able to provide
some diff navigation tools. That requires that I can find the
file chunks in a diff, and parse out the file names.
However, I don't see a reliable way to identify the two files
from a "diff --git" line. Here's a (deliberately pathological)
example:
diff --git a/a / b/file with spaces.txt b/a / b/file with spaces.txt
In this case, the repository contains directories called "a " and
" b" and the file names have spaces in.
What would make this possible would be either to always quote
file paths containing spaces, or use a character other than
a space (e.g. a \t) between the two file names.
Thanks
Simon
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Parsing diff --git lines
2008-03-09 1:48 Parsing diff --git lines Simon Fraser
@ 2008-03-09 4:04 ` Linus Torvalds
2008-03-09 8:25 ` Jakub Narebski
0 siblings, 1 reply; 4+ messages in thread
From: Linus Torvalds @ 2008-03-09 4:04 UTC (permalink / raw)
To: Simon Fraser; +Cc: git
On Sat, 8 Mar 2008, Simon Fraser wrote:
>
> However, I don't see a reliable way to identify the two files
> from a "diff --git" line. Here's a (deliberately pathological)
> example:
See how "git-apply" does it.
The rule is:
- if the filenames are different, you should ignore the filenames on the
"diff --git" line, and trust the ones on the "renamed from/to" ones
(which are unambiguous because they only have one filename per line)
- if the filenames aren't different, then you can unambiguously know how
to parse it by simply making sure they are the same.
So to take your example:
> diff --git a/a / b/file with spaces.txt b/a / b/file with spaces.txt
Here, you can *know* that the filename is "a / b/file with spaces.txt",
because it must match the pattern "a/$filename b/$filename", and no other
split at a space would ever do that!
See in particular "git_header_name()" in builtin-apply.c. See the comments
both at the top of the function and there in the middle to reflect the
above rule:
/*
* This is to extract the same name that appears on "diff --git"
* line. We do not find and return anything if it is a rename
* patch, and it is OK because we will find the name elsewhere.
* We need to reliably find name only when it is mode-change only,
* creation or deletion of an empty file. In any of these cases,
* both sides are the same name under a/ and b/ respectively.
*/
...
/*
* Accept a name only if it shows up twice, exactly the same
* form.
*/
> What would make this possible would be either to always quote
> file paths containing spaces, or use a character other than
> a space (e.g. a \t) between the two file names.
Well, git didn't originally ever quote filenames at all, and I actually
wanted the "diff --git" line to look as much like a regular "diff -urN"
line as possible (which has spaces between the names)
These days, we could quote, but hey, anybody who parses diff lines needs
to be able to handle the non-quoted form *anyway*, so quoting doesn't
really help anybody in the end.
Linus
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Parsing diff --git lines
2008-03-09 4:04 ` Linus Torvalds
@ 2008-03-09 8:25 ` Jakub Narebski
2008-03-09 20:45 ` Johannes Schindelin
0 siblings, 1 reply; 4+ messages in thread
From: Jakub Narebski @ 2008-03-09 8:25 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Simon Fraser, git
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Sat, 8 Mar 2008, Simon Fraser wrote:
> >
> > However, I don't see a reliable way to identify the two files
> > from a "diff --git" line. Here's a (deliberately pathological)
> > example:
>
> See how "git-apply" does it.
>
> The rule is:
>
> - if the filenames are different, you should ignore the filenames on the
> "diff --git" line, and trust the ones on the "renamed from/to" ones
> (which are unambiguous because they only have one filename per line)
>
> - if the filenames aren't different, then you can unambiguously know how
> to parse it by simply making sure they are the same.
By the way, the default pre-commit hook behaves a bit strangely on
files which contain spaces[*1*] in filename (due to GNU-diff-uism):
*
* You have some suspicious patch lines:
*
* In b
* trailing whitespace (line 1)
b:1:++ b/ b
Foootnote:
==========
[*1*] If file has '"', '\' or control character in filename,
it is quoted.
--
Jakub Narebski
Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Parsing diff --git lines
2008-03-09 8:25 ` Jakub Narebski
@ 2008-03-09 20:45 ` Johannes Schindelin
0 siblings, 0 replies; 4+ messages in thread
From: Johannes Schindelin @ 2008-03-09 20:45 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Hi,
On Sun, 9 Mar 2008, Jakub Narebski wrote:
> Foootnote:
Alternatively, you can write "Bigfootnote" or "Sasquatchnote".
Ciao,
Dscho
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-03-09 20:45 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-09 1:48 Parsing diff --git lines Simon Fraser
2008-03-09 4:04 ` Linus Torvalds
2008-03-09 8:25 ` Jakub Narebski
2008-03-09 20:45 ` Johannes Schindelin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).