* Parsing diff --git lines @ 2008-03-09 1:48 Simon Fraser 2008-03-09 4:04 ` Linus Torvalds 0 siblings, 1 reply; 4+ messages in thread From: Simon Fraser @ 2008-03-09 1:48 UTC (permalink / raw) To: git I'm working on a GUI for git, and I'd like to be able to provide some diff navigation tools. That requires that I can find the file chunks in a diff, and parse out the file names. However, I don't see a reliable way to identify the two files from a "diff --git" line. Here's a (deliberately pathological) example: diff --git a/a / b/file with spaces.txt b/a / b/file with spaces.txt In this case, the repository contains directories called "a " and " b" and the file names have spaces in. What would make this possible would be either to always quote file paths containing spaces, or use a character other than a space (e.g. a \t) between the two file names. Thanks Simon ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Parsing diff --git lines 2008-03-09 1:48 Parsing diff --git lines Simon Fraser @ 2008-03-09 4:04 ` Linus Torvalds 2008-03-09 8:25 ` Jakub Narebski 0 siblings, 1 reply; 4+ messages in thread From: Linus Torvalds @ 2008-03-09 4:04 UTC (permalink / raw) To: Simon Fraser; +Cc: git On Sat, 8 Mar 2008, Simon Fraser wrote: > > However, I don't see a reliable way to identify the two files > from a "diff --git" line. Here's a (deliberately pathological) > example: See how "git-apply" does it. The rule is: - if the filenames are different, you should ignore the filenames on the "diff --git" line, and trust the ones on the "renamed from/to" ones (which are unambiguous because they only have one filename per line) - if the filenames aren't different, then you can unambiguously know how to parse it by simply making sure they are the same. So to take your example: > diff --git a/a / b/file with spaces.txt b/a / b/file with spaces.txt Here, you can *know* that the filename is "a / b/file with spaces.txt", because it must match the pattern "a/$filename b/$filename", and no other split at a space would ever do that! See in particular "git_header_name()" in builtin-apply.c. See the comments both at the top of the function and there in the middle to reflect the above rule: /* * This is to extract the same name that appears on "diff --git" * line. We do not find and return anything if it is a rename * patch, and it is OK because we will find the name elsewhere. * We need to reliably find name only when it is mode-change only, * creation or deletion of an empty file. In any of these cases, * both sides are the same name under a/ and b/ respectively. */ ... /* * Accept a name only if it shows up twice, exactly the same * form. */ > What would make this possible would be either to always quote > file paths containing spaces, or use a character other than > a space (e.g. a \t) between the two file names. Well, git didn't originally ever quote filenames at all, and I actually wanted the "diff --git" line to look as much like a regular "diff -urN" line as possible (which has spaces between the names) These days, we could quote, but hey, anybody who parses diff lines needs to be able to handle the non-quoted form *anyway*, so quoting doesn't really help anybody in the end. Linus ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Parsing diff --git lines 2008-03-09 4:04 ` Linus Torvalds @ 2008-03-09 8:25 ` Jakub Narebski 2008-03-09 20:45 ` Johannes Schindelin 0 siblings, 1 reply; 4+ messages in thread From: Jakub Narebski @ 2008-03-09 8:25 UTC (permalink / raw) To: Linus Torvalds; +Cc: Simon Fraser, git Linus Torvalds <torvalds@linux-foundation.org> writes: > On Sat, 8 Mar 2008, Simon Fraser wrote: > > > > However, I don't see a reliable way to identify the two files > > from a "diff --git" line. Here's a (deliberately pathological) > > example: > > See how "git-apply" does it. > > The rule is: > > - if the filenames are different, you should ignore the filenames on the > "diff --git" line, and trust the ones on the "renamed from/to" ones > (which are unambiguous because they only have one filename per line) > > - if the filenames aren't different, then you can unambiguously know how > to parse it by simply making sure they are the same. By the way, the default pre-commit hook behaves a bit strangely on files which contain spaces[*1*] in filename (due to GNU-diff-uism): * * You have some suspicious patch lines: * * In b * trailing whitespace (line 1) b:1:++ b/ b Foootnote: ========== [*1*] If file has '"', '\' or control character in filename, it is quoted. -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Parsing diff --git lines 2008-03-09 8:25 ` Jakub Narebski @ 2008-03-09 20:45 ` Johannes Schindelin 0 siblings, 0 replies; 4+ messages in thread From: Johannes Schindelin @ 2008-03-09 20:45 UTC (permalink / raw) To: Jakub Narebski; +Cc: git Hi, On Sun, 9 Mar 2008, Jakub Narebski wrote: > Foootnote: Alternatively, you can write "Bigfootnote" or "Sasquatchnote". Ciao, Dscho ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-03-09 20:45 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-03-09 1:48 Parsing diff --git lines Simon Fraser 2008-03-09 4:04 ` Linus Torvalds 2008-03-09 8:25 ` Jakub Narebski 2008-03-09 20:45 ` Johannes Schindelin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).