git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Gruenbacher <agruen@suse.de>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: Filename quoting / parsing problem
Date: Sat, 2 Jan 2010 21:48:47 +0100	[thread overview]
Message-ID: <201001022148.47841.agruen@suse.de> (raw)
In-Reply-To: <7v8wcge4kr.fsf@alter.siamese.dyndns.org>

On Saturday 02 January 2010 07:37:08 pm Junio C Hamano wrote:
> Andreas Gruenbacher <agruen@suse.de> writes:
> > On Friday 01 January 2010 09:01:19 pm Junio C Hamano wrote:
> >> > Both "b file" and "c file " are parsed by "git apply" perfectly fine.
> >
> > Right, the "diff --git" lines are technically still parseable when the
> > file name stays the same.  With renames, lines like "diff --git a/f a/f
> > b/f" or "diff --git a/f b/f b/f" are possible, but then there will also
> > be "renamed from" and "renamed to" headers which will disambiguate
> > things.  Still, it doesn't seem like a good idea to allow such
> > ambiguities in the first place.
> 
> You already realized that there is no ambiguity because "diff --git" lines
> are parsable and renames have explicit names.  Why do you still maintain
> that we are allowing such "ambiguities" when there is none?

Don't get so aroused ...

Right now, git generates lines like "diff --git a/f a/f b/f b/f" in some 
corner cases, and from such lines alone, it is not possible to tell what the 
two file names are (either "a/f a/f" and "b/f b/f", or "a/f a/f b/f" and 
"b/f").  I can only find that out by looking at the other header lines.

I would prefer a format which I can parse line by line without ambiguities in 
the first place, because this would keep things much simpler and easier to 
debug.  (Think of other implementations of the extended diff format which may 
not produce the exact same output as git.)

So I would be happy with either of this:

  * Also quote spaces in the "diff --git" line so that I can always reliably
    parse it, or

  * Add an additional extended header line with the file name in case there
    are no other header lines giving the file names away already (as for
    renames, copies, or when there are "---" and "+++" lines).

After our discussion so far, option two would probably be easier for everyone: 
you could add it without risking to break anything, and I could avoid parsing 
the "diff --git" line altogether.

> >> Having said all that, I don't think we would mind a change to treat a
> >> pathname with trailing SP a bit specially (iow, quoting "c file " in the
> >> above failed attempt to reproduce the issue).
> >
> > I would prefer quoting file names which contain spaces anywhere,...
> 
> The only reason I said I don't think we would mind changing the trailing
> SP case is because the reduced risk of getting our patches corrupted by
> MUA _might_ outweigh the benefit of not quoting to avoid an eyesore [*1*].
> 
> But what you said would add to eyesore of quoted names (which you omitted
> from your quote) without any justification other than "I would prefer".
> The pros-and-cons in such a change is quite different; as we have already
> established that there is no ambiguity, "disambuguation" is not a "pro" in
> this comparison.
>
> [Footnote]
> 
> *1* Strictly speaking, it is not just "an eyesore" that is an issue.  Our
> diff output without renames are designed to be grokkable by other people's
> patch implementations (e.g. GNU patch), and the quoted pathnames are not
> understandable by them.

GNU patch doesn't look at "diff --git" lines or extended header lines at all 
so far, so there are no compatibility issues yet.  Quoting spaces in "---" and 
"+++" lines would lead to problems with current GNU patch though.  (So does 
the quoting of several other characters like ", of course.)

> Even though our final version of quoted path format came from the GNU
> diff/patch maintainer (back then, at least):
> 
>     http://article.gmane.org/gmane.comp.version-control.git/10103
> 
> I don't think it happened in the GNU land yet, and you would be the person
> to know about it ;-).

I'm working on it ...

Thanks,
Andreas

  reply	other threads:[~2010-01-02 20:49 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-01 17:44 Filename quoting / parsing problem Andreas Gruenbacher
2010-01-01 19:50 ` Junio C Hamano
2010-01-01 20:01   ` Junio C Hamano
2010-01-02 11:36     ` Andreas Gruenbacher
2010-01-02 18:37       ` Junio C Hamano
2010-01-02 20:48         ` Andreas Gruenbacher [this message]
2010-01-06  0:04           ` Andreas Gruenbacher
2010-01-06  1:32             ` Junio C Hamano
2010-01-06  1:08           ` Junio C Hamano
2010-01-06 10:06             ` Andreas Schwab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201001022148.47841.agruen@suse.de \
    --to=agruen@suse.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).