git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: David Kastrup <dak@gnu.org>, git@vger.kernel.org
Subject: Re: Stupid quoting...
Date: Thu, 14 Jun 2007 01:49:27 -0700	[thread overview]
Message-ID: <7vlkemapk8.fsf@assigned-by-dhcp.pobox.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0706131316390.4059@racer.site> (Johannes Schindelin's message of "Wed, 13 Jun 2007 13:21:12 +0100 (BST)")

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Wed, 13 Jun 2007, David Kastrup wrote:
>
>> what is the point in quoting file names and their characters in
>> git-diff's output?  And what is the recommended way of undoing the
>> damage?
>
> The recommended way is not using spaces to begin with. I mean, does 
> "David" contain spaces? People seem not to see the problem, and fail to 
> blame Microsoft for all the damage they have done, introducing that 
> stupid, stupid concept of filenames containing spaces, and _enforcing_ it.

Why are you talking about spaces ;-)?

There are a few things to note, but the first thing is that mere
spaces do not trigger quoting.  A tab (HT) does, so do non ASCII
characters.  The second thing is that we do this quoting for
various good reasons, and it is not likely to change.

As Alex mentions, the most safe way for programs to read is to
read from the -z format.  However, even if you are capable to do
so, it may be inconvenient in some languages (mainstream
languages like C and Perl are not among them).  Not quoting SP
is a conscious decision, as SP in filenames are rather common,
more common than non ASCII and much more common than HT.

The "raw" formats "ls-files -s", "ls-tree" and "diff --raw"
produce are designed to put names at the end, and typically
delimited with a HT, so that "lazy" scripts can use cut (whose
default delimiter is a HT) to pick out pieces from its output.
And plumbing tools reading from the standard input (most
notably, "update-index --stdin") know how to unquote them.  In
practice, not many people use non ASCII in pathnames and expect
them work sanely for everybody, so loosely written scripts, as
long as they cut at HT to pick out the pathname part, "mostly"
work (I think traditional core git scripts are safe, I suspect
some contributed ones shipped with git core may not be, Cogito
used to be very unsafe but it was audited and became much safer
before it got discontinued).

The pathname quoting rules in textual output was chosen
primarily to make diff output safer, as one of the most
important workflow git supports is e-mailable patches.

GNU patch treats HT on "+++ name"/"--- name" lines as the end of
name (and after HT comes timestamp), but the timestamp part is
treated as optional, which introduces ambiguities and confusion.
The issue was discussed some time ago (check the list archive
for discussion among I, Linus and Paul Eggert -- the GNU diff
and patch maintainer) and the quoting rules we use now is
consistent with what the diff and patch plan to use.  The update
on the GNU side may have already happened, it may not have.

When a patch appears in an e-mail, you would need to be aware
that not everybody has the luxury of living in UTF-8 only world.
Your commit message and cover letter may be in one encoding, the
pathnames that appear in diff headers may be in your filesystem
encoding, and the patch text that appear as the diff payload may
be in another document specific encoding.  All three could be
different (worse, a patch that touch more than one file can
carry different encodings in the payload part), and mixing
character set in a single piece of e-mail confuses people's MUA
and tends to mangle messages.  Quoting non ASCII characters in
pathnames, even they are perfectly valid and ordinary UTF-8
strings, is to eliminate one element in the above three as a
possible source of worries.

  parent reply	other threads:[~2007-06-14  8:49 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-13 11:30 Stupid quoting David Kastrup
2007-06-13 12:06 ` Alex Riesen
2007-06-13 12:21 ` Johannes Schindelin
     [not found]   ` <86ejkgvxmb.fsf@lola.quinscape.zz>
2007-06-14  0:51     ` Johannes Schindelin
2007-06-14  6:12       ` David Kastrup
2007-06-14  7:06         ` Alex Riesen
     [not found]           ` <86hcpb6lr6.fsf@lola.quinscape.zz>
2007-06-14  8:51             ` Alex Riesen
2007-06-14  1:06   ` Steven Grimm
2007-06-14  1:12     ` Johannes Schindelin
2007-06-14  1:19       ` Steven Grimm
2007-06-14  1:34         ` Johannes Schindelin
2007-06-14  8:49   ` Junio C Hamano [this message]
2007-06-16 21:03 ` Jakub Narebski
2007-06-18  8:00   ` David Kastrup
2007-06-18 16:19     ` Jeff King
2007-06-19  1:00     ` Johannes Schindelin
2007-06-19  7:44       ` David Kastrup
2007-06-19  9:50         ` Johannes Schindelin
2007-06-19 20:53           ` Olivier Galibert
     [not found]           ` <86645kutow.fsf@lola.quinscape.zz>
2007-06-20  2:19             ` Johannes Schindelin
2007-06-20  6:19               ` Junio C Hamano
2007-06-20  7:49                 ` David Kastrup
2007-06-20  8:40                   ` Jakub Narebski
2007-06-20  8:59                     ` David Kastrup
2007-06-24  6:50                 ` Jan Hudec
2007-06-24 11:14                   ` Robin Rosenberg
2007-06-24 11:47                     ` Junio C Hamano
2007-06-24 11:58                       ` David Kastrup
2007-06-24 12:19                         ` Junio C Hamano
2007-06-24 12:41                           ` Jeff King
2007-06-24 16:25                     ` Jan Hudec
2007-06-24 19:39                       ` Robin Rosenberg
2007-06-24 19:47                         ` David Kastrup
2007-06-24 20:17                           ` Robin Rosenberg
2007-06-24 20:25                             ` David Kastrup

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vlkemapk8.fsf@assigned-by-dhcp.pobox.com \
    --to=gitster@pobox.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=dak@gnu.org \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).