git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Torsten Bögershausen" <tboegi@web.de>
Cc: git@vger.kernel.org
Subject: Re: [PATCH][RFC] git on Mac OS and precomposed unicode
Date: Mon, 09 Jan 2012 11:29:58 -0800	[thread overview]
Message-ID: <7vty44eksp.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <4F0B196B.8010904@web.de> ("Torsten Bögershausen"'s message of "Mon, 09 Jan 2012 17:44:27 +0100")

Torsten Bögershausen <tboegi@web.de> writes:

> On 08.01.12 03:46, Junio C Hamano wrote:
> ...
>> That also sounds sensible, but...
>> 
>>> This is done in git.c by calling argv_precompose() for all commands
>>> except "git commit".
>> 
>> ... I think it generally is a bad idea to say "all except foo". There may
>> be a reason why "foo" happens to be special in today's code, but who says
>> there won't be another command "bar" that shares the same reason with
>> "foo" to be treated specially? Or depending on the options, perhaps some
>> codepath of "foo" may not want the special casing and want to go through
>> the argv_precompose(), no?
>> 
>> After all, "git commit -- pathspec" will have to get the pathspec from the
>> command line,...
>
> Thanks Junio for catching this.
> I added a new test case as well as fixed the code.

I think you are sidestepping the real issue I raised, which is:

    What is the reason why you do not want to feed the precompose helper
    with some arguments to 'git commit', while it is OK to pass all
    arguments to other commands through precomposition?

I admit it was my fault that I did not spell it out clearly in my
response.

I understand that arguments other than pathspec and revs could be left in
decomposed form, but is there any harm in canonicalizing any and all
command line parameters given in decomposed form consistently into
precomposed form? What problem are you trying to solve by special casing
"git commit"? That is the real question to be answered, as there may be
other commands some of whose arguments may not want to be canonicalized
due to the same reason, but you simply overlooked them. When other people
need to fix that oversight, they need a clearly written criterion what
kind of arguments should not be fixed and why.

And the reason cannot be a desire to pass the value to "--message"
argument intact [*1*]; it is not like osx cannot handle text in
precomposed form, right?

In general, I do not want to see ugly code that says "this one potentially
names a path so we add a call to fix it from decomposed form, but that
other one is not a path and we take it intact" sprinkled all over in the
codebase, without a good reason.

It may seem that one alternative to munging argv[] is to have the
precomposition [*2*] applied inside get_pathspec() and have it take effect
only on the pathspecs, which after all ought to be the only place where
this matters, but I doubt it would result in maintainable code. The names
of branches and tags taken from the command line that are used as revision
names will also be compared with results from readdir in $GIT_DIR/refs/
and need to be canonicalized, for example. So I tend to agree with your
"brute force" approach to canonicalize argv[] before any real part of git
sees them; that is where my suggestion to wrap main() to do so came from.

Also some commands (e.g. "rev-list --stdin") take pathspecs and revs from
their standard input stream, so you would need to be careful about them.


[Footnotes]

*1* Also as other commands like "git merge" also take textual message, and
you do pass the helper to canonicalize it.  No, I am not suggesting you to
special case "git merge".

*2* By the way, this may need a better name if the patch touches anywhere
outside compat/osx --- it is about "canonicalize pathname and pathspec
given from the command line into the form used internally by git", and
from an osx person's point of view, the only difference might be
decomposed vs precomposed, but on other odd systems it might be that
pathnames on the filesystem may be using a different encoding from what is
used for pathnames in the index).

  reply	other threads:[~2012-01-09 19:30 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-07 19:59 [PATCH][RFC] git on Mac OS and precomposed unicode Torsten Bögershausen
2012-01-08  2:46 ` Junio C Hamano
2012-01-09 16:44   ` Torsten Bögershausen
2012-01-09 19:29     ` Junio C Hamano [this message]
2012-01-09 20:47       ` Torsten Bögershausen
  -- strict thread matches above, loose matches on Subject: below --
2012-01-07 19:59 Torsten Bögershausen
2012-01-08  6:01 ` Miles Bader
2012-01-09 16:42   ` Torsten Bögershausen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vty44eksp.fsf@alter.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=tboegi@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).