git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Liu Yubao <yubao.liu@gmail.com>
To: Jakub Narebski <jnareb@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: Does GIT require property like Subversion?
Date: Mon, 09 Oct 2006 10:53:59 +0800	[thread overview]
Message-ID: <4529B9C7.7000000@gmail.com> (raw)
In-Reply-To: <200610081752.10940.robin.rosenberg.lists@dewire.com>

Sorry, I forgot to reply all.

Robin Rosenberg wrote:
> söndag 08 oktober 2006 12:16 skrev Jakub Narebski:
>> File content encoding is something (if it is outside US-ASCII of course)
>> that you would want either to have some default convention, or have it
>> embedded in the file itself (like XML, HTML, or Emacs' file variables)
>> to be able to read file _outside_ SCM.
> Except for CR/LF, this is best solved outside of the SCM. There aren't that
> may tools/users to warrant the complexity or performance hit I imagine to 
> solve it.
> 
>> Path name encoding is something that is global property of a repository,
>> I think. We have i18n.commitEncoding configuration variable; we could
>> add i18n.pathnameEncoding quite easily I think (and some way for Git to
>> detect current filesystem pathname encoding, if possible). Although
>> BTW I think that i18n.commitEncoding information should be made persistent,
>> and copied when cloning repository.
> 
> *I* think git should use UTF-8 internally. Always. Clients could then have
> the option to convert to local conventions.
> 
> Same for pathname. Internally all paths should be UTF-8 encoded. Encoding 
> commit info that way would make the i18n option obsolete also.
> 
I am afraid it's not a good idea to convert file content to UTF-8 encoding
as GIT can manage non-text file, it's not safe to modify file content 
stealthily by a VCS.

But I agree to use UTF-8 for path name in tree object, or add an encoding
property(not a user defined property) to the head of tree object, so GIT
won't do useless enc -> UTF-8 -> same_enc conversion. The second way has
a fault: two tree objects with same content in different encoding have 
different SHA1 digests.

> I have a patch for both these, but it's very ugly and probably has some memory 
> management problems, so I'll refrain from submitting for now. Knowing that it 
> exists may perhaps serve as starting point for discussion. It encodes 
> filenames in UTF-8 using LC_CTYPE as the local encoding, as well as commit 
> messages. An exception is when something looks like UTF-8, in which case it 
> will not convert input to git. When UTF-8 cannot be converted to the local 
> encoding on it's way out of git, the data remains in UTF-8 format. Branch and 
> tags names are not managed (yet, at least).
> 
 >
Good, hope GIT can deal with path names that are not in 8859_1 or UTF-8 encoding.

      parent reply	other threads:[~2006-10-09  2:54 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-08  9:10 Does GIT require property like Subversion? Liu Yubao
2006-10-08  9:19 ` Jan-Benedict Glaw
2006-10-08 10:16   ` Jakub Narebski
2006-10-08 10:26     ` Jakub Narebski
2006-10-08 15:52     ` Robin Rosenberg
2006-10-08 16:24       ` Petr Baudis
2006-10-08 16:40         ` Jakub Narebski
2006-10-09  2:53       ` Liu Yubao [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4529B9C7.7000000@gmail.com \
    --to=yubao.liu@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).