* Does GIT require property like Subversion?
@ 2006-10-08 9:10 Liu Yubao
2006-10-08 9:19 ` Jan-Benedict Glaw
0 siblings, 1 reply; 8+ messages in thread
From: Liu Yubao @ 2006-10-08 9:10 UTC (permalink / raw)
To: git
I want to know whether there is a plan to add this feature, or GIT doesn't
require it at all.
Properties like encoding (path name, file content), eol-style, mime-type
are useful for editing.
BTW: I don't mean Subversion is better that GIT:-)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does GIT require property like Subversion?
2006-10-08 9:10 Does GIT require property like Subversion? Liu Yubao
@ 2006-10-08 9:19 ` Jan-Benedict Glaw
2006-10-08 10:16 ` Jakub Narebski
0 siblings, 1 reply; 8+ messages in thread
From: Jan-Benedict Glaw @ 2006-10-08 9:19 UTC (permalink / raw)
To: Liu Yubao; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 865 bytes --]
On Sun, 2006-10-08 17:10:51 +0800, Liu Yubao <yubao.liu@gmail.com> wrote:
> I want to know whether there is a plan to add this feature, or GIT doesn't
> require it at all.
>
> Properties like encoding (path name, file content), eol-style, mime-type
> are useful for editing.
GIT is a content tracker. It won't ever fiddle with your line
endings. You put data in there and it'll be conserved bit-by-bit. So
if you need to store file encodings, MIME types, automatic CR/CRLF/LF
converstion etc, you have to put this metadata into some additional
files, but GIT won't specifically handle that in any way.
MfG, JBG
--
Jan-Benedict Glaw jbglaw@lug-owl.de +49-172-7608481
Signature of: Eine Freie Meinung in einem Freien Kopf
the second : für einen Freien Staat voll Freier Bürger.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does GIT require property like Subversion?
2006-10-08 9:19 ` Jan-Benedict Glaw
@ 2006-10-08 10:16 ` Jakub Narebski
2006-10-08 10:26 ` Jakub Narebski
2006-10-08 15:52 ` Robin Rosenberg
0 siblings, 2 replies; 8+ messages in thread
From: Jakub Narebski @ 2006-10-08 10:16 UTC (permalink / raw)
To: git
Jan-Benedict Glaw wrote:
> On Sun, 2006-10-08 17:10:51 +0800, Liu Yubao <yubao.liu@gmail.com> wrote:
>> I want to know whether there is a plan to add this feature, or GIT doesn't
>> require it at all.
>>
>> Properties like encoding (path name, file content), eol-style, mime-type
>> are useful for editing.
>
> GIT is a content tracker. It won't ever fiddle with your line
> endings. You put data in there and it'll be conserved bit-by-bit. So
> if you need to store file encodings, MIME types, automatic CR/CRLF/LF
> converstion etc, you have to put this metadata into some additional
> files, but GIT won't specifically handle that in any way.
Mimetype has no place (I think) in SCM. We could in pronciple "borrow"
Mercurial idea of input/output filters
http://www.selenic.com/mercurial/wiki/index.cgi/EncodeDecodeFilter
which would (among others) enable to use constant eol-style in the shared
part of repository i.e. object database, while using OS native eol-style
(UNIX vs. Microsoft Windows vs. MacOS). eol-style doesn't matter much:
you can find good editors which are able to use any eol-style for any OS
nowadays.
File content encoding is something (if it is outside US-ASCII of course)
that you would want either to have some default convention, or have it
embedded in the file itself (like XML, HTML, or Emacs' file variables)
to be able to read file _outside_ SCM.
Path name encoding is something that is global property of a repository,
I think. We have i18n.commitEncoding configuration variable; we could
add i18n.pathnameEncoding quite easily I think (and some way for Git to
detect current filesystem pathname encoding, if possible). Although
BTW I think that i18n.commitEncoding information should be made persistent,
and copied when cloning repository.
But in fact the philosophy of Git _prohibits_ I think property bits.
Unless we add ability (which can be done fairly easy even now, but will
not be automatic) to save some metainfo (ACL, extended attributes,
Subversion-like properties) along with the file (blob) and/or tree
(directory).
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does GIT require property like Subversion?
2006-10-08 10:16 ` Jakub Narebski
@ 2006-10-08 10:26 ` Jakub Narebski
2006-10-08 15:52 ` Robin Rosenberg
1 sibling, 0 replies; 8+ messages in thread
From: Jakub Narebski @ 2006-10-08 10:26 UTC (permalink / raw)
To: git
Jakub Narebski wrote:
> We could in pronciple "borrow" Mercurial idea of input/output filters
> http://www.selenic.com/mercurial/wiki/index.cgi/EncodeDecodeFilter
> which would (among others) enable to use constant eol-style in the shared
> part of repository i.e. object database, while using OS native eol-style
> (UNIX vs. Microsoft Windows vs. MacOS). eol-style doesn't matter much:
> you can find good editors which are able to use any eol-style for any OS
> nowadays.
What's more important, that would enable to store in SCM files which format
is of archive of mix of _text_ and binary files, archive being compressed
and binary. Examples include OpenDocument (ODF), Java Archive (.jar),
Mozilla extension (.xpi)... well good XML aware diff would be also nice.
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does GIT require property like Subversion?
2006-10-08 10:16 ` Jakub Narebski
2006-10-08 10:26 ` Jakub Narebski
@ 2006-10-08 15:52 ` Robin Rosenberg
2006-10-08 16:24 ` Petr Baudis
2006-10-09 2:53 ` Liu Yubao
1 sibling, 2 replies; 8+ messages in thread
From: Robin Rosenberg @ 2006-10-08 15:52 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
söndag 08 oktober 2006 12:16 skrev Jakub Narebski:
> File content encoding is something (if it is outside US-ASCII of course)
> that you would want either to have some default convention, or have it
> embedded in the file itself (like XML, HTML, or Emacs' file variables)
> to be able to read file _outside_ SCM.
Except for CR/LF, this is best solved outside of the SCM. There aren't that
may tools/users to warrant the complexity or performance hit I imagine to
solve it.
> Path name encoding is something that is global property of a repository,
> I think. We have i18n.commitEncoding configuration variable; we could
> add i18n.pathnameEncoding quite easily I think (and some way for Git to
> detect current filesystem pathname encoding, if possible). Although
> BTW I think that i18n.commitEncoding information should be made persistent,
> and copied when cloning repository.
*I* think git should use UTF-8 internally. Always. Clients could then have
the option to convert to local conventions.
Same for pathname. Internally all paths should be UTF-8 encoded. Encoding
commit info that way would make the i18n option obsolete also.
I have a patch for both these, but it's very ugly and probably has some memory
management problems, so I'll refrain from submitting for now. Knowing that it
exists may perhaps serve as starting point for discussion. It encodes
filenames in UTF-8 using LC_CTYPE as the local encoding, as well as commit
messages. An exception is when something looks like UTF-8, in which case it
will not convert input to git. When UTF-8 cannot be converted to the local
encoding on it's way out of git, the data remains in UTF-8 format. Branch and
tags names are not managed (yet, at least).
> But in fact the philosophy of Git _prohibits_ I think property bits.
I don't think so, but they aren't needed for the original purpose. Git already
does manage file permissions.
> Unless we add ability (which can be done fairly easy even now, but will
> not be automatic) to save some metainfo (ACL, extended attributes,
> Subversion-like properties) along with the file (blob) and/or tree
> (directory).
A problem with adding too much metadata is that there is a cost to this. We
like GIT much thanks to it's perforrmance. Git simply gets out of the way
thanks to this. ACL's aren't content at all. Extended attributes however are,
but who uses them?
-- robin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does GIT require property like Subversion?
2006-10-08 15:52 ` Robin Rosenberg
@ 2006-10-08 16:24 ` Petr Baudis
2006-10-08 16:40 ` Jakub Narebski
2006-10-09 2:53 ` Liu Yubao
1 sibling, 1 reply; 8+ messages in thread
From: Petr Baudis @ 2006-10-08 16:24 UTC (permalink / raw)
To: Robin Rosenberg; +Cc: Jakub Narebski, git
Dear diary, on Sun, Oct 08, 2006 at 05:52:10PM CEST, I got a letter
where Robin Rosenberg <robin.rosenberg.lists@dewire.com> said that...
> *I* think git should use UTF-8 internally. Always. Clients could then have
> the option to convert to local conventions.
>
> Same for pathname. Internally all paths should be UTF-8 encoded. Encoding
> commit info that way would make the i18n option obsolete also.
There is a tradeoff here between independence of the data stored inside
Git on the system where you created it, and willingness to store any
awful garbage you feed inside. It goes down to a policy decision and Git
lefts it on the user and opting for garbage support, which gives it more
flexibility.
> söndag 08 oktober 2006 12:16 skrev Jakub Narebski:
> > But in fact the philosophy of Git _prohibits_ I think property bits.
> I don't think so, but they aren't needed for the original purpose. Git already
> does manage file permissions.
Incorrect, Git manages only the execute bit.
> > Unless we add ability (which can be done fairly easy even now, but will
> > not be automatic) to save some metainfo (ACL, extended attributes,
> > Subversion-like properties) along with the file (blob) and/or tree
> > (directory).
>
> A problem with adding too much metadata is that there is a cost to this. We
> like GIT much thanks to it's perforrmance. Git simply gets out of the way
> thanks to this. ACL's aren't content at all. Extended attributes however are,
> but who uses them?
Execute bit isn't content at all either if you look at it this way. But
it's meaningless anyway since you can define content whichever way you
want (this is also why I consider the argument for not tracking
directories dubious).
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does GIT require property like Subversion?
2006-10-08 16:24 ` Petr Baudis
@ 2006-10-08 16:40 ` Jakub Narebski
0 siblings, 0 replies; 8+ messages in thread
From: Jakub Narebski @ 2006-10-08 16:40 UTC (permalink / raw)
To: git
Petr Baudis wrote:
> Dear diary, on Sun, Oct 08, 2006 at 05:52:10PM CEST, I got a letter
> where Robin Rosenberg <robin.rosenberg.lists@dewire.com> said that...
>> söndag 08 oktober 2006 12:16 skrev Jakub Narebski:
>>> But in fact the philosophy of Git _prohibits_ I think property bits.
>> I don't think so, but they aren't needed for the original purpose. Git
>> already does manage file permissions.
>
> Incorrect, Git manages only the execute bit.
And symlinks.
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does GIT require property like Subversion?
2006-10-08 15:52 ` Robin Rosenberg
2006-10-08 16:24 ` Petr Baudis
@ 2006-10-09 2:53 ` Liu Yubao
1 sibling, 0 replies; 8+ messages in thread
From: Liu Yubao @ 2006-10-09 2:53 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Sorry, I forgot to reply all.
Robin Rosenberg wrote:
> söndag 08 oktober 2006 12:16 skrev Jakub Narebski:
>> File content encoding is something (if it is outside US-ASCII of course)
>> that you would want either to have some default convention, or have it
>> embedded in the file itself (like XML, HTML, or Emacs' file variables)
>> to be able to read file _outside_ SCM.
> Except for CR/LF, this is best solved outside of the SCM. There aren't that
> may tools/users to warrant the complexity or performance hit I imagine to
> solve it.
>
>> Path name encoding is something that is global property of a repository,
>> I think. We have i18n.commitEncoding configuration variable; we could
>> add i18n.pathnameEncoding quite easily I think (and some way for Git to
>> detect current filesystem pathname encoding, if possible). Although
>> BTW I think that i18n.commitEncoding information should be made persistent,
>> and copied when cloning repository.
>
> *I* think git should use UTF-8 internally. Always. Clients could then have
> the option to convert to local conventions.
>
> Same for pathname. Internally all paths should be UTF-8 encoded. Encoding
> commit info that way would make the i18n option obsolete also.
>
I am afraid it's not a good idea to convert file content to UTF-8 encoding
as GIT can manage non-text file, it's not safe to modify file content
stealthily by a VCS.
But I agree to use UTF-8 for path name in tree object, or add an encoding
property(not a user defined property) to the head of tree object, so GIT
won't do useless enc -> UTF-8 -> same_enc conversion. The second way has
a fault: two tree objects with same content in different encoding have
different SHA1 digests.
> I have a patch for both these, but it's very ugly and probably has some memory
> management problems, so I'll refrain from submitting for now. Knowing that it
> exists may perhaps serve as starting point for discussion. It encodes
> filenames in UTF-8 using LC_CTYPE as the local encoding, as well as commit
> messages. An exception is when something looks like UTF-8, in which case it
> will not convert input to git. When UTF-8 cannot be converted to the local
> encoding on it's way out of git, the data remains in UTF-8 format. Branch and
> tags names are not managed (yet, at least).
>
>
Good, hope GIT can deal with path names that are not in 8859_1 or UTF-8 encoding.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-10-09 2:54 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-08 9:10 Does GIT require property like Subversion? Liu Yubao
2006-10-08 9:19 ` Jan-Benedict Glaw
2006-10-08 10:16 ` Jakub Narebski
2006-10-08 10:26 ` Jakub Narebski
2006-10-08 15:52 ` Robin Rosenberg
2006-10-08 16:24 ` Petr Baudis
2006-10-08 16:40 ` Jakub Narebski
2006-10-09 2:53 ` Liu Yubao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).