* Does GIT require property like Subversion? @ 2006-10-08 9:10 Liu Yubao 2006-10-08 9:19 ` Jan-Benedict Glaw 0 siblings, 1 reply; 8+ messages in thread From: Liu Yubao @ 2006-10-08 9:10 UTC (permalink / raw) To: git I want to know whether there is a plan to add this feature, or GIT doesn't require it at all. Properties like encoding (path name, file content), eol-style, mime-type are useful for editing. BTW: I don't mean Subversion is better that GIT:-) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does GIT require property like Subversion? 2006-10-08 9:10 Does GIT require property like Subversion? Liu Yubao @ 2006-10-08 9:19 ` Jan-Benedict Glaw 2006-10-08 10:16 ` Jakub Narebski 0 siblings, 1 reply; 8+ messages in thread From: Jan-Benedict Glaw @ 2006-10-08 9:19 UTC (permalink / raw) To: Liu Yubao; +Cc: git [-- Attachment #1: Type: text/plain, Size: 865 bytes --] On Sun, 2006-10-08 17:10:51 +0800, Liu Yubao <yubao.liu@gmail.com> wrote: > I want to know whether there is a plan to add this feature, or GIT doesn't > require it at all. > > Properties like encoding (path name, file content), eol-style, mime-type > are useful for editing. GIT is a content tracker. It won't ever fiddle with your line endings. You put data in there and it'll be conserved bit-by-bit. So if you need to store file encodings, MIME types, automatic CR/CRLF/LF converstion etc, you have to put this metadata into some additional files, but GIT won't specifically handle that in any way. MfG, JBG -- Jan-Benedict Glaw jbglaw@lug-owl.de +49-172-7608481 Signature of: Eine Freie Meinung in einem Freien Kopf the second : für einen Freien Staat voll Freier Bürger. [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does GIT require property like Subversion? 2006-10-08 9:19 ` Jan-Benedict Glaw @ 2006-10-08 10:16 ` Jakub Narebski 2006-10-08 10:26 ` Jakub Narebski 2006-10-08 15:52 ` Robin Rosenberg 0 siblings, 2 replies; 8+ messages in thread From: Jakub Narebski @ 2006-10-08 10:16 UTC (permalink / raw) To: git Jan-Benedict Glaw wrote: > On Sun, 2006-10-08 17:10:51 +0800, Liu Yubao <yubao.liu@gmail.com> wrote: >> I want to know whether there is a plan to add this feature, or GIT doesn't >> require it at all. >> >> Properties like encoding (path name, file content), eol-style, mime-type >> are useful for editing. > > GIT is a content tracker. It won't ever fiddle with your line > endings. You put data in there and it'll be conserved bit-by-bit. So > if you need to store file encodings, MIME types, automatic CR/CRLF/LF > converstion etc, you have to put this metadata into some additional > files, but GIT won't specifically handle that in any way. Mimetype has no place (I think) in SCM. We could in pronciple "borrow" Mercurial idea of input/output filters http://www.selenic.com/mercurial/wiki/index.cgi/EncodeDecodeFilter which would (among others) enable to use constant eol-style in the shared part of repository i.e. object database, while using OS native eol-style (UNIX vs. Microsoft Windows vs. MacOS). eol-style doesn't matter much: you can find good editors which are able to use any eol-style for any OS nowadays. File content encoding is something (if it is outside US-ASCII of course) that you would want either to have some default convention, or have it embedded in the file itself (like XML, HTML, or Emacs' file variables) to be able to read file _outside_ SCM. Path name encoding is something that is global property of a repository, I think. We have i18n.commitEncoding configuration variable; we could add i18n.pathnameEncoding quite easily I think (and some way for Git to detect current filesystem pathname encoding, if possible). Although BTW I think that i18n.commitEncoding information should be made persistent, and copied when cloning repository. But in fact the philosophy of Git _prohibits_ I think property bits. Unless we add ability (which can be done fairly easy even now, but will not be automatic) to save some metainfo (ACL, extended attributes, Subversion-like properties) along with the file (blob) and/or tree (directory). -- Jakub Narebski Warsaw, Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does GIT require property like Subversion? 2006-10-08 10:16 ` Jakub Narebski @ 2006-10-08 10:26 ` Jakub Narebski 2006-10-08 15:52 ` Robin Rosenberg 1 sibling, 0 replies; 8+ messages in thread From: Jakub Narebski @ 2006-10-08 10:26 UTC (permalink / raw) To: git Jakub Narebski wrote: > We could in pronciple "borrow" Mercurial idea of input/output filters > http://www.selenic.com/mercurial/wiki/index.cgi/EncodeDecodeFilter > which would (among others) enable to use constant eol-style in the shared > part of repository i.e. object database, while using OS native eol-style > (UNIX vs. Microsoft Windows vs. MacOS). eol-style doesn't matter much: > you can find good editors which are able to use any eol-style for any OS > nowadays. What's more important, that would enable to store in SCM files which format is of archive of mix of _text_ and binary files, archive being compressed and binary. Examples include OpenDocument (ODF), Java Archive (.jar), Mozilla extension (.xpi)... well good XML aware diff would be also nice. -- Jakub Narebski Warsaw, Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does GIT require property like Subversion? 2006-10-08 10:16 ` Jakub Narebski 2006-10-08 10:26 ` Jakub Narebski @ 2006-10-08 15:52 ` Robin Rosenberg 2006-10-08 16:24 ` Petr Baudis 2006-10-09 2:53 ` Liu Yubao 1 sibling, 2 replies; 8+ messages in thread From: Robin Rosenberg @ 2006-10-08 15:52 UTC (permalink / raw) To: Jakub Narebski; +Cc: git söndag 08 oktober 2006 12:16 skrev Jakub Narebski: > File content encoding is something (if it is outside US-ASCII of course) > that you would want either to have some default convention, or have it > embedded in the file itself (like XML, HTML, or Emacs' file variables) > to be able to read file _outside_ SCM. Except for CR/LF, this is best solved outside of the SCM. There aren't that may tools/users to warrant the complexity or performance hit I imagine to solve it. > Path name encoding is something that is global property of a repository, > I think. We have i18n.commitEncoding configuration variable; we could > add i18n.pathnameEncoding quite easily I think (and some way for Git to > detect current filesystem pathname encoding, if possible). Although > BTW I think that i18n.commitEncoding information should be made persistent, > and copied when cloning repository. *I* think git should use UTF-8 internally. Always. Clients could then have the option to convert to local conventions. Same for pathname. Internally all paths should be UTF-8 encoded. Encoding commit info that way would make the i18n option obsolete also. I have a patch for both these, but it's very ugly and probably has some memory management problems, so I'll refrain from submitting for now. Knowing that it exists may perhaps serve as starting point for discussion. It encodes filenames in UTF-8 using LC_CTYPE as the local encoding, as well as commit messages. An exception is when something looks like UTF-8, in which case it will not convert input to git. When UTF-8 cannot be converted to the local encoding on it's way out of git, the data remains in UTF-8 format. Branch and tags names are not managed (yet, at least). > But in fact the philosophy of Git _prohibits_ I think property bits. I don't think so, but they aren't needed for the original purpose. Git already does manage file permissions. > Unless we add ability (which can be done fairly easy even now, but will > not be automatic) to save some metainfo (ACL, extended attributes, > Subversion-like properties) along with the file (blob) and/or tree > (directory). A problem with adding too much metadata is that there is a cost to this. We like GIT much thanks to it's perforrmance. Git simply gets out of the way thanks to this. ACL's aren't content at all. Extended attributes however are, but who uses them? -- robin ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does GIT require property like Subversion? 2006-10-08 15:52 ` Robin Rosenberg @ 2006-10-08 16:24 ` Petr Baudis 2006-10-08 16:40 ` Jakub Narebski 2006-10-09 2:53 ` Liu Yubao 1 sibling, 1 reply; 8+ messages in thread From: Petr Baudis @ 2006-10-08 16:24 UTC (permalink / raw) To: Robin Rosenberg; +Cc: Jakub Narebski, git Dear diary, on Sun, Oct 08, 2006 at 05:52:10PM CEST, I got a letter where Robin Rosenberg <robin.rosenberg.lists@dewire.com> said that... > *I* think git should use UTF-8 internally. Always. Clients could then have > the option to convert to local conventions. > > Same for pathname. Internally all paths should be UTF-8 encoded. Encoding > commit info that way would make the i18n option obsolete also. There is a tradeoff here between independence of the data stored inside Git on the system where you created it, and willingness to store any awful garbage you feed inside. It goes down to a policy decision and Git lefts it on the user and opting for garbage support, which gives it more flexibility. > söndag 08 oktober 2006 12:16 skrev Jakub Narebski: > > But in fact the philosophy of Git _prohibits_ I think property bits. > I don't think so, but they aren't needed for the original purpose. Git already > does manage file permissions. Incorrect, Git manages only the execute bit. > > Unless we add ability (which can be done fairly easy even now, but will > > not be automatic) to save some metainfo (ACL, extended attributes, > > Subversion-like properties) along with the file (blob) and/or tree > > (directory). > > A problem with adding too much metadata is that there is a cost to this. We > like GIT much thanks to it's perforrmance. Git simply gets out of the way > thanks to this. ACL's aren't content at all. Extended attributes however are, > but who uses them? Execute bit isn't content at all either if you look at it this way. But it's meaningless anyway since you can define content whichever way you want (this is also why I consider the argument for not tracking directories dubious). -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ #!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj $/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1 lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does GIT require property like Subversion? 2006-10-08 16:24 ` Petr Baudis @ 2006-10-08 16:40 ` Jakub Narebski 0 siblings, 0 replies; 8+ messages in thread From: Jakub Narebski @ 2006-10-08 16:40 UTC (permalink / raw) To: git Petr Baudis wrote: > Dear diary, on Sun, Oct 08, 2006 at 05:52:10PM CEST, I got a letter > where Robin Rosenberg <robin.rosenberg.lists@dewire.com> said that... >> söndag 08 oktober 2006 12:16 skrev Jakub Narebski: >>> But in fact the philosophy of Git _prohibits_ I think property bits. >> I don't think so, but they aren't needed for the original purpose. Git >> already does manage file permissions. > > Incorrect, Git manages only the execute bit. And symlinks. -- Jakub Narebski Warsaw, Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Does GIT require property like Subversion? 2006-10-08 15:52 ` Robin Rosenberg 2006-10-08 16:24 ` Petr Baudis @ 2006-10-09 2:53 ` Liu Yubao 1 sibling, 0 replies; 8+ messages in thread From: Liu Yubao @ 2006-10-09 2:53 UTC (permalink / raw) To: Jakub Narebski; +Cc: git Sorry, I forgot to reply all. Robin Rosenberg wrote: > söndag 08 oktober 2006 12:16 skrev Jakub Narebski: >> File content encoding is something (if it is outside US-ASCII of course) >> that you would want either to have some default convention, or have it >> embedded in the file itself (like XML, HTML, or Emacs' file variables) >> to be able to read file _outside_ SCM. > Except for CR/LF, this is best solved outside of the SCM. There aren't that > may tools/users to warrant the complexity or performance hit I imagine to > solve it. > >> Path name encoding is something that is global property of a repository, >> I think. We have i18n.commitEncoding configuration variable; we could >> add i18n.pathnameEncoding quite easily I think (and some way for Git to >> detect current filesystem pathname encoding, if possible). Although >> BTW I think that i18n.commitEncoding information should be made persistent, >> and copied when cloning repository. > > *I* think git should use UTF-8 internally. Always. Clients could then have > the option to convert to local conventions. > > Same for pathname. Internally all paths should be UTF-8 encoded. Encoding > commit info that way would make the i18n option obsolete also. > I am afraid it's not a good idea to convert file content to UTF-8 encoding as GIT can manage non-text file, it's not safe to modify file content stealthily by a VCS. But I agree to use UTF-8 for path name in tree object, or add an encoding property(not a user defined property) to the head of tree object, so GIT won't do useless enc -> UTF-8 -> same_enc conversion. The second way has a fault: two tree objects with same content in different encoding have different SHA1 digests. > I have a patch for both these, but it's very ugly and probably has some memory > management problems, so I'll refrain from submitting for now. Knowing that it > exists may perhaps serve as starting point for discussion. It encodes > filenames in UTF-8 using LC_CTYPE as the local encoding, as well as commit > messages. An exception is when something looks like UTF-8, in which case it > will not convert input to git. When UTF-8 cannot be converted to the local > encoding on it's way out of git, the data remains in UTF-8 format. Branch and > tags names are not managed (yet, at least). > > Good, hope GIT can deal with path names that are not in 8859_1 or UTF-8 encoding. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-10-09 2:54 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-10-08 9:10 Does GIT require property like Subversion? Liu Yubao 2006-10-08 9:19 ` Jan-Benedict Glaw 2006-10-08 10:16 ` Jakub Narebski 2006-10-08 10:26 ` Jakub Narebski 2006-10-08 15:52 ` Robin Rosenberg 2006-10-08 16:24 ` Petr Baudis 2006-10-08 16:40 ` Jakub Narebski 2006-10-09 2:53 ` Liu Yubao
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).