* .gitattributes glob matching broken @ 2008-11-02 16:33 Hannu Koivisto 2008-11-03 9:09 ` Jeff King 0 siblings, 1 reply; 9+ messages in thread From: Hannu Koivisto @ 2008-11-02 16:33 UTC (permalink / raw) To: git Greetings, It seems that, for example, glob pattern *.s matches files with .sh extension at least with checkout and reset --hard but git status thinks otherwise: mkdir test cd test git init echo -e "*.sh -crlf\n*.s crlf" > .gitattributes echo -e "foobar\nfoobar\nfoobar" > kala.s echo -e "foobar\nfoobar\nfoobar" > kala.sh git add .gitattributes kala.s kala.sh git commit -m "Foo." cd .. git clone -n test test2 cd test2 git config core.autocrlf true git checkout git status # On branch master # Changed but not updated: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working # directory) # # modified: kala.sh # no changes added to commit (use "git add" and/or "git commit -a") file kala.s kala.sh kala.s: ASCII text, with CRLF line terminators kala.sh: ASCII text, with CRLF line terminators Tested in Linux with git 1.6.0.3.535.g933bb (master as of this writing) but also witnessed in Windows and with slightly older git versions. This makes git use in a Windows environment pretty much impossible if you don't want to / can't rely on git guessing "text" vs. "binary" files correctly so I hope a solution is found soon. It would also be good to document what kind of glob patterns git actually supports. I made the assumption that at least on Linux it supports whatever glob(7) says but even if that assumption is correct (which it may not be, of course) for example Windows users may not realize to look for such a manual page. -- Hannu ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: .gitattributes glob matching broken 2008-11-02 16:33 .gitattributes glob matching broken Hannu Koivisto @ 2008-11-03 9:09 ` Jeff King 2008-11-03 15:05 ` CRLF support bugs (was: Re: .gitattributes glob matching broken) Hannu Koivisto 0 siblings, 1 reply; 9+ messages in thread From: Jeff King @ 2008-11-03 9:09 UTC (permalink / raw) To: Hannu Koivisto; +Cc: git On Sun, Nov 02, 2008 at 06:33:51PM +0200, Hannu Koivisto wrote: > It seems that, for example, glob pattern *.s matches files with .sh > extension at least with checkout and reset --hard but git status > thinks otherwise: I think your analysis is incorrect. I will try to explain what is happening. > mkdir test > cd test > git init > echo -e "*.sh -crlf\n*.s crlf" > .gitattributes > echo -e "foobar\nfoobar\nfoobar" > kala.s > echo -e "foobar\nfoobar\nfoobar" > kala.sh > git add .gitattributes kala.s kala.sh > git commit -m "Foo." OK, so here we have two files, one of which we are telling git is text and one of which we are telling git is not text. Since we don't have autocrlf set at all, of course nothing happens here. > git clone -n test test2 And here we clone without checking out, so there are no files yet. > cd test2 > git config core.autocrlf true > git checkout And now we do check out the files, with autocrlf applied. But what are we left with? When I run this, _both_ files were detected as text and have CRLF line endings. So here I think is where git didn't do what you expected: kala.sh should not have had CRLF conversion applied. This is a known limitation of the attributes mechanism: it only reads from .gitattributes in the filesystem (or from .git/info/attributes), and not from the tree that is being checked out. This is something that should be addressed, but nobody has stepped up with a patch yet (though there has been some preliminary discussion). > git status > > # On branch master > # Changed but not updated: > # (use "git add <file>..." to update what will be committed) > # (use "git checkout -- <file>..." to discard changes in working > # directory) > # > # modified: kala.sh > # > no changes added to commit (use "git add" and/or "git commit -a") So yes, this status makes perfect sense, then. The file "kala.sh" has CRLFs in the filesystem, but we have told git that it is not a file which gets converted. So it looks like those CRs have been added. The problem, again, is that we have inconsistently applied the gitattributes. They were _not_ applied during checkout (because .gitattributes did not exist yet), but they _are_ being applied here. To "fix" this, you can then do a "git reset --hard" which will respect your .gitattributes (since it is now checked out). And further file creation and checkout should work OK. -Peff ^ permalink raw reply [flat|nested] 9+ messages in thread
* CRLF support bugs (was: Re: .gitattributes glob matching broken) 2008-11-03 9:09 ` Jeff King @ 2008-11-03 15:05 ` Hannu Koivisto 2008-11-03 15:25 ` CRLF support bugs Hannu Koivisto ` (2 more replies) 0 siblings, 3 replies; 9+ messages in thread From: Hannu Koivisto @ 2008-11-03 15:05 UTC (permalink / raw) To: git; +Cc: Jeff King Jeff King <peff@peff.net> writes: > I think your analysis is incorrect. I will try to explain what is > happening. Yes, you are right. The behaviour I saw in my actual use case was so odd that I got completely confused. I suspect one part of that "oddness" was caused by git applying its heuristics in checkout as it doesn't use .gitattributes at that time. For example, it seems that it recognized some of my .sh files as text files and the rest as binary files. I suppose I was correct to assume that it would be stupid to rely on git guessing file type and the only sensible way is to use .gitattributes. If it was supported in checkout too, that is. I don't know what purpose the autodetection aims to serve but I'd add a big warning in the core.autocrlf documentation about it and instructions on how to configure things so that it is never applied but instead the types must always be specified explicitly. > The problem, again, is that we have inconsistently applied the > gitattributes. They were _not_ applied during checkout (because > .gitattributes did not exist yet), but they _are_ being applied here. > > To "fix" this, you can then do a "git reset --hard" which will respect > your .gitattributes (since it is now checked out). And further file > creation and checkout should work OK. Since I'm trying to launch git in a company environment, I think I can't rely on people remembering to do that. Actually, even if .gitattributes were applied in checkout, I think the whole CRLF support is broken by design because people will have to remember to use -n in clone, then enable core.autocrlf support and then checkout. This makes it unneccessarily complicated to create "quick local clones" as well. You might suggest that Windows users should enable core.autocrlf globally but it may not be the right thing to do for all projects/repositories either. I think CRLF conversion support should have some attribute (be it .gitattributes attribute or something else) that is somehow inherited from the parent repository. It would basically say that "you should use platform's native line end type for text files with this repository and its children". To go with that, one would maybe have a configuration option to tell what that platform default line end type is (just in case someone wants to pretend Cygwin is Unix or something like that). I also observed this problem: # Pretend someone does this on Unix mkdir test1 cd test1 git init echo "*.c crlf" > .gitattributes echo -en "foo\r\nfoo\r\nfoo\r\n" > kala.c git add .gitattributes kala.c git commit -m "* Initial checkin." cd .. # Pretend someone else does this on Windows git clone -n test1 test2 cd test2 git config core.autocrlf true git checkout git status ... # modified: kala.c ... git reset --hard git status ... # modified: kala.c ... Now, even if .gitattributes were obeyed by checkout, I suspect the end result would be the same(?) I'm sure someone argues that this makes sense. But try to put yourself in the position of a random Window user. I think it's far from obvious what is going on and what should be done in this situation. -- Hannu ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: CRLF support bugs 2008-11-03 15:05 ` CRLF support bugs (was: Re: .gitattributes glob matching broken) Hannu Koivisto @ 2008-11-03 15:25 ` Hannu Koivisto 2008-11-03 16:46 ` CRLF support bugs (was: Re: .gitattributes glob matching broken) Dmitry Potapov 2008-11-04 5:14 ` CRLF support bugs (was: Re: .gitattributes glob matching broken) Jeff King 2 siblings, 0 replies; 9+ messages in thread From: Hannu Koivisto @ 2008-11-03 15:25 UTC (permalink / raw) To: git; +Cc: Jeff King Hannu Koivisto <azure@iki.fi> writes: > Actually, even if .gitattributes were applied in checkout, I think > the whole CRLF support is broken by design because people will have > to remember to use -n in clone, then enable core.autocrlf support > and then checkout. This makes it unneccessarily complicated to I forgot one thing: so what if someone forgets to use -n or just imagines that you can set core.autocrlf afterwards? # Pretend someone does this on Unix mkdir test1 cd test1 git init echo "*.c crlf" > .gitattributes echo -e "foo\nfoo\nfoo" > kala.c git add .gitattributes kala.c git commit -m "Initial checkin." cd .. # Pretend test1 is not a local repository and someone else does this on Windows git clone test1 test2 cd test2 git config core.autocrlf true git status # On branch master nothing to commit (working directory clean) Now the user would have to know that even though git status claims everything is ok, that is not the case. The user would have to know to say (according to #git): rm .git/index git reset --hard Just for the record, when I started to learn git, one of the first questions I had was "how do I undo checkout?" It wasn't until now that I learned I need to remove .git/index (in addition to all files). -- Hannu ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: CRLF support bugs (was: Re: .gitattributes glob matching broken) 2008-11-03 15:05 ` CRLF support bugs (was: Re: .gitattributes glob matching broken) Hannu Koivisto 2008-11-03 15:25 ` CRLF support bugs Hannu Koivisto @ 2008-11-03 16:46 ` Dmitry Potapov 2008-11-03 22:24 ` CRLF support bugs Hannu Koivisto 2008-11-04 5:14 ` CRLF support bugs (was: Re: .gitattributes glob matching broken) Jeff King 2 siblings, 1 reply; 9+ messages in thread From: Dmitry Potapov @ 2008-11-03 16:46 UTC (permalink / raw) To: Hannu Koivisto; +Cc: git, Jeff King On Mon, Nov 03, 2008 at 05:05:24PM +0200, Hannu Koivisto wrote: > > Actually, even if .gitattributes were applied in checkout, I think > the whole CRLF support is broken by design because people will have > to remember to use -n in clone, then enable core.autocrlf support > and then checkout. This makes it unneccessarily complicated to > create "quick local clones" as well. You might suggest that > Windows users should enable core.autocrlf globally but it may not > be the right thing to do for all projects/repositories either. core.autocrlf was exactly meant to be set globally. Basically, it says what end-of-line should be on your system. It is strange to have it different for different repositories. > > I think CRLF conversion support should have some attribute (be it > .gitattributes attribute or something else) that is somehow > inherited from the parent repository. It would basically say that > "you should use platform's native line end type for text files with > this repository and its children". It is already so. All text files are treated accordingly to users preferences, and for those files where automatic heuristic produces undesirable result, it can be disabled using .gitattributes. (Alas, there is a bug with checkout that you encountered earlier). > To go with that, one would > maybe have a configuration option to tell what that platform > default line end type is (just in case someone wants to pretend > Cygwin is Unix or something like that). That is exactly what core.autocrlf is about. One user may want to have LF on Windows and another wants CRLF. So, core.autocrlf defines how text files should be treated. It is what each user may have in his/her own ~/.gitconfig. > > I also observed this problem: > > # Pretend someone does this on Unix > mkdir test1 > cd test1 > git init > echo "*.c crlf" > .gitattributes > echo -en "foo\r\nfoo\r\nfoo\r\n" > kala.c The 'crlf' attribute means that the file should be treated as 'text' without applying heuristic. The correct ending for text files on Unix is '\n', not '\r\n'. So, you put a text file with incorrect ending, not surprisingly it causes problems for Windows users later. Dmitry ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: CRLF support bugs 2008-11-03 16:46 ` CRLF support bugs (was: Re: .gitattributes glob matching broken) Dmitry Potapov @ 2008-11-03 22:24 ` Hannu Koivisto 0 siblings, 0 replies; 9+ messages in thread From: Hannu Koivisto @ 2008-11-03 22:24 UTC (permalink / raw) To: git; +Cc: Dmitry Potapov, Jeff King Dmitry Potapov <dpotapov@gmail.com> writes: > On Mon, Nov 03, 2008 at 05:05:24PM +0200, Hannu Koivisto wrote: > > core.autocrlf was exactly meant to be set globally. Basically, > it says what end-of-line should be on your system. It is strange > to have it different for different repositories. Maybe so from the point of view of what it was intended for, but since there is nothing else that could be used to control end of line conversion on a repository basis, it certainly doesn't feel strange to me to use it like that. When you clone & checkout, say, the official git repository on Windows, are you comfortable doing it with core.autocrlf globally set to true? Maybe you know it's fine to do that so you actually are but I'm not. How about some other random free software mainly-Unix project you would like to develop / build under Windows? Even if text vs. binary autodetection worked perfectly (and it doesn't), CRLF line ends may still be the wrong choice for some project. I recall one such project and while admittedly the situation with it may have changed since I last used it, that doesn't change the point. I certainly wouldn't want to have core.autocrlf globally set to true on Windows. No automatic conversion is a much safer default. I only want CRLF conversion to happen with projects that have actually considered such checkouts and if necessary have been carefully set up to support it by using .gitattributes. >> I also observed this problem: >> >> # Pretend someone does this on Unix >> mkdir test1 >> cd test1 >> git init >> echo "*.c crlf" > .gitattributes >> echo -en "foo\r\nfoo\r\nfoo\r\n" > kala.c > > The 'crlf' attribute means that the file should be treated as 'text' > without applying heuristic. The correct ending for text files on Unix > is '\n', not '\r\n'. So, you put a text file with incorrect ending, > not surprisingly it causes problems for Windows users later. It seems to me you are looking at this, too, from the technical point of view. Yes, given the way CRLF support is implemented, the end result was expected. But that doesn't mean it was ok from the user's point of view. Consider usability instead. A user makes a mistake and adds a file from a colleague who uses Windows without first converting it. Are you really saying "so he made a mistake, who cares if repository users face problems"? I think it's just very bad usability that by making such a small mistake you cause the system to end up in a state that doesn't make any sense, i.e. git claims you have modifications right after clone & checkout even though you haven't modified anything. -- Hannu ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: CRLF support bugs (was: Re: .gitattributes glob matching broken) 2008-11-03 15:05 ` CRLF support bugs (was: Re: .gitattributes glob matching broken) Hannu Koivisto 2008-11-03 15:25 ` CRLF support bugs Hannu Koivisto 2008-11-03 16:46 ` CRLF support bugs (was: Re: .gitattributes glob matching broken) Dmitry Potapov @ 2008-11-04 5:14 ` Jeff King 2008-11-04 12:37 ` CRLF support bugs (was: Re: .gitattributes glob matchingbroken) Kelly F. Hickel 2 siblings, 1 reply; 9+ messages in thread From: Jeff King @ 2008-11-04 5:14 UTC (permalink / raw) To: Hannu Koivisto; +Cc: git On Mon, Nov 03, 2008 at 05:05:24PM +0200, Hannu Koivisto wrote: > I suspect one part of that "oddness" was caused by git applying its > heuristics in checkout as it doesn't use .gitattributes at that > time. It _does_ apply them in checkout, you just didn't have a .gitattributes file yet. So it is part of the same problem. > For example, it seems that it recognized some of my .sh > files as text files and the rest as binary files. I suppose I was > correct to assume that it would be stupid to rely on git guessing > file type and the only sensible way is to use .gitattributes. If I think it depends on what's in your scripts, since many people have not had trouble with the auto-detection. Perhaps some are UTF-16 which contain NULs? If the auto-detection is not working, I am sure people would love to see samples of what fooled it (since it is, after all, just a guess, and we would like to make the guess more accurate). > > To "fix" this, you can then do a "git reset --hard" which will respect > > your .gitattributes (since it is now checked out). And further file > > creation and checkout should work OK. > > Since I'm trying to launch git in a company environment, I think I > can't rely on people remembering to do that. Oh, absolutely. I think this is a shortcoming in git. The reset is simply a workaround until it is actually fixed. > Actually, even if .gitattributes were applied in checkout, I think > the whole CRLF support is broken by design because people will have > to remember to use -n in clone, then enable core.autocrlf support > and then checkout. This makes it unneccessarily complicated to Yes, that is a little bit annoying. I think there are four options: - people set core.autocrlf in their global ~/.gitconfig. The downside, as you mentioned, is that you might not want it for all projects - clone should take an extra "options" parameter which can set this up after doing the 'init'. Like: git clone -O core.autocrlf=true /path/to/repo - after setting autocrlf, people need to tell git to re-checkout with the updated settings. I don't know of a straightforward way to tell git everything needs to be updated. So I would do: git ls-files | xargs touch git reset --hard which is not ideal. Probably some sort of "re-checkout" option to git-checkout would be better. - you could do this "re-checkout" automagically when core.autocrlf is set via "git config". There are two obvious problems with this magic, though: - that may not be what the user wants, if they have work in progress in the directory. And normally calling "git config" has no such side effects, so it is certainly unexpected. - we don't even know when the config is updated, since the user may simply edit the file behind git's back So that is a little too magic for my taste. > I think CRLF conversion support should have some attribute (be it > .gitattributes attribute or something else) that is somehow > inherited from the parent repository. It would basically say that > "you should use platform's native line end type for text files with > this repository and its children". To go with that, one would > maybe have a configuration option to tell what that platform > default line end type is (just in case someone wants to pretend > Cygwin is Unix or something like that). I think others have complained before about something like this, in that it really is a _local_ decision and not a _project_ decision to make. I am fortunate enough to work exclusively on platforms with sane line endings, so I don't know what is normal. But if you really wanted to do such a thing for some set of corporate users, maybe it would make sense to have a "clone" hook that runs after init and can set up any relevant config (e.g., by copying certain config values from the parent repo). -Peff ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: CRLF support bugs (was: Re: .gitattributes glob matchingbroken) 2008-11-04 5:14 ` CRLF support bugs (was: Re: .gitattributes glob matching broken) Jeff King @ 2008-11-04 12:37 ` Kelly F. Hickel 2008-11-05 3:07 ` Jeff King 0 siblings, 1 reply; 9+ messages in thread From: Kelly F. Hickel @ 2008-11-04 12:37 UTC (permalink / raw) To: Jeff King, Hannu Koivisto; +Cc: git > -----Original Message----- > From: git-owner@vger.kernel.org [mailto:git-owner@vger.kernel.org] On > Behalf Of Jeff King > Sent: Monday, November 03, 2008 11:15 PM > To: Hannu Koivisto > Cc: git@vger.kernel.org > Subject: Re: CRLF support bugs (was: Re: .gitattributes glob > matchingbroken) > > On Mon, Nov 03, 2008 at 05:05:24PM +0200, Hannu Koivisto wrote: > <snip> > > I think CRLF conversion support should have some attribute (be it > > .gitattributes attribute or something else) that is somehow > > inherited from the parent repository. It would basically say that > > "you should use platform's native line end type for text files with > > this repository and its children". To go with that, one would > > maybe have a configuration option to tell what that platform > > default line end type is (just in case someone wants to pretend > > Cygwin is Unix or something like that). > > I think others have complained before about something like this, in > that > it really is a _local_ decision and not a _project_ decision to make. I > am fortunate enough to work exclusively on platforms with sane line > endings, so I don't know what is normal. From my point of view, the factoid that a particular file should be subjected to having its line endings munged is a _project_ decision. Whether or not to munge them on any given platform is a _local_ decision. I work on various UNIXes, Linux, Windows, z/OS, etc, etc, and I want the tool to just do the right thing so that I don't have to think about it on a daily basis. My $0.02.... -Kelly > > But if you really wanted to do such a thing for some set of corporate > users, maybe it would make sense to have a "clone" hook that runs after > init and can set up any relevant config (e.g., by copying certain > config > values from the parent repo). > > -Peff ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: CRLF support bugs (was: Re: .gitattributes glob matchingbroken) 2008-11-04 12:37 ` CRLF support bugs (was: Re: .gitattributes glob matchingbroken) Kelly F. Hickel @ 2008-11-05 3:07 ` Jeff King 0 siblings, 0 replies; 9+ messages in thread From: Jeff King @ 2008-11-05 3:07 UTC (permalink / raw) To: Kelly F. Hickel; +Cc: Hannu Koivisto, git On Tue, Nov 04, 2008 at 06:37:27AM -0600, Kelly F. Hickel wrote: > From my point of view, the factoid that a particular file should be > subjected to having its line endings munged is a _project_ decision. > Whether or not to munge them on any given platform is a _local_ > decision. Now that you spell it out, I am reminded that that is the argument I remember having seen in past discussions. And of course, that argues for .gitattributes being carried by the project, but the core.autocrlf _config_ being a local decision. Which, perhaps not coincidentally, is how it is currently implemented. :) -Peff ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-11-05 3:08 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-11-02 16:33 .gitattributes glob matching broken Hannu Koivisto 2008-11-03 9:09 ` Jeff King 2008-11-03 15:05 ` CRLF support bugs (was: Re: .gitattributes glob matching broken) Hannu Koivisto 2008-11-03 15:25 ` CRLF support bugs Hannu Koivisto 2008-11-03 16:46 ` CRLF support bugs (was: Re: .gitattributes glob matching broken) Dmitry Potapov 2008-11-03 22:24 ` CRLF support bugs Hannu Koivisto 2008-11-04 5:14 ` CRLF support bugs (was: Re: .gitattributes glob matching broken) Jeff King 2008-11-04 12:37 ` CRLF support bugs (was: Re: .gitattributes glob matchingbroken) Kelly F. Hickel 2008-11-05 3:07 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox