* crlf with git-svn driving me nuts... @ 2008-04-16 19:10 Nigel Magnay 2008-04-16 20:01 ` Dmitry Potapov 2008-04-16 20:03 ` Avery Pennarun 0 siblings, 2 replies; 19+ messages in thread From: Nigel Magnay @ 2008-04-16 19:10 UTC (permalink / raw) To: git We've got projects with a mixed userbase of windows / *nix; I'm trying to migrate some users onto git, whilst everyone else stays happy in their SVN repo. However, there's one issue that has been driving me slowly insane. This is best illustrated thusly (on windows) : $ git init $ git config core.autocrlf false -->Create a file with some text content on a few lines $ notepad file.txt $ git add file.txt $ git commit -m "initial checkin" $ git status # On branch master nothing to commit (working directory clean) --> Yarp, what I wanted $ git config core.autocrlf true $ git status # On branch master nothing to commit (working directory clean) --> Yarp, still all good --> Simulate non-change happened by an editor opening file... $ touch file.txt $ git status # On branch master # Changed but not updated: # (use "git add <file>..." to update what will be committed) # # modified: file.txt # no changes added to commit (use "git add" and/or "git commit -a") --> Oh Noes! I wonder what it could be $ git diff file.txt diff --git a/file.txt b/file.txt index 7a2051f..31ca3a0 100644 --- a/file.txt +++ b/file.txt @@ -1,3 +1,3 @@ -<xml> - wooot -</xml> +<xml> + wooot +</xml> --> Huh? ... $ git diff -b file.txt diff --git a/file.txt b/file.txt index 7a2051f..31ca3a0 100644 --> Bah... don't care! get me back to the start... $ git reset --hard HEAD is now at 4762c31... initial checkin $ git status # On branch master # Changed but not updated: # (use "git add <file>..." to update what will be committed) # # modified: file.txt # no changes added to commit (use "git add" and/or "git commit -a") --> ARGH! $ git config core.autocrlf false $ git status # On branch master nothing to commit (working directory clean) $ git config core.autocrlf true $ git status # On branch master nothing to commit (working directory clean) --> WtF? Why does it think in this instance that there is a change? It's CRLF in the repo, it's CRLF in the working tree, and the checkout in either mode ought to be identical ?? Now this is further compounded by the fact that users then typically tend to do a 'CRLF->LF conversion' checkin - *BUT* this will cause merge conflicts if another user actually made a genuine change (I.E. the removal of CR and the change are both treated as significant). Additional fun is caused because some editors 'touching' files that they actually haven't modified, leading to all these 'null' changes. This is a bigger deal for us than it ought to be, because we're pulling changes from a windows-based svn repo, which is always CRLF. Should I set core.autocrlf=input when doing 'git svn fetch' (and would it pay any attention)? Also is it possible to tell the diff / merge machinery that it ought to just ignore text file line endings when merging ? Sorry if some of this is stupid-user territory, but there's probably a few people out there also looking at trying to migrate away from Windows+SVN that are likely to hit the same things... ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: crlf with git-svn driving me nuts... 2008-04-16 19:10 crlf with git-svn driving me nuts Nigel Magnay @ 2008-04-16 20:01 ` Dmitry Potapov 2008-04-16 20:20 ` Avery Pennarun 2008-04-16 20:56 ` Martin Langhoff 2008-04-16 20:03 ` Avery Pennarun 1 sibling, 2 replies; 19+ messages in thread From: Dmitry Potapov @ 2008-04-16 20:01 UTC (permalink / raw) To: Nigel Magnay; +Cc: git On Wed, Apr 16, 2008 at 08:10:26PM +0100, Nigel Magnay wrote: > We've got projects with a mixed userbase of windows / *nix; I'm trying > to migrate some users onto git, whilst everyone else stays happy in > their SVN repo. > > However, there's one issue that has been driving me slowly insane. > This is best illustrated thusly (on windows) : > > $ git init > $ git config core.autocrlf false core.autocrlf=false is a bad choice for Windows. > > -->Create a file with some text content on a few lines > $ notepad file.txt > > $ git add file.txt > $ git commit -m "initial checkin" You added a file with the CRLF ending in the repository! You are going to have problems now... > > $ git status > # On branch master > nothing to commit (working directory clean) > --> Yarp, what I wanted > > $ git config core.autocrlf true > $ git status You should not change core.autocrlf during your work, or you are going to have some funny problems. If you really need to change it, it should be followed by "git reset --hard". In this case, you already have a file with the wrong ending, so file.txt will be shown as changed now, because if you commit it again then it will be commited with <LF>, which should have been done in the first place. > > # On branch master > nothing to commit (working directory clean) > --> Yarp, still all good > > --> Simulate non-change happened by an editor opening file... > $ touch file.txt > $ git status > # On branch master > # Changed but not updated: > # (use "git add <file>..." to update what will be committed) > # > # modified: file.txt > # > no changes added to commit (use "git add" and/or "git commit -a") > > --> Oh Noes! I wonder what it could be > $ git diff file.txt > diff --git a/file.txt b/file.txt > index 7a2051f..31ca3a0 100644 > --- a/file.txt > +++ b/file.txt > @@ -1,3 +1,3 @@ > -<xml> > - wooot > -</xml> > +<xml> > + wooot > +</xml> > > --> Huh? ... Actually, it is @@ -1,3 +1,3 @@ -<xml>^M - wooot^M -</xml>^M +<xml> + wooot +</xml> where ^M is <CR> > > --> WtF? > > Why does it think in this instance that there is a change? It's CRLF > in the repo, it's CRLF in the working tree, and the checkout in either > mode ought to be identical ?? If you do not want problems, you should use core.autocrlf=true on Windows. Then all text files will be stored in the repository with <LF>, but they will have <CR><LF> in your work tree. Users on *nix should set core.autocrlf=input or false, so they will have <LF> in their work tree. Dmitry ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: crlf with git-svn driving me nuts... 2008-04-16 20:01 ` Dmitry Potapov @ 2008-04-16 20:20 ` Avery Pennarun 2008-04-16 20:39 ` Dmitry Potapov 2008-04-16 20:56 ` Martin Langhoff 1 sibling, 1 reply; 19+ messages in thread From: Avery Pennarun @ 2008-04-16 20:20 UTC (permalink / raw) To: Dmitry Potapov; +Cc: Nigel Magnay, git On 4/16/08, Dmitry Potapov <dpotapov@gmail.com> wrote: > In this case, you already have a file with the wrong ending, > so file.txt will be shown as changed now, because if you commit > it again then it will be commited with <LF>, which should have > been done in the first place. [...] > If you do not want problems, you should use core.autocrlf=true > on Windows. Then all text files will be stored in the repository > with <LF>, but they will have <CR><LF> in your work tree. > Users on *nix should set core.autocrlf=input or false, so they > will have <LF> in their work tree. Alas, the subject of this thread involves git-svn, and the typical git-svn user is someone who has no way of rewriting the existing history in their svn repositories. Thus, files *will* be in the repository that have the wrong line endings, and (as you noted) git just gets totally confused in that case. Nigel's example showed a few situations where git *thought* the file had changed when it hadn't, and yet is incapable of checking in the changes. If all I had to do was checkout (thus converting everything to LF), and then "git commit -a" to check in all the corrected files, then git-svn would make one giant, very rude checkin to svn, and my problems would be largely solved. However, this does not seem to be possible due to the problems you noted ("you are going to have problems now"). Have fun, Avery ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: crlf with git-svn driving me nuts... 2008-04-16 20:20 ` Avery Pennarun @ 2008-04-16 20:39 ` Dmitry Potapov 2008-04-16 21:56 ` Nigel Magnay [not found] ` <320075ff0804161447u25dfbb2bmcd36ea507224d835@mail.gmail.com> 0 siblings, 2 replies; 19+ messages in thread From: Dmitry Potapov @ 2008-04-16 20:39 UTC (permalink / raw) To: Avery Pennarun; +Cc: Nigel Magnay, git On Wed, Apr 16, 2008 at 04:20:27PM -0400, Avery Pennarun wrote: > On 4/16/08, Dmitry Potapov <dpotapov@gmail.com> wrote: > > In this case, you already have a file with the wrong ending, > > so file.txt will be shown as changed now, because if you commit > > it again then it will be commited with <LF>, which should have > > been done in the first place. > [...] > > If you do not want problems, you should use core.autocrlf=true > > on Windows. Then all text files will be stored in the repository > > with <LF>, but they will have <CR><LF> in your work tree. > > Users on *nix should set core.autocrlf=input or false, so they > > will have <LF> in their work tree. > > Alas, the subject of this thread involves git-svn, and the typical > git-svn user is someone who has no way of rewriting the existing > history in their svn repositories. Thus, files *will* be in the > repository that have the wrong line endings, and (as you noted) git > just gets totally confused in that case. Actually, what matters in what format files are in _Git_ repository. Maybe, there is a problem with git-svn and how it imports SVN commits to Git, but I have not encountered it. > Nigel's example showed a few situations where git *thought* the file > had changed when it hadn't, and yet is incapable of checking in the > changes. Incapable of checking in? I have not found a single example in his mail where it was impossible. The only quirk with autocrlf is that you need to re-checkout your work tree after changing it. There is no other problems with it as far as I know. Dmitry ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: crlf with git-svn driving me nuts... 2008-04-16 20:39 ` Dmitry Potapov @ 2008-04-16 21:56 ` Nigel Magnay [not found] ` <320075ff0804161447u25dfbb2bmcd36ea507224d835@mail.gmail.com> 1 sibling, 0 replies; 19+ messages in thread From: Nigel Magnay @ 2008-04-16 21:56 UTC (permalink / raw) To: git > > Nigel's example showed a few situations where git *thought* the file > > had changed when it hadn't, and yet is incapable of checking in the > > changes. > > Incapable of checking in? I have not found a single example in > his mail where it was impossible. The only quirk with autocrlf > is that you need to re-checkout your work tree after changing > it. There is no other problems with it as far as I know. > My (initial) setting of core.autocrlf to false was because that's what it was on all the windows clients (I know the default has now changed) and to make the later parts of the script obvious that the file in the repo had a CRLF ending, rather than have being converted to LF. That's the situation we have, because they've all come from SVN. The bit I really don't understand is why git thinks a file that has just been touched has chnaged when it hasn't, and doing a 'git reset --hard' actually doesn't help at all (but, bizzarely, git config core.autocrlf false & git config core.autocrlf true *does* !). The repo copy is CRLF, the working copy is CRLF, but git thinks it's changed... ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <320075ff0804161447u25dfbb2bmcd36ea507224d835@mail.gmail.com>]
[parent not found: <20080416223739.GJ3133@dpotapov.dyndns.org>]
* Re: crlf with git-svn driving me nuts... [not found] ` <20080416223739.GJ3133@dpotapov.dyndns.org> @ 2008-04-16 23:07 ` Nigel Magnay 2008-04-17 0:46 ` Dmitry Potapov 2008-04-17 5:43 ` Steffen Prohaska 0 siblings, 2 replies; 19+ messages in thread From: Nigel Magnay @ 2008-04-16 23:07 UTC (permalink / raw) To: Dmitry Potapov, git > > The bit I really don't understand is why git thinks a file that has > > just been touched has chnaged when it hasn't, > > Actually, it did change in the sense that if you try to commit this > file now into the repository, you will have a different file in Git! > So, it is more correct to say that Git did not notice this change until > you touch this file, because this change is indirect (autocrlf causes > a different interpretation of the file). > Okay - at the very least this behaviour is really, really confusing. And I think there's actually a bug (it should *always* report that the file is different), not magically after it's been touched. But fixing that minor bug still leads to badness for the user. Doing (on a core.autocrlf=true machine) a checkout of any revision containing a file that is (currently) CRLF in the repository, and your WC is *immediately* dirty. However technically correct that is, it doesn't fit most people's user model of an SCM, because they haven't made any modification. And if 1 person makes a change along with their conversion, and the other 'just' does a CRLF->LF conversion, their revisions will conflict at merge time. Blech. And because the svn is mastered crlf (well, strictly speaking, it's ignorant of line endings) this is gonna happen a lot. Can't git be taught that if the WC is byte-identical to the revision in the repository (regardless of autocrlf) then that ought not to be regarded as a change? Is there a way I can persuade the diff / merge mechanisms to normalise before they operate? (e.g if core.autocrlf does lf->crlf/crlf->lf, then an equivalent that does crlf->lf/crlf->lf before doing the merge )? In a perfect world I'd be able to switch all files int he repo to LF, but that's not going to happen any time soon because of the majority of developers, still on svn, still on windows. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: crlf with git-svn driving me nuts... 2008-04-16 23:07 ` Nigel Magnay @ 2008-04-17 0:46 ` Dmitry Potapov 2008-04-17 1:44 ` Avery Pennarun 2008-04-17 7:07 ` Nigel Magnay 2008-04-17 5:43 ` Steffen Prohaska 1 sibling, 2 replies; 19+ messages in thread From: Dmitry Potapov @ 2008-04-17 0:46 UTC (permalink / raw) To: Nigel Magnay; +Cc: git On Thu, Apr 17, 2008 at 12:07:27AM +0100, Nigel Magnay wrote: > > > The bit I really don't understand is why git thinks a file that has > > > just been touched has chnaged when it hasn't, > > > > Actually, it did change in the sense that if you try to commit this > > file now into the repository, you will have a different file in Git! > > So, it is more correct to say that Git did not notice this change until > > you touch this file, because this change is indirect (autocrlf causes > > a different interpretation of the file). > > > > Okay - at the very least this behaviour is really, really confusing. > And I think there's actually a bug (it should *always* report that the > file is different), not magically after it's been touched. I don't think there is a simple way to correct that without penalizing normal use cases. Usually, people do not change autocrlf during their normal work. Besides, you can have your own input filters and they may cause the same effect. So, Git works in the assumption that input filters always produce the same results... > > But fixing that minor bug still leads to badness for the user. Doing > (on a core.autocrlf=true machine) a checkout of any revision > containing a file that is (currently) CRLF in the repository, and your > WC is *immediately* dirty. However technically correct that is, it > doesn't fit most people's user model of an SCM, because they haven't > made any modification. IMHO, the only sane way is never store CRLF in the Git repository. You can have whatever ending you like in your work tree, but inside of Git, LF is the actually marker of the end-of-line. > And if 1 person makes a change along with their > conversion, and the other 'just' does a CRLF->LF conversion, If you imported correctly in Git, it should not have CRLF for text files. So, there is no conversion that a user does expliciltly. > And because the svn is > mastered crlf (well, strictly speaking, it's ignorant of line endings) > this is gonna happen a lot. Not really. SVN has its own setting for EOL conversion. If you have 'svn:eol-style' set to 'native' for any text file then SVN will checkout text files accordingly to your native EOL (you can specify your native EOL using the --native-eol option when it is necessary). > Can't git be taught that if the WC is byte-identical to the revision > in the repository (regardless of autocrlf) then that ought not to be > regarded as a change? Why should not it? If a file is different as long as Git repository is concern then then it *is* a change. Git binary compare files _after_ applying all specified filters (and you can have your own filters, not only autocrlf). > Is there a way I can persuade the diff / merge mechanisms to normalise > before they operate? (e.g if core.autocrlf does lf->crlf/crlf->lf, > then an equivalent that does crlf->lf/crlf->lf before doing the merge > )? I am not sure if there is a standard option for that, but it is certainly possible to define your own merge strategy. > > In a perfect world I'd be able to switch all files int he repo to LF, > but that's not going to happen any time soon because of the majority > of developers, still on svn, still on windows. Well, I don't see any problem here if everything is configured properly. How files are stored inside and what you have in your work tree does not have to be the same. So, storing everything inside with LF is certainly possible. Actually, I believe it is exactly what CVS does (unless you added a file with '-kb'), and people use CVS on Windows. Importing files with CRLF in Git, it is like putting files as _binary_ in CVS. Dmitry ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: crlf with git-svn driving me nuts... 2008-04-17 0:46 ` Dmitry Potapov @ 2008-04-17 1:44 ` Avery Pennarun 2008-04-17 7:07 ` Nigel Magnay 1 sibling, 0 replies; 19+ messages in thread From: Avery Pennarun @ 2008-04-17 1:44 UTC (permalink / raw) To: Dmitry Potapov; +Cc: Nigel Magnay, git On 4/16/08, Dmitry Potapov <dpotapov@gmail.com> wrote: > On Thu, Apr 17, 2008 at 12:07:27AM +0100, Nigel Magnay wrote: > > Okay - at the very least this behaviour is really, really confusing. > > And I think there's actually a bug (it should *always* report that the > > file is different), not magically after it's been touched. > > I don't think there is a simple way to correct that without penalizing > normal use cases. Usually, people do not change autocrlf during their > normal work. Besides, you can have your own input filters and they may > cause the same effect. So, Git works in the assumption that input filters > always produce the same results... However, it doesn't check that before it marks the file as unmodified right after checkout. That is, the problem is hidden until the file's mtime changes. Is there a way to quickly check that every file in the repo is "sane", ie. the input filter is the proper inverse of the output filter and will put each file back in the repo? This is pretty important for anyone designing any kind of input filter, or bugs will go undetected until some later time when they're confusing. > If you imported correctly in Git, it should not have CRLF for text > files. So, there is no conversion that a user does expliciltly. Can you give a set of steps for how to import "correctly" using git-svn? Remember that a given svn repository might have long ago been configured to store CRLF (actually, to store files without changing their line endings), since that is the svn default. Also remember that the svn:eol-style flag may be set differently on various files in svn, and may have changed in different svn revisions over time. Thanks, Avery ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: crlf with git-svn driving me nuts... 2008-04-17 0:46 ` Dmitry Potapov 2008-04-17 1:44 ` Avery Pennarun @ 2008-04-17 7:07 ` Nigel Magnay 2008-04-17 9:43 ` Dmitry Potapov 1 sibling, 1 reply; 19+ messages in thread From: Nigel Magnay @ 2008-04-17 7:07 UTC (permalink / raw) To: Dmitry Potapov; +Cc: git On Thu, Apr 17, 2008 at 1:46 AM, Dmitry Potapov <dpotapov@gmail.com> wrote: > On Thu, Apr 17, 2008 at 12:07:27AM +0100, Nigel Magnay wrote: > > > > The bit I really don't understand is why git thinks a file that has > > > > just been touched has chnaged when it hasn't, > > > > > > Actually, it did change in the sense that if you try to commit this > > > file now into the repository, you will have a different file in Git! > > > So, it is more correct to say that Git did not notice this change until > > > you touch this file, because this change is indirect (autocrlf causes > > > a different interpretation of the file). > > > > > > > Okay - at the very least this behaviour is really, really confusing. > > And I think there's actually a bug (it should *always* report that the > > file is different), not magically after it's been touched. > > I don't think there is a simple way to correct that without penalizing > normal use cases. Usually, people do not change autocrlf during their > normal work. Besides, you can have your own input filters and they may > cause the same effect. So, Git works in the assumption that input filters > always produce the same results... This has nothing to do with changing core.autocrlf after checkout - it's a problem with *any* repo with CRLF files, being checked out on a core.autocrlf=true machine, which basically is any windows machine. The current 'isDirty' check seems to be something like isDirty = ( wc.file.mtime > someValue ) && ( repository.file != filter(wc.file) ) I'm saying it ought to be something like isDirty = ( wc.file.mtime > someValue ) && (sha1(repository.file) != sha1(wc.file) ) && ( repository.file != filter(wc.file) ) > > > > > > But fixing that minor bug still leads to badness for the user. Doing > > (on a core.autocrlf=true machine) a checkout of any revision > > containing a file that is (currently) CRLF in the repository, and your > > WC is *immediately* dirty. However technically correct that is, it > > doesn't fit most people's user model of an SCM, because they haven't > > made any modification. > > IMHO, the only sane way is never store CRLF in the Git repository. > You can have whatever ending you like in your work tree, but inside > of Git, LF is the actually marker of the end-of-line. > Great. I'll go and argue with the team using svn, who don't even *notice* this issue, and try to get them to adjust the metadata on every single file in the repository. Then, for a bonus, I'll try the same with every OSS project that I'm tracking with git-svn. :-( I get that things are horribly broken if you get CRLF in your repository. But it's unreasonable to expect the ability to bend the rest of the world to what's convenient for me! Some of our windows coders probably even *like* svn:eol-style=CRLF ! > > > And if 1 person makes a change along with their > > conversion, and the other 'just' does a CRLF->LF conversion, > > If you imported correctly in Git, it should not have CRLF for text > files. So, there is no conversion that a user does expliciltly. > > > > And because the svn is > > mastered crlf (well, strictly speaking, it's ignorant of line endings) > > this is gonna happen a lot. > > Not really. SVN has its own setting for EOL conversion. If you have > 'svn:eol-style' set to 'native' for any text file then SVN will > checkout text files accordingly to your native EOL (you can specify > your native EOL using the --native-eol option when it is necessary). > Can I set this personally, without affecting the svn repo? If so, why isn't git-svn doing this anyway, and can I tell it to do so? > > > Can't git be taught that if the WC is byte-identical to the revision > > in the repository (regardless of autocrlf) then that ought not to be > > regarded as a change? > > Why should not it? If a file is different as long as Git repository is > concern then then it *is* a change. Git binary compare files _after_ > applying all specified filters (and you can have your own filters, not > only autocrlf). > See above. Unchanged (on disk, byte identical) files, if touched, get (sometimes) marked as dirty. > > > Is there a way I can persuade the diff / merge mechanisms to normalise > > before they operate? (e.g if core.autocrlf does lf->crlf/crlf->lf, > > then an equivalent that does crlf->lf/crlf->lf before doing the merge > > )? > > I am not sure if there is a standard option for that, but it is > certainly possible to define your own merge strategy. > Ok - I'll have a look into this - just a filter on each file before merging would be sufficient. Presumably people that do things like $Id$ expansion need something similar to avoid constant merge conflicts.. > > > > > In a perfect world I'd be able to switch all files int he repo to LF, > > but that's not going to happen any time soon because of the majority > > of developers, still on svn, still on windows. > > Well, I don't see any problem here if everything is configured properly. > How files are stored inside and what you have in your work tree does > not have to be the same. So, storing everything inside with LF is > certainly possible. Actually, I believe it is exactly what CVS does > (unless you added a file with '-kb'), and people use CVS on Windows. > Importing files with CRLF in Git, it is like putting files as _binary_ > in CVS. > > Dmitry > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: crlf with git-svn driving me nuts... 2008-04-17 7:07 ` Nigel Magnay @ 2008-04-17 9:43 ` Dmitry Potapov 2008-04-17 10:09 ` Nigel Magnay 0 siblings, 1 reply; 19+ messages in thread From: Dmitry Potapov @ 2008-04-17 9:43 UTC (permalink / raw) To: Nigel Magnay; +Cc: git On Thu, Apr 17, 2008 at 08:07:27AM +0100, Nigel Magnay wrote: > > This has nothing to do with changing core.autocrlf after checkout - > it's a problem with *any* repo with CRLF files, being checked out on a > core.autocrlf=true machine, which basically is any windows machine. > > The current 'isDirty' check seems to be something like > > isDirty = ( wc.file.mtime > someValue ) && ( repository.file != > filter(wc.file) ) Basically, yes. > > I'm saying it ought to be something like > > isDirty = ( wc.file.mtime > someValue ) && (sha1(repository.file) != > sha1(wc.file) ) && ( repository.file != filter(wc.file) ) I don't think it is reasonable. Files inside of the repository and in the work are not meant to be the same. What if I have $Id$ expansion or something else. What could make sense is to add an additional check: && convert_to_work_tree(repository.file) != wc.file but it should be optional, so it will not penalize those who do need or do not want this extra check. > > > > > > But fixing that minor bug still leads to badness for the user. Doing > > > (on a core.autocrlf=true machine) a checkout of any revision > > > containing a file that is (currently) CRLF in the repository, and your > > > WC is *immediately* dirty. However technically correct that is, it > > > doesn't fit most people's user model of an SCM, because they haven't > > > made any modification. > > > > IMHO, the only sane way is never store CRLF in the Git repository. > > You can have whatever ending you like in your work tree, but inside > > of Git, LF is the actually marker of the end-of-line. > > > > Great. I'll go and argue with the team using svn, who don't even > *notice* this issue, and try to get them to adjust the metadata on > every single file in the repository. Maybe, you can teach git-svn to be smarter... I mean storing text files in Git repo with CRLF is stupid, so, perhaps, git-svn can do a better job converting CRLF<->LF when it exports and imports from/to SVN. > > Then, for a bonus, I'll try the same with every OSS project that I'm > tracking with git-svn. :-( > > I get that things are horribly broken if you get CRLF in your > repository. But it's unreasonable to expect the ability to bend the > rest of the world to what's convenient for me! Some of our windows > coders probably even *like* svn:eol-style=CRLF ! You can use Git and have CRLF in your work tree. You just need to have autocrlf=true for that. _Inside_ of Git, only LF is the end of line. How you store in SVN, it is a separate issue with git-svn. I guess, git-svn needs improvement in this area... Dmitry ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: crlf with git-svn driving me nuts... 2008-04-17 9:43 ` Dmitry Potapov @ 2008-04-17 10:09 ` Nigel Magnay 2008-04-17 18:53 ` Dmitry Potapov 0 siblings, 1 reply; 19+ messages in thread From: Nigel Magnay @ 2008-04-17 10:09 UTC (permalink / raw) To: Dmitry Potapov; +Cc: git > > > > I'm saying it ought to be something like > > > > isDirty = ( wc.file.mtime > someValue ) && (sha1(repository.file) != > > sha1(wc.file) ) && ( repository.file != filter(wc.file) ) > > I don't think it is reasonable. Files inside of the repository and > in the work are not meant to be the same. What if I have $Id$ expansion > or something else. What could make sense is to add an additional check: > && convert_to_work_tree(repository.file) != wc.file > but it should be optional, so it will not penalize those who do need > or do not want this extra check. > Ah, yes - you're right (I was only thinking about check-in filters, not check-out). I agree it ought to be optional; I suggest it ought to be turned on (be default) in the $Id$ expansion and the core.autocrlf=true scenarios (I.E when there's some filter in place). > > > > > ... > Maybe, you can teach git-svn to be smarter... I mean storing text files > in Git repo with CRLF is stupid, so, perhaps, git-svn can do a better > job converting CRLF<->LF when it exports and imports from/to SVN. > Yar - maybe there's some options there. Maybe it isn't so bad - all svn projects probably *ought* to be using eol=native, but it isn't default; so maybe it's just easier to coax those projects into fixing their svn repos (but of course it's not really an issue for them, so it might be a bit of a hard sell). I may add some detail to the wiki docs to point this out - if I'd done it up front to our local projects, my life would be easier! > ... > You can use Git and have CRLF in your work tree. You just need to > have autocrlf=true for that. _Inside_ of Git, only LF is the end > of line. How you store in SVN, it is a separate issue with git-svn. > I guess, git-svn needs improvement in this area... > Yes, in the sense that git is primarily a *nix tool, so it treats LF as canon and CRLF as somehow 'stupid' (I.E you could make an equally valid argument for the reverse position, it just depends on your perspective ;-)) ; but then again, it's only an issue because I'm now merging in git *waaay* more often and it's uncovering a problem that might actually be there already (modulo the fact that svn merging may ignore line endings anyway - but I don't know because all merges there seem to inevitably end up in conflicts anyway..). ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: crlf with git-svn driving me nuts... 2008-04-17 10:09 ` Nigel Magnay @ 2008-04-17 18:53 ` Dmitry Potapov 2008-04-17 22:03 ` Nigel Magnay 0 siblings, 1 reply; 19+ messages in thread From: Dmitry Potapov @ 2008-04-17 18:53 UTC (permalink / raw) To: Nigel Magnay; +Cc: git On Thu, Apr 17, 2008 at 11:09:12AM +0100, Nigel Magnay wrote: > > Maybe it isn't so bad - all > svn projects probably *ought* to be using eol=native, but it isn't > default; If you want to have native EOL for each platform then you have to do this conversion, but it should be applied to only to text files. So, the question is how can a VCS know what file is text and what is not. CVS considers everything what you check-in as text by default. If you want to put a binary file, you have to use -kb flag, otherwise your file may be damaged. People tend to be forgetful and some lose their data in this way. So team SVN team decided to stay on the safe side and put everything as is, because if you forget to set eol=native, you do not lose anything and you can set eol=native later. Unfortunately, now SVN users forget to set eol=native a way too often. So, IMHO, Git approach based on heuristic is much better when most of stored files are text. > so maybe it's just easier to coax those projects into fixing > their svn repos (but of course it's not really an issue for them, so > it might be a bit of a hard sell). If they care about support different platforms then it _is_ issue for them too. On the other hand, if everyone uses Windows with CRLF, you can do that with Git too just by setting autocrlf=false. > > Yes, in the sense that git is primarily a *nix tool, so it treats LF > as canon and perhaps even more important, it is written in C and where LF has always been considered as EOL since the first Hello-World program was written in C: printf("Hello world!\n"); -----------------------^^ So, naturally LF is considered as EOL inside of Git. Actually, CVS does so, and even SVN does if you set eol=native. > and CRLF as somehow 'stupid' (I.E you could make an equally > valid argument for the reverse position, it just depends on your > perspective ;-)) ; There is no good technical reason to have two symbols as the end-of-line marker instead of one. Most programs on Windows just remove CR when read from a file and then adding it back before LF when writing it back. So, CR is clearly redundant. Dmitry ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: crlf with git-svn driving me nuts... 2008-04-17 18:53 ` Dmitry Potapov @ 2008-04-17 22:03 ` Nigel Magnay 2008-04-17 22:42 ` Dmitry Potapov 0 siblings, 1 reply; 19+ messages in thread From: Nigel Magnay @ 2008-04-17 22:03 UTC (permalink / raw) To: Dmitry Potapov; +Cc: git > lose anything and you can set eol=native later. Unfortunately, now SVN > users forget to set eol=native a way too often. So, IMHO, Git approach > based on heuristic is much better when most of stored files are text. > I agree - since the forgetful users includes us! > > > so maybe it's just easier to coax those projects into fixing > > their svn repos (but of course it's not really an issue for them, so > > it might be a bit of a hard sell). > > If they care about support different platforms then it _is_ issue > for them too. On the other hand, if everyone uses Windows with CRLF, > you can do that with Git too just by setting autocrlf=false. > Actually it seems to be less of an everyday issue- but I think it's because the diff tools in use by programs downstream are probably stripping CRs anyway before presenting diffs, so it all 'appears' to be right. Certainly I've been sharing via a svn repo through Eclipse with windows users for ages without it being a problem. Either way, the problem in touched/untouched-files was the majority of my confusion as I wasn't expecting to find a bug and was assuming I was doing something wrong... > > > > > Yes, in the sense that git is primarily a *nix tool, so it treats LF > > as canon > > and perhaps even more important, it is written in C and where LF has > always been considered as EOL since the first Hello-World program was > written in C: > > printf("Hello world!\n"); > -----------------------^^ > > So, naturally LF is considered as EOL inside of Git. Actually, CVS does > so, and even SVN does if you set eol=native. > > > > and CRLF as somehow 'stupid' (I.E you could make an equally > > valid argument for the reverse position, it just depends on your > > perspective ;-)) ; > > There is no good technical reason to have two symbols as the end-of-line > marker instead of one. Most programs on Windows just remove CR when read > from a file and then adding it back before LF when writing it back. So, > CR is clearly redundant. > Well.... Newline = LF vs CRLF (vs CR for early mac.. erk) dates to well before C and UNIX; back into the days of baudot codes and teletype printers that couldn't physically newline in the time taken for 1 character to be processed; LF is meant to mean Line Feed and CR is meant to mean "Carriage Return", so CRLF is in that sense quite logical. But that's standards committees and backwards compatibility for you :-/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: crlf with git-svn driving me nuts... 2008-04-17 22:03 ` Nigel Magnay @ 2008-04-17 22:42 ` Dmitry Potapov 0 siblings, 0 replies; 19+ messages in thread From: Dmitry Potapov @ 2008-04-17 22:42 UTC (permalink / raw) To: Nigel Magnay; +Cc: git On Thu, Apr 17, 2008 at 11:03:10PM +0100, Nigel Magnay wrote: > > Well.... Newline = LF vs CRLF (vs CR for early mac.. erk) dates to > well before C and UNIX; back into the days of baudot codes and > teletype printers that couldn't physically newline in the time taken > for 1 character to be processed; LF is meant to mean Line Feed and CR > is meant to mean "Carriage Return", so CRLF is in that sense quite > logical. But that's standards committees and backwards compatibility > for you :-/ CRLF is logical from the point of you of teletype printers, but when we speak about text files then it is more logical to consider them as a list of lines. What particular symbol is used as line-separator does not really matter, but IMHO it is stupid to have two symbols for that. So, LF vs CR is matter of preferences, but CRLF is just stupid -;) Dmitry ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: crlf with git-svn driving me nuts... 2008-04-16 23:07 ` Nigel Magnay 2008-04-17 0:46 ` Dmitry Potapov @ 2008-04-17 5:43 ` Steffen Prohaska 1 sibling, 0 replies; 19+ messages in thread From: Steffen Prohaska @ 2008-04-17 5:43 UTC (permalink / raw) To: Nigel Magnay; +Cc: Dmitry Potapov, git On Apr 17, 2008, at 1:07 AM, Nigel Magnay wrote: > In a perfect world I'd be able to switch all files int the repo to LF, > but that's not going to happen any time soon because of the majority > of developers, still on svn, still on windows. If you want Git's autocrlf to convert to the native line endings on Windows and Unix, you need to convert everything to LF in the repo. This is what we did and now everything runs smoothly. I have no recommendation, though, how to use svn and git together. I do not use git-svn. Steffen ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: crlf with git-svn driving me nuts... 2008-04-16 20:01 ` Dmitry Potapov 2008-04-16 20:20 ` Avery Pennarun @ 2008-04-16 20:56 ` Martin Langhoff 2008-04-16 21:02 ` Avery Pennarun 2008-04-16 21:17 ` Dmitry Potapov 1 sibling, 2 replies; 19+ messages in thread From: Martin Langhoff @ 2008-04-16 20:56 UTC (permalink / raw) To: Dmitry Potapov; +Cc: Nigel Magnay, git On Wed, Apr 16, 2008 at 3:01 PM, Dmitry Potapov <dpotapov@gmail.com> wrote: > core.autocrlf=false is a bad choice for Windows. ... > If you do not want problems, you should use core.autocrlf=true > on Windows. If you are making the above statements in generally about git, I disagree. I have used msysgit a lot with unix-newlines projects, and it works fantastic. I am careful to work with newline-smart editors but any half-decent editor will cope. The general hint is: avoid any content-mangling options if possible, and git will do the right thing. OTOH, you might be referring to git-svn on Windows, which I have no experience with :-) cheers, martin -- martin.langhoff@gmail.com martin@laptop.org -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: crlf with git-svn driving me nuts... 2008-04-16 20:56 ` Martin Langhoff @ 2008-04-16 21:02 ` Avery Pennarun 2008-04-16 21:17 ` Dmitry Potapov 1 sibling, 0 replies; 19+ messages in thread From: Avery Pennarun @ 2008-04-16 21:02 UTC (permalink / raw) To: Martin Langhoff; +Cc: Dmitry Potapov, Nigel Magnay, git On 4/16/08, Martin Langhoff <martin.langhoff@gmail.com> wrote: > On Wed, Apr 16, 2008 at 3:01 PM, Dmitry Potapov <dpotapov@gmail.com> wrote: > > If you do not want problems, you should use core.autocrlf=true > > on Windows. > > If you are making the above statements in generally about git, I > disagree. I have used msysgit a lot with unix-newlines projects, and > it works fantastic. I am careful to work with newline-smart editors > but any half-decent editor will cope. The general hint is: avoid any > content-mangling options if possible, and git will do the right thing. Various Windows IDEs (notably Delphi... and notepad :)) get confused by non-CRLF files and either do random things to the file, fail to compile, or "helpfully" change all the line endings back to CRLF. I agree that any program that does any such thing is braindead, but unfortunately, some people are stuck with such programs. Avery ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: crlf with git-svn driving me nuts... 2008-04-16 20:56 ` Martin Langhoff 2008-04-16 21:02 ` Avery Pennarun @ 2008-04-16 21:17 ` Dmitry Potapov 1 sibling, 0 replies; 19+ messages in thread From: Dmitry Potapov @ 2008-04-16 21:17 UTC (permalink / raw) To: Martin Langhoff; +Cc: Nigel Magnay, git On Wed, Apr 16, 2008 at 03:56:18PM -0500, Martin Langhoff wrote: > On Wed, Apr 16, 2008 at 3:01 PM, Dmitry Potapov <dpotapov@gmail.com> wrote: > > core.autocrlf=false is a bad choice for Windows. > ... > > If you do not want problems, you should use core.autocrlf=true > > on Windows. > > If you are making the above statements in generally about git, I > disagree. I stand corrected. It should be either core.autocrlf=true is you like DOS ending or core.autocrlf=input if you prefer unix-newlines. In both cases, your Git repository will have only LF, which is the Right Thing. The only argument for core.autocrlf=false was that automatic heuristic may incorrectly detect some binary as text and then your tile will be corrupted. So, core.safecrlf option was introduced to warn a user if a irreversable change happens. In fact, there are two possibilities of irreversable changes -- mixed line-ending in text file, in this normalization is desirable, so this warning can be ignored, or (very unlikely) that Git incorrectly detected your binary file as text. Then you need to use attributes to tell Git that this file is binary. I have not used git-svn on Windows for some time now, because now I have a mirror running on Linux, so I clone directly from it. Dmitry ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: crlf with git-svn driving me nuts... 2008-04-16 19:10 crlf with git-svn driving me nuts Nigel Magnay 2008-04-16 20:01 ` Dmitry Potapov @ 2008-04-16 20:03 ` Avery Pennarun 1 sibling, 0 replies; 19+ messages in thread From: Avery Pennarun @ 2008-04-16 20:03 UTC (permalink / raw) To: Nigel Magnay; +Cc: git On 4/16/08, Nigel Magnay <nigel.magnay@gmail.com> wrote: > Why does it think in this instance that there is a change? It's CRLF > in the repo, it's CRLF in the working tree, and the checkout in either > mode ought to be identical ?? We got quite confused by this here too. I'm pretty sure git's autocrlf feature is buggy, as you've noticed. Combined with that, svn has its *own* kind of autocrlf feature (svn:eol-style property on each file) that acts completely differently. As an added bonus, I don't know if you've run into this yet, but cygwin's "patch" command seems to unconditionally strip CR from patches *before* trying to apply them at all, *even if* the target file is CRLF, so patches just never apply to CRLF files ever. Ha ha! I managed to make the two systems stop stomping on each other, in our case, by using svn:eol-style of "native" (which means when git-svn checks out the file, it gets only LF, since it seems to always claim to be Unix) and not using git's autocrlf at all. However, this isn't optimal since then Windows git users end up with LF instead of CRLF in their files, which confuses them. On the other hand, the conflicts and the random-newline-changing diffs go away, as svn fixes things up at checkin time no matter how badly they got mangled by the windows user (most commonly, they run a program that resaves the whole file as CRLF). Obviously a working git autocrlf feature would be better, but I haven't looked into it closely enough to say where the problem actually lies. Have fun, Avery ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2008-04-17 22:43 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-16 19:10 crlf with git-svn driving me nuts Nigel Magnay
2008-04-16 20:01 ` Dmitry Potapov
2008-04-16 20:20 ` Avery Pennarun
2008-04-16 20:39 ` Dmitry Potapov
2008-04-16 21:56 ` Nigel Magnay
[not found] ` <320075ff0804161447u25dfbb2bmcd36ea507224d835@mail.gmail.com>
[not found] ` <20080416223739.GJ3133@dpotapov.dyndns.org>
2008-04-16 23:07 ` Nigel Magnay
2008-04-17 0:46 ` Dmitry Potapov
2008-04-17 1:44 ` Avery Pennarun
2008-04-17 7:07 ` Nigel Magnay
2008-04-17 9:43 ` Dmitry Potapov
2008-04-17 10:09 ` Nigel Magnay
2008-04-17 18:53 ` Dmitry Potapov
2008-04-17 22:03 ` Nigel Magnay
2008-04-17 22:42 ` Dmitry Potapov
2008-04-17 5:43 ` Steffen Prohaska
2008-04-16 20:56 ` Martin Langhoff
2008-04-16 21:02 ` Avery Pennarun
2008-04-16 21:17 ` Dmitry Potapov
2008-04-16 20:03 ` Avery Pennarun
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).