* CRLF problems with Git on Win32 @ 2008-01-07 9:16 Peter Karlsson 2008-01-07 9:57 ` Steffen Prohaska 0 siblings, 1 reply; 113+ messages in thread From: Peter Karlsson @ 2008-01-07 9:16 UTC (permalink / raw) To: Git Mailing List Hi! When I clone git://git.debian.org/git/turqstat/turqstat.git using the msys-Windows version of git (1.5.4-rc2), some but not all the files get autoconverted to CRLF. Is it possible to set properties for the files that are text, to make sure they are converted properly? -- \\// Peter - http://www.softwolves.pp.se/ ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 9:16 CRLF problems with Git on Win32 Peter Karlsson @ 2008-01-07 9:57 ` Steffen Prohaska 2008-01-07 10:00 ` Junio C Hamano ` (2 more replies) 0 siblings, 3 replies; 113+ messages in thread From: Steffen Prohaska @ 2008-01-07 9:57 UTC (permalink / raw) To: Peter Karlsson; +Cc: Git Mailing List On Jan 7, 2008, at 10:16 AM, Peter Karlsson wrote: > When I clone git://git.debian.org/git/turqstat/turqstat.git using the > msys-Windows version of git (1.5.4-rc2), some but not all the files > get > autoconverted to CRLF. Is it possible to set properties for the files > that are text, to make sure they are converted properly? Per default, CRLF conversion is disabled in msysgit. Git should not convert a single file. Does it really convert some? You can verify that CRLF conversion is off by running git config core.autocrlf which should just print an empty line. You can enable automatic conversion for all text files by running git config core.autocrlf true (this can be set on a per-repository basis or you can set a default for your account if you pass the '--global' option.) A difficulty you'll run into is that you need to set "core.autocrlf true" before you checkout. But because git clone fuses git init, git fetch, and git checkout into a single operation, you can't use it as is if you like to enable CRLF on a per-repository basis (it works if you set a global default). You can either use git clone -n URL # -n tells clone to stop before checkout cd turqstat git config core.autocrlf true git checkout -b master origin/master or you can manually do what clone would do for you, i.e. mkdir turqstat cd turqstat git init git config core.autocrlf true git remote add origin git://git.debian.org/git/turqstat/ turqstat.git git fetch origin git checkout -b master origin/master (this is what I typically do). BTW, I think that git clone should be improved to avoid the workaround described above. Maybe it could ask the user if it should set up a specific line ending conversion before checkout. Unfortunately, I had no time to write a patch, yet. Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 9:57 ` Steffen Prohaska @ 2008-01-07 10:00 ` Junio C Hamano 2008-01-07 12:15 ` Steffen Prohaska 2008-01-07 10:12 ` Jeff King 2008-01-07 10:13 ` Peter Klavins 2 siblings, 1 reply; 113+ messages in thread From: Junio C Hamano @ 2008-01-07 10:00 UTC (permalink / raw) To: Steffen Prohaska; +Cc: Peter Karlsson, Git Mailing List Steffen Prohaska <prohaska@zib.de> writes: > Per default, CRLF conversion is disabled in msysgit. That's interesting, as core.autocrlf was invented _specifically_ for use on Windows. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 10:00 ` Junio C Hamano @ 2008-01-07 12:15 ` Steffen Prohaska 0 siblings, 0 replies; 113+ messages in thread From: Steffen Prohaska @ 2008-01-07 12:15 UTC (permalink / raw) To: Junio C Hamano; +Cc: Peter Karlsson, Git Mailing List On Jan 7, 2008, at 11:00 AM, Junio C Hamano wrote: > Steffen Prohaska <prohaska@zib.de> writes: > >> Per default, CRLF conversion is disabled in msysgit. > > That's interesting, as core.autocrlf was invented _specifically_ > for use on Windows. My take on this is that is was invented for cross-platform projects. But if you have a Windows-only project it does not make sense to convert line endings. Only if you plan to work on multiple platforms, line ending conversion makes sense. The most conservative choice is to leave content unmodified. This is true for Windows, as it is for Unix. Therefore, msysgit does not modify your content unless requested otherwise. Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 9:57 ` Steffen Prohaska 2008-01-07 10:00 ` Junio C Hamano @ 2008-01-07 10:12 ` Jeff King 2008-01-07 18:47 ` Robin Rosenberg 2008-01-07 10:13 ` Peter Klavins 2 siblings, 1 reply; 113+ messages in thread From: Jeff King @ 2008-01-07 10:12 UTC (permalink / raw) To: Steffen Prohaska; +Cc: Peter Karlsson, Git Mailing List On Mon, Jan 07, 2008 at 10:57:52AM +0100, Steffen Prohaska wrote: > or you can manually do what clone would do for you, i.e. > > mkdir turqstat > cd turqstat > git init > git config core.autocrlf true > git remote add origin git://git.debian.org/git/turqstat/turqstat.git > git fetch origin > git checkout -b master origin/master > > (this is what I typically do). > > BTW, I think that git clone should be improved to avoid the > workaround described above. Maybe it could ask the user if it > should set up a specific line ending conversion before checkout. > Unfortunately, I had no time to write a patch, yet. I don't know if there are other options that might impact how clone works, but something like the patch below might make sense. It would allow: git clone -c core.autocrlf=true ... Note that the patch should not be applied; it doesn't handle values with whitespace (and hopefully builtin clone will come soon after v1.5.4, which would make doing it right much simpler). --- diff --git a/git-clone.sh b/git-clone.sh index b4e858c..a002550 100755 --- a/git-clone.sh +++ b/git-clone.sh @@ -23,6 +23,7 @@ reference= reference repository o,origin= use <name> instead of 'origin' to track upstream u,upload-pack= path to git-upload-pack on the remote depth= create a shallow clone of that depth +c,config= set a config option of the form key=value use-separate-remote compatibility, do not use no-separate-remote compatibility, do not use" @@ -127,6 +128,7 @@ use_separate_remote=t depth= no_progress= local_explicitly_asked_for= +config= test -t 1 || no_progress=--no-progress while test $# != 0 @@ -173,6 +175,9 @@ do --depth) shift depth="--depth=$1" ;; + -c|--config) + shift + config="$config $1" ;; --) shift break ;; @@ -242,6 +247,12 @@ fi && export GIT_DIR && GIT_CONFIG="$GIT_DIR/config" git-init $quiet ${template+"$template"} || usage +for i in $config; do + key=`echo $i | cut -d= -f1` + value=`echo $i | cut -d= -f2-` + git config $key $value +done + if test -n "$bare" then GIT_CONFIG="$GIT_DIR/config" git config core.bare true ^ permalink raw reply related [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 10:12 ` Jeff King @ 2008-01-07 18:47 ` Robin Rosenberg 2008-01-07 19:16 ` Johannes Schindelin 0 siblings, 1 reply; 113+ messages in thread From: Robin Rosenberg @ 2008-01-07 18:47 UTC (permalink / raw) To: Jeff King; +Cc: Steffen Prohaska, Peter Karlsson, Git Mailing List måndagen den 7 januari 2008 skrev Jeff King: > On Mon, Jan 07, 2008 at 10:57:52AM +0100, Steffen Prohaska wrote: > > I don't know if there are other options that might impact how clone > works, but something like the patch below might make sense. It would > allow: > > git clone -c core.autocrlf=true ... You can also set the option globally. Maybe something for the installer or a first time wizard. But I do think git should have this option set right from the beginning. It could print out somethig to notify the user that (and which) some options are not set the same as on unix. -- robin ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 18:47 ` Robin Rosenberg @ 2008-01-07 19:16 ` Johannes Schindelin 2008-01-07 21:03 ` Robin Rosenberg 0 siblings, 1 reply; 113+ messages in thread From: Johannes Schindelin @ 2008-01-07 19:16 UTC (permalink / raw) To: Robin Rosenberg Cc: Jeff King, Steffen Prohaska, Peter Karlsson, Git Mailing List [-- Attachment #1: Type: TEXT/PLAIN, Size: 676 bytes --] Hi, On Mon, 7 Jan 2008, Robin Rosenberg wrote: > måndagen den 7 januari 2008 skrev Jeff King: > > On Mon, Jan 07, 2008 at 10:57:52AM +0100, Steffen Prohaska wrote: > > > > I don't know if there are other options that might impact how clone > > works, but something like the patch below might make sense. It would > > allow: > > > > git clone -c core.autocrlf=true ... > > You can also set the option globally. Maybe something for the installer > or a first time wizard. We thought about that, too. > But I do think git should have this option set right from the beginning. Problem. There is not a single "right". It really depends on the project. Ciao, Dscho ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 19:16 ` Johannes Schindelin @ 2008-01-07 21:03 ` Robin Rosenberg 2008-01-07 21:18 ` Johannes Schindelin ` (2 more replies) 0 siblings, 3 replies; 113+ messages in thread From: Robin Rosenberg @ 2008-01-07 21:03 UTC (permalink / raw) To: Johannes Schindelin Cc: Jeff King, Steffen Prohaska, Peter Karlsson, Git Mailing List måndagen den 7 januari 2008 skrev du: > Problem. There is not a single "right". It really depends on the > project. Indeed, but the most common SCM's detect binary files automatically, either by suffix or content analysis, so I think that is what user's expect. It will be right for more projects that the current behaviour. -- robin ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 21:03 ` Robin Rosenberg @ 2008-01-07 21:18 ` Johannes Schindelin 2008-01-07 21:40 ` Steffen Prohaska 2008-01-07 21:36 ` Linus Torvalds 2008-01-07 21:42 ` Thomas Neumann 2 siblings, 1 reply; 113+ messages in thread From: Johannes Schindelin @ 2008-01-07 21:18 UTC (permalink / raw) To: Robin Rosenberg Cc: Jeff King, Steffen Prohaska, Peter Karlsson, Git Mailing List, msysgit [-- Attachment #1: Type: TEXT/PLAIN, Size: 777 bytes --] Hi, [msysGit Cc'ed, since it is massively concerned by this thread] On Mon, 7 Jan 2008, Robin Rosenberg wrote: > måndagen den 7 januari 2008 skrev du: > > Problem. There is not a single "right". It really depends on the > > project. > > Indeed, but the most common SCM's detect binary files automatically, > either by suffix or content analysis, so I think that is what user's > expect. It will be right for more projects that the current behaviour. Steffen also fought for turning this on by default, but so far I resisted. For a good reason: the primary user of msysGit for the moment is... msysGit. And this project does not need CR for obvious reasons. But I imagine that it makes sense for the Git installers. Colour me no-longer-resisting. Ciao, Dscho ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 21:18 ` Johannes Schindelin @ 2008-01-07 21:40 ` Steffen Prohaska [not found] ` <3B08AC4C-A807-4155-8AD7-DC6A6D0FE134-wjoc1KHpMeg@public.gmane.org> 0 siblings, 1 reply; 113+ messages in thread From: Steffen Prohaska @ 2008-01-07 21:40 UTC (permalink / raw) To: Johannes Schindelin Cc: Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysgit On Jan 7, 2008, at 10:18 PM, Johannes Schindelin wrote: > Hi, > > [msysGit Cc'ed, since it is massively concerned by this thread] > > On Mon, 7 Jan 2008, Robin Rosenberg wrote: > >> måndagen den 7 januari 2008 skrev du: >>> Problem. There is not a single "right". It really depends on the >>> project. >> >> Indeed, but the most common SCM's detect binary files automatically, >> either by suffix or content analysis, so I think that is what user's >> expect. It will be right for more projects than the current >> behaviour. > > Steffen also fought for turning this on by default, but so far I > resisted. > For a good reason: the primary user of msysGit for the moment is... > msysGit. And this project does not need CR for obvious reasons. > > But I imagine that it makes sense for the Git installers. Colour me > no-longer-resisting. Eventually I gave in and even voted for "Git does not modify content unless explicitly requested otherwise". Here's the full discussion: http://code.google.com/p/msysgit/issues/detail?id=21 I believe the main question is which type of projects we would like to support by our default. For real cross-platform projects that will be checked out on Windows and Unix we should choose "core.autocrlf true" as our default. But if our default are native Windows projects that will never be checked out on Unix, then we should not set core.autocrlf by default. I once fought for "real cross-platform", because this is what I need in my daily work. Note, however, that this setting bears the slight chance of git failing to correctly detect a binary file. In this case git would corrupt the file. So there is a tiny chance of data loss with "core.autocrlf true". The safest choice is to leave core.autocrlf unset. Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <3B08AC4C-A807-4155-8AD7-DC6A6D0FE134-wjoc1KHpMeg@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <3B08AC4C-A807-4155-8AD7-DC6A6D0FE134-wjoc1KHpMeg@public.gmane.org> @ 2008-01-07 22:06 ` Junio C Hamano [not found] ` <7vzlvhxpda.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> 2008-01-08 17:29 ` J. Bruce Fields 1 sibling, 1 reply; 113+ messages in thread From: Junio C Hamano @ 2008-01-07 22:06 UTC (permalink / raw) To: Steffen Prohaska Cc: Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysgit-/JYPxA39Uh5TLH3MbocFFw Steffen Prohaska <prohaska-wjoc1KHpMeg@public.gmane.org> writes: > I believe the main question is which type of projects we would like > to support by our default. For real cross-platform projects that will > be checked out on Windows and Unix we should choose > "core.autocrlf true" as our default. But if our default are native > Windows projects that will never be checked out on Unix, then we > should not set core.autocrlf by default. If the primary target is native Windows projects that wants CRLF in the work tree, you could still set core.autocrlf. Your checkouts will be with CRLF. And someday perhaps somebody may offer porting that to UNIX and his checkout will be without CR. So wouldn't the categorization be more like this? - "real cross-platform" would want core.autocrlf = true; - "native Windows" can work either way; - "originated from UNIX" would be helped with core.autocrlf = true; ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <7vzlvhxpda.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <7vzlvhxpda.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> @ 2008-01-07 22:58 ` Linus Torvalds [not found] ` <alpine.LFD.1.00.0801071457040.3148-5CScLwifNT1QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org> 2008-01-08 7:02 ` Steffen Prohaska 1 sibling, 1 reply; 113+ messages in thread From: Linus Torvalds @ 2008-01-07 22:58 UTC (permalink / raw) To: Junio C Hamano Cc: Steffen Prohaska, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysgit-/JYPxA39Uh5TLH3MbocFFw On Mon, 7 Jan 2008, Junio C Hamano wrote: > > So wouldn't the categorization be more like this? Well, one thng we could do is to add a new concept, namely core.autocrlf = warn and make *that* the default. It would do the check, but not actually convert anything, just warn about it. Then, it's up to the user to set it explicitly to "true" or "false", unless they just like seeing that warning a million times ;) That might be acceptable to most people. Linus ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <alpine.LFD.1.00.0801071457040.3148-5CScLwifNT1QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <alpine.LFD.1.00.0801071457040.3148-5CScLwifNT1QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org> @ 2008-01-07 23:46 ` Gregory Jefferis 2008-01-08 11:09 ` git and unicode Gonzalo Garramuño 2008-01-08 8:55 ` CRLF problems with Git on Win32 Marius Storm-Olsen 1 sibling, 1 reply; 113+ messages in thread From: Gregory Jefferis @ 2008-01-07 23:46 UTC (permalink / raw) To: Linus Torvalds, Junio C Hamano Cc: Steffen Prohaska, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysgit-/JYPxA39Uh5TLH3MbocFFw On 7/1/08 22:58, "Linus Torvalds" <torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> wrote: > Well, one thng we could do is to add a new concept, namely > > core.autocrlf = warn > > and make *that* the default. > > It would do the check, but not actually convert anything, just warn about > it. > I think this is the best option so far. Since this doesn't exist I was just writing a hook for myself which fails rather than warns when CRLFs are detected. Not modifying content by default is a good mantra. But getting a couple of CRLF files into the repository and then noticing n commits down the line and having to run a filter-branch session is a pain. So +1 for the default being a warning (that includes appropriate instructions) when committing a file that looks like text but has CRLF. And having a "fail" option might be nice while you(?)'re at it: core.autocrlf = fail # refuse to commit text files containing CRLF Greg. PS Of course none of this would have helped me with those old mac CR files that got into another repository ... ^ permalink raw reply [flat|nested] 113+ messages in thread
* git and unicode 2008-01-07 23:46 ` Gregory Jefferis @ 2008-01-08 11:09 ` Gonzalo Garramuño 2008-01-08 15:09 ` Remi Vanicat 2008-01-08 20:36 ` Robin Rosenberg 0 siblings, 2 replies; 113+ messages in thread From: Gonzalo Garramuño @ 2008-01-08 11:09 UTC (permalink / raw) To: git Forking a little from the recent CR/LF thread, I was wondering how does git deal with unicode files? Most scripting languages (ruby, python, etc) are now allowing their source code to be written in unicode (UTF-8, usually). Will git incorrectly categorize those source files as "binary"? -- Gonzalo Garramuño ggarra@advancedsl.com.ar AMD4400 - ASUS48N-E GeForce7300GT Xubuntu Gutsy ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: git and unicode 2008-01-08 11:09 ` git and unicode Gonzalo Garramuño @ 2008-01-08 15:09 ` Remi Vanicat 2008-01-08 20:36 ` Robin Rosenberg 1 sibling, 0 replies; 113+ messages in thread From: Remi Vanicat @ 2008-01-08 15:09 UTC (permalink / raw) To: Gonzalo Garramuño; +Cc: git Gonzalo Garramuño <ggarra@advancedsl.com.ar> writes: > Forking a little from the recent CR/LF thread, I was wondering how > does git deal with unicode files? > > Most scripting languages (ruby, python, etc) are now allowing their > source code to be written in unicode (UTF-8, usually). Will git > incorrectly categorize those source files as "binary"? > > > -- > Gonzalo Garramuño > ggarra@advancedsl.com.ar > > AMD4400 - ASUS48N-E > GeForce7300GT > Xubuntu Gutsy -- Rémi Vanicat ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: git and unicode 2008-01-08 11:09 ` git and unicode Gonzalo Garramuño 2008-01-08 15:09 ` Remi Vanicat @ 2008-01-08 20:36 ` Robin Rosenberg 1 sibling, 0 replies; 113+ messages in thread From: Robin Rosenberg @ 2008-01-08 20:36 UTC (permalink / raw) To: Gonzalo Garramuño; +Cc: git UTF-8: no UTF-16: yes (actually "most likely" yes) -- robin ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 [not found] ` <alpine.LFD.1.00.0801071457040.3148-5CScLwifNT1QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org> 2008-01-07 23:46 ` Gregory Jefferis @ 2008-01-08 8:55 ` Marius Storm-Olsen 1 sibling, 0 replies; 113+ messages in thread From: Marius Storm-Olsen @ 2008-01-08 8:55 UTC (permalink / raw) To: torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Cc: Junio C Hamano, Steffen Prohaska, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysgit-/JYPxA39Uh5TLH3MbocFFw Linus Torvalds said the following on 07.01.2008 23:58: > > > On Mon, 7 Jan 2008, Junio C Hamano wrote: >> So wouldn't the categorization be more like this? > > Well, one thng we could do is to add a new concept, namely > > core.autocrlf = warn > > and make *that* the default. > > It would do the check, but not actually convert anything, just warn about > it. > > Then, it's up to the user to set it explicitly to "true" or "false", > unless they just like seeing that warning a million times ;) > > That might be acceptable to most people. I actually would want the default to be core.autocrlf = windows Meaning, it would be true on Windows platforms, and warn on all others. That way it would work as expected in 90% of the time, namely that files are added to the repo with unix line endings. We could then add the following warning when you try to add a CRLF file on non-Windows platforms: * CRLF line endings detected for text files: <foo>, <bar>, <baz> Consider adding the following to a .gitattributes file to maintain the CRLF line endings on all platforms: <foo> = -crlf <bar> = -crlf <baz> = -crlf Maybe then can we lure non-Windows users to add these required .gitattributes files for files that need to be CRLF on all platforms; instead of shoving the whole burden of maintaining a proper cross-platform repo onto the Windows users alone. /me braces for the "it's your fault for working on Windows in the first place!" flood ;-) Of course, setting the core.autocrlf = true|false should not show the warning, for users who don't care about repo portability to Windows anyways. -- .marius ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 [not found] ` <7vzlvhxpda.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> 2008-01-07 22:58 ` Linus Torvalds @ 2008-01-08 7:02 ` Steffen Prohaska 2008-01-08 7:29 ` Junio C Hamano 1 sibling, 1 reply; 113+ messages in thread From: Steffen Prohaska @ 2008-01-08 7:02 UTC (permalink / raw) To: Junio C Hamano, Linus Torvalds Cc: Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit On Jan 7, 2008, at 11:06 PM, Junio C Hamano wrote: > Steffen Prohaska <prohaska-wjoc1KHpMeg@public.gmane.org> writes: > >> I believe the main question is which type of projects we would like >> to support by our default. For real cross-platform projects that >> will >> be checked out on Windows and Unix we should choose >> "core.autocrlf true" as our default. But if our default are native >> Windows projects that will never be checked out on Unix, then we >> should not set core.autocrlf by default. > > If the primary target is native Windows projects that wants CRLF > in the work tree, you could still set core.autocrlf. Your > checkouts will be with CRLF. And someday perhaps somebody may > offer porting that to UNIX and his checkout will be without CR. > > So wouldn't the categorization be more like this? > > - "real cross-platform" would want core.autocrlf = true; > > - "native Windows" can work either way; But core.autocrlf = true has a slight danger of data corruption. AFAIK, git's binary detection checks the first "few" bytes (with few = 8000). This may be sufficient in most case, but I already met a file that was wrongly classified. (A File format that starts with a large ASCII header and has chunks of binary data attached later.) > - "originated from UNIX" would be helped with core.autocrlf = true; I'd say "could be helped". For the msysgit development, for example, we do _not_ want to have core.autocrlf = true but prefer to preserve the Unix line ending even when working on Windows. We have only few Windows-specific files that are committed with CRLF. We _know_ the problem and we explicitly handle it. I believe, best would be if a line ending policy could be configured for a project. Then, the decision could be made once for the project and should be enforced on all clones. But currently git has no concept for this. A sound policy for "real cross-platform" is that CRLF must never enter the repository unless git detects a file as binary, or a file is explicitly listed in .gitattributes. It doesn't really matter if Windows users check out files with CRLF or LF. It only matters that they'll never commit a file with CRLF. Note, the same is true for Unix users. People could send code by email or copy source files from Windows to Unix machines. Then, CRLF would enter the repo on Unix. So the least that should be set for this type of projects on any OS is core.autocrlf = input. On Windows, core.autocrlf = true is probably more natural. I like Linus' idea of "warn" or Gregory's "fail". Would "warn/fail" be the default on Unix, too? Then Unix users would also be forced to make an explicit choice. Maybe some day they want to check out their project on Windows and they should be prepared now. For typical files, the warning (or error) would never trigger. But maybe one day they copy a file from a Windows machine and forget to run dos2unix. In this case, git would warn them unless they set "core.autocrlf = false". I'm asking the last question because every Unix developer should think about the option, too. Neither Unix or Windows are causing the problem alone. It's the combination in a cross-platform project. Git could ensure that any repository is in principal prepared for cross-platform, unless explicitly told not to do so. So, would you, as Linux developers, like to have (or accept) "warn/fail" as your default? This would make things easy for the msysgit project: No Windows specific configuration; just official git. Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 7:02 ` Steffen Prohaska @ 2008-01-08 7:29 ` Junio C Hamano 2008-01-08 10:08 ` Jeff King 0 siblings, 1 reply; 113+ messages in thread From: Junio C Hamano @ 2008-01-08 7:29 UTC (permalink / raw) To: public-prohaska-wjoc1KHpMeg Cc: Linus Torvalds, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit Steffen Prohaska <prohaska-wjoc1KHpMeg@public.gmane.org> writes: > But core.autocrlf = true has a slight danger of data corruption. > AFAIK, git's binary detection checks the first "few" bytes (with > few = 8000). This may be sufficient in most case, but I already > met a file that was wrongly classified. (A File format that > starts with a large ASCII header and has chunks of binary data > attached later.) I presume that's where .gitattributes kicks in. > I like Linus' idea of "warn" or Gregory's "fail". Yeah, that feels like a sensible thing to do. > I'm asking the last question because every Unix developer should > think about the option, too. Neither Unix or Windows are causing > the problem alone. That's the logical conclusion. If you are introducing crlf = warn, that means you are declaring that CRLF should be treated as a disease, and that should apply everywhere, not just on Windows (which some people may consider a disease itself, but that is a separate topic). ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 7:29 ` Junio C Hamano @ 2008-01-08 10:08 ` Jeff King 2008-01-08 10:35 ` Junio C Hamano 2008-01-08 12:20 ` Gregory Jefferis 0 siblings, 2 replies; 113+ messages in thread From: Jeff King @ 2008-01-08 10:08 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Mon, Jan 07, 2008 at 11:29:30PM -0800, Junio C Hamano wrote: > Steffen Prohaska <prohaska-wjoc1KHpMeg@public.gmane.org> writes: I'm not sure what's causing it, but all of the addresses in your message (including cc headers) got munged. > > I'm asking the last question because every Unix developer should > > think about the option, too. Neither Unix or Windows are causing > > the problem alone. > > That's the logical conclusion. > > If you are introducing crlf = warn, that means you are declaring > that CRLF should be treated as a disease, and that should apply > everywhere, not just on Windows (which some people may consider > a disease itself, but that is a separate topic). It's unclear to me: is such a warning only supposed to happen when we see CRLF _after_ we have determined that a file is not actually binary? Otherwise, it seems like we are punishing people on sane platforms who use binary files (although even with that check, I am slightly uncomfortable given reports of incorrect guessing). -Peff ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 10:08 ` Jeff King @ 2008-01-08 10:35 ` Junio C Hamano 2008-01-08 12:20 ` Gregory Jefferis 1 sibling, 0 replies; 113+ messages in thread From: Junio C Hamano @ 2008-01-08 10:35 UTC (permalink / raw) To: Jeff King; +Cc: git Jeff King <peff@peff.net> writes: > On Mon, Jan 07, 2008 at 11:29:30PM -0800, Junio C Hamano wrote: > >> Steffen Prohaska <prohaska-wjoc1KHpMeg@public.gmane.org> writes: > > I'm not sure what's causing it, but all of the addresses in your message > (including cc headers) got munged. I think Steffen's original got munged (I just replied to it) by gmane's mail relaying interface. >> > I'm asking the last question because every Unix developer should >> > think about the option, too. Neither Unix or Windows are causing >> > the problem alone. >> >> That's the logical conclusion. >> >> If you are introducing crlf = warn, that means you are declaring >> that CRLF should be treated as a disease, and that should apply >> everywhere, not just on Windows (which some people may consider >> a disease itself, but that is a separate topic). > > It's unclear to me: is such a warning only supposed to happen when we > see CRLF _after_ we have determined that a file is not actually binary? Oh, I agree. I thought that was what Steffen was proposing. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 10:08 ` Jeff King 2008-01-08 10:35 ` Junio C Hamano @ 2008-01-08 12:20 ` Gregory Jefferis 1 sibling, 0 replies; 113+ messages in thread From: Gregory Jefferis @ 2008-01-08 12:20 UTC (permalink / raw) To: Jeff King, Junio C Hamano; +Cc: git On 8/1/08 10:08, "Jeff King" <peff@peff.net> wrote: >> If you are introducing crlf = warn, that means you are declaring >> that CRLF should be treated as a disease, and that should apply >> everywhere, not just on Windows (which some people may consider >> a disease itself, but that is a separate topic). > > It's unclear to me: is such a warning only supposed to happen when we > see CRLF _after_ we have determined that a file is not actually binary? > Otherwise, it seems like we are punishing people on sane platforms who > use binary files (although even with that check, I am slightly > uncomfortable given reports of incorrect guessing). In the context of EOL style, a warning or error should only be given if we think the file is text. Very occasionally we will be wrong about this, but if the default behaviour is warn then that will just be a minor annoyance. This annoyance can be overcome for a file or file type (with attributes), per project or globally. If the default behaviour were munge (e.g. autocrlf=true) then we could very occasionally damage something, so I think we can all agree that is a bad idea. Greg. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 [not found] ` <3B08AC4C-A807-4155-8AD7-DC6A6D0FE134-wjoc1KHpMeg@public.gmane.org> 2008-01-07 22:06 ` Junio C Hamano @ 2008-01-08 17:29 ` J. Bruce Fields [not found] ` <20080108172957.GG22155-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 1 sibling, 1 reply; 113+ messages in thread From: J. Bruce Fields @ 2008-01-08 17:29 UTC (permalink / raw) To: Steffen Prohaska Cc: Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysgit-/JYPxA39Uh5TLH3MbocFFw On Mon, Jan 07, 2008 at 10:40:56PM +0100, Steffen Prohaska wrote: > Eventually I gave in and even voted for "Git does not modify > content unless explicitly requested otherwise". > > Here's the full discussion: > > http://code.google.com/p/msysgit/issues/detail?id=21 > > I believe the main question is which type of projects we would like > to support by our default. For real cross-platform projects that will > be checked out on Windows and Unix we should choose > "core.autocrlf true" as our default. But if our default are native > Windows projects that will never be checked out on Unix, then we > should not set core.autocrlf by default. If the policy really depends on the project, then surely the default behavior should be determined by information carried in the project itself (e.g., the .gitattributes)? For that reason it strikes me as a mistake to ignore the crlf attribute by default (assuming that is indeed the current behavior; apologies for not checking). If crlf is set then I think it should be assumed that crlf conversion should be done unless that has been explicitly turned off somehow. --b. ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <20080108172957.GG22155-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <20080108172957.GG22155-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2008-01-08 17:56 ` Steffen Prohaska 2008-01-08 18:07 ` Junio C Hamano 2008-01-08 18:07 ` Junio C Hamano 0 siblings, 2 replies; 113+ messages in thread From: Steffen Prohaska @ 2008-01-08 17:56 UTC (permalink / raw) To: J. Bruce Fields Cc: Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysgit-/JYPxA39Uh5TLH3MbocFFw On Jan 8, 2008, at 6:29 PM, J. Bruce Fields wrote: > On Mon, Jan 07, 2008 at 10:40:56PM +0100, Steffen Prohaska wrote: >> Eventually I gave in and even voted for "Git does not modify >> content unless explicitly requested otherwise". >> >> Here's the full discussion: >> >> http://code.google.com/p/msysgit/issues/detail?id=21 >> >> I believe the main question is which type of projects we would like >> to support by our default. For real cross-platform projects that >> will >> be checked out on Windows and Unix we should choose >> "core.autocrlf true" as our default. But if our default are native >> Windows projects that will never be checked out on Unix, then we >> should not set core.autocrlf by default. > > If the policy really depends on the project, then surely the default > behavior should be determined by information carried in the project > itself (e.g., the .gitattributes)? Unfortunately it depends on the project _and_ the platform. A cross-platform project should have core.autocrlf=input on Unix and core.autocrlf=true on Windows. I don't think I can represent this with the current .gitattributes. Do you suggest to add this kind of magic to .gitattributes? Such as to have .gitattributes containing --- SNIP --- * crlf=autonative --- SNIP --- which would tell git to act as if core.autocrlf=input was set on Unix and core.autocrlf=true was set on Windows. > For that reason it strikes me as a mistake to ignore the crlf > attribute > by default (assuming that is indeed the current behavior; apologies > for > not checking). If crlf is set then I think it should be assumed that > crlf conversion should be done unless that has been explicitly turned > off somehow. I don't understand this comment. msysgit installs plain git. core.autocrlf is unset. Whatever plain git's default is, this is msysgit's default, too. Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 17:56 ` Steffen Prohaska @ 2008-01-08 18:07 ` Junio C Hamano 2008-01-08 18:07 ` Junio C Hamano 1 sibling, 0 replies; 113+ messages in thread From: Junio C Hamano @ 2008-01-08 18:07 UTC (permalink / raw) To: public-prohaska-wjoc1KHpMeg Cc: J. Bruce Fields, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, public-msysgit-/JYPxA39Uh5TLH3MbocFFw Steffen Prohaska <prohaska-wjoc1KHpMeg@public.gmane.org> writes: > msysgit installs plain git. core.autocrlf is unset. Whatever plain > git's default is, this is msysgit's default, too. That sounds like a mistake if you are installing a port to a platform whose native line ending convention is different from where plain git natively runs on (i.e. UNIX). ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 17:56 ` Steffen Prohaska 2008-01-08 18:07 ` Junio C Hamano @ 2008-01-08 18:07 ` Junio C Hamano [not found] ` <7vmyrgry20.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> 2008-01-10 19:58 ` Gregory Jefferis 1 sibling, 2 replies; 113+ messages in thread From: Junio C Hamano @ 2008-01-08 18:07 UTC (permalink / raw) To: public-prohaska-wjoc1KHpMeg Cc: J. Bruce Fields, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, public-msysgit-/JYPxA39Uh5TLH3MbocFFw Steffen Prohaska <prohaska-wjoc1KHpMeg@public.gmane.org> writes: > msysgit installs plain git. core.autocrlf is unset. Whatever plain > git's default is, this is msysgit's default, too. That sounds like a mistake if you are installing a port to a platform whose native line ending convention is different from where plain git natively runs on (i.e. UNIX). ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <7vmyrgry20.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <7vmyrgry20.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> @ 2008-01-08 18:58 ` Steffen Prohaska 2008-01-08 19:09 ` J. Bruce Fields ` (2 more replies) 0 siblings, 3 replies; 113+ messages in thread From: Steffen Prohaska @ 2008-01-08 18:58 UTC (permalink / raw) To: Junio C Hamano Cc: J. Bruce Fields, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit On Jan 8, 2008, at 7:07 PM, Junio C Hamano wrote: > > > Steffen Prohaska <prohaska-wjoc1KHpMeg-XMD5yJDbdMReXY1tMh2IBg@public.gmane.org> writes: > >> msysgit installs plain git. core.autocrlf is unset. Whatever plain >> git's default is, this is msysgit's default, too. > > That sounds like a mistake if you are installing a port to a > platform whose native line ending convention is different from > where plain git natively runs on (i.e. UNIX). We failed to agree on a better default and as the lengthy discussion documents, the best default isn't obvious. I don't think a solution will be found by declaring one platform native (UNIX) and all other platform non-native. The question to answer is how to support cross-platform projects. A valid solution should never corrupt data unless the user explicitly told git to do so. I don't believe it is a valid solution to set core.autocrlf=true on Windows and tell the users: "Well, in its default settings, git sometimes corrupts your data on Windows. Maybe you want to switch to Linux because this is the native platform where data corruption will never happen." I'd prefer the "warn/fail" proposal. Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 18:58 ` Steffen Prohaska @ 2008-01-08 19:09 ` J. Bruce Fields 2008-01-08 19:47 ` Junio C Hamano 2008-01-08 19:59 ` Steffen Prohaska 2008-01-08 20:11 ` Junio C Hamano 2008-01-08 20:50 ` Dmitry Potapov 2 siblings, 2 replies; 113+ messages in thread From: J. Bruce Fields @ 2008-01-08 19:09 UTC (permalink / raw) To: Steffen Prohaska Cc: Junio C Hamano, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit On Tue, Jan 08, 2008 at 07:58:57PM +0100, Steffen Prohaska wrote: > > On Jan 8, 2008, at 7:07 PM, Junio C Hamano wrote: > >> >> >> Steffen Prohaska <prohaska-wjoc1KHpMeg@public.gmane.org> writes: >> >>> msysgit installs plain git. core.autocrlf is unset. Whatever plain >>> git's default is, this is msysgit's default, too. >> >> That sounds like a mistake if you are installing a port to a >> platform whose native line ending convention is different from >> where plain git natively runs on (i.e. UNIX). > > We failed to agree on a better default and as the lengthy > discussion documents, the best default isn't obvious. > > I don't think a solution will be found by declaring one platform > native (UNIX) and all other platform non-native. The question to > answer is how to support cross-platform projects. A valid > solution should never corrupt data unless the user explicitly > told git to do so. My only suggestion is that we consider allowing the user that "explicitly told git to do so" be the project maintainer. So if you echo * autodetectcrlf >.gitattributes git add .gitattributes git commit then users that clone your repo will get that default without having to be told to do something magic on clone. (And ideally I'd've hoped you could do that using the existing crlf attribute rather than having to invent something new, but maybe that doesn't work.) --b. > I don't believe it is a valid solution to set > core.autocrlf=true on Windows and tell the users: "Well, in its > default settings, git sometimes corrupts your data on Windows. > Maybe you want to switch to Linux because this is the native > platform where data corruption will never happen." > > I'd prefer the "warn/fail" proposal. > > Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 19:09 ` J. Bruce Fields @ 2008-01-08 19:47 ` Junio C Hamano [not found] ` <7vir24rtfp.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> 2008-01-08 19:59 ` Steffen Prohaska 1 sibling, 1 reply; 113+ messages in thread From: Junio C Hamano @ 2008-01-08 19:47 UTC (permalink / raw) To: J. Bruce Fields Cc: Steffen Prohaska, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit "J. Bruce Fields" <bfields@fieldses.org> writes: > On Tue, Jan 08, 2008 at 07:58:57PM +0100, Steffen Prohaska wrote: >> ... >> I don't think a solution will be found by declaring one platform >> native (UNIX) and all other platform non-native. The question to >> answer is how to support cross-platform projects. A valid >> solution should never corrupt data unless the user explicitly >> told git to do so. > > My only suggestion is that we consider allowing the user that > "explicitly told git to do so" be the project maintainer. So if you > > echo * autodetectcrlf >.gitattributes > git add .gitattributes > git commit > > then users that clone your repo will get that default without having to > be told to do something magic on clone. > > (And ideally I'd've hoped you could do that using the existing crlf > attribute rather than having to invent something new, but maybe that > doesn't work.) I think the project can mark text files as text with attributes and if the port to the platform initialized core.autocrlf appropriately for the platform everything should work as you described. At least that is how I read the description of `crlf` in gitattributes(5). ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <7vir24rtfp.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <7vir24rtfp.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> @ 2008-01-08 20:02 ` Steffen Prohaska [not found] ` <B655B6FF-9377-434A-A979-2E758771B0FA-wjoc1KHpMeg@public.gmane.org> 2008-01-08 20:41 ` Linus Torvalds 1 sibling, 1 reply; 113+ messages in thread From: Steffen Prohaska @ 2008-01-08 20:02 UTC (permalink / raw) To: Junio C Hamano Cc: J. Bruce Fields, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit On Jan 8, 2008, at 8:47 PM, Junio C Hamano wrote: > > "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> writes: > >> On Tue, Jan 08, 2008 at 07:58:57PM +0100, Steffen Prohaska wrote: >>> ... >>> I don't think a solution will be found by declaring one platform >>> native (UNIX) and all other platform non-native. The question to >>> answer is how to support cross-platform projects. A valid >>> solution should never corrupt data unless the user explicitly >>> told git to do so. >> >> My only suggestion is that we consider allowing the user that >> "explicitly told git to do so" be the project maintainer. So if you >> >> echo * autodetectcrlf >.gitattributes >> git add .gitattributes >> git commit >> >> then users that clone your repo will get that default without >> having to >> be told to do something magic on clone. >> >> (And ideally I'd've hoped you could do that using the existing crlf >> attribute rather than having to invent something new, but maybe that >> doesn't work.) > > I think the project can mark text files as text with attributes > and if the port to the platform initialized core.autocrlf > appropriately for the platform everything should work as you > described. > > At least that is how I read the description of `crlf` in > gitattributes(5). But we do not want to mark a file as text but tell git to run its auto-detection and use the local default line endings. But for different projects we do not even want to run the auto-detection, but leave the files as is. See my separate mail that I just sent before I read yours. Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <B655B6FF-9377-434A-A979-2E758771B0FA-wjoc1KHpMeg@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <B655B6FF-9377-434A-A979-2E758771B0FA-wjoc1KHpMeg@public.gmane.org> @ 2008-01-08 20:15 ` Junio C Hamano [not found] ` <7v3at8rs4b.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> 2008-01-09 11:03 ` Johannes Schindelin 1 sibling, 1 reply; 113+ messages in thread From: Junio C Hamano @ 2008-01-08 20:15 UTC (permalink / raw) To: Steffen Prohaska Cc: Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit Steffen Prohaska <prohaska-wjoc1KHpMeg-XMD5yJDbdMReXY1tMh2IBg@public.gmane.org> writes: > On Jan 8, 2008, at 8:47 PM, Junio C Hamano wrote: >> >> "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw-XMD5yJDbdMReXY1tMh2IBg@public.gmane.org> writes: >> >>> My only suggestion is that we consider allowing the user that >>> "explicitly told git to do so" be the project maintainer. So if you >>> >>> echo * autodetectcrlf >.gitattributes >>> git add .gitattributes >>> git commit >>> >>> then users that clone your repo will get that default without >>> having to >>> be told to do something magic on clone. >>> >>> (And ideally I'd've hoped you could do that using the existing crlf >>> attribute rather than having to invent something new, but maybe that >>> doesn't work.) >> >> I think the project can mark text files as text with attributes >> and if the port to the platform initialized core.autocrlf >> appropriately for the platform everything should work as you >> described. >> >> At least that is how I read the description of `crlf` in >> gitattributes(5). > > > But we do not want to mark a file as text but tell git to run its > auto-detection and use the local default line endings. My reading of the description of `crlf` in gitattributes(5) is: `crlf` ^^^^^^ This attribute controls the line-ending convention. Set:: Setting the `crlf` attribute on a path is meant to mark the path as a "text" file. 'core.autocrlf' conversion takes place without guessing the content type by inspection. Notice "without guessing". ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <7v3at8rs4b.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <7v3at8rs4b.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> @ 2008-01-08 20:39 ` Steffen Prohaska 0 siblings, 0 replies; 113+ messages in thread From: Steffen Prohaska @ 2008-01-08 20:39 UTC (permalink / raw) To: Junio C Hamano Cc: Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit On Jan 8, 2008, at 9:15 PM, Junio C Hamano wrote: > > Steffen Prohaska <prohaska-wjoc1KHpMeg-XMD5yJDbdMReXY1tMh2IBg@public.gmane.org> writes: > >> On Jan 8, 2008, at 8:47 PM, Junio C Hamano wrote: >>> >>> "J. Bruce Fields" <bfields- >>> uC3wQj2KruNg9hUCZPvPmw-XMD5yJDbdMReXY1tMh2IBg@public.gmane.org> writes: >>> >>>> My only suggestion is that we consider allowing the user that >>>> "explicitly told git to do so" be the project maintainer. So if >>>> you >>>> >>>> echo * autodetectcrlf >.gitattributes >>>> git add .gitattributes >>>> git commit >>>> >>>> then users that clone your repo will get that default without >>>> having to >>>> be told to do something magic on clone. >>>> >>>> (And ideally I'd've hoped you could do that using the existing crlf >>>> attribute rather than having to invent something new, but maybe >>>> that >>>> doesn't work.) >>> >>> I think the project can mark text files as text with attributes >>> and if the port to the platform initialized core.autocrlf >>> appropriately for the platform everything should work as you >>> described. >>> >>> At least that is how I read the description of `crlf` in >>> gitattributes(5). >> >> >> But we do not want to mark a file as text but tell git to run its >> auto-detection and use the local default line endings. > > My reading of the description of `crlf` in gitattributes(5) is: > > `crlf` > ^^^^^^ > > This attribute controls the line-ending convention. > > Set:: > > Setting the `crlf` attribute on a path is meant to mark > the path as a "text" file. 'core.autocrlf' conversion > takes place without guessing the content type by > inspection. > > > Notice "without guessing". Exactly this is the problem. Some projects want guessing. A project needs to have a way to explicitly tell git that is should guess the file type and if it found "text", then it should use the right line endings (that is the locally preferred endings). If the project has control, the project maintainer is responsible for making the right choice. That is either he enables automatic detection of "text" files, or he can explicitly tell git about the types without guessing. A different project may not want to have guessing at all, but leave all files as is. I believe this should be the default for all projects that do not explicitly choose otherwise. I'm still reluctant to enabling guessing as a system wide default. Someone may just want to use git to manage a few binary files locally on his machine. I'd be unhappy if "guessing" corrupted these files. The project needs control if guessing is activated or not. Right now we have no way for a project to tell git that it should guess, even if the default for other projects is not to guess. Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 [not found] ` <B655B6FF-9377-434A-A979-2E758771B0FA-wjoc1KHpMeg@public.gmane.org> 2008-01-08 20:15 ` Junio C Hamano @ 2008-01-09 11:03 ` Johannes Schindelin [not found] ` <alpine.LSU.1.00.0801091100401.31053-OGWIkrnhIhzN0uC3ymp8PA@public.gmane.org> 1 sibling, 1 reply; 113+ messages in thread From: Johannes Schindelin @ 2008-01-09 11:03 UTC (permalink / raw) To: Steffen Prohaska Cc: Junio C Hamano, J. Bruce Fields, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit Hi, On Tue, 8 Jan 2008, Steffen Prohaska wrote: > On Jan 8, 2008, at 8:47 PM, Junio C Hamano wrote: > > > I think the project can mark text files as text with attributes and if > > the port to the platform initialized core.autocrlf appropriately for > > the platform everything should work as you described. > > > > At least that is how I read the description of `crlf` in > > gitattributes(5). > > But we do not want to mark a file as text but tell git to run its > auto-detection and use the local default line endings. But for > different projects we do not even want to run the auto-detection, but > leave the files as is. Probably the best thing would be to default to crlf=true, and then have a .gitattributes file like this in your project: -- snip -- *.am -crlf -- snap -- (Did I guess right about the file extension? But why do you want to check in huge 3D stacks? Ah, of course, for test cases.) Ciao, Dscho ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <alpine.LSU.1.00.0801091100401.31053-OGWIkrnhIhzN0uC3ymp8PA@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <alpine.LSU.1.00.0801091100401.31053-OGWIkrnhIhzN0uC3ymp8PA@public.gmane.org> @ 2008-01-09 12:45 ` Steffen Prohaska [not found] ` <019B1C82-27BF-4B6B-981D-5498D31B5DD3-wjoc1KHpMeg@public.gmane.org> 0 siblings, 1 reply; 113+ messages in thread From: Steffen Prohaska @ 2008-01-09 12:45 UTC (permalink / raw) To: Johannes Schindelin Cc: Junio C Hamano, J. Bruce Fields, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit On Jan 9, 2008, at 12:03 PM, Johannes Schindelin wrote: > On Tue, 8 Jan 2008, Steffen Prohaska wrote: > >> On Jan 8, 2008, at 8:47 PM, Junio C Hamano wrote: >> >>> I think the project can mark text files as text with attributes >>> and if >>> the port to the platform initialized core.autocrlf appropriately for >>> the platform everything should work as you described. >>> >>> At least that is how I read the description of `crlf` in >>> gitattributes(5). >> >> But we do not want to mark a file as text but tell git to run its >> auto-detection and use the local default line endings. But for >> different projects we do not even want to run the auto-detection, but >> leave the files as is. > > Probably the best thing would be to default to crlf=true, and then > have a > .gitattributes file like this in your project: > > -- snip -- > *.am -crlf > -- snap -- > > (Did I guess right about the file extension? But why do you want to > check > in huge 3D stacks? Ah, of course, for test cases.) Yes, thanks ;) For now, this is the right thing to do. However, our file format and the application does not depend on the extension. A a long term solution, I'll fix our file format header to include '\0' if the file is binary. Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <019B1C82-27BF-4B6B-981D-5498D31B5DD3-wjoc1KHpMeg@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <019B1C82-27BF-4B6B-981D-5498D31B5DD3-wjoc1KHpMeg@public.gmane.org> @ 2008-01-09 13:32 ` Johannes Schindelin 0 siblings, 0 replies; 113+ messages in thread From: Johannes Schindelin @ 2008-01-09 13:32 UTC (permalink / raw) To: Steffen Prohaska Cc: Junio C Hamano, J. Bruce Fields, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit Hi, On Wed, 9 Jan 2008, Steffen Prohaska wrote: > On Jan 9, 2008, at 12:03 PM, Johannes Schindelin wrote: > > > -- snip -- > > *.am -crlf > > -- snap -- > > > > (Did I guess right about the file extension? But why do you want to > > check in huge 3D stacks? Ah, of course, for test cases.) > > Yes, thanks ;) > > For now, this is the right thing to do. However, our file format and > the application does not depend on the extension. A a long term > solution, I'll fix our file format header to include '\0' if the file is > binary. Of course, that would help the binary detection of both subversion and git, but it would break 3rd party readers of the format :-( Thanks for the heads-up, Dscho ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 [not found] ` <7vir24rtfp.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> 2008-01-08 20:02 ` Steffen Prohaska @ 2008-01-08 20:41 ` Linus Torvalds 2008-01-09 8:03 ` Junio C Hamano 1 sibling, 1 reply; 113+ messages in thread From: Linus Torvalds @ 2008-01-08 20:41 UTC (permalink / raw) To: Junio C Hamano Cc: J. Bruce Fields, Steffen Prohaska, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit On Tue, 8 Jan 2008, Junio C Hamano wrote: > > I think the project can mark text files as text with attributes > and if the port to the platform initialized core.autocrlf > appropriately for the platform everything should work as you > described. Yes, I think core.autocrlf should default to "true" on Windows, since that is what it's about. The alternative is to have "fail"/"warn", to just make sure that nobody can do the wrong thing by mistake. We could just do something like this, although that probably does mean that the whole test-suite needs to be double-checked (ie now we really do behave differently on windows outside of any config options!)) People who really dislike it can always do the git config --global core.autocrlf false thing. (And no, I don't know if "#ifdef __WINDOWS__" is the right thing to do, it's almost certainly not. This is just a draft.) Linus --- environment.c | 16 +++++++++++++++- 1 files changed, 15 insertions(+), 1 deletions(-) diff --git a/environment.c b/environment.c index 18a1c4e..5766bee 100644 --- a/environment.c +++ b/environment.c @@ -34,9 +34,23 @@ char *pager_program; int pager_use_color = 1; char *editor_program; char *excludes_file; -int auto_crlf = 0; /* 1: both ways, -1: only when adding git objects */ unsigned whitespace_rule_cfg = WS_DEFAULT_RULE; +/* + * Automatic CRLF conversion on files that look like + * text: + * 0: none (unix) + * 1: convert to LF on check-in and to CRLF on check-out + * -1: only on check-in (check-out with just LF) + */ +#ifdef __WINDOWS__ + #define DEF_AUTOCRLF 1 +#else + #define DEF_AUTOCRLF 0 +#endif + +int auto_crlf = DEF_AUTOCRLF; + /* This is set by setup_git_dir_gently() and/or git_default_config() */ char *git_work_tree_cfg; static const char *work_tree; ^ permalink raw reply related [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 20:41 ` Linus Torvalds @ 2008-01-09 8:03 ` Junio C Hamano [not found] ` <7vd4sbmnmz.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> 0 siblings, 1 reply; 113+ messages in thread From: Junio C Hamano @ 2008-01-09 8:03 UTC (permalink / raw) To: Linus Torvalds Cc: J. Bruce Fields, Steffen Prohaska, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit Linus Torvalds <torvalds@linux-foundation.org> writes: > On Tue, 8 Jan 2008, Junio C Hamano wrote: >> >> I think the project can mark text files as text with attributes >> and if the port to the platform initialized core.autocrlf >> appropriately for the platform everything should work as you >> described. > > Yes, I think core.autocrlf should default to "true" on Windows, since > that is what it's about. The alternative is to have "fail"/"warn", to just > make sure that nobody can do the wrong thing by mistake. > > We could just do something like this, although that probably does mean > that the whole test-suite needs to be double-checked (ie now we really do > behave differently on windows outside of any config options!)) > > People who really dislike it can always do the > > git config --global core.autocrlf false > > thing. > > (And no, I don't know if "#ifdef __WINDOWS__" is the right thing to do, > it's almost certainly not. This is just a draft.) Perhaps we can do something similar to core.filemode? Create a file that we would need to create anyway in "text" mode, and read it back in "binary" mode to see what stdio did? ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <7vd4sbmnmz.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <7vd4sbmnmz.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> @ 2008-01-09 10:48 ` Johannes Schindelin 2008-01-09 20:25 ` Junio C Hamano [not found] ` <alpine.LSU.1.00.0801091041570.31053-OGWIkrnhIhzN0uC3ymp8PA@public.gmane.org> 0 siblings, 2 replies; 113+ messages in thread From: Johannes Schindelin @ 2008-01-09 10:48 UTC (permalink / raw) To: Junio C Hamano Cc: Linus Torvalds, J. Bruce Fields, Steffen Prohaska, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit Hi, On Wed, 9 Jan 2008, Junio C Hamano wrote: > Linus Torvalds <torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> writes: > > > On Tue, 8 Jan 2008, Junio C Hamano wrote: > >> > >> I think the project can mark text files as text with attributes and > >> if the port to the platform initialized core.autocrlf appropriately > >> for the platform everything should work as you described. > > > > Yes, I think core.autocrlf should default to "true" on Windows, since > > that is what it's about. The alternative is to have "fail"/"warn", to > > just make sure that nobody can do the wrong thing by mistake. > > > > We could just do something like this, although that probably does mean > > that the whole test-suite needs to be double-checked (ie now we really > > do behave differently on windows outside of any config options!)) > > > > People who really dislike it can always do the > > > > git config --global core.autocrlf false > > > > thing. > > > > (And no, I don't know if "#ifdef __WINDOWS__" is the right thing to > > do, it's almost certainly not. This is just a draft.) IMHO this is really not good. Better do it in the global /etc/gitconfig we install _anyway_ (it says core.symlinks=false). > Perhaps we can do something similar to core.filemode? Create a file > that we would need to create anyway in "text" mode, and read it back in > "binary" mode to see what stdio did? The problem is that MinGW behaves sanely, i.e. it does not output CRLF but only LF. Besides, as I stated several times already, there _are_ projects on Windows where you do _not_ want crlf=true: - Windows is already slow. So slow that it is not even funny. Granted, if you use Windows daily, git on MinGW seems snappy, but if you come from Linux, it is slow as hell. And CRLF conversion does not help that impression at all. - Some tools ported to Windows from Unix do not like CRs. - For git itself, I prefer to work without CRLF just because I do not need it. But maybe I am the minority here, and we really should default to crlf=true on Windows, and provide a way to unset that. My preference would be to have Peff's -c switch to clone, but _additionally_ a way to force a full re-checkout of files (for example after "git config core.crlf false"). Ciao, Dscho ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-09 10:48 ` Johannes Schindelin @ 2008-01-09 20:25 ` Junio C Hamano [not found] ` <7vmyrehhkd.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> [not found] ` <alpine.LSU.1.00.0801092047190. 31053@racer.site> [not found] ` <alpine.LSU.1.00.0801091041570.31053-OGWIkrnhIhzN0uC3ymp8PA@public.gmane.org> 1 sibling, 2 replies; 113+ messages in thread From: Junio C Hamano @ 2008-01-09 20:25 UTC (permalink / raw) To: Johannes Schindelin Cc: Linus Torvalds, J. Bruce Fields, Steffen Prohaska, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > IMHO this is really not good. Better do it in the global /etc/gitconfig > we install _anyway_ (it says core.symlinks=false). That sounds like a perfect place for a per-platform tweak like this than in the code, but I wonder if peoples' scripts have valid use case of GIT_CONFIG to bypass it (git-svn's use of the variable to access its private data seems to be Ok). >> Perhaps we can do something similar to core.filemode? Create a file >> that we would need to create anyway in "text" mode, and read it back in >> "binary" mode to see what stdio did? > > The problem is that MinGW behaves sanely, i.e. it does not output CRLF but > only LF. Won't that behaviour be viewed rather as "insanely" from majority of Windows users? > But maybe I am the minority here, and we really should default to > crlf=true on Windows, and provide a way to unset that. > > My preference would be to have Peff's -c switch to clone, but > _additionally_ a way to force a full re-checkout of files (for example > after "git config core.crlf false"). I have been hoping a better (simpler to use, and somewhat more importantly harder to misuse by being not overly flexible) way than that "clone -c" solution, but that is an implementation issue (I think the tweak rather belongs to init than clone anyway, and the point of "-c" is that it is not easy to tweak the way "init" that is used by "clone" behaves). Switching core.crlf (or gitattributes to change filter -- in general, "affecting the way convert_to_working_tree() and convert_to_git() works") can be done for two opposite reasons. (1) repository is correct and checkout is wrong. This wants re-checkout. (2) repository records in a wrong convention by mistake and needs to be fixed. Re-checkout is obviously a wrong thing to do, and re-checkin (not necessarily commit, but updating the index) is necessary. We need both. ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <7vmyrehhkd.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <7vmyrehhkd.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> @ 2008-01-09 20:50 ` Johannes Schindelin 0 siblings, 0 replies; 113+ messages in thread From: Johannes Schindelin @ 2008-01-09 20:50 UTC (permalink / raw) To: Junio C Hamano Cc: Linus Torvalds, J. Bruce Fields, Steffen Prohaska, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit Hi, On Wed, 9 Jan 2008, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin-Mmb7MZpHnFY@public.gmane.org> writes: > > > Junio wrote: > > > >> Perhaps we can do something similar to core.filemode? Create a file > >> that we would need to create anyway in "text" mode, and read it back > >> in "binary" mode to see what stdio did? > > > > The problem is that MinGW behaves sanely, i.e. it does not output CRLF > > but only LF. > > Won't that behaviour be viewed rather as "insanely" from majority of > Windows users? I think the truth is that CRLF was a mistake. Nobody wants to take the blame for it, obviously, but more and more Windows tools just grok LF-only text. The question is: what to do with those that cannot grok LF-only text. I imagine that the best compromise for now would be to have crlf=true, with poor souls like Steffen having to set the gitattributes accordingly. Ciao, Dscho ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <alpine.LSU.1.00.0801092047190. 31053@racer.site>]
[parent not found: <alpine.LSU.1.00.0801092047190.31053-OGWIkrnhIhzN0uC3ymp8PA@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <alpine.LSU.1.00.0801092047190.31053-OGWIkrnhIhzN0uC3ymp8PA@public.gmane.org> @ 2008-01-09 21:03 ` Steffen Prohaska 0 siblings, 0 replies; 113+ messages in thread From: Steffen Prohaska @ 2008-01-09 21:03 UTC (permalink / raw) To: Johannes Schindelin Cc: Junio C Hamano, Linus Torvalds, J. Bruce Fields, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit On Jan 9, 2008, at 9:50 PM, Johannes Schindelin wrote: > Hi, > > On Wed, 9 Jan 2008, Junio C Hamano wrote: > >> Johannes Schindelin <Johannes.Schindelin-Mmb7MZpHnFY@public.gmane.org> writes: >> >>> Junio wrote: >>> >>>> Perhaps we can do something similar to core.filemode? Create a >>>> file >>>> that we would need to create anyway in "text" mode, and read it >>>> back >>>> in "binary" mode to see what stdio did? >>> >>> The problem is that MinGW behaves sanely, i.e. it does not output >>> CRLF >>> but only LF. >> >> Won't that behaviour be viewed rather as "insanely" from majority of >> Windows users? > > I think the truth is that CRLF was a mistake. Nobody wants to take > the > blame for it, obviously, but more and more Windows tools just grok > LF-only > text. > > The question is: what to do with those that cannot grok LF-only > text. I > imagine that the best compromise for now would be to have > crlf=true, with > poor souls like Steffen having to set the gitattributes accordingly. I could live with that but unfortunately this alone does not solve all of the real-world problems happening during cross- platform development. At least the problem of code copied from Windows to Unix and committed there should be addressed, too. Maybe the default on Unix should be crlf=input? I'm wondering what Linux developer would say about this? I am against changing the default of msysgit now. First I'd like to wait how the "crlf=safe" discussion evolves. Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <alpine.LSU.1.00.0801091041570.31053-OGWIkrnhIhzN0uC3ymp8PA@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <alpine.LSU.1.00.0801091041570.31053-OGWIkrnhIhzN0uC3ymp8PA@public.gmane.org> @ 2008-01-10 9:25 ` Peter Karlsson [not found] ` <Pine.LNX.4.64.0801101023380.11922-Hh8n7enkEC8qi7mQTfpNuw@public.gmane.org> [not found] ` <alpine.LSU.1.00.080110115 5140.31053@racer.site> 0 siblings, 2 replies; 113+ messages in thread From: Peter Karlsson @ 2008-01-10 9:25 UTC (permalink / raw) To: Johannes Schindelin Cc: Junio C Hamano, Linus Torvalds, J. Bruce Fields, Steffen Prohaska, Robin Rosenberg, Jeff King, Git Mailing List, msysGit Johannes Schindelin: > The problem is that MinGW behaves sanely, i.e. it does not output > CRLF but only LF. Well, that is broken, since the convention on Windows is to use CRLF. > - Windows is already slow. So slow that it is not even funny. Granted, > if you use Windows daily, git on MinGW seems snappy, but if you come > from Linux, it is slow as hell. True. And I run git a lot on a Novell disk share, which doesn't exactly help improve the speed either :-) > - Some tools ported to Windows from Unix do not like CRs. Those tools are broken, for the same reason as above. Windows has CRLF line endings. Just deal with it. -- \\// Peter - http://www.softwolves.pp.se/ ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <Pine.LNX.4.64.0801101023380.11922-Hh8n7enkEC8qi7mQTfpNuw@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <Pine.LNX.4.64.0801101023380.11922-Hh8n7enkEC8qi7mQTfpNuw@public.gmane.org> @ 2008-01-10 11:57 ` Johannes Schindelin 2008-01-11 3:03 ` Miles Bader 2008-01-11 3:03 ` Miles Bader 0 siblings, 2 replies; 113+ messages in thread From: Johannes Schindelin @ 2008-01-10 11:57 UTC (permalink / raw) To: Peter Karlsson Cc: Junio C Hamano, Linus Torvalds, J. Bruce Fields, Steffen Prohaska, Robin Rosenberg, Jeff King, Git Mailing List, msysGit Hi, On Thu, 10 Jan 2008, Peter Karlsson wrote: > Johannes Schindelin: > > > The problem is that MinGW behaves sanely, i.e. it does not output CRLF > > but only LF. > > Well, that is broken, since the convention on Windows is to use CRLF. I cannot help but wonder what exactly you wanted to achieve with this provably bogus statement, other than provoking flames. I hereby refuse to insult you for it. > > - Windows is already slow. So slow that it is not even funny. > > Granted, if you use Windows daily, git on MinGW seems snappy, but if > > you come from Linux, it is slow as hell. > > True. And I run git a lot on a Novell disk share, which doesn't exactly > help improve the speed either :-) Don't do that, then. I mean, git is _distributed_. Not like there is no way out because your "server" is on a disk share. > Windows has CRLF line endings. Just deal with it. No, I will not just deal with it. Ciao, Dscho ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-10 11:57 ` Johannes Schindelin @ 2008-01-11 3:03 ` Miles Bader 2008-01-11 3:03 ` Miles Bader 1 sibling, 0 replies; 113+ messages in thread From: Miles Bader @ 2008-01-11 3:03 UTC (permalink / raw) To: public-Johannes.Schindelin-Mmb7MZpHnFY Cc: Peter Karlsson, Junio C Hamano, Linus Torvalds, J. Bruce Fields, Steffen Prohaska, Robin Rosenberg, Jeff King, Git Mailing List, msysGit Johannes Schindelin <Johannes.Schindelin-Mmb7MZpHnFY@public.gmane.org> writes: >> Windows has CRLF line endings. Just deal with it. > > No, I will not just deal with it. Didn't Apple change their line-ending convention, moving to LF EOL with OSX? -Miles -- "I distrust a research person who is always obviously busy on a task." --Robert Frosch, VP, GM Research ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-10 11:57 ` Johannes Schindelin 2008-01-11 3:03 ` Miles Bader @ 2008-01-11 3:03 ` Miles Bader 1 sibling, 0 replies; 113+ messages in thread From: Miles Bader @ 2008-01-11 3:03 UTC (permalink / raw) To: public-Johannes.Schindelin-Mmb7MZpHnFY Cc: Peter Karlsson, Junio C Hamano, Linus Torvalds, J. Bruce Fields, Steffen Prohaska, Robin Rosenberg, Jeff King, Git Mailing List, msysGit Johannes Schindelin <Johannes.Schindelin-Mmb7MZpHnFY@public.gmane.org> writes: >> Windows has CRLF line endings. Just deal with it. > > No, I will not just deal with it. Didn't Apple change their line-ending convention, moving to LF EOL with OSX? -Miles -- "I distrust a research person who is always obviously busy on a task." --Robert Frosch, VP, GM Research ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <alpine.LSU.1.00.080110115 5140.31053@racer.site>]
[parent not found: <alpine.LSU.1.00.0801101155140.31053-OGWIkrnhIhzN0uC3ymp8PA@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <alpine.LSU.1.00.0801101155140.31053-OGWIkrnhIhzN0uC3ymp8PA@public.gmane.org> @ 2008-01-10 13:28 ` Peter Karlsson 2008-01-10 14:31 ` Peter Harris 0 siblings, 1 reply; 113+ messages in thread From: Peter Karlsson @ 2008-01-10 13:28 UTC (permalink / raw) To: Johannes Schindelin Cc: Junio C Hamano, Linus Torvalds, J. Bruce Fields, Steffen Prohaska, Robin Rosenberg, Jeff King, Git Mailing List, msysGit Johannes Schindelin: > I cannot help but wonder what exactly you wanted to achieve with this > provably bogus statement, other than provoking flames. I hereby > refuse to insult you for it. I meant to say that any software that claims to be Windows software should handle, and produce, CRLF line breaks in text files. Whether it also supports Unix (LF) or old Mac (CR) line breaks is up to it, but if it is a Windows program, it should do CRLF, as that is the convention (inherited from MS-DOS, which inherited it from CP/M). > > True. And I run git a lot on a Novell disk share, which doesn't exactly > > help improve the speed either :-) > Don't do that, then. I have to. Otherwise the compile server can't see the files (this is not for the project that at in the start of the thread, this is what I use to work around that my employer's choice of version control systems could be better). > > Windows has CRLF line endings. Just deal with it. > No, I will not just deal with it. Me neither, that is why I expect the software to do it for me. Thinking of text files as a stream of bytes is so 1900s. In the 2000s we should think of text files as a stream of characters. How these characters are represented is up to each system that wants it. I see no problem with storing text files as UTF-32 internally (disk is cheap). -- \\// Peter - http://www.softwolves.pp.se/ ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-10 13:28 ` Peter Karlsson @ 2008-01-10 14:31 ` Peter Harris [not found] ` <eaa105840801100631p6b95ed86j153d70244d474b03-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 113+ messages in thread From: Peter Harris @ 2008-01-10 14:31 UTC (permalink / raw) To: Peter Karlsson Cc: Johannes Schindelin, Junio C Hamano, Linus Torvalds, J. Bruce Fields, Steffen Prohaska, Robin Rosenberg, Jeff King, Git Mailing List, msysGit On Jan 10, 2008 8:28 AM, Peter Karlsson <peter@softwolves.pp.se> wrote: > I meant to say that any software that claims to be Windows software > should handle, and produce, CRLF line breaks in text files. Including zip/unzip? How about tar? rsync? NFS and SMB copies from network shares? I bet the Samba folks would just *love* to have this discussion for the hundredth time. Just because CVS and FTP got the defaults wrong (and modern FTP clients mostly default to automatically switching to binary, so basically just CVS) doesn't mean that Git has to get the default wrong, too. Git *does* handle and produce CRLF line breaks, as long as you tell it to. Please don't lose sight of that fact. I'm just glad that VMS is effectively dead. Line endings on VMS are stored outside the text body, IIRC... Peter Harris ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <eaa105840801100631p6b95ed86j153d70244d474b03-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <eaa105840801100631p6b95ed86j153d70244d474b03-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2008-01-11 13:12 ` Peter Karlsson 2008-01-11 15:39 ` Peter Harris 0 siblings, 1 reply; 113+ messages in thread From: Peter Karlsson @ 2008-01-11 13:12 UTC (permalink / raw) To: Peter Harris Cc: Johannes Schindelin, Junio C Hamano, Linus Torvalds, J. Bruce Fields, Steffen Prohaska, Robin Rosenberg, Jeff King, Git Mailing List, msysGit Peter Harris: > > I meant to say that any software that claims to be Windows software > > should handle, and produce, CRLF line breaks in text files. > Including zip/unzip? Yup (zip -l, unzip -a). > How about tar? rsync? Sure. > NFS and SMB copies from network shares? I'd say that might not be as obvious, but it would be nice to have, yes. A typed network file system that stores text as character streams and binary data as octet streams. -- \\// Peter - http://www.softwolves.pp.se/ ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 13:12 ` Peter Karlsson @ 2008-01-11 15:39 ` Peter Harris 0 siblings, 0 replies; 113+ messages in thread From: Peter Harris @ 2008-01-11 15:39 UTC (permalink / raw) To: Peter Karlsson; +Cc: Git Mailing List On Jan 11, 2008 8:12 AM, Peter Karlsson <peter@softwolves.pp.se> wrote: > > > I meant to say that any software that claims to be Windows software > > > should handle, and produce, CRLF line breaks in text files. > > > Including zip/unzip? > > Yup (zip -l, unzip -a). How is this any different from core.autocrlf? You get CRLF conversion if you ask for it, and not if you don't. Peter Harris ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 19:09 ` J. Bruce Fields 2008-01-08 19:47 ` Junio C Hamano @ 2008-01-08 19:59 ` Steffen Prohaska 1 sibling, 0 replies; 113+ messages in thread From: Steffen Prohaska @ 2008-01-08 19:59 UTC (permalink / raw) To: J. Bruce Fields Cc: Junio C Hamano, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit On Jan 8, 2008, at 8:09 PM, J. Bruce Fields wrote: > On Tue, Jan 08, 2008 at 07:58:57PM +0100, Steffen Prohaska wrote: >> >> On Jan 8, 2008, at 7:07 PM, Junio C Hamano wrote: >> >>> >>> >>> Steffen Prohaska <prohaska-wjoc1KHpMeg@public.gmane.org> writes: >>> >>>> msysgit installs plain git. core.autocrlf is unset. Whatever >>>> plain >>>> git's default is, this is msysgit's default, too. >>> >>> That sounds like a mistake if you are installing a port to a >>> platform whose native line ending convention is different from >>> where plain git natively runs on (i.e. UNIX). >> >> We failed to agree on a better default and as the lengthy >> discussion documents, the best default isn't obvious. >> >> I don't think a solution will be found by declaring one platform >> native (UNIX) and all other platform non-native. The question to >> answer is how to support cross-platform projects. A valid >> solution should never corrupt data unless the user explicitly >> told git to do so. > > My only suggestion is that we consider allowing the user that > "explicitly told git to do so" be the project maintainer. So if you > > echo * autodetectcrlf >.gitattributes > git add .gitattributes > git commit > > then users that clone your repo will get that default without > having to > be told to do something magic on clone. > > (And ideally I'd've hoped you could do that using the existing crlf > attribute rather than having to invent something new, but maybe that > doesn't work.) I like this idea. I think we need the following: - if "autodetectcrlf" is set, git should guarantee that files in the repository will always have LF-only. Otherwise the automatic conversion can't work. - git needs to support a way to select the preferred type of line endings based on the OS. Unix users want to see LF, while Windows users want to see CRLF for the same file. And we could implement it as follows: - We add a configuration variable that sets the preferred autocrlf conversion, for example core.defaultautocrlf. - We add a new string value "defaultauto" for crlf in .gitattributes. crlf=defaultauto is similar to setting crlf to "Unspecified", but also forces git to act as if core.autocrlf is set to $(core.defaultautocrlf). That means if the content looks like text it will be converted according to the local settings. - On Unix, core.defaultautocrlf defaults to "input". - On Windows, core.defaultautocrlf defaults to "crlf". This added layer of indirection gives us what we need: The file .gitattributes tells git to convert line endings; but the details of the conversion depend on the environment (OS or configuration). Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 18:58 ` Steffen Prohaska 2008-01-08 19:09 ` J. Bruce Fields @ 2008-01-08 20:11 ` Junio C Hamano [not found] ` <7vbq7wrsb6.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> 2008-01-08 20:50 ` Dmitry Potapov 2 siblings, 1 reply; 113+ messages in thread From: Junio C Hamano @ 2008-01-08 20:11 UTC (permalink / raw) To: Steffen Prohaska Cc: Junio C Hamano, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit Steffen Prohaska <prohaska-wjoc1KHpMeg@public.gmane.org> writes: By the way, is it your setting that mangles people's e-mail addresses? If it is, is it possible for you to configure it to stop doing that? > I don't think a solution will be found by declaring one platform > native (UNIX) and all other platform non-native. The question to > answer is how to support cross-platform projects. A valid > solution should never corrupt data unless the user explicitly > told git to do so. I realize I might have misunderstood your intention. I did not use the word "native" to mean "native vs second class citizen". The "native" in my description should have said "a platform whose native line ending convention happens to match what the project chose to use as its canonical blob object representation". And on such a platform, there is no need to worry about accidental corruption because there is no autocrlf involved. I could have said "lucky platform" instead of "native". I have been assuming that both of us are assuming LF line ending is what a typical cross-platform project would pick as the canonical blob object representation. If you are assuming CRLF line ending as the canonical blob object representation for projects originating from Windows, UNIX is not native (or "lucky") in such a project. Of course, in such a project, setting "core.autocrlf = true" is absolutely a wrong thing to do on Windows. Perhaps you meant that, and I would agree with you. In such a project, setting "core.autocrlf = reversed" (which would convert work tree LF line endings to canonical CRLF when creating a blob, but we currently do not support it) on UNIX would become necessary, if you want the resulting mess to work reasonably without configuration. As long as "cross-platform" needs to support checkouts on platforms with different conventions, and if you do not give explicit marking about the binariness, platforms that are not "lucky" needs to guess. The alternative is to check out verbatim and effectively get it all wrong, from the platform convention's point of view. There is no avoiding that. HOWEVER. If you mean by "cross platform project" a project that picks LF line ending as the canonical representation, then you cannot deny the fact that Windows is not "lucky" and without explicit marking about the binariness, it needs to guess. Again, the alternative is to check out verbatim and get it all wrong. ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <7vbq7wrsb6.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <7vbq7wrsb6.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> @ 2008-01-08 20:20 ` Steffen Prohaska 0 siblings, 0 replies; 113+ messages in thread From: Steffen Prohaska @ 2008-01-08 20:20 UTC (permalink / raw) To: Junio C Hamano Cc: Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit On Jan 8, 2008, at 9:11 PM, Junio C Hamano wrote: > Steffen Prohaska <prohaska-wjoc1KHpMeg-XMD5yJDbdMReXY1tMh2IBg@public.gmane.org> writes: > > By the way, is it your setting that mangles people's > e-mail addresses? If it is, is it possible for you to > configure it to stop doing that? No. I have no idea where these mangled addresses come from. I did never see such addresses before. I thought I replaced all the mangled addresses in my last reply. I this mail wrong, too? Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 18:58 ` Steffen Prohaska 2008-01-08 19:09 ` J. Bruce Fields 2008-01-08 20:11 ` Junio C Hamano @ 2008-01-08 20:50 ` Dmitry Potapov [not found] ` <20080108205054.GN6951-EQL4cN526mwi5CQI31g/s0B+6BGkLq7r@public.gmane.org> 2 siblings, 1 reply; 113+ messages in thread From: Dmitry Potapov @ 2008-01-08 20:50 UTC (permalink / raw) To: Steffen Prohaska Cc: Junio C Hamano, J. Bruce Fields, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit On Tue, Jan 08, 2008 at 07:58:57PM +0100, Steffen Prohaska wrote: > > I don't think a solution will be found by declaring one platform > native (UNIX) and all other platform non-native. The question to > answer is how to support cross-platform projects. A valid > solution should never corrupt data unless the user explicitly > told git to do so. I don't believe it is a valid solution to set > core.autocrlf=true on Windows and tell the users: "Well, in its > default settings, git sometimes corrupts your data on Windows. > Maybe you want to switch to Linux because this is the native > platform where data corruption will never happen." Maybe I am wrong but it seems to me that to guarantee that CRLF conversion is reversible (which means that you can always get exactly what you put into the repository), it is enough to check that the conversation is performed only if every LF is preceded by CR. If it is not so, error out and tell the user that the file should be either marked as binary or EOL in the text must be corrected. So, even in those rare cases where the heuristic went wrong, you will not lose your data. Most likely you will get the above error, but even if a binary file is checked in as text, it will affect only cross-platform projects, and it will be easily to correct the situation later by marking this file as binary and checking in again. So, it is a extermely rare event, and no data is lost. Perhaps, this option can be called core.autocrlf=safe IMHO, a _text_ file is not just some octets, it consists of lines. Even without CRLF conversation, Git is aware about to do some basic operations like diff and merge. So, it is natural to store lines in the repository in the same EOL marker regardless on what platform the file is created. So, having core.autocrlf=false on Windows is wrong. You may not notice it until you do not move to another platform, but the whole thing is already broken. It is not about one platform being more native than other. It is like in the C standard, LF is used to denote the end of line, because it is the only sane choice to mark it. Dmitry ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <20080108205054.GN6951-EQL4cN526mwi5CQI31g/s0B+6BGkLq7r@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <20080108205054.GN6951-EQL4cN526mwi5CQI31g/s0B+6BGkLq7r@public.gmane.org> @ 2008-01-08 21:15 ` Junio C Hamano 2008-01-08 21:57 ` Robin Rosenberg 2008-01-08 21:31 ` Linus Torvalds 1 sibling, 1 reply; 113+ messages in thread From: Junio C Hamano @ 2008-01-08 21:15 UTC (permalink / raw) To: Dmitry Potapov Cc: Steffen Prohaska, J. Bruce Fields, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit Dmitry Potapov <dpotapov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > On Tue, Jan 08, 2008 at 07:58:57PM +0100, Steffen Prohaska wrote: >> >> I don't think a solution will be found by declaring one platform >> native (UNIX) and all other platform non-native. The question to >> answer is how to support cross-platform projects. A valid >> solution should never corrupt data unless the user explicitly >> told git to do so. I don't believe it is a valid solution to set >> core.autocrlf=true on Windows and tell the users: "Well, in its >> default settings, git sometimes corrupts your data on Windows. >> Maybe you want to switch to Linux because this is the native >> platform where data corruption will never happen." > > Maybe I am wrong but it seems to me that to guarantee that > CRLF conversion is reversible (which means that you can > always get exactly what you put into the repository), it is > enough to check that the conversation is performed only if > every LF is preceded by CR. I've heard that before but I seem to recall convert.c already doing something similar if I am not mistaken. static int crlf_to_git(const char *path, const char *src, size_t len, struct strbuf *buf, int action) { struct text_stat stats; char *dst; if ((action == CRLF_BINARY) || !auto_crlf || !len) return 0; gather_stats(src, len, &stats); /* No CR? Nothing to convert, regardless. */ if (!stats.cr) return 0; if (action == CRLF_GUESS) { /* * We're currently not going to even try to convert stuff * that has bare CR characters. Does anybody do that crazy * stuff? */ if (stats.cr != stats.crlf) return 0; /* * And add some heuristics for binary vs text, of course... */ if (is_binary(len, &stats)) return 0; } It counts CR and CRLF and converts only when there are the same number of them. You probably only need to make it also count LF? ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 21:15 ` Junio C Hamano @ 2008-01-08 21:57 ` Robin Rosenberg 0 siblings, 0 replies; 113+ messages in thread From: Robin Rosenberg @ 2008-01-08 21:57 UTC (permalink / raw) To: Junio C Hamano Cc: Dmitry Potapov, Steffen Prohaska, J. Bruce Fields, Johannes Schindelin, Jeff King, Peter Karlsson, Git Mailing List, msysGit tisdagen den 8 januari 2008 skrev Junio C Hamano: > It counts CR and CRLF and converts only when there are the same > number of them. You probably only need to make it also count > LF? Strictly speaking yes, but probably no anyway. Some tools on Windows keep existing line endings for existing lines but add CRLF for new ones. I can only think of one right now, but that's at least one. -- robin ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 [not found] ` <20080108205054.GN6951-EQL4cN526mwi5CQI31g/s0B+6BGkLq7r@public.gmane.org> 2008-01-08 21:15 ` Junio C Hamano @ 2008-01-08 21:31 ` Linus Torvalds [not found] ` <alpine.LFD.1.00.0801081325010.3148-5CScLwifNT1QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org> 1 sibling, 1 reply; 113+ messages in thread From: Linus Torvalds @ 2008-01-08 21:31 UTC (permalink / raw) To: Dmitry Potapov Cc: Steffen Prohaska, Junio C Hamano, J. Bruce Fields, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit On Tue, 8 Jan 2008, Dmitry Potapov wrote: > > Perhaps, this option can be called core.autocrlf=safe We already do half of that: if (action == CRLF_GUESS) { /* * We're currently not going to even try to convert stuff * that has bare CR characters. Does anybody do that crazy * stuff? */ if (stats.cr != stats.crlf) return 0; but we don't check that there are no "naked" LF characters. So the only thing you'd need to add is to add a /* No naked LF's! */ if (safecrlf && stats.lf) return 0; to that sequence too, but the thing is, having mixed line-endings isn't actually all that unusual, so I think that kind of "autocrlf=safe" thing is actually almost useless - because when that thing triggers, you almost always *do* want to convert it to be just one way. I've seen it multiple times when people cooperate with windows files with unix tools, where unix editors often preserve old CRLF's, but write new lines with just LF. So "autocrlf=safe" would be trivial to add, but I suspect it would cause more confusion than it would fix. Linus ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <alpine.LFD.1.00.0801081325010.3148-5CScLwifNT1QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <alpine.LFD.1.00.0801081325010.3148-5CScLwifNT1QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org> @ 2008-01-08 22:09 ` Sean 2008-01-08 22:51 ` Dmitry Potapov 2008-01-09 8:43 ` Abdelrazak Younes 2 siblings, 0 replies; 113+ messages in thread From: Sean @ 2008-01-08 22:09 UTC (permalink / raw) To: Linus Torvalds Cc: Dmitry Potapov, Steffen Prohaska, Junio C Hamano, J. Bruce Fields, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit On Tue, 8 Jan 2008 13:31:57 -0800 (PST) Linus Torvalds <torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> wrote: > So the only thing you'd need to add is to add a > > /* No naked LF's! */ > if (safecrlf && stats.lf) > return 0; > > to that sequence too, but the thing is, having mixed line-endings isn't > actually all that unusual, so I think that kind of "autocrlf=safe" thing > is actually almost useless - because when that thing triggers, you almost > always *do* want to convert it to be just one way. > > I've seen it multiple times when people cooperate with windows files with > unix tools, where unix editors often preserve old CRLF's, but write new > lines with just LF. > > So "autocrlf=safe" would be trivial to add, but I suspect it would cause > more confusion than it would fix. But isn't the entire point of this exercise to ensure that you will never be in the situation on Linux where you checkout files that have CRLF endings? And conversely that on Windows you will never checkout files that have LF endings? If so, you don't have to worry about your tools creating mixed ending files. The only time the above rules should be broken, is when the user explicitly states that their tools will do-the-right-thing without such help. Sean ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 [not found] ` <alpine.LFD.1.00.0801081325010.3148-5CScLwifNT1QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org> 2008-01-08 22:09 ` Sean @ 2008-01-08 22:51 ` Dmitry Potapov [not found] ` <20080108225138.GA23240-EQL4cN526mwi5CQI31g/s0B+6BGkLq7r@public.gmane.org> 2008-01-09 8:43 ` Abdelrazak Younes 2 siblings, 1 reply; 113+ messages in thread From: Dmitry Potapov @ 2008-01-08 22:51 UTC (permalink / raw) To: Linus Torvalds Cc: Steffen Prohaska, Junio C Hamano, J. Bruce Fields, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit On Tue, Jan 08, 2008 at 01:31:57PM -0800, Linus Torvalds wrote: > > but we don't check that there are no "naked" LF characters. But my idea was about checking for "naked" LF, because if there is at least one naked LF, then you will get a _different_ file than you put into the repository. > > So the only thing you'd need to add is to add a > > /* No naked LF's! */ > if (safecrlf && stats.lf) It seems the check for named LF should be: if (safecrlf && stats.lf == stats.crlf) > return 0; Unfortunately, you cannot return 0 here, because if there is no CRLF, the opposite conversation cannot tell apart when all CRLF were successfully converted to LF, and when there was no conversation at all. So, the only thing to do here is to die() saying this file should be either marked as binary or EOL in the text must be corrected. > > to that sequence too, but the thing is, having mixed line-endings isn't > actually all that unusual, so I think that kind of "autocrlf=safe" thing > is actually almost useless - because when that thing triggers, you almost > always *do* want to convert it to be just one way. I agree that in most cases, you *do* want to covert, but the idea of the "safe" mode is to protect you from the possibility (whatever small it is) when you do not want to convert, because it is a _binary_ file, but is_binary heuristic was wrong. > I've seen it multiple times when people cooperate with windows files with > unix tools, where unix editors often preserve old CRLF's, but write new > lines with just LF. > > So "autocrlf=safe" would be trivial to add, but I suspect it would cause > more confusion than it would fix. The idea of "autocrlf=safe" is always to be on the safe side. Those who prefer automatic correction of EOL can use "autocrlf=true". Besides, checking EOL is somewhat similar checking whitespaces. Git allows you either have --whitespace=error or --whitespace=strip, so it is reasonable to have the same choice about EOL. I may choose either the "safe" mode, which will only error out, or I can have the "true" mode, which corrects EOLs on-fly. Dmitry ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <20080108225138.GA23240-EQL4cN526mwi5CQI31g/s0B+6BGkLq7r@public.gmane.org>]
* Re: CRLF problems with Git on Win32 [not found] ` <20080108225138.GA23240-EQL4cN526mwi5CQI31g/s0B+6BGkLq7r@public.gmane.org> @ 2008-01-09 0:01 ` Linus Torvalds 0 siblings, 0 replies; 113+ messages in thread From: Linus Torvalds @ 2008-01-09 0:01 UTC (permalink / raw) To: Dmitry Potapov Cc: Steffen Prohaska, Junio C Hamano, J. Bruce Fields, Johannes Schindelin, Robin Rosenberg, Jeff King, Peter Karlsson, Git Mailing List, msysGit On Wed, 9 Jan 2008, Dmitry Potapov wrote: > > It seems the check for named LF should be: > if (safecrlf && stats.lf == stats.crlf) No. If there was a crlf, then we won't increment the lf count. (Not that I tested it, but that's how it was supposed to work) Linus ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 [not found] ` <alpine.LFD.1.00.0801081325010.3148-5CScLwifNT1QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org> 2008-01-08 22:09 ` Sean 2008-01-08 22:51 ` Dmitry Potapov @ 2008-01-09 8:43 ` Abdelrazak Younes 2 siblings, 0 replies; 113+ messages in thread From: Abdelrazak Younes @ 2008-01-09 8:43 UTC (permalink / raw) To: msysgit-/JYPxA39Uh5TLH3MbocFFw; +Cc: git-u79uwXL29TY76Z2rM5mHXA Linus Torvalds wrote: > > > On Tue, 8 Jan 2008, Dmitry Potapov wrote: >> Perhaps, this option can be called core.autocrlf=safe > > We already do half of that: > > if (action == CRLF_GUESS) { > /* > * We're currently not going to even try to convert stuff > * that has bare CR characters. Does anybody do that crazy > * stuff? > */ > if (stats.cr != stats.crlf) > return 0; > > but we don't check that there are no "naked" LF characters. > > So the only thing you'd need to add is to add a > > /* No naked LF's! */ > if (safecrlf && stats.lf) > return 0; > > to that sequence too, but the thing is, having mixed line-endings isn't > actually all that unusual, so I think that kind of "autocrlf=safe" thing > is actually almost useless - because when that thing triggers, you almost > always *do* want to convert it to be just one way. Sorry for the irruption in this discussion but as a potential git user for cross-platform development I'd like to share my experience/opinion, hope you don't mind. I am investigating the use of git for our cross-platform project which uses svn currently. In our project, we mark manually *all* source file (*.h and *.cpp) with 'eol-style=native'. This way, if some editor on Windows added some CRLF in such marked file, svn will refuse to commit this file until you clean it up. This means that all C/C++/python files uses LF eol exclusively on all platforms. I believe this is the only sane way to do cross-platform development. Now, marking any new file manually is cumbersome and some developers often forget to do it. I would like to be able mark all files with a given extension (.c, .cpp, .h) with "LF only". This way, Windows only files (like visual studio projects) can stay with CRLF. It would be fantastic if git could do that. > > I've seen it multiple times when people cooperate with windows files with > unix tools, where unix editors often preserve old CRLF's, but write new > lines with just LF. Multiple versions of Visual studio do just this indeed. Abdel. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 18:07 ` Junio C Hamano [not found] ` <7vmyrgry20.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> @ 2008-01-10 19:58 ` Gregory Jefferis 2008-01-10 20:20 ` Linus Torvalds 2008-01-10 20:50 ` Rogan Dawes 1 sibling, 2 replies; 113+ messages in thread From: Gregory Jefferis @ 2008-01-10 19:58 UTC (permalink / raw) To: Junio C Hamano, Steffen Prohaska; +Cc: Git Mailing List On 8/1/08 18:07, "Junio C Hamano" <gitster@pobox.com> wrote: > Steffen Prohaska <prohaska-wjoc1KHpMeg@public.gmane.org> writes: > >> msysgit installs plain git. core.autocrlf is unset. Whatever plain >> git's default is, this is msysgit's default, too. > > That sounds like a mistake if you are installing a port to a > platform whose native line ending convention is different from > where plain git natively runs on (i.e. UNIX). I'm not sure that I understand the whole deal about platform default line endings. Isn't plain git functionally agnostic about line endings? You can check in CRLF text files to git and it doesn't care. You can diff, show etc just fine. I haven't yet found anything that breaks with CRLF files. In this sense plain git is already Windows ready. Maybe I'm missing something? Doesn't the problem only come if you try to diff a CRLF file with a new version that has LF only line endings? Then right now you have to use something like: git diff --ignore-space-at-eol Or if a Windows user clones a repository created on another system. For these cross-platform circumstances, it seems to me sensible to have an option (probably enabled by default on all platforms) that allows files to be munged on check in to whatever EOL style the repository creator preferred (probably stored in .gitattributes and could be different for different files in the repo - e.g. a windows vendor src dir on a cross-platform project). Note that this means that munging would only happen if someone actually asked for it - which would be a sensible thing to do as the administrator of a cross-platform project. Then there would be a separate option (probably not enabled by default) to check out with the platform's native line ending instead of whatever is in the repo. This would allow people to work with inflexible toolsets. Finally for people who want to work with native line endings that are different from repository line endings, then it might be necessary to improve the handling of diffs by providing a config var to make --ignore-space-at-eol the default (or perhaps more correctly --ignore-line-endings) for text files. From my preliminary reading of list history improving the inspection of content rather than trying to change content might be the more gitish thing to do. In conclusion all of these CRLF options are designed to help Windows users play nicely with others. But it seems to me naïve Windows users can be perfectly happy with plain git so long as they stay in their own Windows world. jm2c, corrections welcome and apologies to those suffering from eol exhaustion, Greg. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-10 19:58 ` Gregory Jefferis @ 2008-01-10 20:20 ` Linus Torvalds 2008-01-10 21:28 ` Gregory Jefferis 2008-01-10 20:50 ` Rogan Dawes 1 sibling, 1 reply; 113+ messages in thread From: Linus Torvalds @ 2008-01-10 20:20 UTC (permalink / raw) To: Gregory Jefferis; +Cc: Junio C Hamano, Steffen Prohaska, Git Mailing List On Thu, 10 Jan 2008, Gregory Jefferis wrote: > > I'm not sure that I understand the whole deal about platform default line > endings. Isn't plain git functionally agnostic about line endings? You can > check in CRLF text files to git and it doesn't care. You can diff, show etc > just fine. I haven't yet found anything that breaks with CRLF files. In > this sense plain git is already Windows ready. Maybe I'm missing something? If you work together with other people on other platforms, then CRLF is a major pain in the *ss. So you have various options: - only develop on unix-like platforms: lines end with LF, and nobody has any problems regardless of autocrlf behaviour. Might as well consider everything binary. - only develop on windows, using only one set of basic tools: lines normally end with CRLF, and nobody cares. Migth as well consider everything binary. - Mixed windows/unix platfoms, but the Windows people are constrained to use only tools that write text-files with LF. Might we well consider everything binary. Quite frankly, Johannes seems to argue that this is a viable alternative, but I seriously doubt that is really true. Yes, there are lots of Windows tools (pretty much all of them by now, I suspect) that *understand* LF-only line endings, but it's also undoubtedly the case that if you allow windows developers to use their normal tools, a number of them *will* write files with CRLF. - Mixed windows usage - either with other UNIX users, or even just *within* a windows environment if *some* of the tools are basically UNIX ports (ie MinGW or Cygwin without text-mounts) In this case, some tools will write files with CRLF, and others will write them with LF. Again, usually all tools can *read* either form, but the writing is mixed and depends on the tool (so if you work in a group where different people use different editors, you will literally switch back-and-forth between LF and CRLF, sometimes mixing the two in the same file!). This one - at the very least - basically requires "autocrlf=input". Anything else is just madness, because otherwise you'll get files that get partly or entirely rewritten in the object database just due to line ending changes. So in *most* of the situations, you probably don't need to worry about autocrlf. But the thing is, I'm almost 100% convinced that the moment you have even *one* windows developer, and any UNIX experience at all (whether due to people actually working on unix, or just using unixy tools under Windows), you will end up in that final case that really does want autocrlf. Linus ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-10 20:20 ` Linus Torvalds @ 2008-01-10 21:28 ` Gregory Jefferis 2008-01-10 23:23 ` Dmitry Potapov 2008-01-11 0:02 ` Linus Torvalds 0 siblings, 2 replies; 113+ messages in thread From: Gregory Jefferis @ 2008-01-10 21:28 UTC (permalink / raw) To: Linus Torvalds; +Cc: Junio C Hamano, Steffen Prohaska, Git Mailing List On 10/1/08 20:20, "Linus Torvalds" <torvalds@linux-foundation.org> wrote: > - Mixed windows usage - either with other UNIX users, or even just > *within* a windows environment if *some* of the tools are basically > UNIX ports (ie MinGW or Cygwin without text-mounts) > > In this case, some tools will write files with CRLF, and others will > write them with LF. Again, usually all tools can *read* either form, > but the writing is mixed and depends on the tool (so if you work in a > group where different people use different editors, you will literally > switch back-and-forth between LF and CRLF, sometimes mixing the two in > the same file!). > > This one - at the very least - basically requires "autocrlf=input". > Anything else is just madness, because otherwise you'll get files that > get partly or entirely rewritten in the object database just due to > line ending changes. So this is what has to be accommodated. But instead of having autocrlf always set on Windows and always converting to LF in the repository, why not do nothing by default unless the repository contains some information specifying that it wants some or all text files to have a particular kind of line ending (e.g. in gitattributes). Then the choice of line ending inside the repository is up to the people creating/maintaining the repo, which just seems right. Insisting that repos created on windows should have textfiles munged to LF by default doesn't seem right. Even using Dmitry's clever autocrlf=safe option on Windows would lead to inconvenience since all LF files have to be explicitly attributed as text. We should be helping Windows people to use LF files rather than hindering them! ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-10 21:28 ` Gregory Jefferis @ 2008-01-10 23:23 ` Dmitry Potapov 2008-01-11 0:02 ` Linus Torvalds 1 sibling, 0 replies; 113+ messages in thread From: Dmitry Potapov @ 2008-01-10 23:23 UTC (permalink / raw) To: Gregory Jefferis Cc: Linus Torvalds, Junio C Hamano, Steffen Prohaska, Git Mailing List On Thu, Jan 10, 2008 at 09:28:15PM +0000, Gregory Jefferis wrote: > > Insisting that repos created on windows should have textfiles munged to LF > by default doesn't seem right. Even using Dmitry's clever autocrlf=safe > option on Windows would lead to inconvenience since all LF files have to be > explicitly attributed as text. We should be helping Windows people to use > LF files rather than hindering them! I think people may have different preferences about that. Some people may want to have text files with CRLF but others with LF. Some trust Git heuristic for detecting text files (which seems works rahter good for most commonly used formats) but others are paranoid about loss some data. Finally, there are some people, who just wants to store their messy files as is. Based on that, the following options are possible: 1. autocrlf=input for those who want LF and trust Git text heuristic 2. autocrlf=true is for those who want CRLF and trust Git text heuristic 3. autocrlf=fail for those who want LF but do not trust Git heuristic 4. autocrlf=safe for those who want CRLF but do not trust Git heuristic 5. autocrlf=false for those who like messy files with different EOLs All these options have been mentioned in this thread, and I don't think we are likely to come up with a better solution, because "better" depends in which category of people you fall. IMHO, #5 is the least reasonable of all. Dmitry ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-10 21:28 ` Gregory Jefferis 2008-01-10 23:23 ` Dmitry Potapov @ 2008-01-11 0:02 ` Linus Torvalds 2008-01-11 0:32 ` Junio C Hamano 2008-01-11 7:10 ` Steffen Prohaska 1 sibling, 2 replies; 113+ messages in thread From: Linus Torvalds @ 2008-01-11 0:02 UTC (permalink / raw) To: Gregory Jefferis; +Cc: Junio C Hamano, Steffen Prohaska, Git Mailing List On Thu, 10 Jan 2008, Gregory Jefferis wrote: > > So this is what has to be accommodated. But instead of having autocrlf > always set on Windows and always converting to LF in the repository, why not > do nothing by default [ .. ] Why? You can screw yourself more, and much more easily (and much more subtly), by leaving CRLF alone on Windows. The thing is, 99.9% of all people will be *much* better off with autocrlf=true on Windows than with it defaulting to off (or even fail). Isn't *that* the whole point of having a default? Pick the thing that is the right thing for almost everybody? And no, "but think of the children.." is not a valid argument. Sure, you *can* corrupt binary imags with CRLF conversion. But it's really quite hard, since the git heuristics for guessing are rather good. You really have to work at it, and if you do, you're pretty damn likely to know about the issue, so that 0.1% that really needs to not convert (and it's usually one specific file type!) would probably not even turn off CRLF, but rather add a .gitattributes entry for that one filetype! (Side note: if there are known filetype extensions that have problems with the git guessing, we sure as heck could take the filename into account when guessing! There's absolutely nothing that says that we only have to look at the contents when guessing about the text/binary thing!) Linus ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 0:02 ` Linus Torvalds @ 2008-01-11 0:32 ` Junio C Hamano 2008-01-11 7:10 ` Steffen Prohaska 1 sibling, 0 replies; 113+ messages in thread From: Junio C Hamano @ 2008-01-11 0:32 UTC (permalink / raw) To: Linus Torvalds; +Cc: Gregory Jefferis, Steffen Prohaska, Git Mailing List Linus Torvalds <torvalds@linux-foundation.org> writes: > (Side note: if there are known filetype extensions that have problems with > the git guessing, we sure as heck could take the filename into account > when guessing! There's absolutely nothing that says that we only have to > look at the contents when guessing about the text/binary thing!) You do not have to yell. Instead, just give yourself a pat in the back for having a brilliant foresight to give "path" parameter when you did 6c510bee2013022fbce52f4b0ec0cc593fc0cc48 (Lazy man's auto-CRLF) to convert_to_git() function, even though the code originally did not use it back then ;-). ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 0:02 ` Linus Torvalds 2008-01-11 0:32 ` Junio C Hamano @ 2008-01-11 7:10 ` Steffen Prohaska 2008-01-11 15:58 ` Linus Torvalds 1 sibling, 1 reply; 113+ messages in thread From: Steffen Prohaska @ 2008-01-11 7:10 UTC (permalink / raw) To: Linus Torvalds; +Cc: Gregory Jefferis, Junio C Hamano, Git Mailing List On Jan 11, 2008, at 1:02 AM, Linus Torvalds wrote: > > > On Thu, 10 Jan 2008, Gregory Jefferis wrote: >> >> So this is what has to be accommodated. But instead of having >> autocrlf >> always set on Windows and always converting to LF in the >> repository, why not >> do nothing by default [ .. ] > > Why? You can screw yourself more, and much more easily (and much more > subtly), by leaving CRLF alone on Windows. > > The thing is, 99.9% of all people will be *much* better off with > autocrlf=true on Windows than with it defaulting to off (or even > fail). > > Isn't *that* the whole point of having a default? Pick the thing > that is > the right thing for almost everybody? Are you also for "autocrlf=input" as the default on Unix? This is the second half of the solution to the cross-platform problem ... > And no, "but think of the children.." is not a valid argument. > Sure, you > *can* corrupt binary imags with CRLF conversion. But it's really quite > hard, since the git heuristics for guessing are rather good. You > really > have to work at it, and if you do, you're pretty damn likely to > know about > the issue, so that 0.1% that really needs to not convert (and it's > usually > one specific file type!) would probably not even turn off CRLF, but > rather > add a .gitattributes entry for that one filetype! ... and then Windows and Unix users would have the same chance of data corruption. Which is very low, yes, but unfortunately it already hit me once and I didn't immediately recognized what happend. I guess that less experienced git used would have a harder time to understand. However, I don't have a test case at hand. I should probably better go and find one. So for now, you may just want to ignore this comment. Yet, I'm a bit paranoid about the potential data corruption. The way data would be corrupted during commit can't be easily fixed. You only have a chance for fixing this if you recognize the problem before you delete the file in your work tree. But because git is extremely good at preserving your data once you committed a file, I tend to feel _very_ safe after I committed and I am teaching all people that once they committed data to git they'll not loose it until the reflog expires (well and obviously they must not delete .git). > (Side note: if there are known filetype extensions that have > problems with > the git guessing, we sure as heck could take the filename into account > when guessing! There's absolutely nothing that says that we only > have to > look at the contents when guessing about the text/binary thing!) Looking on the content seems the right thing to do. The filetype extension could be misleading. Maybe a mechanism similar to the file command would be more valuable. I guess a stripped down variant should be sufficient. Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 7:10 ` Steffen Prohaska @ 2008-01-11 15:58 ` Linus Torvalds 2008-01-11 16:28 ` Steffen Prohaska 0 siblings, 1 reply; 113+ messages in thread From: Linus Torvalds @ 2008-01-11 15:58 UTC (permalink / raw) To: Steffen Prohaska; +Cc: Gregory Jefferis, Junio C Hamano, Git Mailing List On Fri, 11 Jan 2008, Steffen Prohaska wrote: > > Are you also for "autocrlf=input" as the default on Unix? No. What would it help? "autocrlf" on Windows actually helps you (big upside, very small downside). On Unix or other sane systems, it has zero upside, so while the risk is still very small, there is now no big upside to counteract it. Again, what is "default" supposed to be? I argue that it's supposed to be the thing that is right for 99.9% of all people. And that simply isn't true on Unix. Linus ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 15:58 ` Linus Torvalds @ 2008-01-11 16:28 ` Steffen Prohaska 2008-01-11 17:25 ` Linus Torvalds 2008-01-11 19:00 ` Gregory Jefferis 0 siblings, 2 replies; 113+ messages in thread From: Steffen Prohaska @ 2008-01-11 16:28 UTC (permalink / raw) To: Linus Torvalds; +Cc: Gregory Jefferis, Junio C Hamano, Git Mailing List On Jan 11, 2008, at 4:58 PM, Linus Torvalds wrote: > > > On Fri, 11 Jan 2008, Steffen Prohaska wrote: >> >> Are you also for "autocrlf=input" as the default on Unix? > > No. What would it help? You may later decide that you want to check out your project on Windows. In this case your repository should not contain CRLF. autocrlf=input ensures this. So given the current options, autocrlf=input is the only reasonable default on Unix if git wants to support cross-platform development as its default. > "autocrlf" on Windows actually helps you (big upside, very small > downside). On Unix or other sane systems, it has zero upside, so > while the > risk is still very small, there is now no big upside to counteract it. autocrlf=input on Unix helps cross-platform development, too. > Again, what is "default" supposed to be? I argue that it's supposed > to be > the thing that is right for 99.9% of all people. And that simply isn't > true on Unix. autocrlf=input is true for the very same people that need autocrlf=true on Windows. Every developer who ever plans to check out his code on Windows and on Unix should have these default. I don't think the CRLF problem is a Windows vs. Unix discussion. In my view, the discussion is wether git will have real cross- platform support as its default or not. The current default is sane for native Unix or native Windows projects. For cross- platform projects the default needs to be changed in the way described above. Git needs to ensure that CRLF never enters the repository for text files. If you did not set autocrlf=true, copying source code from Windows to Unix would not be supported. But as you earlier mentioned, this seems to be a common operation and I am observing the same. So I recommend autocrlf=input on Unix if you plan to ever go cross-platform. Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 16:28 ` Steffen Prohaska @ 2008-01-11 17:25 ` Linus Torvalds 2008-01-11 17:56 ` Steffen Prohaska 2008-01-11 19:00 ` Gregory Jefferis 1 sibling, 1 reply; 113+ messages in thread From: Linus Torvalds @ 2008-01-11 17:25 UTC (permalink / raw) To: Steffen Prohaska; +Cc: Gregory Jefferis, Junio C Hamano, Git Mailing List On Fri, 11 Jan 2008, Steffen Prohaska wrote: > > > > No. What would it help? > > You may later decide that you want to check out your project on Windows. > In this case your repository should not contain CRLF. autocrlf=input > ensures this. But under Unix, it would never do that *anyway*, unless the file for some reason really needs it (which I cannot imagine, but I've never seen anything so craptastically stupid that some crazy person hasn't done it) So your argument is bogus. Linus ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 17:25 ` Linus Torvalds @ 2008-01-11 17:56 ` Steffen Prohaska 2008-01-11 18:10 ` Linus Torvalds 0 siblings, 1 reply; 113+ messages in thread From: Steffen Prohaska @ 2008-01-11 17:56 UTC (permalink / raw) To: Linus Torvalds; +Cc: Gregory Jefferis, Junio C Hamano, Git Mailing List On Jan 11, 2008, at 6:25 PM, Linus Torvalds wrote: > > > On Fri, 11 Jan 2008, Steffen Prohaska wrote: >>> >>> No. What would it help? >> >> You may later decide that you want to check out your project on >> Windows. >> In this case your repository should not contain CRLF. autocrlf=input >> ensures this. > > But under Unix, it would never do that *anyway*, unless the file > for some > reason really needs it (which I cannot imagine, but I've never seen > anything so craptastically stupid that some crazy person hasn't > done it) > > So your argument is bogus. Ah sorry, I misunderstood you in [1]. I thought your last point "Mixed Windows usage" meant what I have in mind: A user working in a mixed Windows/Unix environment who creates a file using Windows tools and commits it in the Unix environment. In this case the CRLF file will be transferred from Windows to Unix without git being involved. The right thing for git on Unix is to remove CRLF during a commit but still write only LF during check out. So autocrlf=input is the right choice. [1] http://article.gmane.org/gmane.comp.version-control.git/70082 It happens that people working in a mixed environment do such things. They just copy files from Windows to Unix and commit there. Not very often, but it happens. So it would be nice if git would handle this situation and it actually can by setting autocrlf=input. My point is that perfect support for mixed environments requires that git removes CRLF from any input on any platform. However, git should behave differently during checkout. In this case the native line ending should be written (LF on Unix, CRLF on Windows). The difference happens during check out; commit should be handled identically. Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 17:56 ` Steffen Prohaska @ 2008-01-11 18:10 ` Linus Torvalds 2008-01-11 18:29 ` Steffen Prohaska ` (2 more replies) 0 siblings, 3 replies; 113+ messages in thread From: Linus Torvalds @ 2008-01-11 18:10 UTC (permalink / raw) To: Steffen Prohaska; +Cc: Gregory Jefferis, Junio C Hamano, Git Mailing List On Fri, 11 Jan 2008, Steffen Prohaska wrote: > > Ah sorry, I misunderstood you in [1]. I thought your last point > "Mixed Windows usage" meant what I have in mind: A user working > in a mixed Windows/Unix environment who creates a file using > Windows tools and commits it in the Unix environment. In this > case the CRLF file will be transferred from Windows to Unix > without git being involved. The right thing for git on Unix is > to remove CRLF during a commit but still write only LF during > check out. So autocrlf=input is the right choice. Oh, ok, I didn't realize. But yes, if you use a network share across windows and Unixand actually *share* the working tree over it, then yes, you'd want "autocrlf=input" on the unix side. However, I think that falls under the "0.1%" case, not the "99.9%" case. I realize that people probably do that more often with centralized systems, but with a distributed thing, it probably makes a *ton* more sense to have separate trees. But I could kind of see having a shared development directory and accessing it from different types of machines too. I'd also bet that crlf behavior of git itself will be the *least* of your problems in that situation. You'd have all the *other* tools to worry about, and would probably be very aware indeed of any CRLF issues. So at that point, the "automatic" or default behaviour is probably not a big deal, because everything _else_ you do likely needs special effort too! Linus ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 18:10 ` Linus Torvalds @ 2008-01-11 18:29 ` Steffen Prohaska 2008-01-11 19:16 ` Linus Torvalds 2008-01-11 19:53 ` CRLF problems with Git on Win32 Christer Weinigel 2008-01-14 9:41 ` David Kågedal 2 siblings, 1 reply; 113+ messages in thread From: Steffen Prohaska @ 2008-01-11 18:29 UTC (permalink / raw) To: Linus Torvalds; +Cc: Gregory Jefferis, Junio C Hamano, Git Mailing List On Jan 11, 2008, at 7:10 PM, Linus Torvalds wrote: > > > On Fri, 11 Jan 2008, Steffen Prohaska wrote: >> >> Ah sorry, I misunderstood you in [1]. I thought your last point >> "Mixed Windows usage" meant what I have in mind: A user working >> in a mixed Windows/Unix environment who creates a file using >> Windows tools and commits it in the Unix environment. In this >> case the CRLF file will be transferred from Windows to Unix >> without git being involved. The right thing for git on Unix is >> to remove CRLF during a commit but still write only LF during >> check out. So autocrlf=input is the right choice. > > Oh, ok, I didn't realize. > > But yes, if you use a network share across windows and Unixand > actually > *share* the working tree over it, then yes, you'd want > "autocrlf=input" on > the unix side. > > However, I think that falls under the "0.1%" case, not the "99.9%" > case. > > I realize that people probably do that more often with centralized > systems, but with a distributed thing, it probably makes a *ton* more > sense to have separate trees. But I could kind of see having a shared > development directory and accessing it from different types of > machines > too. It just happens yesterday that I copied a file from Unix to Windows (lucky I am ;) for a quite simple reason. I fetched and merged and realized that another developer forgot to check in a new file. He had already left. So I just looked into his workspace and copied the file. This has nothing to do with centralized system or not. We're just working in a mixed OS environment with shared filesystems. I didn't even think about the line endings in this situation because everything just worked. Actually I like the idea that I do not need to think about the endings because git will care about them. Actually many other tools work well with CRLF. For example, vi just displays [dos] in its status bar; but besides this everything is just fine. > I'd also bet that crlf behavior of git itself will be the *least* > of your > problems in that situation. You'd have all the *other* tools to worry > about, and would probably be very aware indeed of any CRLF issues. > So at > that point, the "automatic" or default behaviour is probably not a big > deal, because everything _else_ you do likely needs special effort > too! I don't think so. In the setting I described above, the questions I receive are not about the other tools but about git. I already started to teach everyone the new "autocrlf=input" policy to avoid these questions. I don't care that much about potential file corruption (though I'd feel more comfortable if I knew git would have stronger guarantees). During the next checkout on Windows file corruption would happen anyway. Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 18:29 ` Steffen Prohaska @ 2008-01-11 19:16 ` Linus Torvalds 2008-01-11 19:50 ` Sam Ravnborg 2008-01-12 17:54 ` [PATCH] [WIP] safecrlf: Add mechanism to warn about irreversible crlf conversions Steffen Prohaska 0 siblings, 2 replies; 113+ messages in thread From: Linus Torvalds @ 2008-01-11 19:16 UTC (permalink / raw) To: Steffen Prohaska; +Cc: Gregory Jefferis, Junio C Hamano, Git Mailing List On Fri, 11 Jan 2008, Steffen Prohaska wrote: > > I already started to teach everyone the new "autocrlf=input" policy to > avoid these questions. I certainly don't think "autocrlf=input" is wrong. It might even be a reasonable default on Unix, although I don't think it's nearly as obvious as the Windows case. I wouldn't mind using it myself, for example, although probably only because I know that for the stuff I work on it simply cannot possibly ever do the wrong thing. In fact, we had a case of bogus CRLF in one of the kernel documentation files for some reason that we ended up fixing by hand. "autocflf=input" would have fixed it (except in that case it wouldn't have, since it came from the original kernel tree, long before crlf was an issue for git ;) So I'd say that autocrlf=input is quite possibly a good idea on Unix in general, but my gut feel is still that it's not a big enough issue to be actually worth making a default change over. But there's absolutely nothing wrong with having it as a policy at a company that has mixed Unix and Windows machines. (Every place I've ever been at, people who had a choice would never ever develop under Windows, so I've never seen any real mixing - even when some parts of the project were DOS/Windows stuff, there was a clear boundary between the stuff that was actually done under Windows) Linus ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 19:16 ` Linus Torvalds @ 2008-01-11 19:50 ` Sam Ravnborg 2008-01-11 21:18 ` Johannes Schindelin 2008-01-12 15:08 ` Dmitry Potapov 2008-01-12 17:54 ` [PATCH] [WIP] safecrlf: Add mechanism to warn about irreversible crlf conversions Steffen Prohaska 1 sibling, 2 replies; 113+ messages in thread From: Sam Ravnborg @ 2008-01-11 19:50 UTC (permalink / raw) To: Linus Torvalds Cc: Steffen Prohaska, Gregory Jefferis, Junio C Hamano, Git Mailing List On Fri, Jan 11, 2008 at 11:16:02AM -0800, Linus Torvalds wrote: > > > On Fri, 11 Jan 2008, Steffen Prohaska wrote: > > > > I already started to teach everyone the new "autocrlf=input" policy to > > avoid these questions. > > I certainly don't think "autocrlf=input" is wrong. It might even be a > reasonable default on Unix, although I don't think it's nearly as obvious > as the Windows case. I wouldn't mind using it myself, for example, > although probably only because I know that for the stuff I work on it > simply cannot possibly ever do the wrong thing. > > In fact, we had a case of bogus CRLF in one of the kernel documentation > files for some reason that we ended up fixing by hand. "autocflf=input" > would have fixed it (except in that case it wouldn't have, since it came > from the original kernel tree, long before crlf was an issue for git ;) > > So I'd say that autocrlf=input is quite possibly a good idea on Unix in > general, but my gut feel is still that it's not a big enough issue to be > actually worth making a default change over. But there's absolutely > nothing wrong with having it as a policy at a company that has mixed Unix > and Windows machines. > > (Every place I've ever been at, people who had a choice would never ever > develop under Windows, so I've never seen any real mixing - even when some > parts of the project were DOS/Windows stuff, there was a clear boundary > between the stuff that was actually done under Windows) The reality I see is the other way around as common practice. For people that has never tried a Linux box the barrier is quite high and they prefer to stick with Windows. Where I work today and in several other places I know of the default choice is to work on Windows and use a Linux box only for cross compilation. This is common practice in many smaller embedded companies and it is also these companies that like to be able to build Linux on a Windows box. Sam ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 19:50 ` Sam Ravnborg @ 2008-01-11 21:18 ` Johannes Schindelin 2008-01-11 22:21 ` Sam Ravnborg 2008-01-12 15:08 ` Dmitry Potapov 1 sibling, 1 reply; 113+ messages in thread From: Johannes Schindelin @ 2008-01-11 21:18 UTC (permalink / raw) To: Sam Ravnborg; +Cc: Linus Torvalds, git Hi, On Fri, 11 Jan 2008, Sam Ravnborg wrote: > On Fri, Jan 11, 2008 at 11:16:02AM -0800, Linus Torvalds wrote: > > > (Every place I've ever been at, people who had a choice would never > > ever develop under Windows, so I've never seen any real mixing - even > > when some parts of the project were DOS/Windows stuff, there was a > > clear boundary between the stuff that was actually done under Windows) > > The reality I see is the other way around as common practice. Not in my world. I see a few people who are stuck to Windows, but they are so because they are lazy. They do not ever do something interesting with computers in their free time, and while working, they only do what they are told to do. That might sound cynical, but you will have to _show_ me different examples to make me reconsider. And no, my work with msysgit did a poor job to convince me otherwise. Ciao, Dscho ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 21:18 ` Johannes Schindelin @ 2008-01-11 22:21 ` Sam Ravnborg 0 siblings, 0 replies; 113+ messages in thread From: Sam Ravnborg @ 2008-01-11 22:21 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Linus Torvalds, git On Fri, Jan 11, 2008 at 09:18:49PM +0000, Johannes Schindelin wrote: > Hi, > > On Fri, 11 Jan 2008, Sam Ravnborg wrote: > > > On Fri, Jan 11, 2008 at 11:16:02AM -0800, Linus Torvalds wrote: > > > > > (Every place I've ever been at, people who had a choice would never > > > ever develop under Windows, so I've never seen any real mixing - even > > > when some parts of the project were DOS/Windows stuff, there was a > > > clear boundary between the stuff that was actually done under Windows) > > > > The reality I see is the other way around as common practice. > > Not in my world. > > I see a few people who are stuck to Windows, but they are so because they > are lazy. They do not ever do something interesting with computers in > their free time, and while working, they only do what they are told to do. Some of the people I have in my mind I will certainly not call lazy, but the other part of the description is a fine match. > That might sound cynical, but you will have to _show_ me different > examples to make me reconsider. I just wanted to say that things looks different in some places of the world nad for some types of development. I do not even know what I should try to make you reconsider - as I did not follow the full thread. Just stumbled over this statement. Sam ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 19:50 ` Sam Ravnborg 2008-01-11 21:18 ` Johannes Schindelin @ 2008-01-12 15:08 ` Dmitry Potapov 1 sibling, 0 replies; 113+ messages in thread From: Dmitry Potapov @ 2008-01-12 15:08 UTC (permalink / raw) To: Sam Ravnborg Cc: Linus Torvalds, Steffen Prohaska, Gregory Jefferis, Junio C Hamano, Git Mailing List On Fri, Jan 11, 2008 at 08:50:22PM +0100, Sam Ravnborg wrote: > On Fri, Jan 11, 2008 at 11:16:02AM -0800, Linus Torvalds wrote: > > > > (Every place I've ever been at, people who had a choice would never ever > > develop under Windows, so I've never seen any real mixing - even when some > > parts of the project were DOS/Windows stuff, there was a clear boundary > > between the stuff that was actually done under Windows) > > The reality I see is the other way around as common practice. > For people that has never tried a Linux box the barrier > is quite high and they prefer to stick with Windows. And for those who have never tried Windows, it would be a great learning barrier as well, and it is far for obvious what would be easy to learn for someone has never had any experience with either of them before... Of course, most people who has used computers for some time could not escape having at least some experience with Windows, and, naturally people prefer to stick to what they know, especially those who do not like or find difficult to learn new stuff. Based on my observation, I would say that those found learning Linux difficult would also find difficult to learn other new things (like a new programming language), and usually had more troubles in dealing with novel situations or doing anything that required out-of-the-box thinking... Usually, they are good only on one thing -- doing what they were told. There are some exceptions, of course, but take a look at the number of open source projects (where people write for fun of programming) and compare how many of them are done by *nix users and Windows users. Isn't obvious what most people who like programing prefer to use? Dmitry ^ permalink raw reply [flat|nested] 113+ messages in thread
* [PATCH] [WIP] safecrlf: Add mechanism to warn about irreversible crlf conversions 2008-01-11 19:16 ` Linus Torvalds 2008-01-11 19:50 ` Sam Ravnborg @ 2008-01-12 17:54 ` Steffen Prohaska 2008-01-12 19:14 ` Dmitry Potapov 1 sibling, 1 reply; 113+ messages in thread From: Steffen Prohaska @ 2008-01-12 17:54 UTC (permalink / raw) To: torvalds, dpotapov, git; +Cc: Steffen Prohaska I promised to think about the CRLF discussion and here is what I believe we could do: - Leave the current core.autocrlf mechanism as is. - Add a mechanism to warn the user if an irreversible conversion happens - After we have the mechanisms for configuring the conversion and for configuring the safety level, we can decide which defaults to use on the different platforms, namely Windows and Unix. I propose to set the following defaults: - Unix: core.autocrlf=input, core.safecrlf=warn - Windows: core.autocrlf=true, core.safecrlf=warn This patch is declared as WIP because tests and a documentation are missing. I'm also not sure if calling warning() and die() is the right thing to do at this place. Interestingly, in some (all?) cases, crlf_to_git() is called two times for a path during git add, resulting in the warning printed two times. I didn't yet analyze why this happens. Maybe the the warnings and errors printed should be more verbose? [ Linus, Dimitry was right about stats.lf. ] Steffen ---- snip snap --- CRLF conversion bears a slight chance of corrupting data. autocrlf=true will convert CRLF to LF during commit and LF to CRLF during checkout. A file that containes a mixture of LF and CRLF before the commit cannot be recreated by git. For text files this does not really matter because we do not care about the line endings anyway; but for binary files that are accidentally classified as text the conversion can result in corrupted data. If you recognize such corruption during commit you can easily fix it by setting the conversion type explicitly in .gitattributes. Right after committing you still have the original file in your work tree and this file is not yet corrupted. However, in mixed Windows/Unix environments text files quite easily can end up containing a mixture of CRLF and LF line endings and git should handle such situations gracefully. For example a user could copy a CRLF file from Windows to Unix and mix it with an existing LF file there. The result would contain both types of line endings. Unfortunately, the desired effect of cleaning up text files with mixed lineendings and undesired effect of corrupting binary files can not be distinguished. In both cases CRLF are removed in an irreversible way. For text files this is the right thing to do, while for binary file its corrupting data. In a sane environment committing and checking out the same file should not modify the origin file in the work tree. For autocrlf=input the original file must not contain CRLF. For autocrlf=true the original file must not contain LF without preceding CR. Otherwise the conversion is irreversible. Note, git might be able to recreate the original file with different autocrlf settings, but in the current environment checking out will yield a file that differs from the file before the commit. This patch adds a mechanism that can either warn the user about an irreversible conversion or can even refuse to convert. The mechanism is controlled by the variable core.safecrlf, with the following values - false: disable safecrlf mechanism - warn: warn about irreversible conversions - true: refuse irreversible conversions The default is to warn. A concept of a safety check was originally proposed in a similar way by Linus Torvalds. Signed-off-by: Steffen Prohaska <prohaska@zib.de> --- cache.h | 8 ++++++++ config.c | 9 +++++++++ convert.c | 21 +++++++++++++++++++++ environment.c | 1 + 4 files changed, 39 insertions(+), 0 deletions(-) diff --git a/cache.h b/cache.h index 39331c2..4e03e3d 100644 --- a/cache.h +++ b/cache.h @@ -330,6 +330,14 @@ extern size_t packed_git_limit; extern size_t delta_base_cache_limit; extern int auto_crlf; +enum safe_crlf { + SAFE_CRLF_FALSE = 0, + SAFE_CRLF_FAIL = 1, + SAFE_CRLF_WARN = 2, +}; + +extern enum safe_crlf safe_crlf; + #define GIT_REPO_VERSION 0 extern int repository_format_version; extern int check_repository_format(void); diff --git a/config.c b/config.c index 857deb6..0a46046 100644 --- a/config.c +++ b/config.c @@ -407,6 +407,15 @@ int git_default_config(const char *var, const char *value) return 0; } + if (!strcmp(var, "core.safecrlf")) { + if (value && !strcasecmp(value, "warn")) { + safe_crlf = SAFE_CRLF_WARN; + return 0; + } + safe_crlf = git_config_bool(var, value); + return 0; + } + if (!strcmp(var, "user.name")) { strlcpy(git_default_name, value, sizeof(git_default_name)); return 0; diff --git a/convert.c b/convert.c index 4df7559..598cf0b 100644 --- a/convert.c +++ b/convert.c @@ -132,6 +132,27 @@ static int crlf_to_git(const char *path, const char *src, size_t len, *dst++ = c; } while (--len); } + if (safe_crlf) { + if ((action == CRLF_INPUT) || auto_crlf <= 0) { + /* autocrlf=input: check if we removed CRLFs */ + if (buf->len != dst - buf->buf) { + if (safe_crlf == SAFE_CRLF_WARN) + warning("Stripped CRLF from %s.", path); + else + die("Refusing to strip CRLF from %s.", path); + } + } else { + /* autocrlf=true: check if we had LFs (without CR) */ + if (stats.lf != stats.crlf) { + if (safe_crlf == SAFE_CRLF_WARN) + warning( + "Checkout will replace LFs with CRLF in %s", path); + else + die("Checkout would replace LFs with CRLF in %s", path); + } + } + } + strbuf_setlen(buf, dst - buf->buf); return 1; } diff --git a/environment.c b/environment.c index 18a1c4e..e351e99 100644 --- a/environment.c +++ b/environment.c @@ -35,6 +35,7 @@ int pager_use_color = 1; char *editor_program; char *excludes_file; int auto_crlf = 0; /* 1: both ways, -1: only when adding git objects */ +enum safe_crlf safe_crlf = SAFE_CRLF_WARN; unsigned whitespace_rule_cfg = WS_DEFAULT_RULE; /* This is set by setup_git_dir_gently() and/or git_default_config() */ -- 1.5.4.rc2.60.g46ee ^ permalink raw reply related [flat|nested] 113+ messages in thread
* Re: [PATCH] [WIP] safecrlf: Add mechanism to warn about irreversible crlf conversions 2008-01-12 17:54 ` [PATCH] [WIP] safecrlf: Add mechanism to warn about irreversible crlf conversions Steffen Prohaska @ 2008-01-12 19:14 ` Dmitry Potapov 2008-01-13 9:05 ` [WIP v2] " Steffen Prohaska 0 siblings, 1 reply; 113+ messages in thread From: Dmitry Potapov @ 2008-01-12 19:14 UTC (permalink / raw) To: Steffen Prohaska; +Cc: torvalds, git On Sat, Jan 12, 2008 at 06:54:13PM +0100, Steffen Prohaska wrote: > diff --git a/convert.c b/convert.c > index 4df7559..598cf0b 100644 > --- a/convert.c > +++ b/convert.c > @@ -132,6 +132,27 @@ static int crlf_to_git(const char *path, const char *src, size_t len, > *dst++ = c; > } while (--len); > } > + if (safe_crlf) { > + if ((action == CRLF_INPUT) || auto_crlf <= 0) { > + /* autocrlf=input: check if we removed CRLFs */ > + if (buf->len != dst - buf->buf) { > + if (safe_crlf == SAFE_CRLF_WARN) > + warning("Stripped CRLF from %s.", path); > + else > + die("Refusing to strip CRLF from %s.", path); > + } This check is okay, however > + } else { > + /* autocrlf=true: check if we had LFs (without CR) */ > + if (stats.lf != stats.crlf) { > + if (safe_crlf == SAFE_CRLF_WARN) > + warning( > + "Checkout will replace LFs with CRLF in %s", path); > + else > + die("Checkout would replace LFs with CRLF in %s", path); > + } > + } this is not, because if you really want to be sure that file will not be mangled by checkout, you should not allow a text file with naked LF when autocrlf=true. And the following lines after gather_stats() can cause: /* No CR? Nothing to convert, regardless. */ if (!stats.cr) return 0; So, I propose a slightly different patch for convert.c: diff --git a/convert.c b/convert.c index 4df7559..9fd88d9 100644 --- a/convert.c +++ b/convert.c @@ -90,9 +90,6 @@ static int crlf_to_git(const char *path, const char *src, size_t len, return 0; gather_stats(src, len, &stats); - /* No CR? Nothing to convert, regardless. */ - if (!stats.cr) - return 0; if (action == CRLF_GUESS) { /* @@ -108,8 +105,23 @@ static int crlf_to_git(const char *path, const char *src, size_t len, */ if (is_binary(len, &stats)) return 0; + + if (safe_crlf) { + /* check if we have "naked" LFs */ + if (stats.lf != stats.crlf) { + if (safe_crlf == SAFE_CRLF_WARN) + warning( + "Checkout will replace LFs with CRLF in %s", path); + else + die("Checkout would replace LFs with CRLF in %s", path); + } + } } + /* No CR? Nothing to convert, regardless. */ + if (!stats.cr) + return 0; + /* only grow if not in place */ if (strbuf_avail(buf) + buf->len < len) strbuf_grow(buf, len - buf->len); @@ -131,6 +143,16 @@ static int crlf_to_git(const char *path, const char *src, size_t len, if (! (c == '\r' && (1 < len && *src == '\n'))) *dst++ = c; } while (--len); + + if (safe_crlf && (action == CRLF_INPUT || auto_crlf <= 0)) { + /* autocrlf=input: check if we removed CRLFs */ + if (buf->len != dst - buf->buf) { + if (safe_crlf == SAFE_CRLF_WARN) + warning("Stripped CRLF from %s.", path); + else + die("Refusing to strip CRLF from %s.", path); + } + } } strbuf_setlen(buf, dst - buf->buf); return 1; Dmitry ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [WIP v2] safecrlf: Add mechanism to warn about irreversible crlf conversions 2008-01-12 19:14 ` Dmitry Potapov @ 2008-01-13 9:05 ` Steffen Prohaska 0 siblings, 0 replies; 113+ messages in thread From: Steffen Prohaska @ 2008-01-13 9:05 UTC (permalink / raw) To: dpotapov, git; +Cc: torvalds, Steffen Prohaska This version gets the naked LF/autocrlf=true case right. However, different from what Dimitry suggested, the safety check is run for all cases that are irreversible. Dimitry suggested to run it only for the CRLF_GUESS case. I believe this is not sufficient: the explicit CFLF_TEXT case should also be checked. The user explicitly marked the file as text but the conversion is nonetheless irreversible in the current setting. This might be unexpected and we should warn about it. Paranoid users can even ask git to fail in this case. Such users would need to manually fix the file, e.g. running dos2unix. I also added basic tests. A documentation is yet missing. Steffen ---- snip snap --- CRLF conversion bears a slight chance of corrupting data. autocrlf=true will convert CRLF to LF during commit and LF to CRLF during checkout. A file that containes a mixture of LF and CRLF before the commit cannot be recreated by git. For text files this does not really matter because we do not care about the line endings anyway; but for binary files that are accidentally classified as text the conversion can result in corrupted data. If you recognize such corruption during commit you can easily fix it by setting the conversion type explicitly in .gitattributes. Right after committing you still have the original file in your work tree and this file is not yet corrupted. However, in mixed Windows/Unix environments text files quite easily can end up containing a mixture of CRLF and LF line endings and git should handle such situations gracefully. For example a user could copy a CRLF file from Windows to Unix and mix it with an existing LF file there. The result would contain both types of line endings. Unfortunately, the desired effect of cleaning up text files with mixed lineendings and undesired effect of corrupting binary files can not be distinguished. In both cases CRLF are removed in an irreversible way. For text files this is the right thing to do, while for binary file its corrupting data. In a sane environment committing and checking out the same file should not modify the origin file in the work tree. For autocrlf=input the original file must not contain CRLF. For autocrlf=true the original file must not contain LF without preceding CR. Otherwise the conversion is irreversible. Note, git might be able to recreate the original file with different autocrlf settings, but in the current environment checking out will yield a file that differs from the file before the commit. This patch adds a mechanism that can either warn the user about an irreversible conversion or can even refuse to convert. The mechanism is controlled by the variable core.safecrlf, with the following values - false: disable safecrlf mechanism - warn: warn about irreversible conversions - true: refuse irreversible conversions The default is to warn. The concept of a safety check was originally proposed in a similar way by Linus Torvalds. Thanks to Dimitry Potapov for insisting on getting the naked LF/autocrlf=true case right. Signed-off-by: Steffen Prohaska <prohaska@zib.de> --- cache.h | 8 ++++++++ config.c | 9 +++++++++ convert.c | 28 +++++++++++++++++++++++++--- environment.c | 1 + t/t0020-crlf.sh | 45 +++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 88 insertions(+), 3 deletions(-) diff --git a/cache.h b/cache.h index 39331c2..4e03e3d 100644 --- a/cache.h +++ b/cache.h @@ -330,6 +330,14 @@ extern size_t packed_git_limit; extern size_t delta_base_cache_limit; extern int auto_crlf; +enum safe_crlf { + SAFE_CRLF_FALSE = 0, + SAFE_CRLF_FAIL = 1, + SAFE_CRLF_WARN = 2, +}; + +extern enum safe_crlf safe_crlf; + #define GIT_REPO_VERSION 0 extern int repository_format_version; extern int check_repository_format(void); diff --git a/config.c b/config.c index 857deb6..0a46046 100644 --- a/config.c +++ b/config.c @@ -407,6 +407,15 @@ int git_default_config(const char *var, const char *value) return 0; } + if (!strcmp(var, "core.safecrlf")) { + if (value && !strcasecmp(value, "warn")) { + safe_crlf = SAFE_CRLF_WARN; + return 0; + } + safe_crlf = git_config_bool(var, value); + return 0; + } + if (!strcmp(var, "user.name")) { strlcpy(git_default_name, value, sizeof(git_default_name)); return 0; diff --git a/convert.c b/convert.c index 4df7559..c9678ee 100644 --- a/convert.c +++ b/convert.c @@ -90,9 +90,6 @@ static int crlf_to_git(const char *path, const char *src, size_t len, return 0; gather_stats(src, len, &stats); - /* No CR? Nothing to convert, regardless. */ - if (!stats.cr) - return 0; if (action == CRLF_GUESS) { /* @@ -110,6 +107,20 @@ static int crlf_to_git(const char *path, const char *src, size_t len, return 0; } + if (safe_crlf && auto_crlf > 0 && action != CRLF_INPUT) { + /* CRLFs would be added by checkout: check if we have "naked" LFs */ + if (stats.lf != stats.crlf) { + if (safe_crlf == SAFE_CRLF_WARN) + warning("Checkout will replace LFs with CRLF in %s", path); + else + die("Checkout would replace LFs with CRLF in %s", path); + } + } + + /* Optimization: No CR? Nothing to convert, regardless. */ + if (!stats.cr) + return 0; + /* only grow if not in place */ if (strbuf_avail(buf) + buf->len < len) strbuf_grow(buf, len - buf->len); @@ -132,6 +143,17 @@ static int crlf_to_git(const char *path, const char *src, size_t len, *dst++ = c; } while (--len); } + + if (safe_crlf && (action == CRLF_INPUT || auto_crlf <= 0)) { + /* CRLFs would not be restored by checkout: check if we removed CRLFs */ + if (buf->len != dst - buf->buf) { + if (safe_crlf == SAFE_CRLF_WARN) + warning("Stripped CRLF from %s.", path); + else + die("Refusing to strip CRLF from %s.", path); + } + } + strbuf_setlen(buf, dst - buf->buf); return 1; } diff --git a/environment.c b/environment.c index 18a1c4e..e351e99 100644 --- a/environment.c +++ b/environment.c @@ -35,6 +35,7 @@ int pager_use_color = 1; char *editor_program; char *excludes_file; int auto_crlf = 0; /* 1: both ways, -1: only when adding git objects */ +enum safe_crlf safe_crlf = SAFE_CRLF_WARN; unsigned whitespace_rule_cfg = WS_DEFAULT_RULE; /* This is set by setup_git_dir_gently() and/or git_default_config() */ diff --git a/t/t0020-crlf.sh b/t/t0020-crlf.sh index 89baebd..e2e0f7b 100755 --- a/t/t0020-crlf.sh +++ b/t/t0020-crlf.sh @@ -8,6 +8,10 @@ q_to_nul () { tr Q '\000' } +q_to_cr () { + tr Q '\015' +} + append_cr () { sed -e 's/$/Q/' | tr Q '\015' } @@ -42,6 +46,47 @@ test_expect_success setup ' echo happy. ' +test_expect_failure 'safecrlf: autocrlf=input, all CRLF' ' + + git repo-config core.autocrlf input && + git repo-config core.safecrlf true && + + for w in I am all CRLF; do echo $w; done | append_cr >allcrlf && + git add allcrlf +' + +test_expect_failure 'safecrlf: autocrlf=input, mixed LF/CRLF' ' + + git repo-config core.autocrlf input && + git repo-config core.safecrlf true && + + for w in Oh here is CRLFQ in text; do echo $w; done | q_to_cr >mixed && + git add mixed +' + +test_expect_failure 'safecrlf: autocrlf=true, all LF' ' + + git repo-config core.autocrlf true && + git repo-config core.safecrlf true && + + for w in I am all LF; do echo $w; done >alllf && + git add alllf +' + +test_expect_failure 'safecrlf: autocrlf=true mixed LF/CRLF' ' + + git repo-config core.autocrlf true && + git repo-config core.safecrlf true && + + for w in Oh here is CRLFQ in text; do echo $w; done | q_to_cr >mixed && + git add mixed +' + +test_expect_success 'switch off autocrlf, safecrlf' ' + git repo-config core.autocrlf false && + git repo-config core.safecrlf false +' + test_expect_success 'update with autocrlf=input' ' rm -f tmp one dir/two three && -- 1.5.4.rc2.60.g46ee ^ permalink raw reply related [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 18:10 ` Linus Torvalds 2008-01-11 18:29 ` Steffen Prohaska @ 2008-01-11 19:53 ` Christer Weinigel 2008-01-14 9:41 ` David Kågedal 2 siblings, 0 replies; 113+ messages in thread From: Christer Weinigel @ 2008-01-11 19:53 UTC (permalink / raw) To: Linus Torvalds Cc: Steffen Prohaska, Gregory Jefferis, Junio C Hamano, Git Mailing List On Fri, 11 Jan 2008 10:10:00 -0800 (PST) Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Fri, 11 Jan 2008, Steffen Prohaska wrote: > > Ah sorry, I misunderstood you in [1]. I thought your last point > > "Mixed Windows usage" meant what I have in mind: A user working > > in a mixed Windows/Unix environment who creates a file using > > Windows tools and commits it in the Unix environment. In this > > case the CRLF file will be transferred from Windows to Unix > > without git being involved. The right thing for git on Unix is > > to remove CRLF during a commit but still write only LF during > > check out. So autocrlf=input is the right choice. > > Oh, ok, I didn't realize. > > But yes, if you use a network share across windows and Unixand > actually *share* the working tree over it, then yes, you'd want > "autocrlf=input" on the unix side. > > However, I think that falls under the "0.1%" case, not the "99.9%" > case. That's how I work all the time. My Linux box is a Samba server where I check things out from perforce (with the "share" settings for end of line which means that text files are checked out with LF only and CRLF is converted to LF on checkin). Having the data on the Linux box is nice since I can have all the nice Unix tools such as sed, find, grep, and they run fast on a native Linux system, which is not true about cygwin on Windows. > I realize that people probably do that more often with centralized > systems, but with a distributed thing, it probably makes a *ton* more > sense to have separate trees. But I could kind of see having a shared > development directory and accessing it from different types of > machines too. We're working in a mixed environment, and even though I do most of my development on Linux I usually want to make sure that things build in Visual Studio before I check in, so the easiest thing to do is to point Visual Studio at the files on the Samba share. Same thing when using Altera's tools to do CPLD development, I run the Altera tools on Windows (their free version is Windows only) but all the files are on the Linux box. My tools that take the SVF file (the "binary image" for the CPLD) and program the CPLD all run under Linux though. A lot of my colleagues have Windows on the desktop, and when they develop on Linux they usually edit the files locally using the Samba share, and then they have a Putty (ssh) connected to the Linux box where they build and test the software. So the shared scenario is actually a very common one for us. > I'd also bet that crlf behavior of git itself will be the *least* of > your problems in that situation. You'd have all the *other* tools to > worry about, and would probably be very aware indeed of any CRLF > issues. So at that point, the "automatic" or default behaviour is > probably not a big deal, because everything _else_ you do likely > needs special effort too! Actually I seldom have any problems with CRLF at all. Sometimes the Xilinx or Altera editors will insert some stray CRLFs in some files, but all the tools I use seem to tolerate that. And as soon as I check in the CRLFs disappear anyway. We just have to make sure to turn on the "share" setting in our Perforce views and everything just works. /Christer ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 18:10 ` Linus Torvalds 2008-01-11 18:29 ` Steffen Prohaska 2008-01-11 19:53 ` CRLF problems with Git on Win32 Christer Weinigel @ 2008-01-14 9:41 ` David Kågedal 2 siblings, 0 replies; 113+ messages in thread From: David Kågedal @ 2008-01-14 9:41 UTC (permalink / raw) To: git Linus Torvalds <torvalds@linux-foundation.org> writes: > On Fri, 11 Jan 2008, Steffen Prohaska wrote: >> >> Ah sorry, I misunderstood you in [1]. I thought your last point >> "Mixed Windows usage" meant what I have in mind: A user working >> in a mixed Windows/Unix environment who creates a file using >> Windows tools and commits it in the Unix environment. In this >> case the CRLF file will be transferred from Windows to Unix >> without git being involved. The right thing for git on Unix is >> to remove CRLF during a commit but still write only LF during >> check out. So autocrlf=input is the right choice. > > Oh, ok, I didn't realize. > > But yes, if you use a network share across windows and Unixand actually > *share* the working tree over it, then yes, you'd want "autocrlf=input" on > the unix side. > > However, I think that falls under the "0.1%" case, not the "99.9%" case. > > I realize that people probably do that more often with centralized > systems, but with a distributed thing, it probably makes a *ton* more > sense to have separate trees. But I could kind of see having a shared > development directory and accessing it from different types of machines > too. One case is when you only want to commit compiling code, and to test-compile on all platforms that you are supposed to be portable to you need to access the source tree on different systems before committing anything. You could of course commit optimistically and checkout on the other system, and then go back and rewrite the commits if you need to fix something. But that is a lot more work. -- David Kågedal ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 16:28 ` Steffen Prohaska 2008-01-11 17:25 ` Linus Torvalds @ 2008-01-11 19:00 ` Gregory Jefferis 2008-01-12 15:26 ` Dmitry Potapov 1 sibling, 1 reply; 113+ messages in thread From: Gregory Jefferis @ 2008-01-11 19:00 UTC (permalink / raw) To: Steffen Prohaska, Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List Sticking my head above the parapet again ... On 11/1/08 16:28, "Steffen Prohaska" <prohaska@zib.de> wrote: > I don't think the CRLF problem is a Windows vs. Unix discussion. Agreed. > In my view, the discussion is wether git will have real cross- > platform support as its default or not. The current default is > sane for native Unix or native Windows projects. For cross- Absolutely > platform projects the default needs to be changed in the way > described above. Git needs to ensure that CRLF never enters the > repository for text files. LF only repositories are model that everyone is tending towards but I feel that there are (sane) people out there who would sometimes like to have CRLF files in the repository and do cross-platform development (I would developing on a Mac for a Windows originated Win/Mac project or if I were keeping vendor source code in a tree). In spite of the plethora of autocrlf variants so far there is still none that on unix would give you LF->CRLF on check in and CRLF->LF on checkout! This should be perfectly compatible with git's internals and I think it should be possible to allow this without breaking anything for other situations. One solution, which would have other uses, would be to allow checkin conversion to a specified line ending and checkout conversion to platform line ending as separately configurable options. If this seems outrageous then it should be made perfectly clear that the git project strongly discourages CRLF text files in cross-platform repositories, that to prevent CRLF creep we disallow them by default even in the privacy of your own OS (if it's Windows) and that if you want to do this you're on your own mate. But I think that would be a shame, inflexible and definitely not PC ;-) > If you did not set autocrlf=true, > copying source code from Windows to Unix would not be supported. > But as you earlier mentioned, this seems to be a common > operation and I am observing the same. So I recommend > autocrlf=input on Unix if you plan to ever go cross-platform. For me this is kind of the mathematician vs the engineer. I think Steffen is logically correct in saying that autocrlf=input on unix is the direct orthologue of autocrlf=true on windows and I dislike the idea that git should show logically different behaviour on different platforms. However I think Linus's cost/benefit analysis is right: CRLF files appear infrequently on unix system and often as not it's because someone specifically wants them to stay that way. So I think autocrlf=input is a useful option but not a necessary default on unix. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-11 19:00 ` Gregory Jefferis @ 2008-01-12 15:26 ` Dmitry Potapov 0 siblings, 0 replies; 113+ messages in thread From: Dmitry Potapov @ 2008-01-12 15:26 UTC (permalink / raw) To: Gregory Jefferis Cc: Steffen Prohaska, Linus Torvalds, Junio C Hamano, Git Mailing List On Fri, Jan 11, 2008 at 07:00:40PM +0000, Gregory Jefferis wrote: > > LF only repositories are model that everyone is tending towards but I feel > that there are (sane) people out there who would sometimes like to have CRLF > files in the repository and do cross-platform development (I would > developing on a Mac for a Windows originated Win/Mac project or if I were > keeping vendor source code in a tree). In spite of the plethora of autocrlf > variants so far there is still none that on unix would give you LF->CRLF on > check in and CRLF->LF on checkout! This should be perfectly compatible with > git's internals Git internally considers only LF as the EOL marker. I think there are more three hundreds places in Git where the decision about end-of-line is made based on that. Though CRLF may appear to work, but it is more an artifact caused by its LF ending, so what it actually works is LF and nothing else. IOW, CRLF from the Git's point of view is no better EOL than let's say SPACE+LF. > and I think it should be possible to allow this without > breaking anything for other situations. One solution, which would have > other uses, would be to allow checkin conversion to a specified line ending > and checkout conversion to platform line ending as separately configurable > options. > > If this seems outrageous then it should be made perfectly clear that the git > project strongly discourages CRLF text files in cross-platform repositories, Because LF is the only true EOL marker, and CRLF is not and never will be. In fact, Git is written in C, and the decision of what is EOL in C is made many years ago. So, it is the only sane choice to use LF for _internal_ representation. It can be said that *nix users are lucky in that their OS uses the same symbol, but it is similar to big-endian platforms being lucky with byte order when it comes to TCP/IP. That is not because TCP/IP wants to discourage little-endian platforms, but having the single encoding is the only sane choice if you care about interoperability, and any other decision will end up being much worse. Dmitry ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-10 19:58 ` Gregory Jefferis 2008-01-10 20:20 ` Linus Torvalds @ 2008-01-10 20:50 ` Rogan Dawes 2008-01-10 21:15 ` Gregory Jefferis 2008-01-11 1:15 ` Junio C Hamano 1 sibling, 2 replies; 113+ messages in thread From: Rogan Dawes @ 2008-01-10 20:50 UTC (permalink / raw) To: Gregory Jefferis; +Cc: Junio C Hamano, Steffen Prohaska, Git Mailing List Gregory Jefferis wrote: > I'm not sure that I understand the whole deal about platform default line > endings. Isn't plain git functionally agnostic about line endings? You can > check in CRLF text files to git and it doesn't care. You can diff, show etc > just fine. I haven't yet found anything that breaks with CRLF files. In > this sense plain git is already Windows ready. Maybe I'm missing something? > > Doesn't the problem only come if you try to diff a CRLF file with a new > version that has LF only line endings? Then right now you have to use > something like: > > git diff --ignore-space-at-eol > > In conclusion all of these CRLF options are designed to help Windows users > play nicely with others. But it seems to me naïve Windows users can be > perfectly happy with plain git so long as they stay in their own Windows > world. > > jm2c, corrections welcome and apologies to those suffering from eol > exhaustion, > > Greg. One example that bit me recently was "git-apply --whitespace=strip" I have files with CRLF in my repo, but git was stripping the CR from lines that I applied via a patch. I worked around it with a smudge/clean filter of "dos2unix | unix2dos" (first removes all CR's, second puts one back on each line) Rogan ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-10 20:50 ` Rogan Dawes @ 2008-01-10 21:15 ` Gregory Jefferis 2008-01-11 1:15 ` Junio C Hamano 1 sibling, 0 replies; 113+ messages in thread From: Gregory Jefferis @ 2008-01-10 21:15 UTC (permalink / raw) To: Rogan Dawes; +Cc: Junio C Hamano, Steffen Prohaska, Git Mailing List On 10/1/08 20:50, "Rogan Dawes" <lists@dawes.za.net> wrote: > Gregory Jefferis wrote: >> Isn't plain git functionally agnostic about line endings? You can >> check in CRLF text files to git and it doesn't care. You can diff, show etc >> just fine. I haven't yet found anything that breaks with CRLF files. In >> this sense plain git is already Windows ready. Maybe I'm missing something? > > One example that bit me recently was "git-apply --whitespace=strip" > > I have files with CRLF in my repo, but git was stripping the CR from > lines that I applied via a patch. > > I worked around it with a smudge/clean filter of "dos2unix | unix2dos" > (first removes all CR's, second puts one back on each line) > > Rogan OK so that's interesting. Is it a case where core git is not crlf agnostic? Looks like CR is being considered whitespace. I think git diff --ignore-space-at-eol also works because CR is considered whitespace. Maybe that's the wrong behaviour. So the big question for me. Should git expect that text files inside a repository have to have LF only line endings? I don't think that it should, but should accommodate both CRLF and LF. I guess at the moment git normally accommodates CRLF files because they look like an LF file that happens to have a funky whitespace char in front of the LFs. Maybe it would be better if edge cases like the one you described were ironed out. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-10 20:50 ` Rogan Dawes 2008-01-10 21:15 ` Gregory Jefferis @ 2008-01-11 1:15 ` Junio C Hamano 1 sibling, 0 replies; 113+ messages in thread From: Junio C Hamano @ 2008-01-11 1:15 UTC (permalink / raw) To: Rogan Dawes; +Cc: Gregory Jefferis, Steffen Prohaska, Git Mailing List Rogan Dawes <lists@dawes.za.net> writes: > One example that bit me recently was "git-apply --whitespace=strip" You might want to go back the list archive for a few days to find this patch: [PATCH 2/2] core.whitespace cr-at-eol-is-ok and try it out. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 21:03 ` Robin Rosenberg 2008-01-07 21:18 ` Johannes Schindelin @ 2008-01-07 21:36 ` Linus Torvalds 2008-01-08 21:26 ` Peter Karlsson 2008-01-07 21:42 ` Thomas Neumann 2 siblings, 1 reply; 113+ messages in thread From: Linus Torvalds @ 2008-01-07 21:36 UTC (permalink / raw) To: Robin Rosenberg Cc: Johannes Schindelin, Jeff King, Steffen Prohaska, Peter Karlsson, Git Mailing List On Mon, 7 Jan 2008, Robin Rosenberg wrote: > > Indeed, but the most common SCM's detect binary files automatically, > either by suffix or content analysis, so I think that is what user's expect. > It will be right for more projects that the current behaviour. Yeah, I suspect it's not only the "expected" behavior, but people have had years of getting used to the whole binary issue, and are much more likely to expect binary corruption than to expect to have to worry about CRLF. And while it's true that it probably doesn't matter at all as long as you stay windows-only (and everything is CRLF), it's also true that (a) maybe you don't necessarily even know that some day you might want to cast off the shackles of MS and (b) even under Windows you do end up having some strange tools end up using LF (ie you may be using some tools that were just straight ports from Unix, and that write just LF). So defaulting to (or asking) "autocrlf" at install time is probably the safest thing, and then people can edit their global .gitconfig to turn it off. Linus ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 21:36 ` Linus Torvalds @ 2008-01-08 21:26 ` Peter Karlsson 2008-01-09 10:56 ` Johannes Schindelin 0 siblings, 1 reply; 113+ messages in thread From: Peter Karlsson @ 2008-01-08 21:26 UTC (permalink / raw) To: Git Mailing List Cc: Robin Rosenberg, Johannes Schindelin, Jeff King, Steffen Prohaska, Linus Torvalds Linus Torvalds: > So defaulting to (or asking) "autocrlf" at install time is probably the > safest thing, and then people can edit their global .gitconfig to turn it > off. Indeed. A checkbox in the Windows installer (like Cygwin has) would be nice. -- \\// Peter - http://www.softwolves.pp.se/ ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 21:26 ` Peter Karlsson @ 2008-01-09 10:56 ` Johannes Schindelin 2008-01-09 12:41 ` Steffen Prohaska 0 siblings, 1 reply; 113+ messages in thread From: Johannes Schindelin @ 2008-01-09 10:56 UTC (permalink / raw) To: Peter Karlsson Cc: Git Mailing List, Robin Rosenberg, Jeff King, Steffen Prohaska, Linus Torvalds Hi, On Tue, 8 Jan 2008, Peter Karlsson wrote: > Linus Torvalds: > > > So defaulting to (or asking) "autocrlf" at install time is probably > > the safest thing, and then people can edit their global .gitconfig to > > turn it off. > > Indeed. A checkbox in the Windows installer (like Cygwin has) would be > nice. No. There are different needs for different projects, and having different defaults just adds to the confusion. I am no longer opposed to setting crlf=true by default for Git (although this does not necessarily hold true for msysGit, but that could be helped by explicitely unsetting crlf for the repositories we check out with the netinstaller). Ciao, Dscho ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-09 10:56 ` Johannes Schindelin @ 2008-01-09 12:41 ` Steffen Prohaska 2008-01-09 13:52 ` Gregory Jefferis 0 siblings, 1 reply; 113+ messages in thread From: Steffen Prohaska @ 2008-01-09 12:41 UTC (permalink / raw) To: Johannes Schindelin Cc: Peter Karlsson, Git Mailing List, Robin Rosenberg, Jeff King, Linus Torvalds On Jan 9, 2008, at 11:56 AM, Johannes Schindelin wrote: > Hi, > > On Tue, 8 Jan 2008, Peter Karlsson wrote: > >> Linus Torvalds: >> >>> So defaulting to (or asking) "autocrlf" at install time is probably >>> the safest thing, and then people can edit their >>> global .gitconfig to >>> turn it off. >> >> Indeed. A checkbox in the Windows installer (like Cygwin has) >> would be >> nice. > > No. There are different needs for different projects, and having > different defaults just adds to the confusion. > > I am no longer opposed to setting crlf=true by default for Git > (although > this does not necessarily hold true for msysGit, but that could be > helped by explicitely unsetting crlf for the repositories we check out > with the netinstaller). I'll further think about "crlf=safe" (see another mail in this thread). I like the idea of safe because it guarantees that data will never be corrupted. But I have no time to think about it immediately. Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-09 12:41 ` Steffen Prohaska @ 2008-01-09 13:52 ` Gregory Jefferis 2008-01-09 14:03 ` Johannes Schindelin 2008-01-09 15:03 ` Dmitry Potapov 0 siblings, 2 replies; 113+ messages in thread From: Gregory Jefferis @ 2008-01-09 13:52 UTC (permalink / raw) To: Steffen Prohaska, Johannes Schindelin Cc: Peter Karlsson, Git Mailing List, Robin Rosenberg, Jeff King, Linus Torvalds On 9/1/08 12:41, "Steffen Prohaska" <prohaska@zib.de> wrote: > I'll further think about "crlf=safe" (see another mail in this > thread). I like the idea of safe because it guarantees that data > will never be corrupted. But I have no time to think about it > immediately. crlf=safe [i.e. munging CRLFs only if there are no bare LFs] sounds appealing to me as well because it looks like munging that is always reversible. However there could still be problems at checkout. To be really safe, it seems to me that it must be 1) reversible in practice and 2) ALWAYS reversed unless we explicitly ask for no gnuming at checkout. Why? Re point (1) to be reversible in practice, we need to know who we've munged. Otherwise when gnuming blindly at checkout we might damage some innocent bystander file that only ever had LFs in the first place. So it seems we would have to keep track of who was munged. But do we want to store this in the repository? Re (2) well if we happen to munge a file on checkin that is actually binary, it must be gnumed on the way out otherwise it will be broken for the user. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-09 13:52 ` Gregory Jefferis @ 2008-01-09 14:03 ` Johannes Schindelin 2008-01-09 15:22 ` Dmitry Potapov 2008-01-09 15:03 ` Dmitry Potapov 1 sibling, 1 reply; 113+ messages in thread From: Johannes Schindelin @ 2008-01-09 14:03 UTC (permalink / raw) To: Gregory Jefferis Cc: Steffen Prohaska, Peter Karlsson, Git Mailing List, Robin Rosenberg, Jeff King, Linus Torvalds Hi, On Wed, 9 Jan 2008, Gregory Jefferis wrote: > On 9/1/08 12:41, "Steffen Prohaska" <prohaska@zib.de> wrote: > > > I'll further think about "crlf=safe" (see another mail in this > > thread). I like the idea of safe because it guarantees that data will > > never be corrupted. But I have no time to think about it immediately. > > crlf=safe [i.e. munging CRLFs only if there are no bare LFs] sounds > appealing to me as well because it looks like munging that is always > reversible. There is a bigger problem here, though: As of now, you can add a (loose) object from a big file pretty easily even on a small machine, because you do not need the whole buffer, but you stream it to hash-object. IIRC Junio wrote a patch to allow this with "git-add", using fast-import, but that patch probably hasn't been applied. Ciao, Dscho ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-09 14:03 ` Johannes Schindelin @ 2008-01-09 15:22 ` Dmitry Potapov 0 siblings, 0 replies; 113+ messages in thread From: Dmitry Potapov @ 2008-01-09 15:22 UTC (permalink / raw) To: Johannes Schindelin Cc: Gregory Jefferis, Steffen Prohaska, Peter Karlsson, Git Mailing List, Robin Rosenberg, Jeff King, Linus Torvalds On Wed, Jan 09, 2008 at 02:03:32PM +0000, Johannes Schindelin wrote: > > On Wed, 9 Jan 2008, Gregory Jefferis wrote: > > > crlf=safe [i.e. munging CRLFs only if there are no bare LFs] sounds > > appealing to me as well because it looks like munging that is always > > reversible. > > There is a bigger problem here, though: As of now, you can add a (loose) > object from a big file pretty easily even on a small machine, because you > do not need the whole buffer, but you stream it to hash-object. IIRC > Junio wrote a patch to allow this with "git-add", using fast-import, but > that patch probably hasn't been applied. I don't think that crlf=safe requires that the whole file was put into the buffer. It can work with stream, but it will call die() if a file that was detected as text has a naked LF. Dmitry ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-09 13:52 ` Gregory Jefferis 2008-01-09 14:03 ` Johannes Schindelin @ 2008-01-09 15:03 ` Dmitry Potapov 2008-01-09 17:37 ` Gregory Jefferis 1 sibling, 1 reply; 113+ messages in thread From: Dmitry Potapov @ 2008-01-09 15:03 UTC (permalink / raw) To: Gregory Jefferis Cc: Steffen Prohaska, Johannes Schindelin, Peter Karlsson, Git Mailing List, Robin Rosenberg, Jeff King, Linus Torvalds On Wed, Jan 09, 2008 at 01:52:59PM +0000, Gregory Jefferis wrote: > > crlf=safe [i.e. munging CRLFs only if there are no bare LFs] sounds > appealing to me as well because it looks like munging that is always > reversible. However there could still be problems at checkout. To be > really safe, it seems to me that it must be 1) reversible in practice and 2) > ALWAYS reversed unless we explicitly ask for no gnuming at checkout. Why? > > Re point (1) to be reversible in practice, we need to know who we've munged. > Otherwise when gnuming blindly at checkout we might damage some innocent > bystander file that only ever had LFs in the first place. If you work on Windows and you have clrf=safe, you cannot put a text file that has only LFs, because naked LF is not allowed. If you want to have naked LF in some file, you have to say that explicitly in .gitattributes. If you work on cross platform project, and somebody else put a file with bare LFs, which is not text though heauristic wrongly detected it as text then you can remove this file from your working directory, correct .gitattributes and checkout this file again. The idea of crlf=safe is that information is never lost. It is always fully reversible, and if you put something into the repostory, you always get back exactly the same unless you changed your .gitattributes. > Re (2) well if we happen to munge a file on checkin that is actually binary, > it must be gnumed on the way out otherwise it will be broken for the user. Of course, it will, because the same heuristic will detect it as text, and convert it back. So as long as you stay on the same platform and with the same .gitattributes, you always get back exactly what you put. Dmitry ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-09 15:03 ` Dmitry Potapov @ 2008-01-09 17:37 ` Gregory Jefferis 2008-01-09 19:05 ` Dmitry Potapov 0 siblings, 1 reply; 113+ messages in thread From: Gregory Jefferis @ 2008-01-09 17:37 UTC (permalink / raw) To: Dmitry Potapov Cc: Steffen Prohaska, Johannes Schindelin, Peter Karlsson, Git Mailing List, Robin Rosenberg, Jeff King, Linus Torvalds On 9/1/08 15:03, "Dmitry Potapov" <dpotapov@gmail.com> wrote: > On Wed, Jan 09, 2008 at 01:52:59PM +0000, Gregory Jefferis wrote: >> >> Re point (1) to be reversible in practice, we need to know who we've munged. >> Otherwise when gnuming blindly at checkout we might damage some innocent >> bystander file that only ever had LFs in the first place. > > If you work on Windows and you have clrf=safe, you cannot put a text > file that has only LFs, because naked LF is not allowed. If you want > to have naked LF in some file, you have to say that explicitly in > .gitattributes. > > If you work on cross platform project, and somebody else put a file with > bare LFs, which is not text though heauristic wrongly detected it as > text then you can remove this file from your working directory, correct > .gitattributes and checkout this file again. The idea of crlf=safe is > that information is never lost. It is always fully reversible, and if > you put something into the repostory, you always get back exactly the > same unless you changed your .gitattributes. > >> Re (2) well if we happen to munge a file on checkin that is actually binary, >> it must be gnumed on the way out otherwise it will be broken for the user. > > Of course, it will, because the same heuristic will detect it as text, > and convert it back. So as long as you stay on the same platform and > with the same .gitattributes, you always get back exactly what you > put. Dmitry, I think all of your comments are correct, BUT, this behaviour as currently proposed still does not seem to me safe (or perhaps transparent) enough to be enabled by default on a Windows platform (or for that matter a Unix one). If LF text files checked in on Windows get turned into CRLF files on checkout by default then I think plenty of people would be surprised and probably unhappy. Similarly I think it would be a bad thing if a binary file that looked like LF only text got mangled on checkout by LF->CRLF conversion - although I agree that it would be possible to recover from this situation with a bit of juggling. So my view is still that this behaviour would be a useful option when explicitly enabled by .gitattributes (as opposed to the current auto CRLF implementation, which could lead to irreversible munging) but that it is not an appropriate system-wide default. I could however see that sane people might disagree! For that matter autocrlf=true,input,safe are all slightly dubious when used as config vars rather than as attributes for the same collateral damage reason discussed above. The only way to prevent collateral damage is to consult .gitattributes on checkout (as Dmitry seemed to be assuming above) rather than gnuming anything in the repository that looked like LF only text. Of course even .gitattributes can change over time, so only by storing a "munged" metadata attribute in the repository could you guarantee that everything came out as it went in - which I think is a highly desirable base state. Greg. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-09 17:37 ` Gregory Jefferis @ 2008-01-09 19:05 ` Dmitry Potapov 0 siblings, 0 replies; 113+ messages in thread From: Dmitry Potapov @ 2008-01-09 19:05 UTC (permalink / raw) To: Gregory Jefferis Cc: Steffen Prohaska, Johannes Schindelin, Peter Karlsson, Git Mailing List, Robin Rosenberg, Jeff King, Linus Torvalds Hi Gregory, On Wed, Jan 09, 2008 at 05:37:07PM +0000, Gregory Jefferis wrote: > > If LF text files checked in on Windows get turned into CRLF files on > checkout by default then I think plenty of people would be surprised and > probably unhappy. LF text cannot be checked in with autocrlf=safe without marking that there is no CRLF conversation for this file. So, what you describe is impossible. IOW, you *always* get back what you put in the repository. > Similarly I think it would be a bad thing if a binary > file that looked like LF only text got mangled on checkout by LF->CRLF > conversion - although I agree that it would be possible to recover from this > situation with a bit of juggling. Again, you can't do that with autocrlf=safe. Yes, it is possible that someone else on Unix to put a file like this, but it is a rare event and easy to recover. So, it is a very small price to pay for cross-platform projects, and those who use the same platform are not affected at all! > The only way to prevent collateral damage is to > consult .gitattributes on checkout (as Dmitry seemed to be assuming above) Yes, I assumed this. Isn't it how it is implemented now? static int crlf_to_worktree(const char *path, const char *src, size_t len, struct strbuf *buf, int action) { char *to_free = NULL; struct text_stat stats; if ((action == CRLF_BINARY) || (action == CRLF_INPUT) || auto_crlf <= 0) return 0; If crlf=false for some file then action will be CRLF_BINARY, and crlf_to_worktree will not convert LF to CRLF. Did I miss somthing? Dmitry ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 21:03 ` Robin Rosenberg 2008-01-07 21:18 ` Johannes Schindelin 2008-01-07 21:36 ` Linus Torvalds @ 2008-01-07 21:42 ` Thomas Neumann 2008-01-08 10:56 ` Peter Karlsson 2 siblings, 1 reply; 113+ messages in thread From: Thomas Neumann @ 2008-01-07 21:42 UTC (permalink / raw) To: git > Indeed, but the most common SCM's detect binary files automatically, > either by suffix or content analysis, so I think that is what user's > expect. It will be right for more projects that the current behaviour. as a user, I expect a SCM to only modify a file when I have explicitly asked it to do so. Automatically conversion by guessing file types are evil, as they _will_ go wrong, and then mess some files. This "intelligent" file handling is a pain to use. You end up in situations were builds work on some platforms but not on others, which gets even more confusion with NFS home directories. So, please do not enable core.autocrlf by default on Windows. It might be reasonable for some projects, but not for all of them, and it will break some projects. Perhaps a project should be able to enable (or "suggest" it) in a repo-wide setting somehow, which would avoid the git clone problem. Thomas ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 21:42 ` Thomas Neumann @ 2008-01-08 10:56 ` Peter Karlsson 2008-01-08 11:07 ` Jeff King ` (3 more replies) 0 siblings, 4 replies; 113+ messages in thread From: Peter Karlsson @ 2008-01-08 10:56 UTC (permalink / raw) To: git Thomas Neumann: > as a user, I expect a SCM to only modify a file when I have > explicitly asked it to do so. As a user, I exepect things to just work. With RCS/CVS/Subversion, it does, because it differentiates between text files (internally encoding NLs with "LF", but I couldn't care less what it uses there) and binary files (which it doesn't change). With git it currently doesn't since it treats everything as binary files. Yes, it's the whole text vs. binary file issue. We do live in a world where different systems store text differently. We have to deal with it. Preferrably, the computer should deal with it without me having to do anything about it. After all, that's what computers are good at. If I occasionally need to do a git add -kb binary.txt to flag a file explicitely, that's a small price to pay for everything else to work out of the box. FWIW, I wouldn't care if git internally stored all texts as SCSU/BOCU (or UTF-32, for that matter, if Git's compression engine is better than SCSU or BOCU) using PARAGRAPH SEPARATOR to separate lines, just as long as I could get back the text I checked in. Come to think about it, locale autoconversion of text files would be a nice way to work between systems that want different encodings, like how Windows prefers UTF-16LE, Mac OS X prefers UTF-8 and Linux systems prefers whatever I have set my locale to (I still use iso-8859-1, so shoot me). -- \\// Peter - http://www.softwolves.pp.se/ ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 10:56 ` Peter Karlsson @ 2008-01-08 11:07 ` Jeff King 2008-01-08 11:54 ` Johannes Schindelin 2008-01-08 11:52 ` Johannes Schindelin ` (2 subsequent siblings) 3 siblings, 1 reply; 113+ messages in thread From: Jeff King @ 2008-01-08 11:07 UTC (permalink / raw) To: Peter Karlsson; +Cc: git On Tue, Jan 08, 2008 at 11:56:00AM +0100, Peter Karlsson wrote: > If I occasionally need to do a > > git add -kb binary.txt > > to flag a file explicitely, that's a small price to pay for everything > else to work out of the box. For you, perhaps, since you apparently infrequently commit binary files and derive some benefit from CRLF conversion. But please bear in mind that there are people on the other end of the spectrum who want the opposite (i.e., who could care less about CRLF, but _do_ have binary files). -Peff ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 11:07 ` Jeff King @ 2008-01-08 11:54 ` Johannes Schindelin 0 siblings, 0 replies; 113+ messages in thread From: Johannes Schindelin @ 2008-01-08 11:54 UTC (permalink / raw) To: Jeff King; +Cc: Peter Karlsson, git Hi, On Tue, 8 Jan 2008, Jeff King wrote: > On Tue, Jan 08, 2008 at 11:56:00AM +0100, Peter Karlsson wrote: > > > If I occasionally need to do a > > > > git add -kb binary.txt > > > > to flag a file explicitely, that's a small price to pay for everything > > else to work out of the box. > > For you, perhaps, since you apparently infrequently commit binary files > and derive some benefit from CRLF conversion. But please bear in mind > that there are people on the other end of the spectrum who want the > opposite (i.e., who could care less about CRLF, but _do_ have binary > files). Do not forget the people who say that git is a content tracker (as opposed to a content munger). Git was really intended as a tracker of octet strings which are organised in tree structures, and where you can have revisions over those tree structures. That is the beauty of git: it keeps simple things simple. Now, for some, this is a curse ;-) Ciao, Dscho ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 10:56 ` Peter Karlsson 2008-01-08 11:07 ` Jeff King @ 2008-01-08 11:52 ` Johannes Schindelin 2008-01-08 13:07 ` Peter Harris 2008-01-09 18:46 ` Jan Hudec 3 siblings, 0 replies; 113+ messages in thread From: Johannes Schindelin @ 2008-01-08 11:52 UTC (permalink / raw) To: Peter Karlsson; +Cc: git Hi, On Tue, 8 Jan 2008, Peter Karlsson wrote: > Thomas Neumann: > > > as a user, I expect a SCM to only modify a file when I have > > explicitly asked it to do so. > > As a user, I exepect things to just work. With RCS/CVS/Subversion, it > does, because it differentiates between text files (internally encoding > NLs with "LF", but I couldn't care less what it uses there) and binary > files (which it doesn't change). With git it currently doesn't since it > treats everything as binary files. <tongue-in-cheek>Hey, if Subversion does what you want, why not just use it?</tongue-in-cheek> Ciao, Dscho ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 10:56 ` Peter Karlsson 2008-01-08 11:07 ` Jeff King 2008-01-08 11:52 ` Johannes Schindelin @ 2008-01-08 13:07 ` Peter Harris 2008-01-08 15:20 ` Peter Karlsson ` (2 more replies) 2008-01-09 18:46 ` Jan Hudec 3 siblings, 3 replies; 113+ messages in thread From: Peter Harris @ 2008-01-08 13:07 UTC (permalink / raw) To: Peter Karlsson; +Cc: git On Jan 8, 2008 5:56 AM, Peter Karlsson <peter@softwolves.pp.se> wrote: > Thomas Neumann: > > > as a user, I expect a SCM to only modify a file when I have > > explicitly asked it to do so. > > As a user, I exepect things to just work. With RCS/CVS/Subversion, it > does, because it differentiates between text files (internally encoding > NLs with "LF", but I couldn't care less what it uses there) and binary > files (which it doesn't change). With git it currently doesn't since it > treats everything as binary files. Actually, Subversion does the Right Thing, and treats everything as a binary file until and unless you explicitly set the svn:eol-style property on each file that you want it to mangle. Maybe you set up Subversion auto-props and forgot about it? That would be almost (but not really) like setting autocrlf=true in your global git config. Peter Harris ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 13:07 ` Peter Harris @ 2008-01-08 15:20 ` Peter Karlsson 2008-01-08 15:58 ` Kelvie Wong 2008-01-08 21:33 ` Dmitry Potapov 2 siblings, 0 replies; 113+ messages in thread From: Peter Karlsson @ 2008-01-08 15:20 UTC (permalink / raw) To: Peter Harris; +Cc: git Peter Harris: > Actually, Subversion does the Right Thing, and treats everything as a > binary file until and unless you explicitly set the svn:eol-style > property on each file that you want it to mangle. > > Maybe you set up Subversion auto-props and forgot about it? That would > be almost (but not really) like setting autocrlf=true in your global > git config. Actually, I've never actively set up a Subversion server myself, nor created any projects in Subversion (I have checked out some Subversion repos, though). I started using RCS and CVS, and now I'm migrating at least parts of that to Git (not all). Since Git is better than CVS in many ways, I would like it to be better than CVS in this one as well. -- \\// Peter - http://www.softwolves.pp.se/ ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 13:07 ` Peter Harris 2008-01-08 15:20 ` Peter Karlsson @ 2008-01-08 15:58 ` Kelvie Wong 2008-01-08 21:33 ` Dmitry Potapov 2 siblings, 0 replies; 113+ messages in thread From: Kelvie Wong @ 2008-01-08 15:58 UTC (permalink / raw) To: Peter Harris; +Cc: Peter Karlsson, git On Jan 8, 2008 5:07 AM, Peter Harris <peter@peter.is-a-geek.org> wrote: > On Jan 8, 2008 5:56 AM, Peter Karlsson <peter@softwolves.pp.se> wrote: > > Thomas Neumann: > > > > > as a user, I expect a SCM to only modify a file when I have > > > explicitly asked it to do so. > > > > As a user, I exepect things to just work. With RCS/CVS/Subversion, it > > does, because it differentiates between text files (internally encoding > > NLs with "LF", but I couldn't care less what it uses there) and binary > > files (which it doesn't change). With git it currently doesn't since it > > treats everything as binary files. > > Actually, Subversion does the Right Thing, and treats everything as a > binary file until and unless you explicitly set the svn:eol-style > property on each file that you want it to mangle. > > Maybe you set up Subversion auto-props and forgot about it? That would > be almost (but not really) like setting autocrlf=true in your global > git config. > > Peter Harris > > - > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > I'd actually like a feature like this. On the internal subversion tree I'm working on (using git-svn), there are quite a bit of files that have CRLF endings -- we are a cross platform development group. The solution to this in subversion was that everyone had the same .subversion/config with a bunch of autoprops set; i.e.: [auto-props] *.H = svn:eol-style=native *.h = svn:eol-style=native *.CPP = svn:eol-style=native *.cpp = svn:eol-style=native and I can't do the same using git-svn. Thankfully emacs detects CRLFs and adjusts accordingly, and that's my workaround for it, but it would be nice to have some kind of gitattribute that allows you to set the autocrlf according to a filter. -- Kelvie Wong ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 13:07 ` Peter Harris 2008-01-08 15:20 ` Peter Karlsson 2008-01-08 15:58 ` Kelvie Wong @ 2008-01-08 21:33 ` Dmitry Potapov 2 siblings, 0 replies; 113+ messages in thread From: Dmitry Potapov @ 2008-01-08 21:33 UTC (permalink / raw) To: Peter Harris; +Cc: Peter Karlsson, git On Tue, Jan 08, 2008 at 08:07:15AM -0500, Peter Harris wrote: > > Actually, Subversion does the Right Thing, and treats everything as a > binary file until and unless you explicitly set the svn:eol-style > property on each file that you want it to mangle. Not exactly. Actually, Subversion detects binary or text file based on heuristic. http://subversion.tigris.org/faq.html#binary-files But the status of a file as binary or text has no effect whatsoever on on CRLF conversation, which is controlled by another property. By default, most of your text files will be detected as text (unless you use non-ASCII character like Cyrillic), but they will not have CRLF conversation. Now, you have to set svn:eol-style=native for each new file, which of course can be done automatically based on file extension, but that should be configured by each user in his or her global config file. Obviously, it does not work well for cross-platform projects, because many users forget to set svn:eol-style=native for some extensions. Moreover, the issue tends to repeat itself for every newly introduced file extension... IMHO, having the binary or text status completely independent from CRLF conversation is insanity... Dmitry ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-08 10:56 ` Peter Karlsson ` (2 preceding siblings ...) 2008-01-08 13:07 ` Peter Harris @ 2008-01-09 18:46 ` Jan Hudec 3 siblings, 0 replies; 113+ messages in thread From: Jan Hudec @ 2008-01-09 18:46 UTC (permalink / raw) To: Peter Karlsson; +Cc: git On Tue, Jan 08, 2008 at 11:56:00 +0100, Peter Karlsson wrote: > Thomas Neumann: > > > as a user, I expect a SCM to only modify a file when I have > > explicitly asked it to do so. > > As a user, I exepect things to just work. With RCS/CVS/Subversion, it > does, because it differentiates between text files (internally encoding > NLs with "LF", but I couldn't care less what it uses there) and binary > files (which it doesn't change). With git it currently doesn't since it > treats everything as binary files. With subversion you must explicitely enable it to "just" work. Subversion auto-tags files with specified extensions, when they are added, with svn:eol property specifying how the file should be converted and than converts (everywhere) the files to specified line endings. However, AFAIK, it does not convert anything unless the properties are set and the default config has the automatic setting *commented out*. -- Jan 'Bulb' Hudec <bulb@ucw.cz> ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 9:57 ` Steffen Prohaska 2008-01-07 10:00 ` Junio C Hamano 2008-01-07 10:12 ` Jeff King @ 2008-01-07 10:13 ` Peter Klavins 2008-01-07 12:58 ` Steffen Prohaska 2008-01-07 13:50 ` Peter Karlsson 2 siblings, 2 replies; 113+ messages in thread From: Peter Klavins @ 2008-01-07 10:13 UTC (permalink / raw) To: git I use an alternate workaround that clones the repository, removes the checked out files, sets autocrlf, then checks out the files again: $ git clone git://git.debian.org/git/turqstat/turqstat.git $ cd turqstat $ git config --add core.autocrlf true $ rm -rf * .gitignore $ git reset --hard The result should now be the same as using Steffen's system. However, there is still an unresolved problem with git's way of treating cr/lf as an attribute only of the checkout and not the repository itself: $ git status # On branch master # Changed but not updated: # (use "git add <file>..." to update what will be committed) # # modified: visualc/.gitignore # modified: visualc/turqstat.sln # modified: visualc/turqstat.vcproj # no changes added to commit (use "git add" and/or "git commit -a") So, checking out the repository with cr/lf true has now caused misalignment of files that were originally checked in with existing cr/lf's in place. Visual Studio in fact happily works with files that only have lf endings, _except_ *.sln and *.vcproj files, which it much prefers to have with cr/lf endings. The _real_ solution to this problem for the moment is _not_ to mix files with both lf and cr/lf endings in the repository. So, the original author of the repository should _also_ have used core.autocrlf true, thus causing the *sln and *vcproj to have their cr's stripped on checkin, but replaced on checkout when checking out with autocrlf true. ------------------------------------------------------------------------ Peter Klavins ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 10:13 ` Peter Klavins @ 2008-01-07 12:58 ` Steffen Prohaska 2008-01-07 13:50 ` Peter Karlsson 1 sibling, 0 replies; 113+ messages in thread From: Steffen Prohaska @ 2008-01-07 12:58 UTC (permalink / raw) To: Peter Klavins; +Cc: git On Jan 7, 2008, at 11:13 AM, Peter Klavins wrote: > I use an alternate workaround that clones the repository, removes > the checked out files, sets autocrlf, then checks out the files again: > > $ git clone git://git.debian.org/git/turqstat/turqstat.git > $ cd turqstat > $ git config --add core.autocrlf true > $ rm -rf * .gitignore > $ git reset --hard > > The result should now be the same as using Steffen's system. Yes, this should yield the same. > However, there is still an unresolved problem with git's way of > treating cr/lf as an attribute only of the checkout and not the > repository itself: > > $ git status > # On branch master > # Changed but not updated: > # (use "git add <file>..." to update what will be committed) > # > # modified: visualc/.gitignore > # modified: visualc/turqstat.sln > # modified: visualc/turqstat.vcproj > # > no changes added to commit (use "git add" and/or "git commit -a") > > So, checking out the repository with cr/lf true has now caused > misalignment of files that were originally checked in with existing > cr/lf's in place. Visual Studio in fact happily works with files > that only have lf endings, _except_ *.sln and *.vcproj files, which > it much prefers to have with cr/lf endings. You could try .gitattributes to exclude files from crlf conversion. But I'd not recommend this, because the mechanism has some deficiencies, as discussed in http://thread.gmane.org/gmane.comp.version-control.git/61888 > The _real_ solution to this problem for the moment is _not_ to mix > files with both lf and cr/lf endings in the repository. This is the way to go. > So, the original author of the repository should _also_ have used > core.autocrlf true, thus causing the *sln and *vcproj to have their > cr's stripped on checkin, but replaced on checkout when checking > out with autocrlf true. For cross-platform projects, I recommend to explicitly configure autocrlf on Windows and Unix. On Windows set git config core.autocrlf true # on Windows and on Unix set git config core.autocrlf input # on Unix This ensures that the repository only contains LF. Even if someone emails source code from Windows to Unix and commits it there. Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 10:13 ` Peter Klavins 2008-01-07 12:58 ` Steffen Prohaska @ 2008-01-07 13:50 ` Peter Karlsson 2008-01-07 14:14 ` Peter Klavins 2008-01-07 16:05 ` Steffen Prohaska 1 sibling, 2 replies; 113+ messages in thread From: Peter Karlsson @ 2008-01-07 13:50 UTC (permalink / raw) To: git; +Cc: Steffen Prohaska, Peter Klavins Steffen Prohaska: > Per default, CRLF conversion is disabled in msysgit. Git should > not convert a single file. Does it really convert some? I didn't verify, but it was only some files that had LFs, perhaps the files that I added while on the Windows machine had CRLFs. That's bad. Peter Klavins: > Visual Studio in fact happily works with files that only have lf > endings, _except_ *.sln and *.vcproj files, which it much prefers to > have with cr/lf endings. The project files were added to the repository on the Windows box (obviously), so those are correct. So apparently my repository is a bit broken at the moment with LF on some files and CRLF on some. That's bad. I just assumed everything worked, it used to "just work" for CVS (except for when you actually tried to add binary files, of course). -- \\// Peter - http://www.softwolves.pp.se/ ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 13:50 ` Peter Karlsson @ 2008-01-07 14:14 ` Peter Klavins 2008-01-07 16:05 ` Steffen Prohaska 1 sibling, 0 replies; 113+ messages in thread From: Peter Klavins @ 2008-01-07 14:14 UTC (permalink / raw) To: git > So apparently my repository is a bit broken at the moment with LF on > some files and CRLF on some. That's bad. I just assumed everything > worked, it used to "just work" for CVS (except for when you actually > tried to add binary files, of course). LOL. Exactly. That's my only gripe with git, there's still some way to go before it's as usable as CVS in this regard, but of course in every other feature it's way superior. If you follow the steps I listed, you will have new .sln and .vcproj files that you can commit over the top of the ones already there, and everything will be fixed! I checked out your project and it built fine. ------------------------------------------------------------------------ Peter Klavins ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: CRLF problems with Git on Win32 2008-01-07 13:50 ` Peter Karlsson 2008-01-07 14:14 ` Peter Klavins @ 2008-01-07 16:05 ` Steffen Prohaska 1 sibling, 0 replies; 113+ messages in thread From: Steffen Prohaska @ 2008-01-07 16:05 UTC (permalink / raw) To: Peter Karlsson; +Cc: git, Peter Klavins On Jan 7, 2008, at 2:50 PM, Peter Karlsson wrote: > Steffen Prohaska: > >> Per default, CRLF conversion is disabled in msysgit. Git should >> not convert a single file. Does it really convert some? > > I didn't verify, but it was only some files that had LFs, perhaps the > files that I added while on the Windows machine had CRLFs. That's bad. This is a typical problem. Once CRLFs are in your repository autocrlf can't "just work" anymore. You need to commit a fixed version of the files. > Peter Klavins: > >> Visual Studio in fact happily works with files that only have lf >> endings, _except_ *.sln and *.vcproj files, which it much prefers to >> have with cr/lf endings. > > The project files were added to the repository on the Windows box > (obviously), so those are correct. > > > So apparently my repository is a bit broken at the moment with LF on > some files and CRLF on some. That's bad. I just assumed everything > worked, it used to "just work" for CVS (except for when you actually > tried to add binary files, of course). "Just works" has a different meaning for git than it has for CVS. For git, it means that once you _told_ git how to convert line endings (that is you have correctly configured autocrlf), git will automatically detect text files and convert them, but leave binary files untouched. It "just works" in the sense that you do not need to explicitly tell git about every single binary files (no cvs -kb needed). Git will auto-detect the file type. But if you does tell git to convert line endings it "just works" as if every file was binary. Per default, git does not modify your content. And for some people, "just works" means exactly this: leave my content as is. So it really depends on the context and therefore some configuration is inevitable. git requires you to configure autocrlf. cvs requires you to set -kb. You may, though, set "core.autocrlf true" globally for your account. After you did this, git should "just work" for you; if "just works" means convert CRLF in _all_ text files in _every_ repository. Steffen ^ permalink raw reply [flat|nested] 113+ messages in thread
end of thread, other threads:[~2008-01-16 6:36 UTC | newest] Thread overview: 113+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-01-07 9:16 CRLF problems with Git on Win32 Peter Karlsson 2008-01-07 9:57 ` Steffen Prohaska 2008-01-07 10:00 ` Junio C Hamano 2008-01-07 12:15 ` Steffen Prohaska 2008-01-07 10:12 ` Jeff King 2008-01-07 18:47 ` Robin Rosenberg 2008-01-07 19:16 ` Johannes Schindelin 2008-01-07 21:03 ` Robin Rosenberg 2008-01-07 21:18 ` Johannes Schindelin 2008-01-07 21:40 ` Steffen Prohaska [not found] ` <3B08AC4C-A807-4155-8AD7-DC6A6D0FE134-wjoc1KHpMeg@public.gmane.org> 2008-01-07 22:06 ` Junio C Hamano [not found] ` <7vzlvhxpda.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> 2008-01-07 22:58 ` Linus Torvalds [not found] ` <alpine.LFD.1.00.0801071457040.3148-5CScLwifNT1QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org> 2008-01-07 23:46 ` Gregory Jefferis 2008-01-08 11:09 ` git and unicode Gonzalo Garramuño 2008-01-08 15:09 ` Remi Vanicat 2008-01-08 20:36 ` Robin Rosenberg 2008-01-08 8:55 ` CRLF problems with Git on Win32 Marius Storm-Olsen 2008-01-08 7:02 ` Steffen Prohaska 2008-01-08 7:29 ` Junio C Hamano 2008-01-08 10:08 ` Jeff King 2008-01-08 10:35 ` Junio C Hamano 2008-01-08 12:20 ` Gregory Jefferis 2008-01-08 17:29 ` J. Bruce Fields [not found] ` <20080108172957.GG22155-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2008-01-08 17:56 ` Steffen Prohaska 2008-01-08 18:07 ` Junio C Hamano 2008-01-08 18:07 ` Junio C Hamano [not found] ` <7vmyrgry20.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> 2008-01-08 18:58 ` Steffen Prohaska 2008-01-08 19:09 ` J. Bruce Fields 2008-01-08 19:47 ` Junio C Hamano [not found] ` <7vir24rtfp.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> 2008-01-08 20:02 ` Steffen Prohaska [not found] ` <B655B6FF-9377-434A-A979-2E758771B0FA-wjoc1KHpMeg@public.gmane.org> 2008-01-08 20:15 ` Junio C Hamano [not found] ` <7v3at8rs4b.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> 2008-01-08 20:39 ` Steffen Prohaska 2008-01-09 11:03 ` Johannes Schindelin [not found] ` <alpine.LSU.1.00.0801091100401.31053-OGWIkrnhIhzN0uC3ymp8PA@public.gmane.org> 2008-01-09 12:45 ` Steffen Prohaska [not found] ` <019B1C82-27BF-4B6B-981D-5498D31B5DD3-wjoc1KHpMeg@public.gmane.org> 2008-01-09 13:32 ` Johannes Schindelin 2008-01-08 20:41 ` Linus Torvalds 2008-01-09 8:03 ` Junio C Hamano [not found] ` <7vd4sbmnmz.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> 2008-01-09 10:48 ` Johannes Schindelin 2008-01-09 20:25 ` Junio C Hamano [not found] ` <7vmyrehhkd.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> 2008-01-09 20:50 ` Johannes Schindelin [not found] ` <alpine.LSU.1.00.0801092047190. 31053@racer.site> [not found] ` <alpine.LSU.1.00.0801092047190.31053-OGWIkrnhIhzN0uC3ymp8PA@public.gmane.org> 2008-01-09 21:03 ` Steffen Prohaska [not found] ` <alpine.LSU.1.00.0801091041570.31053-OGWIkrnhIhzN0uC3ymp8PA@public.gmane.org> 2008-01-10 9:25 ` Peter Karlsson [not found] ` <Pine.LNX.4.64.0801101023380.11922-Hh8n7enkEC8qi7mQTfpNuw@public.gmane.org> 2008-01-10 11:57 ` Johannes Schindelin 2008-01-11 3:03 ` Miles Bader 2008-01-11 3:03 ` Miles Bader [not found] ` <alpine.LSU.1.00.080110115 5140.31053@racer.site> [not found] ` <alpine.LSU.1.00.0801101155140.31053-OGWIkrnhIhzN0uC3ymp8PA@public.gmane.org> 2008-01-10 13:28 ` Peter Karlsson 2008-01-10 14:31 ` Peter Harris [not found] ` <eaa105840801100631p6b95ed86j153d70244d474b03-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2008-01-11 13:12 ` Peter Karlsson 2008-01-11 15:39 ` Peter Harris 2008-01-08 19:59 ` Steffen Prohaska 2008-01-08 20:11 ` Junio C Hamano [not found] ` <7vbq7wrsb6.fsf-jO8aZxhGsIagbBziECNbOZn29agUkmeCHZ5vskTnxNA@public.gmane.org> 2008-01-08 20:20 ` Steffen Prohaska 2008-01-08 20:50 ` Dmitry Potapov [not found] ` <20080108205054.GN6951-EQL4cN526mwi5CQI31g/s0B+6BGkLq7r@public.gmane.org> 2008-01-08 21:15 ` Junio C Hamano 2008-01-08 21:57 ` Robin Rosenberg 2008-01-08 21:31 ` Linus Torvalds [not found] ` <alpine.LFD.1.00.0801081325010.3148-5CScLwifNT1QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org> 2008-01-08 22:09 ` Sean 2008-01-08 22:51 ` Dmitry Potapov [not found] ` <20080108225138.GA23240-EQL4cN526mwi5CQI31g/s0B+6BGkLq7r@public.gmane.org> 2008-01-09 0:01 ` Linus Torvalds 2008-01-09 8:43 ` Abdelrazak Younes 2008-01-10 19:58 ` Gregory Jefferis 2008-01-10 20:20 ` Linus Torvalds 2008-01-10 21:28 ` Gregory Jefferis 2008-01-10 23:23 ` Dmitry Potapov 2008-01-11 0:02 ` Linus Torvalds 2008-01-11 0:32 ` Junio C Hamano 2008-01-11 7:10 ` Steffen Prohaska 2008-01-11 15:58 ` Linus Torvalds 2008-01-11 16:28 ` Steffen Prohaska 2008-01-11 17:25 ` Linus Torvalds 2008-01-11 17:56 ` Steffen Prohaska 2008-01-11 18:10 ` Linus Torvalds 2008-01-11 18:29 ` Steffen Prohaska 2008-01-11 19:16 ` Linus Torvalds 2008-01-11 19:50 ` Sam Ravnborg 2008-01-11 21:18 ` Johannes Schindelin 2008-01-11 22:21 ` Sam Ravnborg 2008-01-12 15:08 ` Dmitry Potapov 2008-01-12 17:54 ` [PATCH] [WIP] safecrlf: Add mechanism to warn about irreversible crlf conversions Steffen Prohaska 2008-01-12 19:14 ` Dmitry Potapov 2008-01-13 9:05 ` [WIP v2] " Steffen Prohaska 2008-01-11 19:53 ` CRLF problems with Git on Win32 Christer Weinigel 2008-01-14 9:41 ` David Kågedal 2008-01-11 19:00 ` Gregory Jefferis 2008-01-12 15:26 ` Dmitry Potapov 2008-01-10 20:50 ` Rogan Dawes 2008-01-10 21:15 ` Gregory Jefferis 2008-01-11 1:15 ` Junio C Hamano 2008-01-07 21:36 ` Linus Torvalds 2008-01-08 21:26 ` Peter Karlsson 2008-01-09 10:56 ` Johannes Schindelin 2008-01-09 12:41 ` Steffen Prohaska 2008-01-09 13:52 ` Gregory Jefferis 2008-01-09 14:03 ` Johannes Schindelin 2008-01-09 15:22 ` Dmitry Potapov 2008-01-09 15:03 ` Dmitry Potapov 2008-01-09 17:37 ` Gregory Jefferis 2008-01-09 19:05 ` Dmitry Potapov 2008-01-07 21:42 ` Thomas Neumann 2008-01-08 10:56 ` Peter Karlsson 2008-01-08 11:07 ` Jeff King 2008-01-08 11:54 ` Johannes Schindelin 2008-01-08 11:52 ` Johannes Schindelin 2008-01-08 13:07 ` Peter Harris 2008-01-08 15:20 ` Peter Karlsson 2008-01-08 15:58 ` Kelvie Wong 2008-01-08 21:33 ` Dmitry Potapov 2008-01-09 18:46 ` Jan Hudec 2008-01-07 10:13 ` Peter Klavins 2008-01-07 12:58 ` Steffen Prohaska 2008-01-07 13:50 ` Peter Karlsson 2008-01-07 14:14 ` Peter Klavins 2008-01-07 16:05 ` Steffen Prohaska
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).