* [BUG] Git does not convert CRLF=>LF on files with \r not before \n @ 2015-04-21 13:51 Alexandre Garnier 2015-04-21 17:41 ` Junio C Hamano 2015-04-21 19:28 ` Torsten Bögershausen 0 siblings, 2 replies; 5+ messages in thread From: Alexandre Garnier @ 2015-04-21 13:51 UTC (permalink / raw) To: git Here is a test: git init -q crlf-test cd crlf-test echo '* text=auto' > .gitattributes git add .gitattributes git commit -q -m "Normalize EOL" echo -ne 'some content\r\nother \rcontent with CR\r\ncontent\r\nagain content with\r\r\n' > inline-cr.txt echo "Working directory content:" cat -A inline-cr.txt echo git add inline-cr.txt echo "Indexed content:" git show :inline-cr.txt | cat -A Result ------ File content: some content^M$ other ^Mcontent with CR^M$ content^M$ again content with^M^M$ Indexed content: some content^M$ other ^Mcontent with CR^M$ content^M$ again content with^M^M$ Expected result --------------- File content: some content^M$ other ^Mcontent with CR^M$ content^M$ again content with^M^M$ Indexed content: some content$ other ^Mcontent with CR$ content$ again content with^M$ # or even 'again content with$' for this last line If you remove the \r that are not at the end of the lines, EOL are converted as expected: File content: some content^M$ other content with CR^M$ content^M$ again content with^M$ Indexed content: some content$ other content with CR$ content$ again content with$ -- Alex ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] Git does not convert CRLF=>LF on files with \r not before \n 2015-04-21 13:51 [BUG] Git does not convert CRLF=>LF on files with \r not before \n Alexandre Garnier @ 2015-04-21 17:41 ` Junio C Hamano 2015-04-22 17:42 ` Junio C Hamano 2015-04-21 19:28 ` Torsten Bögershausen 1 sibling, 1 reply; 5+ messages in thread From: Junio C Hamano @ 2015-04-21 17:41 UTC (permalink / raw) To: Alexandre Garnier Cc: git, Steffen Prohaska, Alex Riesen, Eyvind Bernhardsen, Carlos Martín Nieto Alexandre Garnier <zigarn@gmail.com> writes: > echo '* text=auto' > .gitattributes > git add .gitattributes > git commit -q -m "Normalize EOL" > echo -ne 'some content\r\nother \rcontent with CR\r\ncontent\r\nagain With text=auto, the user instructs us to guess, and we expect either LF or CRLF line-terminated files that is *TEXT*. A lone CR in the middle of the line would mean we cannot reliably guess---it may be LF terminated file with CRs sprinkled inside text, some of which happen to be at the end of the line, or it may be CRLF terminated file with CRs sprinkled in. We try to preserve the user input by not munging when we are not sure. You are seeing the designed and intended behaviour. But it would be a bug if the same thing happens when the user explicitly tells us that the file has CRLF line endings, and I suspect we have that bug, which may want to be corrected. I've Cc'ed various people who worked on convert.c around line endings. I recall we saw a few other discussion threads on text=auto and eol settings. Stakeholders may want to have a unified discussion to first list the issues in the current implementation and come up with fixes for them. Thanks. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] Git does not convert CRLF=>LF on files with \r not before \n 2015-04-21 17:41 ` Junio C Hamano @ 2015-04-22 17:42 ` Junio C Hamano 0 siblings, 0 replies; 5+ messages in thread From: Junio C Hamano @ 2015-04-22 17:42 UTC (permalink / raw) To: Alexandre Garnier; +Cc: Torsten Bögershausen, git Alexandre Garnier <zigarn@gmail.com> writes: > Indeed, when changing the gitattributes for '* text', the replacement is OK. OK. Earlier I said: >> But it would be a bug if the same thing happens when the user >> explicitly tells us that the file has CRLF line endings, and I >> suspect we have that bug, which may want to be corrected. but you are saying that my suspicion is incorrect and we do not have such a bug. Thanks for digging further. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] Git does not convert CRLF=>LF on files with \r not before \n 2015-04-21 13:51 [BUG] Git does not convert CRLF=>LF on files with \r not before \n Alexandre Garnier 2015-04-21 17:41 ` Junio C Hamano @ 2015-04-21 19:28 ` Torsten Bögershausen 2015-04-22 13:06 ` Alexandre Garnier 1 sibling, 1 reply; 5+ messages in thread From: Torsten Bögershausen @ 2015-04-21 19:28 UTC (permalink / raw) To: Alexandre Garnier, git On 2015-04-21 15.51, Alexandre Garnier wrote: > Here is a test: > > git init -q crlf-test > cd crlf-test > echo '* text=auto' > .gitattributes > git add .gitattributes > git commit -q -m "Normalize EOL" > echo -ne 'some content\r\nother \rcontent with CR\r\ncontent\r\nagain > content with\r\r\n' > inline-cr.txt > echo "Working directory content:" > cat -A inline-cr.txt > echo > git add inline-cr.txt > echo "Indexed content:" > git show :inline-cr.txt | cat -A > > Result > ------ > File content: > some content^M$ > other ^Mcontent with CR^M$ > content^M$ > again content with^M^M$ > > Indexed content: > some content^M$ > other ^Mcontent with CR^M$ > content^M$ > again content with^M^M$ > > Expected result > --------------- > File content: > some content^M$ > other ^Mcontent with CR^M$ > content^M$ > again content with^M^M$ > > Indexed content: > some content$ > other ^Mcontent with CR$ > content$ > again content with^M$ > # or even 'again content with$' for this last line > > If you remove the \r that are not at the end of the lines, EOL are > converted as expected: > File content: > some content^M$ > other content with CR^M$ > content^M$ > again content with^M$ > > Indexed content: > some content$ > other content with CR$ > content$ > again content with$ > First of all, thanks for the info. The current implementation of Git does an auto-detection if a file is text or binary. For a file which is "suspected to be text", it is expected to have either LF or CRLF as line endings, but a "bare CR" make Git wonder: Should this still be treated as a text file ? If yes, should the CR be kept as is, or should it be converted into LF (or CRLF) ? The current implementation may simply be explained by the fact that nobody has so far asked to treat this file as "text", so the implementation assumes it to be binary. (Which makes the code a little bit easier, at the time it was written) So the status of today is that you can force Git to let the CR as is, when you specify that the file is "text". Is there a real life problem behind it ? And what should happen to the CRs ? ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] Git does not convert CRLF=>LF on files with \r not before \n 2015-04-21 19:28 ` Torsten Bögershausen @ 2015-04-22 13:06 ` Alexandre Garnier 0 siblings, 0 replies; 5+ messages in thread From: Alexandre Garnier @ 2015-04-22 13:06 UTC (permalink / raw) To: Torsten Bögershausen; +Cc: git Indeed, when changing the gitattributes for '* text', the replacement is OK. Thanks for all the explanations. At first, my use case was some source files (imported from another VCS) with CR in different contexts: - lines ending with CRCRLF - all content in LF or CRLF but some CR that should be EOL... - CR in the middle of the line for no reason! For all this, I will fix the files during import. But when digging I found some shell or awk scripts with CR as a valid char in search/replacement string. I know that the EOL should not be CRLF in this case, but I don't know if this situation could happen in DOS batch files or PowerShell scripts with CRLF EOL. 2015-04-21 21:28 GMT+02:00 Torsten Bögershausen <tboegi@web.de>: > On 2015-04-21 15.51, Alexandre Garnier wrote: >> Here is a test: >> >> git init -q crlf-test >> cd crlf-test >> echo '* text=auto' > .gitattributes >> git add .gitattributes >> git commit -q -m "Normalize EOL" >> echo -ne 'some content\r\nother \rcontent with CR\r\ncontent\r\nagain >> content with\r\r\n' > inline-cr.txt >> echo "Working directory content:" >> cat -A inline-cr.txt >> echo >> git add inline-cr.txt >> echo "Indexed content:" >> git show :inline-cr.txt | cat -A >> >> Result >> ------ >> File content: >> some content^M$ >> other ^Mcontent with CR^M$ >> content^M$ >> again content with^M^M$ >> >> Indexed content: >> some content^M$ >> other ^Mcontent with CR^M$ >> content^M$ >> again content with^M^M$ >> >> Expected result >> --------------- >> File content: >> some content^M$ >> other ^Mcontent with CR^M$ >> content^M$ >> again content with^M^M$ >> >> Indexed content: >> some content$ >> other ^Mcontent with CR$ >> content$ >> again content with^M$ >> # or even 'again content with$' for this last line >> >> If you remove the \r that are not at the end of the lines, EOL are >> converted as expected: >> File content: >> some content^M$ >> other content with CR^M$ >> content^M$ >> again content with^M$ >> >> Indexed content: >> some content$ >> other content with CR$ >> content$ >> again content with$ >> > > First of all, thanks for the info. > > The current implementation of Git does an auto-detection > if a file is text or binary. > > For a file which is "suspected to be text", it is expected to have either LF or CRLF as > line endings, but a "bare CR" make Git wonder: > Should this still be treated as a text file ? > If yes, should the CR be kept as is, or should it be converted into LF (or CRLF) ? > > The current implementation may simply be explained by the fact that nobody has so far asked > to treat this file as "text", so the implementation assumes it to be binary. > > (Which makes the code a little bit easier, at the time it was written) > > So the status of today is that you can force Git to let the CR as is, > when you specify that the file is "text". > > Is there a real life problem behind it ? > And what should happen to the CRs ? > > > > > ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-04-22 17:43 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-04-21 13:51 [BUG] Git does not convert CRLF=>LF on files with \r not before \n Alexandre Garnier 2015-04-21 17:41 ` Junio C Hamano 2015-04-22 17:42 ` Junio C Hamano 2015-04-21 19:28 ` Torsten Bögershausen 2015-04-22 13:06 ` Alexandre Garnier
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).