From: tboegi@web.de
To: lars.schneider@autodesk.com, git@vger.kernel.org, j6t@kdbg.org,
sunshine@sunshineco.com, peff@peff.net,
ramsay@ramsayjones.plus.com, Johannes.Schindelin@gmx.de,
--to=larsxschneider@gmail.com
Cc: "Torsten Bögershausen" <tboegi@web.de>
Subject: [PATCH/RFC v5 7/7] Careful with CRLF when using e.g. UTF-16 for working-tree-encoding
Date: Mon, 29 Jan 2018 21:19:11 +0100 [thread overview]
Message-ID: <20180129201911.9484-1-tboegi@web.de> (raw)
In-Reply-To: <xmqqshawfgaa.fsf@gitster.mtv.corp.google.com>
From: Torsten Bögershausen <tboegi@web.de>
UTF-16 encoded files are treated as "binary" by Git, and no CRLF
conversion is done.
When the UTF-16 encoded files are converted into UF-8 using the new
"working-tree-encoding", the CRLF are converted if core.autocrlf is true.
This may lead to confusion:
A tool writes an UTF-16 encoded file with CRLF.
The file is commited with core.autocrlf=true, the CLRF are converted into LF.
The repo is pushed somewhere and cloned by a different user, who has
decided to use core.autocrlf=false.
He uses the same tool, and now the CRLF are not there as expected, but LF,
make the file useless for the tool.
Avoid this (possible) confusion by ignoring core.autocrlf for all files
which have "working-tree-encoding" defined.
The user can still use a .gitattributes file and specify the line endings
like "text=auto", "text", or "text eol=crlf" and let that .gitattribute
file travel together with push and clone.
Change convert.c to e more careful, simplify the initialization when
attributes are retrived (and none are specified) and update the documentation.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
Documentation/gitattributes.txt | 9 ++++++---
convert.c | 15 ++++++++++++---
2 files changed, 18 insertions(+), 6 deletions(-)
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index a8dbf4be3..3665c4677 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -308,12 +308,15 @@ Use the `working-tree-encoding` attribute only if you cannot store a file in
UTF-8 encoding and if you want Git to be able to process the content as
text.
+Note that when `working-tree-encoding` is defined, core.autocrlf is ignored.
+Set the `text` attribute (or `text=auto`) to enable CRLF conversions.
+
Use the following attributes if your '*.txt' files are UTF-16 encoded
-with byte order mark (BOM) and you want Git to perform automatic line
-ending conversion based on your platform.
+with byte order mark (BOM) and you want Git to perform line
+ending conversion based on core.eol.
------------------------
-*.txt text working-tree-encoding=UTF-16
+*.txt working-tree-encoding=UTF-16 text
------------------------
Use the following attributes if your '*.txt' files are UTF-16 little
diff --git a/convert.c b/convert.c
index 13fad490c..e7f11d1db 100644
--- a/convert.c
+++ b/convert.c
@@ -1264,15 +1264,24 @@ static void convert_attrs(struct conv_attrs *ca, const char *path)
}
ca->checkout_encoding = git_path_check_encoding(ccheck + 5);
} else {
- ca->drv = NULL;
- ca->crlf_action = CRLF_UNDEFINED;
- ca->ident = 0;
+ memset(ca, 0, sizeof(*ca));
}
/* Save attr and make a decision for action */
ca->attr_action = ca->crlf_action;
if (ca->crlf_action == CRLF_TEXT)
ca->crlf_action = text_eol_is_crlf() ? CRLF_TEXT_CRLF : CRLF_TEXT_INPUT;
+ /*
+ * Often UTF-16 encoded files are read and written by programs which
+ * really need CRLF, and it is important to keep the CRLF "as is" when
+ * files are committed with core.autocrlf=true and the repo is pushed.
+ * The CRLF would be converted into LF when the repo is cloned to
+ * a machine with core.autocrlf=false.
+ * Obey the "text" and "eol" attributes and be independent on the
+ * local core.autocrlf for all "encoded" files.
+ */
+ if ((ca->crlf_action == CRLF_UNDEFINED) && ca->checkout_encoding)
+ ca->crlf_action = CRLF_BINARY;
if (ca->crlf_action == CRLF_UNDEFINED && auto_crlf == AUTO_CRLF_FALSE)
ca->crlf_action = CRLF_BINARY;
if (ca->crlf_action == CRLF_UNDEFINED && auto_crlf == AUTO_CRLF_TRUE)
--
2.16.0.rc0.2.g64d3e4d0cc.dirty
next prev parent reply other threads:[~2018-01-29 20:19 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-20 15:24 [PATCH v4 0/6] convert: add support for different encodings lars.schneider
2018-01-20 15:24 ` [PATCH v4 1/6] strbuf: remove unnecessary NUL assignment in xstrdup_tolower() lars.schneider
2018-01-20 15:24 ` [PATCH v4 2/6] strbuf: add xstrdup_toupper() lars.schneider
2018-01-20 15:24 ` [PATCH v4 3/6] utf8: add function to detect prohibited UTF-16/32 BOM lars.schneider
2018-01-20 15:24 ` [PATCH v4 4/6] utf8: add function to detect a missing " lars.schneider
2018-01-20 15:24 ` [PATCH v4 5/6] convert: add 'working-tree-encoding' attribute lars.schneider
2018-01-21 14:22 ` Simon Ruderich
2018-01-22 12:35 ` Lars Schneider
2018-01-23 0:54 ` Jeff King
2018-01-23 10:25 ` Simon Ruderich
2018-01-23 16:20 ` Jeff King
2018-01-23 21:12 ` Junio C Hamano
2018-01-23 9:27 ` Simon Ruderich
2018-01-22 18:00 ` SQUASH convert: add tracing for " lars.schneider
2018-01-22 19:37 ` Eric Sunshine
2018-01-23 10:17 ` [PATCH v2] " lars.schneider
2018-01-20 15:24 ` [PATCH v4 6/6] " lars.schneider
2018-01-23 20:19 ` [PATCH v4 0/6] convert: add support for different encodings Torsten Bögershausen
2018-01-23 20:53 ` Junio C Hamano
2018-01-29 20:18 ` [PATCH v5 0/7] " tboegi
2018-01-29 20:18 ` [PATCH v5 1/7] strbuf: remove unnecessary NUL assignment in xstrdup_tolower() tboegi
2018-01-29 20:19 ` [PATCH v5 2/7] strbuf: add xstrdup_toupper() tboegi
2018-01-29 20:19 ` [PATCH v5 3/7] utf8: add function to detect prohibited UTF-16/32 BOM tboegi
2018-01-29 20:19 ` [PATCH v5 4/7] utf8: add function to detect a missing " tboegi
2018-01-30 19:15 ` Junio C Hamano
2018-01-30 20:58 ` Lars Schneider
2018-01-30 21:54 ` Junio C Hamano
2018-01-29 20:19 ` [PATCH v5 5/7] convert: add 'working-tree-encoding' attribute tboegi
2018-01-30 20:05 ` Junio C Hamano
2018-01-30 20:31 ` Lars Schneider
2018-01-30 21:56 ` Junio C Hamano
2018-01-31 19:12 ` Lars Schneider
2018-01-31 22:45 ` Junio C Hamano
2018-01-29 20:19 ` [PATCH v5 6/7] convert: add tracing for " tboegi
2018-01-29 20:19 ` tboegi [this message]
2018-01-30 11:23 ` [PATCH/RFC v5 7/7] Careful with CRLF when using e.g. UTF-16 for working-tree-encoding Lars Schneider
2018-01-30 14:40 ` Torsten Bögershausen
2018-01-30 15:14 ` Lars Schneider
2018-01-31 17:28 ` Torsten Bögershausen
2018-01-31 19:37 ` Lars Schneider
2018-02-02 19:17 ` Junio C Hamano
2018-02-07 6:31 ` Torsten Bögershausen
2018-02-07 18:12 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180129201911.9484-1-tboegi@web.de \
--to=tboegi@web.de \
--cc=--to=larsxschneider@gmail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=git@vger.kernel.org \
--cc=j6t@kdbg.org \
--cc=lars.schneider@autodesk.com \
--cc=peff@peff.net \
--cc=ramsay@ramsayjones.plus.com \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).