git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Lars Schneider <larsxschneider@gmail.com>
Cc: "Torsten Bögershausen" <tboegi@web.de>,
	"Git List" <git@vger.kernel.org>, "Johannes Sixt" <j6t@kdbg.org>,
	"Eric Sunshine" <sunshine@sunshineco.com>,
	"Jeff King" <peff@peff.net>,
	"Ramsay Jones" <ramsay@ramsayjones.plus.com>,
	Johannes.Schindelin@gmx.de
Subject: Re: [PATCH v5 5/7] convert: add 'working-tree-encoding' attribute
Date: Wed, 31 Jan 2018 14:45:08 -0800	[thread overview]
Message-ID: <xmqqtvv1r6kr.fsf@gitster-ct.c.googlers.com> (raw)
In-Reply-To: <57086A32-6A2A-4B66-A355-10408C9FE0B0@gmail.com> (Lars Schneider's message of "Wed, 31 Jan 2018 20:12:44 +0100")

Lars Schneider <larsxschneider@gmail.com> writes:

>> I am not sure why this is special cased and other codepaths have "if
>> WRITE_OBJECT then die, otherwise error" checks, so no, I do not
>> agree with your reasoning, at least not yet.
>
> The convert_to_git()/encode_to_git() machinery is used in two different
> kinds of code paths:
>
> Some code paths actually write to the Git database (indicated by the 
> CONV_WRITE_OBJECT flag). I consider these the "critical/important" code 
> paths and I don't want to tolerate any encoding errors in these cases as 
> the errors would be "forever" in the Git database. That's why I call 
> die() on errors for these cases to abort whatever we are doing. 
>
> Other code paths do not write to the Git database (e.g. during "git 
> checkout" we use the code to ensure that we are moving away from the 
> exact state that we think we are moving away). In these code paths I am 
> less concerned about encoding errors. I also don't want to abort the 
> operation (e.g. "git checkout") in these cases. That's why I only inform
> the user about the problem with an error message.

Warning the users early while they are doing non-writing
operation to give them chance to adjust the contents, before they
actually need to register the contents as objects by writing, at
which point we need to die.  That's a reasonable distinction and all
of that I already agree with.

What was questionable and left unexplained was why this roundtrip
thing needs to be different.

> The encoding round-trip check can be expensive. That's why I decided 
> initially to only execute the check in the "critical/important" 
> write-to-Git-database situations (CONV_WRITE_OBJECT flag!). I also 
> decided to run it only if the "SHIFT-JIS" encoding is used as this was
> the only encoding that I could find which reportedly does not round-trip
> with UTF-8 (although I was not able to replicate the round-trip 
> problems). 

I still do not see why you have problems with the approach of
maintaining a configurable set of "iffy" encodings (and throw SJIS
into the default list) to achieve all of the above and more.  For
SJIS users, instead of having to set environment variables to obtain
safe behaviour, they automatically get safe behaviour.  When using
encodings that are not problematic, they do not need to spend cycles
checking round-trip.  And when SJIS users know they do not care about
roundtrip checks, they can just configure SJIS away from the list.

  reply	other threads:[~2018-01-31 22:45 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-20 15:24 [PATCH v4 0/6] convert: add support for different encodings lars.schneider
2018-01-20 15:24 ` [PATCH v4 1/6] strbuf: remove unnecessary NUL assignment in xstrdup_tolower() lars.schneider
2018-01-20 15:24 ` [PATCH v4 2/6] strbuf: add xstrdup_toupper() lars.schneider
2018-01-20 15:24 ` [PATCH v4 3/6] utf8: add function to detect prohibited UTF-16/32 BOM lars.schneider
2018-01-20 15:24 ` [PATCH v4 4/6] utf8: add function to detect a missing " lars.schneider
2018-01-20 15:24 ` [PATCH v4 5/6] convert: add 'working-tree-encoding' attribute lars.schneider
2018-01-21 14:22   ` Simon Ruderich
2018-01-22 12:35     ` Lars Schneider
2018-01-23  0:54       ` Jeff King
2018-01-23 10:25         ` Simon Ruderich
2018-01-23 16:20           ` Jeff King
2018-01-23 21:12             ` Junio C Hamano
2018-01-23  9:27       ` Simon Ruderich
2018-01-22 18:00   ` SQUASH convert: add tracing for " lars.schneider
2018-01-22 19:37     ` Eric Sunshine
2018-01-23 10:17   ` [PATCH v2] " lars.schneider
2018-01-20 15:24 ` [PATCH v4 6/6] " lars.schneider
2018-01-23 20:19 ` [PATCH v4 0/6] convert: add support for different encodings Torsten Bögershausen
2018-01-23 20:53   ` Junio C Hamano
2018-01-29 20:18     ` [PATCH v5 0/7] " tboegi
2018-01-29 20:18     ` [PATCH v5 1/7] strbuf: remove unnecessary NUL assignment in xstrdup_tolower() tboegi
2018-01-29 20:19     ` [PATCH v5 2/7] strbuf: add xstrdup_toupper() tboegi
2018-01-29 20:19     ` [PATCH v5 3/7] utf8: add function to detect prohibited UTF-16/32 BOM tboegi
2018-01-29 20:19     ` [PATCH v5 4/7] utf8: add function to detect a missing " tboegi
2018-01-30 19:15       ` Junio C Hamano
2018-01-30 20:58         ` Lars Schneider
2018-01-30 21:54           ` Junio C Hamano
2018-01-29 20:19     ` [PATCH v5 5/7] convert: add 'working-tree-encoding' attribute tboegi
2018-01-30 20:05       ` Junio C Hamano
2018-01-30 20:31         ` Lars Schneider
2018-01-30 21:56           ` Junio C Hamano
2018-01-31 19:12             ` Lars Schneider
2018-01-31 22:45               ` Junio C Hamano [this message]
2018-01-29 20:19     ` [PATCH v5 6/7] convert: add tracing for " tboegi
2018-01-29 20:19     ` [PATCH/RFC v5 7/7] Careful with CRLF when using e.g. UTF-16 for working-tree-encoding tboegi
2018-01-30 11:23       ` Lars Schneider
2018-01-30 14:40         ` Torsten Bögershausen
2018-01-30 15:14           ` Lars Schneider
2018-01-31 17:28             ` Torsten Bögershausen
2018-01-31 19:37               ` Lars Schneider
2018-02-02 19:17               ` Junio C Hamano
2018-02-07  6:31                 ` Torsten Bögershausen
2018-02-07 18:12                   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqtvv1r6kr.fsf@gitster-ct.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=j6t@kdbg.org \
    --cc=larsxschneider@gmail.com \
    --cc=peff@peff.net \
    --cc=ramsay@ramsayjones.plus.com \
    --cc=sunshine@sunshineco.com \
    --cc=tboegi@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).