From: Junio C Hamano <gitster@pobox.com>
To: Robin Rosenberg <robin.rosenberg.lists@dewire.com>
Cc: Jeff King <peff@peff.net>, git@vger.kernel.org
Subject: Re: [PATCH 2/2] send-email: rfc2047-quote subject lines with non-ascii characters
Date: Sun, 30 Mar 2008 16:47:16 -0700 [thread overview]
Message-ID: <7vwsnjwz97.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: 200803290941.54091.robin.rosenberg.lists@dewire.com
Robin Rosenberg <robin.rosenberg.lists@dewire.com> writes:
> Den Saturday 29 March 2008 08.22.03 skrev Jeff King:
>> On Sat, Mar 29, 2008 at 08:19:07AM +0100, Robin Rosenberg wrote:
>> > Den Friday 28 March 2008 22.29.01 skrev Jeff King:
>> > > We always use 'utf-8' as the encoding, since we currently
>> > > have no way of getting the information from the user.
>> >
>> > Don't set encoding to UTF-8 unless it actually looks like UTF-8.
>>
>> OK. Do you have an example function that guesses with high probability
>> whether a string is utf-8? If there are non-ascii characters but we
>> _don't_ guess utf-8, what should we do?
>
> Any test for valid UTF-8 will do that with a very high probability. The
> perl UTF-8 "api" is a mess. I couldn't find such a routine!?. Calling
> decode/encode and see if you get the original string works, but that is too
> clumsy, IMHO.
The sequence to decode followed by encode will test if you have a valid
one and if it is canonically encoded, which is testing too much. You only
want to check if it is valid, and do not care about normalization.
I see this in perluniintro.pod:
=item *
How Do I Detect Data That's Not Valid In a Particular Encoding?
Use the C<Encode> package to try converting it.
For example,
use Encode 'decode_utf8';
if (decode_utf8($string_of_bytes_that_I_think_is_utf8)) {
# valid
} else {
# invalid
}
For commit log messages, we traditionally use similar idea to guess by
checking if it looks like an UTF-8 encoded string and otherwise assume
Latin-1 (and I think we still do if the user does not tell us).
If this issue is only about the --compose part of send-email, perhaps you
can interactively ask instead of "otherwise assume Latin-1"?
next prev parent reply other threads:[~2008-03-30 23:48 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-03-28 6:30 [ANNOUNCE] GIT 1.5.5-rc2 Junio C Hamano
2008-03-28 18:13 ` Jeff King
2008-03-28 21:05 ` Junio C Hamano
2008-03-28 21:23 ` Jeff King
2008-03-28 21:27 ` Jeff King
2008-03-28 21:28 ` [PATCH 1/2] send-email: specify content-type of --compose body Jeff King
2008-03-28 21:29 ` [PATCH 2/2] send-email: rfc2047-quote subject lines with non-ascii characters Jeff King
2008-03-29 7:19 ` Robin Rosenberg
2008-03-29 7:22 ` Jeff King
2008-03-29 8:41 ` Robin Rosenberg
2008-03-29 8:49 ` Jeff King
2008-03-29 9:02 ` Robin Rosenberg
2008-03-29 9:11 ` Jeff King
2008-03-29 9:39 ` Robin Rosenberg
2008-03-29 9:43 ` Jeff King
2008-03-29 12:54 ` Robin Rosenberg
2008-03-29 21:45 ` Jeff King
2008-03-30 3:40 ` Sam Vilain
2008-03-30 4:39 ` Jeff King
2008-03-30 23:47 ` Junio C Hamano [this message]
2008-03-29 8:44 ` Robin Rosenberg
2008-03-29 8:53 ` Jeff King
2008-03-29 9:38 ` Robin Rosenberg
2008-03-29 9:52 ` Jeff King
2008-03-29 12:54 ` Robin Rosenberg
2008-03-29 21:18 ` Jeff King
2008-03-29 21:43 ` Robin Rosenberg
2008-03-29 22:00 ` Jeff King
2008-03-30 2:12 ` Sam Vilain
2008-03-30 4:31 ` Jeff King
2008-05-21 19:39 ` Junio C Hamano
2008-05-21 19:47 ` Jeff King
[not found] <7caf19ae394accab538d2f94953bb62b55a2c79f.1206486012.git.peff@peff.net>
2008-03-25 23:03 ` Jeff King
2008-03-26 5:59 ` Teemu Likonen
2008-03-26 6:20 ` Jeff King
2008-03-26 8:30 ` Teemu Likonen
2008-03-26 8:39 ` Jeff King
2008-03-26 9:23 ` Teemu Likonen
2008-03-26 9:32 ` Teemu Likonen
2008-03-26 9:35 ` Jeff King
2008-03-26 9:33 ` Jeff King
2008-03-27 7:38 ` Jeff King
2008-03-27 19:44 ` Todd Zullinger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7vwsnjwz97.fsf@gitster.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
--cc=robin.rosenberg.lists@dewire.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).