linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: jamie@shareable.org (Jamie Lokier)
To: linux-arm-kernel@lists.infradead.org
Subject: Sending UTF-8 patches (was: [PATCH 2/2] Remove now-defunct ts7250 nand driver)
Date: Wed, 6 Jan 2010 23:21:28 +0000	[thread overview]
Message-ID: <20100106232128.GE24250@shareable.org> (raw)
In-Reply-To: <1262803010.3181.8484.camel@macbook.infradead.org>

David Woodhouse wrote:
> > That's unfortunate.  An option to git-am or it's subsidiary tools to
> > convert the patch as well as the commit would be useful.  After all it
> > _is_ made clear in the MIME header how it's formatted.
> 
> ISTR there was some resistance to that suggestion when git-am was first
> fixed to handle the Content-Type of mails. The idea was that the patch
> should be considered sacrosanct and shouldn't be mangled.

Looks like they forgot mailers work with text, not preservation of
octets, and mailers mangle the octets in standardised ways, so a bit
of unmangling is needed on occasions.

> Personally, I suspect you're right, and it should be converted too.

It would need to optional for git users whose source code isn't UTF-8 -
possibly converting the other way for them.  But yeah I think it'd make
sense to be on by default.

> > > Care to join us in the 21st century?
> > 
> > You mean send the mail in UTF-8 format when it only contains
> > characters in ISO-8859-1?  To make that the default behaviour of an
> > email sender would possibly violate RFC2045, 
> 
> Um, why? Can you point at the particular section you think would be
> violated?

Section 4.1.2, Charset Parameter, final paragraph:

>>   In general, composition software should always use the "lowest common
>>   denominator" character set possible.  For example, if a body contains
>>   only US-ASCII characters, it SHOULD be marked as being in the US-
>>   ASCII character set, not ISO-8859-1, which, like all the ISO-8859
>>   family of character sets, is a superset of US-ASCII.  More generally,
>>   if a widely-used character set is a subset of another character set,
>>   and a body contains only characters in the widely-used subset, it
>>   should be labelled as being in that subset.  This will increase the
>>   chances that the recipient will be able to view the resulting entity
>>   correctly.

It's a SHOULD, but it's still a good idea.  ISO-8859-1 is still very
widely-used for email.

Also in that section:

>>    (1)   US-ASCII -- as defined in ANSI X3.4-1986 [US-ASCII].
>>
>>    (2)   ISO-8859-X -- where "X" is to be replaced, as
>>          necessary, for the parts of ISO-8859 [ISO-8859].  Note
>>          that the ISO 646 character sets have deliberately been
>>          omitted in favor of their 8859 replacements, which are
>>          the designated character sets for Internet mail.  As of
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>          the publication of this document, the legitimate values
>>          for "X" are the digits 1 through 10.

> Would you advise that I send a mail as EBCDIC if it can fit into that?

Obviously not - RFC2045 does not recommend that, so mailers don't do it
in their default configurations.  But they do recode text into the
lowest common charset that can represent the text.  In practice that
means no effect on ASCII, but does affect some non-ASCII characters.

Mutt out of the box tries us-ascii / iso-8859-1 / utf-8, in that order,
to maximise the chance of recipients being able to read the mail.  Even
on a fully 21st-century-ised Linux with UTF-8 terminals etc. :-) I don't
know what other mailers do, sorry, but I'd expect them to do the same.

-- Jamie

  parent reply	other threads:[~2010-01-06 23:21 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-05 21:59 [PATCH 2/2] Remove now-defunct ts7250 nand driver H Hartley Sweeten
2010-01-06 13:31 ` David Woodhouse
2010-01-06 17:26   ` H Hartley Sweeten
2010-01-06 17:42     ` David Woodhouse
2010-01-06 17:47       ` H Hartley Sweeten
2010-05-06 16:47         ` H Hartley Sweeten
2010-05-07  6:01           ` Artem Bityutskiy
2010-05-07 16:37             ` H Hartley Sweeten
2010-01-06 18:07   ` Sending UTF-8 patches (was: [PATCH 2/2] Remove now-defunct ts7250 nand driver) Jamie Lokier
2010-01-06 18:36     ` David Woodhouse
2010-01-06 19:01       ` Nicolas Pitre
2010-01-06 23:21       ` Jamie Lokier [this message]
2010-01-06 23:43         ` David Woodhouse
2010-01-06 18:55     ` Nicolas Pitre
2010-01-06 23:05       ` Jamie Lokier
2010-01-06 23:08         ` David Woodhouse
2010-01-06 23:27           ` Jamie Lokier
2010-01-06 23:50             ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100106232128.GE24250@shareable.org \
    --to=jamie@shareable.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).