public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Chris Wright <chrisw@redhat.com>,
	qemu-devel@nongnu.org, kvm@vger.kernel.org
Subject: Re: [Qemu-devel] KVM call minutes for Feb 15
Date: Thu, 17 Feb 2011 15:25:21 +0200	[thread overview]
Message-ID: <4D5D21C1.80009@redhat.com> (raw)
In-Reply-To: <4D5D1E54.1070704@codemonkey.ws>

On 02/17/2011 03:10 PM, Anthony Liguori wrote:
> On 02/17/2011 06:23 AM, Avi Kivity wrote:
>> On 02/17/2011 02:12 PM, Anthony Liguori wrote:
>>>> (btw what happens in a non-UTF-8 locale? I guess we should just 
>>>> reject unencodable strings).
>>>
>>>
>>> While QEMU is mostly ASCII internally, for the purposes of the JSON 
>>> parser, we always encode and decode UTF-8.  We reject invalid UTF-8 
>>> sequences.  But since JSON is string-encoded unicode, we can always 
>>> decode a JSON string to valid UTF-8 as long as the string is well 
>>> formed.
>>
>> That is wrong.  If the user passes a Unicode filename it is expected 
>> to be translated to the current locale encoding for the purpose of, 
>> say, filename lookup.
>
> QEMU does not support anything but UTF-8.

Since when?

AFAICT, JSON string conversion is the only place where there is any 
dependency on UTF-8.  Anything else should just work.

>
> That's pretty common with Unix software.  I don't think any modern 
> Unix platform actually uses UCS2 or UTF-16.  It's either ascii or UTF-8.

Most/all Linux distributions support UTF-8 as well as a zillion other 
encodings (single-byte ASCII + another charset, or multi-byte charsets 
for languages with many characters.

> The only place it even matters is Windows and Windows has ASCII and 
> UTF-16 versions of their APIs.  So on Windows, non-ASCII characters 
> won't be handled correctly (yet another one of the many issues with 
> Windows support in QEMU).  UTF-8 is self-recovering though so it 
> degrades gracefully.

It matters on Linux with el_GR.iso88597, for example.  If you feed a 
JSON string and translate it blindly to UTF-8, you'll get garbage when 
you feed it to system calls.

Practically everyone uses UTF-8 these days, so the impact is minimal, 
but it is more correct (as well as simpler) to ask the system libraries 
to encode using the current locale.

-- 
error compiling committee.c: too many arguments to function


  reply	other threads:[~2011-02-17 13:25 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-15 16:26 KVM call minutes for Feb 15 Chris Wright
2011-02-15 23:13 ` [Qemu-devel] " Anthony Liguori
2011-02-16 10:24   ` Avi Kivity
2011-02-16 13:34     ` Anthony Liguori
2011-02-17  9:26       ` Avi Kivity
2011-02-17 12:12         ` Anthony Liguori
2011-02-17 12:23           ` Avi Kivity
2011-02-17 13:10             ` Anthony Liguori
2011-02-17 13:25               ` Avi Kivity [this message]
2011-02-17 13:37                 ` Anthony Liguori
2011-02-17 13:59                   ` Peter Maydell
2011-02-17 14:01                     ` Anthony Liguori
2011-02-17 14:06                   ` Avi Kivity
2011-02-17 13:37                 ` Anthony Liguori
2011-02-16 14:39   ` Amit Shah
2011-02-16 14:41     ` Anthony Liguori
2011-02-17 12:42       ` Amit Shah

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D5D21C1.80009@redhat.com \
    --to=avi@redhat.com \
    --cc=anthony@codemonkey.ws \
    --cc=chrisw@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox