From mboxrd@z Thu Jan  1 00:00:00 1970
From: Don Slutz <dslutz@verizon.com>
Subject: Re: pre_parse and strcmp
Date: Sat, 04 Oct 2014 10:57:14 -0400
Message-ID: <54300ACA.7060305@terremark.com>
References: <20141002210939.GA11350@laptop.dumpdata.com>
	<20141002211506.GA11421@laptop.dumpdata.com>
	<542EDA3B02000078000BE950@mail.emea.novell.com>
	<542ECD53.7090807@oracle.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta5.messagelabs.com ([195.245.231.135])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <dslutz@verizon.com>) id 1XaQlq-00067W-3V
	for xen-devel@lists.xenproject.org; Sat, 04 Oct 2014 14:57:18 +0000
In-Reply-To: <542ECD53.7090807@oracle.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: konrad wilk <konrad.wilk@oracle.com>
Cc: xen-devel@lists.xenproject.org, Jan Beulich <jbeulich@suse.com>
List-Id: xen-devel@lists.xenproject.org

On 10/03/14 12:22, konrad wilk wrote:
> On 10/3/2014 12:17 PM, Jan Beulich wrote:
>>>>> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> 10/02/14 11:15 PM >>>
>>> Hit sent to fast [was going to include a patch in this email
>>> once I had completed this]
>>>
>>> My thinking is that the best solution is to have a similar to 
>>> 'pre_parse'
>>> function that would convert the in memory buffer from UTF-16 to a 
>>> normal
>>> ascii type one.
>>>
>>> And hook it up in pre-parse to fix this up.
>>
>> I certainly don't mind a patch to deal with UTF-16 config files so 
>> (in fact already
>> when I originally coded it I considered this would be a good future 
>> enhancement).
>> I'm not, however, convinced that simply converting back to ASCII is 
>> the proper
>> solution here. Instead, if we want to allow UTF-16 config files, we 
>> should do the
>> conversion the other way around.
>
> OK. That will take some time to cobble up.

A simpler change might be to UTF-8.  (my guess would be that it would 
then look
like ASCII and so strcmp would continue to "work".

    -Don Slutz

>>
>> And then of course there is the problem of detection: The example you 
>> gave didn't
>> make clear whether the file was properly starting with a BOM, yet if 
>> it doesn't
>> telling ASCII from UTF-16 is guesswork.
>
> Ah, so that is what the odd character at the start was (BOM)! Yes, the 
> file is very much that type.
>>
>> Jan
>>
>>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel