From: Adam Borowski <kilobyte-b9QjgO8OEXPVItvQsEIGlw@public.gmane.org>
To: Roy Franz <roy.franz-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
Cc: Linux Kernel Mailing List
<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
linux-efi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
Leif Lindholm
<leif.lindholm-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>,
Mark Salter <msalter-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH 09/17] Move unicode to ASCII conversion to shared function.
Date: Fri, 20 Sep 2013 01:02:41 +0200 [thread overview]
Message-ID: <20130919230241.GA18666@angband.pl> (raw)
In-Reply-To: <CAFECyb-tTaCEqUHUh23MaJf-P42ZpodFKeNG=kE+vmmi-gyKrw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On Wed, Sep 18, 2013 at 09:48:44PM -0700, Roy Franz wrote:
> On Wed, Sep 18, 2013 at 8:44 PM, Adam Borowski <kilobyte-b9QjgO8OEXPVItvQsEIGlw@public.gmane.org> wrote:
> > [UCS2 truncation]
>
> I stuck to re-arranging the code that was there, as I don't know enough
> about character encodings to propose changes.
I on the other hand don't know the kernel (lurking because of my first
patch), but I'm on a crusade against mangled Unicode (so far in the
userland). Can't let such a blatant error slip through on my watch :)
> Also, this code is running as part of the kernel decompressor, rather than
> the kernel itself, so it doesn't have access to any kernel facilities, and
> it also needs to be position independent.
Ok, so it can't reuse common libraries. No problem, a simplified, sanitized
and optimized copy of utf16s_to_utf8s() can be done in quite less code than
the original.
> It's running in a quite limited environment - the decompressor has
> its own copy of strstr(), and other string functions.
I'd need nothing but a way to alloc the new string. And I see this is
already done (efi_{low,high_alloc()).
> I checked the UEFI specification, and it states that all 16 bit strings
> are UCS-2, unless otherwise noted.
... which means it will either get upgraded to UTF-16 in a subsequent
version, or some Unicode strings get mangled. I'd ignore this bit and
implement full UTF-16 from the start: every legal UCS-2 string can be
decoded as UTF-16 so it's a strict superset.
> The load options that the command line is provided through a void pointer
> specified as: [snip]
Either a null pointer or a 16-bit string, that sounds clear enough.
I see not a word about endianness (does anything do EFI on big endian?),
but "same as host" seems to be a reasonable assumption.
> Would it be acceptable to fix the naming/comments, and convert values
> above 126 to '?' in the current patchset, and address a more thorough fix
> in another patch set? The ARM and ARM64 EFI stub patchsets that are
> mostly complete depend on this one, so getting this merged soon would be
> helpful.
I don't want to hinder your work, so what about putting in your version
as-is and fixing it later?
> > There's just one problem: which encoding to use, but
> > these days, most distributions have either dropped non-UTF8 or hardly pay
> > lip service, so we could get away with hard-coding UTF-8: those few who
> > use ancient charsets can stick to ASCII.
Not being able to use regular kernel facilities makes supporting ancient
charsets a lost cause. I'm so weeping about them... not.
> I would certainly appreciate your help improving this
Are we on the same page so far? If so, I can make a patch atop yours.
--
ᛊᚨᚾᛁᛏᚣ᛫ᛁᛊ᛫ᚠᛟᚱ᛫ᚦᛖ᛫ᚹᛖᚨᚲ
WARNING: multiple messages have this Message-ID (diff)
From: Adam Borowski <kilobyte@angband.pl>
To: Roy Franz <roy.franz@linaro.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-efi@vger.kernel.org, matt.fleming@intel.com,
Leif Lindholm <leif.lindholm@linaro.org>,
Mark Salter <msalter@redhat.com>
Subject: Re: [PATCH 09/17] Move unicode to ASCII conversion to shared function.
Date: Fri, 20 Sep 2013 01:02:41 +0200 [thread overview]
Message-ID: <20130919230241.GA18666@angband.pl> (raw)
In-Reply-To: <CAFECyb-tTaCEqUHUh23MaJf-P42ZpodFKeNG=kE+vmmi-gyKrw@mail.gmail.com>
On Wed, Sep 18, 2013 at 09:48:44PM -0700, Roy Franz wrote:
> On Wed, Sep 18, 2013 at 8:44 PM, Adam Borowski <kilobyte@angband.pl> wrote:
> > [UCS2 truncation]
>
> I stuck to re-arranging the code that was there, as I don't know enough
> about character encodings to propose changes.
I on the other hand don't know the kernel (lurking because of my first
patch), but I'm on a crusade against mangled Unicode (so far in the
userland). Can't let such a blatant error slip through on my watch :)
> Also, this code is running as part of the kernel decompressor, rather than
> the kernel itself, so it doesn't have access to any kernel facilities, and
> it also needs to be position independent.
Ok, so it can't reuse common libraries. No problem, a simplified, sanitized
and optimized copy of utf16s_to_utf8s() can be done in quite less code than
the original.
> It's running in a quite limited environment - the decompressor has
> its own copy of strstr(), and other string functions.
I'd need nothing but a way to alloc the new string. And I see this is
already done (efi_{low,high_alloc()).
> I checked the UEFI specification, and it states that all 16 bit strings
> are UCS-2, unless otherwise noted.
... which means it will either get upgraded to UTF-16 in a subsequent
version, or some Unicode strings get mangled. I'd ignore this bit and
implement full UTF-16 from the start: every legal UCS-2 string can be
decoded as UTF-16 so it's a strict superset.
> The load options that the command line is provided through a void pointer
> specified as: [snip]
Either a null pointer or a 16-bit string, that sounds clear enough.
I see not a word about endianness (does anything do EFI on big endian?),
but "same as host" seems to be a reasonable assumption.
> Would it be acceptable to fix the naming/comments, and convert values
> above 126 to '?' in the current patchset, and address a more thorough fix
> in another patch set? The ARM and ARM64 EFI stub patchsets that are
> mostly complete depend on this one, so getting this merged soon would be
> helpful.
I don't want to hinder your work, so what about putting in your version
as-is and fixing it later?
> > There's just one problem: which encoding to use, but
> > these days, most distributions have either dropped non-UTF8 or hardly pay
> > lip service, so we could get away with hard-coding UTF-8: those few who
> > use ancient charsets can stick to ASCII.
Not being able to use regular kernel facilities makes supporting ancient
charsets a lost cause. I'm so weeping about them... not.
> I would certainly appreciate your help improving this
Are we on the same page so far? If so, I can make a patch atop yours.
--
ᛊᚨᚾᛁᛏᚣ᛫ᛁᛊ᛫ᚠᛟᚱ᛫ᚦᛖ᛫ᚹᛖᚨᚲ
next prev parent reply other threads:[~2013-09-19 23:02 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-17 4:11 [PATCH V4 00/17] ARM EFI stub common code Roy Franz
2013-09-17 4:11 ` Roy Franz
2013-09-17 4:11 ` [PATCH 02/17] Add proper definitions for some EFI function pointers Roy Franz
2013-09-17 4:11 ` [PATCH 03/17] Move common EFI stub code from x86 arch code to common location Roy Franz
2013-09-17 4:11 ` [PATCH 04/17] Add system table pointer argument to shared functions Roy Franz
2013-09-17 4:11 ` [PATCH 05/17] Rename memory allocation/free functions Roy Franz
2013-09-17 4:11 ` [PATCH 06/17] Enforce minimum alignment of 1 page on allocations Roy Franz
[not found] ` <1379391093-27948-1-git-send-email-roy.franz-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2013-09-17 4:11 ` [PATCH 01/17] EFI stub documentation updates Roy Franz
2013-09-17 4:11 ` Roy Franz
2013-09-17 4:11 ` [PATCH 07/17] Move relocate_kernel() to shared file Roy Franz
2013-09-17 4:11 ` Roy Franz
2013-09-17 4:11 ` [PATCH 13/17] Allow efi_free() to be called with size of 0, and do nothing in that case Roy Franz
2013-09-17 4:11 ` Roy Franz
2013-09-17 4:11 ` [PATCH 16/17] Fix types in EFI calls to match EFI function definitions Roy Franz
2013-09-17 4:11 ` Roy Franz
2013-09-17 4:11 ` [PATCH 17/17] resolve warnings found on ARM compile Roy Franz
2013-09-17 4:11 ` Roy Franz
2013-09-18 13:21 ` [PATCH V4 00/17] ARM EFI stub common code Matt Fleming
2013-09-18 13:21 ` Matt Fleming
2013-09-20 14:32 ` H. Peter Anvin
2013-09-20 14:32 ` H. Peter Anvin
2013-09-17 4:11 ` [PATCH 08/17] Generalize relocate_kernel() for use by other architectures Roy Franz
[not found] ` <1379391093-27948-9-git-send-email-roy.franz-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2013-09-18 12:12 ` Matt Fleming
2013-09-18 12:12 ` Matt Fleming
[not found] ` <20130918121240.GJ3409-HNK1S37rvNbeXh+fF434Mdi2O/JbrIOy@public.gmane.org>
2013-09-18 16:31 ` Roy Franz
2013-09-18 16:31 ` Roy Franz
2013-09-17 4:11 ` [PATCH 09/17] Move unicode to ASCII conversion to shared function Roy Franz
[not found] ` <1379391093-27948-10-git-send-email-roy.franz-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2013-09-19 3:44 ` Adam Borowski
2013-09-19 3:44 ` Adam Borowski
[not found] ` <20130919034406.GA26385-b9QjgO8OEXPVItvQsEIGlw@public.gmane.org>
2013-09-19 3:46 ` H. Peter Anvin
2013-09-19 3:46 ` H. Peter Anvin
2013-09-19 4:48 ` Roy Franz
2013-09-19 4:48 ` Roy Franz
[not found] ` <CAFECyb-tTaCEqUHUh23MaJf-P42ZpodFKeNG=kE+vmmi-gyKrw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-19 23:02 ` Adam Borowski [this message]
2013-09-19 23:02 ` Adam Borowski
2013-09-20 9:30 ` Matt Fleming
2013-09-20 9:27 ` Matt Fleming
2013-09-20 9:27 ` Matt Fleming
[not found] ` <20130920092713.GD4785-HNK1S37rvNbeXh+fF434Mdi2O/JbrIOy@public.gmane.org>
2013-09-20 15:00 ` H. Peter Anvin
2013-09-20 15:00 ` H. Peter Anvin
[not found] ` <523C62FC.8010907-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
2013-09-21 21:31 ` Roy Franz
2013-09-21 21:31 ` Roy Franz
2013-09-17 4:11 ` [PATCH 10/17] Rename __get_map() to efi_get_memory_map() Roy Franz
2013-09-17 4:11 ` [PATCH 11/17] generalize efi_get_memory_map() Roy Franz
2013-09-17 4:11 ` [PATCH 12/17] use efi_get_memory_map() to get final map for x86 Roy Franz
2013-09-17 4:11 ` [PATCH 14/17] Generalize handle_ramdisks() and rename to handle_cmdline_files() Roy Franz
2013-09-17 4:11 ` [PATCH 15/17] Renames in handle_cmdline_files() to complete generalization Roy Franz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130919230241.GA18666@angband.pl \
--to=kilobyte-b9qjgo8oexpvitvqseiglw@public.gmane.org \
--cc=leif.lindholm-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org \
--cc=linux-efi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=matt.fleming-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
--cc=msalter-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=roy.franz-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.