From: Jan Kara <jack@suse.cz>
To: "Vladimir 'φ-coder/phcoder' Serbinenko" <phcoder@gmail.com>
Cc: Jan Kara <jack@suse.cz>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 6/8] Support non-BMP characters in UDF
Date: Wed, 16 May 2012 16:34:48 +0200 [thread overview]
Message-ID: <20120516143448.GD27661@quack.suse.cz> (raw)
In-Reply-To: <4FB2E25E.900@gmail.com>
On Wed 16-05-12 01:10:22, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
> I also have a counterpart for mkudffs/udf-tools but sourceforge homepage
> seems to be abandoned does anybody know if there is a new homepage for
> mkudffs?
Thanks for the patch! It looks OK but shouldn't we rather use the helper
functions you introduced in the NLS code? It look wrong to replicate
decoding of UTF16 here.
Honza
>
> Signed-off-by: Vladimir Serbinenko <phcoder@gmail.com>
> ---
> fs/udf/unicode.c | 28 +++++++++++++++++++++++-----
> 1 file changed, 23 insertions(+), 5 deletions(-)
>
> diff --git a/fs/udf/unicode.c b/fs/udf/unicode.c
> index 9b1b2de..2d8cc12 100644
> --- a/fs/udf/unicode.c
> +++ b/fs/udf/unicode.c
> @@ -280,6 +280,14 @@ static int udf_CS0toNLS(struct nls_table *nls, struct ustr *utf_o,
> if (cmp_id == 16)
> c = (c << 8) | ocu[i++];
>
> + if (cmp_id == 16 && (c & 0xfc00) == 0xd800
> + && i + 1 < ocu_len && ((ocu[i] & 0xfc) == 0xdc)) {
> + uint16_t l;
> + l = ocu[i++] << 8;
> + l |= ocu[i++];
> + c = (((c & 0x3ff) << 10) | (l & 0x3ff)) + 0x10000;
> + }
> +
> len = nls->uni2char(c, &utf_o->u_name[utf_o->u_len],
> UDF_NAME_LEN - utf_o->u_len);
> /* Valid character? */
> @@ -312,20 +320,30 @@ try_again:
> if (!len)
> continue;
> /* Invalid character, deal with it */
> - if (len < 0 || uni_char > 0xffff) {
> + if (len < 0 || uni_char > 0x10ffff) {
> len = 1;
> uni_char = '?';
> }
>
> if (uni_char > max_val) {
> - max_val = 0xffffU;
> + max_val = 0x10ffffU;
> ocu[0] = (uint8_t)0x10U;
> goto try_again;
> }
>
> - if (max_val == 0xffffU)
> - ocu[++u_len] = (uint8_t)(uni_char >> 8);
> - ocu[++u_len] = (uint8_t)(uni_char & 0xffU);
> + if (uni_char > 0xffff) {
> + u16 h, l;
> + h = 0xd800 | (((uni_char - 0x10000) >> 10) & 0x3ff);
> + l = 0xdc00 | ((uni_char - 0x10000) & 0x3ff);
> + ocu[++u_len] = (uint8_t)(h >> 8);
> + ocu[++u_len] = (uint8_t)(h & 0xffU);
> + ocu[++u_len] = (uint8_t)(l >> 8);
> + ocu[++u_len] = (uint8_t)(l & 0xffU);
> + } else {
> + if (max_val == 0x10ffffU)
> + ocu[++u_len] = (uint8_t)(uni_char >> 8);
> + ocu[++u_len] = (uint8_t)(uni_char & 0xffU);
> + }
> i += len - 1;
> }
>
> --
> 1.7.10
>
> --
> Regards
> Vladimir 'φ-coder/phcoder' Serbinenko
>
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
next prev parent reply other threads:[~2012-05-16 14:34 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-15 23:10 [PATCH 6/8] Support non-BMP characters in UDF Vladimir 'φ-coder/phcoder' Serbinenko
2012-05-16 14:34 ` Jan Kara [this message]
2012-05-16 15:14 ` Vladimir 'φ-coder/phcoder' Serbinenko
2012-05-16 20:04 ` Jan Kara
2012-05-17 0:37 ` Vladimir 'φ-coder/phcoder' Serbinenko
2012-05-17 0:48 ` Eliminating UDF iocharset!=utf8 code (Re: [PATCH 6/8] Support non-BMP characters in UDF) Vladimir 'φ-coder/phcoder' Serbinenko
2012-05-17 14:40 ` Jan Kara
2012-05-17 15:30 ` Vladimir 'φ-coder/phcoder' Serbinenko
2012-05-17 19:45 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120516143448.GD27661@quack.suse.cz \
--to=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=phcoder@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).