From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kara Subject: Re: [PATCH 6/8] Support non-BMP characters in UDF Date: Wed, 16 May 2012 16:34:48 +0200 Message-ID: <20120516143448.GD27661@quack.suse.cz> References: <4FB2E25E.900@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Jan Kara , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org To: Vladimir =?utf-8?Q?'=CF=86-coder=2Fphcoder'?= Serbinenko Return-path: Content-Disposition: inline In-Reply-To: <4FB2E25E.900@gmail.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Wed 16-05-12 01:10:22, Vladimir '=CF=86-coder/phcoder' Serbinenko wr= ote: > I also have a counterpart for mkudffs/udf-tools but sourceforge homep= age > seems to be abandoned does anybody know if there is a new homepage fo= r > mkudffs? Thanks for the patch! It looks OK but shouldn't we rather use the hel= per functions you introduced in the NLS code? It look wrong to replicate decoding of UTF16 here. Honza >=20 > Signed-off-by: Vladimir Serbinenko > --- > fs/udf/unicode.c | 28 +++++++++++++++++++++++----- > 1 file changed, 23 insertions(+), 5 deletions(-) >=20 > diff --git a/fs/udf/unicode.c b/fs/udf/unicode.c > index 9b1b2de..2d8cc12 100644 > --- a/fs/udf/unicode.c > +++ b/fs/udf/unicode.c > @@ -280,6 +280,14 @@ static int udf_CS0toNLS(struct nls_table *nls, s= truct ustr *utf_o, > if (cmp_id =3D=3D 16) > c =3D (c << 8) | ocu[i++]; > =20 > + if (cmp_id =3D=3D 16 && (c & 0xfc00) =3D=3D 0xd800 > + && i + 1 < ocu_len && ((ocu[i] & 0xfc) =3D=3D 0xdc)) { > + uint16_t l; > + l =3D ocu[i++] << 8; > + l |=3D ocu[i++]; > + c =3D (((c & 0x3ff) << 10) | (l & 0x3ff)) + 0x10000; > + } > + > len =3D nls->uni2char(c, &utf_o->u_name[utf_o->u_len], > UDF_NAME_LEN - utf_o->u_len); > /* Valid character? */ > @@ -312,20 +320,30 @@ try_again: > if (!len) > continue; > /* Invalid character, deal with it */ > - if (len < 0 || uni_char > 0xffff) { > + if (len < 0 || uni_char > 0x10ffff) { > len =3D 1; > uni_char =3D '?'; > } > =20 > if (uni_char > max_val) { > - max_val =3D 0xffffU; > + max_val =3D 0x10ffffU; > ocu[0] =3D (uint8_t)0x10U; > goto try_again; > } > =20 > - if (max_val =3D=3D 0xffffU) > - ocu[++u_len] =3D (uint8_t)(uni_char >> 8); > - ocu[++u_len] =3D (uint8_t)(uni_char & 0xffU); > + if (uni_char > 0xffff) { > + u16 h, l; > + h =3D 0xd800 | (((uni_char - 0x10000) >> 10) & 0x3ff); > + l =3D 0xdc00 | ((uni_char - 0x10000) & 0x3ff); > + ocu[++u_len] =3D (uint8_t)(h >> 8); > + ocu[++u_len] =3D (uint8_t)(h & 0xffU); > + ocu[++u_len] =3D (uint8_t)(l >> 8); > + ocu[++u_len] =3D (uint8_t)(l & 0xffU); > + } else { > + if (max_val =3D=3D 0x10ffffU) > + ocu[++u_len] =3D (uint8_t)(uni_char >> 8); > + ocu[++u_len] =3D (uint8_t)(uni_char & 0xffU); > + } > i +=3D len - 1; > } > =20 > --=20 > 1.7.10 >=20 > --=20 > Regards > Vladimir '=CF=86-coder/phcoder' Serbinenko >=20 --=20 Jan Kara SUSE Labs, CR