From: "Günter Kukkukk" <linux@kukkukk.com>
To: samba-technical@lists.samba.org
Cc: Amit Sahrawat <amit.sahrawat83@gmail.com>,
Alan Cox <alan@lxorguk.ukuu.org.uk>,
linux-cifs@vger.kernel.org, NamJae Jeon <linkinjeon@gmail.com>,
Jeff Layton <jlayton@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
Steve French <sfrench@samba.org>,
Steve French <smfrench@gmail.com>,
Alan Stern <stern@rowland.harvard.edu>,
ashishsangwan2@gmail.com, akpm@linux-foundation.org,
Anton Altaparmakov <aia21@cam.ac.uk>,
Unix Support <unix-support@ucs.cam.ac.uk>
Subject: Re: CIFS: Rename bug on servers not supporting inode numbers
Date: Thu, 24 Nov 2011 06:08:29 +0100 [thread overview]
Message-ID: <201111240608.31146.linux@kukkukk.com> (raw)
In-Reply-To: <CADDb1s2OcfjwAJoTTbR6547hDL68PJFg_jD_vAs-nGmgiO6YWQ@mail.gmail.com>
On Wednesday 23 November 2011 19:00:16 Amit Sahrawat wrote:
> Hi Alan,
> Ok, translations cannot be added easily. But any idea why surrogate
> pairs are not handled? I think handling for surrogate pairs can be
> added by identifying proper points(there are not many I guess). Please
> share your views.
>
> Regards,
> Amit Sahrawat
>
> On Wed, Nov 23, 2011 at 10:42 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> > On Wed, 23 Nov 2011 11:31:47 -0500 (EST)
> >
> > Alan Stern <stern@rowland.harvard.edu> wrote:
> >> On Wed, 23 Nov 2011, NamJae Jeon wrote:
> >> > Hi. Alan.
> >> > Would you know why there is no upper/lower case table in nls utf8 ?
> >> > And Currently Surrogate pair is not supported also in nls utf8. Is
> >> > there the reason ?
> >>
> >> I don't know.
> >
> > For one case translations are locale specific and very very complicated.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> > in the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
"Surrogate pairs" had to been implemented to extend the former
16 bit limit of UCS-2/UTF-16.
Unicode has been limited to max 0x0010FFFF glyphs - which
would not fit in UCS-2/UTF-16.
To extend UTF-16, the "surrogate range" between D800 and DFFF was "stolen"
from the one of the previously named "Private Use Areas" of UCS-2.
-----
Have those "surrogate pairs" any impact on _todays_ linux file name conventions?
I think the easy answer is NO !
AFAIK - _no_ current operating system is supporting this!
We are talking here about "allowed dir/file name characters"!
The main reason behind "Surrogate pairs" was to allow "userland" (!)
applications to use worldwide special character glyphs!
---------
Anyway - in nls_base.c
.....
static const struct utf8_table utf8_table[] =
{
{0x80, 0x00, 0*6, 0x7F, 0, /* 1 byte sequence */},
{0xE0, 0xC0, 1*6, 0x7FF, 0x80, /* 2 byte sequence */},
{0xF0, 0xE0, 2*6, 0xFFFF, 0x800, /* 3 byte sequence */},
{0xF8, 0xF0, 3*6, 0x1FFFFF, 0x10000, /* 4 byte sequence */},
{0xFC, 0xF8, 4*6, 0x3FFFFFF, 0x200000, /* 5 byte sequence */},
{0xFE, 0xFC, 5*6, 0x7FFFFFFF, 0x4000000, /* 6 byte sequence */},
{0, /* end of table */}
};
........
that configured range exceeds the max. allowed unicode range 0x0010FFFF
and _must_ be changed to:
static const struct utf8_table utf8_table[] =
{
{0x80, 0x00, 0*6, 0x7F, 0, /* 1 byte sequence */},
{0xE0, 0xC0, 1*6, 0x7FF, 0x80, /* 2 byte sequence */},
{0xF0, 0xE0, 2*6, 0xFFFF, 0x800, /* 3 byte sequence */},
{0xF8, 0xF0, 3*6, 0x1FFFFF, 0x10000, /* 4 byte sequence */},
{0, /* end of table */}
};
Cheers, Günter
next prev parent reply other threads:[~2011-11-24 5:08 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-03 15:20 CIFS: Rename bug on servers not supporting inode numbers Anton Altaparmakov
2011-11-03 15:42 ` Anton Altaparmakov
2011-11-03 17:40 ` Jeff Layton
2011-11-03 23:25 ` Anton Altaparmakov
2011-11-03 23:34 ` Steve French
2011-11-03 23:37 ` NamJae Jeon
2011-11-23 10:34 ` NamJae Jeon
2011-11-23 16:31 ` Alan Stern
2011-11-23 17:12 ` Alan Cox
2011-11-23 18:00 ` Amit Sahrawat
2011-11-24 5:08 ` Günter Kukkukk [this message]
2011-11-28 7:56 ` Ashish Sangwan
2011-11-03 18:40 ` Shirish Pargaonkar
2011-11-04 11:16 ` Björn JACKE
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201111240608.31146.linux@kukkukk.com \
--to=linux@kukkukk.com \
--cc=aia21@cam.ac.uk \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=amit.sahrawat83@gmail.com \
--cc=ashishsangwan2@gmail.com \
--cc=jlayton@redhat.com \
--cc=linkinjeon@gmail.com \
--cc=linux-cifs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=samba-technical@lists.samba.org \
--cc=sfrench@samba.org \
--cc=smfrench@gmail.com \
--cc=stern@rowland.harvard.edu \
--cc=unix-support@ucs.cam.ac.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox