From: "Günter Kukkukk" <linux-KQewbsS9MvBBDgjK7y7TUQ@public.gmane.org>
To: samba-technical-w/Ol4Ecudpl8XjKLYN78aQ@public.gmane.org
Cc: Amit Sahrawat
<amit.sahrawat83-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Alan Cox <alan-qBU/x9rampVanCEyBjwyrvXRex20P6io@public.gmane.org>,
linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
NamJae Jeon <linkinjeon-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Steve French <sfrench-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>,
Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Alan Stern
<stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz@public.gmane.org>,
ashishsangwan2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org,
Anton Altaparmakov
<aia21-KWPb1pKIrIJaa/9Udqfwiw@public.gmane.org>,
Unix Support
<unix-support-IE+S/cj2AuA2EctHIo1CcQ@public.gmane.org>
Subject: Re: CIFS: Rename bug on servers not supporting inode numbers
Date: Thu, 24 Nov 2011 06:08:29 +0100 [thread overview]
Message-ID: <201111240608.31146.linux@kukkukk.com> (raw)
In-Reply-To: <CADDb1s2OcfjwAJoTTbR6547hDL68PJFg_jD_vAs-nGmgiO6YWQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On Wednesday 23 November 2011 19:00:16 Amit Sahrawat wrote:
> Hi Alan,
> Ok, translations cannot be added easily. But any idea why surrogate
> pairs are not handled? I think handling for surrogate pairs can be
> added by identifying proper points(there are not many I guess). Please
> share your views.
>
> Regards,
> Amit Sahrawat
>
> On Wed, Nov 23, 2011 at 10:42 PM, Alan Cox <alan-qBU/x9rampVanCEyBjwyrvXRex20P6io@public.gmane.org> wrote:
> > On Wed, 23 Nov 2011 11:31:47 -0500 (EST)
> >
> > Alan Stern <stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz@public.gmane.org> wrote:
> >> On Wed, 23 Nov 2011, NamJae Jeon wrote:
> >> > Hi. Alan.
> >> > Would you know why there is no upper/lower case table in nls utf8 ?
> >> > And Currently Surrogate pair is not supported also in nls utf8. Is
> >> > there the reason ?
> >>
> >> I don't know.
> >
> > For one case translations are locale specific and very very complicated.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> > in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
"Surrogate pairs" had to been implemented to extend the former
16 bit limit of UCS-2/UTF-16.
Unicode has been limited to max 0x0010FFFF glyphs - which
would not fit in UCS-2/UTF-16.
To extend UTF-16, the "surrogate range" between D800 and DFFF was "stolen"
from the one of the previously named "Private Use Areas" of UCS-2.
-----
Have those "surrogate pairs" any impact on _todays_ linux file name conventions?
I think the easy answer is NO !
AFAIK - _no_ current operating system is supporting this!
We are talking here about "allowed dir/file name characters"!
The main reason behind "Surrogate pairs" was to allow "userland" (!)
applications to use worldwide special character glyphs!
---------
Anyway - in nls_base.c
.....
static const struct utf8_table utf8_table[] =
{
{0x80, 0x00, 0*6, 0x7F, 0, /* 1 byte sequence */},
{0xE0, 0xC0, 1*6, 0x7FF, 0x80, /* 2 byte sequence */},
{0xF0, 0xE0, 2*6, 0xFFFF, 0x800, /* 3 byte sequence */},
{0xF8, 0xF0, 3*6, 0x1FFFFF, 0x10000, /* 4 byte sequence */},
{0xFC, 0xF8, 4*6, 0x3FFFFFF, 0x200000, /* 5 byte sequence */},
{0xFE, 0xFC, 5*6, 0x7FFFFFFF, 0x4000000, /* 6 byte sequence */},
{0, /* end of table */}
};
........
that configured range exceeds the max. allowed unicode range 0x0010FFFF
and _must_ be changed to:
static const struct utf8_table utf8_table[] =
{
{0x80, 0x00, 0*6, 0x7F, 0, /* 1 byte sequence */},
{0xE0, 0xC0, 1*6, 0x7FF, 0x80, /* 2 byte sequence */},
{0xF0, 0xE0, 2*6, 0xFFFF, 0x800, /* 3 byte sequence */},
{0xF8, 0xF0, 3*6, 0x1FFFFF, 0x10000, /* 4 byte sequence */},
{0, /* end of table */}
};
Cheers, Günter
WARNING: multiple messages have this Message-ID (diff)
From: "Günter Kukkukk" <linux@kukkukk.com>
To: samba-technical@lists.samba.org
Cc: Amit Sahrawat <amit.sahrawat83@gmail.com>,
Alan Cox <alan@lxorguk.ukuu.org.uk>,
linux-cifs@vger.kernel.org, NamJae Jeon <linkinjeon@gmail.com>,
Jeff Layton <jlayton@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
Steve French <sfrench@samba.org>,
Steve French <smfrench@gmail.com>,
Alan Stern <stern@rowland.harvard.edu>,
ashishsangwan2@gmail.com, akpm@linux-foundation.org,
Anton Altaparmakov <aia21@cam.ac.uk>,
Unix Support <unix-support@ucs.cam.ac.uk>
Subject: Re: CIFS: Rename bug on servers not supporting inode numbers
Date: Thu, 24 Nov 2011 06:08:29 +0100 [thread overview]
Message-ID: <201111240608.31146.linux@kukkukk.com> (raw)
In-Reply-To: <CADDb1s2OcfjwAJoTTbR6547hDL68PJFg_jD_vAs-nGmgiO6YWQ@mail.gmail.com>
On Wednesday 23 November 2011 19:00:16 Amit Sahrawat wrote:
> Hi Alan,
> Ok, translations cannot be added easily. But any idea why surrogate
> pairs are not handled? I think handling for surrogate pairs can be
> added by identifying proper points(there are not many I guess). Please
> share your views.
>
> Regards,
> Amit Sahrawat
>
> On Wed, Nov 23, 2011 at 10:42 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> > On Wed, 23 Nov 2011 11:31:47 -0500 (EST)
> >
> > Alan Stern <stern@rowland.harvard.edu> wrote:
> >> On Wed, 23 Nov 2011, NamJae Jeon wrote:
> >> > Hi. Alan.
> >> > Would you know why there is no upper/lower case table in nls utf8 ?
> >> > And Currently Surrogate pair is not supported also in nls utf8. Is
> >> > there the reason ?
> >>
> >> I don't know.
> >
> > For one case translations are locale specific and very very complicated.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> > in the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
"Surrogate pairs" had to been implemented to extend the former
16 bit limit of UCS-2/UTF-16.
Unicode has been limited to max 0x0010FFFF glyphs - which
would not fit in UCS-2/UTF-16.
To extend UTF-16, the "surrogate range" between D800 and DFFF was "stolen"
from the one of the previously named "Private Use Areas" of UCS-2.
-----
Have those "surrogate pairs" any impact on _todays_ linux file name conventions?
I think the easy answer is NO !
AFAIK - _no_ current operating system is supporting this!
We are talking here about "allowed dir/file name characters"!
The main reason behind "Surrogate pairs" was to allow "userland" (!)
applications to use worldwide special character glyphs!
---------
Anyway - in nls_base.c
.....
static const struct utf8_table utf8_table[] =
{
{0x80, 0x00, 0*6, 0x7F, 0, /* 1 byte sequence */},
{0xE0, 0xC0, 1*6, 0x7FF, 0x80, /* 2 byte sequence */},
{0xF0, 0xE0, 2*6, 0xFFFF, 0x800, /* 3 byte sequence */},
{0xF8, 0xF0, 3*6, 0x1FFFFF, 0x10000, /* 4 byte sequence */},
{0xFC, 0xF8, 4*6, 0x3FFFFFF, 0x200000, /* 5 byte sequence */},
{0xFE, 0xFC, 5*6, 0x7FFFFFFF, 0x4000000, /* 6 byte sequence */},
{0, /* end of table */}
};
........
that configured range exceeds the max. allowed unicode range 0x0010FFFF
and _must_ be changed to:
static const struct utf8_table utf8_table[] =
{
{0x80, 0x00, 0*6, 0x7F, 0, /* 1 byte sequence */},
{0xE0, 0xC0, 1*6, 0x7FF, 0x80, /* 2 byte sequence */},
{0xF0, 0xE0, 2*6, 0xFFFF, 0x800, /* 3 byte sequence */},
{0xF8, 0xF0, 3*6, 0x1FFFFF, 0x10000, /* 4 byte sequence */},
{0, /* end of table */}
};
Cheers, Günter
next prev parent reply other threads:[~2011-11-24 5:08 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-03 15:20 CIFS: Rename bug on servers not supporting inode numbers Anton Altaparmakov
2011-11-03 15:20 ` Anton Altaparmakov
2011-11-03 15:42 ` Anton Altaparmakov
[not found] ` <81357503-810A-477A-A320-D39F2CD69547-KWPb1pKIrIJaa/9Udqfwiw@public.gmane.org>
2011-11-03 17:40 ` Jeff Layton
2011-11-03 17:40 ` Jeff Layton
2011-11-03 23:25 ` Anton Altaparmakov
[not found] ` <A2505FE2-FE28-4C23-879E-357C9D3AA03F-KWPb1pKIrIJaa/9Udqfwiw@public.gmane.org>
2011-11-03 23:34 ` Steve French
2011-11-03 23:34 ` Steve French
[not found] ` <CAH2r5mtye1DHKbPYMKnkcZjfVKBqek=a7DWGPJvxCsoLEhzDXA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-03 23:37 ` NamJae Jeon
2011-11-03 23:37 ` NamJae Jeon
2011-11-23 10:34 ` NamJae Jeon
[not found] ` <CAKYAXd965Md2D4mnDbROoNcKZPQs2dERh8y7aDHtoJ+FJcK1QQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-23 16:31 ` Alan Stern
2011-11-23 16:31 ` Alan Stern
[not found] ` <Pine.LNX.4.44L0.1111231131100.2111-100000-IYeN2dnnYyZXsRXLowluHWD2FQJk+8+b@public.gmane.org>
2011-11-23 17:12 ` Alan Cox
2011-11-23 17:12 ` Alan Cox
[not found] ` <20111123171245.3e3f18a7-qBU/x9rampVanCEyBjwyrvXRex20P6io@public.gmane.org>
2011-11-23 18:00 ` Amit Sahrawat
2011-11-23 18:00 ` Amit Sahrawat
[not found] ` <CADDb1s2OcfjwAJoTTbR6547hDL68PJFg_jD_vAs-nGmgiO6YWQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-24 5:08 ` Günter Kukkukk [this message]
2011-11-24 5:08 ` Günter Kukkukk
[not found] ` <201111240608.31146.linux-KQewbsS9MvBBDgjK7y7TUQ@public.gmane.org>
2011-11-28 7:56 ` Ashish Sangwan
2011-11-28 7:56 ` Ashish Sangwan
2011-11-03 18:40 ` Shirish Pargaonkar
2011-11-03 18:40 ` Shirish Pargaonkar
2011-11-04 11:16 ` Björn JACKE
2011-11-04 11:16 ` Björn JACKE
2011-11-04 11:16 ` Björn JACKE
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201111240608.31146.linux@kukkukk.com \
--to=linux-kqewbss9mvbbdgjk7y7tuq@public.gmane.org \
--cc=aia21-KWPb1pKIrIJaa/9Udqfwiw@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=alan-qBU/x9rampVanCEyBjwyrvXRex20P6io@public.gmane.org \
--cc=amit.sahrawat83-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=ashishsangwan2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=linkinjeon-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=samba-technical-w/Ol4Ecudpl8XjKLYN78aQ@public.gmane.org \
--cc=sfrench-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org \
--cc=smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz@public.gmane.org \
--cc=unix-support-IE+S/cj2AuA2EctHIo1CcQ@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.