From: Christoph Hellwig <hch@infradead.org>
To: Olaf Weber <olaf@sgi.com>
Cc: linux-fsdevel@vger.kernel.org, Ben Myers <bpm@sgi.com>,
tinguely@sgi.com, xfs@oss.sgi.com
Subject: Re: [RFC v2] Unicode/UTF-8 support for XFS
Date: Fri, 26 Sep 2014 09:56:05 -0700 [thread overview]
Message-ID: <20140926165605.GA25274@infradead.org> (raw)
In-Reply-To: <54257D3F.70302@sgi.com>
On Fri, Sep 26, 2014 at 04:50:39PM +0200, Olaf Weber wrote:
> I'm not sure how common the parsing code can be if needs to be capable of
> retrieving data from a filesystem.
>
> Note given your and Andi Kleen's feedback on the trie size I've switched to
> doing algorithmic decomposition for Hangul. This reduces the size of the
> trie to 89952 bytes.
>
> In addition, if you store the trie in the filesystem, then the only part
> that needs storing is the version for that particular filesystem, e.g no
> compatibility info for different unicode versions would be required. This
> would reduce the trie size to about 50kB for case-sensitive filesystems, and
> about 55kB on case-folding filesystems.
Honestly I wouldn't worry about demand loading it too much. This is a
fairly special case code for NAS servers, and should not affect normal
uses now that we use symbol_get. Let's get back to the fundamentals.
> >It's a chicken and egg situation. I'd much prefer we enforce clean
> >utf8 from the start, because if we don't we'll never be able to do
> >that. And other filesystems (e.g. ZFS) allow you to do reject
> >anything that is not clean utf8....
>
> As I understand it, this is optional in ZFS. I wonder what people's
> experiences are with this.
It is as optional as your utf8 support for XFS is. But they do
enforce valid utf8 if they use utf8 normalization for file name
comparisms, be that case sensitive or insensitive. Take a look at the
zfs(8) man page.
> - Forbid non-UTF-8 filenames
> - Allow non-UTF-8 filenames
> - Make it a mount option
> - Make it a mkfs option
My take on this is:
- I think we'll have to prevent non-utf8 file names for any cases where
we use utf8 normalization. If you do not use utf8 normalization
it's plain old Unix everything is allowed.
- I think utf8 normalization vs not should be mkfs option, to make sure
everyone including kernel and repair knows what sort of filesystem
deal with.
- case insensitive matching for utf8 normalized filesystems should be
a runtime decision. mount time for now, but Samba people would be
extremly happy to allow per-operation or per-process CI matching.
But that is another totally different discusion I'd like to keep
separate, I just want to make sure the disk format allows for it for
now.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-09-26 16:56 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-18 19:56 [RFC v2] Unicode/UTF-8 support for XFS Ben Myers
2014-09-18 20:08 ` [PATCH 01/10] xfs: return the first match during case-insensitive lookup Ben Myers
2014-09-18 20:09 ` [PATCH 02/10] xfs: rename XFS_CMP_CASE to XFS_CMP_MATCH Ben Myers
2014-09-18 20:09 ` [PATCH 03/13] libxfs: add xfs_nameops.normhash Ben Myers
2014-09-18 20:10 ` [PATCH 04/10] xfs: change interface of xfs_nameops.normhash Ben Myers
2014-09-18 20:11 ` [PATCH 05/10] xfs: add a superblock feature bit to indicate UTF-8 support Ben Myers
2014-09-18 20:13 ` [PATCH 03/10] xfs: add xfs_nameops.normhash Ben Myers
2014-09-18 20:14 ` [PATCH 06/10] xfs: add unicode character database files Ben Myers
2014-09-22 20:54 ` Dave Chinner
2014-09-26 17:09 ` Christoph Hellwig
2014-09-18 20:15 ` [PATCH 07/10] xfs: add trie generator and supporting code for UTF-8 Ben Myers
2014-09-22 20:57 ` Dave Chinner
2014-09-23 18:57 ` Ben Myers
2014-09-26 17:10 ` Christoph Hellwig
2014-09-18 20:16 ` [PATCH 08/10] xfs: add xfs_nameops for utf8 and utf8+casefold Ben Myers
2014-09-18 20:17 ` [PATCH 09/10] xfs: apply utf-8 normalization rules to user extended attribute names Ben Myers
2014-09-18 20:18 ` [PATCH 10/10] xfs: implement demand load of utf8norm.ko Ben Myers
2014-09-18 20:31 ` [PATCH 00/13] xfsprogs: Unicode/UTF-8 support for XFS Ben Myers
2014-09-18 20:33 ` [PATCH 01/13] libxfs: return the first match during case-insensitive lookup Ben Myers
2014-09-18 20:33 ` [PATCH 02/13] libxfs: rename XFS_CMP_CASE to XFS_CMP_MATCH Ben Myers
2014-09-18 20:34 ` [PATCH 03/13] libxfs: add xfs_nameops.normhash Ben Myers
2014-09-18 20:35 ` [PATCH 04/13] libxfs: change interface of xfs_nameops.normhash Ben Myers
2014-09-18 20:36 ` [PATCH 05/13] libxfs: add a superblock feature bit to indicate UTF-8 support Ben Myers
2014-09-18 20:37 ` [PATCH 06/13] xfsprogs: add unicode character database files Ben Myers
2014-09-18 20:38 ` [PATCH 07/13] libxfs: add trie generator and supporting code for UTF-8 Ben Myers
2014-09-18 20:38 ` [PATCH 08/13] libxfs: add xfs_nameops for utf8 and utf8+casefold Ben Myers
2014-09-18 20:39 ` [PATCH 09/13] libxfs: apply utf-8 normalization rules to user extended attribute names Ben Myers
2014-09-18 20:40 ` [PATCH 10/13] xfsprogs: add utf8 support to growfs Ben Myers
2014-09-18 20:41 ` [PATCH 11/13] xfsprogs: add utf8 support to mkfs.xfs Ben Myers
2014-09-18 20:42 ` [PATCH 12/13] xfsprogs: add utf8 support to xfs_repair Ben Myers
2014-09-18 20:43 ` [PATCH 13/13] xfsprogs: add a preliminary test for utf8 support Ben Myers
2014-09-19 16:06 ` [PATCH 07a/13] xfsprogs: add trie generator for UTF-8 Ben Myers
2014-09-23 18:34 ` Roger Willcocks
2014-09-24 23:11 ` Ben Myers
2014-09-19 16:07 ` [PATCH 07b/13] libxfs: add supporting code " Ben Myers
2014-09-18 21:10 ` [RFC v2] Unicode/UTF-8 support for XFS Ben Myers
2014-09-18 21:24 ` Zach Brown
2014-09-18 22:23 ` Ben Myers
2014-09-19 16:03 ` [PATCH 07a/10] xfs: add trie generator for UTF-8 Ben Myers
2014-09-19 16:04 ` [PATCH 07b/10] xfs: add supporting code " Ben Myers
2014-09-22 14:55 ` [RFC v2] Unicode/UTF-8 support for XFS Andi Kleen
2014-09-22 18:41 ` Ben Myers
2014-09-22 19:29 ` Andi Kleen
2014-09-23 16:13 ` Olaf Weber
2014-09-23 20:15 ` Andi Kleen
2014-09-23 20:45 ` Ben Myers
2014-09-24 11:07 ` Olaf Weber
2014-09-26 14:06 ` Olaf Weber
2014-09-23 13:01 ` Olaf Weber
2014-09-23 20:02 ` Andi Kleen
2014-09-22 22:26 ` Dave Chinner
2014-09-24 13:21 ` Olaf Weber
2014-09-24 23:10 ` Dave Chinner
2014-09-25 13:33 ` Zuckerman, Boris
2014-09-26 14:50 ` Olaf Weber
2014-09-26 16:56 ` Christoph Hellwig [this message]
2014-09-26 17:04 ` Jeremy Allison
2014-09-26 17:06 ` Christoph Hellwig
2014-09-26 17:13 ` Jeremy Allison
2014-09-26 19:37 ` Olaf Weber
2014-09-26 19:46 ` Jeremy Allison
2014-09-26 20:03 ` Olaf Weber
2014-09-29 20:16 ` J. Bruce Fields
2014-09-29 11:06 ` Christoph Hellwig
2014-09-26 17:30 ` Ben Myers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140926165605.GA25274@infradead.org \
--to=hch@infradead.org \
--cc=bpm@sgi.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=olaf@sgi.com \
--cc=tinguely@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox