From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ben Myers Subject: Re: [PATCH 07/10] xfs: add trie generator and supporting code for UTF-8. Date: Tue, 23 Sep 2014 13:57:21 -0500 Message-ID: <20140923185721.GV19952@sgi.com> References: <20140918195650.GI19952@sgi.com> <20140918201518.GJ4482@sgi.com> <20140922205714.GN4267@dastard> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: linux-fsdevel@vger.kernel.org, tinguely@sgi.com, olaf@sgi.com, xfs@oss.sgi.com To: Dave Chinner Return-path: Content-Disposition: inline In-Reply-To: <20140922205714.GN4267@dastard> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com List-Id: linux-fsdevel.vger.kernel.org On Tue, Sep 23, 2014 at 06:57:14AM +1000, Dave Chinner wrote: > On Thu, Sep 18, 2014 at 03:15:19PM -0500, Ben Myers wrote: > > From: Olaf Weber > > > > mkutf8data.c is the source for a program that generates utf8data.h, which > > contains the trie that utf8norm.c uses. The trie is generated from the > > Unicode 7.0.0 data files. The format of the utf8data[] table is described > > in utf8norm.c. > > > > Supporting functions for UTF-8 normalization are in utf8norm.c with the > > header utf8norm.h. Two normalization forms are supported: nfkdi and nfkdicf. > > > > nfkdi: > > - Apply unicode normalization form NFKD. > > - Remove any Default_Ignorable_Code_Point. > > > > nfkdicf: > > - Apply unicode normalization form NFKD. > > - Remove any Default_Ignorable_Code_Point. > > - Apply a full casefold (C + F). > > > > For the purposes of the code, a string is valid UTF-8 if: > > > > - The values encoded are 0x1..0x10FFFF. > > - The surrogate codepoints 0xD800..0xDFFFF are not encoded. > > - The shortest possible encoding is used for all values. > > > > The supporting functions work on null-terminated strings (utf8 prefix) and > > on length-limited strings (utf8n prefix). > > > > Signed-off-by: Olaf Weber > > > > --- > > [v2: the trie is now separated into utf8norm.ko; > > utf8version is now a function and exported; > > introduced CONFIG_XFS_UTF8. -bpm] > > --- > > fs/xfs/Kconfig | 8 + > > fs/xfs/Makefile | 2 +- > > fs/xfs/utf8norm/Makefile | 37 + > > fs/xfs/utf8norm/mkutf8data.c | 3239 ++++++++++++++++++++++++++++++++++++++++++ > > fs/xfs/utf8norm/utf8norm.c | 649 +++++++++ > > fs/xfs/utf8norm/utf8norm.h | 116 ++ > > Again, nothing XFS specific here. It's being built as a separate > module and the only thing that XFS uses are exported functions, so > it really should be generic library code.... I'll get this moved to lib/ as you suggested elsewhere in the thread. Thanks, Ben _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs