From: "Theodore Ts'o" <tytso@mit.edu>
To: Gabriel Krisman Bertazi <krisman@collabora.com>
Cc: linux-ext4@vger.kernel.org, sfrench@samba.org,
darrick.wong@oracle.com, jlayton@kernel.org,
bfields@fieldses.org, paulus@samba.org,
linux-fsdevel@vger.kernel.org, Olaf Weber <olaf@sgi.com>,
Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Subject: Re: [PATCH RFC v6 04/11] unicode: reduce the size of utf8data[]
Date: Sat, 6 Apr 2019 15:53:42 -0400 [thread overview]
Message-ID: <20190406195342.GA18897@mit.edu> (raw)
In-Reply-To: <20190318202745.5200-5-krisman@collabora.com>
On Mon, Mar 18, 2019 at 04:27:38PM -0400, Gabriel Krisman Bertazi wrote:
> From: Olaf Weber <olaf@sgi.com>
>
> Remove the Hangul decompositions from the utf8data trie, and do
> algorithmic decomposition to calculate them on the fly. To store
> the decomposition the caller of utf8lookup()/utf8nlookup() must
> provide a 12-byte buffer, which is used to synthesize a leaf with
> the decomposition. Trie size is reduced from 245kB to 90kB.
I'm seeing sizes much smaller; the actual utf8data[] array is 63,584.
And size utf8-norm.o reports:
text data bss dec hex filename
68752 96 0 68848 10cf0 fs/unicode/utf8-norm.o
Were you measuring the size of the utf8-norm.o file? That will vary
in size depending on whether debugging symbols are enabled, etc.
- Ted
next prev parent reply other threads:[~2019-04-06 20:47 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-18 20:27 [PATCH RFC v6 00/11] Ext4 Encoding and Case-insensitive support Gabriel Krisman Bertazi
2019-03-18 20:27 ` [PATCH RFC v6 01/11] unicode: Add unicode character database files Gabriel Krisman Bertazi
2019-03-18 20:27 ` [PATCH RFC v6 02/11] scripts: add trie generator for UTF-8 Gabriel Krisman Bertazi
2019-03-18 20:27 ` [PATCH RFC v6 03/11] unicode: Introduce code for UTF-8 normalization Gabriel Krisman Bertazi
2019-03-18 20:27 ` [PATCH RFC v6 04/11] unicode: reduce the size of utf8data[] Gabriel Krisman Bertazi
2019-04-06 19:53 ` Theodore Ts'o [this message]
2019-04-08 12:02 ` Weber, Olaf (HPC Data Management & Storage)
2019-03-18 20:27 ` [PATCH RFC v6 05/11] unicode: Implement higher level API for string handling Gabriel Krisman Bertazi
2019-03-18 20:27 ` [PATCH RFC v6 06/11] unicode: Introduce test module for normalized utf8 implementation Gabriel Krisman Bertazi
2019-03-18 20:27 ` [PATCH RFC v6 07/11] MAINTAINERS: Add Unicode subsystem entry Gabriel Krisman Bertazi
2019-03-18 20:27 ` [PATCH RFC v6 08/11] ext4: Include encoding information in the superblock Gabriel Krisman Bertazi
2019-03-18 20:27 ` [PATCH RFC v6 09/11] ext4: Support encoding-aware file name lookups Gabriel Krisman Bertazi
2019-03-18 20:27 ` [PATCH RFC v6 10/11] ext4: Implement EXT4_CASEFOLD_FL flag Gabriel Krisman Bertazi
2019-03-18 20:27 ` [PATCH RFC v6 11/11] docs: ext4.rst: Document encoding and case-insensitive Gabriel Krisman Bertazi
2019-03-21 22:30 ` [PATCH RFC v6 00/11] Ext4 Encoding and Case-insensitive support Randy Dunlap
2019-03-22 23:57 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190406195342.GA18897@mit.edu \
--to=tytso@mit.edu \
--cc=bfields@fieldses.org \
--cc=darrick.wong@oracle.com \
--cc=jlayton@kernel.org \
--cc=krisman@collabora.co.uk \
--cc=krisman@collabora.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=olaf@sgi.com \
--cc=paulus@samba.org \
--cc=sfrench@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.