All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gabriel Krisman Bertazi <krisman@collabora.com>
To: "Theodore Y. Ts'o" <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org
Subject: Re: [PATCH e2fsprogs 4/9] mke2fs: Configure encoding during superblock initialization
Date: Wed, 21 Nov 2018 14:43:23 -0500	[thread overview]
Message-ID: <87lg5mdxno.fsf@collabora.com> (raw)
In-Reply-To: <20181121045503.GA26006@thunk.org> (Theodore Y. Ts'o's message of "Tue, 20 Nov 2018 23:55:03 -0500")

"Theodore Y. Ts'o" <tytso@mit.edu> writes:

> On Mon, Oct 15, 2018 at 05:12:15PM -0400, Gabriel Krisman Bertazi wrote:
>> diff --git a/misc/mke2fs.c b/misc/mke2fs.c
>> index f05003fc30b9..5ed7b987540e 100644
>> --- a/misc/mke2fs.c
>> +++ b/misc/mke2fs.c
>> @@ -790,6 +790,8 @@ static void parse_extended_opts(struct ext2_super_block *param,
>>  	int	len;
>>  	int	r_usage = 0;
>>  	int	ret;
>> +	int	encoding = -1;
>> +	char 	*encoding_flags = NULL;
>
>     ...
>
>> +	if (ext2fs_has_feature_fname_encoding(param)) {
>> +		param->s_encoding_flags =
>> +			ext4_encoding_map[encoding].default_flags;
>
> This code is assuming that users will specify the encoding via "-E encoding=utf8-10.0"
> and this will set the FNAME_ENCODING flag implicitly.
>
> But consider what happens if the user runs command like this:
>
>     mke2fs -t ext4 -O fname_encoding -E resize=12T
>
> When parse_extended_opts gets called, the variable encoding will still
> be -1, and so we'll end up trying to use a negative array index to
> ext4_encoding_map[] which will be... unfortunate.
>
> As I mentioned in another e-mail, I'm a bit dubious about having
> per-encoding default flags.  Those flags should either global ext4
> code points, or they should be forced to specific values given the
> encoding that is specified.

Normalization and casefold types are too specific to each encoding, to
not be per-encoding.  ASCII has no normalization, for instance.

If I understand you correctly, we should make them ext4 code points to
ensure they don't change in the future.

> We probably also want to have a default encoding if the user just
> specifies "-O fname_encoding".   Say, in /etc/mke2fs.conf:

Right.  That solves the case for -O fname_encoding.  I will do this in v3.

>
> [options]
>     default_encoding = utf8-11.0
>
> Then at some point a few years from now, we might enable
> fname_encoding by default, so we might have in /etc/mke2fs.conf:
> [fs_types]
> 	ext4 = {
> 		features = has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isize,largedir,fname_encoding
> 		inode_size = 256
> 	}
>
> So having a way to specify the default encoding in /etc/mke2fs.conf is
> going to be important.  What will probably happen is two years, we'll
> be up to Unicode 13.0, and we might want to add support for Unicode
> 13.0 in some future kernel version,, say, 5.8.  But then we won't want
> to make utf8-13.0 the default for some amount of time, since if the
> file system is mounted on an older kernel, it won't work; the kernel
> will have to reject mounting a file system with an unknown encoding.
>
> So that's why I always like to make these sorts of configuration
> defaults to be tuneable in /etc/mke2fs.conf.  Different distros will
> have different backwards compatibility policies.  For example, For
> enterprise distros, they might want to wait 7 years before creating
> file systems with utf8-13.0 as the default.  For a community distro,
> they might want to wait 2-3 years.  And for a purpose-built Linux
> gaming Valve box, where the kernel is under the control of the box
> manufacturers, they might want to be super-aggressive about adopting a
> new Unicode encoding, in order to crack that critical Ancient Sanskrit
> market.  :-)

good point!

-- 
Gabriel Krisman Bertazi

  reply	other threads:[~2018-11-22  6:19 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-15 21:12 [PATCH e2fsprogs 0/9] Support encoding awareness and casefold Gabriel Krisman Bertazi
2018-10-15 21:12 ` [PATCH e2fsprogs 1/9] e2fsprogs: Add timestamp extension bits to superblock Gabriel Krisman Bertazi
2018-11-19  3:35   ` Theodore Y. Ts'o
2018-10-15 21:12 ` [PATCH e2fsprogs 2/9] e2fsprogs: Reserve feature bit and SB field bit for filename encoding Gabriel Krisman Bertazi
2018-11-19  4:15   ` Theodore Y. Ts'o
2018-10-15 21:12 ` [PATCH e2fsprogs 3/9] libe2p: Helpers for configuring the encoding superblock fields Gabriel Krisman Bertazi
2018-11-19  4:27   ` Theodore Y. Ts'o
2018-11-19 15:28     ` Gabriel Krisman Bertazi
2018-11-21  4:32       ` Theodore Y. Ts'o
2018-11-21 19:33         ` Gabriel Krisman Bertazi
2018-10-15 21:12 ` [PATCH e2fsprogs 4/9] mke2fs: Configure encoding during superblock initialization Gabriel Krisman Bertazi
2018-11-21  4:55   ` Theodore Y. Ts'o
2018-11-21 19:43     ` Gabriel Krisman Bertazi [this message]
2018-10-15 21:12 ` [PATCH e2fsprogs 5/9] chattr/lsattr: Support casefold attribute Gabriel Krisman Bertazi
2018-11-21  5:00   ` Theodore Y. Ts'o
2018-10-15 21:12 ` [PATCH e2fsprogs 6/9] lib/ext2fs: Implement NLS support Gabriel Krisman Bertazi
2018-11-21  5:01   ` Theodore Y. Ts'o
2018-11-21 19:44     ` Gabriel Krisman Bertazi
2018-10-15 21:12 ` [PATCH e2fsprogs 7/9] lib/ext2fs: Support encoding when calculating dx hashes Gabriel Krisman Bertazi
2018-11-21  5:10   ` Theodore Y. Ts'o
2018-10-15 21:12 ` [PATCH e2fsprogs 8/9] debugfs/htree: Support encoding when printing the file hash Gabriel Krisman Bertazi
2018-10-15 21:12 ` [PATCH e2fsprogs 9/9] tune2fs: Prevent enabling encryption flag on encoding-aware fs Gabriel Krisman Bertazi
2018-11-21  5:03   ` Theodore Y. Ts'o
2018-11-21 19:46     ` Gabriel Krisman Bertazi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87lg5mdxno.fsf@collabora.com \
    --to=krisman@collabora.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.