public inbox for linux-bcachefs@vger.kernel.org
 help / color / mirror / Atom feed
From: Gabriel Krisman Bertazi <krisman@suse.de>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: "Joshua Ashton" <joshua@froggi.es>,
	linux-bcachefs@vger.kernel.org,
	"André Almeida" <andrealmeid@igalia.com>
Subject: Re: [v4 3/3] bcachefs: Implement casefolding
Date: Tue, 15 Aug 2023 14:00:56 -0400	[thread overview]
Message-ID: <87sf8k423r.fsf@suse.de> (raw)
In-Reply-To: <20230815005837.figd3mlz7if4w6oo@moria.home.lan> (Kent Overstreet's message of "Mon, 14 Aug 2023 20:58:37 -0400")

Kent Overstreet <kent.overstreet@linux.dev> writes:

> On Mon, Aug 14, 2023 at 10:18:33PM +0100, Joshua Ashton wrote:
>
>> Grepping the test right now I already know we will fail the
>> `test_casefold_flag_removal` test as we don't support toggling the flag on
>> non-empty directories as it would require rehashing and thus restructuring
>> the btree.
>
> What was the motivation for supporting that on other filesystems? I'm
> not inclined to add that to bcachefs without a good reason.

I don't think there was a better reason than "it is user-friendly to
allow disabling the feature without having to remove the
directory". Note that it is trivial to do so in empty directories for
ext4/f2fs: it is just a matter of flipping a bit in the inode to
enable/disable casefolding.

I think it's ok for the Windows games to not allow disabling
case-insensitive once it is set for a directory.  For fstests, I suppose
the easiest solution would be to skip that subtest when running on
bcachefs.

>> > Sure, if you have both, dirent and dentry->d_name casefolded, then it is
>> > just a regular memcmp, but at least you don't have to change the disk
>> > format and your implementation becomes simpler.
>> > 
>> 
>> Given we have the opportunity to have support for this in the dirent format
>> for bcachefs, I think it makes sense to take it right now.
>
> It also means that we have the option of defining a casefolding map that
> handles ß correctly.

What do you mean by handling ß correctly?  do you intend to support NFKD
normalization?  We can extend fs/unicode/ for that fairly easily,
but iirc, other filesystems decided to stick with NFD, because NFKD is
a bit too eager when folding some characters.

>> > > +#if IS_ENABLED(CONFIG_UNICODE)
>> > > +	struct unicode_map 	*cf_encoding;
>> > > +#endif
>> > 
>> > This goes in struct super_block.
>> 
>> The code in libfs.c that uses the s_encoding in the super_block is never
>> used for bcachefs as it defines its own inode_operations and doesn't use any
>> of the libfs common code for lookups/hashes there (ie. the
>> dentry_operations).
>> 
>> One thing we wanted to do down the line was to use the bcachefs options
>> system to be able to pick the encoding on a per-inode basis.
>> 
>> Currently I am storing it in the struct bch_fs as currently there is no need
>> for other encodings right now and it is convenient, so there can only be one
>> -- but that could change at some point with an opt attached to the inode.
>> 
>> I am happy to bung it in the vfs super_block for now though -- but I need to
>> look at how that all works out with mounting and the multiple drives
>> situation. Just be aware that at some point, it will likely be moved back
>> again to our code.
>
> We also don't always have a super_block available when we have a running
> bch_fs object (e.g. when running in userspace, for fsck and various
> other tools). Since s_encoding is used in dirent.c, it probably needs to
> be in bch_fs.

Ah, I didn't know this is reused for userspace tools.  Still, as long as
you initialize sb->s_encoding at mount time, libfs helpers will work fine.

-- 
Gabriel Krisman Bertazi

  reply	other threads:[~2023-08-15 18:01 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-14  1:24 [v4 0/3] bcachefs: Casefolding implementation Joshua Ashton
2023-08-14  1:24 ` [v4 1/3] bcachefs: Split out dirent alloc and name initialization Joshua Ashton
2023-08-14  1:24 ` [v4 2/3] bcachefs: Move dirent_val_u64s to dirent.c Joshua Ashton
2023-08-14  1:24 ` [v4 3/3] bcachefs: Implement casefolding Joshua Ashton
2023-08-14 16:34   ` Gabriel Krisman Bertazi
2023-08-14 21:18     ` Joshua Ashton
2023-08-14 21:33       ` Joshua Ashton
2023-08-14 23:01       ` Gabriel Krisman Bertazi
2023-08-15  0:58       ` Kent Overstreet
2023-08-15 18:00         ` Gabriel Krisman Bertazi [this message]
2023-08-23  9:37           ` Joshua Ashton
2023-08-15 12:36   ` Brian Foster
2023-08-24  5:43     ` Joshua Ashton
2023-08-16 19:40 ` [v4 0/3] bcachefs: Casefolding implementation Kent Overstreet
2023-08-23  9:34   ` Joshua Ashton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sf8k423r.fsf@suse.de \
    --to=krisman@suse.de \
    --cc=andrealmeid@igalia.com \
    --cc=joshua@froggi.es \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-bcachefs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox