From: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
To: Gao Xiang <gaoxiang25@huawei.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>, <tytso@mit.edu>,
<david@fromorbit.com>, <olaf@sgi.com>,
<linux-ext4@vger.kernel.org>, <linux-fsdevel@vger.kernel.org>,
<alvaro.soliverez@collabora.co.uk>,
<kernel@lists.collabora.co.uk>, hutj <hutj@huawei.com>
Subject: Re: [PATCH RFC v2 00/13] NLS/UTF-8 Case-Insensitive lookups for ext4 and VFS proposal
Date: Mon, 12 Feb 2018 17:56:22 -0200 [thread overview]
Message-ID: <87lgfy2d95.fsf@collabora.co.uk> (raw)
In-Reply-To: <b90a4183-6b22-2864-d484-9fd3e30543d0@huawei.com> (Gao Xiang's message of "Tue, 6 Feb 2018 11:21:44 +0800")
Gao Xiang <gaoxiang25@huawei.com> writes:
> Could I express my opinion? I have working on case-insensitive sdcardfs
> for months.
Hi Gao,
Thanks for helping out with this topic.
> I think your problem is how we optimise a case-insensitive lookup on the
> file system with a case-sensitive dcache (I mean d_add and no d_compare
> and d_hash).
Are d_compare and d_hash to be considered really disruptive
performance-wise? Even if they are only used when casefold/encoding
support is enabled? I don't see how we could better use the dcache
without at least requiring these functions to handle CI cases.
> In that case, we could not trust the negative dentry when _creating_ a
> case-insensitive file, for example:
> there exists "anDroid" on-disk, but ext4's in-memory dcache only has
> the negative "Android", if we lookup "Android" we will get the
> _negative_ dentry, but we _cannot_ create it since "anDroid" exists on
> disk. In the create case, an on-disk _iterate_ (or readdir) is
> necessary.
In my previous email, I mentioned my current implementation ignores
negative dentries and forces a ->lookup(), which walks over the disk
entries. (I had to add a fix to the creation path in the vfs-ms_casefold
branch to exactly match that description, so you might have missed the
updated version in that branch).
Either way, this case is supported like this:
If we have two bind-mounts of the same directory, /mnt and /mnt-ci,
case-sensitive and case-insensitive, respectively, We can do:
open("/mnt/anDroid", O_EXCL|O_CREAT) = 3
open("/mnt/Android", 0) = -2 No such file or directory
open("/mnt-ci/Android", 0) = 4
open("/mnt-ci/Android", O_EXCL|O_CREAT) = -17 File exists
open("/mnt-ci/AndROID", O_EXCL|O_CREAT) = -17 File exists
The second open() is expected to create an negative_dentry of "Android",
which, if it wasn't ignored by the 3th open(), the CI operation would
have failed. Notice that the 3th open() operation actually opens the
file that was created by the first open(). It doesn't create a new
file.
Following on, the 4th operation (file creation) *must fail* because
there is a CI name collision with /mnt-ci/anDroid. The same is true for
the final case.
> I could give another example, if we uses case-insentive ext4 and create
> "Android" and "anDroid", how to deal with the case in the
> case-insensitive way?
> I mean in that case we should make both "Android" and "anDroid" can
> access, right?
Not sure if I follow you here, but I'm assuming we create Android and
anDroid in the sensitive mountpoint, because, otherwise the
second file creation in the insensitive mountpoint would fail.
This is the case where I'm hiding one of the previously (CS) created
files, when in the insensitive mountpoint, and the user is shooting
himself. For the sensitive case, Both stays visible to the user.
> I think we need to build a special case-sensitive dcache rather than
> a case-insensitive dcache following the native case-insentive fs(use
> d_add_ci, d_compare and d_hash, eg. fat, ntfs...)
What do you think about the second part of my proposal, where I mention
dealing differently with negative dentries created by a CI lookup?
We don't need to ignore them if we can invalidate them after a creation
in the directory.
> Finally, I agree "let the user shot herself in the foot by having two
> files with the exact CI name", but I think it could not the VFS
> _busniess_ itself since each customer solution "case-sensitive ext4 ->
> case-insensitive lookup" has their _perfered_ way (for example,
> "android" and "Android" exist, A perfers android and B perfers Android.
I don't see how we could defer the decision to the filesystem, that's a
pretty good problem, which I don't have a solution right now.
> Finally, I think for optmization, ext4 or other fs could add some dir
> inode _tag_ and supports native case-insensitive for these dirs could be
> better....
Agreed. But I'm seeing this as outside the scope of my proposal, since it
is specific to each filesystem. My ext4 adaptation, for instance, falls
back to linear search when it can't find the exact match.
Thanks,
--
Gabriel Krisman Bertazi
next prev parent reply other threads:[~2018-02-12 19:56 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-25 2:53 [PATCH RFC v2 00/13] NLS/UTF-8 Case-Insensitive lookups for ext4 and VFS proposal Gabriel Krisman Bertazi
2018-01-25 2:53 ` [PATCH RFC v2 01/13] charsets: Introduce middle-layer for character encoding Gabriel Krisman Bertazi
2018-01-25 2:53 ` [PATCH RFC v2 02/13] charsets: ascii: Wrap ascii functions to charsets library Gabriel Krisman Bertazi
2018-01-25 2:53 ` [PATCH RFC v2 03/13] charsets: utf8: Add unicode character database files Gabriel Krisman Bertazi
2018-01-25 2:53 ` [PATCH RFC v2 04/13] scripts: add trie generator for UTF-8 Gabriel Krisman Bertazi
2018-01-25 2:53 ` [PATCH RFC v2 05/13] charsets: utf8: Introduce code for UTF-8 normalization Gabriel Krisman Bertazi
2018-01-25 2:53 ` [PATCH RFC v2 06/13] charsets: utf8: reduce the size of utf8data[] Gabriel Krisman Bertazi
2018-01-25 2:53 ` [PATCH RFC v2 07/13] charsets: utf8: Hook-up utf-8 code to charsets library Gabriel Krisman Bertazi
2018-01-25 2:53 ` [PATCH RFC v2 08/13] charsets: utf8: Introduce test module for kernel UTF-8 implementation Gabriel Krisman Bertazi
2018-01-25 2:53 ` [PATCH RFC v2 09/13] ext4: Add ignorecase mount option Gabriel Krisman Bertazi
2018-01-25 2:53 ` [PATCH RFC v2 10/13] ext4: Include encoding information on the superblock Gabriel Krisman Bertazi
2018-01-25 2:53 ` [PATCH RFC v2 11/13] fscrypt: Introduce charset-based matching functions Gabriel Krisman Bertazi
2018-01-25 2:53 ` [PATCH RFC v2 12/13] ext4: Support charset name matching Gabriel Krisman Bertazi
2018-01-25 2:53 ` [PATCH RFC v2 13/13] ext4: Implement ext4 dcache hooks for custom charsets Gabriel Krisman Bertazi
2018-01-25 3:16 ` [PATCH RFC v2 00/13] NLS/UTF-8 Case-Insensitive lookups for ext4 and VFS proposal Al Viro
2018-01-25 19:32 ` Theodore Ts'o
2018-01-26 2:52 ` Gaoxiang (OS)
2018-03-05 12:10 ` Greg KH
2018-02-06 2:24 ` Gabriel Krisman Bertazi
2018-02-06 3:21 ` Gao Xiang
2018-02-12 19:56 ` Gabriel Krisman Bertazi [this message]
2018-02-12 22:43 ` Gao Xiang
2018-02-13 22:20 ` Gabriel Krisman Bertazi
2018-02-14 12:27 ` Gao Xiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87lgfy2d95.fsf@collabora.co.uk \
--to=krisman@collabora.co.uk \
--cc=alvaro.soliverez@collabora.co.uk \
--cc=david@fromorbit.com \
--cc=gaoxiang25@huawei.com \
--cc=hutj@huawei.com \
--cc=kernel@lists.collabora.co.uk \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=olaf@sgi.com \
--cc=tytso@mit.edu \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.