From: Gabriel Krisman Bertazi <krisman@suse.de>
To: Eugen Hristev <eugen.hristev@collabora.com>
Cc: tytso@mit.edu, adilger.kernel@dilger.ca,
linux-ext4@vger.kernel.org, jaegeuk@kernel.org,
chao@kernel.org, linux-f2fs-devel@lists.sourceforge.net,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
kernel@collabora.com, viro@zeniv.linux.org.uk,
brauner@kernel.org, jack@suse.cz,
Gabriel Krisman Bertazi <krisman@collabora.com>
Subject: Re: [PATCH v10 3/8] libfs: Introduce case-insensitive string comparison helper
Date: Mon, 19 Feb 2024 09:55:31 -0500 [thread overview]
Message-ID: <87msrwbj18.fsf@mailhost.krisman.be> (raw)
In-Reply-To: <50d2afaa-fd7e-4772-ac84-24e8994bfba8@collabora.com> (Eugen Hristev's message of "Mon, 19 Feb 2024 06:22:37 +0200")
Eugen Hristev <eugen.hristev@collabora.com> writes:
> On 2/16/24 18:12, Gabriel Krisman Bertazi wrote:
>> Eugen Hristev <eugen.hristev@collabora.com> writes:
>>
>>> From: Gabriel Krisman Bertazi <krisman@collabora.com>
>>>
>>> generic_ci_match can be used by case-insensitive filesystems to compare
>>> strings under lookup with dirents in a case-insensitive way. This
>>> function is currently reimplemented by each filesystem supporting
>>> casefolding, so this reduces code duplication in filesystem-specific
>>> code.
>>>
>>> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
>>> [eugen.hristev@collabora.com: rework to first test the exact match]
>>> Signed-off-by: Eugen Hristev <eugen.hristev@collabora.com>
>>> ---
>>> fs/libfs.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++
>>> include/linux/fs.h | 4 +++
>>> 2 files changed, 84 insertions(+)
>>>
>>> diff --git a/fs/libfs.c b/fs/libfs.c
>>> index bb18884ff20e..82871fa1b066 100644
>>> --- a/fs/libfs.c
>>> +++ b/fs/libfs.c
>>> @@ -1773,6 +1773,86 @@ static const struct dentry_operations generic_ci_dentry_ops = {
>>> .d_hash = generic_ci_d_hash,
>>> .d_compare = generic_ci_d_compare,
>>> };
>>> +
>>> +/**
>>> + * generic_ci_match() - Match a name (case-insensitively) with a dirent.
>>> + * This is a filesystem helper for comparison with directory entries.
>>> + * generic_ci_d_compare should be used in VFS' ->d_compare instead.
>>> + *
>>> + * @parent: Inode of the parent of the dirent under comparison
>>> + * @name: name under lookup.
>>> + * @folded_name: Optional pre-folded name under lookup
>>> + * @de_name: Dirent name.
>>> + * @de_name_len: dirent name length.
>>> + *
>>> + *
>>
>> Since this need a respin, mind dropping the extra empty line here?
>>
>>> + * Test whether a case-insensitive directory entry matches the filename
>>> + * being searched. If @folded_name is provided, it is used instead of
>>> + * recalculating the casefold of @name.
>>> + *
>>> + * Return: > 0 if the directory entry matches, 0 if it doesn't match, or
>>> + * < 0 on error.
>>> + */
>>> +int generic_ci_match(const struct inode *parent,
>>> + const struct qstr *name,
>>> + const struct qstr *folded_name,
>>> + const u8 *de_name, u32 de_name_len)
>>> +{
>>> + const struct super_block *sb = parent->i_sb;
>>> + const struct unicode_map *um = sb->s_encoding;
>>> + struct fscrypt_str decrypted_name = FSTR_INIT(NULL, de_name_len);
>>> + struct qstr dirent = QSTR_INIT(de_name, de_name_len);
>>> + int res;
>>> +
>>> + if (IS_ENCRYPTED(parent)) {
>>> + const struct fscrypt_str encrypted_name =
>>> + FSTR_INIT((u8 *) de_name, de_name_len);
>>> +
>>> + if (WARN_ON_ONCE(!fscrypt_has_encryption_key(parent)))
>>> + return -EINVAL;
>>> +
>>> + decrypted_name.name = kmalloc(de_name_len, GFP_KERNEL);
>>> + if (!decrypted_name.name)
>>> + return -ENOMEM;
>>> + res = fscrypt_fname_disk_to_usr(parent, 0, 0, &encrypted_name,
>>> + &decrypted_name);
>>> + if (res < 0)
>>> + goto out;
>>> + dirent.name = decrypted_name.name;
>>> + dirent.len = decrypted_name.len;
>>> + }
>>> +
>>> + /*
>>> + * Attempt a case-sensitive match first. It is cheaper and
>>> + * should cover most lookups, including all the sane
>>> + * applications that expect a case-sensitive filesystem.
>>> + *
>>
>>
>>> + * This comparison is safe under RCU because the caller
>>> + * guarantees the consistency between str and len. See
>>> + * __d_lookup_rcu_op_compare() for details.
>>> + */
>>
>> This paragraph doesn't really make sense here. It is originally from
>> the d_compare hook, which can be called under RCU, but there is no RCU
>> here. Also, here we are comparing the dirent with the
>> name-under-lookup, name which is already safe.
>>
>>
>>> + if (folded_name->name) {
>>> + if (dirent.len == folded_name->len &&
>>> + !memcmp(folded_name->name, dirent.name, dirent.len)) {
>>> + res = 1;
>>> + goto out;
>>> + }
>>> + res = !utf8_strncasecmp_folded(um, folded_name, &dirent);
>>
>> Hmm, second thought on this. This will ignore errors from utf8_strncasecmp*,
>> which CAN happen for the first time here, if the dirent itself is
>> corrupted on disk (exactly why we have patch 6). Yes, ext4_match will drop the
>> error, but we want to propagate it from here, such that the warning on
>> patch 6 can trigger.
>>
>> This is why I did that match dance on the original submission. Sorry
>> for suggesting it. We really want to get the error from utf8 and
>> propagate it if it is negative. basically:
>>
>> res > 0: match
>> res == 0: no match.
>> res < 0: propagate error and let the caller handle it
>
> In that case I will revert to the original v9 implementation and send a v11 to
> handle that.
Please, note that the memcmp optimization is still valid. On match, we
know the name is valid utf8. It is just a matter of propagating the
error code from utf8 to the caller if we need to call it.
--
Gabriel Krisman Bertazi
WARNING: multiple messages have this Message-ID (diff)
From: Gabriel Krisman Bertazi <krisman@suse.de>
To: Eugen Hristev <eugen.hristev@collabora.com>
Cc: brauner@kernel.org, kernel@collabora.com, tytso@mit.edu,
jack@suse.cz, linux-kernel@vger.kernel.org,
linux-f2fs-devel@lists.sourceforge.net, adilger.kernel@dilger.ca,
viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org,
jaegeuk@kernel.org, linux-ext4@vger.kernel.org,
Gabriel Krisman Bertazi <krisman@collabora.com>
Subject: Re: [f2fs-dev] [PATCH v10 3/8] libfs: Introduce case-insensitive string comparison helper
Date: Mon, 19 Feb 2024 09:55:31 -0500 [thread overview]
Message-ID: <87msrwbj18.fsf@mailhost.krisman.be> (raw)
In-Reply-To: <50d2afaa-fd7e-4772-ac84-24e8994bfba8@collabora.com> (Eugen Hristev's message of "Mon, 19 Feb 2024 06:22:37 +0200")
Eugen Hristev <eugen.hristev@collabora.com> writes:
> On 2/16/24 18:12, Gabriel Krisman Bertazi wrote:
>> Eugen Hristev <eugen.hristev@collabora.com> writes:
>>
>>> From: Gabriel Krisman Bertazi <krisman@collabora.com>
>>>
>>> generic_ci_match can be used by case-insensitive filesystems to compare
>>> strings under lookup with dirents in a case-insensitive way. This
>>> function is currently reimplemented by each filesystem supporting
>>> casefolding, so this reduces code duplication in filesystem-specific
>>> code.
>>>
>>> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
>>> [eugen.hristev@collabora.com: rework to first test the exact match]
>>> Signed-off-by: Eugen Hristev <eugen.hristev@collabora.com>
>>> ---
>>> fs/libfs.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++
>>> include/linux/fs.h | 4 +++
>>> 2 files changed, 84 insertions(+)
>>>
>>> diff --git a/fs/libfs.c b/fs/libfs.c
>>> index bb18884ff20e..82871fa1b066 100644
>>> --- a/fs/libfs.c
>>> +++ b/fs/libfs.c
>>> @@ -1773,6 +1773,86 @@ static const struct dentry_operations generic_ci_dentry_ops = {
>>> .d_hash = generic_ci_d_hash,
>>> .d_compare = generic_ci_d_compare,
>>> };
>>> +
>>> +/**
>>> + * generic_ci_match() - Match a name (case-insensitively) with a dirent.
>>> + * This is a filesystem helper for comparison with directory entries.
>>> + * generic_ci_d_compare should be used in VFS' ->d_compare instead.
>>> + *
>>> + * @parent: Inode of the parent of the dirent under comparison
>>> + * @name: name under lookup.
>>> + * @folded_name: Optional pre-folded name under lookup
>>> + * @de_name: Dirent name.
>>> + * @de_name_len: dirent name length.
>>> + *
>>> + *
>>
>> Since this need a respin, mind dropping the extra empty line here?
>>
>>> + * Test whether a case-insensitive directory entry matches the filename
>>> + * being searched. If @folded_name is provided, it is used instead of
>>> + * recalculating the casefold of @name.
>>> + *
>>> + * Return: > 0 if the directory entry matches, 0 if it doesn't match, or
>>> + * < 0 on error.
>>> + */
>>> +int generic_ci_match(const struct inode *parent,
>>> + const struct qstr *name,
>>> + const struct qstr *folded_name,
>>> + const u8 *de_name, u32 de_name_len)
>>> +{
>>> + const struct super_block *sb = parent->i_sb;
>>> + const struct unicode_map *um = sb->s_encoding;
>>> + struct fscrypt_str decrypted_name = FSTR_INIT(NULL, de_name_len);
>>> + struct qstr dirent = QSTR_INIT(de_name, de_name_len);
>>> + int res;
>>> +
>>> + if (IS_ENCRYPTED(parent)) {
>>> + const struct fscrypt_str encrypted_name =
>>> + FSTR_INIT((u8 *) de_name, de_name_len);
>>> +
>>> + if (WARN_ON_ONCE(!fscrypt_has_encryption_key(parent)))
>>> + return -EINVAL;
>>> +
>>> + decrypted_name.name = kmalloc(de_name_len, GFP_KERNEL);
>>> + if (!decrypted_name.name)
>>> + return -ENOMEM;
>>> + res = fscrypt_fname_disk_to_usr(parent, 0, 0, &encrypted_name,
>>> + &decrypted_name);
>>> + if (res < 0)
>>> + goto out;
>>> + dirent.name = decrypted_name.name;
>>> + dirent.len = decrypted_name.len;
>>> + }
>>> +
>>> + /*
>>> + * Attempt a case-sensitive match first. It is cheaper and
>>> + * should cover most lookups, including all the sane
>>> + * applications that expect a case-sensitive filesystem.
>>> + *
>>
>>
>>> + * This comparison is safe under RCU because the caller
>>> + * guarantees the consistency between str and len. See
>>> + * __d_lookup_rcu_op_compare() for details.
>>> + */
>>
>> This paragraph doesn't really make sense here. It is originally from
>> the d_compare hook, which can be called under RCU, but there is no RCU
>> here. Also, here we are comparing the dirent with the
>> name-under-lookup, name which is already safe.
>>
>>
>>> + if (folded_name->name) {
>>> + if (dirent.len == folded_name->len &&
>>> + !memcmp(folded_name->name, dirent.name, dirent.len)) {
>>> + res = 1;
>>> + goto out;
>>> + }
>>> + res = !utf8_strncasecmp_folded(um, folded_name, &dirent);
>>
>> Hmm, second thought on this. This will ignore errors from utf8_strncasecmp*,
>> which CAN happen for the first time here, if the dirent itself is
>> corrupted on disk (exactly why we have patch 6). Yes, ext4_match will drop the
>> error, but we want to propagate it from here, such that the warning on
>> patch 6 can trigger.
>>
>> This is why I did that match dance on the original submission. Sorry
>> for suggesting it. We really want to get the error from utf8 and
>> propagate it if it is negative. basically:
>>
>> res > 0: match
>> res == 0: no match.
>> res < 0: propagate error and let the caller handle it
>
> In that case I will revert to the original v9 implementation and send a v11 to
> handle that.
Please, note that the memcmp optimization is still valid. On match, we
know the name is valid utf8. It is just a matter of propagating the
error code from utf8 to the caller if we need to call it.
--
Gabriel Krisman Bertazi
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
next prev parent reply other threads:[~2024-02-19 14:55 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-15 4:26 [PATCH v10 0/8] Eugen Hristev
2024-02-15 4:26 ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-02-15 4:26 ` [PATCH v10 1/8] ext4: Simplify the handling of cached insensitive names Eugen Hristev
2024-02-15 4:26 ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-02-15 4:26 ` [PATCH v10 2/8] f2fs: " Eugen Hristev
2024-02-15 4:26 ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-02-15 4:26 ` [PATCH v10 3/8] libfs: Introduce case-insensitive string comparison helper Eugen Hristev
2024-02-15 4:26 ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-02-16 16:12 ` Gabriel Krisman Bertazi
2024-02-16 16:12 ` [f2fs-dev] " Gabriel Krisman Bertazi
2024-02-19 4:22 ` Eugen Hristev
2024-02-19 4:22 ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-02-19 14:55 ` Gabriel Krisman Bertazi [this message]
2024-02-19 14:55 ` Gabriel Krisman Bertazi
2024-02-20 7:36 ` Eugen Hristev
2024-02-20 7:36 ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-02-20 14:59 ` Gabriel Krisman Bertazi
2024-02-20 14:59 ` [f2fs-dev] " Gabriel Krisman Bertazi
2024-02-15 4:26 ` [PATCH v10 4/8] ext4: Reuse generic_ci_match for ci comparisons Eugen Hristev
2024-02-15 4:26 ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-02-15 4:26 ` [PATCH v10 5/8] f2fs: " Eugen Hristev
2024-02-15 4:26 ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-02-15 4:26 ` [PATCH v10 6/8] ext4: Log error when lookup of encoded dentry fails Eugen Hristev
2024-02-15 4:26 ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-02-15 4:26 ` [PATCH v10 7/8] ext4: Move CONFIG_UNICODE defguards into the code flow Eugen Hristev
2024-02-15 4:26 ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-03-22 22:11 ` Gabriel Krisman Bertazi
2024-03-22 22:11 ` Gabriel Krisman Bertazi
2024-02-15 4:26 ` [PATCH v10 8/8] f2fs: " Eugen Hristev
2024-02-15 4:26 ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-07-24 2:16 ` [f2fs-dev] [PATCH v10 0/8] patchwork-bot+f2fs
2024-07-24 2:16 ` patchwork-bot+f2fs
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87msrwbj18.fsf@mailhost.krisman.be \
--to=krisman@suse.de \
--cc=adilger.kernel@dilger.ca \
--cc=brauner@kernel.org \
--cc=chao@kernel.org \
--cc=eugen.hristev@collabora.com \
--cc=jack@suse.cz \
--cc=jaegeuk@kernel.org \
--cc=kernel@collabora.com \
--cc=krisman@collabora.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.