All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gabriel Krisman Bertazi <krisman@suse.de>
To: Eric Biggers <ebiggers@kernel.org>
Cc: Eugen Hristev <eugen.hristev@collabora.com>,
	 tytso@mit.edu, adilger.kernel@dilger.ca,
	 linux-ext4@vger.kernel.org, jaegeuk@kernel.org,
	 chao@kernel.org, linux-f2fs-devel@lists.sourceforge.net,
	 linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	 kernel@collabora.com, viro@zeniv.linux.org.uk,
	 brauner@kernel.org,  jack@suse.cz,
	 Gabriel Krisman Bertazi <krisman@collabora.com>
Subject: Re: [PATCH v16 3/9] libfs: Introduce case-insensitive string comparison helper
Date: Sun, 12 May 2024 17:27:48 -0400	[thread overview]
Message-ID: <875xviyb3f.fsf@mailhost.krisman.be> (raw)
In-Reply-To: <20240510013330.GI1110919@google.com> (Eric Biggers's message of "Fri, 10 May 2024 01:33:30 +0000")

Eric Biggers <ebiggers@kernel.org> writes:

> On Fri, Apr 05, 2024 at 03:13:26PM +0300, Eugen Hristev wrote:

>> +		if (WARN_ON_ONCE(!fscrypt_has_encryption_key(parent)))
>> +			return -EINVAL;
>> +
>> +		decrypted_name.name = kmalloc(de_name_len, GFP_KERNEL);
>> +		if (!decrypted_name.name)
>> +			return -ENOMEM;
>> +		res = fscrypt_fname_disk_to_usr(parent, 0, 0, &encrypted_name,
>> +						&decrypted_name);
>> +		if (res < 0)
>> +			goto out;
>
> If fscrypt_fname_disk_to_usr() returns an error and !sb_has_strict_encoding(sb),
> then this function returns 0 (indicating no match) instead of the error code
> (indicating an error).  Is that the correct behavior?  I would think that
> strict_encoding should only have an effect on the actual name
> comparison.

No. we *want* this return code to be propagated back to f2fs.  In ext4 it
wouldn't matter since the error is not visible outside of ext4_match,
but f2fs does the right thing and stops the lookup.

Thinking about it, there is a second problem with this series.
Currently, if we are on strict_mode, f2fs_match_ci_name does not
propagate unicode errors back to f2fs. So, once a utf8 invalid sequence
is found during lookup, it will be considered not-a-match but the lookup
will continue.  This allows some lookups to succeed even in a corrupted
directory.  With this patch, we will abort the lookup on the first
error, breaking existing semantics.  Note that these are different from
memory allocation failure and fscrypt_fname_disk_to_usr. For those, it
makes sense to abort.

Also, once patch 6 and 7 are added, if fscrypt fails with -EINVAL for
any reason unrelated to unicode (like in the WARN_ON above), we will
incorrectly print the error message saying there is a bad UTF8 string.

My suggestion would be to keep the current behavior.  Make
generic_ci_match only propagate non-unicode related errors back to the
filesystem.  This means that we need to move the error messages in patch
6 and 7 into this function, so they only trigger when utf8_strncasecmp*
itself fails.

>> +	/*
>> +	 * Attempt a case-sensitive match first. It is cheaper and
>> +	 * should cover most lookups, including all the sane
>> +	 * applications that expect a case-sensitive filesystem.
>> +	 */
>> +	if (folded_name->name) {
>> +		if (dirent.len == folded_name->len &&
>> +		    !memcmp(folded_name->name, dirent.name, dirent.len))
>> +			goto out;
>> +		res = utf8_strncasecmp_folded(um, folded_name, &dirent);
>
> Shouldn't the memcmp be done with the original user-specified name, not the
> casefolded name?  I would think that the user-specified name is the one that's
> more likely to match the on-disk name, because of case preservation.  In most
> cases users will specify the same case on both file creation and later access.

Yes.

-- 
Gabriel Krisman Bertazi

WARNING: multiple messages have this Message-ID (diff)
From: Gabriel Krisman Bertazi <krisman@suse.de>
To: Eric Biggers <ebiggers@kernel.org>
Cc: brauner@kernel.org, kernel@collabora.com, tytso@mit.edu,
	jack@suse.cz, linux-kernel@vger.kernel.org,
	Eugen Hristev <eugen.hristev@collabora.com>,
	linux-f2fs-devel@lists.sourceforge.net, adilger.kernel@dilger.ca,
	viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org,
	jaegeuk@kernel.org, linux-ext4@vger.kernel.org,
	Gabriel Krisman Bertazi <krisman@collabora.com>
Subject: Re: [f2fs-dev] [PATCH v16 3/9] libfs: Introduce case-insensitive string comparison helper
Date: Sun, 12 May 2024 17:27:48 -0400	[thread overview]
Message-ID: <875xviyb3f.fsf@mailhost.krisman.be> (raw)
In-Reply-To: <20240510013330.GI1110919@google.com> (Eric Biggers's message of "Fri, 10 May 2024 01:33:30 +0000")

Eric Biggers <ebiggers@kernel.org> writes:

> On Fri, Apr 05, 2024 at 03:13:26PM +0300, Eugen Hristev wrote:

>> +		if (WARN_ON_ONCE(!fscrypt_has_encryption_key(parent)))
>> +			return -EINVAL;
>> +
>> +		decrypted_name.name = kmalloc(de_name_len, GFP_KERNEL);
>> +		if (!decrypted_name.name)
>> +			return -ENOMEM;
>> +		res = fscrypt_fname_disk_to_usr(parent, 0, 0, &encrypted_name,
>> +						&decrypted_name);
>> +		if (res < 0)
>> +			goto out;
>
> If fscrypt_fname_disk_to_usr() returns an error and !sb_has_strict_encoding(sb),
> then this function returns 0 (indicating no match) instead of the error code
> (indicating an error).  Is that the correct behavior?  I would think that
> strict_encoding should only have an effect on the actual name
> comparison.

No. we *want* this return code to be propagated back to f2fs.  In ext4 it
wouldn't matter since the error is not visible outside of ext4_match,
but f2fs does the right thing and stops the lookup.

Thinking about it, there is a second problem with this series.
Currently, if we are on strict_mode, f2fs_match_ci_name does not
propagate unicode errors back to f2fs. So, once a utf8 invalid sequence
is found during lookup, it will be considered not-a-match but the lookup
will continue.  This allows some lookups to succeed even in a corrupted
directory.  With this patch, we will abort the lookup on the first
error, breaking existing semantics.  Note that these are different from
memory allocation failure and fscrypt_fname_disk_to_usr. For those, it
makes sense to abort.

Also, once patch 6 and 7 are added, if fscrypt fails with -EINVAL for
any reason unrelated to unicode (like in the WARN_ON above), we will
incorrectly print the error message saying there is a bad UTF8 string.

My suggestion would be to keep the current behavior.  Make
generic_ci_match only propagate non-unicode related errors back to the
filesystem.  This means that we need to move the error messages in patch
6 and 7 into this function, so they only trigger when utf8_strncasecmp*
itself fails.

>> +	/*
>> +	 * Attempt a case-sensitive match first. It is cheaper and
>> +	 * should cover most lookups, including all the sane
>> +	 * applications that expect a case-sensitive filesystem.
>> +	 */
>> +	if (folded_name->name) {
>> +		if (dirent.len == folded_name->len &&
>> +		    !memcmp(folded_name->name, dirent.name, dirent.len))
>> +			goto out;
>> +		res = utf8_strncasecmp_folded(um, folded_name, &dirent);
>
> Shouldn't the memcmp be done with the original user-specified name, not the
> casefolded name?  I would think that the user-specified name is the one that's
> more likely to match the on-disk name, because of case preservation.  In most
> cases users will specify the same case on both file creation and later access.

Yes.

-- 
Gabriel Krisman Bertazi


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

  reply	other threads:[~2024-05-12 21:28 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-05 12:13 [PATCH v16 0/9] Cache insensitive cleanup for ext4/f2fs Eugen Hristev
2024-04-05 12:13 ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-04-05 12:13 ` [PATCH v16 1/9] ext4: Simplify the handling of cached insensitive names Eugen Hristev
2024-04-05 12:13   ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-05-10  1:23   ` Eric Biggers
2024-05-10  1:23     ` [f2fs-dev] " Eric Biggers
2024-04-05 12:13 ` [PATCH v16 2/9] f2fs: " Eugen Hristev
2024-04-05 12:13   ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-05-10  1:23   ` Eric Biggers
2024-05-10  1:23     ` [f2fs-dev] " Eric Biggers
2024-04-05 12:13 ` [PATCH v16 3/9] libfs: Introduce case-insensitive string comparison helper Eugen Hristev
2024-04-05 12:13   ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-05-10  1:33   ` Eric Biggers
2024-05-10  1:33     ` [f2fs-dev] " Eric Biggers
2024-05-12 21:27     ` Gabriel Krisman Bertazi [this message]
2024-05-12 21:27       ` Gabriel Krisman Bertazi
2024-05-22 14:02       ` Eugen Hristev
2024-05-22 14:02         ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-05-22 23:05         ` Gabriel Krisman Bertazi
2024-05-22 23:05           ` [f2fs-dev] " Gabriel Krisman Bertazi
2024-05-26 11:49           ` Eugen Hristev
2024-05-26 11:49             ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-05-27 20:54             ` Gabriel Krisman Bertazi
2024-05-27 20:54               ` [f2fs-dev] " Gabriel Krisman Bertazi
2024-04-05 12:13 ` [PATCH v16 4/9] ext4: Reuse generic_ci_match for ci comparisons Eugen Hristev
2024-04-05 12:13   ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-05-10  1:23   ` Eric Biggers
2024-05-10  1:23     ` [f2fs-dev] " Eric Biggers
2024-04-05 12:13 ` [PATCH v16 5/9] f2fs: " Eugen Hristev
2024-04-05 12:13   ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-05-10  1:24   ` Eric Biggers
2024-05-10  1:24     ` [f2fs-dev] " Eric Biggers
2024-04-05 12:13 ` [PATCH v16 6/9] ext4: Log error when lookup of encoded dentry fails Eugen Hristev
2024-04-05 12:13   ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-05-10  1:24   ` Eric Biggers
2024-05-10  1:24     ` [f2fs-dev] " Eric Biggers
2024-04-05 12:13 ` [PATCH v16 7/9] f2fs: " Eugen Hristev
2024-04-05 12:13   ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-05-10  1:25   ` Eric Biggers
2024-05-10  1:25     ` [f2fs-dev] " Eric Biggers
2024-04-05 12:13 ` [PATCH v16 8/9] ext4: Move CONFIG_UNICODE defguards into the code flow Eugen Hristev
2024-04-05 12:13   ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-05-10  1:25   ` Eric Biggers
2024-05-10  1:25     ` [f2fs-dev] " Eric Biggers
2024-04-05 12:13 ` [PATCH v16 9/9] f2fs: " Eugen Hristev
2024-04-05 12:13   ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-05-10  1:25   ` Eric Biggers
2024-05-10  1:25     ` [f2fs-dev] " Eric Biggers
2024-04-05 12:18 ` [PATCH v16 0/9] Cache insensitive cleanup for ext4/f2fs Matthew Wilcox
2024-04-05 12:18   ` [f2fs-dev] " Matthew Wilcox
2024-04-05 13:02   ` Eugen Hristev
2024-04-05 13:02     ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-04-05 16:37     ` Gabriel Krisman Bertazi
2024-04-05 16:37       ` [f2fs-dev] " Gabriel Krisman Bertazi
2024-05-09 15:12       ` Eugen Hristev
2024-05-09 15:12         ` [f2fs-dev] " Eugen Hristev via Linux-f2fs-devel
2024-07-24  2:16 ` patchwork-bot+f2fs
2024-07-24  2:16   ` patchwork-bot+f2fs

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875xviyb3f.fsf@mailhost.krisman.be \
    --to=krisman@suse.de \
    --cc=adilger.kernel@dilger.ca \
    --cc=brauner@kernel.org \
    --cc=chao@kernel.org \
    --cc=ebiggers@kernel.org \
    --cc=eugen.hristev@collabora.com \
    --cc=jack@suse.cz \
    --cc=jaegeuk@kernel.org \
    --cc=kernel@collabora.com \
    --cc=krisman@collabora.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.