[PATCH] ext4: introduce linear search for dentries

public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] ext4: introduce linear search for dentries
@ 2025-02-12 16:44 Theodore Ts'o
  2025-02-12 21:02 ` Gabriel Krisman Bertazi
  2025-02-13 20:10 ` [PATCH -v2] " Theodore Ts'o
  0 siblings, 2 replies; 8+ messages in thread
From: Theodore Ts'o @ 2025-02-12 16:44 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: drosen, krisman, Theodore Ts'o

This patch addresses an issue where some files in case-insensitive
directories become inaccessible due to changes in how the kernel
function, utf8_casefold(), generates case-folded strings from the
commit 5c26d2f1d3f5 ("unicode: Don't special case ignorable code
points").

There are good reasons why this change should be made; it's actually
quite stupid that Unicode seems to think that the characters ❤ and ❤️
should be casefolded.  Unfortimately because of the backwards
compatibility issue, this commit was reverted in 231825b2e1ff.

This problem is addressed by instituting a brute-force linear fallback
if a lookup fails on case-folded directory, which does result in a
performance hit when looking up files affected by the changing how
thekernel treats ignorable Uniode characters, or when attempting to
look up non-existent file names.  So this fallback can be disabled by
setting an encoding flag if in the future, the system administrator or
the manufacturer of a mobile handset or tablet can be sure that there
was no opportunity for a kernel to insert file names with incompatible
encodings.

Fixes: 5c26d2f1d3f5 ("unicode: Don't special case ignorable code points")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
---
 fs/ext4/namei.c    | 14 ++++++++++----
 include/linux/fs.h |  6 +++++-
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 536d56d15072..820e7ab7f3a3 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1462,7 +1462,8 @@ static bool ext4_match(struct inode *parent,
 		 * sure cf_name was properly initialized before
 		 * considering the calculated hash.
 		 */
-		if (IS_ENCRYPTED(parent) && fname->cf_name.name &&
+		if (sb_no_casefold_compat_fallback(parent->i_sb) &&
+		    IS_ENCRYPTED(parent) && fname->cf_name.name &&
 		    (fname->hinfo.hash != EXT4_DIRENT_HASH(de) ||
 		     fname->hinfo.minor_hash != EXT4_DIRENT_MINOR_HASH(de)))
 			return false;
@@ -1595,10 +1596,15 @@ static struct buffer_head *__ext4_find_entry(struct inode *dir,
 		 * return.  Otherwise, fall back to doing a search the
 		 * old fashioned way.
 		 */
-		if (!IS_ERR(ret) || PTR_ERR(ret) != ERR_BAD_DX_DIR)
+		if (IS_ERR(ret) && PTR_ERR(ret) == ERR_BAD_DX_DIR)
+			dxtrace(printk(KERN_DEBUG "ext4_find_entry: dx failed, "
+				       "falling back\n"));
+		else if (!sb_no_casefold_compat_fallback(dir->i_sb) &&
+			 *res_dir == NULL && IS_CASEFOLDED(dir))
+			dxtrace(printk(KERN_DEBUG "ext4_find_entry: casefold "
+				       "failed, falling back\n"));
+		else
 			goto cleanup_and_exit;
-		dxtrace(printk(KERN_DEBUG "ext4_find_entry: dx failed, "
-			       "falling back\n"));
 		ret = NULL;
 	}
 	nblocks = dir->i_size >> EXT4_BLOCK_SIZE_BITS(sb);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2c3b2f8a621f..a10edf804140 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1258,11 +1258,15 @@ extern int send_sigurg(struct file *file);
 #define SB_NOUSER       BIT(31)
 
 /* These flags relate to encoding and casefolding */
-#define SB_ENC_STRICT_MODE_FL	(1 << 0)
+#define SB_ENC_STRICT_MODE_FL		(1 << 0)
+#define SB_ENC_NO_COMPAT_FALLBACK_FL	(1 << 1)
 
 #define sb_has_strict_encoding(sb) \
 	(sb->s_encoding_flags & SB_ENC_STRICT_MODE_FL)
 
+#define sb_no_casefold_compat_fallback(sb) \
+	(sb->s_encoding_flags & SB_ENC_NO_COMPAT_FALLBACK_FL)
+
 /*
  *	Umount options
  */
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] ext4: introduce linear search for dentries
  2025-02-12 16:44 [PATCH] ext4: introduce linear search for dentries Theodore Ts'o
@ 2025-02-12 21:02 ` Gabriel Krisman Bertazi
  2025-02-13 20:10 ` [PATCH -v2] " Theodore Ts'o
  1 sibling, 0 replies; 8+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-02-12 21:02 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext4 Developers List, drosen

"Theodore Ts'o" <tytso@mit.edu> writes:

> This patch addresses an issue where some files in case-insensitive
> directories become inaccessible due to changes in how the kernel
> function, utf8_casefold(), generates case-folded strings from the
> commit 5c26d2f1d3f5 ("unicode: Don't special case ignorable code
> points").
>
> There are good reasons why this change should be made; it's actually
> quite stupid that Unicode seems to think that the characters ❤ and ❤️
> should be casefolded.  Unfortimately because of the backwards
> compatibility issue, this commit was reverted in 231825b2e1ff.
>
> This problem is addressed by instituting a brute-force linear fallback
> if a lookup fails on case-folded directory, which does result in a
> performance hit when looking up files affected by the changing how
> thekernel treats ignorable Uniode characters, or when attempting to
> look up non-existent file names.  So this fallback can be disabled by
> setting an encoding flag if in the future, the system administrator or
> the manufacturer of a mobile handset or tablet can be sure that there
> was no opportunity for a kernel to insert file names with incompatible
> encodings.
>
> Fixes: 5c26d2f1d3f5 ("unicode: Don't special case ignorable code points")
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>

Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH -v2] ext4: introduce linear search for dentries
  2025-02-12 16:44 [PATCH] ext4: introduce linear search for dentries Theodore Ts'o
  2025-02-12 21:02 ` Gabriel Krisman Bertazi
@ 2025-02-13 20:10 ` Theodore Ts'o
  2025-02-13 20:17   ` Eric Biggers
  2025-02-19 20:30   ` Andreas Dilger
  1 sibling, 2 replies; 8+ messages in thread
From: Theodore Ts'o @ 2025-02-13 20:10 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: krisman, drosen, Theodore Ts'o

This patch addresses an issue where some files in case-insensitive
directories become inaccessible due to changes in how the kernel
function, utf8_casefold(), generates case-folded strings from the
commit 5c26d2f1d3f5 ("unicode: Don't special case ignorable code
points").

There are good reasons why this change should be made; it's actually
quite stupid that Unicode seems to think that the characters ❤ and ❤️
should be casefolded.  Unfortimately because of the backwards
compatibility issue, this commit was reverted in 231825b2e1ff.

This problem is addressed by instituting a brute-force linear fallback
if a lookup fails on case-folded directory, which does result in a
performance hit when looking up files affected by the changing how
thekernel treats ignorable Uniode characters, or when attempting to
look up non-existent file names.  So this fallback can be disabled by
setting an encoding flag if in the future, the system administrator or
the manufacturer of a mobile handset or tablet can be sure that there
was no opportunity for a kernel to insert file names with incompatible
encodings.

Fixes: 5c26d2f1d3f5 ("unicode: Don't special case ignorable code points")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
---
v2:
   * Fix compile failure when CONFIG_UNICODE is not enabled
   * Added reviewed-by from Gabriel Krisman

 fs/ext4/namei.c    | 14 ++++++++++----
 include/linux/fs.h | 10 +++++++++-
 2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 536d56d15072..820e7ab7f3a3 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1462,7 +1462,8 @@ static bool ext4_match(struct inode *parent,
 		 * sure cf_name was properly initialized before
 		 * considering the calculated hash.
 		 */
-		if (IS_ENCRYPTED(parent) && fname->cf_name.name &&
+		if (sb_no_casefold_compat_fallback(parent->i_sb) &&
+		    IS_ENCRYPTED(parent) && fname->cf_name.name &&
 		    (fname->hinfo.hash != EXT4_DIRENT_HASH(de) ||
 		     fname->hinfo.minor_hash != EXT4_DIRENT_MINOR_HASH(de)))
 			return false;
@@ -1595,10 +1596,15 @@ static struct buffer_head *__ext4_find_entry(struct inode *dir,
 		 * return.  Otherwise, fall back to doing a search the
 		 * old fashioned way.
 		 */
-		if (!IS_ERR(ret) || PTR_ERR(ret) != ERR_BAD_DX_DIR)
+		if (IS_ERR(ret) && PTR_ERR(ret) == ERR_BAD_DX_DIR)
+			dxtrace(printk(KERN_DEBUG "ext4_find_entry: dx failed, "
+				       "falling back\n"));
+		else if (!sb_no_casefold_compat_fallback(dir->i_sb) &&
+			 *res_dir == NULL && IS_CASEFOLDED(dir))
+			dxtrace(printk(KERN_DEBUG "ext4_find_entry: casefold "
+				       "failed, falling back\n"));
+		else
 			goto cleanup_and_exit;
-		dxtrace(printk(KERN_DEBUG "ext4_find_entry: dx failed, "
-			       "falling back\n"));
 		ret = NULL;
 	}
 	nblocks = dir->i_size >> EXT4_BLOCK_SIZE_BITS(sb);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2c3b2f8a621f..aa4ec39202c3 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1258,11 +1258,19 @@ extern int send_sigurg(struct file *file);
 #define SB_NOUSER       BIT(31)
 
 /* These flags relate to encoding and casefolding */
-#define SB_ENC_STRICT_MODE_FL	(1 << 0)
+#define SB_ENC_STRICT_MODE_FL		(1 << 0)
+#define SB_ENC_NO_COMPAT_FALLBACK_FL	(1 << 1)
 
 #define sb_has_strict_encoding(sb) \
 	(sb->s_encoding_flags & SB_ENC_STRICT_MODE_FL)
 
+#if IS_ENABLED(CONFIG_UNICODE)
+#define sb_no_casefold_compat_fallback(sb) \
+	(sb->s_encoding_flags & SB_ENC_NO_COMPAT_FALLBACK_FL)
+#else
+#define sb_no_casefold_compat_fallback(sb) (1)
+#endif
+
 /*
  *	Umount options
  */
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH -v2] ext4: introduce linear search for dentries
  2025-02-13 20:10 ` [PATCH -v2] " Theodore Ts'o
@ 2025-02-13 20:17   ` Eric Biggers
  2025-02-19 20:30   ` Andreas Dilger
  1 sibling, 0 replies; 8+ messages in thread
From: Eric Biggers @ 2025-02-13 20:17 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext4 Developers List, krisman, drosen

On Thu, Feb 13, 2025 at 03:10:21PM -0500, Theodore Ts'o wrote:
> There are good reasons why this change should be made; it's actually
> quite stupid that Unicode seems to think that the characters ❤ and ❤️
> should be casefolded.

It actually doesn't.  See
https://www.unicode.org/Public/16.0.0/ucd/CaseFolding.txt and note that heart is
not listed there.

For some reason ext4 actually does Unicode normalization, which is different.

- Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH -v2] ext4: introduce linear search for dentries
  2025-02-13 20:10 ` [PATCH -v2] " Theodore Ts'o
  2025-02-13 20:17   ` Eric Biggers
@ 2025-02-19 20:30   ` Andreas Dilger
  2025-02-19 21:43     ` Gabriel Krisman Bertazi
  1 sibling, 1 reply; 8+ messages in thread
From: Andreas Dilger @ 2025-02-19 20:30 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext4 Developers List, krisman, drosen

[-- Attachment #1: Type: text/plain, Size: 5080 bytes --]

On Feb 13, 2025, at 1:10 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> 
> This patch addresses an issue where some files in case-insensitive
> directories become inaccessible due to changes in how the kernel
> function, utf8_casefold(), generates case-folded strings from the
> commit 5c26d2f1d3f5 ("unicode: Don't special case ignorable code
> points").
> 
> There are good reasons why this change should be made; it's actually
> quite stupid that Unicode seems to think that the characters ❤ and ❤️
> should be casefolded.  Unfortimately because of the backwards
> compatibility issue, this commit was reverted in 231825b2e1ff.
> 
> This problem is addressed by instituting a brute-force linear fallback
> if a lookup fails on case-folded directory, which does result in a
> performance hit when looking up files affected by the changing how
> thekernel treats ignorable Uniode characters, or when attempting to
> look up non-existent file names.  So this fallback can be disabled by
> setting an encoding flag if in the future, the system administrator or
> the manufacturer of a mobile handset or tablet can be sure that there
> was no opportunity for a kernel to insert file names with incompatible
> encodings.

I don't have the full context here, but falling back to a full directory
scan for every failed lookup in a casefolded directory would be *very*
expensive for a large directory.

This could be made conditional upon a much narrower set of conditions:
- if the filename has non-ASCII characters (already uncommon)
- if the filename contains characters that may be case folded (normalized?)

This avoids a huge performance hit for every name lookup in very common
workloads that do not need it (i.e. most computer-generated filenames are
still only using ASCII characters).

Also, depending on the size of the directory vs. the number of case-folded
(normalized?) characters in the filename, it might be faster to do
2^(ambiguous_chars) htree lookups instead of a linear scan of the whole dir.

That could be checked easily whether 2^(ambiguous_chars) < dir blocks, since
the htree leaf blocks will always be fully scanned anyway once found.  That
could be a big win if there are only a few remapped characters in a filename.

Cheers, Andreas

> 
> Fixes: 5c26d2f1d3f5 ("unicode: Don't special case ignorable code points")
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
> ---
> v2:
>   * Fix compile failure when CONFIG_UNICODE is not enabled
>   * Added reviewed-by from Gabriel Krisman
> 
> fs/ext4/namei.c    | 14 ++++++++++----
> include/linux/fs.h | 10 +++++++++-
> 2 files changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 536d56d15072..820e7ab7f3a3 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1462,7 +1462,8 @@ static bool ext4_match(struct inode *parent,
> 		 * sure cf_name was properly initialized before
> 		 * considering the calculated hash.
> 		 */
> -		if (IS_ENCRYPTED(parent) && fname->cf_name.name &&
> +		if (sb_no_casefold_compat_fallback(parent->i_sb) &&
> +		    IS_ENCRYPTED(parent) && fname->cf_name.name &&
> 		    (fname->hinfo.hash != EXT4_DIRENT_HASH(de) ||
> 		     fname->hinfo.minor_hash != EXT4_DIRENT_MINOR_HASH(de)))
> 			return false;
> @@ -1595,10 +1596,15 @@ static struct buffer_head *__ext4_find_entry(struct inode *dir,
> 		 * return.  Otherwise, fall back to doing a search the
> 		 * old fashioned way.
> 		 */
> -		if (!IS_ERR(ret) || PTR_ERR(ret) != ERR_BAD_DX_DIR)
> +		if (IS_ERR(ret) && PTR_ERR(ret) == ERR_BAD_DX_DIR)
> +			dxtrace(printk(KERN_DEBUG "ext4_find_entry: dx failed, "
> +				       "falling back\n"));
> +		else if (!sb_no_casefold_compat_fallback(dir->i_sb) &&
> +			 *res_dir == NULL && IS_CASEFOLDED(dir))
> +			dxtrace(printk(KERN_DEBUG "ext4_find_entry: casefold "
> +				       "failed, falling back\n"));
> +		else
> 			goto cleanup_and_exit;
> -		dxtrace(printk(KERN_DEBUG "ext4_find_entry: dx failed, "
> -			       "falling back\n"));
> 		ret = NULL;
> 	}
> 	nblocks = dir->i_size >> EXT4_BLOCK_SIZE_BITS(sb);
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 2c3b2f8a621f..aa4ec39202c3 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1258,11 +1258,19 @@ extern int send_sigurg(struct file *file);
> #define SB_NOUSER       BIT(31)
> 
> /* These flags relate to encoding and casefolding */
> -#define SB_ENC_STRICT_MODE_FL	(1 << 0)
> +#define SB_ENC_STRICT_MODE_FL		(1 << 0)
> +#define SB_ENC_NO_COMPAT_FALLBACK_FL	(1 << 1)
> 
> #define sb_has_strict_encoding(sb) \
> 	(sb->s_encoding_flags & SB_ENC_STRICT_MODE_FL)
> 
> +#if IS_ENABLED(CONFIG_UNICODE)
> +#define sb_no_casefold_compat_fallback(sb) \
> +	(sb->s_encoding_flags & SB_ENC_NO_COMPAT_FALLBACK_FL)
> +#else
> +#define sb_no_casefold_compat_fallback(sb) (1)
> +#endif
> +
> /*
>  *	Umount options
>  */
> --
> 2.45.2
> 
> 


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH -v2] ext4: introduce linear search for dentries
  2025-02-19 20:30   ` Andreas Dilger
@ 2025-02-19 21:43     ` Gabriel Krisman Bertazi
  2025-02-20  1:58       ` Andreas Dilger
  0 siblings, 1 reply; 8+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-02-19 21:43 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Theodore Ts'o, Ext4 Developers List, drosen

Andreas Dilger <adilger@dilger.ca> writes:

> On Feb 13, 2025, at 1:10 PM, Theodore Ts'o <tytso@mit.edu> wrote:
>> 
>> This patch addresses an issue where some files in case-insensitive
>> directories become inaccessible due to changes in how the kernel
>> function, utf8_casefold(), generates case-folded strings from the
>> commit 5c26d2f1d3f5 ("unicode: Don't special case ignorable code
>> points").
>> 
>> There are good reasons why this change should be made; it's actually
>> quite stupid that Unicode seems to think that the characters ❤ and ❤️
>> should be casefolded.  Unfortimately because of the backwards
>> compatibility issue, this commit was reverted in 231825b2e1ff.
>> 
>> This problem is addressed by instituting a brute-force linear fallback
>> if a lookup fails on case-folded directory, which does result in a
>> performance hit when looking up files affected by the changing how
>> thekernel treats ignorable Uniode characters, or when attempting to
>> look up non-existent file names.  So this fallback can be disabled by
>> setting an encoding flag if in the future, the system administrator or
>> the manufacturer of a mobile handset or tablet can be sure that there
>> was no opportunity for a kernel to insert file names with incompatible
>> encodings.
>
> I don't have the full context here, but falling back to a full directory
> scan for every failed lookup in a casefolded directory would be *very*
> expensive for a large directory.

The context is that I made a change in unicode that caused a change in
the filename hash of case-insensitive dirents, which are calculated from
the casefolded form of the name.  While that change was reverted, users
have created files with the broken kernels and there are reports of
files inaccessible.

> This could be made conditional upon a much narrower set of conditions:
> - if the filename has non-ASCII characters (already uncommon)
> - if the filename contains characters that may be case folded
> (normalized?)

It could be even simpler, by only doing it to filenames that have
zero-length characters before normalization.  We can easily check it
with utf8nlen or utf8ncursor.  I'm very wary of differentiating ASCII
from other characters if we can avoid it.

> This avoids a huge performance hit for every name lookup in very common
> workloads that do not need it (i.e. most computer-generated filenames are
> still only using ASCII characters).

Right.  But this also only affects case-insensitive filesystems, which
have very specific and controlled applications and hardly thousands of
files in the same directory.  If we really need it, I'd suggest we don't
differentiating ASCII from utf8, but only filenames with those
sequences.

IMO, Ted's patch seems fine as a temporary stopgap to recover those
filesystems.

>
> Also, depending on the size of the directory vs. the number of case-folded
> (normalized?) characters in the filename, it might be faster to do
> 2^(ambiguous_chars) htree lookups instead of a linear scan of the whole dir.
>
> That could be checked easily whether 2^(ambiguous_chars) < dir blocks, since
> the htree leaf blocks will always be fully scanned anyway once found.  That
> could be a big win if there are only a few remapped characters in a filename.
>
> Cheers, Andreas
>
>> 
>> Fixes: 5c26d2f1d3f5 ("unicode: Don't special case ignorable code points")
>> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
>> Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
>> ---
>> v2:
>>   * Fix compile failure when CONFIG_UNICODE is not enabled
>>   * Added reviewed-by from Gabriel Krisman
>> 
>> fs/ext4/namei.c    | 14 ++++++++++----
>> include/linux/fs.h | 10 +++++++++-
>> 2 files changed, 19 insertions(+), 5 deletions(-)
>> 
>> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
>> index 536d56d15072..820e7ab7f3a3 100644
>> --- a/fs/ext4/namei.c
>> +++ b/fs/ext4/namei.c
>> @@ -1462,7 +1462,8 @@ static bool ext4_match(struct inode *parent,
>> 		 * sure cf_name was properly initialized before
>> 		 * considering the calculated hash.
>> 		 */
>> -		if (IS_ENCRYPTED(parent) && fname->cf_name.name &&
>> +		if (sb_no_casefold_compat_fallback(parent->i_sb) &&
>> +		    IS_ENCRYPTED(parent) && fname->cf_name.name &&
>> 		    (fname->hinfo.hash != EXT4_DIRENT_HASH(de) ||
>> 		     fname->hinfo.minor_hash != EXT4_DIRENT_MINOR_HASH(de)))
>> 			return false;
>> @@ -1595,10 +1596,15 @@ static struct buffer_head *__ext4_find_entry(struct inode *dir,
>> 		 * return.  Otherwise, fall back to doing a search the
>> 		 * old fashioned way.
>> 		 */
>> -		if (!IS_ERR(ret) || PTR_ERR(ret) != ERR_BAD_DX_DIR)
>> +		if (IS_ERR(ret) && PTR_ERR(ret) == ERR_BAD_DX_DIR)
>> +			dxtrace(printk(KERN_DEBUG "ext4_find_entry: dx failed, "
>> +				       "falling back\n"));
>> +		else if (!sb_no_casefold_compat_fallback(dir->i_sb) &&
>> +			 *res_dir == NULL && IS_CASEFOLDED(dir))
>> +			dxtrace(printk(KERN_DEBUG "ext4_find_entry: casefold "
>> +				       "failed, falling back\n"));
>> +		else
>> 			goto cleanup_and_exit;
>> -		dxtrace(printk(KERN_DEBUG "ext4_find_entry: dx failed, "
>> -			       "falling back\n"));
>> 		ret = NULL;
>> 	}
>> 	nblocks = dir->i_size >> EXT4_BLOCK_SIZE_BITS(sb);
>> diff --git a/include/linux/fs.h b/include/linux/fs.h
>> index 2c3b2f8a621f..aa4ec39202c3 100644
>> --- a/include/linux/fs.h
>> +++ b/include/linux/fs.h
>> @@ -1258,11 +1258,19 @@ extern int send_sigurg(struct file *file);
>> #define SB_NOUSER       BIT(31)
>> 
>> /* These flags relate to encoding and casefolding */
>> -#define SB_ENC_STRICT_MODE_FL	(1 << 0)
>> +#define SB_ENC_STRICT_MODE_FL		(1 << 0)
>> +#define SB_ENC_NO_COMPAT_FALLBACK_FL	(1 << 1)
>> 
>> #define sb_has_strict_encoding(sb) \
>> 	(sb->s_encoding_flags & SB_ENC_STRICT_MODE_FL)
>> 
>> +#if IS_ENABLED(CONFIG_UNICODE)
>> +#define sb_no_casefold_compat_fallback(sb) \
>> +	(sb->s_encoding_flags & SB_ENC_NO_COMPAT_FALLBACK_FL)
>> +#else
>> +#define sb_no_casefold_compat_fallback(sb) (1)
>> +#endif
>> +
>> /*
>>  *	Umount options
>>  */
>> --
>> 2.45.2
>> 
>> 
>
>
> Cheers, Andreas
>
>
>
>
>

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH -v2] ext4: introduce linear search for dentries
  2025-02-19 21:43     ` Gabriel Krisman Bertazi
@ 2025-02-20  1:58       ` Andreas Dilger
  2025-02-20 14:46         ` Theodore Ts'o
  0 siblings, 1 reply; 8+ messages in thread
From: Andreas Dilger @ 2025-02-20  1:58 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi; +Cc: Theodore Ts'o, Ext4 Developers List, drosen

[-- Attachment #1: Type: text/plain, Size: 8525 bytes --]

On Feb 19, 2025, at 2:43 PM, Gabriel Krisman Bertazi <krisman@suse.de> wrote:
> 
> Andreas Dilger <adilger@dilger.ca> writes:
> 
>> On Feb 13, 2025, at 1:10 PM, Theodore Ts'o <tytso@mit.edu> wrote:
>>> 
>>> This patch addresses an issue where some files in case-insensitive
>>> directories become inaccessible due to changes in how the kernel
>>> function, utf8_casefold(), generates case-folded strings from the
>>> commit 5c26d2f1d3f5 ("unicode: Don't special case ignorable code
>>> points").
>>> 
>>> There are good reasons why this change should be made; it's actually
>>> quite stupid that Unicode seems to think that the characters ❤ and ❤️
>>> should be casefolded.  Unfortimately because of the backwards
>>> compatibility issue, this commit was reverted in 231825b2e1ff.
>>> 
>>> This problem is addressed by instituting a brute-force linear fallback
>>> if a lookup fails on case-folded directory, which does result in a
>>> performance hit when looking up files affected by the changing how
>>> thekernel treats ignorable Uniode characters, or when attempting to
>>> look up non-existent file names.  So this fallback can be disabled by
>>> setting an encoding flag if in the future, the system administrator or
>>> the manufacturer of a mobile handset or tablet can be sure that there
>>> was no opportunity for a kernel to insert file names with incompatible
>>> encodings.
>> 
>> I don't have the full context here, but falling back to a full directory
>> scan for every failed lookup in a casefolded directory would be *very*
>> expensive for a large directory.
> 
> The context is that I made a change in unicode that caused a change in
> the filename hash of case-insensitive dirents, which are calculated from
> the casefolded form of the name.  While that change was reverted, users
> have created files with the broken kernels and there are reports of
> files inaccessible.
> 
>> This could be made conditional upon a much narrower set of conditions:
>> - if the filename has non-ASCII characters (already uncommon)
>> - if the filename contains characters that may be case folded
>> (normalized?)
> 
> It could be even simpler, by only doing it to filenames that have
> zero-length characters before normalization.  We can easily check it
> with utf8nlen or utf8ncursor.  I'm very wary of differentiating ASCII
> from other characters if we can avoid it.

Sure, my suggestions are aimed at minimizing the impact of this extra
(and very expensive) fallback mechanism.  If there is a direct way to
determine which filenames were impacted by the earlier bug, and then
do only two lookups (one with the "buggy" casefolded name, one with the
"good" casefolded name) then this would be (at worst) a 2x slowdown for
the lookup, instead of a 1000x slowdown (or whatever, for large directories).

Also, since the number of users affected by this bug is relatively small
(only users running kernels >= v6.12-rc2-1-g5c26d2f1d3f5 where the broken
patch landed and v6.13-rc2-36-g231825b2e1ff when it was reverted), but the
workaround by default affects everyone using the casefold feature, it
behooves us to minimize the performance impact of the workaround.

> 
>> This avoids a huge performance hit for every name lookup in very common
>> workloads that do not need it (i.e. most computer-generated filenames are
>> still only using ASCII characters).
> 
> Right.  But this also only affects case-insensitive filesystems, which
> have very specific and controlled applications and hardly thousands of
> files in the same directory.

I think this is a generalization that does not hold true in all cases.

We have been looking at adding casefold support to Lustre, in order to
improve Samba export performance (which also has a "scan all entries"
fallback), and we cannot control how many files are in a single directory.

It seems likely that systems have been using casefold directly on ext4
for Samba as well.  If the performance impact of "scan all entries" is
noticeable for Samba, then it would be noticeable for this fallback.

> If we really need it, I'd suggest we don't differentiating ASCII from
> utf8, but only filenames with those sequences.

Even better if the fallback can be limited to the subset of affected names.

> IMO, Ted's patch seems fine as a temporary stopgap to recover those
> filesystems.

Sure, but it is enabled by default and affects anyone using the casefold
feature forever into the future, unless the plan is not to land this patch
and only deploy it in the specific distros where the broken kernel was used?

One option would be to have the kernel re-hash any entries that it finds
with the old filename, so that the directories repair themselves, and the
workaround could be removed after some time.  Also, have e2fsck re-hash
the filenames in this case, so that there is a long-term solution after
the kernel workaround is removed.

Cheers, Andreas

>> Also, depending on the size of the directory vs. the number of case-folded
>> (normalized?) characters in the filename, it might be faster to do
>> 2^(ambiguous_chars) htree lookups instead of a linear scan of the whole dir.
>> 
>> That could be checked easily whether 2^(ambiguous_chars) < dir blocks, since
>> the htree leaf blocks will always be fully scanned anyway once found.  That
>> could be a big win if there are only a few remapped characters in a filename.
>> 
>> Cheers, Andreas
>> 
>>> 
>>> Fixes: 5c26d2f1d3f5 ("unicode: Don't special case ignorable code points")
>>> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
>>> Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
>>> ---
>>> v2:
>>>  * Fix compile failure when CONFIG_UNICODE is not enabled
>>>  * Added reviewed-by from Gabriel Krisman
>>> 
>>> fs/ext4/namei.c    | 14 ++++++++++----
>>> include/linux/fs.h | 10 +++++++++-
>>> 2 files changed, 19 insertions(+), 5 deletions(-)
>>> 
>>> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
>>> index 536d56d15072..820e7ab7f3a3 100644
>>> --- a/fs/ext4/namei.c
>>> +++ b/fs/ext4/namei.c
>>> @@ -1462,7 +1462,8 @@ static bool ext4_match(struct inode *parent,
>>> 		 * sure cf_name was properly initialized before
>>> 		 * considering the calculated hash.
>>> 		 */
>>> -		if (IS_ENCRYPTED(parent) && fname->cf_name.name &&
>>> +		if (sb_no_casefold_compat_fallback(parent->i_sb) &&
>>> +		    IS_ENCRYPTED(parent) && fname->cf_name.name &&
>>> 		    (fname->hinfo.hash != EXT4_DIRENT_HASH(de) ||
>>> 		     fname->hinfo.minor_hash != EXT4_DIRENT_MINOR_HASH(de)))
>>> 			return false;
>>> @@ -1595,10 +1596,15 @@ static struct buffer_head *__ext4_find_entry(struct inode *dir,
>>> 		 * return.  Otherwise, fall back to doing a search the
>>> 		 * old fashioned way.
>>> 		 */
>>> -		if (!IS_ERR(ret) || PTR_ERR(ret) != ERR_BAD_DX_DIR)
>>> +		if (IS_ERR(ret) && PTR_ERR(ret) == ERR_BAD_DX_DIR)
>>> +			dxtrace(printk(KERN_DEBUG "ext4_find_entry: dx failed, "
>>> +				       "falling back\n"));
>>> +		else if (!sb_no_casefold_compat_fallback(dir->i_sb) &&
>>> +			 *res_dir == NULL && IS_CASEFOLDED(dir))
>>> +			dxtrace(printk(KERN_DEBUG "ext4_find_entry: casefold "
>>> +				       "failed, falling back\n"));
>>> +		else
>>> 			goto cleanup_and_exit;
>>> -		dxtrace(printk(KERN_DEBUG "ext4_find_entry: dx failed, "
>>> -			       "falling back\n"));
>>> 		ret = NULL;
>>> 	}
>>> 	nblocks = dir->i_size >> EXT4_BLOCK_SIZE_BITS(sb);
>>> diff --git a/include/linux/fs.h b/include/linux/fs.h
>>> index 2c3b2f8a621f..aa4ec39202c3 100644
>>> --- a/include/linux/fs.h
>>> +++ b/include/linux/fs.h
>>> @@ -1258,11 +1258,19 @@ extern int send_sigurg(struct file *file);
>>> #define SB_NOUSER       BIT(31)
>>> 
>>> /* These flags relate to encoding and casefolding */
>>> -#define SB_ENC_STRICT_MODE_FL	(1 << 0)
>>> +#define SB_ENC_STRICT_MODE_FL		(1 << 0)
>>> +#define SB_ENC_NO_COMPAT_FALLBACK_FL	(1 << 1)
>>> 
>>> #define sb_has_strict_encoding(sb) \
>>> 	(sb->s_encoding_flags & SB_ENC_STRICT_MODE_FL)
>>> 
>>> +#if IS_ENABLED(CONFIG_UNICODE)
>>> +#define sb_no_casefold_compat_fallback(sb) \
>>> +	(sb->s_encoding_flags & SB_ENC_NO_COMPAT_FALLBACK_FL)
>>> +#else
>>> +#define sb_no_casefold_compat_fallback(sb) (1)
>>> +#endif
>>> +
>>> /*
>>> *	Umount options
>>> */
>>> --
>>> 2.45.2
>>> 
>>> 
>> 
>> 
>> Cheers, Andreas
>> 
>> 
>> 
>> 
>> 
> 
> --
> Gabriel Krisman Bertazi


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH -v2] ext4: introduce linear search for dentries
  2025-02-20  1:58       ` Andreas Dilger
@ 2025-02-20 14:46         ` Theodore Ts'o
  0 siblings, 0 replies; 8+ messages in thread
From: Theodore Ts'o @ 2025-02-20 14:46 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Gabriel Krisman Bertazi, Ext4 Developers List, drosen

On Wed, Feb 19, 2025 at 06:58:00PM -0700, Andreas Dilger wrote:
> Sure, my suggestions are aimed at minimizing the impact of this extra
> (and very expensive) fallback mechanism.  If there is a direct way to
> determine which filenames were impacted by the earlier bug, and then
> do only two lookups (one with the "buggy" casefolded name, one with the
> "good" casefolded name) then this would be (at worst) a 2x slowdown for
> the lookup, instead of a 1000x slowdown (or whatever, for large directories).
> 
> Also, since the number of users affected by this bug is relatively small
> (only users running kernels >= v6.12-rc2-1-g5c26d2f1d3f5 where the broken
> patch landed and v6.13-rc2-36-g231825b2e1ff when it was reverted), but the
> workaround by default affects everyone using the casefold feature, it
> behooves us to minimize the performance impact of the workaround.

This is why I added a new encoding flag, SB_ENC_NO_COMPAT_FALLBACK_FL,
so if the system administrator is sure that the device never had that
alternate encoding, we don't have to pay that performance penalty.

The problem is the original reason for making the change was a
"security vulnerability" where if you had one of these invisibile
zero-length characters in a directory named ".git", this could cause
someone who was using git on a case-fold directory vulnerable to an
attach where if they pulled from reponsitory that was controlled by a
malicious entity, that this could cause the pull to resullt in an
overwrite of .git/config.  So it was a relativey narrow range in
Linus's tree, but the "security fix" was backported into LTS kernels,
and pushed out to a large number of Android handsets which do use case
folding.

This is why the default is to do the fallback; other than Android
handsets, the number of user of case folding is mercifully quite
small.  And the problem was detected on Android machines, where users
who had files that included characters such as '❤️' or '❤' could no
longer access them; fixing that regression had to take priority.

> We have been looking at adding casefold support to Lustre, in order to
> improve Samba export performance (which also has a "scan all entries"
> fallback), and we cannot control how many files are in a single directory.

For Lustre, if you know that no one is going to be using kernels with
the changed encoding, you could just aways set
SB_ENC_NO_COMPAT_FALLBACK_FL and just be happy.  If you think that
Lustre users might actually use git, and you are worried about this
"security vulnerability", we could ask the git project to fix it at
their level.  I personally don't care that much, since I'm not sure
how many people would really want to be doing development using git on
an Android handset using Termux.  :-)

> It seems likely that systems have been using casefold directly on ext4
> for Samba as well.  If the performance impact of "scan all entries" is
> noticeable for Samba, then it would be noticeable for this fallback.

I'm not sure how many Samba installations actually do use it, but if
they do, but it might not be that bad, since we do have negative
dentries for the common misses in a search path (for example).  And if
it is safe, we can provide utilities to make it easier to
set SB_ENC_NO_COMPAT_FALLBACK_FL.

> One option would be to have the kernel re-hash any entries that it finds
> with the old filename, so that the directories repair themselves, and the
> workaround could be removed after some time.  Also, have e2fsck re-hash
> the filenames in this case, so that there is a long-term solution after
> the kernel workaround is removed.

There are quite a lot of things that can be done, but quite frankly,
I'm not really that excited to invest the time to do something more
complicated.  If someone does want to spend more time, including
changes that might involve another encoding bit so we could actualy
safely eliminate the "dangerous" zero-width characters while allowing
'❤️' or '❤' to be distinct and without breaking file systems that have
those characters, I'm certainly willing to entertain patches.

      		      			   	     - Ted

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-02-20 14:46 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-12 16:44 [PATCH] ext4: introduce linear search for dentries Theodore Ts'o
2025-02-12 21:02 ` Gabriel Krisman Bertazi
2025-02-13 20:10 ` [PATCH -v2] " Theodore Ts'o
2025-02-13 20:17   ` Eric Biggers
2025-02-19 20:30   ` Andreas Dilger
2025-02-19 21:43     ` Gabriel Krisman Bertazi
2025-02-20  1:58       ` Andreas Dilger
2025-02-20 14:46         ` Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox