From: bugzilla-daemon--- via Linux-f2fs-devel <linux-f2fs-devel@lists.sourceforge.net>
To: linux-f2fs-devel@lists.sourceforge.net
Subject: [f2fs-dev] [Bug 219586] New: Unable to find file after unicode change
Date: Tue, 10 Dec 2024 06:58:44 +0000 [thread overview]
Message-ID: <bug-219586-202145@https.bugzilla.kernel.org/> (raw)
https://bugzilla.kernel.org/show_bug.cgi?id=219586
Bug ID: 219586
Summary: Unable to find file after unicode change
Product: File System
Version: 2.5
Hardware: All
OS: Linux
Status: NEW
Severity: blocking
Priority: P3
Component: f2fs
Assignee: filesystem_f2fs@kernel-bugs.kernel.org
Reporter: hanqi@vivo.com
Regression: No
Hi everybody,
The f2fs filesystem is unable to read some files with special characters,
such as ❤️, after the kernel was updated with the following patch:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=18b5f47e7da46d3a0d7331e48befcaf151ed2ddf
We can reproduce this in the following steps:
1、First, we need to roll back the unicode-related changes above and create
the special character file or folder:
./tools/mkfs.f2fs -f -O casefold -C utf8 f2fs.img
mount f2fs.img f2fs_dir/
mkdir Picture
./f2fs_io setflags casefold Picture
cd Picture
touch ❤️
2、Then we apply the above unicode patch, and after mounting the filesystem,
we get a message that the special character file was not found.
mount f2fs.img f2fs_dir/
cd Picture
ls -alh
ls: cannot access '❤️': No such file or directory
total 8
drwxr-xr-x 2 root root 3488 Dec 10 06:11 .
drwxr-xr-x 3 root root 4096 Dec 9 10:21 ..
-????????? ? ? ? ? ? ❤️
Here are the conclusions of my preliminary analysis.
In casefole-enabled f2fs filesystems, file names are converted to lowercase
by the utf8_casefold function when querying for a file, and then the hash is
calculated based on the lowercase filename and stored on disk. The path to
the function is:
f2fs_lookup
f2fs_prepare_lookup
__f2fs_setup_filename
f2fs_init_casefolded_name
utf8_casefold
f2fs_hash_filename
__f2fs_find_entry
For some files that contain special characters, such as ❤️. We found that the
length of the output characters changed after the utf8_casefold function
converted
them to lowercase before and after the patch, which ultimately led to a change
in the
calculated hash. Files created before patch are not readable after path is
enabled.
I think we need to modify the f2fs filesystem to be compatible with unicode
related changes.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
next reply other threads:[~2024-12-10 6:59 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-10 6:58 bugzilla-daemon--- via Linux-f2fs-devel [this message]
2024-12-10 15:47 ` [f2fs-dev] [Bug 219586] Unable to find file after unicode change bugzilla-daemon--- via Linux-f2fs-devel
2024-12-11 0:08 ` bugzilla-daemon--- via Linux-f2fs-devel
2024-12-11 2:11 ` bugzilla-daemon--- via Linux-f2fs-devel
2024-12-11 4:13 ` bugzilla-daemon--- via Linux-f2fs-devel
2024-12-12 8:35 ` bugzilla-daemon--- via Linux-f2fs-devel
2024-12-12 8:39 ` bugzilla-daemon--- via Linux-f2fs-devel
2024-12-12 15:25 ` bugzilla-daemon--- via Linux-f2fs-devel
2024-12-13 1:32 ` bugzilla-daemon--- via Linux-f2fs-devel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-219586-202145@https.bugzilla.kernel.org/ \
--to=linux-f2fs-devel@lists.sourceforge.net \
--cc=bugzilla-daemon@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.