From: "Theodore Ts'o" <tytso@mit.edu>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Brauner <brauner@kernel.org>,
Gabriel Krisman Bertazi <krisman@suse.de>,
viro@zeniv.linux.org.uk, linux-f2fs-devel@lists.sourceforge.net,
ebiggers@kernel.org, linux-fsdevel@vger.kernel.org,
jaegeuk@kernel.org, linux-ext4@vger.kernel.org
Subject: Re: [f2fs-dev] [PATCH v6 0/9] Support negative dentries on case-insensitive ext4 and f2fs
Date: Tue, 21 Nov 2023 00:12:15 -0500 [thread overview]
Message-ID: <20231121051215.GA335601@mit.edu> (raw)
In-Reply-To: <CAHk-=wh+o0Zkzn=mtF6nB1b-EEcod-y4+ZWtAe7=Mi1v7RjUpg@mail.gmail.com>
On Mon, Nov 20, 2023 at 07:03:13PM -0800, Linus Torvalds wrote:
> On Mon, 20 Nov 2023 at 18:29, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > It's a bit complicated, yes. But no, doing things one unicode
> > character at a time is just bad bad bad.
>
> Put another way: the _point_ of UTF-8 is that ASCII is still ASCII.
> It's literally why UTF-8 doesn't suck.
>
> So you can still compare ASCII strings as-is.
>
> No, that doesn't help people who are really using other locales, and
> are actively using complicated characters.
>
> But it very much does mean that you can compare "Bad" and "bad" and
> never ever look at any unicode translation ever.
Yeah, agreed, that would be a nice optimization. However, in the
unfortunate case where (a) it's non-ASCII, and (b) the input string is
non-normalized and/or differs in case, we end up scanning some portion
of the two strings twice; once doing the strcmp, and once doing the
Unicode slow path.
That being said, given that even in the case where we're dealing with
non-ASCII strings, in the fairly common case where the program is
doing a readdir() followed by a open() or stat(), the filename will be
byte-identical and so a strcmp() will suffice.
So I agree that it's a nice optimization. It'd be interesting how
much such an optimization would actually show up in various
benchmarks. It'd have to be something that was really metadata-heavy,
or else the filenamea lookups would get drowned out.
- Ted
next prev parent reply other threads:[~2023-11-21 5:12 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-16 5:07 [PATCH v6 0/9] Support negative dentries on case-insensitive ext4 and f2fs Gabriel Krisman Bertazi
2023-08-16 5:07 ` [PATCH v6 1/9] ecryptfs: Reject casefold directory inodes Gabriel Krisman Bertazi
2023-08-16 5:07 ` [PATCH v6 2/9] 9p: Split ->weak_revalidate from ->revalidate Gabriel Krisman Bertazi
2023-08-16 5:07 ` [PATCH v6 3/9] fs: Expose name under lookup to d_revalidate hooks Gabriel Krisman Bertazi
2023-11-22 20:59 ` Al Viro
2023-08-16 5:07 ` [PATCH v6 4/9] fs: Add DCACHE_CASEFOLDED_NAME flag Gabriel Krisman Bertazi
2023-11-22 20:32 ` Al Viro
2023-08-16 5:07 ` [PATCH v6 5/9] libfs: Validate negative dentries in case-insensitive directories Gabriel Krisman Bertazi
2023-11-22 20:20 ` Al Viro
2023-08-16 5:08 ` [PATCH v6 6/9] libfs: Chain encryption checks after case-insensitive revalidation Gabriel Krisman Bertazi
2023-08-16 5:08 ` [PATCH v6 7/9] libfs: Merge encrypted_ci_dentry_ops and ci_dentry_ops Gabriel Krisman Bertazi
2023-08-16 5:08 ` [PATCH v6 8/9] ext4: Enable negative dentries on case-insensitive lookup Gabriel Krisman Bertazi
2023-08-16 5:08 ` [PATCH v6 9/9] f2fs: " Gabriel Krisman Bertazi
2023-08-17 17:06 ` [PATCH v6 0/9] Support negative dentries on case-insensitive ext4 and f2fs Eric Biggers
2023-08-21 15:52 ` Christian Brauner
2023-08-21 18:53 ` Gabriel Krisman Bertazi
2023-08-22 9:03 ` Christian Brauner
2023-10-24 22:20 ` Gabriel Krisman Bertazi
2023-10-25 13:32 ` Christian Brauner
2023-10-25 15:19 ` Gabriel Krisman Bertazi
2023-11-19 23:11 ` [f2fs-dev] " Gabriel Krisman Bertazi
[not found] ` <655a9634.630a0220.d50d7.5063SMTPIN_ADDED_BROKEN@mx.google.com>
2023-11-20 15:06 ` Christian Brauner
2023-11-20 16:59 ` Gabriel Krisman Bertazi
2023-11-20 18:07 ` Linus Torvalds
2023-11-21 2:02 ` Theodore Ts'o
2023-11-21 2:29 ` Linus Torvalds
2023-11-21 3:03 ` Linus Torvalds
2023-11-21 5:12 ` Theodore Ts'o [this message]
2023-11-22 21:04 ` Al Viro
2023-11-21 2:27 ` Al Viro
2023-11-22 21:19 ` Al Viro
2023-11-23 0:18 ` Linus Torvalds
2023-11-23 5:09 ` Al Viro
2023-11-23 15:57 ` Gabriel Krisman Bertazi
2023-11-23 17:12 ` Al Viro
2023-11-23 17:37 ` Gabriel Krisman Bertazi
2023-11-23 18:24 ` Al Viro
2023-11-23 19:06 ` Gabriel Krisman Bertazi
2023-11-23 19:53 ` Al Viro
2023-11-23 20:15 ` Al Viro
2023-11-24 15:20 ` Gabriel Krisman Bertazi
2023-11-28 0:02 ` Al Viro
2023-11-23 21:52 ` Al Viro
2023-11-24 15:22 ` Gabriel Krisman Bertazi
2023-11-25 22:01 ` Al Viro
2023-11-26 4:52 ` Al Viro
2023-11-26 18:41 ` fun with d_invalidate() vs. d_splice_alias() was " Al Viro
2023-11-27 6:38 ` Al Viro
2023-11-27 15:47 ` Eric W. Biederman
2023-11-27 16:01 ` Eric W. Biederman
2023-11-27 17:25 ` Al Viro
2023-11-27 18:26 ` Al Viro
2023-11-27 16:03 ` Al Viro
2023-11-27 16:14 ` Al Viro
2023-11-27 18:19 ` Eric W. Biederman
2023-11-27 18:43 ` Al Viro
2023-11-27 16:33 ` Christian Brauner
2023-11-29 4:53 ` Al Viro
2023-11-29 10:21 ` Christian Brauner
2023-11-29 15:19 ` Eric W. Biederman
[not found] ` <655f7665.df0a0220.58a21.e84fSMTPIN_ADDED_BROKEN@mx.google.com>
2023-11-23 16:41 ` Linus Torvalds
2023-11-23 1:12 ` Al Viro
2023-11-23 1:22 ` Al Viro
2023-11-22 3:30 ` Gabriel Krisman Bertazi
2024-01-16 19:02 ` patchwork-bot+f2fs
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231121051215.GA335601@mit.edu \
--to=tytso@mit.edu \
--cc=brauner@kernel.org \
--cc=ebiggers@kernel.org \
--cc=jaegeuk@kernel.org \
--cc=krisman@suse.de \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).