linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gabriel Krisman Bertazi <krisman@suse.de>
To: Amir Goldstein <amir73il@gmail.com>
Cc: "André Almeida" <andrealmeid@igalia.com>,
	"Miklos Szeredi" <miklos@szeredi.hu>,
	"Theodore Tso" <tytso@mit.edu>,
	linux-unionfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Christian Brauner" <brauner@kernel.org>,
	"Jan Kara" <jack@suse.cz>,
	kernel-dev@igalia.com
Subject: Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp()
Date: Mon, 25 Aug 2025 21:34:35 -0400	[thread overview]
Message-ID: <87plci3lxw.fsf@mailhost.krisman.be> (raw)
In-Reply-To: <871poz4983.fsf@mailhost.krisman.be> (Gabriel Krisman Bertazi's message of "Mon, 25 Aug 2025 13:11:40 -0400")

Gabriel Krisman Bertazi <gabriel@krisman.be> writes:

> Amir Goldstein <amir73il@gmail.com> writes:
>
>> On Mon, Aug 25, 2025 at 5:27 PM Amir Goldstein <amir73il@gmail.com> wrote:
>>>
>>> On Mon, Aug 25, 2025 at 1:09 PM Gabriel Krisman Bertazi
>>> <gabriel@krisman.be> wrote:
>>> >
>>> > André Almeida <andrealmeid@igalia.com> writes:
>>> >
>>> > > To add overlayfs support casefold layers, create a new function
>>> > > ovl_casefold(), to be able to do case-insensitive strncmp().
>>> > >
>>> > > ovl_casefold() allocates a new buffer and stores the casefolded version
>>> > > of the string on it. If the allocation or the casefold operation fails,
>>> > > fallback to use the original string.
>>> > >
>>> > > The case-insentive name is then used in the rb-tree search/insertion
>>> > > operation. If the name is found in the rb-tree, the name can be
>>> > > discarded and the buffer is freed. If the name isn't found, it's then
>>> > > stored at struct ovl_cache_entry to be used later.
>>> > >
>>> > > Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>>> > > Signed-off-by: André Almeida <andrealmeid@igalia.com>
>>> > > ---
>>> > > Changes from v6:
>>> > >  - Last version was using `strncmp(... tmp->len)` which was causing
>>> > >    regressions. It should be `strncmp(... len)`.
>>> > >  - Rename cf_len to c_len
>>> > >  - Use c_len for tree operation: (cmp < 0 || len < tmp->c_len)
>>> > >  - Remove needless kfree(cf_name)
>>> > > ---
>>> > >  fs/overlayfs/readdir.c | 113 ++++++++++++++++++++++++++++++++++++++++---------
>>> > >  1 file changed, 94 insertions(+), 19 deletions(-)
>>> > >
>>> > > diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
>>> > > index b65cdfce31ce27172d28d879559f1008b9c87320..dfc661b7bc3f87efbf14991e97cee169400d823b 100644
>>> > > --- a/fs/overlayfs/readdir.c
>>> > > +++ b/fs/overlayfs/readdir.c
>>> > > @@ -27,6 +27,8 @@ struct ovl_cache_entry {
>>> > >       bool is_upper;
>>> > >       bool is_whiteout;
>>> > >       bool check_xwhiteout;
>>> > > +     const char *c_name;
>>> > > +     int c_len;
>>> > >       char name[];
>>> > >  };
>>> > >
>>> > > @@ -45,6 +47,7 @@ struct ovl_readdir_data {
>>> > >       struct list_head *list;
>>> > >       struct list_head middle;
>>> > >       struct ovl_cache_entry *first_maybe_whiteout;
>>> > > +     struct unicode_map *map;
>>> > >       int count;
>>> > >       int err;
>>> > >       bool is_upper;
>>> > > @@ -66,6 +69,27 @@ static struct ovl_cache_entry *ovl_cache_entry_from_node(struct rb_node *n)
>>> > >       return rb_entry(n, struct ovl_cache_entry, node);
>>> > >  }
>>> > >
>>> > > +static int ovl_casefold(struct unicode_map *map, const char *str, int len, char **dst)
>>> > > +{
>>> > > +     const struct qstr qstr = { .name = str, .len = len };
>>> > > +     int cf_len;
>>> > > +
>>> > > +     if (!IS_ENABLED(CONFIG_UNICODE) || !map || is_dot_dotdot(str, len))
>>> > > +             return 0;
>>> > > +
>>> > > +     *dst = kmalloc(NAME_MAX, GFP_KERNEL);
>>> > > +
>>> > > +     if (dst) {
>>> > > +             cf_len = utf8_casefold(map, &qstr, *dst, NAME_MAX);
>>> > > +
>>> > > +             if (cf_len > 0)
>>> > > +                     return cf_len;
>>> > > +     }
>>> > > +
>>> > > +     kfree(*dst);
>>> > > +     return 0;
>>> > > +}
>>> >
>>> > Hi,
>>> >
>>> > I should just note this does not differentiates allocation errors from
>>> > casefolding errors (invalid encoding).  It might be just a theoretical
>>> > error because GFP_KERNEL shouldn't fail (wink, wink) and the rest of the
>>> > operation is likely to fail too, but if you have an allocation failure, you
>>> > can end up with an inconsistent cache, because a file is added under the
>>> > !casefolded name and a later successful lookup will look for the
>>> > casefolded version.
>>>
>>> Good point.
>>> I will fix this in my tree.
>>
>> wait why should we not fail to fill the cache for both allocation
>> and encoding errors?
>>
>
> We shouldn't fail the cache for encoding errors, just for allocation errors.
>
> Perhaps I am misreading the code, so please correct me if I'm wrong.  if
> ovl_casefold fails, the non-casefolded name is used in the cache.  That
> makes sense if the reason utf8_casefold failed is because the string
> cannot be casefolded (i.e. an invalid utf-8 string). For those strings,
> everything is fine.  But on an allocation failure, the string might have
> a real casefolded version.  If we fallback to the original string as the
> key, a cache lookup won't find the entry, since we compare with memcmp.

I was thinking again about this and I suspect I misunderstood your
question.  let me try to answer it again:

Ext4, f2fs and tmpfs all allow invalid utf8-encoded strings in a
casefolded directory when running on non-strict-mode.  They are treated
as non-encoded byte-sequences, as if they were seen on a case-Sensitive
directory.  They can't collide with other filenames because they
basically "fold" to themselves.

Now I suspect there is another problem with this series: I don't see how
it implements the semantics of strict mode.  What happens if upper and
lower are in strict mode (which is valid, same encoding_flags) but there
is an invalid name in the lower?  overlayfs should reject the dentry,
because any attempt to create it to the upper will fail.

André, did you consider this scenario?  You can test by creating a file
with an invalid-encoded name in a casefolded directory of a
non-strict-mode filesystem and then flip the strict-mode flag in the
superblock.  I can give it a try tomorrow too.

Thanks,

-- 
Gabriel Krisman Bertazi

  reply	other threads:[~2025-08-26  1:34 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-22 14:17 [PATCH v6 0/9] ovl: Enable support for casefold layers André Almeida
2025-08-22 14:17 ` [PATCH v6 1/9] fs: Create sb_encoding() helper André Almeida
2025-08-25  9:19   ` Gabriel Krisman Bertazi
2025-08-25 12:38   ` Gabriel Krisman Bertazi
2025-08-25 15:28     ` Amir Goldstein
2025-08-22 14:17 ` [PATCH v6 2/9] fs: Create sb_same_encoding() helper André Almeida
2025-08-23 10:02   ` Amir Goldstein
2025-08-25  9:24   ` Gabriel Krisman Bertazi
2025-08-22 14:17 ` [PATCH v6 3/9] ovl: Prepare for mounting case-insensitive enabled layers André Almeida
2025-08-25 10:42   ` Gabriel Krisman Bertazi
2025-08-22 14:17 ` [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp() André Almeida
2025-08-22 16:53   ` Amir Goldstein
2025-08-25 11:09   ` Gabriel Krisman Bertazi
2025-08-25 15:27     ` Amir Goldstein
2025-08-25 15:45       ` Amir Goldstein
2025-08-25 17:11         ` Gabriel Krisman Bertazi
2025-08-26  1:34           ` Gabriel Krisman Bertazi [this message]
2025-08-26  7:19             ` Amir Goldstein
2025-08-26 15:02               ` Gabriel Krisman Bertazi
2025-08-26 19:58                 ` André Almeida
2025-08-27  9:28                   ` Amir Goldstein
2025-08-26 20:01               ` André Almeida
2025-08-27 20:45               ` André Almeida
2025-08-28 11:09                 ` Amir Goldstein
2025-08-22 14:17 ` [PATCH v6 5/9] ovl: Ensure that all layers have the same encoding André Almeida
2025-08-25 11:17   ` Gabriel Krisman Bertazi
2025-08-25 15:32     ` Amir Goldstein
2025-08-26 20:12       ` André Almeida
2025-08-27  9:17         ` Amir Goldstein
2025-08-22 14:17 ` [PATCH v6 6/9] ovl: Set case-insensitive dentry operations for ovl sb André Almeida
2025-08-25 11:24   ` Gabriel Krisman Bertazi
2025-08-25 15:34     ` Amir Goldstein
2025-08-26 20:13       ` André Almeida
2025-08-22 14:17 ` [PATCH v6 7/9] ovl: Add S_CASEFOLD as part of the inode flag to be copied André Almeida
2025-08-22 14:17 ` [PATCH v6 8/9] ovl: Check for casefold consistency when creating new dentries André Almeida
2025-08-22 14:17 ` [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers André Almeida
2025-08-22 16:34   ` Amir Goldstein
2025-08-22 16:47     ` André Almeida
2025-08-22 19:17       ` Amir Goldstein
2025-08-25 13:31         ` André Almeida
2025-08-26  7:31           ` Amir Goldstein
2025-08-26 19:01             ` André Almeida
2025-08-27 18:06               ` Amir Goldstein
2025-08-27 20:37                 ` André Almeida
2025-08-27 23:58                 ` NeilBrown
2025-08-28  3:15                   ` Gabriel Krisman Bertazi
2025-08-28  7:25                     ` Amir Goldstein
2025-08-28 16:44                       ` Amir Goldstein
2025-08-29  1:27                         ` NeilBrown
2025-08-29  1:25                       ` NeilBrown
2025-08-29  9:31                         ` Amir Goldstein
2025-09-01 22:02                           ` NeilBrown
2025-08-22 19:28 ` [syzbot ci] Re: ovl: Enable support for casefold layers syzbot ci

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87plci3lxw.fsf@mailhost.krisman.be \
    --to=krisman@suse.de \
    --cc=amir73il@gmail.com \
    --cc=andrealmeid@igalia.com \
    --cc=brauner@kernel.org \
    --cc=jack@suse.cz \
    --cc=kernel-dev@igalia.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).