linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gabriel Krisman Bertazi <gabriel@krisman.be>
To: "André Almeida" <andrealmeid@igalia.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	 Amir Goldstein <amir73il@gmail.com>,
	 Theodore Tso <tytso@mit.edu>,
	linux-unionfs@vger.kernel.org,  linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,
	 Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>,  Jan Kara <jack@suse.cz>,
	kernel-dev@igalia.com
Subject: Re: [PATCH RFC v2 2/8] ovl: Create ovl_strcmp() with casefold support
Date: Tue, 05 Aug 2025 10:56:45 -0400	[thread overview]
Message-ID: <87o6stakb6.fsf@mailhost.krisman.be> (raw)
In-Reply-To: <20250805-tonyk-overlayfs-v2-2-0e54281da318@igalia.com> ("André Almeida"'s message of "Tue, 05 Aug 2025 00:09:06 -0300")

André Almeida <andrealmeid@igalia.com> writes:

> To add overlayfs support casefold filesystems, create a new function
> ovl_strcmp() with support for casefold names.
>
> If the ovl_cache_entry have stored a casefold name, use it and create
> a casfold version of the name that is going to be compared to.
>
> For the casefold support, just comparing the strings does not work
> because we need the dentry enconding, so make this function find the
> equivalent dentry for a giving directory, if any.
>
> As this function is used for search and insertion in the red-black tree,
> that means that the tree node keys are going to be the casefolded
> version of the dentry's names. Otherwise, the search would not work for
> case-insensitive mount points.
>
> For the non-casefold names, nothing changes.
>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> ---
> I wonder what should be done here if kmalloc fails, if the strcmp()
> should fail as well or just fallback to the normal name?
> ---
>  fs/overlayfs/readdir.c | 42 ++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 40 insertions(+), 2 deletions(-)
>
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index 83bca1bcb0488461b08effa70b32ff2fefba134e..1b8eb10e72a229ade40d18795746d3c779797a06 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -72,6 +72,44 @@ static struct ovl_cache_entry *ovl_cache_entry_from_node(struct rb_node *n)
>  	return rb_entry(n, struct ovl_cache_entry, node);
>  }
>  
> +/*
> + * Compare a string with a cache entry, with support for casefold names.
> + */
> +static int ovl_strcmp(const char *str, struct ovl_cache_entry *p, int len)
> +{

Why do you need to re-casefold str on every call to ovl_strcmp?  Isn't
it done in a loop while walking the rbtree with a constant "str" (i.e.,
the name being added, see ovl_cache_entry_find)? Can't you do it once,
outside of ovl_strcmp? This way you don't repeatedly allocate/free
memory for each node of the tree (as Viro mentioned), and you don't have
to deal with kmalloc failures here.

> +
> +	const struct qstr qstr = { .name = str, .len = len };
> +	const char *p_name = p->name, *name = str;
> +	char *dst = NULL;
> +	int cmp, cf_len;
> +
> +	if (p->cf_name)
> +		p_name = p->cf_name;

This should check IS_ENABLED(CONFIG_UNICODE) so it can be
compiled out by anyone doing CONFIG_UNICODE=n

> +
> +	if (p->map && !is_dot_dotdot(str, len)) {
> +		dst = kmalloc(OVL_NAME_LEN, GFP_KERNEL);
> +
> +		/*
> +		 * strcmp can't fail, so we fallback to the use the original
> +		 * name
> +		 */
> +		if (dst) {
> +			cf_len = utf8_casefold(p->map, &qstr, dst, OVL_NAME_LEN);

utf8_casefold can fail, as you know and checked.  But if it does, a
negative cf_len is passed to strncmp and cast to a very high
value.

> +
> +			if (cf_len > 0) {
> +				name = dst;
> +				dst[cf_len] = '\0';
> +			}

utf8_casefold ensures the string is NULL-terminated on success already.

> +		}
> +	}
> +
> +	cmp = strncmp(name, p_name, cf_len);
> +
> +	kfree(dst);
> +
> +	return cmp;
> +}
> +
>  static bool ovl_cache_entry_find_link(const char *name, int len,
>  				      struct rb_node ***link,
>  				      struct rb_node **parent)
> @@ -85,7 +123,7 @@ static bool ovl_cache_entry_find_link(const char *name, int len,
>  
>  		*parent = *newp;
>  		tmp = ovl_cache_entry_from_node(*newp);
> -		cmp = strncmp(name, tmp->name, len);
> +		cmp = ovl_strcmp(name, tmp, len);
>  		if (cmp > 0)
>  			newp = &tmp->node.rb_right;
>  		else if (cmp < 0 || len < tmp->len)
> @@ -107,7 +145,7 @@ static struct ovl_cache_entry *ovl_cache_entry_find(struct rb_root *root,
>  	while (node) {
>  		struct ovl_cache_entry *p = ovl_cache_entry_from_node(node);
>  
> -		cmp = strncmp(name, p->name, len);
> +		cmp = ovl_strcmp(name, p, len);
>  		if (cmp > 0)
>  			node = p->node.rb_right;
>  		else if (cmp < 0 || len < p->len)

-- 
Gabriel Krisman Bertazi

  parent reply	other threads:[~2025-08-05 14:56 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-05  3:09 [PATCH RFC v2 0/8] ovl: Enable support for casefold filesystems André Almeida
2025-08-05  3:09 ` [PATCH RFC v2 1/8] olv: Store casefold name for case-insentive dentries André Almeida
2025-08-05  3:09 ` [PATCH RFC v2 2/8] ovl: Create ovl_strcmp() with casefold support André Almeida
2025-08-05  5:08   ` Al Viro
2025-08-05 13:01     ` André Almeida
2025-08-05 14:56   ` Gabriel Krisman Bertazi [this message]
2025-08-05 18:40     ` André Almeida
2025-08-05  3:09 ` [PATCH RFC v2 3/8] fs: Create sb_same_encoding() helper André Almeida
2025-08-05  3:09 ` [PATCH RFC v2 4/8] ovl: Ensure that all mount points have the same encoding André Almeida
2025-08-05  3:09 ` [PATCH RFC v2 5/8] ovl: Set case-insensitive dentry operations for ovl sb André Almeida
2025-08-05  3:09 ` [PATCH RFC v2 6/8] ovl: Set inode S_CASEFOLD for casefolded dentries André Almeida
2025-08-05  3:09 ` [PATCH RFC v2 7/8] ovl: Check casefold consistency in ovl stack André Almeida
2025-08-05  3:09 ` [PATCH RFC v2 8/8] ovl: Drop restrictions for casefolded dentries André Almeida

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o6stakb6.fsf@mailhost.krisman.be \
    --to=gabriel@krisman.be \
    --cc=amir73il@gmail.com \
    --cc=andrealmeid@igalia.com \
    --cc=brauner@kernel.org \
    --cc=jack@suse.cz \
    --cc=kernel-dev@igalia.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).