All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gabriel Krisman Bertazi <gabriel@krisman.be>
To: "André Almeida" <andrealmeid@igalia.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	 Amir Goldstein <amir73il@gmail.com>,
	 Theodore Tso <tytso@mit.edu>,
	linux-unionfs@vger.kernel.org,  linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,
	 Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>,  Jan Kara <jack@suse.cz>,
	kernel-dev@igalia.com
Subject: Re: [PATCH RFC v2 2/8] ovl: Create ovl_strcmp() with casefold support
Date: Tue, 05 Aug 2025 10:56:45 -0400	[thread overview]
Message-ID: <87o6stakb6.fsf@mailhost.krisman.be> (raw)
In-Reply-To: <20250805-tonyk-overlayfs-v2-2-0e54281da318@igalia.com> ("André Almeida"'s message of "Tue, 05 Aug 2025 00:09:06 -0300")

André Almeida <andrealmeid@igalia.com> writes:

> To add overlayfs support casefold filesystems, create a new function
> ovl_strcmp() with support for casefold names.
>
> If the ovl_cache_entry have stored a casefold name, use it and create
> a casfold version of the name that is going to be compared to.
>
> For the casefold support, just comparing the strings does not work
> because we need the dentry enconding, so make this function find the
> equivalent dentry for a giving directory, if any.
>
> As this function is used for search and insertion in the red-black tree,
> that means that the tree node keys are going to be the casefolded
> version of the dentry's names. Otherwise, the search would not work for
> case-insensitive mount points.
>
> For the non-casefold names, nothing changes.
>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> ---
> I wonder what should be done here if kmalloc fails, if the strcmp()
> should fail as well or just fallback to the normal name?
> ---
>  fs/overlayfs/readdir.c | 42 ++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 40 insertions(+), 2 deletions(-)
>
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index 83bca1bcb0488461b08effa70b32ff2fefba134e..1b8eb10e72a229ade40d18795746d3c779797a06 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -72,6 +72,44 @@ static struct ovl_cache_entry *ovl_cache_entry_from_node(struct rb_node *n)
>  	return rb_entry(n, struct ovl_cache_entry, node);
>  }
>  
> +/*
> + * Compare a string with a cache entry, with support for casefold names.
> + */
> +static int ovl_strcmp(const char *str, struct ovl_cache_entry *p, int len)
> +{

Why do you need to re-casefold str on every call to ovl_strcmp?  Isn't
it done in a loop while walking the rbtree with a constant "str" (i.e.,
the name being added, see ovl_cache_entry_find)? Can't you do it once,
outside of ovl_strcmp? This way you don't repeatedly allocate/free
memory for each node of the tree (as Viro mentioned), and you don't have
to deal with kmalloc failures here.

> +
> +	const struct qstr qstr = { .name = str, .len = len };
> +	const char *p_name = p->name, *name = str;
> +	char *dst = NULL;
> +	int cmp, cf_len;
> +
> +	if (p->cf_name)
> +		p_name = p->cf_name;

This should check IS_ENABLED(CONFIG_UNICODE) so it can be
compiled out by anyone doing CONFIG_UNICODE=n

> +
> +	if (p->map && !is_dot_dotdot(str, len)) {
> +		dst = kmalloc(OVL_NAME_LEN, GFP_KERNEL);
> +
> +		/*
> +		 * strcmp can't fail, so we fallback to the use the original
> +		 * name
> +		 */
> +		if (dst) {
> +			cf_len = utf8_casefold(p->map, &qstr, dst, OVL_NAME_LEN);

utf8_casefold can fail, as you know and checked.  But if it does, a
negative cf_len is passed to strncmp and cast to a very high
value.

> +
> +			if (cf_len > 0) {
> +				name = dst;
> +				dst[cf_len] = '\0';
> +			}

utf8_casefold ensures the string is NULL-terminated on success already.

> +		}
> +	}
> +
> +	cmp = strncmp(name, p_name, cf_len);
> +
> +	kfree(dst);
> +
> +	return cmp;
> +}
> +
>  static bool ovl_cache_entry_find_link(const char *name, int len,
>  				      struct rb_node ***link,
>  				      struct rb_node **parent)
> @@ -85,7 +123,7 @@ static bool ovl_cache_entry_find_link(const char *name, int len,
>  
>  		*parent = *newp;
>  		tmp = ovl_cache_entry_from_node(*newp);
> -		cmp = strncmp(name, tmp->name, len);
> +		cmp = ovl_strcmp(name, tmp, len);
>  		if (cmp > 0)
>  			newp = &tmp->node.rb_right;
>  		else if (cmp < 0 || len < tmp->len)
> @@ -107,7 +145,7 @@ static struct ovl_cache_entry *ovl_cache_entry_find(struct rb_root *root,
>  	while (node) {
>  		struct ovl_cache_entry *p = ovl_cache_entry_from_node(node);
>  
> -		cmp = strncmp(name, p->name, len);
> +		cmp = ovl_strcmp(name, p, len);
>  		if (cmp > 0)
>  			node = p->node.rb_right;
>  		else if (cmp < 0 || len < p->len)

-- 
Gabriel Krisman Bertazi

  parent reply	other threads:[~2025-08-05 14:56 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-05  3:09 [PATCH RFC v2 0/8] ovl: Enable support for casefold filesystems André Almeida
2025-08-05  3:09 ` [PATCH RFC v2 1/8] olv: Store casefold name for case-insentive dentries André Almeida
2025-08-05  3:09 ` [PATCH RFC v2 2/8] ovl: Create ovl_strcmp() with casefold support André Almeida
2025-08-05  5:08   ` Al Viro
2025-08-05 13:01     ` André Almeida
2025-08-05 14:56   ` Gabriel Krisman Bertazi [this message]
2025-08-05 18:40     ` André Almeida
2025-08-05  3:09 ` [PATCH RFC v2 3/8] fs: Create sb_same_encoding() helper André Almeida
2025-08-05  3:09 ` [PATCH RFC v2 4/8] ovl: Ensure that all mount points have the same encoding André Almeida
2025-08-05  3:09 ` [PATCH RFC v2 5/8] ovl: Set case-insensitive dentry operations for ovl sb André Almeida
2025-08-05  3:09 ` [PATCH RFC v2 6/8] ovl: Set inode S_CASEFOLD for casefolded dentries André Almeida
2025-08-05  3:09 ` [PATCH RFC v2 7/8] ovl: Check casefold consistency in ovl stack André Almeida
2025-08-05  3:09 ` [PATCH RFC v2 8/8] ovl: Drop restrictions for casefolded dentries André Almeida

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o6stakb6.fsf@mailhost.krisman.be \
    --to=gabriel@krisman.be \
    --cc=amir73il@gmail.com \
    --cc=andrealmeid@igalia.com \
    --cc=brauner@kernel.org \
    --cc=jack@suse.cz \
    --cc=kernel-dev@igalia.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.