linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gabriel Krisman Bertazi <gabriel@krisman.be>
To: "André Almeida" <andrealmeid@igalia.com>
Cc: Hugh Dickins <hughd@google.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	 Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>,  Jan Kara <jack@suse.cz>,
	krisman@kernel.org,  linux-mm@kvack.org,
	 linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	 kernel-dev@igalia.com,  Daniel Rosenberg <drosen@google.com>,
	 smcv@collabora.com,  Christoph Hellwig <hch@lst.de>,
	Theodore Ts'o <tytso@mit.edu>
Subject: Re: [PATCH v3 6/9] tmpfs: Add casefold lookup support
Date: Thu, 05 Sep 2024 17:28:53 -0400	[thread overview]
Message-ID: <87zfoln622.fsf@mailhost.krisman.be> (raw)
In-Reply-To: <20240905190252.461639-7-andrealmeid@igalia.com> ("André Almeida"'s message of "Thu, 5 Sep 2024 16:02:49 -0300")


Hi,

André Almeida <andrealmeid@igalia.com> writes:
> @@ -3427,6 +3431,10 @@ shmem_mknod(struct mnt_idmap *idmap, struct inode *dir,
>  	if (IS_ERR(inode))
>  		return PTR_ERR(inode);
>  
> +	if (IS_ENABLED(CONFIG_UNICODE))
> +		if (!generic_ci_validate_strict_name(dir, &dentry->d_name))
> +			return -EINVAL;
> +

if (IS_ENABLED(CONFIG_UNICODE) &&
    generic_ci_validate_strict_name(dir, &dentry->d_name))

>  static const struct constant_table shmem_param_enums_huge[] = {
> @@ -4081,9 +4111,62 @@ const struct fs_parameter_spec shmem_fs_parameters[] = {
>  	fsparam_string("grpquota_block_hardlimit", Opt_grpquota_block_hardlimit),
>  	fsparam_string("grpquota_inode_hardlimit", Opt_grpquota_inode_hardlimit),
>  #endif
> +	fsparam_string("casefold",	Opt_casefold_version),
> +	fsparam_flag  ("casefold",	Opt_casefold),
> +	fsparam_flag  ("strict_encoding", Opt_strict_encoding),

I don't know if it is possible, but can we do it with a single parameter?

> +static int shmem_parse_opt_casefold(struct fs_context *fc, struct fs_parameter *param,
> +				    bool latest_version)

Instead of the boolean, can't you check if param->string != NULL? (real
question, I never used fs_parameter.

> +{
> +	struct shmem_options *ctx = fc->fs_private;
> +	unsigned int maj = 0, min = 0, rev = 0, version = 0;
> +	struct unicode_map *encoding;
> +	char *version_str = param->string + 5;
> +	int ret;

unsigned int version = UTF8_LATEST;

and kill the if/else below:
> +
> +	if (latest_version) {
> +		version = UTF8_LATEST;
> +	} else {
> +		if (strncmp(param->string, "utf8-", 5))
> +			return invalfc(fc, "Only UTF-8 encodings are supported "
> +				       "in the format: utf8-<version number>");
> +
> +		ret = utf8_parse_version(version_str, &maj, &min, &rev);

utf8_parse_version interface could return UNICODE_AGE() already, so we hide the details
from the caller. wdyt?

> +		if (ret)
> +			return invalfc(fc, "Invalid UTF-8 version: %s", version_str);
> +
> +		version = UNICODE_AGE(maj, min, rev);
> +	}
> +
> +	encoding = utf8_load(version);
> +
> +	if (IS_ERR(encoding)) {
> +		if (latest_version)
> +			return invalfc(fc, "Failed loading latest UTF-8 version");
> +		else
> +			return invalfc(fc, "Failed loading UTF-8 version: %s", version_str);

The following covers both legs (untested):

if (IS_ERR(encoding))
  return invalfc(fc, "Failed loading UTF-8 version: utf8-%u.%u.%u\n"",
	           unicode_maj(version), unicode_min(version), unicode_rev(version));

> +	if (latest_version)
> +		pr_info("tmpfs: Using the latest UTF-8 version available");
> +	else
> +		pr_info("tmpfs: Using encoding provided by mount
> options: %s\n", param->string);

The following covers both legs (untested):

pr_info (fc, "tmpfs: Using encoding : utf8-%u.%u.%u\n"
         unicode_maj(version), unicode_min(version), unicode_rev(version));

> +
> +	ctx->encoding = encoding;
> +
> +	return 0;
> +}
> +#else
> +static int shmem_parse_opt_casefold(struct fs_context *fc, struct fs_parameter *param,
> +				    bool latest_version)
> +{
> +	return invalfc(fc, "tmpfs: No kernel support for casefold filesystems\n");
> +}

A message like "Kernel not built with CONFIG_UNICODE" immediately tells
you how to fix it.

> @@ -4515,6 +4610,16 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc)
>  	}
>  	sb->s_export_op = &shmem_export_ops;
>  	sb->s_flags |= SB_NOSEC | SB_I_VERSION;
> +
> +#if IS_ENABLED(CONFIG_UNICODE)
> +	if (ctx->encoding) {
> +		sb->s_encoding = ctx->encoding;
> +		generic_set_sb_d_ops(sb);

This is the right place for setting d_ops (see the next comment), but you
should be loading generic_ci_always_del_dentry_ops, right?

Also, since generic_ci_always_del_dentry_ops is only used by this one,
can you move it to this file?

> +static struct dentry *shmem_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags)
> +{
> +	const struct dentry_operations *d_ops = &simple_dentry_operations;
> +
> +#if IS_ENABLED(CONFIG_UNICODE)
> +	if (dentry->d_sb->s_encoding)
> +		d_ops = &generic_ci_always_del_dentry_ops;
> +#endif

This needs to be done at mount time through sb->s_d_op. See

https://lore.kernel.org/all/20240221171412.10710-1-krisman@suse.de/

I suppose we can do it at mount-time for
generic_ci_always_del_dentry_ops and simple_dentry_operations.

> +
> +	if (dentry->d_name.len > NAME_MAX)
> +		return ERR_PTR(-ENAMETOOLONG);
> +
> +	if (!dentry->d_sb->s_d_op)
> +		d_set_d_op(dentry, d_ops);
> +
> +	/*
> +	 * For now, VFS can't deal with case-insensitive negative dentries, so
> +	 * we prevent them from being created
> +	 */
> +	if (IS_ENABLED(CONFIG_UNICODE) && IS_CASEFOLDED(dir))
> +		return NULL;

Thinking out loud:

I misunderstood always_delete_dentry before.  It removes negative
dentries right after the lookup, since ->d_delete is called on dput.

But you still need this check here, IMO, to prevent the negative dentry
from ever being hashed. Otherwise it can be found by a concurrent
lookup.  And you cannot drop ->d_delete from the case-insensitive
operations too, because we still wants it for !IS_CASEFOLDED(dir).

The window is that, without this code, the negative dentry dentry would
be hashed in d_add() and a concurrent lookup might find it between that
time and the d_put, where it is removed at the end of the concurrent
lookup.

All of this would hopefully go away with the negative dentry for
casefolded directories.

> +
> +	d_add(dentry, NULL);
> +
> +	return NULL;
> +}

The sole reason you are doing this custom function is to exclude negative
dentries from casefolded directories. I doubt we care about the extra
check being done.  Can we just do it in simple_lookup?

> +
>  static const struct inode_operations shmem_dir_inode_operations = {
>  #ifdef CONFIG_TMPFS
>  	.getattr	= shmem_getattr,
>  	.create		= shmem_create,
> -	.lookup		= simple_lookup,
> +	.lookup		= shmem_lookup,
>  	.link		= shmem_link,
>  	.unlink		= shmem_unlink,
>  	.symlink	= shmem_symlink,
> @@ -4791,6 +4923,8 @@ int shmem_init_fs_context(struct fs_context *fc)
>  	ctx->uid = current_fsuid();
>  	ctx->gid = current_fsgid();
>  
> +	ctx->encoding = NULL;
> +
>  	fc->fs_private = ctx;
>  	fc->ops = &shmem_fs_context_ops;
>  	return 0;

-- 
Gabriel Krisman Bertazi

  reply	other threads:[~2024-09-05 21:29 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-05 19:02 [PATCH v3 0/9] tmpfs: Add case-insensitive support for tmpfs André Almeida
2024-09-05 19:02 ` [PATCH v3 1/9] libfs: Create the helper function generic_ci_validate_strict_name() André Almeida
2024-09-05 19:02 ` [PATCH v3 2/9] ext4: Use generic_ci_validate_strict_name helper André Almeida
2024-09-05 19:02 ` [PATCH v3 3/9] unicode: Recreate utf8_parse_version() André Almeida
2024-09-05 19:02 ` [PATCH v3 4/9] unicode: Export latest available UTF-8 version number André Almeida
2024-09-05 19:57   ` Gabriel Krisman Bertazi
2024-09-05 19:02 ` [PATCH v3 5/9] libfs: Create the helper struct generic_ci_always_del_dentry_ops André Almeida
2024-09-05 19:02 ` [PATCH v3 6/9] tmpfs: Add casefold lookup support André Almeida
2024-09-05 21:28   ` Gabriel Krisman Bertazi [this message]
2024-09-06 14:59     ` André Almeida
2024-09-09 14:15       ` Gabriel Krisman Bertazi
2024-09-05 19:02 ` [PATCH v3 7/9] tmpfs: Add flag FS_CASEFOLD_FL support for tmpfs dirs André Almeida
2024-09-05 19:02 ` [PATCH v3 8/9] tmpfs: Expose filesystem features via sysfs André Almeida
2024-09-05 19:02 ` [PATCH v3 9/9] docs: tmpfs: Add casefold options André Almeida
2024-09-05 19:48   ` Gabriel Krisman Bertazi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zfoln622.fsf@mailhost.krisman.be \
    --to=gabriel@krisman.be \
    --cc=akpm@linux-foundation.org \
    --cc=andrealmeid@igalia.com \
    --cc=brauner@kernel.org \
    --cc=drosen@google.com \
    --cc=hch@lst.de \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kernel-dev@igalia.com \
    --cc=krisman@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=smcv@collabora.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).