From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A445CE7A9F for ; Thu, 5 Sep 2024 21:29:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 031BB6B0085; Thu, 5 Sep 2024 17:29:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F23F66B008C; Thu, 5 Sep 2024 17:29:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DEADC6B0093; Thu, 5 Sep 2024 17:29:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C18756B0085 for ; Thu, 5 Sep 2024 17:29:08 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 39B9640B12 for ; Thu, 5 Sep 2024 21:29:08 +0000 (UTC) X-FDA: 82531975176.27.55813C9 Received: from relay7-d.mail.gandi.net (relay7-d.mail.gandi.net [217.70.183.200]) by imf05.hostedemail.com (Postfix) with ESMTP id 1D98110000D for ; Thu, 5 Sep 2024 21:29:04 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=krisman.be header.s=gm1 header.b=mzg3OvN7; dmarc=none; spf=pass (imf05.hostedemail.com: domain of gabriel@krisman.be designates 217.70.183.200 as permitted sender) smtp.mailfrom=gabriel@krisman.be ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725571720; a=rsa-sha256; cv=none; b=iZ+9/r218MB++5+aw+Y8/5gJHtm3hxD1YbjkNjNveyF9S8zsb4J2sl1ThX40/vfShANfyv 1zOKTZSFSBKhtfVgY59QTAV8q8B0BEI8DZz/yap3xeIRzPIH0yPzkLTkXgFdEG19m840Un 3Vb3eW/Ssa+85PHc2CKsTEglb72Ep8k= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=krisman.be header.s=gm1 header.b=mzg3OvN7; dmarc=none; spf=pass (imf05.hostedemail.com: domain of gabriel@krisman.be designates 217.70.183.200 as permitted sender) smtp.mailfrom=gabriel@krisman.be ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725571720; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NDsRbYbLqFB4B0nmxpbXiNrnJ8uSYv3Op+Q13MZehUY=; b=NOL3AObZ2fbPyiw7CHDG1IE08SRPAL8g4UzLep0Wx/iJLIRGbdUvd4IXo3O7JXkiCoAL8k moCuQHnNmMQtQYTb6X2WoLzVRcEkvc1r2JWBQutnz+bzI1uuaFYx/lEgOwn8AoVZ0qwCur vl6vgiFB7ErjdqmNkLFK8Jq3+eNWnDc= Received: by mail.gandi.net (Postfix) with ESMTPSA id 31B5420003; Thu, 5 Sep 2024 21:29:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=krisman.be; s=gm1; t=1725571743; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NDsRbYbLqFB4B0nmxpbXiNrnJ8uSYv3Op+Q13MZehUY=; b=mzg3OvN7W/aRpX/2o+c4lwP7zBzhXom27cj4rmDf6Pi74rJ5rDXKgJTZ/OGHbBiPeZnUD2 3KJFgrPHOVjusMtEjE601spTHSvPvXrhk7gOB1ylro5zBQ4XKkbr1wjzRoG6ZsH9FVxPLy ll6Hsn7V/2/5R8WnMcc4hF2uil5Xi+ldFPp48Pe9HGhOKbuu0Esc+NhoPcYJcKYi5GoVaM jCsrw9RS+Zvoq2Dl9Gp7VmfryRD7he4O+np1qSqvu89fLcAu4Z+CP4bouaABEGKSjYyVJI EWsHMBRoFdgJbC+SPJmuXI7qd0jH23nniOzI8MPgQ/VbG1zIXwadqLVFfFTV6A== From: Gabriel Krisman Bertazi To: =?utf-8?Q?Andr=C3=A9?= Almeida Cc: Hugh Dickins , Andrew Morton , Alexander Viro , Christian Brauner , Jan Kara , krisman@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, kernel-dev@igalia.com, Daniel Rosenberg , smcv@collabora.com, Christoph Hellwig , Theodore Ts'o Subject: Re: [PATCH v3 6/9] tmpfs: Add casefold lookup support In-Reply-To: <20240905190252.461639-7-andrealmeid@igalia.com> (=?utf-8?Q?=22Andr=C3=A9?= Almeida"'s message of "Thu, 5 Sep 2024 16:02:49 -0300") References: <20240905190252.461639-1-andrealmeid@igalia.com> <20240905190252.461639-7-andrealmeid@igalia.com> Date: Thu, 05 Sep 2024 17:28:53 -0400 Message-ID: <87zfoln622.fsf@mailhost.krisman.be> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-GND-Sasl: gabriel@krisman.be X-Rspam-User: X-Rspamd-Queue-Id: 1D98110000D X-Rspamd-Server: rspam01 X-Stat-Signature: yfh9kj5pn1scduf5dg7onc1okagqg7jf X-HE-Tag: 1725571744-253456 X-HE-Meta: U2FsdGVkX1/bbzsWRyTOBWBBJtabYLSnpK9eoxs2vBIAPDuL0iHpkwNFrAp0IAjHa3RpYY/iD12FsGO+uvlqahBhUSOmpHh+DiJqTXsEJacYDD0ZvFzdlt8f5yb3DNZAv4Vp2SIenT7TQba2rXa0StXCO5dcFMH3u6358tNC/msva5yis0qu5OyarnKQX9cEirH15a/rK8phxRcS/gBARycg66NYCuqDTtduObUOU8H5l/xHPH0xdzvvdUb2SJKTIoGIpjGsA/zq+QYjzJuVt/fPjtgCoTXj9Srk1F/VATaDoEFFVVl7Mfuijt0T2b3EiM1yU1E9YawSDDYzs3p9G2QrI/tb9M9pnORvTZcWkBj56pfESx2UXZim0XETJuSk6tZsvAeEBvzD/RqWOatpkSdjp94jtfbTqszXDhBXsSUDrFgGgs5JhM8MnHrjLCwpAX6X7bh3CNlSCafh3W54o7NA+v9kymDdYRJiNjRtd23TZldPNrLgAhHEhCUJZDpnacGnPaSaLg0VMQbSSl0cf27RT9tr98JXq6lJYfq/uMttHA9nZRsE01OamOUDzLQrUJCWl2PCjwmmj/Wp9XOFoQe0PjyQa4zdBF2IJYlhDoiUWQecZQ77jvweOezv4VMvM9p/ROOxd+5zq1OOEvBsvwnygHTtGqipNtXsz8HGVeQvh1ICCRMHg77zgWVaDDqYAPPRM+NX0/5u4YTbqKoQRySGkRLPLVwkYgRifjK/nCaP8tnJUDU2Hh4m9Xuws7nEyLcqIc38g+kIFWHLoHCxqVapmjFOL/Zx8RH8yQ2jcEW+CStJInXc9U/SewKSwiglxtY0/5XBQWtsoL37eCWnF2eqHkx89cCgatZQ8ExYjEdPGU15XkEPMwSa8WcDumLxBOwm7OiH91k9C0k/naFpS7iTq58wWUcL8JQlSTcB2ArpJqhApeHjskurReTwuPg4yBPrVSf0wd3mhVpddtt 0b1a3JnH Ez1nlgSZVnOmqeQN/9CFIbh9JuaIGW38uRjCOT6ZIIr/MUPbW0qozWxGtNzaaMu7Bp5M1XTReTlAaPXXHgvqolu+xWFtQYxZU4WH0oZoNUbPPQV1uEzupf4Ia71OAIERBFEIXYmlO4BoCs3tiMrIY+CTnD2ii9/Jy1mV7jufFHB0YbXhNk/+vFA8Uo1QGOW4GAIWMLM5RE9VJh7minYfXk4nUsSi7G/HC3Z2qC0W4sI2e1Kt42mdQ1oNI7844ZFCH/BjDYAq8T5Nxnd/PTNDmb0splm91oPqkC2whCO/Orsy9MkTvXlVFHcGFl/GGgTROhm1r X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi, Andr=C3=A9 Almeida writes: > @@ -3427,6 +3431,10 @@ shmem_mknod(struct mnt_idmap *idmap, struct inode = *dir, > if (IS_ERR(inode)) > return PTR_ERR(inode); >=20=20 > + if (IS_ENABLED(CONFIG_UNICODE)) > + if (!generic_ci_validate_strict_name(dir, &dentry->d_name)) > + return -EINVAL; > + if (IS_ENABLED(CONFIG_UNICODE) && generic_ci_validate_strict_name(dir, &dentry->d_name)) > static const struct constant_table shmem_param_enums_huge[] =3D { > @@ -4081,9 +4111,62 @@ const struct fs_parameter_spec shmem_fs_parameters= [] =3D { > fsparam_string("grpquota_block_hardlimit", Opt_grpquota_block_hardlimit= ), > fsparam_string("grpquota_inode_hardlimit", Opt_grpquota_inode_hardlimit= ), > #endif > + fsparam_string("casefold", Opt_casefold_version), > + fsparam_flag ("casefold", Opt_casefold), > + fsparam_flag ("strict_encoding", Opt_strict_encoding), I don't know if it is possible, but can we do it with a single parameter? > +static int shmem_parse_opt_casefold(struct fs_context *fc, struct fs_par= ameter *param, > + bool latest_version) Instead of the boolean, can't you check if param->string !=3D NULL? (real question, I never used fs_parameter. > +{ > + struct shmem_options *ctx =3D fc->fs_private; > + unsigned int maj =3D 0, min =3D 0, rev =3D 0, version =3D 0; > + struct unicode_map *encoding; > + char *version_str =3D param->string + 5; > + int ret; unsigned int version =3D UTF8_LATEST; and kill the if/else below: > + > + if (latest_version) { > + version =3D UTF8_LATEST; > + } else { > + if (strncmp(param->string, "utf8-", 5)) > + return invalfc(fc, "Only UTF-8 encodings are supported " > + "in the format: utf8-"); > + > + ret =3D utf8_parse_version(version_str, &maj, &min, &rev); utf8_parse_version interface could return UNICODE_AGE() already, so we hide= the details from the caller. wdyt? > + if (ret) > + return invalfc(fc, "Invalid UTF-8 version: %s", version_str); > + > + version =3D UNICODE_AGE(maj, min, rev); > + } > + > + encoding =3D utf8_load(version); > + > + if (IS_ERR(encoding)) { > + if (latest_version) > + return invalfc(fc, "Failed loading latest UTF-8 version"); > + else > + return invalfc(fc, "Failed loading UTF-8 version: %s", version_str); The following covers both legs (untested): if (IS_ERR(encoding)) return invalfc(fc, "Failed loading UTF-8 version: utf8-%u.%u.%u\n"", unicode_maj(version), unicode_min(version), unicode_rev(version= )); > + if (latest_version) > + pr_info("tmpfs: Using the latest UTF-8 version available"); > + else > + pr_info("tmpfs: Using encoding provided by mount > options: %s\n", param->string); The following covers both legs (untested): pr_info (fc, "tmpfs: Using encoding : utf8-%u.%u.%u\n" unicode_maj(version), unicode_min(version), unicode_rev(version)); > + > + ctx->encoding =3D encoding; > + > + return 0; > +} > +#else > +static int shmem_parse_opt_casefold(struct fs_context *fc, struct fs_par= ameter *param, > + bool latest_version) > +{ > + return invalfc(fc, "tmpfs: No kernel support for casefold filesystems\n= "); > +} A message like "Kernel not built with CONFIG_UNICODE" immediately tells you how to fix it. > @@ -4515,6 +4610,16 @@ static int shmem_fill_super(struct super_block *sb= , struct fs_context *fc) > } > sb->s_export_op =3D &shmem_export_ops; > sb->s_flags |=3D SB_NOSEC | SB_I_VERSION; > + > +#if IS_ENABLED(CONFIG_UNICODE) > + if (ctx->encoding) { > + sb->s_encoding =3D ctx->encoding; > + generic_set_sb_d_ops(sb); This is the right place for setting d_ops (see the next comment), but you should be loading generic_ci_always_del_dentry_ops, right? Also, since generic_ci_always_del_dentry_ops is only used by this one, can you move it to this file? > +static struct dentry *shmem_lookup(struct inode *dir, struct dentry *den= try, unsigned int flags) > +{ > + const struct dentry_operations *d_ops =3D &simple_dentry_operations; > + > +#if IS_ENABLED(CONFIG_UNICODE) > + if (dentry->d_sb->s_encoding) > + d_ops =3D &generic_ci_always_del_dentry_ops; > +#endif This needs to be done at mount time through sb->s_d_op. See https://lore.kernel.org/all/20240221171412.10710-1-krisman@suse.de/ I suppose we can do it at mount-time for generic_ci_always_del_dentry_ops and simple_dentry_operations. > + > + if (dentry->d_name.len > NAME_MAX) > + return ERR_PTR(-ENAMETOOLONG); > + > + if (!dentry->d_sb->s_d_op) > + d_set_d_op(dentry, d_ops); > + > + /* > + * For now, VFS can't deal with case-insensitive negative dentries, so > + * we prevent them from being created > + */ > + if (IS_ENABLED(CONFIG_UNICODE) && IS_CASEFOLDED(dir)) > + return NULL; Thinking out loud: I misunderstood always_delete_dentry before. It removes negative dentries right after the lookup, since ->d_delete is called on dput. But you still need this check here, IMO, to prevent the negative dentry from ever being hashed. Otherwise it can be found by a concurrent lookup. And you cannot drop ->d_delete from the case-insensitive operations too, because we still wants it for !IS_CASEFOLDED(dir). The window is that, without this code, the negative dentry dentry would be hashed in d_add() and a concurrent lookup might find it between that time and the d_put, where it is removed at the end of the concurrent lookup. All of this would hopefully go away with the negative dentry for casefolded directories. > + > + d_add(dentry, NULL); > + > + return NULL; > +} The sole reason you are doing this custom function is to exclude negative dentries from casefolded directories. I doubt we care about the extra check being done. Can we just do it in simple_lookup? > + > static const struct inode_operations shmem_dir_inode_operations =3D { > #ifdef CONFIG_TMPFS > .getattr =3D shmem_getattr, > .create =3D shmem_create, > - .lookup =3D simple_lookup, > + .lookup =3D shmem_lookup, > .link =3D shmem_link, > .unlink =3D shmem_unlink, > .symlink =3D shmem_symlink, > @@ -4791,6 +4923,8 @@ int shmem_init_fs_context(struct fs_context *fc) > ctx->uid =3D current_fsuid(); > ctx->gid =3D current_fsgid(); >=20=20 > + ctx->encoding =3D NULL; > + > fc->fs_private =3D ctx; > fc->ops =3D &shmem_fs_context_ops; > return 0; --=20 Gabriel Krisman Bertazi