* [PATCH v6 0/9] ovl: Enable support for casefold layers
@ 2025-08-22 14:17 André Almeida
2025-08-22 14:17 ` [PATCH v6 1/9] fs: Create sb_encoding() helper André Almeida
` (9 more replies)
0 siblings, 10 replies; 53+ messages in thread
From: André Almeida @ 2025-08-22 14:17 UTC (permalink / raw)
To: Miklos Szeredi, Amir Goldstein, Theodore Tso,
Gabriel Krisman Bertazi
Cc: linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev, André Almeida
Hi all,
We would like to support the usage of casefold layers with overlayfs to
be used with container tools. This use case requires a simple setup,
where every layer will have the same encoding setting (i.e. Unicode
version and flags), using one upper and one lower layer.
* Implementation
When merge layers, ovl uses a red-black tree to check if a given dentry
name from a lower layers already exists in the upper layer. For merging
case-insensitive names, we need to store then in tree casefolded.
However, when displaying to the user the dentry name, we need to respect
the name chosen when the file was created (e.g. Picture.PNG, instead of
picture.png). To achieve this, I create a new field for cache entries
that stores the casefolded names and a function ovl_strcmp() that uses
this name for searching the rb_tree. For composing the layer, ovl uses
the original name, keeping it consistency with whatever name the user
created.
The rest of the patches are mostly for checking if casefold is being
consistently used across the layers and dropping the mount restrictions
that prevented case-insensitive filesystems to be mounted.
Thanks for the feedback!
---
Changes in v6:
- Change pr_warn_ratelimited() message for ovl_create_real()
- Fixed kernel bot warning: "unused variable 'ofs'"
- Last version was using `strncmp(... tmp->len)` which was causing
xfstests regressions. It should be `strncmp(... len)`.
- Use c_len for tree operation: (cmp < 0 || len < tmp->c_len)
- Remove needless kfree(cf_name)
- Fix mounting layers without casefold enabled in ovl_dentry_weird()
v5: https://lore.kernel.org/r/20250814-tonyk-overlayfs-v5-0-c5b80a909cbd@igalia.com
Changes in v5:
- Reordered commits. libfs commits come earlier in the series
- First ovl commit just prepare and create ofs->casefold. The proper
enablement is done in the last commit
- Rework ovl_casefold() consumer/free buffer logic out to the caller
- Replace `const char *aux` with `const char *c_name`
- Add pr_warn_ratelimited() for ovl_create_real() error
- Replace "filesystems" with "layers" in the commit messages
- Add "Testing" section to cover letter
v4: https://lore.kernel.org/r/20250813-tonyk-overlayfs-v4-0-357ccf2e12ad@igalia.com
Changes in v4:
- Split patch "ovl: Support case-insensitive lookup" and move patch that
creates ofs->casefold to the begging of the series
- Merge patch "Store casefold name..." and "Create ovl_casefold()..."
- Make encoding restrictions apply just when casefold is enabled
- Rework set_d_op() with new helper
- Set encoding and encoding flags inside of ovl_get_layers()
- Rework how inode flags are set and checked
v3: https://lore.kernel.org/r/20250808-tonyk-overlayfs-v3-0-30f9be426ba8@igalia.com
Changes in v3:
- Rebased on top of vfs-6.18.misc branch
- Added more guards for casefolding things inside of IS_ENABLED(UNICODE)
- Refactor the strncmp() patch to do a single kmalloc() per rb_tree operation
- Instead of casefolding the cache entry name everytime per strncmp(),
casefold it once and reuse it for every strncmp().
- Created ovl_dentry_ci_operations to not override dentry ops set by
ovl_dentry_operations
- Instead of setting encoding just when there's a upper layer, set it
for any first layer (ofs->fs[0].sb), regardless of it being upper or
not.
- Rewrote the patch that set inode flags
- Check if every dentry is consistent with the root dentry regarding
casefold
v2: https://lore.kernel.org/r/20250805-tonyk-overlayfs-v2-0-0e54281da318@igalia.com
Changes in v2:
- Almost a full rewritten from the v1.
v1: https://lore.kernel.org/lkml/20250409-tonyk-overlayfs-v1-0-3991616fe9a3@igalia.com/
---
André Almeida (9):
fs: Create sb_encoding() helper
fs: Create sb_same_encoding() helper
ovl: Prepare for mounting case-insensitive enabled layers
ovl: Create ovl_casefold() to support casefolded strncmp()
ovl: Ensure that all layers have the same encoding
ovl: Set case-insensitive dentry operations for ovl sb
ovl: Add S_CASEFOLD as part of the inode flag to be copied
ovl: Check for casefold consistency when creating new dentries
ovl: Support mounting case-insensitive enabled layers
fs/overlayfs/copy_up.c | 2 +-
fs/overlayfs/dir.c | 7 +++
fs/overlayfs/inode.c | 1 +
fs/overlayfs/namei.c | 17 +++----
fs/overlayfs/overlayfs.h | 8 ++--
fs/overlayfs/ovl_entry.h | 1 +
fs/overlayfs/params.c | 15 +++++--
fs/overlayfs/params.h | 1 +
fs/overlayfs/readdir.c | 113 +++++++++++++++++++++++++++++++++++++++--------
fs/overlayfs/super.c | 51 +++++++++++++++++++++
fs/overlayfs/util.c | 10 +++--
include/linux/fs.h | 27 ++++++++++-
12 files changed, 213 insertions(+), 40 deletions(-)
---
base-commit: 3b7f28e441a100531fa9eff20e011a42376ca7d5
change-id: 20250409-tonyk-overlayfs-591f5e4d407a
Best regards,
--
André Almeida <andrealmeid@igalia.com>
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v6 1/9] fs: Create sb_encoding() helper
2025-08-22 14:17 [PATCH v6 0/9] ovl: Enable support for casefold layers André Almeida
@ 2025-08-22 14:17 ` André Almeida
2025-08-25 9:19 ` Gabriel Krisman Bertazi
2025-08-25 12:38 ` Gabriel Krisman Bertazi
2025-08-22 14:17 ` [PATCH v6 2/9] fs: Create sb_same_encoding() helper André Almeida
` (8 subsequent siblings)
9 siblings, 2 replies; 53+ messages in thread
From: André Almeida @ 2025-08-22 14:17 UTC (permalink / raw)
To: Miklos Szeredi, Amir Goldstein, Theodore Tso,
Gabriel Krisman Bertazi
Cc: linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev, André Almeida
Filesystems that need to deal with the super block encoding need to use
a if IS_ENABLED(CONFIG_UNICODE) around it because this struct member is
not declared otherwise. In order to move this if/endif guards outside of
the filesytem code and make it simpler, create a new function that
returns the s_encoding member of struct super_block if Unicode is
enabled, and return NULL otherwise.
Suggested-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
include/linux/fs.h | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index e1d4fef5c181d291a7c685e5897b2c018df439ae..a4d353a871b094b562a87ddcffe8336a26c5a3e2 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3733,15 +3733,20 @@ static inline bool generic_ci_validate_strict_name(struct inode *dir, struct qst
}
#endif
-static inline bool sb_has_encoding(const struct super_block *sb)
+static inline struct unicode_map *sb_encoding(const struct super_block *sb)
{
#if IS_ENABLED(CONFIG_UNICODE)
- return !!sb->s_encoding;
+ return sb->s_encoding;
#else
- return false;
+ return NULL;
#endif
}
+static inline bool sb_has_encoding(const struct super_block *sb)
+{
+ return !!sb_encoding(sb);
+}
+
int may_setattr(struct mnt_idmap *idmap, struct inode *inode,
unsigned int ia_valid);
int setattr_prepare(struct mnt_idmap *, struct dentry *, struct iattr *);
--
2.50.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v6 2/9] fs: Create sb_same_encoding() helper
2025-08-22 14:17 [PATCH v6 0/9] ovl: Enable support for casefold layers André Almeida
2025-08-22 14:17 ` [PATCH v6 1/9] fs: Create sb_encoding() helper André Almeida
@ 2025-08-22 14:17 ` André Almeida
2025-08-23 10:02 ` Amir Goldstein
2025-08-25 9:24 ` Gabriel Krisman Bertazi
2025-08-22 14:17 ` [PATCH v6 3/9] ovl: Prepare for mounting case-insensitive enabled layers André Almeida
` (7 subsequent siblings)
9 siblings, 2 replies; 53+ messages in thread
From: André Almeida @ 2025-08-22 14:17 UTC (permalink / raw)
To: Miklos Szeredi, Amir Goldstein, Theodore Tso,
Gabriel Krisman Bertazi
Cc: linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev, André Almeida
For cases where a file lookup can look in different filesystems (like in
overlayfs), both super blocks must have the same encoding and the same
flags. To help with that, create a sb_same_encoding() function.
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
include/linux/fs.h | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index a4d353a871b094b562a87ddcffe8336a26c5a3e2..7de9e1e4839a2726f4355ddf20b9babb74cc9681 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3747,6 +3747,24 @@ static inline bool sb_has_encoding(const struct super_block *sb)
return !!sb_encoding(sb);
}
+/*
+ * Compare if two super blocks have the same encoding and flags
+ */
+static inline bool sb_same_encoding(const struct super_block *sb1,
+ const struct super_block *sb2)
+{
+#if IS_ENABLED(CONFIG_UNICODE)
+ if (sb1->s_encoding == sb2->s_encoding)
+ return true;
+
+ return (sb1->s_encoding && sb2->s_encoding &&
+ (sb1->s_encoding->version == sb2->s_encoding->version) &&
+ (sb1->s_encoding_flags == sb2->s_encoding_flags));
+#else
+ return true;
+#endif
+}
+
int may_setattr(struct mnt_idmap *idmap, struct inode *inode,
unsigned int ia_valid);
int setattr_prepare(struct mnt_idmap *, struct dentry *, struct iattr *);
--
2.50.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v6 3/9] ovl: Prepare for mounting case-insensitive enabled layers
2025-08-22 14:17 [PATCH v6 0/9] ovl: Enable support for casefold layers André Almeida
2025-08-22 14:17 ` [PATCH v6 1/9] fs: Create sb_encoding() helper André Almeida
2025-08-22 14:17 ` [PATCH v6 2/9] fs: Create sb_same_encoding() helper André Almeida
@ 2025-08-22 14:17 ` André Almeida
2025-08-25 10:42 ` Gabriel Krisman Bertazi
2025-08-22 14:17 ` [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp() André Almeida
` (6 subsequent siblings)
9 siblings, 1 reply; 53+ messages in thread
From: André Almeida @ 2025-08-22 14:17 UTC (permalink / raw)
To: Miklos Szeredi, Amir Goldstein, Theodore Tso,
Gabriel Krisman Bertazi
Cc: linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev, André Almeida
Prepare for mounting layers with case-insensitive dentries in order to
supporting such layers in overlayfs, while enforcing uniform casefold
layers.
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
fs/overlayfs/ovl_entry.h | 1 +
fs/overlayfs/params.c | 15 ++++++++++++---
fs/overlayfs/params.h | 1 +
3 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 4c1bae935ced274f93a0d23fe10d34455e226ec4..1d4828dbcf7ac4ba9657221e601bbf79d970d225 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -91,6 +91,7 @@ struct ovl_fs {
struct mutex whiteout_lock;
/* r/o snapshot of upperdir sb's only taken on volatile mounts */
errseq_t errseq;
+ bool casefold;
};
/* Number of lower layers, not including data-only layers */
diff --git a/fs/overlayfs/params.c b/fs/overlayfs/params.c
index f4e7fff909ac49e2f8c58a76273426c1158a7472..63b7346c5ee1c127a9c33b12c3704aa035ff88cf 100644
--- a/fs/overlayfs/params.c
+++ b/fs/overlayfs/params.c
@@ -276,17 +276,26 @@ static int ovl_mount_dir(const char *name, struct path *path)
static int ovl_mount_dir_check(struct fs_context *fc, const struct path *path,
enum ovl_opt layer, const char *name, bool upper)
{
+ bool is_casefolded = ovl_dentry_casefolded(path->dentry);
struct ovl_fs_context *ctx = fc->fs_private;
+ struct ovl_fs *ofs = fc->s_fs_info;
if (!d_is_dir(path->dentry))
return invalfc(fc, "%s is not a directory", name);
/*
* Allow filesystems that are case-folding capable but deny composing
- * ovl stack from case-folded directories.
+ * ovl stack from inconsistent case-folded directories.
*/
- if (ovl_dentry_casefolded(path->dentry))
- return invalfc(fc, "case-insensitive directory on %s not supported", name);
+ if (!ctx->casefold_set) {
+ ofs->casefold = is_casefolded;
+ ctx->casefold_set = true;
+ }
+
+ if (ofs->casefold != is_casefolded) {
+ return invalfc(fc, "case-%ssensitive directory on %s is inconsistent",
+ is_casefolded ? "in" : "", name);
+ }
if (ovl_dentry_weird(path->dentry))
return invalfc(fc, "filesystem on %s not supported", name);
diff --git a/fs/overlayfs/params.h b/fs/overlayfs/params.h
index c96d939820211ddc63e265670a2aff60d95eec49..ffd53cdd84827cce827e8852f2de545f966ce60d 100644
--- a/fs/overlayfs/params.h
+++ b/fs/overlayfs/params.h
@@ -33,6 +33,7 @@ struct ovl_fs_context {
struct ovl_opt_set set;
struct ovl_fs_context_layer *lower;
char *lowerdir_all; /* user provided lowerdir string */
+ bool casefold_set;
};
int ovl_init_fs_context(struct fs_context *fc);
--
2.50.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp()
2025-08-22 14:17 [PATCH v6 0/9] ovl: Enable support for casefold layers André Almeida
` (2 preceding siblings ...)
2025-08-22 14:17 ` [PATCH v6 3/9] ovl: Prepare for mounting case-insensitive enabled layers André Almeida
@ 2025-08-22 14:17 ` André Almeida
2025-08-22 16:53 ` Amir Goldstein
2025-08-25 11:09 ` Gabriel Krisman Bertazi
2025-08-22 14:17 ` [PATCH v6 5/9] ovl: Ensure that all layers have the same encoding André Almeida
` (5 subsequent siblings)
9 siblings, 2 replies; 53+ messages in thread
From: André Almeida @ 2025-08-22 14:17 UTC (permalink / raw)
To: Miklos Szeredi, Amir Goldstein, Theodore Tso,
Gabriel Krisman Bertazi
Cc: linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev, André Almeida
To add overlayfs support casefold layers, create a new function
ovl_casefold(), to be able to do case-insensitive strncmp().
ovl_casefold() allocates a new buffer and stores the casefolded version
of the string on it. If the allocation or the casefold operation fails,
fallback to use the original string.
The case-insentive name is then used in the rb-tree search/insertion
operation. If the name is found in the rb-tree, the name can be
discarded and the buffer is freed. If the name isn't found, it's then
stored at struct ovl_cache_entry to be used later.
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
Changes from v6:
- Last version was using `strncmp(... tmp->len)` which was causing
regressions. It should be `strncmp(... len)`.
- Rename cf_len to c_len
- Use c_len for tree operation: (cmp < 0 || len < tmp->c_len)
- Remove needless kfree(cf_name)
---
fs/overlayfs/readdir.c | 113 ++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 94 insertions(+), 19 deletions(-)
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index b65cdfce31ce27172d28d879559f1008b9c87320..dfc661b7bc3f87efbf14991e97cee169400d823b 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -27,6 +27,8 @@ struct ovl_cache_entry {
bool is_upper;
bool is_whiteout;
bool check_xwhiteout;
+ const char *c_name;
+ int c_len;
char name[];
};
@@ -45,6 +47,7 @@ struct ovl_readdir_data {
struct list_head *list;
struct list_head middle;
struct ovl_cache_entry *first_maybe_whiteout;
+ struct unicode_map *map;
int count;
int err;
bool is_upper;
@@ -66,6 +69,27 @@ static struct ovl_cache_entry *ovl_cache_entry_from_node(struct rb_node *n)
return rb_entry(n, struct ovl_cache_entry, node);
}
+static int ovl_casefold(struct unicode_map *map, const char *str, int len, char **dst)
+{
+ const struct qstr qstr = { .name = str, .len = len };
+ int cf_len;
+
+ if (!IS_ENABLED(CONFIG_UNICODE) || !map || is_dot_dotdot(str, len))
+ return 0;
+
+ *dst = kmalloc(NAME_MAX, GFP_KERNEL);
+
+ if (dst) {
+ cf_len = utf8_casefold(map, &qstr, *dst, NAME_MAX);
+
+ if (cf_len > 0)
+ return cf_len;
+ }
+
+ kfree(*dst);
+ return 0;
+}
+
static bool ovl_cache_entry_find_link(const char *name, int len,
struct rb_node ***link,
struct rb_node **parent)
@@ -79,10 +103,10 @@ static bool ovl_cache_entry_find_link(const char *name, int len,
*parent = *newp;
tmp = ovl_cache_entry_from_node(*newp);
- cmp = strncmp(name, tmp->name, len);
+ cmp = strncmp(name, tmp->c_name, len);
if (cmp > 0)
newp = &tmp->node.rb_right;
- else if (cmp < 0 || len < tmp->len)
+ else if (cmp < 0 || len < tmp->c_len)
newp = &tmp->node.rb_left;
else
found = true;
@@ -101,10 +125,10 @@ static struct ovl_cache_entry *ovl_cache_entry_find(struct rb_root *root,
while (node) {
struct ovl_cache_entry *p = ovl_cache_entry_from_node(node);
- cmp = strncmp(name, p->name, len);
+ cmp = strncmp(name, p->c_name, len);
if (cmp > 0)
node = p->node.rb_right;
- else if (cmp < 0 || len < p->len)
+ else if (cmp < 0 || len < p->c_len)
node = p->node.rb_left;
else
return p;
@@ -145,6 +169,7 @@ static bool ovl_calc_d_ino(struct ovl_readdir_data *rdd,
static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
const char *name, int len,
+ const char *c_name, int c_len,
u64 ino, unsigned int d_type)
{
struct ovl_cache_entry *p;
@@ -167,6 +192,14 @@ static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
/* Defer check for overlay.whiteout to ovl_iterate() */
p->check_xwhiteout = rdd->in_xwhiteouts_dir && d_type == DT_REG;
+ if (c_name && c_name != name) {
+ p->c_name = c_name;
+ p->c_len = c_len;
+ } else {
+ p->c_name = p->name;
+ p->c_len = len;
+ }
+
if (d_type == DT_CHR) {
p->next_maybe_whiteout = rdd->first_maybe_whiteout;
rdd->first_maybe_whiteout = p;
@@ -174,48 +207,55 @@ static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
return p;
}
-static bool ovl_cache_entry_add_rb(struct ovl_readdir_data *rdd,
- const char *name, int len, u64 ino,
+/* Return 0 for found, 1 for added, <0 for error */
+static int ovl_cache_entry_add_rb(struct ovl_readdir_data *rdd,
+ const char *name, int len,
+ const char *c_name, int c_len,
+ u64 ino,
unsigned int d_type)
{
struct rb_node **newp = &rdd->root->rb_node;
struct rb_node *parent = NULL;
struct ovl_cache_entry *p;
- if (ovl_cache_entry_find_link(name, len, &newp, &parent))
- return true;
+ if (ovl_cache_entry_find_link(c_name, c_len, &newp, &parent))
+ return 0;
- p = ovl_cache_entry_new(rdd, name, len, ino, d_type);
+ p = ovl_cache_entry_new(rdd, name, len, c_name, c_len, ino, d_type);
if (p == NULL) {
rdd->err = -ENOMEM;
- return false;
+ return -ENOMEM;
}
list_add_tail(&p->l_node, rdd->list);
rb_link_node(&p->node, parent, newp);
rb_insert_color(&p->node, rdd->root);
- return true;
+ return 1;
}
-static bool ovl_fill_lowest(struct ovl_readdir_data *rdd,
+/* Return 0 for found, 1 for added, <0 for error */
+static int ovl_fill_lowest(struct ovl_readdir_data *rdd,
const char *name, int namelen,
+ const char *c_name, int c_len,
loff_t offset, u64 ino, unsigned int d_type)
{
struct ovl_cache_entry *p;
- p = ovl_cache_entry_find(rdd->root, name, namelen);
+ p = ovl_cache_entry_find(rdd->root, c_name, c_len);
if (p) {
list_move_tail(&p->l_node, &rdd->middle);
+ return 0;
} else {
- p = ovl_cache_entry_new(rdd, name, namelen, ino, d_type);
+ p = ovl_cache_entry_new(rdd, name, namelen, c_name, c_len,
+ ino, d_type);
if (p == NULL)
rdd->err = -ENOMEM;
else
list_add_tail(&p->l_node, &rdd->middle);
}
- return rdd->err == 0;
+ return rdd->err ?: 1;
}
void ovl_cache_free(struct list_head *list)
@@ -223,8 +263,11 @@ void ovl_cache_free(struct list_head *list)
struct ovl_cache_entry *p;
struct ovl_cache_entry *n;
- list_for_each_entry_safe(p, n, list, l_node)
+ list_for_each_entry_safe(p, n, list, l_node) {
+ if (p->c_name != p->name)
+ kfree(p->c_name);
kfree(p);
+ }
INIT_LIST_HEAD(list);
}
@@ -260,12 +303,36 @@ static bool ovl_fill_merge(struct dir_context *ctx, const char *name,
{
struct ovl_readdir_data *rdd =
container_of(ctx, struct ovl_readdir_data, ctx);
+ struct ovl_fs *ofs = OVL_FS(rdd->dentry->d_sb);
+ const char *c_name = NULL;
+ char *cf_name = NULL;
+ int c_len = 0, ret;
+
+ if (ofs->casefold)
+ c_len = ovl_casefold(rdd->map, name, namelen, &cf_name);
+
+ if (c_len <= 0) {
+ c_name = name;
+ c_len = namelen;
+ } else {
+ c_name = cf_name;
+ }
rdd->count++;
if (!rdd->is_lowest)
- return ovl_cache_entry_add_rb(rdd, name, namelen, ino, d_type);
+ ret = ovl_cache_entry_add_rb(rdd, name, namelen, c_name, c_len, ino, d_type);
else
- return ovl_fill_lowest(rdd, name, namelen, offset, ino, d_type);
+ ret = ovl_fill_lowest(rdd, name, namelen, c_name, c_len, offset, ino, d_type);
+
+ /*
+ * If ret == 1, that means that c_name is being used as part of struct
+ * ovl_cache_entry and will be freed at ovl_cache_free(). Otherwise,
+ * c_name was found in the rb-tree so we can free it here.
+ */
+ if (ret != 1 && c_name != name)
+ kfree(c_name);
+
+ return ret >= 0;
}
static int ovl_check_whiteouts(const struct path *path, struct ovl_readdir_data *rdd)
@@ -357,12 +424,18 @@ static int ovl_dir_read_merged(struct dentry *dentry, struct list_head *list,
.list = list,
.root = root,
.is_lowest = false,
+ .map = NULL,
};
int idx, next;
const struct ovl_layer *layer;
+ struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
for (idx = 0; idx != -1; idx = next) {
next = ovl_path_next(idx, dentry, &realpath, &layer);
+
+ if (ofs->casefold)
+ rdd.map = sb_encoding(realpath.dentry->d_sb);
+
rdd.is_upper = ovl_dentry_upper(dentry) == realpath.dentry;
rdd.in_xwhiteouts_dir = layer->has_xwhiteouts &&
ovl_dentry_has_xwhiteouts(dentry);
@@ -555,7 +628,7 @@ static bool ovl_fill_plain(struct dir_context *ctx, const char *name,
container_of(ctx, struct ovl_readdir_data, ctx);
rdd->count++;
- p = ovl_cache_entry_new(rdd, name, namelen, ino, d_type);
+ p = ovl_cache_entry_new(rdd, name, namelen, NULL, 0, ino, d_type);
if (p == NULL) {
rdd->err = -ENOMEM;
return false;
@@ -1023,6 +1096,8 @@ int ovl_check_empty_dir(struct dentry *dentry, struct list_head *list)
del_entry:
list_del(&p->l_node);
+ if (p->c_name != p->name)
+ kfree(p->c_name);
kfree(p);
}
--
2.50.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v6 5/9] ovl: Ensure that all layers have the same encoding
2025-08-22 14:17 [PATCH v6 0/9] ovl: Enable support for casefold layers André Almeida
` (3 preceding siblings ...)
2025-08-22 14:17 ` [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp() André Almeida
@ 2025-08-22 14:17 ` André Almeida
2025-08-25 11:17 ` Gabriel Krisman Bertazi
2025-08-22 14:17 ` [PATCH v6 6/9] ovl: Set case-insensitive dentry operations for ovl sb André Almeida
` (4 subsequent siblings)
9 siblings, 1 reply; 53+ messages in thread
From: André Almeida @ 2025-08-22 14:17 UTC (permalink / raw)
To: Miklos Szeredi, Amir Goldstein, Theodore Tso,
Gabriel Krisman Bertazi
Cc: linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev, André Almeida
When merging layers from different filesystems with casefold enabled,
all layers should use the same encoding version and have the same flags
to avoid any kind of incompatibility issues.
Also, set the encoding and the encoding flags for the ovl super block as
the same as used by the first valid layer.
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
fs/overlayfs/super.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index df85a76597e910d00323018f1d2cd720c5db921d..b1dbd3c79961094d00c7f99cc622e515d544d22f 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -991,6 +991,18 @@ static int ovl_get_data_fsid(struct ovl_fs *ofs)
return ofs->numfs;
}
+/*
+ * Set the ovl sb encoding as the same one used by the first layer
+ */
+static void ovl_set_encoding(struct super_block *sb, struct super_block *fs_sb)
+{
+#if IS_ENABLED(CONFIG_UNICODE)
+ if (sb_has_encoding(fs_sb)) {
+ sb->s_encoding = fs_sb->s_encoding;
+ sb->s_encoding_flags = fs_sb->s_encoding_flags;
+ }
+#endif
+}
static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
struct ovl_fs_context *ctx, struct ovl_layer *layers)
@@ -1024,6 +1036,9 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
if (ovl_upper_mnt(ofs)) {
ofs->fs[0].sb = ovl_upper_mnt(ofs)->mnt_sb;
ofs->fs[0].is_lower = false;
+
+ if (ofs->casefold)
+ ovl_set_encoding(sb, ofs->fs[0].sb);
}
nr_merged_lower = ctx->nr - ctx->nr_data;
@@ -1083,6 +1098,16 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
l->name = NULL;
ofs->numlayer++;
ofs->fs[fsid].is_lower = true;
+
+ if (ofs->casefold) {
+ if (!ovl_upper_mnt(ofs) && !sb_has_encoding(sb))
+ ovl_set_encoding(sb, ofs->fs[fsid].sb);
+
+ if (!sb_has_encoding(sb) || !sb_same_encoding(sb, mnt->mnt_sb)) {
+ pr_err("all layers must have the same encoding\n");
+ return -EINVAL;
+ }
+ }
}
/*
--
2.50.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v6 6/9] ovl: Set case-insensitive dentry operations for ovl sb
2025-08-22 14:17 [PATCH v6 0/9] ovl: Enable support for casefold layers André Almeida
` (4 preceding siblings ...)
2025-08-22 14:17 ` [PATCH v6 5/9] ovl: Ensure that all layers have the same encoding André Almeida
@ 2025-08-22 14:17 ` André Almeida
2025-08-25 11:24 ` Gabriel Krisman Bertazi
2025-08-22 14:17 ` [PATCH v6 7/9] ovl: Add S_CASEFOLD as part of the inode flag to be copied André Almeida
` (3 subsequent siblings)
9 siblings, 1 reply; 53+ messages in thread
From: André Almeida @ 2025-08-22 14:17 UTC (permalink / raw)
To: Miklos Szeredi, Amir Goldstein, Theodore Tso,
Gabriel Krisman Bertazi
Cc: linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev, André Almeida
For filesystems with encoding (i.e. with case-insensitive support), set
the dentry operations for the super block as ovl_dentry_ci_operations.
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
Changes in v6:
- Fix kernel bot warning: unused variable 'ofs'
---
fs/overlayfs/super.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index b1dbd3c79961094d00c7f99cc622e515d544d22f..8db4e55d5027cb975fec9b92251f62fe5924af4f 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -161,6 +161,16 @@ static const struct dentry_operations ovl_dentry_operations = {
.d_weak_revalidate = ovl_dentry_weak_revalidate,
};
+#if IS_ENABLED(CONFIG_UNICODE)
+static const struct dentry_operations ovl_dentry_ci_operations = {
+ .d_real = ovl_d_real,
+ .d_revalidate = ovl_dentry_revalidate,
+ .d_weak_revalidate = ovl_dentry_weak_revalidate,
+ .d_hash = generic_ci_d_hash,
+ .d_compare = generic_ci_d_compare,
+};
+#endif
+
static struct kmem_cache *ovl_inode_cachep;
static struct inode *ovl_alloc_inode(struct super_block *sb)
@@ -1332,6 +1342,19 @@ static struct dentry *ovl_get_root(struct super_block *sb,
return root;
}
+static void ovl_set_d_op(struct super_block *sb)
+{
+#if IS_ENABLED(CONFIG_UNICODE)
+ struct ovl_fs *ofs = sb->s_fs_info;
+
+ if (ofs->casefold) {
+ set_default_d_op(sb, &ovl_dentry_ci_operations);
+ return;
+ }
+#endif
+ set_default_d_op(sb, &ovl_dentry_operations);
+}
+
int ovl_fill_super(struct super_block *sb, struct fs_context *fc)
{
struct ovl_fs *ofs = sb->s_fs_info;
@@ -1443,6 +1466,8 @@ int ovl_fill_super(struct super_block *sb, struct fs_context *fc)
if (IS_ERR(oe))
goto out_err;
+ ovl_set_d_op(sb);
+
/* If the upper fs is nonexistent, we mark overlayfs r/o too */
if (!ovl_upper_mnt(ofs))
sb->s_flags |= SB_RDONLY;
--
2.50.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v6 7/9] ovl: Add S_CASEFOLD as part of the inode flag to be copied
2025-08-22 14:17 [PATCH v6 0/9] ovl: Enable support for casefold layers André Almeida
` (5 preceding siblings ...)
2025-08-22 14:17 ` [PATCH v6 6/9] ovl: Set case-insensitive dentry operations for ovl sb André Almeida
@ 2025-08-22 14:17 ` André Almeida
2025-08-22 14:17 ` [PATCH v6 8/9] ovl: Check for casefold consistency when creating new dentries André Almeida
` (2 subsequent siblings)
9 siblings, 0 replies; 53+ messages in thread
From: André Almeida @ 2025-08-22 14:17 UTC (permalink / raw)
To: Miklos Szeredi, Amir Goldstein, Theodore Tso,
Gabriel Krisman Bertazi
Cc: linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev, André Almeida
To keep ovl's inodes consistent with their real inodes, create a new
mask for inode file attributes that needs to be copied. Add the
S_CASEFOLD flag as part of the flags that need to be copied along with
the other file attributes.
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
fs/overlayfs/copy_up.c | 2 +-
fs/overlayfs/inode.c | 1 +
fs/overlayfs/overlayfs.h | 8 +++++---
fs/overlayfs/super.c | 1 +
4 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 27396fe63f6d5b36143750443304a1f0856e2f56..66bd43a99d2e8548eecf21699a9a6b97e9454d79 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -670,7 +670,7 @@ static int ovl_copy_up_metadata(struct ovl_copy_up_ctx *c, struct dentry *temp)
if (err)
return err;
- if (inode->i_flags & OVL_COPY_I_FLAGS_MASK &&
+ if (inode->i_flags & OVL_FATTR_I_FLAGS_MASK &&
(S_ISREG(c->stat.mode) || S_ISDIR(c->stat.mode))) {
/*
* Copy the fileattr inode flags that are the source of already
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index ecb9f2019395ecd01a124ad029375b1a1d13ebb5..aaa4cf579561299c50046f5ded03d93f056c370c 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -1277,6 +1277,7 @@ struct inode *ovl_get_inode(struct super_block *sb,
}
ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev);
ovl_inode_init(inode, oip, ino, fsid);
+ WARN_ON_ONCE(!!IS_CASEFOLDED(inode) != ofs->casefold);
if (upperdentry && ovl_is_impuredir(sb, upperdentry))
ovl_set_flag(OVL_IMPURE, inode);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index bb0d7ded8e763a4a7a6fc506d966ed2f3bdb4f06..50d550dd1b9d7841723880da85359e735bfc9277 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -821,10 +821,12 @@ struct inode *ovl_get_inode(struct super_block *sb,
struct ovl_inode_params *oip);
void ovl_copyattr(struct inode *to);
+/* vfs fileattr flags read from overlay.protattr xattr to ovl inode */
+#define OVL_PROT_I_FLAGS_MASK (S_APPEND | S_IMMUTABLE)
+/* vfs fileattr flags copied from real to ovl inode */
+#define OVL_FATTR_I_FLAGS_MASK (OVL_PROT_I_FLAGS_MASK | S_SYNC | S_NOATIME)
/* vfs inode flags copied from real to ovl inode */
-#define OVL_COPY_I_FLAGS_MASK (S_SYNC | S_NOATIME | S_APPEND | S_IMMUTABLE)
-/* vfs inode flags read from overlay.protattr xattr to ovl inode */
-#define OVL_PROT_I_FLAGS_MASK (S_APPEND | S_IMMUTABLE)
+#define OVL_COPY_I_FLAGS_MASK (OVL_FATTR_I_FLAGS_MASK | S_CASEFOLD)
/*
* fileattr flags copied from lower to upper inode on copy up.
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 8db4e55d5027cb975fec9b92251f62fe5924af4f..f5fce0a67ed5ea4de56462cab56f82ba7a020c84 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1335,6 +1335,7 @@ static struct dentry *ovl_get_root(struct super_block *sb,
ovl_dentry_set_flag(OVL_E_CONNECTED, root);
ovl_set_upperdata(d_inode(root));
ovl_inode_init(d_inode(root), &oip, ino, fsid);
+ WARN_ON(!!IS_CASEFOLDED(d_inode(root)) != ofs->casefold);
ovl_dentry_init_flags(root, upperdentry, oe, DCACHE_OP_WEAK_REVALIDATE);
/* root keeps a reference of upperdentry */
dget(upperdentry);
--
2.50.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v6 8/9] ovl: Check for casefold consistency when creating new dentries
2025-08-22 14:17 [PATCH v6 0/9] ovl: Enable support for casefold layers André Almeida
` (6 preceding siblings ...)
2025-08-22 14:17 ` [PATCH v6 7/9] ovl: Add S_CASEFOLD as part of the inode flag to be copied André Almeida
@ 2025-08-22 14:17 ` André Almeida
2025-08-22 14:17 ` [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers André Almeida
2025-08-22 19:28 ` [syzbot ci] Re: ovl: Enable support for casefold layers syzbot ci
9 siblings, 0 replies; 53+ messages in thread
From: André Almeida @ 2025-08-22 14:17 UTC (permalink / raw)
To: Miklos Szeredi, Amir Goldstein, Theodore Tso,
Gabriel Krisman Bertazi
Cc: linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev, André Almeida
In a overlayfs with casefold enabled, all new dentries should have
casefold enabled as well. Check this at ovl_create_real().
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
Changes from v5:
- Change pr_warn message
---
fs/overlayfs/dir.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 70b8687dc45e8e33079c865ae302ac58464224a6..fc1116f36a30e7217939b087435955e18a40ad2e 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -187,6 +187,13 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct dentry *parent,
/* mkdir is special... */
newdentry = ovl_do_mkdir(ofs, dir, newdentry, attr->mode);
err = PTR_ERR_OR_ZERO(newdentry);
+ /* expect to inherit casefolding from workdir/upperdir */
+ if (!err && ofs->casefold != ovl_dentry_casefolded(newdentry)) {
+ pr_warn_ratelimited("wrong inherited casefold (%pd2)\n",
+ newdentry);
+ dput(newdentry);
+ err = -EINVAL;
+ }
break;
case S_IFCHR:
--
2.50.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
2025-08-22 14:17 [PATCH v6 0/9] ovl: Enable support for casefold layers André Almeida
` (7 preceding siblings ...)
2025-08-22 14:17 ` [PATCH v6 8/9] ovl: Check for casefold consistency when creating new dentries André Almeida
@ 2025-08-22 14:17 ` André Almeida
2025-08-22 16:34 ` Amir Goldstein
2025-08-22 19:28 ` [syzbot ci] Re: ovl: Enable support for casefold layers syzbot ci
9 siblings, 1 reply; 53+ messages in thread
From: André Almeida @ 2025-08-22 14:17 UTC (permalink / raw)
To: Miklos Szeredi, Amir Goldstein, Theodore Tso,
Gabriel Krisman Bertazi
Cc: linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev, André Almeida
Drop the restriction for casefold dentries lookup to enable support for
case-insensitive layers in overlayfs.
Support case-insensitive layers with the condition that they should be
uniformly enabled across the stack and (i.e. if the root mount dir has
casefold enabled, so should all the dirs bellow for every layer).
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
Changes from v5:
- Fix mounting layers without casefold flag
---
fs/overlayfs/namei.c | 17 +++++++++--------
fs/overlayfs/util.c | 10 ++++++----
2 files changed, 15 insertions(+), 12 deletions(-)
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 76d6248b625e7c58e09685e421aef616aadea40a..e93bcc5727bcafdc18a499b47a7609fd41ecaec8 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -239,13 +239,14 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
char val;
/*
- * We allow filesystems that are case-folding capable but deny composing
- * ovl stack from case-folded directories. If someone has enabled case
- * folding on a directory on underlying layer, the warranty of the ovl
- * stack is voided.
+ * We allow filesystems that are case-folding capable as long as the
+ * layers are consistently enabled in the stack, enabled for every dir
+ * or disabled in all dirs. If someone has modified case folding on a
+ * directory on underlying layer, the warranty of the ovl stack is
+ * voided.
*/
- if (ovl_dentry_casefolded(base)) {
- warn = "case folded parent";
+ if (ofs->casefold != ovl_dentry_casefolded(base)) {
+ warn = "parent wrong casefold";
err = -ESTALE;
goto out_warn;
}
@@ -259,8 +260,8 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
goto out_err;
}
- if (ovl_dentry_casefolded(this)) {
- warn = "case folded child";
+ if (ofs->casefold != ovl_dentry_casefolded(this)) {
+ warn = "child wrong casefold";
err = -EREMOTE;
goto out_warn;
}
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index a33115e7384c129c543746326642813add63f060..52582b1da52598fbb14866f8c33eb27e36adda36 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -203,6 +203,8 @@ void ovl_dentry_init_flags(struct dentry *dentry, struct dentry *upperdentry,
bool ovl_dentry_weird(struct dentry *dentry)
{
+ struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
+
if (!d_can_lookup(dentry) && !d_is_file(dentry) && !d_is_symlink(dentry))
return true;
@@ -210,11 +212,11 @@ bool ovl_dentry_weird(struct dentry *dentry)
return true;
/*
- * Allow filesystems that are case-folding capable but deny composing
- * ovl stack from case-folded directories.
+ * Exceptionally for layers with casefold, we accept that they have
+ * their own hash and compare operations
*/
- if (sb_has_encoding(dentry->d_sb))
- return IS_CASEFOLDED(d_inode(dentry));
+ if (ofs->casefold)
+ return false;
return dentry->d_flags & (DCACHE_OP_HASH | DCACHE_OP_COMPARE);
}
--
2.50.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
2025-08-22 14:17 ` [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers André Almeida
@ 2025-08-22 16:34 ` Amir Goldstein
2025-08-22 16:47 ` André Almeida
0 siblings, 1 reply; 53+ messages in thread
From: Amir Goldstein @ 2025-08-22 16:34 UTC (permalink / raw)
To: André Almeida
Cc: Miklos Szeredi, Theodore Tso, Gabriel Krisman Bertazi,
linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev
On Fri, Aug 22, 2025 at 4:17 PM André Almeida <andrealmeid@igalia.com> wrote:
>
> Drop the restriction for casefold dentries lookup to enable support for
> case-insensitive layers in overlayfs.
>
> Support case-insensitive layers with the condition that they should be
> uniformly enabled across the stack and (i.e. if the root mount dir has
> casefold enabled, so should all the dirs bellow for every layer).
>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> ---
> Changes from v5:
> - Fix mounting layers without casefold flag
> ---
> fs/overlayfs/namei.c | 17 +++++++++--------
> fs/overlayfs/util.c | 10 ++++++----
> 2 files changed, 15 insertions(+), 12 deletions(-)
>
> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> index 76d6248b625e7c58e09685e421aef616aadea40a..e93bcc5727bcafdc18a499b47a7609fd41ecaec8 100644
> --- a/fs/overlayfs/namei.c
> +++ b/fs/overlayfs/namei.c
> @@ -239,13 +239,14 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
> char val;
>
> /*
> - * We allow filesystems that are case-folding capable but deny composing
> - * ovl stack from case-folded directories. If someone has enabled case
> - * folding on a directory on underlying layer, the warranty of the ovl
> - * stack is voided.
> + * We allow filesystems that are case-folding capable as long as the
> + * layers are consistently enabled in the stack, enabled for every dir
> + * or disabled in all dirs. If someone has modified case folding on a
> + * directory on underlying layer, the warranty of the ovl stack is
> + * voided.
> */
> - if (ovl_dentry_casefolded(base)) {
> - warn = "case folded parent";
> + if (ofs->casefold != ovl_dentry_casefolded(base)) {
> + warn = "parent wrong casefold";
> err = -ESTALE;
> goto out_warn;
> }
> @@ -259,8 +260,8 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
> goto out_err;
> }
>
> - if (ovl_dentry_casefolded(this)) {
> - warn = "case folded child";
> + if (ofs->casefold != ovl_dentry_casefolded(this)) {
> + warn = "child wrong casefold";
> err = -EREMOTE;
> goto out_warn;
> }
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index a33115e7384c129c543746326642813add63f060..52582b1da52598fbb14866f8c33eb27e36adda36 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -203,6 +203,8 @@ void ovl_dentry_init_flags(struct dentry *dentry, struct dentry *upperdentry,
>
> bool ovl_dentry_weird(struct dentry *dentry)
> {
> + struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
> +
> if (!d_can_lookup(dentry) && !d_is_file(dentry) && !d_is_symlink(dentry))
> return true;
>
> @@ -210,11 +212,11 @@ bool ovl_dentry_weird(struct dentry *dentry)
> return true;
>
> /*
> - * Allow filesystems that are case-folding capable but deny composing
> - * ovl stack from case-folded directories.
> + * Exceptionally for layers with casefold, we accept that they have
> + * their own hash and compare operations
> */
> - if (sb_has_encoding(dentry->d_sb))
> - return IS_CASEFOLDED(d_inode(dentry));
> + if (ofs->casefold)
> + return false;
I think this is better as:
if (sb_has_encoding(dentry->d_sb))
return false;
I don't think there is a reason to test ofs->casefold here.
a "weird" dentry is one that overlayfs doesn't know how to
handle. Now it known how to handle dentries with hash()/compare()
on casefolding capable filesysytems.
Can you please push v6 after this fix to your gitlab branch?
Thanks,
Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
2025-08-22 16:34 ` Amir Goldstein
@ 2025-08-22 16:47 ` André Almeida
2025-08-22 19:17 ` Amir Goldstein
0 siblings, 1 reply; 53+ messages in thread
From: André Almeida @ 2025-08-22 16:47 UTC (permalink / raw)
To: Amir Goldstein
Cc: Miklos Szeredi, Theodore Tso, Gabriel Krisman Bertazi,
linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev
Em 22/08/2025 13:34, Amir Goldstein escreveu:
> On Fri, Aug 22, 2025 at 4:17 PM André Almeida <andrealmeid@igalia.com> wrote:
>>
>> Drop the restriction for casefold dentries lookup to enable support for
>> case-insensitive layers in overlayfs.
>>
>> Support case-insensitive layers with the condition that they should be
>> uniformly enabled across the stack and (i.e. if the root mount dir has
>> casefold enabled, so should all the dirs bellow for every layer).
>>
>> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>> Signed-off-by: André Almeida <andrealmeid@igalia.com>
>> ---
>> Changes from v5:
>> - Fix mounting layers without casefold flag
>> ---
>> fs/overlayfs/namei.c | 17 +++++++++--------
>> fs/overlayfs/util.c | 10 ++++++----
>> 2 files changed, 15 insertions(+), 12 deletions(-)
>>
>> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
>> index 76d6248b625e7c58e09685e421aef616aadea40a..e93bcc5727bcafdc18a499b47a7609fd41ecaec8 100644
>> --- a/fs/overlayfs/namei.c
>> +++ b/fs/overlayfs/namei.c
>> @@ -239,13 +239,14 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
>> char val;
>>
>> /*
>> - * We allow filesystems that are case-folding capable but deny composing
>> - * ovl stack from case-folded directories. If someone has enabled case
>> - * folding on a directory on underlying layer, the warranty of the ovl
>> - * stack is voided.
>> + * We allow filesystems that are case-folding capable as long as the
>> + * layers are consistently enabled in the stack, enabled for every dir
>> + * or disabled in all dirs. If someone has modified case folding on a
>> + * directory on underlying layer, the warranty of the ovl stack is
>> + * voided.
>> */
>> - if (ovl_dentry_casefolded(base)) {
>> - warn = "case folded parent";
>> + if (ofs->casefold != ovl_dentry_casefolded(base)) {
>> + warn = "parent wrong casefold";
>> err = -ESTALE;
>> goto out_warn;
>> }
>> @@ -259,8 +260,8 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
>> goto out_err;
>> }
>>
>> - if (ovl_dentry_casefolded(this)) {
>> - warn = "case folded child";
>> + if (ofs->casefold != ovl_dentry_casefolded(this)) {
>> + warn = "child wrong casefold";
>> err = -EREMOTE;
>> goto out_warn;
>> }
>> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
>> index a33115e7384c129c543746326642813add63f060..52582b1da52598fbb14866f8c33eb27e36adda36 100644
>> --- a/fs/overlayfs/util.c
>> +++ b/fs/overlayfs/util.c
>> @@ -203,6 +203,8 @@ void ovl_dentry_init_flags(struct dentry *dentry, struct dentry *upperdentry,
>>
>> bool ovl_dentry_weird(struct dentry *dentry)
>> {
>> + struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
>> +
>> if (!d_can_lookup(dentry) && !d_is_file(dentry) && !d_is_symlink(dentry))
>> return true;
>>
>> @@ -210,11 +212,11 @@ bool ovl_dentry_weird(struct dentry *dentry)
>> return true;
>>
>> /*
>> - * Allow filesystems that are case-folding capable but deny composing
>> - * ovl stack from case-folded directories.
>> + * Exceptionally for layers with casefold, we accept that they have
>> + * their own hash and compare operations
>> */
>> - if (sb_has_encoding(dentry->d_sb))
>> - return IS_CASEFOLDED(d_inode(dentry));
>> + if (ofs->casefold)
>> + return false;
>
> I think this is better as:
> if (sb_has_encoding(dentry->d_sb))
> return false;
>
> I don't think there is a reason to test ofs->casefold here.
> a "weird" dentry is one that overlayfs doesn't know how to
> handle. Now it known how to handle dentries with hash()/compare()
> on casefolding capable filesysytems.
>
> Can you please push v6 after this fix to your gitlab branch?
>
Ok, it's done
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp()
2025-08-22 14:17 ` [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp() André Almeida
@ 2025-08-22 16:53 ` Amir Goldstein
2025-08-25 11:09 ` Gabriel Krisman Bertazi
1 sibling, 0 replies; 53+ messages in thread
From: Amir Goldstein @ 2025-08-22 16:53 UTC (permalink / raw)
To: André Almeida
Cc: Miklos Szeredi, Theodore Tso, Gabriel Krisman Bertazi,
linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev
On Fri, Aug 22, 2025 at 4:17 PM André Almeida <andrealmeid@igalia.com> wrote:
>
> To add overlayfs support casefold layers, create a new function
> ovl_casefold(), to be able to do case-insensitive strncmp().
>
> ovl_casefold() allocates a new buffer and stores the casefolded version
> of the string on it. If the allocation or the casefold operation fails,
> fallback to use the original string.
>
> The case-insentive name is then used in the rb-tree search/insertion
> operation. If the name is found in the rb-tree, the name can be
> discarded and the buffer is freed. If the name isn't found, it's then
> stored at struct ovl_cache_entry to be used later.
>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> ---
> Changes from v6:
> - Last version was using `strncmp(... tmp->len)` which was causing
> regressions. It should be `strncmp(... len)`.
> - Rename cf_len to c_len
> - Use c_len for tree operation: (cmp < 0 || len < tmp->c_len)
> - Remove needless kfree(cf_name)
> ---
> fs/overlayfs/readdir.c | 113 ++++++++++++++++++++++++++++++++++++++++---------
> 1 file changed, 94 insertions(+), 19 deletions(-)
>
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index b65cdfce31ce27172d28d879559f1008b9c87320..dfc661b7bc3f87efbf14991e97cee169400d823b 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -27,6 +27,8 @@ struct ovl_cache_entry {
> bool is_upper;
> bool is_whiteout;
> bool check_xwhiteout;
> + const char *c_name;
> + int c_len;
> char name[];
> };
>
> @@ -45,6 +47,7 @@ struct ovl_readdir_data {
> struct list_head *list;
> struct list_head middle;
> struct ovl_cache_entry *first_maybe_whiteout;
> + struct unicode_map *map;
> int count;
> int err;
> bool is_upper;
> @@ -66,6 +69,27 @@ static struct ovl_cache_entry *ovl_cache_entry_from_node(struct rb_node *n)
> return rb_entry(n, struct ovl_cache_entry, node);
> }
>
> +static int ovl_casefold(struct unicode_map *map, const char *str, int len, char **dst)
> +{
> + const struct qstr qstr = { .name = str, .len = len };
> + int cf_len;
> +
> + if (!IS_ENABLED(CONFIG_UNICODE) || !map || is_dot_dotdot(str, len))
> + return 0;
> +
> + *dst = kmalloc(NAME_MAX, GFP_KERNEL);
> +
> + if (dst) {
> + cf_len = utf8_casefold(map, &qstr, *dst, NAME_MAX);
> +
> + if (cf_len > 0)
> + return cf_len;
> + }
> +
> + kfree(*dst);
> + return 0;
> +}
> +
> static bool ovl_cache_entry_find_link(const char *name, int len,
> struct rb_node ***link,
> struct rb_node **parent)
> @@ -79,10 +103,10 @@ static bool ovl_cache_entry_find_link(const char *name, int len,
>
> *parent = *newp;
> tmp = ovl_cache_entry_from_node(*newp);
> - cmp = strncmp(name, tmp->name, len);
> + cmp = strncmp(name, tmp->c_name, len);
> if (cmp > 0)
> newp = &tmp->node.rb_right;
> - else if (cmp < 0 || len < tmp->len)
> + else if (cmp < 0 || len < tmp->c_len)
> newp = &tmp->node.rb_left;
> else
> found = true;
> @@ -101,10 +125,10 @@ static struct ovl_cache_entry *ovl_cache_entry_find(struct rb_root *root,
> while (node) {
> struct ovl_cache_entry *p = ovl_cache_entry_from_node(node);
>
> - cmp = strncmp(name, p->name, len);
> + cmp = strncmp(name, p->c_name, len);
> if (cmp > 0)
> node = p->node.rb_right;
> - else if (cmp < 0 || len < p->len)
> + else if (cmp < 0 || len < p->c_len)
> node = p->node.rb_left;
> else
> return p;
> @@ -145,6 +169,7 @@ static bool ovl_calc_d_ino(struct ovl_readdir_data *rdd,
>
> static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
> const char *name, int len,
> + const char *c_name, int c_len,
> u64 ino, unsigned int d_type)
> {
> struct ovl_cache_entry *p;
> @@ -167,6 +192,14 @@ static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
> /* Defer check for overlay.whiteout to ovl_iterate() */
> p->check_xwhiteout = rdd->in_xwhiteouts_dir && d_type == DT_REG;
>
> + if (c_name && c_name != name) {
> + p->c_name = c_name;
> + p->c_len = c_len;
> + } else {
> + p->c_name = p->name;
> + p->c_len = len;
> + }
> +
> if (d_type == DT_CHR) {
> p->next_maybe_whiteout = rdd->first_maybe_whiteout;
> rdd->first_maybe_whiteout = p;
> @@ -174,48 +207,55 @@ static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
> return p;
> }
>
> -static bool ovl_cache_entry_add_rb(struct ovl_readdir_data *rdd,
> - const char *name, int len, u64 ino,
> +/* Return 0 for found, 1 for added, <0 for error */
> +static int ovl_cache_entry_add_rb(struct ovl_readdir_data *rdd,
> + const char *name, int len,
> + const char *c_name, int c_len,
> + u64 ino,
> unsigned int d_type)
> {
> struct rb_node **newp = &rdd->root->rb_node;
> struct rb_node *parent = NULL;
> struct ovl_cache_entry *p;
>
> - if (ovl_cache_entry_find_link(name, len, &newp, &parent))
> - return true;
> + if (ovl_cache_entry_find_link(c_name, c_len, &newp, &parent))
> + return 0;
>
> - p = ovl_cache_entry_new(rdd, name, len, ino, d_type);
> + p = ovl_cache_entry_new(rdd, name, len, c_name, c_len, ino, d_type);
> if (p == NULL) {
> rdd->err = -ENOMEM;
> - return false;
> + return -ENOMEM;
> }
>
> list_add_tail(&p->l_node, rdd->list);
> rb_link_node(&p->node, parent, newp);
> rb_insert_color(&p->node, rdd->root);
>
> - return true;
> + return 1;
> }
>
> -static bool ovl_fill_lowest(struct ovl_readdir_data *rdd,
> +/* Return 0 for found, 1 for added, <0 for error */
> +static int ovl_fill_lowest(struct ovl_readdir_data *rdd,
> const char *name, int namelen,
> + const char *c_name, int c_len,
> loff_t offset, u64 ino, unsigned int d_type)
> {
> struct ovl_cache_entry *p;
>
> - p = ovl_cache_entry_find(rdd->root, name, namelen);
> + p = ovl_cache_entry_find(rdd->root, c_name, c_len);
> if (p) {
> list_move_tail(&p->l_node, &rdd->middle);
> + return 0;
> } else {
> - p = ovl_cache_entry_new(rdd, name, namelen, ino, d_type);
> + p = ovl_cache_entry_new(rdd, name, namelen, c_name, c_len,
> + ino, d_type);
> if (p == NULL)
> rdd->err = -ENOMEM;
> else
> list_add_tail(&p->l_node, &rdd->middle);
> }
>
> - return rdd->err == 0;
> + return rdd->err ?: 1;
> }
>
> void ovl_cache_free(struct list_head *list)
> @@ -223,8 +263,11 @@ void ovl_cache_free(struct list_head *list)
> struct ovl_cache_entry *p;
> struct ovl_cache_entry *n;
>
> - list_for_each_entry_safe(p, n, list, l_node)
> + list_for_each_entry_safe(p, n, list, l_node) {
> + if (p->c_name != p->name)
> + kfree(p->c_name);
> kfree(p);
> + }
>
> INIT_LIST_HEAD(list);
> }
> @@ -260,12 +303,36 @@ static bool ovl_fill_merge(struct dir_context *ctx, const char *name,
> {
> struct ovl_readdir_data *rdd =
> container_of(ctx, struct ovl_readdir_data, ctx);
> + struct ovl_fs *ofs = OVL_FS(rdd->dentry->d_sb);
> + const char *c_name = NULL;
> + char *cf_name = NULL;
> + int c_len = 0, ret;
> +
> + if (ofs->casefold)
> + c_len = ovl_casefold(rdd->map, name, namelen, &cf_name);
> +
> + if (c_len <= 0) {
> + c_name = name;
> + c_len = namelen;
> + } else {
> + c_name = cf_name;
> + }
>
> rdd->count++;
> if (!rdd->is_lowest)
> - return ovl_cache_entry_add_rb(rdd, name, namelen, ino, d_type);
> + ret = ovl_cache_entry_add_rb(rdd, name, namelen, c_name, c_len, ino, d_type);
> else
> - return ovl_fill_lowest(rdd, name, namelen, offset, ino, d_type);
> + ret = ovl_fill_lowest(rdd, name, namelen, c_name, c_len, offset, ino, d_type);
> +
> + /*
> + * If ret == 1, that means that c_name is being used as part of struct
> + * ovl_cache_entry and will be freed at ovl_cache_free(). Otherwise,
> + * c_name was found in the rb-tree so we can free it here.
> + */
> + if (ret != 1 && c_name != name)
> + kfree(c_name);
> +
> + return ret >= 0;
> }
>
> static int ovl_check_whiteouts(const struct path *path, struct ovl_readdir_data *rdd)
> @@ -357,12 +424,18 @@ static int ovl_dir_read_merged(struct dentry *dentry, struct list_head *list,
> .list = list,
> .root = root,
> .is_lowest = false,
> + .map = NULL,
> };
> int idx, next;
> const struct ovl_layer *layer;
> + struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
>
> for (idx = 0; idx != -1; idx = next) {
> next = ovl_path_next(idx, dentry, &realpath, &layer);
> +
> + if (ofs->casefold)
> + rdd.map = sb_encoding(realpath.dentry->d_sb);
> +
> rdd.is_upper = ovl_dentry_upper(dentry) == realpath.dentry;
> rdd.in_xwhiteouts_dir = layer->has_xwhiteouts &&
> ovl_dentry_has_xwhiteouts(dentry);
> @@ -555,7 +628,7 @@ static bool ovl_fill_plain(struct dir_context *ctx, const char *name,
> container_of(ctx, struct ovl_readdir_data, ctx);
>
> rdd->count++;
> - p = ovl_cache_entry_new(rdd, name, namelen, ino, d_type);
> + p = ovl_cache_entry_new(rdd, name, namelen, NULL, 0, ino, d_type);
> if (p == NULL) {
> rdd->err = -ENOMEM;
> return false;
> @@ -1023,6 +1096,8 @@ int ovl_check_empty_dir(struct dentry *dentry, struct list_head *list)
>
> del_entry:
> list_del(&p->l_node);
> + if (p->c_name != p->name)
> + kfree(p->c_name);
> kfree(p);
OK I thought this was contained in ovl_cache_free().
If we need to repeat this check, we need a helper
ovl_cache_entry_free() to use instead of kfree(p)
everywhere even in ovl_dir_read_impure() when it won't
actually be needed.
I can make this change on commit no need to repost.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
2025-08-22 16:47 ` André Almeida
@ 2025-08-22 19:17 ` Amir Goldstein
2025-08-25 13:31 ` André Almeida
0 siblings, 1 reply; 53+ messages in thread
From: Amir Goldstein @ 2025-08-22 19:17 UTC (permalink / raw)
To: André Almeida
Cc: Miklos Szeredi, Theodore Tso, Gabriel Krisman Bertazi,
linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev
On Fri, Aug 22, 2025 at 6:47 PM André Almeida <andrealmeid@igalia.com> wrote:
>
> Em 22/08/2025 13:34, Amir Goldstein escreveu:
> > On Fri, Aug 22, 2025 at 4:17 PM André Almeida <andrealmeid@igalia.com> wrote:
> >>
> >> Drop the restriction for casefold dentries lookup to enable support for
> >> case-insensitive layers in overlayfs.
> >>
> >> Support case-insensitive layers with the condition that they should be
> >> uniformly enabled across the stack and (i.e. if the root mount dir has
> >> casefold enabled, so should all the dirs bellow for every layer).
> >>
> >> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> >> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> >> ---
> >> Changes from v5:
> >> - Fix mounting layers without casefold flag
> >> ---
> >> fs/overlayfs/namei.c | 17 +++++++++--------
> >> fs/overlayfs/util.c | 10 ++++++----
> >> 2 files changed, 15 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
> >> index 76d6248b625e7c58e09685e421aef616aadea40a..e93bcc5727bcafdc18a499b47a7609fd41ecaec8 100644
> >> --- a/fs/overlayfs/namei.c
> >> +++ b/fs/overlayfs/namei.c
> >> @@ -239,13 +239,14 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
> >> char val;
> >>
> >> /*
> >> - * We allow filesystems that are case-folding capable but deny composing
> >> - * ovl stack from case-folded directories. If someone has enabled case
> >> - * folding on a directory on underlying layer, the warranty of the ovl
> >> - * stack is voided.
> >> + * We allow filesystems that are case-folding capable as long as the
> >> + * layers are consistently enabled in the stack, enabled for every dir
> >> + * or disabled in all dirs. If someone has modified case folding on a
> >> + * directory on underlying layer, the warranty of the ovl stack is
> >> + * voided.
> >> */
> >> - if (ovl_dentry_casefolded(base)) {
> >> - warn = "case folded parent";
> >> + if (ofs->casefold != ovl_dentry_casefolded(base)) {
> >> + warn = "parent wrong casefold";
> >> err = -ESTALE;
> >> goto out_warn;
> >> }
> >> @@ -259,8 +260,8 @@ static int ovl_lookup_single(struct dentry *base, struct ovl_lookup_data *d,
> >> goto out_err;
> >> }
> >>
> >> - if (ovl_dentry_casefolded(this)) {
> >> - warn = "case folded child";
> >> + if (ofs->casefold != ovl_dentry_casefolded(this)) {
> >> + warn = "child wrong casefold";
> >> err = -EREMOTE;
> >> goto out_warn;
> >> }
> >> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> >> index a33115e7384c129c543746326642813add63f060..52582b1da52598fbb14866f8c33eb27e36adda36 100644
> >> --- a/fs/overlayfs/util.c
> >> +++ b/fs/overlayfs/util.c
> >> @@ -203,6 +203,8 @@ void ovl_dentry_init_flags(struct dentry *dentry, struct dentry *upperdentry,
> >>
> >> bool ovl_dentry_weird(struct dentry *dentry)
> >> {
> >> + struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
> >> +
FWIW this was a bug that hits
WARN_ON_ONCE(sb->s_type != &ovl_fs_type)
because dentry is NOT an ovl dentry.
> >> if (!d_can_lookup(dentry) && !d_is_file(dentry) && !d_is_symlink(dentry))
> >> return true;
> >>
> >> @@ -210,11 +212,11 @@ bool ovl_dentry_weird(struct dentry *dentry)
> >> return true;
> >>
> >> /*
> >> - * Allow filesystems that are case-folding capable but deny composing
> >> - * ovl stack from case-folded directories.
> >> + * Exceptionally for layers with casefold, we accept that they have
> >> + * their own hash and compare operations
> >> */
> >> - if (sb_has_encoding(dentry->d_sb))
> >> - return IS_CASEFOLDED(d_inode(dentry));
> >> + if (ofs->casefold)
> >> + return false;
> >
> > I think this is better as:
> > if (sb_has_encoding(dentry->d_sb))
> > return false;
> >
And this still fails the test "Casefold enabled" for me.
Maybe you are confused because this does not look like
a test failure. It looks like this:
generic/999 5s ... [19:10:21][ 150.667994] overlayfs: failed lookup
in lower (ovl-lower/casefold, name='subdir', err=-116): parent wrong
casefold
[ 150.669741] overlayfs: failed lookup in lower (ovl-lower/casefold,
name='subdir', err=-116): parent wrong casefold
[ 150.760644] overlayfs: failed lookup in lower (/ovl-lower,
name='casefold', err=-66): child wrong casefold
[19:10:24] [not run]
generic/999 -- overlayfs does not support casefold enabled layers
Ran: generic/999
Not run: generic/999
Passed all 1 tests
I'm not sure I will keep the test this way. This is not very standard nor
good practice, to run half of the test and then skip it.
I would probably split it into two tests.
The first one as it is now will run to completion on kenrels >= v6.17
and the Casefold enable test will run on kernels >= v6.18.
In any case, please make sure that the test is not skipped when testing
Casefold enabled layers
And then continue with the missing test cases.
When you have a test that passes please send the test itself or
a fstest branch for me to test.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* [syzbot ci] Re: ovl: Enable support for casefold layers
2025-08-22 14:17 [PATCH v6 0/9] ovl: Enable support for casefold layers André Almeida
` (8 preceding siblings ...)
2025-08-22 14:17 ` [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers André Almeida
@ 2025-08-22 19:28 ` syzbot ci
9 siblings, 0 replies; 53+ messages in thread
From: syzbot ci @ 2025-08-22 19:28 UTC (permalink / raw)
To: amir73il, andrealmeid, brauner, jack, kernel-dev, krisman,
linux-fsdevel, linux-kernel, linux-unionfs, miklos, tytso, viro
Cc: syzbot, syzkaller-bugs
syzbot ci has tested the following series
[v6] ovl: Enable support for casefold layers
https://lore.kernel.org/all/20250822-tonyk-overlayfs-v6-0-8b6e9e604fa2@igalia.com
* [PATCH v6 1/9] fs: Create sb_encoding() helper
* [PATCH v6 2/9] fs: Create sb_same_encoding() helper
* [PATCH v6 3/9] ovl: Prepare for mounting case-insensitive enabled layers
* [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp()
* [PATCH v6 5/9] ovl: Ensure that all layers have the same encoding
* [PATCH v6 6/9] ovl: Set case-insensitive dentry operations for ovl sb
* [PATCH v6 7/9] ovl: Add S_CASEFOLD as part of the inode flag to be copied
* [PATCH v6 8/9] ovl: Check for casefold consistency when creating new dentries
* [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
and found the following issue:
WARNING in ovl_dentry_weird
Full report is available here:
https://ci.syzbot.org/series/efd002b5-e585-4cf8-86e7-4f24ba2247c7
***
WARNING in ovl_dentry_weird
tree: torvalds
URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux
base: 068a56e56fa81e42fc5f08dff34fab149bb60a09
arch: amd64
compiler: Debian clang version 20.1.7 (++20250616065708+6146a88f6049-1~exp1~20250616065826.132), Debian LLD 20.1.7
config: https://ci.syzbot.org/builds/039eb31b-2b45-4207-b63e-71a25ed89f00/config
C repro: https://ci.syzbot.org/findings/726ae90b-83b6-49e2-a496-9bfe444dc24f/c_repro
syz repro: https://ci.syzbot.org/findings/726ae90b-83b6-49e2-a496-9bfe444dc24f/syz_repro
EXT4-fs (loop0): 1 orphan inode deleted
EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000 r/w without journal. Quota mode: none.
------------[ cut here ]------------
WARNING: CPU: 0 PID: 6001 at fs/overlayfs/ovl_entry.h:118 OVL_FS fs/overlayfs/ovl_entry.h:118 [inline]
WARNING: CPU: 0 PID: 6001 at fs/overlayfs/ovl_entry.h:118 ovl_dentry_weird+0x15a/0x1a0 fs/overlayfs/util.c:206
Modules linked in:
CPU: 0 UID: 0 PID: 6001 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:OVL_FS fs/overlayfs/ovl_entry.h:118 [inline]
RIP: 0010:ovl_dentry_weird+0x15a/0x1a0 fs/overlayfs/util.c:206
Code: e8 6b f9 8f fe 83 e5 03 0f 95 c3 31 ff 89 ee e8 9c fd 8f fe 89 d8 5b 41 5c 41 5e 41 5f 5d e9 3d b9 4c 08 cc e8 47 f9 8f fe 90 <0f> 0b 90 e9 08 ff ff ff 44 89 f1 80 e1 07 80 c1 03 38 c1 0f 8c 0b
RSP: 0018:ffffc90002caf9c8 EFLAGS: 00010293
RAX: ffffffff832fb1e9 RBX: ffff888109730000 RCX: ffff888023295640
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88802b624a48
RBP: dffffc0000000000 R08: 0000000030656c69 R09: 1ffff110048d0ce0
R10: dffffc0000000000 R11: ffffed10048d0ce1 R12: dffffc0000000000
R13: 0000000000000003 R14: ffff88802b624a48 R15: ffff888109730028
FS: 0000555581e17500(0000) GS:ffff8880b861b000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000200000001000 CR3: 00000000242f4000 CR4: 00000000000006f0
Call Trace:
<TASK>
ovl_mount_dir_check fs/overlayfs/params.c:300 [inline]
ovl_do_parse_layer+0x307/0xbb0 fs/overlayfs/params.c:422
ovl_parse_layer fs/overlayfs/params.c:448 [inline]
ovl_parse_param+0xb62/0xee0 fs/overlayfs/params.c:633
vfs_parse_fs_param+0x1a9/0x420 fs/fs_context.c:146
vfs_parse_fs_string fs/fs_context.c:188 [inline]
vfs_parse_monolithic_sep+0x24d/0x310 fs/fs_context.c:230
do_new_mount+0x273/0x9e0 fs/namespace.c:3804
do_mount fs/namespace.c:4136 [inline]
__do_sys_mount fs/namespace.c:4347 [inline]
__se_sys_mount+0x317/0x410 fs/namespace.c:4324
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f0c2558ebe9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffd67150878 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007f0c257b5fa0 RCX: 00007f0c2558ebe9
RDX: 0000200000000b80 RSI: 0000200000000100 RDI: 0000000000000000
RBP: 00007f0c25611e19 R08: 0000200000000180 R09: 0000000000000000
R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f0c257b5fa0 R14: 00007f0c257b5fa0 R15: 0000000000000005
</TASK>
***
If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
Tested-by: syzbot@syzkaller.appspotmail.com
---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 2/9] fs: Create sb_same_encoding() helper
2025-08-22 14:17 ` [PATCH v6 2/9] fs: Create sb_same_encoding() helper André Almeida
@ 2025-08-23 10:02 ` Amir Goldstein
2025-08-25 9:24 ` Gabriel Krisman Bertazi
1 sibling, 0 replies; 53+ messages in thread
From: Amir Goldstein @ 2025-08-23 10:02 UTC (permalink / raw)
To: André Almeida, Christian Brauner, Gabriel Krisman Bertazi
Cc: Miklos Szeredi, Theodore Tso, linux-unionfs, linux-kernel,
linux-fsdevel, Alexander Viro, Jan Kara, kernel-dev
On Fri, Aug 22, 2025 at 4:17 PM André Almeida <andrealmeid@igalia.com> wrote:
>
> For cases where a file lookup can look in different filesystems (like in
> overlayfs), both super blocks must have the same encoding and the same
> flags. To help with that, create a sb_same_encoding() function.
>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> ---
> include/linux/fs.h | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index a4d353a871b094b562a87ddcffe8336a26c5a3e2..7de9e1e4839a2726f4355ddf20b9babb74cc9681 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3747,6 +3747,24 @@ static inline bool sb_has_encoding(const struct super_block *sb)
> return !!sb_encoding(sb);
> }
>
> +/*
> + * Compare if two super blocks have the same encoding and flags
> + */
> +static inline bool sb_same_encoding(const struct super_block *sb1,
> + const struct super_block *sb2)
> +{
> +#if IS_ENABLED(CONFIG_UNICODE)
> + if (sb1->s_encoding == sb2->s_encoding)
> + return true;
> +
> + return (sb1->s_encoding && sb2->s_encoding &&
> + (sb1->s_encoding->version == sb2->s_encoding->version) &&
> + (sb1->s_encoding_flags == sb2->s_encoding_flags));
> +#else
> + return true;
> +#endif
> +}
> +
> int may_setattr(struct mnt_idmap *idmap, struct inode *inode,
> unsigned int ia_valid);
> int setattr_prepare(struct mnt_idmap *, struct dentry *, struct iattr *);
>
Christian,
I am planning to stage this series for v6.18 [1].
I think it would be better to avoid splitting the two minor vfs helpers
in first two patches from this series into a stable vfs branch and
would be better to get you RVB on the two vfs patches and let them
go upstream via the ovl tree.
WDYT?
Gabriel,
It would be great if you could also provide RVB for the vfs helpers
and of course, review for the entire series would be most welcome as well.
Thanks,
Amir.
[1] https://lore.kernel.org/linux-unionfs/20250822-tonyk-overlayfs-v6-0-8b6e9e604fa2@igalia.com/
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 1/9] fs: Create sb_encoding() helper
2025-08-22 14:17 ` [PATCH v6 1/9] fs: Create sb_encoding() helper André Almeida
@ 2025-08-25 9:19 ` Gabriel Krisman Bertazi
2025-08-25 12:38 ` Gabriel Krisman Bertazi
1 sibling, 0 replies; 53+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-08-25 9:19 UTC (permalink / raw)
To: André Almeida
Cc: Miklos Szeredi, Amir Goldstein, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev
André Almeida <andrealmeid@igalia.com> writes:
> Filesystems that need to deal with the super block encoding need to use
> a if IS_ENABLED(CONFIG_UNICODE) around it because this struct member is
> not declared otherwise. In order to move this if/endif guards outside of
> the filesytem code and make it simpler, create a new function that
> returns the s_encoding member of struct super_block if Unicode is
> enabled, and return NULL otherwise.
>
> Suggested-by: Amir Goldstein <amir73il@gmail.com>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Gabriel Krisman Bertazi <gabriel@krisman.be>
> ---
> include/linux/fs.h | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index e1d4fef5c181d291a7c685e5897b2c018df439ae..a4d353a871b094b562a87ddcffe8336a26c5a3e2 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3733,15 +3733,20 @@ static inline bool generic_ci_validate_strict_name(struct inode *dir, struct qst
> }
> #endif
>
> -static inline bool sb_has_encoding(const struct super_block *sb)
> +static inline struct unicode_map *sb_encoding(const struct super_block *sb)
> {
> #if IS_ENABLED(CONFIG_UNICODE)
> - return !!sb->s_encoding;
> + return sb->s_encoding;
> #else
> - return false;
> + return NULL;
> #endif
> }
>
> +static inline bool sb_has_encoding(const struct super_block *sb)
> +{
> + return !!sb_encoding(sb);
> +}
> +
> int may_setattr(struct mnt_idmap *idmap, struct inode *inode,
> unsigned int ia_valid);
> int setattr_prepare(struct mnt_idmap *, struct dentry *, struct iattr *);
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 2/9] fs: Create sb_same_encoding() helper
2025-08-22 14:17 ` [PATCH v6 2/9] fs: Create sb_same_encoding() helper André Almeida
2025-08-23 10:02 ` Amir Goldstein
@ 2025-08-25 9:24 ` Gabriel Krisman Bertazi
1 sibling, 0 replies; 53+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-08-25 9:24 UTC (permalink / raw)
To: André Almeida
Cc: Miklos Szeredi, Amir Goldstein, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev
André Almeida <andrealmeid@igalia.com> writes:
> For cases where a file lookup can look in different filesystems (like in
> overlayfs), both super blocks must have the same encoding and the same
> flags. To help with that, create a sb_same_encoding() function.
>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Gabriel Krisman Bertazi <gabriel@krisman.be>
> ---
> include/linux/fs.h | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index a4d353a871b094b562a87ddcffe8336a26c5a3e2..7de9e1e4839a2726f4355ddf20b9babb74cc9681 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3747,6 +3747,24 @@ static inline bool sb_has_encoding(const struct super_block *sb)
> return !!sb_encoding(sb);
> }
>
> +/*
> + * Compare if two super blocks have the same encoding and flags
> + */
> +static inline bool sb_same_encoding(const struct super_block *sb1,
> + const struct super_block *sb2)
> +{
> +#if IS_ENABLED(CONFIG_UNICODE)
> + if (sb1->s_encoding == sb2->s_encoding)
> + return true;
> +
> + return (sb1->s_encoding && sb2->s_encoding &&
> + (sb1->s_encoding->version == sb2->s_encoding->version) &&
> + (sb1->s_encoding_flags == sb2->s_encoding_flags));
> +#else
> + return true;
> +#endif
> +}
> +
> int may_setattr(struct mnt_idmap *idmap, struct inode *inode,
> unsigned int ia_valid);
> int setattr_prepare(struct mnt_idmap *, struct dentry *, struct iattr *);
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 3/9] ovl: Prepare for mounting case-insensitive enabled layers
2025-08-22 14:17 ` [PATCH v6 3/9] ovl: Prepare for mounting case-insensitive enabled layers André Almeida
@ 2025-08-25 10:42 ` Gabriel Krisman Bertazi
0 siblings, 0 replies; 53+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-08-25 10:42 UTC (permalink / raw)
To: André Almeida
Cc: Miklos Szeredi, Amir Goldstein, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev
André Almeida <andrealmeid@igalia.com> writes:
> Prepare for mounting layers with case-insensitive dentries in order to
> supporting such layers in overlayfs, while enforcing uniform casefold
> layers.
>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Gabriel Krisman Bertazi <gabriel@krisman.be>
> ---
> fs/overlayfs/ovl_entry.h | 1 +
> fs/overlayfs/params.c | 15 ++++++++++++---
> fs/overlayfs/params.h | 1 +
> 3 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index 4c1bae935ced274f93a0d23fe10d34455e226ec4..1d4828dbcf7ac4ba9657221e601bbf79d970d225 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -91,6 +91,7 @@ struct ovl_fs {
> struct mutex whiteout_lock;
> /* r/o snapshot of upperdir sb's only taken on volatile mounts */
> errseq_t errseq;
> + bool casefold;
> };
>
> /* Number of lower layers, not including data-only layers */
> diff --git a/fs/overlayfs/params.c b/fs/overlayfs/params.c
> index f4e7fff909ac49e2f8c58a76273426c1158a7472..63b7346c5ee1c127a9c33b12c3704aa035ff88cf 100644
> --- a/fs/overlayfs/params.c
> +++ b/fs/overlayfs/params.c
> @@ -276,17 +276,26 @@ static int ovl_mount_dir(const char *name, struct path *path)
> static int ovl_mount_dir_check(struct fs_context *fc, const struct path *path,
> enum ovl_opt layer, const char *name, bool upper)
> {
> + bool is_casefolded = ovl_dentry_casefolded(path->dentry);
> struct ovl_fs_context *ctx = fc->fs_private;
> + struct ovl_fs *ofs = fc->s_fs_info;
>
> if (!d_is_dir(path->dentry))
> return invalfc(fc, "%s is not a directory", name);
>
> /*
> * Allow filesystems that are case-folding capable but deny composing
> - * ovl stack from case-folded directories.
> + * ovl stack from inconsistent case-folded directories.
> */
> - if (ovl_dentry_casefolded(path->dentry))
> - return invalfc(fc, "case-insensitive directory on %s not supported", name);
> + if (!ctx->casefold_set) {
> + ofs->casefold = is_casefolded;
> + ctx->casefold_set = true;
> + }
> +
> + if (ofs->casefold != is_casefolded) {
> + return invalfc(fc, "case-%ssensitive directory on %s is inconsistent",
> + is_casefolded ? "in" : "", name);
> + }
>
> if (ovl_dentry_weird(path->dentry))
> return invalfc(fc, "filesystem on %s not supported", name);
> diff --git a/fs/overlayfs/params.h b/fs/overlayfs/params.h
> index c96d939820211ddc63e265670a2aff60d95eec49..ffd53cdd84827cce827e8852f2de545f966ce60d 100644
> --- a/fs/overlayfs/params.h
> +++ b/fs/overlayfs/params.h
> @@ -33,6 +33,7 @@ struct ovl_fs_context {
> struct ovl_opt_set set;
> struct ovl_fs_context_layer *lower;
> char *lowerdir_all; /* user provided lowerdir string */
> + bool casefold_set;
> };
>
> int ovl_init_fs_context(struct fs_context *fc);
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp()
2025-08-22 14:17 ` [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp() André Almeida
2025-08-22 16:53 ` Amir Goldstein
@ 2025-08-25 11:09 ` Gabriel Krisman Bertazi
2025-08-25 15:27 ` Amir Goldstein
1 sibling, 1 reply; 53+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-08-25 11:09 UTC (permalink / raw)
To: André Almeida
Cc: Miklos Szeredi, Amir Goldstein, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev
André Almeida <andrealmeid@igalia.com> writes:
> To add overlayfs support casefold layers, create a new function
> ovl_casefold(), to be able to do case-insensitive strncmp().
>
> ovl_casefold() allocates a new buffer and stores the casefolded version
> of the string on it. If the allocation or the casefold operation fails,
> fallback to use the original string.
>
> The case-insentive name is then used in the rb-tree search/insertion
> operation. If the name is found in the rb-tree, the name can be
> discarded and the buffer is freed. If the name isn't found, it's then
> stored at struct ovl_cache_entry to be used later.
>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> ---
> Changes from v6:
> - Last version was using `strncmp(... tmp->len)` which was causing
> regressions. It should be `strncmp(... len)`.
> - Rename cf_len to c_len
> - Use c_len for tree operation: (cmp < 0 || len < tmp->c_len)
> - Remove needless kfree(cf_name)
> ---
> fs/overlayfs/readdir.c | 113 ++++++++++++++++++++++++++++++++++++++++---------
> 1 file changed, 94 insertions(+), 19 deletions(-)
>
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index b65cdfce31ce27172d28d879559f1008b9c87320..dfc661b7bc3f87efbf14991e97cee169400d823b 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -27,6 +27,8 @@ struct ovl_cache_entry {
> bool is_upper;
> bool is_whiteout;
> bool check_xwhiteout;
> + const char *c_name;
> + int c_len;
> char name[];
> };
>
> @@ -45,6 +47,7 @@ struct ovl_readdir_data {
> struct list_head *list;
> struct list_head middle;
> struct ovl_cache_entry *first_maybe_whiteout;
> + struct unicode_map *map;
> int count;
> int err;
> bool is_upper;
> @@ -66,6 +69,27 @@ static struct ovl_cache_entry *ovl_cache_entry_from_node(struct rb_node *n)
> return rb_entry(n, struct ovl_cache_entry, node);
> }
>
> +static int ovl_casefold(struct unicode_map *map, const char *str, int len, char **dst)
> +{
> + const struct qstr qstr = { .name = str, .len = len };
> + int cf_len;
> +
> + if (!IS_ENABLED(CONFIG_UNICODE) || !map || is_dot_dotdot(str, len))
> + return 0;
> +
> + *dst = kmalloc(NAME_MAX, GFP_KERNEL);
> +
> + if (dst) {
> + cf_len = utf8_casefold(map, &qstr, *dst, NAME_MAX);
> +
> + if (cf_len > 0)
> + return cf_len;
> + }
> +
> + kfree(*dst);
> + return 0;
> +}
Hi,
I should just note this does not differentiates allocation errors from
casefolding errors (invalid encoding). It might be just a theoretical
error because GFP_KERNEL shouldn't fail (wink, wink) and the rest of the
operation is likely to fail too, but if you have an allocation failure, you
can end up with an inconsistent cache, because a file is added under the
!casefolded name and a later successful lookup will look for the
casefolded version.
> +
> static bool ovl_cache_entry_find_link(const char *name, int len,
> struct rb_node ***link,
> struct rb_node **parent)
> @@ -79,10 +103,10 @@ static bool ovl_cache_entry_find_link(const char *name, int len,
>
> *parent = *newp;
> tmp = ovl_cache_entry_from_node(*newp);
> - cmp = strncmp(name, tmp->name, len);
> + cmp = strncmp(name, tmp->c_name, len);
> if (cmp > 0)
> newp = &tmp->node.rb_right;
> - else if (cmp < 0 || len < tmp->len)
> + else if (cmp < 0 || len < tmp->c_len)
> newp = &tmp->node.rb_left;
> else
> found = true;
> @@ -101,10 +125,10 @@ static struct ovl_cache_entry *ovl_cache_entry_find(struct rb_root *root,
> while (node) {
> struct ovl_cache_entry *p = ovl_cache_entry_from_node(node);
>
> - cmp = strncmp(name, p->name, len);
> + cmp = strncmp(name, p->c_name, len);
> if (cmp > 0)
> node = p->node.rb_right;
> - else if (cmp < 0 || len < p->len)
> + else if (cmp < 0 || len < p->c_len)
> node = p->node.rb_left;
> else
> return p;
> @@ -145,6 +169,7 @@ static bool ovl_calc_d_ino(struct ovl_readdir_data *rdd,
>
> static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
> const char *name, int len,
> + const char *c_name, int c_len,
> u64 ino, unsigned int d_type)
> {
> struct ovl_cache_entry *p;
> @@ -167,6 +192,14 @@ static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
> /* Defer check for overlay.whiteout to ovl_iterate() */
> p->check_xwhiteout = rdd->in_xwhiteouts_dir && d_type == DT_REG;
>
> + if (c_name && c_name != name) {
> + p->c_name = c_name;
> + p->c_len = c_len;
> + } else {
> + p->c_name = p->name;
> + p->c_len = len;
> + }
> +
> if (d_type == DT_CHR) {
> p->next_maybe_whiteout = rdd->first_maybe_whiteout;
> rdd->first_maybe_whiteout = p;
> @@ -174,48 +207,55 @@ static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
> return p;
> }
>
> -static bool ovl_cache_entry_add_rb(struct ovl_readdir_data *rdd,
> - const char *name, int len, u64 ino,
> +/* Return 0 for found, 1 for added, <0 for error */
> +static int ovl_cache_entry_add_rb(struct ovl_readdir_data *rdd,
> + const char *name, int len,
> + const char *c_name, int c_len,
> + u64 ino,
> unsigned int d_type)
> {
> struct rb_node **newp = &rdd->root->rb_node;
> struct rb_node *parent = NULL;
> struct ovl_cache_entry *p;
>
> - if (ovl_cache_entry_find_link(name, len, &newp, &parent))
> - return true;
> + if (ovl_cache_entry_find_link(c_name, c_len, &newp, &parent))
> + return 0;
>
> - p = ovl_cache_entry_new(rdd, name, len, ino, d_type);
> + p = ovl_cache_entry_new(rdd, name, len, c_name, c_len, ino, d_type);
> if (p == NULL) {
> rdd->err = -ENOMEM;
> - return false;
> + return -ENOMEM;
> }
>
> list_add_tail(&p->l_node, rdd->list);
> rb_link_node(&p->node, parent, newp);
> rb_insert_color(&p->node, rdd->root);
>
> - return true;
> + return 1;
> }
>
> -static bool ovl_fill_lowest(struct ovl_readdir_data *rdd,
> +/* Return 0 for found, 1 for added, <0 for error */
> +static int ovl_fill_lowest(struct ovl_readdir_data *rdd,
> const char *name, int namelen,
> + const char *c_name, int c_len,
> loff_t offset, u64 ino, unsigned int d_type)
> {
> struct ovl_cache_entry *p;
>
> - p = ovl_cache_entry_find(rdd->root, name, namelen);
> + p = ovl_cache_entry_find(rdd->root, c_name, c_len);
> if (p) {
> list_move_tail(&p->l_node, &rdd->middle);
> + return 0;
> } else {
> - p = ovl_cache_entry_new(rdd, name, namelen, ino, d_type);
> + p = ovl_cache_entry_new(rdd, name, namelen, c_name, c_len,
> + ino, d_type);
> if (p == NULL)
> rdd->err = -ENOMEM;
> else
> list_add_tail(&p->l_node, &rdd->middle);
> }
>
> - return rdd->err == 0;
> + return rdd->err ?: 1;
> }
>
> void ovl_cache_free(struct list_head *list)
> @@ -223,8 +263,11 @@ void ovl_cache_free(struct list_head *list)
> struct ovl_cache_entry *p;
> struct ovl_cache_entry *n;
>
> - list_for_each_entry_safe(p, n, list, l_node)
> + list_for_each_entry_safe(p, n, list, l_node) {
> + if (p->c_name != p->name)
> + kfree(p->c_name);
> kfree(p);
> + }
>
> INIT_LIST_HEAD(list);
> }
> @@ -260,12 +303,36 @@ static bool ovl_fill_merge(struct dir_context *ctx, const char *name,
> {
> struct ovl_readdir_data *rdd =
> container_of(ctx, struct ovl_readdir_data, ctx);
> + struct ovl_fs *ofs = OVL_FS(rdd->dentry->d_sb);
> + const char *c_name = NULL;
> + char *cf_name = NULL;
> + int c_len = 0, ret;
> +
> + if (ofs->casefold)
> + c_len = ovl_casefold(rdd->map, name, namelen, &cf_name);
> +
> + if (c_len <= 0) {
> + c_name = name;
> + c_len = namelen;
> + } else {
> + c_name = cf_name;
> + }
>
> rdd->count++;
> if (!rdd->is_lowest)
> - return ovl_cache_entry_add_rb(rdd, name, namelen, ino, d_type);
> + ret = ovl_cache_entry_add_rb(rdd, name, namelen, c_name, c_len, ino, d_type);
> else
> - return ovl_fill_lowest(rdd, name, namelen, offset, ino, d_type);
> + ret = ovl_fill_lowest(rdd, name, namelen, c_name, c_len, offset, ino, d_type);
> +
> + /*
> + * If ret == 1, that means that c_name is being used as part of struct
> + * ovl_cache_entry and will be freed at ovl_cache_free(). Otherwise,
> + * c_name was found in the rb-tree so we can free it here.
> + */
> + if (ret != 1 && c_name != name)
> + kfree(c_name);
> +
The semantics of this being conditionally freed is a bit annoying, as
it is already replicated in 3 places. I suppose a helper would come in
hand.
In this specific case, it could just be:
if (ret != 1)
kfree(cf_name);
> + return ret >= 0;
> }
>
> static int ovl_check_whiteouts(const struct path *path, struct ovl_readdir_data *rdd)
> @@ -357,12 +424,18 @@ static int ovl_dir_read_merged(struct dentry *dentry, struct list_head *list,
> .list = list,
> .root = root,
> .is_lowest = false,
> + .map = NULL,
> };
> int idx, next;
> const struct ovl_layer *layer;
> + struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
>
> for (idx = 0; idx != -1; idx = next) {
> next = ovl_path_next(idx, dentry, &realpath, &layer);
> +
> + if (ofs->casefold)
> + rdd.map = sb_encoding(realpath.dentry->d_sb);
> +
> rdd.is_upper = ovl_dentry_upper(dentry) == realpath.dentry;
> rdd.in_xwhiteouts_dir = layer->has_xwhiteouts &&
> ovl_dentry_has_xwhiteouts(dentry);
> @@ -555,7 +628,7 @@ static bool ovl_fill_plain(struct dir_context *ctx, const char *name,
> container_of(ctx, struct ovl_readdir_data, ctx);
>
> rdd->count++;
> - p = ovl_cache_entry_new(rdd, name, namelen, ino, d_type);
> + p = ovl_cache_entry_new(rdd, name, namelen, NULL, 0, ino, d_type);
> if (p == NULL) {
> rdd->err = -ENOMEM;
> return false;
> @@ -1023,6 +1096,8 @@ int ovl_check_empty_dir(struct dentry *dentry, struct list_head *list)
>
> del_entry:
> list_del(&p->l_node);
> + if (p->c_name != p->name)
> + kfree(p->c_name);
> kfree(p);
> }
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 5/9] ovl: Ensure that all layers have the same encoding
2025-08-22 14:17 ` [PATCH v6 5/9] ovl: Ensure that all layers have the same encoding André Almeida
@ 2025-08-25 11:17 ` Gabriel Krisman Bertazi
2025-08-25 15:32 ` Amir Goldstein
0 siblings, 1 reply; 53+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-08-25 11:17 UTC (permalink / raw)
To: André Almeida
Cc: Miklos Szeredi, Amir Goldstein, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev
André Almeida <andrealmeid@igalia.com> writes:
> When merging layers from different filesystems with casefold enabled,
> all layers should use the same encoding version and have the same flags
> to avoid any kind of incompatibility issues.
>
> Also, set the encoding and the encoding flags for the ovl super block as
> the same as used by the first valid layer.
>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> ---
> fs/overlayfs/super.c | 25 +++++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index df85a76597e910d00323018f1d2cd720c5db921d..b1dbd3c79961094d00c7f99cc622e515d544d22f 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -991,6 +991,18 @@ static int ovl_get_data_fsid(struct ovl_fs *ofs)
> return ofs->numfs;
> }
>
> +/*
> + * Set the ovl sb encoding as the same one used by the first layer
> + */
> +static void ovl_set_encoding(struct super_block *sb, struct super_block *fs_sb)
> +{
> +#if IS_ENABLED(CONFIG_UNICODE)
> + if (sb_has_encoding(fs_sb)) {
> + sb->s_encoding = fs_sb->s_encoding;
> + sb->s_encoding_flags = fs_sb->s_encoding_flags;
> + }
> +#endif
> +}
>
> static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
> struct ovl_fs_context *ctx, struct ovl_layer *layers)
> @@ -1024,6 +1036,9 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
> if (ovl_upper_mnt(ofs)) {
> ofs->fs[0].sb = ovl_upper_mnt(ofs)->mnt_sb;
> ofs->fs[0].is_lower = false;
> +
> + if (ofs->casefold)
> + ovl_set_encoding(sb, ofs->fs[0].sb);
> }
>
> nr_merged_lower = ctx->nr - ctx->nr_data;
> @@ -1083,6 +1098,16 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
> l->name = NULL;
> ofs->numlayer++;
> ofs->fs[fsid].is_lower = true;
> +
> + if (ofs->casefold) {
> + if (!ovl_upper_mnt(ofs) && !sb_has_encoding(sb))
> + ovl_set_encoding(sb, ofs->fs[fsid].sb);
> +
> + if (!sb_has_encoding(sb) || !sb_same_encoding(sb, mnt->mnt_sb)) {
Minor nit, but isn't the sb_has_encoding() check redundant here? sb_same_encoding
will check the sb->encoding matches the mnt_sb already.
> + pr_err("all layers must have the same encoding\n");
> + return -EINVAL;
> + }
> + }
> }
>
> /*
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 6/9] ovl: Set case-insensitive dentry operations for ovl sb
2025-08-22 14:17 ` [PATCH v6 6/9] ovl: Set case-insensitive dentry operations for ovl sb André Almeida
@ 2025-08-25 11:24 ` Gabriel Krisman Bertazi
2025-08-25 15:34 ` Amir Goldstein
0 siblings, 1 reply; 53+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-08-25 11:24 UTC (permalink / raw)
To: André Almeida
Cc: Miklos Szeredi, Amir Goldstein, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev
André Almeida <andrealmeid@igalia.com> writes:
> For filesystems with encoding (i.e. with case-insensitive support), set
> the dentry operations for the super block as ovl_dentry_ci_operations.
>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> ---
> Changes in v6:
> - Fix kernel bot warning: unused variable 'ofs'
> ---
> fs/overlayfs/super.c | 25 +++++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index b1dbd3c79961094d00c7f99cc622e515d544d22f..8db4e55d5027cb975fec9b92251f62fe5924af4f 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -161,6 +161,16 @@ static const struct dentry_operations ovl_dentry_operations = {
> .d_weak_revalidate = ovl_dentry_weak_revalidate,
> };
>
> +#if IS_ENABLED(CONFIG_UNICODE)
> +static const struct dentry_operations ovl_dentry_ci_operations = {
> + .d_real = ovl_d_real,
> + .d_revalidate = ovl_dentry_revalidate,
> + .d_weak_revalidate = ovl_dentry_weak_revalidate,
> + .d_hash = generic_ci_d_hash,
> + .d_compare = generic_ci_d_compare,
> +};
> +#endif
> +
> static struct kmem_cache *ovl_inode_cachep;
>
> static struct inode *ovl_alloc_inode(struct super_block *sb)
> @@ -1332,6 +1342,19 @@ static struct dentry *ovl_get_root(struct super_block *sb,
> return root;
> }
>
> +static void ovl_set_d_op(struct super_block *sb)
> +{
> +#if IS_ENABLED(CONFIG_UNICODE)
> + struct ovl_fs *ofs = sb->s_fs_info;
> +
> + if (ofs->casefold) {
> + set_default_d_op(sb, &ovl_dentry_ci_operations);
> + return;
> + }
> +#endif
> + set_default_d_op(sb, &ovl_dentry_operations);
> +}
> +
> int ovl_fill_super(struct super_block *sb, struct fs_context *fc)
> {
> struct ovl_fs *ofs = sb->s_fs_info;
> @@ -1443,6 +1466,8 @@ int ovl_fill_super(struct super_block *sb, struct fs_context *fc)
> if (IS_ERR(oe))
> goto out_err;
>
> + ovl_set_d_op(sb);
> +
Absolutely minor, but fill_super is now calling
set_default_d_op(sb, &ovl_dentry_operations) twice, once here and once
at the beginning of the function. You can remove the original call.
> /* If the upper fs is nonexistent, we mark overlayfs r/o too */
> if (!ovl_upper_mnt(ofs))
> sb->s_flags |= SB_RDONLY;
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 1/9] fs: Create sb_encoding() helper
2025-08-22 14:17 ` [PATCH v6 1/9] fs: Create sb_encoding() helper André Almeida
2025-08-25 9:19 ` Gabriel Krisman Bertazi
@ 2025-08-25 12:38 ` Gabriel Krisman Bertazi
2025-08-25 15:28 ` Amir Goldstein
1 sibling, 1 reply; 53+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-08-25 12:38 UTC (permalink / raw)
To: André Almeida
Cc: Miklos Szeredi, Amir Goldstein, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev
André Almeida <andrealmeid@igalia.com> writes:
> Filesystems that need to deal with the super block encoding need to use
> a if IS_ENABLED(CONFIG_UNICODE) around it because this struct member is
> not declared otherwise. In order to move this if/endif guards outside of
> the filesytem code and make it simpler, create a new function that
> returns the s_encoding member of struct super_block if Unicode is
> enabled, and return NULL otherwise.
>
> Suggested-by: Amir Goldstein <amir73il@gmail.com>
> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> ---
> include/linux/fs.h | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index e1d4fef5c181d291a7c685e5897b2c018df439ae..a4d353a871b094b562a87ddcffe8336a26c5a3e2 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3733,15 +3733,20 @@ static inline bool generic_ci_validate_strict_name(struct inode *dir, struct qst
> }
> #endif
>
> -static inline bool sb_has_encoding(const struct super_block *sb)
> +static inline struct unicode_map *sb_encoding(const struct super_block *sb)
> {
> #if IS_ENABLED(CONFIG_UNICODE)
> - return !!sb->s_encoding;
> + return sb->s_encoding;
> #else
> - return false;
> + return NULL;
> #endif
> }
>
> +static inline bool sb_has_encoding(const struct super_block *sb)
> +{
> + return !!sb_encoding(sb);
> +}
> +
FWIW, sb_has_encoding is completely superfluous now. It is also only
used by overlayfs itself, so it should be easy to drop in favor of your
new helper in the following patches. It even has a smaller function
name :)
> int may_setattr(struct mnt_idmap *idmap, struct inode *inode,
> unsigned int ia_valid);
> int setattr_prepare(struct mnt_idmap *, struct dentry *, struct iattr *);
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
2025-08-22 19:17 ` Amir Goldstein
@ 2025-08-25 13:31 ` André Almeida
2025-08-26 7:31 ` Amir Goldstein
0 siblings, 1 reply; 53+ messages in thread
From: André Almeida @ 2025-08-25 13:31 UTC (permalink / raw)
To: Amir Goldstein
Cc: Miklos Szeredi, Theodore Tso, Gabriel Krisman Bertazi,
linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev
Hi Amir,
Em 22/08/2025 16:17, Amir Goldstein escreveu:
[...]
/*
>>>> - * Allow filesystems that are case-folding capable but deny composing
>>>> - * ovl stack from case-folded directories.
>>>> + * Exceptionally for layers with casefold, we accept that they have
>>>> + * their own hash and compare operations
>>>> */
>>>> - if (sb_has_encoding(dentry->d_sb))
>>>> - return IS_CASEFOLDED(d_inode(dentry));
>>>> + if (ofs->casefold)
>>>> + return false;
>>>
>>> I think this is better as:
>>> if (sb_has_encoding(dentry->d_sb))
>>> return false;
>>>
>
> And this still fails the test "Casefold enabled" for me.
>
> Maybe you are confused because this does not look like
> a test failure. It looks like this:
>
> generic/999 5s ... [19:10:21][ 150.667994] overlayfs: failed lookup
> in lower (ovl-lower/casefold, name='subdir', err=-116): parent wrong
> casefold
> [ 150.669741] overlayfs: failed lookup in lower (ovl-lower/casefold,
> name='subdir', err=-116): parent wrong casefold
> [ 150.760644] overlayfs: failed lookup in lower (/ovl-lower,
> name='casefold', err=-66): child wrong casefold
> [19:10:24] [not run]
> generic/999 -- overlayfs does not support casefold enabled layers
> Ran: generic/999
> Not run: generic/999
> Passed all 1 tests
>
This is how the test output looks before my changes[1] to the test:
$ ./run.sh
FSTYP -- ext4
PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
MKFS_OPTIONS -- -F /dev/vdc
MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
generic/999 1s ... [not run] overlayfs does not support casefold enabled
layers
Ran: generic/999
Not run: generic/999
Passed all 1 tests
And this is how it looks after my changes[1] to the test:
$ ./run.sh
FSTYP -- ext4
PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
MKFS_OPTIONS -- -F /dev/vdc
MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
generic/999 1s
Ran: generic/999
Passed all 1 tests
So, as far as I can tell, the casefold enabled is not being skipped
after the fix to the test.
[1]
https://lore.kernel.org/lkml/5da6b0f4-2730-4783-9c57-c46c2d13e848@igalia.com/
> I'm not sure I will keep the test this way. This is not very standard nor
> good practice, to run half of the test and then skip it.
> I would probably split it into two tests.
> The first one as it is now will run to completion on kenrels >= v6.17
> and the Casefold enable test will run on kernels >= v6.18.
>
> In any case, please make sure that the test is not skipped when testing
> Casefold enabled layers
>
> And then continue with the missing test cases.
>
> When you have a test that passes please send the test itself or
> a fstest branch for me to test.
Ok!
>
> Thanks,
> Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp()
2025-08-25 11:09 ` Gabriel Krisman Bertazi
@ 2025-08-25 15:27 ` Amir Goldstein
2025-08-25 15:45 ` Amir Goldstein
0 siblings, 1 reply; 53+ messages in thread
From: Amir Goldstein @ 2025-08-25 15:27 UTC (permalink / raw)
To: Gabriel Krisman Bertazi
Cc: André Almeida, Miklos Szeredi, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev
On Mon, Aug 25, 2025 at 1:09 PM Gabriel Krisman Bertazi
<gabriel@krisman.be> wrote:
>
> André Almeida <andrealmeid@igalia.com> writes:
>
> > To add overlayfs support casefold layers, create a new function
> > ovl_casefold(), to be able to do case-insensitive strncmp().
> >
> > ovl_casefold() allocates a new buffer and stores the casefolded version
> > of the string on it. If the allocation or the casefold operation fails,
> > fallback to use the original string.
> >
> > The case-insentive name is then used in the rb-tree search/insertion
> > operation. If the name is found in the rb-tree, the name can be
> > discarded and the buffer is freed. If the name isn't found, it's then
> > stored at struct ovl_cache_entry to be used later.
> >
> > Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> > Signed-off-by: André Almeida <andrealmeid@igalia.com>
> > ---
> > Changes from v6:
> > - Last version was using `strncmp(... tmp->len)` which was causing
> > regressions. It should be `strncmp(... len)`.
> > - Rename cf_len to c_len
> > - Use c_len for tree operation: (cmp < 0 || len < tmp->c_len)
> > - Remove needless kfree(cf_name)
> > ---
> > fs/overlayfs/readdir.c | 113 ++++++++++++++++++++++++++++++++++++++++---------
> > 1 file changed, 94 insertions(+), 19 deletions(-)
> >
> > diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> > index b65cdfce31ce27172d28d879559f1008b9c87320..dfc661b7bc3f87efbf14991e97cee169400d823b 100644
> > --- a/fs/overlayfs/readdir.c
> > +++ b/fs/overlayfs/readdir.c
> > @@ -27,6 +27,8 @@ struct ovl_cache_entry {
> > bool is_upper;
> > bool is_whiteout;
> > bool check_xwhiteout;
> > + const char *c_name;
> > + int c_len;
> > char name[];
> > };
> >
> > @@ -45,6 +47,7 @@ struct ovl_readdir_data {
> > struct list_head *list;
> > struct list_head middle;
> > struct ovl_cache_entry *first_maybe_whiteout;
> > + struct unicode_map *map;
> > int count;
> > int err;
> > bool is_upper;
> > @@ -66,6 +69,27 @@ static struct ovl_cache_entry *ovl_cache_entry_from_node(struct rb_node *n)
> > return rb_entry(n, struct ovl_cache_entry, node);
> > }
> >
> > +static int ovl_casefold(struct unicode_map *map, const char *str, int len, char **dst)
> > +{
> > + const struct qstr qstr = { .name = str, .len = len };
> > + int cf_len;
> > +
> > + if (!IS_ENABLED(CONFIG_UNICODE) || !map || is_dot_dotdot(str, len))
> > + return 0;
> > +
> > + *dst = kmalloc(NAME_MAX, GFP_KERNEL);
> > +
> > + if (dst) {
> > + cf_len = utf8_casefold(map, &qstr, *dst, NAME_MAX);
> > +
> > + if (cf_len > 0)
> > + return cf_len;
> > + }
> > +
> > + kfree(*dst);
> > + return 0;
> > +}
>
> Hi,
>
> I should just note this does not differentiates allocation errors from
> casefolding errors (invalid encoding). It might be just a theoretical
> error because GFP_KERNEL shouldn't fail (wink, wink) and the rest of the
> operation is likely to fail too, but if you have an allocation failure, you
> can end up with an inconsistent cache, because a file is added under the
> !casefolded name and a later successful lookup will look for the
> casefolded version.
Good point.
I will fix this in my tree.
>
> > +
> > static bool ovl_cache_entry_find_link(const char *name, int len,
> > struct rb_node ***link,
> > struct rb_node **parent)
> > @@ -79,10 +103,10 @@ static bool ovl_cache_entry_find_link(const char *name, int len,
> >
> > *parent = *newp;
> > tmp = ovl_cache_entry_from_node(*newp);
> > - cmp = strncmp(name, tmp->name, len);
> > + cmp = strncmp(name, tmp->c_name, len);
> > if (cmp > 0)
> > newp = &tmp->node.rb_right;
> > - else if (cmp < 0 || len < tmp->len)
> > + else if (cmp < 0 || len < tmp->c_len)
> > newp = &tmp->node.rb_left;
> > else
> > found = true;
> > @@ -101,10 +125,10 @@ static struct ovl_cache_entry *ovl_cache_entry_find(struct rb_root *root,
> > while (node) {
> > struct ovl_cache_entry *p = ovl_cache_entry_from_node(node);
> >
> > - cmp = strncmp(name, p->name, len);
> > + cmp = strncmp(name, p->c_name, len);
> > if (cmp > 0)
> > node = p->node.rb_right;
> > - else if (cmp < 0 || len < p->len)
> > + else if (cmp < 0 || len < p->c_len)
> > node = p->node.rb_left;
> > else
> > return p;
> > @@ -145,6 +169,7 @@ static bool ovl_calc_d_ino(struct ovl_readdir_data *rdd,
> >
> > static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
> > const char *name, int len,
> > + const char *c_name, int c_len,
> > u64 ino, unsigned int d_type)
> > {
> > struct ovl_cache_entry *p;
> > @@ -167,6 +192,14 @@ static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
> > /* Defer check for overlay.whiteout to ovl_iterate() */
> > p->check_xwhiteout = rdd->in_xwhiteouts_dir && d_type == DT_REG;
> >
> > + if (c_name && c_name != name) {
> > + p->c_name = c_name;
> > + p->c_len = c_len;
> > + } else {
> > + p->c_name = p->name;
> > + p->c_len = len;
> > + }
> > +
> > if (d_type == DT_CHR) {
> > p->next_maybe_whiteout = rdd->first_maybe_whiteout;
> > rdd->first_maybe_whiteout = p;
> > @@ -174,48 +207,55 @@ static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
> > return p;
> > }
> >
> > -static bool ovl_cache_entry_add_rb(struct ovl_readdir_data *rdd,
> > - const char *name, int len, u64 ino,
> > +/* Return 0 for found, 1 for added, <0 for error */
> > +static int ovl_cache_entry_add_rb(struct ovl_readdir_data *rdd,
> > + const char *name, int len,
> > + const char *c_name, int c_len,
> > + u64 ino,
> > unsigned int d_type)
> > {
> > struct rb_node **newp = &rdd->root->rb_node;
> > struct rb_node *parent = NULL;
> > struct ovl_cache_entry *p;
> >
> > - if (ovl_cache_entry_find_link(name, len, &newp, &parent))
> > - return true;
> > + if (ovl_cache_entry_find_link(c_name, c_len, &newp, &parent))
> > + return 0;
> >
> > - p = ovl_cache_entry_new(rdd, name, len, ino, d_type);
> > + p = ovl_cache_entry_new(rdd, name, len, c_name, c_len, ino, d_type);
> > if (p == NULL) {
> > rdd->err = -ENOMEM;
> > - return false;
> > + return -ENOMEM;
> > }
> >
> > list_add_tail(&p->l_node, rdd->list);
> > rb_link_node(&p->node, parent, newp);
> > rb_insert_color(&p->node, rdd->root);
> >
> > - return true;
> > + return 1;
> > }
> >
> > -static bool ovl_fill_lowest(struct ovl_readdir_data *rdd,
> > +/* Return 0 for found, 1 for added, <0 for error */
> > +static int ovl_fill_lowest(struct ovl_readdir_data *rdd,
> > const char *name, int namelen,
> > + const char *c_name, int c_len,
> > loff_t offset, u64 ino, unsigned int d_type)
> > {
> > struct ovl_cache_entry *p;
> >
> > - p = ovl_cache_entry_find(rdd->root, name, namelen);
> > + p = ovl_cache_entry_find(rdd->root, c_name, c_len);
> > if (p) {
> > list_move_tail(&p->l_node, &rdd->middle);
> > + return 0;
> > } else {
> > - p = ovl_cache_entry_new(rdd, name, namelen, ino, d_type);
> > + p = ovl_cache_entry_new(rdd, name, namelen, c_name, c_len,
> > + ino, d_type);
> > if (p == NULL)
> > rdd->err = -ENOMEM;
> > else
> > list_add_tail(&p->l_node, &rdd->middle);
> > }
> >
> > - return rdd->err == 0;
> > + return rdd->err ?: 1;
> > }
> >
> > void ovl_cache_free(struct list_head *list)
> > @@ -223,8 +263,11 @@ void ovl_cache_free(struct list_head *list)
> > struct ovl_cache_entry *p;
> > struct ovl_cache_entry *n;
> >
> > - list_for_each_entry_safe(p, n, list, l_node)
> > + list_for_each_entry_safe(p, n, list, l_node) {
> > + if (p->c_name != p->name)
> > + kfree(p->c_name);
> > kfree(p);
> > + }
> >
> > INIT_LIST_HEAD(list);
> > }
> > @@ -260,12 +303,36 @@ static bool ovl_fill_merge(struct dir_context *ctx, const char *name,
> > {
> > struct ovl_readdir_data *rdd =
> > container_of(ctx, struct ovl_readdir_data, ctx);
> > + struct ovl_fs *ofs = OVL_FS(rdd->dentry->d_sb);
> > + const char *c_name = NULL;
> > + char *cf_name = NULL;
> > + int c_len = 0, ret;
> > +
> > + if (ofs->casefold)
> > + c_len = ovl_casefold(rdd->map, name, namelen, &cf_name);
> > +
> > + if (c_len <= 0) {
> > + c_name = name;
> > + c_len = namelen;
> > + } else {
> > + c_name = cf_name;
> > + }
> >
> > rdd->count++;
> > if (!rdd->is_lowest)
> > - return ovl_cache_entry_add_rb(rdd, name, namelen, ino, d_type);
> > + ret = ovl_cache_entry_add_rb(rdd, name, namelen, c_name, c_len, ino, d_type);
> > else
> > - return ovl_fill_lowest(rdd, name, namelen, offset, ino, d_type);
> > + ret = ovl_fill_lowest(rdd, name, namelen, c_name, c_len, offset, ino, d_type);
> > +
> > + /*
> > + * If ret == 1, that means that c_name is being used as part of struct
> > + * ovl_cache_entry and will be freed at ovl_cache_free(). Otherwise,
> > + * c_name was found in the rb-tree so we can free it here.
> > + */
> > + if (ret != 1 && c_name != name)
> > + kfree(c_name);
> > +
>
> The semantics of this being conditionally freed is a bit annoying, as
> it is already replicated in 3 places. I suppose a helper would come in
> hand.
Yeh.
I have already used ovl_cache_entry_free() in my tree.
Thanks,
Amir.
>
> In this specific case, it could just be:
>
> if (ret != 1)
> kfree(cf_name);
>
>
> > + return ret >= 0;
> > }
> >
> > static int ovl_check_whiteouts(const struct path *path, struct ovl_readdir_data *rdd)
> > @@ -357,12 +424,18 @@ static int ovl_dir_read_merged(struct dentry *dentry, struct list_head *list,
> > .list = list,
> > .root = root,
> > .is_lowest = false,
> > + .map = NULL,
> > };
> > int idx, next;
> > const struct ovl_layer *layer;
> > + struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
> >
> > for (idx = 0; idx != -1; idx = next) {
> > next = ovl_path_next(idx, dentry, &realpath, &layer);
> > +
> > + if (ofs->casefold)
> > + rdd.map = sb_encoding(realpath.dentry->d_sb);
> > +
> > rdd.is_upper = ovl_dentry_upper(dentry) == realpath.dentry;
> > rdd.in_xwhiteouts_dir = layer->has_xwhiteouts &&
> > ovl_dentry_has_xwhiteouts(dentry);
> > @@ -555,7 +628,7 @@ static bool ovl_fill_plain(struct dir_context *ctx, const char *name,
> > container_of(ctx, struct ovl_readdir_data, ctx);
> >
> > rdd->count++;
> > - p = ovl_cache_entry_new(rdd, name, namelen, ino, d_type);
> > + p = ovl_cache_entry_new(rdd, name, namelen, NULL, 0, ino, d_type);
> > if (p == NULL) {
> > rdd->err = -ENOMEM;
> > return false;
> > @@ -1023,6 +1096,8 @@ int ovl_check_empty_dir(struct dentry *dentry, struct list_head *list)
> >
> > del_entry:
> > list_del(&p->l_node);
> > + if (p->c_name != p->name)
> > + kfree(p->c_name);
> > kfree(p);
> > }
>
> --
> Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 1/9] fs: Create sb_encoding() helper
2025-08-25 12:38 ` Gabriel Krisman Bertazi
@ 2025-08-25 15:28 ` Amir Goldstein
0 siblings, 0 replies; 53+ messages in thread
From: Amir Goldstein @ 2025-08-25 15:28 UTC (permalink / raw)
To: Gabriel Krisman Bertazi
Cc: André Almeida, Miklos Szeredi, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev
On Mon, Aug 25, 2025 at 2:38 PM Gabriel Krisman Bertazi <krisman@suse.de> wrote:
>
> André Almeida <andrealmeid@igalia.com> writes:
>
> > Filesystems that need to deal with the super block encoding need to use
> > a if IS_ENABLED(CONFIG_UNICODE) around it because this struct member is
> > not declared otherwise. In order to move this if/endif guards outside of
> > the filesytem code and make it simpler, create a new function that
> > returns the s_encoding member of struct super_block if Unicode is
> > enabled, and return NULL otherwise.
> >
> > Suggested-by: Amir Goldstein <amir73il@gmail.com>
> > Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> > Signed-off-by: André Almeida <andrealmeid@igalia.com>
> > ---
> > include/linux/fs.h | 11 ++++++++---
> > 1 file changed, 8 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index e1d4fef5c181d291a7c685e5897b2c018df439ae..a4d353a871b094b562a87ddcffe8336a26c5a3e2 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -3733,15 +3733,20 @@ static inline bool generic_ci_validate_strict_name(struct inode *dir, struct qst
> > }
> > #endif
> >
> > -static inline bool sb_has_encoding(const struct super_block *sb)
> > +static inline struct unicode_map *sb_encoding(const struct super_block *sb)
> > {
> > #if IS_ENABLED(CONFIG_UNICODE)
> > - return !!sb->s_encoding;
> > + return sb->s_encoding;
> > #else
> > - return false;
> > + return NULL;
> > #endif
> > }
> >
> > +static inline bool sb_has_encoding(const struct super_block *sb)
> > +{
> > + return !!sb_encoding(sb);
> > +}
> > +
>
> FWIW, sb_has_encoding is completely superfluous now. It is also only
> used by overlayfs itself, so it should be easy to drop in favor of your
> new helper in the following patches. It even has a smaller function
> name :)
Heh. ok maybe we should.
I'll wait for Christian to decide how he would like to funnel those helpers
and maybe he has an opinion.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 5/9] ovl: Ensure that all layers have the same encoding
2025-08-25 11:17 ` Gabriel Krisman Bertazi
@ 2025-08-25 15:32 ` Amir Goldstein
2025-08-26 20:12 ` André Almeida
0 siblings, 1 reply; 53+ messages in thread
From: Amir Goldstein @ 2025-08-25 15:32 UTC (permalink / raw)
To: Gabriel Krisman Bertazi
Cc: André Almeida, Miklos Szeredi, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev
On Mon, Aug 25, 2025 at 1:17 PM Gabriel Krisman Bertazi
<gabriel@krisman.be> wrote:
>
> André Almeida <andrealmeid@igalia.com> writes:
>
> > When merging layers from different filesystems with casefold enabled,
> > all layers should use the same encoding version and have the same flags
> > to avoid any kind of incompatibility issues.
> >
> > Also, set the encoding and the encoding flags for the ovl super block as
> > the same as used by the first valid layer.
> >
> > Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> > Signed-off-by: André Almeida <andrealmeid@igalia.com>
> > ---
> > fs/overlayfs/super.c | 25 +++++++++++++++++++++++++
> > 1 file changed, 25 insertions(+)
> >
> > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> > index df85a76597e910d00323018f1d2cd720c5db921d..b1dbd3c79961094d00c7f99cc622e515d544d22f 100644
> > --- a/fs/overlayfs/super.c
> > +++ b/fs/overlayfs/super.c
> > @@ -991,6 +991,18 @@ static int ovl_get_data_fsid(struct ovl_fs *ofs)
> > return ofs->numfs;
> > }
> >
> > +/*
> > + * Set the ovl sb encoding as the same one used by the first layer
> > + */
> > +static void ovl_set_encoding(struct super_block *sb, struct super_block *fs_sb)
> > +{
> > +#if IS_ENABLED(CONFIG_UNICODE)
> > + if (sb_has_encoding(fs_sb)) {
> > + sb->s_encoding = fs_sb->s_encoding;
> > + sb->s_encoding_flags = fs_sb->s_encoding_flags;
> > + }
> > +#endif
> > +}
> >
> > static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
> > struct ovl_fs_context *ctx, struct ovl_layer *layers)
> > @@ -1024,6 +1036,9 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
> > if (ovl_upper_mnt(ofs)) {
> > ofs->fs[0].sb = ovl_upper_mnt(ofs)->mnt_sb;
> > ofs->fs[0].is_lower = false;
> > +
> > + if (ofs->casefold)
> > + ovl_set_encoding(sb, ofs->fs[0].sb);
> > }
> >
> > nr_merged_lower = ctx->nr - ctx->nr_data;
> > @@ -1083,6 +1098,16 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
> > l->name = NULL;
> > ofs->numlayer++;
> > ofs->fs[fsid].is_lower = true;
> > +
> > + if (ofs->casefold) {
> > + if (!ovl_upper_mnt(ofs) && !sb_has_encoding(sb))
> > + ovl_set_encoding(sb, ofs->fs[fsid].sb);
> > +
> > + if (!sb_has_encoding(sb) || !sb_same_encoding(sb, mnt->mnt_sb)) {
>
> Minor nit, but isn't the sb_has_encoding() check redundant here? sb_same_encoding
> will check the sb->encoding matches the mnt_sb already.
Maybe we did something wrong but the intention was:
If all layers root are casefold disabled (or not supported) then
a mix of layers with fs of different encoding (and fs with no encoding support)
is allowed because we take care that all directories are always
casefold disabled.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 6/9] ovl: Set case-insensitive dentry operations for ovl sb
2025-08-25 11:24 ` Gabriel Krisman Bertazi
@ 2025-08-25 15:34 ` Amir Goldstein
2025-08-26 20:13 ` André Almeida
0 siblings, 1 reply; 53+ messages in thread
From: Amir Goldstein @ 2025-08-25 15:34 UTC (permalink / raw)
To: Gabriel Krisman Bertazi
Cc: André Almeida, Miklos Szeredi, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev
On Mon, Aug 25, 2025 at 1:24 PM Gabriel Krisman Bertazi
<gabriel@krisman.be> wrote:
>
> André Almeida <andrealmeid@igalia.com> writes:
>
> > For filesystems with encoding (i.e. with case-insensitive support), set
> > the dentry operations for the super block as ovl_dentry_ci_operations.
> >
> > Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> > Signed-off-by: André Almeida <andrealmeid@igalia.com>
> > ---
> > Changes in v6:
> > - Fix kernel bot warning: unused variable 'ofs'
> > ---
> > fs/overlayfs/super.c | 25 +++++++++++++++++++++++++
> > 1 file changed, 25 insertions(+)
> >
> > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> > index b1dbd3c79961094d00c7f99cc622e515d544d22f..8db4e55d5027cb975fec9b92251f62fe5924af4f 100644
> > --- a/fs/overlayfs/super.c
> > +++ b/fs/overlayfs/super.c
> > @@ -161,6 +161,16 @@ static const struct dentry_operations ovl_dentry_operations = {
> > .d_weak_revalidate = ovl_dentry_weak_revalidate,
> > };
> >
> > +#if IS_ENABLED(CONFIG_UNICODE)
> > +static const struct dentry_operations ovl_dentry_ci_operations = {
> > + .d_real = ovl_d_real,
> > + .d_revalidate = ovl_dentry_revalidate,
> > + .d_weak_revalidate = ovl_dentry_weak_revalidate,
> > + .d_hash = generic_ci_d_hash,
> > + .d_compare = generic_ci_d_compare,
> > +};
> > +#endif
> > +
> > static struct kmem_cache *ovl_inode_cachep;
> >
> > static struct inode *ovl_alloc_inode(struct super_block *sb)
> > @@ -1332,6 +1342,19 @@ static struct dentry *ovl_get_root(struct super_block *sb,
> > return root;
> > }
> >
> > +static void ovl_set_d_op(struct super_block *sb)
> > +{
> > +#if IS_ENABLED(CONFIG_UNICODE)
> > + struct ovl_fs *ofs = sb->s_fs_info;
> > +
> > + if (ofs->casefold) {
> > + set_default_d_op(sb, &ovl_dentry_ci_operations);
> > + return;
> > + }
> > +#endif
> > + set_default_d_op(sb, &ovl_dentry_operations);
> > +}
> > +
> > int ovl_fill_super(struct super_block *sb, struct fs_context *fc)
> > {
> > struct ovl_fs *ofs = sb->s_fs_info;
> > @@ -1443,6 +1466,8 @@ int ovl_fill_super(struct super_block *sb, struct fs_context *fc)
> > if (IS_ERR(oe))
> > goto out_err;
> >
> > + ovl_set_d_op(sb);
> > +
>
> Absolutely minor, but fill_super is now calling
> set_default_d_op(sb, &ovl_dentry_operations) twice, once here and once
> at the beginning of the function. You can remove the original call.
Good catch!
That was not my intention at all.
I asked to replace the set_default_d_op() call with ovl_set_d_op()
but I missed that in the review.
Will fix it in my tree.
Thanks!
Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp()
2025-08-25 15:27 ` Amir Goldstein
@ 2025-08-25 15:45 ` Amir Goldstein
2025-08-25 17:11 ` Gabriel Krisman Bertazi
0 siblings, 1 reply; 53+ messages in thread
From: Amir Goldstein @ 2025-08-25 15:45 UTC (permalink / raw)
To: Gabriel Krisman Bertazi
Cc: André Almeida, Miklos Szeredi, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev
On Mon, Aug 25, 2025 at 5:27 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Mon, Aug 25, 2025 at 1:09 PM Gabriel Krisman Bertazi
> <gabriel@krisman.be> wrote:
> >
> > André Almeida <andrealmeid@igalia.com> writes:
> >
> > > To add overlayfs support casefold layers, create a new function
> > > ovl_casefold(), to be able to do case-insensitive strncmp().
> > >
> > > ovl_casefold() allocates a new buffer and stores the casefolded version
> > > of the string on it. If the allocation or the casefold operation fails,
> > > fallback to use the original string.
> > >
> > > The case-insentive name is then used in the rb-tree search/insertion
> > > operation. If the name is found in the rb-tree, the name can be
> > > discarded and the buffer is freed. If the name isn't found, it's then
> > > stored at struct ovl_cache_entry to be used later.
> > >
> > > Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> > > Signed-off-by: André Almeida <andrealmeid@igalia.com>
> > > ---
> > > Changes from v6:
> > > - Last version was using `strncmp(... tmp->len)` which was causing
> > > regressions. It should be `strncmp(... len)`.
> > > - Rename cf_len to c_len
> > > - Use c_len for tree operation: (cmp < 0 || len < tmp->c_len)
> > > - Remove needless kfree(cf_name)
> > > ---
> > > fs/overlayfs/readdir.c | 113 ++++++++++++++++++++++++++++++++++++++++---------
> > > 1 file changed, 94 insertions(+), 19 deletions(-)
> > >
> > > diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> > > index b65cdfce31ce27172d28d879559f1008b9c87320..dfc661b7bc3f87efbf14991e97cee169400d823b 100644
> > > --- a/fs/overlayfs/readdir.c
> > > +++ b/fs/overlayfs/readdir.c
> > > @@ -27,6 +27,8 @@ struct ovl_cache_entry {
> > > bool is_upper;
> > > bool is_whiteout;
> > > bool check_xwhiteout;
> > > + const char *c_name;
> > > + int c_len;
> > > char name[];
> > > };
> > >
> > > @@ -45,6 +47,7 @@ struct ovl_readdir_data {
> > > struct list_head *list;
> > > struct list_head middle;
> > > struct ovl_cache_entry *first_maybe_whiteout;
> > > + struct unicode_map *map;
> > > int count;
> > > int err;
> > > bool is_upper;
> > > @@ -66,6 +69,27 @@ static struct ovl_cache_entry *ovl_cache_entry_from_node(struct rb_node *n)
> > > return rb_entry(n, struct ovl_cache_entry, node);
> > > }
> > >
> > > +static int ovl_casefold(struct unicode_map *map, const char *str, int len, char **dst)
> > > +{
> > > + const struct qstr qstr = { .name = str, .len = len };
> > > + int cf_len;
> > > +
> > > + if (!IS_ENABLED(CONFIG_UNICODE) || !map || is_dot_dotdot(str, len))
> > > + return 0;
> > > +
> > > + *dst = kmalloc(NAME_MAX, GFP_KERNEL);
> > > +
> > > + if (dst) {
> > > + cf_len = utf8_casefold(map, &qstr, *dst, NAME_MAX);
> > > +
> > > + if (cf_len > 0)
> > > + return cf_len;
> > > + }
> > > +
> > > + kfree(*dst);
> > > + return 0;
> > > +}
> >
> > Hi,
> >
> > I should just note this does not differentiates allocation errors from
> > casefolding errors (invalid encoding). It might be just a theoretical
> > error because GFP_KERNEL shouldn't fail (wink, wink) and the rest of the
> > operation is likely to fail too, but if you have an allocation failure, you
> > can end up with an inconsistent cache, because a file is added under the
> > !casefolded name and a later successful lookup will look for the
> > casefolded version.
>
> Good point.
> I will fix this in my tree.
wait why should we not fail to fill the cache for both allocation
and encoding errors?
Thanks,
Amir.
>
> >
> > > +
> > > static bool ovl_cache_entry_find_link(const char *name, int len,
> > > struct rb_node ***link,
> > > struct rb_node **parent)
> > > @@ -79,10 +103,10 @@ static bool ovl_cache_entry_find_link(const char *name, int len,
> > >
> > > *parent = *newp;
> > > tmp = ovl_cache_entry_from_node(*newp);
> > > - cmp = strncmp(name, tmp->name, len);
> > > + cmp = strncmp(name, tmp->c_name, len);
> > > if (cmp > 0)
> > > newp = &tmp->node.rb_right;
> > > - else if (cmp < 0 || len < tmp->len)
> > > + else if (cmp < 0 || len < tmp->c_len)
> > > newp = &tmp->node.rb_left;
> > > else
> > > found = true;
> > > @@ -101,10 +125,10 @@ static struct ovl_cache_entry *ovl_cache_entry_find(struct rb_root *root,
> > > while (node) {
> > > struct ovl_cache_entry *p = ovl_cache_entry_from_node(node);
> > >
> > > - cmp = strncmp(name, p->name, len);
> > > + cmp = strncmp(name, p->c_name, len);
> > > if (cmp > 0)
> > > node = p->node.rb_right;
> > > - else if (cmp < 0 || len < p->len)
> > > + else if (cmp < 0 || len < p->c_len)
> > > node = p->node.rb_left;
> > > else
> > > return p;
> > > @@ -145,6 +169,7 @@ static bool ovl_calc_d_ino(struct ovl_readdir_data *rdd,
> > >
> > > static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
> > > const char *name, int len,
> > > + const char *c_name, int c_len,
> > > u64 ino, unsigned int d_type)
> > > {
> > > struct ovl_cache_entry *p;
> > > @@ -167,6 +192,14 @@ static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
> > > /* Defer check for overlay.whiteout to ovl_iterate() */
> > > p->check_xwhiteout = rdd->in_xwhiteouts_dir && d_type == DT_REG;
> > >
> > > + if (c_name && c_name != name) {
> > > + p->c_name = c_name;
> > > + p->c_len = c_len;
> > > + } else {
> > > + p->c_name = p->name;
> > > + p->c_len = len;
> > > + }
> > > +
> > > if (d_type == DT_CHR) {
> > > p->next_maybe_whiteout = rdd->first_maybe_whiteout;
> > > rdd->first_maybe_whiteout = p;
> > > @@ -174,48 +207,55 @@ static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
> > > return p;
> > > }
> > >
> > > -static bool ovl_cache_entry_add_rb(struct ovl_readdir_data *rdd,
> > > - const char *name, int len, u64 ino,
> > > +/* Return 0 for found, 1 for added, <0 for error */
> > > +static int ovl_cache_entry_add_rb(struct ovl_readdir_data *rdd,
> > > + const char *name, int len,
> > > + const char *c_name, int c_len,
> > > + u64 ino,
> > > unsigned int d_type)
> > > {
> > > struct rb_node **newp = &rdd->root->rb_node;
> > > struct rb_node *parent = NULL;
> > > struct ovl_cache_entry *p;
> > >
> > > - if (ovl_cache_entry_find_link(name, len, &newp, &parent))
> > > - return true;
> > > + if (ovl_cache_entry_find_link(c_name, c_len, &newp, &parent))
> > > + return 0;
> > >
> > > - p = ovl_cache_entry_new(rdd, name, len, ino, d_type);
> > > + p = ovl_cache_entry_new(rdd, name, len, c_name, c_len, ino, d_type);
> > > if (p == NULL) {
> > > rdd->err = -ENOMEM;
> > > - return false;
> > > + return -ENOMEM;
> > > }
> > >
> > > list_add_tail(&p->l_node, rdd->list);
> > > rb_link_node(&p->node, parent, newp);
> > > rb_insert_color(&p->node, rdd->root);
> > >
> > > - return true;
> > > + return 1;
> > > }
> > >
> > > -static bool ovl_fill_lowest(struct ovl_readdir_data *rdd,
> > > +/* Return 0 for found, 1 for added, <0 for error */
> > > +static int ovl_fill_lowest(struct ovl_readdir_data *rdd,
> > > const char *name, int namelen,
> > > + const char *c_name, int c_len,
> > > loff_t offset, u64 ino, unsigned int d_type)
> > > {
> > > struct ovl_cache_entry *p;
> > >
> > > - p = ovl_cache_entry_find(rdd->root, name, namelen);
> > > + p = ovl_cache_entry_find(rdd->root, c_name, c_len);
> > > if (p) {
> > > list_move_tail(&p->l_node, &rdd->middle);
> > > + return 0;
> > > } else {
> > > - p = ovl_cache_entry_new(rdd, name, namelen, ino, d_type);
> > > + p = ovl_cache_entry_new(rdd, name, namelen, c_name, c_len,
> > > + ino, d_type);
> > > if (p == NULL)
> > > rdd->err = -ENOMEM;
> > > else
> > > list_add_tail(&p->l_node, &rdd->middle);
> > > }
> > >
> > > - return rdd->err == 0;
> > > + return rdd->err ?: 1;
> > > }
> > >
> > > void ovl_cache_free(struct list_head *list)
> > > @@ -223,8 +263,11 @@ void ovl_cache_free(struct list_head *list)
> > > struct ovl_cache_entry *p;
> > > struct ovl_cache_entry *n;
> > >
> > > - list_for_each_entry_safe(p, n, list, l_node)
> > > + list_for_each_entry_safe(p, n, list, l_node) {
> > > + if (p->c_name != p->name)
> > > + kfree(p->c_name);
> > > kfree(p);
> > > + }
> > >
> > > INIT_LIST_HEAD(list);
> > > }
> > > @@ -260,12 +303,36 @@ static bool ovl_fill_merge(struct dir_context *ctx, const char *name,
> > > {
> > > struct ovl_readdir_data *rdd =
> > > container_of(ctx, struct ovl_readdir_data, ctx);
> > > + struct ovl_fs *ofs = OVL_FS(rdd->dentry->d_sb);
> > > + const char *c_name = NULL;
> > > + char *cf_name = NULL;
> > > + int c_len = 0, ret;
> > > +
> > > + if (ofs->casefold)
> > > + c_len = ovl_casefold(rdd->map, name, namelen, &cf_name);
> > > +
> > > + if (c_len <= 0) {
> > > + c_name = name;
> > > + c_len = namelen;
> > > + } else {
> > > + c_name = cf_name;
> > > + }
> > >
> > > rdd->count++;
> > > if (!rdd->is_lowest)
> > > - return ovl_cache_entry_add_rb(rdd, name, namelen, ino, d_type);
> > > + ret = ovl_cache_entry_add_rb(rdd, name, namelen, c_name, c_len, ino, d_type);
> > > else
> > > - return ovl_fill_lowest(rdd, name, namelen, offset, ino, d_type);
> > > + ret = ovl_fill_lowest(rdd, name, namelen, c_name, c_len, offset, ino, d_type);
> > > +
> > > + /*
> > > + * If ret == 1, that means that c_name is being used as part of struct
> > > + * ovl_cache_entry and will be freed at ovl_cache_free(). Otherwise,
> > > + * c_name was found in the rb-tree so we can free it here.
> > > + */
> > > + if (ret != 1 && c_name != name)
> > > + kfree(c_name);
> > > +
> >
> > The semantics of this being conditionally freed is a bit annoying, as
> > it is already replicated in 3 places. I suppose a helper would come in
> > hand.
>
> Yeh.
>
> I have already used ovl_cache_entry_free() in my tree.
>
> Thanks,
> Amir.
>
> >
> > In this specific case, it could just be:
> >
> > if (ret != 1)
> > kfree(cf_name);
> >
> >
> > > + return ret >= 0;
> > > }
> > >
> > > static int ovl_check_whiteouts(const struct path *path, struct ovl_readdir_data *rdd)
> > > @@ -357,12 +424,18 @@ static int ovl_dir_read_merged(struct dentry *dentry, struct list_head *list,
> > > .list = list,
> > > .root = root,
> > > .is_lowest = false,
> > > + .map = NULL,
> > > };
> > > int idx, next;
> > > const struct ovl_layer *layer;
> > > + struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
> > >
> > > for (idx = 0; idx != -1; idx = next) {
> > > next = ovl_path_next(idx, dentry, &realpath, &layer);
> > > +
> > > + if (ofs->casefold)
> > > + rdd.map = sb_encoding(realpath.dentry->d_sb);
> > > +
> > > rdd.is_upper = ovl_dentry_upper(dentry) == realpath.dentry;
> > > rdd.in_xwhiteouts_dir = layer->has_xwhiteouts &&
> > > ovl_dentry_has_xwhiteouts(dentry);
> > > @@ -555,7 +628,7 @@ static bool ovl_fill_plain(struct dir_context *ctx, const char *name,
> > > container_of(ctx, struct ovl_readdir_data, ctx);
> > >
> > > rdd->count++;
> > > - p = ovl_cache_entry_new(rdd, name, namelen, ino, d_type);
> > > + p = ovl_cache_entry_new(rdd, name, namelen, NULL, 0, ino, d_type);
> > > if (p == NULL) {
> > > rdd->err = -ENOMEM;
> > > return false;
> > > @@ -1023,6 +1096,8 @@ int ovl_check_empty_dir(struct dentry *dentry, struct list_head *list)
> > >
> > > del_entry:
> > > list_del(&p->l_node);
> > > + if (p->c_name != p->name)
> > > + kfree(p->c_name);
> > > kfree(p);
> > > }
> >
> > --
> > Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp()
2025-08-25 15:45 ` Amir Goldstein
@ 2025-08-25 17:11 ` Gabriel Krisman Bertazi
2025-08-26 1:34 ` Gabriel Krisman Bertazi
0 siblings, 1 reply; 53+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-08-25 17:11 UTC (permalink / raw)
To: Amir Goldstein
Cc: André Almeida, Miklos Szeredi, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev
Amir Goldstein <amir73il@gmail.com> writes:
> On Mon, Aug 25, 2025 at 5:27 PM Amir Goldstein <amir73il@gmail.com> wrote:
>>
>> On Mon, Aug 25, 2025 at 1:09 PM Gabriel Krisman Bertazi
>> <gabriel@krisman.be> wrote:
>> >
>> > André Almeida <andrealmeid@igalia.com> writes:
>> >
>> > > To add overlayfs support casefold layers, create a new function
>> > > ovl_casefold(), to be able to do case-insensitive strncmp().
>> > >
>> > > ovl_casefold() allocates a new buffer and stores the casefolded version
>> > > of the string on it. If the allocation or the casefold operation fails,
>> > > fallback to use the original string.
>> > >
>> > > The case-insentive name is then used in the rb-tree search/insertion
>> > > operation. If the name is found in the rb-tree, the name can be
>> > > discarded and the buffer is freed. If the name isn't found, it's then
>> > > stored at struct ovl_cache_entry to be used later.
>> > >
>> > > Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>> > > Signed-off-by: André Almeida <andrealmeid@igalia.com>
>> > > ---
>> > > Changes from v6:
>> > > - Last version was using `strncmp(... tmp->len)` which was causing
>> > > regressions. It should be `strncmp(... len)`.
>> > > - Rename cf_len to c_len
>> > > - Use c_len for tree operation: (cmp < 0 || len < tmp->c_len)
>> > > - Remove needless kfree(cf_name)
>> > > ---
>> > > fs/overlayfs/readdir.c | 113 ++++++++++++++++++++++++++++++++++++++++---------
>> > > 1 file changed, 94 insertions(+), 19 deletions(-)
>> > >
>> > > diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
>> > > index b65cdfce31ce27172d28d879559f1008b9c87320..dfc661b7bc3f87efbf14991e97cee169400d823b 100644
>> > > --- a/fs/overlayfs/readdir.c
>> > > +++ b/fs/overlayfs/readdir.c
>> > > @@ -27,6 +27,8 @@ struct ovl_cache_entry {
>> > > bool is_upper;
>> > > bool is_whiteout;
>> > > bool check_xwhiteout;
>> > > + const char *c_name;
>> > > + int c_len;
>> > > char name[];
>> > > };
>> > >
>> > > @@ -45,6 +47,7 @@ struct ovl_readdir_data {
>> > > struct list_head *list;
>> > > struct list_head middle;
>> > > struct ovl_cache_entry *first_maybe_whiteout;
>> > > + struct unicode_map *map;
>> > > int count;
>> > > int err;
>> > > bool is_upper;
>> > > @@ -66,6 +69,27 @@ static struct ovl_cache_entry *ovl_cache_entry_from_node(struct rb_node *n)
>> > > return rb_entry(n, struct ovl_cache_entry, node);
>> > > }
>> > >
>> > > +static int ovl_casefold(struct unicode_map *map, const char *str, int len, char **dst)
>> > > +{
>> > > + const struct qstr qstr = { .name = str, .len = len };
>> > > + int cf_len;
>> > > +
>> > > + if (!IS_ENABLED(CONFIG_UNICODE) || !map || is_dot_dotdot(str, len))
>> > > + return 0;
>> > > +
>> > > + *dst = kmalloc(NAME_MAX, GFP_KERNEL);
>> > > +
>> > > + if (dst) {
>> > > + cf_len = utf8_casefold(map, &qstr, *dst, NAME_MAX);
>> > > +
>> > > + if (cf_len > 0)
>> > > + return cf_len;
>> > > + }
>> > > +
>> > > + kfree(*dst);
>> > > + return 0;
>> > > +}
>> >
>> > Hi,
>> >
>> > I should just note this does not differentiates allocation errors from
>> > casefolding errors (invalid encoding). It might be just a theoretical
>> > error because GFP_KERNEL shouldn't fail (wink, wink) and the rest of the
>> > operation is likely to fail too, but if you have an allocation failure, you
>> > can end up with an inconsistent cache, because a file is added under the
>> > !casefolded name and a later successful lookup will look for the
>> > casefolded version.
>>
>> Good point.
>> I will fix this in my tree.
>
> wait why should we not fail to fill the cache for both allocation
> and encoding errors?
>
We shouldn't fail the cache for encoding errors, just for allocation errors.
Perhaps I am misreading the code, so please correct me if I'm wrong. if
ovl_casefold fails, the non-casefolded name is used in the cache. That
makes sense if the reason utf8_casefold failed is because the string
cannot be casefolded (i.e. an invalid utf-8 string). For those strings,
everything is fine. But on an allocation failure, the string might have
a real casefolded version. If we fallback to the original string as the
key, a cache lookup won't find the entry, since we compare with memcmp.
>
>>
>> >
>> > > +
>> > > static bool ovl_cache_entry_find_link(const char *name, int len,
>> > > struct rb_node ***link,
>> > > struct rb_node **parent)
>> > > @@ -79,10 +103,10 @@ static bool ovl_cache_entry_find_link(const char *name, int len,
>> > >
>> > > *parent = *newp;
>> > > tmp = ovl_cache_entry_from_node(*newp);
>> > > - cmp = strncmp(name, tmp->name, len);
>> > > + cmp = strncmp(name, tmp->c_name, len);
>> > > if (cmp > 0)
>> > > newp = &tmp->node.rb_right;
>> > > - else if (cmp < 0 || len < tmp->len)
>> > > + else if (cmp < 0 || len < tmp->c_len)
>> > > newp = &tmp->node.rb_left;
>> > > else
>> > > found = true;
>> > > @@ -101,10 +125,10 @@ static struct ovl_cache_entry *ovl_cache_entry_find(struct rb_root *root,
>> > > while (node) {
>> > > struct ovl_cache_entry *p = ovl_cache_entry_from_node(node);
>> > >
>> > > - cmp = strncmp(name, p->name, len);
>> > > + cmp = strncmp(name, p->c_name, len);
>> > > if (cmp > 0)
>> > > node = p->node.rb_right;
>> > > - else if (cmp < 0 || len < p->len)
>> > > + else if (cmp < 0 || len < p->c_len)
>> > > node = p->node.rb_left;
>> > > else
>> > > return p;
>> > > @@ -145,6 +169,7 @@ static bool ovl_calc_d_ino(struct ovl_readdir_data *rdd,
>> > >
>> > > static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
>> > > const char *name, int len,
>> > > + const char *c_name, int c_len,
>> > > u64 ino, unsigned int d_type)
>> > > {
>> > > struct ovl_cache_entry *p;
>> > > @@ -167,6 +192,14 @@ static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
>> > > /* Defer check for overlay.whiteout to ovl_iterate() */
>> > > p->check_xwhiteout = rdd->in_xwhiteouts_dir && d_type == DT_REG;
>> > >
>> > > + if (c_name && c_name != name) {
>> > > + p->c_name = c_name;
>> > > + p->c_len = c_len;
>> > > + } else {
>> > > + p->c_name = p->name;
>> > > + p->c_len = len;
>> > > + }
>> > > +
>> > > if (d_type == DT_CHR) {
>> > > p->next_maybe_whiteout = rdd->first_maybe_whiteout;
>> > > rdd->first_maybe_whiteout = p;
>> > > @@ -174,48 +207,55 @@ static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
>> > > return p;
>> > > }
>> > >
>> > > -static bool ovl_cache_entry_add_rb(struct ovl_readdir_data *rdd,
>> > > - const char *name, int len, u64 ino,
>> > > +/* Return 0 for found, 1 for added, <0 for error */
>> > > +static int ovl_cache_entry_add_rb(struct ovl_readdir_data *rdd,
>> > > + const char *name, int len,
>> > > + const char *c_name, int c_len,
>> > > + u64 ino,
>> > > unsigned int d_type)
>> > > {
>> > > struct rb_node **newp = &rdd->root->rb_node;
>> > > struct rb_node *parent = NULL;
>> > > struct ovl_cache_entry *p;
>> > >
>> > > - if (ovl_cache_entry_find_link(name, len, &newp, &parent))
>> > > - return true;
>> > > + if (ovl_cache_entry_find_link(c_name, c_len, &newp, &parent))
>> > > + return 0;
>> > >
>> > > - p = ovl_cache_entry_new(rdd, name, len, ino, d_type);
>> > > + p = ovl_cache_entry_new(rdd, name, len, c_name, c_len, ino, d_type);
>> > > if (p == NULL) {
>> > > rdd->err = -ENOMEM;
>> > > - return false;
>> > > + return -ENOMEM;
>> > > }
>> > >
>> > > list_add_tail(&p->l_node, rdd->list);
>> > > rb_link_node(&p->node, parent, newp);
>> > > rb_insert_color(&p->node, rdd->root);
>> > >
>> > > - return true;
>> > > + return 1;
>> > > }
>> > >
>> > > -static bool ovl_fill_lowest(struct ovl_readdir_data *rdd,
>> > > +/* Return 0 for found, 1 for added, <0 for error */
>> > > +static int ovl_fill_lowest(struct ovl_readdir_data *rdd,
>> > > const char *name, int namelen,
>> > > + const char *c_name, int c_len,
>> > > loff_t offset, u64 ino, unsigned int d_type)
>> > > {
>> > > struct ovl_cache_entry *p;
>> > >
>> > > - p = ovl_cache_entry_find(rdd->root, name, namelen);
>> > > + p = ovl_cache_entry_find(rdd->root, c_name, c_len);
>> > > if (p) {
>> > > list_move_tail(&p->l_node, &rdd->middle);
>> > > + return 0;
>> > > } else {
>> > > - p = ovl_cache_entry_new(rdd, name, namelen, ino, d_type);
>> > > + p = ovl_cache_entry_new(rdd, name, namelen, c_name, c_len,
>> > > + ino, d_type);
>> > > if (p == NULL)
>> > > rdd->err = -ENOMEM;
>> > > else
>> > > list_add_tail(&p->l_node, &rdd->middle);
>> > > }
>> > >
>> > > - return rdd->err == 0;
>> > > + return rdd->err ?: 1;
>> > > }
>> > >
>> > > void ovl_cache_free(struct list_head *list)
>> > > @@ -223,8 +263,11 @@ void ovl_cache_free(struct list_head *list)
>> > > struct ovl_cache_entry *p;
>> > > struct ovl_cache_entry *n;
>> > >
>> > > - list_for_each_entry_safe(p, n, list, l_node)
>> > > + list_for_each_entry_safe(p, n, list, l_node) {
>> > > + if (p->c_name != p->name)
>> > > + kfree(p->c_name);
>> > > kfree(p);
>> > > + }
>> > >
>> > > INIT_LIST_HEAD(list);
>> > > }
>> > > @@ -260,12 +303,36 @@ static bool ovl_fill_merge(struct dir_context *ctx, const char *name,
>> > > {
>> > > struct ovl_readdir_data *rdd =
>> > > container_of(ctx, struct ovl_readdir_data, ctx);
>> > > + struct ovl_fs *ofs = OVL_FS(rdd->dentry->d_sb);
>> > > + const char *c_name = NULL;
>> > > + char *cf_name = NULL;
>> > > + int c_len = 0, ret;
>> > > +
>> > > + if (ofs->casefold)
>> > > + c_len = ovl_casefold(rdd->map, name, namelen, &cf_name);
>> > > +
>> > > + if (c_len <= 0) {
>> > > + c_name = name;
>> > > + c_len = namelen;
>> > > + } else {
>> > > + c_name = cf_name;
>> > > + }
>> > >
>> > > rdd->count++;
>> > > if (!rdd->is_lowest)
>> > > - return ovl_cache_entry_add_rb(rdd, name, namelen, ino, d_type);
>> > > + ret = ovl_cache_entry_add_rb(rdd, name, namelen, c_name, c_len, ino, d_type);
>> > > else
>> > > - return ovl_fill_lowest(rdd, name, namelen, offset, ino, d_type);
>> > > + ret = ovl_fill_lowest(rdd, name, namelen, c_name, c_len, offset, ino, d_type);
>> > > +
>> > > + /*
>> > > + * If ret == 1, that means that c_name is being used as part of struct
>> > > + * ovl_cache_entry and will be freed at ovl_cache_free(). Otherwise,
>> > > + * c_name was found in the rb-tree so we can free it here.
>> > > + */
>> > > + if (ret != 1 && c_name != name)
>> > > + kfree(c_name);
>> > > +
>> >
>> > The semantics of this being conditionally freed is a bit annoying, as
>> > it is already replicated in 3 places. I suppose a helper would come in
>> > hand.
>>
>> Yeh.
>>
>> I have already used ovl_cache_entry_free() in my tree.
>>
>> Thanks,
>> Amir.
>>
>> >
>> > In this specific case, it could just be:
>> >
>> > if (ret != 1)
>> > kfree(cf_name);
>> >
>> >
>> > > + return ret >= 0;
>> > > }
>> > >
>> > > static int ovl_check_whiteouts(const struct path *path, struct ovl_readdir_data *rdd)
>> > > @@ -357,12 +424,18 @@ static int ovl_dir_read_merged(struct dentry *dentry, struct list_head *list,
>> > > .list = list,
>> > > .root = root,
>> > > .is_lowest = false,
>> > > + .map = NULL,
>> > > };
>> > > int idx, next;
>> > > const struct ovl_layer *layer;
>> > > + struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
>> > >
>> > > for (idx = 0; idx != -1; idx = next) {
>> > > next = ovl_path_next(idx, dentry, &realpath, &layer);
>> > > +
>> > > + if (ofs->casefold)
>> > > + rdd.map = sb_encoding(realpath.dentry->d_sb);
>> > > +
>> > > rdd.is_upper = ovl_dentry_upper(dentry) == realpath.dentry;
>> > > rdd.in_xwhiteouts_dir = layer->has_xwhiteouts &&
>> > > ovl_dentry_has_xwhiteouts(dentry);
>> > > @@ -555,7 +628,7 @@ static bool ovl_fill_plain(struct dir_context *ctx, const char *name,
>> > > container_of(ctx, struct ovl_readdir_data, ctx);
>> > >
>> > > rdd->count++;
>> > > - p = ovl_cache_entry_new(rdd, name, namelen, ino, d_type);
>> > > + p = ovl_cache_entry_new(rdd, name, namelen, NULL, 0, ino, d_type);
>> > > if (p == NULL) {
>> > > rdd->err = -ENOMEM;
>> > > return false;
>> > > @@ -1023,6 +1096,8 @@ int ovl_check_empty_dir(struct dentry *dentry, struct list_head *list)
>> > >
>> > > del_entry:
>> > > list_del(&p->l_node);
>> > > + if (p->c_name != p->name)
>> > > + kfree(p->c_name);
>> > > kfree(p);
>> > > }
>> >
>> > --
>> > Gabriel Krisman Bertazi
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp()
2025-08-25 17:11 ` Gabriel Krisman Bertazi
@ 2025-08-26 1:34 ` Gabriel Krisman Bertazi
2025-08-26 7:19 ` Amir Goldstein
0 siblings, 1 reply; 53+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-08-26 1:34 UTC (permalink / raw)
To: Amir Goldstein
Cc: André Almeida, Miklos Szeredi, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev
Gabriel Krisman Bertazi <gabriel@krisman.be> writes:
> Amir Goldstein <amir73il@gmail.com> writes:
>
>> On Mon, Aug 25, 2025 at 5:27 PM Amir Goldstein <amir73il@gmail.com> wrote:
>>>
>>> On Mon, Aug 25, 2025 at 1:09 PM Gabriel Krisman Bertazi
>>> <gabriel@krisman.be> wrote:
>>> >
>>> > André Almeida <andrealmeid@igalia.com> writes:
>>> >
>>> > > To add overlayfs support casefold layers, create a new function
>>> > > ovl_casefold(), to be able to do case-insensitive strncmp().
>>> > >
>>> > > ovl_casefold() allocates a new buffer and stores the casefolded version
>>> > > of the string on it. If the allocation or the casefold operation fails,
>>> > > fallback to use the original string.
>>> > >
>>> > > The case-insentive name is then used in the rb-tree search/insertion
>>> > > operation. If the name is found in the rb-tree, the name can be
>>> > > discarded and the buffer is freed. If the name isn't found, it's then
>>> > > stored at struct ovl_cache_entry to be used later.
>>> > >
>>> > > Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>>> > > Signed-off-by: André Almeida <andrealmeid@igalia.com>
>>> > > ---
>>> > > Changes from v6:
>>> > > - Last version was using `strncmp(... tmp->len)` which was causing
>>> > > regressions. It should be `strncmp(... len)`.
>>> > > - Rename cf_len to c_len
>>> > > - Use c_len for tree operation: (cmp < 0 || len < tmp->c_len)
>>> > > - Remove needless kfree(cf_name)
>>> > > ---
>>> > > fs/overlayfs/readdir.c | 113 ++++++++++++++++++++++++++++++++++++++++---------
>>> > > 1 file changed, 94 insertions(+), 19 deletions(-)
>>> > >
>>> > > diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
>>> > > index b65cdfce31ce27172d28d879559f1008b9c87320..dfc661b7bc3f87efbf14991e97cee169400d823b 100644
>>> > > --- a/fs/overlayfs/readdir.c
>>> > > +++ b/fs/overlayfs/readdir.c
>>> > > @@ -27,6 +27,8 @@ struct ovl_cache_entry {
>>> > > bool is_upper;
>>> > > bool is_whiteout;
>>> > > bool check_xwhiteout;
>>> > > + const char *c_name;
>>> > > + int c_len;
>>> > > char name[];
>>> > > };
>>> > >
>>> > > @@ -45,6 +47,7 @@ struct ovl_readdir_data {
>>> > > struct list_head *list;
>>> > > struct list_head middle;
>>> > > struct ovl_cache_entry *first_maybe_whiteout;
>>> > > + struct unicode_map *map;
>>> > > int count;
>>> > > int err;
>>> > > bool is_upper;
>>> > > @@ -66,6 +69,27 @@ static struct ovl_cache_entry *ovl_cache_entry_from_node(struct rb_node *n)
>>> > > return rb_entry(n, struct ovl_cache_entry, node);
>>> > > }
>>> > >
>>> > > +static int ovl_casefold(struct unicode_map *map, const char *str, int len, char **dst)
>>> > > +{
>>> > > + const struct qstr qstr = { .name = str, .len = len };
>>> > > + int cf_len;
>>> > > +
>>> > > + if (!IS_ENABLED(CONFIG_UNICODE) || !map || is_dot_dotdot(str, len))
>>> > > + return 0;
>>> > > +
>>> > > + *dst = kmalloc(NAME_MAX, GFP_KERNEL);
>>> > > +
>>> > > + if (dst) {
>>> > > + cf_len = utf8_casefold(map, &qstr, *dst, NAME_MAX);
>>> > > +
>>> > > + if (cf_len > 0)
>>> > > + return cf_len;
>>> > > + }
>>> > > +
>>> > > + kfree(*dst);
>>> > > + return 0;
>>> > > +}
>>> >
>>> > Hi,
>>> >
>>> > I should just note this does not differentiates allocation errors from
>>> > casefolding errors (invalid encoding). It might be just a theoretical
>>> > error because GFP_KERNEL shouldn't fail (wink, wink) and the rest of the
>>> > operation is likely to fail too, but if you have an allocation failure, you
>>> > can end up with an inconsistent cache, because a file is added under the
>>> > !casefolded name and a later successful lookup will look for the
>>> > casefolded version.
>>>
>>> Good point.
>>> I will fix this in my tree.
>>
>> wait why should we not fail to fill the cache for both allocation
>> and encoding errors?
>>
>
> We shouldn't fail the cache for encoding errors, just for allocation errors.
>
> Perhaps I am misreading the code, so please correct me if I'm wrong. if
> ovl_casefold fails, the non-casefolded name is used in the cache. That
> makes sense if the reason utf8_casefold failed is because the string
> cannot be casefolded (i.e. an invalid utf-8 string). For those strings,
> everything is fine. But on an allocation failure, the string might have
> a real casefolded version. If we fallback to the original string as the
> key, a cache lookup won't find the entry, since we compare with memcmp.
I was thinking again about this and I suspect I misunderstood your
question. let me try to answer it again:
Ext4, f2fs and tmpfs all allow invalid utf8-encoded strings in a
casefolded directory when running on non-strict-mode. They are treated
as non-encoded byte-sequences, as if they were seen on a case-Sensitive
directory. They can't collide with other filenames because they
basically "fold" to themselves.
Now I suspect there is another problem with this series: I don't see how
it implements the semantics of strict mode. What happens if upper and
lower are in strict mode (which is valid, same encoding_flags) but there
is an invalid name in the lower? overlayfs should reject the dentry,
because any attempt to create it to the upper will fail.
André, did you consider this scenario? You can test by creating a file
with an invalid-encoded name in a casefolded directory of a
non-strict-mode filesystem and then flip the strict-mode flag in the
superblock. I can give it a try tomorrow too.
Thanks,
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp()
2025-08-26 1:34 ` Gabriel Krisman Bertazi
@ 2025-08-26 7:19 ` Amir Goldstein
2025-08-26 15:02 ` Gabriel Krisman Bertazi
` (2 more replies)
0 siblings, 3 replies; 53+ messages in thread
From: Amir Goldstein @ 2025-08-26 7:19 UTC (permalink / raw)
To: Gabriel Krisman Bertazi
Cc: André Almeida, Miklos Szeredi, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev
On Tue, Aug 26, 2025 at 3:34 AM Gabriel Krisman Bertazi <krisman@suse.de> wrote:
>
> Gabriel Krisman Bertazi <gabriel@krisman.be> writes:
>
> > Amir Goldstein <amir73il@gmail.com> writes:
> >
> >> On Mon, Aug 25, 2025 at 5:27 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >>>
> >>> On Mon, Aug 25, 2025 at 1:09 PM Gabriel Krisman Bertazi
> >>> <gabriel@krisman.be> wrote:
> >>> >
> >>> > André Almeida <andrealmeid@igalia.com> writes:
> >>> >
> >>> > > To add overlayfs support casefold layers, create a new function
> >>> > > ovl_casefold(), to be able to do case-insensitive strncmp().
> >>> > >
> >>> > > ovl_casefold() allocates a new buffer and stores the casefolded version
> >>> > > of the string on it. If the allocation or the casefold operation fails,
> >>> > > fallback to use the original string.
> >>> > >
> >>> > > The case-insentive name is then used in the rb-tree search/insertion
> >>> > > operation. If the name is found in the rb-tree, the name can be
> >>> > > discarded and the buffer is freed. If the name isn't found, it's then
> >>> > > stored at struct ovl_cache_entry to be used later.
> >>> > >
> >>> > > Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> >>> > > Signed-off-by: André Almeida <andrealmeid@igalia.com>
> >>> > > ---
> >>> > > Changes from v6:
> >>> > > - Last version was using `strncmp(... tmp->len)` which was causing
> >>> > > regressions. It should be `strncmp(... len)`.
> >>> > > - Rename cf_len to c_len
> >>> > > - Use c_len for tree operation: (cmp < 0 || len < tmp->c_len)
> >>> > > - Remove needless kfree(cf_name)
> >>> > > ---
> >>> > > fs/overlayfs/readdir.c | 113 ++++++++++++++++++++++++++++++++++++++++---------
> >>> > > 1 file changed, 94 insertions(+), 19 deletions(-)
> >>> > >
> >>> > > diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> >>> > > index b65cdfce31ce27172d28d879559f1008b9c87320..dfc661b7bc3f87efbf14991e97cee169400d823b 100644
> >>> > > --- a/fs/overlayfs/readdir.c
> >>> > > +++ b/fs/overlayfs/readdir.c
> >>> > > @@ -27,6 +27,8 @@ struct ovl_cache_entry {
> >>> > > bool is_upper;
> >>> > > bool is_whiteout;
> >>> > > bool check_xwhiteout;
> >>> > > + const char *c_name;
> >>> > > + int c_len;
> >>> > > char name[];
> >>> > > };
> >>> > >
> >>> > > @@ -45,6 +47,7 @@ struct ovl_readdir_data {
> >>> > > struct list_head *list;
> >>> > > struct list_head middle;
> >>> > > struct ovl_cache_entry *first_maybe_whiteout;
> >>> > > + struct unicode_map *map;
> >>> > > int count;
> >>> > > int err;
> >>> > > bool is_upper;
> >>> > > @@ -66,6 +69,27 @@ static struct ovl_cache_entry *ovl_cache_entry_from_node(struct rb_node *n)
> >>> > > return rb_entry(n, struct ovl_cache_entry, node);
> >>> > > }
> >>> > >
> >>> > > +static int ovl_casefold(struct unicode_map *map, const char *str, int len, char **dst)
> >>> > > +{
> >>> > > + const struct qstr qstr = { .name = str, .len = len };
> >>> > > + int cf_len;
> >>> > > +
> >>> > > + if (!IS_ENABLED(CONFIG_UNICODE) || !map || is_dot_dotdot(str, len))
> >>> > > + return 0;
> >>> > > +
> >>> > > + *dst = kmalloc(NAME_MAX, GFP_KERNEL);
> >>> > > +
> >>> > > + if (dst) {
Andre,
Just noticed this is a bug, should have been if (*dst), but anyway following
Gabriel's comments I have made this change in my tree (pending more
strict related changes):
static int ovl_casefold(struct ovl_readdir_data *rdd, const char *str, int len,
char **dst)
{
const struct qstr qstr = { .name = str, .len = len };
char *cf_name;
int cf_len;
if (!IS_ENABLED(CONFIG_UNICODE) || !rdd->map || is_dot_dotdot(str, len))
return 0;
cf_name = kmalloc(NAME_MAX, GFP_KERNEL);
if (!cf_name) {
rdd->err = -ENOMEM;
return -ENOMEM;
}
cf_len = utf8_casefold(rdd->map, &qstr, *dst, NAME_MAX);
if (cf_len > 0)
*dst = cf_name;
else
kfree(cf_name);
return cf_len;
}
> >>> > > + cf_len = utf8_casefold(map, &qstr, *dst, NAME_MAX);
> >>> > > +
> >>> > > + if (cf_len > 0)
> >>> > > + return cf_len;
> >>> > > + }
> >>> > > +
> >>> > > + kfree(*dst);
> >>> > > + return 0;
> >>> > > +}
> >>> >
> >>> > Hi,
> >>> >
> >>> > I should just note this does not differentiates allocation errors from
> >>> > casefolding errors (invalid encoding). It might be just a theoretical
> >>> > error because GFP_KERNEL shouldn't fail (wink, wink) and the rest of the
> >>> > operation is likely to fail too, but if you have an allocation failure, you
> >>> > can end up with an inconsistent cache, because a file is added under the
> >>> > !casefolded name and a later successful lookup will look for the
> >>> > casefolded version.
> >>>
> >>> Good point.
> >>> I will fix this in my tree.
> >>
> >> wait why should we not fail to fill the cache for both allocation
> >> and encoding errors?
> >>
> >
> > We shouldn't fail the cache for encoding errors, just for allocation errors.
> >
> > Perhaps I am misreading the code, so please correct me if I'm wrong. if
> > ovl_casefold fails, the non-casefolded name is used in the cache. That
> > makes sense if the reason utf8_casefold failed is because the string
> > cannot be casefolded (i.e. an invalid utf-8 string). For those strings,
> > everything is fine. But on an allocation failure, the string might have
> > a real casefolded version. If we fallback to the original string as the
> > key, a cache lookup won't find the entry, since we compare with memcmp.
Just to make it clear in case the name "cache lookup" confuses anyone
on this thread - we are talking about ovl readdir cache, not about the vfs
lookup cache, the the purpose of ovl readdir cache is twofold:
1. plain in-memory readdir cache
2. (more important to this discussion) implementation of "merged dir" content
So I agree with you that with non-strict mode, invalid encoded names
should be added to readdir cache as is and not in the case of allocation
failure.
>
> I was thinking again about this and I suspect I misunderstood your
> question. let me try to answer it again:
>
> Ext4, f2fs and tmpfs all allow invalid utf8-encoded strings in a
> casefolded directory when running on non-strict-mode. They are treated
> as non-encoded byte-sequences, as if they were seen on a case-Sensitive
> directory. They can't collide with other filenames because they
> basically "fold" to themselves.
>
> Now I suspect there is another problem with this series: I don't see how
> it implements the semantics of strict mode. What happens if upper and
> lower are in strict mode (which is valid, same encoding_flags) but there
> is an invalid name in the lower? overlayfs should reject the dentry,
> because any attempt to create it to the upper will fail.
Ok, so IIUC, one issue is that return value from ovl_casefold() should be
conditional to the sb encoding_flags, which was inherited from the layers.
Again, *IF* I understand correctly, then strict mode ext4 will not allow
creating an invalid-encoded name, but will strict mode ext4 allow
it as a valid lookup result?
>
> André, did you consider this scenario?
In general, as I have told Andre from v1, please stick to the most common
configs that people actually need.
We do NOT need to support every possible combination of layers configurations.
This is why we went with supporting all-or-nothing configs for casefolder dirs.
Because it is simpler for overlayfs semantics and good enough for what
users need.
So my question is to you both: do users actually use strict mode for
wine and such?
Because if they don't I would rather support the default mode only
(enforced on mount)
and add support for strict mode later per actual users demand.
> You can test by creating a file
> with an invalid-encoded name in a casefolded directory of a
> non-strict-mode filesystem and then flip the strict-mode flag in the
> superblock. I can give it a try tomorrow too.
Can the sb flags be flipped in runtime? while mounted?
I suppose you are talking about an offline change that requires
re-mount of overlayfs and re-validate the same encoding flags on all layers?
Andre,
Please also add these and other casefold functional tests to fstest to
validate correctness of the merge dir implementation with different
casefold variants in different layers.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
2025-08-25 13:31 ` André Almeida
@ 2025-08-26 7:31 ` Amir Goldstein
2025-08-26 19:01 ` André Almeida
0 siblings, 1 reply; 53+ messages in thread
From: Amir Goldstein @ 2025-08-26 7:31 UTC (permalink / raw)
To: André Almeida
Cc: Miklos Szeredi, Theodore Tso, Gabriel Krisman Bertazi,
linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev
On Mon, Aug 25, 2025 at 3:31 PM André Almeida <andrealmeid@igalia.com> wrote:
>
> Hi Amir,
>
> Em 22/08/2025 16:17, Amir Goldstein escreveu:
>
> [...]
>
> /*
> >>>> - * Allow filesystems that are case-folding capable but deny composing
> >>>> - * ovl stack from case-folded directories.
> >>>> + * Exceptionally for layers with casefold, we accept that they have
> >>>> + * their own hash and compare operations
> >>>> */
> >>>> - if (sb_has_encoding(dentry->d_sb))
> >>>> - return IS_CASEFOLDED(d_inode(dentry));
> >>>> + if (ofs->casefold)
> >>>> + return false;
> >>>
> >>> I think this is better as:
> >>> if (sb_has_encoding(dentry->d_sb))
> >>> return false;
> >>>
> >
> > And this still fails the test "Casefold enabled" for me.
> >
> > Maybe you are confused because this does not look like
> > a test failure. It looks like this:
> >
> > generic/999 5s ... [19:10:21][ 150.667994] overlayfs: failed lookup
> > in lower (ovl-lower/casefold, name='subdir', err=-116): parent wrong
> > casefold
> > [ 150.669741] overlayfs: failed lookup in lower (ovl-lower/casefold,
> > name='subdir', err=-116): parent wrong casefold
> > [ 150.760644] overlayfs: failed lookup in lower (/ovl-lower,
> > name='casefold', err=-66): child wrong casefold
> > [19:10:24] [not run]
> > generic/999 -- overlayfs does not support casefold enabled layers
> > Ran: generic/999
> > Not run: generic/999
> > Passed all 1 tests
> >
>
> This is how the test output looks before my changes[1] to the test:
>
> $ ./run.sh
> FSTYP -- ext4
> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
> MKFS_OPTIONS -- -F /dev/vdc
> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
>
> generic/999 1s ... [not run] overlayfs does not support casefold enabled
> layers
> Ran: generic/999
> Not run: generic/999
> Passed all 1 tests
>
>
> And this is how it looks after my changes[1] to the test:
>
> $ ./run.sh
> FSTYP -- ext4
> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
> MKFS_OPTIONS -- -F /dev/vdc
> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
>
> generic/999 1s
> Ran: generic/999
> Passed all 1 tests
>
> So, as far as I can tell, the casefold enabled is not being skipped
> after the fix to the test.
Is this how it looks with your v6 or after fixing the bug:
https://lore.kernel.org/linux-unionfs/68a8c4d7.050a0220.37038e.005c.GAE@google.com/
Because for me this skipping started after fixing this bug
Maybe we fixed the bug incorrectly, but I did not see what the problem
was from a quick look.
Can you test with my branch:
https://github.com/amir73il/linux/commits/ovl_casefold/
>
> [1]
> https://lore.kernel.org/lkml/5da6b0f4-2730-4783-9c57-c46c2d13e848@igalia.com/
>
>
> > I'm not sure I will keep the test this way. This is not very standard nor
> > good practice, to run half of the test and then skip it.
> > I would probably split it into two tests.
> > The first one as it is now will run to completion on kenrels >= v6.17
> > and the Casefold enable test will run on kernels >= v6.18.
> >
> > In any case, please make sure that the test is not skipped when testing
> > Casefold enabled layers
> >
> > And then continue with the missing test cases.
> >
> > When you have a test that passes please send the test itself or
> > a fstest branch for me to test.
>
> Ok!
I assume you are testing with ext4 layers?
If we are both testing the same code and same test and getting different
results I would like to get to the bottom of this, so please share as much
information on your test setup as you can.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp()
2025-08-26 7:19 ` Amir Goldstein
@ 2025-08-26 15:02 ` Gabriel Krisman Bertazi
2025-08-26 19:58 ` André Almeida
2025-08-26 20:01 ` André Almeida
2025-08-27 20:45 ` André Almeida
2 siblings, 1 reply; 53+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-08-26 15:02 UTC (permalink / raw)
To: Amir Goldstein
Cc: André Almeida, Miklos Szeredi, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev
Amir Goldstein <amir73il@gmail.com> writes:
> On Tue, Aug 26, 2025 at 3:34 AM Gabriel Krisman Bertazi <krisman@suse.de> wrote:
>
>>
>> I was thinking again about this and I suspect I misunderstood your
>> question. let me try to answer it again:
>>
>> Ext4, f2fs and tmpfs all allow invalid utf8-encoded strings in a
>> casefolded directory when running on non-strict-mode. They are treated
>> as non-encoded byte-sequences, as if they were seen on a case-Sensitive
>> directory. They can't collide with other filenames because they
>> basically "fold" to themselves.
>>
>> Now I suspect there is another problem with this series: I don't see how
>> it implements the semantics of strict mode. What happens if upper and
>> lower are in strict mode (which is valid, same encoding_flags) but there
>> is an invalid name in the lower? overlayfs should reject the dentry,
>> because any attempt to create it to the upper will fail.
>
> Ok, so IIUC, one issue is that return value from ovl_casefold() should be
> conditional to the sb encoding_flags, which was inherited from the
> layers.
yes, unless you reject mounting strict_mode filesystems, which the best
course of action, in my opinion.
>
> Again, *IF* I understand correctly, then strict mode ext4 will not allow
> creating an invalid-encoded name, but will strict mode ext4 allow
> it as a valid lookup result?
strict mode ext4 will not allow creating an invalid-encoded name. And
even lookups will fail. Because the kernel can't casefold it, it will
assume the dirent is broken and ignore it during lookup.
(I just noticed the dirent is ignored and the error is not propagated in
ext4_match. That needs improvement.).
>>
>> André, did you consider this scenario?
>
> In general, as I have told Andre from v1, please stick to the most common
> configs that people actually need.
>
> We do NOT need to support every possible combination of layers configurations.
>
> This is why we went with supporting all-or-nothing configs for casefolder dirs.
> Because it is simpler for overlayfs semantics and good enough for what
> users need.
>
> So my question is to you both: do users actually use strict mode for
> wine and such?
> Because if they don't I would rather support the default mode only
> (enforced on mount)
> and add support for strict mode later per actual users demand.
I doubt we care. strict mode is a restricted version of casefolding
support with minor advantages. Basically, with it, you can trust that
if you update the unicode version, there won't be any behavior change in
casefolding due to newly assigned code-points. For Wine, that is
irrelevant.
You can very well reject strict mode and be done with it.
>
>> You can test by creating a file
>> with an invalid-encoded name in a casefolded directory of a
>> non-strict-mode filesystem and then flip the strict-mode flag in the
>> superblock. I can give it a try tomorrow too.
>
> Can the sb flags be flipped in runtime? while mounted?
> I suppose you are talking about an offline change that requires
> re-mount of overlayfs and re-validate the same encoding flags on all layers?
No, it is set at mkfs-time. The scenario I'm describing is a
filesystem corruption, where a filename has invalid characters but the
disk is in strict mode. What I proposed is a way to test this by
crafting an image.
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
2025-08-26 7:31 ` Amir Goldstein
@ 2025-08-26 19:01 ` André Almeida
2025-08-27 18:06 ` Amir Goldstein
0 siblings, 1 reply; 53+ messages in thread
From: André Almeida @ 2025-08-26 19:01 UTC (permalink / raw)
To: Amir Goldstein
Cc: Miklos Szeredi, Theodore Tso, Gabriel Krisman Bertazi,
linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev
Em 26/08/2025 04:31, Amir Goldstein escreveu:
> On Mon, Aug 25, 2025 at 3:31 PM André Almeida <andrealmeid@igalia.com> wrote:
>>
>> Hi Amir,
>>
>> Em 22/08/2025 16:17, Amir Goldstein escreveu:
>>
>> [...]
>>
>> /*
>>>>>> - * Allow filesystems that are case-folding capable but deny composing
>>>>>> - * ovl stack from case-folded directories.
>>>>>> + * Exceptionally for layers with casefold, we accept that they have
>>>>>> + * their own hash and compare operations
>>>>>> */
>>>>>> - if (sb_has_encoding(dentry->d_sb))
>>>>>> - return IS_CASEFOLDED(d_inode(dentry));
>>>>>> + if (ofs->casefold)
>>>>>> + return false;
>>>>>
>>>>> I think this is better as:
>>>>> if (sb_has_encoding(dentry->d_sb))
>>>>> return false;
>>>>>
>>>
>>> And this still fails the test "Casefold enabled" for me.
>>>
>>> Maybe you are confused because this does not look like
>>> a test failure. It looks like this:
>>>
>>> generic/999 5s ... [19:10:21][ 150.667994] overlayfs: failed lookup
>>> in lower (ovl-lower/casefold, name='subdir', err=-116): parent wrong
>>> casefold
>>> [ 150.669741] overlayfs: failed lookup in lower (ovl-lower/casefold,
>>> name='subdir', err=-116): parent wrong casefold
>>> [ 150.760644] overlayfs: failed lookup in lower (/ovl-lower,
>>> name='casefold', err=-66): child wrong casefold
>>> [19:10:24] [not run]
>>> generic/999 -- overlayfs does not support casefold enabled layers
>>> Ran: generic/999
>>> Not run: generic/999
>>> Passed all 1 tests
>>>
>>
>> This is how the test output looks before my changes[1] to the test:
>>
>> $ ./run.sh
>> FSTYP -- ext4
>> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
>> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
>> MKFS_OPTIONS -- -F /dev/vdc
>> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
>>
>> generic/999 1s ... [not run] overlayfs does not support casefold enabled
>> layers
>> Ran: generic/999
>> Not run: generic/999
>> Passed all 1 tests
>>
>>
>> And this is how it looks after my changes[1] to the test:
>>
>> $ ./run.sh
>> FSTYP -- ext4
>> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
>> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
>> MKFS_OPTIONS -- -F /dev/vdc
>> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
>>
>> generic/999 1s
>> Ran: generic/999
>> Passed all 1 tests
>>
>> So, as far as I can tell, the casefold enabled is not being skipped
>> after the fix to the test.
>
> Is this how it looks with your v6 or after fixing the bug:
> https://lore.kernel.org/linux-unionfs/68a8c4d7.050a0220.37038e.005c.GAE@google.com/
>
> Because for me this skipping started after fixing this bug
> Maybe we fixed the bug incorrectly, but I did not see what the problem
> was from a quick look.
>
> Can you test with my branch:
> https://github.com/amir73il/linux/commits/ovl_casefold/
>
Right, our branches have a different base, mine is older and based on
the tag vfs/vfs-6.18.mount.
I have now tested with your branch, and indeed the test fails with
"overlayfs does not support casefold enabled". I did some debugging and
the missing commit from my branch that is making this difference here is
e8bd877fb76bb9f3 ("ovl: fix possible double unlink"). After reverting it
on top of your branch, the test works. I'm not sure yet why this
prevents the mount, but this is the call trace when the error happens:
TID/PID 860/860 (mount/mount):
entry_SYSCALL_64_after_hwframe+0x77
do_syscall_64+0xa2
x64_sys_call+0x1bc3
__x64_sys_fsconfig+0x46c
vfs_cmd_create+0x60
vfs_get_tree+0x2e
ovl_get_tree+0x19
get_tree_nodev+0x70
ovl_fill_super+0x53b
! 0us [-EINVAL] ovl_parent_lock
And for the ovl_parent_lock() arguments, *parent="work", *child="#7". So
right now I'm trying to figure out why the dentry for #7 is not hashed.
>>
>> [1]
>> https://lore.kernel.org/lkml/5da6b0f4-2730-4783-9c57-c46c2d13e848@igalia.com/
>>
>>
>>> I'm not sure I will keep the test this way. This is not very standard nor
>>> good practice, to run half of the test and then skip it.
>>> I would probably split it into two tests.
>>> The first one as it is now will run to completion on kenrels >= v6.17
>>> and the Casefold enable test will run on kernels >= v6.18.
>>>
>>> In any case, please make sure that the test is not skipped when testing
>>> Casefold enabled layers
>>>
>>> And then continue with the missing test cases.
>>>
>>> When you have a test that passes please send the test itself or
>>> a fstest branch for me to test.
>>
>> Ok!
>
> I assume you are testing with ext4 layers?
>
> If we are both testing the same code and same test and getting different
> results I would like to get to the bottom of this, so please share as much
> information on your test setup as you can.
>
> Thanks,
> Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp()
2025-08-26 15:02 ` Gabriel Krisman Bertazi
@ 2025-08-26 19:58 ` André Almeida
2025-08-27 9:28 ` Amir Goldstein
0 siblings, 1 reply; 53+ messages in thread
From: André Almeida @ 2025-08-26 19:58 UTC (permalink / raw)
To: Gabriel Krisman Bertazi, Amir Goldstein
Cc: Miklos Szeredi, Theodore Tso, linux-unionfs, linux-kernel,
linux-fsdevel, Alexander Viro, Christian Brauner, Jan Kara,
kernel-dev
Em 26/08/2025 12:02, Gabriel Krisman Bertazi escreveu:
> Amir Goldstein <amir73il@gmail.com> writes:
>
>> On Tue, Aug 26, 2025 at 3:34 AM Gabriel Krisman Bertazi <krisman@suse.de> wrote:
>>
>>>
>>> I was thinking again about this and I suspect I misunderstood your
>>> question. let me try to answer it again:
>>>
>>> Ext4, f2fs and tmpfs all allow invalid utf8-encoded strings in a
>>> casefolded directory when running on non-strict-mode. They are treated
>>> as non-encoded byte-sequences, as if they were seen on a case-Sensitive
>>> directory. They can't collide with other filenames because they
>>> basically "fold" to themselves.
>>>
>>> Now I suspect there is another problem with this series: I don't see how
>>> it implements the semantics of strict mode. What happens if upper and
>>> lower are in strict mode (which is valid, same encoding_flags) but there
>>> is an invalid name in the lower? overlayfs should reject the dentry,
>>> because any attempt to create it to the upper will fail.
>>
>> Ok, so IIUC, one issue is that return value from ovl_casefold() should be
>> conditional to the sb encoding_flags, which was inherited from the
>> layers.
>
> yes, unless you reject mounting strict_mode filesystems, which the best
> course of action, in my opinion.
>
>>
>> Again, *IF* I understand correctly, then strict mode ext4 will not allow
>> creating an invalid-encoded name, but will strict mode ext4 allow
>> it as a valid lookup result?
>
> strict mode ext4 will not allow creating an invalid-encoded name. And
> even lookups will fail. Because the kernel can't casefold it, it will
> assume the dirent is broken and ignore it during lookup.
>
> (I just noticed the dirent is ignored and the error is not propagated in
> ext4_match. That needs improvement.).
>
>>>
>>> André, did you consider this scenario?
>>
>> In general, as I have told Andre from v1, please stick to the most common
>> configs that people actually need.
>>
>> We do NOT need to support every possible combination of layers configurations.
>>
>> This is why we went with supporting all-or-nothing configs for casefolder dirs.
>> Because it is simpler for overlayfs semantics and good enough for what
>> users need.
>>
>> So my question is to you both: do users actually use strict mode for
>> wine and such?
>> Because if they don't I would rather support the default mode only
>> (enforced on mount)
>> and add support for strict mode later per actual users demand.
>
> I doubt we care. strict mode is a restricted version of casefolding
> support with minor advantages. Basically, with it, you can trust that
> if you update the unicode version, there won't be any behavior change in
> casefolding due to newly assigned code-points. For Wine, that is
> irrelevant.
>
> You can very well reject strict mode and be done with it.
>
Amir,
I think this can be done at ovl_get_layers(), something like:
if (sb_has_strict_encoding(sb)) {
pr_err("strict encoding not supported\n");
return -EINVAL;
}
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp()
2025-08-26 7:19 ` Amir Goldstein
2025-08-26 15:02 ` Gabriel Krisman Bertazi
@ 2025-08-26 20:01 ` André Almeida
2025-08-27 20:45 ` André Almeida
2 siblings, 0 replies; 53+ messages in thread
From: André Almeida @ 2025-08-26 20:01 UTC (permalink / raw)
To: Amir Goldstein, Gabriel Krisman Bertazi
Cc: Miklos Szeredi, Theodore Tso, linux-unionfs, linux-kernel,
linux-fsdevel, Alexander Viro, Christian Brauner, Jan Kara,
kernel-dev
Em 26/08/2025 04:19, Amir Goldstein escreveu:
> On Tue, Aug 26, 2025 at 3:34 AM Gabriel Krisman Bertazi <krisman@suse.de> wrote:
>>
>> Gabriel Krisman Bertazi <gabriel@krisman.be> writes:
>>
>>> Amir Goldstein <amir73il@gmail.com> writes:
>>>
>>>> On Mon, Aug 25, 2025 at 5:27 PM Amir Goldstein <amir73il@gmail.com> wrote:
>>>>>
>>>>> On Mon, Aug 25, 2025 at 1:09 PM Gabriel Krisman Bertazi
>>>>> <gabriel@krisman.be> wrote:
>>>>>>
>>>>>> André Almeida <andrealmeid@igalia.com> writes:
>>>>>>
>>>>>>> To add overlayfs support casefold layers, create a new function
>>>>>>> ovl_casefold(), to be able to do case-insensitive strncmp().
>>>>>>>
>>>>>>> ovl_casefold() allocates a new buffer and stores the casefolded version
>>>>>>> of the string on it. If the allocation or the casefold operation fails,
>>>>>>> fallback to use the original string.
>>>>>>>
>>>>>>> The case-insentive name is then used in the rb-tree search/insertion
>>>>>>> operation. If the name is found in the rb-tree, the name can be
>>>>>>> discarded and the buffer is freed. If the name isn't found, it's then
>>>>>>> stored at struct ovl_cache_entry to be used later.
>>>>>>>
>>>>>>> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>>>>>>> Signed-off-by: André Almeida <andrealmeid@igalia.com>
>>>>>>> ---
>>>>>>> Changes from v6:
>>>>>>> - Last version was using `strncmp(... tmp->len)` which was causing
>>>>>>> regressions. It should be `strncmp(... len)`.
>>>>>>> - Rename cf_len to c_len
>>>>>>> - Use c_len for tree operation: (cmp < 0 || len < tmp->c_len)
>>>>>>> - Remove needless kfree(cf_name)
>>>>>>> ---
>>>>>>> fs/overlayfs/readdir.c | 113 ++++++++++++++++++++++++++++++++++++++++---------
>>>>>>> 1 file changed, 94 insertions(+), 19 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
>>>>>>> index b65cdfce31ce27172d28d879559f1008b9c87320..dfc661b7bc3f87efbf14991e97cee169400d823b 100644
>>>>>>> --- a/fs/overlayfs/readdir.c
>>>>>>> +++ b/fs/overlayfs/readdir.c
>>>>>>> @@ -27,6 +27,8 @@ struct ovl_cache_entry {
>>>>>>> bool is_upper;
>>>>>>> bool is_whiteout;
>>>>>>> bool check_xwhiteout;
>>>>>>> + const char *c_name;
>>>>>>> + int c_len;
>>>>>>> char name[];
>>>>>>> };
>>>>>>>
>>>>>>> @@ -45,6 +47,7 @@ struct ovl_readdir_data {
>>>>>>> struct list_head *list;
>>>>>>> struct list_head middle;
>>>>>>> struct ovl_cache_entry *first_maybe_whiteout;
>>>>>>> + struct unicode_map *map;
>>>>>>> int count;
>>>>>>> int err;
>>>>>>> bool is_upper;
>>>>>>> @@ -66,6 +69,27 @@ static struct ovl_cache_entry *ovl_cache_entry_from_node(struct rb_node *n)
>>>>>>> return rb_entry(n, struct ovl_cache_entry, node);
>>>>>>> }
>>>>>>>
>>>>>>> +static int ovl_casefold(struct unicode_map *map, const char *str, int len, char **dst)
>>>>>>> +{
>>>>>>> + const struct qstr qstr = { .name = str, .len = len };
>>>>>>> + int cf_len;
>>>>>>> +
>>>>>>> + if (!IS_ENABLED(CONFIG_UNICODE) || !map || is_dot_dotdot(str, len))
>>>>>>> + return 0;
>>>>>>> +
>>>>>>> + *dst = kmalloc(NAME_MAX, GFP_KERNEL);
>>>>>>> +
>>>>>>> + if (dst) {
>
> Andre,
>
> Just noticed this is a bug, should have been if (*dst), but anyway following
> Gabriel's comments I have made this change in my tree (pending more
> strict related changes):
>
> static int ovl_casefold(struct ovl_readdir_data *rdd, const char *str, int len,
> char **dst)
> {
> const struct qstr qstr = { .name = str, .len = len };
> char *cf_name;
> int cf_len;
>
> if (!IS_ENABLED(CONFIG_UNICODE) || !rdd->map || is_dot_dotdot(str, len))
> return 0;
>
> cf_name = kmalloc(NAME_MAX, GFP_KERNEL);
> if (!cf_name) {
> rdd->err = -ENOMEM;
> return -ENOMEM;
> }
>
> cf_len = utf8_casefold(rdd->map, &qstr, *dst, NAME_MAX);
> if (cf_len > 0)
> *dst = cf_name;
> else
> kfree(cf_name);
>
> return cf_len;
> }
Right, that makes sense to me. I was unsure what to do regarding
allocation fails, but this seems the right direction. Thanks!
>
>>>>>>> + cf_len = utf8_casefold(map, &qstr, *dst, NAME_MAX);
>>>>>>> +
>>>>>>> + if (cf_len > 0)
>>>>>>> + return cf_len;
>>>>>>> + }
>>>>>>> +
>>>>>>> + kfree(*dst);
>>>>>>> + return 0;
>>>>>>> +}
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I should just note this does not differentiates allocation errors from
>>>>>> casefolding errors (invalid encoding). It might be just a theoretical
>>>>>> error because GFP_KERNEL shouldn't fail (wink, wink) and the rest of the
>>>>>> operation is likely to fail too, but if you have an allocation failure, you
>>>>>> can end up with an inconsistent cache, because a file is added under the
>>>>>> !casefolded name and a later successful lookup will look for the
>>>>>> casefolded version.
>>>>>
>>>>> Good point.
>>>>> I will fix this in my tree.
>>>>
>>>> wait why should we not fail to fill the cache for both allocation
>>>> and encoding errors?
>>>>
>>>
>>> We shouldn't fail the cache for encoding errors, just for allocation errors.
>>>
>>> Perhaps I am misreading the code, so please correct me if I'm wrong. if
>>> ovl_casefold fails, the non-casefolded name is used in the cache. That
>>> makes sense if the reason utf8_casefold failed is because the string
>>> cannot be casefolded (i.e. an invalid utf-8 string). For those strings,
>>> everything is fine. But on an allocation failure, the string might have
>>> a real casefolded version. If we fallback to the original string as the
>>> key, a cache lookup won't find the entry, since we compare with memcmp.
>
> Just to make it clear in case the name "cache lookup" confuses anyone
> on this thread - we are talking about ovl readdir cache, not about the vfs
> lookup cache, the the purpose of ovl readdir cache is twofold:
> 1. plain in-memory readdir cache
> 2. (more important to this discussion) implementation of "merged dir" content
>
> So I agree with you that with non-strict mode, invalid encoded names
> should be added to readdir cache as is and not in the case of allocation
> failure.
>
>>
>> I was thinking again about this and I suspect I misunderstood your
>> question. let me try to answer it again:
>>
>> Ext4, f2fs and tmpfs all allow invalid utf8-encoded strings in a
>> casefolded directory when running on non-strict-mode. They are treated
>> as non-encoded byte-sequences, as if they were seen on a case-Sensitive
>> directory. They can't collide with other filenames because they
>> basically "fold" to themselves.
>>
>> Now I suspect there is another problem with this series: I don't see how
>> it implements the semantics of strict mode. What happens if upper and
>> lower are in strict mode (which is valid, same encoding_flags) but there
>> is an invalid name in the lower? overlayfs should reject the dentry,
>> because any attempt to create it to the upper will fail.
>
> Ok, so IIUC, one issue is that return value from ovl_casefold() should be
> conditional to the sb encoding_flags, which was inherited from the layers.
>
> Again, *IF* I understand correctly, then strict mode ext4 will not allow
> creating an invalid-encoded name, but will strict mode ext4 allow
> it as a valid lookup result?
>
>>
>> André, did you consider this scenario?
>
> In general, as I have told Andre from v1, please stick to the most common
> configs that people actually need.
>
> We do NOT need to support every possible combination of layers configurations.
>
> This is why we went with supporting all-or-nothing configs for casefolder dirs.
> Because it is simpler for overlayfs semantics and good enough for what
> users need.
>
> So my question is to you both: do users actually use strict mode for
> wine and such?
> Because if they don't I would rather support the default mode only
> (enforced on mount)
> and add support for strict mode later per actual users demand.
>
I agree with Gabriel, no need to add this for Wine. We can refuse to
mount to make things easier.
>> You can test by creating a file
>> with an invalid-encoded name in a casefolded directory of a
>> non-strict-mode filesystem and then flip the strict-mode flag in the
>> superblock. I can give it a try tomorrow too.
>
> Can the sb flags be flipped in runtime? while mounted?
> I suppose you are talking about an offline change that requires
> re-mount of overlayfs and re-validate the same encoding flags on all layers?
>
> Andre,
>
> Please also add these and other casefold functional tests to fstest to
> validate correctness of the merge dir implementation with different
> casefold variants in different layers.
>
Ok, I will add a test case to stress mounting layers with different
encoding versions, flags and etc.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 5/9] ovl: Ensure that all layers have the same encoding
2025-08-25 15:32 ` Amir Goldstein
@ 2025-08-26 20:12 ` André Almeida
2025-08-27 9:17 ` Amir Goldstein
0 siblings, 1 reply; 53+ messages in thread
From: André Almeida @ 2025-08-26 20:12 UTC (permalink / raw)
To: Amir Goldstein, Gabriel Krisman Bertazi
Cc: Miklos Szeredi, Theodore Tso, linux-unionfs, linux-kernel,
linux-fsdevel, Alexander Viro, Christian Brauner, Jan Kara,
kernel-dev
Em 25/08/2025 12:32, Amir Goldstein escreveu:
> On Mon, Aug 25, 2025 at 1:17 PM Gabriel Krisman Bertazi
> <gabriel@krisman.be> wrote:
>>
>> André Almeida <andrealmeid@igalia.com> writes:
>>
>>> When merging layers from different filesystems with casefold enabled,
>>> all layers should use the same encoding version and have the same flags
>>> to avoid any kind of incompatibility issues.
>>>
>>> Also, set the encoding and the encoding flags for the ovl super block as
>>> the same as used by the first valid layer.
>>>
>>> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>>> Signed-off-by: André Almeida <andrealmeid@igalia.com>
>>> ---
>>> fs/overlayfs/super.c | 25 +++++++++++++++++++++++++
>>> 1 file changed, 25 insertions(+)
>>>
>>> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
>>> index df85a76597e910d00323018f1d2cd720c5db921d..b1dbd3c79961094d00c7f99cc622e515d544d22f 100644
>>> --- a/fs/overlayfs/super.c
>>> +++ b/fs/overlayfs/super.c
>>> @@ -991,6 +991,18 @@ static int ovl_get_data_fsid(struct ovl_fs *ofs)
>>> return ofs->numfs;
>>> }
>>>
>>> +/*
>>> + * Set the ovl sb encoding as the same one used by the first layer
>>> + */
>>> +static void ovl_set_encoding(struct super_block *sb, struct super_block *fs_sb)
>>> +{
>>> +#if IS_ENABLED(CONFIG_UNICODE)
>>> + if (sb_has_encoding(fs_sb)) {
>>> + sb->s_encoding = fs_sb->s_encoding;
>>> + sb->s_encoding_flags = fs_sb->s_encoding_flags;
>>> + }
>>> +#endif
>>> +}
>>>
>>> static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
>>> struct ovl_fs_context *ctx, struct ovl_layer *layers)
>>> @@ -1024,6 +1036,9 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
>>> if (ovl_upper_mnt(ofs)) {
>>> ofs->fs[0].sb = ovl_upper_mnt(ofs)->mnt_sb;
>>> ofs->fs[0].is_lower = false;
>>> +
>>> + if (ofs->casefold)
>>> + ovl_set_encoding(sb, ofs->fs[0].sb);
>>> }
>>>
>>> nr_merged_lower = ctx->nr - ctx->nr_data;
>>> @@ -1083,6 +1098,16 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
>>> l->name = NULL;
>>> ofs->numlayer++;
>>> ofs->fs[fsid].is_lower = true;
>>> +
>>> + if (ofs->casefold) {
>>> + if (!ovl_upper_mnt(ofs) && !sb_has_encoding(sb))
>>> + ovl_set_encoding(sb, ofs->fs[fsid].sb);
>>> +
>>> + if (!sb_has_encoding(sb) || !sb_same_encoding(sb, mnt->mnt_sb)) {
>>
>> Minor nit, but isn't the sb_has_encoding() check redundant here? sb_same_encoding
>> will check the sb->encoding matches the mnt_sb already.
>
> Maybe we did something wrong but the intention was:
> If all layers root are casefold disabled (or not supported) then
> a mix of layers with fs of different encoding (and fs with no encoding support)
> is allowed because we take care that all directories are always
> casefold disabled.
>
We are going to reach this code only if ofs->casefold is true, so that
means that ovl_dentry_casefolded() was true, and that means that
sb_has_encoding(dentry->d_sb) is also true... so I think that Gabriel is
right, if we reach this part of the code, that means that casefold is
enabled and being used by at least one layer, so we can call
sb_same_encoding() to check if they are consistent for all layers.
For the case that we don't care about the layers having different
encoding, the code will already skip this because of if (ofs->casefold)
> Thanks,
> Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 6/9] ovl: Set case-insensitive dentry operations for ovl sb
2025-08-25 15:34 ` Amir Goldstein
@ 2025-08-26 20:13 ` André Almeida
0 siblings, 0 replies; 53+ messages in thread
From: André Almeida @ 2025-08-26 20:13 UTC (permalink / raw)
To: Amir Goldstein, Gabriel Krisman Bertazi
Cc: Miklos Szeredi, Theodore Tso, linux-unionfs, linux-kernel,
linux-fsdevel, Alexander Viro, Christian Brauner, Jan Kara,
kernel-dev
Em 25/08/2025 12:34, Amir Goldstein escreveu:
> On Mon, Aug 25, 2025 at 1:24 PM Gabriel Krisman Bertazi
> <gabriel@krisman.be> wrote:
>>
>> André Almeida <andrealmeid@igalia.com> writes:
>>
>>> For filesystems with encoding (i.e. with case-insensitive support), set
>>> the dentry operations for the super block as ovl_dentry_ci_operations.
>>>
>>> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
>>> Signed-off-by: André Almeida <andrealmeid@igalia.com>
>>> ---
>>> Changes in v6:
>>> - Fix kernel bot warning: unused variable 'ofs'
>>> ---
>>> fs/overlayfs/super.c | 25 +++++++++++++++++++++++++
>>> 1 file changed, 25 insertions(+)
>>>
>>> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
>>> index b1dbd3c79961094d00c7f99cc622e515d544d22f..8db4e55d5027cb975fec9b92251f62fe5924af4f 100644
>>> --- a/fs/overlayfs/super.c
>>> +++ b/fs/overlayfs/super.c
>>> @@ -161,6 +161,16 @@ static const struct dentry_operations ovl_dentry_operations = {
>>> .d_weak_revalidate = ovl_dentry_weak_revalidate,
>>> };
>>>
>>> +#if IS_ENABLED(CONFIG_UNICODE)
>>> +static const struct dentry_operations ovl_dentry_ci_operations = {
>>> + .d_real = ovl_d_real,
>>> + .d_revalidate = ovl_dentry_revalidate,
>>> + .d_weak_revalidate = ovl_dentry_weak_revalidate,
>>> + .d_hash = generic_ci_d_hash,
>>> + .d_compare = generic_ci_d_compare,
>>> +};
>>> +#endif
>>> +
>>> static struct kmem_cache *ovl_inode_cachep;
>>>
>>> static struct inode *ovl_alloc_inode(struct super_block *sb)
>>> @@ -1332,6 +1342,19 @@ static struct dentry *ovl_get_root(struct super_block *sb,
>>> return root;
>>> }
>>>
>>> +static void ovl_set_d_op(struct super_block *sb)
>>> +{
>>> +#if IS_ENABLED(CONFIG_UNICODE)
>>> + struct ovl_fs *ofs = sb->s_fs_info;
>>> +
>>> + if (ofs->casefold) {
>>> + set_default_d_op(sb, &ovl_dentry_ci_operations);
>>> + return;
>>> + }
>>> +#endif
>>> + set_default_d_op(sb, &ovl_dentry_operations);
>>> +}
>>> +
>>> int ovl_fill_super(struct super_block *sb, struct fs_context *fc)
>>> {
>>> struct ovl_fs *ofs = sb->s_fs_info;
>>> @@ -1443,6 +1466,8 @@ int ovl_fill_super(struct super_block *sb, struct fs_context *fc)
>>> if (IS_ERR(oe))
>>> goto out_err;
>>>
>>> + ovl_set_d_op(sb);
>>> +
>>
>> Absolutely minor, but fill_super is now calling
>> set_default_d_op(sb, &ovl_dentry_operations) twice, once here and once
>> at the beginning of the function. You can remove the original call.
>
> Good catch!
>
> That was not my intention at all.
> I asked to replace the set_default_d_op() call with ovl_set_d_op()
> but I missed that in the review.
>
> Will fix it in my tree.
>
Ops, my bad. Thank you for the fix :)
> Thanks!
> Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 5/9] ovl: Ensure that all layers have the same encoding
2025-08-26 20:12 ` André Almeida
@ 2025-08-27 9:17 ` Amir Goldstein
0 siblings, 0 replies; 53+ messages in thread
From: Amir Goldstein @ 2025-08-27 9:17 UTC (permalink / raw)
To: André Almeida
Cc: Gabriel Krisman Bertazi, Miklos Szeredi, Theodore Tso,
linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev
On Tue, Aug 26, 2025 at 10:12 PM André Almeida <andrealmeid@igalia.com> wrote:
>
>
>
> Em 25/08/2025 12:32, Amir Goldstein escreveu:
> > On Mon, Aug 25, 2025 at 1:17 PM Gabriel Krisman Bertazi
> > <gabriel@krisman.be> wrote:
> >>
> >> André Almeida <andrealmeid@igalia.com> writes:
> >>
> >>> When merging layers from different filesystems with casefold enabled,
> >>> all layers should use the same encoding version and have the same flags
> >>> to avoid any kind of incompatibility issues.
> >>>
> >>> Also, set the encoding and the encoding flags for the ovl super block as
> >>> the same as used by the first valid layer.
> >>>
> >>> Reviewed-by: Amir Goldstein <amir73il@gmail.com>
> >>> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> >>> ---
> >>> fs/overlayfs/super.c | 25 +++++++++++++++++++++++++
> >>> 1 file changed, 25 insertions(+)
> >>>
> >>> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> >>> index df85a76597e910d00323018f1d2cd720c5db921d..b1dbd3c79961094d00c7f99cc622e515d544d22f 100644
> >>> --- a/fs/overlayfs/super.c
> >>> +++ b/fs/overlayfs/super.c
> >>> @@ -991,6 +991,18 @@ static int ovl_get_data_fsid(struct ovl_fs *ofs)
> >>> return ofs->numfs;
> >>> }
> >>>
> >>> +/*
> >>> + * Set the ovl sb encoding as the same one used by the first layer
> >>> + */
> >>> +static void ovl_set_encoding(struct super_block *sb, struct super_block *fs_sb)
> >>> +{
> >>> +#if IS_ENABLED(CONFIG_UNICODE)
> >>> + if (sb_has_encoding(fs_sb)) {
> >>> + sb->s_encoding = fs_sb->s_encoding;
> >>> + sb->s_encoding_flags = fs_sb->s_encoding_flags;
> >>> + }
> >>> +#endif
> >>> +}
> >>>
> >>> static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
> >>> struct ovl_fs_context *ctx, struct ovl_layer *layers)
> >>> @@ -1024,6 +1036,9 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
> >>> if (ovl_upper_mnt(ofs)) {
> >>> ofs->fs[0].sb = ovl_upper_mnt(ofs)->mnt_sb;
> >>> ofs->fs[0].is_lower = false;
> >>> +
> >>> + if (ofs->casefold)
> >>> + ovl_set_encoding(sb, ofs->fs[0].sb);
> >>> }
> >>>
> >>> nr_merged_lower = ctx->nr - ctx->nr_data;
> >>> @@ -1083,6 +1098,16 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
> >>> l->name = NULL;
> >>> ofs->numlayer++;
> >>> ofs->fs[fsid].is_lower = true;
> >>> +
> >>> + if (ofs->casefold) {
> >>> + if (!ovl_upper_mnt(ofs) && !sb_has_encoding(sb))
> >>> + ovl_set_encoding(sb, ofs->fs[fsid].sb);
> >>> +
> >>> + if (!sb_has_encoding(sb) || !sb_same_encoding(sb, mnt->mnt_sb)) {
> >>
> >> Minor nit, but isn't the sb_has_encoding() check redundant here? sb_same_encoding
> >> will check the sb->encoding matches the mnt_sb already.
> >
> > Maybe we did something wrong but the intention was:
> > If all layers root are casefold disabled (or not supported) then
> > a mix of layers with fs of different encoding (and fs with no encoding support)
> > is allowed because we take care that all directories are always
> > casefold disabled.
> >
>
> We are going to reach this code only if ofs->casefold is true, so that
> means that ovl_dentry_casefolded() was true, and that means that
> sb_has_encoding(dentry->d_sb) is also true... so I think that Gabriel is
> right, if we reach this part of the code, that means that casefold is
> enabled and being used by at least one layer, so we can call
> sb_same_encoding() to check if they are consistent for all layers.
>
> For the case that we don't care about the layers having different
> encoding, the code will already skip this because of if (ofs->casefold)
Doh! yeh that was silly. I removed that now.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp()
2025-08-26 19:58 ` André Almeida
@ 2025-08-27 9:28 ` Amir Goldstein
0 siblings, 0 replies; 53+ messages in thread
From: Amir Goldstein @ 2025-08-27 9:28 UTC (permalink / raw)
To: André Almeida
Cc: Gabriel Krisman Bertazi, Miklos Szeredi, Theodore Tso,
linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev
On Tue, Aug 26, 2025 at 9:58 PM André Almeida <andrealmeid@igalia.com> wrote:
>
>
>
> Em 26/08/2025 12:02, Gabriel Krisman Bertazi escreveu:
> > Amir Goldstein <amir73il@gmail.com> writes:
> >
> >> On Tue, Aug 26, 2025 at 3:34 AM Gabriel Krisman Bertazi <krisman@suse.de> wrote:
> >>
> >>>
> >>> I was thinking again about this and I suspect I misunderstood your
> >>> question. let me try to answer it again:
> >>>
> >>> Ext4, f2fs and tmpfs all allow invalid utf8-encoded strings in a
> >>> casefolded directory when running on non-strict-mode. They are treated
> >>> as non-encoded byte-sequences, as if they were seen on a case-Sensitive
> >>> directory. They can't collide with other filenames because they
> >>> basically "fold" to themselves.
> >>>
> >>> Now I suspect there is another problem with this series: I don't see how
> >>> it implements the semantics of strict mode. What happens if upper and
> >>> lower are in strict mode (which is valid, same encoding_flags) but there
> >>> is an invalid name in the lower? overlayfs should reject the dentry,
> >>> because any attempt to create it to the upper will fail.
> >>
> >> Ok, so IIUC, one issue is that return value from ovl_casefold() should be
> >> conditional to the sb encoding_flags, which was inherited from the
> >> layers.
> >
> > yes, unless you reject mounting strict_mode filesystems, which the best
> > course of action, in my opinion.
> >
> >>
> >> Again, *IF* I understand correctly, then strict mode ext4 will not allow
> >> creating an invalid-encoded name, but will strict mode ext4 allow
> >> it as a valid lookup result?
> >
> > strict mode ext4 will not allow creating an invalid-encoded name. And
> > even lookups will fail. Because the kernel can't casefold it, it will
> > assume the dirent is broken and ignore it during lookup.
> >
> > (I just noticed the dirent is ignored and the error is not propagated in
> > ext4_match. That needs improvement.).
> >
> >>>
> >>> André, did you consider this scenario?
> >>
> >> In general, as I have told Andre from v1, please stick to the most common
> >> configs that people actually need.
> >>
> >> We do NOT need to support every possible combination of layers configurations.
> >>
> >> This is why we went with supporting all-or-nothing configs for casefolder dirs.
> >> Because it is simpler for overlayfs semantics and good enough for what
> >> users need.
> >>
> >> So my question is to you both: do users actually use strict mode for
> >> wine and such?
> >> Because if they don't I would rather support the default mode only
> >> (enforced on mount)
> >> and add support for strict mode later per actual users demand.
> >
> > I doubt we care. strict mode is a restricted version of casefolding
> > support with minor advantages. Basically, with it, you can trust that
> > if you update the unicode version, there won't be any behavior change in
> > casefolding due to newly assigned code-points. For Wine, that is
> > irrelevant.
> >
> > You can very well reject strict mode and be done with it.
> >
>
> Amir,
>
> I think this can be done at ovl_get_layers(), something like:
>
> if (sb_has_strict_encoding(sb)) {
> pr_err("strict encoding not supported\n");
> return -EINVAL;
> }
>
Yap, I've put it into ovl_set_encoding() to warn more accurately
on upper fs:
/*
* Set the ovl sb encoding as the same one used by the first layer
*/
static int ovl_set_encoding(struct super_block *sb, struct super_block *fs_sb)
{
if (!sb_has_encoding(fs_sb))
return 0;
#if IS_ENABLED(CONFIG_UNICODE)
if (sb_has_strict_encoding(fs_sb)) {
pr_err("strict encoding not supported\n");
return -EINVAL;
}
sb->s_encoding = fs_sb->s_encoding;
sb->s_encoding_flags = fs_sb->s_encoding_flags;
#endif
return 0;
}
Thanks,
Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
2025-08-26 19:01 ` André Almeida
@ 2025-08-27 18:06 ` Amir Goldstein
2025-08-27 20:37 ` André Almeida
2025-08-27 23:58 ` NeilBrown
0 siblings, 2 replies; 53+ messages in thread
From: Amir Goldstein @ 2025-08-27 18:06 UTC (permalink / raw)
To: André Almeida, NeilBrown
Cc: Miklos Szeredi, Theodore Tso, Gabriel Krisman Bertazi,
linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev
On Tue, Aug 26, 2025 at 9:01 PM André Almeida <andrealmeid@igalia.com> wrote:
>
>
>
> Em 26/08/2025 04:31, Amir Goldstein escreveu:
> > On Mon, Aug 25, 2025 at 3:31 PM André Almeida <andrealmeid@igalia.com> wrote:
> >>
> >> Hi Amir,
> >>
> >> Em 22/08/2025 16:17, Amir Goldstein escreveu:
> >>
> >> [...]
> >>
> >> /*
> >>>>>> - * Allow filesystems that are case-folding capable but deny composing
> >>>>>> - * ovl stack from case-folded directories.
> >>>>>> + * Exceptionally for layers with casefold, we accept that they have
> >>>>>> + * their own hash and compare operations
> >>>>>> */
> >>>>>> - if (sb_has_encoding(dentry->d_sb))
> >>>>>> - return IS_CASEFOLDED(d_inode(dentry));
> >>>>>> + if (ofs->casefold)
> >>>>>> + return false;
> >>>>>
> >>>>> I think this is better as:
> >>>>> if (sb_has_encoding(dentry->d_sb))
> >>>>> return false;
> >>>>>
> >>>
> >>> And this still fails the test "Casefold enabled" for me.
> >>>
> >>> Maybe you are confused because this does not look like
> >>> a test failure. It looks like this:
> >>>
> >>> generic/999 5s ... [19:10:21][ 150.667994] overlayfs: failed lookup
> >>> in lower (ovl-lower/casefold, name='subdir', err=-116): parent wrong
> >>> casefold
> >>> [ 150.669741] overlayfs: failed lookup in lower (ovl-lower/casefold,
> >>> name='subdir', err=-116): parent wrong casefold
> >>> [ 150.760644] overlayfs: failed lookup in lower (/ovl-lower,
> >>> name='casefold', err=-66): child wrong casefold
> >>> [19:10:24] [not run]
> >>> generic/999 -- overlayfs does not support casefold enabled layers
> >>> Ran: generic/999
> >>> Not run: generic/999
> >>> Passed all 1 tests
> >>>
> >>
> >> This is how the test output looks before my changes[1] to the test:
> >>
> >> $ ./run.sh
> >> FSTYP -- ext4
> >> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
> >> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
> >> MKFS_OPTIONS -- -F /dev/vdc
> >> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
> >>
> >> generic/999 1s ... [not run] overlayfs does not support casefold enabled
> >> layers
> >> Ran: generic/999
> >> Not run: generic/999
> >> Passed all 1 tests
> >>
> >>
> >> And this is how it looks after my changes[1] to the test:
> >>
> >> $ ./run.sh
> >> FSTYP -- ext4
> >> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
> >> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
> >> MKFS_OPTIONS -- -F /dev/vdc
> >> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
> >>
> >> generic/999 1s
> >> Ran: generic/999
> >> Passed all 1 tests
> >>
> >> So, as far as I can tell, the casefold enabled is not being skipped
> >> after the fix to the test.
> >
> > Is this how it looks with your v6 or after fixing the bug:
> > https://lore.kernel.org/linux-unionfs/68a8c4d7.050a0220.37038e.005c.GAE@google.com/
> >
> > Because for me this skipping started after fixing this bug
> > Maybe we fixed the bug incorrectly, but I did not see what the problem
> > was from a quick look.
> >
> > Can you test with my branch:
> > https://github.com/amir73il/linux/commits/ovl_casefold/
> >
>
> Right, our branches have a different base, mine is older and based on
> the tag vfs/vfs-6.18.mount.
>
> I have now tested with your branch, and indeed the test fails with
> "overlayfs does not support casefold enabled". I did some debugging and
> the missing commit from my branch that is making this difference here is
> e8bd877fb76bb9f3 ("ovl: fix possible double unlink"). After reverting it
> on top of your branch, the test works. I'm not sure yet why this
> prevents the mount, but this is the call trace when the error happens:
Wow, that is an interesting development race...
>
> TID/PID 860/860 (mount/mount):
>
> entry_SYSCALL_64_after_hwframe+0x77
> do_syscall_64+0xa2
> x64_sys_call+0x1bc3
> __x64_sys_fsconfig+0x46c
> vfs_cmd_create+0x60
> vfs_get_tree+0x2e
> ovl_get_tree+0x19
> get_tree_nodev+0x70
> ovl_fill_super+0x53b
> ! 0us [-EINVAL] ovl_parent_lock
>
> And for the ovl_parent_lock() arguments, *parent="work", *child="#7". So
> right now I'm trying to figure out why the dentry for #7 is not hashed.
>
The reason is this:
static struct dentry *ext4_lookup(...
{
...
if (IS_ENABLED(CONFIG_UNICODE) && !inode && IS_CASEFOLDED(dir)) {
/* Eventually we want to call d_add_ci(dentry, NULL)
* for negative dentries in the encoding case as
* well. For now, prevent the negative dentry
* from being cached.
*/
return NULL;
}
return d_splice_alias(inode, dentry);
}
Neil,
Apparently, the assumption that
ovl_lookup_temp() => ovl_lookup_upper() => lookup_one()
returns a hashed dentry is not always true.
It may be always true for all the filesystems that are currently
supported as an overlayfs
upper layer fs (?), but it does not look like you can count on this
for the wider vfs effort
and we should try to come up with a solution for ovl_parent_lock()
that will allow enabling
casefolding on overlayfs layers.
This patch seems to work. WDYT?
Thanks,
Amir.
commit 5dfcd10378038637648f3f422e3d5097eb6faa5f
Author: Amir Goldstein <amir73il@gmail.com>
Date: Wed Aug 27 19:55:26 2025 +0200
ovl: adapt ovl_parent_lock() to casefolded directories
e8bd877fb76bb9f3 ("ovl: fix possible double unlink") added a sanity
check of !d_unhashed(child) to try to verify that child dentry was not
unlinked while parent dir was unlocked.
This "was not unlink" check has a false positive result in the case of
casefolded parent dir, because in that case, ovl_create_temp() returns
an unhashed dentry.
Change the "was not unlinked" check to use cant_mount(child).
cant_mount(child) means that child was unlinked while we have been
holding a reference to child, so it could not have become negative.
This fixes the error in ovl_parent_lock() in ovl_check_rename_whiteout()
after ovl_create_temp() and allows mount of overlayfs with casefolding
enabled layers.
Reported-by: André Almeida <andrealmeid@igalia.com>
Link: https://lore.kernel.org/r/18704e8c-c734-43f3-bc7c-b8be345e1bf5@igalia.com/
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index bec4a39d1b97c..bffbb59776720 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -1551,9 +1551,23 @@ void ovl_copyattr(struct inode *inode)
int ovl_parent_lock(struct dentry *parent, struct dentry *child)
{
+ bool is_unlinked;
+
inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
- if (!child ||
- (!d_unhashed(child) && child->d_parent == parent))
+ if (!child)
+ return 0;
+
+ /*
+ * After re-acquiring parent dir lock, verify that child was not moved
+ * to another parent and that it was not unlinked. cant_mount() means
+ * that child was unlinked while parent was unlocked. Since we are
+ * holding a reference to child, it could not have become negative.
+ * d_unhashed(child) is not a strong enough indication for unlinked,
+ * because with casefolded parent dir, ovl_create_temp() returns an
+ * unhashed dentry.
+ */
+ is_unlinked = cant_mount(child) || WARN_ON_ONCE(d_is_negative(child));
+ if (!is_unlinked && child->d_parent == parent)
return 0;
inode_unlock(parent->d_inode);
^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
2025-08-27 18:06 ` Amir Goldstein
@ 2025-08-27 20:37 ` André Almeida
2025-08-27 23:58 ` NeilBrown
1 sibling, 0 replies; 53+ messages in thread
From: André Almeida @ 2025-08-27 20:37 UTC (permalink / raw)
To: Amir Goldstein, NeilBrown
Cc: Miklos Szeredi, Theodore Tso, Gabriel Krisman Bertazi,
linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev
Em 27/08/2025 15:06, Amir Goldstein escreveu:
[...]
>
> The reason is this:
>
> static struct dentry *ext4_lookup(...
> {
> ...
> if (IS_ENABLED(CONFIG_UNICODE) && !inode && IS_CASEFOLDED(dir)) {
> /* Eventually we want to call d_add_ci(dentry, NULL)
> * for negative dentries in the encoding case as
> * well. For now, prevent the negative dentry
> * from being cached.
> */
> return NULL;
> }
>
> return d_splice_alias(inode, dentry);
> }
>
> Neil,
>
> Apparently, the assumption that
> ovl_lookup_temp() => ovl_lookup_upper() => lookup_one()
> returns a hashed dentry is not always true.
>
> It may be always true for all the filesystems that are currently
> supported as an overlayfs
> upper layer fs (?), but it does not look like you can count on this
> for the wider vfs effort
> and we should try to come up with a solution for ovl_parent_lock()
> that will allow enabling
> casefolding on overlayfs layers.
>
> This patch seems to work. WDYT?
>
> Thanks,
> Amir.
>
Thank you for the fix!
> commit 5dfcd10378038637648f3f422e3d5097eb6faa5f
> Author: Amir Goldstein <amir73il@gmail.com>
> Date: Wed Aug 27 19:55:26 2025 +0200
>
> ovl: adapt ovl_parent_lock() to casefolded directories
>
> e8bd877fb76bb9f3 ("ovl: fix possible double unlink") added a sanity
Just to make checkpatch happy, this should be
Commit e8bd877fb76b ("ovl: fix possible double unlink") added a sanity
> check of !d_unhashed(child) to try to verify that child dentry was not
> unlinked while parent dir was unlocked.
>
> This "was not unlink" check has a false positive result in the case of
> casefolded parent dir, because in that case, ovl_create_temp() returns
> an unhashed dentry.
>
> Change the "was not unlinked" check to use cant_mount(child).
> cant_mount(child) means that child was unlinked while we have been
> holding a reference to child, so it could not have become negative.
>
> This fixes the error in ovl_parent_lock() in ovl_check_rename_whiteout()
> after ovl_create_temp() and allows mount of overlayfs with casefolding
> enabled layers.
>
> Reported-by: André Almeida <andrealmeid@igalia.com>
> Link: https://lore.kernel.org/r/18704e8c-c734-43f3-bc7c-b8be345e1bf5@igalia.com/
I think the correct chain here is:
Reported-by: André Almeida <andrealmeid@igalia.com>
Closes:
https://lore.kernel.org/r/18704e8c-c734-43f3-bc7c-b8be345e1bf5@igalia.com/
Fixes: e8bd877fb76b ("ovl: fix possible double unlink")
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index bec4a39d1b97c..bffbb59776720 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -1551,9 +1551,23 @@ void ovl_copyattr(struct inode *inode)
>
> int ovl_parent_lock(struct dentry *parent, struct dentry *child)
> {
> + bool is_unlinked;
> +
> inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
> - if (!child ||
> - (!d_unhashed(child) && child->d_parent == parent))
> + if (!child)
> + return 0;
> +
> + /*
> + * After re-acquiring parent dir lock, verify that child was not moved
> + * to another parent and that it was not unlinked. cant_mount() means
> + * that child was unlinked while parent was unlocked. Since we are
> + * holding a reference to child, it could not have become negative.
> + * d_unhashed(child) is not a strong enough indication for unlinked,
> + * because with casefolded parent dir, ovl_create_temp() returns an
> + * unhashed dentry.
> + */
> + is_unlinked = cant_mount(child) || WARN_ON_ONCE(d_is_negative(child));
> + if (!is_unlinked && child->d_parent == parent)
> return 0;
>
> inode_unlock(parent->d_inode);
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp()
2025-08-26 7:19 ` Amir Goldstein
2025-08-26 15:02 ` Gabriel Krisman Bertazi
2025-08-26 20:01 ` André Almeida
@ 2025-08-27 20:45 ` André Almeida
2025-08-28 11:09 ` Amir Goldstein
2 siblings, 1 reply; 53+ messages in thread
From: André Almeida @ 2025-08-27 20:45 UTC (permalink / raw)
To: Amir Goldstein, Gabriel Krisman Bertazi
Cc: Miklos Szeredi, Theodore Tso, linux-unionfs, linux-kernel,
linux-fsdevel, Alexander Viro, Christian Brauner, Jan Kara,
kernel-dev
Em 26/08/2025 04:19, Amir Goldstein escreveu:
>
> Andre,
>
> Just noticed this is a bug, should have been if (*dst), but anyway following
> Gabriel's comments I have made this change in my tree (pending more
> strict related changes):
>
> static int ovl_casefold(struct ovl_readdir_data *rdd, const char *str, int len,
> char **dst)
> {
> const struct qstr qstr = { .name = str, .len = len };
> char *cf_name;
> int cf_len;
>
> if (!IS_ENABLED(CONFIG_UNICODE) || !rdd->map || is_dot_dotdot(str, len))
> return 0;
>
> cf_name = kmalloc(NAME_MAX, GFP_KERNEL);
> if (!cf_name) {
> rdd->err = -ENOMEM;
> return -ENOMEM;
> }
>
> cf_len = utf8_casefold(rdd->map, &qstr, *dst, NAME_MAX);
The third argument here should be cf_name, not *dst anymore.
> if (cf_len > 0)
> *dst = cf_name;
> else
> kfree(cf_name);
>
> return cf_len;
> }
>
> Thanks,
> Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
2025-08-27 18:06 ` Amir Goldstein
2025-08-27 20:37 ` André Almeida
@ 2025-08-27 23:58 ` NeilBrown
2025-08-28 3:15 ` Gabriel Krisman Bertazi
1 sibling, 1 reply; 53+ messages in thread
From: NeilBrown @ 2025-08-27 23:58 UTC (permalink / raw)
To: Amir Goldstein
Cc: André Almeida, Miklos Szeredi, Theodore Tso,
Gabriel Krisman Bertazi, linux-unionfs, linux-kernel,
linux-fsdevel, Alexander Viro, Christian Brauner, Jan Kara,
kernel-dev
On Thu, 28 Aug 2025, Amir Goldstein wrote:
> On Tue, Aug 26, 2025 at 9:01 PM André Almeida <andrealmeid@igalia.com> wrote:
> >
> >
> >
> > Em 26/08/2025 04:31, Amir Goldstein escreveu:
> > > On Mon, Aug 25, 2025 at 3:31 PM André Almeida <andrealmeid@igalia.com> wrote:
> > >>
> > >> Hi Amir,
> > >>
> > >> Em 22/08/2025 16:17, Amir Goldstein escreveu:
> > >>
> > >> [...]
> > >>
> > >> /*
> > >>>>>> - * Allow filesystems that are case-folding capable but deny composing
> > >>>>>> - * ovl stack from case-folded directories.
> > >>>>>> + * Exceptionally for layers with casefold, we accept that they have
> > >>>>>> + * their own hash and compare operations
> > >>>>>> */
> > >>>>>> - if (sb_has_encoding(dentry->d_sb))
> > >>>>>> - return IS_CASEFOLDED(d_inode(dentry));
> > >>>>>> + if (ofs->casefold)
> > >>>>>> + return false;
> > >>>>>
> > >>>>> I think this is better as:
> > >>>>> if (sb_has_encoding(dentry->d_sb))
> > >>>>> return false;
> > >>>>>
> > >>>
> > >>> And this still fails the test "Casefold enabled" for me.
> > >>>
> > >>> Maybe you are confused because this does not look like
> > >>> a test failure. It looks like this:
> > >>>
> > >>> generic/999 5s ... [19:10:21][ 150.667994] overlayfs: failed lookup
> > >>> in lower (ovl-lower/casefold, name='subdir', err=-116): parent wrong
> > >>> casefold
> > >>> [ 150.669741] overlayfs: failed lookup in lower (ovl-lower/casefold,
> > >>> name='subdir', err=-116): parent wrong casefold
> > >>> [ 150.760644] overlayfs: failed lookup in lower (/ovl-lower,
> > >>> name='casefold', err=-66): child wrong casefold
> > >>> [19:10:24] [not run]
> > >>> generic/999 -- overlayfs does not support casefold enabled layers
> > >>> Ran: generic/999
> > >>> Not run: generic/999
> > >>> Passed all 1 tests
> > >>>
> > >>
> > >> This is how the test output looks before my changes[1] to the test:
> > >>
> > >> $ ./run.sh
> > >> FSTYP -- ext4
> > >> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
> > >> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
> > >> MKFS_OPTIONS -- -F /dev/vdc
> > >> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
> > >>
> > >> generic/999 1s ... [not run] overlayfs does not support casefold enabled
> > >> layers
> > >> Ran: generic/999
> > >> Not run: generic/999
> > >> Passed all 1 tests
> > >>
> > >>
> > >> And this is how it looks after my changes[1] to the test:
> > >>
> > >> $ ./run.sh
> > >> FSTYP -- ext4
> > >> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
> > >> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
> > >> MKFS_OPTIONS -- -F /dev/vdc
> > >> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
> > >>
> > >> generic/999 1s
> > >> Ran: generic/999
> > >> Passed all 1 tests
> > >>
> > >> So, as far as I can tell, the casefold enabled is not being skipped
> > >> after the fix to the test.
> > >
> > > Is this how it looks with your v6 or after fixing the bug:
> > > https://lore.kernel.org/linux-unionfs/68a8c4d7.050a0220.37038e.005c.GAE@google.com/
> > >
> > > Because for me this skipping started after fixing this bug
> > > Maybe we fixed the bug incorrectly, but I did not see what the problem
> > > was from a quick look.
> > >
> > > Can you test with my branch:
> > > https://github.com/amir73il/linux/commits/ovl_casefold/
> > >
> >
> > Right, our branches have a different base, mine is older and based on
> > the tag vfs/vfs-6.18.mount.
> >
> > I have now tested with your branch, and indeed the test fails with
> > "overlayfs does not support casefold enabled". I did some debugging and
> > the missing commit from my branch that is making this difference here is
> > e8bd877fb76bb9f3 ("ovl: fix possible double unlink"). After reverting it
> > on top of your branch, the test works. I'm not sure yet why this
> > prevents the mount, but this is the call trace when the error happens:
>
> Wow, that is an interesting development race...
>
> >
> > TID/PID 860/860 (mount/mount):
> >
> > entry_SYSCALL_64_after_hwframe+0x77
> > do_syscall_64+0xa2
> > x64_sys_call+0x1bc3
> > __x64_sys_fsconfig+0x46c
> > vfs_cmd_create+0x60
> > vfs_get_tree+0x2e
> > ovl_get_tree+0x19
> > get_tree_nodev+0x70
> > ovl_fill_super+0x53b
> > ! 0us [-EINVAL] ovl_parent_lock
> >
> > And for the ovl_parent_lock() arguments, *parent="work", *child="#7". So
> > right now I'm trying to figure out why the dentry for #7 is not hashed.
> >
>
> The reason is this:
>
> static struct dentry *ext4_lookup(...
> {
> ...
> if (IS_ENABLED(CONFIG_UNICODE) && !inode && IS_CASEFOLDED(dir)) {
> /* Eventually we want to call d_add_ci(dentry, NULL)
> * for negative dentries in the encoding case as
> * well. For now, prevent the negative dentry
> * from being cached.
> */
> return NULL;
> }
>
> return d_splice_alias(inode, dentry);
> }
>
> Neil,
>
> Apparently, the assumption that
> ovl_lookup_temp() => ovl_lookup_upper() => lookup_one()
> returns a hashed dentry is not always true.
>
> It may be always true for all the filesystems that are currently
> supported as an overlayfs
> upper layer fs (?), but it does not look like you can count on this
> for the wider vfs effort
> and we should try to come up with a solution for ovl_parent_lock()
> that will allow enabling
> casefolding on overlayfs layers.
>
> This patch seems to work. WDYT?
>
> Thanks,
> Amir.
>
> commit 5dfcd10378038637648f3f422e3d5097eb6faa5f
> Author: Amir Goldstein <amir73il@gmail.com>
> Date: Wed Aug 27 19:55:26 2025 +0200
>
> ovl: adapt ovl_parent_lock() to casefolded directories
>
> e8bd877fb76bb9f3 ("ovl: fix possible double unlink") added a sanity
> check of !d_unhashed(child) to try to verify that child dentry was not
> unlinked while parent dir was unlocked.
>
> This "was not unlink" check has a false positive result in the case of
> casefolded parent dir, because in that case, ovl_create_temp() returns
> an unhashed dentry.
>
> Change the "was not unlinked" check to use cant_mount(child).
> cant_mount(child) means that child was unlinked while we have been
> holding a reference to child, so it could not have become negative.
>
> This fixes the error in ovl_parent_lock() in ovl_check_rename_whiteout()
> after ovl_create_temp() and allows mount of overlayfs with casefolding
> enabled layers.
>
> Reported-by: André Almeida <andrealmeid@igalia.com>
> Link: https://lore.kernel.org/r/18704e8c-c734-43f3-bc7c-b8be345e1bf5@igalia.com/
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index bec4a39d1b97c..bffbb59776720 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -1551,9 +1551,23 @@ void ovl_copyattr(struct inode *inode)
>
> int ovl_parent_lock(struct dentry *parent, struct dentry *child)
> {
> + bool is_unlinked;
> +
> inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
> - if (!child ||
> - (!d_unhashed(child) && child->d_parent == parent))
> + if (!child)
> + return 0;
> +
> + /*
> + * After re-acquiring parent dir lock, verify that child was not moved
> + * to another parent and that it was not unlinked. cant_mount() means
> + * that child was unlinked while parent was unlocked. Since we are
> + * holding a reference to child, it could not have become negative.
> + * d_unhashed(child) is not a strong enough indication for unlinked,
> + * because with casefolded parent dir, ovl_create_temp() returns an
> + * unhashed dentry.
> + */
> + is_unlinked = cant_mount(child) || WARN_ON_ONCE(d_is_negative(child));
> + if (!is_unlinked && child->d_parent == parent)
> return 0;
>
> inode_unlock(parent->d_inode);
>
I don't feel comfortable with that. Letting ovl_parent_lock() succeed
on an unhashed dentry doesn't work for my longer term plans for locking.
I would really rather we got that dentry hashed.
What is happening is :
- lookup on non-existent name -> unhashed dentry
- vfs_create on that dentry - still unhashed
- rename of that unhashed dentry -> confusion in ovl_parent_lock()
If this were being done from user-space there would be another lookup
after the create and before the rename, and that would result in a
hashed dentry.
Could ovl_create_real() do a lookup for the name if the dentry isn't
hashed? That should result in a dentry that can safely be passed to
ovl_parent_lock().
NeilBrown
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
2025-08-27 23:58 ` NeilBrown
@ 2025-08-28 3:15 ` Gabriel Krisman Bertazi
2025-08-28 7:25 ` Amir Goldstein
0 siblings, 1 reply; 53+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-08-28 3:15 UTC (permalink / raw)
To: NeilBrown
Cc: Amir Goldstein, André Almeida, Miklos Szeredi, Theodore Tso,
linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev
"NeilBrown" <neil@brown.name> writes:
> On Thu, 28 Aug 2025, Amir Goldstein wrote:
>> On Tue, Aug 26, 2025 at 9:01 PM André Almeida <andrealmeid@igalia.com> wrote:
>> >
>> >
>> >
>> > Em 26/08/2025 04:31, Amir Goldstein escreveu:
>> > > On Mon, Aug 25, 2025 at 3:31 PM André Almeida <andrealmeid@igalia.com> wrote:
>> > >>
>> > >> Hi Amir,
>> > >>
>> > >> Em 22/08/2025 16:17, Amir Goldstein escreveu:
>> > >>
>> > >> [...]
>> > >>
>> > >> /*
>> > >>>>>> - * Allow filesystems that are case-folding capable but deny composing
>> > >>>>>> - * ovl stack from case-folded directories.
>> > >>>>>> + * Exceptionally for layers with casefold, we accept that they have
>> > >>>>>> + * their own hash and compare operations
>> > >>>>>> */
>> > >>>>>> - if (sb_has_encoding(dentry->d_sb))
>> > >>>>>> - return IS_CASEFOLDED(d_inode(dentry));
>> > >>>>>> + if (ofs->casefold)
>> > >>>>>> + return false;
>> > >>>>>
>> > >>>>> I think this is better as:
>> > >>>>> if (sb_has_encoding(dentry->d_sb))
>> > >>>>> return false;
>> > >>>>>
>> > >>>
>> > >>> And this still fails the test "Casefold enabled" for me.
>> > >>>
>> > >>> Maybe you are confused because this does not look like
>> > >>> a test failure. It looks like this:
>> > >>>
>> > >>> generic/999 5s ... [19:10:21][ 150.667994] overlayfs: failed lookup
>> > >>> in lower (ovl-lower/casefold, name='subdir', err=-116): parent wrong
>> > >>> casefold
>> > >>> [ 150.669741] overlayfs: failed lookup in lower (ovl-lower/casefold,
>> > >>> name='subdir', err=-116): parent wrong casefold
>> > >>> [ 150.760644] overlayfs: failed lookup in lower (/ovl-lower,
>> > >>> name='casefold', err=-66): child wrong casefold
>> > >>> [19:10:24] [not run]
>> > >>> generic/999 -- overlayfs does not support casefold enabled layers
>> > >>> Ran: generic/999
>> > >>> Not run: generic/999
>> > >>> Passed all 1 tests
>> > >>>
>> > >>
>> > >> This is how the test output looks before my changes[1] to the test:
>> > >>
>> > >> $ ./run.sh
>> > >> FSTYP -- ext4
>> > >> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
>> > >> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
>> > >> MKFS_OPTIONS -- -F /dev/vdc
>> > >> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
>> > >>
>> > >> generic/999 1s ... [not run] overlayfs does not support casefold enabled
>> > >> layers
>> > >> Ran: generic/999
>> > >> Not run: generic/999
>> > >> Passed all 1 tests
>> > >>
>> > >>
>> > >> And this is how it looks after my changes[1] to the test:
>> > >>
>> > >> $ ./run.sh
>> > >> FSTYP -- ext4
>> > >> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
>> > >> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
>> > >> MKFS_OPTIONS -- -F /dev/vdc
>> > >> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
>> > >>
>> > >> generic/999 1s
>> > >> Ran: generic/999
>> > >> Passed all 1 tests
>> > >>
>> > >> So, as far as I can tell, the casefold enabled is not being skipped
>> > >> after the fix to the test.
>> > >
>> > > Is this how it looks with your v6 or after fixing the bug:
>> > > https://lore.kernel.org/linux-unionfs/68a8c4d7.050a0220.37038e.005c.GAE@google.com/
>> > >
>> > > Because for me this skipping started after fixing this bug
>> > > Maybe we fixed the bug incorrectly, but I did not see what the problem
>> > > was from a quick look.
>> > >
>> > > Can you test with my branch:
>> > > https://github.com/amir73il/linux/commits/ovl_casefold/
>> > >
>> >
>> > Right, our branches have a different base, mine is older and based on
>> > the tag vfs/vfs-6.18.mount.
>> >
>> > I have now tested with your branch, and indeed the test fails with
>> > "overlayfs does not support casefold enabled". I did some debugging and
>> > the missing commit from my branch that is making this difference here is
>> > e8bd877fb76bb9f3 ("ovl: fix possible double unlink"). After reverting it
>> > on top of your branch, the test works. I'm not sure yet why this
>> > prevents the mount, but this is the call trace when the error happens:
>>
>> Wow, that is an interesting development race...
>>
>> >
>> > TID/PID 860/860 (mount/mount):
>> >
>> > entry_SYSCALL_64_after_hwframe+0x77
>> > do_syscall_64+0xa2
>> > x64_sys_call+0x1bc3
>> > __x64_sys_fsconfig+0x46c
>> > vfs_cmd_create+0x60
>> > vfs_get_tree+0x2e
>> > ovl_get_tree+0x19
>> > get_tree_nodev+0x70
>> > ovl_fill_super+0x53b
>> > ! 0us [-EINVAL] ovl_parent_lock
>> >
>> > And for the ovl_parent_lock() arguments, *parent="work", *child="#7". So
>> > right now I'm trying to figure out why the dentry for #7 is not hashed.
>> >
>>
>> The reason is this:
>>
>> static struct dentry *ext4_lookup(...
>> {
>> ...
>> if (IS_ENABLED(CONFIG_UNICODE) && !inode && IS_CASEFOLDED(dir)) {
>> /* Eventually we want to call d_add_ci(dentry, NULL)
>> * for negative dentries in the encoding case as
>> * well. For now, prevent the negative dentry
>> * from being cached.
>> */
>> return NULL;
>> }
>>
>> return d_splice_alias(inode, dentry);
>> }
>>
>> Neil,
>>
>> Apparently, the assumption that
>> ovl_lookup_temp() => ovl_lookup_upper() => lookup_one()
>> returns a hashed dentry is not always true.
>>
>> It may be always true for all the filesystems that are currently
>> supported as an overlayfs
>> upper layer fs (?), but it does not look like you can count on this
>> for the wider vfs effort
>> and we should try to come up with a solution for ovl_parent_lock()
>> that will allow enabling
>> casefolding on overlayfs layers.
>>
>> This patch seems to work. WDYT?
>>
>> Thanks,
>> Amir.
>>
>> commit 5dfcd10378038637648f3f422e3d5097eb6faa5f
>> Author: Amir Goldstein <amir73il@gmail.com>
>> Date: Wed Aug 27 19:55:26 2025 +0200
>>
>> ovl: adapt ovl_parent_lock() to casefolded directories
>>
>> e8bd877fb76bb9f3 ("ovl: fix possible double unlink") added a sanity
>> check of !d_unhashed(child) to try to verify that child dentry was not
>> unlinked while parent dir was unlocked.
>>
>> This "was not unlink" check has a false positive result in the case of
>> casefolded parent dir, because in that case, ovl_create_temp() returns
>> an unhashed dentry.
>>
>> Change the "was not unlinked" check to use cant_mount(child).
>> cant_mount(child) means that child was unlinked while we have been
>> holding a reference to child, so it could not have become negative.
>>
>> This fixes the error in ovl_parent_lock() in ovl_check_rename_whiteout()
>> after ovl_create_temp() and allows mount of overlayfs with casefolding
>> enabled layers.
>>
>> Reported-by: André Almeida <andrealmeid@igalia.com>
>> Link: https://lore.kernel.org/r/18704e8c-c734-43f3-bc7c-b8be345e1bf5@igalia.com/
>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>>
>> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
>> index bec4a39d1b97c..bffbb59776720 100644
>> --- a/fs/overlayfs/util.c
>> +++ b/fs/overlayfs/util.c
>> @@ -1551,9 +1551,23 @@ void ovl_copyattr(struct inode *inode)
>>
>> int ovl_parent_lock(struct dentry *parent, struct dentry *child)
>> {
>> + bool is_unlinked;
>> +
>> inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
>> - if (!child ||
>> - (!d_unhashed(child) && child->d_parent == parent))
>> + if (!child)
>> + return 0;
>> +
>> + /*
>> + * After re-acquiring parent dir lock, verify that child was not moved
>> + * to another parent and that it was not unlinked. cant_mount() means
>> + * that child was unlinked while parent was unlocked. Since we are
>> + * holding a reference to child, it could not have become negative.
>> + * d_unhashed(child) is not a strong enough indication for unlinked,
>> + * because with casefolded parent dir, ovl_create_temp() returns an
>> + * unhashed dentry.
>> + */
>> + is_unlinked = cant_mount(child) || WARN_ON_ONCE(d_is_negative(child));
>> + if (!is_unlinked && child->d_parent == parent)
>> return 0;
>>
>> inode_unlock(parent->d_inode);
>>
>
> I don't feel comfortable with that. Letting ovl_parent_lock() succeed
> on an unhashed dentry doesn't work for my longer term plans for locking.
> I would really rather we got that dentry hashed.
>
> What is happening is :
> - lookup on non-existent name -> unhashed dentry
> - vfs_create on that dentry - still unhashed
> - rename of that unhashed dentry -> confusion in ovl_parent_lock()
>
> If this were being done from user-space there would be another lookup
> after the create and before the rename, and that would result in a
> hashed dentry.
>
> Could ovl_create_real() do a lookup for the name if the dentry isn't
> hashed? That should result in a dentry that can safely be passed to
> ovl_parent_lock().
Might be a good time to mention I have a branch enabling negative
dentries in casefolded directories. It didn't have any major issues last
time I posted, but it didn't get much interest. It should be enough to
resolve the unhashed dentries after a lookup due to casefolding.
I'd need to revisit and retest, but it is a way out of it.
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
2025-08-28 3:15 ` Gabriel Krisman Bertazi
@ 2025-08-28 7:25 ` Amir Goldstein
2025-08-28 16:44 ` Amir Goldstein
2025-08-29 1:25 ` NeilBrown
0 siblings, 2 replies; 53+ messages in thread
From: Amir Goldstein @ 2025-08-28 7:25 UTC (permalink / raw)
To: Gabriel Krisman Bertazi, NeilBrown
Cc: André Almeida, Miklos Szeredi, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev
On Thu, Aug 28, 2025 at 5:15 AM Gabriel Krisman Bertazi
<gabriel@krisman.be> wrote:
>
> "NeilBrown" <neil@brown.name> writes:
>
> > On Thu, 28 Aug 2025, Amir Goldstein wrote:
> >> On Tue, Aug 26, 2025 at 9:01 PM André Almeida <andrealmeid@igalia.com> wrote:
> >> >
> >> >
> >> >
> >> > Em 26/08/2025 04:31, Amir Goldstein escreveu:
> >> > > On Mon, Aug 25, 2025 at 3:31 PM André Almeida <andrealmeid@igalia.com> wrote:
> >> > >>
> >> > >> Hi Amir,
> >> > >>
> >> > >> Em 22/08/2025 16:17, Amir Goldstein escreveu:
> >> > >>
> >> > >> [...]
> >> > >>
> >> > >> /*
> >> > >>>>>> - * Allow filesystems that are case-folding capable but deny composing
> >> > >>>>>> - * ovl stack from case-folded directories.
> >> > >>>>>> + * Exceptionally for layers with casefold, we accept that they have
> >> > >>>>>> + * their own hash and compare operations
> >> > >>>>>> */
> >> > >>>>>> - if (sb_has_encoding(dentry->d_sb))
> >> > >>>>>> - return IS_CASEFOLDED(d_inode(dentry));
> >> > >>>>>> + if (ofs->casefold)
> >> > >>>>>> + return false;
> >> > >>>>>
> >> > >>>>> I think this is better as:
> >> > >>>>> if (sb_has_encoding(dentry->d_sb))
> >> > >>>>> return false;
> >> > >>>>>
> >> > >>>
> >> > >>> And this still fails the test "Casefold enabled" for me.
> >> > >>>
> >> > >>> Maybe you are confused because this does not look like
> >> > >>> a test failure. It looks like this:
> >> > >>>
> >> > >>> generic/999 5s ... [19:10:21][ 150.667994] overlayfs: failed lookup
> >> > >>> in lower (ovl-lower/casefold, name='subdir', err=-116): parent wrong
> >> > >>> casefold
> >> > >>> [ 150.669741] overlayfs: failed lookup in lower (ovl-lower/casefold,
> >> > >>> name='subdir', err=-116): parent wrong casefold
> >> > >>> [ 150.760644] overlayfs: failed lookup in lower (/ovl-lower,
> >> > >>> name='casefold', err=-66): child wrong casefold
> >> > >>> [19:10:24] [not run]
> >> > >>> generic/999 -- overlayfs does not support casefold enabled layers
> >> > >>> Ran: generic/999
> >> > >>> Not run: generic/999
> >> > >>> Passed all 1 tests
> >> > >>>
> >> > >>
> >> > >> This is how the test output looks before my changes[1] to the test:
> >> > >>
> >> > >> $ ./run.sh
> >> > >> FSTYP -- ext4
> >> > >> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
> >> > >> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
> >> > >> MKFS_OPTIONS -- -F /dev/vdc
> >> > >> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
> >> > >>
> >> > >> generic/999 1s ... [not run] overlayfs does not support casefold enabled
> >> > >> layers
> >> > >> Ran: generic/999
> >> > >> Not run: generic/999
> >> > >> Passed all 1 tests
> >> > >>
> >> > >>
> >> > >> And this is how it looks after my changes[1] to the test:
> >> > >>
> >> > >> $ ./run.sh
> >> > >> FSTYP -- ext4
> >> > >> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
> >> > >> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
> >> > >> MKFS_OPTIONS -- -F /dev/vdc
> >> > >> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
> >> > >>
> >> > >> generic/999 1s
> >> > >> Ran: generic/999
> >> > >> Passed all 1 tests
> >> > >>
> >> > >> So, as far as I can tell, the casefold enabled is not being skipped
> >> > >> after the fix to the test.
> >> > >
> >> > > Is this how it looks with your v6 or after fixing the bug:
> >> > > https://lore.kernel.org/linux-unionfs/68a8c4d7.050a0220.37038e.005c.GAE@google.com/
> >> > >
> >> > > Because for me this skipping started after fixing this bug
> >> > > Maybe we fixed the bug incorrectly, but I did not see what the problem
> >> > > was from a quick look.
> >> > >
> >> > > Can you test with my branch:
> >> > > https://github.com/amir73il/linux/commits/ovl_casefold/
> >> > >
> >> >
> >> > Right, our branches have a different base, mine is older and based on
> >> > the tag vfs/vfs-6.18.mount.
> >> >
> >> > I have now tested with your branch, and indeed the test fails with
> >> > "overlayfs does not support casefold enabled". I did some debugging and
> >> > the missing commit from my branch that is making this difference here is
> >> > e8bd877fb76bb9f3 ("ovl: fix possible double unlink"). After reverting it
> >> > on top of your branch, the test works. I'm not sure yet why this
> >> > prevents the mount, but this is the call trace when the error happens:
> >>
> >> Wow, that is an interesting development race...
> >>
> >> >
> >> > TID/PID 860/860 (mount/mount):
> >> >
> >> > entry_SYSCALL_64_after_hwframe+0x77
> >> > do_syscall_64+0xa2
> >> > x64_sys_call+0x1bc3
> >> > __x64_sys_fsconfig+0x46c
> >> > vfs_cmd_create+0x60
> >> > vfs_get_tree+0x2e
> >> > ovl_get_tree+0x19
> >> > get_tree_nodev+0x70
> >> > ovl_fill_super+0x53b
> >> > ! 0us [-EINVAL] ovl_parent_lock
> >> >
> >> > And for the ovl_parent_lock() arguments, *parent="work", *child="#7". So
> >> > right now I'm trying to figure out why the dentry for #7 is not hashed.
> >> >
> >>
> >> The reason is this:
> >>
> >> static struct dentry *ext4_lookup(...
> >> {
> >> ...
> >> if (IS_ENABLED(CONFIG_UNICODE) && !inode && IS_CASEFOLDED(dir)) {
> >> /* Eventually we want to call d_add_ci(dentry, NULL)
> >> * for negative dentries in the encoding case as
> >> * well. For now, prevent the negative dentry
> >> * from being cached.
> >> */
> >> return NULL;
> >> }
> >>
> >> return d_splice_alias(inode, dentry);
> >> }
> >>
> >> Neil,
> >>
> >> Apparently, the assumption that
> >> ovl_lookup_temp() => ovl_lookup_upper() => lookup_one()
> >> returns a hashed dentry is not always true.
> >>
> >> It may be always true for all the filesystems that are currently
> >> supported as an overlayfs
> >> upper layer fs (?), but it does not look like you can count on this
> >> for the wider vfs effort
> >> and we should try to come up with a solution for ovl_parent_lock()
> >> that will allow enabling
> >> casefolding on overlayfs layers.
> >>
> >> This patch seems to work. WDYT?
> >>
> >> Thanks,
> >> Amir.
> >>
> >> commit 5dfcd10378038637648f3f422e3d5097eb6faa5f
> >> Author: Amir Goldstein <amir73il@gmail.com>
> >> Date: Wed Aug 27 19:55:26 2025 +0200
> >>
> >> ovl: adapt ovl_parent_lock() to casefolded directories
> >>
> >> e8bd877fb76bb9f3 ("ovl: fix possible double unlink") added a sanity
> >> check of !d_unhashed(child) to try to verify that child dentry was not
> >> unlinked while parent dir was unlocked.
> >>
> >> This "was not unlink" check has a false positive result in the case of
> >> casefolded parent dir, because in that case, ovl_create_temp() returns
> >> an unhashed dentry.
> >>
> >> Change the "was not unlinked" check to use cant_mount(child).
> >> cant_mount(child) means that child was unlinked while we have been
> >> holding a reference to child, so it could not have become negative.
> >>
> >> This fixes the error in ovl_parent_lock() in ovl_check_rename_whiteout()
> >> after ovl_create_temp() and allows mount of overlayfs with casefolding
> >> enabled layers.
> >>
> >> Reported-by: André Almeida <andrealmeid@igalia.com>
> >> Link: https://lore.kernel.org/r/18704e8c-c734-43f3-bc7c-b8be345e1bf5@igalia.com/
> >> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> >>
> >> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> >> index bec4a39d1b97c..bffbb59776720 100644
> >> --- a/fs/overlayfs/util.c
> >> +++ b/fs/overlayfs/util.c
> >> @@ -1551,9 +1551,23 @@ void ovl_copyattr(struct inode *inode)
> >>
> >> int ovl_parent_lock(struct dentry *parent, struct dentry *child)
> >> {
> >> + bool is_unlinked;
> >> +
> >> inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
> >> - if (!child ||
> >> - (!d_unhashed(child) && child->d_parent == parent))
> >> + if (!child)
> >> + return 0;
> >> +
> >> + /*
> >> + * After re-acquiring parent dir lock, verify that child was not moved
> >> + * to another parent and that it was not unlinked. cant_mount() means
> >> + * that child was unlinked while parent was unlocked. Since we are
> >> + * holding a reference to child, it could not have become negative.
> >> + * d_unhashed(child) is not a strong enough indication for unlinked,
> >> + * because with casefolded parent dir, ovl_create_temp() returns an
> >> + * unhashed dentry.
> >> + */
> >> + is_unlinked = cant_mount(child) || WARN_ON_ONCE(d_is_negative(child));
> >> + if (!is_unlinked && child->d_parent == parent)
> >> return 0;
> >>
> >> inode_unlock(parent->d_inode);
> >>
> >
> > I don't feel comfortable with that. Letting ovl_parent_lock() succeed
> > on an unhashed dentry doesn't work for my longer term plans for locking.
> > I would really rather we got that dentry hashed.
> >
> > What is happening is :
> > - lookup on non-existent name -> unhashed dentry
> > - vfs_create on that dentry - still unhashed
> > - rename of that unhashed dentry -> confusion in ovl_parent_lock()
> >
> > If this were being done from user-space there would be another lookup
> > after the create and before the rename, and that would result in a
> > hashed dentry.
> >
> > Could ovl_create_real() do a lookup for the name if the dentry isn't
> > hashed? That should result in a dentry that can safely be passed to
> > ovl_parent_lock().
>
> Might be a good time to mention I have a branch enabling negative
> dentries in casefolded directories. It didn't have any major issues last
> time I posted, but it didn't get much interest. It should be enough to
> resolve the unhashed dentries after a lookup due to casefolding.
>
> I'd need to revisit and retest, but it is a way out of it.
>
That's definitely a way out, but I don't know if it's needed to unblock the
ovl_casefold work.
I will try Neil's suggestion because it makes sense.
Neil,
FYI, if your future work for vfs assumes that fs will alway have the
dentry hashed after create, you may want to look at:
static int ovl_instantiate(struct dentry *dentry, struct inode *inode,
...
/* Force lookup of new upper hardlink to find its lower */
if (hardlink)
d_drop(dentry);
return 0;
}
If your assumption is not true for overlayfs, it may not be true for other fs
as well. How could you verify that it is correct?
I really hope that you have some opt-in strategy in mind, so those new
dirops assumptions would not have to include all possible filesystems.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp()
2025-08-27 20:45 ` André Almeida
@ 2025-08-28 11:09 ` Amir Goldstein
0 siblings, 0 replies; 53+ messages in thread
From: Amir Goldstein @ 2025-08-28 11:09 UTC (permalink / raw)
To: André Almeida
Cc: Gabriel Krisman Bertazi, Miklos Szeredi, Theodore Tso,
linux-unionfs, linux-kernel, linux-fsdevel, Alexander Viro,
Christian Brauner, Jan Kara, kernel-dev
On Wed, Aug 27, 2025 at 10:45 PM André Almeida <andrealmeid@igalia.com> wrote:
>
> Em 26/08/2025 04:19, Amir Goldstein escreveu:
> >
> > Andre,
> >
> > Just noticed this is a bug, should have been if (*dst), but anyway following
> > Gabriel's comments I have made this change in my tree (pending more
> > strict related changes):
> >
> > static int ovl_casefold(struct ovl_readdir_data *rdd, const char *str, int len,
> > char **dst)
> > {
> > const struct qstr qstr = { .name = str, .len = len };
> > char *cf_name;
> > int cf_len;
> >
> > if (!IS_ENABLED(CONFIG_UNICODE) || !rdd->map || is_dot_dotdot(str, len))
> > return 0;
> >
> > cf_name = kmalloc(NAME_MAX, GFP_KERNEL);
> > if (!cf_name) {
> > rdd->err = -ENOMEM;
> > return -ENOMEM;
> > }
> >
> > cf_len = utf8_casefold(rdd->map, &qstr, *dst, NAME_MAX);
>
> The third argument here should be cf_name, not *dst anymore.
oops. fixed in my tree.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
2025-08-28 7:25 ` Amir Goldstein
@ 2025-08-28 16:44 ` Amir Goldstein
2025-08-29 1:27 ` NeilBrown
2025-08-29 1:25 ` NeilBrown
1 sibling, 1 reply; 53+ messages in thread
From: Amir Goldstein @ 2025-08-28 16:44 UTC (permalink / raw)
To: NeilBrown
Cc: André Almeida, Miklos Szeredi, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev, Gabriel Krisman Bertazi
On Thu, Aug 28, 2025 at 9:25 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Thu, Aug 28, 2025 at 5:15 AM Gabriel Krisman Bertazi
> <gabriel@krisman.be> wrote:
> >
> > "NeilBrown" <neil@brown.name> writes:
> >
> > > On Thu, 28 Aug 2025, Amir Goldstein wrote:
> > >> On Tue, Aug 26, 2025 at 9:01 PM André Almeida <andrealmeid@igalia.com> wrote:
> > >> >
> > >> >
> > >> >
> > >> > Em 26/08/2025 04:31, Amir Goldstein escreveu:
> > >> > > On Mon, Aug 25, 2025 at 3:31 PM André Almeida <andrealmeid@igalia.com> wrote:
> > >> > >>
> > >> > >> Hi Amir,
> > >> > >>
> > >> > >> Em 22/08/2025 16:17, Amir Goldstein escreveu:
> > >> > >>
> > >> > >> [...]
> > >> > >>
> > >> > >> /*
> > >> > >>>>>> - * Allow filesystems that are case-folding capable but deny composing
> > >> > >>>>>> - * ovl stack from case-folded directories.
> > >> > >>>>>> + * Exceptionally for layers with casefold, we accept that they have
> > >> > >>>>>> + * their own hash and compare operations
> > >> > >>>>>> */
> > >> > >>>>>> - if (sb_has_encoding(dentry->d_sb))
> > >> > >>>>>> - return IS_CASEFOLDED(d_inode(dentry));
> > >> > >>>>>> + if (ofs->casefold)
> > >> > >>>>>> + return false;
> > >> > >>>>>
> > >> > >>>>> I think this is better as:
> > >> > >>>>> if (sb_has_encoding(dentry->d_sb))
> > >> > >>>>> return false;
> > >> > >>>>>
> > >> > >>>
> > >> > >>> And this still fails the test "Casefold enabled" for me.
> > >> > >>>
> > >> > >>> Maybe you are confused because this does not look like
> > >> > >>> a test failure. It looks like this:
> > >> > >>>
> > >> > >>> generic/999 5s ... [19:10:21][ 150.667994] overlayfs: failed lookup
> > >> > >>> in lower (ovl-lower/casefold, name='subdir', err=-116): parent wrong
> > >> > >>> casefold
> > >> > >>> [ 150.669741] overlayfs: failed lookup in lower (ovl-lower/casefold,
> > >> > >>> name='subdir', err=-116): parent wrong casefold
> > >> > >>> [ 150.760644] overlayfs: failed lookup in lower (/ovl-lower,
> > >> > >>> name='casefold', err=-66): child wrong casefold
> > >> > >>> [19:10:24] [not run]
> > >> > >>> generic/999 -- overlayfs does not support casefold enabled layers
> > >> > >>> Ran: generic/999
> > >> > >>> Not run: generic/999
> > >> > >>> Passed all 1 tests
> > >> > >>>
> > >> > >>
> > >> > >> This is how the test output looks before my changes[1] to the test:
> > >> > >>
> > >> > >> $ ./run.sh
> > >> > >> FSTYP -- ext4
> > >> > >> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
> > >> > >> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
> > >> > >> MKFS_OPTIONS -- -F /dev/vdc
> > >> > >> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
> > >> > >>
> > >> > >> generic/999 1s ... [not run] overlayfs does not support casefold enabled
> > >> > >> layers
> > >> > >> Ran: generic/999
> > >> > >> Not run: generic/999
> > >> > >> Passed all 1 tests
> > >> > >>
> > >> > >>
> > >> > >> And this is how it looks after my changes[1] to the test:
> > >> > >>
> > >> > >> $ ./run.sh
> > >> > >> FSTYP -- ext4
> > >> > >> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
> > >> > >> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
> > >> > >> MKFS_OPTIONS -- -F /dev/vdc
> > >> > >> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
> > >> > >>
> > >> > >> generic/999 1s
> > >> > >> Ran: generic/999
> > >> > >> Passed all 1 tests
> > >> > >>
> > >> > >> So, as far as I can tell, the casefold enabled is not being skipped
> > >> > >> after the fix to the test.
> > >> > >
> > >> > > Is this how it looks with your v6 or after fixing the bug:
> > >> > > https://lore.kernel.org/linux-unionfs/68a8c4d7.050a0220.37038e.005c.GAE@google.com/
> > >> > >
> > >> > > Because for me this skipping started after fixing this bug
> > >> > > Maybe we fixed the bug incorrectly, but I did not see what the problem
> > >> > > was from a quick look.
> > >> > >
> > >> > > Can you test with my branch:
> > >> > > https://github.com/amir73il/linux/commits/ovl_casefold/
> > >> > >
> > >> >
> > >> > Right, our branches have a different base, mine is older and based on
> > >> > the tag vfs/vfs-6.18.mount.
> > >> >
> > >> > I have now tested with your branch, and indeed the test fails with
> > >> > "overlayfs does not support casefold enabled". I did some debugging and
> > >> > the missing commit from my branch that is making this difference here is
> > >> > e8bd877fb76bb9f3 ("ovl: fix possible double unlink"). After reverting it
> > >> > on top of your branch, the test works. I'm not sure yet why this
> > >> > prevents the mount, but this is the call trace when the error happens:
> > >>
> > >> Wow, that is an interesting development race...
> > >>
> > >> >
> > >> > TID/PID 860/860 (mount/mount):
> > >> >
> > >> > entry_SYSCALL_64_after_hwframe+0x77
> > >> > do_syscall_64+0xa2
> > >> > x64_sys_call+0x1bc3
> > >> > __x64_sys_fsconfig+0x46c
> > >> > vfs_cmd_create+0x60
> > >> > vfs_get_tree+0x2e
> > >> > ovl_get_tree+0x19
> > >> > get_tree_nodev+0x70
> > >> > ovl_fill_super+0x53b
> > >> > ! 0us [-EINVAL] ovl_parent_lock
> > >> >
> > >> > And for the ovl_parent_lock() arguments, *parent="work", *child="#7". So
> > >> > right now I'm trying to figure out why the dentry for #7 is not hashed.
> > >> >
> > >>
> > >> The reason is this:
> > >>
> > >> static struct dentry *ext4_lookup(...
> > >> {
> > >> ...
> > >> if (IS_ENABLED(CONFIG_UNICODE) && !inode && IS_CASEFOLDED(dir)) {
> > >> /* Eventually we want to call d_add_ci(dentry, NULL)
> > >> * for negative dentries in the encoding case as
> > >> * well. For now, prevent the negative dentry
> > >> * from being cached.
> > >> */
> > >> return NULL;
> > >> }
> > >>
> > >> return d_splice_alias(inode, dentry);
> > >> }
> > >>
> > >> Neil,
> > >>
> > >> Apparently, the assumption that
> > >> ovl_lookup_temp() => ovl_lookup_upper() => lookup_one()
> > >> returns a hashed dentry is not always true.
> > >>
> > >> It may be always true for all the filesystems that are currently
> > >> supported as an overlayfs
> > >> upper layer fs (?), but it does not look like you can count on this
> > >> for the wider vfs effort
> > >> and we should try to come up with a solution for ovl_parent_lock()
> > >> that will allow enabling
> > >> casefolding on overlayfs layers.
> > >>
> > >> This patch seems to work. WDYT?
> > >>
> > >> Thanks,
> > >> Amir.
> > >>
> > >> commit 5dfcd10378038637648f3f422e3d5097eb6faa5f
> > >> Author: Amir Goldstein <amir73il@gmail.com>
> > >> Date: Wed Aug 27 19:55:26 2025 +0200
> > >>
> > >> ovl: adapt ovl_parent_lock() to casefolded directories
> > >>
> > >> e8bd877fb76bb9f3 ("ovl: fix possible double unlink") added a sanity
> > >> check of !d_unhashed(child) to try to verify that child dentry was not
> > >> unlinked while parent dir was unlocked.
> > >>
> > >> This "was not unlink" check has a false positive result in the case of
> > >> casefolded parent dir, because in that case, ovl_create_temp() returns
> > >> an unhashed dentry.
> > >>
> > >> Change the "was not unlinked" check to use cant_mount(child).
> > >> cant_mount(child) means that child was unlinked while we have been
> > >> holding a reference to child, so it could not have become negative.
> > >>
> > >> This fixes the error in ovl_parent_lock() in ovl_check_rename_whiteout()
> > >> after ovl_create_temp() and allows mount of overlayfs with casefolding
> > >> enabled layers.
> > >>
> > >> Reported-by: André Almeida <andrealmeid@igalia.com>
> > >> Link: https://lore.kernel.org/r/18704e8c-c734-43f3-bc7c-b8be345e1bf5@igalia.com/
> > >> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > >>
> > >> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> > >> index bec4a39d1b97c..bffbb59776720 100644
> > >> --- a/fs/overlayfs/util.c
> > >> +++ b/fs/overlayfs/util.c
> > >> @@ -1551,9 +1551,23 @@ void ovl_copyattr(struct inode *inode)
> > >>
> > >> int ovl_parent_lock(struct dentry *parent, struct dentry *child)
> > >> {
> > >> + bool is_unlinked;
> > >> +
> > >> inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
> > >> - if (!child ||
> > >> - (!d_unhashed(child) && child->d_parent == parent))
> > >> + if (!child)
> > >> + return 0;
> > >> +
> > >> + /*
> > >> + * After re-acquiring parent dir lock, verify that child was not moved
> > >> + * to another parent and that it was not unlinked. cant_mount() means
> > >> + * that child was unlinked while parent was unlocked. Since we are
> > >> + * holding a reference to child, it could not have become negative.
> > >> + * d_unhashed(child) is not a strong enough indication for unlinked,
> > >> + * because with casefolded parent dir, ovl_create_temp() returns an
> > >> + * unhashed dentry.
> > >> + */
> > >> + is_unlinked = cant_mount(child) || WARN_ON_ONCE(d_is_negative(child));
> > >> + if (!is_unlinked && child->d_parent == parent)
> > >> return 0;
> > >>
> > >> inode_unlock(parent->d_inode);
> > >>
> > >
> > > I don't feel comfortable with that. Letting ovl_parent_lock() succeed
> > > on an unhashed dentry doesn't work for my longer term plans for locking.
> > > I would really rather we got that dentry hashed.
> > >
> > > What is happening is :
> > > - lookup on non-existent name -> unhashed dentry
> > > - vfs_create on that dentry - still unhashed
> > > - rename of that unhashed dentry -> confusion in ovl_parent_lock()
> > >
> > > If this were being done from user-space there would be another lookup
> > > after the create and before the rename, and that would result in a
> > > hashed dentry.
> > >
> > > Could ovl_create_real() do a lookup for the name if the dentry isn't
> > > hashed? That should result in a dentry that can safely be passed to
> > > ovl_parent_lock().
> >
See patch below.
Seems to get the job done.
Thanks,
Amir.
>
> FYI, if your future work for vfs assumes that fs will alway have the
> dentry hashed after create, you may want to look at:
>
> static int ovl_instantiate(struct dentry *dentry, struct inode *inode,
> ...
> /* Force lookup of new upper hardlink to find its lower */
> if (hardlink)
> d_drop(dentry);
>
> return 0;
> }
>
> If your assumption is not true for overlayfs, it may not be true for other fs
> as well. How could you verify that it is correct?
>
> I really hope that you have some opt-in strategy in mind, so those new
> dirops assumptions would not have to include all possible filesystems.
>
commit 32786370148617766043f6d054ff40758ce79f21 (HEAD -> ovl_casefold)
Author: Amir Goldstein <amir73il@gmail.com>
Date: Wed Aug 27 19:55:26 2025 +0200
ovl: make sure that ovl_create_real() returns a hashed dentry
e8bd877fb76bb9f3 ("ovl: fix possible double unlink") added a sanity
check of !d_unhashed(child) to try to verify that child dentry was not
unlinked while parent dir was unlocked.
This "was not unlink" check has a false positive result in the case of
casefolded parent dir, because in that case, ovl_create_temp() returns
an unhashed dentry after ovl_create_real() gets an unhashed dentry from
ovl_lookup_upper() and makes it positive.
To avoid returning unhashed dentry from ovl_create_temp(), let
ovl_create_real() lookup again after making the newdentry positive,
so it always returns a hashed positive dentry (or an error).
This fixes the error in ovl_parent_lock() in ovl_check_rename_whiteout()
after ovl_create_temp() and allows mount of overlayfs with casefolding
enabled layers.
Reported-by: André Almeida <andrealmeid@igalia.com>
Closes: https://lore.kernel.org/r/18704e8c-c734-43f3-bc7c-b8be345e1bf5@igalia.com/
Suggested-by: Neil Brown <neil@brown.name>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 538a1b2dbb387..a5e9ddf3023b3 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -212,12 +212,32 @@ struct dentry *ovl_create_real(struct ovl_fs
*ofs, struct dentry *parent,
err = -EPERM;
}
}
- if (!err && WARN_ON(!newdentry->d_inode)) {
+ if (err)
+ goto out;
+
+ if (WARN_ON(!newdentry->d_inode)) {
/*
* Not quite sure if non-instantiated dentry is legal or not.
* VFS doesn't seem to care so check and warn here.
*/
err = -EIO;
+ } else if (d_unhashed(newdentry)) {
+ struct dentry *d;
+ /*
+ * Some filesystems (i.e. casefolded) may return an unhashed
+ * negative dentry from the ovl_lookup_upper() call before
+ * ovl_create_real().
+ * In that case, lookup again after making the newdentry
+ * positive, so ovl_create_upper() always returns a hashed
+ * positive dentry.
+ */
+ d = ovl_lookup_upper(ofs, newdentry->d_name.name, parent,
+ newdentry->d_name.len);
+ dput(newdentry);
+ if (IS_ERR_OR_NULL(d))
+ err = d ? PTR_ERR(d) : -ENOENT;
+ else
+ return d;
}
out:
if (err) {
^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
2025-08-28 7:25 ` Amir Goldstein
2025-08-28 16:44 ` Amir Goldstein
@ 2025-08-29 1:25 ` NeilBrown
2025-08-29 9:31 ` Amir Goldstein
1 sibling, 1 reply; 53+ messages in thread
From: NeilBrown @ 2025-08-29 1:25 UTC (permalink / raw)
To: Amir Goldstein
Cc: Gabriel Krisman Bertazi, André Almeida, Miklos Szeredi,
Theodore Tso, linux-unionfs, linux-kernel, linux-fsdevel,
Alexander Viro, Christian Brauner, Jan Kara, kernel-dev
On Thu, 28 Aug 2025, Amir Goldstein wrote:
>
> Neil,
>
> FYI, if your future work for vfs assumes that fs will alway have the
> dentry hashed after create, you may want to look at:
>
> static int ovl_instantiate(struct dentry *dentry, struct inode *inode,
> ...
> /* Force lookup of new upper hardlink to find its lower */
> if (hardlink)
> d_drop(dentry);
>
> return 0;
> }
>
> If your assumption is not true for overlayfs, it may not be true for other fs
> as well. How could you verify that it is correct?
I don't need the dentry to be hashed after the create has completed (or
failed).
I only need it to be hashed when the create starts, and ideally for the
duration of the creation process.
Several filesystems d_drop() a newly created dentry so as to trigger a
lookup - overlayfs is not unique.
>
> I really hope that you have some opt-in strategy in mind, so those new
> dirops assumptions would not have to include all possible filesystems.
Filesystems will need to opt-in to not having the parent locked. If
a fs still has the parent locked across operations it doesn't really
matter when the d_drop() happens. However I want to move all the
d_drop()s to the end (which is where ovl has it) to ensure there are no
structural issues that mean an early d_drop() is needed. e.g. Some
filesystems d_drop() and then d_splice_alias() and I want to add a new
d_splice_alias() variant that doesn't require the d_drop().
So it is only at the start of an operation (create, remove, rename) that
I need the dentry to be hashed. That raises questions about ext4_lookup
not hashing a negative dentry as a lookup-create pair in do_mknodat or
lookup_open could call vfs_create with a non-hashed dentry.
That isn't *actually* a problem (I think - I should double-check) as the
dentry is still d_in_lookup() so it is hashed in the separate
in_lookup_hashtable(). So a d_lookup() will find it even though it
isn't hashed.
That suggests an alternate fix for ovl_parent_lock(). Rather than
insisting that the child is hashed, we can insist that either
d_in_lookup(child) || !d_unhashed(child)
Such a dentry really is hashed: it might be hashed in one table, it
might be hashed in the other.
However that wouldn't protect against filesystems which deliberately
d_drop() during create, so I think ovl still needs to perform a lookup
after a create and before a rename - if the create succeeds but the
dentry is negative.
Thanks,
NeilBrown
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
2025-08-28 16:44 ` Amir Goldstein
@ 2025-08-29 1:27 ` NeilBrown
0 siblings, 0 replies; 53+ messages in thread
From: NeilBrown @ 2025-08-29 1:27 UTC (permalink / raw)
To: Amir Goldstein
Cc: André Almeida, Miklos Szeredi, Theodore Tso, linux-unionfs,
linux-kernel, linux-fsdevel, Alexander Viro, Christian Brauner,
Jan Kara, kernel-dev, Gabriel Krisman Bertazi
On Fri, 29 Aug 2025, Amir Goldstein wrote:
>
> commit 32786370148617766043f6d054ff40758ce79f21 (HEAD -> ovl_casefold)
> Author: Amir Goldstein <amir73il@gmail.com>
> Date: Wed Aug 27 19:55:26 2025 +0200
>
> ovl: make sure that ovl_create_real() returns a hashed dentry
>
> e8bd877fb76bb9f3 ("ovl: fix possible double unlink") added a sanity
> check of !d_unhashed(child) to try to verify that child dentry was not
> unlinked while parent dir was unlocked.
>
> This "was not unlink" check has a false positive result in the case of
> casefolded parent dir, because in that case, ovl_create_temp() returns
> an unhashed dentry after ovl_create_real() gets an unhashed dentry from
> ovl_lookup_upper() and makes it positive.
>
> To avoid returning unhashed dentry from ovl_create_temp(), let
> ovl_create_real() lookup again after making the newdentry positive,
> so it always returns a hashed positive dentry (or an error).
>
> This fixes the error in ovl_parent_lock() in ovl_check_rename_whiteout()
> after ovl_create_temp() and allows mount of overlayfs with casefolding
> enabled layers.
>
> Reported-by: André Almeida <andrealmeid@igalia.com>
> Closes: https://lore.kernel.org/r/18704e8c-c734-43f3-bc7c-b8be345e1bf5@igalia.com/
> Suggested-by: Neil Brown <neil@brown.name>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by NeilBrown <neil@brown.name>
Thanks,
NeilBrown
>
> diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> index 538a1b2dbb387..a5e9ddf3023b3 100644
> --- a/fs/overlayfs/dir.c
> +++ b/fs/overlayfs/dir.c
> @@ -212,12 +212,32 @@ struct dentry *ovl_create_real(struct ovl_fs
> *ofs, struct dentry *parent,
> err = -EPERM;
> }
> }
> - if (!err && WARN_ON(!newdentry->d_inode)) {
> + if (err)
> + goto out;
> +
> + if (WARN_ON(!newdentry->d_inode)) {
> /*
> * Not quite sure if non-instantiated dentry is legal or not.
> * VFS doesn't seem to care so check and warn here.
> */
> err = -EIO;
> + } else if (d_unhashed(newdentry)) {
> + struct dentry *d;
> + /*
> + * Some filesystems (i.e. casefolded) may return an unhashed
> + * negative dentry from the ovl_lookup_upper() call before
> + * ovl_create_real().
> + * In that case, lookup again after making the newdentry
> + * positive, so ovl_create_upper() always returns a hashed
> + * positive dentry.
> + */
> + d = ovl_lookup_upper(ofs, newdentry->d_name.name, parent,
> + newdentry->d_name.len);
> + dput(newdentry);
> + if (IS_ERR_OR_NULL(d))
> + err = d ? PTR_ERR(d) : -ENOENT;
> + else
> + return d;
> }
> out:
> if (err) {
>
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
2025-08-29 1:25 ` NeilBrown
@ 2025-08-29 9:31 ` Amir Goldstein
2025-09-01 22:02 ` NeilBrown
0 siblings, 1 reply; 53+ messages in thread
From: Amir Goldstein @ 2025-08-29 9:31 UTC (permalink / raw)
To: NeilBrown
Cc: Gabriel Krisman Bertazi, André Almeida, Miklos Szeredi,
Theodore Tso, linux-unionfs, linux-kernel, linux-fsdevel,
Alexander Viro, Christian Brauner, Jan Kara, kernel-dev
On Fri, Aug 29, 2025 at 3:25 AM NeilBrown <neil@brown.name> wrote:
>
> On Thu, 28 Aug 2025, Amir Goldstein wrote:
> >
> > Neil,
> >
> > FYI, if your future work for vfs assumes that fs will alway have the
> > dentry hashed after create, you may want to look at:
> >
> > static int ovl_instantiate(struct dentry *dentry, struct inode *inode,
> > ...
> > /* Force lookup of new upper hardlink to find its lower */
> > if (hardlink)
> > d_drop(dentry);
> >
> > return 0;
> > }
> >
> > If your assumption is not true for overlayfs, it may not be true for other fs
> > as well. How could you verify that it is correct?
>
> I don't need the dentry to be hashed after the create has completed (or
> failed).
> I only need it to be hashed when the create starts, and ideally for the
> duration of the creation process.
> Several filesystems d_drop() a newly created dentry so as to trigger a
> lookup - overlayfs is not unique.
>
> >
> > I really hope that you have some opt-in strategy in mind, so those new
> > dirops assumptions would not have to include all possible filesystems.
>
> Filesystems will need to opt-in to not having the parent locked. If
> a fs still has the parent locked across operations it doesn't really
> matter when the d_drop() happens. However I want to move all the
> d_drop()s to the end (which is where ovl has it) to ensure there are no
> structural issues that mean an early d_drop() is needed. e.g. Some
> filesystems d_drop() and then d_splice_alias() and I want to add a new
> d_splice_alias() variant that doesn't require the d_drop().
>
Do you mean revert c971e6a006175 kill d_instantiate_no_diralias()?
In any case, I hope that in the end the semantics of state of dentry after
lookup/create will be more clear than they are now...
Thanks,
Amir.
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
2025-08-29 9:31 ` Amir Goldstein
@ 2025-09-01 22:02 ` NeilBrown
0 siblings, 0 replies; 53+ messages in thread
From: NeilBrown @ 2025-09-01 22:02 UTC (permalink / raw)
To: Amir Goldstein
Cc: Gabriel Krisman Bertazi, André Almeida, Miklos Szeredi,
Theodore Tso, linux-unionfs, linux-kernel, linux-fsdevel,
Alexander Viro, Christian Brauner, Jan Kara, kernel-dev
On Fri, 29 Aug 2025, Amir Goldstein wrote:
> On Fri, Aug 29, 2025 at 3:25 AM NeilBrown <neil@brown.name> wrote:
> >
> > On Thu, 28 Aug 2025, Amir Goldstein wrote:
> > >
> > > Neil,
> > >
> > > FYI, if your future work for vfs assumes that fs will alway have the
> > > dentry hashed after create, you may want to look at:
> > >
> > > static int ovl_instantiate(struct dentry *dentry, struct inode *inode,
> > > ...
> > > /* Force lookup of new upper hardlink to find its lower */
> > > if (hardlink)
> > > d_drop(dentry);
> > >
> > > return 0;
> > > }
> > >
> > > If your assumption is not true for overlayfs, it may not be true for other fs
> > > as well. How could you verify that it is correct?
> >
> > I don't need the dentry to be hashed after the create has completed (or
> > failed).
> > I only need it to be hashed when the create starts, and ideally for the
> > duration of the creation process.
> > Several filesystems d_drop() a newly created dentry so as to trigger a
> > lookup - overlayfs is not unique.
> >
> > >
> > > I really hope that you have some opt-in strategy in mind, so those new
> > > dirops assumptions would not have to include all possible filesystems.
> >
> > Filesystems will need to opt-in to not having the parent locked. If
> > a fs still has the parent locked across operations it doesn't really
> > matter when the d_drop() happens. However I want to move all the
> > d_drop()s to the end (which is where ovl has it) to ensure there are no
> > structural issues that mean an early d_drop() is needed. e.g. Some
> > filesystems d_drop() and then d_splice_alias() and I want to add a new
> > d_splice_alias() variant that doesn't require the d_drop().
> >
>
> Do you mean revert c971e6a006175 kill d_instantiate_no_diralias()?
Something like that, yes. Details will probably end up being a bit
different.
>
> In any case, I hope that in the end the semantics of state of dentry after
> lookup/create will be more clear than they are now...
That would be nice. Not just clear, but documented would be the aim.
NeilBrown
^ permalink raw reply [flat|nested] 53+ messages in thread
end of thread, other threads:[~2025-09-01 22:03 UTC | newest]
Thread overview: 53+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-22 14:17 [PATCH v6 0/9] ovl: Enable support for casefold layers André Almeida
2025-08-22 14:17 ` [PATCH v6 1/9] fs: Create sb_encoding() helper André Almeida
2025-08-25 9:19 ` Gabriel Krisman Bertazi
2025-08-25 12:38 ` Gabriel Krisman Bertazi
2025-08-25 15:28 ` Amir Goldstein
2025-08-22 14:17 ` [PATCH v6 2/9] fs: Create sb_same_encoding() helper André Almeida
2025-08-23 10:02 ` Amir Goldstein
2025-08-25 9:24 ` Gabriel Krisman Bertazi
2025-08-22 14:17 ` [PATCH v6 3/9] ovl: Prepare for mounting case-insensitive enabled layers André Almeida
2025-08-25 10:42 ` Gabriel Krisman Bertazi
2025-08-22 14:17 ` [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp() André Almeida
2025-08-22 16:53 ` Amir Goldstein
2025-08-25 11:09 ` Gabriel Krisman Bertazi
2025-08-25 15:27 ` Amir Goldstein
2025-08-25 15:45 ` Amir Goldstein
2025-08-25 17:11 ` Gabriel Krisman Bertazi
2025-08-26 1:34 ` Gabriel Krisman Bertazi
2025-08-26 7:19 ` Amir Goldstein
2025-08-26 15:02 ` Gabriel Krisman Bertazi
2025-08-26 19:58 ` André Almeida
2025-08-27 9:28 ` Amir Goldstein
2025-08-26 20:01 ` André Almeida
2025-08-27 20:45 ` André Almeida
2025-08-28 11:09 ` Amir Goldstein
2025-08-22 14:17 ` [PATCH v6 5/9] ovl: Ensure that all layers have the same encoding André Almeida
2025-08-25 11:17 ` Gabriel Krisman Bertazi
2025-08-25 15:32 ` Amir Goldstein
2025-08-26 20:12 ` André Almeida
2025-08-27 9:17 ` Amir Goldstein
2025-08-22 14:17 ` [PATCH v6 6/9] ovl: Set case-insensitive dentry operations for ovl sb André Almeida
2025-08-25 11:24 ` Gabriel Krisman Bertazi
2025-08-25 15:34 ` Amir Goldstein
2025-08-26 20:13 ` André Almeida
2025-08-22 14:17 ` [PATCH v6 7/9] ovl: Add S_CASEFOLD as part of the inode flag to be copied André Almeida
2025-08-22 14:17 ` [PATCH v6 8/9] ovl: Check for casefold consistency when creating new dentries André Almeida
2025-08-22 14:17 ` [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers André Almeida
2025-08-22 16:34 ` Amir Goldstein
2025-08-22 16:47 ` André Almeida
2025-08-22 19:17 ` Amir Goldstein
2025-08-25 13:31 ` André Almeida
2025-08-26 7:31 ` Amir Goldstein
2025-08-26 19:01 ` André Almeida
2025-08-27 18:06 ` Amir Goldstein
2025-08-27 20:37 ` André Almeida
2025-08-27 23:58 ` NeilBrown
2025-08-28 3:15 ` Gabriel Krisman Bertazi
2025-08-28 7:25 ` Amir Goldstein
2025-08-28 16:44 ` Amir Goldstein
2025-08-29 1:27 ` NeilBrown
2025-08-29 1:25 ` NeilBrown
2025-08-29 9:31 ` Amir Goldstein
2025-09-01 22:02 ` NeilBrown
2025-08-22 19:28 ` [syzbot ci] Re: ovl: Enable support for casefold layers syzbot ci
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).