* [PATCH RESEND 2/4] libext2fs: add quota to libext2fs
From: Etienne AUJAMES @ 2026-06-19 15:32 UTC (permalink / raw)
To: linux-ext4, Theodore Ts'o; +Cc: Andreas Dilger, Li Dongyang
In-Reply-To: <ajVdnQUu9tSrKldW@eaujamesFR0130>
add quota related interface to libext2fs and install the
relevant headers.
Change-Id: I17e6b5aa74e0f1bb1465168a1cf4e03184e003b0
Lustre-bug-id: https://jira.whamcloud.com/browse/LU-13241
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
---
lib/ext2fs/Makefile.in | 43 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 43 insertions(+)
diff --git a/lib/ext2fs/Makefile.in b/lib/ext2fs/Makefile.in
index e9a6ced24..0656c4c5c 100644
--- a/lib/ext2fs/Makefile.in
+++ b/lib/ext2fs/Makefile.in
@@ -28,6 +28,8 @@ DEBUG_OBJS= debug_cmds.o extent_cmds.o tst_cmds.o debugfs.o util.o \
create_inode_libarchive.o journal.o revoke.o recovery.o \
do_journal.o do_orphan.o
+QUOTA_LIB_OBJS= mkquota.o quotaio.o quotaio_v2.o quotaio_tree.o dict.o
+
DEBUG_SRCS= debug_cmds.c extent_cmds.c tst_cmds.c \
$(top_srcdir)/debugfs/debugfs.c \
$(top_srcdir)/debugfs/util.c \
@@ -57,6 +59,7 @@ DEBUG_SRCS= debug_cmds.c extent_cmds.c tst_cmds.c \
@TDB_CMT@TDB_OBJ= tdb.o
OBJS= $(DEBUGFS_LIB_OBJS) $(RESIZE_LIB_OBJS) $(E2IMAGE_LIB_OBJS) \
+ $(QUOTA_LIB_OBJS) \
$(TEST_IO_LIB_OBJS) \
ext2_err.o \
alloc.o \
@@ -236,6 +239,7 @@ SRCS= ext2_err.c \
HFILES= bitops.h ext2fs.h ext2_io.h ext2_fs.h ext2_ext_attr.h ext3_extents.h \
tdb.h qcow2.h hashmap.h
+QUOTA_HFILES= quotaio.h dqblk_v2.h quotaio_tree.h dict.h
HFILES_IN= ext2_err.h ext2_types.h
LIBRARY= libext2fs
@@ -459,6 +463,41 @@ do_orphan.o: $(top_srcdir)/debugfs/do_orphan.c
$(E) " CC $<"
$(Q) $(CC) $(DEBUGFS_CFLAGS) -c $< -o $@
+mkquota.o: $(top_srcdir)/lib/support/mkquota.c
+ $(E) " CC $<"
+ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_STLIB) -c $< -o $@
+@PROFILE_CMT@ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_STLIB) -g -pg -o profiled/$*.o -c $<
+@ELF_CMT@ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_SHLIB) -fPIC -shared -o elfshared/$*.o -c $<
+@BSDLIB_CMT@ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_SHLIB) $(BSDLIB_PIC_FLAG) -o pic/$*.o -c $<
+
+quotaio.o: $(top_srcdir)/lib/support/quotaio.c
+ $(E) " CC $<"
+ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_STLIB) -c $< -o $@
+@PROFILE_CMT@ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_STLIB) -g -pg -o profiled/$*.o -c $<
+@ELF_CMT@ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_SHLIB) -fPIC -shared -o elfshared/$*.o -c $<
+@BSDLIB_CMT@ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_SHLIB) $(BSDLIB_PIC_FLAG) -o pic/$*.o -c $<
+
+quotaio_v2.o: $(top_srcdir)/lib/support/quotaio_v2.c
+ $(E) " CC $<"
+ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_STLIB) -c $< -o $@
+@PROFILE_CMT@ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_STLIB) -g -pg -o profiled/$*.o -c $<
+@ELF_CMT@ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_SHLIB) -fPIC -shared -o elfshared/$*.o -c $<
+@BSDLIB_CMT@ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_SHLIB) $(BSDLIB_PIC_FLAG) -o pic/$*.o -c $<
+
+quotaio_tree.o: $(top_srcdir)/lib/support/quotaio_tree.c
+ $(E) " CC $<"
+ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_STLIB) -c $< -o $@
+@PROFILE_CMT@ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_STLIB) -g -pg -o profiled/$*.o -c $<
+@ELF_CMT@ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_SHLIB) -fPIC -shared -o elfshared/$*.o -c $<
+@BSDLIB_CMT@ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_SHLIB) $(BSDLIB_PIC_FLAG) -o pic/$*.o -c $<
+
+dict.o: $(top_srcdir)/lib/support/dict.c
+ $(E) " CC $<"
+ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_STLIB) -c $< -o $@
+@PROFILE_CMT@ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_STLIB) -g -pg -o profiled/$*.o -c $<
+@ELF_CMT@ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_SHLIB) -fPIC -shared -o elfshared/$*.o -c $<
+@BSDLIB_CMT@ $(Q) $(CC) -I$(top_srcdir)/lib/support $(ALL_CFLAGS_SHLIB) $(BSDLIB_PIC_FLAG) -o pic/$*.o -c $<
+
xattrs.o: $(top_srcdir)/debugfs/xattrs.c
$(E) " CC $<"
$(Q) $(CC) $(DEBUGFS_CFLAGS) -c $< -o $@
@@ -586,6 +625,10 @@ install:: all $(HFILES) $(HFILES_IN) installdirs ext2fs.pc
echo " INSTALL_DATA $(includedir)/ext2fs/$$i"; \
$(INSTALL_DATA) $(srcdir)/$$i $(DESTDIR)$(includedir)/ext2fs/$$i; \
done
+ $(Q) for i in $(QUOTA_HFILES); do \
+ echo " INSTALL_DATA $(includedir)/ext2fs/$$i"; \
+ $(INSTALL_DATA) $(top_srcdir)/lib/support/$$i $(DESTDIR)$(includedir)/ext2fs/$$i; \
+ done
$(Q) for i in $(HFILES_IN); do \
echo " INSTALL_DATA $(includedir)/ext2fs/$$i"; \
$(INSTALL_DATA) $$i $(DESTDIR)$(includedir)/ext2fs/$$i; \
--
2.43.7
^ permalink raw reply related
* Re: [PATCH v10 03/22] ovl: use core fsverity ensure info interface
From: Eric Biggers @ 2026-06-19 16:54 UTC (permalink / raw)
To: Amir Goldstein
Cc: Andrey Albershteyn, linux-xfs, fsverity, linux-fsdevel, hch,
linux-ext4, linux-f2fs-devel, linux-btrfs, linux-unionfs, djwong
In-Reply-To: <CAOQ4uxh_hfiSwMw8ABhhrz7GguZWjHEiBmvb3eq16Wfqw0+ZrQ@mail.gmail.com>
On Fri, Jun 19, 2026 at 09:28:31AM +0200, Amir Goldstein wrote:
> On Wed, May 20, 2026 at 9:07 PM Eric Biggers <ebiggers@kernel.org> wrote:
> >
> > On Wed, May 20, 2026 at 02:37:01PM +0200, Andrey Albershteyn wrote:
> > > fsverity now exposes fsverity_ensure_verity_info() which could be used
> > > instead of opening file to ensure that fsverity info is loaded and
> > > attached to inode.
> > >
> > > Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
> > > Acked-by: Amir Goldstein <amir73il@gmail.com>
> > > ---
> > > fs/overlayfs/util.c | 14 +++-----------
> > > 1 file changed, 3 insertions(+), 11 deletions(-)
> >
> > Reviewed-by: Eric Biggers <ebiggers@kernel.org>
> >
> > I'm still confused by the new implementation of fsverity_active() that
> > got introduced by "fsverity: use a hashtable to find the fsverity_info",
> > though. I should have caught this during review of that commit. For
> > one its comment is outdated, but also the memory barrier seems to be
> > specific to the fsverity_get_info() caller and probably should be moved
> > to there. Anyway, that's not directly related to this patch.
>
> Eric, Andrey,
>
> Did you see the Sashiko review for this patch and others in this series?
>
> https://sashiko.dev/#/patchset/20260520123722.405752-1-aalbersh%40kernel.org
>
> It annotated some review comments as high and critical.
> For this patch it is about interaction with fscrypt.
>
> Please take a look and say if this is concerning or false positive.
Yes, this patch is broken and should be dropped. I need to remember to
look at the Sashiko reviews for other people's patches and not just
trust that the submitter will. Fortunately this one wasn't applied yet.
I pointed out the HIGHMEM performance bug in
"fsverity: generate and store zero-block hash" earlier
(https://lore.kernel.org/linux-fsdevel/20260401222717.GH2466@quark/). I
assume it was decided that no one will care about the combination of XFS
&& fsverity && HIGHMEM. But the XFS folks should double-check that.
Andrey, could you check the Sashiko reviews for the other patches too?
- Eric
^ permalink raw reply
* [PATCH RESEND 0/4] e2fsck: Fix orphan inodes processing
From: Etienne AUJAMES @ 2026-06-19 15:17 UTC (permalink / raw)
To: linux-ext4, Theodore Ts'o; +Cc: Andreas Dilger, Li Dongyang
e2fsck does not handle properly orphan inodes.
Case 1: bad free_blocks accounting with extent files
# e2fsck -v /tmp/ext4
e2fsck 1.47.3-wc2 (11-Nov-2025)
Truncating orphaned inode 12 (uid=0, gid=0, mode=0100644, size=4096)
Setting free blocks count to 2554682 (was 2554683)
/tmp/ext4: clean, 13/655360 files, 66758/2621440 blocks
# e2fsck -yf /tmp/ext4
e2fsck 1.47.3-wc2 (11-Nov-2025)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (2554682, counted=2554683).
Fix<y>? yes
Case 2: e2fsck does not support orphan inodes with ea_inode
# e2fsck -yf /tmp/ext4
e2fsck 1.47.3-wc2 (11-Nov-2025)
Clearing orphaned inode 12 (uid=0, gid=0, mode=0100644, size=0)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Regular filesystem inode 13 has EA_INODE flag set. Clear<y>? yes
Unattached inode 13
Connect to /lost+found<y>? yes
Inode 13 ref count is 2, should be 1. Fix<y>? yes
Pass 5: Checking group summary information
Patch 1 fixes the first case.
Patch 2 includes quota function in libext2fs (required by patch 2).
Patch 3 fixes ext2fs_xattrs_* function to update inode iblk and quota.
Patch 4 fixes the second case.
Bugs tracked by: https://jira.whamcloud.com/browse/LU-20049
Etienne AUJAMES (3):
e2fsck: fix orphaned extent files handling
libext2fs: update iblock when using ea_inode feature
libext2fs: add ext2fs_xattrs_release_all() helper
Li Dongyang (1):
libext2fs: add quota to libext2fs
debugfs/debugfs.c | 33 +-
debugfs/xattrs.c | 19 +-
e2fsck/pass1.c | 12 +-
e2fsck/super.c | 295 +++++++++---------
lib/ext2fs/Makefile.in | 43 +++
lib/ext2fs/ext2fs.h | 10 +
lib/ext2fs/ext_attr.c | 268 +++++++++++-----
lib/ext2fs/i_block.c | 14 +
lib/support/quotaio.h | 1 -
misc/create_inode_libarchive.c | 35 ++-
misc/fuse2fs.c | 117 +++----
tests/d_xattr_ea_inode/expect | 188 +++++++++++
tests/d_xattr_ea_inode/name | 1 +
tests/d_xattr_ea_inode/script | 104 ++++++
tests/f_orphan_ea_inode/expect.1 | 6 +
tests/f_orphan_ea_inode/expect.2 | 7 +
tests/f_orphan_ea_inode/image.gz | Bin 0 -> 2139 bytes
tests/f_orphan_ea_inode/name | 1 +
tests/f_orphan_ea_inode/script | 3 +
.../f_orphan_truncate_extents_inode/expect.1 | 3 +
.../f_orphan_truncate_extents_inode/expect.2 | 7 +
.../f_orphan_truncate_extents_inode/image.gz | Bin 0 -> 2854 bytes
tests/f_orphan_truncate_extents_inode/name | 1 +
tests/f_orphan_truncate_extents_inode/script | 3 +
24 files changed, 842 insertions(+), 329 deletions(-)
create mode 100644 tests/d_xattr_ea_inode/expect
create mode 100644 tests/d_xattr_ea_inode/name
create mode 100644 tests/d_xattr_ea_inode/script
create mode 100644 tests/f_orphan_ea_inode/expect.1
create mode 100644 tests/f_orphan_ea_inode/expect.2
create mode 100644 tests/f_orphan_ea_inode/image.gz
create mode 100644 tests/f_orphan_ea_inode/name
create mode 100644 tests/f_orphan_ea_inode/script
create mode 100644 tests/f_orphan_truncate_extents_inode/expect.1
create mode 100644 tests/f_orphan_truncate_extents_inode/expect.2
create mode 100644 tests/f_orphan_truncate_extents_inode/image.gz
create mode 100644 tests/f_orphan_truncate_extents_inode/name
create mode 100644 tests/f_orphan_truncate_extents_inode/script
--
2.43.7
^ permalink raw reply
* [PATCH RESEND 1/4] e2fsck: fix orphaned extent files handling
From: Etienne AUJAMES @ 2026-06-19 15:24 UTC (permalink / raw)
To: linux-ext4, Theodore Ts'o; +Cc: Andreas Dilger, Li Dongyang
In-Reply-To: <ajVdnQUu9tSrKldW@eaujamesFR0130>
release_inode_blocks() does not handle corectly multi-levels extent
files: it does not count the non-leaf blocks directly released by
ext2fs_block_iterate3().
This patch relies on ext2fs_get_stat_i_blocks() count for quota update
and ext2fs_free_blocks_count() to count number of blocks released by
release_inode_blocks().
Add regression test: f_orphan_truncate_extents_inode
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: Ib0c3aaaa685e7bcfae896617cda03005d19539ff
Lustre-bug-id: https://jira.whamcloud.com/browse/LU-20049
---
e2fsck/super.c | 244 +++++++++---------
.../f_orphan_truncate_extents_inode/expect.1 | 3 +
.../f_orphan_truncate_extents_inode/expect.2 | 7 +
.../f_orphan_truncate_extents_inode/image.gz | Bin 0 -> 2854 bytes
tests/f_orphan_truncate_extents_inode/name | 1 +
tests/f_orphan_truncate_extents_inode/script | 3 +
6 files changed, 133 insertions(+), 125 deletions(-)
create mode 100644 tests/f_orphan_truncate_extents_inode/expect.1
create mode 100644 tests/f_orphan_truncate_extents_inode/expect.2
create mode 100644 tests/f_orphan_truncate_extents_inode/image.gz
create mode 100644 tests/f_orphan_truncate_extents_inode/name
create mode 100644 tests/f_orphan_truncate_extents_inode/script
diff --git a/e2fsck/super.c b/e2fsck/super.c
index cfc0919a2..c2ccefd54 100644
--- a/e2fsck/super.c
+++ b/e2fsck/super.c
@@ -62,21 +62,14 @@ static int check_super_value64(e2fsck_t ctx, const char *descr,
return 1;
}
-/*
- * helper function to release an inode
- */
struct process_block_struct {
- e2fsck_t ctx;
- char *buf;
+ e2fsck_t ctx;
+ char *buf;
struct problem_context *pctx;
- int truncating;
- int truncate_offset;
e2_blkcnt_t truncate_block;
- int truncated_blocks;
- int abort;
+ e2_blkcnt_t truncated_blocks;
errcode_t errcode;
blk64_t last_cluster;
- struct ext2_inode_large *inode;
};
static int release_inode_block(ext2_filsys fs,
@@ -91,7 +84,6 @@ static int release_inode_block(ext2_filsys fs,
struct problem_context *pctx;
blk64_t blk = *block_nr;
blk64_t cluster = EXT2FS_B2C(fs, *block_nr);
- int retval = 0;
pb = (struct process_block_struct *) priv_data;
ctx = pb->ctx;
@@ -111,155 +103,157 @@ static int release_inode_block(ext2_filsys fs,
if ((blk < fs->super->s_first_data_block) ||
(blk >= ext2fs_blocks_count(fs->super))) {
fix_problem(ctx, PR_0_ORPHAN_ILLEGAL_BLOCK_NUM, pctx);
- return_abort:
- pb->abort = 1;
+ pb->errcode = EXT2_ET_BAD_BLOCK_NUM;
return BLOCK_ABORT;
}
if (!ext2fs_test_block_bitmap2(fs->block_map, blk)) {
fix_problem(ctx, PR_0_ORPHAN_ALREADY_CLEARED_BLOCK, pctx);
- goto return_abort;
+ pb->errcode = EXT2_ET_BAD_BLOCK_NUM;
+ return BLOCK_ABORT;
}
/*
- * If we are deleting an orphan, then we leave the fields alone.
- * If we are truncating an orphan, then update the inode fields
- * and clean up any partial block data.
+ * We don't remove direct blocks until we've reached
+ * the truncation block.
*/
- if (pb->truncating) {
- /*
- * We only remove indirect blocks if they are
- * completely empty.
- */
- if (blockcnt < 0) {
- int i, limit;
- blk_t *bp;
-
- pb->errcode = io_channel_read_blk64(fs->io, blk, 1,
- pb->buf);
- if (pb->errcode)
- goto return_abort;
-
- limit = fs->blocksize >> 2;
- for (i = 0, bp = (blk_t *) pb->buf;
- i < limit; i++, bp++)
- if (*bp)
- return 0;
- }
- /*
- * We don't remove direct blocks until we've reached
- * the truncation block.
- */
- if (blockcnt >= 0 && blockcnt < pb->truncate_block)
- return 0;
- /*
- * If part of the last block needs truncating, we do
- * it here.
- */
- if ((blockcnt == pb->truncate_block) && pb->truncate_offset) {
- pb->errcode = io_channel_read_blk64(fs->io, blk, 1,
- pb->buf);
- if (pb->errcode)
- goto return_abort;
- memset(pb->buf + pb->truncate_offset, 0,
- fs->blocksize - pb->truncate_offset);
- pb->errcode = io_channel_write_blk64(fs->io, blk, 1,
- pb->buf);
- if (pb->errcode)
- goto return_abort;
- }
- pb->truncated_blocks++;
- *block_nr = 0;
- retval |= BLOCK_CHANGED;
+ if (blockcnt >= 0 && blockcnt < pb->truncate_block)
+ return 0;
+
+ /*
+ * We only remove indirect blocks if they are
+ * completely empty.
+ */
+ if (blockcnt < 0) {
+ int i, limit;
+ blk_t *bp;
+
+ pb->errcode = io_channel_read_blk64(fs->io, blk, 1,
+ pb->buf);
+ if (pb->errcode)
+ return BLOCK_ABORT;
+
+ limit = fs->blocksize >> 2;
+ for (i = 0, bp = (blk_t *) pb->buf;
+ i < limit; i++, bp++)
+ if (*bp)
+ return 0;
}
- if (ctx->qctx)
- quota_data_sub(ctx->qctx, pb->inode, 0, ctx->fs->blocksize);
ext2fs_block_alloc_stats2(fs, blk, -1);
- ctx->free_blocks++;
- return retval;
+ pb->truncated_blocks++;
+ *block_nr = 0;
+
+ return BLOCK_CHANGED;
}
-/*
- * This function releases an inode. Returns 1 if an inconsistency was
- * found. If the inode has a link count, then it is being truncated and
- * not deleted.
- */
-static int release_inode_blocks(e2fsck_t ctx, ext2_ino_t ino,
- struct ext2_inode_large *inode, char *block_buf,
- struct problem_context *pctx)
+static errcode_t truncate_inode_blocks(e2fsck_t ctx, ext2_ino_t ino,
+ struct ext2_inode_large *inode,
+ char *block_buf,
+ struct problem_context *pctx)
{
- struct process_block_struct pb;
- ext2_filsys fs = ctx->fs;
- blk64_t blk;
- errcode_t retval;
- __u32 count;
+ ext2_filsys fs = ctx->fs;
+ struct process_block_struct pb = { 0 };
+ e2_blkcnt_t truncate_block = 0;
+ __u32 truncate_offset = 0;
+ blk64_t blk;
+ int ret_flags;
+ errcode_t retval = 0;
if (!ext2fs_inode_has_valid_blocks2(fs, EXT2_INODE(inode)))
- goto release_acl;
+ return 0;
- pb.buf = block_buf + 3 * ctx->fs->blocksize;
- pb.ctx = ctx;
- pb.abort = 0;
- pb.errcode = 0;
- pb.pctx = pctx;
- pb.last_cluster = 0;
- pb.inode = inode;
if (inode->i_links_count) {
- pb.truncating = 1;
- pb.truncate_block = (e2_blkcnt_t)
+ truncate_offset = inode->i_size % fs->blocksize;
+ truncate_block = (e2_blkcnt_t)
((EXT2_I_SIZE(inode) + fs->blocksize - 1) /
fs->blocksize);
- pb.truncate_offset = inode->i_size % fs->blocksize;
- } else {
- pb.truncating = 0;
- pb.truncate_block = 0;
- pb.truncate_offset = 0;
}
- pb.truncated_blocks = 0;
+
+ pb.buf = block_buf;
+ pb.ctx = ctx;
+ pb.pctx = pctx;
+ pb.truncate_block = truncate_block;
retval = ext2fs_block_iterate3(fs, ino, BLOCK_FLAG_DEPTH_TRAVERSE,
block_buf, release_inode_block, &pb);
if (retval) {
com_err("release_inode_blocks", retval,
_("while calling ext2fs_block_iterate for inode %u"),
ino);
- return 1;
+ return retval;
}
- if (pb.abort)
- return 1;
+ if (pb.errcode)
+ return pb.errcode;
/* Refresh the inode since ext2fs_block_iterate may have changed it */
e2fsck_read_inode_full(ctx, ino, EXT2_INODE(inode), sizeof(*inode),
"release_inode_blocks");
- if (pb.truncated_blocks)
- ext2fs_iblk_sub_blocks(fs, EXT2_INODE(inode),
- pb.truncated_blocks);
-release_acl:
- blk = ext2fs_file_acl_block(fs, EXT2_INODE(inode));
- if (blk) {
- retval = ext2fs_adjust_ea_refcount3(fs, blk, block_buf, -1,
- &count, ino);
- if (retval == EXT2_ET_BAD_EA_BLOCK_NUM) {
- retval = 0;
- count = 1;
- }
- if (retval) {
- com_err("release_inode_blocks", retval,
- _("while calling ext2fs_adjust_ea_refcount2 for inode %u"),
- ino);
- return 1;
- }
- if (count == 0) {
- if (ctx->qctx)
- quota_data_sub(ctx->qctx, inode, 0,
- ctx->fs->blocksize);
- ext2fs_block_alloc_stats2(fs, blk, -1);
- ctx->free_blocks++;
- }
- ext2fs_file_acl_block_set(fs, EXT2_INODE(inode), 0);
+ ext2fs_iblk_sub_blocks(fs, EXT2_INODE(inode), pb.truncated_blocks);
+ if (!truncate_offset)
+ return 0;
+
+ /* Is there an initialized block at the end? */
+ retval = ext2fs_bmap2(fs, ino, NULL, NULL, 0,
+ truncate_block, &ret_flags, &blk);
+ if (retval)
+ return retval;
+ if ((blk == 0) || (ret_flags & BMAP_RET_UNINIT))
+ return 0;
+
+ retval = io_channel_read_blk64(fs->io, blk, 1, block_buf);
+ if (retval)
+ return retval;
+
+ memset(block_buf + truncate_offset, 0, fs->blocksize - truncate_offset);
+ retval = io_channel_write_blk64(fs->io, blk, 1, block_buf);
+
+ return retval;
+}
+
+/*
+ * This function releases an inode. Returns 1 if an inconsistency was
+ * found. If the inode has a link count, then it is being truncated and
+ * not deleted.
+ */
+static int release_inode_blocks(e2fsck_t ctx, ext2_ino_t ino,
+ struct ext2_inode_large *inode, char *block_buf,
+ struct problem_context *pctx)
+{
+ ext2_filsys fs = ctx->fs;
+ blk64_t free_blks, ino_blks;
+ char *buf;
+ errcode_t err;
+ int rc = 0;
+
+ free_blks = ext2fs_free_blocks_count(fs->super);
+ ino_blks = ext2fs_get_stat_i_blocks(fs, EXT2_INODE(inode));
+ buf = block_buf + 3 * ctx->fs->blocksize;
+ if (truncate_inode_blocks(ctx, ino, inode, buf, pctx)) {
+ rc = 1;
+ goto update_counts;
}
- return 0;
+ if (inode->i_links_count)
+ goto update_counts;
+
+ err = ext2fs_free_ext_attr(fs, ino, inode);
+ if (err) {
+ com_err(__func__, err,
+ _("while calling ext2fs_free_ext_attr for inode %u"),
+ ino);
+ rc = 1;
+ goto update_counts;
+ }
+
+ rc = 0;
+
+update_counts:
+ ctx->free_blocks += ext2fs_free_blocks_count(fs->super) - free_blks;
+ ino_blks -= ext2fs_get_stat_i_blocks(fs, EXT2_INODE(inode));
+ if (ctx->qctx)
+ quota_data_sub(ctx->qctx, inode, 0, ino_blks << 9);
+
+ return rc;
}
/* Load all quota data in preparation for orphan clearing. */
diff --git a/tests/f_orphan_truncate_extents_inode/expect.1 b/tests/f_orphan_truncate_extents_inode/expect.1
new file mode 100644
index 000000000..b24aae7ad
--- /dev/null
+++ b/tests/f_orphan_truncate_extents_inode/expect.1
@@ -0,0 +1,3 @@
+test_filesys: Truncating orphaned inode 12 (uid=0, gid=0, mode=0100644, size=4096)
+test_filesys: clean, 12/128 files, 75/1024 blocks
+Exit status is 0
diff --git a/tests/f_orphan_truncate_extents_inode/expect.2 b/tests/f_orphan_truncate_extents_inode/expect.2
new file mode 100644
index 000000000..7edff9bce
--- /dev/null
+++ b/tests/f_orphan_truncate_extents_inode/expect.2
@@ -0,0 +1,7 @@
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 12/128 files (16.7% non-contiguous), 75/1024 blocks
+Exit status is 0
diff --git a/tests/f_orphan_truncate_extents_inode/image.gz b/tests/f_orphan_truncate_extents_inode/image.gz
new file mode 100644
index 0000000000000000000000000000000000000000..30681b879455b936e05d4dcbb4feaed6c4ff1eb9
GIT binary patch
literal 2854
zcmeHI{Z|ub7RHY)J?j?rU>6Hbay;x&EDLO;6|5w)E)secX|xtmD~TnV2mvC7HJUid
zQceX863bCRLDH29M-);L`3R7JUqnJ68v*MeBqBnh8Iq!8LP#>R!|wj*U*PHdF!S7d
zp7*))-g)mcB<cI_uW5H?EnSwC`z_~iz|44l$zvCxXV$_uLK&f*VL@Hf4~zB?`7d(G
z9Z{!g&3Sg&a>DO7t*O4TcjJk~J3vBKc=8_z<Sx_ZjK+W5OQMCD&zoA-ZaHK-k{JHC
zZM3Gff_sHcoAcz(H>U;-V<l(x4%b;(x0@n(<rf>BqHdjt*wL`GuHnw>6mHu1+Nyoz
zW#pC>z>l3<rY6NLyqxh`dRgPmwBTT=F8iEJ6rWe+`*)uuF5F_14mkHWWP@|oHuvqk
zFtsZg@btzm3uh`mVERrPNbQ_SpU6QtArK8AfQ6rCaPSmduP5Cr`@p3+K0lm-sT=o1
z<%}^hZj+oB9rI3o${4Y@Q+VB9n~x6jB~<Rlj7xR8Q=6t^&$)n|E!E7`0sp&trZS&*
zN$6dy6=sCTJUS6JNg!<9@XwVyQqmY<>0I@?)`7e%^~C9E?mTr@Do-uCS66sakURfa
zW69B;b{JjAXwR$EWrD^|QH!GH=*g;l(*nlPsL{fW*GAR_*hxR5OWtxT?7yS4mTGsR
z-Qz{f4kh`)haPQK(8^OkjS!YrD%XD=EexyN<QpQ@MPt2z+0Hrd@>&xA*JqnXp0Db;
z@R>W?h~4>m$^%q(oWgL9Y<7tSE54y~V@noy^{m!D0kZ7tIiJk9Exa3{G2eaS_1|D_
zuJ_aksd10YC-MpwE{|3*BGdA!W{E!kIQd@69%FRuNuW#a1==a%mmR}?CfC>H9(;l|
zG-HiC2x}U<q^`@RR(!qis=;$A|1!j5VKLN|K(&`C=|_#~gBWntYZ0CS7yjY|c7HSJ
zbLQr1hCLQfGzKfSU=xYCw-eBpnqimXdOoNaWMXFDytSd){e0FKs758Nn*1LW`bjPK
z*!pf1ygqYRCz)EndT+j1c=LU1VzVZeXRq3mCRJ`Ag=K{BLC{=-wPDLR^scY!65ji>
zT=#;t?f)P2t&e9#>EbU%9;j(3v#DewfZX;C{}M?EUF5dkN(EFu?JuVcKk`wZO}ZLJ
zt0*z8_n@lhh#%@MBle-mb4L(srYJ(h9TR`*l{Qmq{7>8chmh`TbWV)yBO@c2pZD4s
zB|o5no@S1!NF^@~X2Q@#{}($1NHCcK1+s)Hx%6efn9BYZIa7(k>8w8<mY>^qDMOJq
zqKC_DEs&}<j%_?=E(ck22a)e=1V~ydIIJik>KP<WoQOYn6ah_Mm4GHY1gxQ4@$a4c
z7hHc$#s&yrg7@fLAIvh<tLRBGjx0&uFQ#;Cf7#kjA60Q^0=DstL(d?NU)8Md*dW?n
zz(ZPXS~&1p2-{iaFpB>q+=T?2`jsnb3WGRWn2WqIg{kP<d3`7g)*VDvn6@ghP)(#r
zG`ACZRH=b(W}a5X-G`0l(AaWNVmN{nk&Bh?`yUUOL-+~azZ&{r4(m$}X`@wwNT+;8
zvAP-?x&qc!V^A<2)~t5#N=I(l95A<5q_ifx*_H!U2e7tkCI))6Pba|@Drc#~$RKJW
zM5%L1IQ1~)5HHfc&ReJ?Dg_m;^ZqaPt%T?oT<5``ZxzE<z3`z}i-TaC*S-I7A_Cz&
zd%i`+7F<vy<4u*ZjZsXQvU?S{$=i>&h$kh=6M{@uWd*Hb!{0-#$+%n?u3}zX?8jAr
zy*Q}BRooZxB0u8VoPOa$>Q{JHx>)1@4))@U7PB=G_H~_&IHe5db8tST9uL%0!t)eB
z81Jn+MtR8C*%S!1RoJ&7S53vric2^+Ynz0)de_0%YZs$w+bry)$@@|9$-51W5Kx+D
zM6(JwNX)fPrM%QJh^7|Mkyw)kl9V|5(G<^{t(JD0_VGmi(zhP--;(ce2YNa`^Bc;u
z`+mo3>m6G?M1`W#MlOTkj`ST-(;byJS+B@#_jLtUosXJl>q8euJ;ek<5-Fq7f7OP<
z&ZHPUx(%PI2joaq`u$r243dg0;u|i(-puz@f?oKcID(yyu*iuJ{Q*26{+u1}J!(K<
z7C9WM&!nkznL&rUiTqDHqk<mI!k0}ORMzeC!I}_C4Y+$w4S&NOC>nd>j&GfTBK6=8
z8fr(Rh`$Ae+)3_3_)rgsBRXQd&9?6$dXk$15Hu0EZz*x#io|_OF`x}^4O3P09#26U
zo&>RZB{OAkWApe$P?A%uB$dvXVM;S$&>ZsA4+Um!E%)c-B&%fik(~%`##j8u@G>0z
ztg$9S2Z(5J{|Ve;_|Px3^r!)G%f}cTJ2lVgW|T>eQwB#I@JEYAv~LiDAslF1acg>`
z_sBuk7EHy9#(k?1e<f!Lqe>GmWFfC@Q4ml@G%!BYgnLc449ITRBLDrzzQYIY9o*Wl
Qs3(SfSGqhPU{%0>0O93B*Z=?k
literal 0
HcmV?d00001
diff --git a/tests/f_orphan_truncate_extents_inode/name b/tests/f_orphan_truncate_extents_inode/name
new file mode 100644
index 000000000..6f16502b3
--- /dev/null
+++ b/tests/f_orphan_truncate_extents_inode/name
@@ -0,0 +1 @@
+truncating an orphaned extent-mapped inode in preen mode
diff --git a/tests/f_orphan_truncate_extents_inode/script b/tests/f_orphan_truncate_extents_inode/script
new file mode 100644
index 000000000..fb895e9a4
--- /dev/null
+++ b/tests/f_orphan_truncate_extents_inode/script
@@ -0,0 +1,3 @@
+FSCK_OPT=-p
+SECOND_FSCK_OPT="-yf -E no_optimize_extents"
+. $cmd_dir/run_e2fsck
--
2.43.7
^ permalink raw reply related
* [syzbot ci] Re: Data in direntry (dirdata) feature
From: syzbot ci @ 2026-06-19 14:50 UTC (permalink / raw)
To: artem.blagodarenko, adilger, linux-ext4, pravin.shelar, syzbot,
syzkaller-bugs
Cc: syzbot, syzkaller-bugs
In-Reply-To: <CA+rD4x_2wXOP=4RwPY-A2vJjK4Vv9hGUSVFzprCe1H+8MTOKhA@mail.gmail.com>
syzbot ci has tested the suggested fix patch on top of the following series:
[v2] Data in direntry (dirdata) feature
https://lore.kernel.org/all/20260610152417.13576-1-ablagodarenko@thelustrecollective.com
Patch: https://ci.syzbot.org/jobs/2471bcf5-fa8b-4932-846b-3db72cc2b56c/patch
Testing results:
* [build 0] Build Patched: passed
* [build 0] Boot test: Patched: passed
Full report is available here:
https://ci.syzbot.org/session/08769134-a853-4686-a652-a4c24e8773d7
---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.
^ permalink raw reply
* Re: [syzbot ci] Re: Data in direntry (dirdata) feature
From: syzbot @ 2026-06-19 14:11 UTC (permalink / raw)
To: artem.blagodarenko
Cc: adilger, artem.blagodarenko, linux-ext4, pravin.shelar, syzbot,
syzkaller-bugs
In-Reply-To: <CA+rD4x_2wXOP=4RwPY-A2vJjK4Vv9hGUSVFzprCe1H+8MTOKhA@mail.gmail.com>
> Thanks for the report. The attached patch addresses the issues found in
> the dirdata series review (dx_get_dx_info/get_dx_countlimit blocksize
> fallback, dfid parameter shadowing in ext4_dirdata_get, and the unsafe
> delete-before-add in EXT4_IOC_SET_LUFID).
>
>
> #syz test
I see the command but can't find the corresponding bug.
The email is sent to syzbot+HASH@syzkaller.appspotmail.com address
but the HASH does not correspond to any known bug.
Please double check the address.
>
> On Thu, Jun 11, 2026 11:29 AM, syzbot ci <
> syzbot+cid7b922cb3d448114@syzkaller.appspotmail.com> wrote:
>
>> syzbot ci has tested the following series
>>
>> [v2] Data in direntry (dirdata) feature
>>
>> https://lore.kernel.org/all/20260610152417.13576-1-ablagodarenko@thelustrecollective.com
>> * [PATCH v2 01/10] ext4: replace ext4_dir_entry with ext4_dir_entry_2
>> * [PATCH v2 02/10] ext4: add ext4_dir_entry_is_tail()
>> * [PATCH v2 03/10] ext4: refactor dx_root to support variable dirent sizes
>> * [PATCH v2 04/10] ext4: add dirdata format definitions and access helpers
>> * [PATCH v2 05/10] ext4: preserve dirdata bits in get_dtype()
>> * [PATCH v2 06/10] ext4: add ext4_dir_entry_len() and harden dirdata
>> parsing
>> * [PATCH v2 07/10] ext4: rename ext4_dir_rec_len() and clarify dirdata
>> usage
>> * [PATCH v2 08/10] ext4: dirdata feature
>> * [PATCH v2 09/10] ext4: add dirdata set/get helpers
>> * [PATCH v2 10/10] ext4: Add EXT4_IOC_SET_LUFID ioctl for setting LUFID on
>> directory entries
>>
>> and found the following issues:
>> * KASAN: slab-out-of-bounds Read in __ext4_check_dir_entry
>> * KASAN: slab-out-of-bounds Read in ext4_inlinedir_to_tree
>> * KASAN: slab-use-after-free Read in __ext4_check_dir_entry
>> * KASAN: slab-use-after-free Read in ext4_inlinedir_to_tree
>> * KASAN: use-after-free Read in __ext4_check_dir_entry
>>
>> Full report is available here:
>> https://ci.syzbot.org/series/5bf0e2fa-2e68-4532-8396-4568879b2788
>>
>> ***
>>
>> KASAN: slab-out-of-bounds Read in __ext4_check_dir_entry
>>
>> tree: torvalds
>> URL:
>> https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux
>> base: 9716c086c8e8b141d35aa61f2e96a2e83de212a7
>> arch: amd64
>> compiler: Debian clang version 21.1.8
>> (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
>> config:
>> https://ci.syzbot.org/builds/ddf6ee7c-dfa8-4383-b004-10140edc081c/config
>> syz repro:
>> https://ci.syzbot.org/findings/b0854918-13f9-49dd-ab30-12154f0debe2/syz_repro
>>
>> loop0: lost filesystem error report for type 5 error -117
>> EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000
>> r/w without journal. Quota mode: none.
>> ==================================================================
>> BUG: KASAN: slab-out-of-bounds in ext4_dirent_get_data_len
>> fs/ext4/ext4.h:4069 [inline]
>> BUG: KASAN: slab-out-of-bounds in ext4_dir_entry_len fs/ext4/ext4.h:4096
>> [inline]
>> BUG: KASAN: slab-out-of-bounds in __ext4_check_dir_entry+0x65a/0xc40
>> fs/ext4/dir.c:96
>> Read of size 1 at addr ffff8881022db7f5 by task syz.0.23/5815
>>
>> CPU: 1 UID: 0 PID: 5815 Comm: syz.0.23 Not tainted syzkaller #0
>> PREEMPT(full)
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
>> 1.16.2-debian-1.16.2-1 04/01/2014
>> Call Trace:
>> <TASK>
>> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
>> print_address_description+0x55/0x1e0 mm/kasan/report.c:378
>> print_report+0x58/0x70 mm/kasan/report.c:482
>> kasan_report+0x117/0x150 mm/kasan/report.c:595
>> ext4_dirent_get_data_len fs/ext4/ext4.h:4069 [inline]
>> ext4_dir_entry_len fs/ext4/ext4.h:4096 [inline]
>> __ext4_check_dir_entry+0x65a/0xc40 fs/ext4/dir.c:96
>> ext4_check_all_de+0x66/0x150 fs/ext4/dir.c:657
>> ext4_convert_inline_data_nolock+0x1b7/0x990 fs/ext4/inline.c:1121
>> ext4_try_add_inline_entry+0x604/0x8e0 fs/ext4/inline.c:1247
>> __ext4_add_entry+0x390/0x1f40 fs/ext4/namei.c:2529
>> ext4_add_entry fs/ext4/namei.c:2613 [inline]
>> ext4_mkdir+0x5e5/0xce0 fs/ext4/namei.c:3175
>> vfs_mkdir+0x413/0x630 fs/namei.c:5271
>> filename_mkdirat+0x285/0x510 fs/namei.c:5304
>> __do_sys_mkdirat fs/namei.c:5325 [inline]
>> __se_sys_mkdirat+0x35/0x150 fs/namei.c:5322
>> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>> RIP: 0033:0x7f669359bcc7
>> Code: 00 66 90 48 89 f2 b9 00 01 00 00 48 89 fe bf 9c ff ff ff e9 db f7 ff
>> ff 66 2e 0f 1f 84 00 00 00 00 00 90 b8 02 01 00 00 0f 05 <48> 3d 01 f0 ff
>> ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
>> RSP: 002b:00007ffd42381d38 EFLAGS: 00000246 ORIG_RAX: 0000000000000102
>> RAX: ffffffffffffffda RBX: 00007ffd42381dc0 RCX: 00007f669359bcc7
>> RDX: 00000000000001ff RSI: 0000200000001200 RDI: 00000000ffffff9c
>> RBP: 00002000000024c0 R08: 0000200000000240 R09: 0000000000000000
>> R10: 00002000000024c0 R11: 0000000000000246 R12: 0000200000001200
>> R13: 00007ffd42381d80 R14: 0000000000000000 R15: 0000000000000000
>> </TASK>
>>
>> Allocated by task 5066:
>> kasan_save_stack mm/kasan/common.c:57 [inline]
>> kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
>> poison_kmalloc_redzone mm/kasan/common.c:398 [inline]
>> __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:415
>> kasan_kmalloc include/linux/kasan.h:263 [inline]
>> __kmalloc_cache_noprof+0x31c/0x660 mm/slub.c:5420
>> kmalloc_noprof include/linux/slab.h:950 [inline]
>> kzalloc_noprof include/linux/slab.h:1188 [inline]
>> kernfs_get_open_node fs/kernfs/file.c:543 [inline]
>> kernfs_fop_open+0x862/0xda0 fs/kernfs/file.c:718
>> do_dentry_open+0x822/0x13a0 fs/open.c:947
>> vfs_open+0x3b/0x340 fs/open.c:1079
>> do_open fs/namei.c:4699 [inline]
>> path_openat+0x2e08/0x3860 fs/namei.c:4858
>> do_file_open+0x23e/0x4a0 fs/namei.c:4887
>> do_sys_openat2+0x113/0x200 fs/open.c:1364
>> do_sys_open fs/open.c:1370 [inline]
>> __do_sys_openat fs/open.c:1386 [inline]
>> __se_sys_openat fs/open.c:1381 [inline]
>> __x64_sys_openat+0x138/0x170 fs/open.c:1381
>> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>
>> Last potentially related work creation:
>> kasan_save_stack+0x3e/0x60 mm/kasan/common.c:57
>> kasan_record_aux_stack+0xbd/0xd0 mm/kasan/generic.c:556
>> kvfree_call_rcu+0x100/0x430 mm/slab_common.c:1970
>> kernfs_unlink_open_file+0x3fe/0x4b0 fs/kernfs/file.c:604
>> kernfs_fop_release+0x2eb/0x440 fs/kernfs/file.c:783
>> __fput+0x44f/0xa60 fs/file_table.c:510
>> fput_close_sync+0x11f/0x240 fs/file_table.c:615
>> __do_sys_close fs/open.c:1507 [inline]
>> __se_sys_close fs/open.c:1492 [inline]
>> __x64_sys_close+0x7e/0x110 fs/open.c:1492
>> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>
>> The buggy address belongs to the object at ffff8881022db700
>> which belongs to the cache kmalloc-128 of size 128
>> The buggy address is located 117 bytes to the right of
>> allocated 128-byte region [ffff8881022db700, ffff8881022db780)
>>
>> The buggy address belongs to the physical page:
>> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1022db
>> flags: 0x17ff00000000000(node=0|zone=2|lastcpupid=0x7ff)
>> page_type: f5(slab)
>> raw: 017ff00000000000 ffff888100041a00 dead000000000100 dead000000000122
>> raw: 0000000000000000 0000000800100010 00000000f5000000 0000000000000000
>> page dumped because: kasan: bad access detected
>> page_owner tracks the page as allocated
>> page last allocated via order 0, migratetype Unmovable, gfp_mask
>> 0xd2000(__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 0,
>> tgid 0 (swapper/0), ts 2408938923, free_ts 0
>> set_page_owner include/linux/page_owner.h:32 [inline]
>> post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
>> prep_new_page mm/page_alloc.c:1861 [inline]
>> get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
>> __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
>> alloc_slab_page mm/slub.c:3278 [inline]
>> allocate_slab+0x77/0x660 mm/slub.c:3467
>> new_slab mm/slub.c:3525 [inline]
>> refill_objects+0x339/0x3d0 mm/slub.c:7272
>> refill_sheaf mm/slub.c:2816 [inline]
>> __pcs_replace_empty_main+0x321/0x720 mm/slub.c:4652
>> alloc_from_pcs mm/slub.c:4750 [inline]
>> slab_alloc_node mm/slub.c:4884 [inline]
>> __do_kmalloc_node mm/slub.c:5295 [inline]
>> __kmalloc_noprof+0x474/0x760 mm/slub.c:5308
>> kmalloc_noprof include/linux/slab.h:954 [inline]
>> kzalloc_noprof include/linux/slab.h:1188 [inline]
>> __alloc_empty_sheaf mm/slub.c:2768 [inline]
>> alloc_empty_sheaf mm/slub.c:2783 [inline]
>> __pcs_replace_empty_main+0x2df/0x720 mm/slub.c:4647
>> alloc_from_pcs mm/slub.c:4750 [inline]
>> slab_alloc_node mm/slub.c:4884 [inline]
>> kmem_cache_alloc_noprof+0x37d/0x650 mm/slub.c:4906
>> dup_fd+0x55/0xb40 fs/file.c:390
>> copy_files+0xc8/0x120 kernel/fork.c:1639
>> copy_process+0x1d94/0x4440 kernel/fork.c:2252
>> kernel_clone+0x2d7/0x940 kernel/fork.c:2722
>> user_mode_thread+0x110/0x180 kernel/fork.c:2798
>> rest_init+0x23/0x300 init/main.c:727
>> start_kernel+0x38a/0x3e0 init/main.c:1220
>> page_owner free stack trace missing
>>
>> Memory state around the buggy address:
>> ffff8881022db680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> ffff8881022db700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> >ffff8881022db780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> ^
>> ffff8881022db800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> ffff8881022db880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> ==================================================================
>>
>>
>> ***
>>
>> KASAN: slab-out-of-bounds Read in ext4_inlinedir_to_tree
>>
>> tree: torvalds
>> URL:
>> https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux
>> base: 9716c086c8e8b141d35aa61f2e96a2e83de212a7
>> arch: amd64
>> compiler: Debian clang version 21.1.8
>> (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
>> config:
>> https://ci.syzbot.org/builds/ddf6ee7c-dfa8-4383-b004-10140edc081c/config
>> syz repro:
>> https://ci.syzbot.org/findings/2dff870b-f382-4c93-8d8d-b2291d921224/syz_repro
>>
>> loop1: lost filesystem error report for type 5 error -117
>> EXT4-fs (loop1): mounted filesystem 00000000-0000-0000-0000-000000000000
>> r/w without journal. Quota mode: none.
>> ==================================================================
>> BUG: KASAN: slab-out-of-bounds in ext4_dir_entry_len fs/ext4/ext4.h:4095
>> [inline]
>> BUG: KASAN: slab-out-of-bounds in ext4_inlinedir_to_tree+0xda5/0x10d0
>> fs/ext4/inline.c:1335
>> Read of size 2 at addr ffff888115a3183c by task syz.1.18/5839
>>
>> CPU: 1 UID: 0 PID: 5839 Comm: syz.1.18 Not tainted syzkaller #0
>> PREEMPT(full)
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
>> 1.16.2-debian-1.16.2-1 04/01/2014
>> Call Trace:
>> <TASK>
>> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
>> print_address_description+0x55/0x1e0 mm/kasan/report.c:378
>> print_report+0x58/0x70 mm/kasan/report.c:482
>> kasan_report+0x117/0x150 mm/kasan/report.c:595
>> ext4_dir_entry_len fs/ext4/ext4.h:4095 [inline]
>> ext4_inlinedir_to_tree+0xda5/0x10d0 fs/ext4/inline.c:1335
>> ext4_htree_fill_tree+0x517/0x1230 fs/ext4/namei.c:1182
>> ext4_dx_readdir fs/ext4/dir.c:600 [inline]
>> ext4_readdir+0x2db4/0x3640 fs/ext4/dir.c:146
>> iterate_dir+0x399/0x570 fs/readdir.c:110
>> __do_sys_getdents64 fs/readdir.c:399 [inline]
>> __se_sys_getdents64+0xf1/0x280 fs/readdir.c:384
>> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>> RIP: 0033:0x7f3e02b9ce59
>> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7
>> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
>> ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
>> RSP: 002b:00007f3e03ad5028 EFLAGS: 00000246 ORIG_RAX: 00000000000000d9
>> RAX: ffffffffffffffda RBX: 00007f3e02e15fa0 RCX: 00007f3e02b9ce59
>> RDX: 0000000000001000 RSI: 0000200000000f80 RDI: 0000000000000004
>> RBP: 00007f3e02c32d6f R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
>> R13: 00007f3e02e16038 R14: 00007f3e02e15fa0 R15: 00007ffcaa902298
>> </TASK>
>>
>> Allocated by task 5839:
>> kasan_save_stack mm/kasan/common.c:57 [inline]
>> kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
>> poison_kmalloc_redzone mm/kasan/common.c:398 [inline]
>> __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:415
>> kasan_kmalloc include/linux/kasan.h:263 [inline]
>> __do_kmalloc_node mm/slub.c:5296 [inline]
>> __kmalloc_noprof+0x35c/0x760 mm/slub.c:5308
>> kmalloc_noprof include/linux/slab.h:954 [inline]
>> ext4_inlinedir_to_tree+0x312/0x10d0 fs/ext4/inline.c:1292
>> ext4_htree_fill_tree+0x517/0x1230 fs/ext4/namei.c:1182
>> ext4_dx_readdir fs/ext4/dir.c:600 [inline]
>> ext4_readdir+0x2db4/0x3640 fs/ext4/dir.c:146
>> iterate_dir+0x399/0x570 fs/readdir.c:110
>> __do_sys_getdents64 fs/readdir.c:399 [inline]
>> __se_sys_getdents64+0xf1/0x280 fs/readdir.c:384
>> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>
>> The buggy address belongs to the object at ffff888115a31800
>> which belongs to the cache kmalloc-64 of size 64
>> The buggy address is located 0 bytes to the right of
>> allocated 60-byte region [ffff888115a31800, ffff888115a3183c)
>>
>> The buggy address belongs to the physical page:
>> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x115a31
>> flags: 0x17ff00000000000(node=0|zone=2|lastcpupid=0x7ff)
>> page_type: f5(slab)
>> raw: 017ff00000000000 ffff8881000418c0 dead000000000100 dead000000000122
>> raw: 0000000000000000 0000000800200020 00000000f5000000 0000000000000000
>> page dumped because: kasan: bad access detected
>> page_owner tracks the page as allocated
>> page last allocated via order 0, migratetype Unmovable, gfp_mask
>> 0xd2c40(GFP_NOFS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC),
>> pid 5051, tgid 5051 (acpid), ts 27203740677, free_ts 27201732767
>> set_page_owner include/linux/page_owner.h:32 [inline]
>> post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
>> prep_new_page mm/page_alloc.c:1861 [inline]
>> get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
>> __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
>> alloc_slab_page mm/slub.c:3278 [inline]
>> allocate_slab+0x77/0x660 mm/slub.c:3467
>> new_slab mm/slub.c:3525 [inline]
>> refill_objects+0x339/0x3d0 mm/slub.c:7272
>> refill_sheaf mm/slub.c:2816 [inline]
>> __pcs_replace_empty_main+0x321/0x720 mm/slub.c:4652
>> alloc_from_pcs mm/slub.c:4750 [inline]
>> slab_alloc_node mm/slub.c:4884 [inline]
>> __do_kmalloc_node mm/slub.c:5295 [inline]
>> __kmalloc_noprof+0x474/0x760 mm/slub.c:5308
>> kmalloc_noprof include/linux/slab.h:954 [inline]
>> kzalloc_noprof include/linux/slab.h:1188 [inline]
>> tomoyo_get_name+0x20c/0x590 security/tomoyo/memory.c:173
>> tomoyo_parse_name_union+0xd9/0x130 security/tomoyo/util.c:260
>> tomoyo_update_path_acl security/tomoyo/file.c:399 [inline]
>> tomoyo_write_file+0x3a6/0xc50 security/tomoyo/file.c:1027
>> tomoyo_write_domain2 security/tomoyo/common.c:1160 [inline]
>> tomoyo_add_entry security/tomoyo/common.c:2177 [inline]
>> tomoyo_supervisor+0x1208/0x1570 security/tomoyo/common.c:2238
>> tomoyo_audit_path_log security/tomoyo/file.c:169 [inline]
>> tomoyo_path_permission+0x25a/0x380 security/tomoyo/file.c:592
>> tomoyo_check_open_permission+0x2b2/0x470 security/tomoyo/file.c:782
>> security_file_open+0xa9/0x240 security/security.c:2739
>> do_dentry_open+0x4a8/0x13a0 fs/open.c:924
>> vfs_open+0x3b/0x340 fs/open.c:1079
>> page last free pid 15 tgid 15 stack trace:
>> reset_page_owner include/linux/page_owner.h:25 [inline]
>> __free_pages_prepare mm/page_alloc.c:1397 [inline]
>> __free_frozen_pages+0xc1c/0xd30 mm/page_alloc.c:2938
>> __tlb_remove_table_free mm/mmu_gather.c:228 [inline]
>> tlb_remove_table_rcu+0x85/0x100 mm/mmu_gather.c:291
>> rcu_do_batch kernel/rcu/tree.c:2617 [inline]
>> rcu_core+0x7cd/0x1070 kernel/rcu/tree.c:2869
>> handle_softirqs+0x22a/0x840 kernel/softirq.c:622
>> run_ksoftirqd+0x36/0x60 kernel/softirq.c:1076
>> smpboot_thread_fn+0x541/0xa50 kernel/smpboot.c:160
>> kthread+0x389/0x470 kernel/kthread.c:436
>> ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
>> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>>
>> Memory state around the buggy address:
>> ffff888115a31700: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>> ffff888115a31780: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc
>> >ffff888115a31800: 00 00 00 00 00 00 00 04 fc fc fc fc fc fc fc fc
>> ^
>> ffff888115a31880: 00 00 00 00 00 00 02 fc fc fc fc fc fc fc fc fc
>> ffff888115a31900: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>> ==================================================================
>>
>>
>> ***
>>
>> KASAN: slab-use-after-free Read in __ext4_check_dir_entry
>>
>> tree: torvalds
>> URL:
>> https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux
>> base: 9716c086c8e8b141d35aa61f2e96a2e83de212a7
>> arch: amd64
>> compiler: Debian clang version 21.1.8
>> (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
>> config:
>> https://ci.syzbot.org/builds/ddf6ee7c-dfa8-4383-b004-10140edc081c/config
>> syz repro:
>> https://ci.syzbot.org/findings/f1d48ea1-6e87-4d64-9c13-8bf8aed109fc/syz_repro
>>
>> loop0: lost filesystem error report for type 5 error -117
>> EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000
>> r/w without journal. Quota mode: none.
>> ==================================================================
>> BUG: KASAN: slab-use-after-free in ext4_dirent_get_data_len
>> fs/ext4/ext4.h:4069 [inline]
>> BUG: KASAN: slab-use-after-free in ext4_dir_entry_len fs/ext4/ext4.h:4096
>> [inline]
>> BUG: KASAN: slab-use-after-free in __ext4_check_dir_entry+0x65a/0xc40
>> fs/ext4/dir.c:96
>> Read of size 1 at addr ffff888114d8c045 by task syz.0.20/5821
>>
>> CPU: 1 UID: 0 PID: 5821 Comm: syz.0.20 Not tainted syzkaller #0
>> PREEMPT(full)
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
>> 1.16.2-debian-1.16.2-1 04/01/2014
>> Call Trace:
>> <TASK>
>> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
>> print_address_description+0x55/0x1e0 mm/kasan/report.c:378
>> print_report+0x58/0x70 mm/kasan/report.c:482
>> kasan_report+0x117/0x150 mm/kasan/report.c:595
>> ext4_dirent_get_data_len fs/ext4/ext4.h:4069 [inline]
>> ext4_dir_entry_len fs/ext4/ext4.h:4096 [inline]
>> __ext4_check_dir_entry+0x65a/0xc40 fs/ext4/dir.c:96
>> ext4_find_dest_de+0x136/0x770 fs/ext4/namei.c:2203
>> ext4_add_dirent_to_inline+0xcf/0x430 fs/ext4/inline.c:984
>> ext4_try_add_inline_entry+0x235/0x8e0 fs/ext4/inline.c:1213
>> __ext4_add_entry+0x390/0x1f40 fs/ext4/namei.c:2529
>> ext4_add_entry fs/ext4/namei.c:2613 [inline]
>> ext4_add_nondir+0x111/0x310 fs/ext4/namei.c:2936
>> ext4_create+0x2e9/0x470 fs/ext4/namei.c:2982
>> lookup_open fs/namei.c:4511 [inline]
>> open_last_lookups fs/namei.c:4611 [inline]
>> path_openat+0x1395/0x3860 fs/namei.c:4855
>> do_file_open+0x23e/0x4a0 fs/namei.c:4887
>> do_sys_openat2+0x113/0x200 fs/open.c:1364
>> do_sys_open fs/open.c:1370 [inline]
>> __do_sys_openat fs/open.c:1386 [inline]
>> __se_sys_openat fs/open.c:1381 [inline]
>> __x64_sys_openat+0x138/0x170 fs/open.c:1381
>> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>> RIP: 0033:0x7f922219ce59
>> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7
>> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
>> ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
>> RSP: 002b:00007f9223137028 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
>> RAX: ffffffffffffffda RBX: 00007f9222415fa0 RCX: 00007f922219ce59
>> RDX: 0000000000042042 RSI: 0000200000000080 RDI: 0000000000000004
>> RBP: 00007f9222232d6f R08: 0000000000000000 R09: 0000000000000000
>> R10: 000000000000014a R11: 0000000000000246 R12: 0000000000000000
>> R13: 00007f9222416038 R14: 00007f9222415fa0 R15: 00007ffd01a2d448
>> </TASK>
>>
>> Allocated by task 5484:
>> kasan_save_stack mm/kasan/common.c:57 [inline]
>> kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
>> unpoison_slab_object mm/kasan/common.c:340 [inline]
>> __kasan_slab_alloc+0x6c/0x80 mm/kasan/common.c:366
>> kasan_slab_alloc include/linux/kasan.h:253 [inline]
>> slab_post_alloc_hook mm/slub.c:4570 [inline]
>> slab_alloc_node mm/slub.c:4899 [inline]
>> kmem_cache_alloc_node_noprof+0x384/0x690 mm/slub.c:4951
>> kmalloc_reserve net/core/skbuff.c:613 [inline]
>> __alloc_skb+0x27d/0x7d0 net/core/skbuff.c:713
>> alloc_skb include/linux/skbuff.h:1385 [inline]
>> nlmsg_new include/net/netlink.h:1055 [inline]
>> mpls_netconf_notify_devconf+0x46/0x100 net/mpls/af_mpls.c:1217
>> mpls_dev_notify+0xb2d/0xd10 net/mpls/af_mpls.c:1691
>> notifier_call_chain+0x1ad/0x3d0 kernel/notifier.c:85
>> call_netdevice_notifiers_extack net/core/dev.c:2287 [inline]
>> call_netdevice_notifiers net/core/dev.c:2301 [inline]
>> unregister_netdevice_many_notify+0x17a5/0x22c0 net/core/dev.c:12421
>> ops_exit_rtnl_list net/core/net_namespace.c:187 [inline]
>> ops_undo_list+0x3d3/0x940 net/core/net_namespace.c:248
>> cleanup_net+0x56b/0x800 net/core/net_namespace.c:702
>> process_one_work kernel/workqueue.c:3314 [inline]
>> process_scheduled_works+0xb5d/0x1860 kernel/workqueue.c:3397
>> worker_thread+0xa53/0xfc0 kernel/workqueue.c:3478
>> kthread+0x389/0x470 kernel/kthread.c:436
>> ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
>> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>>
>> Freed by task 5484:
>> kasan_save_stack mm/kasan/common.c:57 [inline]
>> kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
>> kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:584
>> poison_slab_object mm/kasan/common.c:253 [inline]
>> __kasan_slab_free+0x5c/0x80 mm/kasan/common.c:285
>> kasan_slab_free include/linux/kasan.h:235 [inline]
>> slab_free_hook mm/slub.c:2689 [inline]
>> slab_free mm/slub.c:6251 [inline]
>> kfree+0x1c5/0x640 mm/slub.c:6566
>> skb_kfree_head net/core/skbuff.c:1075 [inline]
>> skb_free_head net/core/skbuff.c:1087 [inline]
>> skb_release_data+0x828/0xa60 net/core/skbuff.c:1114
>> skb_release_all net/core/skbuff.c:1189 [inline]
>> __kfree_skb+0x5d/0x210 net/core/skbuff.c:1203
>> netlink_broadcast_filtered+0xe18/0xf20 net/netlink/af_netlink.c:1540
>> nlmsg_multicast_filtered include/net/netlink.h:1165 [inline]
>> nlmsg_multicast include/net/netlink.h:1184 [inline]
>> nlmsg_notify+0xf0/0x1a0 net/netlink/af_netlink.c:2598
>> mpls_dev_notify+0xb2d/0xd10 net/mpls/af_mpls.c:1691
>> notifier_call_chain+0x1ad/0x3d0 kernel/notifier.c:85
>> call_netdevice_notifiers_extack net/core/dev.c:2287 [inline]
>> call_netdevice_notifiers net/core/dev.c:2301 [inline]
>> unregister_netdevice_many_notify+0x17a5/0x22c0 net/core/dev.c:12421
>> ops_exit_rtnl_list net/core/net_namespace.c:187 [inline]
>> ops_undo_list+0x3d3/0x940 net/core/net_namespace.c:248
>> cleanup_net+0x56b/0x800 net/core/net_namespace.c:702
>> process_one_work kernel/workqueue.c:3314 [inline]
>> process_scheduled_works+0xb5d/0x1860 kernel/workqueue.c:3397
>> worker_thread+0xa53/0xfc0 kernel/workqueue.c:3478
>> kthread+0x389/0x470 kernel/kthread.c:436
>> ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
>> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>>
>> The buggy address belongs to the object at ffff888114d8c000
>> which belongs to the cache skbuff_small_head of size 704
>> The buggy address is located 69 bytes inside of
>> freed 704-byte region [ffff888114d8c000, ffff888114d8c2c0)
>>
>> The buggy address belongs to the physical page:
>> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x114d8c
>> head: order:2 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
>> flags: 0x17ff00000000040(head|node=0|zone=2|lastcpupid=0x7ff)
>> page_type: f5(slab)
>> raw: 017ff00000000040 ffff888160416b40 dead000000000100 dead000000000122
>> raw: 0000000000000000 0000000800120012 00000000f5000000 0000000000000000
>> head: 017ff00000000040 ffff888160416b40 dead000000000100 dead000000000122
>> head: 0000000000000000 0000000800120012 00000000f5000000 0000000000000000
>> head: 017ff00000000002 ffffffffffffff01 00000000ffffffff 00000000ffffffff
>> head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000004
>> page dumped because: kasan: bad access detected
>> page_owner tracks the page as allocated
>> page last allocated via order 2, migratetype Unmovable, gfp_mask
>> 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC),
>> pid 5484, tgid 5484 (kworker/u8:2), ts 72573003529, free_ts 72546506446
>> set_page_owner include/linux/page_owner.h:32 [inline]
>> post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
>> prep_new_page mm/page_alloc.c:1861 [inline]
>> get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
>> __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
>> alloc_slab_page mm/slub.c:3278 [inline]
>> allocate_slab+0x77/0x660 mm/slub.c:3467
>> new_slab mm/slub.c:3525 [inline]
>> refill_objects+0x339/0x3d0 mm/slub.c:7272
>> refill_sheaf mm/slub.c:2816 [inline]
>> __pcs_replace_empty_main+0x321/0x720 mm/slub.c:4652
>> alloc_from_pcs mm/slub.c:4750 [inline]
>> slab_alloc_node mm/slub.c:4884 [inline]
>> kmem_cache_alloc_node_noprof+0x441/0x690 mm/slub.c:4951
>> kmalloc_reserve net/core/skbuff.c:613 [inline]
>> __alloc_skb+0x27d/0x7d0 net/core/skbuff.c:713
>> alloc_skb include/linux/skbuff.h:1385 [inline]
>> nlmsg_new include/net/netlink.h:1055 [inline]
>> mpls_netconf_notify_devconf+0x46/0x100 net/mpls/af_mpls.c:1217
>> mpls_dev_notify+0xb2d/0xd10 net/mpls/af_mpls.c:1691
>> notifier_call_chain+0x1ad/0x3d0 kernel/notifier.c:85
>> call_netdevice_notifiers_extack net/core/dev.c:2287 [inline]
>> call_netdevice_notifiers net/core/dev.c:2301 [inline]
>> unregister_netdevice_many_notify+0x17a5/0x22c0 net/core/dev.c:12421
>> ops_exit_rtnl_list net/core/net_namespace.c:187 [inline]
>> ops_undo_list+0x3d3/0x940 net/core/net_namespace.c:248
>> cleanup_net+0x56b/0x800 net/core/net_namespace.c:702
>> process_one_work kernel/workqueue.c:3314 [inline]
>> process_scheduled_works+0xb5d/0x1860 kernel/workqueue.c:3397
>> worker_thread+0xa53/0xfc0 kernel/workqueue.c:3478
>> page last free pid 5484 tgid 5484 stack trace:
>> reset_page_owner include/linux/page_owner.h:25 [inline]
>> __free_pages_prepare mm/page_alloc.c:1397 [inline]
>> __free_frozen_pages+0xc1c/0xd30 mm/page_alloc.c:2938
>> stack_depot_save_flags+0x40e/0x810 lib/stackdepot.c:735
>> kasan_save_stack mm/kasan/common.c:58 [inline]
>> kasan_save_track+0x4f/0x80 mm/kasan/common.c:78
>> unpoison_slab_object mm/kasan/common.c:340 [inline]
>> __kasan_slab_alloc+0x6c/0x80 mm/kasan/common.c:366
>> kasan_slab_alloc include/linux/kasan.h:253 [inline]
>> slab_post_alloc_hook mm/slub.c:4570 [inline]
>> slab_alloc_node mm/slub.c:4899 [inline]
>> kmem_cache_alloc_noprof+0x2bc/0x650 mm/slub.c:4906
>> kmem_alloc_batch lib/debugobjects.c:371 [inline]
>> fill_pool+0x156/0x580 lib/debugobjects.c:420
>> debug_objects_fill_pool lib/debugobjects.c:752 [inline]
>> debug_object_activate+0x4a3/0x580 lib/debugobjects.c:841
>> debug_rcu_head_queue kernel/rcu/rcu.h:236 [inline]
>> __call_rcu_common kernel/rcu/tree.c:3116 [inline]
>> call_rcu+0x43/0x890 kernel/rcu/tree.c:3251
>> kernfs_put+0x259/0x520 fs/kernfs/dir.c:618
>> kernfs_remove_by_name_ns+0xc8/0x140 fs/kernfs/dir.c:1799
>> device_remove_class_symlinks+0x178/0x190 drivers/base/core.c:3479
>> device_del+0x400/0x8f0 drivers/base/core.c:3881
>> unregister_netdevice_many_notify+0x1d5f/0x22c0 net/core/dev.c:12456
>> ops_exit_rtnl_list net/core/net_namespace.c:187 [inline]
>> ops_undo_list+0x3d3/0x940 net/core/net_namespace.c:248
>> cleanup_net+0x56b/0x800 net/core/net_namespace.c:702
>> process_one_work kernel/workqueue.c:3314 [inline]
>> process_scheduled_works+0xb5d/0x1860 kernel/workqueue.c:3397
>>
>> Memory state around the buggy address:
>> ffff888114d8bf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> ffff888114d8bf80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> >ffff888114d8c000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ^
>> ffff888114d8c080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ffff888114d8c100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ==================================================================
>>
>>
>> ***
>>
>> KASAN: slab-use-after-free Read in ext4_inlinedir_to_tree
>>
>> tree: torvalds
>> URL:
>> https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux
>> base: 9716c086c8e8b141d35aa61f2e96a2e83de212a7
>> arch: amd64
>> compiler: Debian clang version 21.1.8
>> (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
>> config:
>> https://ci.syzbot.org/builds/ddf6ee7c-dfa8-4383-b004-10140edc081c/config
>> syz repro:
>> https://ci.syzbot.org/findings/f42da242-e16e-4f10-bf25-0bd7e192d989/syz_repro
>>
>> loop0: lost filesystem error report for type 5 error -117
>> EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000
>> r/w without journal. Quota mode: none.
>> ==================================================================
>> BUG: KASAN: slab-use-after-free in ext4_dirent_get_data_len
>> fs/ext4/ext4.h:4069 [inline]
>> BUG: KASAN: slab-use-after-free in ext4_dir_entry_len fs/ext4/ext4.h:4096
>> [inline]
>> BUG: KASAN: slab-use-after-free in ext4_inlinedir_to_tree+0x94c/0x10d0
>> fs/ext4/inline.c:1335
>> Read of size 1 at addr ffff88816fee8825 by task syz.0.20/5867
>>
>> CPU: 1 UID: 0 PID: 5867 Comm: syz.0.20 Not tainted syzkaller #0
>> PREEMPT(full)
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
>> 1.16.2-debian-1.16.2-1 04/01/2014
>> Call Trace:
>> <TASK>
>> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
>> print_address_description+0x55/0x1e0 mm/kasan/report.c:378
>> print_report+0x58/0x70 mm/kasan/report.c:482
>> kasan_report+0x117/0x150 mm/kasan/report.c:595
>> ext4_dirent_get_data_len fs/ext4/ext4.h:4069 [inline]
>> ext4_dir_entry_len fs/ext4/ext4.h:4096 [inline]
>> ext4_inlinedir_to_tree+0x94c/0x10d0 fs/ext4/inline.c:1335
>> ext4_htree_fill_tree+0x517/0x1230 fs/ext4/namei.c:1182
>> ext4_dx_readdir fs/ext4/dir.c:600 [inline]
>> ext4_readdir+0x2db4/0x3640 fs/ext4/dir.c:146
>> iterate_dir+0x399/0x570 fs/readdir.c:110
>> __do_sys_getdents fs/readdir.c:319 [inline]
>> __se_sys_getdents+0xf1/0x270 fs/readdir.c:304
>> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>> RIP: 0033:0x7f010ad9ce59
>> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7
>> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
>> ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
>> RSP: 002b:00007f010bc0f028 EFLAGS: 00000246 ORIG_RAX: 000000000000004e
>> RAX: ffffffffffffffda RBX: 00007f010b015fa0 RCX: 00007f010ad9ce59
>> RDX: 0000000000000054 RSI: 0000000000000000 RDI: 0000000000000004
>> RBP: 00007f010ae32d6f R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
>> R13: 00007f010b016038 R14: 00007f010b015fa0 R15: 00007ffd93577348
>> </TASK>
>>
>> Allocated by task 5064:
>> kasan_save_stack mm/kasan/common.c:57 [inline]
>> kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
>> poison_kmalloc_redzone mm/kasan/common.c:398 [inline]
>> __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:415
>> kasan_kmalloc include/linux/kasan.h:263 [inline]
>> __do_kmalloc_node mm/slub.c:5296 [inline]
>> __kmalloc_noprof+0x35c/0x760 mm/slub.c:5308
>> kmalloc_noprof include/linux/slab.h:954 [inline]
>> kzalloc_noprof include/linux/slab.h:1188 [inline]
>> tomoyo_encode2 security/tomoyo/realpath.c:45 [inline]
>> tomoyo_encode+0x28b/0x550 security/tomoyo/realpath.c:80
>> tomoyo_realpath_from_path+0x58d/0x5d0 security/tomoyo/realpath.c:283
>> tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
>> tomoyo_path_perm+0x283/0x560 security/tomoyo/file.c:827
>> security_inode_getattr+0x12b/0x310 security/security.c:1895
>> vfs_getattr fs/stat.c:259 [inline]
>> vfs_fstat fs/stat.c:281 [inline]
>> vfs_fstatat+0xb4/0x170 fs/stat.c:371
>> __do_sys_newfstatat fs/stat.c:538 [inline]
>> __se_sys_newfstatat fs/stat.c:532 [inline]
>> __x64_sys_newfstatat+0x151/0x200 fs/stat.c:532
>> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>
>> Freed by task 5064:
>> kasan_save_stack mm/kasan/common.c:57 [inline]
>> kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
>> kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:584
>> poison_slab_object mm/kasan/common.c:253 [inline]
>> __kasan_slab_free+0x5c/0x80 mm/kasan/common.c:285
>> kasan_slab_free include/linux/kasan.h:235 [inline]
>> slab_free_hook mm/slub.c:2689 [inline]
>> slab_free mm/slub.c:6251 [inline]
>> kfree+0x1c5/0x640 mm/slub.c:6566
>> tomoyo_path_perm+0x403/0x560 security/tomoyo/file.c:847
>> security_inode_getattr+0x12b/0x310 security/security.c:1895
>> vfs_getattr fs/stat.c:259 [inline]
>> vfs_fstat fs/stat.c:281 [inline]
>> vfs_fstatat+0xb4/0x170 fs/stat.c:371
>> __do_sys_newfstatat fs/stat.c:538 [inline]
>> __se_sys_newfstatat fs/stat.c:532 [inline]
>> __x64_sys_newfstatat+0x151/0x200 fs/stat.c:532
>> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>
>> The buggy address belongs to the object at ffff88816fee8800
>> which belongs to the cache kmalloc-64 of size 64
>> The buggy address is located 37 bytes inside of
>> freed 64-byte region [ffff88816fee8800, ffff88816fee8840)
>>
>> The buggy address belongs to the physical page:
>> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x16fee8
>> flags: 0x57ff00000000000(node=1|zone=2|lastcpupid=0x7ff)
>> page_type: f5(slab)
>> raw: 057ff00000000000 ffff8881000418c0 dead000000000100 dead000000000122
>> raw: 0000000000000000 0000000800200020 00000000f5000000 0000000000000000
>> page dumped because: kasan: bad access detected
>> page_owner tracks the page as allocated
>> page last allocated via order 0, migratetype Unmovable, gfp_mask
>> 0xd2cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC),
>> pid 1, tgid 1 (swapper/0), ts 21294026082, free_ts 0
>> set_page_owner include/linux/page_owner.h:32 [inline]
>> post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
>> prep_new_page mm/page_alloc.c:1861 [inline]
>> get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
>> __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
>> alloc_slab_page mm/slub.c:3278 [inline]
>> allocate_slab+0x77/0x660 mm/slub.c:3467
>> new_slab mm/slub.c:3525 [inline]
>> refill_objects+0x339/0x3d0 mm/slub.c:7272
>> refill_sheaf mm/slub.c:2816 [inline]
>> __pcs_replace_empty_main+0x321/0x720 mm/slub.c:4652
>> alloc_from_pcs mm/slub.c:4750 [inline]
>> slab_alloc_node mm/slub.c:4884 [inline]
>> __do_kmalloc_node mm/slub.c:5295 [inline]
>> __kmalloc_noprof+0x474/0x760 mm/slub.c:5308
>> kmalloc_noprof include/linux/slab.h:954 [inline]
>> kzalloc_noprof include/linux/slab.h:1188 [inline]
>> handler_new_ref+0x261/0x9c0 drivers/media/v4l2-core/v4l2-ctrls-core.c:1882
>> v4l2_ctrl_add_handler+0x19f/0x290
>> drivers/media/v4l2-core/v4l2-ctrls-core.c:2443
>> vivid_create_controls+0x332d/0x3bd0
>> drivers/media/test-drivers/vivid/vivid-ctrls.c:2072
>> vivid_create_instance drivers/media/test-drivers/vivid/vivid-core.c:1933
>> [inline]
>> vivid_probe+0x4261/0x72b0
>> drivers/media/test-drivers/vivid/vivid-core.c:2095
>> platform_probe+0xf9/0x190 drivers/base/platform.c:1432
>> call_driver_probe drivers/base/dd.c:-1 [inline]
>> really_probe+0x267/0xaf0 drivers/base/dd.c:709
>> __driver_probe_device+0x1ef/0x380 drivers/base/dd.c:871
>> driver_probe_device+0x4f/0x240 drivers/base/dd.c:901
>> __driver_attach+0x34c/0x640 drivers/base/dd.c:1295
>> page_owner free stack trace missing
>>
>> Memory state around the buggy address:
>> ffff88816fee8700: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
>> ffff88816fee8780: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
>> >ffff88816fee8800: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>> ^
>> ffff88816fee8880: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>> ffff88816fee8900: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>> ==================================================================
>>
>>
>> ***
>>
>> KASAN: use-after-free Read in __ext4_check_dir_entry
>>
>> tree: torvalds
>> URL:
>> https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux
>> base: 9716c086c8e8b141d35aa61f2e96a2e83de212a7
>> arch: amd64
>> compiler: Debian clang version 21.1.8
>> (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
>> config:
>> https://ci.syzbot.org/builds/ddf6ee7c-dfa8-4383-b004-10140edc081c/config
>> syz repro:
>> https://ci.syzbot.org/findings/57c0b75a-8922-4dc1-9a20-ca947564792b/syz_repro
>>
>> ==================================================================
>> BUG: KASAN: use-after-free in ext4_dirent_get_data_len fs/ext4/ext4.h:4069
>> [inline]
>> BUG: KASAN: use-after-free in ext4_dir_entry_len fs/ext4/ext4.h:4096
>> [inline]
>> BUG: KASAN: use-after-free in __ext4_check_dir_entry+0x65a/0xc40
>> fs/ext4/dir.c:96
>> Read of size 1 at addr ffff88816be85045 by task syz.2.21/5880
>>
>> CPU: 1 UID: 0 PID: 5880 Comm: syz.2.21 Not tainted syzkaller #0
>> PREEMPT(full)
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
>> 1.16.2-debian-1.16.2-1 04/01/2014
>> Call Trace:
>> <TASK>
>> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
>> print_address_description+0x55/0x1e0 mm/kasan/report.c:378
>> print_report+0x58/0x70 mm/kasan/report.c:482
>> kasan_report+0x117/0x150 mm/kasan/report.c:595
>> ext4_dirent_get_data_len fs/ext4/ext4.h:4069 [inline]
>> ext4_dir_entry_len fs/ext4/ext4.h:4096 [inline]
>> __ext4_check_dir_entry+0x65a/0xc40 fs/ext4/dir.c:96
>> ext4_find_dest_de+0x136/0x770 fs/ext4/namei.c:2203
>> ext4_add_dirent_to_inline+0xcf/0x430 fs/ext4/inline.c:984
>> ext4_try_add_inline_entry+0x235/0x8e0 fs/ext4/inline.c:1213
>> __ext4_add_entry+0x390/0x1f40 fs/ext4/namei.c:2529
>> ext4_add_entry fs/ext4/namei.c:2613 [inline]
>> ext4_add_nondir+0x111/0x310 fs/ext4/namei.c:2936
>> ext4_create+0x2e9/0x470 fs/ext4/namei.c:2982
>> lookup_open fs/namei.c:4511 [inline]
>> open_last_lookups fs/namei.c:4611 [inline]
>> path_openat+0x1395/0x3860 fs/namei.c:4855
>> do_file_open+0x23e/0x4a0 fs/namei.c:4887
>> do_sys_openat2+0x113/0x200 fs/open.c:1364
>> do_sys_open fs/open.c:1370 [inline]
>> __do_sys_openat fs/open.c:1386 [inline]
>> __se_sys_openat fs/open.c:1381 [inline]
>> __x64_sys_openat+0x138/0x170 fs/open.c:1381
>> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>> RIP: 0033:0x7f5713b9ce59
>> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7
>> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
>> ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
>> RSP: 002b:00007fff672b25f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
>> RAX: ffffffffffffffda RBX: 00007f5713e15fa0 RCX: 00007f5713b9ce59
>> RDX: 0000000000042042 RSI: 0000200000000080 RDI: 0000000000000004
>> RBP: 00007f5713c32d6f R08: 0000000000000000 R09: 0000000000000000
>> R10: 000000000000014a R11: 0000000000000246 R12: 0000000000000000
>> R13: 00007f5713e15fac R14: 00007f5713e15fa0 R15: 00007f5713e15fa0
>> </TASK>
>>
>> The buggy address belongs to the physical page:
>> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x16be85
>> flags: 0x57ff00000000000(node=1|zone=2|lastcpupid=0x7ff)
>> page_type: f0(buddy)
>> raw: 057ff00000000000 ffffea0005afa0c8 ffffea0005afa1c8 0000000000000000
>> raw: 0000000000000000 0000000000000000 00000000f0000000 0000000000000000
>> page dumped because: kasan: bad access detected
>> page_owner tracks the page as freed
>> page last allocated via order 0, migratetype Unmovable, gfp_mask
>> 0xcc0(GFP_KERNEL), pid 5630, tgid 5630 (syz-executor), ts 67290853657,
>> free_ts 69321168948
>> set_page_owner include/linux/page_owner.h:32 [inline]
>> post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
>> prep_new_page mm/page_alloc.c:1861 [inline]
>> get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
>> __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
>> __alloc_pages_noprof+0x10/0x100 mm/page_alloc.c:5255
>> alloc_pages_bulk_noprof+0x5ff/0x7c0 mm/page_alloc.c:5175
>> ___alloc_pages_bulk mm/kasan/shadow.c:345 [inline]
>> __kasan_populate_vmalloc_do mm/kasan/shadow.c:370 [inline]
>> __kasan_populate_vmalloc+0xc1/0x1d0 mm/kasan/shadow.c:424
>> kasan_populate_vmalloc include/linux/kasan.h:580 [inline]
>> alloc_vmap_area+0xd47/0x1480 mm/vmalloc.c:2123
>> __get_vm_area_node+0x1f8/0x300 mm/vmalloc.c:3226
>> __vmalloc_node_range_noprof+0x36a/0x1750 mm/vmalloc.c:4024
>> vmalloc_user_noprof+0xad/0xe0 mm/vmalloc.c:4218
>> kcov_ioctl+0x55/0x620 kernel/kcov.c:726
>> vfs_ioctl fs/ioctl.c:51 [inline]
>> __do_sys_ioctl fs/ioctl.c:597 [inline]
>> __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:583
>> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>> page last free pid 5693 tgid 5693 stack trace:
>> reset_page_owner include/linux/page_owner.h:25 [inline]
>> __free_pages_prepare mm/page_alloc.c:1397 [inline]
>> __free_frozen_pages+0xc1c/0xd30 mm/page_alloc.c:2938
>> kasan_depopulate_vmalloc_pte+0x6d/0x90 mm/kasan/shadow.c:484
>> apply_to_pte_range mm/memory.c:3338 [inline]
>> apply_to_pmd_range mm/memory.c:3382 [inline]
>> apply_to_pud_range mm/memory.c:3418 [inline]
>> apply_to_p4d_range mm/memory.c:3454 [inline]
>> __apply_to_page_range+0xbdc/0x1420 mm/memory.c:3490
>> __kasan_release_vmalloc+0xa2/0xd0 mm/kasan/shadow.c:602
>> kasan_release_vmalloc include/linux/kasan.h:593 [inline]
>> kasan_release_vmalloc_node mm/vmalloc.c:2284 [inline]
>> purge_vmap_node+0x220/0x960 mm/vmalloc.c:2306
>> __purge_vmap_area_lazy+0x779/0xb40 mm/vmalloc.c:2396
>> drain_vmap_area_work+0x27/0x40 mm/vmalloc.c:2430
>> process_one_work kernel/workqueue.c:3314 [inline]
>> process_scheduled_works+0xb5d/0x1860 kernel/workqueue.c:3397
>> worker_thread+0xa53/0xfc0 kernel/workqueue.c:3478
>> kthread+0x389/0x470 kernel/kthread.c:436
>> ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
>> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>>
>> Memory state around the buggy address:
>> ffff88816be84f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> ffff88816be84f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> >ffff88816be85000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>> ^
>> ffff88816be85080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>> ffff88816be85100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>> ==================================================================
>>
>>
>> ***
>>
>> If these findings have caused you to resend the series or submit a
>> separate fix, please add the following tag to your commit message:
>> Tested-by: syzbot@syzkaller.appspotmail.com
>>
>> ---
>> This report is generated by a bot. It may contain errors.
>> syzbot ci engineers can be reached at syzkaller@googlegroups.com.
>>
>> To test a patch for this bug, please reply with `#syz test`
>> (should be on a separate line).
>>
>> The patch should be attached to the email.
>> Note: arguments like custom git repos and branches are not supported.
>>
>>
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/syzkaller-bugs/CA%2BrD4x_2wXOP%3D4RwPY-A2vJjK4Vv9hGUSVFzprCe1H%2B8MTOKhA%40mail.gmail.com.
^ permalink raw reply
* Re: [syzbot ci] Re: Data in direntry (dirdata) feature
From: Artem Blagodarenko @ 2026-06-19 14:10 UTC (permalink / raw)
To: adilger, artem.blagodarenko, linux-ext4, pravin.shelar
Cc: syzbot, syzkaller-bugs
In-Reply-To: <6a2a8e0d.3b0a2d4e.8c8d1.000f.GAE@google.com>
[-- Attachment #1.1: Type: text/plain, Size: 43669 bytes --]
Thanks for the report. The attached patch addresses the issues found in
the dirdata series review (dx_get_dx_info/get_dx_countlimit blocksize
fallback, dfid parameter shadowing in ext4_dirdata_get, and the unsafe
delete-before-add in EXT4_IOC_SET_LUFID).
#syz test
On Thu, Jun 11, 2026 11:29 AM, syzbot ci <
syzbot+cid7b922cb3d448114@syzkaller.appspotmail.com> wrote:
> syzbot ci has tested the following series
>
> [v2] Data in direntry (dirdata) feature
>
> https://lore.kernel.org/all/20260610152417.13576-1-ablagodarenko@thelustrecollective.com
> * [PATCH v2 01/10] ext4: replace ext4_dir_entry with ext4_dir_entry_2
> * [PATCH v2 02/10] ext4: add ext4_dir_entry_is_tail()
> * [PATCH v2 03/10] ext4: refactor dx_root to support variable dirent sizes
> * [PATCH v2 04/10] ext4: add dirdata format definitions and access helpers
> * [PATCH v2 05/10] ext4: preserve dirdata bits in get_dtype()
> * [PATCH v2 06/10] ext4: add ext4_dir_entry_len() and harden dirdata
> parsing
> * [PATCH v2 07/10] ext4: rename ext4_dir_rec_len() and clarify dirdata
> usage
> * [PATCH v2 08/10] ext4: dirdata feature
> * [PATCH v2 09/10] ext4: add dirdata set/get helpers
> * [PATCH v2 10/10] ext4: Add EXT4_IOC_SET_LUFID ioctl for setting LUFID on
> directory entries
>
> and found the following issues:
> * KASAN: slab-out-of-bounds Read in __ext4_check_dir_entry
> * KASAN: slab-out-of-bounds Read in ext4_inlinedir_to_tree
> * KASAN: slab-use-after-free Read in __ext4_check_dir_entry
> * KASAN: slab-use-after-free Read in ext4_inlinedir_to_tree
> * KASAN: use-after-free Read in __ext4_check_dir_entry
>
> Full report is available here:
> https://ci.syzbot.org/series/5bf0e2fa-2e68-4532-8396-4568879b2788
>
> ***
>
> KASAN: slab-out-of-bounds Read in __ext4_check_dir_entry
>
> tree: torvalds
> URL:
> https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux
> base: 9716c086c8e8b141d35aa61f2e96a2e83de212a7
> arch: amd64
> compiler: Debian clang version 21.1.8
> (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> config:
> https://ci.syzbot.org/builds/ddf6ee7c-dfa8-4383-b004-10140edc081c/config
> syz repro:
> https://ci.syzbot.org/findings/b0854918-13f9-49dd-ab30-12154f0debe2/syz_repro
>
> loop0: lost filesystem error report for type 5 error -117
> EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000
> r/w without journal. Quota mode: none.
> ==================================================================
> BUG: KASAN: slab-out-of-bounds in ext4_dirent_get_data_len
> fs/ext4/ext4.h:4069 [inline]
> BUG: KASAN: slab-out-of-bounds in ext4_dir_entry_len fs/ext4/ext4.h:4096
> [inline]
> BUG: KASAN: slab-out-of-bounds in __ext4_check_dir_entry+0x65a/0xc40
> fs/ext4/dir.c:96
> Read of size 1 at addr ffff8881022db7f5 by task syz.0.23/5815
>
> CPU: 1 UID: 0 PID: 5815 Comm: syz.0.23 Not tainted syzkaller #0
> PREEMPT(full)
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> 1.16.2-debian-1.16.2-1 04/01/2014
> Call Trace:
> <TASK>
> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
> print_address_description+0x55/0x1e0 mm/kasan/report.c:378
> print_report+0x58/0x70 mm/kasan/report.c:482
> kasan_report+0x117/0x150 mm/kasan/report.c:595
> ext4_dirent_get_data_len fs/ext4/ext4.h:4069 [inline]
> ext4_dir_entry_len fs/ext4/ext4.h:4096 [inline]
> __ext4_check_dir_entry+0x65a/0xc40 fs/ext4/dir.c:96
> ext4_check_all_de+0x66/0x150 fs/ext4/dir.c:657
> ext4_convert_inline_data_nolock+0x1b7/0x990 fs/ext4/inline.c:1121
> ext4_try_add_inline_entry+0x604/0x8e0 fs/ext4/inline.c:1247
> __ext4_add_entry+0x390/0x1f40 fs/ext4/namei.c:2529
> ext4_add_entry fs/ext4/namei.c:2613 [inline]
> ext4_mkdir+0x5e5/0xce0 fs/ext4/namei.c:3175
> vfs_mkdir+0x413/0x630 fs/namei.c:5271
> filename_mkdirat+0x285/0x510 fs/namei.c:5304
> __do_sys_mkdirat fs/namei.c:5325 [inline]
> __se_sys_mkdirat+0x35/0x150 fs/namei.c:5322
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7f669359bcc7
> Code: 00 66 90 48 89 f2 b9 00 01 00 00 48 89 fe bf 9c ff ff ff e9 db f7 ff
> ff 66 2e 0f 1f 84 00 00 00 00 00 90 b8 02 01 00 00 0f 05 <48> 3d 01 f0 ff
> ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007ffd42381d38 EFLAGS: 00000246 ORIG_RAX: 0000000000000102
> RAX: ffffffffffffffda RBX: 00007ffd42381dc0 RCX: 00007f669359bcc7
> RDX: 00000000000001ff RSI: 0000200000001200 RDI: 00000000ffffff9c
> RBP: 00002000000024c0 R08: 0000200000000240 R09: 0000000000000000
> R10: 00002000000024c0 R11: 0000000000000246 R12: 0000200000001200
> R13: 00007ffd42381d80 R14: 0000000000000000 R15: 0000000000000000
> </TASK>
>
> Allocated by task 5066:
> kasan_save_stack mm/kasan/common.c:57 [inline]
> kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
> poison_kmalloc_redzone mm/kasan/common.c:398 [inline]
> __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:415
> kasan_kmalloc include/linux/kasan.h:263 [inline]
> __kmalloc_cache_noprof+0x31c/0x660 mm/slub.c:5420
> kmalloc_noprof include/linux/slab.h:950 [inline]
> kzalloc_noprof include/linux/slab.h:1188 [inline]
> kernfs_get_open_node fs/kernfs/file.c:543 [inline]
> kernfs_fop_open+0x862/0xda0 fs/kernfs/file.c:718
> do_dentry_open+0x822/0x13a0 fs/open.c:947
> vfs_open+0x3b/0x340 fs/open.c:1079
> do_open fs/namei.c:4699 [inline]
> path_openat+0x2e08/0x3860 fs/namei.c:4858
> do_file_open+0x23e/0x4a0 fs/namei.c:4887
> do_sys_openat2+0x113/0x200 fs/open.c:1364
> do_sys_open fs/open.c:1370 [inline]
> __do_sys_openat fs/open.c:1386 [inline]
> __se_sys_openat fs/open.c:1381 [inline]
> __x64_sys_openat+0x138/0x170 fs/open.c:1381
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> Last potentially related work creation:
> kasan_save_stack+0x3e/0x60 mm/kasan/common.c:57
> kasan_record_aux_stack+0xbd/0xd0 mm/kasan/generic.c:556
> kvfree_call_rcu+0x100/0x430 mm/slab_common.c:1970
> kernfs_unlink_open_file+0x3fe/0x4b0 fs/kernfs/file.c:604
> kernfs_fop_release+0x2eb/0x440 fs/kernfs/file.c:783
> __fput+0x44f/0xa60 fs/file_table.c:510
> fput_close_sync+0x11f/0x240 fs/file_table.c:615
> __do_sys_close fs/open.c:1507 [inline]
> __se_sys_close fs/open.c:1492 [inline]
> __x64_sys_close+0x7e/0x110 fs/open.c:1492
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> The buggy address belongs to the object at ffff8881022db700
> which belongs to the cache kmalloc-128 of size 128
> The buggy address is located 117 bytes to the right of
> allocated 128-byte region [ffff8881022db700, ffff8881022db780)
>
> The buggy address belongs to the physical page:
> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1022db
> flags: 0x17ff00000000000(node=0|zone=2|lastcpupid=0x7ff)
> page_type: f5(slab)
> raw: 017ff00000000000 ffff888100041a00 dead000000000100 dead000000000122
> raw: 0000000000000000 0000000800100010 00000000f5000000 0000000000000000
> page dumped because: kasan: bad access detected
> page_owner tracks the page as allocated
> page last allocated via order 0, migratetype Unmovable, gfp_mask
> 0xd2000(__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 0,
> tgid 0 (swapper/0), ts 2408938923, free_ts 0
> set_page_owner include/linux/page_owner.h:32 [inline]
> post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
> prep_new_page mm/page_alloc.c:1861 [inline]
> get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
> __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
> alloc_slab_page mm/slub.c:3278 [inline]
> allocate_slab+0x77/0x660 mm/slub.c:3467
> new_slab mm/slub.c:3525 [inline]
> refill_objects+0x339/0x3d0 mm/slub.c:7272
> refill_sheaf mm/slub.c:2816 [inline]
> __pcs_replace_empty_main+0x321/0x720 mm/slub.c:4652
> alloc_from_pcs mm/slub.c:4750 [inline]
> slab_alloc_node mm/slub.c:4884 [inline]
> __do_kmalloc_node mm/slub.c:5295 [inline]
> __kmalloc_noprof+0x474/0x760 mm/slub.c:5308
> kmalloc_noprof include/linux/slab.h:954 [inline]
> kzalloc_noprof include/linux/slab.h:1188 [inline]
> __alloc_empty_sheaf mm/slub.c:2768 [inline]
> alloc_empty_sheaf mm/slub.c:2783 [inline]
> __pcs_replace_empty_main+0x2df/0x720 mm/slub.c:4647
> alloc_from_pcs mm/slub.c:4750 [inline]
> slab_alloc_node mm/slub.c:4884 [inline]
> kmem_cache_alloc_noprof+0x37d/0x650 mm/slub.c:4906
> dup_fd+0x55/0xb40 fs/file.c:390
> copy_files+0xc8/0x120 kernel/fork.c:1639
> copy_process+0x1d94/0x4440 kernel/fork.c:2252
> kernel_clone+0x2d7/0x940 kernel/fork.c:2722
> user_mode_thread+0x110/0x180 kernel/fork.c:2798
> rest_init+0x23/0x300 init/main.c:727
> start_kernel+0x38a/0x3e0 init/main.c:1220
> page_owner free stack trace missing
>
> Memory state around the buggy address:
> ffff8881022db680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff8881022db700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >ffff8881022db780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ^
> ffff8881022db800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ffff8881022db880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ==================================================================
>
>
> ***
>
> KASAN: slab-out-of-bounds Read in ext4_inlinedir_to_tree
>
> tree: torvalds
> URL:
> https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux
> base: 9716c086c8e8b141d35aa61f2e96a2e83de212a7
> arch: amd64
> compiler: Debian clang version 21.1.8
> (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> config:
> https://ci.syzbot.org/builds/ddf6ee7c-dfa8-4383-b004-10140edc081c/config
> syz repro:
> https://ci.syzbot.org/findings/2dff870b-f382-4c93-8d8d-b2291d921224/syz_repro
>
> loop1: lost filesystem error report for type 5 error -117
> EXT4-fs (loop1): mounted filesystem 00000000-0000-0000-0000-000000000000
> r/w without journal. Quota mode: none.
> ==================================================================
> BUG: KASAN: slab-out-of-bounds in ext4_dir_entry_len fs/ext4/ext4.h:4095
> [inline]
> BUG: KASAN: slab-out-of-bounds in ext4_inlinedir_to_tree+0xda5/0x10d0
> fs/ext4/inline.c:1335
> Read of size 2 at addr ffff888115a3183c by task syz.1.18/5839
>
> CPU: 1 UID: 0 PID: 5839 Comm: syz.1.18 Not tainted syzkaller #0
> PREEMPT(full)
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> 1.16.2-debian-1.16.2-1 04/01/2014
> Call Trace:
> <TASK>
> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
> print_address_description+0x55/0x1e0 mm/kasan/report.c:378
> print_report+0x58/0x70 mm/kasan/report.c:482
> kasan_report+0x117/0x150 mm/kasan/report.c:595
> ext4_dir_entry_len fs/ext4/ext4.h:4095 [inline]
> ext4_inlinedir_to_tree+0xda5/0x10d0 fs/ext4/inline.c:1335
> ext4_htree_fill_tree+0x517/0x1230 fs/ext4/namei.c:1182
> ext4_dx_readdir fs/ext4/dir.c:600 [inline]
> ext4_readdir+0x2db4/0x3640 fs/ext4/dir.c:146
> iterate_dir+0x399/0x570 fs/readdir.c:110
> __do_sys_getdents64 fs/readdir.c:399 [inline]
> __se_sys_getdents64+0xf1/0x280 fs/readdir.c:384
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7f3e02b9ce59
> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7
> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
> ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007f3e03ad5028 EFLAGS: 00000246 ORIG_RAX: 00000000000000d9
> RAX: ffffffffffffffda RBX: 00007f3e02e15fa0 RCX: 00007f3e02b9ce59
> RDX: 0000000000001000 RSI: 0000200000000f80 RDI: 0000000000000004
> RBP: 00007f3e02c32d6f R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 00007f3e02e16038 R14: 00007f3e02e15fa0 R15: 00007ffcaa902298
> </TASK>
>
> Allocated by task 5839:
> kasan_save_stack mm/kasan/common.c:57 [inline]
> kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
> poison_kmalloc_redzone mm/kasan/common.c:398 [inline]
> __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:415
> kasan_kmalloc include/linux/kasan.h:263 [inline]
> __do_kmalloc_node mm/slub.c:5296 [inline]
> __kmalloc_noprof+0x35c/0x760 mm/slub.c:5308
> kmalloc_noprof include/linux/slab.h:954 [inline]
> ext4_inlinedir_to_tree+0x312/0x10d0 fs/ext4/inline.c:1292
> ext4_htree_fill_tree+0x517/0x1230 fs/ext4/namei.c:1182
> ext4_dx_readdir fs/ext4/dir.c:600 [inline]
> ext4_readdir+0x2db4/0x3640 fs/ext4/dir.c:146
> iterate_dir+0x399/0x570 fs/readdir.c:110
> __do_sys_getdents64 fs/readdir.c:399 [inline]
> __se_sys_getdents64+0xf1/0x280 fs/readdir.c:384
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> The buggy address belongs to the object at ffff888115a31800
> which belongs to the cache kmalloc-64 of size 64
> The buggy address is located 0 bytes to the right of
> allocated 60-byte region [ffff888115a31800, ffff888115a3183c)
>
> The buggy address belongs to the physical page:
> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x115a31
> flags: 0x17ff00000000000(node=0|zone=2|lastcpupid=0x7ff)
> page_type: f5(slab)
> raw: 017ff00000000000 ffff8881000418c0 dead000000000100 dead000000000122
> raw: 0000000000000000 0000000800200020 00000000f5000000 0000000000000000
> page dumped because: kasan: bad access detected
> page_owner tracks the page as allocated
> page last allocated via order 0, migratetype Unmovable, gfp_mask
> 0xd2c40(GFP_NOFS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC),
> pid 5051, tgid 5051 (acpid), ts 27203740677, free_ts 27201732767
> set_page_owner include/linux/page_owner.h:32 [inline]
> post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
> prep_new_page mm/page_alloc.c:1861 [inline]
> get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
> __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
> alloc_slab_page mm/slub.c:3278 [inline]
> allocate_slab+0x77/0x660 mm/slub.c:3467
> new_slab mm/slub.c:3525 [inline]
> refill_objects+0x339/0x3d0 mm/slub.c:7272
> refill_sheaf mm/slub.c:2816 [inline]
> __pcs_replace_empty_main+0x321/0x720 mm/slub.c:4652
> alloc_from_pcs mm/slub.c:4750 [inline]
> slab_alloc_node mm/slub.c:4884 [inline]
> __do_kmalloc_node mm/slub.c:5295 [inline]
> __kmalloc_noprof+0x474/0x760 mm/slub.c:5308
> kmalloc_noprof include/linux/slab.h:954 [inline]
> kzalloc_noprof include/linux/slab.h:1188 [inline]
> tomoyo_get_name+0x20c/0x590 security/tomoyo/memory.c:173
> tomoyo_parse_name_union+0xd9/0x130 security/tomoyo/util.c:260
> tomoyo_update_path_acl security/tomoyo/file.c:399 [inline]
> tomoyo_write_file+0x3a6/0xc50 security/tomoyo/file.c:1027
> tomoyo_write_domain2 security/tomoyo/common.c:1160 [inline]
> tomoyo_add_entry security/tomoyo/common.c:2177 [inline]
> tomoyo_supervisor+0x1208/0x1570 security/tomoyo/common.c:2238
> tomoyo_audit_path_log security/tomoyo/file.c:169 [inline]
> tomoyo_path_permission+0x25a/0x380 security/tomoyo/file.c:592
> tomoyo_check_open_permission+0x2b2/0x470 security/tomoyo/file.c:782
> security_file_open+0xa9/0x240 security/security.c:2739
> do_dentry_open+0x4a8/0x13a0 fs/open.c:924
> vfs_open+0x3b/0x340 fs/open.c:1079
> page last free pid 15 tgid 15 stack trace:
> reset_page_owner include/linux/page_owner.h:25 [inline]
> __free_pages_prepare mm/page_alloc.c:1397 [inline]
> __free_frozen_pages+0xc1c/0xd30 mm/page_alloc.c:2938
> __tlb_remove_table_free mm/mmu_gather.c:228 [inline]
> tlb_remove_table_rcu+0x85/0x100 mm/mmu_gather.c:291
> rcu_do_batch kernel/rcu/tree.c:2617 [inline]
> rcu_core+0x7cd/0x1070 kernel/rcu/tree.c:2869
> handle_softirqs+0x22a/0x840 kernel/softirq.c:622
> run_ksoftirqd+0x36/0x60 kernel/softirq.c:1076
> smpboot_thread_fn+0x541/0xa50 kernel/smpboot.c:160
> kthread+0x389/0x470 kernel/kthread.c:436
> ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>
> Memory state around the buggy address:
> ffff888115a31700: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> ffff888115a31780: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc
> >ffff888115a31800: 00 00 00 00 00 00 00 04 fc fc fc fc fc fc fc fc
> ^
> ffff888115a31880: 00 00 00 00 00 00 02 fc fc fc fc fc fc fc fc fc
> ffff888115a31900: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> ==================================================================
>
>
> ***
>
> KASAN: slab-use-after-free Read in __ext4_check_dir_entry
>
> tree: torvalds
> URL:
> https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux
> base: 9716c086c8e8b141d35aa61f2e96a2e83de212a7
> arch: amd64
> compiler: Debian clang version 21.1.8
> (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> config:
> https://ci.syzbot.org/builds/ddf6ee7c-dfa8-4383-b004-10140edc081c/config
> syz repro:
> https://ci.syzbot.org/findings/f1d48ea1-6e87-4d64-9c13-8bf8aed109fc/syz_repro
>
> loop0: lost filesystem error report for type 5 error -117
> EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000
> r/w without journal. Quota mode: none.
> ==================================================================
> BUG: KASAN: slab-use-after-free in ext4_dirent_get_data_len
> fs/ext4/ext4.h:4069 [inline]
> BUG: KASAN: slab-use-after-free in ext4_dir_entry_len fs/ext4/ext4.h:4096
> [inline]
> BUG: KASAN: slab-use-after-free in __ext4_check_dir_entry+0x65a/0xc40
> fs/ext4/dir.c:96
> Read of size 1 at addr ffff888114d8c045 by task syz.0.20/5821
>
> CPU: 1 UID: 0 PID: 5821 Comm: syz.0.20 Not tainted syzkaller #0
> PREEMPT(full)
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> 1.16.2-debian-1.16.2-1 04/01/2014
> Call Trace:
> <TASK>
> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
> print_address_description+0x55/0x1e0 mm/kasan/report.c:378
> print_report+0x58/0x70 mm/kasan/report.c:482
> kasan_report+0x117/0x150 mm/kasan/report.c:595
> ext4_dirent_get_data_len fs/ext4/ext4.h:4069 [inline]
> ext4_dir_entry_len fs/ext4/ext4.h:4096 [inline]
> __ext4_check_dir_entry+0x65a/0xc40 fs/ext4/dir.c:96
> ext4_find_dest_de+0x136/0x770 fs/ext4/namei.c:2203
> ext4_add_dirent_to_inline+0xcf/0x430 fs/ext4/inline.c:984
> ext4_try_add_inline_entry+0x235/0x8e0 fs/ext4/inline.c:1213
> __ext4_add_entry+0x390/0x1f40 fs/ext4/namei.c:2529
> ext4_add_entry fs/ext4/namei.c:2613 [inline]
> ext4_add_nondir+0x111/0x310 fs/ext4/namei.c:2936
> ext4_create+0x2e9/0x470 fs/ext4/namei.c:2982
> lookup_open fs/namei.c:4511 [inline]
> open_last_lookups fs/namei.c:4611 [inline]
> path_openat+0x1395/0x3860 fs/namei.c:4855
> do_file_open+0x23e/0x4a0 fs/namei.c:4887
> do_sys_openat2+0x113/0x200 fs/open.c:1364
> do_sys_open fs/open.c:1370 [inline]
> __do_sys_openat fs/open.c:1386 [inline]
> __se_sys_openat fs/open.c:1381 [inline]
> __x64_sys_openat+0x138/0x170 fs/open.c:1381
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7f922219ce59
> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7
> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
> ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007f9223137028 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
> RAX: ffffffffffffffda RBX: 00007f9222415fa0 RCX: 00007f922219ce59
> RDX: 0000000000042042 RSI: 0000200000000080 RDI: 0000000000000004
> RBP: 00007f9222232d6f R08: 0000000000000000 R09: 0000000000000000
> R10: 000000000000014a R11: 0000000000000246 R12: 0000000000000000
> R13: 00007f9222416038 R14: 00007f9222415fa0 R15: 00007ffd01a2d448
> </TASK>
>
> Allocated by task 5484:
> kasan_save_stack mm/kasan/common.c:57 [inline]
> kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
> unpoison_slab_object mm/kasan/common.c:340 [inline]
> __kasan_slab_alloc+0x6c/0x80 mm/kasan/common.c:366
> kasan_slab_alloc include/linux/kasan.h:253 [inline]
> slab_post_alloc_hook mm/slub.c:4570 [inline]
> slab_alloc_node mm/slub.c:4899 [inline]
> kmem_cache_alloc_node_noprof+0x384/0x690 mm/slub.c:4951
> kmalloc_reserve net/core/skbuff.c:613 [inline]
> __alloc_skb+0x27d/0x7d0 net/core/skbuff.c:713
> alloc_skb include/linux/skbuff.h:1385 [inline]
> nlmsg_new include/net/netlink.h:1055 [inline]
> mpls_netconf_notify_devconf+0x46/0x100 net/mpls/af_mpls.c:1217
> mpls_dev_notify+0xb2d/0xd10 net/mpls/af_mpls.c:1691
> notifier_call_chain+0x1ad/0x3d0 kernel/notifier.c:85
> call_netdevice_notifiers_extack net/core/dev.c:2287 [inline]
> call_netdevice_notifiers net/core/dev.c:2301 [inline]
> unregister_netdevice_many_notify+0x17a5/0x22c0 net/core/dev.c:12421
> ops_exit_rtnl_list net/core/net_namespace.c:187 [inline]
> ops_undo_list+0x3d3/0x940 net/core/net_namespace.c:248
> cleanup_net+0x56b/0x800 net/core/net_namespace.c:702
> process_one_work kernel/workqueue.c:3314 [inline]
> process_scheduled_works+0xb5d/0x1860 kernel/workqueue.c:3397
> worker_thread+0xa53/0xfc0 kernel/workqueue.c:3478
> kthread+0x389/0x470 kernel/kthread.c:436
> ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>
> Freed by task 5484:
> kasan_save_stack mm/kasan/common.c:57 [inline]
> kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
> kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:584
> poison_slab_object mm/kasan/common.c:253 [inline]
> __kasan_slab_free+0x5c/0x80 mm/kasan/common.c:285
> kasan_slab_free include/linux/kasan.h:235 [inline]
> slab_free_hook mm/slub.c:2689 [inline]
> slab_free mm/slub.c:6251 [inline]
> kfree+0x1c5/0x640 mm/slub.c:6566
> skb_kfree_head net/core/skbuff.c:1075 [inline]
> skb_free_head net/core/skbuff.c:1087 [inline]
> skb_release_data+0x828/0xa60 net/core/skbuff.c:1114
> skb_release_all net/core/skbuff.c:1189 [inline]
> __kfree_skb+0x5d/0x210 net/core/skbuff.c:1203
> netlink_broadcast_filtered+0xe18/0xf20 net/netlink/af_netlink.c:1540
> nlmsg_multicast_filtered include/net/netlink.h:1165 [inline]
> nlmsg_multicast include/net/netlink.h:1184 [inline]
> nlmsg_notify+0xf0/0x1a0 net/netlink/af_netlink.c:2598
> mpls_dev_notify+0xb2d/0xd10 net/mpls/af_mpls.c:1691
> notifier_call_chain+0x1ad/0x3d0 kernel/notifier.c:85
> call_netdevice_notifiers_extack net/core/dev.c:2287 [inline]
> call_netdevice_notifiers net/core/dev.c:2301 [inline]
> unregister_netdevice_many_notify+0x17a5/0x22c0 net/core/dev.c:12421
> ops_exit_rtnl_list net/core/net_namespace.c:187 [inline]
> ops_undo_list+0x3d3/0x940 net/core/net_namespace.c:248
> cleanup_net+0x56b/0x800 net/core/net_namespace.c:702
> process_one_work kernel/workqueue.c:3314 [inline]
> process_scheduled_works+0xb5d/0x1860 kernel/workqueue.c:3397
> worker_thread+0xa53/0xfc0 kernel/workqueue.c:3478
> kthread+0x389/0x470 kernel/kthread.c:436
> ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>
> The buggy address belongs to the object at ffff888114d8c000
> which belongs to the cache skbuff_small_head of size 704
> The buggy address is located 69 bytes inside of
> freed 704-byte region [ffff888114d8c000, ffff888114d8c2c0)
>
> The buggy address belongs to the physical page:
> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x114d8c
> head: order:2 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> flags: 0x17ff00000000040(head|node=0|zone=2|lastcpupid=0x7ff)
> page_type: f5(slab)
> raw: 017ff00000000040 ffff888160416b40 dead000000000100 dead000000000122
> raw: 0000000000000000 0000000800120012 00000000f5000000 0000000000000000
> head: 017ff00000000040 ffff888160416b40 dead000000000100 dead000000000122
> head: 0000000000000000 0000000800120012 00000000f5000000 0000000000000000
> head: 017ff00000000002 ffffffffffffff01 00000000ffffffff 00000000ffffffff
> head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000004
> page dumped because: kasan: bad access detected
> page_owner tracks the page as allocated
> page last allocated via order 2, migratetype Unmovable, gfp_mask
> 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC),
> pid 5484, tgid 5484 (kworker/u8:2), ts 72573003529, free_ts 72546506446
> set_page_owner include/linux/page_owner.h:32 [inline]
> post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
> prep_new_page mm/page_alloc.c:1861 [inline]
> get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
> __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
> alloc_slab_page mm/slub.c:3278 [inline]
> allocate_slab+0x77/0x660 mm/slub.c:3467
> new_slab mm/slub.c:3525 [inline]
> refill_objects+0x339/0x3d0 mm/slub.c:7272
> refill_sheaf mm/slub.c:2816 [inline]
> __pcs_replace_empty_main+0x321/0x720 mm/slub.c:4652
> alloc_from_pcs mm/slub.c:4750 [inline]
> slab_alloc_node mm/slub.c:4884 [inline]
> kmem_cache_alloc_node_noprof+0x441/0x690 mm/slub.c:4951
> kmalloc_reserve net/core/skbuff.c:613 [inline]
> __alloc_skb+0x27d/0x7d0 net/core/skbuff.c:713
> alloc_skb include/linux/skbuff.h:1385 [inline]
> nlmsg_new include/net/netlink.h:1055 [inline]
> mpls_netconf_notify_devconf+0x46/0x100 net/mpls/af_mpls.c:1217
> mpls_dev_notify+0xb2d/0xd10 net/mpls/af_mpls.c:1691
> notifier_call_chain+0x1ad/0x3d0 kernel/notifier.c:85
> call_netdevice_notifiers_extack net/core/dev.c:2287 [inline]
> call_netdevice_notifiers net/core/dev.c:2301 [inline]
> unregister_netdevice_many_notify+0x17a5/0x22c0 net/core/dev.c:12421
> ops_exit_rtnl_list net/core/net_namespace.c:187 [inline]
> ops_undo_list+0x3d3/0x940 net/core/net_namespace.c:248
> cleanup_net+0x56b/0x800 net/core/net_namespace.c:702
> process_one_work kernel/workqueue.c:3314 [inline]
> process_scheduled_works+0xb5d/0x1860 kernel/workqueue.c:3397
> worker_thread+0xa53/0xfc0 kernel/workqueue.c:3478
> page last free pid 5484 tgid 5484 stack trace:
> reset_page_owner include/linux/page_owner.h:25 [inline]
> __free_pages_prepare mm/page_alloc.c:1397 [inline]
> __free_frozen_pages+0xc1c/0xd30 mm/page_alloc.c:2938
> stack_depot_save_flags+0x40e/0x810 lib/stackdepot.c:735
> kasan_save_stack mm/kasan/common.c:58 [inline]
> kasan_save_track+0x4f/0x80 mm/kasan/common.c:78
> unpoison_slab_object mm/kasan/common.c:340 [inline]
> __kasan_slab_alloc+0x6c/0x80 mm/kasan/common.c:366
> kasan_slab_alloc include/linux/kasan.h:253 [inline]
> slab_post_alloc_hook mm/slub.c:4570 [inline]
> slab_alloc_node mm/slub.c:4899 [inline]
> kmem_cache_alloc_noprof+0x2bc/0x650 mm/slub.c:4906
> kmem_alloc_batch lib/debugobjects.c:371 [inline]
> fill_pool+0x156/0x580 lib/debugobjects.c:420
> debug_objects_fill_pool lib/debugobjects.c:752 [inline]
> debug_object_activate+0x4a3/0x580 lib/debugobjects.c:841
> debug_rcu_head_queue kernel/rcu/rcu.h:236 [inline]
> __call_rcu_common kernel/rcu/tree.c:3116 [inline]
> call_rcu+0x43/0x890 kernel/rcu/tree.c:3251
> kernfs_put+0x259/0x520 fs/kernfs/dir.c:618
> kernfs_remove_by_name_ns+0xc8/0x140 fs/kernfs/dir.c:1799
> device_remove_class_symlinks+0x178/0x190 drivers/base/core.c:3479
> device_del+0x400/0x8f0 drivers/base/core.c:3881
> unregister_netdevice_many_notify+0x1d5f/0x22c0 net/core/dev.c:12456
> ops_exit_rtnl_list net/core/net_namespace.c:187 [inline]
> ops_undo_list+0x3d3/0x940 net/core/net_namespace.c:248
> cleanup_net+0x56b/0x800 net/core/net_namespace.c:702
> process_one_work kernel/workqueue.c:3314 [inline]
> process_scheduled_works+0xb5d/0x1860 kernel/workqueue.c:3397
>
> Memory state around the buggy address:
> ffff888114d8bf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ffff888114d8bf80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >ffff888114d8c000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ^
> ffff888114d8c080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff888114d8c100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================
>
>
> ***
>
> KASAN: slab-use-after-free Read in ext4_inlinedir_to_tree
>
> tree: torvalds
> URL:
> https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux
> base: 9716c086c8e8b141d35aa61f2e96a2e83de212a7
> arch: amd64
> compiler: Debian clang version 21.1.8
> (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> config:
> https://ci.syzbot.org/builds/ddf6ee7c-dfa8-4383-b004-10140edc081c/config
> syz repro:
> https://ci.syzbot.org/findings/f42da242-e16e-4f10-bf25-0bd7e192d989/syz_repro
>
> loop0: lost filesystem error report for type 5 error -117
> EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000
> r/w without journal. Quota mode: none.
> ==================================================================
> BUG: KASAN: slab-use-after-free in ext4_dirent_get_data_len
> fs/ext4/ext4.h:4069 [inline]
> BUG: KASAN: slab-use-after-free in ext4_dir_entry_len fs/ext4/ext4.h:4096
> [inline]
> BUG: KASAN: slab-use-after-free in ext4_inlinedir_to_tree+0x94c/0x10d0
> fs/ext4/inline.c:1335
> Read of size 1 at addr ffff88816fee8825 by task syz.0.20/5867
>
> CPU: 1 UID: 0 PID: 5867 Comm: syz.0.20 Not tainted syzkaller #0
> PREEMPT(full)
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> 1.16.2-debian-1.16.2-1 04/01/2014
> Call Trace:
> <TASK>
> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
> print_address_description+0x55/0x1e0 mm/kasan/report.c:378
> print_report+0x58/0x70 mm/kasan/report.c:482
> kasan_report+0x117/0x150 mm/kasan/report.c:595
> ext4_dirent_get_data_len fs/ext4/ext4.h:4069 [inline]
> ext4_dir_entry_len fs/ext4/ext4.h:4096 [inline]
> ext4_inlinedir_to_tree+0x94c/0x10d0 fs/ext4/inline.c:1335
> ext4_htree_fill_tree+0x517/0x1230 fs/ext4/namei.c:1182
> ext4_dx_readdir fs/ext4/dir.c:600 [inline]
> ext4_readdir+0x2db4/0x3640 fs/ext4/dir.c:146
> iterate_dir+0x399/0x570 fs/readdir.c:110
> __do_sys_getdents fs/readdir.c:319 [inline]
> __se_sys_getdents+0xf1/0x270 fs/readdir.c:304
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7f010ad9ce59
> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7
> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
> ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007f010bc0f028 EFLAGS: 00000246 ORIG_RAX: 000000000000004e
> RAX: ffffffffffffffda RBX: 00007f010b015fa0 RCX: 00007f010ad9ce59
> RDX: 0000000000000054 RSI: 0000000000000000 RDI: 0000000000000004
> RBP: 00007f010ae32d6f R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 00007f010b016038 R14: 00007f010b015fa0 R15: 00007ffd93577348
> </TASK>
>
> Allocated by task 5064:
> kasan_save_stack mm/kasan/common.c:57 [inline]
> kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
> poison_kmalloc_redzone mm/kasan/common.c:398 [inline]
> __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:415
> kasan_kmalloc include/linux/kasan.h:263 [inline]
> __do_kmalloc_node mm/slub.c:5296 [inline]
> __kmalloc_noprof+0x35c/0x760 mm/slub.c:5308
> kmalloc_noprof include/linux/slab.h:954 [inline]
> kzalloc_noprof include/linux/slab.h:1188 [inline]
> tomoyo_encode2 security/tomoyo/realpath.c:45 [inline]
> tomoyo_encode+0x28b/0x550 security/tomoyo/realpath.c:80
> tomoyo_realpath_from_path+0x58d/0x5d0 security/tomoyo/realpath.c:283
> tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
> tomoyo_path_perm+0x283/0x560 security/tomoyo/file.c:827
> security_inode_getattr+0x12b/0x310 security/security.c:1895
> vfs_getattr fs/stat.c:259 [inline]
> vfs_fstat fs/stat.c:281 [inline]
> vfs_fstatat+0xb4/0x170 fs/stat.c:371
> __do_sys_newfstatat fs/stat.c:538 [inline]
> __se_sys_newfstatat fs/stat.c:532 [inline]
> __x64_sys_newfstatat+0x151/0x200 fs/stat.c:532
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> Freed by task 5064:
> kasan_save_stack mm/kasan/common.c:57 [inline]
> kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
> kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:584
> poison_slab_object mm/kasan/common.c:253 [inline]
> __kasan_slab_free+0x5c/0x80 mm/kasan/common.c:285
> kasan_slab_free include/linux/kasan.h:235 [inline]
> slab_free_hook mm/slub.c:2689 [inline]
> slab_free mm/slub.c:6251 [inline]
> kfree+0x1c5/0x640 mm/slub.c:6566
> tomoyo_path_perm+0x403/0x560 security/tomoyo/file.c:847
> security_inode_getattr+0x12b/0x310 security/security.c:1895
> vfs_getattr fs/stat.c:259 [inline]
> vfs_fstat fs/stat.c:281 [inline]
> vfs_fstatat+0xb4/0x170 fs/stat.c:371
> __do_sys_newfstatat fs/stat.c:538 [inline]
> __se_sys_newfstatat fs/stat.c:532 [inline]
> __x64_sys_newfstatat+0x151/0x200 fs/stat.c:532
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> The buggy address belongs to the object at ffff88816fee8800
> which belongs to the cache kmalloc-64 of size 64
> The buggy address is located 37 bytes inside of
> freed 64-byte region [ffff88816fee8800, ffff88816fee8840)
>
> The buggy address belongs to the physical page:
> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x16fee8
> flags: 0x57ff00000000000(node=1|zone=2|lastcpupid=0x7ff)
> page_type: f5(slab)
> raw: 057ff00000000000 ffff8881000418c0 dead000000000100 dead000000000122
> raw: 0000000000000000 0000000800200020 00000000f5000000 0000000000000000
> page dumped because: kasan: bad access detected
> page_owner tracks the page as allocated
> page last allocated via order 0, migratetype Unmovable, gfp_mask
> 0xd2cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC),
> pid 1, tgid 1 (swapper/0), ts 21294026082, free_ts 0
> set_page_owner include/linux/page_owner.h:32 [inline]
> post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
> prep_new_page mm/page_alloc.c:1861 [inline]
> get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
> __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
> alloc_slab_page mm/slub.c:3278 [inline]
> allocate_slab+0x77/0x660 mm/slub.c:3467
> new_slab mm/slub.c:3525 [inline]
> refill_objects+0x339/0x3d0 mm/slub.c:7272
> refill_sheaf mm/slub.c:2816 [inline]
> __pcs_replace_empty_main+0x321/0x720 mm/slub.c:4652
> alloc_from_pcs mm/slub.c:4750 [inline]
> slab_alloc_node mm/slub.c:4884 [inline]
> __do_kmalloc_node mm/slub.c:5295 [inline]
> __kmalloc_noprof+0x474/0x760 mm/slub.c:5308
> kmalloc_noprof include/linux/slab.h:954 [inline]
> kzalloc_noprof include/linux/slab.h:1188 [inline]
> handler_new_ref+0x261/0x9c0 drivers/media/v4l2-core/v4l2-ctrls-core.c:1882
> v4l2_ctrl_add_handler+0x19f/0x290
> drivers/media/v4l2-core/v4l2-ctrls-core.c:2443
> vivid_create_controls+0x332d/0x3bd0
> drivers/media/test-drivers/vivid/vivid-ctrls.c:2072
> vivid_create_instance drivers/media/test-drivers/vivid/vivid-core.c:1933
> [inline]
> vivid_probe+0x4261/0x72b0
> drivers/media/test-drivers/vivid/vivid-core.c:2095
> platform_probe+0xf9/0x190 drivers/base/platform.c:1432
> call_driver_probe drivers/base/dd.c:-1 [inline]
> really_probe+0x267/0xaf0 drivers/base/dd.c:709
> __driver_probe_device+0x1ef/0x380 drivers/base/dd.c:871
> driver_probe_device+0x4f/0x240 drivers/base/dd.c:901
> __driver_attach+0x34c/0x640 drivers/base/dd.c:1295
> page_owner free stack trace missing
>
> Memory state around the buggy address:
> ffff88816fee8700: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
> ffff88816fee8780: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
> >ffff88816fee8800: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> ^
> ffff88816fee8880: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> ffff88816fee8900: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> ==================================================================
>
>
> ***
>
> KASAN: use-after-free Read in __ext4_check_dir_entry
>
> tree: torvalds
> URL:
> https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux
> base: 9716c086c8e8b141d35aa61f2e96a2e83de212a7
> arch: amd64
> compiler: Debian clang version 21.1.8
> (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> config:
> https://ci.syzbot.org/builds/ddf6ee7c-dfa8-4383-b004-10140edc081c/config
> syz repro:
> https://ci.syzbot.org/findings/57c0b75a-8922-4dc1-9a20-ca947564792b/syz_repro
>
> ==================================================================
> BUG: KASAN: use-after-free in ext4_dirent_get_data_len fs/ext4/ext4.h:4069
> [inline]
> BUG: KASAN: use-after-free in ext4_dir_entry_len fs/ext4/ext4.h:4096
> [inline]
> BUG: KASAN: use-after-free in __ext4_check_dir_entry+0x65a/0xc40
> fs/ext4/dir.c:96
> Read of size 1 at addr ffff88816be85045 by task syz.2.21/5880
>
> CPU: 1 UID: 0 PID: 5880 Comm: syz.2.21 Not tainted syzkaller #0
> PREEMPT(full)
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> 1.16.2-debian-1.16.2-1 04/01/2014
> Call Trace:
> <TASK>
> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
> print_address_description+0x55/0x1e0 mm/kasan/report.c:378
> print_report+0x58/0x70 mm/kasan/report.c:482
> kasan_report+0x117/0x150 mm/kasan/report.c:595
> ext4_dirent_get_data_len fs/ext4/ext4.h:4069 [inline]
> ext4_dir_entry_len fs/ext4/ext4.h:4096 [inline]
> __ext4_check_dir_entry+0x65a/0xc40 fs/ext4/dir.c:96
> ext4_find_dest_de+0x136/0x770 fs/ext4/namei.c:2203
> ext4_add_dirent_to_inline+0xcf/0x430 fs/ext4/inline.c:984
> ext4_try_add_inline_entry+0x235/0x8e0 fs/ext4/inline.c:1213
> __ext4_add_entry+0x390/0x1f40 fs/ext4/namei.c:2529
> ext4_add_entry fs/ext4/namei.c:2613 [inline]
> ext4_add_nondir+0x111/0x310 fs/ext4/namei.c:2936
> ext4_create+0x2e9/0x470 fs/ext4/namei.c:2982
> lookup_open fs/namei.c:4511 [inline]
> open_last_lookups fs/namei.c:4611 [inline]
> path_openat+0x1395/0x3860 fs/namei.c:4855
> do_file_open+0x23e/0x4a0 fs/namei.c:4887
> do_sys_openat2+0x113/0x200 fs/open.c:1364
> do_sys_open fs/open.c:1370 [inline]
> __do_sys_openat fs/open.c:1386 [inline]
> __se_sys_openat fs/open.c:1381 [inline]
> __x64_sys_openat+0x138/0x170 fs/open.c:1381
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7f5713b9ce59
> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7
> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
> ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007fff672b25f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
> RAX: ffffffffffffffda RBX: 00007f5713e15fa0 RCX: 00007f5713b9ce59
> RDX: 0000000000042042 RSI: 0000200000000080 RDI: 0000000000000004
> RBP: 00007f5713c32d6f R08: 0000000000000000 R09: 0000000000000000
> R10: 000000000000014a R11: 0000000000000246 R12: 0000000000000000
> R13: 00007f5713e15fac R14: 00007f5713e15fa0 R15: 00007f5713e15fa0
> </TASK>
>
> The buggy address belongs to the physical page:
> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x16be85
> flags: 0x57ff00000000000(node=1|zone=2|lastcpupid=0x7ff)
> page_type: f0(buddy)
> raw: 057ff00000000000 ffffea0005afa0c8 ffffea0005afa1c8 0000000000000000
> raw: 0000000000000000 0000000000000000 00000000f0000000 0000000000000000
> page dumped because: kasan: bad access detected
> page_owner tracks the page as freed
> page last allocated via order 0, migratetype Unmovable, gfp_mask
> 0xcc0(GFP_KERNEL), pid 5630, tgid 5630 (syz-executor), ts 67290853657,
> free_ts 69321168948
> set_page_owner include/linux/page_owner.h:32 [inline]
> post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
> prep_new_page mm/page_alloc.c:1861 [inline]
> get_page_from_freelist+0x2593/0x2610 mm/page_alloc.c:3941
> __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
> __alloc_pages_noprof+0x10/0x100 mm/page_alloc.c:5255
> alloc_pages_bulk_noprof+0x5ff/0x7c0 mm/page_alloc.c:5175
> ___alloc_pages_bulk mm/kasan/shadow.c:345 [inline]
> __kasan_populate_vmalloc_do mm/kasan/shadow.c:370 [inline]
> __kasan_populate_vmalloc+0xc1/0x1d0 mm/kasan/shadow.c:424
> kasan_populate_vmalloc include/linux/kasan.h:580 [inline]
> alloc_vmap_area+0xd47/0x1480 mm/vmalloc.c:2123
> __get_vm_area_node+0x1f8/0x300 mm/vmalloc.c:3226
> __vmalloc_node_range_noprof+0x36a/0x1750 mm/vmalloc.c:4024
> vmalloc_user_noprof+0xad/0xe0 mm/vmalloc.c:4218
> kcov_ioctl+0x55/0x620 kernel/kcov.c:726
> vfs_ioctl fs/ioctl.c:51 [inline]
> __do_sys_ioctl fs/ioctl.c:597 [inline]
> __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:583
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> page last free pid 5693 tgid 5693 stack trace:
> reset_page_owner include/linux/page_owner.h:25 [inline]
> __free_pages_prepare mm/page_alloc.c:1397 [inline]
> __free_frozen_pages+0xc1c/0xd30 mm/page_alloc.c:2938
> kasan_depopulate_vmalloc_pte+0x6d/0x90 mm/kasan/shadow.c:484
> apply_to_pte_range mm/memory.c:3338 [inline]
> apply_to_pmd_range mm/memory.c:3382 [inline]
> apply_to_pud_range mm/memory.c:3418 [inline]
> apply_to_p4d_range mm/memory.c:3454 [inline]
> __apply_to_page_range+0xbdc/0x1420 mm/memory.c:3490
> __kasan_release_vmalloc+0xa2/0xd0 mm/kasan/shadow.c:602
> kasan_release_vmalloc include/linux/kasan.h:593 [inline]
> kasan_release_vmalloc_node mm/vmalloc.c:2284 [inline]
> purge_vmap_node+0x220/0x960 mm/vmalloc.c:2306
> __purge_vmap_area_lazy+0x779/0xb40 mm/vmalloc.c:2396
> drain_vmap_area_work+0x27/0x40 mm/vmalloc.c:2430
> process_one_work kernel/workqueue.c:3314 [inline]
> process_scheduled_works+0xb5d/0x1860 kernel/workqueue.c:3397
> worker_thread+0xa53/0xfc0 kernel/workqueue.c:3478
> kthread+0x389/0x470 kernel/kthread.c:436
> ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>
> Memory state around the buggy address:
> ffff88816be84f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ffff88816be84f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >ffff88816be85000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> ^
> ffff88816be85080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> ffff88816be85100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> ==================================================================
>
>
> ***
>
> If these findings have caused you to resend the series or submit a
> separate fix, please add the following tag to your commit message:
> Tested-by: syzbot@syzkaller.appspotmail.com
>
> ---
> This report is generated by a bot. It may contain errors.
> syzbot ci engineers can be reached at syzkaller@googlegroups.com.
>
> To test a patch for this bug, please reply with `#syz test`
> (should be on a separate line).
>
> The patch should be attached to the email.
> Note: arguments like custom git repos and branches are not supported.
>
>
[-- Attachment #1.2: Type: text/html, Size: 49607 bytes --]
[-- Attachment #2: dirdata-syzbot-fix.patch --]
[-- Type: application/octet-stream, Size: 11438 bytes --]
From e3d5c74f1ec0fbefb9a4b9193a474614b98d640a Mon Sep 17 00:00:00 2001
From: Artem Blagodarenko <artem.blagodarenko@gmail.com>
Date: Fri, 19 Jun 2026 09:48:12 -0400
Subject: [PATCH] ext4: fix issues reported by syzbot/CI on the dirdata series
Address the following issues found by automated review of the v2
dirdata patch series:
- dx_get_dx_info() called ext4_dir_entry_len() with dir hardcoded to
NULL, forcing its blocksize fallback to 4096 regardless of the real
filesystem blocksize, and never validated that the computed offset
stayed within the block. Thread the real inode through and reject
out-of-bounds results.
- get_dx_countlimit() had the same NULL-dir blocksize-fallback bug at
a separate call site; pass the real inode through there too.
- ext4_dirdata_get() declared a local "dfid" inside the
EXT4_DIRENT_LUFID branch that shadowed the function's own "dfid"
output parameter, so the LUFID copy never reached the caller's
buffer. Rename the local and copy into the real parameter. Also,
both ext4_dirdata_get() and ext4_dirdata_set() compared offsets
against the raw on-disk de->rec_len instead of decoding it via
ext4_rec_len_from_disk(), which is wrong on big-endian hosts and
mishandles the "0/65535 means full block" sentinel.
- ext4_dirdata_set_lufid() (EXT4_IOC_SET_LUFID) deleted the existing
directory entry and then tried to re-add it with the new LUFID
data; if ext4_add_entry() failed, the inode was left with no
directory entry pointing at it. On failure, attempt to restore the
original entry, and loudly flag inode corruption if that also
fails.
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@gmail.com>
---
fs/ext4/namei.c | 105 +++++++++++++++++++++++++++++++++++-------------
1 file changed, 78 insertions(+), 27 deletions(-)
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 65c53c08213a..e6f54dba735e 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -412,7 +412,7 @@ static struct dx_countlimit *get_dx_countlimit(struct inode *inode,
if (le16_to_cpu(de->rec_len) != (blocksize - rlen))
return NULL;
/* de->rec_len covers whole dx_root block, calculate actual length */
- dotdot_rec_len = ext4_dir_entry_len(de, NULL);
+ dotdot_rec_len = ext4_dir_entry_len(de, inode);
root = (struct dx_root_info *)(((char *)de + dotdot_rec_len));
if (root->reserved_zero ||
root->info_length != sizeof(struct dx_root_info))
@@ -520,13 +520,20 @@ ext4_next_entry(struct ext4_dir_entry_2 *p, unsigned long blocksize)
* Future: use high four bits of block for coalesce-on-delete flags
* Mask them off for now.
*/
-static struct dx_root_info *dx_get_dx_info(void *de_buf)
+static struct dx_root_info *dx_get_dx_info(struct inode *dir, void *de_buf)
{
+ unsigned int blocksize = dir->i_sb->s_blocksize;
+ void *base = de_buf;
+
/* get dotdot first */
- de_buf += ext4_dir_entry_len(de_buf, NULL);
+ de_buf += ext4_dir_entry_len(de_buf, dir);
/* dx root info is after dotdot entry */
- de_buf += ext4_dir_entry_len(de_buf, NULL);
+ de_buf += ext4_dir_entry_len(de_buf, dir);
+
+ if (de_buf < base || (char *)de_buf - (char *)base +
+ sizeof(struct dx_root_info) > blocksize)
+ return ERR_PTR(-EFSCORRUPTED);
return (struct dx_root_info *)de_buf;
}
@@ -577,7 +584,9 @@ static inline unsigned dx_root_limit(struct inode *dir,
struct dx_root_info *info;
unsigned int entry_space;
- info = dx_get_dx_info(dot_de);
+ info = dx_get_dx_info(dir, dot_de);
+ if (IS_ERR(info))
+ return 0;
entry_space = dir->i_sb->s_blocksize - ((char *)info - (char *)dot_de) -
info->info_length;
@@ -793,7 +802,9 @@ dx_probe(struct ext4_filename *fname, struct inode *dir,
if (IS_ERR(frame->bh))
return (struct dx_frame *) frame->bh;
- info = dx_get_dx_info((struct ext4_dir_entry_2 *)frame->bh->b_data);
+ info = dx_get_dx_info(dir, (struct ext4_dir_entry_2 *)frame->bh->b_data);
+ if (IS_ERR(info))
+ goto fail;
if (info->hash_version != DX_HASH_TEA &&
info->hash_version != DX_HASH_HALF_MD4 &&
info->hash_version != DX_HASH_LEGACY &&
@@ -938,7 +949,7 @@ dx_probe(struct ext4_filename *fname, struct inode *dir,
return ret_err;
}
-static void dx_release(struct dx_frame *frames)
+static void dx_release(struct inode *dir, struct dx_frame *frames)
{
struct dx_root_info *info;
int i;
@@ -947,7 +958,9 @@ static void dx_release(struct dx_frame *frames)
if (frames[0].bh == NULL)
return;
- info = dx_get_dx_info((struct ext4_dir_entry_2 *)frames[0].bh->b_data);
+ info = dx_get_dx_info(dir, (struct ext4_dir_entry_2 *)frames[0].bh->b_data);
+ if (IS_ERR(info))
+ return;
/* save local copy, "info" may be freed after brelse() */
indirect_levels = info->indirect_levels;
for (i = 0; i <= indirect_levels; i++) {
@@ -1253,12 +1266,12 @@ int ext4_htree_fill_tree(struct file *dir_file, __u32 start_hash,
(count && ((hashval & 1) == 0)))
break;
}
- dx_release(frames);
+ dx_release(dir, frames);
dxtrace(printk(KERN_DEBUG "Fill tree: returned %d entries, "
"next hash: %x\n", count, *next_hash));
return count;
errout:
- dx_release(frames);
+ dx_release(dir, frames);
return (err);
}
@@ -1296,8 +1309,10 @@ unsigned char ext4_dirdata_get(struct ext4_dir_entry_2 *de, struct inode *dir,
{
unsigned char ret = 0;
unsigned int data_offset = de->name_len + 1;
+ unsigned int rec_len = ext4_rec_len_from_disk(de->rec_len,
+ dir->i_sb->s_blocksize);
- if (data_offset > de->rec_len)
+ if (data_offset > rec_len)
return ret;
/* compatibility: hash stored inline after filename (no dirdata) */
@@ -1312,19 +1327,20 @@ unsigned char ext4_dirdata_get(struct ext4_dir_entry_2 *de, struct inode *dir,
/* EXT4_DIRENT_* are not expected without flag in i_sb */
if (de->file_type & EXT4_DIRENT_LUFID) {
- struct ext4_dirent_fid *dfid =
+ struct ext4_dirent_fid *disk_fid =
(struct ext4_dirent_fid *)(de->name + data_offset);
unsigned int dlen;
- if (data_offset + sizeof(dfid->df_header) > de->rec_len)
+ if (data_offset + sizeof(disk_fid->df_header) > rec_len)
return ret;
- dlen = dfid->df_header.ddh_length;
- if (dlen < sizeof(*dfid) || data_offset + dlen > de->rec_len)
+ dlen = disk_fid->df_header.ddh_length;
+ if (dlen < sizeof(*disk_fid) || data_offset + dlen > rec_len)
return ret;
if (dfid) {
- memcpy(dfid, dfid->df_fid, dfid->df_header.ddh_length);
+ memcpy(dfid, disk_fid->df_fid,
+ disk_fid->df_header.ddh_length);
ret |= EXT4_DIRENT_LUFID;
}
data_offset += dlen;
@@ -1336,11 +1352,11 @@ unsigned char ext4_dirdata_get(struct ext4_dir_entry_2 *de, struct inode *dir,
(struct ext4_dirent_data_header *)(de->name + data_offset);
unsigned int dlen;
- if (data_offset + sizeof(*ddh) > de->rec_len)
+ if (data_offset + sizeof(*ddh) > rec_len)
return ret;
dlen = ddh->ddh_length;
- if (dlen < sizeof(*ddh) || data_offset + dlen > de->rec_len)
+ if (dlen < sizeof(*ddh) || data_offset + dlen > rec_len)
return ret;
data_offset += dlen;
@@ -1355,7 +1371,7 @@ unsigned char ext4_dirdata_get(struct ext4_dir_entry_2 *de, struct inode *dir,
unsigned int dlen;
dlen = dh->dh_header.ddh_length;
- if (dlen < sizeof(*dh) || data_offset + dlen > de->rec_len)
+ if (dlen < sizeof(*dh) || data_offset + dlen > rec_len)
return ret;
hinfo->hash = le32_to_cpu(dh->dh_hash.hash);
@@ -1383,12 +1399,14 @@ static void ext4_dirdata_set(struct ext4_dir_entry_2 *de, struct inode *dir,
{
struct dx_hash_info *hinfo = &fname->hinfo;
unsigned int data_offset = de->name_len + 1;
+ unsigned int rec_len = ext4_rec_len_from_disk(de->rec_len,
+ dir->i_sb->s_blocksize);
if (dfid) {
unsigned int dlen = dfid->df_header.ddh_length;
- if (data_offset + dlen > de->rec_len) {
+ if (data_offset + dlen > rec_len) {
EXT4_ERROR_INODE(dir, "Can not insert FID");
return;
}
@@ -1406,7 +1424,7 @@ static void ext4_dirdata_set(struct ext4_dir_entry_2 *de, struct inode *dir,
struct ext4_dirent_hash *dh =
(struct ext4_dirent_hash *)(de->name + data_offset);
- if (data_offset + sizeof(*dh) > de->rec_len) {
+ if (data_offset + sizeof(*dh) > rec_len) {
EXT4_ERROR_INODE(dir, "Can not insert dhash dirdata");
return;
}
@@ -1418,7 +1436,7 @@ static void ext4_dirdata_set(struct ext4_dir_entry_2 *de, struct inode *dir,
} else {
/* Compatibility: store hash inline after filename */
if (data_offset + sizeof(struct ext4_dir_entry_hash) >
- de-> rec_len) {
+ rec_len) {
EXT4_ERROR_INODE(dir, "Can not insert dhash");
return;
}
@@ -1906,7 +1924,7 @@ static struct buffer_head * ext4_dx_find_entry(struct inode *dir,
errout:
dxtrace(printk(KERN_DEBUG "%s not found\n", fname->usr_fname->name));
success:
- dx_release(frames);
+ dx_release(dir, frames);
return bh;
}
@@ -2425,7 +2443,12 @@ static int make_indexed_dir(handle_t *handle, struct ext4_filename *fname,
blocksize);
/* initialize hashing info */
- dx_info = dx_get_dx_info(dot_de);
+ dx_info = dx_get_dx_info(dir, dot_de);
+ if (IS_ERR(dx_info)) {
+ brelse(bh2);
+ brelse(bh);
+ return PTR_ERR(dx_info);
+ }
memset(dx_info, 0, sizeof(*dx_info));
dx_info->info_length = sizeof(*dx_info);
if (ext4_hash_in_dirent(dir))
@@ -2483,7 +2506,7 @@ static int make_indexed_dir(handle_t *handle, struct ext4_filename *fname,
*/
if (retval)
ext4_mark_inode_dirty(handle, dir);
- dx_release(frames);
+ dx_release(dir, frames);
brelse(bh2);
return retval;
}
@@ -2759,8 +2782,13 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
/* Set up root */
dx_set_count(entries, 1);
dx_set_block(entries + 0, newblock);
- info = dx_get_dx_info((struct ext4_dir_entry_2 *)
+ info = dx_get_dx_info(dir, (struct ext4_dir_entry_2 *)
frames[0].bh->b_data);
+ if (IS_ERR(info)) {
+ err = PTR_ERR(info);
+ brelse(bh2);
+ goto journal_error;
+ }
info->indirect_levels += 1;
dxtrace(printk(KERN_DEBUG
"Creating %d level index...\n",
@@ -2788,7 +2816,7 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
ext4_std_error(dir->i_sb, err); /* this is a no-op if err == 0 */
cleanup:
brelse(bh);
- dx_release(frames);
+ dx_release(dir, frames);
/* @restart is true means htree-path has been changed, we need to
* repeat dx_probe() to find out valid htree-path
*/
@@ -4463,6 +4491,29 @@ int ext4_dirdata_set_lufid(struct inode *dir, const char *filename,
}
EXT4_I(inode)->i_dirdata = old_dirdata;
+ if (err) {
+ /*
+ * The original entry was already removed above and the
+ * re-add with the new LUFID failed; try to restore the
+ * original entry so the inode isn't left without any
+ * directory entry pointing at it.
+ */
+ struct dentry parent_dentry = { .d_inode = dir };
+ struct dentry orig_dentry = {
+ .d_name = d_name,
+ .d_parent = &parent_dentry,
+ .d_inode = inode,
+ };
+ int rollback_err = ext4_add_entry(handle, &orig_dentry, inode);
+
+ if (rollback_err)
+ EXT4_ERROR_INODE(dir,
+ "Failed to set LUFID on '%.*s' (err=%d) and failed to restore the original directory entry (err=%d); inode %llu may be orphaned",
+ namelen, filename, err, rollback_err,
+ inode->i_ino);
+ goto out_unlock;
+ }
+
/* Update inode times */
inode_set_ctime_current(dir);
inode_inc_iversion(dir);
--
2.43.7
^ permalink raw reply related
* Re: [PATCH 0/2] fs: refactor code to use clear_and_wake_up_bit()
From: Christian Brauner @ 2026-06-19 13:34 UTC (permalink / raw)
To: linux-fsdevel, linux-ext4, linux-kernel, Jan Kara, shuo chen,
Theodore Ts'o, linux-kernel-mentees, shuah, patch-reply,
Agatha Isabelle Moreira
In-Reply-To: <ag4PEP52c8rxrYPc@guidai>
On Wed, 20 May 2026 16:45:35 -0300, Agatha Isabelle Moreira wrote:
> fs: refactor code to use clear_and_wake_up_bit()
>
> Refactor code to use `clear_and_wake_up_bit()` instead of manual calls
> to:
> clear_bit_unlock();
> smp_mb__after_atomic();
> wake_up_bit();
>
> [...]
Applied to the vfs-7.3.misc branch of the vfs/vfs.git tree.
Patches in the vfs-7.3.misc branch should appear in linux-next soon.
Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.
It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.
Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.
tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs-7.3.misc
[1/2] fs: buffer: use clear_and_wake_up_bit() in unlock_buffer()
https://git.kernel.org/vfs/vfs/c/1a6e4692deca
[2/2] fs: jbd2: use clear_and_wake_up_bit() in journal_end_buffer_io_sync()
https://git.kernel.org/vfs/vfs/c/8efd38683c81
^ permalink raw reply
* Re: [PATCH v10 03/22] ovl: use core fsverity ensure info interface
From: Amir Goldstein @ 2026-06-19 7:28 UTC (permalink / raw)
To: Eric Biggers
Cc: Andrey Albershteyn, linux-xfs, fsverity, linux-fsdevel, hch,
linux-ext4, linux-f2fs-devel, linux-btrfs, linux-unionfs, djwong
In-Reply-To: <20260520190719.GB3424023@google.com>
On Wed, May 20, 2026 at 9:07 PM Eric Biggers <ebiggers@kernel.org> wrote:
>
> On Wed, May 20, 2026 at 02:37:01PM +0200, Andrey Albershteyn wrote:
> > fsverity now exposes fsverity_ensure_verity_info() which could be used
> > instead of opening file to ensure that fsverity info is loaded and
> > attached to inode.
> >
> > Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
> > Acked-by: Amir Goldstein <amir73il@gmail.com>
> > ---
> > fs/overlayfs/util.c | 14 +++-----------
> > 1 file changed, 3 insertions(+), 11 deletions(-)
>
> Reviewed-by: Eric Biggers <ebiggers@kernel.org>
>
> I'm still confused by the new implementation of fsverity_active() that
> got introduced by "fsverity: use a hashtable to find the fsverity_info",
> though. I should have caught this during review of that commit. For
> one its comment is outdated, but also the memory barrier seems to be
> specific to the fsverity_get_info() caller and probably should be moved
> to there. Anyway, that's not directly related to this patch.
Eric, Andrey,
Did you see the Sashiko review for this patch and others in this series?
https://sashiko.dev/#/patchset/20260520123722.405752-1-aalbersh%40kernel.org
It annotated some review comments as high and critical.
For this patch it is about interaction with fscrypt.
Please take a look and say if this is concerning or false positive.
Thanks,
Amir.
^ permalink raw reply
* Re: [PATCH v7 3/4] ext4: introduce ext4_put_ea_inode() for safe deferred iput
From: Zhou, Yun @ 2026-06-19 6:24 UTC (permalink / raw)
To: Jan Kara
Cc: tytso, adilger.kernel, libaokun, ojaswin, ritesh.list, yi.zhang,
linux-ext4, linux-kernel
In-Reply-To: <jxcbsd2ot63wy3dcoximemkuitwoqn2a7jgxcsfdwaf5q3ecdu@sahahqqopo6y>
On 6/18/2026 2:42 AM, Jan Kara wrote:
> On Tue 16-06-26 23:15:57, Yun Zhou wrote:
>> +
>> + /* Deferred iput for EA inodes to avoid lock ordering issues */
>> + struct llist_head s_ea_inode_to_free;
>> + struct work_struct s_ea_inode_work;
>> +
>
> I'd probably use delayed work and schedule it with a delay of one jiffie so
> that some inodes can accumulate before we process them which should reduce
> the amount of task switching to workqueues.
>
Good idea, I will use delayed_work in next version.
>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>> index 6a77db4d3124..b777bb0a81ea 100644
>> --- a/fs/ext4/super.c
>> +++ b/fs/ext4/super.c
>> @@ -1308,6 +1308,9 @@ static void ext4_put_super(struct super_block *sb)
>> destroy_workqueue(sbi->rsv_conversion_wq);
>> ext4_release_orphan_info(sb);
>>
>> + /* Flush deferred EA inode iputs before destroying journal */
>> + flush_work(&sbi->s_ea_inode_work);
>> +
>
> This should happen earlier in ext4_put_super(). At this place quotas were
> already turned off and so quota accounting would go wrong.
That makes sense. I'll move it up to right before ext4_quotas_off().
>> +static void ext4_xattr_inode_array_free_deferred(struct super_block *sb,
>> + struct ext4_xattr_inode_array *array)
>
> The array of EA inodes used in xattr handling is just another mechanism
> used for delaying iput() of EA inodes. It doesn't make sense to stack these
> to one on top of another. Just completely replace the array mechanism with
> always deferring iput of EA inode into the workqueue.
>
I'm thinking that a complete replacement might be too large a change.
Should we consider postponing this work, or perhaps appending a new
patch to this series to handle it?
>
> Allocating ext4_ea_iput_entry for dropping each inode is somewhat wasteful.
> I want to suggest another scheme (somewhat more involved but more efficient
> scheme):
>
> 1) Create a VFS helper bool iput_if_not_last(struct inode *inode) which
> drops inode reference if it is not the last one (and returns true in that
> case). Basically:
>
> bool iput_if_not_last(struct inode *inode)
> {
> return atomic_add_unless(&inode->i_count, -1, 1);
> }
>
> This needs to be a separate patch as it should get vetting from VFS
> maintainers.
>
> 2) Use iput_if_not_last() in ext4_put_ea_inode(). If it returns true, we
> are done. Otherwise we know we were at least for a moment holders of the
> last inode reference, so we link the inode to the list of inodes to drop
> through llist_node embedded in ext4_inode_info. We cannot race with anybody
> else trying to link the same inode into the list because we hold one inode
> ref and so nobody else can hit this "I was holding the last ref" path.
> I'd union this llist_node say with xattr_sem which is unused for EA inodes
> to avoid growing ext4_inode_info.
>
> This way we avoid offloading unless really necessary and we don't have to
> do allocations just to drop EA inode ref.
>
Your idea makes a lot of sense. It greatly simplifies the current deferred
iput logic and eliminates the risk of failing to allocate an entry during
an OOM. However, as you mentioned, getting the VFS maintainers to agree
might be quite challenging.
BR,
Yun
^ permalink raw reply
* WARNING: at ext4_check_map_extents_env, CPU: syz.NUM.NUM/ADDR
From: sanan.hasanou @ 2026-06-18 22:26 UTC (permalink / raw)
To: tytso, adilger.kernel, linux-ext4, linux-kernel; +Cc: syzkaller, contact
Good day, dear maintainers,
We found a bug using a modified version of syzkaller.
Kernel Branch: 7.0-rc1
Kernel Config: <https://drive.google.com/open?id=173DLEAEPKPhhR1TcqofdnkLpdoK7PMFl>
Unfortunately, we don't have any reproducer for this bug yet.
Thank you!
Best regards,
Sanan Hasanov
EXT4-fs (loop7): stripe (65535) is not aligned with cluster size (16), stripe is disabled
[EXT4 FS bs=1024, gc=1, bpg=131072, ipg=32, mo=a840e11d, mo2=0002]
------------[ cut here ]------------
WARNING: at ext4_check_map_extents_env+0x471/0x510 fs/ext4/inode.c:436, CPU#1: syz.7.16867/107084
Modules linked in:
CPU: 1 UID: 0 PID: 107084 Comm: syz.7.16867 Not tainted 7.0.0-rc1 #1 PREEMPT(full)
Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:ext4_check_map_extents_env+0x471/0x510 fs/ext4/inode.c:436
Code: ff e9 89 fc ff ff 44 89 e1 80 e1 07 80 c1 03 38 c1 0f 8c 05 fd ff ff 4c 89 e7 e8 da d6 ae ff e9 f8 fc ff ff e8 60 a7 42 ff 90 <0f> 0b 90 e9 8a fc ff ff 44 89 e1 80 e1 07 80 c1 03 38 c1 0f 8c 25
RSP: 0018:ffffc900015376c8 EFLAGS: 00010283
RAX: ffffffff827faa70 RBX: 0000000000000000 RCX: 0000000000080000
RDX: ffffc900150f1000 RSI: 00000000000052eb RDI: 00000000000052ec
RBP: 0000000000000000 R08: ffff888034a03be7 R09: 1ffff1100694077c
R10: dffffc0000000000 R11: ffffed100694077d R12: 0000000000000000
R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 00007f2dfb24b6c0(0000) GS:ffff8880d99df000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f2dfa45fb85 CR3: 0000000038fd7000 CR4: 00000000000006f0
Call Trace:
<TASK>
ext4_map_blocks+0x1e9/0x1540 fs/ext4/inode.c:721
ext4_protect_reserved_inode fs/ext4/block_validity.c:168 [inline]
ext4_setup_system_zone+0x872/0xa90 fs/ext4/block_validity.c:251
__ext4_fill_super fs/ext4/super.c:5594 [inline]
ext4_fill_super+0x534c/0x6390 fs/ext4/super.c:5791
get_tree_bdev_flags+0x3fe/0x4c0 fs/super.c:1694
vfs_get_tree+0x8e/0x290 fs/super.c:1754
fc_mount fs/namespace.c:1193 [inline]
do_new_mount_fc fs/namespace.c:3760 [inline]
do_new_mount+0x31f/0xd40 fs/namespace.c:3836
do_mount fs/namespace.c:4159 [inline]
__do_sys_mount fs/namespace.c:4348 [inline]
__se_sys_mount+0x3a1/0x4b0 fs/namespace.c:4325
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x19a/0x7b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7f2dfa3a559e
Code: 0f 1f 40 00 48 c7 c2 b0 ff ff ff f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f2dfb24ae28 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007f2dfb24aec0 RCX: 00007f2dfa3a559e
RDX: 0000200000000080 RSI: 0000200000000040 RDI: 00007f2dfb24ae80
RBP: 0000200000000080 R08: 00007f2dfb24aec0 R09: 0000000000000011
R10: 0000000000000011 R11: 0000000000000246 R12: 0000200000000040
R13: 00007f2dfb24ae80 R14: 000000000000060c R15: 0000200000000180
</TASK>
<<<<<<<<<<<<<<< tail report >>>>>>>>>>>>>>>
^ permalink raw reply
* WARNING: at ext4_check_map_extents_env, CPU: syz.NUM.NUM/NUM
From: sanan.hasanou @ 2026-06-18 22:24 UTC (permalink / raw)
To: tytso, adilger.kernel, linux-ext4, linux-kernel; +Cc: syzkaller, contact
Good day, dear maintainers,
We found a bug using a modified version of syzkaller.
Kernel Branch: 7.0-rc1
Kernel Config: <https://drive.google.com/open?id=173DLEAEPKPhhR1TcqofdnkLpdoK7PMFl>
Reproducer: <https://drive.google.com/open?id=1_mGsSS7wCRfk8qXdMEw08C6S49PejwXr>
Thank you!
Best regards,
Sanan Hasanov
EXT4-fs (loop0): stripe (65535) is not aligned with cluster size (16), stripe is disabled
[EXT4 FS bs=1024, gc=1, bpg=131072, ipg=32, mo=a840e11d, mo2=0002]
------------[ cut here ]------------
WARNING: at ext4_check_map_extents_env+0x471/0x510 fs/ext4/inode.c:436, CPU#0: syz.0.1217/21399
Modules linked in:
CPU: 0 UID: 0 PID: 21399 Comm: syz.0.1217 Not tainted 7.0.0-rc1 #1 PREEMPT(full)
Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:ext4_check_map_extents_env+0x471/0x510 fs/ext4/inode.c:436
Code: ff e9 89 fc ff ff 44 89 e1 80 e1 07 80 c1 03 38 c1 0f 8c 05 fd ff ff 4c 89 e7 e8 da d6 ae ff e9 f8 fc ff ff e8 60 a7 42 ff 90 <0f> 0b 90 e9 8a fc ff ff 44 89 e1 80 e1 07 80 c1 03 38 c1 0f 8c 25
RSP: 0018:ffffc90002ce76c8 EFLAGS: 00010287
RAX: ffffffff827faa70 RBX: 0000000000000000 RCX: 0000000000080000
RDX: ffffc900019f5000 RSI: 00000000000030b1 RDI: 00000000000030b2
RBP: 0000000000000000 R08: ffff888017201637 R09: 1ffff11002e402c6
R10: dffffc0000000000 R11: ffffed1002e402c7 R12: 0000000000000000
R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 00007fbe746086c0(0000) GS:ffff8880d98df000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe495821940 CR3: 00000000683d9000 CR4: 00000000000006f0
Call Trace:
<TASK>
ext4_map_blocks+0x1e9/0x1540 fs/ext4/inode.c:721
ext4_protect_reserved_inode fs/ext4/block_validity.c:168 [inline]
ext4_setup_system_zone+0x872/0xa90 fs/ext4/block_validity.c:251
__ext4_fill_super fs/ext4/super.c:5594 [inline]
ext4_fill_super+0x534c/0x6390 fs/ext4/super.c:5791
get_tree_bdev_flags+0x3fe/0x4c0 fs/super.c:1694
vfs_get_tree+0x8e/0x290 fs/super.c:1754
fc_mount fs/namespace.c:1193 [inline]
do_new_mount_fc fs/namespace.c:3760 [inline]
do_new_mount+0x31f/0xd40 fs/namespace.c:3836
do_mount fs/namespace.c:4159 [inline]
__do_sys_mount fs/namespace.c:4348 [inline]
__se_sys_mount+0x3a1/0x4b0 fs/namespace.c:4325
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x19a/0x7b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7fbe737a559e
Code: 0f 1f 40 00 48 c7 c2 b0 ff ff ff f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fbe74607e28 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007fbe74607ec0 RCX: 00007fbe737a559e
RDX: 0000200000000080 RSI: 0000200000000040 RDI: 00007fbe74607e80
RBP: 0000200000000080 R08: 00007fbe74607ec0 R09: 0000000000000011
R10: 0000000000000011 R11: 0000000000000246 R12: 0000200000000040
R13: 00007fbe74607e80 R14: 000000000000060c R15: 0000200000000180
</TASK>
<<<<<<<<<<<<<<< tail report >>>>>>>>>>>>>>>
^ permalink raw reply
* [PATCH] fscrypt: Fix key setup in edge case with multiple data unit sizes
From: Eric Biggers @ 2026-06-18 18:06 UTC (permalink / raw)
To: linux-fscrypt
Cc: Theodore Ts'o, Jaegeuk Kim, linux-kernel, linux-fsdevel,
linux-ext4, linux-f2fs-devel, Eric Biggers, stable
The addition of support for customizable data unit sizes introduced an
edge case where a file's contents can be en/decrypted with the wrong
data unit size. It occurs when there are multiple v2 policies that:
- Have *different* data unit sizes, via the log2_data_unit_size field
- Share the same master_key_identifier, contents_encryption_mode, and
either FSCRYPT_POLICY_FLAG_DIRECT_KEY,
FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32, or
FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64
- Are being used on the same filesystem, which also must be mounted with
the "inlinecrypt" mount option.
Fortunately this edge case doesn't actually occur in practice. I just
found it via code review. But it needs to be fixed regardless.
The bug is caused by the data unit size not being fully considered when
blk_crypto_keys are cached in mk_direct_keys, mk_iv_ino_lblk_32_keys,
and mk_iv_ino_lblk_64_keys. They're differentiated only by master key,
encryption mode, and flag. However, each one actually has a data unit
size too. Only the first data unit size that is cached is used.
To fix this, start using the data unit size to differentiate the cached
keys. For several reasons, including avoiding increasing the size of
struct fscrypt_master_key, just replace all three arrays with a single
linked list instead of changing them into two-dimensional arrays. This
works well when considering that in practice at most 2 entries are used
across all three arrays, so it was already mostly wasted space.
For simplicity, make the list also take over the publish/subscribe of
the prepared key itself. That is, create separate list nodes for
blk_crypto_keys vs crypto_skciphers, and add nodes to the list only when
their key is actually prepared. (Note that the legacy
fscrypt_direct_keys table in fs/crypto/keysetup_v1.c already works this
way.) This eliminates the need for the additional memory barriers when
reading and writing the fields of struct fscrypt_prepared_key.
Note that I technically should have included the data unit size in the
HKDF info string as well. But it's too late to change that.
Fixes: 5b1188847180 ("fscrypt: support crypto data unit size less than filesystem block size")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
---
I'm planning to take this via the fscrypt tree for 7.2
fs/crypto/fscrypt_private.h | 52 +++++++++-------
fs/crypto/inline_crypt.c | 8 +--
fs/crypto/keyring.c | 23 ++++---
fs/crypto/keysetup.c | 118 ++++++++++++++++++++++--------------
4 files changed, 120 insertions(+), 81 deletions(-)
diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h
index 8d3c278a7591..4263cac24b32 100644
--- a/fs/crypto/fscrypt_private.h
+++ b/fs/crypto/fscrypt_private.h
@@ -234,19 +234,28 @@ struct fscrypt_symlink_data {
/**
* struct fscrypt_prepared_key - a key prepared for actual encryption/decryption
* @tfm: crypto API transform object
* @blk_key: key for blk-crypto
*
- * Normally only one of the fields will be non-NULL.
+ * Only one of the fields is non-NULL.
*/
struct fscrypt_prepared_key {
struct crypto_sync_skcipher *tfm;
#ifdef CONFIG_FS_ENCRYPTION_INLINE_CRYPT
struct blk_crypto_key *blk_key;
#endif
};
+/* An entry in the linked list ->mk_mode_keys */
+struct fscrypt_mode_key {
+ struct fscrypt_prepared_key key;
+ struct list_head link;
+ u8 hkdf_context;
+ u8 mode_num;
+ u8 data_unit_bits;
+};
+
/*
* fscrypt_inode_info - the "encryption key" for an inode
*
* When an encrypted file's key is made available, an instance of this struct is
* allocated and a pointer to it is stored in the file's in-memory inode. Once
@@ -428,24 +437,16 @@ int fscrypt_derive_sw_secret(struct super_block *sb,
/*
* Check whether the crypto transform or blk-crypto key has been allocated in
* @prep_key, depending on which encryption implementation the file will use.
*/
static inline bool
-fscrypt_is_key_prepared(struct fscrypt_prepared_key *prep_key,
+fscrypt_is_key_prepared(const struct fscrypt_prepared_key *prep_key,
const struct fscrypt_inode_info *ci)
{
- /*
- * The two smp_load_acquire()'s here pair with the smp_store_release()'s
- * in fscrypt_prepare_inline_crypt_key() and fscrypt_prepare_key().
- * I.e., in some cases (namely, if this prep_key is a per-mode
- * encryption key) another task can publish blk_key or tfm concurrently,
- * executing a RELEASE barrier. We need to use smp_load_acquire() here
- * to safely ACQUIRE the memory the other task published.
- */
if (fscrypt_using_inline_encryption(ci))
- return smp_load_acquire(&prep_key->blk_key) != NULL;
- return smp_load_acquire(&prep_key->tfm) != NULL;
+ return prep_key->blk_key != NULL;
+ return prep_key->tfm != NULL;
}
#else /* CONFIG_FS_ENCRYPTION_INLINE_CRYPT */
static inline int fscrypt_select_encryption_impl(struct fscrypt_inode_info *ci,
@@ -484,14 +485,14 @@ fscrypt_derive_sw_secret(struct super_block *sb,
fscrypt_warn(NULL, "kernel doesn't support hardware-wrapped keys");
return -EOPNOTSUPP;
}
static inline bool
-fscrypt_is_key_prepared(struct fscrypt_prepared_key *prep_key,
+fscrypt_is_key_prepared(const struct fscrypt_prepared_key *prep_key,
const struct fscrypt_inode_info *ci)
{
- return smp_load_acquire(&prep_key->tfm) != NULL;
+ return prep_key->tfm != NULL;
}
#endif /* !CONFIG_FS_ENCRYPTION_INLINE_CRYPT */
/* keyring.c */
@@ -575,12 +576,12 @@ struct fscrypt_master_key {
struct rw_semaphore mk_sem;
/*
* Active and structural reference counts. An active ref guarantees
* that the struct continues to exist, continues to be in the keyring
- * ->s_master_keys, and that any embedded subkeys (e.g.
- * ->mk_direct_keys) that have been prepared continue to exist.
+ * ->s_master_keys, and that any non-file-scoped subkeys (e.g.
+ * ->mk_mode_keys) that have been prepared continue to exist.
* A structural ref only guarantees that the struct continues to exist.
*
* There is one active ref associated with ->mk_present being true, and
* one active ref for each inode in ->mk_decrypted_inodes.
*
@@ -630,16 +631,25 @@ struct fscrypt_master_key {
*/
struct list_head mk_decrypted_inodes;
spinlock_t mk_decrypted_inodes_lock;
/*
- * Per-mode encryption keys for the various types of encryption policies
- * that use them. Allocated and derived on-demand.
+ * A list of 'struct fscrypt_mode_key' for the (hkdf_context, mode_num,
+ * data_unit_bits, inlinecrypt) combinations that are in use for this
+ * master key, for hkdf_context in [HKDF_CONTEXT_DIRECT_KEY,
+ * HKDF_CONTEXT_IV_INO_LBLK_32_KEY, HKDF_CONTEXT_IV_INO_LBLK_64_KEY].
+ *
+ * This is a linked list and not a hash table because in practice
+ * there's just a single encryption policy per master key, using
+ * _at most_ 2 nodes in this list. Per-file keys don't use this at all.
+ *
+ * This list is append-only until the master key is fully removed, at
+ * which time the list is cleared. Before then,
+ * fscrypt_mode_key_setup_mutex synchronizes appends, and searches use
+ * the RCU read lock together with ->mk_sem held for read.
*/
- struct fscrypt_prepared_key mk_direct_keys[FSCRYPT_MODE_MAX + 1];
- struct fscrypt_prepared_key mk_iv_ino_lblk_64_keys[FSCRYPT_MODE_MAX + 1];
- struct fscrypt_prepared_key mk_iv_ino_lblk_32_keys[FSCRYPT_MODE_MAX + 1];
+ struct list_head mk_mode_keys;
/* Hash key for inode numbers. Initialized only when needed. */
siphash_key_t mk_ino_hash_key;
bool mk_ino_hash_key_initialized;
diff --git a/fs/crypto/inline_crypt.c b/fs/crypto/inline_crypt.c
index 37d42d357925..47324062fee5 100644
--- a/fs/crypto/inline_crypt.c
+++ b/fs/crypto/inline_crypt.c
@@ -196,17 +196,11 @@ int fscrypt_prepare_inline_crypt_key(struct fscrypt_prepared_key *prep_key,
if (err) {
fscrypt_err(inode, "error %d starting to use blk-crypto", err);
goto fail;
}
- /*
- * Pairs with the smp_load_acquire() in fscrypt_is_key_prepared().
- * I.e., here we publish ->blk_key with a RELEASE barrier so that
- * concurrent tasks can ACQUIRE it. Note that this concurrency is only
- * possible for per-mode keys, not for per-file keys.
- */
- smp_store_release(&prep_key->blk_key, blk_key);
+ prep_key->blk_key = blk_key;
return 0;
fail:
kfree_sensitive(blk_key);
return err;
diff --git a/fs/crypto/keyring.c b/fs/crypto/keyring.c
index be8e6e8011f2..5fe0d985a58d 100644
--- a/fs/crypto/keyring.c
+++ b/fs/crypto/keyring.c
@@ -85,18 +85,18 @@ void fscrypt_put_master_key(struct fscrypt_master_key *mk)
}
void fscrypt_put_master_key_activeref(struct super_block *sb,
struct fscrypt_master_key *mk)
{
- size_t i;
+ struct fscrypt_mode_key *node, *tmp;
if (!refcount_dec_and_test(&mk->mk_active_refs))
return;
/*
* No active references left, so complete the full removal of this
* fscrypt_master_key struct by removing it from the keyring and
- * destroying any subkeys embedded in it.
+ * destroying any non-file-scoped subkeys.
*/
if (WARN_ON_ONCE(!sb->s_master_keys))
return;
spin_lock(&sb->s_master_keys->lock);
@@ -108,17 +108,20 @@ void fscrypt_put_master_key_activeref(struct super_block *sb,
* ->mk_decrypted_inodes is empty.
*/
WARN_ON_ONCE(mk->mk_present);
WARN_ON_ONCE(!list_empty(&mk->mk_decrypted_inodes));
- for (i = 0; i <= FSCRYPT_MODE_MAX; i++) {
- fscrypt_destroy_prepared_key(
- sb, &mk->mk_direct_keys[i]);
- fscrypt_destroy_prepared_key(
- sb, &mk->mk_iv_ino_lblk_64_keys[i]);
- fscrypt_destroy_prepared_key(
- sb, &mk->mk_iv_ino_lblk_32_keys[i]);
+ /*
+ * Destroy any non-file-scoped subkeys. Since ->mk_active_refs == 0,
+ * they're no longer referenced by any inodes. Nor can key setup run
+ * and use them again. So they're no longer needed. (This implies no
+ * concurrent readers, so we don't need list_del_rcu() for example.)
+ */
+ list_for_each_entry_safe(node, tmp, &mk->mk_mode_keys, link) {
+ fscrypt_destroy_prepared_key(sb, &node->key);
+ list_del(&node->link);
+ kfree(node);
}
memzero_explicit(&mk->mk_ino_hash_key,
sizeof(mk->mk_ino_hash_key));
mk->mk_ino_hash_key_initialized = false;
@@ -443,10 +446,12 @@ static int add_new_master_key(struct super_block *sb,
mk->mk_spec = *mk_spec;
INIT_LIST_HEAD(&mk->mk_decrypted_inodes);
spin_lock_init(&mk->mk_decrypted_inodes_lock);
+ INIT_LIST_HEAD(&mk->mk_mode_keys);
+
if (mk_spec->type == FSCRYPT_KEY_SPEC_TYPE_IDENTIFIER) {
err = allocate_master_key_users_keyring(mk);
if (err)
goto out_put;
err = add_master_key_user(mk);
diff --git a/fs/crypto/keysetup.c b/fs/crypto/keysetup.c
index ce327bfdada4..f905f9f94bdd 100644
--- a/fs/crypto/keysetup.c
+++ b/fs/crypto/keysetup.c
@@ -161,17 +161,11 @@ int fscrypt_prepare_key(struct fscrypt_prepared_key *prep_key,
false, ci);
tfm = fscrypt_allocate_skcipher(ci->ci_mode, raw_key, ci->ci_inode);
if (IS_ERR(tfm))
return PTR_ERR(tfm);
- /*
- * Pairs with the smp_load_acquire() in fscrypt_is_key_prepared().
- * I.e., here we publish ->tfm with a RELEASE barrier so that
- * concurrent tasks can ACQUIRE it. Note that this concurrency is only
- * possible for per-mode keys, not for per-file keys.
- */
- smp_store_release(&prep_key->tfm, tfm);
+ prep_key->tfm = tfm;
return 0;
}
/* Destroy a crypto transform object and/or blk-crypto key. */
void fscrypt_destroy_prepared_key(struct super_block *sb,
@@ -188,21 +182,50 @@ int fscrypt_set_per_file_enc_key(struct fscrypt_inode_info *ci,
{
ci->ci_owns_key = true;
return fscrypt_prepare_key(&ci->ci_enc_key, raw_key, ci);
}
+/*
+ * Find the fscrypt_prepared_key (if any) for a particular (mk, hkdf_context,
+ * mode_num, data_unit_bits, inlinecrypt) combination.
+ *
+ * The caller must hold ->mk_sem for reading and ->mk_present must be true,
+ * ensuring that ->mk_mode_keys is still append-only.
+ */
+static struct fscrypt_prepared_key *
+fscrypt_find_mode_key(struct fscrypt_master_key *mk, u8 hkdf_context,
+ u8 mode_num, const struct fscrypt_inode_info *ci)
+{
+ struct fscrypt_mode_key *node;
+
+ /*
+ * The RCU read lock here is used only to synchronize with concurrent
+ * list_add_tail_rcu(). Concurrent deletions are impossible here, so
+ * returning a pointer to a node without taking any refcount is safe.
+ */
+ guard(rcu)();
+ list_for_each_entry_rcu(node, &mk->mk_mode_keys, link) {
+ if (node->hkdf_context == hkdf_context &&
+ node->mode_num == mode_num &&
+ node->data_unit_bits == ci->ci_data_unit_bits &&
+ fscrypt_is_key_prepared(&node->key, ci))
+ return &node->key;
+ }
+ return NULL;
+}
+
static int setup_per_mode_enc_key(struct fscrypt_inode_info *ci,
struct fscrypt_master_key *mk,
- struct fscrypt_prepared_key *keys,
u8 hkdf_context, bool include_fs_uuid)
{
const struct inode *inode = ci->ci_inode;
const struct super_block *sb = inode->i_sb;
struct fscrypt_mode *mode = ci->ci_mode;
const u8 mode_num = mode - fscrypt_modes;
struct fscrypt_prepared_key *prep_key;
- u8 mode_key[FSCRYPT_MAX_RAW_KEY_SIZE];
+ struct fscrypt_mode_key *new_node;
+ u8 raw_mode_key[FSCRYPT_MAX_RAW_KEY_SIZE];
u8 hkdf_info[sizeof(mode_num) + sizeof(sb->s_uuid)];
unsigned int hkdf_infolen = 0;
bool use_hw_wrapped_key = false;
int err;
@@ -221,52 +244,60 @@ static int setup_per_mode_enc_key(struct fscrypt_inode_info *ci,
return -EINVAL;
}
use_hw_wrapped_key = true;
}
- prep_key = &keys[mode_num];
- if (fscrypt_is_key_prepared(prep_key, ci)) {
+ prep_key = fscrypt_find_mode_key(mk, hkdf_context, mode_num, ci);
+ if (prep_key) {
ci->ci_enc_key = *prep_key;
return 0;
}
- mutex_lock(&fscrypt_mode_key_setup_mutex);
+ guard(mutex)(&fscrypt_mode_key_setup_mutex);
- if (fscrypt_is_key_prepared(prep_key, ci))
- goto done_unlock;
+ prep_key = fscrypt_find_mode_key(mk, hkdf_context, mode_num, ci);
+ if (prep_key) {
+ ci->ci_enc_key = *prep_key;
+ return 0;
+ }
+
+ new_node = kzalloc_obj(*new_node);
+ if (!new_node)
+ return -ENOMEM;
+ new_node->hkdf_context = hkdf_context;
+ new_node->mode_num = mode_num;
+ new_node->data_unit_bits = ci->ci_data_unit_bits;
+ prep_key = &new_node->key;
if (use_hw_wrapped_key) {
err = fscrypt_prepare_inline_crypt_key(prep_key,
mk->mk_secret.bytes,
mk->mk_secret.size, true,
ci);
- if (err)
- goto out_unlock;
- goto done_unlock;
+ } else {
+ static_assert(sizeof(mode_num) == 1);
+ static_assert(sizeof(sb->s_uuid) == 16);
+ static_assert(sizeof(hkdf_info) == 17);
+ hkdf_info[hkdf_infolen++] = mode_num;
+ if (include_fs_uuid) {
+ memcpy(&hkdf_info[hkdf_infolen], &sb->s_uuid,
+ sizeof(sb->s_uuid));
+ hkdf_infolen += sizeof(sb->s_uuid);
+ }
+ fscrypt_hkdf_expand(&mk->mk_secret.hkdf, hkdf_context,
+ hkdf_info, hkdf_infolen, raw_mode_key,
+ mode->keysize);
+ err = fscrypt_prepare_key(prep_key, raw_mode_key, ci);
+ memzero_explicit(raw_mode_key, mode->keysize);
}
-
- BUILD_BUG_ON(sizeof(mode_num) != 1);
- BUILD_BUG_ON(sizeof(sb->s_uuid) != 16);
- BUILD_BUG_ON(sizeof(hkdf_info) != 17);
- hkdf_info[hkdf_infolen++] = mode_num;
- if (include_fs_uuid) {
- memcpy(&hkdf_info[hkdf_infolen], &sb->s_uuid,
- sizeof(sb->s_uuid));
- hkdf_infolen += sizeof(sb->s_uuid);
+ if (err) {
+ kfree(new_node);
+ return err;
}
- fscrypt_hkdf_expand(&mk->mk_secret.hkdf, hkdf_context, hkdf_info,
- hkdf_infolen, mode_key, mode->keysize);
- err = fscrypt_prepare_key(prep_key, mode_key, ci);
- memzero_explicit(mode_key, mode->keysize);
- if (err)
- goto out_unlock;
-done_unlock:
+ list_add_tail_rcu(&new_node->link, &mk->mk_mode_keys);
ci->ci_enc_key = *prep_key;
- err = 0;
-out_unlock:
- mutex_unlock(&fscrypt_mode_key_setup_mutex);
- return err;
+ return 0;
}
/*
* Derive a SipHash key from the given fscrypt master key and the given
* application-specific information string.
@@ -309,12 +340,12 @@ void fscrypt_hash_inode_number(struct fscrypt_inode_info *ci,
static int fscrypt_setup_iv_ino_lblk_32_key(struct fscrypt_inode_info *ci,
struct fscrypt_master_key *mk)
{
int err;
- err = setup_per_mode_enc_key(ci, mk, mk->mk_iv_ino_lblk_32_keys,
- HKDF_CONTEXT_IV_INO_LBLK_32_KEY, true);
+ err = setup_per_mode_enc_key(ci, mk, HKDF_CONTEXT_IV_INO_LBLK_32_KEY,
+ true);
if (err)
return err;
/* pairs with smp_store_release() below */
if (!smp_load_acquire(&mk->mk_ino_hash_key_initialized)) {
@@ -362,23 +393,22 @@ static int fscrypt_setup_v2_file_key(struct fscrypt_inode_info *ci,
* v1 policies, for v2 policies in this case we don't encrypt
* with the master key directly but rather derive a per-mode
* encryption key. This ensures that the master key is
* consistently used only for HKDF, avoiding key reuse issues.
*/
- err = setup_per_mode_enc_key(ci, mk, mk->mk_direct_keys,
- HKDF_CONTEXT_DIRECT_KEY, false);
+ err = setup_per_mode_enc_key(ci, mk, HKDF_CONTEXT_DIRECT_KEY,
+ false);
} else if (ci->ci_policy.v2.flags &
FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64) {
/*
* IV_INO_LBLK_64: encryption keys are derived from (master_key,
* mode_num, filesystem_uuid), and inode number is included in
* the IVs. This format is optimized for use with inline
* encryption hardware compliant with the UFS standard.
*/
- err = setup_per_mode_enc_key(ci, mk, mk->mk_iv_ino_lblk_64_keys,
- HKDF_CONTEXT_IV_INO_LBLK_64_KEY,
- true);
+ err = setup_per_mode_enc_key(
+ ci, mk, HKDF_CONTEXT_IV_INO_LBLK_64_KEY, true);
} else if (ci->ci_policy.v2.flags &
FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32) {
err = fscrypt_setup_iv_ino_lblk_32_key(ci, mk);
} else {
u8 derived_key[FSCRYPT_MAX_RAW_KEY_SIZE];
base-commit: 83f1454877cc292b88baf13c829c16ce6937d120
--
2.54.0
^ permalink raw reply related
* Re: [GIT PULL] ext4 changes for 7.2-rc1
From: pr-tracker-bot @ 2026-06-18 17:04 UTC (permalink / raw)
To: Theodore Ts'o
Cc: Linus Torvalds, Linux Kernel Developers List,
Ext4 Developers List
In-Reply-To: <ajPrqTd4FaxlpYPs@mit.edu>
The pull request you sent on Thu, 18 Jun 2026 09:00:01 -0400:
> https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git tags/ext4_for_linus-7.2-rc1
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/83f1454877cc292b88baf13c829c16ce6937d120
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply
* Re: [PATCH v2 6/8] ext4: return -EAGAIN from ext4_map_blocks() in NOWAIT cache miss
From: Baokun Li @ 2026-06-18 15:51 UTC (permalink / raw)
To: Jan Kara
Cc: linux-ext4, tytso, adilger.kernel, yi.zhang, ojaswin, ritesh.list,
peng_wang
In-Reply-To: <ekslfvokadqeiqmfbhv7d3v4ayguuqt7z53i5ze4u55fqkvkjg@waocl7jdgpja>
On 2026/6/18 22:09, Jan Kara wrote:
> On Thu 18-06-26 20:57:33, Baokun Li wrote:
>> Make ext4_map_blocks() return -EAGAIN instead of 0 when
>> EXT4_GET_BLOCKS_CACHED_NOWAIT is set and the extent status cache
>> misses. This allows callers to easily distinguish between a successful
>> cache lookup (positive return value) and a cache miss requiring disk
>> access (-EAGAIN), simplifying error handling in NOWAIT paths.
>>
>> The change affects two locations:
>> 1. After cache hit: return retval ? retval : -EAGAIN
>> (return -EAGAIN if cache hit is hole/delayed)
> Are you sure about this case? -EAGAIN looks wrong here - we have the valid
> information cached and provide it to the caller without blocking. So at
> least from the POV of ext4_map_blocks() there's no reason to return -EAGAIN.
>
> Honza
You're right, there's no need to return -EAGAIN here. I only considered
the write path - even without returning -EAGAIN, ext4_iomap_alloc()
would return it anyway. But I missed the read path, where this would cause
an unnecessary retry.
I'll remove this change in the next version and only keep the essential
second modification (for cache miss case).
Thanks for the review!
Cheers,
Baokun
>> 2. After cache miss: return -EAGAIN
>> (instead of 0, indicating need for disk lookup)
>>
>> The only existing caller using EXT4_GET_BLOCKS_CACHED_NOWAIT is the
>> ext4_get_link() -> ext4_getblk() path. Although ext4_getblk() now
>> takes a different return branch (err < 0 instead of err == 0) and
>> propagates -EAGAIN instead of NULL, ext4_get_link() converts both
>> cases to -ECHILD via IS_ERR_OR_NULL(), so the final error seen by
>> the VFS remains unchanged.
>>
>> Signed-off-by: Baokun Li <libaokun@linux.alibaba.com>
>> ---
>> fs/ext4/inode.c | 5 +++--
>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index 832794294ccf..03adbca3ec78 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -760,7 +760,8 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
>> }
>>
>> if (flags & EXT4_GET_BLOCKS_CACHED_NOWAIT)
>> - return retval;
>> + return retval ? retval : -EAGAIN;
>> +
>> #ifdef ES_AGGRESSIVE_TEST
>> ext4_map_blocks_es_recheck(handle, inode, map,
>> &orig_map, flags);
>> @@ -776,7 +777,7 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
>> * cannot find extent in the cache.
>> */
>> if (flags & EXT4_GET_BLOCKS_CACHED_NOWAIT)
>> - return 0;
>> + return -EAGAIN;
>>
>> /*
>> * Try to see if we can get the block without requesting a new
>> --
>> 2.43.7
>>
^ permalink raw reply
* Re: [PATCH v3] ext4: drop s_writepages_rwsem around inline data handling in writepages
From: Zhou, Yun @ 2026-06-18 14:52 UTC (permalink / raw)
To: Jan Kara
Cc: tytso, adilger.kernel, libaokun, ojaswin, ritesh.list, yi.zhang,
ebiggers, linux-ext4, linux-kernel
In-Reply-To: <rvwzagttciesrgonspk37dm4sxkxxgd7marnwtz5c6cpag747e@wvnqduid6hv7>
On 6/18/2026 10:23 PM, Jan Kara wrote:
>
> On Mon 15-06-26 14:10:15, Yun Zhou wrote:
>
> You have fixed this differently (not expanding extra isize from
> ext4_evict_inode()) and furthermore this scenario is really impossible
> because you cannot be inside ext4_writepages() on inode that's undergoing
> eviction. SO let's discard this patch.
>
Yes, that patch series can resolves all the deadlock risks associated with
calling iput(ea_inode) while holding a jbd2 handle—something I hadn't
even considered at first. I really owe this to your suggestions.
BR,
Yun
^ permalink raw reply
* Re: [PATCH v3] ext4: drop s_writepages_rwsem around inline data handling in writepages
From: Jan Kara @ 2026-06-18 14:23 UTC (permalink / raw)
To: Yun Zhou
Cc: tytso, adilger.kernel, libaokun, jack, ojaswin, ritesh.list,
yi.zhang, ebiggers, linux-ext4, linux-kernel
In-Reply-To: <20260615061015.1523668-1-yun.zhou@windriver.com>
On Mon 15-06-26 14:10:15, Yun Zhou wrote:
> ext4_do_writepages() calls ext4_destroy_inline_data() which acquires
> xattr_sem while s_writepages_rwsem is held (read). This creates a
> circular lock dependency:
>
> CPU0 CPU1
> ---- ----
> ext4_writepages()
> ext4_writepages_down_read()
> [holds s_writepages_rwsem]
> ext4_evict_inode()
> __ext4_mark_inode_dirty()
> ext4_expand_extra_isize_ea()
> ext4_xattr_block_set()
> [holds xattr_sem]
> iput(old_bh inode)
> write_inode_now()
> ext4_writepages()
> ext4_writepages_down_read()
> [BLOCKED on s_writepages_rwsem]
> ext4_do_writepages()
> ext4_destroy_inline_data()
> down_write(xattr_sem)
> [BLOCKED on xattr_sem]
You have fixed this differently (not expanding extra isize from
ext4_evict_inode()) and furthermore this scenario is really impossible
because you cannot be inside ext4_writepages() on inode that's undergoing
eviction. SO let's discard this patch.
Honza
>
> Fix by temporarily dropping s_writepages_rwsem for the entire inline
> data handling block, including the journal handle start/stop. The
> rwsem must be dropped before ext4_journal_start() -- not between
> journal_start and journal_stop -- to avoid a secondary deadlock with
> ext4_change_inode_journal_flag() which takes rwsem (write) and then
> calls jbd2_journal_lock_updates() waiting for active handles to stop.
>
> This is safe because:
>
> - This code runs before any block mapping or IO submission, so no
> writepages state depends on the rwsem being held at this point.
>
> - Inline data destruction is a one-way format transition (once cleared,
> EXT4_INODE_INLINE_DATA is never set again). The rwsem is
> re-acquired after journal_stop, ensuring format stability for the
> remainder of writepages.
>
> - The can_map flag identifies the ext4_writepages() path (holds rwsem)
> vs ext4_normal_submit_inode_data_buffers() (does not), so the
> drop/reacquire is skipped when the rwsem is not held.
>
> Also check the return value of ext4_destroy_inline_data() to avoid
> proceeding with an inconsistent inode format on failure.
>
> Reported-by: syzbot+bb2455d02bda0b5701e3@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=bb2455d02bda0b5701e3
> Fixes: c8585c6fcaf2 ("ext4: fix races between changing inode journal mode and ext4_writepages")
> Signed-off-by: Yun Zhou <yun.zhou@windriver.com>
> ---
> v3: Drop s_writepages_rwsem before ext4_journal_start() and reacquire
> after ext4_journal_stop(), instead of dropping between journal_start
> and journal_stop as in v2. This avoids two issues identified in v2
> review:
> - memalloc_nofs_restore() in ext4_writepages_up_read() would clear
> PF_MEMALLOC_NOFS while the jbd2 handle is active.
> - Reacquiring s_writepages_rwsem while holding a handle creates an
> ABBA deadlock with ext4_change_inode_journal_flag() which takes
> the rwsem (write) then calls jbd2_journal_lock_updates().
>
> v2: Instead of moving inline data handling to ext4_writepages(),
> temporarily drop s_writepages_rwsem around ext4_destroy_inline_data()
> in ext4_do_writepages(). The move approach had a race where concurrent
> writes could create dirty pages with inline data after the early check,
> and unconditional destruction without dirty pages would lose data.
>
> v1: Moved inline data cleanup from ext4_do_writepages() to
> ext4_writepages() before acquiring s_writepages_rwsem.
>
> fs/ext4/inode.c | 31 ++++++++++++++++++++++++++-----
> 1 file changed, 26 insertions(+), 5 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index c2c2d6ac7f3d..cd7588a3fa45 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1694,6 +1694,9 @@ struct mpage_da_data {
> struct writeback_control *wbc;
> unsigned int can_map:1; /* Can writepages call map blocks? */
>
> + /* Saved memalloc context from ext4_writepages_down_read() */
> + int alloc_ctx;
> +
> /* These are internal state of ext4_do_writepages() */
> loff_t start_pos; /* The start pos to write */
> loff_t next_pos; /* Current pos to examine */
> @@ -2816,16 +2819,35 @@ static int ext4_do_writepages(struct mpage_da_data *mpd)
> * we'd better clear the inline data here.
> */
> if (ext4_has_inline_data(inode)) {
> - /* Just inode will be modified... */
> + /*
> + * Temporarily drop s_writepages_rwsem because
> + * ext4_destroy_inline_data() acquires xattr_sem, which has
> + * a higher lock ordering rank. Holding both would create a
> + * circular dependency with ext4_xattr_block_set() -> iput()
> + * -> ext4_writepages() -> s_writepages_rwsem.
> + *
> + * Drop the rwsem before starting the journal handle to also
> + * avoid a deadlock with ext4_change_inode_journal_flag(),
> + * which takes rwsem (write) then jbd2_journal_lock_updates().
> + */
> + if (mpd->can_map)
> + ext4_writepages_up_read(inode->i_sb, mpd->alloc_ctx);
> handle = ext4_journal_start(inode, EXT4_HT_INODE, 1);
> if (IS_ERR(handle)) {
> + if (mpd->can_map)
> + mpd->alloc_ctx =
> + ext4_writepages_down_read(inode->i_sb);
> ret = PTR_ERR(handle);
> goto out_writepages;
> }
> BUG_ON(ext4_test_inode_state(inode,
> EXT4_STATE_MAY_INLINE_DATA));
> - ext4_destroy_inline_data(handle, inode);
> + ret = ext4_destroy_inline_data(handle, inode);
> ext4_journal_stop(handle);
> + if (mpd->can_map)
> + mpd->alloc_ctx = ext4_writepages_down_read(inode->i_sb);
> + if (ret)
> + goto out_writepages;
> }
>
> /*
> @@ -3032,13 +3054,12 @@ static int ext4_writepages(struct address_space *mapping,
> .can_map = 1,
> };
> int ret;
> - int alloc_ctx;
>
> ret = ext4_emergency_state(sb);
> if (unlikely(ret))
> return ret;
>
> - alloc_ctx = ext4_writepages_down_read(sb);
> + mpd.alloc_ctx = ext4_writepages_down_read(sb);
> ret = ext4_do_writepages(&mpd);
> /*
> * For data=journal writeback we could have come across pages marked
> @@ -3047,7 +3068,7 @@ static int ext4_writepages(struct address_space *mapping,
> */
> if (!ret && mpd.journalled_more_data)
> ret = ext4_do_writepages(&mpd);
> - ext4_writepages_up_read(sb, alloc_ctx);
> + ext4_writepages_up_read(sb, mpd.alloc_ctx);
>
> return ret;
> }
> --
> 2.43.0
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply
* Re: [PATCH v2 8/8] ext4: handle IOCB_NOWAIT in ext4_dio_needs_zeroing() with cache-only lookup
From: Jan Kara @ 2026-06-18 14:10 UTC (permalink / raw)
To: Baokun Li
Cc: linux-ext4, tytso, adilger.kernel, jack, yi.zhang, ojaswin,
ritesh.list, peng_wang
In-Reply-To: <20260618125735.4156639-9-libaokun@linux.alibaba.com>
On Thu 18-06-26 20:57:35, Baokun Li wrote:
> Add a nowait parameter to ext4_dio_needs_zeroing() and pass
> EXT4_GET_BLOCKS_CACHED_NOWAIT flag to ext4_map_blocks() when
> IOCB_NOWAIT is set. This ensures the needs_zeroing check only uses
> cached extent info. If cache misses, ext4_map_blocks() returns
> -EAGAIN, causing ext4_dio_needs_zeroing() to conservatively return
> true (needs zeroing). The caller then tries to upgrade to exclusive
> lock, which returns -EAGAIN for NOWAIT, avoiding potential sleep on
> down_read(i_data_sem).
>
> The caller in ext4_dio_write_checks() is updated to pass the
> IOCB_NOWAIT flag from the kiocb.
>
> Signed-off-by: Baokun Li <libaokun@linux.alibaba.com>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/file.c | 14 ++++++++++----
> 1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index 5ffc1afd8050..44d1658d2b5a 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -228,7 +228,8 @@ ext4_extending_io(struct inode *inode, loff_t offset, size_t len)
> * unwritten conversion for middle blocks are protected by i_data_sem
> * and inode_dio_begin().
> */
> -static bool ext4_dio_needs_zeroing(struct inode *inode, loff_t pos, loff_t len)
> +static bool ext4_dio_needs_zeroing(struct inode *inode, loff_t pos, loff_t len,
> + bool nowait)
> {
> struct ext4_map_blocks map;
> unsigned int blkbits = inode->i_blkbits;
> @@ -236,10 +237,14 @@ static bool ext4_dio_needs_zeroing(struct inode *inode, loff_t pos, loff_t len)
> bool head_partial, tail_partial;
> ext4_lblk_t head_lblk, tail_lblk;
> int err;
> + int map_flags = 0;
>
> if (pos + len > i_size_read(inode))
> return true;
>
> + if (nowait)
> + map_flags = EXT4_GET_BLOCKS_CACHED_NOWAIT;
> +
> head_partial = (pos & blockmask) != 0;
> tail_partial = ((pos + len) & blockmask) != 0;
> head_lblk = pos >> blkbits;
> @@ -249,7 +254,7 @@ static bool ext4_dio_needs_zeroing(struct inode *inode, loff_t pos, loff_t len)
> if (head_partial) {
> map.m_lblk = head_lblk;
> map.m_len = tail_lblk - head_lblk + 1;
> - err = ext4_map_blocks(NULL, inode, &map, 0);
> + err = ext4_map_blocks(NULL, inode, &map, map_flags);
> if (err <= 0 || !(map.m_flags & EXT4_MAP_MAPPED))
> return true;
> /* If this mapping already covers the tail block, we're done. */
> @@ -261,7 +266,7 @@ static bool ext4_dio_needs_zeroing(struct inode *inode, loff_t pos, loff_t len)
> if (tail_partial) {
> map.m_lblk = tail_lblk;
> map.m_len = 1;
> - err = ext4_map_blocks(NULL, inode, &map, 0);
> + err = ext4_map_blocks(NULL, inode, &map, map_flags);
> if (err <= 0 || !(map.m_flags & EXT4_MAP_MAPPED))
> return true;
> }
> @@ -516,7 +521,8 @@ static ssize_t ext4_dio_write_checks(struct kiocb *iocb, struct iov_iter *from,
> * under shared lock is safe.
> */
> if (ext4_unaligned_io(inode, from, offset))
> - needs_zeroing = ext4_dio_needs_zeroing(inode, offset, count);
> + needs_zeroing = ext4_dio_needs_zeroing(inode, offset, count,
> + iocb->ki_flags & IOCB_NOWAIT);
>
> /* Determine whether we need to upgrade to an exclusive lock. */
> if (*ilock_shared &&
> --
> 2.43.7
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply
* Re: [PATCH v2 7/8] ext4: handle IOMAP_NOWAIT in ext4_iomap_begin() with cache-only lookup
From: Jan Kara @ 2026-06-18 14:09 UTC (permalink / raw)
To: Baokun Li
Cc: linux-ext4, tytso, adilger.kernel, jack, yi.zhang, ojaswin,
ritesh.list, peng_wang
In-Reply-To: <20260618125735.4156639-8-libaokun@linux.alibaba.com>
On Thu 18-06-26 20:57:34, Baokun Li wrote:
> Pass EXT4_GET_BLOCKS_CACHED_NOWAIT flag to ext4_map_blocks() when
> IOMAP_NOWAIT is set, ensuring that extent lookups only use the cached
> extent status tree. If the cache misses, ext4_map_blocks() returns
> -EAGAIN instead of sleeping on down_read(i_data_sem) to read extent
> tree from disk.
>
> This applies to both write and read paths in ext4_iomap_begin(),
> allowing DIO/DAX operations with RWF_NOWAIT to avoid blocking on
> extent tree lookups.
>
> Signed-off-by: Baokun Li <libaokun@linux.alibaba.com>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/inode.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 03adbca3ec78..09f85cd6c118 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3781,6 +3781,7 @@ static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
> struct ext4_map_blocks map;
> u8 blkbits = inode->i_blkbits;
> unsigned int orig_mlen;
> + int map_flags = 0;
>
> if ((offset >> blkbits) > EXT4_MAX_LOGICAL_BLOCK)
> return -EINVAL;
> @@ -3795,6 +3796,12 @@ static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
> map.m_len = min_t(loff_t, (offset + length - 1) >> blkbits,
> EXT4_MAX_LOGICAL_BLOCK) - map.m_lblk + 1;
> orig_mlen = map.m_len;
> + /*
> + * In NOWAIT context, only use cached extent info. If es cache misses,
> + * return -EAGAIN to avoid sleeping on down_read(i_data_sem).
> + */
> + if (flags & IOMAP_NOWAIT)
> + map_flags = EXT4_GET_BLOCKS_CACHED_NOWAIT;
>
> if (flags & IOMAP_WRITE) {
> /*
> @@ -3804,7 +3811,7 @@ static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
> * especially in multi-threaded overwrite requests.
> */
> if (offset + length <= i_size_read(inode)) {
> - ret = ext4_map_blocks(NULL, inode, &map, 0);
> + ret = ext4_map_blocks(NULL, inode, &map, map_flags);
> /*
> * For DAX we convert extents to initialized ones before
> * copying the data, otherwise we do it after I/O so
> @@ -3825,7 +3832,7 @@ static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
> }
> ret = ext4_iomap_alloc(inode, &map, flags);
> } else {
> - ret = ext4_map_blocks(NULL, inode, &map, 0);
> + ret = ext4_map_blocks(NULL, inode, &map, map_flags);
> }
>
> if (ret < 0)
> --
> 2.43.7
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply
* Re: [PATCH v2 6/8] ext4: return -EAGAIN from ext4_map_blocks() in NOWAIT cache miss
From: Jan Kara @ 2026-06-18 14:09 UTC (permalink / raw)
To: Baokun Li
Cc: linux-ext4, tytso, adilger.kernel, jack, yi.zhang, ojaswin,
ritesh.list, peng_wang
In-Reply-To: <20260618125735.4156639-7-libaokun@linux.alibaba.com>
On Thu 18-06-26 20:57:33, Baokun Li wrote:
> Make ext4_map_blocks() return -EAGAIN instead of 0 when
> EXT4_GET_BLOCKS_CACHED_NOWAIT is set and the extent status cache
> misses. This allows callers to easily distinguish between a successful
> cache lookup (positive return value) and a cache miss requiring disk
> access (-EAGAIN), simplifying error handling in NOWAIT paths.
>
> The change affects two locations:
> 1. After cache hit: return retval ? retval : -EAGAIN
> (return -EAGAIN if cache hit is hole/delayed)
Are you sure about this case? -EAGAIN looks wrong here - we have the valid
information cached and provide it to the caller without blocking. So at
least from the POV of ext4_map_blocks() there's no reason to return -EAGAIN.
Honza
> 2. After cache miss: return -EAGAIN
> (instead of 0, indicating need for disk lookup)
>
> The only existing caller using EXT4_GET_BLOCKS_CACHED_NOWAIT is the
> ext4_get_link() -> ext4_getblk() path. Although ext4_getblk() now
> takes a different return branch (err < 0 instead of err == 0) and
> propagates -EAGAIN instead of NULL, ext4_get_link() converts both
> cases to -ECHILD via IS_ERR_OR_NULL(), so the final error seen by
> the VFS remains unchanged.
>
> Signed-off-by: Baokun Li <libaokun@linux.alibaba.com>
> ---
> fs/ext4/inode.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 832794294ccf..03adbca3ec78 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -760,7 +760,8 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
> }
>
> if (flags & EXT4_GET_BLOCKS_CACHED_NOWAIT)
> - return retval;
> + return retval ? retval : -EAGAIN;
> +
> #ifdef ES_AGGRESSIVE_TEST
> ext4_map_blocks_es_recheck(handle, inode, map,
> &orig_map, flags);
> @@ -776,7 +777,7 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
> * cannot find extent in the cache.
> */
> if (flags & EXT4_GET_BLOCKS_CACHED_NOWAIT)
> - return 0;
> + return -EAGAIN;
>
> /*
> * Try to see if we can get the block without requesting a new
> --
> 2.43.7
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply
* Re: [PATCH v2 5/8] ext4: use kiocb_modified instead of file_modified in DIO/DAX write path
From: Jan Kara @ 2026-06-18 13:56 UTC (permalink / raw)
To: Baokun Li
Cc: linux-ext4, tytso, adilger.kernel, jack, yi.zhang, ojaswin,
ritesh.list, peng_wang
In-Reply-To: <20260618125735.4156639-6-libaokun@linux.alibaba.com>
On Thu 18-06-26 20:57:32, Baokun Li wrote:
> file_modified() passes flags=0 which drops IOCB_NOWAIT, causing
> file_update_time() to sleep in ext4_journal_start() via
> ext4_dirty_inode() even in non-blocking contexts.
>
> kiocb_modified(iocb) propagates iocb->ki_flags so that
> generic_update_time() correctly returns -EAGAIN when IOCB_NOWAIT
> is set and ->dirty_inode could block, matching the behavior
> already adopted by XFS, FUSE, and ext2.
>
> Affected paths:
> - ext4_dio_write_checks(): DIO NOWAIT write
> - ext4_write_checks(): shared by buffered (rejects NOWAIT upfront)
> and DAX write (supports NOWAIT)
>
> ext4_fallocate() in extents.c is not affected as it has no kiocb.
>
> Signed-off-by: Baokun Li <libaokun@linux.alibaba.com>
Indeed, good catch! Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/file.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index 2681f148e7b8..5ffc1afd8050 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -307,7 +307,7 @@ static ssize_t ext4_write_checks(struct kiocb *iocb, struct iov_iter *from)
> if (count <= 0)
> return count;
>
> - ret = file_modified(iocb->ki_filp);
> + ret = kiocb_modified(iocb);
> if (ret)
> return ret;
>
> @@ -465,7 +465,7 @@ static const struct iomap_dio_ops ext4_dio_write_ops = {
> *
> * The decision is layered, evaluated in this order:
> *
> - * 1. If file_modified() needs to update security info (!IS_NOSEC), upgrade
> + * 1. If kiocb_modified() needs to update security info (!IS_NOSEC), upgrade
> * to the exclusive lock -- the security update itself requires it,
> * regardless of whether the write extends the file or is aligned.
> *
> @@ -555,7 +555,7 @@ static ssize_t ext4_dio_write_checks(struct kiocb *iocb, struct iov_iter *from,
> *dio_flags = IOMAP_DIO_FORCE_WAIT;
> }
>
> - ret = file_modified(file);
> + ret = kiocb_modified(iocb);
> if (ret < 0)
> goto out;
>
> --
> 2.43.7
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply
* Re: [PATCH v2 2/8] ext4: drain in-flight DIO before buffered write fallback
From: Jan Kara @ 2026-06-18 13:54 UTC (permalink / raw)
To: Baokun Li
Cc: linux-ext4, tytso, adilger.kernel, jack, yi.zhang, ojaswin,
ritesh.list, peng_wang
In-Reply-To: <20260618125735.4156639-3-libaokun@linux.alibaba.com>
On Thu 18-06-26 20:57:29, Baokun Li wrote:
> generic/746 started failing intermittently on ext3 (no-extent inodes).
> The test triggers 'Page cache invalidation failure on direct I/O'
> warnings and subsequent fsync returns -EIO. Adding a 50ms delay
> between ext4_buffered_write_iter() and filemap_write_and_wait_range()
> in ext4_dio_write_iter() makes the race almost always reproducible.
>
> On no-extent inodes, DIO writes to holes cannot use unwritten extents,
> so ext4_iomap_alloc() leaves m_flags=0 and ext4_map_blocks() returns 0.
> The iomap layer then returns -ENOTBLK, causing fallback to buffered I/O.
>
> The fallback path in ext4_dio_write_iter() calls
> ext4_buffered_write_iter() which dirties pages, then does flush and
> invalidate. However, there's an unprotected window between
> ext4_buffered_write_iter() returning (with inode lock released) and
> the subsequent flush+invalidate.
>
> Concurrent async DIO completions from other threads can run
> kiocb_invalidate_post_direct_write() during this window. If pages have
> been re-dirtied, post-invalidation finds dirty pages and triggers the
> warning, setting -EIO in the error sequence.
>
> Consider a file with two 4k extents: [hole][written]. Thread A does
> DIO to the written extent, while thread B does DIO spanning both:
>
> kworker A (4k DIO, allocated block) kworker B (8k DIO, fallback)
> ----------------------------------- ----------------------------
> inode_lock_shared() inode_lock_shared()
> iomap_dio_rw(): iomap_dio_rw():
> kiocb_invalidate_pages -> clean iomap_begin -> -ENOTBLK
> submit_bio (async) dio->size = 0
> inode_unlock_shared() inode_unlock_shared()
>
> [bio pending in block layer] /* fallback: lock released */
> ext4_buffered_write_iter()
> inode_lock(exclusive)
> generic_perform_write()
> -> dirty pages [0, 8k]
> inode_unlock(exclusive)
>
> /* pages dirty, no lock */
> [bio completes] filemap_write_and_wait_range()
> iomap_dio_complete() -> flush dirty pages
> kiocb_invalidate_post_direct_write() invalidate_mapping_pages()
> invalidate_inode_pages2_range()
> -> finds dirty page!
> -> dio_warn_stale_pagecache()
> -> errseq_set(-EIO)
>
> This issue can be triggered through normal I/O paths, not just
> intentionally overlapping DIO writes from userspace. For example,
> generic/746 uses a loop device where multiple kworkers issue concurrent
> I/O to the backing file. Additionally, when block_size < folio_size,
> non-overlapping DIO writes that share a large folio can also trigger
> the race.
>
> Add inode_dio_wait() in ext4_buffered_write_iter() before
> generic_perform_write() to drain all in-flight DIO. This ensures
> that all DIO clears existing pages before submitting IO (via
> kiocb_invalidate_pages()), and all BIO waits for all DIO to
> complete (via inode_dio_wait()), thus eliminating the race.
>
> Fixes: 378f32bab371 ("ext4: introduce direct I/O write using iomap infrastructure")
> Suggested-by: Zhang Yi <yi.zhang@huawei.com>
> Link: https://patch.msgid.link/d1adcf7c-c276-458d-9cac-68a4410f7626@gmail.com
> Signed-off-by: Baokun Li <libaokun@linux.alibaba.com>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/file.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index eb1a323962b1..9f9bc0b13772 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -313,6 +313,12 @@ static ssize_t ext4_buffered_write_iter(struct kiocb *iocb,
> if (ret <= 0)
> goto out;
>
> + /*
> + * Prevent concurrent DIO and BIO to the same file range.
> + * Wait for all in-flight DIO to complete before dirtying pages.
> + */
> + inode_dio_wait(inode);
> +
> ret = generic_perform_write(iocb, from);
>
> out:
> --
> 2.43.7
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply
* Re: [PATCH v2 1/8] ext4: prevent sleeping allocation in NOWAIT write path
From: Jan Kara @ 2026-06-18 13:52 UTC (permalink / raw)
To: Baokun Li
Cc: linux-ext4, tytso, adilger.kernel, jack, yi.zhang, ojaswin,
ritesh.list, peng_wang, Sashiko
In-Reply-To: <20260618125735.4156639-2-libaokun@linux.alibaba.com>
On Thu 18-06-26 20:57:28, Baokun Li wrote:
> Block allocation requires journal access which may sleep, violating
> NOWAIT semantics. Return -EAGAIN early when IOMAP_NOWAIT is set,
> allowing the caller to retry without the NOWAIT constraint.
>
> This ensures that write paths using IOMAP_NOWAIT (e.g., DIO with
> RWF_NOWAIT) will not block on journal operations when blocks need
> to be allocated.
>
> Reported-by: Sashiko <sashiko-bot@kernel.org>
> Closes: https://sashiko.dev/#/patchset/20260611163441.2431805-1-libaokun@linux.alibaba.com?part=1
> Signed-off-by: Baokun Li <libaokun@linux.alibaba.com>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/inode.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index c2c2d6ac7f3d..832794294ccf 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3672,6 +3672,9 @@ static int ext4_iomap_alloc(struct inode *inode, struct ext4_map_blocks *map,
> int ret, dio_credits, m_flags = 0, retries = 0;
> bool force_commit = false;
>
> + if (flags & IOMAP_NOWAIT)
> + return -EAGAIN;
> +
> /*
> * Trim the mapping request to the maximum value that we can map at
> * once for direct I/O.
> --
> 2.43.7
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply
* Re: [PATCH v4 18/23] ext4: wait for ordered I/O in the iomap buffered I/O path
From: Jan Kara @ 2026-06-18 13:48 UTC (permalink / raw)
To: Zhang Yi
Cc: linux-ext4, linux-fsdevel, linux-kernel, tytso, adilger.kernel,
libaokun, jack, ojaswin, ritesh.list, djwong, hch, yi.zhang,
yizhang089, yangerkun, yukuai
In-Reply-To: <20260511072344.191271-19-yi.zhang@huaweicloud.com>
On Mon 11-05-26 15:23:38, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
>
> For append writes, wait for ordered I/O to complete before updating
> i_disksize. This ensures that zeroed data is flushed to disk before the
> metadata update, preventing stale data from being exposed during
> unaligned post-EOF append writes.
>
> Suggested-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Frankly, this all looks too complex to me. Plus your are adding 32-bytes to
struct ext4_inode_info which isn't great either. Why don't you just do
filemap_fdatawait() for the byte at old i_disksize and be done with it?
I believe we have to simplify this. All this complexity (and thus
maintenance burden) across several patches for the corner case of zeroing
tail block on extention is in my opinion difficult to justify.
Honza
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 078feda47e36..9ce2128eea3e 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -1195,6 +1195,15 @@ struct ext4_inode_info {
> #ifdef CONFIG_FS_ENCRYPTION
> struct fscrypt_inode_info *i_crypt_info;
> #endif
> +
> + /*
> + * Track ordered zeroed data during post-EOF append writes, fallocate,
> + * and truncate-up operations. These parameters are used only in the
> + * iomap buffered I/O path.
> + */
> + ext4_lblk_t i_ordered_lblk;
> + ext4_lblk_t i_ordered_len;
> + wait_queue_head_t i_ordered_wq;
> };
>
> /*
> @@ -3858,6 +3867,8 @@ extern int ext4_move_extents(struct file *o_filp, struct file *d_filp,
> __u64 len, __u64 *moved_len);
>
> /* page-io.c */
> +#define EXT4_IOMAP_IOEND_ORDER_IO 1UL /* This I/O is an ordered one */
> +
> extern int __init ext4_init_pageio(void);
> extern void ext4_exit_pageio(void);
> extern ext4_io_end_t *ext4_init_io_end(struct inode *inode, gfp_t flags);
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index e013aeb03d7b..11fb369efeb1 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4345,6 +4345,7 @@ static int ext4_iomap_writeback_submit(struct iomap_writepage_ctx *wpc,
> {
> struct iomap_ioend *ioend = wpc->wb_ctx;
> struct ext4_inode_info *ei = EXT4_I(ioend->io_inode);
> + ext4_lblk_t start, end, order_lblk, order_len;
>
> /*
> * After I/O completion, a worker needs to be scheduled when:
> @@ -4357,6 +4358,30 @@ static int ext4_iomap_writeback_submit(struct iomap_writepage_ctx *wpc,
> test_opt(ioend->io_inode->i_sb, DATA_ERR_ABORT))
> ioend->io_bio.bi_end_io = ext4_iomap_end_bio;
>
> + /*
> + * Mark the I/O as ordered. Ordered I/O requires separate endio
> + * handling and must not be merged with regular I/O operations.
> + */
> + order_len = READ_ONCE(ei->i_ordered_len);
> + if (order_len) {
> + /*
> + * Pair with smp_store_release() in ext4_block_zero_eof().
> + * Ensure we see the updated i_ordered_lblk that was written
> + * before the release store to i_ordered_len.
> + */
> + smp_rmb();
> + order_lblk = READ_ONCE(ei->i_ordered_lblk);
> + start = ioend->io_offset >> ioend->io_inode->i_blkbits;
> + end = EXT4_B_TO_LBLK(ioend->io_inode,
> + ioend->io_offset + ioend->io_size);
> +
> + if (start <= order_lblk && end >= order_lblk + order_len) {
> + ioend->io_bio.bi_end_io = ext4_iomap_end_bio;
> + ioend->io_private = (void *)EXT4_IOMAP_IOEND_ORDER_IO;
> + ioend->io_flags |= IOMAP_IOEND_BOUNDARY;
> + }
> + }
> +
> return iomap_ioend_writeback_submit(wpc, error);
> }
>
> @@ -4746,8 +4771,10 @@ static int ext4_iomap_submit_zero_block(struct inode *inode,
> loff_t from, loff_t end)
> {
> struct address_space *mapping = inode->i_mapping;
> + struct ext4_inode_info *ei = EXT4_I(inode);
> struct folio *folio;
> bool do_submit = false;
> + int ret;
>
> folio = filemap_lock_folio(mapping, from >> PAGE_SHIFT);
> if (IS_ERR(folio))
> @@ -4757,14 +4784,50 @@ static int ext4_iomap_submit_zero_block(struct inode *inode,
> folio_wait_writeback(folio);
> WARN_ON_ONCE(folio_test_writeback(folio));
>
> - if (likely(folio_test_dirty(folio)))
> + /*
> + * Mark the ordered range. It will be cleared upon I/O completion
> + * in ext4_iomap_end_bio(). Any operation that extends i_disksize
> + * (including append write end io past the zeroed boundary,
> + * truncate up and append fallocate) must wait for this I/O to
> + * complete before updating i_disksize.
> + *
> + * When multiple overlapping unaligned EOF writes are in flight, we
> + * only need to track and wait for the first one. Subsequent writes
> + * will zero the gap in memory and ensure that the zeroed data is
> + * written out along with the valid data in the same block before
> + * i_disksize is updated.
> + */
> + if (likely(folio_test_dirty(folio) &&
> + READ_ONCE(ei->i_ordered_len) == 0)) {
> + WRITE_ONCE(ei->i_ordered_lblk,
> + from >> inode->i_blkbits);
> + /*
> + * Pairs with smp_rmb() in ext4_iomap_writeback_submit()
> + * and ext4_iomap_wb_ordered_wait(). Ensure the updated
> + * i_ordered_lblk is visible when i_ordered_len becomes
> + * non-zero.
> + */
> + smp_store_release(&ei->i_ordered_len, 1);
> do_submit = true;
> + }
> folio_unlock(folio);
> folio_put(folio);
>
> /* Submit zeroed block. */
> - if (do_submit)
> - return filemap_fdatawrite_range(mapping, from, end - 1);
> + if (do_submit) {
> + ret = filemap_fdatawrite_range(mapping, from, end - 1);
> + if (ret) {
> + /*
> + * Pairs with wait_event() in
> + * ext4_iomap_wb_ordered_wait(). Ensure
> + * i_ordered_len = 0 is visible before waking up
> + * waiters.
> + */
> + smp_store_release(&ei->i_ordered_len, 0);
> + wake_up_all(&ei->i_ordered_wq);
> + return ret;
> + }
> + }
> return 0;
> }
>
> @@ -4827,10 +4890,13 @@ int ext4_block_zero_eof(struct inode *inode, loff_t from, loff_t end)
> * data=ordered mode. We submit zeroed range directly here.
> * Do not wait for I/O completion for performance.
> *
> - * TODO: Any operation that extends i_disksize (including
> - * append write end io past the zeroed boundary, truncate up,
> - * and append fallocate) must wait for the relevant I/O to
> - * complete before updating i_disksize.
> + * The end_io handler ext4_iomap_wb_ordered_wait() will wait
> + * for I/O completion before updating i_disksize if the write
> + * extends beyond the zeroed boundary.
> + *
> + * TODO: Any other operation that extends i_disksize
> + * (including truncate up and append fallocate) must wait for
> + * the relevant I/O to complete before updating i_disksize.
> */
> } else if (ext4_inode_buffered_iomap(inode)) {
> err = ext4_iomap_submit_zero_block(inode, from, end);
> diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
> index 3050c887329f..ad05ebb49bf6 100644
> --- a/fs/ext4/page-io.c
> +++ b/fs/ext4/page-io.c
> @@ -613,6 +613,46 @@ int ext4_bio_write_folio(struct ext4_io_submit *io, struct folio *folio,
> return 0;
> }
>
> +/*
> + * If the old disk size is not block size aligned and the current
> + * writeback range is entirely beyond the old EOF block, we should
> + * wait for the zeroed data written in ext4_block_zero_eof() to be
> + * written out, otherwise, it may expose stale data in that block.
> + */
> +static void ext4_iomap_wb_ordered_wait(struct inode *inode,
> + loff_t pos, loff_t end)
> +{
> + struct ext4_inode_info *ei = EXT4_I(inode);
> + unsigned int blocksize = i_blocksize(inode);
> + loff_t disksize = READ_ONCE(ei->i_disksize);
> + ext4_lblk_t order_lblk, order_len;
> +
> + /*
> + * Waiting for ordered I/O is unnecessary when:
> + * - The on-disk size is block-aligned (no stale data exists).
> + * - The write start is within the block of the old EOF
> + * (overwriting, or appending to a block that already contains
> + * valid data).
> + */
> + if (!(disksize & (blocksize - 1)) ||
> + pos < round_up(disksize, blocksize))
> + return;
> +
> + order_len = READ_ONCE(ei->i_ordered_len);
> + if (!order_len)
> + return;
> +
> + /*
> + * Pair with smp_store_release() in ext4_iomap_end_bio() and
> + * ext4_block_zero_eof(). Ensure we see the updated i_ordered_lblk
> + * that was written before the release store to i_ordered_len.
> + */
> + smp_rmb();
> + order_lblk = READ_ONCE(ei->i_ordered_lblk);
> + if ((pos >> inode->i_blkbits) >= order_lblk + order_len)
> + wait_event(ei->i_ordered_wq, READ_ONCE(ei->i_ordered_len) == 0);
> +}
> +
> static int ext4_iomap_wb_update_disksize(handle_t *handle, struct inode *inode,
> loff_t end)
> {
> @@ -656,6 +696,9 @@ static void ext4_iomap_finish_ioend(struct iomap_ioend *ioend)
> goto out;
> }
>
> + /* Wait ordered zero data to be written out. */
> + ext4_iomap_wb_ordered_wait(inode, pos, pos + size);
> +
> /* We may need to convert one extent and dirty the inode. */
> credits = ext4_chunk_trans_blocks(inode,
> EXT4_MAX_BLOCKS(size, pos, inode->i_blkbits));
> @@ -717,8 +760,25 @@ void ext4_iomap_end_bio(struct bio *bio)
> struct inode *inode = ioend->io_inode;
> struct ext4_inode_info *ei = EXT4_I(inode);
> struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> + unsigned long io_mode = (unsigned long)ioend->io_private;
> unsigned long flags;
>
> + /*
> + * This is an ordered I/O, clear the ordered range set in
> + * ext4_block_zero_eof() and wake up all waiters that will update
> + * the inode i_disksize.
> + */
> + if (io_mode == EXT4_IOMAP_IOEND_ORDER_IO) {
> + /*
> + * Pairs with wait_event() in ext4_iomap_wb_ordered_wait().
> + * Ensure i_ordered_len = 0 is visible before waking up
> + * waiters.
> + */
> + smp_store_release(&ei->i_ordered_len, 0);
> + wake_up_all(&ei->i_ordered_wq);
> + goto defer;
> + }
> +
> /* Needs to convert unwritten extents or update the i_disksize. */
> if ((ioend->io_flags & IOMAP_IOEND_UNWRITTEN) ||
> ioend->io_offset + ioend->io_size > READ_ONCE(ei->i_disksize))
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 62bfe05a64bc..9c0a00e716f3 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1444,6 +1444,9 @@ static struct inode *ext4_alloc_inode(struct super_block *sb)
> ext4_fc_init_inode(&ei->vfs_inode);
> spin_lock_init(&ei->i_fc_lock);
> mmb_init(&ei->i_metadata_bhs, &ei->vfs_inode.i_data);
> + ei->i_ordered_lblk = 0;
> + ei->i_ordered_len = 0;
> + init_waitqueue_head(&ei->i_ordered_wq);
> return &ei->vfs_inode;
> }
>
> @@ -1480,12 +1483,20 @@ static void ext4_destroy_inode(struct inode *inode)
> dump_stack();
> }
>
> - if (!(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_ERROR_FS) &&
> - WARN_ON_ONCE(EXT4_I(inode)->i_reserved_data_blocks))
> - ext4_msg(inode->i_sb, KERN_ERR,
> - "Inode %llu (%p): i_reserved_data_blocks (%u) not cleared!",
> - inode->i_ino, EXT4_I(inode),
> - EXT4_I(inode)->i_reserved_data_blocks);
> + if (!(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_ERROR_FS)) {
> + if (WARN_ON_ONCE(EXT4_I(inode)->i_reserved_data_blocks))
> + ext4_msg(inode->i_sb, KERN_ERR,
> + "Inode %llu (%p): i_reserved_data_blocks (%u) not cleared!",
> + inode->i_ino, EXT4_I(inode),
> + EXT4_I(inode)->i_reserved_data_blocks);
> +
> + if (WARN_ON_ONCE(EXT4_I(inode)->i_ordered_len))
> + ext4_msg(inode->i_sb, KERN_ERR,
> + "Inode %llu (%p): i_ordered_lblk (%u) and i_ordered_len (%u) not cleared!",
> + inode->i_ino, EXT4_I(inode),
> + EXT4_I(inode)->i_ordered_lblk,
> + EXT4_I(inode)->i_ordered_len);
> + }
> }
>
> static void ext4_shutdown(struct super_block *sb)
> --
> 2.52.0
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply
* [GIT PULL] ext4 changes for 7.2-rc1
From: Theodore Ts'o @ 2026-06-18 13:00 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Linux Kernel Developers List, Ext4 Developers List
The following changes since commit 5200f5f493f79f14bbdc349e402a40dfb32f23c8:
Linux 7.1-rc4 (2026-05-17 13:59:58 -0700)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git tags/ext4_for_linus-7.2-rc1
for you to fetch changes up to c143957520c6c9b5cd72e0de8b52b814f0c576fe:
ext4: validate donor file superblock early in EXT4_IOC_MOVE_EXT (2026-06-10 10:53:50 -0400)
----------------------------------------------------------------
Various ext4 updates for 7.2-rc1:
* A major rework of the fast commit mechanism to avoid lock
contention and deadlocks. We also export snapshot statistics
in /proc/fs/ext4/*/fc_info.
* Performance optimization for directory hash computation by
processing input in 4-byte chunks and removing function pointers,
along with new KUnit tests for directory hash.
* Cleanups in JBD2 to remove special slabs and use kmalloc() instead.
* Various bug fixes, including:
- Early validation of donor superblock in EXT4_IOC_MOVE_EXT to avoid
cross-fs deadlock
- Fix for a kernel BUG in ext4_write_inline_data_end under
data=journal
- Fix for a NULL dereference in jbd2_journal_dirty_metadata when
handle is aborted
- Fix for an underflow in JBD2 fast commit block initialization check
- Fix for LOGFLUSH shutdown ordering to ensure ordered data writeback
- Miscellaneous fixes for error path return values and KUnit assertions.
----------------------------------------------------------------
Abdellah Ouhbi (1):
ext4: Use %pe to print PTR_ERR()
Aditya Prakash Srivastava (1):
ext4: fix kernel BUG in ext4_write_inline_data_end
Deepanshu Kartikey (1):
jbd2: check for aborted handle in jbd2_journal_dirty_metadata()
Guan-Chun Wu (2):
ext4: add Kunit coverage for directory hash computation
ext4: improve str2hashbuf by processing 4-byte chunks and removing function pointers
Hongling Zeng (1):
ext4: fix ERR_PTR(0) in ext4_mkdir()
Junrui Luo (1):
jbd2: fix integer underflow in jbd2_journal_initialize_fast_commit()
Li Chen (8):
ext4: fix fast commit wait/wake bit mapping on 64-bit
ext4: fast commit: snapshot inode state before writing log
ext4: lockdep: handle i_data_sem subclassing for special inodes
ext4: fast commit: avoid waiting for FC_COMMITTING
ext4: fast commit: avoid self-deadlock in inode snapshotting
ext4: fast commit: avoid i_data_sem by dropping ext4_map_blocks() in snapshots
ext4: fast commit: add lock_updates tracepoint
ext4: fast commit: export snapshot stats in fc_info
Matthew Wilcox (Oracle) (2):
ext4: remove mention of PageWriteback
jbd2: remove special jbd2 slabs
Ryota Sakamoto (1):
ext4: replace KUnit tests for memcmp() with KUNIT_ASSERT_MEMEQ()
Yun Zhou (1):
ext4: validate donor file superblock early in EXT4_IOC_MOVE_EXT
Zhang Yi (1):
ext4: fix LOGFLUSH shutdown ordering to allow ordered-mode data writeback
fs/ext4/Makefile | 2 +-
fs/ext4/ext4.h | 93 ++++-
fs/ext4/extents.c | 4 +-
fs/ext4/fast_commit.c | 784 ++++++++++++++++++++++++++++++++---------
fs/ext4/hash-test.c | 567 +++++++++++++++++++++++++++++
fs/ext4/hash.c | 68 ++--
fs/ext4/inode.c | 54 ++-
fs/ext4/ioctl.c | 15 +-
fs/ext4/mballoc-test.c | 9 +-
fs/ext4/namei.c | 6 +-
fs/ext4/page-io.c | 2 +-
fs/ext4/super.c | 13 +-
fs/jbd2/commit.c | 8 +-
fs/jbd2/journal.c | 127 +------
fs/jbd2/transaction.c | 17 +-
include/linux/jbd2.h | 3 -
include/trace/events/ext4.h | 61 ++++
17 files changed, 1495 insertions(+), 338 deletions(-)
create mode 100644 fs/ext4/hash-test.c
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox