Linux EXT4 FS development
 help / color / mirror / Atom feed
* [PATCH v4 00/11] Data in direntry (dirdata) feature
@ 2026-06-24 13:36 Artem Blagodarenko
  2026-06-24 13:36 ` [PATCH v4 01/11] ext4: validate count against limit in ext4_dx_csum_verify/_set Artem Blagodarenko
                   ` (11 more replies)
  0 siblings, 12 replies; 19+ messages in thread
From: Artem Blagodarenko @ 2026-06-24 13:36 UTC (permalink / raw)
  To: linux-ext4; +Cc: adilger.kernel, Artem Blagodarenko

EXT4 currently stores a hash in the directory entry
(dirent) immediately after the file name to support
simultaneous fscrypt and casefold functionality.

It has been discussed within the EXT4 community that
this hash could instead be stored in dirdata. This
would make it the second (or third, in the case of
64-bit inode counts) user of dirdata.

At the same time, the existing format—where the hash
is placed after the file name—must continue to be
supported. With these patches, EXT4 can handle the
hash in both formats.

The first user of this feature is LUFID -
Locally Unique File ID.

Support for fscrypt and case-insensitive directories
with dirdata enabled has been verified using a
dedicated xfstest submitted to the xfstests list as
a separate patch.

e2fsprogs support is provided in a separate patches
series.

Changes in v4:
- syzbot ci actually ran the v3 series and found real,
  reproducible KASAN slab-out-of-bounds and use-after-free
  reads, all rooted in ext4_dir_entry_len() decoding
  de->rec_len with a hardcoded full-block size even when
  the entry lives in a smaller buffer (inline directory
  data). Gave it an explicit blocksize parameter and fixed
  every caller to pass the real containing-buffer size.
- dx_get_dx_info() and get_dx_countlimit() additionally
  needed dir=NULL (not just the right blocksize) when
  computing past the on-disk '.'/'..' entries, since those
  never carry the casefold+fscrypt hash regardless of the
  directory's feature flags; passing the real dir made
  ext4_dirent_rec_len() add 8 bytes of hash space that was
  never written on disk, corrupting dx_root_info's offset
  for every casefold+encrypt directory.
- ext4_dirdata_get()/ext4_dirdata_set(): fixed bounds checks
  that were off by EXT4_BASE_DIR_LEN (the 8-byte dirent
  header), a LUFID memcpy that used the wrong source/length,
  an out-of-bounds array write for maximum-length filenames,
  and an uninitialized gap byte leaking a stale memory byte
  to disk.
- EXT4_IOC_SET_LUFID: fixed ddh_length under-counting the
  header byte (silently dropping the last byte of every
  LUFID payload), rejected '.'/'..' as targets, added a
  missing inode_permission(dir, MAY_WRITE) check, and closed
  a data race on the shared i_dirdata field by also locking
  the target inode (not just the parent directory) for the
  duration it's used.
- Fixed a missing bounds check in ext4_dx_csum_verify()/
  ext4_dx_csum_set() that let an unvalidated on-disk `count`
  field drive an out-of-bounds checksum read. This bug
  predates this series, but is included as patch 1 (ahead
  of the patch that touches this function) since it was
  found via review of this series.
- Thanks to Sashiko AI review and to Xiao (xiaowu.417@qq.com)
  for reproducing several of the above with concrete crash
  logs and PoCs.

Artem Blagodarenko (11):
  ext4: validate count against limit in ext4_dx_csum_verify/_set
  ext4: replace ext4_dir_entry with ext4_dir_entry_2
  ext4: add ext4_dir_entry_is_tail()
  ext4: refactor dx_root to support variable dirent sizes
  ext4: add dirdata format definitions and access helpers
  ext4: preserve dirdata bits in get_dtype()
  ext4: add ext4_dir_entry_len() and harden dirdata parsing
  ext4: rename ext4_dir_rec_len() and clarify dirdata usage
  ext4: dirdata feature
  ext4: add dirdata set/get helpers
  ext4: Add EXT4_IOC_SET_LUFID ioctl for setting LUFID on directory
    entries

 foofile.txt               |   0
 fs/ext4/dir.c             |   9 +-
 fs/ext4/ext4.h            | 211 +++++++++++-
 fs/ext4/inline.c          |  41 ++-
 fs/ext4/ioctl.c           |  84 +++++
 fs/ext4/namei.c           | 699 +++++++++++++++++++++++++++++---------
 fs/ext4/sysfs.c           |   2 +
 include/uapi/linux/ext4.h |  13 +
 8 files changed, 861 insertions(+), 198 deletions(-)
 create mode 100644 foofile.txt

-- 
2.43.7


^ permalink raw reply	[flat|nested] 19+ messages in thread
* [PATCH v3 00/10] Data in direntry (dirdata) feature
@ 2026-06-19 19:10 Artem Blagodarenko
  2026-06-20  6:55 ` [syzbot ci] " syzbot ci
  0 siblings, 1 reply; 19+ messages in thread
From: Artem Blagodarenko @ 2026-06-19 19:10 UTC (permalink / raw)
  To: linux-ext4; +Cc: adilger.kernel, Artem Blagodarenko, syzbot

EXT4 currently stores a hash in the directory entry
(dirent) immediately after the file name to support
simultaneous fscrypt and casefold functionality.

It has been discussed within the EXT4 community that
this hash could instead be stored in dirdata. This
would make it the second (or third, in the case of
64-bit inode counts) user of dirdata.

At the same time, the existing format—where the hash
is placed after the file name—must continue to be
supported. With these patches, EXT4 can handle the
hash in both formats.

The first user of this feature is  LUFID -
Locally Unique File ID.

Support for fscrypt and case-insensitive directories
with dirdata enabled has been verified using a
dedicated xfstest submitted to the xfstests list as
a separate patch.

e2fsprogs support is provided in a separate patches
series.

Changes in v3:
- Fixed issues reported by automated review of v2:
  - dx_get_dx_info() and get_dx_countlimit() called
    ext4_dir_entry_len() with the directory inode
    hardcoded to NULL, forcing its blocksize fallback
    to 4096 regardless of the real filesystem blocksize.
    Both now pass the real inode through, and
    dx_get_dx_info() also rejects results that fall
    outside the directory block.
  - ext4_dirdata_get() declared a local "dfid" that
    shadowed the function's own "dfid" output parameter,
    so a requested LUFID copy never reached the caller's
    buffer. Renamed the local and fixed the copy.
  - ext4_dirdata_get()/ext4_dirdata_set() compared
    offsets against the raw on-disk rec_len instead of
    decoding it via ext4_rec_len_from_disk(), which is
    incorrect on big-endian hosts and mishandles the
    "0/65535 means full block" sentinel. Both now decode
    rec_len once and use the decoded value throughout.
  - EXT4_IOC_SET_LUFID deleted the existing directory
    entry before re-adding it with the new LUFID data;
    if the re-add failed, the inode was left with no
    directory entry at all. It now attempts to restore
    the original entry on failure, and loudly flags
    inode corruption if that also fails.
- syzbot ci tested the fix for these issues; per its
  request, this is being submitted with the corresponding
  Tested-by tag below.
- Rebased onto the latest codebase.

Artem Blagodarenko (10):
  ext4: replace ext4_dir_entry with ext4_dir_entry_2
  ext4: add ext4_dir_entry_is_tail()
  ext4: refactor dx_root to support variable dirent sizes
  ext4: add dirdata format definitions and access helpers
  ext4: preserve dirdata bits in get_dtype()
  ext4: add ext4_dir_entry_len() and harden dirdata parsing
  ext4: rename ext4_dir_rec_len() and clarify dirdata usage
  ext4: dirdata feature
  ext4: add dirdata set/get helpers
  ext4: Add EXT4_IOC_SET_LUFID ioctl for setting LUFID on directory
    entries

 foofile.txt               |   0
 fs/ext4/dir.c             |   9 +-
 fs/ext4/ext4.h            | 205 +++++++++++-
 fs/ext4/inline.c          |  37 ++-
 fs/ext4/ioctl.c           |  62 ++++
 fs/ext4/namei.c           | 650 ++++++++++++++++++++++++++++----------
 fs/ext4/sysfs.c           |   2 +
 include/uapi/linux/ext4.h |  13 +
 8 files changed, 780 insertions(+), 198 deletions(-)
 create mode 100644 foofile.txt

Tested-by: syzbot@syzkaller.appspotmail.com
-- 
2.43.7


^ permalink raw reply	[flat|nested] 19+ messages in thread
* [PATCH v2 00/10] Data in direntry (dirdata) feature
@ 2026-06-10 15:24 Artem Blagodarenko
  2026-06-11 10:29 ` [syzbot ci] " syzbot ci
  0 siblings, 1 reply; 19+ messages in thread
From: Artem Blagodarenko @ 2026-06-10 15:24 UTC (permalink / raw)
  To: linux-ext4; +Cc: adilger.kernel, Artem Blagodarenko, Andreas Dilger, syzbot

EXT4 currently stores a hash in the directory entry
(dirent) immediately after the file name to support
simultaneous fscrypt and casefold functionality.

It has been discussed within the EXT4 community that
this hash could instead be stored in dirdata. This
would make it the second (or third, in the case of
64-bit inode counts) user of dirdata.

At the same time, the existing format—where the hash
is placed after the file name—must continue to be
supported. With these patches, EXT4 can handle the
hash in both formats.

The first user of this feature is  LUFID -
Locally Unique File ID.

Support for fscrypt and case-insensitive directories
with dirdata enabled has been verified using a
dedicated xfstest submitted to the xfstests list as
a separate patch.

e2fsprogs support is provided in a separate patches
series.

Changes in v2:
- Split the patch set into 10 smaller patchesfor
  easier reading and review.
- Added an IOCTL to set the LUFID for testing purposes.
  LUFIDs can be listed via debugfs. Corresponding support
  has been added in the related e2fsprogs series.
- Removed the dirdata mount option.
- Fixed the following issue:
  KASAN: slab-out-of-bounds read in __ext4_check_dir_entry
- Rebased onto the latest codebase.

Artem Blagodarenko (10):
  ext4: replace ext4_dir_entry with ext4_dir_entry_2
  ext4: add ext4_dir_entry_is_tail()
  ext4: refactor dx_root to support variable dirent sizes
  ext4: add dirdata format definitions and access helpers
  ext4: preserve dirdata bits in get_dtype()
  ext4: add ext4_dir_entry_len() and harden dirdata parsing
  ext4: rename ext4_dir_rec_len() and clarify dirdata usage
  ext4: dirdata feature
  ext4: add dirdata set/get helpers
  ext4: Add EXT4_IOC_SET_LUFID ioctl for setting LUFID on directory
    entries

 foofile.txt               |   0
 fs/ext4/dir.c             |   9 +-
 fs/ext4/ext4.h            | 205 ++++++++++++-
 fs/ext4/inline.c          |  37 ++-
 fs/ext4/ioctl.c           |  62 ++++
 fs/ext4/namei.c           | 587 +++++++++++++++++++++++++++-----------
 fs/ext4/sysfs.c           |   2 +
 include/uapi/linux/ext4.h |  13 +
 8 files changed, 723 insertions(+), 192 deletions(-)
 create mode 100644 foofile.txt

Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Tested-by: syzbot@syzkaller.appspotmail.com
-- 
2.43.7


^ permalink raw reply	[flat|nested] 19+ messages in thread
* [PATCH 0/3] Data in direntry (dirdata) feature
@ 2026-04-17 21:37 Artem Blagodarenko
  2026-04-18  6:47 ` [syzbot ci] " syzbot ci
  0 siblings, 1 reply; 19+ messages in thread
From: Artem Blagodarenko @ 2026-04-17 21:37 UTC (permalink / raw)
  To: linux-ext4; +Cc: adilger.kernel, Artem Blagodarenko

EXT4 currently stores a hash in the directory entry
(dirent) immediately after the file name to support
simultaneous fscrypt and casefold functionality.

It has been discussed within the EXT4 community that
this hash could instead be stored in dirdata. This
would make it the second (or third, in the case of
64-bit inode counts) user of dirdata.

At the same time, the existing format—where the hash
is placed after the file name—must continue to be
supported. With these patches, EXT4 can handle the
hash in both formats.

The first user of this feature, LUFID, has been
tested in the Lustre filesystem backend (LDISKFS)
[1].

Support for fscrypt and case-insensitive directories
with dirdata enabled has been verified using a
dedicated xfstest submitted to the EXT4 community as
a separate patch.

e2fsprogs support is provided in a separate patch.

[1] https://review.whamcloud.com/c/fs/lustre-release/+/64439

Artem Blagodarenko (3):
  ext4: make dirdata work with metadata_csum
  ext4: add dirdata support structures and helpers
  ext4: dirdata feature

 fs/ext4/dir.c    |   9 +-
 fs/ext4/ext4.h   | 169 +++++++++++++++++++--
 fs/ext4/inline.c |  22 +--
 fs/ext4/namei.c  | 379 ++++++++++++++++++++++++++++++-----------------
 fs/ext4/super.c  |   4 +-
 fs/ext4/sysfs.c  |   2 +
 6 files changed, 422 insertions(+), 163 deletions(-)

-- 
2.43.5


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2026-06-24 23:18 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-24 13:36 [PATCH v4 00/11] Data in direntry (dirdata) feature Artem Blagodarenko
2026-06-24 13:36 ` [PATCH v4 01/11] ext4: validate count against limit in ext4_dx_csum_verify/_set Artem Blagodarenko
2026-06-24 13:36 ` [PATCH v4 02/11] ext4: replace ext4_dir_entry with ext4_dir_entry_2 Artem Blagodarenko
2026-06-24 13:36 ` [PATCH v4 03/11] ext4: add ext4_dir_entry_is_tail() Artem Blagodarenko
2026-06-24 13:36 ` [PATCH v4 04/11] ext4: refactor dx_root to support variable dirent sizes Artem Blagodarenko
2026-06-24 13:36 ` [PATCH v4 05/11] ext4: add dirdata format definitions and access helpers Artem Blagodarenko
2026-06-24 13:36 ` [PATCH v4 06/11] ext4: preserve dirdata bits in get_dtype() Artem Blagodarenko
2026-06-24 13:36 ` [PATCH v4 07/11] ext4: add ext4_dir_entry_len() and harden dirdata parsing Artem Blagodarenko
2026-06-24 13:36 ` [PATCH v4 08/11] ext4: rename ext4_dir_rec_len() and clarify dirdata usage Artem Blagodarenko
2026-06-24 13:36 ` [PATCH v4 09/11] ext4: dirdata feature Artem Blagodarenko
2026-06-24 13:36 ` [PATCH v4 10/11] ext4: add dirdata set/get helpers Artem Blagodarenko
2026-06-24 13:36 ` [PATCH v4 11/11] ext4: Add EXT4_IOC_SET_LUFID ioctl for setting LUFID on directory entries Artem Blagodarenko
2026-06-24 23:18 ` [syzbot ci] Re: Data in direntry (dirdata) feature syzbot ci
  -- strict thread matches above, loose matches on Subject: below --
2026-06-19 19:10 [PATCH v3 00/10] " Artem Blagodarenko
2026-06-20  6:55 ` [syzbot ci] " syzbot ci
2026-06-10 15:24 [PATCH v2 00/10] " Artem Blagodarenko
2026-06-11 10:29 ` [syzbot ci] " syzbot ci
2026-06-19 14:10   ` Artem Blagodarenko
2026-06-19 14:11     ` syzbot
2026-06-19 14:50     ` syzbot ci
2026-04-17 21:37 [PATCH 0/3] " Artem Blagodarenko
2026-04-18  6:47 ` [syzbot ci] " syzbot ci

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox