public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
* [SECURITY] e2fsprogs v1.47.4 Vulnerabilities — Orphan File & Extent Handling
@ 2026-04-03 11:29 4fqr
  2026-04-03 13:48 ` Theodore Tso
  2026-04-03 16:17 ` Andreas Dilger
  0 siblings, 2 replies; 5+ messages in thread
From: 4fqr @ 2026-04-03 11:29 UTC (permalink / raw)
  To: linux-ext4@vger.kernel.org


[-- Attachment #1.1: Type: text/plain, Size: 1349 bytes --]

linux-ext4@vger.kernel.org,

I'm disclosing three security vulnerabilities in e2fsprogs v1.47.4 affecting orphan file inode processing and extent tree validation. This follows responsible disclosure notification to the maintainer (Theodore Ts'o).

**Vulnerability Overview:**

F1 (CRITICAL): process_orphan_block() lacks inode range validation before calling release_orphan_inode(), allowing arbitrary inode destruction via crafted orphan file blocks.

F2 (HIGH): pass1 dispatch chain aliases s_orphan_file_inum with reserved inodes (5–10), bypassing reserved inode guards and enabling mode corruption on critical system inodes like the resize inode.

F3 (MEDIUM): ext2fs_extent_fix_parents() contains an unsigned underflow where blk64_t subtraction truncates to __u32, corrupting parent extent length metadata.

**Technical Details:**

All three are exploitable via crafted .img files processed by e2fsck -y with no special privileges. Detailed technical report with code locations, attack scenarios, and exact fixes is attached.

**Timeline:**
- Primary maintainer contact: Theodore Ts'o (tytso@mit.edu)
- 90-day embargo from maintainer acknowledgment
- Kernel security team notified concurrently

Patches and coordination discussion will follow once the maintainer has reviewed.

Thanks,
4fqr
4fqr@proton.me
Attachment: e2fsprogs_audit_4fqr.txt

[-- Attachment #1.2: Type: text/html, Size: 2449 bytes --]

[-- Attachment #2: e2fsprogs_audit_4fqr.txt --]
[-- Type: text/plain, Size: 30030 bytes --]

================================================================================
                     e2fsprogs — SECURITY AUDIT REPORT
                     Target: v1.47.4  |  Date: 2026-04-03
================================================================================

  Auditor  : 4fqr
  Scope    : Full source tree — emphasis on e2fsck, libext2fs, orphan handling,
             extent tree code, and any path reachable by a malicious disk image.
  Method   : Static analysis + manual code review. No fuzzing performed.
  Threat   : Attacker supplies a crafted ext4 filesystem image.
             Victim runs  e2fsck -y  on it  (USB attach, cloud disk, VM image).

================================================================================
  EXECUTIVE SUMMARY
================================================================================

  Three confirmed vulnerabilities were found. Two are directly triggerable by
  a crafted filesystem image processed by e2fsck. One creates a destructive
  chain attack when combined with the other.

  The root cause in each case is the same class of mistake: on-disk values are
  trusted as valid indices or inode numbers without the same range-checks that
  are applied in older, parallel code paths.

  None of these require kernel exploitation, memory corruption primitives, or
  special privileges. A crafted .img file + "e2fsck -y image.img" is enough.

  Severity summary:
  ┌────┬───────────────────────────────────────────────────────┬──────────┐
  │ #  │ Title                                                 │ Severity │
  ├────┼───────────────────────────────────────────────────────┼──────────┤
  │ F1 │ Orphan file blocks: no inode range check before       │ CRITICAL │
  │    │ release_orphan_inode()                                │          │
  ├────┼───────────────────────────────────────────────────────┼──────────┤
  │ F2 │ s_orphan_file_inum aliases reserved inodes in pass1   │ HIGH     │
  │    │ dispatch; triggers destructive chain via resize inode │          │
  ├────┼───────────────────────────────────────────────────────┼──────────┤
  │ F3 │ ext2fs_extent_fix_parents(): __u32 += blk64_t         │ MEDIUM   │
  │    │ unsigned underflow corrupts parent extent length      │          │
  └────┴───────────────────────────────────────────────────────┴──────────┘

================================================================================
  BACKGROUND — HOW ORPHAN PROCESSING WORKS
================================================================================

  When ext4 needs to track inodes that must be cleaned up on next mount/fsck
  (e.g. unlinked files still open), it uses one of two mechanisms:

  LEGACY   — a singly-linked list threaded through s_last_orphan →
             inode.i_dtime → inode.i_dtime → ... → 0

  NEW      — an "orphan file": a dedicated hidden inode (s_orphan_file_inum)
             whose data blocks each contain an array of u32 inode numbers.
             Indicated by feature flags  orphan_file  +  orphan_present.

  e2fsck processes both at startup in  release_orphan_inodes()  (super.c:509),
  before Pass 1 runs — meaning before any inode bitmap, block bitmap, or inode
  table has been validated against the actual filesystem state.

  The function that does the actual work for both paths is:

      release_orphan_inode(ctx, &ino, block_buf)   [super.c:317]

  It reads the inode, frees its blocks, and — if i_links_count == 0 — marks
  the inode itself as free in the inode allocation bitmap.

================================================================================
  FINDING F1 — CRITICAL
  Missing inode range validation in process_orphan_block()
================================================================================

  File  : e2fsck/super.c
  Lines : 420 – 427  (vulnerable path)
           548 – 552  (the guard that exists for the legacy path, but NOT here)

  ┌─────────────────────────────────────────────────────────────────────────┐
  │  VULNERABLE CODE                                                        │
  └─────────────────────────────────────────────────────────────────────────┘

  super.c:420
  ───────────────────────────────────────────────────────────────────────────
    bdata = (__u32 *)pd->buf;
    for (j = 0; j < inodes_per_ob; j++) {
        if (!bdata[j])
            continue;
        ino = ext2fs_le32_to_cpu(bdata[j]);          /* raw disk value     */
        if (release_orphan_inode(ctx, &ino, pd->block_buf))  /* NO CHECK   */
            goto return_abort;
        bdata[j] = 0;
    }
  ───────────────────────────────────────────────────────────────────────────

  ┌─────────────────────────────────────────────────────────────────────────┐
  │  THE GUARD THAT PROTECTS THE LEGACY PATH (but is absent here)          │
  └─────────────────────────────────────────────────────────────────────────┘

  super.c:548
  ───────────────────────────────────────────────────────────────────────────
    /* Traditional orphan list: head inode is validated */
    if (ino && ((ino < EXT2_FIRST_INODE(fs->super)) ||
        (ino > fs->super->s_inodes_count))) {
        fix_problem(ctx, PR_0_ORPHAN_ILLEGAL_HEAD_INODE, &pctx);
        goto err_qctx;
    }
  ───────────────────────────────────────────────────────────────────────────

  ┌─────────────────────────────────────────────────────────────────────────┐
  │  WHAT release_orphan_inode() DOES WITHOUT A VALID INODE                │
  └─────────────────────────────────────────────────────────────────────────┘

  super.c:317
  ───────────────────────────────────────────────────────────────────────────
    static int release_orphan_inode(e2fsck_t ctx, ext2_ino_t *ino, ...)
    {
        e2fsck_read_inode_full(ctx, *ino, ...);     /* reads inode from disk */

        next_ino = inode.i_dtime;
        if (next_ino &&
            ((next_ino < EXT2_FIRST_INODE(fs->super)) || ...))  /* NEXT is
            { return 1; }                                          checked,
                                                                   not *ino */
        if (release_inode_blocks(ctx, *ino, &inode, ...))       /* frees    */
            return 1;                                            /* blocks   */

        if (!inode.i_links_count) {
            ext2fs_inode_alloc_stats2(fs, *ino, -1, ...);  /* marks inode  */
            ctx->free_inodes++;                             /* as FREE in   */
            ext2fs_set_dtime(fs, ...);                      /* the bitmap   */
        }
        e2fsck_write_inode_full(ctx, *ino, ...);       /* writes back      */
    }
  ───────────────────────────────────────────────────────────────────────────

  Note carefully: inside release_orphan_inode, the range check at lines
  334–339 validates NEXT (i_dtime chain), not *ino itself. The current
  inode is never validated.

  ┌─────────────────────────────────────────────────────────────────────────┐
  │  ATTACK SCENARIO                                                        │
  └─────────────────────────────────────────────────────────────────────────┘

  Craft an ext4 image with:

    1. Feature flags: orphan_file + orphan_present set in the superblock.
       Do NOT set metadata_csum — the block checksum is then skipped.
       (If metadata_csum is desired, set s_checksum_seed to any known value
        and compute the matching CRC32c — the attacker controls the seed.)

    2. A valid inode for s_orphan_file_inum (any regular non-reserved inode).
       Its data block(s) are attacker-controlled.

    3. The data block(s) filled with target inode numbers as little-endian
       u32 values. Useful targets with i_links_count == 0:
         - inode 7  (EXT2_RESIZE_INO)      → journal/block bitmaps freed
         - inode 9  (exclude bitmap inode)
         - inode 10 (journal backup inode)
       Useful targets with i_links_count > 0 (blocks released, inode kept):
         - Any inode whose blocks you want freed (data destruction)

    4. Do NOT set EXT2_ERROR_FS in s_state — that flag causes
       release_orphan_inodes() to short-circuit before reaching this code.

  Result: e2fsck -y reads the image, enters release_orphan_inodes(), calls
  process_orphan_file(), reaches process_orphan_block(), and invokes
  release_inode_blocks() + ext2fs_inode_alloc_stats2() on each target inode
  — all before Pass 1 has validated a single bitmap.

  Impact: targeted inodes have their blocks freed and are marked available
  for reuse. The filesystem is silently corrupted. With EXT2_RESIZE_INO as
  a target, the block group descriptor tables are freed.

  ┌─────────────────────────────────────────────────────────────────────────┐
  │  FIX                                                                    │
  └─────────────────────────────────────────────────────────────────────────┘

  Add the same guard that the legacy path has, immediately before the call
  to release_orphan_inode() in process_orphan_block():

      ino = ext2fs_le32_to_cpu(bdata[j]);
  +   if (ino < EXT2_FIRST_INODE(fs->super) ||
  +       ino > fs->super->s_inodes_count) {
  +       fix_problem(ctx, PR_0_ORPHAN_ILLEGAL_INODE, &pctx);
  +       goto return_abort;
  +   }
      if (release_orphan_inode(ctx, &ino, pd->block_buf))

================================================================================
  FINDING F2 — HIGH
  s_orphan_file_inum aliases reserved inodes in pass1 dispatch chain
================================================================================

  File  : e2fsck/pass1.c
  Lines : 1725 – 1879  (the dispatch chain)
           1853 – 1866  (the orphan-file branch — fires too early)

  ┌─────────────────────────────────────────────────────────────────────────┐
  │  THE DISPATCH CHAIN IN PASS 1                                           │
  └─────────────────────────────────────────────────────────────────────────┘

  pass1.c evaluates the following else-if ladder for every inode:

    1725:  if   (ino == EXT2_BAD_INO)             → inode 1  — protected
    1773:  elif (ino == EXT2_ROOT_INO)             → inode 2  — protected
    1800:  elif (ino == EXT2_JOURNAL_INO)          → inode 8  — protected
    1826:  elif (quota_inum_is_reserved(fs, ino))  → inodes 3, 4 — protected
    1853:  elif (ino == s_orphan_file_inum)         ← ORPHAN FILE BRANCH
    1879:  elif (ino < EXT2_FIRST_INODE(fs->super))← catches 5,6,7,9,10

  If s_orphan_file_inum is set to 5, 6, 7, 9, or 10 in the superblock,
  the orphan-file branch at line 1853 fires BEFORE the generic reserved-inode
  guard at line 1879. Those reserved inodes are never properly handled.

  ┌─────────────────────────────────────────────────────────────────────────┐
  │  VULNERABLE CODE                                                        │
  └─────────────────────────────────────────────────────────────────────────┘

  pass1.c:1853
  ───────────────────────────────────────────────────────────────────────────
    } else if (ino == fs->super->s_orphan_file_inum) {
        ext2fs_mark_inode_bitmap2(ctx->inode_used_map, ino);
        if (ext2fs_has_feature_orphan_file(fs->super)) {
            if (!LINUX_S_ISREG(inode->i_mode) &&          /* mode check   */
                fix_problem(ctx, PR_1_ORPHAN_FILE_BAD_MODE, &pctx)) {
                inode->i_mode = LINUX_S_IFREG;             /* WRITES BACK  */
                e2fsck_write_inode(ctx, ino, inode, "pass1");
            }
            check_blocks(ctx, &pctx, block_buf, NULL);
            FINISH_INODE_LOOP(ctx, ino, &pctx, failed_csum);
            continue;   /* skips all reserved-inode validation */
        }
  ───────────────────────────────────────────────────────────────────────────

  With -y, fix_problem() returns 1 unconditionally. For any reserved inode
  that is not a regular file (resize inode has S_IFREG, others have mode=0),
  e2fsck would write LINUX_S_IFREG into that inode's i_mode field on disk.

  ┌─────────────────────────────────────────────────────────────────────────┐
  │  THE DESTRUCTIVE CHAIN: s_orphan_file_inum = 7 (EXT2_RESIZE_INO)      │
  └─────────────────────────────────────────────────────────────────────────┘

  The resize inode (inode 7) is used for online filesystem expansion. Its
  data blocks are the per-group BLOCK BITMAP IMAGES — one block per group,
  containing bit-for-bit maps of allocated blocks.

  When process_orphan_file() runs on inode 7 (in release_orphan_inodes,
  before Pass 1), it iterates every data block of inode 7 and calls
  process_orphan_block() on each:

      /* process_orphan_block reads the block as an array of u32 inode#s */
      bdata = (__u32 *)pd->buf;               /* resize inode block data  */
      for (j = 0; j < inodes_per_ob; j++) {
          ino = ext2fs_le32_to_cpu(bdata[j]); /* bitmap words as inode#s  */
          release_orphan_inode(ctx, &ino, ...);
      }

  A per-group block bitmap for a dense filesystem is full of 0xFFFFFFFF
  words → inode# 4294967295 > s_inodes_count, harmless. But for a filesystem
  that is moderately allocated, the bitmaps contain words like 0x0000003F or
  0x000003FF — small inode numbers that ARE valid inodes. Those inodes get
  their blocks released and are marked free.

  On a semi-sparse crafted filesystem, the attacker can arrange exactly which
  words appear in the resize inode's blocks by controlling block allocation
  density, giving precise control over which inodes get wiped.

  This chain requires BOTH F1 and F2:
    F2 → process_orphan_file() is called on the resize inode's blocks
    F1 → each word in those blocks passes unchecked into release_orphan_inode

  ┌─────────────────────────────────────────────────────────────────────────┐
  │  FIX                                                                    │
  └─────────────────────────────────────────────────────────────────────────┘

  Validate s_orphan_file_inum at superblock-check time and in the pass1
  dispatch chain. Two changes:

  (a) In e2fsck/super.c, when opening the orphan file, guard the inode number:

      orphan_inum = fs->super->s_orphan_file_inum;
  +   if (orphan_inum < EXT2_FIRST_INODE(fs->super) ||
  +       orphan_inum > fs->super->s_inodes_count) {
  +       /* fix_problem / clear orphan_file feature */
  +   }

  (b) In e2fsck/pass1.c, move the orphan-file else-if AFTER the
      ino < EXT2_FIRST_INODE guard (line 1879), or add an explicit
      lower-bound check:

      } else if (ino == fs->super->s_orphan_file_inum
  +              && ino >= EXT2_FIRST_INODE(fs->super)) {

================================================================================
  FINDING F3 — MEDIUM
  ext2fs_extent_fix_parents(): __u32 += blk64_t unsigned underflow
================================================================================

  File  : lib/ext2fs/extent.c
  Line  : 816

  ┌─────────────────────────────────────────────────────────────────────────┐
  │  VULNERABLE CODE                                                        │
  └─────────────────────────────────────────────────────────────────────────┘

  extent.c:800
  ───────────────────────────────────────────────────────────────────────────
    /* modified node's start block */
    start = extent.e_lblk;           /* blk64_t: leaf's logical block      */
    ...
    while (handle->level > 0 &&
           (path->left == path->entries - 1)) {
        ext2fs_extent_get(handle, EXT2_EXTENT_UP, &extent);  /* go to parent*/
        if (extent.e_lblk == start)
            break;
        path = handle->path + handle->level;
        extent.e_len += (extent.e_lblk - start);  /* LINE 816              */
        extent.e_lblk = start;
        ext2fs_extent_replace(handle, 0, &extent);
    }
  ───────────────────────────────────────────────────────────────────────────

  Type breakdown:
    extent.e_len   is  __u32          (32-bit unsigned)
    extent.e_lblk  is  blk64_t        (64-bit unsigned)
    start          is  blk64_t        (64-bit unsigned)

  The function is designed for the case where a leaf's start block was moved
  EARLIER (smaller) — the parent index entry must extend to cover it:
    extent.e_lblk > start → subtraction positive → e_len grows. Correct.

  But if a crafted extent tree has a parent index entry whose e_lblk is
  LESS than the current leaf's e_lblk (a B-tree invariant violation that is
  not rejected on read), then:
    extent.e_lblk < start
    → (extent.e_lblk - start) as blk64_t wraps to ~0ULL - delta + 1
    → += on __u32 truncates that to a small garbage value
    → extent_replace() writes the corrupted length back to disk

  ┌─────────────────────────────────────────────────────────────────────────┐
  │  REACHABILITY                                                           │
  └─────────────────────────────────────────────────────────────────────────┘

  ext2fs_extent_fix_parents() is called by:
    - ext2fs_extent_insert()           used throughout e2fsck and libext2fs
    - ext2fs_extent_set_bmap()         used during extent tree rebuild
    - rewrite_extent_replay()          e2fsck/extents.c — called in Pass 1E
                                        on every inode flagged for rebuild

  A crafted inode with an extent tree where a leaf's e_lblk < its parent
  index's e_lblk will reach this path during Pass 1E. The attacker does not
  need to corrupt the leaf-writing path — the malformed parent-child
  relationship in the on-disk image is sufficient.

  ┌─────────────────────────────────────────────────────────────────────────┐
  │  FIX                                                                    │
  └─────────────────────────────────────────────────────────────────────────┘

  Guard against the underflow with an explicit check before the arithmetic:

      if (extent.e_lblk == start)
          break;
      path = handle->path + handle->level;
  +   if (extent.e_lblk < start) {
  +       /* parent starts after child — malformed tree; bail */
  +       retval = EXT2_ET_EXTENT_INVALID_LENGTH;
  +       goto done;
  +   }
      extent.e_len += (extent.e_lblk - start);

================================================================================
  FINDINGS REVIEW
================================================================================

  The preliminary scan raised several additional issues. After careful review
  of the actual source, here is the verdict on each:

  ┌──────────────────────────────────┬──────────────────────────────────────┐
  │ Claim                            │ Verdict                              │
  ├──────────────────────────────────┼──────────────────────────────────────┤
  │ extents.c:96 — blk64_t overflow  │ NOT A BUG. Both __u32 values are     │
  │ in extent merge accumulation     │ promoted to blk64_t before addition. │
  │                                  │ The (1ULL<<32) check is correct.     │
  ├──────────────────────────────────┼──────────────────────────────────────┤
  │ icount.c:220 — sprintf overflow  │ NOT A BUG. Allocation at line 216    │
  │ with long tdb_dir path           │ is strlen(tdb_dir)+64; format string │
  │                                  │ produces at most +53 bytes. Safe.   │
  ├──────────────────────────────────┼──────────────────────────────────────┤
  │ dir_iterate.c — infinite loop    │ NOT EXPLOITABLE. rec_len < 8 check   │
  │ on rec_len = 0 or 1              │ at line 84 returns 0 immediately.    │
  │                                  │ Zero-length entries abort cleanly.   │
  ├──────────────────────────────────┼──────────────────────────────────────┤
  │ extents.c:241 — loop re-entry    │ BENIGN by design. The ex--; i--      │
  │ use-after-free in UNINIT split   │ pattern re-processes the same array  │
  │                                  │ slot (e_len reduced each pass). The  │
  │                                  │ e_len==0 guard at line 232 exits.    │
  ├──────────────────────────────────┼──────────────────────────────────────┤
  │ extent.c:1607 — save_length      │ NEEDS FUZZING. Context unclear from  │
  │ minus underflow in split_node    │ static analysis alone. Recommend     │
  │                                  │ targeted AFL++ run on this function. │
  └──────────────────────────────────┴──────────────────────────────────────┘

================================================================================
  ATTACK SURFACE NOTES
================================================================================

  WHY THIS MATTERS MORE THAN TYPICAL PARSER BUGS

  e2fsck is routinely run automatically:
    - systemd-fsck triggers it on dirty ext4 partitions at boot
    - Desktop OS automounters call it on USB drives
    - Cloud providers run it during disk attach / snapshot restore
    - CI pipelines often call "e2fsck -y" to clean up test images

  In all of these contexts, the filesystem image is the attack input and
  e2fsck runs as root. Filesystem-level bugs here can silently destroy
  data, corrupt kernel metadata, or (with further chaining) achieve
  privilege escalation via inode bitmap manipulation.

  THE TIMING OF ORPHAN PROCESSING IS CRITICAL

  release_orphan_inodes() is called at the very start of e2fsck, from
  check_super_block() — before Pass 1, before bitmaps are validated,
  before any consistency has been established. The attacker's inode
  destructions happen against an unvalidated filesystem state.

  CHECKSUMS DO NOT PROTECT YOU HERE

  Finding F1 is exploitable both with and without metadata_csum:
    - Without: checksum check is skipped entirely.
    - With:    attacker sets s_checksum_seed to any value, computes
               CRC32c(seed || ino || gen || blk || buf) correctly.
               The seed lives in the same superblock the attacker crafts.
  Checksums here protect against accidental corruption, not adversarial input.

================================================================================
  QUICK REFERENCE — EXACT DIFF LOCATIONS
================================================================================

  F1  e2fsck/super.c       line 424      add range check before
                                          release_orphan_inode() call

  F2a e2fsck/super.c       ~line 446     validate s_orphan_file_inum
                                          >= EXT2_FIRST_INODE on entry
                                          to process_orphan_file()

  F2b e2fsck/pass1.c       line 1853     add lower-bound guard to the
                                          s_orphan_file_inum elif branch

  F3  lib/ext2fs/extent.c  line 816      guard against extent.e_lblk < start
                                          before the += arithmetic

================================================================================
  END OF REPORT
================================================================================

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-04-03 23:11 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-03 11:29 [SECURITY] e2fsprogs v1.47.4 Vulnerabilities — Orphan File & Extent Handling 4fqr
2026-04-03 13:48 ` Theodore Tso
2026-04-03 16:17 ` Andreas Dilger
2026-04-03 18:34   ` Theodore Tso
2026-04-03 23:11   ` Theodore Tso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox