Re: [PATCH V4 7/9] dax: fix holder_ops race in fs_put_dax()

Linux CXL
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "John Groves" <john@jagalactic.com>
Cc: linux-cxl@vger.kernel.org
Subject: Re: [PATCH V4 7/9] dax: fix holder_ops race in fs_put_dax()
Date: Sun, 07 Jun 2026 19:49:16 +0000	[thread overview]
Message-ID: <20260607194917.1CE561F00893@smtp.kernel.org> (raw)
In-Reply-To: <0100019ea3941018-519230fa-2897-41b8-9677-dabc8d1124ca-000000@email.amazonses.com>

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:

New issues:
- [High] fs_put_dax() unconditionally clears dax_dev->holder_ops before verifying ownership, corrupting the operations of a legitimate holder if called incorrectly.

Pre-existing issues:
- [High] dax_holder_notify_failure() reads dax_dev->holder_ops without READ_ONCE(), allowing compiler double-fetches and a TOCTOU NULL pointer dereference.
- [High] fs_dax_get() drops the SRCU read lock before modifying device state, allowing it to race with kill_dax() and acquire a dead device.
- [High] fs_put_dax() clears holder information but fails to wait for pre-existing SRCU readers, leading to a UAF or NULL pointer dereference in notify callbacks.
- [High] kill_dax() checks holder_data before clearing DAXDEV_ALIVE, creating a TOCTOU race where a newly attached holder is wiped without notification.
--

commit c385d20d67e08b6d8549b98dcf43d6d8a692ac06
Author: John Groves <John@Groves.net>

dax: fix holder_ops race in fs_put_dax()

Clear holder_ops before holder_data so that a concurrent fs_dax_get()
cannot have its newly installed holder_ops overwritten. cmpxchg()
provides release ordering on weakly-ordered architectures, ensuring the
WRITE_ONCE(holder_ops, NULL) store is visible to any CPU that observes
the holder_data release.

> diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> index 25cf99dd9360b..96f778dcde50b 100644
> --- a/drivers/dax/super.c
> +++ b/drivers/dax/super.c
> @@ -116,11 +116,47 @@ EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev);

[ ... ]

> +		/*
> +		 * Clear holder_ops before releasing holder_data. A concurrent
> +		 * dax_holder_notify_failure() that sees NULL ops returns
> +		 * -EOPNOTSUPP cleanly. A concurrent fs_dax_get() that acquires

[Severity: High]
This isn't a bug introduced by this patch, but this comment relies on a flawed
behavior in the failure notification path.

Looking at dax_holder_notify_failure():

drivers/dax/super.c:dax_holder_notify_failure() {
    if (!dax_dev->holder_ops) {
        rc = -EOPNOTSUPP;
        goto out;
    }

    rc = dax_dev->holder_ops->notify_failure(dax_dev, off, len, mf_flags);
}

Since holder_ops is read without the READ_ONCE macro, can the compiler emit
two separate memory loads? If fs_put_dax() sets holder_ops to NULL between
the check and the indirect function call, it seems the second load would read
NULL, resulting in a NULL pointer dereference.

> +		 * holder_data after the cmpxchg below is guaranteed to observe
> +		 * holder_ops=NULL first (cmpxchg provides release ordering), so
> +		 * its subsequent store of new ops will not be overwritten.
> +		 */
> +		WRITE_ONCE(dax_dev->holder_ops, NULL);
> +		prev = cmpxchg(&dax_dev->holder_data, holder, NULL);

[Severity: High]
Does this unconditionally clear the holder operations before verifying 
ownership?

If an errant caller invokes fs_put_dax() with a non-matching holder, the 
cmpxchg will correctly fail and preserve holder_data. However, the legitimate 
holder's operations have already been overwritten with NULL here. Could this 
corrupt the rightful holder's state and silently break future memory failure 
notifications?

[ ... ]

> +		WARN_ON(prev && prev != holder);
> +	}
>  	put_dax(dax_dev);
>  }

[Severity: High]
This is a pre-existing issue, but does fs_put_dax() need to synchronize with
SRCU readers before returning?

When a filesystem unmounts, it calls fs_put_dax() to clear holder_data and
holder_ops, but returns without calling synchronize_srcu. 

If dax_holder_notify_failure() is executing asynchronously, it retrieves the
mount pointer:

fs/xfs/xfs_notify_failure.c:xfs_dax_notify_failure() {
    struct xfs_mount *mp = dax_holder(dax_dev);

    if (!(mp->m_super->s_flags & SB_BORN)) {
        ...
    }
}

If fs_put_dax() has just cleared holder_data, dax_holder() returns NULL, and
this immediately dereferences the NULL pointer. Could this cause a crash on
unmount?

[Severity: High]
This is a pre-existing issue, but looking at fs_dax_get() which is related to
these ownership changes:

drivers/dax/super.c:fs_dax_get() {
    id = dax_read_lock();
    if (!dax_dev || !dax_alive(dax_dev) || !igrab(&dax_dev->inode)) {
        dax_read_unlock(id);
        return -ENODEV;
    }
    dax_read_unlock(id);
    ... 
    if (cmpxchg(&dax_dev->holder_data, NULL, holder))
}

Does dropping the SRCU read lock before modifying the device state allow a
race with kill_dax()?

If kill_dax() executes concurrently, clears the alive flag, and wipes 
holder_data, could the cmpxchg in fs_dax_get() blindly succeed and set 
operations on a dead device?

[Severity: High]
This is a pre-existing issue, but could there be a race during device removal
in kill_dax()?

drivers/dax/super.c:kill_dax() {
    if (dax_dev->holder_data != NULL)
        dax_holder_notify_failure(dax_dev, 0, U64_MAX,
                MF_MEM_PRE_REMOVE);

    clear_bit(DAXDEV_ALIVE, &dax_dev->flags);
}

Because holder_data is checked before clearing the alive flag, could a 
new filesystem execute fs_dax_get_by_bdev() concurrently, see dax_alive() is
true, and successfully register its holder_data? 

If kill_dax() then unconditionally wipes holder_data and holder_ops after
this, the new filesystem would miss the pre-remove notification entirely.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/0100019ea3929225-a0f8e6f7-30ae-4f8e-ae6f-19129666c4c3-000000@email.amazonses.com?part=7

next prev parent reply	other threads:[~2026-06-07 19:49 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260607193224.94244-1-john@jagalactic.com>
2026-06-07 19:32 ` [PATCH V4 0/9] Fixes to the previously-merged drivers/dax/fsdev series John Groves
2026-06-07 19:33   ` [PATCH V4 1/9] dax: fix misleading comment about share/index union in dax_folio_reset_order() John Groves
2026-06-07 19:45     ` sashiko-bot
2026-06-07 19:33   ` [PATCH V4 2/9] dax/fsdev: fix multi-range offset in memory_failure handler John Groves
2026-06-07 19:49     ` sashiko-bot
2026-06-08 10:56     ` Richard Cheng
2026-06-11 16:59       ` John Groves
2026-06-07 19:33   ` [PATCH V4 3/9] dax/fsdev: clear vmemmap_shift when binding static pgmap John Groves
2026-06-07 19:49     ` sashiko-bot
2026-06-07 19:33   ` [PATCH V4 4/9] dax/fsdev: don't leave a dangling dev_dax->pgmap on probe failure John Groves
2026-06-07 19:44     ` sashiko-bot
2026-06-08 21:30     ` Dave Jiang
2026-06-07 19:33   ` [PATCH V4 5/9] dax/fsdev: use __va(phys) for kaddr in direct_access John Groves
2026-06-07 19:44     ` sashiko-bot
2026-06-07 19:34   ` [PATCH V4 6/9] dax/fsdev: fail probe on invalid pgmap offset John Groves
2026-06-07 19:43     ` sashiko-bot
2026-06-08 21:39     ` Dave Jiang
2026-06-07 19:34   ` [PATCH V4 7/9] dax: fix holder_ops race in fs_put_dax() John Groves
2026-06-07 19:49     ` sashiko-bot [this message]
2026-06-08 10:52     ` Richard Cheng
2026-06-11 17:01       ` John Groves
2026-06-07 19:34   ` [PATCH V4 8/9] dax: replace exported dax_dev_get() with non-allocating dax_dev_find() John Groves
2026-06-07 19:49     ` sashiko-bot
2026-06-08 10:48     ` Richard Cheng
2026-06-07 19:34   ` [PATCH V4 9/9] dax: fsdev.c minor formatting cleanup John Groves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260607194917.1CE561F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=john@jagalactic.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox