From: sashiko-bot@kernel.org
To: "John Groves" <john@jagalactic.com>
Cc: linux-cxl@vger.kernel.org
Subject: Re: [PATCH V4 7/9] dax: fix holder_ops race in fs_put_dax()
Date: Sun, 07 Jun 2026 19:49:16 +0000 [thread overview]
Message-ID: <20260607194917.1CE561F00893@smtp.kernel.org> (raw)
In-Reply-To: <0100019ea3941018-519230fa-2897-41b8-9677-dabc8d1124ca-000000@email.amazonses.com>
Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
New issues:
- [High] fs_put_dax() unconditionally clears dax_dev->holder_ops before verifying ownership, corrupting the operations of a legitimate holder if called incorrectly.
Pre-existing issues:
- [High] dax_holder_notify_failure() reads dax_dev->holder_ops without READ_ONCE(), allowing compiler double-fetches and a TOCTOU NULL pointer dereference.
- [High] fs_dax_get() drops the SRCU read lock before modifying device state, allowing it to race with kill_dax() and acquire a dead device.
- [High] fs_put_dax() clears holder information but fails to wait for pre-existing SRCU readers, leading to a UAF or NULL pointer dereference in notify callbacks.
- [High] kill_dax() checks holder_data before clearing DAXDEV_ALIVE, creating a TOCTOU race where a newly attached holder is wiped without notification.
--
commit c385d20d67e08b6d8549b98dcf43d6d8a692ac06
Author: John Groves <John@Groves.net>
dax: fix holder_ops race in fs_put_dax()
Clear holder_ops before holder_data so that a concurrent fs_dax_get()
cannot have its newly installed holder_ops overwritten. cmpxchg()
provides release ordering on weakly-ordered architectures, ensuring the
WRITE_ONCE(holder_ops, NULL) store is visible to any CPU that observes
the holder_data release.
> diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> index 25cf99dd9360b..96f778dcde50b 100644
> --- a/drivers/dax/super.c
> +++ b/drivers/dax/super.c
> @@ -116,11 +116,47 @@ EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev);
[ ... ]
> + /*
> + * Clear holder_ops before releasing holder_data. A concurrent
> + * dax_holder_notify_failure() that sees NULL ops returns
> + * -EOPNOTSUPP cleanly. A concurrent fs_dax_get() that acquires
[Severity: High]
This isn't a bug introduced by this patch, but this comment relies on a flawed
behavior in the failure notification path.
Looking at dax_holder_notify_failure():
drivers/dax/super.c:dax_holder_notify_failure() {
if (!dax_dev->holder_ops) {
rc = -EOPNOTSUPP;
goto out;
}
rc = dax_dev->holder_ops->notify_failure(dax_dev, off, len, mf_flags);
}
Since holder_ops is read without the READ_ONCE macro, can the compiler emit
two separate memory loads? If fs_put_dax() sets holder_ops to NULL between
the check and the indirect function call, it seems the second load would read
NULL, resulting in a NULL pointer dereference.
> + * holder_data after the cmpxchg below is guaranteed to observe
> + * holder_ops=NULL first (cmpxchg provides release ordering), so
> + * its subsequent store of new ops will not be overwritten.
> + */
> + WRITE_ONCE(dax_dev->holder_ops, NULL);
> + prev = cmpxchg(&dax_dev->holder_data, holder, NULL);
[Severity: High]
Does this unconditionally clear the holder operations before verifying
ownership?
If an errant caller invokes fs_put_dax() with a non-matching holder, the
cmpxchg will correctly fail and preserve holder_data. However, the legitimate
holder's operations have already been overwritten with NULL here. Could this
corrupt the rightful holder's state and silently break future memory failure
notifications?
[ ... ]
> + WARN_ON(prev && prev != holder);
> + }
> put_dax(dax_dev);
> }
[Severity: High]
This is a pre-existing issue, but does fs_put_dax() need to synchronize with
SRCU readers before returning?
When a filesystem unmounts, it calls fs_put_dax() to clear holder_data and
holder_ops, but returns without calling synchronize_srcu.
If dax_holder_notify_failure() is executing asynchronously, it retrieves the
mount pointer:
fs/xfs/xfs_notify_failure.c:xfs_dax_notify_failure() {
struct xfs_mount *mp = dax_holder(dax_dev);
if (!(mp->m_super->s_flags & SB_BORN)) {
...
}
}
If fs_put_dax() has just cleared holder_data, dax_holder() returns NULL, and
this immediately dereferences the NULL pointer. Could this cause a crash on
unmount?
[Severity: High]
This is a pre-existing issue, but looking at fs_dax_get() which is related to
these ownership changes:
drivers/dax/super.c:fs_dax_get() {
id = dax_read_lock();
if (!dax_dev || !dax_alive(dax_dev) || !igrab(&dax_dev->inode)) {
dax_read_unlock(id);
return -ENODEV;
}
dax_read_unlock(id);
...
if (cmpxchg(&dax_dev->holder_data, NULL, holder))
}
Does dropping the SRCU read lock before modifying the device state allow a
race with kill_dax()?
If kill_dax() executes concurrently, clears the alive flag, and wipes
holder_data, could the cmpxchg in fs_dax_get() blindly succeed and set
operations on a dead device?
[Severity: High]
This is a pre-existing issue, but could there be a race during device removal
in kill_dax()?
drivers/dax/super.c:kill_dax() {
if (dax_dev->holder_data != NULL)
dax_holder_notify_failure(dax_dev, 0, U64_MAX,
MF_MEM_PRE_REMOVE);
clear_bit(DAXDEV_ALIVE, &dax_dev->flags);
}
Because holder_data is checked before clearing the alive flag, could a
new filesystem execute fs_dax_get_by_bdev() concurrently, see dax_alive() is
true, and successfully register its holder_data?
If kill_dax() then unconditionally wipes holder_data and holder_ops after
this, the new filesystem would miss the pre-remove notification entirely.
--
Sashiko AI review · https://sashiko.dev/#/patchset/0100019ea3929225-a0f8e6f7-30ae-4f8e-ae6f-19129666c4c3-000000@email.amazonses.com?part=7
next prev parent reply other threads:[~2026-06-07 19:49 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260607193224.94244-1-john@jagalactic.com>
2026-06-07 19:32 ` [PATCH V4 0/9] Fixes to the previously-merged drivers/dax/fsdev series John Groves
2026-06-07 19:33 ` [PATCH V4 1/9] dax: fix misleading comment about share/index union in dax_folio_reset_order() John Groves
2026-06-07 19:45 ` sashiko-bot
2026-06-07 19:33 ` [PATCH V4 2/9] dax/fsdev: fix multi-range offset in memory_failure handler John Groves
2026-06-07 19:49 ` sashiko-bot
2026-06-08 10:56 ` Richard Cheng
2026-06-07 19:33 ` [PATCH V4 3/9] dax/fsdev: clear vmemmap_shift when binding static pgmap John Groves
2026-06-07 19:49 ` sashiko-bot
2026-06-07 19:33 ` [PATCH V4 4/9] dax/fsdev: don't leave a dangling dev_dax->pgmap on probe failure John Groves
2026-06-07 19:44 ` sashiko-bot
2026-06-07 19:33 ` [PATCH V4 5/9] dax/fsdev: use __va(phys) for kaddr in direct_access John Groves
2026-06-07 19:44 ` sashiko-bot
2026-06-07 19:34 ` [PATCH V4 6/9] dax/fsdev: fail probe on invalid pgmap offset John Groves
2026-06-07 19:43 ` sashiko-bot
2026-06-07 19:34 ` [PATCH V4 7/9] dax: fix holder_ops race in fs_put_dax() John Groves
2026-06-07 19:49 ` sashiko-bot [this message]
2026-06-08 10:52 ` Richard Cheng
2026-06-07 19:34 ` [PATCH V4 8/9] dax: replace exported dax_dev_get() with non-allocating dax_dev_find() John Groves
2026-06-07 19:49 ` sashiko-bot
2026-06-08 10:48 ` Richard Cheng
2026-06-07 19:34 ` [PATCH V4 9/9] dax: fsdev.c minor formatting cleanup John Groves
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260607194917.1CE561F00893@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=john@jagalactic.com \
--cc=linux-cxl@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.