Re: [PATCH V5 2/9] dax/fsdev: fix multi-range offset in memory_failure handler

Linux CXL
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "John Groves" <john@jagalactic.com>
Cc: linux-cxl@vger.kernel.org
Subject: Re: [PATCH V5 2/9] dax/fsdev: fix multi-range offset in memory_failure handler
Date: Thu, 11 Jun 2026 17:51:35 +0000	[thread overview]
Message-ID: <20260611175136.356021F00893@smtp.kernel.org> (raw)
In-Reply-To: <0100019eb7bda506-6ba24207-b1c0-4eeb-9b04-61940cf3f80c-000000@email.amazonses.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

New issues:
- [Critical] Incorrect memory failure offset calculation for static DAX devices, causing the wrong filesystem blocks to be marked corrupted.

Pre-existing issues:
- [High] Use-After-Free in dax_pgoff_to_phys() due to concurrent reallocation of dev_dax->ranges array via sysfs.
--

commit 410f6bffae27c2c1b51ab4214947b135248b8594
Author: John Groves <John@Groves.net>

dax/fsdev: fix multi-range offset in memory_failure handler

This commit updates the offset calculation in the memory_failure handler for
multi-range DAX devices. It changes the logic to iterate over the device
pagemap's immutable ranges array instead of the potentially dynamically
reallocated dev_dax ranges array.

>     Walk the pagemap's own range array (pgmap->ranges[]) rather than
>     dev_dax->ranges[]. The pgmap copy is the immutable snapshot populated at
>     probe and is never mutated afterwards, whereas dev_dax->ranges[] can be
>     krealloc()'d by a concurrent sysfs mapping_store() (under dax_region_rwsem,
>     which this ->memory_failure callback does not hold). For dynamic devices the
>     two arrays are identical, so the reported offset is unchanged for the
>     multi-range case this targets.

[Severity: High]
This isn't a bug introduced by this patch, but since the commit message
mentions that dev_dax->ranges can be krealloc()'d by a concurrent sysfs
mapping_store(), does this expose a use-after-free risk in
dax_pgoff_to_phys()?

Unlike size_store() which checks if the device is bound to a driver,
mapping_store() appears to lack this check and can invoke
alloc_dev_dax_range() to reallocate the array while the device is
actively in use:

drivers/dax/bus.c:mapping_store() {
    ...
    rc = alloc_dev_dax_range(dev_dax, r.start, to_alloc);
    ...
}

Meanwhile, filesystem I/O operations can locklessly iterate over the ranges
array:

drivers/dax/bus.c:dax_pgoff_to_phys() {
    ...
    for (i = 0; i < dev_dax->nr_range; i++) {
        struct dev_dax_range *dax_range = &dev_dax->ranges[i];
    ...
}

Could a concurrent sysfs write to the mapping attribute cause a
use-after-free if dax_pgoff_to_phys() is iterating the old ranges
array that was freed by krealloc()?

> diff --git a/drivers/dax/fsdev.c b/drivers/dax/fsdev.c
> index 188b2526bee45..2c5de3d80a618 100644
> --- a/drivers/dax/fsdev.c
> +++ b/drivers/dax/fsdev.c

[ ... ]

> @@ -135,11 +135,26 @@ static void fsdev_clear_ops(void *data)
>   * The core mm code in free_zone_device_folio() handles the wake_up_var()
>   * directly for this memory type.
>   */
> +static u64 fsdev_pfn_to_offset(struct dev_pagemap *pgmap, unsigned long pfn)
> +{
> +	phys_addr_t phys = PFN_PHYS(pfn);
> +	u64 offset = 0;
> +
> +	for (int i = 0; i < pgmap->nr_range; i++) {
> +		struct range *range = &pgmap->ranges[i];
> +
> +		if (phys >= range->start && phys <= range->end)
> +			return offset + (phys - range->start);

[Severity: Critical]
Does this calculation incorrectly include the data_offset padding for static
DAX devices?

For static DAX devices, the device's physical range starts with a metadata
padding area (data_offset). The old code calculated the offset based on
dev_dax->ranges[0].range.start, which represents the start of the actual
user data.

By using pgmap->ranges[0].start instead, which includes this metadata
padding, the logical offset might be inflated by data_offset. Could this
cause the filesystem to receive an incorrect offset in
dax_holder_notify_failure(), potentially resulting in poisoning the wrong
filesystem blocks while leaving the actual corrupted blocks unmarked?

> +		offset += range_len(range);
> +	}
> +	return -1ULL;
> +}
> +

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/0100019eb7bcda4b-3f8edae9-d7a4-4bfa-aaea-fcef77fdbbc3-000000@email.amazonses.com?part=2

next prev parent reply	other threads:[~2026-06-11 17:51 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260611173057.65868-1-john@jagalactic.com>
2026-06-11 17:31 ` [PATCH V5 0/9] Fixes to the previously-merged drivers/dax/fsdev series John Groves
2026-06-11 17:31   ` [PATCH V5 1/9] dax: fix misleading comment about share/index union in dax_folio_reset_order() John Groves
2026-06-11 17:31   ` [PATCH V5 2/9] dax/fsdev: fix multi-range offset in memory_failure handler John Groves
2026-06-11 17:51     ` sashiko-bot [this message]
2026-06-12  3:08     ` Richard Cheng
2026-06-15 13:13       ` John Groves
2026-06-11 17:32   ` [PATCH V5 3/9] dax/fsdev: clear vmemmap_shift when binding static pgmap John Groves
2026-06-11 17:55     ` sashiko-bot
2026-06-12  2:56     ` Richard Cheng
2026-06-15 13:16       ` John Groves
2026-06-11 17:32   ` [PATCH V5 4/9] dax/fsdev: don't leave a dangling dev_dax->pgmap on probe failure John Groves
2026-06-11 18:04     ` sashiko-bot
2026-06-11 17:32   ` [PATCH V5 5/9] dax/fsdev: use __va(phys) for kaddr in direct_access John Groves
2026-06-11 17:32   ` [PATCH V5 6/9] dax/fsdev: fail probe on invalid pgmap offset John Groves
2026-06-11 18:09     ` Gupta, Pankaj
2026-06-15 13:23       ` John Groves
2026-06-11 18:13     ` sashiko-bot
2026-06-11 17:32   ` [PATCH V5 7/9] dax: read holder_ops once in dax_holder_notify_failure() John Groves
2026-06-11 18:13     ` sashiko-bot
2026-06-12  3:02     ` Richard Cheng
2026-06-15 13:22       ` John Groves
2026-06-11 17:32   ` [PATCH V5 8/9] dax: fix holder_ops race in fs_put_dax() John Groves
2026-06-11 18:28     ` sashiko-bot
2026-06-11 17:33   ` [PATCH V5 9/9] dax: fsdev.c minor formatting cleanup John Groves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260611175136.356021F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=john@jagalactic.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox