Re: [PATCH v5 1/3] md: suspend array when sync_action=reshape

Linux RAID subsystem development
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Chen Cheng" <chencheng@fnnas.com>
Cc: yukuai@fygo.io, linux-raid@vger.kernel.org
Subject: Re: [PATCH v5 1/3] md: suspend array when sync_action=reshape
Date: Mon, 22 Jun 2026 12:25:54 +0000	[thread overview]
Message-ID: <20260622122554.E43991F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260622121312.1775322-2-chencheng@fnnas.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:

New issues:
- [High] Unmatched mddev_resume() on mddev_lock() failure causes mddev->suspended underflow.
- [High] The retry loop bypasses array re-suspension but leaves the suspended flag set, causing unprotected reshape and double-resume.
- [High] Suspending the array before calling flush_work() creates a circular memory reclaim deadlock.

Pre-existing issues:
- [High] Cross-task corruption of PF_MEMALLOC_NOIO flags when mddev_suspend() and mddev_resume() are interleaved.
--

commit ec573079e939aa92bfb3fd89247846ae57883ef2
Author: Chen Cheng <chencheng@fnnas.com>

md: suspend array when sync_action=reshape

This patch suspends the md array before starting a reshape operation to
prevent new requests from allocating objects from the old pool during
the transition. The array is then resumed on exit.

> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 096bb64e87bd5..e139f36e30b92 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -5258,26 +5258,37 @@ static ssize_t
>  action_store(struct mddev *mddev, const char *page, size_t len)
>  {
>  	int ret;
> +	bool suspended = false;
>  	enum sync_action action;
>  
>  	if (!mddev->pers || !mddev->pers->sync_request)
>  		return -EINVAL;
>  
> +	action = md_sync_action_by_name(page);
> +	if (action == ACTION_RESHAPE) {
> +		ret = mddev_suspend(mddev, true);

[Severity: High]
This is a pre-existing issue, but does calling mddev_suspend() and
mddev_resume() from concurrent contexts risk corrupting the PF_MEMALLOC_NOIO
task flags?

When mddev_suspend() is called, it stores the task-specific NOIO flag into
the global mddev structure:

drivers/md/md.c:mddev_suspend() {
    ...
    mddev->noio_flag = memalloc_noio_save();
    ...
}

If two tasks concurrently enter mddev_suspend(), the second task increments
the suspend counter without overwriting mddev->noio_flag. If the first task
then resumes (e.g., on a lock failure) it decrements the counter but returns 
early because the array is still suspended:

drivers/md/md.c:__mddev_resume() {
    if (mddev->suspended) {
        mutex_unlock(&mddev->suspend_mutex);
        return;
    }
    ...
    memalloc_noio_restore(mddev->noio_flag);
}

This prevents the first task from restoring its NOIO flag. When the second
task later resumes and drops the counter to 0, it incorrectly restores the
first task's flags onto its own context. Could this leave the first task
stuck in NOIO mode indefinitely?

> +		if (ret)
> +			return ret;
> +		suspended = true;
> +	}
>  retry:
>  	if (work_busy(&mddev->sync_work))
>  		flush_work(&mddev->sync_work);

[Severity: High]
Can holding the array suspended while synchronously waiting for the worker
thread expose the worker's memory allocations to a circular deadlock?

If md_start_sync() allocates memory with GFP_KERNEL in the kworker thread
(for example, when calling md_register_thread()), it can trigger direct
memory reclaim. Because the worker thread did not transition the array to
suspended itself, it does not inherit the PF_MEMALLOC_NOIO flag.

Consequently, memory reclaim can issue I/O to the md device. Since the array
is now suspended, md_handle_request() will block this I/O indefinitely. The
worker thread hangs waiting for I/O, and action_store() hangs here in
flush_work() waiting for the worker thread, permanently deadlocking the
array. Is it safe to suspend the array before flushing the work?

>  
>  	ret = mddev_lock(mddev);
> -	if (ret)
> +	if (ret) {
> +		mddev_resume(mddev);
>  		return ret;
> +	}

[Severity: High]
Does this unconditional mddev_resume() risk underflowing the mddev->suspended
counter if the array was not actually suspended?

If action_store() is called with an action other than ACTION_RESHAPE, the
array is not suspended. If the subsequent mddev_lock() fails (e.g., if it is
interrupted by a signal), the error path unconditionally calls mddev_resume().

Since the array was not suspended, __mddev_resume() decrements
mddev->suspended from 0 to -1. Because mddev->suspended is a signed integer
and -1 evaluates to true, subsequent calls to mddev_suspend() will see
mddev->suspended != 0, increment it back to 0, and return immediately without
actually pausing I/O. Could this permanently break the suspension mechanism
for the array?

>  
>  	if (work_busy(&mddev->sync_work)) {
> -		mddev_unlock(mddev);
> +		if (suspended)
> +			mddev_unlock_and_resume(mddev);
> +		else
> +			mddev_unlock(mddev);
>  		goto retry;
>  	}

[Severity: High]
Will the retry loop bypass array re-suspension while leaving the local
suspended flag set to true?

If work_busy(&mddev->sync_work) is true, the code unlocks and resumes the
array, then jumps back to the retry label. However, the retry label is placed
after the mddev_suspend() block. 

The code will not re-suspend the array, but the local 'suspended' boolean
remains true. As a result, the reshape operation proceeds on an active,
unsuspended array. Furthermore, when the function exits, it will call
mddev_unlock_and_resume() again because 'suspended' is true, leading to a
double-resume and another mddev->suspended underflow.

[ ... ]

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260622121312.1775322-1-chencheng@fnnas.com?part=1

next prev parent reply	other threads:[~2026-06-22 12:25 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-22 12:13 [PATCH v5 0/3] md/raid10: fix r10bio width mismatches across reshape Chen Cheng
2026-06-22 12:13 ` [PATCH v5 1/3] md: suspend array when sync_action=reshape Chen Cheng
2026-06-22 12:25   ` sashiko-bot [this message]
2026-06-22 12:13 ` [PATCH v5 2/3] md/raid10: resize r10bio_pool for reshape Chen Cheng
2026-06-22 12:42   ` sashiko-bot
2026-06-22 12:13 ` [PATCH v5 3/3] md/raid10: free r10bio before ending master_bio in raid_end_bio_io() Chen Cheng
2026-06-22 12:29   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260622122554.E43991F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=chencheng@fnnas.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=yukuai@fygo.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox