linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Logan Gunthorpe <logang@deltatee.com>
To: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
	Song Liu <song@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	Donald Buczek <buczek@molgen.mpg.de>,
	Guoqing Jiang <guoqing.jiang@linux.dev>, Xiao Ni <xni@redhat.com>,
	Stephen Bates <sbates@raithlin.com>,
	Martin Oliveira <Martin.Oliveira@eideticom.com>,
	David Sloan <David.Sloan@eideticom.com>,
	Logan Gunthorpe <logang@deltatee.com>,
	Christoph Hellwig <hch@lst.de>
Subject: [PATCH v3 03/11] md/raid5: Ensure array is suspended for calls to log_exit()
Date: Thu,  2 Jun 2022 12:18:09 -0600	[thread overview]
Message-ID: <20220602181818.50729-4-logang@deltatee.com> (raw)
In-Reply-To: <20220602181818.50729-1-logang@deltatee.com>

The raid5-cache code relies on there being no IO in flight when
log_exit() is called. There are two places where this is not
guaranteed so add mddev_suspend() and mddev_resume() calls to these
sites.

The site in raid5_remove_disk() has a comment saying that it is
called in raid5d and thus cannot wait for pending writes; however that
does not appear to be correct anymore (if it ever was) as
raid5_remove_disk() is called from hot_remove_disk() which only
appears to be called in the md_ioctl(). Thus, the comment is removed,
as well as the racy check and replaced with calls to suspend/resume.

The site in raid5_change_consistency_policy() is in the error path,
and another similar call site already has suspend/resume calls just
below it; so it should be equally safe to make that change here.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/raid5.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 5d09256d7f81..3ad37dd4c5cd 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7938,18 +7938,9 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
 
 	print_raid5_conf(conf);
 	if (test_bit(Journal, &rdev->flags) && conf->log) {
-		/*
-		 * we can't wait pending write here, as this is called in
-		 * raid5d, wait will deadlock.
-		 * neilb: there is no locking about new writes here,
-		 * so this cannot be safe.
-		 */
-		if (atomic_read(&conf->active_stripes) ||
-		    atomic_read(&conf->r5c_cached_full_stripes) ||
-		    atomic_read(&conf->r5c_cached_partial_stripes)) {
-			return -EBUSY;
-		}
+		mddev_suspend(mddev);
 		log_exit(conf);
+		mddev_resume(mddev);
 		return 0;
 	}
 	if (rdev == rcu_access_pointer(p->rdev))
@@ -8697,8 +8688,11 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
 			err = log_init(conf, NULL, true);
 			if (!err) {
 				err = resize_stripes(conf, conf->pool_size);
-				if (err)
+				if (err) {
+					mddev_suspend(mddev);
 					log_exit(conf);
+					mddev_resume(mddev);
+				}
 			}
 		} else
 			err = -EINVAL;
-- 
2.30.2


  parent reply	other threads:[~2022-06-02 18:18 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-02 18:18 [PATCH v3 00/11] Bug fixes for mdadm tests Logan Gunthorpe
2022-06-02 18:18 ` [PATCH v3 01/11] md/raid5-log: Drop extern decorators for function prototypes Logan Gunthorpe
2022-06-02 18:18 ` [PATCH v3 02/11] md/raid5-ppl: Drop unused argument from ppl_handle_flush_request() Logan Gunthorpe
2022-06-02 18:18 ` Logan Gunthorpe [this message]
2022-06-02 18:18 ` [PATCH v3 04/11] md/raid5-cache: Take mddev_lock in r5c_journal_mode_show() Logan Gunthorpe
2022-06-03  6:39   ` Christoph Hellwig
2022-06-03 21:47     ` Logan Gunthorpe
2022-06-02 18:18 ` [PATCH v3 05/11] md/raid5-cache: Drop RCU usage of conf->log Logan Gunthorpe
2022-06-03  6:43   ` Christoph Hellwig
2022-06-02 18:18 ` [PATCH v3 06/11] md/raid5-cache: Clear conf->log after finishing work Logan Gunthorpe
2022-06-03  6:43   ` Christoph Hellwig
2022-06-02 18:18 ` [PATCH v3 07/11] md/raid5-cache: Annotate pslot with __rcu notation Logan Gunthorpe
2022-06-02 18:18 ` [PATCH v3 08/11] md: Use enum for overloaded magic numbers used by mddev->curr_resync Logan Gunthorpe
2022-06-02 18:18 ` [PATCH v3 09/11] md: Ensure resync is reported after it starts Logan Gunthorpe
2022-06-02 18:18 ` [PATCH v3 10/11] md: Notify sysfs sync_completed in md_reap_sync_thread() Logan Gunthorpe
2022-06-02 18:18 ` [PATCH v3 11/11] md/raid5-ppl: Fix argument order in bio_alloc_bioset() Logan Gunthorpe
2022-06-03  6:45   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220602181818.50729-4-logang@deltatee.com \
    --to=logang@deltatee.com \
    --cc=David.Sloan@eideticom.com \
    --cc=Martin.Oliveira@eideticom.com \
    --cc=buczek@molgen.mpg.de \
    --cc=guoqing.jiang@linux.dev \
    --cc=hch@infradead.org \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=sbates@raithlin.com \
    --cc=song@kernel.org \
    --cc=xni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).