All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Sasha Levin <sashal@kernel.org>,
	Guoqing Jiang <guoqing.jiang@cloud.ionos.com>,
	snitzer@kernel.org, linux-raid@vger.kernel.org,
	Song Liu <song@kernel.org>,
	dm-devel@redhat.com, Donald Buczek <buczek@molgen.mpg.de>,
	agk@redhat.com
Subject: [dm-devel] [PATCH AUTOSEL 4.19 18/27] md: don't unregister sync_thread with reconfig_mutex held
Date: Tue,  7 Jun 2022 14:01:22 -0400	[thread overview]
Message-ID: <20220607180133.481701-18-sashal@kernel.org> (raw)
In-Reply-To: <20220607180133.481701-1-sashal@kernel.org>

From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>

[ Upstream commit 8b48ec23cc51a4e7c8dbaef5f34ebe67e1a80934 ]

Unregister sync_thread doesn't need to hold reconfig_mutex since it
doesn't reconfigure array.

And it could cause deadlock problem for raid5 as follows:

1. process A tried to reap sync thread with reconfig_mutex held after echo
   idle to sync_action.
2. raid5 sync thread was blocked if there were too many active stripes.
3. SB_CHANGE_PENDING was set (because of write IO comes from upper layer)
   which causes the number of active stripes can't be decreased.
4. SB_CHANGE_PENDING can't be cleared since md_check_recovery was not able
   to hold reconfig_mutex.

More details in the link:
https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@molgen.mpg.de/T/#t

And add one parameter to md_reap_sync_thread since it could be called by
dm-raid which doesn't hold reconfig_mutex.

Reported-and-tested-by: Donald Buczek <buczek@molgen.mpg.de>
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/md/dm-raid.c |  2 +-
 drivers/md/md.c      | 14 +++++++++-----
 drivers/md/md.h      |  2 +-
 3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index b16332917220..0be03123d6a2 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -3710,7 +3710,7 @@ static int raid_message(struct dm_target *ti, unsigned int argc, char **argv,
 	if (!strcasecmp(argv[0], "idle") || !strcasecmp(argv[0], "frozen")) {
 		if (mddev->sync_thread) {
 			set_bit(MD_RECOVERY_INTR, &mddev->recovery);
-			md_reap_sync_thread(mddev);
+			md_reap_sync_thread(mddev, false);
 		}
 	} else if (decipher_sync_action(mddev, mddev->recovery) != st_idle)
 		return -EBUSY;
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 7e0477e883c7..da7954eff4d7 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4611,7 +4611,7 @@ action_store(struct mddev *mddev, const char *page, size_t len)
 			flush_workqueue(md_misc_wq);
 			if (mddev->sync_thread) {
 				set_bit(MD_RECOVERY_INTR, &mddev->recovery);
-				md_reap_sync_thread(mddev);
+				md_reap_sync_thread(mddev, true);
 			}
 			mddev_unlock(mddev);
 		}
@@ -5871,7 +5871,7 @@ static void __md_stop_writes(struct mddev *mddev)
 	flush_workqueue(md_misc_wq);
 	if (mddev->sync_thread) {
 		set_bit(MD_RECOVERY_INTR, &mddev->recovery);
-		md_reap_sync_thread(mddev);
+		md_reap_sync_thread(mddev, true);
 	}
 
 	del_timer_sync(&mddev->safemode_timer);
@@ -8893,7 +8893,7 @@ void md_check_recovery(struct mddev *mddev)
 			 * ->spare_active and clear saved_raid_disk
 			 */
 			set_bit(MD_RECOVERY_INTR, &mddev->recovery);
-			md_reap_sync_thread(mddev);
+			md_reap_sync_thread(mddev, true);
 			clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
 			clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
 			clear_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags);
@@ -8928,7 +8928,7 @@ void md_check_recovery(struct mddev *mddev)
 			goto unlock;
 		}
 		if (mddev->sync_thread) {
-			md_reap_sync_thread(mddev);
+			md_reap_sync_thread(mddev, true);
 			goto unlock;
 		}
 		/* Set RUNNING before clearing NEEDED to avoid
@@ -9001,12 +9001,16 @@ void md_check_recovery(struct mddev *mddev)
 }
 EXPORT_SYMBOL(md_check_recovery);
 
-void md_reap_sync_thread(struct mddev *mddev)
+void md_reap_sync_thread(struct mddev *mddev, bool reconfig_mutex_held)
 {
 	struct md_rdev *rdev;
 
+	if (reconfig_mutex_held)
+		mddev_unlock(mddev);
 	/* resync has finished, collect result */
 	md_unregister_thread(&mddev->sync_thread);
+	if (reconfig_mutex_held)
+		mddev_lock_nointr(mddev);
 	if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery) &&
 	    !test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery) &&
 	    mddev->degraded != mddev->raid_disks) {
diff --git a/drivers/md/md.h b/drivers/md/md.h
index cce62bbc2bcb..027b220dec14 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -674,7 +674,7 @@ extern struct md_thread *md_register_thread(
 extern void md_unregister_thread(struct md_thread **threadp);
 extern void md_wakeup_thread(struct md_thread *thread);
 extern void md_check_recovery(struct mddev *mddev);
-extern void md_reap_sync_thread(struct mddev *mddev);
+extern void md_reap_sync_thread(struct mddev *mddev, bool reconfig_mutex_held);
 extern int mddev_init_writes_pending(struct mddev *mddev);
 extern bool md_write_start(struct mddev *mddev, struct bio *bi);
 extern void md_write_inc(struct mddev *mddev, struct bio *bi);
-- 
2.35.1

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


WARNING: multiple messages have this Message-ID (diff)
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>,
	Donald Buczek <buczek@molgen.mpg.de>, Song Liu <song@kernel.org>,
	Sasha Levin <sashal@kernel.org>,
	agk@redhat.com, snitzer@kernel.org, dm-devel@redhat.com,
	linux-raid@vger.kernel.org
Subject: [PATCH AUTOSEL 4.19 18/27] md: don't unregister sync_thread with reconfig_mutex held
Date: Tue,  7 Jun 2022 14:01:22 -0400	[thread overview]
Message-ID: <20220607180133.481701-18-sashal@kernel.org> (raw)
In-Reply-To: <20220607180133.481701-1-sashal@kernel.org>

From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>

[ Upstream commit 8b48ec23cc51a4e7c8dbaef5f34ebe67e1a80934 ]

Unregister sync_thread doesn't need to hold reconfig_mutex since it
doesn't reconfigure array.

And it could cause deadlock problem for raid5 as follows:

1. process A tried to reap sync thread with reconfig_mutex held after echo
   idle to sync_action.
2. raid5 sync thread was blocked if there were too many active stripes.
3. SB_CHANGE_PENDING was set (because of write IO comes from upper layer)
   which causes the number of active stripes can't be decreased.
4. SB_CHANGE_PENDING can't be cleared since md_check_recovery was not able
   to hold reconfig_mutex.

More details in the link:
https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@molgen.mpg.de/T/#t

And add one parameter to md_reap_sync_thread since it could be called by
dm-raid which doesn't hold reconfig_mutex.

Reported-and-tested-by: Donald Buczek <buczek@molgen.mpg.de>
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/md/dm-raid.c |  2 +-
 drivers/md/md.c      | 14 +++++++++-----
 drivers/md/md.h      |  2 +-
 3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index b16332917220..0be03123d6a2 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -3710,7 +3710,7 @@ static int raid_message(struct dm_target *ti, unsigned int argc, char **argv,
 	if (!strcasecmp(argv[0], "idle") || !strcasecmp(argv[0], "frozen")) {
 		if (mddev->sync_thread) {
 			set_bit(MD_RECOVERY_INTR, &mddev->recovery);
-			md_reap_sync_thread(mddev);
+			md_reap_sync_thread(mddev, false);
 		}
 	} else if (decipher_sync_action(mddev, mddev->recovery) != st_idle)
 		return -EBUSY;
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 7e0477e883c7..da7954eff4d7 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4611,7 +4611,7 @@ action_store(struct mddev *mddev, const char *page, size_t len)
 			flush_workqueue(md_misc_wq);
 			if (mddev->sync_thread) {
 				set_bit(MD_RECOVERY_INTR, &mddev->recovery);
-				md_reap_sync_thread(mddev);
+				md_reap_sync_thread(mddev, true);
 			}
 			mddev_unlock(mddev);
 		}
@@ -5871,7 +5871,7 @@ static void __md_stop_writes(struct mddev *mddev)
 	flush_workqueue(md_misc_wq);
 	if (mddev->sync_thread) {
 		set_bit(MD_RECOVERY_INTR, &mddev->recovery);
-		md_reap_sync_thread(mddev);
+		md_reap_sync_thread(mddev, true);
 	}
 
 	del_timer_sync(&mddev->safemode_timer);
@@ -8893,7 +8893,7 @@ void md_check_recovery(struct mddev *mddev)
 			 * ->spare_active and clear saved_raid_disk
 			 */
 			set_bit(MD_RECOVERY_INTR, &mddev->recovery);
-			md_reap_sync_thread(mddev);
+			md_reap_sync_thread(mddev, true);
 			clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
 			clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
 			clear_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags);
@@ -8928,7 +8928,7 @@ void md_check_recovery(struct mddev *mddev)
 			goto unlock;
 		}
 		if (mddev->sync_thread) {
-			md_reap_sync_thread(mddev);
+			md_reap_sync_thread(mddev, true);
 			goto unlock;
 		}
 		/* Set RUNNING before clearing NEEDED to avoid
@@ -9001,12 +9001,16 @@ void md_check_recovery(struct mddev *mddev)
 }
 EXPORT_SYMBOL(md_check_recovery);
 
-void md_reap_sync_thread(struct mddev *mddev)
+void md_reap_sync_thread(struct mddev *mddev, bool reconfig_mutex_held)
 {
 	struct md_rdev *rdev;
 
+	if (reconfig_mutex_held)
+		mddev_unlock(mddev);
 	/* resync has finished, collect result */
 	md_unregister_thread(&mddev->sync_thread);
+	if (reconfig_mutex_held)
+		mddev_lock_nointr(mddev);
 	if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery) &&
 	    !test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery) &&
 	    mddev->degraded != mddev->raid_disks) {
diff --git a/drivers/md/md.h b/drivers/md/md.h
index cce62bbc2bcb..027b220dec14 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -674,7 +674,7 @@ extern struct md_thread *md_register_thread(
 extern void md_unregister_thread(struct md_thread **threadp);
 extern void md_wakeup_thread(struct md_thread *thread);
 extern void md_check_recovery(struct mddev *mddev);
-extern void md_reap_sync_thread(struct mddev *mddev);
+extern void md_reap_sync_thread(struct mddev *mddev, bool reconfig_mutex_held);
 extern int mddev_init_writes_pending(struct mddev *mddev);
 extern bool md_write_start(struct mddev *mddev, struct bio *bi);
 extern void md_write_inc(struct mddev *mddev, struct bio *bi);
-- 
2.35.1


  parent reply	other threads:[~2022-06-07 18:02 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-07 18:01 [PATCH AUTOSEL 4.19 01/27] iio: dummy: iio_simple_dummy: check the return value of kstrdup() Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 02/27] lkdtm/usercopy: Expand size of "out of frame" object Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 03/27] tty: synclink_gt: Fix null-pointer-dereference in slgt_clean() Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 04/27] tty: Fix a possible resource leak in icom_probe Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 05/27] drivers: staging: rtl8192u: Fix deadlock in ieee80211_beacons_stop() Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 06/27] drivers: staging: rtl8192e: Fix deadlock in rtllib_beacons_stop() Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 07/27] USB: host: isp116x: check return value after calling platform_get_resource() Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 08/27] drivers: tty: serial: Fix deadlock in sa1100_set_termios() Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 09/27] drivers: usb: host: Fix deadlock in oxu_bus_suspend() Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 10/27] USB: hcd-pci: Fully suspend across freeze/thaw cycle Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 11/27] usb: dwc2: gadget: don't reset gadget's driver->bus Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 12/27] misc: rtsx: set NULL intfdata when probe fails Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 13/27] extcon: Modify extcon device to be created after driver data is set Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 14/27] clocksource/drivers/sp804: Avoid error on multiple instances Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 15/27] staging: rtl8712: fix uninit-value in r871xu_drv_init() Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 16/27] serial: msm_serial: disable interrupts in __msm_console_write() Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 17/27] kernfs: Separate kernfs_pr_cont_buf and rename_lock Sasha Levin
2022-06-07 18:01 ` Sasha Levin [this message]
2022-06-07 18:01   ` [PATCH AUTOSEL 4.19 18/27] md: don't unregister sync_thread with reconfig_mutex held Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 19/27] md: protect md_unregister_thread from reentrancy Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 20/27] Revert "net: af_key: add check for pfkey_broadcast in function pfkey_process" Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 21/27] ceph: allow ceph.dir.rctime xattr to be updatable Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 22/27] drm/radeon: fix a possible null pointer dereference Sasha Levin
2022-06-07 18:01   ` Sasha Levin
2022-06-07 18:01   ` Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 23/27] modpost: fix undefined behavior of is_arm_mapping_symbol() Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 24/27] nbd: call genl_unregister_family() first in nbd_cleanup() Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 25/27] nbd: fix race between nbd_alloc_config() and module removal Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 26/27] nbd: fix io hung while disconnecting device Sasha Levin
2022-06-07 18:01 ` [PATCH AUTOSEL 4.19 27/27] nodemask: Fix return values to be unsigned Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220607180133.481701-18-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=agk@redhat.com \
    --cc=buczek@molgen.mpg.de \
    --cc=dm-devel@redhat.com \
    --cc=guoqing.jiang@cloud.ionos.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=snitzer@kernel.org \
    --cc=song@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.