From: Logan Gunthorpe <logang@deltatee.com>
To: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
Song Liu <song@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>,
Donald Buczek <buczek@molgen.mpg.de>,
Guoqing Jiang <guoqing.jiang@linux.dev>, Xiao Ni <xni@redhat.com>,
Stephen Bates <sbates@raithlin.com>,
Martin Oliveira <Martin.Oliveira@eideticom.com>,
David Sloan <David.Sloan@eideticom.com>,
Logan Gunthorpe <logang@deltatee.com>,
Christoph Hellwig <hch@lst.de>
Subject: [PATCH v2 17/17] md: Notify sysfs sync_completed in md_reap_sync_thread()
Date: Thu, 26 May 2022 10:36:04 -0600 [thread overview]
Message-ID: <20220526163604.32736-18-logang@deltatee.com> (raw)
In-Reply-To: <20220526163604.32736-1-logang@deltatee.com>
The mdadm test 07layouts randomly produces a kernel hung task deadlock.
The deadlock is caused by the suspend_lo/suspend_hi files being set by
the mdadm background process during reshape and not being cleared
because the process hangs. (Leaving aside the issue of the fragility of
freezing kernel tasks by buggy userspace processes...)
When the background mdadm process hangs it, is waiting (without a
timeout) on a change to the sync_completed file signalling that the
reshape has completed. The process is woken up a couple times when
the reshape finishes but it is woken up before MD_RECOVERY_RUNNING
is cleared so sync_completed_show() reports 0 instead of "none".
To fix this, notify the sysfs file in md_reap_sync_thread() after
MD_RECOVERY_RUNNING has been cleared. This wakes up mdadm and causes
it to continue and write to suspend_lo/suspend_hi to allow IO to
continue.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
drivers/md/md.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 2be429874d18..2c07c9508222 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9476,6 +9476,7 @@ void md_reap_sync_thread(struct mddev *mddev, bool reconfig_mutex_held)
wake_up(&resync_wait);
/* flag recovery needed just to double check */
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
+ sysfs_notify_dirent_safe(mddev->sysfs_completed);
sysfs_notify_dirent_safe(mddev->sysfs_action);
md_new_event();
if (mddev->event_work.func)
--
2.30.2
prev parent reply other threads:[~2022-05-26 16:36 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 01/17] md/raid5-log: Drop extern decorators for function prototypes Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 02/17] md/raid5-cache: Add r5c_conf_is_writeback() helper Logan Gunthorpe
2022-05-30 5:56 ` Christoph Hellwig
2022-05-26 16:35 ` [PATCH v2 03/17] md/raid5-cache: Refactor r5l_start() to take a struct r5conf Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 04/17] md/raid5-cache: Refactor r5l_flush_stripe_to_raid() " Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 05/17] md/raid5-cache: Refactor r5l_wake_reclaim() " Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 06/17] md/raid5-cache: Refactor remaining functions to take a r5conf Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 07/17] md/raid5-ppl: Drop unused argument from ppl_handle_flush_request() Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 08/17] md/raid5-cache: Pass the log through to r5c_finish_cache_stripe() Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 09/17] md/raid5-cache: Don't pass conf to r5c_calculate_new_cp() Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 10/17] md/raid5-cache: Take struct r5l_log in r5c_log_required_to_flush_cache() Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 11/17] md/raid5: Ensure array is suspended for calls to log_exit() Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 12/17] md/raid5-cache: Move struct r5l_log definition to raid5-log.h Logan Gunthorpe
2022-05-30 5:59 ` Christoph Hellwig
2022-05-30 15:48 ` Logan Gunthorpe
2022-06-01 22:36 ` Song Liu
2022-06-01 22:42 ` Logan Gunthorpe
2022-06-01 22:50 ` Song Liu
2022-05-26 16:36 ` [PATCH v2 13/17] md/raid5-cache: Add RCU protection to conf->log accesses Logan Gunthorpe
2022-05-30 6:01 ` Christoph Hellwig
2022-05-30 15:57 ` Logan Gunthorpe
2022-05-26 16:36 ` [PATCH v2 14/17] md/raid5-cache: Annotate pslot with __rcu notation Logan Gunthorpe
2022-05-26 16:36 ` [PATCH v2 15/17] md: Use enum for overloaded magic numbers used by mddev->curr_resync Logan Gunthorpe
2022-05-30 6:01 ` Christoph Hellwig
2022-05-30 15:43 ` Logan Gunthorpe
2022-05-26 16:36 ` [PATCH v2 16/17] md: Ensure resync is reported after it starts Logan Gunthorpe
2022-05-30 6:02 ` Christoph Hellwig
2022-05-26 16:36 ` Logan Gunthorpe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220526163604.32736-18-logang@deltatee.com \
--to=logang@deltatee.com \
--cc=David.Sloan@eideticom.com \
--cc=Martin.Oliveira@eideticom.com \
--cc=buczek@molgen.mpg.de \
--cc=guoqing.jiang@linux.dev \
--cc=hch@infradead.org \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=sbates@raithlin.com \
--cc=song@kernel.org \
--cc=xni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox