[md PATCH 0/2] Fix two bug in the new write

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [md PATCH 0/2] Fix two bug in the new write_pending counting.
@ 2017-08-08  6:56 NeilBrown
  2017-08-08  6:56 ` [md PATCH 2/2] md: fix test in md_write_start() NeilBrown
  2017-08-08  6:56 ` [md PATCH 1/2] md: always clear ->safemode when md_check_recovery gets the mddev lock NeilBrown
  0 siblings, 2 replies; 4+ messages in thread
From: NeilBrown @ 2017-08-08  6:56 UTC (permalink / raw)
  To: Shaohua Li; +Cc: Dominik Brodowski, David R, linux-raid

Two separate bugs caused by the same patch.
One causes a hard lockup that was reported on the list.
Other found by inspection.
Both suitable for -stable.

Please review the second one closely.  I think it is correct now, but
then I thought it was correct before too :-(

Thanks,
NeilBrown


---

NeilBrown (2):
      md: always clear ->safemode when md_check_recovery gets the mddev lock.
      md: fix test in md_write_start()


 drivers/md/md.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--
Signature


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [md PATCH 2/2] md: fix test in md_write_start()
  2017-08-08  6:56 [md PATCH 0/2] Fix two bug in the new write_pending counting NeilBrown
@ 2017-08-08  6:56 ` NeilBrown
  2017-08-08  6:56 ` [md PATCH 1/2] md: always clear ->safemode when md_check_recovery gets the mddev lock NeilBrown
  1 sibling, 0 replies; 4+ messages in thread
From: NeilBrown @ 2017-08-08  6:56 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-raid

md_write_start() needs to clear the in_sync flag is it is set, or if
there might be a race with set_in_sync() such that the later will
set it very soon.  In the later case it is sufficient to take the
spinlock to synchronize with set_in_sync(), and then set the flag
if needed.

The current test is incorrect.
It should be:
  if "flag is set" or "race is possible"

"flag is set" is trivially "mddev->in_sync".
"race is possible" should be tested by "mddev->sync_checkers".

If sync_checkers is 0, then there can be no race.  set_in_sync() will
wait in percpu_ref_switch_to_atomic_sync() for an RCU grace period,
and as md_write_start() holds the rcu_read_lock(), set_in_sync() will
be sure ot see the update to writes_pending.

If sync_checkers is > 0, there could be race.  If md_write_start()
happened entirely between
		if (!mddev->in_sync &&
		    percpu_ref_is_zero(&mddev->writes_pending)) {
and
			mddev->in_sync = 1;
in set_in_sync(), then it would not see that is_sync had been set,
and set_in_sync() would not see that writes_pending had been
incremented.

This bug means that in_sync is sometimes not set when it should be.
Consequently there is a small chance that the array will be marked as
"clean" when in fact it is inconsistent.

Fixes: 4ad23a976413 ("MD: use per-cpu counter for writes_pending")
cc: stable@vger.kernel.org (v4.12+)
Signed-off-by: NeilBrown <neilb@suse.com>
---
 drivers/md/md.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index d84aceede1cb..e4ba95f6cd59 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7996,7 +7996,7 @@ bool md_write_start(struct mddev *mddev, struct bio *bi)
 	if (mddev->safemode == 1)
 		mddev->safemode = 0;
 	/* sync_checkers is always 0 when writes_pending is in per-cpu mode */
-	if (mddev->in_sync || !mddev->sync_checkers) {
+	if (mddev->in_sync || mddev->sync_checkers) {
 		spin_lock(&mddev->lock);
 		if (mddev->in_sync) {
 			mddev->in_sync = 0;

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [md PATCH 1/2] md: always clear ->safemode when md_check_recovery gets the mddev lock.
  2017-08-08  6:56 [md PATCH 0/2] Fix two bug in the new write_pending counting NeilBrown
  2017-08-08  6:56 ` [md PATCH 2/2] md: fix test in md_write_start() NeilBrown
@ 2017-08-08  6:56 ` NeilBrown
  2017-08-08 14:40   ` Shaohua Li
  1 sibling, 1 reply; 4+ messages in thread
From: NeilBrown @ 2017-08-08  6:56 UTC (permalink / raw)
  To: Shaohua Li; +Cc: Dominik Brodowski, David R, linux-raid

If ->safemode == 1, md_check_recovery() will try to get the mddev lock
and perform various other checks.
If mddev->in_sync is zero, it will call set_in_sync, and clear
->safemode.  However if mddev->in_sync is not zero, ->safemode will not
be cleared.

When md_check_recovery() drops the mddev lock, the thread is woken
up again.  Normally it would just check if there was anything else to
do, find nothing, and go to sleep.  However as ->safemode was not
cleared, it will take the mddev lock again, then wake itself up
when unlocking.

This results in an infinite loop, repeatedly calling
md_check_recovery(), which RCU or the soft-lockup detector
will eventually complain about.

Prior to commit 4ad23a976413 ("MD: use per-cpu counter for
writes_pending"), safemode would only be set to one when the
writes_pending counter reached zero, and would be cleared again
when writes_pending is incremented.  Since that patch, safemode
is set more freely, but is not reliably cleared.

So in md_check_recovery() clear ->safemode before checking ->in_sync.

Fixes: 4ad23a976413 ("MD: use per-cpu counter for writes_pending")
Cc: stable@vger.kernel.org (4.12+)
Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reported-by: David R <david@unsolicited.net>
Signed-off-by: NeilBrown <neilb@suse.com>
---
 drivers/md/md.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index c99634612fc4..d84aceede1cb 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8656,6 +8656,9 @@ void md_check_recovery(struct mddev *mddev)
 	if (mddev_trylock(mddev)) {
 		int spares = 0;

+		if (mddev->safemode == 1)
+			mddev->safemode = 0;
+
 		if (mddev->ro) {
 			struct md_rdev *rdev;
 			if (!mddev->external && mddev->in_sync)

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [md PATCH 1/2] md: always clear ->safemode when md_check_recovery gets the mddev lock.
  2017-08-08  6:56 ` [md PATCH 1/2] md: always clear ->safemode when md_check_recovery gets the mddev lock NeilBrown
@ 2017-08-08 14:40   ` Shaohua Li
  0 siblings, 0 replies; 4+ messages in thread
From: Shaohua Li @ 2017-08-08 14:40 UTC (permalink / raw)
  To: NeilBrown; +Cc: Dominik Brodowski, David R, linux-raid

On Tue, Aug 08, 2017 at 04:56:36PM +1000, Neil Brown wrote:
> If ->safemode == 1, md_check_recovery() will try to get the mddev lock
> and perform various other checks.
> If mddev->in_sync is zero, it will call set_in_sync, and clear
> ->safemode.  However if mddev->in_sync is not zero, ->safemode will not
> be cleared.
> 
> When md_check_recovery() drops the mddev lock, the thread is woken
> up again.  Normally it would just check if there was anything else to
> do, find nothing, and go to sleep.  However as ->safemode was not
> cleared, it will take the mddev lock again, then wake itself up
> when unlocking.
> 
> This results in an infinite loop, repeatedly calling
> md_check_recovery(), which RCU or the soft-lockup detector
> will eventually complain about.
> 
> Prior to commit 4ad23a976413 ("MD: use per-cpu counter for
> writes_pending"), safemode would only be set to one when the
> writes_pending counter reached zero, and would be cleared again
> when writes_pending is incremented.  Since that patch, safemode
> is set more freely, but is not reliably cleared.
> 
> So in md_check_recovery() clear ->safemode before checking ->in_sync.

Nice catch! Applied both patches.

I spent hours to check why md_check_recovery loops, apparently I missed
set_in_sync is only called when in_sync is not set, silly me.

Thanks,
Shaohua
 
> Fixes: 4ad23a976413 ("MD: use per-cpu counter for writes_pending")
> Cc: stable@vger.kernel.org (4.12+)
> Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
> Reported-by: David R <david@unsolicited.net>
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
>  drivers/md/md.c |    3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index c99634612fc4..d84aceede1cb 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -8656,6 +8656,9 @@ void md_check_recovery(struct mddev *mddev)
>  	if (mddev_trylock(mddev)) {
>  		int spares = 0;
>  
> +		if (mddev->safemode == 1)
> +			mddev->safemode = 0;
> +
>  		if (mddev->ro) {
>  			struct md_rdev *rdev;
>  			if (!mddev->external && mddev->in_sync)
> 
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-08-08 14:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-08  6:56 [md PATCH 0/2] Fix two bug in the new write_pending counting NeilBrown
2017-08-08  6:56 ` [md PATCH 2/2] md: fix test in md_write_start() NeilBrown
2017-08-08  6:56 ` [md PATCH 1/2] md: always clear ->safemode when md_check_recovery gets the mddev lock NeilBrown
2017-08-08 14:40   ` Shaohua Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).