stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Roman Gushchin <klamm@yandex-team.ru>,
	NeilBrown <neilb@suse.com>, Shaohua Li <shli@kernel.org>
Subject: [PATCH 3.14 34/37] md/raid5: fix locking in handle_stripe_clean_event()
Date: Fri,  6 Nov 2015 11:24:49 -0800	[thread overview]
Message-ID: <20151106192412.947316376@linuxfoundation.org> (raw)
In-Reply-To: <20151106192410.681850286@linuxfoundation.org>

3.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Roman Gushchin <klamm@yandex-team.ru>

commit b8a9d66d043ffac116100775a469f05f5158c16f upstream.

After commit 566c09c53455 ("raid5: relieve lock contention in get_active_stripe()")
__find_stripe() is called under conf->hash_locks + hash.
But handle_stripe_clean_event() calls remove_hash() under
conf->device_lock.

Under some cirscumstances the hash chain can be circuited,
and we get an infinite loop with disabled interrupts and locked hash
lock in __find_stripe(). This leads to hard lockup on multiple CPUs
and following system crash.

I was able to reproduce this behavior on raid6 over 6 ssd disks.
The devices_handle_discard_safely option should be set to enable trim
support. The following script was used:

for i in `seq 1 32`; do
    dd if=/dev/zero of=large$i bs=10M count=100 &
done

neilb: original was against a 3.x kernel.  I forward-ported
  to 4.3-rc.  This verison is suitable for any kernel since
  Commit: 59fc630b8b5f ("RAID5: batch adjacent full stripe write")
  (v4.1+).  I'll post a version for earlier kernels to stable.

Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Fixes: 566c09c53455 ("raid5: relieve lock contention in get_active_stripe()")
Signed-off-by: NeilBrown <neilb@suse.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: <stable@vger.kernel.org> # 3.13 - 4.2
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/md/raid5.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3029,6 +3029,8 @@ static void handle_stripe_clean_event(st
 		}
 	if (!discard_pending &&
 	    test_bit(R5_Discard, &sh->dev[sh->pd_idx].flags)) {
+		int hash = sh->hash_lock_index;
+
 		clear_bit(R5_Discard, &sh->dev[sh->pd_idx].flags);
 		clear_bit(R5_UPTODATE, &sh->dev[sh->pd_idx].flags);
 		if (sh->qd_idx >= 0) {
@@ -3042,9 +3044,9 @@ static void handle_stripe_clean_event(st
 		 * no updated data, so remove it from hash list and the stripe
 		 * will be reinitialized
 		 */
-		spin_lock_irq(&conf->device_lock);
+		spin_lock_irq(conf->hash_locks + hash);
 		remove_hash(sh);
-		spin_unlock_irq(&conf->device_lock);
+		spin_unlock_irq(conf->hash_locks + hash);
 		if (test_bit(STRIPE_SYNC_REQUESTED, &sh->state))
 			set_bit(STRIPE_HANDLE, &sh->state);
 



  parent reply	other threads:[~2015-11-06 19:24 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-06 19:24 [PATCH 3.14 00/37] 3.14.57-stable review Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 01/37] ath9k: declare required extra tx headroom Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 02/37] iwlwifi: dvm: fix D3 firmware PN programming Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 03/37] iwlwifi: fix firmware filename for 3160 Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 04/37] iwlwifi: mvm: fix D3 firmware PN programming Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 05/37] iwlwifi: pci: add a few more PCI subvendor IDs for the 7265 series Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 06/37] iommu/amd: Dont clear DTE flags when modifying it Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 07/37] powerpc/rtas: Validate rtas.entry before calling enter_rtas() Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 08/37] ASoC: wm8904: Correct number of EQ registers Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 09/37] x86/setup: Extend low identity map to cover whole kernel range Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 10/37] mm: make sendfile(2) killable Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 14/37] drm/nouveau/gem: return only valid domain when theres only one Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 16/37] drm/radeon: dont try to recreate sysfs entries on resume Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 18/37] rbd: require stable pages if message data CRCs are enabled Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 19/37] rbd: dont leak parent_spec in rbd_dev_probe_parent() Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 20/37] rbd: prevent kernel stack blow up on rbd map Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 21/37] Revert "ARM64: unwind: Fix PC calculation" Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 22/37] dm btree remove: fix a bug when rebalancing nodes after removal Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 23/37] dm btree: fix leak of bufio-backed block in btree_split_beneath error path Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 24/37] xhci: handle no ping response error properly Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 25/37] xhci: Add spurious wakeup quirk for LynxPoint-LP controllers Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 26/37] xen-blkfront: check for null drvdata in blkback_changed (XenbusStateClosing) Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 27/37] module: Fix locking in symbol_put_addr() Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 28/37] crypto: api - Only abort operations on fatal signal Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 29/37] md/raid1: submit_bio_wait() returns 0 on success Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 30/37] md/raid10: " Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 31/37] Revert "md: allow a partially recovered device to be hot-added to an array." Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 33/37] IB/cm: Fix rb-tree duplicate free and use-after-free Greg Kroah-Hartman
2015-11-06 19:24 ` Greg Kroah-Hartman [this message]
2015-11-06 19:24 ` [PATCH 3.14 35/37] serial: 8250_pci: Add support for 16 port Exar boards Greg Kroah-Hartman
2015-11-23 14:23   ` Soeren Grunewald
2016-02-23 23:51     ` Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 36/37] serial: 8250_pci: Add support for 12 " Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 37/37] xen: fix backport of previous kexec patch Greg Kroah-Hartman
2015-11-07  1:41 ` [PATCH 3.14 00/37] 3.14.57-stable review Guenter Roeck
2015-11-07  2:54 ` Shuah Khan
     [not found] ` <56402111.42371c0a.e127c.749b@mx.google.com>
2015-11-09  4:41   ` Kevin Hilman
2015-11-09 17:09     ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151106192412.947316376@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=klamm@yandex-team.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.com \
    --cc=shli@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).