All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Roman Gushchin <klamm@yandex-team.ru>,
	NeilBrown <neilb@suse.com>, Shaohua Li <shli@kernel.org>
Subject: [PATCH 3.14 34/37] md/raid5: fix locking in handle_stripe_clean_event()
Date: Fri,  6 Nov 2015 11:24:49 -0800	[thread overview]
Message-ID: <20151106192412.947316376@linuxfoundation.org> (raw)
In-Reply-To: <20151106192410.681850286@linuxfoundation.org>

3.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Roman Gushchin <klamm@yandex-team.ru>

commit b8a9d66d043ffac116100775a469f05f5158c16f upstream.

After commit 566c09c53455 ("raid5: relieve lock contention in get_active_stripe()")
__find_stripe() is called under conf->hash_locks + hash.
But handle_stripe_clean_event() calls remove_hash() under
conf->device_lock.

Under some cirscumstances the hash chain can be circuited,
and we get an infinite loop with disabled interrupts and locked hash
lock in __find_stripe(). This leads to hard lockup on multiple CPUs
and following system crash.

I was able to reproduce this behavior on raid6 over 6 ssd disks.
The devices_handle_discard_safely option should be set to enable trim
support. The following script was used:

for i in `seq 1 32`; do
    dd if=/dev/zero of=large$i bs=10M count=100 &
done

neilb: original was against a 3.x kernel.  I forward-ported
  to 4.3-rc.  This verison is suitable for any kernel since
  Commit: 59fc630b8b5f ("RAID5: batch adjacent full stripe write")
  (v4.1+).  I'll post a version for earlier kernels to stable.

Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Fixes: 566c09c53455 ("raid5: relieve lock contention in get_active_stripe()")
Signed-off-by: NeilBrown <neilb@suse.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: <stable@vger.kernel.org> # 3.13 - 4.2
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/md/raid5.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3029,6 +3029,8 @@ static void handle_stripe_clean_event(st
 		}
 	if (!discard_pending &&
 	    test_bit(R5_Discard, &sh->dev[sh->pd_idx].flags)) {
+		int hash = sh->hash_lock_index;
+
 		clear_bit(R5_Discard, &sh->dev[sh->pd_idx].flags);
 		clear_bit(R5_UPTODATE, &sh->dev[sh->pd_idx].flags);
 		if (sh->qd_idx >= 0) {
@@ -3042,9 +3044,9 @@ static void handle_stripe_clean_event(st
 		 * no updated data, so remove it from hash list and the stripe
 		 * will be reinitialized
 		 */
-		spin_lock_irq(&conf->device_lock);
+		spin_lock_irq(conf->hash_locks + hash);
 		remove_hash(sh);
-		spin_unlock_irq(&conf->device_lock);
+		spin_unlock_irq(conf->hash_locks + hash);
 		if (test_bit(STRIPE_SYNC_REQUESTED, &sh->state))
 			set_bit(STRIPE_HANDLE, &sh->state);
 



  parent reply	other threads:[~2015-11-06 19:54 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-06 19:24 [PATCH 3.14 00/37] 3.14.57-stable review Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 01/37] ath9k: declare required extra tx headroom Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 02/37] iwlwifi: dvm: fix D3 firmware PN programming Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 03/37] iwlwifi: fix firmware filename for 3160 Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 04/37] iwlwifi: mvm: fix D3 firmware PN programming Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 05/37] iwlwifi: pci: add a few more PCI subvendor IDs for the 7265 series Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 06/37] iommu/amd: Dont clear DTE flags when modifying it Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 07/37] powerpc/rtas: Validate rtas.entry before calling enter_rtas() Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 08/37] ASoC: wm8904: Correct number of EQ registers Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 09/37] x86/setup: Extend low identity map to cover whole kernel range Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 10/37] mm: make sendfile(2) killable Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 14/37] drm/nouveau/gem: return only valid domain when theres only one Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 16/37] drm/radeon: dont try to recreate sysfs entries on resume Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 18/37] rbd: require stable pages if message data CRCs are enabled Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 19/37] rbd: dont leak parent_spec in rbd_dev_probe_parent() Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 20/37] rbd: prevent kernel stack blow up on rbd map Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 21/37] Revert "ARM64: unwind: Fix PC calculation" Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 22/37] dm btree remove: fix a bug when rebalancing nodes after removal Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 23/37] dm btree: fix leak of bufio-backed block in btree_split_beneath error path Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 24/37] xhci: handle no ping response error properly Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 25/37] xhci: Add spurious wakeup quirk for LynxPoint-LP controllers Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 26/37] xen-blkfront: check for null drvdata in blkback_changed (XenbusStateClosing) Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 27/37] module: Fix locking in symbol_put_addr() Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 28/37] crypto: api - Only abort operations on fatal signal Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 29/37] md/raid1: submit_bio_wait() returns 0 on success Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 30/37] md/raid10: " Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 31/37] Revert "md: allow a partially recovered device to be hot-added to an array." Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 33/37] IB/cm: Fix rb-tree duplicate free and use-after-free Greg Kroah-Hartman
2015-11-06 19:24 ` Greg Kroah-Hartman [this message]
2015-11-06 19:24 ` [PATCH 3.14 35/37] serial: 8250_pci: Add support for 16 port Exar boards Greg Kroah-Hartman
2015-11-23 14:23   ` Soeren Grunewald
2016-02-23 23:51     ` Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 36/37] serial: 8250_pci: Add support for 12 " Greg Kroah-Hartman
2015-11-06 19:24 ` [PATCH 3.14 37/37] xen: fix backport of previous kexec patch Greg Kroah-Hartman
2015-11-07  1:41 ` [PATCH 3.14 00/37] 3.14.57-stable review Guenter Roeck
2015-11-07  2:54 ` Shuah Khan
     [not found] ` <56402111.42371c0a.e127c.749b@mx.google.com>
2015-11-09  4:41   ` Kevin Hilman
2015-11-09 17:09     ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151106192412.947316376@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=klamm@yandex-team.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.com \
    --cc=shli@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.