public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Chris Wright <chrisw@sous-sol.org>
To: linux-kernel@vger.kernel.org, stable@kernel.org, jejb@kernel.org
Cc: Justin Forbes <jmforbes@linuxtx.org>,
	Zwane Mwaikambo <zwane@arm.linux.org.uk>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Randy Dunlap <rdunlap@xenotime.net>,
	Dave Jones <davej@redhat.com>,
	Chuck Wolber <chuckw@quantumlinux.com>,
	Chris Wedgwood <reviews@ml.cw.f00f.org>,
	Michael Krufky <mkrufky@linuxtv.org>,
	Chuck Ebbert <cebbert@redhat.com>,
	Domenico Andreoli <cavokz@gmail.com>,
	torvalds@linux-foundation.org, akpm@linux-foundation.org,
	alan@lxorguk.ukuu.org.uk, Dan Williams <dan.j.williams@intel.com>,
	NeilBrown <neilb@suse.de>
Subject: md: close a livelock window in handle_parity_checks5
Date: Wed, 16 Apr 2008 18:02:18 -0700	[thread overview]
Message-ID: <20080417010358.329332938@sous-sol.org> (raw)
In-Reply-To: 20080417010122.148289106@sous-sol.org

[-- Attachment #1: md-close-a-livelock-window-in-handle_parity_checks5.patch --]
[-- Type: text/plain, Size: 3991 bytes --]

-stable review patch.  If anyone has any objections, please let us know.
---------------------

From: Dan Williams <dan.j.williams@intel.com>

upstream commit: bd2ab67030e9116f1e4aae1289220255412b37fd

If a failure is detected after a parity check operation has been initiated,
but before it completes handle_parity_checks5 will never quiesce operations on
the stripe.

Explicitly handle this case by "canceling" the parity check, i.e.  clear the
STRIPE_OP_CHECK flags and queue the stripe on the handle list again to refresh
any non-uptodate blocks.

Kernel versions >= 2.6.23 are susceptible.

Cc: <stable@kernel.org>
Cc: NeilBrown <neilb@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
---
 drivers/md/raid5.c |   53 ++++++++++++++++++++++++++++++-----------------------
 1 file changed, 30 insertions(+), 23 deletions(-)

--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2348,25 +2348,15 @@ static void handle_issuing_new_write_req
 static void handle_parity_checks5(raid5_conf_t *conf, struct stripe_head *sh,
 				struct stripe_head_state *s, int disks)
 {
+	int canceled_check = 0;
+
 	set_bit(STRIPE_HANDLE, &sh->state);
-	/* Take one of the following actions:
-	 * 1/ start a check parity operation if (uptodate == disks)
-	 * 2/ finish a check parity operation and act on the result
-	 * 3/ skip to the writeback section if we previously
-	 *    initiated a recovery operation
-	 */
-	if (s->failed == 0 &&
-	    !test_bit(STRIPE_OP_MOD_REPAIR_PD, &sh->ops.pending)) {
-		if (!test_and_set_bit(STRIPE_OP_CHECK, &sh->ops.pending)) {
-			BUG_ON(s->uptodate != disks);
-			clear_bit(R5_UPTODATE, &sh->dev[sh->pd_idx].flags);
-			sh->ops.count++;
-			s->uptodate--;
-		} else if (
-		       test_and_clear_bit(STRIPE_OP_CHECK, &sh->ops.complete)) {
-			clear_bit(STRIPE_OP_CHECK, &sh->ops.ack);
-			clear_bit(STRIPE_OP_CHECK, &sh->ops.pending);
 
+	/* complete a check operation */
+	if (test_and_clear_bit(STRIPE_OP_CHECK, &sh->ops.complete)) {
+	    clear_bit(STRIPE_OP_CHECK, &sh->ops.ack);
+	    clear_bit(STRIPE_OP_CHECK, &sh->ops.pending);
+		if (s->failed == 0) {
 			if (sh->ops.zero_sum_result == 0)
 				/* parity is correct (on disc,
 				 * not in buffer any more)
@@ -2391,7 +2381,8 @@ static void handle_parity_checks5(raid5_
 					s->uptodate++;
 				}
 			}
-		}
+		} else
+			canceled_check = 1; /* STRIPE_INSYNC is not set */
 	}
 
 	/* check if we can clear a parity disk reconstruct */
@@ -2404,12 +2395,28 @@ static void handle_parity_checks5(raid5_
 		clear_bit(STRIPE_OP_COMPUTE_BLK, &sh->ops.pending);
 	}
 
-	/* Wait for check parity and compute block operations to complete
-	 * before write-back
+	/* start a new check operation if there are no failures, the stripe is
+	 * not insync, and a repair is not in flight
 	 */
-	if (!test_bit(STRIPE_INSYNC, &sh->state) &&
-		!test_bit(STRIPE_OP_CHECK, &sh->ops.pending) &&
-		!test_bit(STRIPE_OP_COMPUTE_BLK, &sh->ops.pending)) {
+	if (s->failed == 0 &&
+	    !test_bit(STRIPE_INSYNC, &sh->state) &&
+	    !test_bit(STRIPE_OP_MOD_REPAIR_PD, &sh->ops.pending)) {
+		if (!test_and_set_bit(STRIPE_OP_CHECK, &sh->ops.pending)) {
+			BUG_ON(s->uptodate != disks);
+			clear_bit(R5_UPTODATE, &sh->dev[sh->pd_idx].flags);
+			sh->ops.count++;
+			s->uptodate--;
+		}
+	}
+
+	/* Wait for check parity and compute block operations to complete
+	 * before write-back.  If a failure occurred while the check operation
+	 * was in flight we need to cycle this stripe through handle_stripe
+	 * since the parity block may not be uptodate
+	 */
+	if (!canceled_check && !test_bit(STRIPE_INSYNC, &sh->state) &&
+	    !test_bit(STRIPE_OP_CHECK, &sh->ops.pending) &&
+	    !test_bit(STRIPE_OP_COMPUTE_BLK, &sh->ops.pending)) {
 		struct r5dev *dev;
 		/* either failed parity check, or recovery is happening */
 		if (s->failed == 0)

-- 

  parent reply	other threads:[~2008-04-17  1:30 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-17  1:01 [patch 00/66] 2.6.24-stable review Chris Wright
2008-04-17  1:01 ` time: prevent the loop in timespec_add_ns() from being optimised away Chris Wright
2008-04-17  1:01 ` kbuild: soften modpost checks when doing cross builds Chris Wright
2008-04-17  1:01 ` mtd: memory corruption in block2mtd.c Chris Wright
2008-04-17  1:01 ` md: remove the super sysfs attribute from devices in an md array Chris Wright
2008-04-17  1:01 ` V4L: ivtv: Add missing sg_init_table() Chris Wright
2008-04-17  1:01 ` UIO: add pgprot_noncached() to UIO mmap code Chris Wright
2008-04-17  1:01 ` USB: add support for Motorola ROKR Z6 cellphone in mass storage mode Chris Wright
2008-04-17  1:01 ` USB: new quirk flag to avoid Set-Interface Chris Wright
2008-04-17  1:01 ` inotify: fix race Chris Wright
2008-04-17  1:01 ` inotify: remove debug code Chris Wright
2008-04-17  1:01 ` NOHZ: reevaluate idle sleep length after add_timer_on() Chris Wright
2008-04-17  1:01 ` slab: fix cache_cache bootstrap in kmem_cache_init() Chris Wright
2008-04-17  1:01 ` xen: fix RMW when unmasking events Chris Wright
2008-04-17  1:01 ` xen: mask out SEP from CPUID Chris Wright
2008-04-17  1:01 ` xen: fix UP setup of shared_info Chris Wright
2008-04-17  1:01 ` PERCPU : __percpu_alloc_mask() can dynamically size percpu_data storage Chris Wright
2008-04-17  1:01 ` alloc_percpu() fails to allocate percpu data Chris Wright
2008-04-17  1:01 ` vfs: fix data leak in nobh_write_end() Chris Wright
2008-04-17  1:01 ` pci: revert SMBus unhide on HP Compaq nx6110 Chris Wright
2008-04-17  1:01 ` hwmon: (w83781d) Fix I/O resource conflict with PNP Chris Wright
2008-04-17  1:01 ` vmcoreinfo: add the symbol "phys_base" Chris Wright
2008-04-17  9:24   ` Eric W. Biederman
2008-04-17 17:16     ` Chris Wright
2008-04-17 17:29       ` Vivek Goyal
2008-04-18 10:17         ` Ken'ichi Ohmichi
2008-04-17 23:31       ` Eric W. Biederman
2008-04-17  1:01 ` USB: Allow initialization of broken keyspan serial adapters Chris Wright
2008-04-17  1:01 ` USB: serial: fix regression in Visor/Palm OS module for kernels >= 2.6.24 Chris Wright
2008-04-17  1:01 ` USB: serial: ti_usb_3410_5052: Correct TUSB3410 endpoint requirements Chris Wright
2008-04-17  8:01   ` Oliver Neukum
2008-04-17 17:02     ` Greg KH
2008-04-17  1:01 ` CRYPTO xcbc: Fix crash when ipsec uses xcbc-mac with big data chunk Chris Wright
2008-04-17 11:26   ` S.Çağlar Onur
2008-04-17 14:22     ` Herbert Xu
2008-04-17 23:33       ` Chris Wright
2008-04-17  1:01 ` mtd: fix broken state in CFI driver caused by FL_SHUTDOWN Chris Wright
2008-04-17  1:01 ` ipmi: change device node ordering to reflect probe order Chris Wright
2008-04-17  1:01 ` AX25 ax25_out: check skb for NULL in ax25_kick() Chris Wright
2008-04-17  1:01 ` NET: include <linux/types.h> into linux/ethtool.h for __u* typedef Chris Wright
2008-04-17  1:01 ` SUNGEM: Fix NAPI assertion failure Chris Wright
2008-04-17  1:01 ` INET: inet_frag_evictor() must run with BH disabled Chris Wright
2008-04-17  1:01 ` LLC: Restrict LLC sockets to root Chris Wright
2008-04-17  1:01 ` netpoll: zap_completion_queue: adjust skb->users counter Chris Wright
2008-04-17  1:01 ` PPPOL2TP: Make locking calls softirq-safe Chris Wright
2008-04-17  1:01 ` PPPOL2TP: Fix SMP issues in skb reorder queue handling Chris Wright
2008-04-17  1:01 ` NET: Add preemption point in qdisc_run Chris Wright
2008-04-17  1:01 ` sch_htb: fix "too many events" situation Chris Wright
2008-04-17  1:02 ` SCTP: Fix local_addr deletions during list traversals Chris Wright
2008-04-17  1:02 ` NET: Fix multicast device ioctl checks Chris Wright
2008-04-17  1:02 ` TCP: Fix shrinking windows with window scaling Chris Wright
2008-04-17  1:02 ` TCP: Let skbs grow over a page on fast peers Chris Wright
2008-04-17  1:02 ` VLAN: Dont copy ALLMULTI/PROMISC flags from underlying device Chris Wright
2008-04-17  1:02 ` SPARC64: Fix atomic backoff limit Chris Wright
2008-04-17  1:02 ` SPARC64: Fix __get_cpu_var in preemption-enabled area Chris Wright
2008-04-17  1:02 ` SPARC64: flush_ptrace_access() needs preemption disable Chris Wright
2008-04-17  1:02 ` libata: assume no device is attached if both IDENTIFYs are aborted Chris Wright
2008-04-17  1:02 ` sis190: read the mac address from the eeprom first Chris Wright
2008-04-17  1:02 ` bluetooth: hci_core: defer hci_unregister_sysfs() Chris Wright
2008-04-17  1:02 ` SPARC64: Fix FPU saving in 64-bit signal handling Chris Wright
2008-04-17  1:02 ` DVB: tda10086: make the 22kHz tone for DISEQC a config option Chris Wright
2008-04-17  1:02 ` SUNRPC: Fix a memory leak in rpc_create() Chris Wright
2008-04-17 21:25   ` Stefan Lippers-Hollmann
2008-04-17 22:06     ` Trond Myklebust
2008-04-17 22:09       ` Chris Wright
2008-04-18 14:42       ` Chuck Lever
2008-04-17  1:02 ` HFS+: fix unlink of links Chris Wright
2008-04-17  1:02 ` acpi: fix "buggy BIOS check" when CPUs are hot removed Chris Wright
2008-04-17  1:02 ` plip: replace spin_lock_irq with spin_lock_irqsave in irq context Chris Wright
2008-04-17  1:02 ` signalfd: fix for incorrect SI_QUEUE user data reporting Chris Wright
2008-04-17  1:02 ` Chris Wright [this message]
2008-04-17  1:02 ` POWERPC: Fix build of modular drivers/macintosh/apm_emu.c Chris Wright
2008-04-17  1:02 ` pnpacpi: reduce printk severity for "pnpacpi: exceeded the max number of ..." Chris Wright
2008-04-17 15:24   ` Nick Andrew
2008-04-17 17:09     ` Chris Wright
2008-04-18 21:48   ` Bjorn Helgaas
2008-04-23  4:09     ` [stable PATCH for 2.6.24.5 and 2.6.25] pnpacpi: fix potential corruption on "pnpacpi: exceeded the max number of IRQ resources 2" Len Brown
2008-04-17  1:02 ` PARISC futex: special case cmpxchg NULL in kernel space Chris Wright
2008-04-17  1:02 ` PARISC pdc_console: fix bizarre panic on boot Chris Wright
2008-04-17  1:02 ` PARISC fix signal trampoline cache flushing Chris Wright
2008-04-17  1:02 ` acpi: bus: check once more for an empty list after locking it Chris Wright
2008-04-17  1:02 ` fbdev: fix /proc/fb oops after module removal Chris Wright
2008-04-17  1:02 ` macb: Call phy_disconnect on removing Chris Wright
2008-04-17  1:02 ` file capabilities: remove cap_task_kill() Chris Wright
2008-04-17  1:02 ` locks: fix possible infinite loop in fcntl(F_SETLKW) over nfs Chris Wright
2008-04-18  7:50 ` [stable] [patch 00/66] 2.6.24-stable review Chris Wright

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080417010358.329332938@sous-sol.org \
    --to=chrisw@sous-sol.org \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=cavokz@gmail.com \
    --cc=cebbert@redhat.com \
    --cc=chuckw@quantumlinux.com \
    --cc=dan.j.williams@intel.com \
    --cc=davej@redhat.com \
    --cc=jejb@kernel.org \
    --cc=jmforbes@linuxtx.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mkrufky@linuxtv.org \
    --cc=neilb@suse.de \
    --cc=rdunlap@xenotime.net \
    --cc=reviews@ml.cw.f00f.org \
    --cc=stable@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=zwane@arm.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox