From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Alexander Lyakas <alex.bolshoy@gmail.com>,
NeilBrown <neilb@suse.de>,
Jack Wang <jinpu.wang@profitbricks.com>
Subject: [ 18/34] md/raid1,raid10: use freeze_array in place of raise_barrier in various places.
Date: Sun, 18 Aug 2013 13:34:31 -0700 [thread overview]
Message-ID: <20130818203300.933588263@linuxfoundation.org> (raw)
In-Reply-To: <20130818203259.653403173@linuxfoundation.org>
3.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: NeilBrown <neilb@suse.de>
commit e2d59925221cd562e07fee38ec8839f7209ae603 upstream.
Various places in raid1 and raid10 are calling raise_barrier when they
really should call freeze_array.
The former is only intended to be called from "make_request".
The later has extra checks for 'nr_queued' and makes a call to
flush_pending_writes(), so it is safe to call it from within the
management thread.
Using raise_barrier will sometimes deadlock. Using freeze_array
should not.
As 'freeze_array' currently expects one request to be pending (in
handle_read_error - the only previous caller), we need to pass
it the number of pending requests (extra) to ignore.
The deadlock was made particularly noticeable by commits
050b66152f87c7 (raid10) and 6b740b8d79252f13 (raid1) which
appeared in 3.4, so the fix is appropriate for any -stable
kernel since then.
This patch probably won't apply directly to some early kernels and
will need to be applied by hand.
Cc: stable@vger.kernel.org
Reported-by: Alexander Lyakas <alex.bolshoy@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
[adjust context to make it can be apply on top of 3.4 ]
Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/md/raid1.c | 22 +++++++++++-----------
drivers/md/raid10.c | 14 +++++++-------
2 files changed, 18 insertions(+), 18 deletions(-)
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -812,17 +812,17 @@ static void allow_barrier(struct r1conf
wake_up(&conf->wait_barrier);
}
-static void freeze_array(struct r1conf *conf)
+static void freeze_array(struct r1conf *conf, int extra)
{
/* stop syncio and normal IO and wait for everything to
* go quite.
* We increment barrier and nr_waiting, and then
- * wait until nr_pending match nr_queued+1
+ * wait until nr_pending match nr_queued+extra
* This is called in the context of one normal IO request
* that has failed. Thus any sync request that might be pending
* will be blocked by nr_pending, and we need to wait for
* pending IO requests to complete or be queued for re-try.
- * Thus the number queued (nr_queued) plus this request (1)
+ * Thus the number queued (nr_queued) plus this request (extra)
* must match the number of pending IOs (nr_pending) before
* we continue.
*/
@@ -830,7 +830,7 @@ static void freeze_array(struct r1conf *
conf->barrier++;
conf->nr_waiting++;
wait_event_lock_irq(conf->wait_barrier,
- conf->nr_pending == conf->nr_queued+1,
+ conf->nr_pending == conf->nr_queued+extra,
conf->resync_lock,
flush_pending_writes(conf));
spin_unlock_irq(&conf->resync_lock);
@@ -1432,8 +1432,8 @@ static int raid1_add_disk(struct mddev *
* we wait for all outstanding requests to complete.
*/
synchronize_sched();
- raise_barrier(conf);
- lower_barrier(conf);
+ freeze_array(conf, 0);
+ unfreeze_array(conf);
clear_bit(Unmerged, &rdev->flags);
}
md_integrity_add_rdev(rdev, mddev);
@@ -1481,11 +1481,11 @@ static int raid1_remove_disk(struct mdde
*/
struct md_rdev *repl =
conf->mirrors[conf->raid_disks + number].rdev;
- raise_barrier(conf);
+ freeze_array(conf, 0);
clear_bit(Replacement, &repl->flags);
p->rdev = repl;
conf->mirrors[conf->raid_disks + number].rdev = NULL;
- lower_barrier(conf);
+ unfreeze_array(conf);
clear_bit(WantReplacement, &rdev->flags);
} else
clear_bit(WantReplacement, &rdev->flags);
@@ -2100,7 +2100,7 @@ static void handle_read_error(struct r1c
* frozen
*/
if (mddev->ro == 0) {
- freeze_array(conf);
+ freeze_array(conf, 1);
fix_read_error(conf, r1_bio->read_disk,
r1_bio->sector, r1_bio->sectors);
unfreeze_array(conf);
@@ -2855,7 +2855,7 @@ static int raid1_reshape(struct mddev *m
return -ENOMEM;
}
- raise_barrier(conf);
+ freeze_array(conf, 0);
/* ok, everything is stopped */
oldpool = conf->r1bio_pool;
@@ -2887,7 +2887,7 @@ static int raid1_reshape(struct mddev *m
mddev->delta_disks = 0;
conf->last_used = 0; /* just make sure it is in-range */
- lower_barrier(conf);
+ unfreeze_array(conf);
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
md_wakeup_thread(mddev->thread);
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -952,17 +952,17 @@ static void allow_barrier(struct r10conf
wake_up(&conf->wait_barrier);
}
-static void freeze_array(struct r10conf *conf)
+static void freeze_array(struct r10conf *conf, int extra)
{
/* stop syncio and normal IO and wait for everything to
* go quiet.
* We increment barrier and nr_waiting, and then
- * wait until nr_pending match nr_queued+1
+ * wait until nr_pending match nr_queued+extra
* This is called in the context of one normal IO request
* that has failed. Thus any sync request that might be pending
* will be blocked by nr_pending, and we need to wait for
* pending IO requests to complete or be queued for re-try.
- * Thus the number queued (nr_queued) plus this request (1)
+ * Thus the number queued (nr_queued) plus this request (extra)
* must match the number of pending IOs (nr_pending) before
* we continue.
*/
@@ -970,7 +970,7 @@ static void freeze_array(struct r10conf
conf->barrier++;
conf->nr_waiting++;
wait_event_lock_irq(conf->wait_barrier,
- conf->nr_pending == conf->nr_queued+1,
+ conf->nr_pending == conf->nr_queued+extra,
conf->resync_lock,
flush_pending_writes(conf));
@@ -1619,8 +1619,8 @@ static int raid10_add_disk(struct mddev
* we wait for all outstanding requests to complete.
*/
synchronize_sched();
- raise_barrier(conf, 0);
- lower_barrier(conf);
+ freeze_array(conf, 0);
+ unfreeze_array(conf);
clear_bit(Unmerged, &rdev->flags);
}
md_integrity_add_rdev(rdev, mddev);
@@ -2410,7 +2410,7 @@ static void handle_read_error(struct mdd
r10_bio->devs[slot].bio = NULL;
if (mddev->ro == 0) {
- freeze_array(conf);
+ freeze_array(conf, 1);
fix_read_error(conf, mddev, r10_bio);
unfreeze_array(conf);
} else
next prev parent reply other threads:[~2013-08-18 20:53 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-18 20:34 [ 00/34] 3.4.59-stable review Greg Kroah-Hartman
2013-08-18 20:34 ` [ 01/34] perf/arm: Fix armpmu_map_hw_event() Greg Kroah-Hartman
2013-08-18 20:34 ` [ 02/34] fs/proc/task_mmu.c: fix buffer overflow in add_page_map() Greg Kroah-Hartman
2013-08-18 20:34 ` [ 03/34] drm/i915/lvds: ditch ->prepare special case Greg Kroah-Hartman
2013-08-18 20:34 ` [ 04/34] MIPS: Expose missing pci_io{map,unmap} declarations Greg Kroah-Hartman
2013-08-18 20:34 ` [ 05/34] microblaze: Update microblaze defconfigs Greg Kroah-Hartman
2013-08-18 20:34 ` [ 06/34] sound: Fix make allmodconfig on MIPS Greg Kroah-Hartman
2013-08-18 20:34 ` [ 07/34] sound: Fix make allmodconfig on MIPS correctly Greg Kroah-Hartman
2013-08-18 20:34 ` [ 08/34] HID: microsoft: do not use compound literal - fix build Greg Kroah-Hartman
2013-08-18 20:34 ` [ 09/34] vm: add no-mmu vm_iomap_memory() stub Greg Kroah-Hartman
2013-08-18 20:34 ` [ 10/34] cris: posix_types.h, include asm-generic/posix_types.h Greg Kroah-Hartman
2013-08-18 20:34 ` [ 11/34] cris: Remove old legacy "-traditional" flag from arch-v10/lib/Makefile Greg Kroah-Hartman
2013-08-18 20:34 ` [ 12/34] CRIS: Add _sdata to vmlinux.lds.S Greg Kroah-Hartman
2013-08-18 20:34 ` [ 13/34] futex: Take hugepages into account when generating futex_key Greg Kroah-Hartman
2013-08-18 20:34 ` [ 14/34] frv: Use correct size for task_struct allocation Greg Kroah-Hartman
2013-08-18 20:34 ` [ 15/34] frv: Use core allocator for task_struct Greg Kroah-Hartman
2013-08-18 20:34 ` [ 16/34] powerpc/numa: Avoid stupid uninitialized warning from gcc Greg Kroah-Hartman
2013-08-18 20:34 ` [ 17/34] alpha: makefile: dont enforce small data model for kernel builds Greg Kroah-Hartman
2013-08-18 20:34 ` Greg Kroah-Hartman [this message]
2013-08-18 20:34 ` [ 19/34] sparc32: add ucmpdi2 Greg Kroah-Hartman
2013-08-18 20:34 ` [ 20/34] sparc32: Add ucmpdi2.o to obj-y instead of lib-y Greg Kroah-Hartman
2013-08-18 20:34 ` [ 21/34] MIPS: Rewrite pfn_valid to work in modules, too Greg Kroah-Hartman
2013-08-18 20:34 ` [ 22/34] af_key: initialize satype in key_notify_policy_flush() Greg Kroah-Hartman
2013-08-18 20:34 ` [ 23/34] iwl4965: set power mode early Greg Kroah-Hartman
2013-08-18 20:34 ` [ 24/34] iwl4965: reset firmware after rfkill off Greg Kroah-Hartman
2013-08-18 20:34 ` [ 25/34] can: pcan_usb: fix wrong memcpy() bytes length Greg Kroah-Hartman
2013-08-18 20:34 ` [ 26/34] genetlink: fix family dump race Greg Kroah-Hartman
2013-08-18 20:34 ` [ 27/34] usb: add two quirky touchscreen Greg Kroah-Hartman
2013-08-18 20:34 ` [ 28/34] USB: mos7720: fix broken control requests Greg Kroah-Hartman
2013-08-18 20:34 ` [ 29/34] xtensa: fix linker script transformation for .text.unlikely Greg Kroah-Hartman
2013-08-18 20:34 ` [ 30/34] xtensa: replace xtensa-specific _f{data,text} by _s{data,text} Greg Kroah-Hartman
2013-08-18 20:34 ` [ 31/34] ARM: 7809/1: perf: fix event validation for software group leaders Greg Kroah-Hartman
2013-08-18 20:34 ` [ 32/34] m68k: Truncate base in do_div() Greg Kroah-Hartman
2013-08-18 20:34 ` [ 33/34] m68k/atari: ARAnyM - Fix NatFeat module support Greg Kroah-Hartman
2013-08-18 20:34 ` [ 34/34] jbd2: Fix use after free after error in jbd2_journal_dirty_metadata() Greg Kroah-Hartman
2013-08-19 1:49 ` [ 00/34] 3.4.59-stable review Guenter Roeck
2013-08-19 18:02 ` Shuah Khan
2013-08-19 19:35 ` Greg Kroah-Hartman
2013-08-19 20:14 ` Stefan Lippers-Hollmann
2013-08-19 22:22 ` Shuah Khan
2013-08-19 22:30 ` Greg Kroah-Hartman
2013-08-20 7:36 ` Berg, Johannes
2013-08-20 7:36 ` Berg, Johannes
2013-08-20 15:24 ` Greg Kroah-Hartman
2013-08-20 15:32 ` Berg, Johannes
2013-08-20 15:53 ` Hugh Dickins
2013-08-20 16:03 ` Greg Kroah-Hartman
2013-08-20 16:25 ` Hugh Dickins
2013-08-20 16:43 ` Steven Rostedt
2013-08-20 16:43 ` Shuah Khan
2013-08-19 22:31 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130818203300.933588263@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=alex.bolshoy@gmail.com \
--cc=jinpu.wang@profitbricks.com \
--cc=linux-kernel@vger.kernel.org \
--cc=neilb@suse.de \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.