From: NeilBrown <neilb@suse.de>
To: linux-raid@vger.kernel.org
Subject: [md PATCH 05/22] md/raid5: don't complete make_request on barrier until writes are scheduled
Date: Fri, 04 Dec 2009 17:48:02 +1100 [thread overview]
Message-ID: <20091204064802.10264.6658.stgit@notabene.brown> (raw)
In-Reply-To: <20091204064559.10264.37619.stgit@notabene.brown>
The post-barrier-flush is sent be md as soon as make_request on the
barrier write completes. For raid5, the data might not be in the
per-device queues yet. So for barrier requests, wait for any
pre-reading to be done so that the request will be in the per-device
queues.
We use the 'preread_active' count to check that nothing is still in
the preread phase, and delay the decrement of this count until after
write requests have been submitted to the underlying devices.
Signed-off-by: NeilBrown <neilb@suse.de>
---
drivers/md/raid5.c | 51 +++++++++++++++++++++++++++++++++++++++------------
1 files changed, 39 insertions(+), 12 deletions(-)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index ecf89c8..8c772b2 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2947,6 +2947,7 @@ static void handle_stripe5(struct stripe_head *sh)
struct r5dev *dev;
mdk_rdev_t *blocked_rdev = NULL;
int prexor;
+ int dec_preread_active = 0;
memset(&s, 0, sizeof(s));
pr_debug("handling stripe %llu, state=%#lx cnt=%d, pd_idx=%d check:%d "
@@ -3096,12 +3097,8 @@ static void handle_stripe5(struct stripe_head *sh)
set_bit(STRIPE_INSYNC, &sh->state);
}
}
- if (test_and_clear_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) {
- atomic_dec(&conf->preread_active_stripes);
- if (atomic_read(&conf->preread_active_stripes) <
- IO_THRESHOLD)
- md_wakeup_thread(conf->mddev->thread);
- }
+ if (test_and_clear_bit(STRIPE_PREREAD_ACTIVE, &sh->state))
+ dec_preread_active = 1;
}
/* Now to consider new write requests and what else, if anything
@@ -3208,6 +3205,16 @@ static void handle_stripe5(struct stripe_head *sh)
ops_run_io(sh, &s);
+ if (dec_preread_active) {
+ /* We delay this until after ops_run_io so that if make_request
+ * is waiting on a barrier, it won't continue until the writes
+ * have actually been submitted.
+ */
+ atomic_dec(&conf->preread_active_stripes);
+ if (atomic_read(&conf->preread_active_stripes) <
+ IO_THRESHOLD)
+ md_wakeup_thread(conf->mddev->thread);
+ }
return_io(return_bi);
}
@@ -3221,6 +3228,7 @@ static void handle_stripe6(struct stripe_head *sh)
struct r6_state r6s;
struct r5dev *dev, *pdev, *qdev;
mdk_rdev_t *blocked_rdev = NULL;
+ int dec_preread_active = 0;
pr_debug("handling stripe %llu, state=%#lx cnt=%d, "
"pd_idx=%d, qd_idx=%d\n, check:%d, reconstruct:%d\n",
@@ -3380,12 +3388,8 @@ static void handle_stripe6(struct stripe_head *sh)
set_bit(STRIPE_INSYNC, &sh->state);
}
}
- if (test_and_clear_bit(STRIPE_PREREAD_ACTIVE, &sh->state)) {
- atomic_dec(&conf->preread_active_stripes);
- if (atomic_read(&conf->preread_active_stripes) <
- IO_THRESHOLD)
- md_wakeup_thread(conf->mddev->thread);
- }
+ if (test_and_clear_bit(STRIPE_PREREAD_ACTIVE, &sh->state))
+ dec_preread_active = 1;
}
/* Now to consider new write requests and what else, if anything
@@ -3494,6 +3498,18 @@ static void handle_stripe6(struct stripe_head *sh)
ops_run_io(sh, &s);
+
+ if (dec_preread_active) {
+ /* We delay this until after ops_run_io so that if make_request
+ * is waiting on a barrier, it won't continue until the writes
+ * have actually been submitted.
+ */
+ atomic_dec(&conf->preread_active_stripes);
+ if (atomic_read(&conf->preread_active_stripes) <
+ IO_THRESHOLD)
+ md_wakeup_thread(conf->mddev->thread);
+ }
+
return_io(return_bi);
}
@@ -3996,6 +4012,9 @@ static int make_request(struct request_queue *q, struct bio * bi)
finish_wait(&conf->wait_for_overlap, &w);
set_bit(STRIPE_HANDLE, &sh->state);
clear_bit(STRIPE_DELAYED, &sh->state);
+ if (mddev->barrier &&
+ !test_and_set_bit(STRIPE_PREREAD_ACTIVE, &sh->state))
+ atomic_inc(&conf->preread_active_stripes);
release_stripe(sh);
} else {
/* cannot get stripe for read-ahead, just give-up */
@@ -4015,6 +4034,14 @@ static int make_request(struct request_queue *q, struct bio * bi)
bio_endio(bi, 0);
}
+
+ if (mddev->barrier) {
+ /* We need to wait for the stripes to all be handled.
+ * So: wait for preread_active_stripes to drop to 0.
+ */
+ wait_event(mddev->thread->wqueue,
+ atomic_read(&conf->preread_active_stripes) == 0);
+ }
return 0;
}
next prev parent reply other threads:[~2009-12-04 6:48 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-04 6:48 [md PATCH 00/22] MD patches queued for 2.6.33 NeilBrown
2009-12-04 6:48 ` [md PATCH 02/22] md: adjust resync_min usefully when resync aborts NeilBrown
2009-12-04 6:48 ` [md PATCH 03/22] md: don't reset curr_resync_completed after an interrupted resync NeilBrown
2009-12-04 6:48 ` NeilBrown [this message]
2010-01-21 21:07 ` [md PATCH 05/22] md/raid5: don't complete make_request on barrieruntil writes are scheduled Tirumala Reddy Marri
2010-01-28 2:44 ` Neil Brown
2009-12-04 6:48 ` [md PATCH 11/22] md: change daemon_sleep to be in 'jiffies' rather than 'seconds' NeilBrown
2009-12-04 6:48 ` [md PATCH 01/22] md/bitmap: protect against bitmap removal while being updated NeilBrown
2009-12-04 6:48 ` [md PATCH 09/22] md: collect bitmap-specific fields into one structure NeilBrown
2009-12-04 6:48 ` [md PATCH 04/22] md: support barrier requests on all personalities NeilBrown
2009-12-08 13:54 ` Andre Noll
2009-12-10 6:25 ` Neil Brown
2009-12-11 11:46 ` Andre Noll
2009-12-04 6:48 ` [md PATCH 13/22] md: support bitmap offset appropriate for external-metadata arrays NeilBrown
2009-12-04 6:48 ` [md PATCH 21/22] md: move compat_ioctl handling into md.c NeilBrown
2009-12-04 6:48 ` [md PATCH 22/22] md: integrate spares into array at earliest opportunity NeilBrown
2009-12-04 6:48 ` [md PATCH 19/22] md: add MODULE_DESCRIPTION for all md related modules NeilBrown
2009-12-04 6:48 ` [md PATCH 16/22] md/bitmap: update dirty flag when bitmap bits are explicitly set NeilBrown
2009-12-04 6:48 ` [md PATCH 15/22] md: Support write-intent bitmaps with externally managed metadata NeilBrown
2009-12-04 6:48 ` [md PATCH 17/22] md/raid10: print more useful messages on device failure NeilBrown
2009-12-04 6:48 ` [md PATCH 14/22] md: support updating bitmap parameters via sysfs NeilBrown
2009-12-08 10:29 ` Andre Noll
2009-12-10 6:14 ` Neil Brown
2009-12-11 11:46 ` Andre Noll
2009-12-04 6:48 ` [md PATCH 07/22] md/raid1: add takeover support for raid5->raid1 NeilBrown
2009-12-04 6:48 ` [md PATCH 10/22] md: move offset, daemon_sleep and chunksize out of bitmap structure NeilBrown
2009-12-04 6:48 ` [md PATCH 06/22] md: add honouring of suspend_{lo,hi} to raid1 NeilBrown
2009-12-04 6:48 ` [md PATCH 12/22] md: remove needless setting of thread->timeout in raid10_quiesce NeilBrown
2009-12-04 6:48 ` [md PATCH 20/22] md: revise Kconfig help for MD_MULTIPATH NeilBrown
2009-12-04 6:48 ` [md PATCH 18/22] raid: improve MD/raid10 handling of correctable read errors NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091204064802.10264.6658.stgit@notabene.brown \
--to=neilb@suse.de \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).