linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Andrew Morton <akpm@osdl.org>
Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH 006 of 12] md: Fix some small races in bitmap plugging in raid5.
Date: Tue, 27 Jun 2006 17:05:37 +1000	[thread overview]
Message-ID: <1060627070537.26022@suse.de> (raw)
In-Reply-To: 20060627170010.25835.patches@notabene


The comment gives more details, but I didn't quite have the
sequencing write, so there was room for races to leave bits
unset in the on-disk bitmap for short periods of time.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid5.c |   30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
--- .prev/drivers/md/raid5.c	2006-06-27 12:17:33.000000000 +1000
+++ ./drivers/md/raid5.c	2006-06-27 12:17:33.000000000 +1000
@@ -18,6 +18,30 @@
  * Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
  */
 
+/*
+ * BITMAP UNPLUGGING:
+ *
+ * The sequencing for updating the bitmap reliably is a little
+ * subtle (and I got it wrong the first time) so it deserves some
+ * explanation.
+ *
+ * We group bitmap updates into batches.  Each batch has a number.
+ * We may write out several batches at once, but that isn't very important.
+ * conf->bm_write is the number of the last batch successfully written.
+ * conf->bm_flush is the number of the last batch that was closed to
+ *    new additions.
+ * When we discover that we will need to write to any block in a stripe
+ * (in add_stripe_bio) we update the in-memory bitmap and record in sh->bm_seq
+ * the number of the batch it will be in. This is bm_flush+1.
+ * When we are ready to do a write, if that batch hasn't been written yet,
+ *   we plug the array and queue the stripe for later.
+ * When an unplug happens, we increment bm_flush, thus closing the current
+ *   batch.
+ * When we notice that bm_flush > bm_write, we write out all pending updates
+ * to the bitmap, and advance bm_write to where bm_flush was.
+ * This may occasionally write a bit out twice, but is sure never to
+ * miss any bits.
+ */
 
 #include <linux/config.h>
 #include <linux/module.h>
@@ -93,7 +117,7 @@ static void __release_stripe(raid5_conf_
 				list_add_tail(&sh->lru, &conf->delayed_list);
 				blk_plug_device(conf->mddev->queue);
 			} else if (test_bit(STRIPE_BIT_DELAY, &sh->state) &&
-				   conf->seq_write == sh->bm_seq) {
+				   sh->bm_seq - conf->seq_write > 0) {
 				list_add_tail(&sh->lru, &conf->bitmap_list);
 				blk_plug_device(conf->mddev->queue);
 			} else {
@@ -1274,9 +1298,9 @@ static int add_stripe_bio(struct stripe_
 		(unsigned long long)sh->sector, dd_idx);
 
 	if (conf->mddev->bitmap && firstwrite) {
-		sh->bm_seq = conf->seq_write;
 		bitmap_startwrite(conf->mddev->bitmap, sh->sector,
 				  STRIPE_SECTORS, 0);
+		sh->bm_seq = conf->seq_flush+1;
 		set_bit(STRIPE_BIT_DELAY, &sh->state);
 	}
 
@@ -2920,7 +2944,7 @@ static void raid5d (mddev_t *mddev)
 	while (1) {
 		struct list_head *first;
 
-		if (conf->seq_flush - conf->seq_write > 0) {
+		if (conf->seq_flush != conf->seq_write) {
 			int seq = conf->seq_flush;
 			spin_unlock_irq(&conf->device_lock);
 			bitmap_unplug(mddev->bitmap);

  parent reply	other threads:[~2006-06-27  7:05 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-27  7:05 [PATCH 000 of 12] md: Introduction NeilBrown
2006-06-27  7:05 ` [PATCH 001 of 12] md: Possible fix for unplug problem NeilBrown
2006-06-27  7:05 ` [PATCH 002 of 12] md: Set desc_nr correctly for version-1 superblocks NeilBrown
2006-06-27  7:05 ` [PATCH 003 of 12] md: Delay starting md threads until array is completely setup NeilBrown
2006-06-27  7:05 ` [PATCH 004 of 12] md: Fix resync speed calculation for restarted resyncs NeilBrown
2006-06-27  7:05 ` [PATCH 005 of 12] md: Fix a plug/unplug race in raid5 NeilBrown
2006-06-27  7:05 ` NeilBrown [this message]
2006-06-27  7:05 ` [PATCH 007 of 12] md: Fix usage of wrong variable in raid1 NeilBrown
2006-06-27  7:05 ` [PATCH 008 of 12] md: Unify usage of symbolic names for perms NeilBrown
2006-06-27  7:05 ` [PATCH 009 of 12] md: Require CAP_SYS_ADMIN for (re-)configuring md devices via sysfs NeilBrown
2006-06-27  7:05 ` [PATCH 010 of 12] md: Remove a variable that is now unused NeilBrown
2006-06-27  7:06 ` [PATCH 011 of 12] md: Fix "Will Configure" message when interpreting md= kernel parameter NeilBrown
2006-06-27  7:06 ` [PATCH 012 of 12] md: Include sector number in messages about corrected read errors NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1060627070537.26022@suse.de \
    --to=neilb@suse.de \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).