linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 000 of 3] md: Introduction - assorted fixed for 2.6.16
@ 2006-03-30  5:52 NeilBrown
  2006-03-30  5:52 ` [PATCH 001 of 3] md: Don't clear bits in bitmap when writing to one device fails during recovery NeilBrown
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: NeilBrown @ 2006-03-30  5:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid, linux-kernel

Following are three patches for md.  The first fixes a problem that
can cause corruption in fairly unusual circumstances (re-adding a
device to a raid1 and suffering write-errors that are subsequntly
fixed and the device is re-added again).

The other two fix minor problems

The are suitable to go straight in to 2.6.17-rc.

NeilBrown

 [PATCH 001 of 3] md: Don't clear bits in bitmap when writing to one device fails during recovery.
 [PATCH 002 of 3] md: Remove some code that can sleep from under a spinlock.
 [PATCH 003 of 3] md: Raid-6 did not create sysfs entries for stripe cache

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 001 of 3] md: Don't clear bits in bitmap when writing to one device fails during recovery.
  2006-03-30  5:52 [PATCH 000 of 3] md: Introduction - assorted fixed for 2.6.16 NeilBrown
@ 2006-03-30  5:52 ` NeilBrown
  2006-03-30  6:12   ` Andrew Morton
  2006-03-30  5:52 ` [PATCH 002 of 3] md: Remove some code that can sleep from under a spinlock NeilBrown
  2006-03-30  5:52 ` [PATCH 003 of 3] md: Raid-6 did not create sysfs entries for stripe cache NeilBrown
  2 siblings, 1 reply; 6+ messages in thread
From: NeilBrown @ 2006-03-30  5:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid, linux-kernel


Currently a device failure during recovery leaves bits set in the
bitmap.  This normally isn't a problem as the offending device will be
rejected because of errors.  However if device re-adding is being used
with non-persistent bitmaps, this can be a problem.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid1.c |   13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff ./drivers/md/raid1.c~current~ ./drivers/md/raid1.c
--- ./drivers/md/raid1.c~current~	2006-03-30 16:48:29.000000000 +1100
+++ ./drivers/md/raid1.c	2006-03-30 16:48:40.000000000 +1100
@@ -1135,8 +1135,19 @@ static int end_sync_write(struct bio *bi
 			mirror = i;
 			break;
 		}
-	if (!uptodate)
+	if (!uptodate) {
+		int sync_blocks = 0;
+		sector_t s = r1_bio->sector;
+		long sectors_to_go = r1_bio->sectors;
+		/* make sure these bits doesn't get cleared. */
+		do {
+			bitmap_end_sync(mddev->bitmap, r1_bio->sector,
+					&sync_blocks, 1);
+			s += sync_blocks;
+			sectors_to_go -= sync_blocks;
+		} while (sectors_to_go > 0);
 		md_error(mddev, conf->mirrors[mirror].rdev);
+	}
 
 	update_head_pos(mirror, r1_bio);
 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 002 of 3] md: Remove some code that can sleep from under a spinlock.
  2006-03-30  5:52 [PATCH 000 of 3] md: Introduction - assorted fixed for 2.6.16 NeilBrown
  2006-03-30  5:52 ` [PATCH 001 of 3] md: Don't clear bits in bitmap when writing to one device fails during recovery NeilBrown
@ 2006-03-30  5:52 ` NeilBrown
  2006-03-30  5:52 ` [PATCH 003 of 3] md: Raid-6 did not create sysfs entries for stripe cache NeilBrown
  2 siblings, 0 replies; 6+ messages in thread
From: NeilBrown @ 2006-03-30  5:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid, linux-kernel


And remove the comments that were put in inplace of a fix too....

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/md.c |    8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff ./drivers/md/md.c~current~ ./drivers/md/md.c
--- ./drivers/md/md.c~current~	2006-03-30 16:48:30.000000000 +1100
+++ ./drivers/md/md.c	2006-03-30 16:48:47.000000000 +1100
@@ -214,13 +214,11 @@ static void mddev_put(mddev_t *mddev)
 		return;
 	if (!mddev->raid_disks && list_empty(&mddev->disks)) {
 		list_del(&mddev->all_mddevs);
-		/* that blocks */
+		spin_unlock(&all_mddevs_lock);
 		blk_cleanup_queue(mddev->queue);
-		/* that also blocks */
 		kobject_unregister(&mddev->kobj);
-		/* result blows... */
-	}
-	spin_unlock(&all_mddevs_lock);
+	} else
+		spin_unlock(&all_mddevs_lock);
 }
 
 static mddev_t * mddev_find(dev_t unit)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 003 of 3] md: Raid-6 did not create sysfs entries for stripe cache
  2006-03-30  5:52 [PATCH 000 of 3] md: Introduction - assorted fixed for 2.6.16 NeilBrown
  2006-03-30  5:52 ` [PATCH 001 of 3] md: Don't clear bits in bitmap when writing to one device fails during recovery NeilBrown
  2006-03-30  5:52 ` [PATCH 002 of 3] md: Remove some code that can sleep from under a spinlock NeilBrown
@ 2006-03-30  5:52 ` NeilBrown
  2 siblings, 0 replies; 6+ messages in thread
From: NeilBrown @ 2006-03-30  5:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid, linux-kernel


Signed-off-by: Brad Campbell <brad@wasp.net.au>
Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid6main.c |    2 ++
 1 file changed, 2 insertions(+)

diff ./drivers/md/raid6main.c~current~ ./drivers/md/raid6main.c
--- ./drivers/md/raid6main.c~current~	2006-03-30 16:48:30.000000000 +1100
+++ ./drivers/md/raid6main.c	2006-03-30 16:48:52.000000000 +1100
@@ -2151,6 +2151,8 @@ static int run(mddev_t *mddev)
 	}
 
 	/* Ok, everything is just fine now */
+	sysfs_create_group(&mddev->kobj, &raid6_attrs_group);
+
 	mddev->array_size =  mddev->size * (mddev->raid_disks - 2);
 
 	mddev->queue->unplug_fn = raid6_unplug_device;

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 001 of 3] md: Don't clear bits in bitmap when writing to one device fails during recovery.
  2006-03-30  5:52 ` [PATCH 001 of 3] md: Don't clear bits in bitmap when writing to one device fails during recovery NeilBrown
@ 2006-03-30  6:12   ` Andrew Morton
  2006-03-30  6:48     ` Neil Brown
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2006-03-30  6:12 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, linux-kernel

NeilBrown <neilb@suse.de> wrote:
>
> +	if (!uptodate) {
>  +		int sync_blocks = 0;
>  +		sector_t s = r1_bio->sector;
>  +		long sectors_to_go = r1_bio->sectors;
>  +		/* make sure these bits doesn't get cleared. */
>  +		do {
>  +			bitmap_end_sync(mddev->bitmap, r1_bio->sector,
>  +					&sync_blocks, 1);
>  +			s += sync_blocks;
>  +			sectors_to_go -= sync_blocks;
>  +		} while (sectors_to_go > 0);
>   		md_error(mddev, conf->mirrors[mirror].rdev);
>  +	}

Can mddev->bitmap be NULL?

If so, will the above loop do the right thing when this:

void bitmap_end_sync(struct bitmap *bitmap, sector_t offset, int *blocks, int aborted)
{
	bitmap_counter_t *bmc;
	unsigned long flags;
/*
	if (offset == 0) printk("bitmap_end_sync 0 (%d)\n", aborted);
*/	if (bitmap == NULL) {
		*blocks = 1024;
		return;
	}

triggers?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 001 of 3] md: Don't clear bits in bitmap when writing to one device fails during recovery.
  2006-03-30  6:12   ` Andrew Morton
@ 2006-03-30  6:48     ` Neil Brown
  0 siblings, 0 replies; 6+ messages in thread
From: Neil Brown @ 2006-03-30  6:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid, linux-kernel

On Wednesday March 29, akpm@osdl.org wrote:
> NeilBrown <neilb@suse.de> wrote:
> >
> > +	if (!uptodate) {
> >  +		int sync_blocks = 0;
> >  +		sector_t s = r1_bio->sector;
> >  +		long sectors_to_go = r1_bio->sectors;
> >  +		/* make sure these bits doesn't get cleared. */
> >  +		do {
> >  +			bitmap_end_sync(mddev->bitmap, r1_bio->sector,
> >  +					&sync_blocks, 1);
> >  +			s += sync_blocks;
> >  +			sectors_to_go -= sync_blocks;
> >  +		} while (sectors_to_go > 0);
> >   		md_error(mddev, conf->mirrors[mirror].rdev);
> >  +	}
> 
> Can mddev->bitmap be NULL?

Yes, normally it is.

> 
> If so, will the above loop do the right thing when this:
> 
> void bitmap_end_sync(struct bitmap *bitmap, sector_t offset, int *blocks, int aborted)
> {
> 	bitmap_counter_t *bmc;
> 	unsigned long flags;
> /*
> 	if (offset == 0) printk("bitmap_end_sync 0 (%d)\n", aborted);
> */	if (bitmap == NULL) {
> 		*blocks = 1024;
> 		return;
> 	}
> 
> triggers?

Yes.  sync_blocks will be 1024 (a nice big number) and the loop will
exit quite quickly having done nothing (which is what it needs to do
in that case).
Ofcourse, if someone submits a bio for multiple thousands of sectors
it will loop needlessly a few times, but do we ever generate bios that
are even close to a megabyte?
If so, that 1024 can be safely increased to 1<<20, and possibly higher
but I would need to check.

Thanks for asking
NeilBrown

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-03-30  6:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-30  5:52 [PATCH 000 of 3] md: Introduction - assorted fixed for 2.6.16 NeilBrown
2006-03-30  5:52 ` [PATCH 001 of 3] md: Don't clear bits in bitmap when writing to one device fails during recovery NeilBrown
2006-03-30  6:12   ` Andrew Morton
2006-03-30  6:48     ` Neil Brown
2006-03-30  5:52 ` [PATCH 002 of 3] md: Remove some code that can sleep from under a spinlock NeilBrown
2006-03-30  5:52 ` [PATCH 003 of 3] md: Raid-6 did not create sysfs entries for stripe cache NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).