RAID1 ramdisk patch

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* RAID1 ramdisk patch
@ 2005-09-05  0:46 Wilco Baan Hofman
  2005-09-05  1:27 ` Neil Brown
  0 siblings, 1 reply; 19+ messages in thread
From: Wilco Baan Hofman @ 2005-09-05  0:46 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 680 bytes --]

Hi all,

I have written a small patch for use with a HDD-backed ramdisk in the md 
raid1 driver. The raid1 driver usually does read balancing on the disks, 
but I feel that if it encounters a single ram disk in the array that 
should be the preferred read disk. The application of this would be for 
example a 2GB ram disk in raid1 with a 2GB partition, where the ram disk 
is used for reading and both 'disks' used for writing.

Attached is a bit of code which checks for a ram-disk and sets it as 
preferred disk. It also checks if the ram disk is in sync before 
allowing the read.

PS. I am not this list, please CC me if a reply were to be made.

Regards,

Wilco Baan Hofman

[-- Attachment #2: syn-raid1ramdisk-20050905.patch --]
[-- Type: text/plain, Size: 2947 bytes --]

diff -urN linux-2.6.13-rc6.orig/include/linux/raid/raid1.h linux-2.6.13-rc6/include/linux/raid/raid1.h
--- linux-2.6.13-rc6.orig/include/linux/raid/raid1.h	2005-08-07 20:18:56.000000000 +0200
+++ linux-2.6.13-rc6/include/linux/raid/raid1.h	2005-09-04 11:41:24.000000000 +0200
@@ -32,6 +32,7 @@
 	int			raid_disks;
 	int			working_disks;
 	int			last_used;
+	int			preferred_read_disk;
 	sector_t		next_seq_sect;
 	spinlock_t		device_lock;
diff -urN linux-2.6.13-rc6.orig/drivers/md/raid1.c linux-2.6.13-rc6/drivers/md/raid1.c 
--- linux-2.6.13-rc6.orig/drivers/md/raid1.c	2005-08-07 20:18:56.000000000 +0200
+++ linux-2.6.13-rc6/drivers/md/raid1.c	2005-09-05 01:54:26.000000000 +0200
@@ -21,6 +21,8 @@
  * Additions to bitmap code, (C) 2003-2004 Paul Clements, SteelEye Technology:
  * - persistent bitmap code
  *
+ * Special handling of ramdisk (C) 2005 Wilco Baan Hofman <wilco@baanhofman.nl>
+ *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
  * the Free Software Foundation; either version 2, or (at your option)
@@ -399,8 +401,6 @@
 			goto rb_out;
 		}
 	}
-	disk = new_disk;
-	/* now disk == new_disk == starting point for search */
 
 	/*
 	 * Don't change to another disk for sequential reads:
@@ -409,7 +409,18 @@
 		goto rb_out;
 	if (this_sector == conf->mirrors[new_disk].head_position)
 		goto rb_out;
-
+	
+	/* [SYN] If the preferred disk exists, return it */
+	if (conf->preferred_read_disk != -1 &&
+			(new_rdev=conf->mirrors[conf->preferred_read_disk].rdev) != NULL &&
+		        new_rdev->in_sync) {
+		new_disk = conf->preferred_read_disk;
+		goto rb_out;
+	}
+	
+	disk = new_disk;
+	/* now disk == new_disk == starting point for search */
+	
 	current_distance = abs(this_sector - conf->mirrors[disk].head_position);
 
 	/* Find the disk whose head is closest */
@@ -1292,10 +1303,11 @@
 static int run(mddev_t *mddev)
 {
 	conf_t *conf;
-	int i, j, disk_idx;
+	int i, j, disk_idx, ram_count;
 	mirror_info_t *disk;
 	mdk_rdev_t *rdev;
 	struct list_head *tmp;
+	char b[BDEVNAME_SIZE];
 
 	if (mddev->level != 1) {
 		printk("raid1: %s: raid level not set to mirroring (%d)\n",
@@ -1417,6 +1429,30 @@
 	mddev->queue->unplug_fn = raid1_unplug;
 	mddev->queue->issue_flush_fn = raid1_issue_flush;
 
+	/* [SYN] if there is a ram disk, that will be the preferred disk.
+	 * .. unless there are multiple ram disks. */
+	conf->preferred_read_disk = -1;
+	for (i = 0,
+	     ram_count = 0; 
+	     i < mddev->raid_disks; 
+	     i++) {
+	
+		bdevname(conf->mirrors[i].rdev->bdev, b);
+		if (strncmp(b, "ram", 3) == 0) {
+			if (ram_count) {
+				conf->preferred_read_disk = -1;
+				break;
+			}
+			conf->preferred_read_disk = i;
+			ram_count++;
+		}
+	}
+	if (conf->preferred_read_disk >= 0) {
+		printk(KERN_INFO 
+			"raid1: One ram disk (%s) found, setting it preferred read disk.\n", b);
+	}
+
+	
 	return 0;
 
 out_no_mem:

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: RAID1 ramdisk patch
  2005-09-05  0:46 RAID1 ramdisk patch Wilco Baan Hofman
@ 2005-09-05  1:27 ` Neil Brown
  2005-09-05  7:40   ` Wilco Baan Hofman
  2005-11-16 13:36   ` segfault mdadm --write-behind, 2.6.14-mm2 (was: Re: RAID1 ramdisk patch) Sander
  0 siblings, 2 replies; 19+ messages in thread
From: Neil Brown @ 2005-09-05  1:27 UTC (permalink / raw)
  To: Wilco Baan Hofman; +Cc: linux-kernel

On Monday September 5, wilco@baanhofman.nl wrote:
> Hi all,
> 
> I have written a small patch for use with a HDD-backed ramdisk in the md 
> raid1 driver. The raid1 driver usually does read balancing on the disks, 
> but I feel that if it encounters a single ram disk in the array that 
> should be the preferred read disk. The application of this would be for 
> example a 2GB ram disk in raid1 with a 2GB partition, where the ram disk 
> is used for reading and both 'disks' used for writing.
> 
> Attached is a bit of code which checks for a ram-disk and sets it as 
> preferred disk. It also checks if the ram disk is in sync before 
> allowing the read.

Hi,
 equivalent functionality is now available in 2.6-mm and is referred
 to as 'write mostly'.
 If you use mdadm-2.0 and mark a device as --write-mostly, then all
 read requests will go to the other device(s) if possible,.
 e.g.
   mdadm --create /dev/md0 --level=1 --raid-disks=2 /dev/ramdisk \
      --writemostly /dev/realdisk

 Does this suit your needs?

 You can also arrange for the write to the writemostly device to be
 'write-behind' so that the filesystem doesn't wait for the write to
 complete.  This can reduce write-latency (though not increase write
 throughput) at a very small cost of reliability (if the RAM dies, the
 disk may not be 100% up-to-date).

NeilBrown


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: RAID1 ramdisk patch
  2005-09-05  1:27 ` Neil Brown
@ 2005-09-05  7:40   ` Wilco Baan Hofman
  2005-11-16 13:36   ` segfault mdadm --write-behind, 2.6.14-mm2 (was: Re: RAID1 ramdisk patch) Sander
  1 sibling, 0 replies; 19+ messages in thread
From: Wilco Baan Hofman @ 2005-09-05  7:40 UTC (permalink / raw)
  To: linux-kernel

Neil Brown wrote:

>On Monday September 5, wilco@baanhofman.nl wrote:
>  
>
>>Hi all,
>>
>>I have written a small patch for use with a HDD-backed ramdisk in the md 
>>raid1 driver. The raid1 driver usually does read balancing on the disks, 
>>but I feel that if it encounters a single ram disk in the array that 
>>should be the preferred read disk. The application of this would be for 
>>example a 2GB ram disk in raid1 with a 2GB partition, where the ram disk 
>>is used for reading and both 'disks' used for writing.
>>
>>Attached is a bit of code which checks for a ram-disk and sets it as 
>>preferred disk. It also checks if the ram disk is in sync before 
>>allowing the read.
>>    
>>
>
>Hi,
> equivalent functionality is now available in 2.6-mm and is referred
> to as 'write mostly'.
> If you use mdadm-2.0 and mark a device as --write-mostly, then all
> read requests will go to the other device(s) if possible,.
> e.g.
>   mdadm --create /dev/md0 --level=1 --raid-disks=2 /dev/ramdisk \
>      --writemostly /dev/realdisk
>
> Does this suit your needs?
>
> You can also arrange for the write to the writemostly device to be
> 'write-behind' so that the filesystem doesn't wait for the write to
> complete.  This can reduce write-latency (though not increase write
> throughput) at a very small cost of reliability (if the RAM dies, the
> disk may not be 100% up-to-date).
>
>NeilBrown
>
>  
>
I was looking for that (but couldn't find it)..

At this point I don't see why it wouldn't, if that also syncs from the 
partition then it's basically the same functionality, but written from a 
different perspective.

To use it I'll have to deviate from stock linux and use a non-packaged 
mdadm, but that is better than applying my patch every kernel update ;-)

Thanks, I'll look into it.

Wilco Baan Hofman

^ permalink raw reply	[flat|nested] 19+ messages in thread

* segfault mdadm --write-behind, 2.6.14-mm2  (was: Re: RAID1 ramdisk patch)
  2005-09-05  1:27 ` Neil Brown
  2005-09-05  7:40   ` Wilco Baan Hofman
@ 2005-11-16 13:36   ` Sander
  2005-11-16 22:20     ` Andrew Morton
  1 sibling, 1 reply; 19+ messages in thread
From: Sander @ 2005-11-16 13:36 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-kernel

Neil Brown wrote (ao):
>  If you use mdadm-2.0 and mark a device as --write-mostly, then all
>  read requests will go to the other device(s) if possible,.
>  e.g.
>    mdadm --create /dev/md0 --level=1 --raid-disks=2 /dev/ramdisk \
>       --writemostly /dev/realdisk
> 
>  Does this suit your needs?
> 
>  You can also arrange for the write to the writemostly device to be
>  'write-behind' so that the filesystem doesn't wait for the write to
>  complete.  This can reduce write-latency (though not increase write
>  throughput) at a very small cost of reliability (if the RAM dies, the
>  disk may not be 100% up-to-date).

With 2.6.14-mm2 (x86) and mdadm 2.1 I get a Segmentation fault when I
try this:

mdadm -C /dev/md1 -l1 -n2 --bitmap=/storage/md1.bitmap /dev/loop0 \
--write-behind /dev/loop1

loop0 is attached to a file on tmpfs, and loop1 is attached
to a file on a lvm2 volume (reiser4, if that matters).

I can create and use the array with:

mdadm -C /dev/md1 -l1 -n2 /dev/loop0 /dev/loop1

and

mdadm -C /dev/md1 -l1 -n2 /dev/loop0 --write-mostly /dev/loop1

mdadm is compiled with:
gcc (GCC) 4.0.3 20051023 (prerelease) (Debian 4.0.2-3)

Can/should I provide more info?

	With kind regards, Sander

This is what I get if I reboot, create the images with dd,
attach them with losetup and try to create the array with mdadm:


[42949575.730000] loop: loaded (max 8 devices)
[42949584.840000] md: bind<loop0>
[42949584.840000] md: bind<loop1>
[42949584.840000] md: md1: raid array is not clean -- starting background reconstruction
[42949584.840000] md1: bitmap file is out of date (0 < 1) -- forcing full recovery
[42949584.840000] md1: bitmap file is out of date, doing full recovery
[42949584.840000] Unable to handle kernel NULL pointer dereference at virtual address 00000008
[42949584.840000]  printing eip:
[42949584.840000] c01c33dd
[42949584.840000] *pde = 00000000
[42949584.840000] Oops: 0000 [#1]
[42949584.840000] last sysfs file: /devices/pci0000:00/0000:00:11.0/i2c-0/name
[42949584.840000] Modules linked in: loop dm_mod i2c_viapro i2c_core
[42949584.840000] CPU:    0
[42949584.840000] EIP:    0060:[<c01c33dd>]    Not tainted VLI
[42949584.840000] EFLAGS: 00010286   (2.6.14-mm2)
[42949584.840000] EIP is at prepare_write_unix_file+0x1d/0xab
[42949584.840000] eax: 00000000   ebx: c01c33c0   ecx: 00000000   edx: c104ce60
[42949584.840000] esi: c104ce60   edi: f2f2f4a0   ebp: 00000000   esp: c2d6bd90
[42949584.840000] ds: 007b   es: 007b   ss: 0068
[42949584.840000] Process mdadm (pid: 749, threadinfo=c2d6b000 task=c3784580)
[42949584.840000] Stack: 30303034 00000000 c104ce60 c01c33c0 c104ce60 f2f2f4a0 00000001 c02b00f2
[42949584.840000]        00001000 00000f00 f2f2f4a0 c2674000 c104ce60 c02b1154 c03a97dc f7c278cc
[42949584.840000]        c2d6bddc c02b05b4 c03a975c f7c278cc 00000000 00000000 00000000 00031f20
[42949584.840000] Call Trace:
[42949584.840000]  [<c01c33c0>] prepare_write_unix_file+0x0/0xab
[42949584.840000]  [<c02b00f2>] write_page+0x52/0x140
[42949584.840000]  [<c02b1154>] bitmap_init_from_disk+0x384/0x450
[42949584.840000]  [<c02b05b4>] bitmap_read_sb+0x84/0x2f0
[42949584.840000]  [<c02b21f3>] bitmap_create+0x1a3/0x2a0
[42949584.840000]  [<c02ab95a>] do_md_run+0x2ba/0x500
[42949584.840000]  [<c02ac8a7>] add_new_disk+0x157/0x3b0
[42949584.840000]  [<c0179034>] mpage_writepages+0x124/0x3d0
[42949584.840000]  [<c013c23e>] __pagevec_free+0x3e/0x60
[42949584.840000]  [<c013eff9>] release_pages+0x29/0x160
[42949584.840000]  [<c02adb81>] md_ioctl+0x5a1/0x630
[42949584.840000]  [<c0137918>] find_get_pages+0x18/0x40
[42949584.840000]  [<c02ad5e0>] md_ioctl+0x0/0x630
[42949584.840000]  [<c01ede74>] blkdev_driver_ioctl+0x54/0x60
[42949584.840000]  [<c01edfb4>] blkdev_ioctl+0x134/0x180
[42949584.840000]  [<c015e158>] block_ioctl+0x18/0x20
[42949584.840000]  [<c015e140>] block_ioctl+0x0/0x20
[42949584.840000]  [<c01674ff>] do_ioctl+0x1f/0x70
[42949584.840000]  [<c016769c>] vfs_ioctl+0x5c/0x1e0
[42949584.840000]  [<c0156c91>] __fput+0xe1/0x140
[42949584.840000]  [<c016785d>] sys_ioctl+0x3d/0x70
[42949584.840000]  [<c0102f49>] syscall_call+0x7/0xb
[42949584.840000] Code: 02 00 00 eb 89 89 f6 8d bc 27 00 00 00 00 83 ec 1c 89 5c 24 0c 89 7c 24 14 89 6c 24 18 89 c5 89 74 24 10 89 54 24 08 89 4c 24 04 <8b> 40 08 8b 40 08 8b 80 94 00 00 00 e8 92 20 fd ff 3d 18 fc ff
[42949584.840000]


-- 
Humilis IT Services and Solutions
http://www.humilis.net

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: segfault mdadm --write-behind, 2.6.14-mm2  (was: Re: RAID1 ramdisk patch)
  2005-11-16 13:36   ` segfault mdadm --write-behind, 2.6.14-mm2 (was: Re: RAID1 ramdisk patch) Sander
@ 2005-11-16 22:20     ` Andrew Morton
  2005-11-16 23:08       ` Neil Brown
  2005-11-18 14:18       ` segfault mdadm --write-behind, 2.6.14-mm2 Vladimir V. Saveliev
  0 siblings, 2 replies; 19+ messages in thread
From: Andrew Morton @ 2005-11-16 22:20 UTC (permalink / raw)
  To: sander; +Cc: neilb, linux-kernel, reiserfs-dev

Sander <sander@humilis.net> wrote:
>
> Neil Brown wrote (ao):
> >  If you use mdadm-2.0 and mark a device as --write-mostly, then all
> >  read requests will go to the other device(s) if possible,.
> >  e.g.
> >    mdadm --create /dev/md0 --level=1 --raid-disks=2 /dev/ramdisk \
> >       --writemostly /dev/realdisk
> > 
> >  Does this suit your needs?
> > 
> >  You can also arrange for the write to the writemostly device to be
> >  'write-behind' so that the filesystem doesn't wait for the write to
> >  complete.  This can reduce write-latency (though not increase write
> >  throughput) at a very small cost of reliability (if the RAM dies, the
> >  disk may not be 100% up-to-date).
> 
> With 2.6.14-mm2 (x86) and mdadm 2.1 I get a Segmentation fault when I
> try this:

It oopsed in reiser4.  reiserfs-dev added to Cc...

> mdadm -C /dev/md1 -l1 -n2 --bitmap=/storage/md1.bitmap /dev/loop0 \
> --write-behind /dev/loop1
> 
> loop0 is attached to a file on tmpfs, and loop1 is attached
> to a file on a lvm2 volume (reiser4, if that matters).
> 
> I can create and use the array with:
> 
> mdadm -C /dev/md1 -l1 -n2 /dev/loop0 /dev/loop1
> 
> and
> 
> mdadm -C /dev/md1 -l1 -n2 /dev/loop0 --write-mostly /dev/loop1
> 
> mdadm is compiled with:
> gcc (GCC) 4.0.3 20051023 (prerelease) (Debian 4.0.2-3)
> 
> Can/should I provide more info?
> 
> 	With kind regards, Sander
> 
> This is what I get if I reboot, create the images with dd,
> attach them with losetup and try to create the array with mdadm:
> 
> 
> [42949575.730000] loop: loaded (max 8 devices)
> [42949584.840000] md: bind<loop0>
> [42949584.840000] md: bind<loop1>
> [42949584.840000] md: md1: raid array is not clean -- starting background reconstruction
> [42949584.840000] md1: bitmap file is out of date (0 < 1) -- forcing full recovery
> [42949584.840000] md1: bitmap file is out of date, doing full recovery
> [42949584.840000] Unable to handle kernel NULL pointer dereference at virtual address 00000008
> [42949584.840000]  printing eip:
> [42949584.840000] c01c33dd
> [42949584.840000] *pde = 00000000
> [42949584.840000] Oops: 0000 [#1]
> [42949584.840000] last sysfs file: /devices/pci0000:00/0000:00:11.0/i2c-0/name
> [42949584.840000] Modules linked in: loop dm_mod i2c_viapro i2c_core
> [42949584.840000] CPU:    0
> [42949584.840000] EIP:    0060:[<c01c33dd>]    Not tainted VLI
> [42949584.840000] EFLAGS: 00010286   (2.6.14-mm2)
> [42949584.840000] EIP is at prepare_write_unix_file+0x1d/0xab
> [42949584.840000] eax: 00000000   ebx: c01c33c0   ecx: 00000000   edx: c104ce60
> [42949584.840000] esi: c104ce60   edi: f2f2f4a0   ebp: 00000000   esp: c2d6bd90
> [42949584.840000] ds: 007b   es: 007b   ss: 0068
> [42949584.840000] Process mdadm (pid: 749, threadinfo=c2d6b000 task=c3784580)
> [42949584.840000] Stack: 30303034 00000000 c104ce60 c01c33c0 c104ce60 f2f2f4a0 00000001 c02b00f2
> [42949584.840000]        00001000 00000f00 f2f2f4a0 c2674000 c104ce60 c02b1154 c03a97dc f7c278cc
> [42949584.840000]        c2d6bddc c02b05b4 c03a975c f7c278cc 00000000 00000000 00000000 00031f20
> [42949584.840000] Call Trace:
> [42949584.840000]  [<c01c33c0>] prepare_write_unix_file+0x0/0xab
> [42949584.840000]  [<c02b00f2>] write_page+0x52/0x140
> [42949584.840000]  [<c02b1154>] bitmap_init_from_disk+0x384/0x450
> [42949584.840000]  [<c02b05b4>] bitmap_read_sb+0x84/0x2f0
> [42949584.840000]  [<c02b21f3>] bitmap_create+0x1a3/0x2a0
> [42949584.840000]  [<c02ab95a>] do_md_run+0x2ba/0x500
> [42949584.840000]  [<c02ac8a7>] add_new_disk+0x157/0x3b0
> [42949584.840000]  [<c0179034>] mpage_writepages+0x124/0x3d0
> [42949584.840000]  [<c013c23e>] __pagevec_free+0x3e/0x60
> [42949584.840000]  [<c013eff9>] release_pages+0x29/0x160
> [42949584.840000]  [<c02adb81>] md_ioctl+0x5a1/0x630
> [42949584.840000]  [<c0137918>] find_get_pages+0x18/0x40
> [42949584.840000]  [<c02ad5e0>] md_ioctl+0x0/0x630
> [42949584.840000]  [<c01ede74>] blkdev_driver_ioctl+0x54/0x60
> [42949584.840000]  [<c01edfb4>] blkdev_ioctl+0x134/0x180
> [42949584.840000]  [<c015e158>] block_ioctl+0x18/0x20
> [42949584.840000]  [<c015e140>] block_ioctl+0x0/0x20
> [42949584.840000]  [<c01674ff>] do_ioctl+0x1f/0x70
> [42949584.840000]  [<c016769c>] vfs_ioctl+0x5c/0x1e0
> [42949584.840000]  [<c0156c91>] __fput+0xe1/0x140
> [42949584.840000]  [<c016785d>] sys_ioctl+0x3d/0x70
> [42949584.840000]  [<c0102f49>] syscall_call+0x7/0xb
> [42949584.840000] Code: 02 00 00 eb 89 89 f6 8d bc 27 00 00 00 00 83 ec 1c 89 5c 24 0c 89 7c 24 14 89 6c 24 18 89 c5 89 74 24 10 89 54 24 08 89 4c 24 04 <8b> 40 08 8b 40 08 8b 80 94 00 00 00 e8 92 20 fd ff 3d 18 fc ff
> [42949584.840000]
> 
> 
> -- 
> Humilis IT Services and Solutions
> http://www.humilis.net
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: segfault mdadm --write-behind, 2.6.14-mm2  (was: Re: RAID1 ramdisk patch)
  2005-11-16 22:20     ` Andrew Morton
@ 2005-11-16 23:08       ` Neil Brown
  2005-11-17  7:50         ` Sander
  2005-11-18 14:18       ` segfault mdadm --write-behind, 2.6.14-mm2 Vladimir V. Saveliev
  1 sibling, 1 reply; 19+ messages in thread
From: Neil Brown @ 2005-11-16 23:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: sander, linux-kernel, reiserfs-dev

On Wednesday November 16, akpm@osdl.org wrote:
> Sander <sander@humilis.net> wrote:
> >
> > 
> > With 2.6.14-mm2 (x86) and mdadm 2.1 I get a Segmentation fault when I
> > try this:
> 
> It oopsed in reiser4.  reiserfs-dev added to Cc...
> 

Hmm... It appears that md/bitmap is calling prepare_write and
commit_write with 'file' as NULL - this works for some filesystems,
but not for reiser4.

Does this patch help.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/bitmap.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff ./drivers/md/bitmap.c~current~ ./drivers/md/bitmap.c
--- ./drivers/md/bitmap.c~current~	2005-11-17 10:05:18.000000000 +1100
+++ ./drivers/md/bitmap.c	2005-11-17 10:05:40.000000000 +1100
@@ -326,9 +326,9 @@ static int write_page(struct bitmap *bit
 		}
 	}
 
-	ret = page->mapping->a_ops->prepare_write(NULL, page, 0, PAGE_SIZE);
+	ret = page->mapping->a_ops->prepare_write(bitmap->file, page, 0, PAGE_SIZE);
 	if (!ret)
-		ret = page->mapping->a_ops->commit_write(NULL, page, 0,
+		ret = page->mapping->a_ops->commit_write(bitmap->file, page, 0,
 			PAGE_SIZE);
 	if (ret) {
 		unlock_page(page);

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: segfault mdadm --write-behind, 2.6.14-mm2  (was: Re: RAID1 ramdisk patch)
  2005-11-16 23:08       ` Neil Brown
@ 2005-11-17  7:50         ` Sander
  2005-11-17 10:12           ` Sander
  0 siblings, 1 reply; 19+ messages in thread
From: Sander @ 2005-11-17  7:50 UTC (permalink / raw)
  To: Neil Brown; +Cc: Andrew Morton, sander, linux-kernel, reiserfs-dev

Neil Brown wrote (ao):
> On Wednesday November 16, akpm@osdl.org wrote:
> > Sander <sander@humilis.net> wrote:
> > > With 2.6.14-mm2 (x86) and mdadm 2.1 I get a Segmentation fault when I
> > > try this:
> > 
> > It oopsed in reiser4.  reiserfs-dev added to Cc...
> > 
> 
> Hmm... It appears that md/bitmap is calling prepare_write and
> commit_write with 'file' as NULL - this works for some filesystems,
> but not for reiser4.
> 
> Does this patch help.

Something changed, but it didn't fix it it seems:

# mdadm -C /dev/md1 --bitmap=/storage/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1
mdadm: RUN_ARRAY failed: No such file or directory

(google didn't turn up the same error, but a lot 
 without the 'No such file or directory')

[42949645.530000] md: bind<loop0>
[42949645.540000] md: bind<loop1>
[42949645.540000] md: md1: raid array is not clean -- starting background reconstruction
[42949645.540000] md1: bitmap file is out of date (0 < 1) -- forcing full recovery
[42949645.540000] md1: bitmap file is out of date, doing full recovery
[42949645.560000] md1: bitmap initialized from disk: read 0/7 pages, set 0 bits, status: 1
[42949645.560000] md1: failed to create bitmap (1)
[42949645.560000] md: pers->run() failed ...
[42949645.560000] md: md1 stopped.
[42949645.560000] md: unbind<loop1>
[42949645.560000] md: export_rdev(loop1)
[42949645.560000] md: unbind<loop0>
[42949645.560000] md: export_rdev(loop0)

# ls -l /storage/raid1.bitmap
-rw-r--r-- 1 root root 25856 Nov 17 08:37 /storage/raid1.bitmap

(file is there, lets try again)

~# mdadm -C /dev/md1 --bitmap=/storage/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1
mdadm: /dev/loop0 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Thu Nov 17 08:37:58 2005
mdadm: /dev/loop1 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Thu Nov 17 08:37:58 2005
Continue creating array? yes
mdadm: bitmap file /storage/raid1.bitmap already exists, use --force to overwrite

(ok, try with new bitmapfile)

# mdadm -C /dev/md1 --bitmap=/storage/raid.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1
mdadm: /dev/loop0 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Thu Nov 17 08:37:58 2005
mdadm: /dev/loop1 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Thu Nov 17 08:37:58 2005
Continue creating array? yes
mdadm: RUN_ARRAY failed: No such file or directory

(doesn't work, lets force the first one)

# mdadm -C /dev/md1 --bitmap=/storage/raid1.bitmap -f -l1 -n2 /dev/loop0 --write-behind /dev/loop1
mdadm: /dev/loop0 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Thu Nov 17 08:40:50 2005
mdadm: /dev/loop1 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Thu Nov 17 08:40:50 2005
Continue creating array? yes
Segmentation fault


For some reason, the dmesg is quite a bit longer now.

[42949831.700000] Bad page state at free_hot_cold_page (in process 'mdadm', page c1043220)
[42949831.700000] flags:0x80000001 mapping:00000000 mapcount:0 count:0
[42949831.700000] Backtrace:
[42949831.700000]  [<c013b320>] bad_page+0x70/0xb0
[42949831.700000]  [<c013bab1>] free_hot_cold_page+0x51/0xd0
[42949831.700000]  [<c013f5da>] truncate_inode_pages_range+0x11a/0x310
[42949831.700000]  [<c01a2ac0>] reiser4_invalidate_pages+0x90/0xc0
[42949831.700000]  [<c01ba5ed>] kill_hook_extent+0x17d/0x5b0
[42949831.700000]  [<c01ac29c>] plugin_by_unsafe_id+0x1c/0x110
[42949831.700000]  [<c01ba470>] kill_hook_extent+0x0/0x5b0
[42949831.700000]  [<c01cd7fd>] call_kill_hooks+0x9d/0xc0
[42949831.700000]  [<c01cd8f0>] kill_head+0x0/0x40
[42949831.700000]  [<c01cdf76>] prepare_for_compact+0x536/0x540
[42949831.700000]  [<c0192a0e>] lock_tail+0x1e/0x40
[42949831.700000]  [<c01ac29c>] plugin_by_unsafe_id+0x1c/0x110
[42949831.700000]  [<c01cd820>] kill_units+0x0/0x80
[42949831.700000]  [<c01cd8f0>] kill_head+0x0/0x40
[42949831.700000]  [<c0192933>] longterm_unlock_znode+0xa3/0x160
[42949831.700000]  [<c0192bf3>] longterm_lock_znode+0x163/0x250
[42949831.700000]  [<c018ce4b>] jload_gfp+0x5b/0x140
[42949831.700000]  [<c01cdfb1>] kill_node40+0x31/0xc0
[42949831.700000]  [<c0191a88>] carry_cut+0x48/0x60
[42949831.700000]  [<c018f458>] carry_on_level+0x38/0xc0
[42949831.700000]  [<c018f302>] carry+0x82/0x1a0
[42949831.700000]  [<c018f704>] add_carry+0x24/0x40
[42949831.700000]  [<c018f51d>] post_carry+0x3d/0xa0
[42949831.710000]  [<c0194886>] kill_node_content+0xf6/0x160
[42949831.710000]  [<c0194e39>] cut_tree_worker_common+0x159/0x350
[42949831.710000]  [<c0194ce0>] cut_tree_worker_common+0x0/0x350
[42949831.710000]  [<c0195155>] cut_tree_object+0x125/0x240
[42949831.710000]  [<c0196d29>] reiser4_grab_reserved+0x49/0x190
[42949831.710000]  [<c018d04f>] jrelse+0xf/0x20
[42949831.710000]  [<c01bfc81>] cut_file_items+0xb1/0x180
[42949831.710000]  [<c01a0108>] add_empty_leaf+0xa8/0x220
[42949831.710000]  [<c01bfdab>] shorten_file+0x4b/0x260
[42949831.710000]  [<c01bfb40>] update_file_size+0x0/0x90
[42949831.710000]  [<c01c2f03>] setattr_truncate+0x73/0x210
[42949831.710000]  [<c01ad384>] permission_common+0x24/0x40
[42949831.710000]  [<c01ad360>] permission_common+0x0/0x40
[42949831.710000]  [<c0162b78>] permission+0x48/0x90
[42949831.710000]  [<c0163119>] __link_path_walk+0x89/0xc40
[42949831.710000]  [<c01c30fe>] setattr_unix_file+0x5e/0xc0
[42949831.710000]  [<c016f58f>] notify_change+0xcf/0x2d5
[42949831.710000]  [<c0163d3f>] link_path_walk+0x6f/0xe0
[42949831.710000]  [<c0153e9b>] do_truncate+0x4b/0x70
[42949831.710000]  [<c0162b78>] permission+0x48/0x90
[42949831.710000]  [<c0164704>] may_open+0x184/0x1d0
[42949831.710000]  [<c01647d5>] open_namei+0x85/0x560
[42949831.710000]  [<c0154fe2>] filp_open+0x22/0x50
[42949831.710000]  [<c01551ad>] get_unused_fd+0x4d/0xb0
[42949831.710000]  [<c01552c1>] do_sys_open+0x41/0xd0
[42949831.710000]  [<c0102f49>] syscall_call+0x7/0xb
[42949831.710000] Trying to fix it up, but a reboot is needed
[42949831.710000] ------------[ cut here ]------------
[42949831.710000] kernel BUG at mm/filemap.c:480!
[42949831.710000] invalid operand: 0000 [#1]
[42949831.710000] last sysfs file: /devices/pci0000:00/0000:00:11.0/i2c-0/name
[42949831.710000] Modules linked in: loop dm_mod i2c_viapro i2c_core
[42949831.710000] CPU:    0
[42949831.710000] EIP:    0060:[<c013763d>]    Tainted: G    B VLI
[42949831.710000] EFLAGS: 00010246   (2.6.14-mm2) 
[42949831.710000] EIP is at unlock_page+0xd/0x30
[42949831.710000] eax: 00000000   ebx: c1043220   ecx: c03cad30   edx: c1652218
[42949831.710000] esi: 00000001   edi: 00000000   ebp: 00000006   esp: c26c298c
[42949831.710000] ds: 007b   es: 007b   ss: 0068
[42949831.710000] Process mdadm (pid: 785, threadinfo=c26c2000 task=c6f64050)
[42949831.710000] Stack: c1043220 c013f5e1 0000000e 00007000 f2fb87ec 00000000 00000000 00000007 
[42949831.710000]        00000000 c1043220 c1045260 c1040240 c1040260 c1042820 c1042800 c10415e0 
[42949831.710000]        00007000 00000000 00000000 00000000 00000006 f2fb8810 00000001 00006fff 
[42949831.710000] Call Trace:
[42949831.710000]  [<c013f5e1>] truncate_inode_pages_range+0x121/0x310
[42949831.710000]  [<c01a2ac0>] reiser4_invalidate_pages+0x90/0xc0
[42949831.710000]  [<c01ba5ed>] kill_hook_extent+0x17d/0x5b0
[42949831.710000]  [<c01ac29c>] plugin_by_unsafe_id+0x1c/0x110
[42949831.710000]  [<c01ba470>] kill_hook_extent+0x0/0x5b0
[42949831.710000]  [<c01cd7fd>] call_kill_hooks+0x9d/0xc0
[42949831.710000]  [<c01cd8f0>] kill_head+0x0/0x40
[42949831.710000]  [<c01cdf76>] prepare_for_compact+0x536/0x540
[42949831.710000]  [<c0192a0e>] lock_tail+0x1e/0x40
[42949831.710000]  [<c01ac29c>] plugin_by_unsafe_id+0x1c/0x110
[42949831.710000]  [<c01cd820>] kill_units+0x0/0x80
[42949831.710000]  [<c01cd8f0>] kill_head+0x0/0x40
[42949831.710000]  [<c0192933>] longterm_unlock_znode+0xa3/0x160
[42949831.710000]  [<c0192bf3>] longterm_lock_znode+0x163/0x250
[42949831.710000]  [<c018ce4b>] jload_gfp+0x5b/0x140
[42949831.710000]  [<c01cdfb1>] kill_node40+0x31/0xc0
[42949831.710000]  [<c0191a88>] carry_cut+0x48/0x60
[42949831.710000]  [<c018f458>] carry_on_level+0x38/0xc0
[42949831.710000]  [<c018f302>] carry+0x82/0x1a0
[42949831.710000]  [<c018f704>] add_carry+0x24/0x40
[42949831.710000]  [<c018f51d>] post_carry+0x3d/0xa0
[42949831.710000]  [<c0194886>] kill_node_content+0xf6/0x160
[42949831.710000]  [<c0194e39>] cut_tree_worker_common+0x159/0x350
[42949831.710000]  [<c0194ce0>] cut_tree_worker_common+0x0/0x350
[42949831.710000]  [<c0195155>] cut_tree_object+0x125/0x240
[42949831.710000]  [<c0196d29>] reiser4_grab_reserved+0x49/0x190
[42949831.710000]  [<c018d04f>] jrelse+0xf/0x20
[42949831.710000]  [<c01bfc81>] cut_file_items+0xb1/0x180
[42949831.710000]  [<c01a0108>] add_empty_leaf+0xa8/0x220
[42949831.710000]  [<c01bfdab>] shorten_file+0x4b/0x260
[42949831.710000]  [<c01bfb40>] update_file_size+0x0/0x90
[42949831.710000]  [<c01c2f03>] setattr_truncate+0x73/0x210
[42949831.710000]  [<c01ad384>] permission_common+0x24/0x40
[42949831.710000]  [<c01ad360>] permission_common+0x0/0x40
[42949831.710000]  [<c0162b78>] permission+0x48/0x90
[42949831.710000]  [<c0163119>] __link_path_walk+0x89/0xc40
[42949831.710000]  [<c01c30fe>] setattr_unix_file+0x5e/0xc0
[42949831.710000]  [<c016f58f>] notify_change+0xcf/0x2d5
[42949831.710000]  [<c0163d3f>] link_path_walk+0x6f/0xe0
[42949831.710000]  [<c0153e9b>] do_truncate+0x4b/0x70
[42949831.710000]  [<c0162b78>] permission+0x48/0x90
[42949831.710000]  [<c0164704>] may_open+0x184/0x1d0
[42949831.710000]  [<c01647d5>] open_namei+0x85/0x560
[42949831.710000]  [<c0154fe2>] filp_open+0x22/0x50
[42949831.710000]  [<c01551ad>] get_unused_fd+0x4d/0xb0
[42949831.710000]  [<c01552c1>] do_sys_open+0x41/0xd0
[42949831.710000]  [<c0102f49>] syscall_call+0x7/0xb
[42949831.710000] Code: e8 69 ff ff ff 89 da b9 20 6f 13 c0 c7 04 24 02 00 00 00 e8 e6 77 22 00 83 c4 20 5b c3 90 53 89 c3 0f ba 30 00 19 c0 85 c0 75 08 <0f> 0b e0 01 f8 6a 38 c0 89 d8 e8 34 ff ff ff 89 da 31 c9 5b e9 
[42949831.710000]  

-- 
Humilis IT Services and Solutions
http://www.humilis.net

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: segfault mdadm --write-behind, 2.6.14-mm2  (was: Re: RAID1 ramdisk patch)
  2005-11-17  7:50         ` Sander
@ 2005-11-17 10:12           ` Sander
  2005-11-17 10:15             ` Sander
  0 siblings, 1 reply; 19+ messages in thread
From: Sander @ 2005-11-17 10:12 UTC (permalink / raw)
  To: Sander; +Cc: Neil Brown, Andrew Morton, linux-kernel, reiserfs-dev

Sander wrote (ao):
# Neil Brown wrote (ao):
# > On Wednesday November 16, akpm@osdl.org wrote:
# > > Sander <sander@humilis.net> wrote:
# > > > With 2.6.14-mm2 (x86) and mdadm 2.1 I get a Segmentation fault when I
# > > > try this:
# > > 
# > > It oopsed in reiser4.  reiserfs-dev added to Cc...
# > > 
# > 
# > Hmm... It appears that md/bitmap is calling prepare_write and
# > commit_write with 'file' as NULL - this works for some filesystems,
# > but not for reiser4.
# > 
# > Does this patch help.
# 
# Something changed, but it didn't fix it it seems:
# 
# # mdadm -C /dev/md1 --bitmap=/storage/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1
# mdadm: RUN_ARRAY failed: No such file or directory

FWIW, the following happens when I point --bitmap to /tmp/raid1.bitmap
which is tmpfs, and also happens when I attach both loop0 and loop1 to
files on tmpfs.

This would suggest that reiser4 is not solely at fault?

The difference btw is that I can reboot with 'shutdown -r now'
instead of sysrq. And that mdadm hangs:

# mdadm -C /dev/md1 --bitmap=/tmp/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1
mdadm: RUN_ARRAY failed: No such file or directory

# mdadm -C /dev/md1 -f --bitmap=/tmp/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1
mdadm: /dev/loop0 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Thu Nov 17 11:04:31 2005
mdadm: /dev/loop1 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Thu Nov 17 11:04:31 2005
Continue creating array? yes
[hang, no prompt, no reaction to ctrl-c, etc]


[42949549.780000] md: bind<loop0>
[42949549.780000] md: bind<loop1>
[42949549.780000] md: md1: raid array is not clean -- starting background reconstruction
[42949549.790000] md1: bitmap file is out of date (0 < 1) -- forcing full recovery
[42949549.790000] md1: bitmap file is out of date, doing full recovery
[42949549.790000] md1: bitmap initialized from disk: read 0/4 pages, set 0 bits, status: 524288
[42949549.790000] Bad page state at free_hot_cold_page (in process 'mdadm', page c10dcc20)
[42949549.790000] flags:0x80000019 mapping:f5155c84 mapcount:0 count:0
[42949549.790000] Backtrace:
[42949549.790000]  [<c013b320>] bad_page+0x70/0xb0
[42949549.790000]  [<c013bab1>] free_hot_cold_page+0x51/0xd0
[42949549.790000]  [<c02b0a90>] bitmap_file_put+0x30/0x70
[42949549.790000]  [<c02b1f8e>] bitmap_free+0x1e/0xb0
[42949549.790000]  [<c02b2126>] bitmap_create+0xd6/0x2a0
[42949549.790000]  [<c02ab95a>] do_md_run+0x2ba/0x500
[42949549.790000]  [<c02ac8a7>] add_new_disk+0x157/0x3b0
[42949549.790000]  [<c0179034>] mpage_writepages+0x124/0x3d0
[42949549.790000]  [<c013c23e>] __pagevec_free+0x3e/0x60
[42949549.790000]  [<c013eff9>] release_pages+0x29/0x160
[42949549.790000]  [<c02adb81>] md_ioctl+0x5a1/0x630
[42949549.790000]  [<c0137918>] find_get_pages+0x18/0x40
[42949549.790000]  [<c02ad5e0>] md_ioctl+0x0/0x630
[42949549.790000]  [<c01ede74>] blkdev_driver_ioctl+0x54/0x60
[42949549.790000]  [<c01edfb4>] blkdev_ioctl+0x134/0x180
[42949549.790000]  [<c015e158>] block_ioctl+0x18/0x20
[42949549.790000]  [<c015e140>] block_ioctl+0x0/0x20
[42949549.790000]  [<c01674ff>] do_ioctl+0x1f/0x70
[42949549.790000]  [<c016769c>] vfs_ioctl+0x5c/0x1e0
[42949549.790000]  [<c0156c91>] __fput+0xe1/0x140
[42949549.790000]  [<c016785d>] sys_ioctl+0x3d/0x70
[42949549.790000]  [<c0102f49>] syscall_call+0x7/0xb
[42949549.790000] Trying to fix it up, but a reboot is needed
[42949549.790000] md1: failed to create bitmap (524288)
[42949549.790000] md: pers->run() failed ...
[42949549.790000] md: md1 stopped.
[42949549.790000] md: unbind<loop1>
[42949549.790000] md: export_rdev(loop1)
[42949549.790000] md: unbind<loop0>
[42949549.790000] md: export_rdev(loop0)


-- 
Humilis IT Services and Solutions
http://www.humilis.net

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: segfault mdadm --write-behind, 2.6.14-mm2  (was: Re: RAID1 ramdisk patch)
  2005-11-17 10:12           ` Sander
@ 2005-11-17 10:15             ` Sander
  2005-11-21 23:07               ` Please help me understand ->writepage. Was " Neil Brown
  0 siblings, 1 reply; 19+ messages in thread
From: Sander @ 2005-11-17 10:15 UTC (permalink / raw)
  To: Sander; +Cc: Neil Brown, Andrew Morton, linux-kernel, reiserfs-dev

Sander wrote (ao):
# Sander wrote (ao):
# # Neil Brown wrote (ao):
# # > On Wednesday November 16, akpm@osdl.org wrote:
# # > > Sander <sander@humilis.net> wrote:
# # > > > With 2.6.14-mm2 (x86) and mdadm 2.1 I get a Segmentation fault when I
# # > > > try this:
# # > > 
# # > > It oopsed in reiser4.  reiserfs-dev added to Cc...
# # > > 
# # > 
# # > Hmm... It appears that md/bitmap is calling prepare_write and
# # > commit_write with 'file' as NULL - this works for some filesystems,
# # > but not for reiser4.
# # > 
# # > Does this patch help.
# # 
# # Something changed, but it didn't fix it it seems:
# # 
# # # mdadm -C /dev/md1 --bitmap=/storage/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1
# # mdadm: RUN_ARRAY failed: No such file or directory
# 
# FWIW, the following happens when I point --bitmap to /tmp/raid1.bitmap
# which is tmpfs, and also happens when I attach both loop0 and loop1 to
# files on tmpfs.
# 
# This would suggest that reiser4 is not solely at fault?
# 
# The difference btw is that I can reboot with 'shutdown -r now'
# instead of sysrq. And that mdadm hangs:
# 
# # mdadm -C /dev/md1 --bitmap=/tmp/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1
# mdadm: RUN_ARRAY failed: No such file or directory
# 
# # mdadm -C /dev/md1 -f --bitmap=/tmp/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1
# mdadm: /dev/loop0 appears to be part of a raid array:
#     level=raid1 devices=2 ctime=Thu Nov 17 11:04:31 2005
# mdadm: /dev/loop1 appears to be part of a raid array:
#     level=raid1 devices=2 ctime=Thu Nov 17 11:04:31 2005
# Continue creating array? yes
# [hang, no prompt, no reaction to ctrl-c, etc]

And even more info. It seems mdadm spins:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                             
  749 root      25   0  1696  568  492 R 99.9  0.1   8:32.50 mdadm

Would sysrq-t be useful?


# [42949549.780000] md: bind<loop0>
# [42949549.780000] md: bind<loop1>
# [42949549.780000] md: md1: raid array is not clean -- starting background reconstruction
# [42949549.790000] md1: bitmap file is out of date (0 < 1) -- forcing full recovery
# [42949549.790000] md1: bitmap file is out of date, doing full recovery
# [42949549.790000] md1: bitmap initialized from disk: read 0/4 pages, set 0 bits, status: 524288
# [42949549.790000] Bad page state at free_hot_cold_page (in process 'mdadm', page c10dcc20)
# [42949549.790000] flags:0x80000019 mapping:f5155c84 mapcount:0 count:0
# [42949549.790000] Backtrace:
# [42949549.790000]  [<c013b320>] bad_page+0x70/0xb0
# [42949549.790000]  [<c013bab1>] free_hot_cold_page+0x51/0xd0
# [42949549.790000]  [<c02b0a90>] bitmap_file_put+0x30/0x70
# [42949549.790000]  [<c02b1f8e>] bitmap_free+0x1e/0xb0
# [42949549.790000]  [<c02b2126>] bitmap_create+0xd6/0x2a0
# [42949549.790000]  [<c02ab95a>] do_md_run+0x2ba/0x500
# [42949549.790000]  [<c02ac8a7>] add_new_disk+0x157/0x3b0
# [42949549.790000]  [<c0179034>] mpage_writepages+0x124/0x3d0
# [42949549.790000]  [<c013c23e>] __pagevec_free+0x3e/0x60
# [42949549.790000]  [<c013eff9>] release_pages+0x29/0x160
# [42949549.790000]  [<c02adb81>] md_ioctl+0x5a1/0x630
# [42949549.790000]  [<c0137918>] find_get_pages+0x18/0x40
# [42949549.790000]  [<c02ad5e0>] md_ioctl+0x0/0x630
# [42949549.790000]  [<c01ede74>] blkdev_driver_ioctl+0x54/0x60
# [42949549.790000]  [<c01edfb4>] blkdev_ioctl+0x134/0x180
# [42949549.790000]  [<c015e158>] block_ioctl+0x18/0x20
# [42949549.790000]  [<c015e140>] block_ioctl+0x0/0x20
# [42949549.790000]  [<c01674ff>] do_ioctl+0x1f/0x70
# [42949549.790000]  [<c016769c>] vfs_ioctl+0x5c/0x1e0
# [42949549.790000]  [<c0156c91>] __fput+0xe1/0x140
# [42949549.790000]  [<c016785d>] sys_ioctl+0x3d/0x70
# [42949549.790000]  [<c0102f49>] syscall_call+0x7/0xb
# [42949549.790000] Trying to fix it up, but a reboot is needed
# [42949549.790000] md1: failed to create bitmap (524288)
# [42949549.790000] md: pers->run() failed ...
# [42949549.790000] md: md1 stopped.
# [42949549.790000] md: unbind<loop1>
# [42949549.790000] md: export_rdev(loop1)
# [42949549.790000] md: unbind<loop0>
# [42949549.790000] md: export_rdev(loop0)

-- 
Humilis IT Services and Solutions
http://www.humilis.net

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: segfault mdadm --write-behind, 2.6.14-mm2
  2005-11-16 22:20     ` Andrew Morton
  2005-11-16 23:08       ` Neil Brown
@ 2005-11-18 14:18       ` Vladimir V. Saveliev
  1 sibling, 0 replies; 19+ messages in thread
From: Vladimir V. Saveliev @ 2005-11-18 14:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: sander, neilb, linux-kernel, reiserfs-dev

Hello

Andrew Morton wrote:
> Sander <sander@humilis.net> wrote:
>>Neil Brown wrote (ao):
>>> If you use mdadm-2.0 and mark a device as --write-mostly, then all
>>> read requests will go to the other device(s) if possible,.
>>> e.g.
>>>   mdadm --create /dev/md0 --level=1 --raid-disks=2 /dev/ramdisk \
>>>      --writemostly /dev/realdisk
>>>
>>> Does this suit your needs?
>>>
>>> You can also arrange for the write to the writemostly device to be
>>> 'write-behind' so that the filesystem doesn't wait for the write to
>>> complete.  This can reduce write-latency (though not increase write
>>> throughput) at a very small cost of reliability (if the RAM dies, the
>>> disk may not be 100% up-to-date).
>>With 2.6.14-mm2 (x86) and mdadm 2.1 I get a Segmentation fault when I
>>try this:
> 
> It oopsed in reiser4.  reiserfs-dev added to Cc...
> 
>>mdadm -C /dev/md1 -l1 -n2 --bitmap=/storage/md1.bitmap /dev/loop0 \
>>--write-behind /dev/loop1
>>
>>loop0 is attached to a file on tmpfs, and loop1 is attached
>>to a file on a lvm2 volume (reiser4, if that matters).
>>

I tried ext2 on lvm2 and that did not help.
So, for now I would assume that the problem is not in reiser4 but somewhere else.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Please help me understand ->writepage. Was Re: segfault mdadm --write-behind, 2.6.14-mm2  (was: Re: RAID1 ramdisk patch)
  2005-11-17 10:15             ` Sander
@ 2005-11-21 23:07               ` Neil Brown
  2005-11-21 23:30                 ` Jeff Garzik
  2005-11-21 23:51                 ` Andrew Morton
  0 siblings, 2 replies; 19+ messages in thread
From: Neil Brown @ 2005-11-21 23:07 UTC (permalink / raw)
  To: sander; +Cc: Andrew Morton, linux-kernel, reiserfs-dev

On Thursday November 17, sander@humilis.net wrote:
> Sander wrote (ao):
> # Sander wrote (ao):
> # # Neil Brown wrote (ao):
> # # > On Wednesday November 16, akpm@osdl.org wrote:
> # # > > Sander <sander@humilis.net> wrote:
> # # > > > With 2.6.14-mm2 (x86) and mdadm 2.1 I get a Segmentation fault when I
> # # > > > try this:
> # # > > 
> # # > > It oopsed in reiser4.  reiserfs-dev added to Cc...
> # # > > 
> # # > 
> # # > Hmm... It appears that md/bitmap is calling prepare_write and
> # # > commit_write with 'file' as NULL - this works for some filesystems,
> # # > but not for reiser4.
> # # > 
> # # > Does this patch help.
> # # 
> # # Something changed, but it didn't fix it it seems:
> # # 
> # # # mdadm -C /dev/md1 --bitmap=/storage/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1
> # # mdadm: RUN_ARRAY failed: No such file or directory
> # 
> # FWIW, the following happens when I point --bitmap to /tmp/raid1.bitmap
> # which is tmpfs, and also happens when I attach both loop0 and loop1 to
> # files on tmpfs.
> # 
> # This would suggest that reiser4 is not solely at fault?
> # 

No, there is something very wrong in md/bitmap.c's handling of writing
to a file.  It was developed for, and tested on, ext3 and doesn't seem
to work anywhere else.... and I don't understand enough to fix it.

Help ???

What md/bitmap wants to do is effectively memory map the file, make
updates to pages occasionally, flush those pages out to storage, and
wait for the flush to complete.  It doesn't exactly memory map.  It
just reads all the pages and keeps them in an array (holding a
reference to each).

To write the pages out it effectively does ->prepare_write,
->commit_write, and then ->writepage.
I'm not sure that prepare/commit is needed, but they don't seem to be
the problem.  writepage is.

For tmpfs at least, writepage disconnects the page from the pagecache
(via move_to_swap_cache), so the page that we are holding is no longer
part of the file and, significantly, page->mapping become NULL.
This suggests that the ->writepage usage is broken.
However I tried to see what 'msync' does for real memory mapped files,
and it eventually calls ->writepage too.  So how does that work??

Any advice would be most welcome!

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Please help me understand ->writepage. Was Re: segfault mdadm --write-behind, 2.6.14-mm2  (was: Re: RAID1 ramdisk patch)
  2005-11-21 23:07               ` Please help me understand ->writepage. Was " Neil Brown
@ 2005-11-21 23:30                 ` Jeff Garzik
  2005-11-21 23:51                 ` Andrew Morton
  1 sibling, 0 replies; 19+ messages in thread
From: Jeff Garzik @ 2005-11-21 23:30 UTC (permalink / raw)
  To: Neil Brown; +Cc: sander, Andrew Morton, linux-kernel, reiserfs-dev

On Tue, Nov 22, 2005 at 10:07:41AM +1100, Neil Brown wrote:
> To write the pages out it effectively does ->prepare_write,
> ->commit_write, and then ->writepage.
> I'm not sure that prepare/commit is needed, but they don't seem to be
> the problem.  writepage is.

That's a bit weird.  Typically you have two separate callpaths,
non-page-aligned (prepare_write + commit_write) or writepage(s).
Not both.

	Jeff




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Please help me understand ->writepage. Was Re: segfault mdadm --write-behind, 2.6.14-mm2  (was: Re: RAID1 ramdisk patch)
  2005-11-21 23:07               ` Please help me understand ->writepage. Was " Neil Brown
  2005-11-21 23:30                 ` Jeff Garzik
@ 2005-11-21 23:51                 ` Andrew Morton
  2005-11-22  3:12                   ` Neil Brown
  1 sibling, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2005-11-21 23:51 UTC (permalink / raw)
  To: Neil Brown; +Cc: sander, linux-kernel, reiserfs-dev

Neil Brown <neilb@suse.de> wrote:
>
> Help ???

Indeed.  tmpfs is crackpottery.

>  What md/bitmap wants to do is effectively memory map the file, make
>  updates to pages occasionally, flush those pages out to storage, and
>  wait for the flush to complete.  It doesn't exactly memory map.  It
>  just reads all the pages and keeps them in an array (holding a
>  reference to each).
> 
>  To write the pages out it effectively does ->prepare_write,
>  ->commit_write, and then ->writepage.
>  I'm not sure that prepare/commit is needed, but they don't seem to be
>  the problem.  writepage is.
> 
>  For tmpfs at least, writepage disconnects the page from the pagecache
>  (via move_to_swap_cache), so the page that we are holding is no longer
>  part of the file and, significantly, page->mapping become NULL.
>  This suggests that the ->writepage usage is broken.
>  However I tried to see what 'msync' does for real memory mapped files,
>  and it eventually calls ->writepage too.  So how does that work??
> 
>  Any advice would be most welcome!

Skip the writepage if !mapping_cap_writeback_dirty(page->mapping), I guess.
Or, if appropriate, just sync the file.  Use filemap_fdatawrite() or even
refactor do_fsync() and use most of that.

Also, write_page() doesn't need to run set_page_dirty(); ->commit_write()
will do that.

Several kmap()s in there which can become kmap_atomic().

bitmap_init_from_disk() might be leaking bitmap->filemap on kmalloc-failed
error path.

bitmap->filemap_attr can be allocated with kzalloc() now.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Please help me understand ->writepage. Was Re: segfault mdadm --write-behind, 2.6.14-mm2  (was: Re: RAID1 ramdisk patch)
  2005-11-21 23:51                 ` Andrew Morton
@ 2005-11-22  3:12                   ` Neil Brown
  2005-11-22  3:47                     ` Andrew Morton
                                       ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Neil Brown @ 2005-11-22  3:12 UTC (permalink / raw)
  To: Andrew Morton; +Cc: sander, linux-kernel, reiserfs-dev

On Monday November 21, akpm@osdl.org wrote:
> Neil Brown <neilb@suse.de> wrote:
> >
> > Help ???
> 
> Indeed.  tmpfs is crackpottery.

Ok, that explains a lot... :-)

> > 
> >  Any advice would be most welcome!
> 
> Skip the writepage if !mapping_cap_writeback_dirty(page->mapping), I guess.
> Or, if appropriate, just sync the file.  Use filemap_fdatawrite() or even
> refactor do_fsync() and use most of that.

Uhm, what would you think of testing mapping_cap_writeback_dirty in
write_one_page??  If you don't like it, I can take it into write_page.

> 
> Also, write_page() doesn't need to run set_page_dirty(); ->commit_write()
> will do that.

Ok.... but I think I'm dropping prepare_write / commit_write.

> 
> Several kmap()s in there which can become kmap_atomic().

I've made them all kmap_atomic.

> 
> bitmap_init_from_disk() might be leaking bitmap->filemap on kmalloc-failed
> error path.

It looks that way, but actually not.  bitmap_create requires that
bitmap_destroy always be called afterwards, even on an error.  Not the
best interface I'd agree...

> 
> bitmap->filemap_attr can be allocated with kzalloc() now.
Yes, thanks.

So Sander, could you try this patch for main against reiser4?  It
seems to work on ext3 and tmpfs and has some chance of not mucking up
on reiser4.

Thanks,
NeilBrown


===File /home/src/mm/.patches/applied/014MdBitmapFix========
Status: devel

Hopefully make md/bitmaps work on files other than ext3



Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/bitmap.c |   64 +++++++++++++++++++-------------------------------
 ./mm/page-writeback.c |    4 +++
 2 files changed, 29 insertions(+), 39 deletions(-)

diff ./drivers/md/bitmap.c~current~ ./drivers/md/bitmap.c
--- ./drivers/md/bitmap.c~current~	2005-11-22 14:06:53.000000000 +1100
+++ ./drivers/md/bitmap.c	2005-11-22 14:07:05.000000000 +1100
@@ -310,7 +310,6 @@ static int write_sb_page(mddev_t *mddev,
  */
 static int write_page(struct bitmap *bitmap, struct page *page, int wait)
 {
-	int ret = -ENOMEM;
 
 	if (bitmap->file == NULL)
 		return write_sb_page(bitmap->mddev, bitmap->offset, page, wait);
@@ -326,15 +325,6 @@ static int write_page(struct bitmap *bit
 		}
 	}
 
-	ret = page->mapping->a_ops->prepare_write(bitmap->file, page, 0, PAGE_SIZE);
-	if (!ret)
-		ret = page->mapping->a_ops->commit_write(bitmap->file, page, 0,
-			PAGE_SIZE);
-	if (ret) {
-		unlock_page(page);
-		return ret;
-	}
-
 	set_page_dirty(page); /* force it to be written out */
 
 	if (!wait) {
@@ -406,11 +396,11 @@ int bitmap_update_sb(struct bitmap *bitm
 		return 0;
 	}
 	spin_unlock_irqrestore(&bitmap->lock, flags);
-	sb = (bitmap_super_t *)kmap(bitmap->sb_page);
+	sb = (bitmap_super_t *)kmap_atomic(bitmap->sb_page, KM_USER0);
 	sb->events = cpu_to_le64(bitmap->mddev->events);
 	if (!bitmap->mddev->degraded)
 		sb->events_cleared = cpu_to_le64(bitmap->mddev->events);
-	kunmap(bitmap->sb_page);
+	kunmap_atomic(bitmap->sb_page, KM_USER0);
 	return write_page(bitmap, bitmap->sb_page, 1);
 }
 
@@ -421,7 +411,7 @@ void bitmap_print_sb(struct bitmap *bitm
 
 	if (!bitmap || !bitmap->sb_page)
 		return;
-	sb = (bitmap_super_t *)kmap(bitmap->sb_page);
+	sb = (bitmap_super_t *)kmap_atomic(bitmap->sb_page, KM_USER0);
 	printk(KERN_DEBUG "%s: bitmap file superblock:\n", bmname(bitmap));
 	printk(KERN_DEBUG "         magic: %08x\n", le32_to_cpu(sb->magic));
 	printk(KERN_DEBUG "       version: %d\n", le32_to_cpu(sb->version));
@@ -440,7 +430,7 @@ void bitmap_print_sb(struct bitmap *bitm
 	printk(KERN_DEBUG "     sync size: %llu KB\n",
 			(unsigned long long)le64_to_cpu(sb->sync_size)/2);
 	printk(KERN_DEBUG "max write behind: %d\n", le32_to_cpu(sb->write_behind));
-	kunmap(bitmap->sb_page);
+	kunmap_atomic(bitmap->sb_page, KM_USER0);
 }
 
 /* read the superblock from the bitmap file and initialize some bitmap fields */
@@ -466,7 +456,7 @@ static int bitmap_read_sb(struct bitmap 
 		return err;
 	}
 
-	sb = (bitmap_super_t *)kmap(bitmap->sb_page);
+	sb = (bitmap_super_t *)kmap_atomic(bitmap->sb_page, KM_USER0);
 
 	if (bytes_read < sizeof(*sb)) { /* short read */
 		printk(KERN_INFO "%s: bitmap file superblock truncated\n",
@@ -535,7 +525,7 @@ success:
 		bitmap->events_cleared = bitmap->mddev->events;
 	err = 0;
 out:
-	kunmap(bitmap->sb_page);
+	kunmap_atomic(bitmap->sb_page, KM_USER0);
 	if (err)
 		bitmap_print_sb(bitmap);
 	return err;
@@ -560,7 +550,7 @@ static void bitmap_mask_state(struct bit
 	}
 	page_cache_get(bitmap->sb_page);
 	spin_unlock_irqrestore(&bitmap->lock, flags);
-	sb = (bitmap_super_t *)kmap(bitmap->sb_page);
+	sb = (bitmap_super_t *)kmap_atomic(bitmap->sb_page, KM_USER0);
 	switch (op) {
 		case MASK_SET: sb->state |= bits;
 				break;
@@ -568,7 +558,7 @@ static void bitmap_mask_state(struct bit
 				break;
 		default: BUG();
 	}
-	kunmap(bitmap->sb_page);
+	kunmap_atomic(bitmap->sb_page, KM_USER0);
 	page_cache_release(bitmap->sb_page);
 }
 
@@ -621,8 +611,7 @@ static void bitmap_file_unmap(struct bit
 	spin_unlock_irqrestore(&bitmap->lock, flags);
 
 	while (pages--)
-		if (map[pages]->index != 0) /* 0 is sb_page, release it below */
-			page_cache_release(map[pages]);
+		page_cache_release(map[pages]);
 	kfree(map);
 	kfree(attr);
 
@@ -771,7 +760,7 @@ static void bitmap_file_set_bit(struct b
 		set_bit(bit, kaddr);
 	else
 		ext2_set_bit(bit, kaddr);
-	kunmap_atomic(kaddr, KM_USER0);
+	kunmap_atomic(page, KM_USER0);
 	PRINTK("set file bit %lu page %lu\n", bit, page->index);
 
 	/* record page number so it gets flushed to disk when unplug occurs */
@@ -854,6 +843,7 @@ static int bitmap_init_from_disk(struct 
 	unsigned long bytes, offset, dummy;
 	int outofdate;
 	int ret = -ENOSPC;
+	void *paddr;
 
 	chunks = bitmap->chunks;
 	file = bitmap->file;
@@ -887,12 +877,10 @@ static int bitmap_init_from_disk(struct 
 	if (!bitmap->filemap)
 		goto out;
 
-	bitmap->filemap_attr = kmalloc(sizeof(long) * num_pages, GFP_KERNEL);
+	bitmap->filemap_attr = kzalloc(sizeof(long) * num_pages, GFP_KERNEL);
 	if (!bitmap->filemap_attr)
 		goto out;
 
-	memset(bitmap->filemap_attr, 0, sizeof(long) * num_pages);
-
 	oldindex = ~0L;
 
 	for (i = 0; i < chunks; i++) {
@@ -901,8 +889,6 @@ static int bitmap_init_from_disk(struct 
 		bit = file_page_offset(i);
 		if (index != oldindex) { /* this is a new page, read it in */
 			/* unmap the old page, we're done with it */
-			if (oldpage != NULL)
-				kunmap(oldpage);
 			if (index == 0) {
 				/*
 				 * if we're here then the superblock page
@@ -910,6 +896,7 @@ static int bitmap_init_from_disk(struct 
 				 * we've already read it in, so just use it
 				 */
 				page = bitmap->sb_page;
+				page_cache_get(page);
 				offset = sizeof(bitmap_super_t);
 			} else if (file) {
 				page = read_page(file, index, &dummy);
@@ -925,18 +912,18 @@ static int bitmap_init_from_disk(struct 
 
 			oldindex = index;
 			oldpage = page;
-			kmap(page);
 
 			if (outofdate) {
 				/*
 				 * if bitmap is out of date, dirty the
 			 	 * whole page and write it out
 				 */
-				memset(page_address(page) + offset, 0xff,
+				paddr = kmap_atomic(page, KM_USER0);
+				memset(paddr + offset, 0xff,
 				       PAGE_SIZE - offset);
+				kunmap_atomic(page, KM_USER0);
 				ret = write_page(bitmap, page, 1);
 				if (ret) {
-					kunmap(page);
 					/* release, page not in filemap yet */
 					page_cache_release(page);
 					goto out;
@@ -945,10 +932,12 @@ static int bitmap_init_from_disk(struct 
 
 			bitmap->filemap[bitmap->file_pages++] = page;
 		}
+		paddr = kmap_atomic(page, KM_USER0);
 		if (bitmap->flags & BITMAP_HOSTENDIAN)
-			b = test_bit(bit, page_address(page));
+			b = test_bit(bit, paddr);
 		else
-			b = ext2_test_bit(bit, page_address(page));
+			b = ext2_test_bit(bit, paddr);
+		kunmap_atomic(page, KM_USER0);
 		if (b) {
 			/* if the disk bit is set, set the memory bit */
 			bitmap_set_memory_bits(bitmap, i << CHUNK_BLOCK_SHIFT(bitmap),
@@ -963,9 +952,6 @@ static int bitmap_init_from_disk(struct 
 	ret = 0;
 	bitmap_mask_state(bitmap, BITMAP_STALE, MASK_UNSET);
 
-	if (page) /* unmap the last page */
-		kunmap(page);
-
 	if (bit_cnt) { /* Kick recovery if any bits were set */
 		set_bit(MD_RECOVERY_NEEDED, &bitmap->mddev->recovery);
 		md_wakeup_thread(bitmap->mddev->thread);
@@ -1021,6 +1007,7 @@ int bitmap_daemon_work(struct bitmap *bi
 	int err = 0;
 	int blocks;
 	int attr;
+	void *paddr;
 
 	if (bitmap == NULL)
 		return 0;
@@ -1077,14 +1064,12 @@ int bitmap_daemon_work(struct bitmap *bi
 					set_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE);
 					spin_unlock_irqrestore(&bitmap->lock, flags);
 				}
-				kunmap(lastpage);
 				page_cache_release(lastpage);
 				if (err)
 					bitmap_file_kick(bitmap);
 			} else
 				spin_unlock_irqrestore(&bitmap->lock, flags);
 			lastpage = page;
-			kmap(page);
 /*
 			printk("bitmap clean at page %lu\n", j);
 */
@@ -1107,10 +1092,12 @@ int bitmap_daemon_work(struct bitmap *bi
 						  -1);
 
 				/* clear the bit */
+				paddr = kmap_atomic(page, KM_USER0);
 				if (bitmap->flags & BITMAP_HOSTENDIAN)
-					clear_bit(file_page_offset(j), page_address(page));
+					clear_bit(file_page_offset(j), paddr);
 				else
-					ext2_clear_bit(file_page_offset(j), page_address(page));
+					ext2_clear_bit(file_page_offset(j), paddr);
+				kunmap_atomic(page, KM_USER0);
 			}
 		}
 		spin_unlock_irqrestore(&bitmap->lock, flags);
@@ -1118,7 +1105,6 @@ int bitmap_daemon_work(struct bitmap *bi
 
 	/* now sync the final page */
 	if (lastpage != NULL) {
-		kunmap(lastpage);
 		spin_lock_irqsave(&bitmap->lock, flags);
 		if (get_page_attr(bitmap, lastpage) &BITMAP_PAGE_NEEDWRITE) {
 			clear_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE);

diff ./mm/page-writeback.c~current~ ./mm/page-writeback.c
--- ./mm/page-writeback.c~current~	2005-11-22 14:06:53.000000000 +1100
+++ ./mm/page-writeback.c	2005-11-22 14:07:05.000000000 +1100
@@ -583,6 +583,10 @@ int write_one_page(struct page *page, in
 	};
 
 	BUG_ON(!PageLocked(page));
+	if (!mapping_cap_writeback_dirty(mapping)) {
+		unlock_page(page);
+		return ret;
+	}
 
 	if (wait)
 		wait_on_page_writeback(page);
============================================================

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Please help me understand ->writepage. Was Re: segfault mdadm --write-behind, 2.6.14-mm2  (was: Re: RAID1 ramdisk patch)
  2005-11-22  3:12                   ` Neil Brown
@ 2005-11-22  3:47                     ` Andrew Morton
  2005-11-22 10:34                     ` Sander
  2005-11-22 12:00                     ` Please help me understand ->writepage. " Anton Altaparmakov
  2 siblings, 0 replies; 19+ messages in thread
From: Andrew Morton @ 2005-11-22  3:47 UTC (permalink / raw)
  To: Neil Brown; +Cc: sander, linux-kernel, reiserfs-dev

Neil Brown <neilb@suse.de> wrote:
>
> Uhm, what would you think of testing mapping_cap_writeback_dirty in
>  write_one_page??  If you don't like it, I can take it into write_page.

write_one_page() is a little library function for filesystems to call, and
filesystems implicitly know whether or not they have backing store.  So
probably it's best to do this test in the (unusual) caller.

> > Also, write_page() doesn't need to run set_page_dirty(); ->commit_write()
> > will do that.
>
> Ok.... but I think I'm dropping prepare_write / commit_write.
>

Those functions do some pretty handy things, like creating disk blocks
within the file to back the page.  If someone comes along and ftruncate()s
the bitmap file while you're not looking, what happens?  Generally we use
i_sem for this sort of thing.

If you know that the page is still mapped into the file then yes, you can do

	lock_page()
	kmap_atomic()
	<modify>
	kunmap_atomic()
	flush_dcache_page()
	set_page_dirty()
	unlock_page()
	write_one_page(wait==1)

but that's rather a lot of work.

bitmap_unplug() looks risky - calling filesystem functions (like
lock_page()) from inside an unplug function.  Can this all be called from
the vmscan->writepage path?

It might be simpler and more maintainable to maintain the bitmap in normal
kernel memory, sync it to disk via higher-level entrypoints like
sys_write(), vfs_write(), sys_sync(), do_sync(), etc.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Please help me understand ->writepage. Was Re: segfault mdadm --write-behind, 2.6.14-mm2  (was: Re: RAID1 ramdisk patch)
  2005-11-22  3:12                   ` Neil Brown
  2005-11-22  3:47                     ` Andrew Morton
@ 2005-11-22 10:34                     ` Sander
  2005-11-24  5:41                       ` Please help me understand reiser4_writepage. " Neil Brown
  2005-11-22 12:00                     ` Please help me understand ->writepage. " Anton Altaparmakov
  2 siblings, 1 reply; 19+ messages in thread
From: Sander @ 2005-11-22 10:34 UTC (permalink / raw)
  To: Neil Brown; +Cc: Andrew Morton, sander, linux-kernel, reiserfs-dev

Neil Brown wrote (ao):
> On Monday November 21, akpm@osdl.org wrote:
> > bitmap->filemap_attr can be allocated with kzalloc() now.
> Yes, thanks.
> 
> So Sander, could you try this patch for main against reiser4?  It
> seems to work on ext3 and tmpfs and has some chance of not mucking up
> on reiser4.

It doesn't crash or segfault anymore. It works with the bitmap file on
tmpfs, but not yet on reiser4.

This is kernel 2.6.15-rc1-mm2 with your (Neil Brown's) patch.


loop0 is connected to a file on tmpfs
loop1 to a file on reiser4
/storage/raid1.bitmap is also on reiser4

# mdadm -C /dev/md1 --bitmap=/storage/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1
mdadm: RUN_ARRAY failed: No such file or directory

# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5] [multipath] [raid6] [raid10]
md0 : active raid1 sdd1[3] sdc1[2] sdb1[1] sda1[0]
      1003904 blocks [4/4] [UUUU]

unused devices: <none>

# mdadm -C /dev/md1 --bitmap=/storage/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1
mdadm: /dev/loop0 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Tue Nov 22 11:09:15 2005
mdadm: /dev/loop1 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Tue Nov 22 11:09:15 2005
Continue creating array? yes
mdadm: bitmap file /storage/raid1.bitmap already exists, use --force to overwrite

# mdadm -C /dev/md1 -f --bitmap=/storage/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1
mdadm: /dev/loop0 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Tue Nov 22 11:09:15 2005
mdadm: /dev/loop1 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Tue Nov 22 11:09:15 2005
Continue creating array? yes
mdadm: RUN_ARRAY failed: Success

# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid5] [multipath] [raid6] [raid10] 
md0 : active raid1 sdd1[3] sdc1[2] sdb1[1] sda1[0]
      1003904 blocks [4/4] [UUUU]
      
unused devices: <none>


dmesg:
[42949583.660000] loop: loaded (max 8 devices)
[42949655.110000] md: bind<loop0>
[42949655.110000] md: bind<loop1>
[42949655.110000] md: md1: raid array is not clean -- starting background reconstruction
[42949655.110000] md1: bitmap file is out of date (0 < 1) -- forcing full recovery
[42949655.110000] md1: bitmap file is out of date, doing full recovery
[42949655.680000] md1: bitmap initialized from disk: read 0/4 pages, set 0 bits, status: 1
[42949655.680000] md1: failed to create bitmap (1)
[42949655.680000] md: pers->run() failed ...
[42949655.680000] md: md1 stopped.
[42949655.680000] md: unbind<loop1>
[42949655.680000] md: export_rdev(loop1)
[42949655.680000] md: unbind<loop0>
[42949655.680000] md: export_rdev(loop0)
[42949671.480000] md: bind<loop0>
[42949671.480000] md: bind<loop1>
[42949671.480000] md: md1: raid array is not clean -- starting background reconstruction
[42949671.480000] md1: bitmap file is out of date (0 < 1) -- forcing full recovery
[42949671.480000] md1: bitmap file is out of date, doing full recovery
[42949671.770000] md1: bitmap initialized from disk: read 0/4 pages, set 0 bits, status: 1
[42949671.770000] md1: failed to create bitmap (1)
[42949671.770000] md: pers->run() failed ...
[42949671.770000] md: md1 stopped.
[42949671.770000] md: unbind<loop1>
[42949671.770000] md: export_rdev(loop1)
[42949671.770000] md: unbind<loop0>
[42949671.770000] md: export_rdev(loop0)


It does work with the bitmap file on tmpfs:

# mdadm -C /dev/md1 -f --bitmap=/tmp/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1
mdadm: /dev/loop0 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Tue Nov 22 11:20:48 2005
mdadm: /dev/loop1 appears to be part of a raid array:
    level=raid1 devices=2 ctime=Tue Nov 22 11:20:48 2005
Continue creating array? yes
mdadm: array /dev/md1 started.

silo1:~# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid5] [multipath] [raid6] [raid10] 
md1 : active raid1 loop1[1] loop0[0]
      509056 blocks [2/2] [UU]
      [>....................]  resync =  1.2% (6528/509056) finish=2.5min speed=3264K/sec
      bitmap: 63/63 pages [252KB], 4KB chunk, file: /tmp/raid1.bitmap

md0 : active raid1 sdd1[3] sdc1[2] sdb1[1] sda1[0]
      1003904 blocks [4/4] [UUUU]
      
unused devices: <none>


Is there anything you need me to test further?

Thanks for the patch!

	Sander

-- 
Humilis IT Services and Solutions
http://www.humilis.net

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Please help me understand ->writepage. Was Re: segfault mdadm --write-behind, 2.6.14-mm2  (was: Re: RAID1 ramdisk patch)
  2005-11-22  3:12                   ` Neil Brown
  2005-11-22  3:47                     ` Andrew Morton
  2005-11-22 10:34                     ` Sander
@ 2005-11-22 12:00                     ` Anton Altaparmakov
  2005-11-24  5:29                       ` Neil Brown
  2 siblings, 1 reply; 19+ messages in thread
From: Anton Altaparmakov @ 2005-11-22 12:00 UTC (permalink / raw)
  To: Neil Brown; +Cc: Andrew Morton, sander, linux-kernel, reiserfs-dev

On Tue, 22 Nov 2005, Neil Brown wrote:
> On Monday November 21, akpm@osdl.org wrote:
> > Neil Brown <neilb@suse.de> wrote:
> > >
> > > Help ???
> > 
> > Indeed.  tmpfs is crackpottery.
> 
> Ok, that explains a lot... :-)
> 
> > >  Any advice would be most welcome!
> > 
> > Skip the writepage if !mapping_cap_writeback_dirty(page->mapping), I guess.
> > Or, if appropriate, just sync the file.  Use filemap_fdatawrite() or even
> > refactor do_fsync() and use most of that.
> 
> Uhm, what would you think of testing mapping_cap_writeback_dirty in
> write_one_page??  If you don't like it, I can take it into write_page.
> 
> > Also, write_page() doesn't need to run set_page_dirty(); ->commit_write()
> > will do that.
> 
> Ok.... but I think I'm dropping prepare_write / commit_write.

That is a good idea given some file systems do not implement them.

> > Several kmap()s in there which can become kmap_atomic().
> 
> I've made them all kmap_atomic.

Except you did it wrong...  See below...

> > bitmap_init_from_disk() might be leaking bitmap->filemap on kmalloc-failed
> > error path.
> 
> It looks that way, but actually not.  bitmap_create requires that
> bitmap_destroy always be called afterwards, even on an error.  Not the
> best interface I'd agree...
> 
> > bitmap->filemap_attr can be allocated with kzalloc() now.
> Yes, thanks.
> 
> So Sander, could you try this patch for main against reiser4?  It
> seems to work on ext3 and tmpfs and has some chance of not mucking up
> on reiser4.
> 
> Thanks,
> NeilBrown
> 
> ===File /home/src/mm/.patches/applied/014MdBitmapFix========
> Status: devel
> 
> Hopefully make md/bitmaps work on files other than ext3
> 
> 
> 
> Signed-off-by: Neil Brown <neilb@suse.de>
> 
> ### Diffstat output
>  ./drivers/md/bitmap.c |   64 +++++++++++++++++++-------------------------------
>  ./mm/page-writeback.c |    4 +++
>  2 files changed, 29 insertions(+), 39 deletions(-)
> 
> diff ./drivers/md/bitmap.c~current~ ./drivers/md/bitmap.c
> --- ./drivers/md/bitmap.c~current~	2005-11-22 14:06:53.000000000 +1100
> +++ ./drivers/md/bitmap.c	2005-11-22 14:07:05.000000000 +1100
> @@ -310,7 +310,6 @@ static int write_sb_page(mddev_t *mddev,
>   */
>  static int write_page(struct bitmap *bitmap, struct page *page, int wait)
>  {
> -	int ret = -ENOMEM;
>  
>  	if (bitmap->file == NULL)
>  		return write_sb_page(bitmap->mddev, bitmap->offset, page, wait);
> @@ -326,15 +325,6 @@ static int write_page(struct bitmap *bit
>  		}
>  	}
>  
> -	ret = page->mapping->a_ops->prepare_write(bitmap->file, page, 0, PAGE_SIZE);
> -	if (!ret)
> -		ret = page->mapping->a_ops->commit_write(bitmap->file, page, 0,
> -			PAGE_SIZE);
> -	if (ret) {
> -		unlock_page(page);
> -		return ret;
> -	}
> -
>  	set_page_dirty(page); /* force it to be written out */
>  
>  	if (!wait) {
> @@ -406,11 +396,11 @@ int bitmap_update_sb(struct bitmap *bitm
>  		return 0;
>  	}
>  	spin_unlock_irqrestore(&bitmap->lock, flags);
> -	sb = (bitmap_super_t *)kmap(bitmap->sb_page);
> +	sb = (bitmap_super_t *)kmap_atomic(bitmap->sb_page, KM_USER0);
>  	sb->events = cpu_to_le64(bitmap->mddev->events);
>  	if (!bitmap->mddev->degraded)
>  		sb->events_cleared = cpu_to_le64(bitmap->mddev->events);
> -	kunmap(bitmap->sb_page);
> +	kunmap_atomic(bitmap->sb_page, KM_USER0);

You need to pass in the address not the page, i.e.:

	kunmap_atomic(sb, KM_USER0);

>  	return write_page(bitmap, bitmap->sb_page, 1);
>  }
>  
> @@ -421,7 +411,7 @@ void bitmap_print_sb(struct bitmap *bitm
>  
>  	if (!bitmap || !bitmap->sb_page)
>  		return;
> -	sb = (bitmap_super_t *)kmap(bitmap->sb_page);
> +	sb = (bitmap_super_t *)kmap_atomic(bitmap->sb_page, KM_USER0);
>  	printk(KERN_DEBUG "%s: bitmap file superblock:\n", bmname(bitmap));
>  	printk(KERN_DEBUG "         magic: %08x\n", le32_to_cpu(sb->magic));
>  	printk(KERN_DEBUG "       version: %d\n", le32_to_cpu(sb->version));
> @@ -440,7 +430,7 @@ void bitmap_print_sb(struct bitmap *bitm
>  	printk(KERN_DEBUG "     sync size: %llu KB\n",
>  			(unsigned long long)le64_to_cpu(sb->sync_size)/2);
>  	printk(KERN_DEBUG "max write behind: %d\n", le32_to_cpu(sb->write_behind));
> -	kunmap(bitmap->sb_page);
> +	kunmap_atomic(bitmap->sb_page, KM_USER0);

Again, this should be:

	kunmap_atomic(sb, KM_USER0);

>  }
>  
>  /* read the superblock from the bitmap file and initialize some bitmap fields */
> @@ -466,7 +456,7 @@ static int bitmap_read_sb(struct bitmap 
>  		return err;
>  	}
>  
> -	sb = (bitmap_super_t *)kmap(bitmap->sb_page);
> +	sb = (bitmap_super_t *)kmap_atomic(bitmap->sb_page, KM_USER0);
>  
>  	if (bytes_read < sizeof(*sb)) { /* short read */
>  		printk(KERN_INFO "%s: bitmap file superblock truncated\n",
> @@ -535,7 +525,7 @@ success:
>  		bitmap->events_cleared = bitmap->mddev->events;
>  	err = 0;
>  out:
> -	kunmap(bitmap->sb_page);
> +	kunmap_atomic(bitmap->sb_page, KM_USER0);

Again:	kunmap_atomic(sb, KM_USER0);

>  	if (err)
>  		bitmap_print_sb(bitmap);
>  	return err;
> @@ -560,7 +550,7 @@ static void bitmap_mask_state(struct bit
>  	}
>  	page_cache_get(bitmap->sb_page);
>  	spin_unlock_irqrestore(&bitmap->lock, flags);
> -	sb = (bitmap_super_t *)kmap(bitmap->sb_page);
> +	sb = (bitmap_super_t *)kmap_atomic(bitmap->sb_page, KM_USER0);
>  	switch (op) {
>  		case MASK_SET: sb->state |= bits;
>  				break;
> @@ -568,7 +558,7 @@ static void bitmap_mask_state(struct bit
>  				break;
>  		default: BUG();
>  	}
> -	kunmap(bitmap->sb_page);
> +	kunmap_atomic(bitmap->sb_page, KM_USER0);

Again:	kunmap_atomic(sb, KM_USER0);

>  	page_cache_release(bitmap->sb_page);
>  }
>  
> @@ -621,8 +611,7 @@ static void bitmap_file_unmap(struct bit
>  	spin_unlock_irqrestore(&bitmap->lock, flags);
>  
>  	while (pages--)
> -		if (map[pages]->index != 0) /* 0 is sb_page, release it below */
> -			page_cache_release(map[pages]);
> +		page_cache_release(map[pages]);
>  	kfree(map);
>  	kfree(attr);
>  
> @@ -771,7 +760,7 @@ static void bitmap_file_set_bit(struct b
>  		set_bit(bit, kaddr);
>  	else
>  		ext2_set_bit(bit, kaddr);
> -	kunmap_atomic(kaddr, KM_USER0);
> +	kunmap_atomic(page, KM_USER0);

This one was correct, you broke it.  (-:

>  	PRINTK("set file bit %lu page %lu\n", bit, page->index);
>  
>  	/* record page number so it gets flushed to disk when unplug occurs */
> @@ -854,6 +843,7 @@ static int bitmap_init_from_disk(struct 
>  	unsigned long bytes, offset, dummy;
>  	int outofdate;
>  	int ret = -ENOSPC;
> +	void *paddr;
>  
>  	chunks = bitmap->chunks;
>  	file = bitmap->file;
> @@ -887,12 +877,10 @@ static int bitmap_init_from_disk(struct 
>  	if (!bitmap->filemap)
>  		goto out;
>  
> -	bitmap->filemap_attr = kmalloc(sizeof(long) * num_pages, GFP_KERNEL);
> +	bitmap->filemap_attr = kzalloc(sizeof(long) * num_pages, GFP_KERNEL);
>  	if (!bitmap->filemap_attr)
>  		goto out;
>  
> -	memset(bitmap->filemap_attr, 0, sizeof(long) * num_pages);
> -
>  	oldindex = ~0L;
>  
>  	for (i = 0; i < chunks; i++) {
> @@ -901,8 +889,6 @@ static int bitmap_init_from_disk(struct 
>  		bit = file_page_offset(i);
>  		if (index != oldindex) { /* this is a new page, read it in */
>  			/* unmap the old page, we're done with it */
> -			if (oldpage != NULL)
> -				kunmap(oldpage);
>  			if (index == 0) {
>  				/*
>  				 * if we're here then the superblock page
> @@ -910,6 +896,7 @@ static int bitmap_init_from_disk(struct 
>  				 * we've already read it in, so just use it
>  				 */
>  				page = bitmap->sb_page;
> +				page_cache_get(page);
>  				offset = sizeof(bitmap_super_t);
>  			} else if (file) {
>  				page = read_page(file, index, &dummy);
> @@ -925,18 +912,18 @@ static int bitmap_init_from_disk(struct 
>  
>  			oldindex = index;
>  			oldpage = page;
> -			kmap(page);
>  
>  			if (outofdate) {
>  				/*
>  				 * if bitmap is out of date, dirty the
>  			 	 * whole page and write it out
>  				 */
> -				memset(page_address(page) + offset, 0xff,
> +				paddr = kmap_atomic(page, KM_USER0);
> +				memset(paddr + offset, 0xff,
>  				       PAGE_SIZE - offset);
> +				kunmap_atomic(page, KM_USER0);

Again:				kunmap_atomic(paddr, KM_USER0);

>  				ret = write_page(bitmap, page, 1);
>  				if (ret) {
> -					kunmap(page);
>  					/* release, page not in filemap yet */
>  					page_cache_release(page);
>  					goto out;
> @@ -945,10 +932,12 @@ static int bitmap_init_from_disk(struct 
>  
>  			bitmap->filemap[bitmap->file_pages++] = page;
>  		}
> +		paddr = kmap_atomic(page, KM_USER0);
>  		if (bitmap->flags & BITMAP_HOSTENDIAN)
> -			b = test_bit(bit, page_address(page));
> +			b = test_bit(bit, paddr);
>  		else
> -			b = ext2_test_bit(bit, page_address(page));
> +			b = ext2_test_bit(bit, paddr);
> +		kunmap_atomic(page, KM_USER0);

Again:		kunmap_atomic(paddr, KM_USER0);

>  		if (b) {
>  			/* if the disk bit is set, set the memory bit */
>  			bitmap_set_memory_bits(bitmap, i << CHUNK_BLOCK_SHIFT(bitmap),
> @@ -963,9 +952,6 @@ static int bitmap_init_from_disk(struct 
>  	ret = 0;
>  	bitmap_mask_state(bitmap, BITMAP_STALE, MASK_UNSET);
>  
> -	if (page) /* unmap the last page */
> -		kunmap(page);
> -
>  	if (bit_cnt) { /* Kick recovery if any bits were set */
>  		set_bit(MD_RECOVERY_NEEDED, &bitmap->mddev->recovery);
>  		md_wakeup_thread(bitmap->mddev->thread);
> @@ -1021,6 +1007,7 @@ int bitmap_daemon_work(struct bitmap *bi
>  	int err = 0;
>  	int blocks;
>  	int attr;
> +	void *paddr;
>  
>  	if (bitmap == NULL)
>  		return 0;
> @@ -1077,14 +1064,12 @@ int bitmap_daemon_work(struct bitmap *bi
>  					set_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE);
>  					spin_unlock_irqrestore(&bitmap->lock, flags);
>  				}
> -				kunmap(lastpage);
>  				page_cache_release(lastpage);
>  				if (err)
>  					bitmap_file_kick(bitmap);
>  			} else
>  				spin_unlock_irqrestore(&bitmap->lock, flags);
>  			lastpage = page;
> -			kmap(page);
>  /*
>  			printk("bitmap clean at page %lu\n", j);
>  */
> @@ -1107,10 +1092,12 @@ int bitmap_daemon_work(struct bitmap *bi
>  						  -1);
>  
>  				/* clear the bit */
> +				paddr = kmap_atomic(page, KM_USER0);
>  				if (bitmap->flags & BITMAP_HOSTENDIAN)
> -					clear_bit(file_page_offset(j), page_address(page));
> +					clear_bit(file_page_offset(j), paddr);
>  				else
> -					ext2_clear_bit(file_page_offset(j), page_address(page));
> +					ext2_clear_bit(file_page_offset(j), paddr);
> +				kunmap_atomic(page, KM_USER0);

Again:				kunmap_atomic(paddr, KM_USER0);

>  			}
>  		}
>  		spin_unlock_irqrestore(&bitmap->lock, flags);
> @@ -1118,7 +1105,6 @@ int bitmap_daemon_work(struct bitmap *bi
>  
>  	/* now sync the final page */
>  	if (lastpage != NULL) {
> -		kunmap(lastpage);
>  		spin_lock_irqsave(&bitmap->lock, flags);
>  		if (get_page_attr(bitmap, lastpage) &BITMAP_PAGE_NEEDWRITE) {
>  			clear_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE);
> 
> diff ./mm/page-writeback.c~current~ ./mm/page-writeback.c
> --- ./mm/page-writeback.c~current~	2005-11-22 14:06:53.000000000 +1100
> +++ ./mm/page-writeback.c	2005-11-22 14:07:05.000000000 +1100
> @@ -583,6 +583,10 @@ int write_one_page(struct page *page, in
>  	};
>  
>  	BUG_ON(!PageLocked(page));
> +	if (!mapping_cap_writeback_dirty(mapping)) {
> +		unlock_page(page);
> +		return ret;
> +	}
>  
>  	if (wait)
>  		wait_on_page_writeback(page);

Hope this helps.

Best regards,

	Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Please help me understand ->writepage. Was Re: segfault mdadm --write-behind, 2.6.14-mm2  (was: Re: RAID1 ramdisk patch)
  2005-11-22 12:00                     ` Please help me understand ->writepage. " Anton Altaparmakov
@ 2005-11-24  5:29                       ` Neil Brown
  0 siblings, 0 replies; 19+ messages in thread
From: Neil Brown @ 2005-11-24  5:29 UTC (permalink / raw)
  To: Anton Altaparmakov; +Cc: Andrew Morton, sander, linux-kernel, reiserfs-dev

On Tuesday November 22, aia21@cam.ac.uk wrote:
> On Tue, 22 Nov 2005, Neil Brown wrote:
> > I've made them all kmap_atomic.
> 
> Except you did it wrong...  See below...
> 
> > -	kunmap(bitmap->sb_page);
> > +	kunmap_atomic(bitmap->sb_page, KM_USER0);
> 
> You need to pass in the address not the page, i.e.:
> 

How.. umm... intuitive :-(
Thanks, I'll fix that.

> 
> Hope this helps.
> 

It does.  I really appreciate getting feedback on my code.... I've
sometimes tempted to slip in a few bugs so that when people point them
out to me I know they have read the rest of the code and that
increases my confidence in it (I haven't actually done this... yet).

:-)

NeilBrown

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Please help me understand reiser4_writepage. Was Re: segfault mdadm --write-behind, 2.6.14-mm2  (was: Re: RAID1 ramdisk patch)
  2005-11-22 10:34                     ` Sander
@ 2005-11-24  5:41                       ` Neil Brown
  0 siblings, 0 replies; 19+ messages in thread
From: Neil Brown @ 2005-11-24  5:41 UTC (permalink / raw)
  To: sander; +Cc: Andrew Morton, linux-kernel, reiserfs-dev

On Tuesday November 22, sander@humilis.net wrote:
> 
> It doesn't crash or segfault anymore. It works with the bitmap file on
> tmpfs, but not yet on reiser4.
> 
> This is kernel 2.6.15-rc1-mm2 with your (Neil Brown's) patch.
> 
...
> [42949655.680000] md1: bitmap initialized from disk: read 0/4 pages, set 0 bits, status: 1
....

Ok, this is interesting... 'status: 1'.
That should be either 0 or a negative errno.

That is printed in bitmap_init_from_disk in drivers/md/bitmap.c

'ret' can only be '1' if that value is returned from 'write_page'
write_page (same file) can only return '1' if that is returned by
write_one_page (mm/page-writeback.c).
write_one_page can only return '1' from a_ops->writepage, which is
presumably
  reiser4_writepage in fs/reiser4/page_cache.c

This will only return an unchecked value from write_page_by_ent (if
REISER4_USE_ENTD is defined) or emergency_flush.
emergency_flush is in fs/reiser4/emergency_flush.c and it does indeed
return 1 in some circumstances, though I don't really know what
circumstances.

So there may well be something that md/bitmap is doing wrongly, but
reiser4_writepage should not be returning 1 in any case.

Could someone on reiserfs-dev help me understand when
reiser4_writepage returns '1' and what I might be doing to trigger
that?

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2005-11-24  5:41 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-05  0:46 RAID1 ramdisk patch Wilco Baan Hofman
2005-09-05  1:27 ` Neil Brown
2005-09-05  7:40   ` Wilco Baan Hofman
2005-11-16 13:36   ` segfault mdadm --write-behind, 2.6.14-mm2 (was: Re: RAID1 ramdisk patch) Sander
2005-11-16 22:20     ` Andrew Morton
2005-11-16 23:08       ` Neil Brown
2005-11-17  7:50         ` Sander
2005-11-17 10:12           ` Sander
2005-11-17 10:15             ` Sander
2005-11-21 23:07               ` Please help me understand ->writepage. Was " Neil Brown
2005-11-21 23:30                 ` Jeff Garzik
2005-11-21 23:51                 ` Andrew Morton
2005-11-22  3:12                   ` Neil Brown
2005-11-22  3:47                     ` Andrew Morton
2005-11-22 10:34                     ` Sander
2005-11-24  5:41                       ` Please help me understand reiser4_writepage. " Neil Brown
2005-11-22 12:00                     ` Please help me understand ->writepage. " Anton Altaparmakov
2005-11-24  5:29                       ` Neil Brown
2005-11-18 14:18       ` segfault mdadm --write-behind, 2.6.14-mm2 Vladimir V. Saveliev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox