public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* AIO, DIO fsx tests failures on 2.6.19-rc1-mm1
@ 2006-10-16 15:42 Badari Pulavarty
  2006-10-16 17:51 ` Zach Brown
  0 siblings, 1 reply; 11+ messages in thread
From: Badari Pulavarty @ 2006-10-16 15:42 UTC (permalink / raw)
  To: Zach Brown; +Cc: lkml, akpm

Hi Zach,


While looking at test.kernel.org failures, I noticed that most
of fsx tests with AIO or DIO (or combination) are failing.

I haven't digged deep, but I am assuming its something to do
with your latest cleanup work :(

Can you take a look ? Failures look nasty ..

http://test.kernel.org/abat/55567/005.fsx-linux.test/results/


Thanks,
Badari


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AIO, DIO fsx tests failures on 2.6.19-rc1-mm1
  2006-10-16 15:42 AIO, DIO fsx tests failures on 2.6.19-rc1-mm1 Badari Pulavarty
@ 2006-10-16 17:51 ` Zach Brown
  2006-10-16 17:59   ` Badari Pulavarty
  0 siblings, 1 reply; 11+ messages in thread
From: Zach Brown @ 2006-10-16 17:51 UTC (permalink / raw)
  To: Badari Pulavarty; +Cc: lkml, akpm


> Can you take a look ? Failures look nasty ..

Yeah, it's right at the tippy top of my list.

- z

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AIO, DIO fsx tests failures on 2.6.19-rc1-mm1
  2006-10-16 17:51 ` Zach Brown
@ 2006-10-16 17:59   ` Badari Pulavarty
  2006-10-16 20:13     ` Zach Brown
  0 siblings, 1 reply; 11+ messages in thread
From: Badari Pulavarty @ 2006-10-16 17:59 UTC (permalink / raw)
  To: Zach Brown; +Cc: lkml, akpm

On Mon, 2006-10-16 at 10:51 -0700, Zach Brown wrote:
> > Can you take a look ? Failures look nasty ..
> 
> Yeah, it's right at the tippy top of my list.
> 
> - z

Here is the easiest case to fix first :)
simple DIO wrote more than asked for :(

elm3b29:~ # /root/fsx-linux -N 10000 -o 128000 -r 2048 -w 4096 -Z -R -W
jnk
mapped writes DISABLED
truncating to largest ever: 0x32740
truncating to largest ever: 0x39212
truncating to largest ever: 0x3bae9
short write: 0x17000 bytes instead of 0x14000   <<<<<<
LOG DUMP (47 total operations):
1(1 mod 256): WRITE     0x1f000 thru 0x35fff    (0x17000 bytes) HOLE
2(2 mod 256): WRITE     0x10000 thru 0x2dfff    (0x1e000 bytes)
3(3 mod 256): READ      0xe800 thru 0xefff      (0x800 bytes)
4(4 mod 256): READ      0x800 thru 0x137ff      (0x13000 bytes)
5(5 mod 256): READ      0x2e800 thru 0x357ff    (0x7000 bytes)
6(6 mod 256): READ      0x23800 thru 0x357ff    (0x12000 bytes)
7(7 mod 256): READ      0x2f800 thru 0x357ff    (0x6000 bytes)
8(8 mod 256): READ      0x7000 thru 0xbfff      (0x5000 bytes)
9(9 mod 256): TRUNCATE DOWN     from 0x36000 to 0x32740
10(10 mod 256): READ    0xa800 thru 0x287ff     (0x1e000 bytes)
11(11 mod 256): WRITE   0x29000 thru 0x34fff    (0xc000 bytes) EXTEND
12(12 mod 256): TRUNCATE DOWN   from 0x35000 to 0x1977a
13(13 mod 256): READ    0xa800 thru 0xe7ff      (0x4000 bytes)
14(14 mod 256): TRUNCATE UP     from 0x1977a to 0x39212
15(15 mod 256): WRITE   0x28000 thru 0x3cfff    (0x15000 bytes) EXTEND
16(16 mod 256): READ    0xb000 thru 0xcfff      (0x2000 bytes)
17(17 mod 256): READ    0x38800 thru 0x3c7ff    (0x4000 bytes)
18(18 mod 256): TRUNCATE DOWN   from 0x3d000 to 0x2a2b5
19(19 mod 256): READ    0xf800 thru 0x167ff     (0x7000 bytes)
20(20 mod 256): READ    0x15000 thru 0x19fff    (0x5000 bytes)
21(21 mod 256): TRUNCATE DOWN   from 0x2a2b5 to 0xa87
22(22 mod 256): TRUNCATE UP     from 0xa87 to 0x20108
23(23 mod 256): READ    0x4800 thru 0x15fff     (0x11800 bytes)
24(24 mod 256): READ    0x7000 thru 0x13fff     (0xd000 bytes)
25(25 mod 256): READ    0x0 thru 0x137ff        (0x13800 bytes)
26(26 mod 256): SKIPPED (no operation)
27(27 mod 256): WRITE   0x15000 thru 0x23fff    (0xf000 bytes) EXTEND
28(28 mod 256): WRITE   0x23000 thru 0x38fff    (0x16000 bytes) EXTEND
29(29 mod 256): WRITE   0x32000 thru 0x32fff    (0x1000 bytes)
30(30 mod 256): WRITE   0x10000 thru 0x16fff    (0x7000 bytes)
31(31 mod 256): TRUNCATE DOWN   from 0x39000 to 0x2fae1
32(32 mod 256): READ    0x13000 thru 0x2efff    (0x1c000 bytes)
33(33 mod 256): WRITE   0x3e000 thru 0x3efff    (0x1000 bytes) HOLE
34(34 mod 256): TRUNCATE DOWN   from 0x3f000 to 0x18d8b
35(35 mod 256): WRITE   0x7000 thru 0x10fff     (0xa000 bytes)
36(36 mod 256): READ    0x9000 thru 0x187ff     (0xf800 bytes)
37(37 mod 256): TRUNCATE DOWN   from 0x18d8b to 0x12111
38(38 mod 256): TRUNCATE UP     from 0x12111 to 0x2e931
39(39 mod 256): WRITE   0x38000 thru 0x3efff    (0x7000 bytes) HOLE
40(40 mod 256): WRITE   0x2c000 thru 0x3dfff    (0x12000 bytes)
41(41 mod 256): SKIPPED (no operation)
42(42 mod 256): TRUNCATE DOWN   from 0x3f000 to 0x15e10
43(43 mod 256): TRUNCATE UP     from 0x15e10 to 0x3bae9
44(44 mod 256): TRUNCATE DOWN   from 0x3bae9 to 0x3b4a3
45(45 mod 256): TRUNCATE DOWN   from 0x3b4a3 to 0x27d16
46(46 mod 256): WRITE   0x27000 thru 0x2ffff    (0x9000 bytes) EXTEND
47(47 mod 256): WRITE   0x10000 thru 0x23fff    (0x14000 bytes)
Correct content saved for comparison
(maybe hexdump "jnk" vs "jnk.fsxgood")
Segmentation fault



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AIO, DIO fsx tests failures on 2.6.19-rc1-mm1
  2006-10-16 17:59   ` Badari Pulavarty
@ 2006-10-16 20:13     ` Zach Brown
  2006-10-16 20:38       ` Badari Pulavarty
  0 siblings, 1 reply; 11+ messages in thread
From: Zach Brown @ 2006-10-16 20:13 UTC (permalink / raw)
  To: Badari Pulavarty; +Cc: lkml, akpm


> Here is the easiest case to fix first :)
> simple DIO wrote more than asked for :(
> 
> elm3b29:~ # /root/fsx-linux -N 10000 -o 128000 -r 2048 -w 4096 -Z -R -W
> jnk
> mapped writes DISABLED
> truncating to largest ever: 0x32740
> truncating to largest ever: 0x39212
> truncating to largest ever: 0x3bae9
> short write: 0x17000 bytes instead of 0x14000   <<<<<<

So the answer is that -rc1-mm1 doesn't quite have the most recent
version of this patch.  Grab the final patch at the end of this post
from Andrew:

	http://lkml.org/lkml/2006/10/11/234

It fixes up a misunderstanding that came from
generic_file_buffered_write()'s habit of adding its 'written' input into
the amount of bytes it announces having written in its return value.

>From mm-commits it looks like -mm2 will have the full patch.

- z

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AIO, DIO fsx tests failures on 2.6.19-rc1-mm1
  2006-10-16 20:13     ` Zach Brown
@ 2006-10-16 20:38       ` Badari Pulavarty
  2006-10-16 20:59         ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Badari Pulavarty @ 2006-10-16 20:38 UTC (permalink / raw)
  To: Zach Brown; +Cc: lkml, akpm

On Mon, 2006-10-16 at 13:13 -0700, Zach Brown wrote:
> > Here is the easiest case to fix first :)
> > simple DIO wrote more than asked for :(
> > 
> > elm3b29:~ # /root/fsx-linux -N 10000 -o 128000 -r 2048 -w 4096 -Z -R -W
> > jnk
> > mapped writes DISABLED
> > truncating to largest ever: 0x32740
> > truncating to largest ever: 0x39212
> > truncating to largest ever: 0x3bae9
> > short write: 0x17000 bytes instead of 0x14000   <<<<<<
> 
> So the answer is that -rc1-mm1 doesn't quite have the most recent
> version of this patch.  Grab the final patch at the end of this post
> from Andrew:
> 
> 	http://lkml.org/lkml/2006/10/11/234
> 
> It fixes up a misunderstanding that came from
> generic_file_buffered_write()'s habit of adding its 'written' input into
> the amount of bytes it announces having written in its return value.
> 
> From mm-commits it looks like -mm2 will have the full patch.
> 

Hmm.. with that patch applied, I still have fsx failures.
This time read() returning -EINVAL. Are there any other fixes
missing in -mm ?

Thanks,
Badari

elm3b29:~ # /root/fsx-linux -N 10000 -o 128000 -r 2048 -w 4096 -Z -R -W
jnkI
mapped writes DISABLED
truncating to largest ever: 0x32740
truncating to largest ever: 0x39212
truncating to largest ever: 0x3bae9
truncating to largest ever: 0x3c1e3
truncating to largest ever: 0x3d1cd
truncating to largest ever: 0x3e8b8
truncating to largest ever: 0x3ed14
truncating to largest ever: 0x3f9c2
truncating to largest ever: 0x3ff9f
doread: read: Invalid argument
Segmentation fault



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AIO, DIO fsx tests failures on 2.6.19-rc1-mm1
  2006-10-16 20:38       ` Badari Pulavarty
@ 2006-10-16 20:59         ` Andrew Morton
  2006-10-16 21:21           ` Zach Brown
  2006-10-17 22:31           ` Badari Pulavarty
  0 siblings, 2 replies; 11+ messages in thread
From: Andrew Morton @ 2006-10-16 20:59 UTC (permalink / raw)
  To: Badari Pulavarty; +Cc: Zach Brown, lkml

On Mon, 16 Oct 2006 13:38:19 -0700
Badari Pulavarty <pbadari@us.ibm.com> wrote:

> > 
> > So the answer is that -rc1-mm1 doesn't quite have the most recent
> > version of this patch.  Grab the final patch at the end of this post
> > from Andrew:
> > 
> > 	http://lkml.org/lkml/2006/10/11/234
> > 
> > It fixes up a misunderstanding that came from
> > generic_file_buffered_write()'s habit of adding its 'written' input into
> > the amount of bytes it announces having written in its return value.
> > 
> > From mm-commits it looks like -mm2 will have the full patch.
> > 
> 
> Hmm.. with that patch applied, I still have fsx failures.
> This time read() returning -EINVAL. Are there any other fixes
> missing in -mm ?

Probably.  I need to get off butt and prepare rc2-mm1.

The below is the full patch against 2.6.19-rc2.  Please test this version.


From: Jeff Moyer <jmoyer@redhat.com>

When direct-io falls back to buffered write, it will just leave the dirty data
floating about in pagecache, pending regular writeback.

But normal direct-io semantics are that IO is synchronous, and that it leaves
no pagecache behind.

So change the fallback-to-buffered-write code to sync the file region and to
then strip away the pagecache, just as a regular direct-io write would do.

Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 mm/filemap.c |   51 +++++++++++++++++++++++++++++++++++++++++++------
 1 files changed, 45 insertions(+), 6 deletions(-)

diff -puN mm/filemap.c~direct-io-sync-and-invalidate-file-region-when-falling-back-to-buffered-write mm/filemap.c
--- a/mm/filemap.c~direct-io-sync-and-invalidate-file-region-when-falling-back-to-buffered-write
+++ a/mm/filemap.c
@@ -2222,7 +2222,7 @@ __generic_file_aio_write_nolock(struct k
 				unsigned long nr_segs, loff_t *ppos)
 {
 	struct file *file = iocb->ki_filp;
-	const struct address_space * mapping = file->f_mapping;
+	struct address_space * mapping = file->f_mapping;
 	size_t ocount;		/* original count */
 	size_t count;		/* after file limit checks */
 	struct inode 	*inode = mapping->host;
@@ -2275,8 +2275,11 @@ __generic_file_aio_write_nolock(struct k
 
 	/* coalesce the iovecs and go direct-to-BIO for O_DIRECT */
 	if (unlikely(file->f_flags & O_DIRECT)) {
-		written = generic_file_direct_write(iocb, iov,
-				&nr_segs, pos, ppos, count, ocount);
+		loff_t endbyte;
+		ssize_t written_buffered;
+
+		written = generic_file_direct_write(iocb, iov, &nr_segs, pos,
+							ppos, count, ocount);
 		if (written < 0 || written == count)
 			goto out;
 		/*
@@ -2285,10 +2288,46 @@ __generic_file_aio_write_nolock(struct k
 		 */
 		pos += written;
 		count -= written;
-	}
+		written_buffered = generic_file_buffered_write(iocb, iov,
+						nr_segs, pos, ppos, count,
+						written);
+		/*
+		 * If generic_file_buffered_write() retuned a synchronous error
+		 * then we want to return the number of bytes which were
+		 * direct-written, or the error code if that was zero.  Note
+		 * that this differs from normal direct-io semantics, which
+		 * will return -EFOO even if some bytes were written.
+		 */
+		if (written_buffered < 0) {
+			err = written_buffered;
+			goto out;
+		}
 
-	written = generic_file_buffered_write(iocb, iov, nr_segs,
-			pos, ppos, count, written);
+		/*
+		 * We need to ensure that the page cache pages are written to
+		 * disk and invalidated to preserve the expected O_DIRECT
+		 * semantics.
+		 */
+		endbyte = pos + written_buffered - written - 1;
+		err = do_sync_file_range(file, pos, endbyte,
+					 SYNC_FILE_RANGE_WAIT_BEFORE|
+					 SYNC_FILE_RANGE_WRITE|
+					 SYNC_FILE_RANGE_WAIT_AFTER);
+		if (err == 0) {
+			written = written_buffered;
+			invalidate_mapping_pages(mapping,
+						 pos >> PAGE_CACHE_SHIFT,
+						 endbyte >> PAGE_CACHE_SHIFT);
+		} else {
+			/*
+			 * We don't know how much we wrote, so just return
+			 * the number of bytes which were direct-written
+			 */
+		}
+	} else {
+		written = generic_file_buffered_write(iocb, iov, nr_segs,
+				pos, ppos, count, written);
+	}
 out:
 	current->backing_dev_info = NULL;
 	return written ? written : err;
_


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AIO, DIO fsx tests failures on 2.6.19-rc1-mm1
  2006-10-16 20:59         ` Andrew Morton
@ 2006-10-16 21:21           ` Zach Brown
  2006-10-17 22:31           ` Badari Pulavarty
  1 sibling, 0 replies; 11+ messages in thread
From: Zach Brown @ 2006-10-16 21:21 UTC (permalink / raw)
  To: Badari Pulavarty; +Cc: Andrew Morton, lkml


>> Hmm.. with that patch applied, I still have fsx failures.
>> This time read() returning -EINVAL. Are there any other fixes
>> missing in -mm ?

For what it's worth, the fsx I have here isn't raising errors with the
latest patch on 1k, 2k, and 4k ext3 on a stinky old IDE drive on an old
UP P3.

# /tmp/fsx -N 10000 -o 128000 -r 2048 -w 4096 -Z -R -W
/mnt/ext3-hdb4/fsx-file
...
truncating to largest ever: 0x3ff9f
truncating to largest ever: 0x3ffa9
All operations completed A-OK!

- z

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AIO, DIO fsx tests failures on 2.6.19-rc1-mm1
  2006-10-16 20:59         ` Andrew Morton
  2006-10-16 21:21           ` Zach Brown
@ 2006-10-17 22:31           ` Badari Pulavarty
  2006-10-17 23:10             ` Andrew Morton
  2006-10-18 15:44             ` Chris Mason
  1 sibling, 2 replies; 11+ messages in thread
From: Badari Pulavarty @ 2006-10-17 22:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Zach Brown, lkml

Andrew Morton wrote:
> On Mon, 16 Oct 2006 13:38:19 -0700
> Badari Pulavarty <pbadari@us.ibm.com> wrote:
>
>   
>>> So the answer is that -rc1-mm1 doesn't quite have the most recent
>>> version of this patch.  Grab the final patch at the end of this post
>>> from Andrew:
>>>
>>> 	http://lkml.org/lkml/2006/10/11/234
>>>
>>> It fixes up a misunderstanding that came from
>>> generic_file_buffered_write()'s habit of adding its 'written' input into
>>> the amount of bytes it announces having written in its return value.
>>>
>>> From mm-commits it looks like -mm2 will have the full patch.
>>>
>>>       
>> Hmm.. with that patch applied, I still have fsx failures.
>> This time read() returning -EINVAL. Are there any other fixes
>> missing in -mm ?
>>     
>
> Probably.  I need to get off butt and prepare rc2-mm1.
>
> The below is the full patch against 2.6.19-rc2.  Please test this version.
>
>
> From: Jeff Moyer <jmoyer@redhat.com>
>
> When direct-io falls back to buffered write, it will just leave the dirty data
> floating about in pagecache, pending regular writeback.
>
> But normal direct-io semantics are that IO is synchronous, and that it leaves
> no pagecache behind.
>
> So change the fallback-to-buffered-write code to sync the file region and to
> then strip away the pagecache, just as a regular direct-io write would do.
>
>   

Okay. Finally tracked down the problem I am running into.
This happens only on reiserfs

# /root/fsx-linux -N 10000 -o 128000 -r 2048 -w 4096 -Z -R -W
jnk
mapped writes DISABLED
truncating to largest ever: 0x32740
truncating to largest ever: 0x39212
truncating to largest ever: 0x3bae9
truncating to largest ever: 0x3c1e3
truncating to largest ever: 0x3d1cd
truncating to largest ever: 0x3e8b8
truncating to largest ever: 0x3ed14
truncating to largest ever: 0x3f9c2
truncating to largest ever: 0x3ff9f
doread: read: Invalid argument
Segmentation fault

Here is the strace for it
..
ftruncate(3, 2721)                      = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=2721, ...}) = 0
lseek(3, 0, SEEK_END)                   = 2721
fstat(3, {st_mode=S_IFREG|0644, st_size=2721, ...}) = 0
lseek(3, 0, SEEK_END)                   = 2721
fstat(3, {st_mode=S_IFREG|0644, st_size=2721, ...}) = 0
lseek(3, 0, SEEK_END)                   = 2721
lseek(3, 0, SEEK_SET)                   = 0
read(3, 0x50a800, 2048)                 = -1 EINVAL (Invalid argument)

reiserfs getblock() is returing -EINVAL. There is comment in the code
about tail handling and returning EINVAL. BTW, this is not a -mm
issue, it happens on mainline too...

Thanks,
Badari


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AIO, DIO fsx tests failures on 2.6.19-rc1-mm1
  2006-10-17 22:31           ` Badari Pulavarty
@ 2006-10-17 23:10             ` Andrew Morton
  2006-10-17 23:38               ` Badari Pulavarty
  2006-10-18 15:44             ` Chris Mason
  1 sibling, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2006-10-17 23:10 UTC (permalink / raw)
  To: Badari Pulavarty; +Cc: Zach Brown, lkml

On Tue, 17 Oct 2006 15:31:49 -0700
Badari Pulavarty <pbadari@us.ibm.com> wrote:

> Andrew Morton wrote:
> > On Mon, 16 Oct 2006 13:38:19 -0700
> > Badari Pulavarty <pbadari@us.ibm.com> wrote:
> >
> >   
> >>> So the answer is that -rc1-mm1 doesn't quite have the most recent
> >>> version of this patch.  Grab the final patch at the end of this post
> >>> from Andrew:
> >>>
> >>> 	http://lkml.org/lkml/2006/10/11/234
> >>>
> >>> It fixes up a misunderstanding that came from
> >>> generic_file_buffered_write()'s habit of adding its 'written' input into
> >>> the amount of bytes it announces having written in its return value.
> >>>
> >>> From mm-commits it looks like -mm2 will have the full patch.
> >>>
> >>>       
> >> Hmm.. with that patch applied, I still have fsx failures.
> >> This time read() returning -EINVAL. Are there any other fixes
> >> missing in -mm ?
> >>     
> >
> > Probably.  I need to get off butt and prepare rc2-mm1.
> >
> > The below is the full patch against 2.6.19-rc2.  Please test this version.
> >
> >
> > From: Jeff Moyer <jmoyer@redhat.com>
> >
> > When direct-io falls back to buffered write, it will just leave the dirty data
> > floating about in pagecache, pending regular writeback.
> >
> > But normal direct-io semantics are that IO is synchronous, and that it leaves
> > no pagecache behind.
> >
> > So change the fallback-to-buffered-write code to sync the file region and to
> > then strip away the pagecache, just as a regular direct-io write would do.
> >
> >   
> 
> Okay. Finally tracked down the problem I am running into.
> This happens only on reiserfs
> 
> # /root/fsx-linux -N 10000 -o 128000 -r 2048 -w 4096 -Z -R -W
> jnk
> mapped writes DISABLED
> truncating to largest ever: 0x32740
> truncating to largest ever: 0x39212
> truncating to largest ever: 0x3bae9
> truncating to largest ever: 0x3c1e3
> truncating to largest ever: 0x3d1cd
> truncating to largest ever: 0x3e8b8
> truncating to largest ever: 0x3ed14
> truncating to largest ever: 0x3f9c2
> truncating to largest ever: 0x3ff9f
> doread: read: Invalid argument
> Segmentation fault
> 
> Here is the strace for it
> ..
> ftruncate(3, 2721)                      = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=2721, ...}) = 0
> lseek(3, 0, SEEK_END)                   = 2721
> fstat(3, {st_mode=S_IFREG|0644, st_size=2721, ...}) = 0
> lseek(3, 0, SEEK_END)                   = 2721
> fstat(3, {st_mode=S_IFREG|0644, st_size=2721, ...}) = 0
> lseek(3, 0, SEEK_END)                   = 2721
> lseek(3, 0, SEEK_SET)                   = 0
> read(3, 0x50a800, 2048)                 = -1 EINVAL (Invalid argument)
> 
> reiserfs getblock() is returing -EINVAL. There is comment in the code
> about tail handling and returning EINVAL. BTW, this is not a -mm
> issue, it happens on mainline too...
> 

Does it fail in mainline, or only in
mainline+direct-io-sync-and-invalidate-file-region-when-falling-back-to-buffered-write.patch?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AIO, DIO fsx tests failures on 2.6.19-rc1-mm1
  2006-10-17 23:10             ` Andrew Morton
@ 2006-10-17 23:38               ` Badari Pulavarty
  0 siblings, 0 replies; 11+ messages in thread
From: Badari Pulavarty @ 2006-10-17 23:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Zach Brown, lkml

Andrew Morton wrote:
> On Tue, 17 Oct 2006 15:31:49 -0700
> Badari Pulavarty <pbadari@us.ibm.com> wrote:
>
>   
>> Andrew Morton wrote:
>>     
>>> On Mon, 16 Oct 2006 13:38:19 -0700
>>> Badari Pulavarty <pbadari@us.ibm.com> wrote:
>>>
>>>   
>>>       
>>>>> So the answer is that -rc1-mm1 doesn't quite have the most recent
>>>>> version of this patch.  Grab the final patch at the end of this post
>>>>> from Andrew:
>>>>>
>>>>> 	http://lkml.org/lkml/2006/10/11/234
>>>>>
>>>>> It fixes up a misunderstanding that came from
>>>>> generic_file_buffered_write()'s habit of adding its 'written' input into
>>>>> the amount of bytes it announces having written in its return value.
>>>>>
>>>>> From mm-commits it looks like -mm2 will have the full patch.
>>>>>
>>>>>       
>>>>>           
>>>> Hmm.. with that patch applied, I still have fsx failures.
>>>> This time read() returning -EINVAL. Are there any other fixes
>>>> missing in -mm ?
>>>>     
>>>>         
>>> Probably.  I need to get off butt and prepare rc2-mm1.
>>>
>>> The below is the full patch against 2.6.19-rc2.  Please test this version.
>>>
>>>
>>> From: Jeff Moyer <jmoyer@redhat.com>
>>>
>>> When direct-io falls back to buffered write, it will just leave the dirty data
>>> floating about in pagecache, pending regular writeback.
>>>
>>> But normal direct-io semantics are that IO is synchronous, and that it leaves
>>> no pagecache behind.
>>>
>>> So change the fallback-to-buffered-write code to sync the file region and to
>>> then strip away the pagecache, just as a regular direct-io write would do.
>>>
>>>   
>>>       
>> Okay. Finally tracked down the problem I am running into.
>> This happens only on reiserfs
>>
>> # /root/fsx-linux -N 10000 -o 128000 -r 2048 -w 4096 -Z -R -W
>> jnk
>> mapped writes DISABLED
>> truncating to largest ever: 0x32740
>> truncating to largest ever: 0x39212
>> truncating to largest ever: 0x3bae9
>> truncating to largest ever: 0x3c1e3
>> truncating to largest ever: 0x3d1cd
>> truncating to largest ever: 0x3e8b8
>> truncating to largest ever: 0x3ed14
>> truncating to largest ever: 0x3f9c2
>> truncating to largest ever: 0x3ff9f
>> doread: read: Invalid argument
>> Segmentation fault
>>
>> Here is the strace for it
>> ..
>> ftruncate(3, 2721)                      = 0
>> fstat(3, {st_mode=S_IFREG|0644, st_size=2721, ...}) = 0
>> lseek(3, 0, SEEK_END)                   = 2721
>> fstat(3, {st_mode=S_IFREG|0644, st_size=2721, ...}) = 0
>> lseek(3, 0, SEEK_END)                   = 2721
>> fstat(3, {st_mode=S_IFREG|0644, st_size=2721, ...}) = 0
>> lseek(3, 0, SEEK_END)                   = 2721
>> lseek(3, 0, SEEK_SET)                   = 0
>> read(3, 0x50a800, 2048)                 = -1 EINVAL (Invalid argument)
>>
>> reiserfs getblock() is returing -EINVAL. There is comment in the code
>> about tail handling and returning EINVAL. BTW, this is not a -mm
>> issue, it happens on mainline too...
>>
>>     
>
> Does it fail in mainline, or only in
> mainline+direct-io-sync-and-invalidate-file-region-when-falling-back-to-buffered-write.patch?
>
>   
It fails on mailine (2.6.19-rc1). I will double check my tree just in case..

Thanks,
Badari


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: AIO, DIO fsx tests failures on 2.6.19-rc1-mm1
  2006-10-17 22:31           ` Badari Pulavarty
  2006-10-17 23:10             ` Andrew Morton
@ 2006-10-18 15:44             ` Chris Mason
  1 sibling, 0 replies; 11+ messages in thread
From: Chris Mason @ 2006-10-18 15:44 UTC (permalink / raw)
  To: Badari Pulavarty; +Cc: Andrew Morton, Zach Brown, lkml

On Tue, Oct 17, 2006 at 03:31:49PM -0700, Badari Pulavarty wrote:
> Okay. Finally tracked down the problem I am running into.
> This happens only on reiserfs
> 
> # /root/fsx-linux -N 10000 -o 128000 -r 2048 -w 4096 -Z -R -W
> jnk
> mapped writes DISABLED
> doread: read: Invalid argument
> Segmentation fault
> 
> Here is the strace for it
> ..
> ftruncate(3, 2721)                      = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=2721, ...}) = 0
> lseek(3, 0, SEEK_END)                   = 2721
> fstat(3, {st_mode=S_IFREG|0644, st_size=2721, ...}) = 0
> lseek(3, 0, SEEK_END)                   = 2721
> fstat(3, {st_mode=S_IFREG|0644, st_size=2721, ...}) = 0
> lseek(3, 0, SEEK_END)                   = 2721
> lseek(3, 0, SEEK_SET)                   = 0
> read(3, 0x50a800, 2048)                 = -1 EINVAL (Invalid argument)
> 
> reiserfs getblock() is returing -EINVAL. There is comment in the code
> about tail handling and returning EINVAL. BTW, this is not a -mm
> issue, it happens on mainline too...

Yes, reiserfs doesn't allow O_DIRECT on tails.  You'll have to mount -o
notail for this test.

-chris

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2006-10-18 15:44 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-16 15:42 AIO, DIO fsx tests failures on 2.6.19-rc1-mm1 Badari Pulavarty
2006-10-16 17:51 ` Zach Brown
2006-10-16 17:59   ` Badari Pulavarty
2006-10-16 20:13     ` Zach Brown
2006-10-16 20:38       ` Badari Pulavarty
2006-10-16 20:59         ` Andrew Morton
2006-10-16 21:21           ` Zach Brown
2006-10-17 22:31           ` Badari Pulavarty
2006-10-17 23:10             ` Andrew Morton
2006-10-17 23:38               ` Badari Pulavarty
2006-10-18 15:44             ` Chris Mason

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox