All of lore.kernel.org
 help / color / mirror / Atom feed
From: Suparna Bhattacharya <suparna@in.ibm.com>
To: Daniel McNeil <daniel@osdl.org>
Cc: Andrew Morton <akpm@osdl.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, "linux-aio@kvack.org" <linux-aio@kvack.org>
Subject: Re: 2.6.0-test9-mm3 - AIO test results
Date: Mon, 17 Nov 2003 10:55:18 +0530	[thread overview]
Message-ID: <20031117052518.GA11184@in.ibm.com> (raw)
In-Reply-To: <1068761038.1805.35.camel@ibm-c.pdx.osdl.net>

On Thu, Nov 13, 2003 at 02:03:58PM -0800, Daniel McNeil wrote:
> Andrew,
> 
> I'm testing test9-mm3 on a 2-proc Xeon with a ext3 file system.
> I tested using the test programs aiocp and aiodio_sparse.
> (see http://developer.osdl.org/daniel/AIO/)
> 
> Using aiocp with i/o sizes from 1k to 512k to copy files worked
> without any errors or kernel debug messages.
> 
> With 64k i/o, the aiodio_sparse program complete without any errors.
> There are no kernel error messages, so that is good.
> 
> There are still problems with non power of 2 i/o sizes using AIO and
> O_DIRECT.  It hangs with aio's that do not seem to complete.  The test
> does exit when hitting ^c and there are no kernel messages.  Test output
> below:

Could you check if the following patch fixes the problem for you ?

Regards
Suparna

--------------------------------------------------------------

With this patch, when the DIO code falls back to buffered i/o after
having submitted part of the i/o, then buffered i/o is issued only
for the remaining part of the request (i.e. the part not already 
covered by DIO).

diff -ur pure-mm3/fs/direct-io.c linux-2.6.0-test9-mm3/fs/direct-io.c
--- pure-mm3/fs/direct-io.c	2003-11-14 09:09:06.000000000 +0530
+++ linux-2.6.0-test9-mm3/fs/direct-io.c	2003-11-17 09:00:47.000000000 +0530
@@ -74,6 +74,7 @@
 					   been performed at the start of a
 					   write */
 	int pages_in_io;		/* approximate total IO pages */
+	size_t	size;			/* total request size (doesn't change)*/
 	sector_t block_in_file;		/* Current offset into the underlying
 					   file in dio_block units. */
 	unsigned blocks_available;	/* At block_in_file.  changes */
@@ -226,7 +227,7 @@
 			dio_complete(dio, dio->block_in_file << dio->blkbits,
 					dio->result);
 			/* Complete AIO later if falling back to buffered i/o */
-			if (dio->result != -ENOTBLK) {
+			if (dio->result >= dio->size || dio->rw == READ) {
 				aio_complete(dio->iocb, dio->result, 0);
 				kfree(dio);
 			} else {
@@ -889,6 +890,7 @@
 	dio->blkbits = blkbits;
 	dio->blkfactor = inode->i_blkbits - blkbits;
 	dio->start_zero_done = 0;
+	dio->size = 0;
 	dio->block_in_file = offset >> blkbits;
 	dio->blocks_available = 0;
 	dio->cur_page = NULL;
@@ -925,7 +927,7 @@
 
 	for (seg = 0; seg < nr_segs; seg++) {
 		user_addr = (unsigned long)iov[seg].iov_base;
-		bytes = iov[seg].iov_len;
+		dio->size += bytes = iov[seg].iov_len;
 
 		/* Index into the first page of the first block */
 		dio->first_block_in_page = (user_addr & ~PAGE_MASK) >> blkbits;
@@ -956,6 +958,13 @@
 		}
 	} /* end iovec loop */
 
+	if (ret == -ENOTBLK && rw == WRITE) {
+		/*
+		 * The remaining part of the request will be 
+		 * be handled by buffered I/O when we return
+		 */
+		ret = 0;
+	}
 	/*
 	 * There may be some unwritten disk at the end of a part-written
 	 * fs-block-sized block.  Go zero that now.
@@ -986,19 +995,13 @@
 	 */
 	if (dio->is_async) {
 		if (ret == 0)
-			ret = dio->result;	/* Bytes written */
-		if (ret == -ENOTBLK) {
-			/*
-			 * The request will be reissued via buffered I/O
-			 * when we return; Any I/O already issued
-			 * effectively becomes redundant.
-			 */
-			dio->result = ret;
+			ret = dio->result;
+		if (ret > 0 && dio->result < dio->size && rw == WRITE) {
 			dio->waiter = current;
 		}
 		finished_one_bio(dio);		/* This can free the dio */
 		blk_run_queues();
-		if (ret == -ENOTBLK) {
+		if (dio->waiter) {
 			/*
 			 * Wait for already issued I/O to drain out and
 			 * release its references to user-space pages
@@ -1032,7 +1035,8 @@
 		}
 		dio_complete(dio, offset, ret);
 		/* We could have also come here on an AIO file extend */
-		if (!is_sync_kiocb(iocb) && (ret != -ENOTBLK))
+		if (!is_sync_kiocb(iocb) && !(rw == WRITE && ret >= 0 && 
+			dio->result < dio->size))
 			aio_complete(iocb, ret, 0);
 		kfree(dio);
 	}
diff -ur pure-mm3/mm/filemap.c linux-2.6.0-test9-mm3/mm/filemap.c
--- pure-mm3/mm/filemap.c	2003-11-14 09:15:08.000000000 +0530
+++ linux-2.6.0-test9-mm3/mm/filemap.c	2003-11-15 11:11:16.000000000 +0530
@@ -1895,14 +1895,16 @@
 		 */
 		if (written >= 0 && file->f_flags & O_SYNC)
 			status = generic_osync_inode(inode, mapping, OSYNC_METADATA);
-		if (written >= 0 && !is_sync_kiocb(iocb))
+		if (written >= count && !is_sync_kiocb(iocb))
 			written = -EIOCBQUEUED;
-		if (written != -ENOTBLK)
+		if (written < 0 || written >= count)
 			goto out_status;
 		/*
 		 * direct-io write to a hole: fall through to buffered I/O
+		 * for completing the rest of the request.
 		 */
-		written = 0;
+		pos += written;
+		count -= written;
 	}
 
 	buf = iov->iov_base;

WARNING: multiple messages have this Message-ID (diff)
From: Suparna Bhattacharya <suparna@in.ibm.com>
To: Daniel McNeil <daniel@osdl.org>
Cc: Andrew Morton <akpm@osdl.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, "linux-aio@kvack.org" <linux-aio@kvack.org>
Subject: Re: 2.6.0-test9-mm3 - AIO test results
Date: Mon, 17 Nov 2003 10:55:18 +0530	[thread overview]
Message-ID: <20031117052518.GA11184@in.ibm.com> (raw)
In-Reply-To: <1068761038.1805.35.camel@ibm-c.pdx.osdl.net>

On Thu, Nov 13, 2003 at 02:03:58PM -0800, Daniel McNeil wrote:
> Andrew,
> 
> I'm testing test9-mm3 on a 2-proc Xeon with a ext3 file system.
> I tested using the test programs aiocp and aiodio_sparse.
> (see http://developer.osdl.org/daniel/AIO/)
> 
> Using aiocp with i/o sizes from 1k to 512k to copy files worked
> without any errors or kernel debug messages.
> 
> With 64k i/o, the aiodio_sparse program complete without any errors.
> There are no kernel error messages, so that is good.
> 
> There are still problems with non power of 2 i/o sizes using AIO and
> O_DIRECT.  It hangs with aio's that do not seem to complete.  The test
> does exit when hitting ^c and there are no kernel messages.  Test output
> below:

Could you check if the following patch fixes the problem for you ?

Regards
Suparna

--------------------------------------------------------------

With this patch, when the DIO code falls back to buffered i/o after
having submitted part of the i/o, then buffered i/o is issued only
for the remaining part of the request (i.e. the part not already 
covered by DIO).

diff -ur pure-mm3/fs/direct-io.c linux-2.6.0-test9-mm3/fs/direct-io.c
--- pure-mm3/fs/direct-io.c	2003-11-14 09:09:06.000000000 +0530
+++ linux-2.6.0-test9-mm3/fs/direct-io.c	2003-11-17 09:00:47.000000000 +0530
@@ -74,6 +74,7 @@
 					   been performed at the start of a
 					   write */
 	int pages_in_io;		/* approximate total IO pages */
+	size_t	size;			/* total request size (doesn't change)*/
 	sector_t block_in_file;		/* Current offset into the underlying
 					   file in dio_block units. */
 	unsigned blocks_available;	/* At block_in_file.  changes */
@@ -226,7 +227,7 @@
 			dio_complete(dio, dio->block_in_file << dio->blkbits,
 					dio->result);
 			/* Complete AIO later if falling back to buffered i/o */
-			if (dio->result != -ENOTBLK) {
+			if (dio->result >= dio->size || dio->rw == READ) {
 				aio_complete(dio->iocb, dio->result, 0);
 				kfree(dio);
 			} else {
@@ -889,6 +890,7 @@
 	dio->blkbits = blkbits;
 	dio->blkfactor = inode->i_blkbits - blkbits;
 	dio->start_zero_done = 0;
+	dio->size = 0;
 	dio->block_in_file = offset >> blkbits;
 	dio->blocks_available = 0;
 	dio->cur_page = NULL;
@@ -925,7 +927,7 @@
 
 	for (seg = 0; seg < nr_segs; seg++) {
 		user_addr = (unsigned long)iov[seg].iov_base;
-		bytes = iov[seg].iov_len;
+		dio->size += bytes = iov[seg].iov_len;
 
 		/* Index into the first page of the first block */
 		dio->first_block_in_page = (user_addr & ~PAGE_MASK) >> blkbits;
@@ -956,6 +958,13 @@
 		}
 	} /* end iovec loop */
 
+	if (ret == -ENOTBLK && rw == WRITE) {
+		/*
+		 * The remaining part of the request will be 
+		 * be handled by buffered I/O when we return
+		 */
+		ret = 0;
+	}
 	/*
 	 * There may be some unwritten disk at the end of a part-written
 	 * fs-block-sized block.  Go zero that now.
@@ -986,19 +995,13 @@
 	 */
 	if (dio->is_async) {
 		if (ret == 0)
-			ret = dio->result;	/* Bytes written */
-		if (ret == -ENOTBLK) {
-			/*
-			 * The request will be reissued via buffered I/O
-			 * when we return; Any I/O already issued
-			 * effectively becomes redundant.
-			 */
-			dio->result = ret;
+			ret = dio->result;
+		if (ret > 0 && dio->result < dio->size && rw == WRITE) {
 			dio->waiter = current;
 		}
 		finished_one_bio(dio);		/* This can free the dio */
 		blk_run_queues();
-		if (ret == -ENOTBLK) {
+		if (dio->waiter) {
 			/*
 			 * Wait for already issued I/O to drain out and
 			 * release its references to user-space pages
@@ -1032,7 +1035,8 @@
 		}
 		dio_complete(dio, offset, ret);
 		/* We could have also come here on an AIO file extend */
-		if (!is_sync_kiocb(iocb) && (ret != -ENOTBLK))
+		if (!is_sync_kiocb(iocb) && !(rw == WRITE && ret >= 0 && 
+			dio->result < dio->size))
 			aio_complete(iocb, ret, 0);
 		kfree(dio);
 	}
diff -ur pure-mm3/mm/filemap.c linux-2.6.0-test9-mm3/mm/filemap.c
--- pure-mm3/mm/filemap.c	2003-11-14 09:15:08.000000000 +0530
+++ linux-2.6.0-test9-mm3/mm/filemap.c	2003-11-15 11:11:16.000000000 +0530
@@ -1895,14 +1895,16 @@
 		 */
 		if (written >= 0 && file->f_flags & O_SYNC)
 			status = generic_osync_inode(inode, mapping, OSYNC_METADATA);
-		if (written >= 0 && !is_sync_kiocb(iocb))
+		if (written >= count && !is_sync_kiocb(iocb))
 			written = -EIOCBQUEUED;
-		if (written != -ENOTBLK)
+		if (written < 0 || written >= count)
 			goto out_status;
 		/*
 		 * direct-io write to a hole: fall through to buffered I/O
+		 * for completing the rest of the request.
 		 */
-		written = 0;
+		pos += written;
+		count -= written;
 	}
 
 	buf = iov->iov_base;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

  reply	other threads:[~2003-11-17  5:19 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-11-13  7:30 2.6.0-test9-mm3 Andrew Morton
2003-11-13  7:30 ` 2.6.0-test9-mm3 Andrew Morton
2003-11-13 20:03 ` [PATCH] linux-2.6.0-test9-mm3_verbose-timesource-acpi-pm_A0 john stultz
2003-11-13 20:03   ` john stultz
2003-11-13 22:03 ` 2.6.0-test9-mm3 - AIO test results Daniel McNeil
2003-11-13 22:03   ` Daniel McNeil
2003-11-17  5:25   ` Suparna Bhattacharya [this message]
2003-11-17  5:25     ` Suparna Bhattacharya
2003-11-18  1:15     ` Daniel McNeil
2003-11-18  1:15       ` Daniel McNeil
2003-11-18  1:37       ` Daniel McNeil
2003-11-18  1:37         ` Daniel McNeil
2003-11-18 11:55         ` Suparna Bhattacharya
2003-11-18 11:55           ` Suparna Bhattacharya
2003-11-18 23:47           ` Daniel McNeil
2003-11-18 23:47             ` Daniel McNeil
2003-11-24  9:42             ` Suparna Bhattacharya
2003-11-24  9:42               ` Suparna Bhattacharya
2003-11-25 23:49               ` [PATCH 2.6.0-test9-mm5] aio-dio-fallback-bio_count-race.patch Daniel McNeil
2003-11-26  7:55                 ` Suparna Bhattacharya
2003-11-26  7:55                   ` Suparna Bhattacharya
2003-12-02  1:35                   ` Daniel McNeil
2003-12-02  1:35                     ` Daniel McNeil
2003-12-02 15:25                     ` Suparna Bhattacharya
2003-12-02 15:25                       ` Suparna Bhattacharya
2003-12-03 23:14                       ` Daniel McNeil
2003-12-03 23:14                         ` Daniel McNeil
2003-12-04  4:40                         ` Suparna Bhattacharya
2003-12-04  4:40                           ` Suparna Bhattacharya
2003-11-13 22:04 ` 2.6.0-test9-mm3 (compile stats) John Cherry
2003-11-14  5:07 ` 2.6.0-test9-mm3 Martin J. Bligh
2003-11-14  5:07   ` 2.6.0-test9-mm3 Martin J. Bligh
2003-11-14 20:57   ` 2.6.0-test9-mm3 Zwane Mwaikambo
2003-11-14 20:57     ` 2.6.0-test9-mm3 Zwane Mwaikambo
2003-11-14 21:57     ` 2.6.0-test9-mm3 Martin J. Bligh
2003-11-14 21:57       ` 2.6.0-test9-mm3 Martin J. Bligh
2003-11-14 21:37       ` 2.6.0-test9-mm3 Zwane Mwaikambo
2003-11-14 21:37         ` 2.6.0-test9-mm3 Zwane Mwaikambo
2003-11-14 21:47       ` 2.6.0-test9-mm3 Linus Torvalds
2003-11-14 21:47         ` 2.6.0-test9-mm3 Linus Torvalds
2003-11-15  0:55         ` 2.6.0-test9-mm3 Zwane Mwaikambo
2003-11-15  0:55           ` 2.6.0-test9-mm3 Zwane Mwaikambo
2003-11-15 19:34           ` [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops Zwane Mwaikambo
2003-11-15 19:34             ` Zwane Mwaikambo
2003-11-15 19:52             ` Zwane Mwaikambo
2003-11-15 19:52               ` Zwane Mwaikambo
2003-11-17 21:46             ` Zwane Mwaikambo
2003-11-17 21:46               ` Zwane Mwaikambo
2003-11-17 22:42               ` Linus Torvalds
2003-11-17 22:42                 ` Linus Torvalds
2003-11-17 23:01                 ` Zwane Mwaikambo
2003-11-17 23:01                   ` Zwane Mwaikambo
2003-11-17 23:14                   ` Zwane Mwaikambo
2003-11-17 23:14                     ` Zwane Mwaikambo
2003-11-18  7:21                     ` Zwane Mwaikambo
2003-11-18  7:21                       ` Zwane Mwaikambo
2003-11-18 15:47                       ` Linus Torvalds
2003-11-18 15:47                         ` Linus Torvalds
2003-11-18 16:16                         ` Zwane Mwaikambo
2003-11-18 16:16                           ` Zwane Mwaikambo
2003-11-18 16:37                           ` Linus Torvalds
2003-11-18 16:37                             ` Linus Torvalds
2003-11-18 17:08                             ` Zwane Mwaikambo
2003-11-18 17:08                               ` Zwane Mwaikambo
2003-11-18 17:38                               ` Martin J. Bligh
2003-11-18 17:38                                 ` Martin J. Bligh
2003-11-18 17:22                                 ` Zwane Mwaikambo
2003-11-18 17:22                                   ` Zwane Mwaikambo
2003-11-19 20:32                             ` Matt Mackall
2003-11-19 20:32                               ` Matt Mackall
2003-11-19 23:09                               ` Matt Mackall
2003-11-19 23:09                                 ` Matt Mackall
2003-11-20  7:14                                 ` Zwane Mwaikambo
2003-11-20  7:14                                   ` Zwane Mwaikambo
2003-11-20  7:44                                 ` Matt Mackall
2003-11-20  7:44                                   ` Matt Mackall
2003-11-20  7:53                                   ` Andrew Morton
2003-11-20  7:53                                     ` Andrew Morton
2003-11-20  8:13                                   ` Matt Mackall
2003-11-20  8:13                                     ` Matt Mackall
2003-11-14 19:08 ` 2.6.0-test9-mm3 Martin J. Bligh
2003-11-14 19:08   ` 2.6.0-test9-mm3 Martin J. Bligh
2003-11-14 18:59   ` 2.6.0-test9-mm3 Andrew Morton
2003-11-14 18:59     ` 2.6.0-test9-mm3 Andrew Morton
2003-11-14 19:32     ` 2.6.0-test9-mm3 Mike Fedyk
2003-11-14 19:32       ` 2.6.0-test9-mm3 Mike Fedyk
2003-11-14 20:27       ` 2.6.0-test9-mm3 John Stoffel
2003-11-14 20:27         ` 2.6.0-test9-mm3 John Stoffel
2003-11-15  1:01         ` 2.6.0-test9-mm3 Mike Fedyk
2003-11-15  1:01           ` 2.6.0-test9-mm3 Mike Fedyk
2003-11-14 19:10   ` 2.6.0-test9-mm3 Badari Pulavarty
2003-11-14 19:10     ` 2.6.0-test9-mm3 Badari Pulavarty
2003-11-14 20:29     ` 2.6.0-test9-mm3 Martin J. Bligh
2003-11-14 20:29       ` 2.6.0-test9-mm3 Martin J. Bligh
2003-11-17 20:58       ` 2.6.0-test9-mm3 bill davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20031117052518.GA11184@in.ibm.com \
    --to=suparna@in.ibm.com \
    --cc=akpm@osdl.org \
    --cc=daniel@osdl.org \
    --cc=linux-aio@kvack.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.