public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Suparna Bhattacharya <suparna@in.ibm.com>
To: linux-aio@kvack.org, linux-kernel@vger.kernel.org
Cc: linux-osdl@osdl.org
Subject: Re: [PATCH 20/22] AIO poll
Date: Fri, 2 Jul 2004 22:09:46 +0530	[thread overview]
Message-ID: <20040702163946.GJ3450@in.ibm.com> (raw)
In-Reply-To: <20040702130030.GA4256@in.ibm.com>

On Fri, Jul 02, 2004 at 06:30:30PM +0530, Suparna Bhattacharya wrote:
> The patchset contains modifications and fixes to the AIO core
> to support the full retry model, an implementation of AIO
> support for buffered filesystem AIO reads and O_SYNC writes
> (the latter courtesy O_SYNC speedup changes from Andrew Morton),
> an implementation of AIO reads and writes to pipes (from
> Chris Mason) and AIO poll (again from Chris Mason).
> 
> Full retry infrastructure and fixes
> [1] aio-retry.patch
> [2] 4g4g-aio-hang-fix.patch
> [3] aio-retry-elevated-refcount.patch
> [4] aio-splice-runlist.patch
> 
> FS AIO read
> [5] aio-wait-page.patch
> [6] aio-fs_read.patch
> [7] aio-upfront-readahead.patch
> 
> AIO for pipes
> [8] aio-cancel-fix.patch
> [9] aio-read-immediate.patch
> [10] aio-pipe.patch
> [11] aio-context-switch.patch
> 
> Concurrent O_SYNC write speedups using radix-tree walks
> [12] writepages-range.patch
> [13] fix-writeback-range.patch
> [14] fix-writepages-range.patch
> [15] fdatawrite-range.patch
> [16] O_SYNC-speedup.patch
> 
> AIO O_SYNC write
> [17] aio-wait_on_page_writeback_range.patch
> [18] aio-O_SYNC.patch
> [19] O_SYNC-write-fix.patch
> 
> AIO poll
> [20] aio-poll.patch
> 

Todo: fixup to merge with recent kiocb->private changes

-- 
Suparna Bhattacharya (suparna@in.ibm.com)
Linux Technology Center
IBM Software Lab, India
---------------------------------------------------

From: Chris Mason

AIO poll support
 
I've attached the patch, it's not quite as obvious as the pipe code, but
not too bad.  I'm not sure if I'm using struct kiocb->private the way it
was intended, but I don't see any other code touching it, so it should
be ok.

On Mon 2004-02-23 at 14:05, Suparna Bhattacharya wrote:
> I was wondering if a particular fop->poll routine, could possibly
> invoke __pollwait for more than one wait queue (I don't know if such
> a case even exists). That kind of a thing would work OK with the existing
> poll logic, but not in our case, because we'd end up queueing the same
> wait queue on two queues which would be a problem.

Oh, I see what you mean.  I looked at a few of the poll_wait callers,
and it seems safe, but there are too many for a full audit right now.
The attached patch fixes the page allocation problem and adds a check to
make sure we don't abuse current->io_wait.  An oops is better than
random corruption at least.
                                                                                
I ran it through my basic test and pipetest, the pipetest results are
below.  The pipetest epoll usage needs updating, so I can only compare
against regular poll.
                                                                                
./pipetest --aio-poll 10000 1 5
using 10000 pipe pairs, 1 message threads, 5 generations, 12 bufsize
Ok! Mode aio-poll: 5 passes in 0.000073 seconds
passes_per_sec: 68493.15
coffee:/usr/src/aio # ./pipetest 10000 1 5
using 10000 pipe pairs, 1 message threads, 5 generations, 12 bufsize
Ok! Mode poll: 5 passes in 0.083066 seconds
passes_per_sec: 60.19

Here are some optimizations.  aio-poll-3 avoids wake_up when it can use
finish_wait instead, and adds a fast path to aio-poll for when data is
already available.
                                                                                

 fs/aio.c                |   17 +++++++
 fs/select.c             |  104 +++++++++++++++++++++++++++++++++++++++++++++++-
 include/linux/aio.h     |    1 
 include/linux/aio_abi.h |    2 
 4 files changed, 122 insertions(+), 2 deletions(-)

--- aio/fs/aio.c	2004-06-21 10:58:18.206817768 -0700
+++ aio-poll/fs/aio.c	2004-06-21 10:58:39.700550224 -0700
@@ -1361,6 +1361,16 @@ static ssize_t aio_fsync(struct kiocb *i
 }
 
 /*
+ * Retry method for aio_poll (also used for first time submit)
+ * Responsible for updating iocb state as retries progress
+ */
+static ssize_t aio_poll(struct kiocb *iocb)
+{
+	unsigned events = (unsigned)(iocb->ki_buf);
+	return generic_aio_poll(iocb, events);
+}
+
+/*
  * aio_setup_iocb:
  *	Performs the initial checks and aio retry method
  *	setup for the kiocb at the time of io submission.
@@ -1405,6 +1415,13 @@ ssize_t aio_setup_iocb(struct kiocb *kio
 		if (file->f_op->aio_fsync)
 			kiocb->ki_retry = aio_fsync;
 		break;
+	case IOCB_CMD_POLL:
+		ret = -EINVAL;
+		if (file->f_op->poll) {
+			memset(kiocb->private, 0, sizeof(kiocb->private));
+			kiocb->ki_retry = aio_poll;
+		}
+		break;
 	default:
 		dprintk("EINVAL: io_submit: no operation provided\n");
 		ret = -EINVAL;
--- aio/fs/select.c	2004-06-15 22:18:54.000000000 -0700
+++ aio-poll/fs/select.c	2004-06-21 10:58:39.702549920 -0700
@@ -21,6 +21,7 @@
 #include <linux/personality.h> /* for STICKY_TIMEOUTS */
 #include <linux/file.h>
 #include <linux/fs.h>
+#include <linux/aio.h>
 
 #include <asm/uaccess.h>
 
@@ -39,6 +40,12 @@ struct poll_table_page {
 	struct poll_table_entry entries[0];
 };
 
+struct aio_poll_table {
+	int init;
+	struct poll_wqueues wq;
+	struct poll_table_page table;
+};
+
 #define POLL_TABLE_FULL(table) \
 	((unsigned long)((table)->entry+1) > PAGE_SIZE + (unsigned long)(table))
 
@@ -109,12 +116,34 @@ void __pollwait(struct file *filp, wait_
 	/* Add a new entry */
 	{
 		struct poll_table_entry * entry = table->entry;
+		wait_queue_t *wait;
+		wait_queue_t *aio_wait = current->io_wait;
+
+		if (aio_wait) {
+			/* for aio, there can only be one wait_address.
+			 * we might be adding it again via a retry call
+			 * if so, just return.
+			 * if not, bad things are happening
+			 */
+			if (table->entry != table->entries) {
+				if (table->entries[0].wait_address != wait_address)
+					BUG();
+				return;
+			}
+		}
+
 		table->entry = entry+1;
 	 	get_file(filp);
 	 	entry->filp = filp;
 		entry->wait_address = wait_address;
 		init_waitqueue_entry(&entry->wait, current);
-		add_wait_queue(wait_address,&entry->wait);
+
+		/* if we're in aioland, use current->io_wait */
+		if (aio_wait)
+			wait = aio_wait;
+		else
+			wait = &entry->wait;
+		add_wait_queue(wait_address,wait);
 	}
 }
 
@@ -533,3 +562,76 @@ out_fds:
 	poll_freewait(&table);
 	return err;
 }
+
+static void aio_poll_freewait(struct aio_poll_table *ap, struct kiocb *iocb)
+{
+	struct poll_table_page * p = ap->wq.table;
+	if (p) {
+		struct poll_table_entry * entry = p->entry;
+		if (entry > p->entries) {
+			/*
+			 * there is only one entry for aio polls
+			 */
+			entry = p->entries;
+			if (iocb)
+				finish_wait(entry->wait_address,&iocb->ki_wait);
+			else
+				wake_up(entry->wait_address);
+			fput(entry->filp);
+		}
+	}
+	ap->init = 0;
+}
+
+static int
+aio_poll_cancel(struct kiocb *iocb, struct io_event *evt)
+{
+	struct aio_poll_table *aio_table;
+	aio_table = (struct aio_poll_table *)iocb->private;
+	
+	evt->obj = (u64)(unsigned long)iocb-> ki_obj.user;
+	evt->data = iocb->ki_user_data;
+	evt->res = iocb->ki_nbytes - iocb->ki_left;
+	if (evt->res == 0)
+	        evt->res = -EINTR;
+	evt->res2 = 0;
+	if (aio_table->init)
+		aio_poll_freewait(aio_table, NULL);
+	aio_put_req(iocb);
+	return 0;
+}
+
+ssize_t generic_aio_poll(struct kiocb *iocb, unsigned events)
+{
+	struct aio_poll_table *aio_table;
+	unsigned mask;
+	struct file *file = iocb->ki_filp;
+	aio_table = (struct aio_poll_table *)iocb->private;
+
+	/* fast path */
+	mask = file->f_op->poll(file, NULL);
+	mask &= events | POLLERR | POLLHUP;
+	if (mask)
+		return mask;
+
+	if ((sizeof(*aio_table) + sizeof(struct poll_table_entry)) >
+	    sizeof(iocb->private))
+		BUG();
+
+	if (!aio_table->init) {
+		aio_table->init = 1;
+		poll_initwait(&aio_table->wq);
+		aio_table->wq.table = &aio_table->table;
+		aio_table->table.next = NULL;
+		aio_table->table.entry = aio_table->table.entries;
+	}
+	iocb->ki_cancel = aio_poll_cancel;
+
+	mask = file->f_op->poll(file, &aio_table->wq.pt);
+	mask &= events | POLLERR | POLLHUP;
+	if (mask) {
+		aio_poll_freewait(aio_table, iocb);
+		return mask;
+	}
+	return -EIOCBRETRY;
+}
--- aio/include/linux/aio.h	2004-06-21 10:58:18.206817768 -0700
+++ aio-poll/include/linux/aio.h	2004-06-21 10:58:39.702549920 -0700
@@ -201,4 +201,5 @@ static inline struct kiocb *list_kiocb(s
 extern atomic_t aio_nr;
 extern unsigned aio_max_nr;
 
+extern ssize_t generic_aio_poll(struct kiocb *, unsigned);
 #endif /* __LINUX__AIO_H */
--- aio/include/linux/aio_abi.h	2004-06-15 22:18:58.000000000 -0700
+++ aio-poll/include/linux/aio_abi.h	2004-06-21 10:58:39.703549768 -0700
@@ -38,8 +38,8 @@ enum {
 	IOCB_CMD_FDSYNC = 3,
 	/* These two are experimental.
 	 * IOCB_CMD_PREADX = 4,
-	 * IOCB_CMD_POLL = 5,
 	 */
+	IOCB_CMD_POLL = 5,
 	IOCB_CMD_NOOP = 6,
 };
 

  parent reply	other threads:[~2004-07-02 16:33 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-07-02 13:00 [PATCH 0/22] fsaio, pipe aio and aio poll upgraded to 2.6.7 Suparna Bhattacharya
2004-07-02 13:07 ` [PATCH 1/22] High-level AIO retry infrastructure and fixes Suparna Bhattacharya
2004-07-02 13:11 ` [PATCH 2/22] use_mm fix (helps AIO hangs on 4:4 split) Suparna Bhattacharya
2004-07-02 13:14 ` [PATCH 3/22] Refcounting fixes Suparna Bhattacharya
2004-07-02 13:15 ` [PATCH 4/22] Splice ioctx runlist for fairness Suparna Bhattacharya
2004-07-02 13:16 ` [PATCH 5/22] AIO wait on page support Suparna Bhattacharya
2004-07-02 13:18 ` [PATCH 6/22] FS AIO read Suparna Bhattacharya
2004-07-02 13:19 ` [PATCH 7/22] Upfront readahead to help streaming AIO reads Suparna Bhattacharya
2004-07-02 13:20 ` [PATCH 8/22] AIO cancellation fix Suparna Bhattacharya
2004-07-02 13:23 ` [PATCH 9/22] AIO immediate read (needed for AIO pipes & sockets) Suparna Bhattacharya
2004-07-02 13:23 ` [PATCH 10/22] AIO pipe support Suparna Bhattacharya
2004-07-02 13:26 ` [PATCH 11/22] Reduce AIO worker context switches Suparna Bhattacharya
2004-07-02 16:05 ` [PATCH 12/22] Writeback page range hint Suparna Bhattacharya
2004-07-02 16:18 ` [PATCH 13/22] Fix writeback page range to use exact limits Suparna Bhattacharya
2004-07-02 16:22 ` [PATCH 14/22] mpage writepages range limit fix Suparna Bhattacharya
2004-07-02 16:25 ` [PATCH 15/22] filemap_fdatawrite range interface Suparna Bhattacharya
2004-07-02 16:27 ` [PATCH 16/22] Concurrent O_SYNC write support Suparna Bhattacharya
2004-07-02 16:31 ` [PATCH 17/22] AIO wait on writeback Suparna Bhattacharya
2004-07-02 16:33 ` [PATCH 18/22] AIO O_SYNC write Suparna Bhattacharya
2004-07-02 16:34 ` [PATCH 19/22] Fix math error in AIO wait on writeback Suparna Bhattacharya
2004-07-02 16:39 ` Suparna Bhattacharya [this message]
2004-07-29 15:19   ` [PATCH 20/22] AIO poll Jeff Moyer
2004-07-29 16:02     ` Avi Kivity
2004-07-29 16:16       ` Arjan van de Ven
2004-07-29 16:37         ` Benjamin LaHaise
2004-07-29 17:23         ` William Lee Irwin III
2004-07-29 17:10       ` William Lee Irwin III
2004-07-29 17:24         ` Avi Kivity
2004-07-29 17:26           ` William Lee Irwin III
2004-07-29 17:30             ` Avi Kivity
2004-07-29 17:32               ` William Lee Irwin III
2004-07-02 16:42 ` [PATCH 21/22] fix: flush workqueue on put_ioctx Suparna Bhattacharya
2004-07-02 16:44 ` [PATCH 22/22] Fix stalls with the AIO context switch patch Suparna Bhattacharya
2004-07-05  9:24 ` [PATCH 0/22] fsaio, pipe aio and aio poll upgraded to 2.6.7 Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040702163946.GJ3450@in.ibm.com \
    --to=suparna@in.ibm.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-osdl@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox