1st glance at kiobuf overhead in kernel aio vs pread vs user aio

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* 1st glance at kiobuf overhead in kernel aio vs pread vs user aio
@ 2001-02-02 19:32 bcrl
  2001-02-02 20:18 ` Ingo Molnar
  0 siblings, 1 reply; 7+ messages in thread
From: bcrl @ 2001-02-02 19:32 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-aio, kiobuf-io-devel

Hey folks,

First off, sorry for spamming all the mailing lists, but I want to make
sure that everyone interested in kiobufs, aio and the like sees this.
Since the mass of discussion going on about kiobufs started, I ran a few
tests of the behaviour of various code when reading from a cached ~700mb
file.  The first thing I've noticed is that I have to slim down the posix
compatibility code for aio =).  In any case, here are some graphs of
interest:

	http://www.kvack.org/~blah/aio_plot5.png
	http://www.kvack.org/~blah/aio_plot5_nouser.png

The graph is of log2(buffersize) vs microseconds to read 700MB of file
into this buffer.  The machine used was a 4way 1MB Xeon.  The 1gb items
were done while running with no highmem support, and 4gb with highmem but
no PAE.  Of the graphs, the second is probably more interesting since it
removes the userland aio data points which squash things quite a bit.

Note that the aio code makes use of map_user_kiobuf for all access to/from
user space and avoids context switches on page cache hits.  There is also
overhead for setting up the data structures that is probably causing a lot
of the base overhead, especially in glibc; to this end I'll post updated
results from using aio syscalls directly, as well as after changing the
kernel aio read path to improve cache locality.

The plateaus visible at 2**18 and 2**20 onward would be the transition
from L2 cache to main memory bandwidth; buffer sizes less than 1 page may
result in a similar picture.  The overhead of kmaps for highmem looks to
be fairly low (~5%), and aio is ~9% at 64K to ~5% at 1MB and larger.  My
goal is to reduce aio's overhead to less than 1%.

If you want to take a peek at the aio code, you can grab it from
http://www.kvack.org/~blah/aio/aio-v2.4.0-20010123.diff .  There are a few
changes still pending, and I'll look into improving the performance with
smaller buffers over the weekend.  I'll try reducing the cache damage done
with the aio code as compared to pread, and isolating the costs of setting
up/tearing down a kiobuf versus reusing one.  To this end, I'm going to
implement aio sendfile and use the kiobuf device idea from Stephen.
Comments/thoughts/patches appreciated...  Cheers,

		-ben

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 1st glance at kiobuf overhead in kernel aio vs pread vs user aio
  2001-02-02 19:32 1st glance at kiobuf overhead in kernel aio vs pread vs user aio bcrl
@ 2001-02-02 20:18 ` Ingo Molnar
  2001-02-02 20:45   ` [Kiobuf-io-devel] " Benjamin LaHaise
  0 siblings, 1 reply; 7+ messages in thread
From: Ingo Molnar @ 2001-02-02 20:18 UTC (permalink / raw)
  To: bcrl; +Cc: Linux Kernel List, linux-aio, kiobuf-io-devel, Stephen C. Tweedie

Ben,

- first of all, great patch! I've got a conceptual question: exactly how
does the AIO code prevent filesystem-related scheduling in the issuing
process' context? I'd like to use (and test) your AIO code for TUX, but i
do not see where it's guaranteed that the process that does the aio does
not block - from the patch this is not yet clear to me. (Right now TUX
uses separate 'async IO' kernel threads to avoid this problem.) Or if it's
not yet possible, what are the plans to handle this?

- another conceptual question. async IO doesnt have much use if many files
are used and open() is synchronous. (which it is right now) Thus for TUX
i've added ATOMICLOOKUP to the VFS - and 'missed' (ie. not yet
dentry-cached) VFS lookups are passed to the async IO threads as well. Do
you have any plans to add file-open() as an integral part of the async IO
framework as well?

once these issues are solved (or are they already?), i'd love to drop the
ad-hoc kernel-thread based async IO implementation of TUX and 'use the
real thing'. (which will also probably perform better) [Btw., context
switches are not that much of an issue in kernel-space, due to lazy TLB
switching. So basically in kernel-space the async IO threads are barely
more than a function call.]

	Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Kiobuf-io-devel] Re: 1st glance at kiobuf overhead in kernel aio vs pread vs user aio
  2001-02-02 20:18 ` Ingo Molnar
@ 2001-02-02 20:45   ` Benjamin LaHaise
  2001-02-02 23:14     ` Ingo Molnar
  2001-02-02 23:19     ` [Kiobuf-io-devel] Re: 1st glance at kiobuf overhead in kernelaio " Rajagopal Ananthanarayanan
  0 siblings, 2 replies; 7+ messages in thread
From: Benjamin LaHaise @ 2001-02-02 20:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linux Kernel List, linux-aio, kiobuf-io-devel, Stephen C. Tweedie

Hey Ingo,

On Fri, 2 Feb 2001, Ingo Molnar wrote:

> - first of all, great patch! I've got a conceptual question: exactly how
> does the AIO code prevent filesystem-related scheduling in the issuing
> process' context? I'd like to use (and test) your AIO code for TUX, but i
> do not see where it's guaranteed that the process that does the aio does
> not block - from the patch this is not yet clear to me. (Right now TUX
> uses separate 'async IO' kernel threads to avoid this problem.) Or if it's
> not yet possible, what are the plans to handle this?

Thanks!  Right now the code does the page cache lookup allocations and
lookups in the caller's thread, the write path then attempts to lock all
pages sequentially during io using the async page locking function
wtd_lock_page.  I've tried to get this close to some of the ideas proposed
by Jeff Merkey, and have implemented async page and buffer locking
mechanisms so far.  The down in the write path is still synchronous,
mostly because I want some feedback before going much further down this
path.  The read path verifies the up2date state of individual pages, and
if it encounters one which is not, then it queues the request for the
worker thread which calls readpage on all the pages that need updating.

> - another conceptual question. async IO doesnt have much use if many files
> are used and open() is synchronous. (which it is right now) Thus for TUX
> i've added ATOMICLOOKUP to the VFS - and 'missed' (ie. not yet
> dentry-cached) VFS lookups are passed to the async IO threads as well. Do
> you have any plans to add file-open() as an integral part of the async IO
> framework as well?

I hadn't thought of that, but I don't think it would be too hard to
implement.  We need to decide the degree to which state machine based code
should be used within the kernel.  For many things it can potentially have
a lower overhead than existing kernel code simply because the stack usage
is much flatter.

> once these issues are solved (or are they already?), i'd love to drop the
> ad-hoc kernel-thread based async IO implementation of TUX and 'use the
> real thing'. (which will also probably perform better) [Btw., context
> switches are not that much of an issue in kernel-space, due to lazy TLB
> switching. So basically in kernel-space the async IO threads are barely
> more than a function call.]

;-)  I still want to get the network glue done to merge with the zerocopy
patches as buffer management for things like large LDAP servers isn't
going to work that well with the close-to-posix aio_read.  Of course, this
depends on the aio sendfile code that's coming...

		-ben

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Kiobuf-io-devel] Re: 1st glance at kiobuf overhead in kernel aio vs pread vs user aio
  2001-02-02 20:45   ` [Kiobuf-io-devel] " Benjamin LaHaise
@ 2001-02-02 23:14     ` Ingo Molnar
  2001-02-02 23:19     ` [Kiobuf-io-devel] Re: 1st glance at kiobuf overhead in kernelaio " Rajagopal Ananthanarayanan
  1 sibling, 0 replies; 7+ messages in thread
From: Ingo Molnar @ 2001-02-02 23:14 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Linux Kernel List, linux-aio, kiobuf-io-devel, Stephen C. Tweedie

On Fri, 2 Feb 2001, Benjamin LaHaise wrote:

> Thanks! Right now the code does the page cache lookup allocations and
> lookups in the caller's thread, [...]

(the killer is not the memory allocation(s), if there is enough RAM then
we can get a free page without having to block.)

The real problem is the implicit ->bmap() in readpage(). IMO this is the
tough part in AIO. There can be zillions of sub-IOs generated during
filesystem ->bmap(). Doing the data reads asynchronously is just about 30%
of the work, and as long as there is even a *single* inode-related
blocking point in the synchronous async IO path, the whole scheme remains
useless for practical applications. It does not matter that 90% of the IO
is asynchronous - if we are blocking on the remaining 10% then the whole
operation degrades to synchronous!

To make this work correctly, lowlevel filesystem code must be modified in
nontrivial ways. If this is easy then please give me a quick description
of how this is going to be done.

Plus there is the issue of not blocking in __wait_request() either.

	Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Kiobuf-io-devel] Re: 1st glance at kiobuf overhead in kernelaio vs  pread vs user aio
  2001-02-02 20:45   ` [Kiobuf-io-devel] " Benjamin LaHaise
  2001-02-02 23:14     ` Ingo Molnar
@ 2001-02-02 23:19     ` Rajagopal Ananthanarayanan
  2001-02-02 23:30       ` Ingo Molnar
  1 sibling, 1 reply; 7+ messages in thread
From: Rajagopal Ananthanarayanan @ 2001-02-02 23:19 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Ingo Molnar, Linux Kernel List, linux-aio, kiobuf-io-devel,
	Stephen C. Tweedie

Benjamin LaHaise wrote:
> 
> Hey Ingo,
> 
> On Fri, 2 Feb 2001, Ingo Molnar wrote:
> 
> > - first of all, great patch! I've got a conceptual question: exactly how
> > does the AIO code prevent filesystem-related scheduling in the issuing
> > process' context? I'd like to use (and test) your AIO code for TUX, but i
> > do not see where it's guaranteed that the process that does the aio does
> > not block - from the patch this is not yet clear to me. (Right now TUX
> > uses separate 'async IO' kernel threads to avoid this problem.) Or if it's
> > not yet possible, what are the plans to handle this?
> 
> Thanks!  Right now the code does the page cache lookup allocations and
> lookups in the caller's thread, the write path then attempts to lock all
> pages sequentially during io using the async page locking function
> wtd_lock_page.  I've tried to get this close to some of the ideas proposed
> by Jeff Merkey, and have implemented async page and buffer locking
> mechanisms so far.  The down in the write path is still synchronous,
> mostly because I want some feedback before going much further down this
> path.  The read path verifies the up2date state of individual pages, and
> if it encounters one which is not, then it queues the request for the
> worker thread which calls readpage on all the pages that need updating.

[ Ben, good to see you have a patch to send, something which I've been requesting
  you for sometime now ;-) ]

Do you really have worker threads? In my reading of the patch it seems
that the wtd is serviced by keventd. And by using mapped kiobuf's you've
avoided issues such as:

	a. (not) requiring a requestor's process context to perform the copy (copy-out
           on read, for example)
        b. avoiding requestor's (user) page from being unmapped when a
             __iodesc_read_finish is being executed.

These are two major improvements I'm glad to see over my earlier KAIO patch
(obURL: http://oss.sgi.com/projects/kaio/) ... of course, several abstractions,
including kiobufs & more generic task queues in 2.4 have made this easier,
which is a good thing.

I see several similarities to the KAIO patch too; stuff like splitting
generic_read routine (which now you have expanded to include the write
routine also), and the handling of RAW devices.

A nice addition in your patch is the introduction of kiobuf as a common container of
pages, which in the KAIO patch was handled with an ad-hoc (page *) vector
for non RAW & kiobuf's for the RAW case.

One point which is not clear is how one would implement aio_suspend(...)
which waits for any ONE of N aiocb's to complete. The aio_complete(...)
routine in your patch expects a particular idx to wait on, so I assume
as is, only one aiocb can be waited upon. Am I correct? This particular
case is solved in the KAIO patch ...

Also, can you also put out a library that goes with the kernel patch?
I can imagine what it would look like, but ...

Cheers,

ananth.

--------------------------------------------------------------------------
Rajagopal Ananthanarayanan ("ananth")
Member Technical Staff, SGI.
--------------------------------------------------------------------------
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Kiobuf-io-devel] Re: 1st glance at kiobuf overhead in kernelaio vs  pread vs user aio
  2001-02-02 23:19     ` [Kiobuf-io-devel] Re: 1st glance at kiobuf overhead in kernelaio " Rajagopal Ananthanarayanan
@ 2001-02-02 23:30       ` Ingo Molnar
  2001-02-03  0:37         ` [Kiobuf-io-devel] Re: 1st glance at kiobuf overhead in kernelaiovs " Rajagopal Ananthanarayanan
  0 siblings, 1 reply; 7+ messages in thread
From: Ingo Molnar @ 2001-02-02 23:30 UTC (permalink / raw)
  To: Rajagopal Ananthanarayanan
  Cc: Benjamin LaHaise, Linux Kernel List, linux-aio, kiobuf-io-devel,
	Stephen C. Tweedie


On Fri, 2 Feb 2001, Rajagopal Ananthanarayanan wrote:

> Do you really have worker threads? In my reading of the patch it seems
> that the wtd is serviced by keventd. [...]

i think worker threads (or any 'helper' threads) should be avoided. It can
be done without any extra process context, and it should be done that way.
Why all the trouble with async IO requests if requests are going to end up
in a worker thread's context anyway? (which will be a serializing point,
otherwise why does it end up there?)

	Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Kiobuf-io-devel] Re: 1st glance at kiobuf overhead in kernelaiovs   pread vs user aio
  2001-02-02 23:30       ` Ingo Molnar
@ 2001-02-03  0:37         ` Rajagopal Ananthanarayanan
  0 siblings, 0 replies; 7+ messages in thread
From: Rajagopal Ananthanarayanan @ 2001-02-03  0:37 UTC (permalink / raw)
  To: mingo
  Cc: Benjamin LaHaise, Linux Kernel List, linux-aio, kiobuf-io-devel,
	Stephen C. Tweedie

Ingo Molnar wrote:
> 
> On Fri, 2 Feb 2001, Rajagopal Ananthanarayanan wrote:
> 
> > Do you really have worker threads? In my reading of the patch it seems
> > that the wtd is serviced by keventd. [...]
> 
> i think worker threads (or any 'helper' threads) should be avoided. It can
> be done without any extra process context, and it should be done that way.
> Why all the trouble with async IO requests if requests are going to end up
> in a worker thread's context anyway? (which will be a serializing point,
> otherwise why does it end up there?)
> 

Good point. Can you expand on how you plan to service pending
chunks of work (eg. issuing readpage() on some pages) without
the use of threads?

thanks,


--------------------------------------------------------------------------
Rajagopal Ananthanarayanan ("ananth")
Member Technical Staff, SGI.
--------------------------------------------------------------------------
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2001-02-03  0:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-02-02 19:32 1st glance at kiobuf overhead in kernel aio vs pread vs user aio bcrl
2001-02-02 20:18 ` Ingo Molnar
2001-02-02 20:45   ` [Kiobuf-io-devel] " Benjamin LaHaise
2001-02-02 23:14     ` Ingo Molnar
2001-02-02 23:19     ` [Kiobuf-io-devel] Re: 1st glance at kiobuf overhead in kernelaio " Rajagopal Ananthanarayanan
2001-02-02 23:30       ` Ingo Molnar
2001-02-03  0:37         ` [Kiobuf-io-devel] Re: 1st glance at kiobuf overhead in kernelaiovs " Rajagopal Ananthanarayanan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox