All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kent Overstreet <koverstreet@google.com>
To: Jens Axboe <jaxboe@fusionio.com>
Cc: Jack Wang <jack.wang.usish@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-aio@kvack.org" <linux-aio@kvack.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"zab@redhat.com" <zab@redhat.com>,
	"bcrl@kvack.org" <bcrl@kvack.org>,
	"jmoyer@redhat.com" <jmoyer@redhat.com>,
	"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH 00/26] AIO performance improvements/cleanups, v2
Date: Sat, 15 Dec 2012 01:25:26 -0800	[thread overview]
Message-ID: <20121215092526.GA10411@moria.home.lan> (raw)
In-Reply-To: <50CAD6D9.5070703@fusionio.com>

On Fri, Dec 14, 2012 at 08:35:53AM +0100, Jens Axboe wrote:
> On 2012-12-14 03:26, Jack Wang wrote:
> > 2012/12/14 Jens Axboe <jaxboe@fusionio.com>:
> >> On Mon, Dec 03 2012, Kent Overstreet wrote:
> >>> Last posting: http://thread.gmane.org/gmane.linux.kernel.aio.general/3169
> >>>
> >>> Changes since the last posting should all be noted in the individual
> >>> patch descriptions.
> >>>
> >>>  * Zach pointed out the aio_read_evt() patch was calling functions that
> >>>    could sleep in TASK_INTERRUPTIBLE state, that patch is rewritten.
> >>>  * Ben pointed out some synchronize_rcu() usage was problematic,
> >>>    converted it to call_rcu()
> >>>  * The flush_dcache_page() patch is new
> >>>  * Changed the "use cancellation list lazily" patch so as to remove
> >>>    ki_flags from struct kiocb.
> >>
> >> Kent, I ran a few tests, and the below patches still don't seem as fast
> >> as the approach I took. To keep it fair, I used your aio branch and
> >> applied by dio speedups too. As a sanity check, I ran with your branch
> >> alone as well. The quick results below - kaio is kent-aio, just your
> >> branch. kaio-dio is with the direct IO speedups too. jaio is my branch,
> >> which already has the dio changes too.
> >>
> >> Devices         Branch          IOPS
> >> 1               kaio            ~915K
> >> 1               kaio-dio        ~930K
> >> 1               jaio           ~1220K
> >> 6               kaio           ~3050K
> >> 6               kaio-dio       ~3080K
> >> 6               jaio            3500K
> >>
> >> The box runs out of CPU driving power, which is why it doesn't scale
> >> linearly, otherwise I know that jaio at least does. It's basically
> >> completion limited for the 6 device test at the moment.
> >>
> >> I'll run some profiling tomorrow morning and get you some better
> >> results. Just thought I'd share these at least.
> >>
> >> --
> >> Jens Axboe
> >>
> > 
> > A really good performance, woo.
> > 
> > I think the device tested is really fast PCIe SSD builded by fusionio
> > with fusionio in house block driver?
> 
> It is pci-e flash storage, but it is not fusion-io.
> 
> > any compare number with current mainline?
> 
> Sure, I should have included that. Here's the table again, this time
> with mainline as well.
> 
> Devices         Branch          IOPS
> 1               mainline        ~870K
> 1               kaio            ~915K
> 1               kaio-dio        ~930K
> 1               jaio           ~1220K
> 6               kaio           ~3050K
> 6               kaio-dio       ~3080K
> 6               jaio           ~3500K
> 6               mainline       ~2850K

Cool, thanks for the numbers!

I suspect the difference is due to contention on the ringbuffer,
completion side. You didn't enable my batched completion stuff, did you?

I suspect the numbers would look quite a bit different with that,
based on my own profiling. If the driver for the device you're testing
on is open source, I'd be happy to do the conversion (it's a 5 minute
job).

Also, I don't think our approaches really conflict - it's been awhile
since I looked at your patch but you're getting rid of the aio
ringbuffer and using a linked list instead, right? My batched completion
stuff should still benefit that case.

Though - hrm, I'd have expected getting rid of the cancellation linked
list to make a bigger difference and both our patchsets do that.

What device are you testing on, and what's your fio script? I may just
have to buy some hardware so I can test this myself.

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

WARNING: multiple messages have this Message-ID (diff)
From: Kent Overstreet <koverstreet@google.com>
To: Jens Axboe <jaxboe@fusionio.com>
Cc: Jack Wang <jack.wang.usish@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-aio@kvack.org" <linux-aio@kvack.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"zab@redhat.com" <zab@redhat.com>,
	"bcrl@kvack.org" <bcrl@kvack.org>,
	"jmoyer@redhat.com" <jmoyer@redhat.com>,
	"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH 00/26] AIO performance improvements/cleanups, v2
Date: Sat, 15 Dec 2012 01:25:26 -0800	[thread overview]
Message-ID: <20121215092526.GA10411@moria.home.lan> (raw)
In-Reply-To: <50CAD6D9.5070703@fusionio.com>

On Fri, Dec 14, 2012 at 08:35:53AM +0100, Jens Axboe wrote:
> On 2012-12-14 03:26, Jack Wang wrote:
> > 2012/12/14 Jens Axboe <jaxboe@fusionio.com>:
> >> On Mon, Dec 03 2012, Kent Overstreet wrote:
> >>> Last posting: http://thread.gmane.org/gmane.linux.kernel.aio.general/3169
> >>>
> >>> Changes since the last posting should all be noted in the individual
> >>> patch descriptions.
> >>>
> >>>  * Zach pointed out the aio_read_evt() patch was calling functions that
> >>>    could sleep in TASK_INTERRUPTIBLE state, that patch is rewritten.
> >>>  * Ben pointed out some synchronize_rcu() usage was problematic,
> >>>    converted it to call_rcu()
> >>>  * The flush_dcache_page() patch is new
> >>>  * Changed the "use cancellation list lazily" patch so as to remove
> >>>    ki_flags from struct kiocb.
> >>
> >> Kent, I ran a few tests, and the below patches still don't seem as fast
> >> as the approach I took. To keep it fair, I used your aio branch and
> >> applied by dio speedups too. As a sanity check, I ran with your branch
> >> alone as well. The quick results below - kaio is kent-aio, just your
> >> branch. kaio-dio is with the direct IO speedups too. jaio is my branch,
> >> which already has the dio changes too.
> >>
> >> Devices         Branch          IOPS
> >> 1               kaio            ~915K
> >> 1               kaio-dio        ~930K
> >> 1               jaio           ~1220K
> >> 6               kaio           ~3050K
> >> 6               kaio-dio       ~3080K
> >> 6               jaio            3500K
> >>
> >> The box runs out of CPU driving power, which is why it doesn't scale
> >> linearly, otherwise I know that jaio at least does. It's basically
> >> completion limited for the 6 device test at the moment.
> >>
> >> I'll run some profiling tomorrow morning and get you some better
> >> results. Just thought I'd share these at least.
> >>
> >> --
> >> Jens Axboe
> >>
> > 
> > A really good performance, woo.
> > 
> > I think the device tested is really fast PCIe SSD builded by fusionio
> > with fusionio in house block driver?
> 
> It is pci-e flash storage, but it is not fusion-io.
> 
> > any compare number with current mainline?
> 
> Sure, I should have included that. Here's the table again, this time
> with mainline as well.
> 
> Devices         Branch          IOPS
> 1               mainline        ~870K
> 1               kaio            ~915K
> 1               kaio-dio        ~930K
> 1               jaio           ~1220K
> 6               kaio           ~3050K
> 6               kaio-dio       ~3080K
> 6               jaio           ~3500K
> 6               mainline       ~2850K

Cool, thanks for the numbers!

I suspect the difference is due to contention on the ringbuffer,
completion side. You didn't enable my batched completion stuff, did you?

I suspect the numbers would look quite a bit different with that,
based on my own profiling. If the driver for the device you're testing
on is open source, I'd be happy to do the conversion (it's a 5 minute
job).

Also, I don't think our approaches really conflict - it's been awhile
since I looked at your patch but you're getting rid of the aio
ringbuffer and using a linked list instead, right? My batched completion
stuff should still benefit that case.

Though - hrm, I'd have expected getting rid of the cancellation linked
list to make a bigger difference and both our patchsets do that.

What device are you testing on, and what's your fio script? I may just
have to buy some hardware so I can test this myself.

  reply	other threads:[~2012-12-15  9:25 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-03 20:58 [PATCH 00/26] AIO performance improvements/cleanups, v2 Kent Overstreet
2012-12-03 20:58 ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 01/26] mm: remove old aio use_mm() comment Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 02/26] aio: remove dead code from aio.h Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 03/26] gadget: remove only user of aio retry Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 04/26] aio: remove retry-based AIO Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-27 10:11   ` Fubo Chen
2012-12-27 10:11     ` Fubo Chen
2012-12-03 20:58 ` [PATCH 05/26] char: add aio_{read,write} to /dev/{null,zero} Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 06/26] aio: Kill return value of aio_complete() Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 07/26] aio: kiocb_cancel() Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 08/26] aio: Move private stuff out of aio.h Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 09/26] aio: dprintk() -> pr_debug() Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 10/26] aio: do fget() after aio_get_req() Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 11/26] aio: Make aio_put_req() lockless Kent Overstreet
2012-12-03 20:58 ` [PATCH 12/26] aio: Refcounting cleanup Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 13/26] aio: Convert read_events() to hrtimers Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 14/26] aio: Make aio_read_evt() more efficient Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 15/26] aio: Use flush_dcache_page() Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 16/26] aio: Use cancellation list lazily Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 17/26] aio: Change reqs_active to include unreaped completions Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 18/26] aio: Kill batch allocation Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 19/26] aio: Kill struct aio_ring_info Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 20/26] aio: Give shared kioctx fields their own cachelines Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 21/26] aio: reqs_active -> reqs_available Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 22/26] aio: percpu reqs_available Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 23/26] Generic dynamic per cpu refcounting Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-27 13:47   ` Fubo Chen
2012-12-27 13:47     ` Fubo Chen
2012-12-03 20:58 ` [PATCH 24/26] aio: Percpu ioctx refcount Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 25/26] aio: use xchg() instead of completion_lock Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-03 20:58 ` [PATCH 26/26] aio: Don't include aio.h in sched.h Kent Overstreet
2012-12-03 20:58   ` Kent Overstreet
2012-12-13 21:18 ` [PATCH 00/26] AIO performance improvements/cleanups, v2 Jens Axboe
2012-12-13 21:18   ` Jens Axboe
2012-12-14  2:26   ` Jack Wang
2012-12-14  2:26     ` Jack Wang
2012-12-14  7:35     ` Jens Axboe
2012-12-15  9:25       ` Kent Overstreet [this message]
2012-12-15  9:25         ` Kent Overstreet
2012-12-15  9:46         ` Jens Axboe
2012-12-15 10:36           ` Kent Overstreet
2012-12-15 10:36             ` Kent Overstreet
2012-12-15 12:59             ` Jens Axboe
2012-12-15 12:59               ` Jens Axboe
2012-12-15 13:16             ` Jens Axboe
2012-12-18 19:16               ` Kent Overstreet
2012-12-18 19:16                 ` Kent Overstreet
2012-12-19  6:45                 ` Kent Overstreet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121215092526.GA10411@moria.home.lan \
    --to=koverstreet@google.com \
    --cc=bcrl@kvack.org \
    --cc=jack.wang.usish@gmail.com \
    --cc=jaxboe@fusionio.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=zab@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.