Re: [PATCH v3 next/akpm] aio: convert the ioctx list to radix tree

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Kent Overstreet <koverstreet@google.com>
To: Octavian Purdila <octavian.purdila@intel.com>
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-aio@kvack.org, linux-s390@vger.kernel.org, bcrl@kvack.org,
	schwidefsky@de.ibm.com, kirill.shutemov@linux.intel.com,
	zab@redhat.com, Andi Kleen <ak@linux.intel.com>
Subject: Re: [PATCH v3 next/akpm] aio: convert the ioctx list to radix tree
Date: Wed, 12 Jun 2013 11:14:40 -0700	[thread overview]
Message-ID: <20130612181440.GC6151@google.com> (raw)
In-Reply-To: <1366026055-28604-1-git-send-email-octavian.purdila@intel.com>

On Mon, Apr 15, 2013 at 02:40:55PM +0300, Octavian Purdila wrote:
> When using a large number of threads performing AIO operations the
> IOCTX list may get a significant number of entries which will cause
> significant overhead. For example, when running this fio script:
> 
> rw=randrw; size=256k ;directory=/mnt/fio; ioengine=libaio; iodepth=1
> blocksize=1024; numjobs=512; thread; loops=100
> 
> on an EXT2 filesystem mounted on top of a ramdisk we can observe up to
> 30% CPU time spent by lookup_ioctx:
> 
>  32.51%  [guest.kernel]  [g] lookup_ioctx
>   9.19%  [guest.kernel]  [g] __lock_acquire.isra.28
>   4.40%  [guest.kernel]  [g] lock_release
>   4.19%  [guest.kernel]  [g] sched_clock_local
>   3.86%  [guest.kernel]  [g] local_clock
>   3.68%  [guest.kernel]  [g] native_sched_clock
>   3.08%  [guest.kernel]  [g] sched_clock_cpu
>   2.64%  [guest.kernel]  [g] lock_release_holdtime.part.11
>   2.60%  [guest.kernel]  [g] memcpy
>   2.33%  [guest.kernel]  [g] lock_acquired
>   2.25%  [guest.kernel]  [g] lock_acquire
>   1.84%  [guest.kernel]  [g] do_io_submit
> 
> This patchs converts the ioctx list to a radix tree. For a performance
> comparison the above FIO script was run on a 2 sockets 8 core
> machine. This are the results (average and %rsd of 10 runs) for the
> original list based implementation and for the radix tree based
> implementation:
> 
> cores         1         2         4         8         16        32
> list       109376 ms  69119 ms  35682 ms  22671 ms  19724 ms  16408 ms
> %rsd         0.69%      1.15%     1.17%     1.21%     1.71%     1.43%
> radix       73651 ms  41748 ms  23028 ms  16766 ms  15232 ms   13787 ms
> %rsd         1.19%      0.98%     0.69%     1.13%    0.72%      0.75%
> % of radix
> relative    66.12%     65.59%    66.63%    72.31%   77.26%     83.66%
> to list
> 
> To consider the impact of the patch on the typical case of having
> only one ctx per process the following FIO script was run:
> 
> rw=randrw; size=100m ;directory=/mnt/fio; ioengine=libaio; iodepth=1
> blocksize=1024; numjobs=1; thread; loops=100
> 
> on the same system and the results are the following:
> 
> list        58892 ms
> %rsd         0.91%
> radix       59404 ms
> %rsd         0.81%
> % of radix
> relative    100.87%
> to list

So, I was just doing some benchmarking/profiling to get ready to send
out the aio patches I've got for 3.11 - and it looks like your patch is
causing a ~1.5% throughput regression in my testing :/

I'm just benchmarking random 4k reads with fio, with a single job.
Looking at the profile it appears to all be radix_tree_lookup() - that's
more expensive than I'd expect for a tree with one element.

It's a shame we don't have resizable RCU hash tables, that's really what
we want for this. Actually, I think I might know how to make that work
by using cuckoo hashing...

Might also be worth trying a single element cache of the most recently
used ioctx. Anyways, I don't want to nack your patch over this (the
overhead this is fixing can be quite a bit worse) but I'd like to try
and see if we can fix or reduce the regression in the single ioctx case.

next prev parent reply	other threads:[~2013-06-12 18:14 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-15 11:40 [PATCH v3 next/akpm] aio: convert the ioctx list to radix tree Octavian Purdila
2013-05-10 20:40 ` Andrew Morton
2013-05-10 21:15   ` Kent Overstreet
2013-05-13 21:01     ` Octavian Purdila
2013-06-12 18:14 ` Kent Overstreet [this message]
2013-06-12 18:24   ` Benjamin LaHaise
2013-06-12 19:40     ` Zach Brown
2013-06-14 14:20       ` Octavian Purdila
2013-06-18 19:05         ` Octavian Purdila
2013-06-18 19:08           ` Benjamin LaHaise
2013-06-18 19:32             ` Octavian Purdila

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130612181440.GC6151@google.com \
    --to=koverstreet@google.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=bcrl@kvack.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=octavian.purdila@intel.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=zab@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.