From: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
To: Benjamin LaHaise <bcrl@kvack.org>,
Kent Overstreet <kent.overstreet@gmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
linux-fsdevel@vger.kernel.org, linux-aio@kvack.org,
linux-kernel@vger.kernel.org
Subject: aio: questions with ioctx_alloc() and large num_possible_cpus()
Date: Tue, 4 Oct 2016 19:55:12 -0300 [thread overview]
Message-ID: <db64a8f3-cd15-2ccc-e120-ba0466f811d0@linux.vnet.ibm.com> (raw)
Hi Benjamin, Kent, and others,
Would you please comment / answer about this possible problem?
Any feedback is appreciated.
Since commit e1bdd5f27a5b ("aio: percpu reqs_available") the maximum
number of aio nr_events may be a function of num_possible_cpus() and
actually be /inversely proportional/ to it (i.e., more CPUs lead to
less system-wide aio nr_events). This is a problem on larger systems.
That's because if "nr_events < num_possible_cpus() * 4" (for example
nr_events == 1) that counts as "num_possible_cpus() * 4" into aio_nr
and against aio_max_nr
static struct kioctx *ioctx_alloc(unsigned nr_events)
...
nr_events = max(nr_events, num_possible_cpus() * 4);
nr_events *= 2;
...
/* limit the number of system wide aios */
....
if (aio_nr + nr_events > (aio_max_nr * 2UL) ||
...
err = -EAGAIN;
...
aio_nr += ctx->max_reqs;
...
That problem is easily noticeable on a common POWER8 system: 160 CPUs
(2 sockets * 10 cores/socket * 8 threads/core = 160 CPUs) limits the max
AIO contexts with "io_setup(1, )" to 102 out of 64k (default ax_aio_nr):
# cat /sys/devices/system/cpu/possible
0-159
# cat /proc/sys/fs/aio-max-nr
65536
# echo $(( 65536 / (160 * 4) ))
102
test-case snippet & output:
for (i = 0; i < 65536; i++)
if (rc = io_setup(1, &ioctx[i]))
break;
printf("rc = %d, i = %d\n", rc, i);
> rc = -11, i = 102
(another problem is that the sysctl aio-nr grows larger than aio-max-nr,
since it's checked against "aio_max_nr * 2")
So,
I've been trying to understand/fix this, but soon got stuck on options
as I didn't quite get a few points.. if you could provide some insight,
please, that would be really helpful:
- why "num_possible_cpus() * 4", and why "max(nr_events, <it>)" ?
Is it just related to req_batch in a form of a reasonable constant,
or there are other implications (e.g., related to "up to half of
slots on other cpu's percpu counters" -- which would be nice to
understand why too.)
- "struct kioctx" says max_reqs is
" is what userspace passed to io_setup(), it's not used for
anything but counting against the global max_reqs quota. "
However, we see it incremented by the modified nr_events, thus
not really the value from userspace anymore, and used to derive
nr_events in aio_setup_ring(). Is the comment wrong nowadays,
or is the code usage of max_reqs wrong/abusing it, or... ? :)
- what's really expected to be counted by aio-nr is nr_events
(er.. the value actually requested by userspace?) or the number
of times io_setup(N, ) returned successfully (say, io contexts),
regardless of the total/sum of their nr_events?
- any other comments/suggestions are appreciated.
Thanks in advance,
--
Mauricio Faria de Oliveira
IBM Linux Technology Center
next reply other threads:[~2016-10-04 22:55 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-04 22:55 Mauricio Faria de Oliveira [this message]
2016-10-05 6:34 ` aio: questions with ioctx_alloc() and large num_possible_cpus() Kent Overstreet
2016-10-05 17:21 ` Mauricio Faria de Oliveira
2016-10-05 17:41 ` Benjamin LaHaise
2016-10-05 17:58 ` Mauricio Faria de Oliveira
2016-10-05 18:17 ` Benjamin LaHaise
2016-10-05 19:22 ` Mauricio Faria de Oliveira
2016-10-28 18:59 ` Jeff Moyer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=db64a8f3-cd15-2ccc-e120-ba0466f811d0@linux.vnet.ibm.com \
--to=mauricfo@linux.vnet.ibm.com \
--cc=bcrl@kvack.org \
--cc=kent.overstreet@gmail.com \
--cc=linux-aio@kvack.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).