* [RESEND PATCH v2 1/2] aio: make nr_events a parameter for aio_setup_ring()
@ 2017-05-24 19:35 Mauricio Faria de Oliveira
0 siblings, 0 replies; 5+ messages in thread
From: Mauricio Faria de Oliveira @ 2017-05-24 19:35 UTC (permalink / raw)
To: bcrl, viro; +Cc: jmoyer, linux-aio, linux-fsdevel
In order to decouple ctx->max_reqs and ctx->nr_events, first remove
the aio_setup_ring() dependency on ctx->max_reqs -- which currently
is assigned the value of nr_events -- and make the function receive
the value of nr_events directly.
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
---
fs/aio.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index f52d925ee259..7c3c01f352c1 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -441,10 +441,9 @@ static int aio_migratepage(struct address_space *mapping, struct page *new,
#endif
};
-static int aio_setup_ring(struct kioctx *ctx)
+static int aio_setup_ring(struct kioctx *ctx, unsigned nr_events)
{
struct aio_ring *ring;
- unsigned nr_events = ctx->max_reqs;
struct mm_struct *mm = current->mm;
unsigned long size, unused;
int nr_pages;
@@ -753,7 +752,7 @@ static struct kioctx *ioctx_alloc(unsigned nr_events)
if (!ctx->cpu)
goto err;
- err = aio_setup_ring(ctx);
+ err = aio_setup_ring(ctx, nr_events);
if (err < 0)
goto err;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [RESEND PATCH v2 0/2] aio: decouple ctx's max_reqs and nr_events
@ 2017-05-24 19:36 Mauricio Faria de Oliveira
2017-05-24 19:36 ` [RESEND PATCH v2 1/2] aio: make nr_events a parameter for aio_setup_ring() Mauricio Faria de Oliveira
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Mauricio Faria de Oliveira @ 2017-05-24 19:36 UTC (permalink / raw)
To: bcrl, viro; +Cc: jmoyer, linux-aio, linux-fsdevel
Currently, using smaller nr_events in io_setup(), the max number of
allocation requests is inversely proportional to num_possible_cpus()
(which is a problem on larger/non-virtualized systems with many CPUs).
This happens because, internally, nr_events must be increased for the
percpu allocation scheme (according to the number of CPUs), but it is
being reflected on the max_reqs value (aio-nr), which is used to count
against the global limit (aio-max-nr).
This patchset decouples ctx->max_reqs and ctx->nr_events, so that the
extra nr_events which are internally required do not reflect into the
'aio-nr' counter which is externally visible & checked to 'aio-max-nr'.
This ensures that userspace can allocate most of the aio-max-nr value
that is made available/visible to it -- regardless of the number of
possible CPUs in the system.
P.S.: this patchset has been 'relatively acked' by Benjamin LaHaise [2],
but for timing difficulties, it hasn't been queued nor re-submitted
in a while; sorry.
v2:
- adjust/move comment in PATCH 2/2 to commit message (Jeff Moyer [3])
Test-case:
// 64k request (ie, aio-max-nr) to io_setup with nr_events = 1
for (i = 0; i < 65536; i++)
if (rc = io_setup(1, &ioctx[i]))
break;
printf("rc = %d, i = %d\n", rc, i);
- on v4.12-rc2, it fails w/ -EAGAIN on request #102 (160 CPUs),
and aio-nr value is ~2x the aio-max-nr value (wrong):
# ./test-case &
rc = -11, i = 102
# grep . /proc/sys/fs/aio-*
/proc/sys/fs/aio-max-nr:65536
/proc/sys/fs/aio-nr:130560
- on v4.12-rc2 with patches, it fails w/ -ENOMEM on request #65515,
thus going way further toward the actual max limit:
# ./test-case &
rc = -12, i = 65515
# grep . /proc/sys/fs/aio-*
/proc/sys/fs/aio-max-nr:65536
/proc/sys/fs/aio-nr:65515
Links:
[1] http://marc.info/?l=linux-aio&m=147562172703469&w=2
[2] http://marc.info/?l=linux-aio&m=147757894603034&w=2
[3] http://marc.info/?l=linux-aio&m=147769074702475&w=2
Mauricio Faria de Oliveira (2):
aio: make nr_events a parameter for aio_setup_ring()
aio: use ctx->max_reqs only for counting against the global limit
fs/aio.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
--
1.8.3.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [RESEND PATCH v2 1/2] aio: make nr_events a parameter for aio_setup_ring()
2017-05-24 19:36 [RESEND PATCH v2 0/2] aio: decouple ctx's max_reqs and nr_events Mauricio Faria de Oliveira
@ 2017-05-24 19:36 ` Mauricio Faria de Oliveira
2017-05-24 19:36 ` [RESEND PATCH v2 2/2] aio: use ctx->max_reqs only for counting against the global limit Mauricio Faria de Oliveira
2017-06-09 0:26 ` [RESEND PATCH v2 0/2] aio: decouple ctx's max_reqs and nr_events Mauricio Faria de Oliveira
2 siblings, 0 replies; 5+ messages in thread
From: Mauricio Faria de Oliveira @ 2017-05-24 19:36 UTC (permalink / raw)
To: bcrl, viro; +Cc: jmoyer, linux-aio, linux-fsdevel
In order to decouple ctx->max_reqs and ctx->nr_events, first remove
the aio_setup_ring() dependency on ctx->max_reqs -- which currently
is assigned the value of nr_events -- and make the function receive
the value of nr_events directly.
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
---
fs/aio.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index f52d925ee259..7c3c01f352c1 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -441,10 +441,9 @@ static int aio_migratepage(struct address_space *mapping, struct page *new,
#endif
};
-static int aio_setup_ring(struct kioctx *ctx)
+static int aio_setup_ring(struct kioctx *ctx, unsigned nr_events)
{
struct aio_ring *ring;
- unsigned nr_events = ctx->max_reqs;
struct mm_struct *mm = current->mm;
unsigned long size, unused;
int nr_pages;
@@ -753,7 +752,7 @@ static struct kioctx *ioctx_alloc(unsigned nr_events)
if (!ctx->cpu)
goto err;
- err = aio_setup_ring(ctx);
+ err = aio_setup_ring(ctx, nr_events);
if (err < 0)
goto err;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [RESEND PATCH v2 2/2] aio: use ctx->max_reqs only for counting against the global limit
2017-05-24 19:36 [RESEND PATCH v2 0/2] aio: decouple ctx's max_reqs and nr_events Mauricio Faria de Oliveira
2017-05-24 19:36 ` [RESEND PATCH v2 1/2] aio: make nr_events a parameter for aio_setup_ring() Mauricio Faria de Oliveira
@ 2017-05-24 19:36 ` Mauricio Faria de Oliveira
2017-06-09 0:26 ` [RESEND PATCH v2 0/2] aio: decouple ctx's max_reqs and nr_events Mauricio Faria de Oliveira
2 siblings, 0 replies; 5+ messages in thread
From: Mauricio Faria de Oliveira @ 2017-05-24 19:36 UTC (permalink / raw)
To: bcrl, viro; +Cc: jmoyer, linux-aio, linux-fsdevel
Decouple ctx->max_reqs and ctx->nr_events; each one represents
a different side of the same coin -- userspace and kernelspace,
respectively.
Briefly, ctx->max_reqs represents what is userspace/externally
accessible by userspace; and ctx->nr_events represents what is
kernelspace/internally needed by the percpu allocation scheme.
With the percpu scheme, the original value of ctx->max_reqs from
userspace is changed (but still used to count against aio_max_nr)
based on num_possible_cpus(), and it may increase significantly
on systems with great num_possible_cpus() for smaller nr_events.
This eventually prevents userspace applications from getting the
actual value of aio_max_nr in the total requested nr_events.
ctx->max_reqs
=============
The ctx->max_reqs value once again aligns with its description:
* This is what userspace passed to io_setup(), it's not used for
* anything but counting against the global max_reqs quota.
It stores the original value of nr_events that userspace passed
to io_setup() (it's not increased to make room for requirements
of the percpu allocation scheme) - and is used to increment and
decrement the 'aio_nr' value, and to check against 'aio_max_nr'.
So, regardless of how many additional nr_events are internally
required for the percpu allocation scheme (e.g. make it 4x the
number of possible CPUs, and double it), userspace can get all
of the 'aio-max-nr' value that is made available/visible to it.
Another benefit is a consistent value in '/proc/sys/fs/aio-nr':
the sum of all values as requested by userspace, and it's less
than or equal to '/proc/sys/fs/aio-max-nr' again (not 2x it).
ctx->nr_events
==============
The ctx->nr_events value is the actual size of the ringbuffer/
number of slots, which may be more than what userspace passed
to io_setup() (depending on the requested value for nr_events
and/or calculations made in aio_setup_ring()) - as determined
by the percpu allocation scheme for its correct/fast behavior.
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
---
fs/aio.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 7c3c01f352c1..4967b0e1ef1a 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -706,6 +706,12 @@ static struct kioctx *ioctx_alloc(unsigned nr_events)
int err = -ENOMEM;
/*
+ * Store the original value of nr_events from userspace for counting
+ * against the global limit (aio_max_nr).
+ */
+ unsigned max_reqs = nr_events;
+
+ /*
* We keep track of the number of available ringbuffer slots, to prevent
* overflow (reqs_available), and we also use percpu counters for this.
*
@@ -723,14 +729,14 @@ static struct kioctx *ioctx_alloc(unsigned nr_events)
return ERR_PTR(-EINVAL);
}
- if (!nr_events || (unsigned long)nr_events > (aio_max_nr * 2UL))
+ if (!nr_events || (unsigned long)max_reqs > aio_max_nr)
return ERR_PTR(-EAGAIN);
ctx = kmem_cache_zalloc(kioctx_cachep, GFP_KERNEL);
if (!ctx)
return ERR_PTR(-ENOMEM);
- ctx->max_reqs = nr_events;
+ ctx->max_reqs = max_reqs;
spin_lock_init(&ctx->ctx_lock);
spin_lock_init(&ctx->completion_lock);
@@ -763,8 +769,8 @@ static struct kioctx *ioctx_alloc(unsigned nr_events)
/* limit the number of system wide aios */
spin_lock(&aio_nr_lock);
- if (aio_nr + nr_events > (aio_max_nr * 2UL) ||
- aio_nr + nr_events < aio_nr) {
+ if (aio_nr + ctx->max_reqs > aio_max_nr ||
+ aio_nr + ctx->max_reqs < aio_nr) {
spin_unlock(&aio_nr_lock);
err = -EAGAIN;
goto err_ctx;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RESEND PATCH v2 0/2] aio: decouple ctx's max_reqs and nr_events
2017-05-24 19:36 [RESEND PATCH v2 0/2] aio: decouple ctx's max_reqs and nr_events Mauricio Faria de Oliveira
2017-05-24 19:36 ` [RESEND PATCH v2 1/2] aio: make nr_events a parameter for aio_setup_ring() Mauricio Faria de Oliveira
2017-05-24 19:36 ` [RESEND PATCH v2 2/2] aio: use ctx->max_reqs only for counting against the global limit Mauricio Faria de Oliveira
@ 2017-06-09 0:26 ` Mauricio Faria de Oliveira
2 siblings, 0 replies; 5+ messages in thread
From: Mauricio Faria de Oliveira @ 2017-06-09 0:26 UTC (permalink / raw)
To: bcrl, viro, jmoyer; +Cc: linux-aio, linux-fsdevel
On 05/24/2017 04:36 PM, Mauricio Faria de Oliveira wrote:
> Mauricio Faria de Oliveira (2):
> aio: make nr_events a parameter for aio_setup_ring()
> aio: use ctx->max_reqs only for counting against the global limit
I'll resend this in a simpler patch, with subject:
"aio: fix the increment of aio-nr and counting against aio-max-nr"
Hope this helps.
Sorry for the extra email.
--
Mauricio Faria de Oliveira
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-06-09 0:27 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-24 19:36 [RESEND PATCH v2 0/2] aio: decouple ctx's max_reqs and nr_events Mauricio Faria de Oliveira
2017-05-24 19:36 ` [RESEND PATCH v2 1/2] aio: make nr_events a parameter for aio_setup_ring() Mauricio Faria de Oliveira
2017-05-24 19:36 ` [RESEND PATCH v2 2/2] aio: use ctx->max_reqs only for counting against the global limit Mauricio Faria de Oliveira
2017-06-09 0:26 ` [RESEND PATCH v2 0/2] aio: decouple ctx's max_reqs and nr_events Mauricio Faria de Oliveira
-- strict thread matches above, loose matches on Subject: below --
2017-05-24 19:35 [RESEND PATCH v2 1/2] aio: make nr_events a parameter for aio_setup_ring() Mauricio Faria de Oliveira
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).