linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
To: bcrl@kvack.org, viro@zeniv.linux.org.uk
Cc: jmoyer@redhat.com, linux-aio@kvack.org, linux-fsdevel@vger.kernel.org
Subject: [RESEND PATCH v2 2/2] aio: use ctx->max_reqs only for counting against the global limit
Date: Wed, 24 May 2017 16:36:25 -0300	[thread overview]
Message-ID: <1495654585-22790-3-git-send-email-mauricfo@linux.vnet.ibm.com> (raw)
In-Reply-To: <1495654585-22790-1-git-send-email-mauricfo@linux.vnet.ibm.com>

Decouple ctx->max_reqs and ctx->nr_events; each one represents
a different side of the same coin -- userspace and kernelspace,
respectively.

Briefly, ctx->max_reqs represents what is userspace/externally
accessible by userspace; and ctx->nr_events represents what is
kernelspace/internally needed by the percpu allocation scheme.

With the percpu scheme, the original value of ctx->max_reqs from
userspace is changed (but still used to count against aio_max_nr)
based on num_possible_cpus(), and it may increase significantly
on systems with great num_possible_cpus() for smaller nr_events.

This eventually prevents userspace applications from getting the
actual value of aio_max_nr in the total requested nr_events.

ctx->max_reqs
=============

The ctx->max_reqs value once again aligns with its description:

  * This is what userspace passed to io_setup(), it's not used for
  * anything but counting against the global max_reqs quota.

It stores the original value of nr_events that userspace passed
to io_setup() (it's not increased to make room for requirements
of the percpu allocation scheme) - and is used to increment and
decrement the 'aio_nr' value, and to check against 'aio_max_nr'.

So, regardless of how many additional nr_events are internally
required for the percpu allocation scheme (e.g. make it 4x the
number of possible CPUs, and double it), userspace can get all
of the 'aio-max-nr' value that is made available/visible to it.

Another benefit is a consistent value in '/proc/sys/fs/aio-nr':
the sum of all values as requested by userspace, and it's less
than or equal to '/proc/sys/fs/aio-max-nr' again (not 2x it).

ctx->nr_events
==============

The ctx->nr_events value is the actual size of the ringbuffer/
number of slots, which may be more than what userspace passed
to io_setup() (depending on the requested value for nr_events
and/or calculations made in aio_setup_ring()) - as determined
by the percpu allocation scheme for its correct/fast behavior.

Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
---
 fs/aio.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 7c3c01f352c1..4967b0e1ef1a 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -706,6 +706,12 @@ static struct kioctx *ioctx_alloc(unsigned nr_events)
 	int err = -ENOMEM;
 
 	/*
+	 * Store the original value of nr_events from userspace for counting
+	 * against the global limit (aio_max_nr).
+	 */
+	unsigned max_reqs = nr_events;
+
+	/*
 	 * We keep track of the number of available ringbuffer slots, to prevent
 	 * overflow (reqs_available), and we also use percpu counters for this.
 	 *
@@ -723,14 +729,14 @@ static struct kioctx *ioctx_alloc(unsigned nr_events)
 		return ERR_PTR(-EINVAL);
 	}
 
-	if (!nr_events || (unsigned long)nr_events > (aio_max_nr * 2UL))
+	if (!nr_events || (unsigned long)max_reqs > aio_max_nr)
 		return ERR_PTR(-EAGAIN);
 
 	ctx = kmem_cache_zalloc(kioctx_cachep, GFP_KERNEL);
 	if (!ctx)
 		return ERR_PTR(-ENOMEM);
 
-	ctx->max_reqs = nr_events;
+	ctx->max_reqs = max_reqs;
 
 	spin_lock_init(&ctx->ctx_lock);
 	spin_lock_init(&ctx->completion_lock);
@@ -763,8 +769,8 @@ static struct kioctx *ioctx_alloc(unsigned nr_events)
 
 	/* limit the number of system wide aios */
 	spin_lock(&aio_nr_lock);
-	if (aio_nr + nr_events > (aio_max_nr * 2UL) ||
-	    aio_nr + nr_events < aio_nr) {
+	if (aio_nr + ctx->max_reqs > aio_max_nr ||
+	    aio_nr + ctx->max_reqs < aio_nr) {
 		spin_unlock(&aio_nr_lock);
 		err = -EAGAIN;
 		goto err_ctx;
-- 
1.8.3.1

  parent reply	other threads:[~2017-05-24 19:36 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-24 19:36 [RESEND PATCH v2 0/2] aio: decouple ctx's max_reqs and nr_events Mauricio Faria de Oliveira
2017-05-24 19:36 ` [RESEND PATCH v2 1/2] aio: make nr_events a parameter for aio_setup_ring() Mauricio Faria de Oliveira
2017-05-24 19:36 ` Mauricio Faria de Oliveira [this message]
2017-06-09  0:26 ` [RESEND PATCH v2 0/2] aio: decouple ctx's max_reqs and nr_events Mauricio Faria de Oliveira

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1495654585-22790-3-git-send-email-mauricfo@linux.vnet.ibm.com \
    --to=mauricfo@linux.vnet.ibm.com \
    --cc=bcrl@kvack.org \
    --cc=jmoyer@redhat.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).