From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:58092 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750871AbdGENxj (ORCPT ); Wed, 5 Jul 2017 09:53:39 -0400 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v65DrSkq120669 for ; Wed, 5 Jul 2017 09:53:33 -0400 Received: from e24smtp04.br.ibm.com (e24smtp04.br.ibm.com [32.104.18.25]) by mx0a-001b2d01.pphosted.com with ESMTP id 2bgtyf95eb-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Wed, 05 Jul 2017 09:53:32 -0400 Received: from localhost by e24smtp04.br.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 5 Jul 2017 10:53:24 -0300 Received: from d24av05.br.ibm.com (d24av05.br.ibm.com [9.18.232.44]) by d24relay03.br.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v65Dn3Gm30867632 for ; Wed, 5 Jul 2017 10:49:03 -0300 Received: from d24av05.br.ibm.com (localhost [127.0.0.1]) by d24av05.br.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id v65ArMmB006655 for ; Wed, 5 Jul 2017 07:53:22 -0300 From: Mauricio Faria de Oliveira To: bcrl@kvack.org, viro@zeniv.linux.org.uk Cc: linux-aio@kvack.org, linux-fsdevel@vger.kernel.org, jmoyer@redhat.com, lekshmi.cpillai@in.ibm.com, nguyenp@us.ibm.com Subject: [PATCH RESEND] fs: aio: fix the increment of aio-nr and counting against aio-max-nr Date: Wed, 5 Jul 2017 10:53:16 -0300 Message-Id: <1499262796-4022-1-git-send-email-mauricfo@linux.vnet.ibm.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Currently, aio-nr is incremented in steps of 'num_possible_cpus() * 8' for io_setup(nr_events, ..) with 'nr_events < num_possible_cpus() * 4': ioctx_alloc() ... nr_events = max(nr_events, num_possible_cpus() * 4); nr_events *= 2; ... ctx->max_reqs = nr_events; ... aio_nr += ctx->max_reqs; .... This limits the number of aio contexts actually available to much less than aio-max-nr, and is increasingly worse with greater number of CPUs. For example, with 64 CPUs, only 256 aio contexts are actually available (with aio-max-nr = 65536) because the increment is 512 in that scenario. Note: 65536 [max aio contexts] / (64*4*2) [increment per aio context] is 128, but make it 256 (double) as counting against 'aio-max-nr * 2': ioctx_alloc() ... if (aio_nr + nr_events > (aio_max_nr * 2UL) || ... goto err_ctx; ... This patch uses the original value of nr_events (from userspace) to increment aio-nr and count against aio-max-nr, which resolves those. Signed-off-by: Mauricio Faria de Oliveira Reported-by: Lekshmi C. Pillai Tested-by: Lekshmi C. Pillai Tested-by: Paul Nguyen --- fs/aio.c | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index f52d925ee259..3908480d7ccd 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -441,10 +441,9 @@ static int aio_migratepage(struct address_space *mapping, struct page *new, #endif }; -static int aio_setup_ring(struct kioctx *ctx) +static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events) { struct aio_ring *ring; - unsigned nr_events = ctx->max_reqs; struct mm_struct *mm = current->mm; unsigned long size, unused; int nr_pages; @@ -707,6 +706,12 @@ static struct kioctx *ioctx_alloc(unsigned nr_events) int err = -ENOMEM; /* + * Store the original nr_events -- what userspace passed to io_setup(), + * for counting against the global limit -- before it changes. + */ + unsigned int max_reqs = nr_events; + + /* * We keep track of the number of available ringbuffer slots, to prevent * overflow (reqs_available), and we also use percpu counters for this. * @@ -724,14 +729,14 @@ static struct kioctx *ioctx_alloc(unsigned nr_events) return ERR_PTR(-EINVAL); } - if (!nr_events || (unsigned long)nr_events > (aio_max_nr * 2UL)) + if (!nr_events || (unsigned long)max_reqs > aio_max_nr) return ERR_PTR(-EAGAIN); ctx = kmem_cache_zalloc(kioctx_cachep, GFP_KERNEL); if (!ctx) return ERR_PTR(-ENOMEM); - ctx->max_reqs = nr_events; + ctx->max_reqs = max_reqs; spin_lock_init(&ctx->ctx_lock); spin_lock_init(&ctx->completion_lock); @@ -753,7 +758,7 @@ static struct kioctx *ioctx_alloc(unsigned nr_events) if (!ctx->cpu) goto err; - err = aio_setup_ring(ctx); + err = aio_setup_ring(ctx, nr_events); if (err < 0) goto err; @@ -764,8 +769,8 @@ static struct kioctx *ioctx_alloc(unsigned nr_events) /* limit the number of system wide aios */ spin_lock(&aio_nr_lock); - if (aio_nr + nr_events > (aio_max_nr * 2UL) || - aio_nr + nr_events < aio_nr) { + if (aio_nr + ctx->max_reqs > aio_max_nr || + aio_nr + ctx->max_reqs < aio_nr) { spin_unlock(&aio_nr_lock); err = -EAGAIN; goto err_ctx; -- 1.8.3.1