Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andrey Ryabinin <aryabinin@virtuozzo.com>
To: Tim Chen <tim.c.chen@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andi Kleen <ak@linux.intel.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Vladimir Davydov <vdavydov@virtuozzo.com>,
	Konstantin Khlebnikov <koct9i@gmail.com>
Subject: Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting
Date: Thu, 11 Feb 2016 16:54:09 +0300	[thread overview]
Message-ID: <56BC9281.6090505@virtuozzo.com> (raw)
In-Reply-To: <1455150256.715.60.camel@schen9-desk2.jf.intel.com>



On 02/11/2016 03:24 AM, Tim Chen wrote:
> On Wed, 2016-02-10 at 13:28 -0800, Andrew Morton wrote:
> 
>>
>> If a process is unmapping 4MB then it's pretty crazy for us to be
>> hitting the percpu_counter 32 separate times for that single operation.
>>
>> Is there some way in which we can batch up the modifications within the
>> caller and update the counter less frequently?  Perhaps even in a
>> single hit?
> 
> I think the problem is the batch size is too small and we overflow
> the local counter into the global counter for 4M allocations.
> The reason for the small batch size was because we use
> percpu_counter_read_positive in __vm_enough_memory and it is not precise
> and the error could grow with large batch size.
> 
> Let's switch to the precise __percpu_counter_compare that is 
> unaffected by batch size.  It will do precise comparison and only add up
> the local per cpu counters when the global count is not precise
> enough.  
> 

I'm not certain about this. for_each_online_cpu() under spinlock somewhat doubtful.
And if we are close to limit we will be hitting slowpath all the time.


> So maybe something like the following patch with a relaxed batch size.
> I have not tested this patch much other than compiling and booting
> the kernel.  I wonder if this works for Andrey. We could relax the batch
> size further, but that will mean that we will incur the overhead
> of summing the per cpu counters earlier when the global count get close
> to the allowed limit.
> 
> Thanks.
> 
> Tim
> 
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Andrey Ryabinin <aryabinin@virtuozzo.com>
To: Tim Chen <tim.c.chen@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
	Andi Kleen <ak@linux.intel.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Vladimir Davydov <vdavydov@virtuozzo.com>,
	Konstantin Khlebnikov <koct9i@gmail.com>
Subject: Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting
Date: Thu, 11 Feb 2016 16:54:09 +0300	[thread overview]
Message-ID: <56BC9281.6090505@virtuozzo.com> (raw)
In-Reply-To: <1455150256.715.60.camel@schen9-desk2.jf.intel.com>



On 02/11/2016 03:24 AM, Tim Chen wrote:
> On Wed, 2016-02-10 at 13:28 -0800, Andrew Morton wrote:
> 
>>
>> If a process is unmapping 4MB then it's pretty crazy for us to be
>> hitting the percpu_counter 32 separate times for that single operation.
>>
>> Is there some way in which we can batch up the modifications within the
>> caller and update the counter less frequently?  Perhaps even in a
>> single hit?
> 
> I think the problem is the batch size is too small and we overflow
> the local counter into the global counter for 4M allocations.
> The reason for the small batch size was because we use
> percpu_counter_read_positive in __vm_enough_memory and it is not precise
> and the error could grow with large batch size.
> 
> Let's switch to the precise __percpu_counter_compare that is 
> unaffected by batch size.  It will do precise comparison and only add up
> the local per cpu counters when the global count is not precise
> enough.  
> 

I'm not certain about this. for_each_online_cpu() under spinlock somewhat doubtful.
And if we are close to limit we will be hitting slowpath all the time.


> So maybe something like the following patch with a relaxed batch size.
> I have not tested this patch much other than compiling and booting
> the kernel.  I wonder if this works for Andrey. We could relax the batch
> size further, but that will mean that we will incur the overhead
> of summing the per cpu counters earlier when the global count get close
> to the allowed limit.
> 
> Thanks.
> 
> Tim
>

next prev parent reply	other threads:[~2016-02-11 13:53 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-10 14:52 [PATCH 1/3] mm: move max_map_count bits into mm.h Andrey Ryabinin
2016-02-10 14:52 ` Andrey Ryabinin
2016-02-10 14:52 ` [PATCH 2/3] mm: dedupclicate memory overcommitment code Andrey Ryabinin
2016-02-10 14:52   ` Andrey Ryabinin
2016-02-10 14:52 ` [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting Andrey Ryabinin
2016-02-10 14:52   ` Andrey Ryabinin
2016-02-10 17:46   ` Konstantin Khlebnikov
2016-02-10 17:46     ` Konstantin Khlebnikov
2016-02-11 13:36     ` Andrey Ryabinin
2016-02-11 13:36       ` Andrey Ryabinin
2016-02-11 16:57       ` Tim Chen
2016-02-11 16:57         ` Tim Chen
2016-02-10 18:00   ` Tim Chen
2016-02-10 18:00     ` Tim Chen
2016-02-10 21:28     ` Andrew Morton
2016-02-10 21:28       ` Andrew Morton
2016-02-11  0:24       ` Tim Chen
2016-02-11  0:24         ` Tim Chen
2016-02-11 13:54         ` Andrey Ryabinin [this message]
2016-02-11 13:54           ` Andrey Ryabinin
2016-02-11 18:20           ` Tim Chen
2016-02-11 18:20             ` Tim Chen
2016-02-11 19:45             ` Dave Hansen
2016-02-11 19:45               ` Dave Hansen
2016-02-11 20:51         ` Andrew Morton
2016-02-11 20:51           ` Andrew Morton
2016-02-11 21:18           ` Tim Chen
2016-02-11 21:18             ` Tim Chen
2016-02-12 12:24           ` Andrey Ryabinin
2016-02-12 12:24             ` Andrey Ryabinin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56BC9281.6090505@virtuozzo.com \
    --to=aryabinin@virtuozzo.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=koct9i@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=tim.c.chen@linux.intel.com \
    --cc=vdavydov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.