All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave@sr71.net>
To: Cody P Schafer <cody@linux.vnet.ibm.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, cl@linux.com
Subject: Re: [RFC][PATCH] mm: percpu pages: up batch size to fix arithmetic?? errror
Date: Wed, 11 Sep 2013 17:20:07 -0700	[thread overview]
Message-ID: <523108B7.7050101@sr71.net> (raw)
In-Reply-To: <5230FB0A.70901@linux.vnet.ibm.com>

BTW, in my little test, the median ->count was 10, and the mean was 45.

On 09/11/2013 04:21 PM, Cody P Schafer wrote:
> Also, we may want to consider shrinking pcp->high down from 6*pcp->batch
> given that the original "6*" choice was based upon ->batch actually
> being 1/4th of the average pageset size, where now it appears closer to
> being the average.

One other thing: we actually had a hot _and_ a cold pageset at that
point, and we now share one pageset for hot and cold pages.  After
looking at it for a bit today, I'm not sure how much the history
matters.  We probably need to take a fresh look at what we want.

Anybody disagree with this?

1. We want ->batch to be large enough that if all the CPUs in a zone
   are doing allocations constantly, there is very little contention on
   the zone_lock.
2. If ->high gets too large, we'll end up keeping too much memory in
   the pcp and __alloc_pages_direct_reclaim() will end up calling the
   (expensive drain_all_pages() too often).
3. We want ->high to approximate the size of the cache which is
   private to a given cpu.  But, that's complicated by the L3 caches
   and hyperthreading today.
4. ->high can be a _bit_ larger than the CPU cache without it being a
   real problem since not _all_ the pages being freed will be fully
   resident in the cache.  Some will be cold, some will only have a few
   of their cachelines resident.
5. A 0.75MB ->high seems a bit low for CPUs with 30MB of L3 cache on
   the socket (although 20 threads share that).

I'll take one of my big systems and run it with some various ->high
settings and see if it makes any difference.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Dave Hansen <dave@sr71.net>
To: Cody P Schafer <cody@linux.vnet.ibm.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, cl@linux.com
Subject: Re: [RFC][PATCH] mm: percpu pages: up batch size to fix arithmetic?? errror
Date: Wed, 11 Sep 2013 17:20:07 -0700	[thread overview]
Message-ID: <523108B7.7050101@sr71.net> (raw)
In-Reply-To: <5230FB0A.70901@linux.vnet.ibm.com>

BTW, in my little test, the median ->count was 10, and the mean was 45.

On 09/11/2013 04:21 PM, Cody P Schafer wrote:
> Also, we may want to consider shrinking pcp->high down from 6*pcp->batch
> given that the original "6*" choice was based upon ->batch actually
> being 1/4th of the average pageset size, where now it appears closer to
> being the average.

One other thing: we actually had a hot _and_ a cold pageset at that
point, and we now share one pageset for hot and cold pages.  After
looking at it for a bit today, I'm not sure how much the history
matters.  We probably need to take a fresh look at what we want.

Anybody disagree with this?

1. We want ->batch to be large enough that if all the CPUs in a zone
   are doing allocations constantly, there is very little contention on
   the zone_lock.
2. If ->high gets too large, we'll end up keeping too much memory in
   the pcp and __alloc_pages_direct_reclaim() will end up calling the
   (expensive drain_all_pages() too often).
3. We want ->high to approximate the size of the cache which is
   private to a given cpu.  But, that's complicated by the L3 caches
   and hyperthreading today.
4. ->high can be a _bit_ larger than the CPU cache without it being a
   real problem since not _all_ the pages being freed will be fully
   resident in the cache.  Some will be cold, some will only have a few
   of their cachelines resident.
5. A 0.75MB ->high seems a bit low for CPUs with 30MB of L3 cache on
   the socket (although 20 threads share that).

I'll take one of my big systems and run it with some various ->high
settings and see if it makes any difference.

  reply	other threads:[~2013-09-12  0:20 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-11 22:08 [RFC][PATCH] mm: percpu pages: up batch size to fix arithmetic?? errror Dave Hansen
2013-09-11 22:08 ` Dave Hansen
2013-09-11 23:08 ` Cody P Schafer
2013-09-11 23:08   ` Cody P Schafer
2013-09-11 23:21   ` Cody P Schafer
2013-09-11 23:21     ` Cody P Schafer
2013-09-12  0:20     ` Dave Hansen [this message]
2013-09-12  0:20       ` Dave Hansen
2013-09-12 14:16       ` Christoph Lameter
2013-09-12 14:16         ` Christoph Lameter
2013-09-12 15:21         ` Dave Hansen
2013-09-12 15:21           ` Dave Hansen
2013-09-11 23:58   ` Dave Hansen
2013-09-11 23:58     ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=523108B7.7050101@sr71.net \
    --to=dave@sr71.net \
    --cc=cl@linux.com \
    --cc=cody@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.