linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Gavin Shan <gwshan@linux.vnet.ibm.com>
To: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>,
	Anton Blanchard <anton@samba.org>,
	linuxppc-dev@lists.ozlabs.org, mgorman@suse.de
Subject: Re: [PATCH] powerpc/mm: Fix RECLAIM_DISTANCE
Date: Tue, 31 Jan 2017 16:40:29 +1100	[thread overview]
Message-ID: <20170131054029.GA9937@gwshan> (raw)
In-Reply-To: <20170131050104.GB25724@gwshan>

On Tue, Jan 31, 2017 at 04:01:05PM +1100, Gavin Shan wrote:
>On Tue, Jan 31, 2017 at 08:11:39AM +1100, Michael Ellerman wrote:
>>Gavin Shan <gwshan@linux.vnet.ibm.com> writes:
>>
>>I'd like to see some test results from multi-node systems.
>>
>>I'd also like to understand what has changed since we changed
>>RECLAIM_DISTANCE in the first place, ie. why did it used to work and now
>>doesn't?
>>
>
>[Ccing Mel]
>
>Michael, thanks for review. I would like to explain a bit more. The issue
>addressed by the patch is irrelevant to the number of NUMA nodes. There
>is one procfs entry ("/proc/sys/vm/zone_reclaim_mode") which corresponds
>to variable @node_reclaim_mode (their names don't match!). it can have
>belowing bits or any combination of them. Its default value is RECLAIM_OFF (0).
>Note RECLAIM_ZONE was obsoleted and I will send one patch to remove it
>later.
>
>#define RECLAIM_OFF 0
>#define RECLAIM_ZONE (1<<0)     /* Run shrink_inactive_list on the zone */
>#define RECLAIM_WRITE (1<<1)    /* Writeout pages during reclaim */
>#define RECLAIM_UNMAP (1<<2)    /* Unmap pages during reclaim */
>
>When @node_reclaim_mode is set to (RECLAIM_WRITE | RECLAIM_UNMAP), node_reclaim()
>isn't called on the preferred node as the condition is false: zone_allows_reclaim(
>node-A, node-A). As I observed, the distance from node-A to node-A is 10, equal to
>RECLAIM_DISTANCE.
>
>static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
>{
>        return node_distance(zone_to_nid(local_zone), zone_to_nid(zone)) <
>                                RECLAIM_DISTANCE;
>}
>
>
>__alloc_pages_nodemask
>   get_page_from_freelist     <- WATERMARK_LOW
>      zone_watermark_fast     <- Assume the allocation is breaking WATERMARK_LOW
>         node_reclaim         <- @node_reclaim_node isn't 0 and
>                                 zone_allows_reclaim(preferred_zone, current_zone) returns true
>         __node_reclaim       <- SWAP, WRITEPAGE and UNMAP setting from @node_reclaim_node
>         shrink_node
>      buffered_rmqueue
>   __alloc_pages_slowpath
>      get_page_from_freelist        <- WATERMARK_MIN
>      __alloc_pages_direct_compact  <- If it's costly allocation (order > 3)
>      wake_all_kswapds
>      get_page_from_freelist        <- NO_WATERMARK, CPU local node is set to preferred one
>      __alloc_pages_direct_reclaim
>         __perform_reclaim
>         try_to_free_pages          <- WRITEPAGE + UNMAP + SWAP
>         do_try_to_free_pages
>         shrink_zones               <- Stop until priority (12) reaches to 0 or reclaimed enough
>         shrink_node
>      __alloc_pages_direct_compact
>
>
>Also, RECLAIM_DISTANCE is set to 30 in include/linux/topology.h. It's used when arch
>doesn't provide one. It's why I set this macro to 30 in this patch. This issue is 
>introduced by commit 5f7a75acdb2 ("mm: page_alloc: do not cache reclaim distances").
>In the patch, it had wrong replacement. So I would correct the wrong replacement
>alternatively. Or both of them. Which way do you think is the best? Maybe Mel also
>has thoughts.
>
>     39  static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
>     40  {
>     41 -       return node_isset(local_zone->node, zone->zone_pgdat->reclaim_nodes);
>     42 -}
>     43 -
>     44 -static void __paginginit init_zone_allows_reclaim(int nid)
>     45 -{
>     46 -       int i;
>     47 -
>     48 -       for_each_node_state(i, N_MEMORY)
>     49 -               if (node_distance(nid, i) <= RECLAIM_DISTANCE)
>     50 -                       node_set(i, NODE_DATA(nid)->reclaim_nodes);
>     51 +       return node_distance(zone_to_nid(local_zone), zone_to_nid(zone)) <
>     52 +                               RECLAIM_DISTANCE;
>     53  }
>

Sorry, to make it clear. The patch replaced "<=" with "<" wrongly :)

Thanks,
Gavin

  reply	other threads:[~2017-01-31  5:41 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-23 23:32 [PATCH] powerpc/mm: Fix RECLAIM_DISTANCE Gavin Shan
2017-01-25  3:57 ` Balbir Singh
2017-01-25  4:58   ` Gavin Shan
2017-01-27 12:49     ` Balbir Singh
2017-01-30  1:02       ` Anton Blanchard
2017-01-30  4:38         ` Gavin Shan
2017-01-30 21:11           ` Michael Ellerman
2017-01-31  5:01             ` Gavin Shan
2017-01-31  5:40               ` Gavin Shan [this message]
2017-02-07 23:40               ` Gavin Shan
2017-01-31  4:33         ` Gavin Shan
2017-01-31  4:58           ` Anton Blanchard
2017-01-31  5:30             ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170131054029.GA9937@gwshan \
    --to=gwshan@linux.vnet.ibm.com \
    --cc=anton@samba.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mgorman@suse.de \
    --cc=mpe@ellerman.id.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).