From: Donet Tom <donettom@linux.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: Aneesh Kumar <aneesh.kumar@kernel.org>,
Huang Ying <ying.huang@intel.com>,
Michal Hocko <mhocko@kernel.org>,
Dave Hansen <dave.hansen@linux.intel.com>,
Mel Gorman <mgorman@suse.de>, Feng Tang <feng.tang@intel.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, Rik van Riel <riel@surriel.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Matthew Wilcox <willy@infradead.org>,
Vlastimil Babka <vbabka@suse.cz>,
Dan Williams <dan.j.williams@intel.com>,
Hugh Dickins <hughd@google.com>,
Kefeng Wang <wangkefeng.wang@huawei.com>,
Suren Baghdasaryan <surenb@google.com>,
Donet Tom <donettom@linux.ibm.com>
Subject: [PATCH v3 0/2] Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy
Date: Thu, 21 Mar 2024 06:29:49 -0500 [thread overview]
Message-ID: <cover.1711002865.git.donettom@linux.ibm.com> (raw)
This patchset is to optimize the cross-socket memory access with
MPOL_PREFERRED_MANY policy.
To test this patch we ran the following test on a 3 node system.
Node 0 - 2GB - Tier 1
Node 1 - 11GB - Tier 1
Node 6 - 10GB - Tier 2
Below changes are made to memcached to set the memory policy,
It select Node0 and Node1 as preferred nodes.
#include <numaif.h>
#include <numa.h>
unsigned long nodemask;
int ret;
nodemask = 0x03;
ret = set_mempolicy(MPOL_PREFERRED_MANY | MPOL_F_NUMA_BALANCING,
&nodemask, 10);
/* If MPOL_F_NUMA_BALANCING isn't supported,
* fall back to MPOL_PREFERRED_MANY */
if (ret < 0 && errno == EINVAL){
printf("set mem policy normal\n");
ret = set_mempolicy(MPOL_PREFERRED_MANY, &nodemask, 10);
}
if (ret < 0) {
perror("Failed to call set_mempolicy");
exit(-1);
}
Test Procedure:
===============
1. Make sure memory tiering and demotion are enabled.
2. Start memcached.
# ./memcached -b 100000 -m 204800 -u root -c 1000000 -t 7
-d -s "/tmp/memcached.sock"
3. Run memtier_benchmark to store 3200000 keys.
#./memtier_benchmark -S "/tmp/memcached.sock" --protocol=memcache_binary
--threads=1 --pipeline=1 --ratio=1:0 --key-pattern=S:S --key-minimum=1
--key-maximum=3200000 -n allkeys -c 1 -R -x 1 -d 1024
4. Start a memory eater on node 0 and 1. This will demote all memcached
pages to node 6.
5. Make sure all the memcached pages got demoted to lower tier by reading
/proc/<memcaced PID>/numa_maps.
# cat /proc/2771/numa_maps
---
default anon=1009 dirty=1009 active=0 N6=1009 kernelpagesize_kB=64
default anon=1009 dirty=1009 active=0 N6=1009 kernelpagesize_kB=64
---
6. Kill memory eater.
7. Read the pgpromote_success counter.
8. Start reading the keys by running memtier_benchmark.
#./memtier_benchmark -S "/tmp/memcached.sock" --protocol=memcache_binary
--pipeline=1 --distinct-client-seed --ratio=0:3 --key-pattern=R:R
--key-minimum=1 --key-maximum=3200000 -n allkeys
--threads=64 -c 1 -R -x 6
9. Read the pgpromote_success counter.
Test Results:
=============
Without Patch
------------------
1. pgpromote_success before test
Node 0: pgpromote_success 11
Node 1: pgpromote_success 140974
pgpromote_success after test
Node 0: pgpromote_success 11
Node 1: pgpromote_success 140974
2. Memtier-benchmark result.
AGGREGATED AVERAGE RESULTS (6 runs)
==================================================================
Type Ops/sec Hits/sec Misses/sec Avg. Latency p50 Latency
------------------------------------------------------------------
Sets 0.00 --- --- --- ---
Gets 305792.03 305791.93 0.10 0.18949 0.16700
Waits 0.00 --- --- --- ---
Totals 305792.03 305791.93 0.10 0.18949 0.16700
======================================
p99 Latency p99.9 Latency KB/sec
-------------------------------------
--- --- 0.00
0.44700 1.71100 11542.69
--- --- ---
0.44700 1.71100 11542.69
With Patch
---------------
1. pgpromote_success before test
Node 0: pgpromote_success 5
Node 1: pgpromote_success 89386
pgpromote_success after test
Node 0: pgpromote_success 57895
Node 1: pgpromote_success 141463
2. Memtier-benchmark result.
AGGREGATED AVERAGE RESULTS (6 runs)
====================================================================
Type Ops/sec Hits/sec Misses/sec Avg. Latency p50 Latency
--------------------------------------------------------------------
Sets 0.00 --- --- --- ---
Gets 521942.24 521942.07 0.17 0.11459 0.10300
Waits 0.00 --- --- --- ---
Totals 521942.24 521942.07 0.17 0.11459 0.10300
=======================================
p99 Latency p99.9 Latency KB/sec
---------------------------------------
--- --- 0.00
0.23100 0.31900 19701.68
--- --- ---
0.23100 0.31900 19701.68
Test Result Analysis:
=====================
1. With patch we could observe pages are getting promoted.
2. Memtier-benchmark results shows that, with the patch,
performance has increased more than 50%.
Ops/sec without fix - 305792.03
Ops/sec with fix - 521942.24
Changes:
V3:
- Added "* @vmf: structure describing the fault" comment for
mpol_misplaced() to fix the warning.
https://lore.kernel.org/oe-kbuild-all/202403202229.WZeAnUuO-lkp@intel.com/
v2:
- Rebased on latest upstream (v6.8-rc7)
- Used 'numa_node_id()' to get the current execution node ID, Added
'lockdep_assert_held' to make sure that the 'mpol_misplaced()' is
called with ptl held.
- The migration condition has been updated; now, migration will only
occur if the execution node is present in the policy nodemask.
-v1: https://lore.kernel.org/linux-mm/9c3f7b743477560d1c5b12b8c111a584a2cc92ee.1708097962.git.donettom@linux.ibm.com/#t
Donet Tom (2):
mm/mempolicy: Use numa_node_id() instead of cpu_to_node()
mm/numa_balancing:Allow migrate on protnone reference with
MPOL_PREFERRED_MANY policy
include/linux/mempolicy.h | 5 +++--
mm/huge_memory.c | 2 +-
mm/internal.h | 2 +-
mm/memory.c | 8 +++++---
mm/mempolicy.c | 36 +++++++++++++++++++++++++++---------
5 files changed, 37 insertions(+), 16 deletions(-)
--
2.39.3
next reply other threads:[~2024-03-21 11:31 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-21 11:29 Donet Tom [this message]
2024-03-21 11:29 ` [PATCH v3 1/2] mm/mempolicy: Use numa_node_id() instead of cpu_to_node() Donet Tom
2024-03-21 11:29 ` [PATCH v3 2/2] mm/numa_balancing:Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy Donet Tom
2024-03-22 8:32 ` Huang, Ying
2024-03-22 10:05 ` Donet Tom
2024-03-25 2:48 ` Huang, Ying
2024-03-25 5:00 ` Donet Tom
2024-03-25 5:02 ` Donet Tom
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1711002865.git.donettom@linux.ibm.com \
--to=donettom@linux.ibm.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=feng.tang@intel.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=riel@surriel.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=wangkefeng.wang@huawei.com \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.