From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
Wei Xu <weixugc@google.com>, Yang Shi <shy828301@gmail.com>,
Davidlohr Bueso <dave@stgolabs.net>,
Tim C Chen <tim.c.chen@intel.com>,
Michal Hocko <mhocko@kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Hesham Almatary <hesham.almatary@huawei.com>,
Dave Hansen <dave.hansen@intel.com>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>,
Alistair Popple <apopple@nvidia.com>,
Dan Williams <dan.j.williams@intel.com>,
Johannes Weiner <hannes@cmpxchg.org>,
jvgediya.oss@gmail.com, Jagdish Gediya <jvgediya@linux.ibm.com>
Subject: Re: [PATCH v11 1/8] mm/demotion: Add support for explicit memory tiers
Date: Fri, 29 Jul 2022 12:54:36 +0530 [thread overview]
Message-ID: <874jz0s7nv.fsf@linux.ibm.com> (raw)
In-Reply-To: <871qu4mo5a.fsf@yhuang6-desk2.ccr.corp.intel.com>
"Huang, Ying" <ying.huang@intel.com> writes:
> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>
>> In the current kernel, memory tiers are defined implicitly via a demotion path
>> relationship between NUMA nodes, which is created during the kernel
>> initialization and updated when a NUMA node is hot-added or hot-removed. The
>> current implementation puts all nodes with CPU into the highest tier, and builds
>> the tier hierarchy tier-by-tier by establishing the per-node demotion targets
>> based on the distances between nodes.
>>
>> This current memory tier kernel implementation needs to be improved for several
>> important use cases,
>>
>> The current tier initialization code always initializes each memory-only NUMA
>> node into a lower tier. But a memory-only NUMA node may have a high performance
>> memory device (e.g. a DRAM-backed memory-only node on a virtual machine) that
>> should be put into a higher tier.
>>
>> The current tier hierarchy always puts CPU nodes into the top tier. But on a
>> system with HBM or GPU devices, the memory-only NUMA nodes mapping these devices
>> should be in the top tier, and DRAM nodes with CPUs are better to be placed into
>> the next lower tier.
>>
>> With current kernel higher tier node can only be demoted to nodes with shortest
>> distance on the next lower tier as defined by the demotion path, not any other
>> node from any lower tier. This strict, demotion order does not work in all use
>> cases (e.g. some use cases may want to allow cross-socket demotion to another
>> node in the same demotion tier as a fallback when the preferred demotion node is
>> out of space), This demotion order is also inconsistent with the page allocation
>> fallback order when all the nodes in a higher tier are out of space: The page
>> allocation can fall back to any node from any lower tier, whereas the demotion
>> order doesn't allow that.
>>
>> This patch series address the above by defining memory tiers explicitly.
>>
>> Linux kernel presents memory devices as NUMA nodes and each memory device is of
>> a specific type. The memory type of a device is represented by its abstract
>> distance. A memory tier corresponds to a range of abstract distance. This allows
>> for classifying memory devices with a specific performance range into a memory
>> tier.
>>
>> This patch configures the range/chunk size to be 128. The default DRAM
>> abstract distance is 512. We can have 4 memory tiers below the default DRAM
>> abstract distance which cover the range 0 - 127, 127 - 255, 256- 383, 384 - 511.
>> Slower memory devices like persistent memory will have abstract distance below
>> the default DRAM level and hence will be placed in these 4 lower tiers.
>
> For abstract distance, the lower value means higher performance, higher
> value means lower performance. So the abstract distance of PMEM should
> be smaller than that of DRAM.
I noticed that after sending v11 and did send v12 fixing that already
which can be found
https://lore.kernel.org/linux-mm/20220729061349.968148-1-aneesh.kumar@linux.ibm.com
>
>> A kernel parameter is provided to override the default memory tier.
>
> Forget to delete?
yes. Also fixed in v12.
-aneesh
next prev parent reply other threads:[~2022-07-29 7:24 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-28 19:04 [PATCH v11 0/8] mm/demotion: Memory tiers and demotion Aneesh Kumar K.V
2022-07-28 19:04 ` [PATCH v11 1/8] mm/demotion: Add support for explicit memory tiers Aneesh Kumar K.V
2022-07-29 6:25 ` Huang, Ying
2022-07-29 7:24 ` Aneesh Kumar K.V [this message]
2022-08-02 2:50 ` Dan Williams
2022-08-02 3:16 ` Huang, Ying
2022-08-02 3:40 ` Dan Williams
2022-08-02 5:03 ` Aneesh Kumar K V
2022-08-02 6:57 ` Huang, Ying
2022-08-02 9:34 ` Aneesh Kumar K V
2022-08-04 0:56 ` Huang, Ying
2022-08-04 4:49 ` Aneesh Kumar K V
2022-08-04 5:19 ` Huang, Ying
2022-07-28 19:04 ` [PATCH v11 2/8] mm/demotion: Move memory demotion related code Aneesh Kumar K.V
2022-07-28 19:04 ` [PATCH v11 3/8] mm/demotion: Add hotplug callbacks to handle new numa node onlined Aneesh Kumar K.V
2022-07-28 19:04 ` [PATCH v11 4/8] mm/demotion/dax/kmem: Set node's abstract distance to MEMTIER_ADISTANCE_PMEM Aneesh Kumar K.V
2022-07-29 6:20 ` Huang, Ying
2022-07-29 7:19 ` Aneesh Kumar K.V
2022-08-01 2:06 ` Huang, Ying
2022-08-01 4:40 ` Aneesh Kumar K V
2022-08-01 5:10 ` Huang, Ying
2022-08-01 5:38 ` Aneesh Kumar K V
2022-08-01 6:37 ` Huang, Ying
2022-08-01 6:55 ` Aneesh Kumar K V
2022-08-01 7:13 ` Huang, Ying
2022-08-01 7:41 ` Aneesh Kumar K V
2022-08-02 1:58 ` Huang, Ying
2022-07-28 19:04 ` [PATCH v11 5/8] mm/demotion: Build demotion targets based on explicit memory tiers Aneesh Kumar K.V
2022-07-29 6:35 ` Huang, Ying
2022-07-29 7:22 ` Aneesh Kumar K.V
2022-08-01 2:15 ` Huang, Ying
2022-07-28 19:04 ` [PATCH v11 6/8] mm/demotion: Add pg_data_t member to track node memory tier details Aneesh Kumar K.V
2022-07-28 19:04 ` [PATCH v11 7/8] mm/demotion: Demote pages according to allocation fallback order Aneesh Kumar K.V
2022-07-28 19:04 ` [PATCH v11 8/8] mm/demotion: Update node_is_toptier to work with memory tiers Aneesh Kumar K.V
2022-07-29 6:39 ` Huang, Ying
2022-07-29 6:41 ` Aneesh Kumar K V
2022-07-29 6:47 ` Aneesh Kumar K V
2022-08-01 1:04 ` Huang, Ying
2022-07-29 5:30 ` [PATCH v11 0/8] mm/demotion: Memory tiers and demotion Huang, Ying
2022-07-29 6:17 ` Aneesh Kumar K.V
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=874jz0s7nv.fsf@linux.ibm.com \
--to=aneesh.kumar@linux.ibm.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=apopple@nvidia.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=dave@stgolabs.net \
--cc=hannes@cmpxchg.org \
--cc=hesham.almatary@huawei.com \
--cc=jvgediya.oss@gmail.com \
--cc=jvgediya@linux.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=shy828301@gmail.com \
--cc=tim.c.chen@intel.com \
--cc=weixugc@google.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.