* Re: [PATCH v7 11/12] mm/demotion: Add documentation for memory tiering
[not found] ` <202206230554.5tVWF6UB-lkp@intel.com>
@ 2022-06-25 2:56 ` Bagas Sanjaya
0 siblings, 0 replies; 3+ messages in thread
From: Bagas Sanjaya @ 2022-06-25 2:56 UTC (permalink / raw)
To: kernel test robot
Cc: Aneesh Kumar K.V, linux-mm, akpm, kbuild-all, Wei Xu, Huang Ying,
Yang Shi, Davidlohr Bueso, Tim C Chen, Michal Hocko,
Linux Kernel Mailing List, Hesham Almatary, Dave Hansen,
Jonathan Cameron, Alistair Popple, Dan Williams, Jagdish Gediya,
linux-doc
On Thu, Jun 23, 2022 at 05:21:17AM +0800, kernel test robot wrote:
> If you fix the issue, kindly add following tag where applicable
> Reported-by: kernel test robot <lkp@intel.com>
>
> All errors (new ones prefixed by >>):
>
> >> Documentation/admin-guide/mm/memory-tiering.rst:5: (SEVERE/4) Title overline & underline mismatch.
>
> vim +5 Documentation/admin-guide/mm/memory-tiering.rst
>
> 4
> > 5 ===========
> 6 Memory tiers
> 7 ============
> 8
>
Here is the fixup. Thanks.
---- >8 ----
From ee8b97451b6ad1869f4d426e2d3825ac20a6e15d Mon Sep 17 00:00:00 2001
From: Bagas Sanjaya <bagasdotme@gmail.com>
Date: Sat, 25 Jun 2022 09:48:28 +0700
Subject: [PATCH] fixup for "mm/demotion: Add documentation for memory tiering"
Extend the title heading overline by one (=) to match the underline.
Fixes: 64fc925cf27dac ("mm/demotion: Add documentation for memory tiering")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
---
Documentation/admin-guide/mm/memory-tiering.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/mm/memory-tiering.rst b/Documentation/admin-guide/mm/memory-tiering.rst
index 142c36651f5dd2..0a75e0dab1fd8e 100644
--- a/Documentation/admin-guide/mm/memory-tiering.rst
+++ b/Documentation/admin-guide/mm/memory-tiering.rst
@@ -2,7 +2,7 @@
.. _admin_guide_memory_tiering:
-===========
+============
Memory tiers
============
--
An old man doll... just what I always wanted! - Clara
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v7 11/12] mm/demotion: Add documentation for memory tiering
[not found] ` <20220622082513.467538-12-aneesh.kumar@linux.ibm.com>
[not found] ` <202206230554.5tVWF6UB-lkp@intel.com>
@ 2022-06-25 4:13 ` Bagas Sanjaya
2022-06-27 4:40 ` Aneesh Kumar K.V
1 sibling, 1 reply; 3+ messages in thread
From: Bagas Sanjaya @ 2022-06-25 4:13 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: linux-mm, akpm, Wei Xu, Huang Ying, Yang Shi, Davidlohr Bueso,
Tim C Chen, Michal Hocko, Linux Kernel Mailing List,
Hesham Almatary, Dave Hansen, Jonathan Cameron, Alistair Popple,
Dan Williams, Jagdish Gediya, linux-doc
On Wed, Jun 22, 2022 at 01:55:12PM +0530, Aneesh Kumar K.V wrote:
> From: Jagdish Gediya <jvgediya@linux.ibm.com>
>
Hi Aneesh and Jagdish,
The documentation can be improved, see below.
> All N_MEMORY nodes are divided into 3 memoty tiers with tier ID value
> MEMORY_TIER_HBM_GPU, MEMORY_TIER_DRAM and MEMORY_TIER_PMEM. By default,
> all nodes are assigned to default memory tier.
>
> Demotion path for all N_MEMORY nodes is prepared based on the tier ID value
> of memory tiers.
>
> This patch adds documention for memory tiering introduction, its sysfs
> interfaces and how demotion is performed based on memory tiers.
>
I think the patch message should just be:
"Add documentation for memory tiering. It also covers its sysfs
interfaces and how demotion is performed based on memory tiers."
> +===========
> +Memory tiers
> +============
> +
> +This document describes explicit memory tiering support along with
> +demotion based on memory tiers.
> +
This causes htmldocs error, for which I have applied the fixup at [1].
> +Memory nodes are divided into 3 types of memory tiers with tier ID
> +value as shown based on their hardware characteristics.
> +
> +
> +MEMORY_TIER_HBM_GPU
> +MEMORY_TIER_DRAM
> +MEMORY_TIER_PMEM
> +
Use bullet list.
> +Sysfs interfaces
> +================
> +
> +Nodes belonging to specific tier can be read from,
> +/sys/devices/system/memtier/memtierN/nodelist (Read-Only)
> +
> +Where N is 0 - 2.
The "where" sentence can be compounded into the previous sentence above.
> +
> +Example 1:
> +For a system where Node 0 is CPU + DRAM nodes, Node 1 is HBM node,
> +node 2 is a PMEM node an ideal tier layout will be
> +
> +$ cat /sys/devices/system/memtier/memtier0/nodelist
> +1
> +$ cat /sys/devices/system/memtier/memtier1/nodelist
> +0
> +$ cat /sys/devices/system/memtier/memtier2/nodelist
> +2
> +
The code snippets should have been inside literal code blocks.
> +Example 2:
> +For a system where Node 0 & 1 are CPU + DRAM nodes, node 2 & 3 are PMEM
> +nodes.
> +
> +$ cat /sys/devices/system/memtier/memtier0/nodelist
> +cat: /sys/devices/system/memtier/memtier0/nodelist: No such file or
> +directory
> +$ cat /sys/devices/system/memtier/memtier1/nodelist
> +0-1
> +$ cat /sys/devices/system/memtier/memtier2/nodelist
> +2-3
> +
Use literal code block.
> +Default memory tier can be read from,
> +/sys/devices/system/memtier/default_tier (Read-Only)
> +
> +e.g.
> +$ cat /sys/devices/system/memtier/default_tier
> +memtier200
> +
> +Max memory tier ID supported can be read from,
> +/sys/devices/system/memtier/max_tier (Read-Only)
> +
> +e.g.
> +$ cat /sys/devices/system/memtier/max_tier
> +400
> +
> +Individual node's memory tier can be read of set using,
> +/sys/devices/system/node/nodeN/memtier (Read-Write)
> +
> +where N = node id
> +
> +When this interface is written, Node is moved from the old memory tier
> +to new memory tier and demotion targets for all N_MEMORY nodes are
> +built again.
> +
> +For example 1 mentioned above,
> +$ cat /sys/devices/system/node/node0/memtier
> +1
> +$ cat /sys/devices/system/node/node1/memtier
> +0
> +$ cat /sys/devices/system/node/node2/memtier
> +2
> +
The same suggestions above apply here, too.
> +Enable/Disable demotion
> +-----------------------
> +
> +By default demotion is disabled, it can be enabled/disabled using
> +below sysfs interface,
> +
> +$ echo 0/1 or false/true > /sys/kernel/mm/numa/demotion_enabled
> +
Use literal code block.
> +preferred and allowed demotion nodes
> +------------------------------------
> +
> +Preferred nodes for a specific N_MEMORY node are the best nodes
> +from the next possible lower memory tier. Allowed nodes for any
> +node are all the nodes available in all possible lower memory
> +tiers.
> +
> +Example:
> +
> +For a system where Node 0 & 1 are CPU + DRAM nodes, node 2 & 3 are PMEM
> +nodes,
> +
> +node distances:
> +node 0 1 2 3
> + 0 10 20 30 40
> + 1 20 10 40 30
> + 2 30 40 10 40
> + 3 40 30 40 10
> +
Use reST table.
> +memory_tiers[0] = <empty>
> +memory_tiers[1] = 0-1
> +memory_tiers[2] = 2-3
> +
> +node_demotion[0].preferred = 2
> +node_demotion[0].allowed = 2, 3
> +node_demotion[1].preferred = 3
> +node_demotion[1].allowed = 3, 2
> +node_demotion[2].preferred = <empty>
> +node_demotion[2].allowed = <empty>
> +node_demotion[3].preferred = <empty>
> +node_demotion[3].allowed = <empty>
> +
What are these above? Node properties? BTW, use literal code block.
If you don't understand these suggestions above, here is the diff:
---- >8 ----
diff --git a/Documentation/admin-guide/mm/memory-tiering.rst b/Documentation/admin-guide/mm/memory-tiering.rst
index 0a75e0dab1fd8e..10ec5aab6ddd53 100644
--- a/Documentation/admin-guide/mm/memory-tiering.rst
+++ b/Documentation/admin-guide/mm/memory-tiering.rst
@@ -14,13 +14,13 @@ Introduction
Many systems have multiple types of memory devices e.g. GPU, DRAM and
PMEM. The memory subsystem of these systems can be called a memory
-tiering system because the performance of the different types of
+tiering system because the performance of each type of
memory is different. Memory tiers are defined based on the hardware
capabilities of memory nodes. Each memory tier is assigned a tier ID
value that determines the memory tier position in demotion order.
The memory tier assignment of each node is independent of each
-other. Moving a node from one tier to another tier doesn't affect
+other. Moving a node from one tier to another doesn't affect
the tier assignment of any other node.
Memory tiers are used to build the demotion targets for nodes. A node
@@ -32,10 +32,9 @@ Memory tier rank
Memory nodes are divided into 3 types of memory tiers with tier ID
value as shown based on their hardware characteristics.
-
-MEMORY_TIER_HBM_GPU
-MEMORY_TIER_DRAM
-MEMORY_TIER_PMEM
+ * MEMORY_TIER_HBM_GPU
+ * MEMORY_TIER_DRAM
+ * MEMORY_TIER_PMEM
Memory tiers initialization and (re)assignments
===============================================
@@ -49,68 +48,73 @@ hotplug, the memory tier with default tier ID is assigned to the memory node.
Sysfs interfaces
================
-Nodes belonging to specific tier can be read from,
-/sys/devices/system/memtier/memtierN/nodelist (Read-Only)
+Nodes belonging to specific tier can be read from
+/sys/devices/system/memtier/memtierN/nodelist, where N is 0 - 2 (read-only)
-Where N is 0 - 2.
+Examples:
-Example 1:
-For a system where Node 0 is CPU + DRAM nodes, Node 1 is HBM node,
-node 2 is a PMEM node an ideal tier layout will be
+1. On a system where Node 0 is CPU + DRAM nodes, Node 1 is HBM node,
+ node 2 is a PMEM node an ideal tier layout will be:
-$ cat /sys/devices/system/memtier/memtier0/nodelist
-1
-$ cat /sys/devices/system/memtier/memtier1/nodelist
-0
-$ cat /sys/devices/system/memtier/memtier2/nodelist
-2
+ .. code-block::
-Example 2:
-For a system where Node 0 & 1 are CPU + DRAM nodes, node 2 & 3 are PMEM
-nodes.
+ $ cat /sys/devices/system/memtier/memtier0/nodelist
+ 1
+ $ cat /sys/devices/system/memtier/memtier1/nodelist
+ 0
+ $ cat /sys/devices/system/memtier/memtier2/nodelist
+ 2
-$ cat /sys/devices/system/memtier/memtier0/nodelist
-cat: /sys/devices/system/memtier/memtier0/nodelist: No such file or
-directory
-$ cat /sys/devices/system/memtier/memtier1/nodelist
-0-1
-$ cat /sys/devices/system/memtier/memtier2/nodelist
-2-3
+2. On a system where Node 0 & 1 are CPU + DRAM nodes, node 2 & 3 are PMEM
+ nodes:
-Default memory tier can be read from,
-/sys/devices/system/memtier/default_tier (Read-Only)
+ .. code-block::
-e.g.
-$ cat /sys/devices/system/memtier/default_tier
-memtier200
+ $ cat /sys/devices/system/memtier/memtier0/nodelist
+ cat: /sys/devices/system/memtier/memtier0/nodelist: No such file or
+ directory
+ $ cat /sys/devices/system/memtier/memtier1/nodelist
+ 0-1
+ $ cat /sys/devices/system/memtier/memtier2/nodelist
+ 2-3
-Max memory tier ID supported can be read from,
-/sys/devices/system/memtier/max_tier (Read-Only)
+Default memory tier can be read from
+/sys/devices/system/memtier/default_tier (read-only), e.g.:
-e.g.
-$ cat /sys/devices/system/memtier/max_tier
-400
+.. code-block::
-Individual node's memory tier can be read of set using,
-/sys/devices/system/node/nodeN/memtier (Read-Write)
+ $ cat /sys/devices/system/memtier/default_tier
+ memtier200
-where N = node id
+Max memory tier ID supported can be read from
+/sys/devices/system/memtier/max_tier (read-only), e.g.:
-When this interface is written, Node is moved from the old memory tier
+.. code-block::
+
+ $ cat /sys/devices/system/memtier/max_tier
+ 400
+
+Individual node's memory tier can be read or set using
+/sys/devices/system/node/nodeN/memtier (read-write), where N = node id.
+
+When this interface is written, node is moved from the old memory tier
to new memory tier and demotion targets for all N_MEMORY nodes are
built again.
-For example 1 mentioned above,
-$ cat /sys/devices/system/node/node0/memtier
-1
-$ cat /sys/devices/system/node/node1/memtier
-0
-$ cat /sys/devices/system/node/node2/memtier
-2
+For example 1 mentioned above:
+
+.. code-block::
+
+ $ cat /sys/devices/system/node/node0/memtier
+ 1
+ $ cat /sys/devices/system/node/node1/memtier
+ 0
+ $ cat /sys/devices/system/node/node2/memtier
+ 2
Additional memory tiers can be created by writing a tier ID value to this file.
-This results in a new memory tier creation and moving the specific NUMA node to
-that memory tier.
+This results into creating a new tier and moving the specific NUMA node to
+that tier.
Demotion
========
@@ -128,19 +132,20 @@ be used.
Instead of a page being discarded during reclaim, it can be moved to
persistent memory. Allowing page migration during reclaim enables
-these systems to migrate pages from fast(higher) tiers to slow(lower)
-tiers when the fast(higher) tier is under pressure.
+these systems to migrate pages from fast (higher) tiers to slow (lower)
+tiers when the fast (higher) tier is under pressure.
Enable/Disable demotion
-----------------------
-By default demotion is disabled, it can be enabled/disabled using
-below sysfs interface,
+By default demotion is disabled. It can be toggled by:
-$ echo 0/1 or false/true > /sys/kernel/mm/numa/demotion_enabled
+.. code-block::
-preferred and allowed demotion nodes
+ $ echo 0/1 or false/true > /sys/kernel/mm/numa/demotion_enabled
+
+Preferred and allowed demotion nodes
------------------------------------
Preferred nodes for a specific N_MEMORY node are the best nodes
@@ -148,35 +153,40 @@ from the next possible lower memory tier. Allowed nodes for any
node are all the nodes available in all possible lower memory
tiers.
-Example:
+For example, on a system where Node 0 & 1 are CPU + DRAM nodes,
+node 2 & 3 are PMEM nodes:
-For a system where Node 0 & 1 are CPU + DRAM nodes, node 2 & 3 are PMEM
-nodes,
+ * node distances
-node distances:
-node 0 1 2 3
- 0 10 20 30 40
- 1 20 10 40 30
- 2 30 40 10 40
- 3 40 30 40 10
+ ==== == == == ==
+ node 0 1 2 3
+ ==== == == == ==
+ 0 10 20 30 40
+ 1 20 10 40 30
+ 2 30 40 10 40
+ 3 40 30 40 10
+ ==== == == == ==
-memory_tiers[0] = <empty>
-memory_tiers[1] = 0-1
-memory_tiers[2] = 2-3
+ * node properties
-node_demotion[0].preferred = 2
-node_demotion[0].allowed = 2, 3
-node_demotion[1].preferred = 3
-node_demotion[1].allowed = 3, 2
-node_demotion[2].preferred = <empty>
-node_demotion[2].allowed = <empty>
-node_demotion[3].preferred = <empty>
-node_demotion[3].allowed = <empty>
+ .. code-block::
+
+ memory_tiers[0] = <empty>
+ memory_tiers[1] = 0-1
+ memory_tiers[2] = 2-3
+
+ node_demotion[0].preferred = 2
+ node_demotion[0].allowed = 2, 3
+ node_demotion[1].preferred = 3
+ node_demotion[1].allowed = 3, 2
+ node_demotion[2].preferred = <empty>
+ node_demotion[2].allowed = <empty>
+ node_demotion[3].preferred = <empty>
+ node_demotion[3].allowed = <empty>
Memory allocation for demotion
------------------------------
-If a page needs to be demoted from any node, the kernel 1st tries
-to allocate a new page from the node's preferred node and fallbacks to
-node's allowed targets in allocation fallback order.
-
+If a page needs to be demoted from any node, the kernel first tries
+to allocate a new page from the node's preferred target node and fallbacks
+to node's allowed targets in allocation fallback order.
Thanks.
[1]: https://lore.kernel.org/linux-doc/YrZ5cTFOSuWxlF2t@debian.me/
--
An old man doll... just what I always wanted! - Clara
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v7 11/12] mm/demotion: Add documentation for memory tiering
2022-06-25 4:13 ` Bagas Sanjaya
@ 2022-06-27 4:40 ` Aneesh Kumar K.V
0 siblings, 0 replies; 3+ messages in thread
From: Aneesh Kumar K.V @ 2022-06-27 4:40 UTC (permalink / raw)
To: Bagas Sanjaya
Cc: linux-mm, akpm, Wei Xu, Huang Ying, Yang Shi, Davidlohr Bueso,
Tim C Chen, Michal Hocko, Linux Kernel Mailing List,
Hesham Almatary, Dave Hansen, Jonathan Cameron, Alistair Popple,
Dan Williams, Jagdish Gediya, linux-doc
Bagas Sanjaya <bagasdotme@gmail.com> writes:
> On Wed, Jun 22, 2022 at 01:55:12PM +0530, Aneesh Kumar K.V wrote:
>> From: Jagdish Gediya <jvgediya@linux.ibm.com>
>>
>
> Hi Aneesh and Jagdish,
>
> The documentation can be improved, see below.
>
>> All N_MEMORY nodes are divided into 3 memoty tiers with tier ID value
>> MEMORY_TIER_HBM_GPU, MEMORY_TIER_DRAM and MEMORY_TIER_PMEM. By default,
>> all nodes are assigned to default memory tier.
>>
>> Demotion path for all N_MEMORY nodes is prepared based on the tier ID value
>> of memory tiers.
>>
>> This patch adds documention for memory tiering introduction, its sysfs
>> interfaces and how demotion is performed based on memory tiers.
>>
>
> I think the patch message should just be:
> "Add documentation for memory tiering. It also covers its sysfs
> interfaces and how demotion is performed based on memory tiers."
>
>> +===========
>> +Memory tiers
>> +============
>> +
>> +This document describes explicit memory tiering support along with
>> +demotion based on memory tiers.
>> +
>
> This causes htmldocs error, for which I have applied the fixup at [1].
>
>> +Memory nodes are divided into 3 types of memory tiers with tier ID
>> +value as shown based on their hardware characteristics.
>> +
>> +
>> +MEMORY_TIER_HBM_GPU
>> +MEMORY_TIER_DRAM
>> +MEMORY_TIER_PMEM
>> +
>
> Use bullet list.
>
>> +Sysfs interfaces
>> +================
>> +
>> +Nodes belonging to specific tier can be read from,
>> +/sys/devices/system/memtier/memtierN/nodelist (Read-Only)
>> +
>> +Where N is 0 - 2.
>
> The "where" sentence can be compounded into the previous sentence above.
>
>> +
>> +Example 1:
>> +For a system where Node 0 is CPU + DRAM nodes, Node 1 is HBM node,
>> +node 2 is a PMEM node an ideal tier layout will be
>> +
>> +$ cat /sys/devices/system/memtier/memtier0/nodelist
>> +1
>> +$ cat /sys/devices/system/memtier/memtier1/nodelist
>> +0
>> +$ cat /sys/devices/system/memtier/memtier2/nodelist
>> +2
>> +
>
> The code snippets should have been inside literal code blocks.
>
>> +Example 2:
>> +For a system where Node 0 & 1 are CPU + DRAM nodes, node 2 & 3 are PMEM
>> +nodes.
>> +
>> +$ cat /sys/devices/system/memtier/memtier0/nodelist
>> +cat: /sys/devices/system/memtier/memtier0/nodelist: No such file or
>> +directory
>> +$ cat /sys/devices/system/memtier/memtier1/nodelist
>> +0-1
>> +$ cat /sys/devices/system/memtier/memtier2/nodelist
>> +2-3
>> +
>
> Use literal code block.
>
>> +Default memory tier can be read from,
>> +/sys/devices/system/memtier/default_tier (Read-Only)
>> +
>> +e.g.
>> +$ cat /sys/devices/system/memtier/default_tier
>> +memtier200
>> +
>> +Max memory tier ID supported can be read from,
>> +/sys/devices/system/memtier/max_tier (Read-Only)
>> +
>> +e.g.
>> +$ cat /sys/devices/system/memtier/max_tier
>> +400
>> +
>> +Individual node's memory tier can be read of set using,
>> +/sys/devices/system/node/nodeN/memtier (Read-Write)
>> +
>> +where N = node id
>> +
>> +When this interface is written, Node is moved from the old memory tier
>> +to new memory tier and demotion targets for all N_MEMORY nodes are
>> +built again.
>> +
>> +For example 1 mentioned above,
>> +$ cat /sys/devices/system/node/node0/memtier
>> +1
>> +$ cat /sys/devices/system/node/node1/memtier
>> +0
>> +$ cat /sys/devices/system/node/node2/memtier
>> +2
>> +
>
> The same suggestions above apply here, too.
>
>> +Enable/Disable demotion
>> +-----------------------
>> +
>> +By default demotion is disabled, it can be enabled/disabled using
>> +below sysfs interface,
>> +
>> +$ echo 0/1 or false/true > /sys/kernel/mm/numa/demotion_enabled
>> +
>
> Use literal code block.
>
>> +preferred and allowed demotion nodes
>> +------------------------------------
>> +
>> +Preferred nodes for a specific N_MEMORY node are the best nodes
>> +from the next possible lower memory tier. Allowed nodes for any
>> +node are all the nodes available in all possible lower memory
>> +tiers.
>> +
>> +Example:
>> +
>> +For a system where Node 0 & 1 are CPU + DRAM nodes, node 2 & 3 are PMEM
>> +nodes,
>> +
>> +node distances:
>> +node 0 1 2 3
>> + 0 10 20 30 40
>> + 1 20 10 40 30
>> + 2 30 40 10 40
>> + 3 40 30 40 10
>> +
>
> Use reST table.
>
>> +memory_tiers[0] = <empty>
>> +memory_tiers[1] = 0-1
>> +memory_tiers[2] = 2-3
>> +
>> +node_demotion[0].preferred = 2
>> +node_demotion[0].allowed = 2, 3
>> +node_demotion[1].preferred = 3
>> +node_demotion[1].allowed = 3, 2
>> +node_demotion[2].preferred = <empty>
>> +node_demotion[2].allowed = <empty>
>> +node_demotion[3].preferred = <empty>
>> +node_demotion[3].allowed = <empty>
>> +
>
> What are these above? Node properties? BTW, use literal code block.
>
> If you don't understand these suggestions above, here is the diff:
I got with the below diff.
patch: **** malformed patch at line 180: @@ -148,35 +153,40 @@ from the next possible lower memory tier. Allowed nodes for any
But I did modify the documentation based on your feedback and it is much
better than what I had. Thanks for the review. I will send v8 with the
changes folded. I did add the below to commit message. Hope that is ok.
[update doc format by Bagas Sanjaya <bagasdotme@gmail.com>]
>
> ---- >8 ----
>
> diff --git a/Documentation/admin-guide/mm/memory-tiering.rst b/Documentation/admin-guide/mm/memory-tiering.rst
> index 0a75e0dab1fd8e..10ec5aab6ddd53 100644
> --- a/Documentation/admin-guide/mm/memory-tiering.rst
> +++ b/Documentation/admin-guide/mm/memory-tiering.rst
> @@ -14,13 +14,13 @@ Introduction
>
> Many systems have multiple types of memory devices e.g. GPU, DRAM and
> PMEM. The memory subsystem of these systems can be called a memory
> -tiering system because the performance of the different types of
> +tiering system because the performance of each type of
> memory is different. Memory tiers are defined based on the hardware
> capabilities of memory nodes. Each memory tier is assigned a tier ID
> value that determines the memory tier position in demotion order.
>
> The memory tier assignment of each node is independent of each
> -other. Moving a node from one tier to another tier doesn't affect
> +other. Moving a node from one tier to another doesn't affect
> the tier assignment of any other node.
>
> Memory tiers are used to build the demotion targets for nodes. A node
> @@ -32,10 +32,9 @@ Memory tier rank
> Memory nodes are divided into 3 types of memory tiers with tier ID
> value as shown based on their hardware characteristics.
>
> -
> -MEMORY_TIER_HBM_GPU
> -MEMORY_TIER_DRAM
> -MEMORY_TIER_PMEM
> + * MEMORY_TIER_HBM_GPU
> + * MEMORY_TIER_DRAM
> + * MEMORY_TIER_PMEM
>
> Memory tiers initialization and (re)assignments
> ===============================================
> @@ -49,68 +48,73 @@ hotplug, the memory tier with default tier ID is assigned to the memory node.
> Sysfs interfaces
> ================
>
> -Nodes belonging to specific tier can be read from,
> -/sys/devices/system/memtier/memtierN/nodelist (Read-Only)
> +Nodes belonging to specific tier can be read from
> +/sys/devices/system/memtier/memtierN/nodelist, where N is 0 - 2 (read-only)
>
> -Where N is 0 - 2.
> +Examples:
>
> -Example 1:
> -For a system where Node 0 is CPU + DRAM nodes, Node 1 is HBM node,
> -node 2 is a PMEM node an ideal tier layout will be
> +1. On a system where Node 0 is CPU + DRAM nodes, Node 1 is HBM node,
> + node 2 is a PMEM node an ideal tier layout will be:
>
> -$ cat /sys/devices/system/memtier/memtier0/nodelist
> -1
> -$ cat /sys/devices/system/memtier/memtier1/nodelist
> -0
> -$ cat /sys/devices/system/memtier/memtier2/nodelist
> -2
> + .. code-block::
>
> -Example 2:
> -For a system where Node 0 & 1 are CPU + DRAM nodes, node 2 & 3 are PMEM
> -nodes.
> + $ cat /sys/devices/system/memtier/memtier0/nodelist
> + 1
> + $ cat /sys/devices/system/memtier/memtier1/nodelist
> + 0
> + $ cat /sys/devices/system/memtier/memtier2/nodelist
> + 2
>
> -$ cat /sys/devices/system/memtier/memtier0/nodelist
> -cat: /sys/devices/system/memtier/memtier0/nodelist: No such file or
> -directory
> -$ cat /sys/devices/system/memtier/memtier1/nodelist
> -0-1
> -$ cat /sys/devices/system/memtier/memtier2/nodelist
> -2-3
> +2. On a system where Node 0 & 1 are CPU + DRAM nodes, node 2 & 3 are PMEM
> + nodes:
>
> -Default memory tier can be read from,
> -/sys/devices/system/memtier/default_tier (Read-Only)
> + .. code-block::
>
> -e.g.
> -$ cat /sys/devices/system/memtier/default_tier
> -memtier200
> + $ cat /sys/devices/system/memtier/memtier0/nodelist
> + cat: /sys/devices/system/memtier/memtier0/nodelist: No such file or
> + directory
> + $ cat /sys/devices/system/memtier/memtier1/nodelist
> + 0-1
> + $ cat /sys/devices/system/memtier/memtier2/nodelist
> + 2-3
>
> -Max memory tier ID supported can be read from,
> -/sys/devices/system/memtier/max_tier (Read-Only)
> +Default memory tier can be read from
> +/sys/devices/system/memtier/default_tier (read-only), e.g.:
>
> -e.g.
> -$ cat /sys/devices/system/memtier/max_tier
> -400
> +.. code-block::
>
> -Individual node's memory tier can be read of set using,
> -/sys/devices/system/node/nodeN/memtier (Read-Write)
> + $ cat /sys/devices/system/memtier/default_tier
> + memtier200
>
> -where N = node id
> +Max memory tier ID supported can be read from
> +/sys/devices/system/memtier/max_tier (read-only), e.g.:
>
> -When this interface is written, Node is moved from the old memory tier
> +.. code-block::
> +
> + $ cat /sys/devices/system/memtier/max_tier
> + 400
> +
> +Individual node's memory tier can be read or set using
> +/sys/devices/system/node/nodeN/memtier (read-write), where N = node id.
> +
> +When this interface is written, node is moved from the old memory tier
> to new memory tier and demotion targets for all N_MEMORY nodes are
> built again.
>
> -For example 1 mentioned above,
> -$ cat /sys/devices/system/node/node0/memtier
> -1
> -$ cat /sys/devices/system/node/node1/memtier
> -0
> -$ cat /sys/devices/system/node/node2/memtier
> -2
> +For example 1 mentioned above:
> +
> +.. code-block::
> +
> + $ cat /sys/devices/system/node/node0/memtier
> + 1
> + $ cat /sys/devices/system/node/node1/memtier
> + 0
> + $ cat /sys/devices/system/node/node2/memtier
> + 2
>
> Additional memory tiers can be created by writing a tier ID value to this file.
> -This results in a new memory tier creation and moving the specific NUMA node to
> -that memory tier.
> +This results into creating a new tier and moving the specific NUMA node to
> +that tier.
>
> Demotion
> ========
> @@ -128,19 +132,20 @@ be used.
>
> Instead of a page being discarded during reclaim, it can be moved to
> persistent memory. Allowing page migration during reclaim enables
> -these systems to migrate pages from fast(higher) tiers to slow(lower)
> -tiers when the fast(higher) tier is under pressure.
> +these systems to migrate pages from fast (higher) tiers to slow (lower)
> +tiers when the fast (higher) tier is under pressure.
>
>
> Enable/Disable demotion
> -----------------------
>
> -By default demotion is disabled, it can be enabled/disabled using
> -below sysfs interface,
> +By default demotion is disabled. It can be toggled by:
>
> -$ echo 0/1 or false/true > /sys/kernel/mm/numa/demotion_enabled
> +.. code-block::
>
> -preferred and allowed demotion nodes
> + $ echo 0/1 or false/true > /sys/kernel/mm/numa/demotion_enabled
> +
> +Preferred and allowed demotion nodes
> ------------------------------------
>
> Preferred nodes for a specific N_MEMORY node are the best nodes
> @@ -148,35 +153,40 @@ from the next possible lower memory tier. Allowed nodes for any
> node are all the nodes available in all possible lower memory
> tiers.
>
> -Example:
> +For example, on a system where Node 0 & 1 are CPU + DRAM nodes,
> +node 2 & 3 are PMEM nodes:
>
> -For a system where Node 0 & 1 are CPU + DRAM nodes, node 2 & 3 are PMEM
> -nodes,
> + * node distances
>
> -node distances:
> -node 0 1 2 3
> - 0 10 20 30 40
> - 1 20 10 40 30
> - 2 30 40 10 40
> - 3 40 30 40 10
> + ==== == == == ==
> + node 0 1 2 3
> + ==== == == == ==
> + 0 10 20 30 40
> + 1 20 10 40 30
> + 2 30 40 10 40
> + 3 40 30 40 10
> + ==== == == == ==
>
> -memory_tiers[0] = <empty>
> -memory_tiers[1] = 0-1
> -memory_tiers[2] = 2-3
> + * node properties
>
> -node_demotion[0].preferred = 2
> -node_demotion[0].allowed = 2, 3
> -node_demotion[1].preferred = 3
> -node_demotion[1].allowed = 3, 2
> -node_demotion[2].preferred = <empty>
> -node_demotion[2].allowed = <empty>
> -node_demotion[3].preferred = <empty>
> -node_demotion[3].allowed = <empty>
> + .. code-block::
> +
> + memory_tiers[0] = <empty>
> + memory_tiers[1] = 0-1
> + memory_tiers[2] = 2-3
> +
> + node_demotion[0].preferred = 2
> + node_demotion[0].allowed = 2, 3
> + node_demotion[1].preferred = 3
> + node_demotion[1].allowed = 3, 2
> + node_demotion[2].preferred = <empty>
> + node_demotion[2].allowed = <empty>
> + node_demotion[3].preferred = <empty>
> + node_demotion[3].allowed = <empty>
>
> Memory allocation for demotion
> ------------------------------
>
> -If a page needs to be demoted from any node, the kernel 1st tries
> -to allocate a new page from the node's preferred node and fallbacks to
> -node's allowed targets in allocation fallback order.
> -
> +If a page needs to be demoted from any node, the kernel first tries
> +to allocate a new page from the node's preferred target node and fallbacks
> +to node's allowed targets in allocation fallback order.
>
>
> Thanks.
>
> [1]: https://lore.kernel.org/linux-doc/YrZ5cTFOSuWxlF2t@debian.me/
>
> --
> An old man doll... just what I always wanted! - Clara
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-06-27 4:41 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20220622082513.467538-1-aneesh.kumar@linux.ibm.com>
[not found] ` <20220622082513.467538-12-aneesh.kumar@linux.ibm.com>
[not found] ` <202206230554.5tVWF6UB-lkp@intel.com>
2022-06-25 2:56 ` [PATCH v7 11/12] mm/demotion: Add documentation for memory tiering Bagas Sanjaya
2022-06-25 4:13 ` Bagas Sanjaya
2022-06-27 4:40 ` Aneesh Kumar K.V
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).