All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Wei Xu <weixugc@google.com>
Cc: "ying.huang@intel.com" <ying.huang@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Greg Thelen <gthelen@google.com>, Yang Shi <shy828301@gmail.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Jagdish Gediya <jvgediya@linux.ibm.com>,
	Michal Hocko <mhocko@kernel.org>,
	Tim C Chen <tim.c.chen@intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Alistair Popple <apopple@nvidia.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Feng Tang <feng.tang@intel.com>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Dan Williams <dan.j.williams@intel.com>,
	David Rientjes <rientjes@google.com>,
	Linux MM <linux-mm@kvack.org>,
	Brice Goglin <brice.goglin@gmail.com>,
	Hesham Almatary <hesham.almatary@huawei.com>
Subject: Re: RFC: Memory Tiering Kernel Interfaces (v2)
Date: Thu, 12 May 2022 13:06:10 +0530	[thread overview]
Message-ID: <87y1z7jj85.fsf@linux.ibm.com> (raw)
In-Reply-To: <CAAPL-u-g86QqHaHGGtVJMER8ENC2dpekK+2qOkxoRFmC0F_80g@mail.gmail.com>

Wei Xu <weixugc@google.com> writes:

> On Thu, May 12, 2022 at 12:12 AM Aneesh Kumar K V
> <aneesh.kumar@linux.ibm.com> wrote:
>>
>> On 5/12/22 12:33 PM, ying.huang@intel.com wrote:
>> > On Wed, 2022-05-11 at 23:22 -0700, Wei Xu wrote:
>> >> Sysfs Interfaces
>> >> ================
>> >>
>> >> * /sys/devices/system/memtier/memtierN/nodelist
>> >>
>> >>    where N = 0, 1, 2 (the kernel supports only 3 tiers for now).
>> >>
>> >>    Format: node_list
>> >>
>> >>    Read-only.  When read, list the memory nodes in the specified tier.
>> >>
>> >>    Tier 0 is the highest tier, while tier 2 is the lowest tier.
>> >>
>> >>    The absolute value of a tier id number has no specific meaning.
>> >>    What matters is the relative order of the tier id numbers.
>> >>
>> >>    When a memory tier has no nodes, the kernel can hide its memtier
>> >>    sysfs files.
>> >>
>> >> * /sys/devices/system/node/nodeN/memtier
>> >>
>> >>    where N = 0, 1, ...
>> >>
>> >>    Format: int or empty
>> >>
>> >>    When read, list the memory tier that the node belongs to.  Its value
>> >>    is empty for a CPU-only NUMA node.
>> >>
>> >>    When written, the kernel moves the node into the specified memory
>> >>    tier if the move is allowed.  The tier assignment of all other nodes
>> >>    are not affected.
>> >>
>> >>    Initially, we can make this interface read-only.
>> >
>> > It seems that "/sys/devices/system/node/nodeN/memtier" has all
>> > information we needed.  Do we really need
>> > "/sys/devices/system/memtier/memtierN/nodelist"?
>> >
>> > That can be gotten via a simple shell command line,
>> >
>> > $ grep . /sys/devices/system/node/nodeN/memtier | sort -n -k 2 -t ':'
>> >
>>
>> It will be really useful to fetch the memory tier node list in an easy
>> fashion rather than reading multiple sysfs directories. If we don't have
>> other attributes for memorytier, we could keep
>> "/sys/devices/system/memtier/memtierN" a NUMA node list there by
>> avoiding /sys/devices/system/memtier/memtierN/nodelist
>>
>> -aneesh
>
> It is harder to implement memtierN as just a file and doesn't follow
> the existing sysfs pattern, either.  Besides, it is extensible to have
> memtierN as a directory. 

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 6248326f944d..251f38ec3816 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -1097,12 +1097,49 @@ static struct attribute *node_state_attrs[] = {
 	NULL
 };
 
+#define MAX_TIER 3
+nodemask_t memory_tier[MAX_TIER];
+
+#define _TIER_ATTR_RO(name, tier_index)					\
+	{ __ATTR(name, 0444, show_tier, NULL), tier_index, NULL }
+
+struct memory_tier_attr {
+	struct device_attribute attr;
+	int tier_index;
+	int (*write)(nodemask_t nodes);
+};
+
+static ssize_t show_tier(struct device *dev,
+			 struct device_attribute *attr, char *buf)
+{
+	struct memory_tier_attr *mt = container_of(attr, struct memory_tier_attr, attr);
+
+	return sysfs_emit(buf, "%*pbl\n",
+			  nodemask_pr_args(&memory_tier[mt->tier_index]));
+}
+
 static const struct attribute_group memory_root_attr_group = {
 	.attrs = node_state_attrs,
 };
 
+
+#define TOP_TIER 0
+static struct memory_tier_attr memory_tiers[] = {
+	[0] = _TIER_ATTR_RO(memory_top_tier, TOP_TIER),
+};
+
+static struct attribute *memory_tier_attrs[] = {
+	&memory_tiers[0].attr.attr,
+	NULL
+};
+
+static const struct attribute_group memory_tier_attr_group = {
+	.attrs = memory_tier_attrs,
+};
+
 static const struct attribute_group *cpu_root_attr_groups[] = {
 	&memory_root_attr_group,
+	&memory_tier_attr_group,
 	NULL,
 };
 

As long as we have the ability to see the nodelist, I am good with the
proposal.

-aneesh


  reply	other threads:[~2022-05-12  7:36 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-12  6:22 RFC: Memory Tiering Kernel Interfaces (v2) Wei Xu
2022-05-12  7:03 ` ying.huang
2022-05-12  7:12   ` Aneesh Kumar K V
2022-05-12  7:18     ` ying.huang
2022-05-12  7:22     ` Wei Xu
2022-05-12  7:36       ` Aneesh Kumar K.V [this message]
2022-05-12  8:15         ` Wei Xu
2022-05-12  8:37           ` ying.huang
2022-05-13  2:52             ` ying.huang
2022-05-13  7:00               ` Wei Xu
2022-05-16  1:57                 ` ying.huang
2022-05-12 21:12           ` Tim Chen
2022-05-12 21:31             ` Wei Xu
2022-05-12 15:00 ` Jonathan Cameron
2022-05-18  7:09   ` Wei Xu
2022-05-18 12:00     ` Jonathan Cameron
2022-05-24  7:36       ` Wei Xu
2022-05-24 13:26         ` Aneesh Kumar K.V
2022-05-25  5:27           ` Wei Xu
2022-05-25  7:47             ` Alistair Popple
2022-05-25 11:48               ` Jonathan Cameron
2022-05-25 15:32                 ` Wei Xu
2022-05-20  3:06     ` Ying Huang
2022-05-24  7:04       ` Wei Xu
2022-05-24  8:24         ` Ying Huang
2022-05-25  5:32           ` Wei Xu
2022-05-25  9:03             ` Ying Huang
2022-05-25 10:01               ` Aneesh Kumar K V
2022-05-25 11:36                 ` Mika Penttilä
2022-05-25 15:33                   ` Wei Xu
2022-05-25 17:27                 ` Wei Xu
2022-05-26  9:32                   ` Jonathan Cameron
2022-05-26 20:30                     ` Wei Xu
2022-05-27  9:26                   ` Aneesh Kumar K V
2022-05-25 15:36               ` Wei Xu
2022-05-26  1:09                 ` Ying Huang
2022-05-26  3:53                   ` Wei Xu
2022-05-26  6:54                     ` Ying Huang
2022-05-26  7:08                       ` Wei Xu
2022-05-26  7:39                         ` Ying Huang
2022-05-26 20:55                           ` Wei Xu
2022-05-27  9:10                             ` Jonathan Cameron
2022-05-30  6:54                               ` Ying Huang
2022-05-13  3:25 ` ying.huang
2022-05-13  6:36   ` Wei Xu
2022-05-13  7:04     ` ying.huang
2022-05-13  7:21       ` Wei Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y1z7jj85.fsf@linux.ibm.com \
    --to=aneesh.kumar@linux.ibm.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=brice.goglin@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=dave@stgolabs.net \
    --cc=feng.tang@intel.com \
    --cc=gthelen@google.com \
    --cc=hesham.almatary@huawei.com \
    --cc=jvgediya@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rientjes@google.com \
    --cc=shy828301@gmail.com \
    --cc=tim.c.chen@intel.com \
    --cc=weixugc@google.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.