linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Aneesh Kumar K V <aneesh.kumar@linux.ibm.com>
To: Ying Huang <ying.huang@intel.com>
Cc: Greg Thelen <gthelen@google.com>, Yang Shi <shy828301@gmail.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Tim C Chen <tim.c.chen@intel.com>,
	Brice Goglin <brice.goglin@gmail.com>,
	Michal Hocko <mhocko@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Hesham Almatary <hesham.almatary@huawei.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	Alistair Popple <apopple@nvidia.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Feng Tang <feng.tang@intel.com>,
	Jagdish Gediya <jvgediya@linux.ibm.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	David Rientjes <rientjes@google.com>,
	linux-mm@kvack.org, akpm@linux-foundation.org
Subject: Re: [RFC PATCH v4 1/7] mm/demotion: Add support for explicit memory tiers
Date: Mon, 6 Jun 2022 14:32:17 +0530	[thread overview]
Message-ID: <1301311f-12f0-0fda-1245-82bb4c3f5e93@linux.ibm.com> (raw)
In-Reply-To: <d36eabfdc062aeb5aee18ab7ac0bca36b19f8521.camel@intel.com>

On 6/6/22 2:22 PM, Ying Huang wrote:
....
>>>> I can move the patch "mm/demotion/dax/kmem: Set node's memory tier to
>>>> MEMORY_TIER_PMEM" before switching the demotion logic so that on systems
>>>> with two memory tiers (DRAM and pmem) the demotion continues to work
>>>> as expected after patch 3 ("mm/demotion: Build demotion targets based on
>>>> explicit memory tiers"). With that, there will not be any regression in
>>>> between the patch series.
>>>>
>>>
>>> Thanks!  Please do that.  And I think you can add sysfs interface after
>>> that patch too.  That is, in [1/7]
>>>
>>
>> I am not sure why you insist on moving sysfs interfaces later. They are
>> introduced based on the helper added. It make patch review easier to
>> look at both the helpers and the user of the helper together in a patch.
> 
> Yes.  We should introduce a function and its user in one patch for
> review.  But this doesn't mean that we should introduce the user space
> interface as the first step.  I think the user space interface should
> output correct information when we expose it.
> 

If you look at this patchset we are not exposing any wrong information.

patch 1 -> adds ability to register the memory tiers and expose details 
of registered memory tier. At this point the patchset only support DRAM 
tier and hence only one tier is shown

patch 2 -> adds per node memtier attribute. So only DRAM nodes shows the 
details, because the patchset yet has not introduced a slower memory 
tier like PMEM.

patch 4 -> introducing demotion. Will make that patch 5

patch 5 -> add dax kmem numa nodes as slower memory tier. Now this 
becomes patch 4 at which point we will correctly show two memory tiers 
in the system.


>>> +struct memory_tier {
>>> +	nodemask_t nodelist;
>>> +};
>>>
>>> And struct device can be added after the kernel has switched the
>>> implementation based on explicit memory tiers.
>>>
>>> +struct memory_tier {
>>> +	struct device dev;
>>> +	nodemask_t nodelist;
>>> +};
>>>
>>
>>
>> Can you elaborate on this? or possibly review the v5 series indicating
>> what change you are suggesting here?
>>
>>
>>> But I don't think it's a good idea to have "struct device" embedded in
>>> "struct memory_tier".  We don't have "struct device" embedded in "struct
>>> pgdata_list"...
>>>
>>
>> I avoided creating an array for memory_tier (memory_tier[]) so that we
>> can keep it dynamic. Keeping dev embedded in struct memory_tier simplify
>> the life cycle management of that dynamic list. We free the struct
>> memory_tier allocation via device release function (memtier->dev.release
>> = memory_tier_device_release )
>>
>> Why do you think it is not a good idea?
> 
> I think that we shouldn't bind our kernel internal implementation with
> user space interface too much.  Yes.  We can expose kernel internal
> implementation to user space in a direct way.  I suggest you to follow
> the style of "struct pglist_data" and "struct node".  If we decouple
> "struct memory_tier" and "struct memory_tier_dev" (or some other name),
> we can refer to "struct memory_tier" without depending on all device
> core.  Memory tier should be accessible inside the kernel even without a
> user interface.  And memory tier isn't a device in concept.
> 

memory_tiers are different from pglist_data and struct node in that we 
also allow the creation of them from userspace. That is the life time of 
a memory tier is driven from userspace and it is much easier to manage 
them via sysfs file lifetime mechanism rather than inventing an 
independent and more complex way of doing the same.

> For life cycle management, I think that we can do that without sysfs
> too.
> 

unless there are specific details that you think will be broken by 
embedding struct device inside struct memory_tier, IMHO I still consider 
the embedded implementation much simpler and in accordance with other 
kernel design patterns.

-aneesh


  reply	other threads:[~2022-06-06  9:02 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-26 21:22 RFC: Memory Tiering Kernel Interfaces (v3) Wei Xu
2022-05-27  2:58 ` Ying Huang
2022-05-27 14:05   ` Hesham Almatary
2022-05-27 16:25     ` Wei Xu
2022-05-27 12:25 ` [RFC PATCH v4 0/7] mm/demotion: Memory tiers and demotion Aneesh Kumar K.V
2022-05-27 12:25   ` [RFC PATCH v4 1/7] mm/demotion: Add support for explicit memory tiers Aneesh Kumar K.V
2022-05-27 13:59     ` Jonathan Cameron
2022-06-02  6:07     ` Ying Huang
2022-06-06  2:49       ` Ying Huang
2022-06-06  3:56         ` Aneesh Kumar K V
2022-06-06  5:33           ` Ying Huang
2022-06-06  6:01             ` Aneesh Kumar K V
2022-06-06  6:27               ` Aneesh Kumar K.V
2022-06-06  7:53                 ` Ying Huang
2022-06-06  8:01                   ` Aneesh Kumar K V
2022-06-06  8:52                     ` Ying Huang
2022-06-06  9:02                       ` Aneesh Kumar K V [this message]
2022-06-08  1:24                         ` Ying Huang
2022-06-08  7:16     ` Ying Huang
2022-06-08  8:24       ` Aneesh Kumar K V
2022-06-08  8:27         ` Ying Huang
2022-05-27 12:25   ` [RFC PATCH v4 2/7] mm/demotion: Expose per node memory tier to sysfs Aneesh Kumar K.V
2022-05-27 14:15     ` Jonathan Cameron
2022-06-03  8:40       ` Aneesh Kumar K V
2022-06-06 14:59         ` Jonathan Cameron
2022-06-06 16:01           ` Aneesh Kumar K V
2022-06-06 16:16             ` Jonathan Cameron
2022-06-06 16:39               ` Aneesh Kumar K V
2022-06-06 17:46                 ` Aneesh Kumar K.V
2022-06-07 14:32                   ` Jonathan Cameron
2022-06-08  7:18     ` Ying Huang
2022-06-08  8:25       ` Aneesh Kumar K V
2022-06-08  8:29         ` Ying Huang
2022-05-27 12:25   ` [RFC PATCH v4 3/7] mm/demotion: Build demotion targets based on explicit memory tiers Aneesh Kumar K.V
2022-05-27 14:31     ` Jonathan Cameron
2022-05-30  3:35     ` [mm/demotion] 8ebccd60c2: BUG:sleeping_function_called_from_invalid_context_at_mm/compaction.c kernel test robot
2022-05-27 12:25   ` [RFC PATCH v4 4/7] mm/demotion/dax/kmem: Set node's memory tier to MEMORY_TIER_PMEM Aneesh Kumar K.V
2022-06-01  6:29     ` Bharata B Rao
2022-06-01 13:49       ` Aneesh Kumar K V
2022-06-02  6:36         ` Bharata B Rao
2022-06-03  9:04           ` Aneesh Kumar K V
2022-06-06 10:11             ` Bharata B Rao
2022-06-06 10:16               ` Aneesh Kumar K V
2022-06-06 11:54                 ` Aneesh Kumar K.V
2022-06-06 12:09                   ` Bharata B Rao
2022-06-06 13:00                     ` Aneesh Kumar K V
2022-05-27 12:25   ` [RFC PATCH v4 5/7] mm/demotion: Add support to associate rank with memory tier Aneesh Kumar K.V
2022-05-27 14:45     ` Jonathan Cameron
2022-05-27 15:45       ` Aneesh Kumar K V
2022-05-30 12:36         ` Jonathan Cameron
2022-06-02  6:41     ` Ying Huang
2022-05-27 12:25   ` [RFC PATCH v4 6/7] mm/demotion: Add support for removing node from demotion memory tiers Aneesh Kumar K.V
2022-06-02  6:43     ` Ying Huang
2022-05-27 12:25   ` [RFC PATCH v4 7/7] mm/demotion: Demote pages according to allocation fallback order Aneesh Kumar K.V
2022-05-27 15:03     ` Jonathan Cameron
2022-06-02  7:35     ` Ying Huang
2022-06-03 15:09       ` Aneesh Kumar K V
2022-06-06  0:43         ` Ying Huang
2022-06-06  4:07           ` Aneesh Kumar K V
2022-06-06  5:26             ` Ying Huang
2022-06-06  6:21               ` Aneesh Kumar K.V
2022-06-06  7:42                 ` Ying Huang
2022-06-06  8:02                   ` Aneesh Kumar K V
2022-06-06  8:06                     ` Ying Huang
2022-06-06 17:07               ` Yang Shi
2022-05-27 13:40 ` RFC: Memory Tiering Kernel Interfaces (v3) Aneesh Kumar K V
2022-05-27 16:30   ` Wei Xu
2022-05-29  4:31     ` Ying Huang
2022-05-30 12:50       ` Jonathan Cameron
2022-05-31  1:57         ` Ying Huang
2022-06-07 19:25         ` Tim Chen
2022-06-08  4:41           ` Aneesh Kumar K V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1301311f-12f0-0fda-1245-82bb4c3f5e93@linux.ibm.com \
    --to=aneesh.kumar@linux.ibm.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=brice.goglin@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=dave@stgolabs.net \
    --cc=feng.tang@intel.com \
    --cc=gthelen@google.com \
    --cc=hesham.almatary@huawei.com \
    --cc=jvgediya@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rientjes@google.com \
    --cc=shy828301@gmail.com \
    --cc=tim.c.chen@intel.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).