From: Gregory Price <gregory.price@memverge.com>
To: Yuanchu Xie <yuanchu@google.com>
Cc: David Hildenbrand <david@redhat.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
Khalid Aziz <khalid.aziz@oracle.com>,
Henry Huang <henry.hj@antgroup.com>, Yu Zhao <yuzhao@google.com>,
Dan Williams <dan.j.williams@intel.com>,
Huang Ying <ying.huang@intel.com>, Wei Xu <weixugc@google.com>,
David Rientjes <rientjes@google.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Muchun Song <muchun.song@linux.dev>,
Shuah Khan <shuah@kernel.org>,
Yosry Ahmed <yosryahmed@google.com>,
Matthew Wilcox <willy@infradead.org>,
Sudarshan Rajagopalan <quic_sudaraja@quicinc.com>,
Kairui Song <kasong@tencent.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Vasily Averin <vasily.averin@linux.dev>,
Nhat Pham <nphamcs@gmail.com>, Miaohe Lin <linmiaohe@huawei.com>,
Qi Zheng <zhengqi.arch@bytedance.com>,
Abel Wu <wuyun.abel@bytedance.com>,
"Vishal Moola (Oracle)" <vishal.moola@gmail.com>,
Kefeng Wang <wangkefeng.wang@huawei.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
cgroups@vger.kernel.org, linux-kselftest@vger.kernel.org
Subject: Re: [RFC PATCH v3 0/8] mm: workingset reporting
Date: Wed, 27 Mar 2024 17:44:20 -0400 [thread overview]
Message-ID: <ZgSTNCP5f+T5VtBI@memverge.com> (raw)
In-Reply-To: <20240327213108.2384666-1-yuanchu@google.com>
On Wed, Mar 27, 2024 at 02:30:59PM -0700, Yuanchu Xie wrote:
>
> Promotion/Demotion
> Similar to proactive reclaim, a workingset report enables demotion to a
> slower tier of memory.
> For promotion, the workingset report interfaces need to be extended to
> report hotness and gather hotness information from the devices[1].
>
> [1]
> https://www.opencompute.org/documents/ocp-cms-hotness-tracking-requirements-white-paper-pdf-1
>
> Sysfs and Cgroup Interfaces
> ==========
> The interfaces are detailed in the patches that introduce them. The main
> idea here is we break down the workingset per-node per-memcg into time
> intervals (ms), e.g.
>
> 1000 anon=137368 file=24530
> 20000 anon=34342 file=0
> 30000 anon=353232 file=333608
> 40000 anon=407198 file=206052
> 9223372036854775807 anon=4925624 file=892892
>
> I realize this does not generalize well to hotness information, but I
> lack the intuition for an abstraction that presents hotness in a useful
> way. Based on a recent proposal for move_phys_pages[2], it seems like
> userspace tiering software would like to move specific physical pages,
> instead of informing the kernel "move x number of hot pages to y
> device". Please advise.
>
> [2]
> https://lore.kernel.org/lkml/20240319172609.332900-1-gregory.price@memverge.com/
>
Please note that this proposed interface (move_phys_pages) is very
unlikely to be received upstream due to side channel concerns. Instead,
it's more likely that the tiering component will expose a "promote X
pages from tier A to tier B", and the kernel component would then
use/consume hotness information to determine which pages to promote.
(Just as one example, there are many more realistic designs)
So if there is a way to expose workingset data to the mm/memory_tiers.c
component instead of via sysfs/cgroup - that is preferable.
The 'move_phys_pages' interface is more of an experimental interface to
test the effectiveness of this approach without having to plumb out the
entire system. Definitely anything userland interface should not be
designed to generate physical address information for consumption unless
it is hard-locked behind admin caps.
Regards,
Gregory
next prev parent reply other threads:[~2024-03-27 21:44 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-27 21:30 [RFC PATCH v3 0/8] mm: workingset reporting Yuanchu Xie
2024-03-27 21:31 ` [RFC PATCH v3 1/8] mm: multi-gen LRU: ignore non-leaf pmd_young for force_scan=true Yuanchu Xie
2024-04-09 6:50 ` Huang, Ying
2024-04-09 22:36 ` Yuanchu Xie
2024-04-10 6:15 ` Huang, Ying
2024-03-27 21:31 ` [RFC PATCH v3 2/8] mm: aggregate working set information into histograms Yuanchu Xie
2024-04-09 7:18 ` Huang, Ying
2024-03-27 21:31 ` [RFC PATCH v3 3/8] mm: use refresh interval to rate-limit workingset report aggregation Yuanchu Xie
2024-03-27 21:31 ` [RFC PATCH v3 4/8] mm: report workingset during memory pressure driven scanning Yuanchu Xie
2024-03-27 21:31 ` [RFC PATCH v3 5/8] mm: extend working set reporting to memcgs Yuanchu Xie
2024-03-27 21:31 ` [RFC PATCH v3 6/8] mm: add per-memcg reaccess histogram Yuanchu Xie
2024-03-27 21:31 ` [RFC PATCH v3 7/8] mm: add kernel aging thread for workingset reporting Yuanchu Xie
2024-03-27 21:31 ` [RFC PATCH v3 8/8] mm: test system-wide " Yuanchu Xie
2024-03-29 19:43 ` Muhammad Usama Anjum
2024-03-27 21:44 ` Gregory Price [this message]
2024-03-27 22:53 ` [RFC PATCH v3 0/8] mm: " Yuanchu Xie
2024-03-29 17:28 ` Gregory Price
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZgSTNCP5f+T5VtBI@memverge.com \
--to=gregory.price@memverge.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=cgroups@vger.kernel.org \
--cc=dan.j.williams@intel.com \
--cc=david@redhat.com \
--cc=gregkh@linuxfoundation.org \
--cc=hannes@cmpxchg.org \
--cc=henry.hj@antgroup.com \
--cc=kasong@tencent.com \
--cc=khalid.aziz@oracle.com \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mst@redhat.com \
--cc=muchun.song@linux.dev \
--cc=nphamcs@gmail.com \
--cc=quic_sudaraja@quicinc.com \
--cc=rafael@kernel.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=shuah@kernel.org \
--cc=vasily.averin@linux.dev \
--cc=vishal.moola@gmail.com \
--cc=wangkefeng.wang@huawei.com \
--cc=weixugc@google.com \
--cc=willy@infradead.org \
--cc=wuyun.abel@bytedance.com \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
--cc=yuanchu@google.com \
--cc=yuzhao@google.com \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox