All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gregory Price <gregory.price@memverge.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "tj@kernel.org" <tj@kernel.org>,
	John Groves <john@jagalactic.com>,
	Gregory Price <gourry.memverge@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"cgroups@vger.kernel.org" <cgroups@vger.kernel.org>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"ying.huang@intel.com" <ying.huang@intel.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"mhocko@kernel.org" <mhocko@kernel.org>,
	"lizefan.x@bytedance.com" <lizefan.x@bytedance.com>,
	"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
	"corbet@lwn.net" <corbet@lwn.net>,
	"roman.gushchin@linux.dev" <roman.gushchin@linux.dev>,
	"shakeelb@google.com" <shakeelb@google.com>,
	"muchun.song@linux.dev" <muchun.song@linux.dev>,
	"jgroves@micron.com" <jgroves@micron.com>
Subject: Re: [RFC PATCH v4 0/3] memcg weighted interleave mempolicy control
Date: Sun, 12 Nov 2023 21:22:20 -0500	[thread overview]
Message-ID: <ZVGIXN83qG7jQmuj@memverge.com> (raw)
In-Reply-To: <6550144fb048d_46f0294be@dwillia2-mobl3.amr.corp.intel.com.notmuch>

On Sat, Nov 11, 2023 at 03:54:55PM -0800, Dan Williams wrote:
> tj@kernel.org wrote:
> > Hello,
> > 
> > On Fri, Nov 10, 2023 at 10:42:39PM -0500, Gregory Price wrote:
> > > On Fri, Nov 10, 2023 at 05:05:50PM -1000, tj@kernel.org wrote:
> 
> > Here, even if CXL actually becomes popular, how many are going to use memory
> > hotplug and need to dynamically rebalance memory in actively running
> > workloads? What's the scenario? Are there going to be an army of data center
> > technicians going around plugging and unplugging CXL devices depending on
> > system memory usage?
> 
> While I have personal skepticism that all of the infrastructure in the
> CXL specification is going to become popular, one mechanism that seems
> poised to cross that threshold is "dynamic capacity". So it is not the
> case that techs are running around hot-adjusting physical memory. A host
> will have a cable hop to a shared memory pool in the rack where it can
> be dynamically provisioned across hosts.
> 
> However, even then the bounds of what is dynamic is going to be
> constrained to a fixed address space with likely predictable performance
> characteristics for that address range. That potentially allows for a
> system wide memory interleave policy to be viable. That might be the
> place to start and mirrors, at a coarser granularity, what hardware
> interleaving can do.
> 
> [..]

Funny enough, this is exactly why I skipped cgroups and went directly to 
implementing the weights as an attribute of numa nodes. It cuts out a
middle-man and lets you apply weights globally.

BUT the policy is still ultimately opt-in, so you don't really get a
global effect, just a global control.  Just given that lesson, yeah
it's better to reduce the scope to mempolicy first.

Getting to global interleave weights from there... more complicated.

The simplees way I can think of to test system-wide weighted interleave
is to have the init task create a default mempolicy and have all tasks
inherit it.  That feels like a big, dumb hammer - but it might work.

Comparatively, implementing a mempolicy in the root cgroup and having
tasks use that directly "feels" better, though lessons form this patch
- interating cgroup parent trees on allocations feels not great.

Barring that, if a cgroup.mempolicy and a default mempolicy for init
aren't realistic, I don't see a good path to fruition for a global
interleave approach that doesn't require nastier allocator changes.

In the meantime, unless there's other pro-cgroups voices, I'm going to
pivot back to my initial approach of doing it in mempolicy, though I
may explore extending mempolicy into procfs at the same time.

~Gregory

  reply	other threads:[~2023-11-13  2:22 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <klhcqksrg7uvdrf6hoi5tegifycjltz2kx2d62hapmw3ulr7oa@woibsnrpgox4>
2023-11-09  0:25 ` [RFC PATCH v4 0/3] memcg weighted interleave mempolicy control Gregory Price
2023-11-09  0:25   ` [RFC PATCH v4 1/3] mm/memcontrol: implement memcg.interleave_weights Gregory Price
2023-11-09  8:38     ` kernel test robot
2023-11-09  0:25   ` [RFC PATCH v4 2/3] mm/mempolicy: implement weighted interleave Gregory Price
2023-11-09 13:36     ` kernel test robot
2023-11-10 15:26     ` Ravi Jonnalagadda
2023-11-09  0:25   ` [RFC PATCH v4 3/3] Documentation: sysfs entries for cgroup.memory.interleave_weights Gregory Price
2023-11-09 10:02   ` [RFC PATCH v4 0/3] memcg weighted interleave mempolicy control Michal Hocko
2023-11-09 15:10     ` Gregory Price
2023-11-09 16:34     ` Gregory Price
2023-11-10  9:05       ` Michal Hocko
2023-11-10 21:24         ` Gregory Price
2023-11-09 22:48   ` John Groves
2023-11-10 22:05     ` tj
2023-11-10 22:29       ` Gregory Price
2023-11-11  3:05         ` tj
2023-11-11  3:42           ` Gregory Price
2023-11-11 11:16             ` tj
2023-11-11 23:54               ` Dan Williams
2023-11-13  2:22                 ` Gregory Price [this message]
2023-11-14  9:43             ` Michal Hocko
2023-11-14 15:50               ` Gregory Price
2023-11-14 17:01                 ` Michal Hocko
2023-11-14 17:49                   ` Gregory Price
2023-11-15  5:56                     ` Huang, Ying
2023-12-04  3:33                       ` Gregory Price
2023-12-04  8:19                         ` Huang, Ying
2023-12-04 13:50                           ` Gregory Price
2023-12-05  9:01                             ` Huang, Ying
2023-12-05 14:47                               ` Gregory Price
2023-12-06  0:50                                 ` Huang, Ying
2023-12-06  2:01                                   ` Gregory Price
2023-11-10  6:16   ` Huang, Ying
2023-11-10 19:54     ` Gregory Price
2023-11-13  1:31       ` Huang, Ying
2023-11-13  2:28         ` Gregory Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZVGIXN83qG7jQmuj@memverge.com \
    --to=gregory.price@memverge.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=gourry.memverge@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=jgroves@micron.com \
    --cc=john@jagalactic.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan.x@bytedance.com \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.