From: Randy Dunlap <rdunlap@infradead.org>
To: Gregory Price <gourry@gourry.net>, linux-cxl@vger.kernel.org
Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
kernel-team@meta.com, dave@stgolabs.net,
jonathan.cameron@huawei.com, dave.jiang@intel.com,
alison.schofield@intel.com, vishal.l.verma@intel.com,
ira.weiny@intel.com, dan.j.williams@intel.com, corbet@lwn.net
Subject: Re: [RFC PATCH v2 14/18] cxl: docs/allocation/reclaim
Date: Sat, 10 May 2025 19:31:17 -0700 [thread overview]
Message-ID: <2abca2ae-d53a-4324-b74f-0f189b41f1ae@infradead.org> (raw)
In-Reply-To: <20250430181048.1197475-15-gourry@gourry.net>
On 4/30/25 11:10 AM, Gregory Price wrote:
> Document a bit about how reclaim interacts with various CXL
> configurations.
>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
> .../driver-api/cxl/allocation/reclaim.rst | 51 +++++++++++++++++++
> Documentation/driver-api/cxl/index.rst | 1 +
> 2 files changed, 52 insertions(+)
> create mode 100644 Documentation/driver-api/cxl/allocation/reclaim.rst
>
> diff --git a/Documentation/driver-api/cxl/allocation/reclaim.rst b/Documentation/driver-api/cxl/allocation/reclaim.rst
> new file mode 100644
> index 000000000000..f37c8b1cc3bd
> --- /dev/null
> +++ b/Documentation/driver-api/cxl/allocation/reclaim.rst
> @@ -0,0 +1,51 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=======
> +Reclaim
> +=======
> +Another way CXL memory can be utilized *indirectly* is via the reclaim system
> +in :code:`mm/vmscan.c`. Reclaim is engaged when memory capacity on the system
> +becomes pressured based on global and cgroup-local `watermark` settings.
> +
> +In this section we won't discuss the `watermark` configurations, just how CXL
> +memory can be consumed by various pieces of reclaim system.
> +
> +Demotion
> +========
> +By default, the reclaim system will prefer swap (or zswap) when reclaiming
> +memory. Enabling :code:`kernel/mm/numa/demotion_enabled` will cause vmscan
> +to opportunistically prefer distant NUMA nodes to swap or zswap, if capacity
> +is available.
> +
> +Demotion engages the :code:`mm/memory_tier.c` component to determine the
> +next demotion node. The next demotion node is based on the :code:`HMAT`
> +or :code:`CDAT` performance data.
> +
> +cpusets.mems_allowed quirk
> +--------------------------
> +In Linux v6.15 and below, demotion does not respect :code:`cpusets.mems_allowed`
> +when migrating pages. As a result, if demotion is enabled, vmscan cannot
> +guarantee isolation of a container's memory from nodes not set in mems_allowed.
> +
> +In Linux v6.XX and up, demotion does attempt to respect
> +:code:`cpusets.mems_allowed`; however, certain classes of shared memory
> +originally instantiated by another cgroup (such as common libraries - e.g.
> +libc) may still be demoted. As a result, the mems_allowed interface still
> +cannot provide perfect isolation from the remote nodes.
> +
> +ZSwap and Node Preference
> +=========================
> +In Linux v6.15 and below, ZSwap allocates memory from the local node of the
> +processor for the new pages being compressed. Since pages being compressed
> +are typically cold, the result is a cold page becomes promoted - only to
> +be later demoted as it ages off the LRU.
> +
> +In Linux v6.XX, ZSwap tries to prefer the node of the page being compressed
> +as the allocation target for the compression page. This helps prevernt
prevent
> +thrashing.
> +
> +Demotion with ZSwap
> +===================
> +When enabling both Demotion and ZSwap, you create a situation where ZSwap
> +will prefer the slowest form of CXL memory by default until that tier of
> +memory is exausted.
exhausted.
--
~Randy
next prev parent reply other threads:[~2025-05-11 2:31 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-30 18:10 [RFC PATCH v2 00/18] CXL Boot to Bash Documentation Gregory Price
2025-04-30 18:10 ` [RFC PATCH v2 01/18] cxl: update documentation structure in prep for new docs Gregory Price
2025-05-11 7:34 ` Bagas Sanjaya
2025-04-30 18:10 ` [RFC PATCH v2 02/18] cxl: docs/devices - device reference and uefi placeholder Gregory Price
2025-05-11 1:57 ` Randy Dunlap
2025-04-30 18:10 ` [RFC PATCH v2 03/18] cxl: docs/platform/bios-and-efi documentation Gregory Price
2025-04-30 18:10 ` [RFC PATCH v2 04/18] cxl: docs/platform/acpi reference documentation Gregory Price
2025-05-11 2:02 ` Randy Dunlap
2025-04-30 18:10 ` [RFC PATCH v2 05/18] cxl: docs/platform/example-configs documentation Gregory Price
2025-04-30 18:10 ` [RFC PATCH v2 06/18] cxl: docs/linux - overview Gregory Price
2025-05-11 2:06 ` Randy Dunlap
2025-04-30 18:10 ` [RFC PATCH v2 07/18] cxl: docs/linux - early boot configuration Gregory Price
2025-05-11 2:11 ` Randy Dunlap
2025-04-30 18:10 ` [RFC PATCH v2 08/18] cxl: docs/linux - add cxl-driver theory of operation Gregory Price
2025-04-30 18:10 ` [RFC PATCH v2 09/18] cxl: docs/linux/cxl-driver - add example configurations Gregory Price
2025-04-30 18:10 ` [RFC PATCH v2 10/18] cxl: docs/linux/dax-driver documentation Gregory Price
2025-05-11 2:18 ` Randy Dunlap
2025-05-12 14:31 ` Gregory Price
2025-04-30 18:10 ` [RFC PATCH v2 11/18] cxl: docs/linux/memory-hotplug Gregory Price
2025-05-11 2:22 ` Randy Dunlap
2025-04-30 18:10 ` [RFC PATCH v2 12/18] cxl: docs/allocation/dax Gregory Price
2025-04-30 18:10 ` [RFC PATCH v2 13/18] cxl: docs/allocation/page-allocator Gregory Price
2025-05-11 2:28 ` Randy Dunlap
2025-04-30 18:10 ` [RFC PATCH v2 14/18] cxl: docs/allocation/reclaim Gregory Price
2025-05-11 2:31 ` Randy Dunlap [this message]
2025-04-30 18:10 ` [RFC PATCH v2 15/18] cxl: docs/allocation/hugepages Gregory Price
2025-04-30 18:10 ` [RFC PATCH v2 16/18] cxl: docs/allocation/tiering Gregory Price
2025-04-30 18:10 ` [RFC PATCH v2 17/18] cxl: docs/use-cases Gregory Price
2025-04-30 18:10 ` [RFC PATCH v2 18/18] cxl: docs - add self-referencing cross-links Gregory Price
2025-05-11 2:54 ` Randy Dunlap
2025-05-11 9:14 ` Bagas Sanjaya
2025-04-30 23:15 ` [PATCH FIXUP] cxl: docs - fixup cedt.rst reference in access-coordinates Gregory Price
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2abca2ae-d53a-4324-b74f-0f189b41f1ae@infradead.org \
--to=rdunlap@infradead.org \
--cc=alison.schofield@intel.com \
--cc=corbet@lwn.net \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=gourry@gourry.net \
--cc=ira.weiny@intel.com \
--cc=jonathan.cameron@huawei.com \
--cc=kernel-team@meta.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox