public inbox for linux-cxl@vger.kernel.org
 help / color / mirror / Atom feed
From: Ira Weiny <ira.weiny@intel.com>
To: Gregory Price <gourry@gourry.net>, Ira Weiny <ira.weiny@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>, Fan Ni <fan.ni@samsung.com>,
	"Jonathan Cameron" <Jonathan.Cameron@huawei.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	"Alison Schofield" <alison.schofield@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	<linux-cxl@vger.kernel.org>, <nvdimm@lists.linux.dev>,
	<linux-kernel@vger.kernel.org>, Li Ming <ming.li@zohomail.com>
Subject: Re: [PATCH v9 00/19] DCD: Add support for Dynamic Capacity Devices (DCD)
Date: Tue, 3 Feb 2026 16:04:23 -0600	[thread overview]
Message-ID: <698270e76775_44a22100c4@iweiny-mobl.notmuch> (raw)
In-Reply-To: <aYEHmjmv-Z_WyrqV@gourry-fedora-PF4VCD3F>

Gregory Price wrote:
> On Sun, Apr 13, 2025 at 05:52:08PM -0500, Ira Weiny wrote:
> > A git tree of this series can be found here:
> > 
> > 	https://github.com/weiny2/linux-kernel/tree/dcd-v6-2025-04-13
> > 
> > This is now based on 6.15-rc2.
> > 
> 
> Extreme necro-bump for this set, but i wonder what folks opinion is on
> DCD support if we expose a new region control pattern ala:
> 
> https://lore.kernel.org/linux-cxl/20260129210442.3951412-1-gourry@gourry.net/
> 
> The major difference would be elimination of sparse-DAX, which i know

Sparse-dax is somewhat of a misnomer.  sparse regions may have been a
better name for it.  That is really what we are speaking of.  It is the
idea that we have regions which don't necessarily have memory backing the
size of the region.

For the DCD series I wrote dax devices could only be created after extents
appeared.

> has been a concern, in favor of a per-region-driver policy on how to
> manage hot-add/remove events.

I think a concern would be that each region driver is implementing a
'policy' which requires new drivers for new policies.

My memory is very weak on all this stuff...

My general architecture was trying to exposed the extent ranges to user
space and allow userspace to build them into ranges with whatever policy
they wanted.

The tests[1] were all written to create dax devices on top of the extents
in certain ways to link together those extents.

[1] https://github.com/weiny2/ndctl/blob/dcd-region3-2025-04-13/test/cxl-dcd.sh

I did not like the 'implicit' nature of the association of dax device with
extent.  But it maintained backwards compatibility with non-sparse
regions...

My vision for tags was that eventually dax device creation could have a
tag specified prior and would only allocate from extents with that tag.

> 
> Things I've discussed with folks in different private contexts
> 
> sysram usecase:
> ----
>   echo regionN > decoder0.0/create_dc_region
>   /* configure decoders */
>   echo regionN > cxl/drivers/sysram/bind
> 
> tagged extents arrive and leave as a group, no sparseness
>     extents cannot share a tag unless they arrive together
>     e.g. set(A) & set(B) must have different tags
>     add and expose daxN.M/uuid as the tag for collective management

I'm not following this.  If set(A) arrives can another set(A) arrive
later?

How long does the kernel wait for all the 'A's to arrive?  Or must they be
in a ...  'more bit set' set of extents.

Regardless IMO if user space was monitoring the extents with tag A they
can decide if and when all those extents have arrived and can build on top
of that.

> 
> Can decide whether linux wants to support untagged extents
>     cxl_sysram could choose to track and hotplug untagged extents

'cxl_sysram' is the sysram region driver right?

Are we expecting to have tags and non-taged extents on the same DCD
region?

I'm ok not supporting that.  But just to be clear about what you are
suggesting.

Would the cxl_sysram region driver be attached to the DCD partition?  Then
it would have some DCD functionality built in...  I guess make a common
extent processing lib for the 2 drivers?

I feel like that is a lot of policy being built into the kernel.  Where
having the DCD region driver simply tell user space 'Hey there is a new
extent here' and then having user space online that as sysram makes the
policy decision in user space.

Segwaying into the N_PRIVATE work.  Couldn't we assign that memory to a
NUMA node with N_PRIVATE only memory via userspace...  Then it is onlined
in a way that any app which is allocating from that node would get that
memory.  And keep it out of kernel space?

But keep all that policy in user space when an extent appears.  Not baked
into a particular driver.

>     directly without going through DAX. Partial release would be
>     possible on a per-extent granularity in this case.
> ----
> 
> 
> virtio usecase:  (making some stuff up here)
> ----
>   echo regionN > decoder0.0/create_dc_region
>   /* configure decoders */
>   echo regionN > cxl/drivers/virtio/bind
> 
> tags are required and may imply specific VM routing
>     may or may not use DAX under the hood
> 
> extents may be tracked individually and add/removed individually
>     if using DAX, this implies 1 device per extent.
>     This probably requires a minimum extent size to be reasonable.
> 
> Does not expose the memory as SysRAM, instead builds new interface
>     to handle memory management message routing to/from the VMM
>     (N_MEMORY_PRIVATE?)
> ----
> 
> 
> devdax usecase (FAMFS?)
> ---- 
>   echo regionN > decoder0.0/create_dc_region
>   /* configure decoders */
>   echo regionN > cxl/drivers/devdax/bind
> 
> All sets of extents appear as new DAX devices
> Tags are exposed via daxN.M/uuid
> Tags are required
>    otherwise you can't make sense of what that devdax represents
> ---
> 
> Begs the question:
>    Do we require tags as a baseline feature for all modes?

Previously no.  But I've often thought of no tag as just a special case of
tag == 0.  But we agreed at one time that they would have special no tag
meaning such that it was just memory to be used however...

>    No tag - no service.
>    Heavily implied:  Tags are globally unique (uuid)
> 
> But I think this resolves a lot of the disparate disagreements on "what
> to do with tags" and how to manage sparseness - just split the policy
> into each individual use-case's respective driver.

I think what I'm worried about is where that policy resides.

I think it is best to have a DCD region driver which simply exposes
extents and allows user space to control how those extents are used.  I
think some of what you have above works like that but I want to be careful
baking in policy.

> 
> If a sufficiently unique use-case comes along that doesn't fit the
> existing categories - a new region-driver may be warranted.

Again I don't like the idea of needing new drivers for new policies.  That
goes against how things should work in the kernel.

Ira

  reply	other threads:[~2026-02-03 22:01 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-13 22:52 [PATCH v9 00/19] DCD: Add support for Dynamic Capacity Devices (DCD) Ira Weiny
2025-04-13 22:52 ` [PATCH v9 01/19] cxl/mbox: Flag " Ira Weiny
2025-04-14 14:19   ` Jonathan Cameron
2025-05-05 21:04     ` Fan Ni
2025-05-06 16:09       ` Ira Weiny
2025-05-06 18:54         ` Fan Ni
2025-04-13 22:52 ` [PATCH v9 02/19] cxl/mem: Read dynamic capacity configuration from the device Ira Weiny
2025-04-14 14:35   ` Jonathan Cameron
2025-04-14 15:20     ` Jonathan Cameron
2025-05-07 17:40   ` Fan Ni
2025-05-08 13:35     ` Ira Weiny
2025-04-13 22:52 ` [PATCH v9 03/19] cxl/cdat: Gather DSMAS data for DCD partitions Ira Weiny
2025-04-14 15:29   ` Jonathan Cameron
2025-04-13 22:52 ` [PATCH v9 04/19] cxl/core: Enforce partition order/simplify partition calls Ira Weiny
2025-04-14 15:32   ` Jonathan Cameron
2026-02-02 19:25   ` Davidlohr Bueso
2025-04-13 22:52 ` [PATCH v9 05/19] cxl/mem: Expose dynamic ram A partition in sysfs Ira Weiny
2025-04-14 15:34   ` Jonathan Cameron
2026-02-02 19:28   ` Davidlohr Bueso
2025-04-13 22:52 ` [PATCH v9 06/19] cxl/port: Add 'dynamic_ram_a' to endpoint decoder mode Ira Weiny
2025-04-14 15:36   ` Jonathan Cameron
2025-05-07 20:50   ` Fan Ni
2025-04-13 22:52 ` [PATCH v9 07/19] cxl/region: Add sparse DAX region support Ira Weiny
2025-04-14 15:40   ` Jonathan Cameron
2025-05-08 17:54   ` Fan Ni
2025-05-08 18:17   ` Fan Ni
2025-04-13 22:52 ` [PATCH v9 08/19] cxl/events: Split event msgnum configuration from irq setup Ira Weiny
2025-04-13 22:52 ` [PATCH v9 09/19] cxl/pci: Factor out interrupt policy check Ira Weiny
2025-04-13 22:52 ` [PATCH v9 10/19] cxl/mem: Configure dynamic capacity interrupts Ira Weiny
2025-04-13 22:52 ` [PATCH v9 11/19] cxl/core: Return endpoint decoder information from region search Ira Weiny
2025-04-13 22:52 ` [PATCH v9 12/19] cxl/extent: Process dynamic partition events and realize region extents Ira Weiny
2025-04-14 16:07   ` Jonathan Cameron
2025-04-14 22:10   ` Alison Schofield
2025-05-12 17:47   ` Fan Ni
2026-02-02 20:00   ` Davidlohr Bueso
2026-02-24  1:24   ` Anisa Su
2026-03-05 22:00     ` Ira Weiny
2025-04-13 22:52 ` [PATCH v9 13/19] cxl/region/extent: Expose region extent information in sysfs Ira Weiny
2025-04-13 22:52 ` [PATCH v9 14/19] dax/bus: Factor out dev dax resize logic Ira Weiny
2025-04-13 22:52 ` [PATCH v9 15/19] dax/region: Create resources on sparse DAX regions Ira Weiny
2025-04-13 22:52 ` [PATCH v9 16/19] cxl/region: Read existing extents on region creation Ira Weiny
2025-04-14 16:15   ` Jonathan Cameron
2026-02-02 19:42   ` Davidlohr Bueso
2025-04-13 22:52 ` [PATCH v9 17/19] cxl/mem: Trace Dynamic capacity Event Record Ira Weiny
2025-04-13 22:52 ` [PATCH v9 18/19] tools/testing/cxl: Make event logs dynamic Ira Weiny
2025-04-13 22:52 ` [PATCH v9 19/19] tools/testing/cxl: Add DC Regions to mock mem data Ira Weiny
2025-04-14 16:11 ` [PATCH v9 00/19] DCD: Add support for Dynamic Capacity Devices (DCD) Fan Ni
2025-04-15  2:37   ` Ira Weiny
2025-04-15  2:47     ` Fan Ni
2025-04-15  4:28     ` Dan Williams
2025-05-13 18:55     ` Fan Ni
2025-04-14 16:47 ` Jonathan Cameron
2025-04-15  4:50   ` Dan Williams
2025-04-15 10:03     ` Jonathan Cameron
2025-04-15 17:45       ` Dan Williams
2025-06-03 16:32 ` Fan Ni
2025-06-09 17:09   ` Fan Ni
2026-02-02 20:22 ` Gregory Price
2026-02-03 22:04   ` Ira Weiny [this message]
2026-02-04 15:12     ` Gregory Price
2026-02-04 17:57       ` Ira Weiny
2026-02-04 18:53         ` Gregory Price
2026-02-05 17:48           ` Jonathan Cameron
2026-02-06 11:01             ` Alireza Sanaee
2026-02-06 13:26               ` Gregory Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=698270e76775_44a22100c4@iweiny-mobl.notmuch \
    --to=ira.weiny@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=fan.ni@samsung.com \
    --cc=gourry@gourry.net \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.li@zohomail.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox