Re: [PATCH v10 01/26] cxl: make memdev creation type agnostic

public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed

From: Dan Williams <dan.j.williams@intel.com>
To: Alejandro Lucero Palau <alucerop@amd.com>,
	Dan Williams <dan.j.williams@intel.com>,
	<linux-cxl@vger.kernel.org>, <netdev@vger.kernel.org>,
	<edward.cree@amd.com>, <davem@davemloft.net>, <kuba@kernel.org>,
	<pabeni@redhat.com>, <edumazet@google.com>,
	<dave.jiang@intel.com>
Subject: Re: [PATCH v10 01/26] cxl: make memdev creation type agnostic
Date: Tue, 18 Feb 2025 18:29:48 -0800	[thread overview]
Message-ID: <67b5421cd91d2_2d2c294ad@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <33aceef3-9e01-4b28-bc3e-7dc11b59a1f6@amd.com>

Alejandro Lucero Palau wrote:
> 
> On 2/6/25 19:37, Dan Williams wrote:
> > alucerop@ wrote:
> >> From: Alejandro Lucero <alucerop@amd.com>
> >>
> >> In preparation for Type2 support, change memdev creation making
> >> type based on argument.
> >>
> >> Integrate initialization of dvsec and serial fields in the related
> >> cxl_dev_state within same function creating the memdev.
> >>
> >> Move the code from mbox file to memdev file.
> >>
> >> Add new header files with type2 required definitions for memdev
> >> state creation.
> >>
> >> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> >> ---
> >>   drivers/cxl/core/mbox.c   | 20 --------------------
> >>   drivers/cxl/core/memdev.c | 23 +++++++++++++++++++++++
> >>   drivers/cxl/cxlmem.h      | 18 +++---------------
> >>   drivers/cxl/cxlpci.h      | 17 +----------------
> >>   drivers/cxl/pci.c         | 16 +++++++++-------
> >>   include/cxl/cxl.h         | 26 ++++++++++++++++++++++++++
> >>   include/cxl/pci.h         | 23 +++++++++++++++++++++++
> >>   7 files changed, 85 insertions(+), 58 deletions(-)
> >>   create mode 100644 include/cxl/cxl.h
> >>   create mode 100644 include/cxl/pci.h
> >>
> >> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> >> index 4d22bb731177..96155b8af535 100644
> >> --- a/drivers/cxl/core/mbox.c
> >> +++ b/drivers/cxl/core/mbox.c
> >> @@ -1435,26 +1435,6 @@ int cxl_mailbox_init(struct cxl_mailbox *cxl_mbox, struct device *host)
> >>   }
> >>   EXPORT_SYMBOL_NS_GPL(cxl_mailbox_init, "CXL");
> >>   
> >> -struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev)
> >> -{
> >> -	struct cxl_memdev_state *mds;
> >> -
> >> -	mds = devm_kzalloc(dev, sizeof(*mds), GFP_KERNEL);
> >> -	if (!mds) {
> >> -		dev_err(dev, "No memory available\n");
> >> -		return ERR_PTR(-ENOMEM);
> >> -	}
> >> -
> >> -	mutex_init(&mds->event.log_lock);
> >> -	mds->cxlds.dev = dev;
> >> -	mds->cxlds.reg_map.host = dev;
> >> -	mds->cxlds.reg_map.resource = CXL_RESOURCE_NONE;
> >> -	mds->cxlds.type = CXL_DEVTYPE_CLASSMEM;
> >> -
> >> -	return mds;
> >> -}
> >> -EXPORT_SYMBOL_NS_GPL(cxl_memdev_state_create, "CXL");
> >> -
> >>   void __init cxl_mbox_init(void)
> >>   {
> >>   	struct dentry *mbox_debugfs;
> >> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> >> index 63c6c681125d..456d505f1bc8 100644
> >> --- a/drivers/cxl/core/memdev.c
> >> +++ b/drivers/cxl/core/memdev.c
> >> @@ -632,6 +632,29 @@ static void detach_memdev(struct work_struct *work)
> >>   
> >>   static struct lock_class_key cxl_memdev_key;
> >>   
> >> +struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev, u64 serial,
> >> +						 u16 dvsec, enum cxl_devtype type)
> >> +{
> >> +	struct cxl_memdev_state *mds;
> >> +
> >> +	mds = devm_kzalloc(dev, sizeof(*mds), GFP_KERNEL);
> >> +	if (!mds) {
> >> +		dev_err(dev, "No memory available\n");
> >> +		return ERR_PTR(-ENOMEM);
> >> +	}
> >> +
> >> +	mutex_init(&mds->event.log_lock);
> >> +	mds->cxlds.dev = dev;
> >> +	mds->cxlds.reg_map.host = dev;
> >> +	mds->cxlds.reg_map.resource = CXL_RESOURCE_NONE;
> >> +	mds->cxlds.cxl_dvsec = dvsec;
> >> +	mds->cxlds.serial = serial;
> >> +	mds->cxlds.type = type;
> >> +
> >> +	return mds;
> >> +}
> >> +EXPORT_SYMBOL_NS_GPL(cxl_memdev_state_create, "CXL");
> > I was envisioning that accelerators only consider 'struct cxl_dev_state'
> > and that 'struct cxl_memdev_state' is exclusively for
> > CXL_DEVTYPE_CLASSMEM memory expander use case.
> 
> 
> That was the original idea and what I have followed since the RFC, but 
> since the patchset has gone through some assumptions which turned wrong, 
> I seized the "revolution" for changing this as well.
> 
> 
> A type2 is a memdev, and what makes it different is the exposure, so I 
> can not see why an accel driver, at least a Type2, should not use a 
> cxl_memdev_state struct. This simplifies the type2 support and after 
> all, a Type2 could require the exact same things like a type3, like 
> mbox, perf, poison, ... .

I disagree, I think it avoids the discipline of maintaining Accelerator
CXL.mem infrastructure alongside the sometimes super-set sometimes
disjoint-set of generic CXL memory expander support.

Specifically, the reason I think the implementation is worse off reusing
cxl_memdev_state for accelerators is because accelerators are not
subject to the same requirements as "memory device" expanders that emit
the class code from CXL 3.1 8.1.12.1 "PCI Header - Class Code Register
(Offset 09h)".

The whole point of the CXL_DEVTYPE_CLASSMEM vs CXL_DEVTYPE_DEVMEM
distinction was for cases where accelerators are not mandated to support
the same functionality as a generic expander.

It is not until patch12 that this set notices that to_cxl_memdev_state()
has been returning NULL for accelerator created 'cxl_memdev_state'
instances. However, patch12 is confused because to_cxl_memdev_state()
has no reason to exist if it can be assumed that 'struct
cxl_memdev_state' always wraps 'struct cxl_dev_state'.

The fact that it requires thought and care to decide how accelerators
can share code paths with the generic memory class device case is a
*feature*.

So either 'enum cxl_devtype' needs to be removed altogether (would need
a strong argument that is currently absent from this set), or we need to
carefully think about the optional functionality that an accelerator may
want to reuse from expanders. As it stands, most of cxl_memdev_state
makes little sense for an accelerator:

> struct cxl_memdev_state {
>         struct cxl_dev_state cxlds;
>         size_t lsa_size; 

Why would an accelerator ever worry about PMEM?

>         char firmware_version[0x10];

Certainly accelerators have their own firmware versioning and update
flows independent of the CXL standard?

>         DECLARE_BITMAP(enabled_cmds, CXL_MEM_COMMAND_ID_MAX);
>         DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);

Mailbox is not mandatory and Memory device mandatory commands are not
mandatory for accelerators?

>         u64 total_bytes;
>         u64 volatile_only_bytes;
>         u64 persistent_only_bytes;
>         u64 partition_align_bytes;
>         u64 active_volatile_bytes;
>         u64 active_persistent_bytes;
>         u64 next_volatile_bytes;
>         u64 next_persistent_bytes;

Already had a long discussion about accelerator DPA space enumeration.

>         struct cxl_event_state event;
>         struct cxl_poison_state poison;

These might be candidates for accelerator reuse, but I would suggest
promoting them out of 'struct cxl_memdev_state' to an optional
capability of 'struct cxl_dev_state'.

>         struct cxl_security_state security;

PMEM Security is 2-degrees of optionality away from what an accelerator
would ever need to consider.

>         struct cxl_fw_state fw;

Not even sure that cxl_memdev_state needs the ops passed to
firmware_upload_register() make little use of 'struct cxl_memdev_state'
outside of using it to look up the 'struct cxl_mailbox'.

> };


> >   Something roughly like
> > the below. Note, this borrows from the fwctl_alloc_device() example
> > which captures the spirit of registering a c4ore object wrapped by an end
> > driver provided structure).
> >
> > #define cxl_dev_state_create(parent, serial, dvsec, type, drv_struct, member)  \
> >          ({                                                                     \
> >                  static_assert(__same_type(struct cxl_dev_state,                \
> >                                            ((drv_struct *)NULL)->member));      \
> >                  static_assert(offsetof(drv_struct, member) == 0);              \
> >                  (drv_struct *)_cxl_dev_state_create(parent, serial, dvsec,     \
> >                                                      type, sizeof(drv_struct)); \
> >          })
> 
> 
> If you prefer the accel driver keeping a struct embedding the core cxl 
> object, that is fine. I can not see a reason for not doing it, although 
> I can not see a reason either for imposing this.

The cxl_pci driver would use it to align on a single way to wrap its class device
driver context around 'struct cxl_dev_state'. So it is more about
setting common expectations across cxl_pci and accelerator drivers for
how they wrap 'struct cxl_dev_state'.

[..]
> > struct cxl_memdev_state *cxl_memdev_state_create(parent, serial, dvsec)
> > {
> >          struct cxl_memdev_state *mds = cxl_dev_state_create(
> >                  parent, serial, dvsec, CXL_DEVTYPE_CLASSMEM,
> >                  struct cxl_memdev_state, cxlds);
> >
> >          if (IS_ERR(mds))
> >                  return mds;
> >          
> >          mutex_init(&mds->event.log_lock);
> >          mds->cxlds.dev = dev;
> >          mds->cxlds.reg_map.host = dev;
> >          mds->cxlds.reg_map.resource = CXL_RESOURCE_NONE;
> >          mds->cxlds.cxl_dvsec = dvsec;
> >          mds->cxlds.serial = serial;
> >          mds->cxlds.type = type;
> >
> >          return mds;
> > }
> >
> > If an accelerator wants to share infrastructure that is currently housed
> > in 'struct cxl_memdev_state', then that functionality should first move
> > to 'struct cxl_dev_state'.
> 
> If you see the full patchset, you will realize the accel driver does not 
> use the cxl objects except for doing further initialization with them. 
> In other words, we keep the idea of avoiding the accel driver 
> manipulating those structs freely, and having an API for accel drivers. 
> Using cxl_memdev_struct now implies to change some core functions for 
> using that struct as the argument what should not be a problem.

To date, 'struct cxl_memdev_state' has been a dumping ground for random
context that the class driver needs. The consequences of that dumping
have been low as the only potential burden would be self-contained to
the only user of 'struct cxl_memdev_state', cxl_pci. The creation of
'struct cxl_dev_state' was motivated by the observation that the arrival
of accelerators ends that honeymoon period. The implementation needs to
be conscientious about not spreading cruft amongst multiple disparate
accelerator drivers and their use cases.

The cxl_pci class device should be able to change the cxl_memdev_state
structure at will without worry or care for understanding accelerator
use cases.

> We will need to think about this when type2 cache support comes, which 
> will mean type1 support as well. But it is hard to think about what will 
> be required then, because it is not clear yet how we should implement 
> that support. So for the impending need of having Type2 support for 
> CXL.mem, I really think this is all what we need by now.

I want to avoid the liability of accelerator drivers silently growing
attachment to functionality in 'struct cxl_memdev_state', and architect
a shared data structure / library interface for those pieces to reuse.

next prev parent reply	other threads:[~2025-02-19  2:29 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-05 15:19 [PATCH v10 00/26] cxl: add type2 device basic support alucerop
2025-02-05 15:19 ` [PATCH v10 01/26] cxl: make memdev creation type agnostic alucerop
2025-02-06 19:37   ` Dan Williams
2025-02-17 12:32     ` Alejandro Lucero Palau
2025-02-19  2:29       ` Dan Williams [this message]
2025-02-20 18:17         ` Alejandro Lucero Palau
2025-02-17 13:05     ` Alejandro Lucero Palau
2025-02-13  3:57   ` Alison Schofield
2025-02-17 12:49     ` Alejandro Lucero Palau
2025-02-17 13:06     ` Alejandro Lucero Palau
2025-02-14 17:02   ` Jonathan Cameron
2025-02-17 13:08     ` Alejandro Lucero Palau
2025-02-05 15:19 ` [PATCH v10 02/26] sfc: add basic cxl initialization alucerop
2025-02-06  1:37   ` Edward Cree
2025-02-07 12:48   ` Simon Horman
2025-02-17 13:10     ` Alejandro Lucero Palau
2025-02-07 13:03   ` Simon Horman
2025-02-17 13:11     ` Alejandro Lucero Palau
2025-02-18 13:32       ` Simon Horman
2025-02-05 15:19 ` [PATCH v10 03/26] cxl: move pci generic code alucerop
2025-02-05 21:33   ` Ira Weiny
2025-02-06 17:49     ` Alejandro Lucero Palau
2025-02-14 17:11       ` Jonathan Cameron
2025-02-17 13:13         ` Alejandro Lucero Palau
2025-02-05 15:19 ` [PATCH v10 04/26] cxl: move register/capability check to driver alucerop
2025-02-07 12:52   ` Simon Horman
2025-02-17 13:17     ` Alejandro Lucero Palau
2025-02-14 17:21   ` Jonathan Cameron
2025-02-17 13:18     ` Alejandro Lucero Palau
2025-02-05 15:19 ` [PATCH v10 05/26] cxl: add function for type2 cxl regs setup alucerop
2025-02-05 21:35   ` Ira Weiny
2025-02-06 17:50     ` Alejandro Lucero Palau
2025-02-05 15:19 ` [PATCH v10 06/26] sfc: use cxl api for regs setup and checking alucerop
2025-02-05 21:31   ` Ira Weiny
2025-02-06 17:47     ` Alejandro Lucero Palau
2025-02-05 15:19 ` [PATCH v10 07/26] cxl: add support for setting media ready by an accel driver alucerop
2025-02-05 21:42   ` Ira Weiny
2025-02-06 17:58     ` Alejandro Lucero Palau
2025-02-05 15:19 ` [PATCH v10 08/26] sfc: set cxl media ready alucerop
2025-02-05 15:19 ` [PATCH v10 09/26] cxl: support device identification without mailbox alucerop
2025-02-05 21:45   ` Ira Weiny
2025-02-06 18:10     ` Alejandro Lucero Palau
2025-02-06 19:23       ` Ira Weiny
2025-02-17 13:41         ` Alejandro Lucero Palau
2025-02-05 15:19 ` [PATCH v10 10/26] cxl: modify dpa setup process for supporting type2 alucerop
2025-02-05 15:19 ` [PATCH v10 11/26] sfc: initialize dpa resources alucerop
2025-02-05 15:19 ` [PATCH v10 12/26] cxl: prepare memdev creation for type2 alucerop
2025-02-05 15:19 ` [PATCH v10 13/26] sfc: create type2 cxl memdev alucerop
2025-02-05 15:19 ` [PATCH v10 14/26] cxl: define a driver interface for HPA free space enumeration alucerop
2025-02-07 12:55   ` Simon Horman
2025-02-17 13:44     ` Alejandro Lucero Palau
2025-02-13  4:08   ` Alison Schofield
2025-02-17 13:49     ` Alejandro Lucero Palau
2025-02-05 15:19 ` [PATCH v10 15/26] sfc: obtain root decoder with enough HPA free space alucerop
2025-02-05 22:47   ` Ira Weiny
2025-02-17 13:54     ` Alejandro Lucero Palau
2025-02-18  0:03       ` Ira Weiny
2025-02-05 15:19 ` [PATCH v10 16/26] cxl: define a driver interface for DPA allocation alucerop
2025-02-06 19:11   ` kernel test robot
2025-02-07 13:46   ` Simon Horman
2025-02-17 14:08     ` Alejandro Lucero Palau
2025-02-18 13:34       ` Simon Horman
2025-02-18 14:09         ` Simon Horman
2025-02-05 15:19 ` [PATCH v10 17/26] sfc: get endpoint decoder alucerop
2025-02-05 15:19 ` [PATCH v10 18/26] cxl: make region type based on endpoint type alucerop
2025-02-05 15:19 ` [PATCH v10 19/26] cxl/region: factor out interleave ways setup alucerop
2025-02-05 15:19 ` [PATCH v10 20/26] cxl/region: factor out interleave granularity setup alucerop
2025-02-05 15:19 ` [PATCH v10 21/26] cxl: allow region creation by type2 drivers alucerop
2025-02-06 20:06   ` kernel test robot
2025-02-07 13:23   ` Simon Horman
2025-02-05 15:19 ` [PATCH v10 22/26] cxl: add region flag for precluding a device memory to be used for dax alucerop
2025-02-05 15:19 ` [PATCH v10 23/26] sfc: create cxl region alucerop
2025-02-05 15:19 ` [PATCH v10 24/26] cxl: add function for obtaining region range alucerop
2025-02-05 15:19 ` [PATCH v10 25/26] sfc: update MCDI protocol headers alucerop
2025-02-05 15:19 ` [PATCH v10 26/26] sfc: support pio mapping based on cxl alucerop
2025-02-13  1:51 ` [PATCH v10 00/26] cxl: add type2 device basic support Alison Schofield

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=67b5421cd91d2_2d2c294ad@dwillia2-xfh.jf.intel.com.notmuch \
    --to=dan.j.williams@intel.com \
    --cc=alucerop@amd.com \
    --cc=dave.jiang@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=edward.cree@amd.com \
    --cc=kuba@kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox