public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Alejandro Lucero Palau <alucerop@amd.com>,
	Dan Williams <dan.j.williams@intel.com>,
	<alejandro.lucero-palau@amd.com>, <linux-cxl@vger.kernel.org>,
	<netdev@vger.kernel.org>, <martin.habets@xilinx.com>,
	<edward.cree@amd.com>, <davem@davemloft.net>, <kuba@kernel.org>,
	<pabeni@redhat.com>, <edumazet@google.com>,
	<dave.jiang@intel.com>
Subject: Re: [PATCH v8 01/27] cxl: add type2 device basic support
Date: Tue, 14 Jan 2025 14:52:35 -0800	[thread overview]
Message-ID: <6786eab3a124c_20f3294f4@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <92e3b256-b660-5074-f3aa-c8ab158fcb8b@amd.com>

Alejandro Lucero Palau wrote:
> 
> On 1/8/25 14:32, Alejandro Lucero Palau wrote:
> >
> > On 1/8/25 01:33, Dan Williams wrote:
> >> Dan Williams wrote:
> >>> alejandro.lucero-palau@ wrote:
> >>>> From: Alejandro Lucero <alucerop@amd.com>
> >>>>
> >>>> Differentiate CXL memory expanders (type 3) from CXL device 
> >>>> accelerators
> >>>> (type 2) with a new function for initializing cxl_dev_state.
> >>>>
> >>>> Create accessors to cxl_dev_state to be used by accel drivers.
> >>>>
> >>>> Based on previous work by Dan Williams [1]
> >>>>
> >>>> Link: [1] 
> >>>> https://lore.kernel.org/linux-cxl/168592160379.1948938.12863272903570476312.stgit@dwillia2-xfh.jf.intel.com/
> >>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> >>>> Co-developed-by: Dan Williams <dan.j.williams@intel.com>
> >>>> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> >>>> Reviewed-by: Fan Ni <fan.ni@samsung.com>
> >>> This patch causes
> >> Whoops, forgot to complete this thought. Someting in this series causes:
> >>
> >> depmod: ERROR: Cycle detected: ecdh_generic
> >> depmod: ERROR: Cycle detected: tpm
> >> depmod: ERROR: Cycle detected: cxl_mock -> cxl_core -> cxl_mock
> >> depmod: ERROR: Cycle detected: encrypted_keys
> >> depmod: ERROR: Found 2 modules in dependency cycles!
> >>
> >> I think the non CXL ones are false likely triggered by the CXL causing
> >> depmod to exit early.
> >>
> >> Given cxl-test is unfamiliar territory to many submitters I always offer
> >> to fix up the breakage. I came up with the below incremental patch to
> >> fold in that also addresses my other feedback.
> >>
> >> Now the depmod error is something Alison saw too, and while I can also
> >> see it on patch1 if I do:
> >>
> >> - apply whole series
> >> - build => see the error
> >> - rollback patch1
> >> - build => see the error
> >>
> >> ...a subsequent build the error goes away, so I think that transient
> >> behavior is a quirk of how cxl-test is built, but some later patch in
> >> that series makes the failure permanent.
> >>
> >> In any event I figured that out after creating the below fixup and
> >> realizing that it does not fix the cxl-test build issue:
> >
> >
> > Ok. but it is a good way of showing what you had in your mind about 
> > the suggested changes.
> >
> > I'll use it for v10.
> >
> > Thanks
> >
> 
> Hi Dan,
> 
> 
> There's a problem with this approach and it is the need of the driver 
> having access to internal cxl structs like cxl_dev_state.

Apologies for stepping away for a few days, there was a chance to get
the long pending DAX page reference count series into v6.14 that I
needed to devote some review cycles.

> Your patch does not cover it but for an accel driver that struct needs 
> to be allocated before using the new cxl_dev_state_init.

Why does the cxl core need to wrap malloc on behalf of the driver?

> I think we reached an agreement in initial discussions about avoiding 
> this need through an API for accel drivers indirectly doing whatever is 
> needed regarding internal CXL structs. Initially it was stated this 
> being necessary for avoiding drivers doing wrong things but Jonathan 
> pointed out the main concern being changing those internal structs in 
> the future could benefit from this approach. Whatever the reason, that 
> was the assumption.

I think there is a benefit from a driver being able to do someting like:

struct my_cxl_accelerator_context {
    ...
    struct cxl_dev_state cxlds;
    ...
};

Even if the rule is that direct consumption of 'struct cxl_dev_state'
outside of the cxl core is unwanted.

C does not make this easy, so it is either make the definition of
'struct cxl_dev_state' public to CXL accelerator drivers so that they
know the size, or add an allocation API that takes in the extra size
that accelerator needs to allocate the core CXL context.

Unless and until we run into a real life problem of accelerator drivers
misusing 'struct cxl_dev_state' I think I prefer the explicit approach
of make the data structure embeddable and only require the core to do
the initialization.

> I could add a function for accel drivers doing the allocation as with 
> current v9 code, and then using your changes for having common code.

Let me go look at what you have there, but the design principle of the
CXL core is a library and enabling (but not requiring) users to have a container_of()
relationship between the core context and their local context feels the
most maintainable at this point.

> Also, I completely agree with merging the serial and dvsec 
> initializations through arguments to cxl_dev_state_init, but we need the 
> cxl_set_resource function for accel drivers. The current code for adding 
> resources with memdev is relying on mbox commands, and although we could 
> change that code for supporting accel drivers without an mbox, I would 
> say the function/code added is simple enough for not requiring that 
> effort. Note my goal is for an accel device without an mbox, but we will 
> see devices with one in the future, so I bet for leaving any change 
> there to that moment.

...but the way it was adding those resources was just wrong. This also
collides with some of the changes proposed for DCD partition management.
I needs a rethink in terms of a data structure and API to describe the
DPA address map of a CXL.mem capable device to the core and the core
should not be hard-coding a memory-expander-class device definition to
that layout.

I am imagining something similar to the way that resource range
resources are transmitted to platform device registration.

  parent reply	other threads:[~2025-01-14 22:52 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-16 16:10 [PATCH v8 00/27] cxl: add type2 device basic support alejandro.lucero-palau
2024-12-16 16:10 ` [PATCH v8 01/27] " alejandro.lucero-palau
2024-12-24 16:35   ` Jonathan Cameron
2024-12-27  6:56     ` Alejandro Lucero Palau
2025-01-07 16:35   ` Alison Schofield
2025-01-07 23:42   ` Dan Williams
2025-01-08  1:33     ` Dan Williams
2025-01-08 14:32       ` Alejandro Lucero Palau
2025-01-14 14:35         ` Alejandro Lucero Palau
2025-01-14 16:40           ` Alejandro Lucero Palau
2025-01-14 22:52           ` Dan Williams [this message]
2025-01-15 16:01             ` Alejandro Lucero Palau
2025-01-16  6:16               ` Dan Williams
2025-01-16 10:02                 ` Alejandro Lucero Palau
2025-02-05 20:05             ` Dan Williams
2025-02-06 17:37               ` Alejandro Lucero Palau
2025-02-07  1:57                 ` Dan Williams
2025-01-24 13:38       ` Alejandro Lucero Palau
2025-01-08 14:11     ` Alejandro Lucero Palau
2025-01-14 23:48       ` Dan Williams
2024-12-16 16:10 ` [PATCH v8 02/27] sfc: add cxl support using new CXL API alejandro.lucero-palau
2024-12-24 17:04   ` Jonathan Cameron
2024-12-27  7:00     ` Alejandro Lucero Palau
2025-01-08  1:56   ` Dan Williams
2025-01-08 14:53     ` Alejandro Lucero Palau
2025-01-14 23:59       ` Dan Williams
2024-12-16 16:10 ` [PATCH v8 03/27] cxl: add capabilities field to cxl_dev_state and cxl_port alejandro.lucero-palau
2024-12-24 17:08   ` Jonathan Cameron
2024-12-27  7:07     ` Alejandro Lucero Palau
2025-01-02 12:49       ` Jonathan Cameron
2025-01-03  7:16         ` Alejandro Lucero Palau
2025-01-03 10:47           ` Jonathan Cameron
2024-12-16 16:10 ` [PATCH v8 04/27] cxl/pci: add check for validating capabilities alejandro.lucero-palau
2024-12-24 17:15   ` Jonathan Cameron
2024-12-27  7:47     ` Alejandro Lucero Palau
2024-12-16 16:10 ` [PATCH v8 05/27] cxl: move pci generic code alejandro.lucero-palau
2024-12-24 17:19   ` Jonathan Cameron
2024-12-27  7:53     ` Alejandro Lucero Palau
2025-01-08  5:19   ` Dan Williams
2025-01-08 14:39     ` Alejandro Lucero Palau
2024-12-16 16:10 ` [PATCH v8 06/27] cxl: add function for type2 cxl regs setup alejandro.lucero-palau
2024-12-24 17:22   ` Jonathan Cameron
2024-12-27  8:04     ` Alejandro Lucero Palau
2024-12-30  9:01       ` Alejandro Lucero Palau
2025-01-06 10:41   ` Dan Carpenter
2025-01-06 15:19     ` Alejandro Lucero Palau
2024-12-16 16:10 ` [PATCH v8 07/27] sfc: use cxl api for regs setup and checking alejandro.lucero-palau
2024-12-24 17:23   ` Jonathan Cameron
2024-12-27  8:05     ` Alejandro Lucero Palau
2024-12-16 16:10 ` [PATCH v8 08/27] cxl: add functions for resource request/release by a driver alejandro.lucero-palau
2024-12-24 17:25   ` Jonathan Cameron
2024-12-27  8:06     ` Alejandro Lucero Palau
2024-12-16 16:10 ` [PATCH v8 09/27] sfc: request cxl ram resource alejandro.lucero-palau
2024-12-24 17:27   ` Jonathan Cameron
2024-12-16 16:10 ` [PATCH v8 10/27] resource: harden resource_contains alejandro.lucero-palau
2024-12-24 17:27   ` Jonathan Cameron
2024-12-16 16:10 ` [PATCH v8 11/27] cxl: add function for setting media ready by a driver alejandro.lucero-palau
2024-12-24 17:29   ` Jonathan Cameron
2024-12-27  8:08     ` Alejandro Lucero Palau
2025-01-02 12:45       ` Jonathan Cameron
2024-12-16 16:10 ` [PATCH v8 12/27] sfc: set cxl media ready alejandro.lucero-palau
2024-12-16 16:10 ` [PATCH v8 13/27] cxl: prepare memdev creation for type2 alejandro.lucero-palau
2024-12-24 17:32   ` Jonathan Cameron
2024-12-27  8:28     ` Alejandro Lucero Palau
2024-12-16 16:10 ` [PATCH v8 14/27] sfc: create type2 cxl memdev alejandro.lucero-palau
2024-12-24 17:33   ` Jonathan Cameron
2024-12-16 16:10 ` [PATCH v8 15/27] cxl: define a driver interface for HPA free space enumeration alejandro.lucero-palau
2024-12-24 17:42   ` Jonathan Cameron
2024-12-27 10:05     ` Alejandro Lucero Palau
2024-12-16 16:10 ` [PATCH v8 16/27] sfc: obtain root decoder with enough HPA free space alejandro.lucero-palau
2024-12-18 11:17   ` Edward Cree
2024-12-24 17:43   ` Jonathan Cameron
2024-12-25 20:21   ` kernel test robot
2024-12-16 16:10 ` [PATCH v8 17/27] cxl: define a driver interface for DPA allocation alejandro.lucero-palau
2024-12-24 17:53   ` Jonathan Cameron
2024-12-27 10:23     ` Alejandro Lucero Palau
2024-12-16 16:10 ` [PATCH v8 18/27] sfc: get endpoint decoder alejandro.lucero-palau
2024-12-17 10:42   ` Simon Horman
2024-12-18  8:22     ` Alejandro Lucero Palau
2025-01-07 11:34       ` Simon Horman
2024-12-16 16:10 ` [PATCH v8 19/27] cxl: make region type based on endpoint type alejandro.lucero-palau
2024-12-24 17:54   ` Jonathan Cameron
2024-12-16 16:10 ` [PATCH v8 20/27] cxl/region: factor out interleave ways setup alejandro.lucero-palau
2024-12-24 17:56   ` Jonathan Cameron
2024-12-16 16:10 ` [PATCH v8 21/27] cxl/region: factor out interleave granularity setup alejandro.lucero-palau
2024-12-24 17:56   ` Jonathan Cameron
2024-12-16 16:10 ` [PATCH v8 22/27] cxl: allow region creation by type2 drivers alejandro.lucero-palau
2024-12-24 18:01   ` Jonathan Cameron
2024-12-27 10:27     ` Alejandro Lucero Palau
2024-12-16 16:10 ` [PATCH v8 23/27] cxl: add region flag for precluding a device memory to be used for dax alejandro.lucero-palau
2024-12-24 18:04   ` Jonathan Cameron
2024-12-27  8:46     ` Alejandro Lucero Palau
2024-12-16 16:10 ` [PATCH v8 24/27] sfc: create cxl region alejandro.lucero-palau
2024-12-24 18:05   ` Jonathan Cameron
2024-12-25 23:58   ` kernel test robot
2024-12-16 16:10 ` [PATCH v8 25/27] cxl: add function for obtaining region range alejandro.lucero-palau
2024-12-24 18:07   ` Jonathan Cameron
2024-12-16 16:10 ` [PATCH v8 26/27] sfc: update MCDI protocol headers alejandro.lucero-palau
2024-12-16 16:10 ` [PATCH v8 27/27] sfc: support pio mapping based on cxl alejandro.lucero-palau
2024-12-17 10:47   ` Simon Horman
2024-12-18  8:32     ` Alejandro Lucero Palau
2024-12-30 12:16       ` Alejandro Lucero Palau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6786eab3a124c_20f3294f4@dwillia2-xfh.jf.intel.com.notmuch \
    --to=dan.j.williams@intel.com \
    --cc=alejandro.lucero-palau@amd.com \
    --cc=alucerop@amd.com \
    --cc=dave.jiang@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=edward.cree@amd.com \
    --cc=kuba@kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=martin.habets@xilinx.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox