netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: Alejandro Lucero Palau <alucerop@amd.com>
Cc: <alejandro.lucero-palau@amd.com>, <linux-cxl@vger.kernel.org>,
	<netdev@vger.kernel.org>, <dan.j.williams@intel.com>,
	<edward.cree@amd.com>, <davem@davemloft.net>, <kuba@kernel.org>,
	<pabeni@redhat.com>, <edumazet@google.com>,
	<dave.jiang@intel.com>
Subject: Re: [PATCH v22 11/25] cxl/hdm: Add support for getting region from committed decoder
Date: Fri, 19 Dec 2025 11:02:17 +0000	[thread overview]
Message-ID: <20251219110217.000063b0@huawei.com> (raw)
In-Reply-To: <1e98adcc-feeb-41cb-b1fe-618597cb0be4@amd.com>

On Thu, 18 Dec 2025 15:27:29 +0000
Alejandro Lucero Palau <alucerop@amd.com> wrote:

> On 12/18/25 15:03, Jonathan Cameron wrote:
> > On Thu, 18 Dec 2025 11:52:58 +0000
> > Alejandro Lucero Palau <alucerop@amd.com> wrote:
> >  
> >> Hi Jonathan,
> >>
> >>
> >> On 12/15/25 13:50, Jonathan Cameron wrote:  
> >>> On Fri, 5 Dec 2025 11:52:34 +0000
> >>> <alejandro.lucero-palau@amd.com> wrote:
> >>>     
> >>>> From: Alejandro Lucero <alucerop@amd.com>
> >>>>
> >>>> A Type2 device configured by the BIOS can already have its HDM
> >>>> committed. Add a cxl_get_committed_decoder() function for cheking  
> >>> checking if this is so after memdev creation.
> >>>     
> >>>> so after memdev creation. A CXL region should have been created
> >>>> during memdev initialization, therefore a Type2 driver can ask for
> >>>> such a region for working with the HPA. If the HDM is not committed,
> >>>> a Type2 driver will create the region after obtaining proper HPA
> >>>> and DPA space.
> >>>>
> >>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>  
> >>> Hi Alejandro,
> >>>
> >>> I'm in two minds about this.  In general there are devices that have
> >>> been configured by the BIOS because they are already in use. I'm not sure
> >>> the driver you are working with here is necessarily set up to survive
> >>> that sort of live setup without interrupting data flows.  
> >>
> >> This is not mainly about my driver/device but something PJ and Dan agree
> >> on support along this type2 patchset.
> >>
> >> You can see the v21 discussions, but basically PJ can not have his
> >> driver using the committed decoders from BIOS. So this change addresses
> >> that situation which my driver/device can also benefit from as current
> >> BIOS available is committing decoders regardless of UEFI flags like
> >> EFI_RESERVED_TYPE.
> >>
> >>
> >> Neither in my case nor in PJ case the device will be in use before
> >> kernel is executing, although PJ should confirm this.  
> > There was some discussion in that thread of whether the decoders are locked.
> > If they aren't (and if the device is not in use, or some other hard constraint
> > isn't requiring it, in my view they definitely shouldn't be!) I'd at least
> > like to consider the option of a 'cleanup pass' to tear them down and give
> > the driver a clean slate to build on. Kind of similar to what we do in
> > making PCI reeumerate in the kernel if we really don't like what the bios did.  
> 
> 
> I do not mind to support that option, but could we do it as a follow-up?

Sure. I'm wondering a bit on whether it's a global flag similar to the
one for full PCI bus reenumeration or more like the stuff that repairs corners
of PCI enumeration if the kernel doesn't like what it finds.

> 
> 
> > Might not be possible if there is another higher numbered decoder in use
> > though :(
> >  
> >>  
> >>> If it is fair enough to support this, otherwise my inclination is tear
> >>> down whatever the bios did and start again (unless locked - in which
> >>> case go grumble at your BIOS folk). Reasoning being that we then only
> >>> have to handle the equivalent of the hotplug flow in both cases rather
> >>> than having to handle 2.  
> >>
> >> Well, the automatic discovery region used for Type3 can be reused for
> >> Type2 in this scenario, so we do not need to tear down what the BIOS
> >> did. However, the argument is what we should do when the driver exits
> >> which the current functionality added with the patchset being tearing
> >> down the device and CXL bridge decoders. Dan seems to be keen on not
> >> doing this tear down even if the HDMs are not locked.  
> > That's the question that makes this interesting.  What is reasoning for
> > leaving bios stuff around in type 2 cases? I'd definitely like 'a way'
> > to blow it away even if another option keeps it in place.
> > A bios configures for what it can see at boot not necessarily what shows
> > up later.  Similar cases exist in PCI such as resizeable BARs.
> > The OS knows a lot more about the workload than the bios ever does and
> > may choose to reconfigure because of hotplugged devices.  
> 
> 
> The main reason seems to be an assumption from BIOSes that only 
> advertise CFMWS is there exists a CXL.mem enabled ... with the CXL Host 

Just to confirm, do you mean CXL.mem is enabled for the device? I.e.
memory is in use at boot?  If that config bit is set then we have
to leave it alone as we have very little idea what traffic is in flight.
Or just that there is some memory advertised by the device.

> Bridge CFMWS being equal to the total CXL.mem advertises by those 
> devices discovered. This is something I have been talking about in 
> discord and internally because I think that creates problems with 
> hotplugging and future FAM support, or maybe current DCD.

For DCD it shouldn't matter as long as there is space for all the DC
regions.  Whether that is backed by the device shouldn't be something
the bios cares about.  For the others fully agree its a wrong bios
writer assumption that we should try to get them to stop making!

> 
> 
> One case, theoretical but I think quite possible, is a device requiring 
> the CXL.mem not using the full capacity in all modes, likely because 
> that device memory used for other purposes and kept hidden from the 
> host. So the one knowing what to do should be the driver and dependent 
> on the device and likely some other data maybe even configurable from 
> user space.

Yes. This is kind of similar to some of the things that happen with
resizeable BARs in PCI.

> 
> 
> So yes, I agree with you that the kernel should be able to do things far 
> better than the BIOS ...

I'm sure everyone reading this email agrees policy in the OS where possible
not the BIOS :)

Jonathan


  reply	other threads:[~2025-12-19 11:02 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-05 11:52 [PATCH v22 00/25] Type2 device basic support alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 01/25] cxl/mem: Arrange for always-synchronous memdev attach alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 02/25] cxl/port: Arrange for always synchronous endpoint attach alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 03/25] cxl/mem: Introduce a memdev creation ->probe() operation alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 04/25] cxl: Add type2 device basic support alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 05/25] sfc: add cxl support alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 06/25] cxl: Move pci generic code alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 07/25] cxl/sfc: Map cxl component regs alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 08/25] cxl/sfc: Initialize dpa without a mailbox alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 09/25] cxl: Prepare memdev creation for type2 alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 10/25] sfc: create type2 cxl memdev alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 11/25] cxl/hdm: Add support for getting region from committed decoder alejandro.lucero-palau
2025-12-15 13:50   ` Jonathan Cameron
2025-12-18 11:52     ` Alejandro Lucero Palau
2025-12-18 15:03       ` Jonathan Cameron
2025-12-18 15:27         ` Alejandro Lucero Palau
2025-12-19 11:02           ` Jonathan Cameron [this message]
2025-12-05 11:52 ` [PATCH v22 12/25] cxl: Add function for obtaining region range alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 13/25] cxl: Export functions for unwinding cxl by accelerators alejandro.lucero-palau
2025-12-15 13:53   ` Jonathan Cameron
2025-12-18 12:07     ` Alejandro Lucero Palau
2025-12-05 11:52 ` [PATCH v22 14/25] sfc: obtain decoder and region if committed by firmware alejandro.lucero-palau
2025-12-15 13:57   ` Jonathan Cameron
2025-12-18 12:14     ` Alejandro Lucero Palau
2025-12-05 11:52 ` [PATCH v22 15/25] cxl: Define a driver interface for HPA free space enumeration alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 16/25] sfc: get root decoder alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 17/25] cxl: Define a driver interface for DPA allocation alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 18/25] sfc: get endpoint decoder alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 19/25] cxl: Make region type based on endpoint type alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 20/25] cxl/region: Factor out interleave ways setup alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 21/25] cxl/region: Factor out interleave granularity setup alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 22/25] cxl: Allow region creation by type2 drivers alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 23/25] cxl: Avoid dax creation for accelerators alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 24/25] sfc: create cxl region alejandro.lucero-palau
2025-12-05 11:52 ` [PATCH v22 25/25] sfc: support pio mapping based on cxl alejandro.lucero-palau
2025-12-07  7:12 ` [PATCH v22 00/25] Type2 device basic support dan.j.williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251219110217.000063b0@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=alejandro.lucero-palau@amd.com \
    --cc=alucerop@amd.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=edward.cree@amd.com \
    --cc=kuba@kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).