Linux CXL
 help / color / mirror / Atom feed
From: Gregory Price <gregory.price@memverge.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-cxl@vger.kernel.org, Dave Jiang <dave.jiang@intel.com>
Subject: Re: [PATCH] cxl/hdm: Extend DVSEC range register emulation for region enumeration
Date: Wed, 29 Mar 2023 12:04:21 -0400	[thread overview]
Message-ID: <ZCRhhUDcmypVKu0X@memverge.com> (raw)
In-Reply-To: <168012575521.221280.14177293493678527326.stgit@dwillia2-xfh.jf.intel.com>

On Wed, Mar 29, 2023 at 02:35:55PM -0700, Dan Williams wrote:
> One motivation for mapping range registers to decoder objects is
> to use those settings for region autodiscovery.
> 
> The need to map a region for devices programmed to use range registers
> is especially urgent now that the kernel no longer routes "Soft
> Reserved" ranges in the memory map to device-dax by default. The CXL
> memory range loses all access mechanisms.
> 
> Complete the implementation by filling out ways and granularity, marking
> the DPA reservation, and setting the endpoint-decoder state to signal
> autodiscovery.
> 
> Fixes: 09d09e04d2fc ("cxl/dax: Create dax devices for CXL RAM regions")
> Tested-by: Dave Jiang <dave.jiang@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/hdm.c |   30 ++++++++++++++++++++++++------
>  1 file changed, 24 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 9884b6d4d930..5339c0719177 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -738,20 +738,26 @@ static int cxl_decoder_reset(struct cxl_decoder *cxld)
>  	return 0;
>  }
>  
> -static int cxl_setup_hdm_decoder_from_dvsec(struct cxl_port *port,
> -					    struct cxl_decoder *cxld, int which,
> -					    struct cxl_endpoint_dvsec_info *info)
> +static int cxl_setup_hdm_decoder_from_dvsec(
> +	struct cxl_port *port, struct cxl_decoder *cxld, u64 *dpa_base,
> +	int which, struct cxl_endpoint_dvsec_info *info)
>  {
> +	struct cxl_endpoint_decoder *cxled;
> +	struct range *range;
> +	int rc;
> +
>  	if (!is_cxl_endpoint(port))
>  		return -EOPNOTSUPP;
>  
> -	if (!range_len(&info->dvsec_range[which]))
> +	cxled = to_cxl_endpoint_decoder(&cxld->dev);
> +	range = &info->dvsec_range[which];
> +	if (!range_len(range))
>  		return -ENOENT;
>  
>  	cxld->target_type = CXL_DECODER_EXPANDER;
>  	cxld->commit = NULL;
>  	cxld->reset = NULL;
> -	cxld->hpa_range = info->dvsec_range[which];
> +	cxld->hpa_range = *range;
>  
>  	/*
>  	 * Set the emulated decoder as locked pending additional support to
> @@ -760,6 +766,17 @@ static int cxl_setup_hdm_decoder_from_dvsec(struct cxl_port *port,
>  	cxld->flags |= CXL_DECODER_F_ENABLE | CXL_DECODER_F_LOCK;
>  	port->commit_end = cxld->id;
>  
> +	rc = devm_cxl_dpa_reserve(cxled, *dpa_base, range_len(range), 0);

I have an example host where *dpa_base ends up being 0x0 here, and as a
result later down the line the region fails to register with this:

[   21.974117] cxl_pci 0000:3f:00.0: mem0:decoder1.0 no CXL window for range 0x0:0x1fffffffff

this traces back to here:

int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
                                struct cxl_endpoint_dvsec_info *info)
{
        u64 dpa_base = 0;    <- dpa_base is set to 0 and never updated
...
        for (i = 0; i < cxlhdm->decoder_count; i++) {
...
                rc = init_hdm_decoder(port, cxld, target_map, hdm, i,
                                      &dpa_base, info);
}


static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
                            int *target_map, void __iomem *hdm, int which,
                            u64 *dpa_base, struct cxl_endpoint_dvsec_info *info)
{
... passed directly into here
        if (should_emulate_decoders(info))
                return cxl_setup_hdm_decoder_from_dvsec(port, cxld, dpa_base,
                                                        which, info);
...
}

static int cxl_setup_hdm_decoder_from_dvsec(
        struct cxl_port *port, struct cxl_decoder *cxld, u64 *dpa_base,
        int which, struct cxl_endpoint_dvsec_info *info)
{
        rc = devm_cxl_dpa_reserve(cxled, *dpa_base, range_len(range), 0);
        if (rc) {
                dev_err(&port->dev,
                        "decoder%d.%d: Failed to reserve DPA range %#llx - %#llx\n (%d)",
                        port->id, cxld->id, *dpa_base,
                        *dpa_base + range_len(range) - 1, rc);
                return rc;
        }
        *dpa_base += range_len(range);
        cxled->state = CXL_DECODER_STATE_AUTO;

        return 0;
}


(full cxl log with bonus prints i added)
[   21.607034] cxl_pci 0000:3f:00.0: No component registers (-19)
[   21.641831] cxl_pci 0000:3f:00.0: DOE: [d80] failed to cache protocols : -5
[   21.642866] cxl_pci 0000:3f:00.0: Failed to create MB object for MB @ d80
[   21.643686] cxl_pci 0000:3f:00.0: Failed to request region 0x0000000000001fff-0x000000000010201e
[   21.644371] cxl add_dpa_res: (0, 1fffffffff)
[   21.645540] cxl add_dpa_res: (2000000000, 1fffffffff)
[   21.965692] cxl hdm dvsec range: (0, 1fffffffff)
[   21.967045] cxl emulating decoders: dpa_base(0)   <-  *dpa_base
[   21.967778] cxl_add_to_region: searching for root decoder with address range(0, 1fffffffff)
[   21.972824] cxl match decoder: found root decoder, r1(1050000000, 304fffffff) r2(0, 1fffffffff)
[   21.974117] cxl_pci 0000:3f:00.0: mem0:decoder1.0 no CXL window for range 0x0:0x1fffffffff
[   21.974905] cxl discover_region: failed to add to region: 0x0-0x1fffffffff



Ultimately having trouble deciding if this is something broken with
bios, the setup_hdm_decoder code, or the discover_region code.  I'm not
PCI guru, but we should expect the rdm dvsec range to be in the range of
the root decoder window / CFMW:  r1(1050000000, 304fffffff)


confirming this is the issue, i *forced* the dvsec range register to read
out base+0x1050000000


diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 7328a2552411..6fc6df0f7b5a 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -340,12 +344,16 @@ int cxl_dvsec_rr_decode(struct device *dev, int d,
                if (rc)
                        return rc;

                base |= temp & CXL_DVSEC_MEM_BASE_LOW_MASK;

                info->dvsec_range[i] = (struct range) {
-                       .start = base,
-                       .end = base + size - 1
+                       .start = 0x1050000000 + base,
+                       .end = 0x1050000000 + base + size - 1
                };


And this resulted in everything working "as one would expect"

[user@host0 ~]# ls /sys/bus/cxl/devices/
dax_region0  decoder0.0  decoder1.0  endpoint1  mem0  region0  root0

[user@host0 ~]# numactl --hardware
available: 2 nodes (0-1)
node 1 cpus:
node 1 size: 0 MB
node 1 free: 0 MB
node distances:
node   0   1
  0:  10  50
  1:  255  10

[user@host0 ~]# echo online_movable > /sys/bus/node/devices/node1/memory33/state
[user@host0 ~]# numactl --hardware
available: 2 nodes (0-1)
node 1 size: 2048 MB
node 1 free: 2048 MB
node distances:
node   0   1
  0:  10  50
  1:  255  10


Basically the question is:

Is the DVSEC Range Register expected to be programmed by bios, and are
not being programmed correctly?

Or is there something else missing here to correct for the CMFW base?



If it's the former, then this patch set is gtg and i'm happy to add my
Tested-by tag.  If it's the latter, can we hotfix it before release?



> +	if (rc) {
> +		dev_err(&port->dev,
> +			"decoder%d.%d: Failed to reserve DPA range %#llx - %#llx\n (%d)",
> +			port->id, cxld->id, *dpa_base,
> +			*dpa_base + range_len(range) - 1, rc);
> +		return rc;
> +	}
> +	*dpa_base += range_len(range);
> +	cxled->state = CXL_DECODER_STATE_AUTO;
> +
>  	return 0;
>  }
>  
> @@ -779,7 +796,8 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
>  	} target_list;
>  
>  	if (should_emulate_decoders(info))
> -		return cxl_setup_hdm_decoder_from_dvsec(port, cxld, which, info);
> +		return cxl_setup_hdm_decoder_from_dvsec(port, cxld, dpa_base,
> +							which, info);
>  
>  	ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(which));
>  	base = ioread64_hi_lo(hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(which));
> 

  reply	other threads:[~2023-03-30  3:25 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-29 21:35 [PATCH] cxl/hdm: Extend DVSEC range register emulation for region enumeration Dan Williams
2023-03-29 16:04 ` Gregory Price [this message]
2023-03-30  4:21   ` Dan Williams
2023-03-29 17:20     ` Gregory Price
2023-05-16  6:43       ` Gregory Price
2023-03-29 17:21     ` Gregory Price
2023-03-30  6:33       ` Dan Williams
2023-03-30  4:27         ` Gregory Price
2023-04-16  4:05           ` Gregory Price
2023-04-18  5:51             ` Dan Williams
2023-04-20  0:50               ` Gregory Price
2023-03-30  0:06 ` Dave Jiang
2023-03-30 17:00 ` Jonathan Cameron
2023-03-30 18:24   ` Dan Williams
2023-04-03 23:06 ` [PATCH v2] " Dan Williams
2023-04-03 23:44   ` Dave Jiang
2023-04-04  0:08     ` Dan Williams
2023-04-04  0:16       ` Dave Jiang
2023-04-04  9:19   ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZCRhhUDcmypVKu0X@memverge.com \
    --to=gregory.price@memverge.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox