From: "Verma, Vishal L" <vishal.l.verma@intel.com>
To: "Williams, Dan J" <dan.j.williams@intel.com>,
"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>
Cc: "Schofield, Alison" <alison.schofield@intel.com>,
"Jiang, Dave" <dave.jiang@intel.com>,
"nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>,
"Weiny, Ira" <ira.weiny@intel.com>
Subject: Re: [ndctl PATCH v2 06/10] cxl: add a 'create-region' command
Date: Thu, 11 Aug 2022 21:53:36 +0000 [thread overview]
Message-ID: <128f99454b44519c6ab356b680206232b5e597ba.camel@intel.com> (raw)
In-Reply-To: <62f559b843501_3ce6829439@dwillia2-xfh.jf.intel.com.notmuch>
On Thu, 2022-08-11 at 12:34 -0700, Dan Williams wrote:
> Vishal Verma wrote:
> > Add a 'create-region' command to cxl-cli that walks the platform's CXL
> > hierarchy to find an appropriate root decoder based on any options
> > provided, and uses libcxl APIs to create a 'region' that is comprehended
> > by libnvdimm and ndctl.
> >
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
> > ---
> > Documentation/cxl/bus-option.txt | 5 +
> > Documentation/cxl/cxl-create-region.txt | 114 +++++
> > Documentation/cxl/region-description.txt | 7 +
> > cxl/builtin.h | 1 +
> > cxl/filter.h | 4 +-
> > cxl/cxl.c | 1 +
> > cxl/region.c | 594 +++++++++++++++++++++++
> > Documentation/cxl/meson.build | 2 +
> > cxl/meson.build | 1 +
> > 9 files changed, 728 insertions(+), 1 deletion(-)
> > create mode 100644 Documentation/cxl/bus-option.txt
> > create mode 100644 Documentation/cxl/cxl-create-region.txt
> > create mode 100644 Documentation/cxl/region-description.txt
> > create mode 100644 cxl/region.c
> >
> > diff --git a/Documentation/cxl/bus-option.txt b/Documentation/cxl/bus-option.txt
> > new file mode 100644
> > index 0000000..02e2f08
> > --- /dev/null
> > +++ b/Documentation/cxl/bus-option.txt
> > @@ -0,0 +1,5 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +-b::
> > +--bus=::
> > + Restrict the operation to the specified bus.
> > diff --git a/Documentation/cxl/cxl-create-region.txt b/Documentation/cxl/cxl-create-region.txt
> > new file mode 100644
> > index 0000000..15dc742
> > --- /dev/null
> > +++ b/Documentation/cxl/cxl-create-region.txt
> > @@ -0,0 +1,114 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +cxl-create-region(1)
> > +====================
> > +
> > +NAME
> > +----
> > +cxl-create-region - Assemble a CXL region by setting up attributes of its
> > +constituent CXL memdevs.
> > +
> > +SYNOPSIS
> > +--------
> > +[verse]
> > +'cxl create-region [<options>]'
> > +
> > +include::region-description.txt[]
> > +
> > +For create-region, a size can optionally be specified, but if not, the maximum
> > +possible size for each memdev will be used up to the available decode capacity
> > +in the system for the given memory type. For persistent regions a UUID can
> > +optionally be specified, but if not, one will be generated.
> > +
> > +If the region-creation operation is successful, a region object will be
> > +emitted on stdout in JSON format (see examples). If the specified arguments
> > +cannot be satisfied with a legal configuration, then an appropriate error will
> > +be emitted on stderr.
> > +
> > +EXAMPLE
> > +-------
> > +----
> > +# cxl create-region -m -d decoder0.1 -w 2 -g 1024 mem0 mem1
> > +{
> > + "region":"region0",
> > + "resource":"0xc90000000",
> > + "size":"512.00 MiB (536.87 MB)",
> > + "interleave_ways":2,
> > + "interleave_granularity":1024,
> > + "mappings":[
> > + {
> > + "position":1,
> > + "decoder":"decoder4.0"
> > + },
> > + {
> > + "position":0,
> > + "decoder":"decoder3.0"
> > + }
> > + ]
> > +}
> > +created 1 region
> > +----
> > +
> > +OPTIONS
> > +-------
> > +<target(s)>::
> > +The CXL targets that should be used to form the region. This is optional,
> > +as they can be chosen automatically based on other options chosen. The number of
> > +'target' arguments must match the '--ways' option (if provided). The
> > +targets may be memdevs, or endpoints. The options below control what type of
> > +targets are being used.
> > +
> > +include::bus-option.txt[]
> > +
> > +-m::
> > +--memdevs::
> > + Indicate that the non-option arguments for 'target(s)' refer to memdev
> > + names.
>
> Are they names or filter parameters ala 'cxl list -m'? I.e. do you
> foresee being able to do something like:
Hm, in the current implementation they really are just names - I just
use the remaining argv[] as the targets array, but..
>
> "cxl create-region -p $port -m all"
>
> ...to just select all the memdevs that are descendants of $port in the
> future? More of a curiosity about future possibilities then a request
> for change.
This sounds like a useful thing to do - perhaps with the next bit of
porcelain additions.
>
> > +
> > +-e::
> > +--ep-decoders::
> > + Indicate that the non-option arguments for 'target(s)' refer to endpoint
> > + decoder names.
>
> I wonder if this should have a note about it being for test-only
> purposes? Given the strict CXL decoder allocation rules I wonder if
> anyone can use this in practice? There might be some synergy with 'cxl
> reserve-dpa' where this option could be used to say "do not allocate new
> decoders, and do not reserve more DPA, just try to use the DPA that was
> already reserved in the following decoders".
>
> We might even delete this option for now unless I am missing the marquee
> use case for it? Because unless someone knows what they are doing they
> are almost always going to be wrong.
Yeah I agree - it is definitely not straightforward to use, and I don't
see a practical use case. I think I only had it because the ABI wanted
decoders, and very early implementations had me explicitly asking for
/everything/.
I'm okay adding a note that this shouldn't normally be used, or
removing it entirely.
>
> > +
> > +-s::
> > +--size=::
> > + Specify the total size for the new region. This is optional, and by
> > + default, the maximum possible size will be used.
>
> How about add:
>
> "The maximum possible size is gated by both the contiguous free HPA
> space remaining in the root decoder, and the available DPA space in the
> component memdevs."
Yep.
>
> > +
> > +-t::
> > +--type=::
> > + Specify the region type - 'pmem' or 'ram'. Defaults to 'pmem'.
> > +
> > +-U::
> > +--uuid=::
> > + Specify a UUID for the new region. This shouldn't usually need to be
> > + specified, as one will be generated by default.
> > +
> > +-w::
> > +--ways=::
> > + The number of interleave ways for the new region's interleave. This
> > + should be equal to the number of memdevs specified in --memdevs, if
> > + --memdevs is being supplied. If --memdevs is not specified, an
> > + appropriate number of memdevs will be chosen based on the number of
> > + ways specified.
>
> The reverse is also true, right? That if -w is not specified then the
> number of ways is determined by the number of targets specified, or by
> other default target searches. I guess notes about those enhanced
> default modes can wait until more 'create-region' porcelain arrives.
Hm actually in the current state, /only/ the reverse is true, so this
description was certainly a bit forward looking. I'll fix to say what
it actually does today.
>
> > +
> > +-g::
> > +--granularity=::
> > + The interleave granularity for the new region. Must match the selected
> > + root decoder's (if provided) granularity.
>
> This just has me thinking that the kernel needs to up-level its code
> comments and changelogs on the "why" for this restriction to somewhere
> this man page can reference.
>
> However second sentence should be:
>
> "If the root decoder is interleaved across more than one host-bridge
> then this value must match that granularity. Otherwise, for
> non-interleaved decode windows, any granularity can be
> specified as long as all devices support that setting."
>
> As I type that it raises 2 questions:
>
> 1/ If someone does "cxl create-region -g 1024" with no other arguments,
> will it fallback to a decoder that can support that setting if its first
> choice can not?
Well there's no automatic root decoder selection at all in this series,
but for future porcelain, I'd imagine it should try to match exactly
any args that were specified, and fail if /all/ of those can't be
matched.
e.g.:
decoder1 - 2-way - IG:256, and
decoder2 - 1-way - IG:1024
and we get 'cxl create-region -g 1024', we should pick decoder 2,
create a x1 interleave under it. Does that make sense?
>
> 2/ Per Dave's recent series [1], cxl-cli is missing any consideration
> that a given endpoint may not have the support for the interleave
> address bits that are necessary for a given 'create-region' request.
> Does not need to be solved now, but should be queued up next.
>
> 2a/ Same for interleave-ways, there are endpoint capabilities that need
> to be accounted.
Yep agreed.
>
> [1]: https://lore.kernel.org/linux-cxl/165999244272.493131.1975513183227389633.stgit@djiang5-desk4.jf.intel.com/
>
[..]
> >
> > +
> > +
> > +static int parse_create_options(int argc, const char **argv,
> > + struct parsed_params *p)
> > +{
> > + int i;
> > +
> > + if (!param.root_decoder) {
> > + log_err(&rl, "no root decoder specified\n");
> > + return -EINVAL;
> > + }
> > +
> > + if (param.type) {
> > + if (strcmp(param.type, "ram") == 0)
> > + p->mode = CXL_DECODER_MODE_RAM;
> > + else if (strcmp(param.type, "volatile") == 0)
> > + p->mode = CXL_DECODER_MODE_RAM;
> > + else if (strcmp(param.type, "pmem") == 0)
> > + p->mode = CXL_DECODER_MODE_PMEM;
> > + else {
> > + log_err(&rl, "unsupported type: %s\n", param.type);
> > + return -EINVAL;
>
> This probably wants a common helper that can be shared with
> __reserve_dpa() and add_cxl_decoder() that do the same conversion. I.e.
> cxl_decoder_mode_name() needs a buddy. That can be a follow on change.
> ISTR I might have already noted that, so forgive the duplicate comment.
Ah I do remember this, I probably just missed this cleanup - I'll add
it for v3.
>
> > + }
> > + } else
> > + p->mode = CXL_DECODER_MODE_PMEM;
> > +
> > + if (param.size) {
> > + p->size = parse_size64(param.size);
> > + if (p->size == ULLONG_MAX) {
> > + log_err(&rl, "Invalid size: %s\n", param.size);
> > + return -EINVAL;
> > + }
> > + }
> > +
> > + if (param.ways) {
> > + unsigned long ways = strtoul(param.ways, NULL, 0);
> > +
> > + if (ways == ULONG_MAX || (int)ways <= 0) {
> > + log_err(&rl, "Invalid interleave ways: %s\n",
> > + param.ways);
> > + return -EINVAL;
> > + }
> > + p->ways = ways;
> > + } else if (argc) {
> > + p->ways = argc;
>
> This is where:
>
> cxl create-region -p $port -m all
>
> ...would not work, but maybe requiring explicit targets is ok. There's
> so many potential moving pieces in a CXL topology maybe we do not want
> to go there with this flexibility.
Hm yeah - true. It was certainly convenient (ab)using argc and argv for
this, but if we did add the flexibility of specifying a filter, or even
multiple filters in the future, the targets array would need proper
reconstruction anyway, and counting the objects in it would come
alongwith it.
>
> >
[..]
> > +
> > +static int cxl_region_determine_granularity(struct cxl_region *region,
> > + struct parsed_params *p)
> > +{
> > + const char *devname = cxl_region_get_devname(region);
> > + unsigned int granularity, ways;
> > +
> > + /* Default granularity will be the root decoder's granularity */
> > + granularity = cxl_decoder_get_interleave_granularity(p->root_decoder);
> > + if (granularity == 0 || granularity == UINT_MAX) {
> > + log_err(&rl, "%s: unable to determine root decoder granularity\n",
> > + devname);
> > + return -ENXIO;
> > + }
> > +
> > + /* If no user-supplied granularity, just use the default */
> > + if (!p->granularity)
> > + return granularity;
> > +
> > + ways = cxl_decoder_get_interleave_ways(p->root_decoder);
> > + if (ways == 0 || ways == UINT_MAX) {
> > + log_err(&rl, "%s: unable to determine root decoder ways\n",
> > + devname);
> > + return -ENXIO;
> > + }
> > +
> > + /* For ways == 1, any user-supplied granularity is fine */
>
> ...modulo a future patch that checks the device's interleave capability.
>
> Rest of the patch looks good. After fixing up the option descriptions
> per above you can add:
>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Thanks!
next prev parent reply other threads:[~2022-08-11 21:53 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-10 23:09 [ndctl PATCH v2 00/10] cxl: add region management Vishal Verma
2022-08-10 23:09 ` [ndctl PATCH v2 01/10] libcxl: add a depth attribute to cxl_port Vishal Verma
2022-08-10 23:09 ` [ndctl PATCH v2 02/10] cxl/port: Consolidate the debug option in cxl-port man pages Vishal Verma
2022-08-10 23:09 ` [ndctl PATCH v2 03/10] libcxl: Introduce libcxl region and mapping objects Vishal Verma
2022-08-10 23:09 ` [ndctl PATCH v2 04/10] cxl-cli: add region listing support Vishal Verma
2022-08-11 2:24 ` Dan Williams
2022-08-10 23:09 ` [ndctl PATCH v2 05/10] libcxl: add low level APIs for region creation Vishal Verma
2022-08-11 3:05 ` Dan Williams
2022-08-11 4:08 ` Verma, Vishal L
2022-08-11 18:42 ` Dan Williams
2022-08-11 21:34 ` Verma, Vishal L
2022-08-10 23:09 ` [ndctl PATCH v2 06/10] cxl: add a 'create-region' command Vishal Verma
2022-08-11 19:34 ` Dan Williams
2022-08-11 21:53 ` Verma, Vishal L [this message]
2022-08-11 23:02 ` Dan Williams
2022-08-10 23:09 ` [ndctl PATCH v2 07/10] cxl: add commands to {enable,disable,destroy}-region Vishal Verma
2022-08-10 23:09 ` [ndctl PATCH v2 08/10] cxl/list: make memdevs and regions the default listing Vishal Verma
2022-08-10 23:09 ` [ndctl PATCH v2 09/10] test: add a cxl-create-region test Vishal Verma
2022-08-11 19:47 ` Dan Williams
2022-08-10 23:09 ` [ndctl PATCH v2 10/10] cxl/decoder: add a max_available_extent attribute Vishal Verma
2022-08-11 20:22 ` Dan Williams
2022-08-11 21:57 ` Verma, Vishal L
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=128f99454b44519c6ab356b680206232b5e597ba.camel@intel.com \
--to=vishal.l.verma@intel.com \
--cc=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=nvdimm@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox