From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>, <linux-cxl@vger.kernel.org>,
<ira.weiny@intel.com>, <vishal.l.verma@intel.com>,
<alison.schofield@intel.com>, <dave@stgolabs.net>,
<jgg@nvidia.com>, <shiju.jose@huawei.com>
Subject: Re: [PATCH v1 14/19] cxl: Add support for fwctl RPC command to enable CXL feature commands
Date: Tue, 4 Feb 2025 10:04:08 +0000 [thread overview]
Message-ID: <20250204100408.000048fd@huawei.com> (raw)
In-Reply-To: <67a170ca464e3_2d2c294d@dwillia2-xfh.jf.intel.com.notmuch>
On Mon, 3 Feb 2025 17:43:38 -0800
Dan Williams <dan.j.williams@intel.com> wrote:
> Jonathan Cameron wrote:
> [..]
> > > > > I think it's fine to let userspace see that exclusive features are
> > > > > present, just need to return EBUSY if userspace actually tries to use
> > > > > them.
> > > >
> > > > To me, a poke it and see interface is really ugly.
> > >
> > > That smells more like a matter of documentation. "Doctor it hurts when I
> > > try to use the documented kernel-exclusive commands?"
> >
> > To me this is a nasty interface design.
> > If I'm writing a tool to enumerate what is exposed etc then it will
> > have to poke every get command just to list if an interface is available.
> > Hopefully none of them have side effects!
>
> The kernel exclusive list is documented. How did this tool get written
> in such a way to understand how to get data out of the interface but
> without reading the documentation on how to consume that data?
Today's kernel exclusive list is documented. Kind of tricky to know what
is on that list in a few years time.
>
> > > > 2. Query type interface. So a way to actually ask if a given feature is
> > > > usable.
> > >
> > > Not sure we really need a programmatic way to read the documentation.
> > >
> > > The CXL_MEM_COMMAND_FLAG_EXCLUSIVE flag is for cases where the
> > > exclusivity is transient. For these features the exclusivity is
> > > permanent, and I hope we never need to cross that
> > > transient-exclusivity-bridge for Features.
> >
> > It's permanent today, but I can definitely see that not always being
> > the case - we may well have future kernel does things in X fashion but
> > for legacy support disable that CONFIG option. Not nice but definitely
> > plausible.
>
> Then we cross that bridge and build some new ABI to communicate
> transient exclusivity, and that new state of the world will be
> documented as to how to discover that new capability.
>
> Something like a Linux specific feature that returns a list of transient
> and permanently exclusive Features.
>
> In the meantime no need to hide useful information from userspace.
Ok. Seems like a more complex long term solution, but meh I don't care
that much.
>
> > > > 3. What we have here. To me the simplest solution is hide what we can't
> > > > be used.
> > >
> > > It is inconsistent that we do not do this for the other kernel exclusive
> > > commands in userspace retreived Command Effects Log. The ABI here is raw
> > > Get Supported Features payload.
> >
> > If they were exposed via similar paths I'd agree consistency matters
> > but I 'hope' no one is going to have a tool that mixes fwctl and the
> > legacy path. In my head we add all the useful commands to fwctl
> > and that legacy path ends up effectively deprecated.
> >
> > Anyhow, I don't feel that strongly about this, it's just a case
> > of doesn't smell of roses to me.
> >
> > >
> > > > > > + /* These effects supported for all scope */
> > > > > > + if ((effects & CXL_CMD_CONFIG_CHANGE_COLD_RESET ||
> > > > > > + (effects & CXL_CMD_EFFECTS_EXTEND &&
> > > > > > + (effects & CXL_CMD_CONFIG_CHANGE_CONV_RESET ||
> > > > > > + effects & CXL_CMD_CONFIG_CHANGE_CXL_RESET))) &&
> > > > > > + scope >= FWCTL_RPC_DEBUG_WRITE)
> > > > > > + return true;
> > > > >
> > > > > Looks good for the known bits, but this needs to return false for the
> > > > > currently reserved bits because the driver can not assume a security
> > > > > model for future effects. If a future spec adds
> > > > > FWCTL_RPC_DEBUG_WRITE-safe effects, a new kernel is needed to allow
> > > > > those Feature commands through.
> > > > >
> > > > > Sidenote: I wonder why the spec wasted one of its bits on an extend bit,
> > > > > but here we are. The 'extend' concept is typically something like
> > > > > "bit15: go look at this other field in this payload as this 16-bit field
> > > > > was exhausted", not "bit9: the bits above this originally defined 16 bit
> > > > > field now has more bits", oh well.
> > > >
> > > > It's odd but corner case of going from 'unknown' state for the remaining
> > > > pair of bits to 0 means this and 1 means this.
> > >
> > > I don't understand. 0 means no effect to worry about whether it is
> > > defined or not.
> > >
> > > > Naming though doesn't match the spec that calls it CEL[11:10] valid.
> > > > Would be good to name it closer to that as we may well have something
> > > > in bits 12 and 15 in future and it doesn't refer to them.
> > >
> > > Hopefully we can head off another "valid2" mistake, and I don't think
> > > Linux needs to define anything for this bit. That bit's definition is:
> > >
> > > "Bit[9]: 1 is recommended, 0 is permitted (CEL[11:10] Valid)"
> > >
> > > ...which translates to "useless". If 11 or 10 are set, I don't care what
> > > value 9 has.
> > >
> > > If 12:15 are set, I don't care if there is a future valid2
> > > bit gating whether or not to use them. Valid bits are for cases that go
> > > outside of what Reserved 0 compatibility rules can convey, and I think
> > > Reserved 0 compatiblity fully covers us in this case.
> >
> > Seems the spec authors disagreed. (obviously I can't comment on that
> > discussion).
> >
> > Using just what anyone can see (if they have the spec)
> > It was a clear spec hole and there wasn't an obvious default for 0 to
> > mean so it was a 'read your device docs and act appropriately' case
> > before this stuff was added.
>
> I see v2 is still trying to pretend this bit matters.
>
> Are you saying that because the unused bits were marked 0 instead of
> "Reserved" that software needs to play this game of checking an extend
> bit?
>
> What breaks if software treats those bits as Reserved0? What breaks if
> software ignores bit9?
More generally we can't assume it doesn't happen if those bits are 0.
In this particular case perhaps it doesn't matter.
Hmm. The aim is to decide if the permissions DEBUG_WRITE is enough.
We have already verified it doesn't happen immediately. So now
the question is does it result in a change conventional reset.
Without bit 9 we have no idea either way. So question is do
we assume it does, or assume it doesn't?
+ /* These effects supported for all scope */
+ if ((effects & CXL_CMD_CONFIG_CHANGE_COLD_RESET ||
+ (effects & CXL_CMD_EFFECTS_EXTEND &&
+ (effects & CXL_CMD_CONFIG_CHANGE_CONV_RESET ||
+ effects & CXL_CMD_CONFIG_CHANGE_CXL_RESET))) &&
+ scope >= FWCTL_RPC_DEBUG_WRITE)
+ return true;
+
+ return false;
Right now we return false if it was conventional reset but
we don't know that because the hardware doesn't say either way.
I'm stuck on what the right thing to do is. My assumption
is that cold reset will always apply if conv or cxl reset
do. So maybe who cares and we shouldn't check these two
at all?
Fun corner: there must be something that causes an effect
to happen (or in what sense is this a 'set') and if we have
excluded everything else maybe we should assume that
it must be one of these two resets?
>
> > There may be corner cases where the right answer if we know the
> > feature is not persisted over a reset but instead panic or take
> > some heavy weight action. Same can be true the other way around
> > in that we may have to do something heavy to manually reset something
> > we don't want to persist over reset. Hopefully not but we'll see.
>
> ...but how is that relevant to the FWCTL use case which just wants to
> know if the operation is going to have an immediate effect that changes
> a parameter behind the kernel's back?
In that case, should we just let everything through and not check for
cold reset either?
Jonathan
>
next prev parent reply other threads:[~2025-02-04 10:04 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-22 23:50 [PATCH v1 0/19] cxl: Add CXL feature commands support via fwctl Dave Jiang
2025-01-22 23:50 ` [PATCH v1 01/19] cxl: Refactor user ioctl command path from mds to mailbox Dave Jiang
2025-01-22 23:50 ` [PATCH v1 02/19] cxl: Add skeletal features driver Dave Jiang
2025-01-23 3:59 ` Dan Williams
2025-01-23 15:49 ` Dave Jiang
2025-01-23 19:57 ` Dan Williams
2025-01-23 17:24 ` Jonathan Cameron
2025-01-22 23:50 ` [PATCH v1 03/19] cxl: Enumerate feature commands Dave Jiang
2025-01-23 17:33 ` Jonathan Cameron
2025-01-23 23:55 ` Dan Williams
2025-01-22 23:50 ` [PATCH v1 04/19] cxl: Add Get Supported Features command for kernel usage Dave Jiang
2025-01-23 17:43 ` Jonathan Cameron
2025-01-24 0:30 ` Dan Williams
2025-01-24 15:01 ` Jason Gunthorpe
2025-01-27 11:10 ` Jonathan Cameron
2025-01-28 0:54 ` Dan Williams
2025-01-22 23:50 ` [PATCH v1 05/19] cxl: Add features driver attribute to emit number of features supported Dave Jiang
2025-01-23 17:44 ` Jonathan Cameron
2025-01-24 0:35 ` Dan Williams
2025-01-22 23:50 ` [PATCH v1 06/19] cxl/test: Add Get Supported Features mailbox command support Dave Jiang
2025-01-23 17:47 ` Jonathan Cameron
2025-01-24 0:42 ` Dan Williams
2025-01-22 23:50 ` [PATCH v1 07/19] cxl/mbox: Add GET_FEATURE mailbox command Dave Jiang
2025-01-23 17:50 ` Jonathan Cameron
2025-01-24 22:58 ` Dan Williams
2025-01-29 0:14 ` Dave Jiang
2025-01-22 23:50 ` [PATCH v1 08/19] cxl/mbox: Add SET_FEATURE " Dave Jiang
2025-01-23 17:52 ` Jonathan Cameron
2025-01-24 23:01 ` Dan Williams
2025-01-22 23:50 ` [PATCH v1 09/19] cxl: Setup exclusive CXL features that are reserved for the kernel Dave Jiang
2025-01-23 17:59 ` Jonathan Cameron
2025-01-24 23:05 ` Dan Williams
2025-01-22 23:50 ` [PATCH v1 10/19] cxl: Add FWCTL support to the CXL features driver Dave Jiang
2025-01-23 18:04 ` Jonathan Cameron
2025-01-23 18:53 ` Jason Gunthorpe
2025-01-24 23:14 ` Dan Williams
2025-01-22 23:50 ` [PATCH v1 11/19] cxl: Add support for get driver information Dave Jiang
2025-01-23 18:09 ` Jonathan Cameron
2025-01-25 1:26 ` Dan Williams
2025-01-22 23:50 ` [PATCH v1 12/19] cxl: Move cxl_mem.h under uapi to cxl exclusive directory Dave Jiang
2025-01-23 18:10 ` Jonathan Cameron
2025-01-25 1:29 ` Dan Williams
2025-01-22 23:50 ` [PATCH v1 13/19] cxl: Move cxl feature command structs to user header Dave Jiang
2025-01-23 18:12 ` Jonathan Cameron
2025-01-23 18:13 ` Jonathan Cameron
2025-01-25 1:34 ` Dan Williams
2025-01-22 23:50 ` [PATCH v1 14/19] cxl: Add support for fwctl RPC command to enable CXL feature commands Dave Jiang
2025-01-23 18:21 ` Jonathan Cameron
2025-01-25 2:08 ` Dan Williams
2025-01-27 10:51 ` Jonathan Cameron
2025-01-28 0:40 ` Dan Williams
2025-01-28 12:01 ` Jonathan Cameron
2025-01-28 15:55 ` Dave Jiang
2025-01-30 13:42 ` Jonathan Cameron
2025-02-04 1:43 ` Dan Williams
2025-02-04 10:04 ` Jonathan Cameron [this message]
2025-02-04 22:26 ` Dan Williams
2025-02-05 17:36 ` Jonathan Cameron
2025-02-05 18:02 ` Dan Williams
2025-01-22 23:50 ` [PATCH v1 15/19] cxl: Add support to handle user feature commands for get feature Dave Jiang
2025-01-23 18:25 ` Jonathan Cameron
2025-01-25 2:23 ` Dan Williams
2025-01-22 23:50 ` [PATCH v1 16/19] cxl: Add support to handle user feature commands for set feature Dave Jiang
2025-01-23 18:26 ` Jonathan Cameron
2025-01-25 2:29 ` Dan Williams
2025-01-22 23:50 ` [PATCH v1 17/19] cxl/test: Add Get Feature support to cxl_test Dave Jiang
2025-01-25 2:33 ` Dan Williams
2025-01-22 23:50 ` [PATCH v1 18/19] cxl/test: Add Set " Dave Jiang
2025-01-25 2:36 ` Dan Williams
2025-01-22 23:50 ` [PATCH v1 19/19] fwctl/cxl: Add documentation to FWCTL CXL Dave Jiang
2025-01-25 2:55 ` Dan Williams
2025-01-23 17:03 ` [PATCH v1 0/19] cxl: Add CXL feature commands support via fwctl Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250204100408.000048fd@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=jgg@nvidia.com \
--cc=linux-cxl@vger.kernel.org \
--cc=shiju.jose@huawei.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox