qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: fan <nifan.cxl@gmail.com>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: fan <nifan.cxl@gmail.com>, Markus Armbruster <armbru@redhat.com>,
	qemu-devel@nongnu.org, linux-cxl@vger.kernel.org,
	gregory.price@memverge.com, ira.weiny@intel.com,
	dan.j.williams@intel.com, a.manzanares@samsung.com,
	dave@stgolabs.net, nmtadam.samsung@gmail.com,
	jim.harris@samsung.com, Jorgen.Hansen@wdc.com,
	wj28.lee@gmail.com, Fan Ni <fan.ni@samsung.com>,
	mst@redhat.com
Subject: Re: [PATCH v7 09/12] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents
Date: Wed, 1 May 2024 15:36:47 -0700	[thread overview]
Message-ID: <ZjLD_wSqrj_z6mM7@debian> (raw)
In-Reply-To: <20240501155812.00002ec3@Huawei.com>

Hi Markus, Michael and Jonathan,

FYI. I have updated this patch based on the feedbacks so far, and posted here:
https://lore.kernel.org/linux-cxl/20240418232902.583744-1-fan.ni@samsung.com/T/#ma25b6657597d39df23341dc43c22a8c49818e5f9

Comments are welcomed and appreciated.

Fan


On Wed, May 01, 2024 at 03:58:12PM +0100, Jonathan Cameron wrote:
> 
> 
> 
> > > >> > +# @hid: host id  
> > > >> 
> > > >> @host-id, unless "HID" is established terminology in CXL DCD land.  
> > > >
> > > > host-id works.  
> > > >> 
> > > >> What is a host ID?  
> > > >
> > > > It is an id identifying the host to which the capacity is being added.  
> > > 
> > > How are these IDs assigned?  
> > 
> > All the arguments passed to the command here are defined in CXL spec. I
> > will add reference to the spec.
> > 
> > Based on the spec, for LD-FAM (Fabric attached memory represented as
> > logical device), host id is the LD-ID of the host interface to which
> > the capacity is being added. LD-ID is a unique number (16-bit) assigned
> > to a host interface.
> 
> Key here is the host doesn't know it.  This ID exists purely for rooting
> to the appropriate host interface either via choosing a port on a
> multihead Single Logical Device (SLD) (so today it's always 0 as we only
> have one head) or if we ever implement a switch capable of handling MLDs
> then the switch will handle routing of host PCIe accesses so it lands
> on the interface defined by this ID (and the event turns up in that event log.
> 
>             Host A         Host B - could in theory be a RP on host A ;)
>               |              |  Doesn't exist (yet, but there are partial.
>              _|______________|_ patches for this on list.
>             | LD 0         LD 1|
>             |                  |
>             |   Multi Head     |
>             |   Single Logical |
>             |  Device (MH-SLD) |
>             |__________________|
> Host view similar to the switch case, but just two direct
> connected devices.
> 
> Or Switch and MLD case - we aren't emulating this yet at all
> 
>      Wiring / real topology                 Host View 
>          
>       Host A     Host B              Host A       Host B
>         |          |                   |            |
>      ___|__________|___               _|_          _|_
>     |   \  SWITCH /    |             |SW0|        | | |
>     |    \       /     |             | | |        | | |
>     |    LD0   LD1     |             | | |        | | |
>     |      \   /       |             | | |        | | |
>     |        |         |             | | |        | | |
>     |________|_________|             |_|_|        |_|_|
>              |                         |            |
>       Traffic tagged with LD           |            |
>              |                         |            |
>      ________|________________     ____|___     ____|___
>     | Multilogical Device MLD |   |        |   |        |
>     |        |                |   | Simple |   | Another|
>     |       / \               |   | CXL    |   | CXL    |
>     |      /   \              |   | Memory |   | Memory |
>     |    Interfaces           |   | Device |   | Device |
>     |   LD0     LD1           |   |        |   |        |
>     |_________________________|   |________|   |________|
> 
> Note the hosts just see separate devices and switches with the fun exception that the
> memory may actually be available to both at the same time.
> 
> Control plane for the switches and MLD see what is actually going on.
> 
> At this stage upshot is we could just default this to zero and add an optional
> parameter to set it later.
> 
> 
> 
> ...
> 
> > > >> > +# @extents: Extents to release
> > > >> > +#
> > > >> > +# Since : 9.1
> > > >> > +##
> > > >> > +{ 'command': 'cxl-release-dynamic-capacity',
> > > >> > +  'data': { 'path': 'str',
> > > >> > +            'hid': 'uint16',
> > > >> > +            'flags': 'uint8',
> > > >> > +            'region-id': 'uint8',
> > > >> > +            'tag': 'str',
> > > >> > +            'extents': [ 'CXLDCExtentRecord' ]
> > > >> > +           }
> > > >> > +}  
> > > >> 
> > > >> During review of v5, you wrote:
> > > >> 
> > > >>     For add command, the host will send a mailbox command to response to
> > > >>     the add request to the device to indicate whether it accepts the add
> > > >>     capacity offer or not.
> > > >>     
> > > >>     For release command, the host send a mailbox command (not always a
> > > >>     response since the host can proactively release capacity if it does
> > > >>     not need it any more) to device to ask device release the capacity.
> > > >> 
> > > >> Can you briefly sketch the protocol?  Peers and messages involved.
> > > >> Possibly as a state diagram.  
> > > >
> > > > Need to think about it. If we can polish the text nicely, maybe the
> > > > sketch is not needed. My concern is that the sketch may
> > > > introduce unwanted complexity as we expose too much details. The two
> > > > commands provide ways to add/release dynamic capacity to/from a host,
> > > > that is all. All the other information, like what the host will do, or
> > > > how the device will react, are consequence of the command, not sure
> > > > whether we want to include here.  
> > > 
> > > The protocol sketch is for me, not necessarily the doc comment.  I'd
> > > like to understand at high level how this stuff works, because only then
> > > can I meaningfully review the docs.  
> > 
> > --------------------------------
> > For add command, saying a user sends a request to FM to ask to add
> > extent A of the device (managed by FM) to host 0.
> > The function cxl-add-dynamic-capacity simulates what FM needs to do.
> 
> This gets a little fiddly as an explanation.  I'd argue this is more or
> less at the level of the FM to device command flow so it's the device
> verifying etc. (you could explain this interface as talking to an FM
> that is talking to the device, but that just feels complicated to me).
> 
> > 1. Verify extent A is valid (behaviour defined by the spec), return
> > error if not; otherwise,
> > 2. Add a record to the device's event log (indicating the intent to
> > add extent A to host 0), update device internal extent tracking status,
> > signal an interrupt to host 0;
> > (The above step 1 & 2 are performed in the QMP interface, following
> > operations are QMP irrelevant, only host and device involved.)
> 
> In this patch.
> 
> > 3. Once the interrupt is received, host 0 fetch the event record from
> > the device's event log through some mailbox command (out of scope
> > of this patch series).
> 
> It's in patch 8.
> 
> > 4. Host 0 decides whether it accepts extent A or not. Whether accept or
> > reject, host needs to send a response (add-response mailbox command) to
> > the device so the device can update its internal extent tracking
> > status accordingly.
> > The device return a value to the host showing whether the response is
> > successful or failed.
> 
> (assuming the host isn't buggy this always succeeds)
> 
> > 5. Based on the mailbox command return value, the host process
> > accordingly.
> 
> Memory now useable by host if it accepted it successfully.
> 
> > 6. The host sends a mailbox command to the device to clear the event
> > record in the device's event log. 
> > 
> > ---------------------------------
> > For release command, saying a user sends a request to FM to ask host 0
> > to release extent A and return it back to the device (managed by FM).
> > 
> > The function cxl-release-dynamic-capacity simulates what FM needs to do.
> > 1. Verify extent A is valid (defined by the spec), return error if not;
> > otherwise,
> > 2. Add a record to the event log (indicating the intent to
> > release extent A from host 0), signal an interrupt to host 0;
> > (The above step 1 & 2 are performed in the QMP interface, following
> > operations are QMP irrelevant, only host and device involved.
> > 3. Once the interrupt is received, host 0 fetch the event record from
> > the device's event log through some mailbox command (out of scope
> > of this patch series).
> > 4. Host 0 decides whether it can release extent A or not. Whether can or
> > cannot release, host needs to send a release (mailbox command) to the device
> > so the device can update its internal extent tracking status accordingly.
> > The device returns a value to host 0 showing whether the release is
> > successful or failed.
> > 5. Based on the returned value, the host process accordingly.
> > 6. The host sends mailbox command to clear the event record in the
> > device's event log. 
> > 
> > For release command, it is more complicated. Based on the release flag
> > passed to FM, FM can behaviour differently. For example, if the
> > forced-removal flag is set, FM can directly get the extent back from a
> > host for other uses without waiting for the host to send command to the
> > device. For the above step 2, their may be not event record to the event
> > log (no supported in this patch series yet).
> I thought we weren't doing force remove yet?  So for that we could
> set default value as normal release until we add that support perhaps.
> 
> > 
> > Also, for the release interface here, it simulates FM initializes the
> > release request.
> > There is another case where the host can proactively release extents it
> > do not need any more back to device. However, this case is out of the
> > scope of this release interface.
> > 
> > Hope the above text helps a little for the context here.
> > Let me know if further clarification is needed.
> 
> Only thing I'd add is that for now (because we don't need it for testing
> the kernel flows) is that this does not provide any way for external
> agents (e.g. our 'fabric manager' to find out what the state is - i.e.
> if the extents have been accepted by the host etc). That stuff is all
> defined by the spec, but not yet in the QMP interface.  At somepoint
> we may want to add that as a state query type interface.
> 
> Jonathan
> 
> p.s. Our emails raced yesterday, so great you put together this explanation
> of the flows before I got to it :)
> 
> > 
> > Thanks,
> > Fan
> > 
> > 
> > 
> > >   
> > > > @Jonathan, Any thoughts on this?  
> > > 
> > > Thanks!
> > >   
> 


  reply	other threads:[~2024-05-01 22:37 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-18 23:10 [PATCH v7 00/12] Enabling DCD emulation support in Qemu nifan.cxl
2024-04-18 23:10 ` [PATCH v7 01/12] hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output payload of identify memory device command nifan.cxl
2024-04-19 16:40   ` Gregory Price
2024-04-18 23:10 ` [PATCH v7 02/12] hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative and mailbox command support nifan.cxl
2024-04-19 16:44   ` Gregory Price
2024-04-18 23:10 ` [PATCH v7 03/12] include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for type3 memory devices nifan.cxl
2024-04-19 16:45   ` Gregory Price
2024-04-18 23:10 ` [PATCH v7 04/12] hw/mem/cxl_type3: Add support to create DC regions to " nifan.cxl
2024-04-19 16:47   ` Gregory Price
2024-05-14  8:14   ` Zhijian Li (Fujitsu) via
2024-05-16 17:06     ` fan
2024-04-18 23:10 ` [PATCH v7 05/12] hw/mem/cxl-type3: Refactor ct3_build_cdat_entries_for_mr to take mr size instead of mr as argument nifan.cxl
2024-04-19 16:39   ` Gregory Price
2024-04-18 23:10 ` [PATCH v7 06/12] hw/mem/cxl_type3: Add host backend and address space handling for DC regions nifan.cxl
2024-04-19 17:27   ` Gregory Price
2024-04-22 11:55     ` Jonathan Cameron via
2024-04-22 11:52   ` Jonathan Cameron via
2024-05-14  8:28   ` Zhijian Li (Fujitsu) via
2024-05-16 17:07     ` fan
2024-04-18 23:10 ` [PATCH v7 07/12] hw/mem/cxl_type3: Add DC extent list representative and get DC extent list mailbox support nifan.cxl
2024-04-19 16:52   ` Gregory Price
2024-04-18 23:10 ` [PATCH v7 08/12] hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release dynamic capacity response nifan.cxl
2024-04-19 18:12   ` Gregory Price
2024-04-18 23:11 ` [PATCH v7 09/12] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents nifan.cxl
2024-04-19 18:13   ` Gregory Price
2024-04-22 12:01   ` Jonathan Cameron via
2024-04-26  9:12   ` Markus Armbruster
2024-04-26 17:31     ` fan
2024-04-29  7:58       ` Markus Armbruster
2024-04-30 17:17         ` fan
2024-05-01 14:58           ` Jonathan Cameron via
2024-05-01 22:36             ` fan [this message]
2024-06-04  9:18             ` Markus Armbruster
2024-06-04 11:54               ` Jonathan Cameron via
2024-06-04 12:13                 ` Jonathan Cameron via
2024-06-04 12:28                 ` Markus Armbruster
2024-04-30 17:21         ` Jonathan Cameron via
2024-05-01 22:29         ` fan
2024-05-20 16:50           ` Jonathan Cameron via
2024-05-20 17:55             ` fan
2024-05-21 23:32             ` fan
2024-05-23 15:31               ` Jonathan Cameron via
2024-05-21 23:38             ` fan
2024-05-23 15:32               ` Jonathan Cameron via
2024-05-14  2:35   ` Zhijian Li (Fujitsu) via
2024-04-18 23:11 ` [PATCH v7 10/12] hw/mem/cxl_type3: Add DPA range validation for accesses to DC regions nifan.cxl
2024-04-19 16:57   ` Gregory Price
2024-04-18 23:11 ` [PATCH v7 11/12] hw/cxl/cxl-mailbox-utils: Add superset extent release mailbox support nifan.cxl
2024-04-19 18:20   ` Gregory Price
2024-04-18 23:11 ` [PATCH v7 12/12] hw/mem/cxl_type3: Allow to release extent superset in QMP interface nifan.cxl
2024-04-19 18:20   ` Gregory Price
2024-04-19 18:24 ` [PATCH v7 00/12] Enabling DCD emulation support in Qemu Gregory Price
2024-04-19 18:43   ` fan
2024-04-20 20:35     ` Gregory Price
2024-04-22 12:04       ` Jonathan Cameron via
2024-04-22 14:23         ` Jonathan Cameron via
2024-04-22 15:07           ` Jonathan Cameron via
2024-04-22 15:42         ` Gregory Price
2024-05-16 17:05   ` fan
2024-05-17 12:18     ` Jonathan Cameron via
2024-05-17 16:03       ` fan
2024-05-28 18:10     ` Gregory Price
2024-05-14  2:16 ` Zhijian Li (Fujitsu) via
2024-05-16 17:12   ` fan
2024-05-17  2:20     ` Zhijian Li (Fujitsu) via

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZjLD_wSqrj_z6mM7@debian \
    --to=nifan.cxl@gmail.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=Jorgen.Hansen@wdc.com \
    --cc=a.manzanares@samsung.com \
    --cc=armbru@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave@stgolabs.net \
    --cc=fan.ni@samsung.com \
    --cc=gregory.price@memverge.com \
    --cc=ira.weiny@intel.com \
    --cc=jim.harris@samsung.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=nmtadam.samsung@gmail.com \
    --cc=qemu-devel@nongnu.org \
    --cc=wj28.lee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).