Linux CXL
 help / color / mirror / Atom feed
From: Ira Weiny <ira.weiny@intel.com>
To: Ira Weiny <ira.weiny@intel.com>,
	Dave Jiang <dave.jiang@intel.com>, Fan Ni <fan.ni@samsung.com>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	Navneet Singh <navneet.singh@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Alison Schofield <alison.schofield@intel.com>,
	"Vishal Verma" <vishal.l.verma@intel.com>,
	Ira Weiny <ira.weiny@intel.com>, <linux-btrfs@vger.kernel.org>,
	<linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	Chris Mason <clm@fb.com>, Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.com>,
	Johannes Thumshirn <johannes.thumshirn@wdc.com>,
	Petr Mladek <pmladek@suse.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Jonathan Corbet <corbet@lwn.net>,
	"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
	"Li, Ming" <ming4.li@intel.com>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>
Subject: Re: [PATCH v2 00/25] DCD: Add support for Dynamic Capacity Devices (DCD)
Date: Fri, 16 Aug 2024 09:17:28 -0500	[thread overview]
Message-ID: <66bf5f78a2330_22328529480@iweiny-mobl.notmuch> (raw)
In-Reply-To: <20240816-dcd-type2-upstream-v2-0-20189a10ad7d@intel.com>

Please ignore this series __and__ the RESEND.

The series did not get sent properly.  Something went wrong with my smtp
server in the middle.

  [PATCH v2 22/25] cxl/region: Read existing extents on region creation
CRITICAL: Error running /usr/bin/msmtp -i: msmtp: cannot locate host smtpauth.intel.com: No address associated with hostname
msmtp: could not send mail (account default from /home/iweiny/.msmtprc)

Then I used b4 --resend v2.  But glossed over the fact that it was going
to do something very bad and send a very old version.

https://lore.kernel.org/all/20240816-dcd-type2-upstream-v2-0-b4044aadf2bd@intel.com/

So please ignore that too.  :-(

At this point I'm going to send v3.


<fingers crossed>
Ira



Ira Weiny wrote:
> A git tree of this series can be found here:
> 
> 	https://github.com/weiny2/linux-kernel/tree/dcd-v4-2024-08-15
> 
> This series requires the CXL memory notifier lock change:
> 
> 	https://lore.kernel.org/all/20240814-fix-notifiers-v2-1-6bab38192c7c@intel.com/
> 
> Background
> ==========
> 
> A Dynamic Capacity Device (DCD) (CXL 3.1 sec 9.13.3) is a CXL memory
> device that allows memory capacity within a region to change
> dynamically without the need for resetting the device, reconfiguring
> HDM decoders, or reconfiguring software DAX regions.
> 
> One of the biggest use cases for Dynamic Capacity is to allow hosts to
> share memory dynamically within a data center without increasing the
> per-host attached memory.
> 
> The general flow for the addition or removal of memory is to have an
> orchestrator coordinate the use of the memory.  Generally there are 5
> actors in such a system, the Orchestrator, Fabric Manager, the Logical
> device, the Host Kernel, and a Host User.
> 
> Typical work flows are shown below.
> 
> Orchestrator      FM         Device       Host Kernel    Host User
> 
>     |             |           |            |              |
>     |-------------- Create region ----------------------->|
>     |             |           |            |              |
>     |             |           |            |<-- Create ---|
>     |             |           |            |    Region    |
>     |<------------- Signal done --------------------------|
>     |             |           |            |              |
>     |-- Add ----->|-- Add --->|--- Add --->|              |
>     |  Capacity   |  Extent   |   Extent   |              |
>     |             |           |            |              |
>     |             |<- Accept -|<- Accept  -|              |
>     |             |   Extent  |   Extent   |              |
>     |             |           |            |<- Create --->|
>     |             |           |            |   DAX dev    |-- Use memory
>     |             |           |            |              |   |
>     |             |           |            |              |   |
>     |             |           |            |<- Release ---| <-+
>     |             |           |            |   DAX dev    |
>     |             |           |            |              |
>     |<------------- Signal done --------------------------|
>     |             |           |            |              |
>     |-- Remove -->|- Release->|- Release ->|              |
>     |  Capacity   |  Extent   |   Extent   |              |
>     |             |           |            |              |
>     |             |<- Release-|<- Release -|              |
>     |             |   Extent  |   Extent   |              |
>     |             |           |            |              |
>     |-- Add ----->|-- Add --->|--- Add --->|              |
>     |  Capacity   |  Extent   |   Extent   |              |
>     |             |           |            |              |
>     |             |<- Accept -|<- Accept  -|              |
>     |             |   Extent  |   Extent   |              |
>     |             |           |            |<- Create ----|
>     |             |           |            |   DAX dev    |-- Use memory
>     |             |           |            |              |   |
>     |             |           |            |<- Release ---| <-+
>     |             |           |            |   DAX dev    |
>     |<------------- Signal done --------------------------|
>     |             |           |            |              |
>     |-- Remove -->|- Release->|- Release ->|              |
>     |  Capacity   |  Extent   |   Extent   |              |
>     |             |           |            |              |
>     |             |<- Release-|<- Release -|              |
>     |             |   Extent  |   Extent   |              |
>     |             |           |            |              |
>     |-- Add ----->|-- Add --->|--- Add --->|              |
>     |  Capacity   |  Extent   |   Extent   |              |
>     |             |           |            |<- Create ----|
>     |             |           |            |   DAX dev    |-- Use memory
>     |             |           |            |              |   |
>     |-- Remove -->|- Release->|- Release ->|              |   |
>     |  Capacity   |  Extent   |   Extent   |              |   |
>     |             |           |            |              |   |
>     |             |           |     (Release Ignored)     |   |
>     |             |           |            |              |   |
>     |             |           |            |<- Release ---| <-+
>     |             |           |            |   DAX dev    |
>     |<------------- Signal done --------------------------|
>     |             |           |            |              |
>     |             |- Release->|- Release ->|              |
>     |             |  Extent   |   Extent   |              |
>     |             |           |            |              |
>     |             |<- Release-|<- Release -|              |
>     |             |   Extent  |   Extent   |              |
>     |             |           |            |<- Destroy ---|
>     |             |           |            |   Region     |
>     |             |           |            |              |
> 
> Previous versions of this series[0] resulted in architectural comments
> as well as confusion on the architecture based on the organization of
> patch series itself.
> 
> This version has reordered the patches to clarify the architecture.
> It also streamlines extent handling more.
> 
> The series still requires the creation of regions and DAX devices to be
> synchronized with the Orchestrator and Fabric Manager.  The host kernel
> will reject an add extent event if the region is not created yet.  It
> will also ignore a release if the DAX device is created and referencing
> an extent.
> 
> These synchronizations are not anticipated to be an issue with real
> applications.
> 
> In order to allow for capacity to be added and removed a new concept of
> a sparse DAX region is introduced.  A sparse DAX region may have 0 or
> more bytes of available space.  The total space depends on the number
> and size of the extents which have been added.
> 
> Initially it is anticipated that users of the memory will carefully
> coordinate the surfacing of additional capacity with the creation of DAX
> devices which use that capacity.  Therefore, the allocation of the
> memory to DAX devices does not allow for specific associations between
> DAX device and extent.  This keeps allocations very similar to existing
> DAX region behavior.
> 
> Great care was taken to keep the extent tracking simple.  Some xarray's
> needed to be added but extra software objects were kept to a minimum.
> 
> Region extents continue to be tracked as sub-devices of the DAX region.
> This ensures that region destruction cleans up all extent allocations
> properly.
> 
> Due to these major changes all reviews were removed from the larger
> patches.  A few of the straight forward patches have kept the tags.
> 
> In summary the major functionality of this series includes:
> 
> - Getting the dynamic capacity (DC) configuration information from cxl
>   devices
> 
> - Configuring the DC partitions reported by hardware
> 
> - Enhancing the CXL and DAX regions for dynamic capacity support
> 	a. Maintain a logical separation between hardware extents and
> 	   software managed region extents.  This provides an
> 	   abstraction between the layers and should allow for
> 	   interleaving in the future
> 
> - Get hardware extent lists for endpoint decoders upon
>   region creation.
> 
> - Adjust extent/region memory available on the following events.
>         a. Add capacity Events
> 	b. Release capacity events
> 
> - Host response for add capacity
> 	a. do not accept the extent if:
> 		If the region does not exist
> 		or an error occurs realizing the extent
> 	b. If the region does exist
> 		realize a DAX region extent with 1:1 mapping (no
> 		interleave yet)
> 	c. Support the more bit by processing a list of extents marked
> 	   with the more bit together before setting up a response.
> 
> - Host response for remove capacity
> 	a. If no DAX device references the extent; release the extent
> 	b. If a reference does exist, ignore the request.
> 	   (Require FM to issue release again.)
> 
> - Modify DAX device creation/resize to account for extents within a
>   sparse DAX region
> 
> - Trace Dynamic Capacity events for debugging
> 
> - Add cxl-test infrastructure to allow for faster unit testing
>   (See new ndctl branch for cxl-dcd.sh test[1])
> 
> Fan Ni's upstream of Qemu DCD was used for testing.
> 
> Remaining work:
> 
> 	1) Integrate the QoS work from Dave Jiang
> 	2) Interleave support
> 
> Possible additional work depending on requirements:
> 
> 	1) Allow mapping to specific extents (perhaps based on
> 	   label/tag)
> 	2) Release extents when DAX devices are released if a release
> 	   was previously seen from the device
> 	3) Accept a new extent which extends (but overlaps) an existing
> 	   extent(s)
> 	4) Rework DAX device interfaces, memfd has been explored a bit
> 
> [0] v1: https://lore.kernel.org/all/20240324-dcd-type2-upstream-v1-0-b7b00d623625@intel.com/
> [1] https://github.com/weiny2/ndctl/tree/dcd-region2-2024-08-15
> 
> ---
> Major changes:
> - Jonathan: support the more bit
> - djbw: Allow more than 1 region per DC partition
> - All: Address the many comments on the series.
> - iweiny: rebase
> - iweiny: Rework the series to make it easier to review and understand
>           the flow
> - Link to v1: https://lore.kernel.org/r/20240324-dcd-type2-upstream-v1-0-b7b00d623625@intel.com
> 
> ---
> Ira Weiny (13):
>       range: Add range_overlaps()
>       printk: Add print format (%par) for struct range
>       dax: Document dax dev range tuple
>       cxl/pci: Delay event buffer allocation
>       cxl/region: Refactor common create region code
>       cxl/events: Split event msgnum configuration from irq setup
>       cxl/pci: Factor out interrupt policy check
>       cxl/core: Return endpoint decoder information from region search
>       dax/bus: Factor out dev dax resize logic
>       dax/region: Create resources on sparse DAX regions
>       cxl/region: Read existing extents on region creation
>       tools/testing/cxl: Make event logs dynamic
>       tools/testing/cxl: Add DC Regions to mock mem data
> 
> Navneet Singh (12):
>       cxl/mbox: Flag support for Dynamic Capacity Devices (DCD)
>       cxl/mem: Read dynamic capacity configuration from the device
>       cxl/core: Separate region mode from decoder mode
>       cxl/region: Add dynamic capacity decoder and region modes
>       cxl/hdm: Add dynamic capacity size support to endpoint decoders
>       cxl/port: Add endpoint decoder DC mode support to sysfs
>       cxl/mem: Expose DCD partition capabilities in sysfs
>       cxl/region: Add sparse DAX region support
>       cxl/mem: Configure dynamic capacity interrupts
>       cxl/extent: Process DCD events and realize region extents
>       cxl/region/extent: Expose region extent information in sysfs
>       cxl/mem: Trace Dynamic capacity Event Record
> 
>  Documentation/ABI/testing/sysfs-bus-cxl   |  68 ++-
>  Documentation/core-api/printk-formats.rst |  14 +
>  drivers/cxl/core/Makefile                 |   2 +-
>  drivers/cxl/core/core.h                   |  33 +-
>  drivers/cxl/core/extent.c                 | 467 ++++++++++++++
>  drivers/cxl/core/hdm.c                    | 206 ++++++-
>  drivers/cxl/core/mbox.c                   | 578 +++++++++++++++++-
>  drivers/cxl/core/memdev.c                 | 101 ++-
>  drivers/cxl/core/port.c                   |  13 +-
>  drivers/cxl/core/region.c                 | 173 ++++--
>  drivers/cxl/core/trace.h                  |  65 ++
>  drivers/cxl/cxl.h                         | 122 +++-
>  drivers/cxl/cxlmem.h                      | 128 +++-
>  drivers/cxl/pci.c                         | 123 +++-
>  drivers/dax/bus.c                         | 352 +++++++++--
>  drivers/dax/bus.h                         |   4 +-
>  drivers/dax/cxl.c                         |  73 ++-
>  drivers/dax/dax-private.h                 |  39 +-
>  drivers/dax/hmem/hmem.c                   |   2 +-
>  drivers/dax/pmem.c                        |   2 +-
>  fs/btrfs/ordered-data.c                   |  10 +-
>  include/linux/cxl-event.h                 |  32 +
>  include/linux/range.h                     |   7 +
>  lib/vsprintf.c                            |  37 ++
>  tools/testing/cxl/Kbuild                  |   3 +-
>  tools/testing/cxl/test/mem.c              | 981 ++++++++++++++++++++++++++----
>  26 files changed, 3327 insertions(+), 308 deletions(-)
> ---
> base-commit: 3cef9316df4cda21b5bf25e4230221b02050dfa1
> change-id: 20230604-dcd-type2-upstream-0cd15f6216fd
> 
> Best regards,
> -- 
> Ira Weiny <ira.weiny@intel.com>
> 



      parent reply	other threads:[~2024-08-16 14:17 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-16 13:59 [PATCH v2 00/25] DCD: Add support for Dynamic Capacity Devices (DCD) Ira Weiny
2024-08-16 13:59 ` [PATCH v2 01/25] range: Add range_overlaps() Ira Weiny
2024-08-16 13:59 ` [PATCH v2 02/25] printk: Add print format (%par) for struct range Ira Weiny
2024-08-16 13:59 ` [PATCH v2 03/25] dax: Document dax dev range tuple Ira Weiny
2024-08-16 13:59 ` [PATCH v2 04/25] cxl/pci: Delay event buffer allocation Ira Weiny
2024-08-16 13:59 ` [PATCH v2 05/25] cxl/mbox: Flag support for Dynamic Capacity Devices (DCD) ira.weiny
2024-08-16 13:59 ` [PATCH v2 06/25] cxl/mem: Read dynamic capacity configuration from the device ira.weiny
2024-08-16 13:59 ` [PATCH v2 07/25] cxl/core: Separate region mode from decoder mode ira.weiny
2024-08-16 13:59 ` [PATCH v2 08/25] cxl/region: Add dynamic capacity decoder and region modes ira.weiny
2024-08-16 13:59 ` [PATCH v2 09/25] cxl/hdm: Add dynamic capacity size support to endpoint decoders ira.weiny
2024-08-16 13:59 ` [PATCH v2 10/25] cxl/port: Add endpoint decoder DC mode support to sysfs ira.weiny
2024-08-16 13:59 ` [PATCH v2 11/25] cxl/mem: Expose DCD partition capabilities in sysfs ira.weiny
2024-08-16 14:00 ` [PATCH v2 12/25] cxl/region: Refactor common create region code Ira Weiny
2024-08-16 14:00 ` [PATCH v2 13/25] cxl/region: Add sparse DAX region support ira.weiny
2024-08-16 14:00 ` [PATCH v2 14/25] cxl/events: Split event msgnum configuration from irq setup Ira Weiny
2024-08-16 14:00 ` [PATCH v2 15/25] cxl/pci: Factor out interrupt policy check Ira Weiny
2024-08-16 14:00 ` [PATCH v2 16/25] cxl/mem: Configure dynamic capacity interrupts ira.weiny
2024-08-16 14:00 ` [PATCH v2 17/25] cxl/core: Return endpoint decoder information from region search Ira Weiny
2024-08-16 14:00 ` [PATCH v2 18/25] cxl/extent: Process DCD events and realize region extents ira.weiny
2024-08-16 14:00 ` [PATCH v2 19/25] cxl/region/extent: Expose region extent information in sysfs ira.weiny
2024-08-16 14:00 ` [PATCH v2 20/25] dax/bus: Factor out dev dax resize logic Ira Weiny
2024-08-16 14:00 ` [PATCH v2 21/25] dax/region: Create resources on sparse DAX regions Ira Weiny
2024-08-16 14:17 ` Ira Weiny [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=66bf5f78a2330_22328529480@iweiny-mobl.notmuch \
    --to=ira.weiny@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=clm@fb.com \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=dsterba@suse.com \
    --cc=fan.ni@samsung.com \
    --cc=johannes.thumshirn@wdc.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming4.li@intel.com \
    --cc=navneet.singh@intel.com \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox