From: Fan Ni <nifan.cxl@gmail.com>
To: Ira Weiny <ira.weiny@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>,
Navneet Singh <navneet.singh@intel.com>,
Jonathan Corbet <corbet@lwn.net>,
Andrew Morton <akpm@linux-foundation.org>,
Dan Williams <dan.j.williams@intel.com>,
Davidlohr Bueso <dave@stgolabs.net>,
Alison Schofield <alison.schofield@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
linux-btrfs@vger.kernel.org, linux-cxl@vger.kernel.org,
linux-doc@vger.kernel.org, nvdimm@lists.linux.dev,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 23/28] dax/bus: Factor out dev dax resize logic
Date: Mon, 14 Oct 2024 09:56:15 -0700 [thread overview]
Message-ID: <Zw1NL3_otWVTUF4c@fan> (raw)
In-Reply-To: <20241007-dcd-type2-upstream-v4-23-c261ee6eeded@intel.com>
On Mon, Oct 07, 2024 at 06:16:29PM -0500, Ira Weiny wrote:
> Dynamic Capacity regions must limit dev dax resources to those areas
> which have extents backing real memory. Such DAX regions are dubbed
> 'sparse' regions. In order to manage where memory is available four
> alternatives were considered:
>
> 1) Create a single region resource child on region creation which
> reserves the entire region. Then as extents are added punch holes in
> this reservation. This requires new resource manipulation to punch
> the holes and still requires an additional iteration over the extent
> areas which may already have existing dev dax resources used.
>
> 2) Maintain an ordered xarray of extents which can be queried while
> processing the resize logic. The issue is that existing region->res
> children may artificially limit the allocation size sent to
> alloc_dev_dax_range(). IE the resource children can't be directly
> used in the resize logic to find where space in the region is. This
> also poses a problem of managing the available size in 2 places.
>
> 3) Maintain a separate resource tree with extents. This option is the
> same as 2) but with the different data structure. Most ideally there
> should be a unified representation of the resource tree not two places
> to look for space.
>
> 4) Create region resource children for each extent. Manage the dax dev
> resize logic in the same way as before but use a region child
> (extent) resource as the parents to find space within each extent.
>
> Option 4 can leverage the existing resize algorithm to find space within
> the extents. It manages the available space in a singular resource tree
> which is less complicated for finding space.
>
> In preparation for this change, factor out the dev_dax_resize logic.
> For static regions use dax_region->res as the parent to find space for
> the dax ranges. Future patches will use the same algorithm with
> individual extent resources as the parent.
>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
LGTM based on the code logic, but not familiar with dax resource management.
Fan
> ---
> Changes:
> [Jonathan: Fix handling of alloc]
> ---
> drivers/dax/bus.c | 129 +++++++++++++++++++++++++++++++++---------------------
> 1 file changed, 79 insertions(+), 50 deletions(-)
>
> diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
> index d8cb5195a227..f0e3f8c787df 100644
> --- a/drivers/dax/bus.c
> +++ b/drivers/dax/bus.c
> @@ -844,11 +844,9 @@ static int devm_register_dax_mapping(struct dev_dax *dev_dax, int range_id)
> return 0;
> }
>
> -static int alloc_dev_dax_range(struct dev_dax *dev_dax, u64 start,
> - resource_size_t size)
> +static int alloc_dev_dax_range(struct resource *parent, struct dev_dax *dev_dax,
> + u64 start, resource_size_t size)
> {
> - struct dax_region *dax_region = dev_dax->region;
> - struct resource *res = &dax_region->res;
> struct device *dev = &dev_dax->dev;
> struct dev_dax_range *ranges;
> unsigned long pgoff = 0;
> @@ -866,14 +864,14 @@ static int alloc_dev_dax_range(struct dev_dax *dev_dax, u64 start,
> return 0;
> }
>
> - alloc = __request_region(res, start, size, dev_name(dev), 0);
> + alloc = __request_region(parent, start, size, dev_name(dev), 0);
> if (!alloc)
> return -ENOMEM;
>
> ranges = krealloc(dev_dax->ranges, sizeof(*ranges)
> * (dev_dax->nr_range + 1), GFP_KERNEL);
> if (!ranges) {
> - __release_region(res, alloc->start, resource_size(alloc));
> + __release_region(parent, alloc->start, resource_size(alloc));
> return -ENOMEM;
> }
>
> @@ -1026,50 +1024,45 @@ static bool adjust_ok(struct dev_dax *dev_dax, struct resource *res)
> return true;
> }
>
> -static ssize_t dev_dax_resize(struct dax_region *dax_region,
> - struct dev_dax *dev_dax, resource_size_t size)
> +/**
> + * dev_dax_resize_static - Expand the device into the unused portion of the
> + * region. This may involve adjusting the end of an existing resource, or
> + * allocating a new resource.
> + *
> + * @parent: parent resource to allocate this range in
> + * @dev_dax: DAX device to be expanded
> + * @to_alloc: amount of space to alloc; must be <= space available in @parent
> + *
> + * Return the amount of space allocated or -ERRNO on failure
> + */
> +static ssize_t dev_dax_resize_static(struct resource *parent,
> + struct dev_dax *dev_dax,
> + resource_size_t to_alloc)
> {
> - resource_size_t avail = dax_region_avail_size(dax_region), to_alloc;
> - resource_size_t dev_size = dev_dax_size(dev_dax);
> - struct resource *region_res = &dax_region->res;
> - struct device *dev = &dev_dax->dev;
> struct resource *res, *first;
> - resource_size_t alloc = 0;
> int rc;
>
> - if (dev->driver)
> - return -EBUSY;
> - if (size == dev_size)
> - return 0;
> - if (size > dev_size && size - dev_size > avail)
> - return -ENOSPC;
> - if (size < dev_size)
> - return dev_dax_shrink(dev_dax, size);
> -
> - to_alloc = size - dev_size;
> - if (dev_WARN_ONCE(dev, !alloc_is_aligned(dev_dax, to_alloc),
> - "resize of %pa misaligned\n", &to_alloc))
> - return -ENXIO;
> -
> - /*
> - * Expand the device into the unused portion of the region. This
> - * may involve adjusting the end of an existing resource, or
> - * allocating a new resource.
> - */
> -retry:
> - first = region_res->child;
> - if (!first)
> - return alloc_dev_dax_range(dev_dax, dax_region->res.start, to_alloc);
> + first = parent->child;
> + if (!first) {
> + rc = alloc_dev_dax_range(parent, dev_dax,
> + parent->start, to_alloc);
> + if (rc)
> + return rc;
> + return to_alloc;
> + }
>
> - rc = -ENOSPC;
> for (res = first; res; res = res->sibling) {
> struct resource *next = res->sibling;
> + resource_size_t alloc;
>
> /* space at the beginning of the region */
> - if (res == first && res->start > dax_region->res.start) {
> - alloc = min(res->start - dax_region->res.start, to_alloc);
> - rc = alloc_dev_dax_range(dev_dax, dax_region->res.start, alloc);
> - break;
> + if (res == first && res->start > parent->start) {
> + alloc = min(res->start - parent->start, to_alloc);
> + rc = alloc_dev_dax_range(parent, dev_dax,
> + parent->start, alloc);
> + if (rc)
> + return rc;
> + return alloc;
> }
>
> alloc = 0;
> @@ -1078,21 +1071,55 @@ static ssize_t dev_dax_resize(struct dax_region *dax_region,
> alloc = min(next->start - (res->end + 1), to_alloc);
>
> /* space at the end of the region */
> - if (!alloc && !next && res->end < region_res->end)
> - alloc = min(region_res->end - res->end, to_alloc);
> + if (!alloc && !next && res->end < parent->end)
> + alloc = min(parent->end - res->end, to_alloc);
>
> if (!alloc)
> continue;
>
> if (adjust_ok(dev_dax, res)) {
> rc = adjust_dev_dax_range(dev_dax, res, resource_size(res) + alloc);
> - break;
> + if (rc)
> + return rc;
> + return alloc;
> }
> - rc = alloc_dev_dax_range(dev_dax, res->end + 1, alloc);
> - break;
> + rc = alloc_dev_dax_range(parent, dev_dax, res->end + 1, alloc);
> + if (rc)
> + return rc;
> + return alloc;
> }
> - if (rc)
> - return rc;
> +
> + /* available was already calculated and should never be an issue */
> + dev_WARN_ONCE(&dev_dax->dev, 1, "space not found?");
> + return 0;
> +}
> +
> +static ssize_t dev_dax_resize(struct dax_region *dax_region,
> + struct dev_dax *dev_dax, resource_size_t size)
> +{
> + resource_size_t avail = dax_region_avail_size(dax_region), to_alloc;
> + resource_size_t dev_size = dev_dax_size(dev_dax);
> + struct device *dev = &dev_dax->dev;
> + resource_size_t alloc;
> +
> + if (dev->driver)
> + return -EBUSY;
> + if (size == dev_size)
> + return 0;
> + if (size > dev_size && size - dev_size > avail)
> + return -ENOSPC;
> + if (size < dev_size)
> + return dev_dax_shrink(dev_dax, size);
> +
> + to_alloc = size - dev_size;
> + if (dev_WARN_ONCE(dev, !alloc_is_aligned(dev_dax, to_alloc),
> + "resize of %pa misaligned\n", &to_alloc))
> + return -ENXIO;
> +
> +retry:
> + alloc = dev_dax_resize_static(&dax_region->res, dev_dax, to_alloc);
> + if (alloc <= 0)
> + return alloc;
> to_alloc -= alloc;
> if (to_alloc)
> goto retry;
> @@ -1198,7 +1225,8 @@ static ssize_t mapping_store(struct device *dev, struct device_attribute *attr,
>
> to_alloc = range_len(&r);
> if (alloc_is_aligned(dev_dax, to_alloc))
> - rc = alloc_dev_dax_range(dev_dax, r.start, to_alloc);
> + rc = alloc_dev_dax_range(&dax_region->res, dev_dax, r.start,
> + to_alloc);
> up_write(&dax_dev_rwsem);
> up_write(&dax_region_rwsem);
>
> @@ -1466,7 +1494,8 @@ static struct dev_dax *__devm_create_dev_dax(struct dev_dax_data *data)
> device_initialize(dev);
> dev_set_name(dev, "dax%d.%d", dax_region->id, dev_dax->id);
>
> - rc = alloc_dev_dax_range(dev_dax, dax_region->res.start, data->size);
> + rc = alloc_dev_dax_range(&dax_region->res, dev_dax, dax_region->res.start,
> + data->size);
> if (rc)
> goto err_range;
>
>
> --
> 2.46.0
>
--
Fan Ni
next prev parent reply other threads:[~2024-10-14 16:56 UTC|newest]
Thread overview: 134+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-07 23:16 [PATCH v4 00/28] DCD: Add support for Dynamic Capacity Devices (DCD) Ira Weiny
2024-10-07 23:16 ` [PATCH v4 01/28] test printk: Add very basic struct resource tests Ira Weiny
2024-10-08 16:35 ` Andy Shevchenko
2024-10-09 12:24 ` Jonathan Cameron
2024-10-09 17:09 ` Fan Ni
2024-10-10 14:59 ` Petr Mladek
2024-10-11 14:49 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 02/28] printk: Add print format (%pra) for struct range Ira Weiny
2024-10-08 16:56 ` Andy Shevchenko
2024-10-09 12:27 ` Jonathan Cameron
2024-10-09 14:42 ` Andy Shevchenko
2024-10-09 13:30 ` Rasmus Villemoes
2024-10-09 14:41 ` Andy Shevchenko
2024-10-14 0:08 ` Ira Weiny
2024-10-11 16:54 ` Ira Weiny
2024-10-09 17:33 ` Fan Ni
2024-10-11 2:09 ` Bagas Sanjaya
2024-10-17 20:57 ` Ira Weiny
2024-10-25 12:42 ` Bagas Sanjaya
2024-10-07 23:16 ` [PATCH v4 03/28] cxl/cdat: Use %pra for dpa range outputs Ira Weiny
2024-10-09 12:33 ` Jonathan Cameron
2024-10-09 17:34 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 04/28] range: Add range_overlaps() Ira Weiny
2024-10-08 16:10 ` David Sterba
2024-10-09 14:45 ` Andy Shevchenko
2024-10-09 14:46 ` Andy Shevchenko
2024-10-14 0:12 ` Ira Weiny
2024-10-09 15:36 ` David Sterba
2024-10-09 16:04 ` Andy Shevchenko
2024-10-10 15:24 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 05/28] dax: Document dax dev range tuple Ira Weiny
2024-10-09 12:42 ` Jonathan Cameron
2024-10-11 20:40 ` Ira Weiny
2024-10-16 15:48 ` Jonathan Cameron
2024-10-07 23:16 ` [PATCH v4 06/28] cxl/pci: Delay event buffer allocation Ira Weiny
2024-10-09 17:47 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 07/28] cxl/mbox: Flag support for Dynamic Capacity Devices (DCD) ira.weiny
2024-10-07 23:16 ` [PATCH v4 08/28] cxl/mem: Read dynamic capacity configuration from the device ira.weiny
2024-10-09 12:49 ` Jonathan Cameron
2024-10-14 0:05 ` Ira Weiny
2024-10-16 15:54 ` Jonathan Cameron
2024-10-16 16:59 ` Kees Cook
2024-10-07 23:16 ` [PATCH v4 09/28] cxl/core: Separate region mode from decoder mode ira.weiny
2024-10-09 12:51 ` Jonathan Cameron
2024-10-09 18:06 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 10/28] cxl/region: Add dynamic capacity decoder and region modes ira.weiny
2024-10-07 23:16 ` [PATCH v4 11/28] cxl/hdm: Add dynamic capacity size support to endpoint decoders ira.weiny
2024-10-10 12:45 ` Jonathan Cameron
2024-10-07 23:16 ` [PATCH v4 12/28] cxl/cdat: Gather DSMAS data for DCD regions Ira Weiny
2024-10-09 14:42 ` Rafael J. Wysocki
2024-10-11 20:38 ` Ira Weiny
2024-10-14 20:52 ` Wysocki, Rafael J
2024-10-09 18:16 ` Fan Ni
2024-10-14 1:16 ` Ira Weiny
2024-10-10 12:51 ` Jonathan Cameron
2024-10-07 23:16 ` [PATCH v4 13/28] cxl/mem: Expose DCD partition capabilities in sysfs ira.weiny
2024-10-09 20:46 ` Fan Ni
2024-10-14 1:34 ` Ira Weiny
2024-10-10 13:04 ` Jonathan Cameron
2024-10-16 21:34 ` Ira Weiny
2024-10-11 2:15 ` Bagas Sanjaya
2024-10-07 23:16 ` [PATCH v4 14/28] cxl/port: Add endpoint decoder DC mode support to sysfs ira.weiny
2024-10-10 13:14 ` Jonathan Cameron
2024-10-17 17:51 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 15/28] cxl/region: Refactor common create region code Ira Weiny
2024-10-10 13:18 ` Jonathan Cameron
2024-10-17 20:29 ` Ira Weiny
2024-10-10 16:27 ` Fan Ni
2024-10-24 2:17 ` Alison Schofield
2024-10-07 23:16 ` [PATCH v4 16/28] cxl/region: Add sparse DAX region support ira.weiny
2024-10-10 13:46 ` Jonathan Cameron
2024-10-10 17:41 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 17/28] cxl/events: Split event msgnum configuration from irq setup Ira Weiny
2024-10-10 13:49 ` Jonathan Cameron
2024-10-10 17:58 ` Fan Ni
2024-10-24 2:33 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 18/28] cxl/pci: Factor out interrupt policy check Ira Weiny
2024-10-10 18:07 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 19/28] cxl/mem: Configure dynamic capacity interrupts ira.weiny
2024-10-10 14:15 ` Jonathan Cameron
2024-10-10 18:25 ` Fan Ni
2024-10-24 3:09 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 20/28] cxl/core: Return endpoint decoder information from region search Ira Weiny
2024-10-10 14:21 ` Jonathan Cameron
2024-10-10 18:29 ` Fan Ni
2024-10-24 2:30 ` Alison Schofield
2024-10-07 23:16 ` [PATCH v4 21/28] cxl/extent: Process DCD events and realize region extents ira.weiny
2024-10-09 1:56 ` Li, Ming4
2024-10-09 19:49 ` Ira Weiny
2024-10-10 3:06 ` Li, Ming4
2024-10-14 2:05 ` Ira Weiny
2024-10-10 14:50 ` Jonathan Cameron
2024-10-11 19:14 ` Fan Ni
2024-10-17 21:15 ` Ira Weiny
2024-10-18 9:03 ` Jonathan Cameron
2024-10-21 14:04 ` Ira Weiny
2024-10-21 14:47 ` Jonathan Cameron
2024-10-10 14:58 ` Jonathan Cameron
2024-10-17 21:39 ` Ira Weiny
2024-10-18 9:09 ` Jonathan Cameron
2024-10-21 18:45 ` Ira Weiny
2024-10-22 17:01 ` Jonathan Cameron
2024-10-07 23:16 ` [PATCH v4 22/28] cxl/region/extent: Expose region extent information in sysfs ira.weiny
2024-10-10 15:01 ` Jonathan Cameron
2024-10-18 18:26 ` Ira Weiny
2024-10-21 9:37 ` Jonathan Cameron
2024-10-14 16:08 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 23/28] dax/bus: Factor out dev dax resize logic Ira Weiny
2024-10-10 15:06 ` Jonathan Cameron
2024-10-21 21:16 ` Ira Weiny
2024-10-14 16:56 ` Fan Ni [this message]
2024-10-07 23:16 ` [PATCH v4 24/28] dax/region: Create resources on sparse DAX regions ira.weiny
2024-10-10 15:27 ` Jonathan Cameron
2024-10-23 1:20 ` Ira Weiny
2024-10-23 11:22 ` Jonathan Cameron
2024-10-24 3:50 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 25/28] cxl/region: Read existing extents on region creation ira.weiny
2024-10-10 15:33 ` Jonathan Cameron
2024-10-24 1:41 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 26/28] cxl/mem: Trace Dynamic capacity Event Record ira.weiny
2024-10-10 15:41 ` Jonathan Cameron
2024-10-24 1:52 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 27/28] tools/testing/cxl: Make event logs dynamic Ira Weiny
2024-10-10 15:49 ` Jonathan Cameron
2024-10-24 1:59 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 28/28] tools/testing/cxl: Add DC Regions to mock mem data Ira Weiny
2024-10-10 15:58 ` Jonathan Cameron
2024-10-24 2:23 ` Ira Weiny
2024-10-08 22:57 ` [PATCH v4 00/28] DCD: Add support for Dynamic Capacity Devices (DCD) Fan Ni
2024-10-08 23:06 ` Fan Ni
2024-10-10 15:30 ` Ira Weiny
2024-10-10 15:31 ` Ira Weiny
2024-10-21 16:47 ` Fan Ni
2024-10-22 17:05 ` Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zw1NL3_otWVTUF4c@fan \
--to=nifan.cxl@gmail.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=alison.schofield@intel.com \
--cc=corbet@lwn.net \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=navneet.singh@intel.com \
--cc=nvdimm@lists.linux.dev \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).