Linux CXL
 help / color / mirror / Atom feed
From: Dave Jiang <dave.jiang@intel.com>
To: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Cc: <linux-cxl@vger.kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	"Huang, Ying" <ying.huang@intel.com>, <dan.j.williams@intel.com>,
	<ira.weiny@intel.com>, <vishal.l.verma@intel.com>,
	<alison.schofield@intel.com>, <dave@stgolabs.net>,
	<brice.goglin@gmail.com>, <nifan.cxl@gmail.com>
Subject: Re: [PATCH v2 3/3] cxl: Add memory hotplug notifier for cxl region
Date: Fri, 22 Dec 2023 11:17:15 -0700	[thread overview]
Message-ID: <af8ef32d-bb8f-468c-aeee-ade0e3c1ff39@intel.com> (raw)
In-Reply-To: <20231219151507.0000226f@Huawei.com>



On 12/19/23 08:15, Jonathan Cameron wrote:
> On Fri, 15 Dec 2023 16:16:11 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> When the CXL region is formed, the driver would computed the performance
>> data for the region. However this data is not available at the node data
>> collection that has been populated by the HMAT during kernel
>> initialization. Add a memory hotplug notifier to update the performance
>> data to the node hmem_attrs to expose the newly calculated region
>> performance data. The CXL region is created under specific CFMWS. The
>> node for the CFMWS is created during SRAT parsing by acpi_parse_cfmws().
>> The notifier will run once only and turn itself off after the initial
>> run. Additional regions may overwrite the initial data, but since this is
>> for the same poximity domain it's a don't care for now.
> 
> proximity
> 
>>
>> node_set_perf_attrs() is exported to allow update of perf attribs for a
>> node. Given that only CXL is using this, export only to CXL namespace.
>>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Cc: Rafael J. Wysocki <rafael@kernel.org>
>> Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> What is end result of this?
> 
> /sys/devices/system/node/node/access0/ 
> /sys/devices/system/node/node/access1/ 
> With just the bandwidths and latencies?
> No targets or initiators under accessX/targets or accessX/initiators?

# tree ./devices/system/node/node2/access0
./devices/system/node/node2/access0
├── initiators
│   ├── node1 -> ../../../node1
│   ├── read_bandwidth
│   ├── read_latency
│   ├── write_bandwidth
│   └── write_latency
├── power
│   ├── async
│   ├── runtime_active_kids
│   ├── runtime_enabled
│   ├── runtime_status
│   └── runtime_usage
├── targets
└── uevent


> 
> Or have those been set up earlier?  In which case do we handle
> the worse bandwidth being inside the host CPU?

I think it gets setup via the memory online callback notifier the region driver registered.

> 
>> ---
>> v2:
>> - Fix notifier return values (Dan)
>> - Use devm_add_action_or_reset() instead of adding a remove callback (Dan)
>> - Add Ying review tag
>> ---
>>  drivers/base/node.c       |    1 +
>>  drivers/cxl/core/region.c |   42 ++++++++++++++++++++++++++++++++++++++++++
>>  drivers/cxl/cxl.h         |    3 +++
>>  3 files changed, 46 insertions(+)
>>
>> diff --git a/drivers/base/node.c b/drivers/base/node.c
>> index cb2b6cc7f6e6..f5b5a3f11894 100644
>> --- a/drivers/base/node.c
>> +++ b/drivers/base/node.c
>> @@ -215,6 +215,7 @@ void node_set_perf_attrs(unsigned int nid, struct access_coordinate *coord,
>>  		}
>>  	}
>>  }
>> +EXPORT_SYMBOL_NS_GPL(node_set_perf_attrs, CXL);
> This feels ugly as namespaces usually about what is providing the facility not
> a 'who can use it' control.
> 
> Also, I'm aware of at least one other user who will want this in the not
> too distant future.  So if we want to namespace it, I'd prefer a NODE namespace
> or something along those lines.

I'll just make it normal export if we are anticipating another user.

> 
>>  
>>  /**
>>   * struct node_cache_info - Internal tracking for memory node caches
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index d97fa5f32e86..1765bf716484 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -4,6 +4,7 @@
>>  #include <linux/genalloc.h>
>>  #include <linux/device.h>
>>  #include <linux/module.h>
>> +#include <linux/memory.h>
>>  #include <linux/slab.h>
>>  #include <linux/uuid.h>
>>  #include <linux/sort.h>
>> @@ -2960,6 +2961,42 @@ static int is_system_ram(struct resource *res, void *arg)
>>  	return 1;
>>  }
>>  
>> +static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
>> +					  unsigned long action, void *arg)
>> +{
>> +	struct cxl_region *cxlr = container_of(nb, struct cxl_region,
>> +					       memory_notifier);
>> +	struct cxl_region_params *p = &cxlr->params;
>> +	struct cxl_endpoint_decoder *cxled = p->targets[0];
>> +	struct cxl_decoder *cxld = &cxled->cxld;
>> +	struct memory_notify *mnb = arg;
>> +	int nid = mnb->status_change_nid;
>> +	int region_nid;
>> +
>> +	if (nid == NUMA_NO_NODE || action != MEM_ONLINE)
>> +		return NOTIFY_DONE;
>> +
>> +	region_nid = phys_to_target_node(cxld->hpa_range.start);
>> +	if (nid != region_nid)
>> +		return NOTIFY_DONE;
>> +
>> +	/* Don't set if there's no coordinate information */
>> +	if (!cxlr->coord.write_bandwidth)
>> +		return NOTIFY_DONE;
> 
> Could future proof a bit to allow for RO memory by using read_bandwith here.

Yes. I didn't realize there will be RO memory. I just assumed that bandwidth would always be > 0 for a valid set of data.

> 
>> +
>> +	node_set_perf_attrs(nid, &cxlr->coord, 0);
>> +	node_set_perf_attrs(nid, &cxlr->coord, 1);
> 
> Hmm. Assumption that the access attributes from no CPU requesters is the same
> as the CPU bothers me a little.

I wasn't too sure about updating this. Should I only update access 0? 
> 
>> +
>> +	return NOTIFY_OK;
>> +}
>> +
>> +static void remove_coord_notifier(void *data)
>> +{
>> +	struct cxl_region *cxlr = data;
>> +
>> +	unregister_memory_notifier(&cxlr->memory_notifier);
>> +}
>> +
>>  static int cxl_region_probe(struct device *dev)
>>  {
>>  	struct cxl_region *cxlr = to_cxl_region(dev);
>> @@ -2985,6 +3022,11 @@ static int cxl_region_probe(struct device *dev)
>>  		goto out;
>>  	}
>>  
>> +	cxlr->memory_notifier.notifier_call = cxl_region_perf_attrs_callback;
>> +	cxlr->memory_notifier.priority = HMAT_CALLBACK_PRI;
>> +	register_memory_notifier(&cxlr->memory_notifier);
>> +	rc = devm_add_action_or_reset(&cxlr->dev, remove_coord_notifier, cxlr);
>> +
>>  	/*
>>  	 * From this point on any path that changes the region's state away from
>>  	 * CXL_CONFIG_COMMIT is also responsible for releasing the driver.
> 
>>
> 

  reply	other threads:[~2023-12-22 18:17 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-15 23:15 [PATCH v2 0/3] cxl: Add support to report region access coordinates to numa nodes Dave Jiang
2023-12-15 23:15 ` [PATCH v2 1/3] cxl/region: Calculate performance data for a region Dave Jiang
2023-12-19 14:51   ` Jonathan Cameron
2023-12-21 22:51     ` Dave Jiang
2024-01-08 13:58       ` Jonathan Cameron
2023-12-15 23:16 ` [PATCH v2 2/3] cxl/region: Add sysfs attribute for locality attributes of CXL regions Dave Jiang
2023-12-19 14:58   ` Jonathan Cameron
2023-12-15 23:16 ` [PATCH v2 3/3] cxl: Add memory hotplug notifier for cxl region Dave Jiang
2023-12-19 15:15   ` Jonathan Cameron
2023-12-22 18:17     ` Dave Jiang [this message]
2024-01-08 13:56       ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=af8ef32d-bb8f-468c-aeee-ade0e3c1ff39@intel.com \
    --to=dave.jiang@intel.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=brice.goglin@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave@stgolabs.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=nifan.cxl@gmail.com \
    --cc=rafael@kernel.org \
    --cc=vishal.l.verma@intel.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox