linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: Keith Busch <keith.busch@intel.com>
Cc: linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
	linux-mm@kvack.org, linux-api@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Rafael Wysocki <rafael@kernel.org>,
	Dave Hansen <dave.hansen@intel.com>,
	Dan Williams <dan.j.williams@intel.com>
Subject: Re: [PATCHv7 03/10] acpi/hmat: Parse and report heterogeneous memory
Date: Mon, 11 Mar 2019 10:28:20 +0000	[thread overview]
Message-ID: <20190311102820.00000f2c@huawei.com> (raw)
In-Reply-To: <20190227225038.20438-4-keith.busch@intel.com>

On Wed, 27 Feb 2019 15:50:31 -0700
Keith Busch <keith.busch@intel.com> wrote:

> Systems may provide different memory types and export this information
> in the ACPI Heterogeneous Memory Attribute Table (HMAT). Parse these
> tables provided by the platform and report the memory access and caching
> attributes to the kernel messages.
> 
> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Signed-off-by: Keith Busch <keith.busch@intel.com>
Hi Keith

A few trivial comments in this one. All stuff we can fix if a firmware
runs into the limits of representation.

One other question that comes to mind is:
The move to picoseconds was perhaps optimistic on the part of the
ACPI spec, but do we want to go for future proofing and go with them?
Costs us a few more zeros in a sysfs file after all.  Internal storage
could stay at nano seconds perhaps.

All small things though so either way...

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/acpi/Kconfig       |   1 +
>  drivers/acpi/Makefile      |   1 +
>  drivers/acpi/hmat/Kconfig  |   7 ++
>  drivers/acpi/hmat/Makefile |   1 +
>  drivers/acpi/hmat/hmat.c   | 237 +++++++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 247 insertions(+)
>  create mode 100644 drivers/acpi/hmat/Kconfig
>  create mode 100644 drivers/acpi/hmat/Makefile
>  create mode 100644 drivers/acpi/hmat/hmat.c
> 
> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
> index 4e015c77e48e..283ee94224c6 100644
> --- a/drivers/acpi/Kconfig
> +++ b/drivers/acpi/Kconfig
> @@ -475,6 +475,7 @@ config ACPI_REDUCED_HARDWARE_ONLY
>  	  If you are unsure what to do, do not enable this option.
>  
>  source "drivers/acpi/nfit/Kconfig"
> +source "drivers/acpi/hmat/Kconfig"
>  
>  source "drivers/acpi/apei/Kconfig"
>  source "drivers/acpi/dptf/Kconfig"
> diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
> index bb857421c2e8..5d361e4e3405 100644
> --- a/drivers/acpi/Makefile
> +++ b/drivers/acpi/Makefile
> @@ -80,6 +80,7 @@ obj-$(CONFIG_ACPI_PROCESSOR)	+= processor.o
>  obj-$(CONFIG_ACPI)		+= container.o
>  obj-$(CONFIG_ACPI_THERMAL)	+= thermal.o
>  obj-$(CONFIG_ACPI_NFIT)		+= nfit/
> +obj-$(CONFIG_ACPI_HMAT)		+= hmat/
>  obj-$(CONFIG_ACPI)		+= acpi_memhotplug.o
>  obj-$(CONFIG_ACPI_HOTPLUG_IOAPIC) += ioapic.o
>  obj-$(CONFIG_ACPI_BATTERY)	+= battery.o
> diff --git a/drivers/acpi/hmat/Kconfig b/drivers/acpi/hmat/Kconfig
> new file mode 100644
> index 000000000000..2f7111b7af62
> --- /dev/null
> +++ b/drivers/acpi/hmat/Kconfig
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: GPL-2.0
> +config ACPI_HMAT
> +	bool "ACPI Heterogeneous Memory Attribute Table Support"
> +	depends on ACPI_NUMA
> +	help
> +	 If set, this option has the kernel parse and report the
> +	 platform's ACPI HMAT (Heterogeneous Memory Attributes Table).
> diff --git a/drivers/acpi/hmat/Makefile b/drivers/acpi/hmat/Makefile
> new file mode 100644
> index 000000000000..e909051d3d00
> --- /dev/null
> +++ b/drivers/acpi/hmat/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_ACPI_HMAT) := hmat.o
> diff --git a/drivers/acpi/hmat/hmat.c b/drivers/acpi/hmat/hmat.c
> new file mode 100644
> index 000000000000..99f711420f6d
> --- /dev/null
> +++ b/drivers/acpi/hmat/hmat.c
> @@ -0,0 +1,237 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2019, Intel Corporation.
> + *
> + * Heterogeneous Memory Attributes Table (HMAT) representation
> + *
> + * This program parses and reports the platform's HMAT tables, and registers
> + * the applicable attributes with the node's interfaces.
> + */
> +
> +#include <linux/acpi.h>
> +#include <linux/bitops.h>
> +#include <linux/device.h>
> +#include <linux/init.h>
> +#include <linux/list.h>
> +#include <linux/node.h>
> +#include <linux/sysfs.h>
> +
> +static __initdata u8 hmat_revision;
> +
> +static __init const char *hmat_data_type(u8 type)
> +{
> +	switch (type) {
> +	case ACPI_HMAT_ACCESS_LATENCY:
> +		return "Access Latency";
> +	case ACPI_HMAT_READ_LATENCY:
> +		return "Read Latency";
> +	case ACPI_HMAT_WRITE_LATENCY:
> +		return "Write Latency";
> +	case ACPI_HMAT_ACCESS_BANDWIDTH:
> +		return "Access Bandwidth";
> +	case ACPI_HMAT_READ_BANDWIDTH:
> +		return "Read Bandwidth";
> +	case ACPI_HMAT_WRITE_BANDWIDTH:
> +		return "Write Bandwidth";
> +	default:
> +		return "Reserved";
> +	}
> +}
> +
> +static __init const char *hmat_data_type_suffix(u8 type)
> +{
> +	switch (type) {
> +	case ACPI_HMAT_ACCESS_LATENCY:
> +	case ACPI_HMAT_READ_LATENCY:
> +	case ACPI_HMAT_WRITE_LATENCY:
> +		return " nsec";
> +	case ACPI_HMAT_ACCESS_BANDWIDTH:
> +	case ACPI_HMAT_READ_BANDWIDTH:
> +	case ACPI_HMAT_WRITE_BANDWIDTH:
> +		return " MB/s";
> +	default:
> +		return "";
> +	}
> +}
> +
> +static __init u32 hmat_normalize(u16 entry, u64 base, u8 type)
> +{
> +	u32 value;
> +
> +	/*
> +	 * Check for invalid and overflow values
> +	 */
> +	if (entry == 0xffff || !entry)
> +		return 0;
> +	else if (base > (UINT_MAX / (entry)))

In this particular case, the issue is not that the value provided is impossible
but rather that the kernel is currently using too small a type to hold it.
If I've understood that right, perhaps a warning here would make sense?

Also, for consistency, U32_MAX.


> +		return 0;
> +
> +	/*
> +	 * Divide by the base unit for version 1, convert latency from
> +	 * picosenonds to nanoseconds if revision 2.
> +	 */
> +	value = entry * base;
> +	if (hmat_revision == 1) {
> +		if (value < 10)
> +			return 0;
> +		value = DIV_ROUND_UP(value, 10);
> +	} else if (hmat_revision == 2) {
> +		switch (type) {
> +		case ACPI_HMAT_ACCESS_LATENCY:
> +		case ACPI_HMAT_READ_LATENCY:
> +		case ACPI_HMAT_WRITE_LATENCY:
> +			value = DIV_ROUND_UP(value, 1000);

In this case our use of a u32 is resulting in a value that is clamped
way below the full u32 range.  Perhaps locally do the computation at
higher precision then clamp down at the end?

> +			break;
> +		default:
> +			break;
> +		}
> +	}
> +	return value;
> +}
> +
> +static __init int hmat_parse_locality(union acpi_subtable_headers *header,
> +				      const unsigned long end)
> +{
> +	struct acpi_hmat_locality *hmat_loc = (void *)header;
> +	unsigned int init, targ, total_size, ipds, tpds;
> +	u32 *inits, *targs, value;
> +	u16 *entries;
> +	u8 type;
> +
> +	if (hmat_loc->header.length < sizeof(*hmat_loc)) {
> +		pr_notice("HMAT: Unexpected locality header length: %d\n",
> +			 hmat_loc->header.length);
> +		return -EINVAL;
> +	}
> +
> +	type = hmat_loc->data_type;
> +	ipds = hmat_loc->number_of_initiator_Pds;
> +	tpds = hmat_loc->number_of_target_Pds;
> +	total_size = sizeof(*hmat_loc) + sizeof(*entries) * ipds * tpds +
> +		     sizeof(*inits) * ipds + sizeof(*targs) * tpds;
> +	if (hmat_loc->header.length < total_size) {

Perhaps overly paranoid, but could we make that a != to catch potential
broken tables?

> +		pr_notice("HMAT: Unexpected locality header length:%d, minimum required:%d\n",
> +			 hmat_loc->header.length, total_size);
> +		return -EINVAL;
> +	}
> +
> +	pr_info("HMAT: Locality: Flags:%02x Type:%s Initiator Domains:%d Target Domains:%d Base:%lld\n",
> +		hmat_loc->flags, hmat_data_type(type), ipds, tpds,
> +		hmat_loc->entry_base_unit);
> +
> +	inits = (u32 *)(hmat_loc + 1);
> +	targs = inits + ipds;
> +	entries = (u16 *)(targs + tpds);
> +	for (init = 0; init < ipds; init++) {
> +		for (targ = 0; targ < tpds; targ++) {
> +			value = hmat_normalize(entries[init * tpds + targ],
> +					       hmat_loc->entry_base_unit,
> +					       type);
> +			pr_info("  Initiator-Target[%d-%d]:%d%s\n",
> +				inits[init], targs[targ], value,
> +				hmat_data_type_suffix(type));
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static __init int hmat_parse_cache(union acpi_subtable_headers *header,
> +				   const unsigned long end)
> +{
> +	struct acpi_hmat_cache *cache = (void *)header;
> +	u32 attrs;
> +
> +	if (cache->header.length < sizeof(*cache)) {
> +		pr_notice("HMAT: Unexpected cache header length: %d\n",
> +			 cache->header.length);
> +		return -EINVAL;
> +	}
> +
> +	attrs = cache->cache_attributes;
> +	pr_info("HMAT: Cache: Domain:%d Size:%llu Attrs:%08x SMBIOS Handles:%d\n",
> +		cache->memory_PD, cache->cache_size, attrs,
> +		cache->number_of_SMBIOShandles);
> +
> +	return 0;
> +}
> +
> +static int __init hmat_parse_address_range(union acpi_subtable_headers *header,
> +					   const unsigned long end)
> +{
> +	struct acpi_hmat_proximity_domain *p = (void *)header;
> +	struct memory_target *target = NULL;
> +
> +	if (p->header.length != sizeof(*p)) {
> +		pr_notice("HMAT: Unexpected address range header length: %d\n",
> +			 p->header.length);
> +		return -EINVAL;
> +	}
> +
> +	if (hmat_revision == 1)
> +		pr_info("HMAT: Memory (%#llx length %#llx) Flags:%04x Processor Domain:%d Memory Domain:%d\n",
> +			p->reserved3, p->reserved4, p->flags, p->processor_PD,
> +			p->memory_PD);
> +	else
> +		pr_info("HMAT: Memory Flags:%04x Processor Domain:%d Memory Domain:%d\n",
> +			p->flags, p->processor_PD, p->memory_PD);
> +
> +	return 0;
> +}
> +
> +static int __init hmat_parse_subtable(union acpi_subtable_headers *header,
> +				      const unsigned long end)
> +{
> +	struct acpi_hmat_structure *hdr = (void *)header;
> +
> +	if (!hdr)
> +		return -EINVAL;
> +
> +	switch (hdr->type) {
> +	case ACPI_HMAT_TYPE_ADDRESS_RANGE:
> +		return hmat_parse_address_range(header, end);
> +	case ACPI_HMAT_TYPE_LOCALITY:
> +		return hmat_parse_locality(header, end);
> +	case ACPI_HMAT_TYPE_CACHE:
> +		return hmat_parse_cache(header, end);
> +	default:
> +		return -EINVAL;
> +	}
> +}
> +
> +static __init int hmat_init(void)
> +{
> +	struct acpi_table_header *tbl;
> +	enum acpi_hmat_type i;
> +	acpi_status status;
> +
> +	if (srat_disabled())
> +		return 0;
> +
> +	status = acpi_get_table(ACPI_SIG_HMAT, 0, &tbl);
> +	if (ACPI_FAILURE(status))
> +		return 0;
> +
> +	hmat_revision = tbl->revision;
> +	switch (hmat_revision) {
> +	case 1:
> +	case 2:
> +		break;
> +	default:
> +		pr_notice("Ignoring HMAT: Unknown revision:%d\n", hmat_revision);
> +		goto out_put;
> +	}
> +
> +	for (i = ACPI_HMAT_TYPE_ADDRESS_RANGE; i < ACPI_HMAT_TYPE_RESERVED; i++) {
> +		if (acpi_table_parse_entries(ACPI_SIG_HMAT,
> +					     sizeof(struct acpi_table_hmat), i,
> +					     hmat_parse_subtable, 0) < 0) {
> +			pr_notice("Ignoring HMAT: Invalid table");
> +			goto out_put;
> +		}
> +	}
> +out_put:
> +	acpi_put_table(tbl);
> +	return 0;
> +}
> +subsys_initcall(hmat_init);

  parent reply	other threads:[~2019-03-11 10:28 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-27 22:50 [PATCHv7 00/10] Heterogenous memory node attributes Keith Busch
2019-02-27 22:50 ` [PATCHv7 01/10] acpi: Create subtable parsing infrastructure Keith Busch
2019-02-27 22:50 ` [PATCHv7 02/10] acpi: Add HMAT to generic parsing tables Keith Busch
2019-02-27 22:50 ` [PATCHv7 03/10] acpi/hmat: Parse and report heterogeneous memory Keith Busch
2019-03-08 17:25   ` Jonathan Cameron
2019-03-11 10:28   ` Jonathan Cameron [this message]
2019-02-27 22:50 ` [PATCHv7 04/10] node: Link memory nodes to their compute nodes Keith Busch
2019-03-11 10:34   ` Jonathan Cameron
2019-02-27 22:50 ` [PATCHv7 05/10] node: Add heterogenous memory access attributes Keith Busch
2019-02-27 22:50 ` [PATCHv7 06/10] node: Add memory-side caching attributes Keith Busch
2019-03-08 16:21   ` Jonathan Cameron
2019-02-27 22:50 ` [PATCHv7 07/10] acpi/hmat: Register processor domain to its memory Keith Busch
2019-03-11 11:20   ` Jonathan Cameron
2019-03-11 19:52     ` Keith Busch
2019-02-27 22:50 ` [PATCHv7 08/10] acpi/hmat: Register performance attributes Keith Busch
2019-03-11 11:21   ` Jonathan Cameron
2019-02-27 22:50 ` [PATCHv7 09/10] acpi/hmat: Register memory side cache attributes Keith Busch
2019-02-27 22:50 ` [PATCHv7 10/10] doc/mm: New documentation for memory performance Keith Busch
2019-03-11 11:38   ` Jonathan Cameron
2019-03-11 20:16     ` Keith Busch
2019-03-12 13:37       ` Jonathan Cameron
2019-03-11 11:47 ` [PATCHv7 00/10] Heterogenous memory node attributes Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190311102820.00000f2c@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=keith.busch@intel.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rafael@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).