From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: Keith Busch <keith.busch@intel.com>
Cc: linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
linux-mm@kvack.org, linux-api@vger.kernel.org,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Rafael Wysocki <rafael@kernel.org>,
Dave Hansen <dave.hansen@intel.com>,
Dan Williams <dan.j.williams@intel.com>
Subject: Re: [PATCHv7 03/10] acpi/hmat: Parse and report heterogeneous memory
Date: Mon, 11 Mar 2019 10:28:20 +0000 [thread overview]
Message-ID: <20190311102820.00000f2c@huawei.com> (raw)
In-Reply-To: <20190227225038.20438-4-keith.busch@intel.com>
On Wed, 27 Feb 2019 15:50:31 -0700
Keith Busch <keith.busch@intel.com> wrote:
> Systems may provide different memory types and export this information
> in the ACPI Heterogeneous Memory Attribute Table (HMAT). Parse these
> tables provided by the platform and report the memory access and caching
> attributes to the kernel messages.
>
> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Signed-off-by: Keith Busch <keith.busch@intel.com>
Hi Keith
A few trivial comments in this one. All stuff we can fix if a firmware
runs into the limits of representation.
One other question that comes to mind is:
The move to picoseconds was perhaps optimistic on the part of the
ACPI spec, but do we want to go for future proofing and go with them?
Costs us a few more zeros in a sysfs file after all. Internal storage
could stay at nano seconds perhaps.
All small things though so either way...
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> drivers/acpi/Kconfig | 1 +
> drivers/acpi/Makefile | 1 +
> drivers/acpi/hmat/Kconfig | 7 ++
> drivers/acpi/hmat/Makefile | 1 +
> drivers/acpi/hmat/hmat.c | 237 +++++++++++++++++++++++++++++++++++++++++++++
> 5 files changed, 247 insertions(+)
> create mode 100644 drivers/acpi/hmat/Kconfig
> create mode 100644 drivers/acpi/hmat/Makefile
> create mode 100644 drivers/acpi/hmat/hmat.c
>
> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
> index 4e015c77e48e..283ee94224c6 100644
> --- a/drivers/acpi/Kconfig
> +++ b/drivers/acpi/Kconfig
> @@ -475,6 +475,7 @@ config ACPI_REDUCED_HARDWARE_ONLY
> If you are unsure what to do, do not enable this option.
>
> source "drivers/acpi/nfit/Kconfig"
> +source "drivers/acpi/hmat/Kconfig"
>
> source "drivers/acpi/apei/Kconfig"
> source "drivers/acpi/dptf/Kconfig"
> diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
> index bb857421c2e8..5d361e4e3405 100644
> --- a/drivers/acpi/Makefile
> +++ b/drivers/acpi/Makefile
> @@ -80,6 +80,7 @@ obj-$(CONFIG_ACPI_PROCESSOR) += processor.o
> obj-$(CONFIG_ACPI) += container.o
> obj-$(CONFIG_ACPI_THERMAL) += thermal.o
> obj-$(CONFIG_ACPI_NFIT) += nfit/
> +obj-$(CONFIG_ACPI_HMAT) += hmat/
> obj-$(CONFIG_ACPI) += acpi_memhotplug.o
> obj-$(CONFIG_ACPI_HOTPLUG_IOAPIC) += ioapic.o
> obj-$(CONFIG_ACPI_BATTERY) += battery.o
> diff --git a/drivers/acpi/hmat/Kconfig b/drivers/acpi/hmat/Kconfig
> new file mode 100644
> index 000000000000..2f7111b7af62
> --- /dev/null
> +++ b/drivers/acpi/hmat/Kconfig
> @@ -0,0 +1,7 @@
> +# SPDX-License-Identifier: GPL-2.0
> +config ACPI_HMAT
> + bool "ACPI Heterogeneous Memory Attribute Table Support"
> + depends on ACPI_NUMA
> + help
> + If set, this option has the kernel parse and report the
> + platform's ACPI HMAT (Heterogeneous Memory Attributes Table).
> diff --git a/drivers/acpi/hmat/Makefile b/drivers/acpi/hmat/Makefile
> new file mode 100644
> index 000000000000..e909051d3d00
> --- /dev/null
> +++ b/drivers/acpi/hmat/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_ACPI_HMAT) := hmat.o
> diff --git a/drivers/acpi/hmat/hmat.c b/drivers/acpi/hmat/hmat.c
> new file mode 100644
> index 000000000000..99f711420f6d
> --- /dev/null
> +++ b/drivers/acpi/hmat/hmat.c
> @@ -0,0 +1,237 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2019, Intel Corporation.
> + *
> + * Heterogeneous Memory Attributes Table (HMAT) representation
> + *
> + * This program parses and reports the platform's HMAT tables, and registers
> + * the applicable attributes with the node's interfaces.
> + */
> +
> +#include <linux/acpi.h>
> +#include <linux/bitops.h>
> +#include <linux/device.h>
> +#include <linux/init.h>
> +#include <linux/list.h>
> +#include <linux/node.h>
> +#include <linux/sysfs.h>
> +
> +static __initdata u8 hmat_revision;
> +
> +static __init const char *hmat_data_type(u8 type)
> +{
> + switch (type) {
> + case ACPI_HMAT_ACCESS_LATENCY:
> + return "Access Latency";
> + case ACPI_HMAT_READ_LATENCY:
> + return "Read Latency";
> + case ACPI_HMAT_WRITE_LATENCY:
> + return "Write Latency";
> + case ACPI_HMAT_ACCESS_BANDWIDTH:
> + return "Access Bandwidth";
> + case ACPI_HMAT_READ_BANDWIDTH:
> + return "Read Bandwidth";
> + case ACPI_HMAT_WRITE_BANDWIDTH:
> + return "Write Bandwidth";
> + default:
> + return "Reserved";
> + }
> +}
> +
> +static __init const char *hmat_data_type_suffix(u8 type)
> +{
> + switch (type) {
> + case ACPI_HMAT_ACCESS_LATENCY:
> + case ACPI_HMAT_READ_LATENCY:
> + case ACPI_HMAT_WRITE_LATENCY:
> + return " nsec";
> + case ACPI_HMAT_ACCESS_BANDWIDTH:
> + case ACPI_HMAT_READ_BANDWIDTH:
> + case ACPI_HMAT_WRITE_BANDWIDTH:
> + return " MB/s";
> + default:
> + return "";
> + }
> +}
> +
> +static __init u32 hmat_normalize(u16 entry, u64 base, u8 type)
> +{
> + u32 value;
> +
> + /*
> + * Check for invalid and overflow values
> + */
> + if (entry == 0xffff || !entry)
> + return 0;
> + else if (base > (UINT_MAX / (entry)))
In this particular case, the issue is not that the value provided is impossible
but rather that the kernel is currently using too small a type to hold it.
If I've understood that right, perhaps a warning here would make sense?
Also, for consistency, U32_MAX.
> + return 0;
> +
> + /*
> + * Divide by the base unit for version 1, convert latency from
> + * picosenonds to nanoseconds if revision 2.
> + */
> + value = entry * base;
> + if (hmat_revision == 1) {
> + if (value < 10)
> + return 0;
> + value = DIV_ROUND_UP(value, 10);
> + } else if (hmat_revision == 2) {
> + switch (type) {
> + case ACPI_HMAT_ACCESS_LATENCY:
> + case ACPI_HMAT_READ_LATENCY:
> + case ACPI_HMAT_WRITE_LATENCY:
> + value = DIV_ROUND_UP(value, 1000);
In this case our use of a u32 is resulting in a value that is clamped
way below the full u32 range. Perhaps locally do the computation at
higher precision then clamp down at the end?
> + break;
> + default:
> + break;
> + }
> + }
> + return value;
> +}
> +
> +static __init int hmat_parse_locality(union acpi_subtable_headers *header,
> + const unsigned long end)
> +{
> + struct acpi_hmat_locality *hmat_loc = (void *)header;
> + unsigned int init, targ, total_size, ipds, tpds;
> + u32 *inits, *targs, value;
> + u16 *entries;
> + u8 type;
> +
> + if (hmat_loc->header.length < sizeof(*hmat_loc)) {
> + pr_notice("HMAT: Unexpected locality header length: %d\n",
> + hmat_loc->header.length);
> + return -EINVAL;
> + }
> +
> + type = hmat_loc->data_type;
> + ipds = hmat_loc->number_of_initiator_Pds;
> + tpds = hmat_loc->number_of_target_Pds;
> + total_size = sizeof(*hmat_loc) + sizeof(*entries) * ipds * tpds +
> + sizeof(*inits) * ipds + sizeof(*targs) * tpds;
> + if (hmat_loc->header.length < total_size) {
Perhaps overly paranoid, but could we make that a != to catch potential
broken tables?
> + pr_notice("HMAT: Unexpected locality header length:%d, minimum required:%d\n",
> + hmat_loc->header.length, total_size);
> + return -EINVAL;
> + }
> +
> + pr_info("HMAT: Locality: Flags:%02x Type:%s Initiator Domains:%d Target Domains:%d Base:%lld\n",
> + hmat_loc->flags, hmat_data_type(type), ipds, tpds,
> + hmat_loc->entry_base_unit);
> +
> + inits = (u32 *)(hmat_loc + 1);
> + targs = inits + ipds;
> + entries = (u16 *)(targs + tpds);
> + for (init = 0; init < ipds; init++) {
> + for (targ = 0; targ < tpds; targ++) {
> + value = hmat_normalize(entries[init * tpds + targ],
> + hmat_loc->entry_base_unit,
> + type);
> + pr_info(" Initiator-Target[%d-%d]:%d%s\n",
> + inits[init], targs[targ], value,
> + hmat_data_type_suffix(type));
> + }
> + }
> +
> + return 0;
> +}
> +
> +static __init int hmat_parse_cache(union acpi_subtable_headers *header,
> + const unsigned long end)
> +{
> + struct acpi_hmat_cache *cache = (void *)header;
> + u32 attrs;
> +
> + if (cache->header.length < sizeof(*cache)) {
> + pr_notice("HMAT: Unexpected cache header length: %d\n",
> + cache->header.length);
> + return -EINVAL;
> + }
> +
> + attrs = cache->cache_attributes;
> + pr_info("HMAT: Cache: Domain:%d Size:%llu Attrs:%08x SMBIOS Handles:%d\n",
> + cache->memory_PD, cache->cache_size, attrs,
> + cache->number_of_SMBIOShandles);
> +
> + return 0;
> +}
> +
> +static int __init hmat_parse_address_range(union acpi_subtable_headers *header,
> + const unsigned long end)
> +{
> + struct acpi_hmat_proximity_domain *p = (void *)header;
> + struct memory_target *target = NULL;
> +
> + if (p->header.length != sizeof(*p)) {
> + pr_notice("HMAT: Unexpected address range header length: %d\n",
> + p->header.length);
> + return -EINVAL;
> + }
> +
> + if (hmat_revision == 1)
> + pr_info("HMAT: Memory (%#llx length %#llx) Flags:%04x Processor Domain:%d Memory Domain:%d\n",
> + p->reserved3, p->reserved4, p->flags, p->processor_PD,
> + p->memory_PD);
> + else
> + pr_info("HMAT: Memory Flags:%04x Processor Domain:%d Memory Domain:%d\n",
> + p->flags, p->processor_PD, p->memory_PD);
> +
> + return 0;
> +}
> +
> +static int __init hmat_parse_subtable(union acpi_subtable_headers *header,
> + const unsigned long end)
> +{
> + struct acpi_hmat_structure *hdr = (void *)header;
> +
> + if (!hdr)
> + return -EINVAL;
> +
> + switch (hdr->type) {
> + case ACPI_HMAT_TYPE_ADDRESS_RANGE:
> + return hmat_parse_address_range(header, end);
> + case ACPI_HMAT_TYPE_LOCALITY:
> + return hmat_parse_locality(header, end);
> + case ACPI_HMAT_TYPE_CACHE:
> + return hmat_parse_cache(header, end);
> + default:
> + return -EINVAL;
> + }
> +}
> +
> +static __init int hmat_init(void)
> +{
> + struct acpi_table_header *tbl;
> + enum acpi_hmat_type i;
> + acpi_status status;
> +
> + if (srat_disabled())
> + return 0;
> +
> + status = acpi_get_table(ACPI_SIG_HMAT, 0, &tbl);
> + if (ACPI_FAILURE(status))
> + return 0;
> +
> + hmat_revision = tbl->revision;
> + switch (hmat_revision) {
> + case 1:
> + case 2:
> + break;
> + default:
> + pr_notice("Ignoring HMAT: Unknown revision:%d\n", hmat_revision);
> + goto out_put;
> + }
> +
> + for (i = ACPI_HMAT_TYPE_ADDRESS_RANGE; i < ACPI_HMAT_TYPE_RESERVED; i++) {
> + if (acpi_table_parse_entries(ACPI_SIG_HMAT,
> + sizeof(struct acpi_table_hmat), i,
> + hmat_parse_subtable, 0) < 0) {
> + pr_notice("Ignoring HMAT: Invalid table");
> + goto out_put;
> + }
> + }
> +out_put:
> + acpi_put_table(tbl);
> + return 0;
> +}
> +subsys_initcall(hmat_init);
next prev parent reply other threads:[~2019-03-11 10:28 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-27 22:50 [PATCHv7 00/10] Heterogenous memory node attributes Keith Busch
2019-02-27 22:50 ` [PATCHv7 01/10] acpi: Create subtable parsing infrastructure Keith Busch
2019-02-27 22:50 ` [PATCHv7 02/10] acpi: Add HMAT to generic parsing tables Keith Busch
2019-02-27 22:50 ` [PATCHv7 03/10] acpi/hmat: Parse and report heterogeneous memory Keith Busch
2019-03-08 17:25 ` Jonathan Cameron
2019-03-11 10:28 ` Jonathan Cameron [this message]
2019-02-27 22:50 ` [PATCHv7 04/10] node: Link memory nodes to their compute nodes Keith Busch
2019-03-11 10:34 ` Jonathan Cameron
2019-02-27 22:50 ` [PATCHv7 05/10] node: Add heterogenous memory access attributes Keith Busch
2019-02-27 22:50 ` [PATCHv7 06/10] node: Add memory-side caching attributes Keith Busch
2019-03-08 16:21 ` Jonathan Cameron
2019-02-27 22:50 ` [PATCHv7 07/10] acpi/hmat: Register processor domain to its memory Keith Busch
2019-03-11 11:20 ` Jonathan Cameron
2019-03-11 19:52 ` Keith Busch
2019-02-27 22:50 ` [PATCHv7 08/10] acpi/hmat: Register performance attributes Keith Busch
2019-03-11 11:21 ` Jonathan Cameron
2019-02-27 22:50 ` [PATCHv7 09/10] acpi/hmat: Register memory side cache attributes Keith Busch
2019-02-27 22:50 ` [PATCHv7 10/10] doc/mm: New documentation for memory performance Keith Busch
2019-03-11 11:38 ` Jonathan Cameron
2019-03-11 20:16 ` Keith Busch
2019-03-12 13:37 ` Jonathan Cameron
2019-03-11 11:47 ` [PATCHv7 00/10] Heterogenous memory node attributes Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190311102820.00000f2c@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=gregkh@linuxfoundation.org \
--cc=keith.busch@intel.com \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rafael@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).