All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [Devel] [PATCH v3 2/3] hmat: add heterogeneous memory sysfs support
  2017-12-15  0:52 ` Rafael J. Wysocki
  (?)
  (?)
@ 2017-12-15 20:53 ` Ross Zwisler
  -1 siblings, 0 replies; 153+ messages in thread
From: Ross Zwisler @ 2017-12-15 20:53 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 1130 bytes --]

On Fri, Dec 15, 2017 at 01:52:03AM +0100, Rafael J. Wysocki wrote:
<>
> > diff --git a/drivers/acpi/hmat/core.c b/drivers/acpi/hmat/core.c
> > new file mode 100644
> > index 000000000000..61b90dadf84b
> > --- /dev/null
> > +++ b/drivers/acpi/hmat/core.c
> > @@ -0,0 +1,536 @@
> > +/*
> > + * Heterogeneous Memory Attributes Table (HMAT) representation in sysfs
> > + *
> > + * Copyright (c) 2017, Intel Corporation.
> > + *
> > + * This program is free software; you can redistribute it and/or modify it
> > + * under the terms and conditions of the GNU General Public License,
> > + * version 2, as published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope it will be useful, but WITHOUT
> > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> > + * more details.
> > + */
> 
> Minor nit for starters: you should use SPDX license indentifiers in
> new files and if you do so, the license boilerplace is not necessary
> any more.

Okay, I'll fix that up.

^ permalink raw reply	[flat|nested] 153+ messages in thread
* Re: [Devel] [PATCH v3 0/3] create sysfs representation of ACPI HMAT
@ 2017-12-23  1:14 ` Rafael J. Wysocki
  0 siblings, 0 replies; 153+ messages in thread
From: Rafael J. Wysocki @ 2017-12-23  1:14 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 3545 bytes --]

On Sat, Dec 23, 2017 at 12:57 AM, Dan Williams <dan.j.williams(a)intel.com> wrote:
> On Fri, Dec 22, 2017 at 3:22 PM, Ross Zwisler
> <ross.zwisler(a)linux.intel.com> wrote:
>> On Fri, Dec 22, 2017 at 02:53:42PM -0800, Dan Williams wrote:
>>> On Thu, Dec 21, 2017 at 12:31 PM, Brice Goglin <brice.goglin(a)gmail.com> wrote:
>>> > Le 20/12/2017 à 23:41, Ross Zwisler a écrit :
>>> [..]
>>> > Hello
>>> >
>>> > I can confirm that HPC runtimes are going to use these patches (at least
>>> > all runtimes that use hwloc for topology discovery, but that's the vast
>>> > majority of HPC anyway).
>>> >
>>> > We really didn't like KNL exposing a hacky SLIT table [1]. We had to
>>> > explicitly detect that specific crazy table to find out which NUMA nodes
>>> > were local to which cores, and to find out which NUMA nodes were
>>> > HBM/MCDRAM or DDR. And then we had to hide the SLIT values to the
>>> > application because the reported latencies didn't match reality. Quite
>>> > annoying.
>>> >
>>> > With Ross' patches, we can easily get what we need:
>>> > * which NUMA nodes are local to which CPUs? /sys/devices/system/node/
>>> > can only report a single local node per CPU (doesn't work for KNL and
>>> > upcoming architectures with HBM+DDR+...)
>>> > * which NUMA nodes are slow/fast (for both bandwidth and latency)
>>> > And we can still look at SLIT under /sys/devices/system/node if really
>>> > needed.
>>> >
>>> > And of course having this in sysfs is much better than parsing ACPI
>>> > tables that are only accessible to root :)
>>>
>>> On this point, it's not clear to me that we should allow these sysfs
>>> entries to be world readable. Given /proc/iomem now hides physical
>>> address information from non-root we at least need to be careful not
>>> to undo that with new sysfs HMAT attributes.
>>
>> This enabling does not expose any physical addresses to userspace.  It only
>> provides performance numbers from the HMAT and associates them with existing
>> NUMA nodes.  Are you worried that exposing performance numbers to non-root
>> users via sysfs poses a security risk?
>
> It's an information disclosure that's not clear we need to make to
> non-root processes.
>
> I'm more worried about userspace growing dependencies on the absolute
> numbers when those numbers can change from platform to platform.
> Differentiated memory on one platform may be the common memory pool on
> another.
>
> To me this has parallels with storage device hinting where
> specifications like T10 have a complex enumeration of all the
> performance hints that can be passed to the device, but the Linux
> enabling effort aims for a sanitzed set of relative hints that make
> sense. It's more flexible if userspace specifies a relative intent
> rather than an absolute performance target. Putting all the HMAT
> information into sysfs gives userspace more information than it could
> possibly do anything reasonable, at least outside of specialized apps
> that are hand tuned for a given hardware platform.

That's a valid point IMO.

It is sort of tempting to expose everything to user space verbatim,
especially early in the enabling process when the kernel has not yet
found suitable ways to utilize the given information, but the very act
of exposing it may affect what can be done with it in the future.

User space interfaces need to stay around and be supported forever, at
least potentially, so adding every one of them is a serious
commitment.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 153+ messages in thread
* Re: [Devel] [PATCH v3 0/3] create sysfs representation of ACPI HMAT
  2017-12-21  1:41             ` Elliott, Robert (Persistent Memory)
  (?)
  (?)
@ 2017-12-22 21:46 ` Ross Zwisler
  -1 siblings, 0 replies; 153+ messages in thread
From: Ross Zwisler @ 2017-12-22 21:46 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 3796 bytes --]

On Thu, Dec 21, 2017 at 01:41:15AM +0000, Elliott, Robert (Persistent Memory) wrote:
> 
> 
> > -----Original Message-----
> > From: Linux-nvdimm [mailto:linux-nvdimm-bounces(a)lists.01.org] On Behalf Of
> > Ross Zwisler
> ...
> > 
> > On Wed, Dec 20, 2017 at 10:19:37AM -0800, Matthew Wilcox wrote:
> ...
> > > initiator is a CPU?  I'd have expected you to expose a memory controller
> > > abstraction rather than re-use storage terminology.
> > 
> > Yea, I agree that at first blush it seems weird.  It turns out that
> > looking at it in sort of a storage initiator/target way is beneficial,
> > though, because it allows us to cut down on the number of data values
> > we need to represent.
> > 
> > For example the SLIT, which doesn't differentiate between initiator and
> > target proximity domains (and thus nodes) always represents a system
> > with N proximity domains using a NxN distance table.  This makes sense
> > if every node contains both CPUs and memory.
> > 
> > With the introduction of the HMAT, though, we can have memory-only
> > initiator nodes and we can explicitly associate them with their local 
> > CPU.  This is necessary so that we can separate memory with different
> > performance characteristics (HBM vs normal memory vs persistent memory,
> > for example) that are all attached to the same CPU.
> > 
> > So, say we now have a system with 4 CPUs, and each of those CPUs has 3
> > different types of memory attached to it.  We now have 16 total proximity
> > domains, 4 CPU and 12 memory.
> 
> The CPU cores that make up a node can have performance restrictions of
> their own; for example, they might max out at 10 GB/s even though the
> memory controller supports 120 GB/s (meaning you need to use 12 cores
> on the node to fully exercise memory).  It'd be helpful to report this,
> so software can decide how many cores to use for bandwidth-intensive work.
> 
> > If we represent this with the SLIT we end up with a 16 X 16 distance table
> > (256 entries), most of which don't matter because they are memory-to-
> > memory distances which don't make sense.
> > 
> > In the HMAT, though, we separate out the initiators and the targets and
> > put them into separate lists.  (See 5.2.27.4 System Locality Latency and
> > Bandwidth Information Structure in ACPI 6.2 for details.)  So, this same
> > config in the HMAT only has 4*12=48 performance values of each type, all
> > of which convey meaningful information.
> > 
> > The HMAT indeed even uses the storage "initiator" and "target"
> > terminology. :)
> 
> Centralized DMA engines (e.g., as used by the "DMA based blk-mq pmem
> driver") have performance differences too.  A CPU might include
> CPU cores that reach 10 GB/s, DMA engines that reach 60 GB/s, and
> memory controllers that reach 120 GB/s.  I guess these would be
> represented as extra initiators on the node?

For both of your comments I think all of this comes down to how you want to
represent your platform in the HMAT.  The sysfs representation just shows you
what is in the HMAT.

Each initiator node is just a single NUMA node (think of it as a NUMA node
which has the characteristic that it can initiate memory requests), so I don't
think there is a way to have "extra initiators on the node".  I think what
you're talking about is separating the DMA engines and CPU cores into separate
NUMA nodes, both of which are initiators.  I think this is probably fine as it
conveys useful info.

I don't think the HMAT has a concept of increasing bandwidth for number of CPU
cores used - it just has a single bandwidth number (well, one for read and one
for write) per initiator/target pair.  I don't think we want to add this,
either - the HMAT is already very complex.

^ permalink raw reply	[flat|nested] 153+ messages in thread
* Re: [Devel] [PATCH v3 2/3] hmat: add heterogeneous memory sysfs support
  2017-12-14  2:10 ` Ross Zwisler
  (?)
@ 2017-12-15  0:52 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 153+ messages in thread
From: Rafael J. Wysocki @ 2017-12-15  0:52 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 6608 bytes --]

On Thu, Dec 14, 2017 at 3:10 AM, Ross Zwisler
<ross.zwisler(a)linux.intel.com> wrote:
> Add a new sysfs subsystem, /sys/devices/system/hmat, which surfaces
> information about memory initiators and memory targets to the user.  These
> initiators and targets are described by the ACPI SRAT and HMAT tables.
>
> A "memory initiator" in this case is a NUMA node containing one or more
> devices such as CPU or separate memory I/O devices that can initiate
> memory requests.  A "memory target" is NUMA node containing at least one
> CPU-accessible physical address range.
>
> The key piece of information surfaced by this patch is the mapping between
> the ACPI table "proximity domain" numbers, held in the "firmware_id"
> attribute, and Linux NUMA node numbers.  Every ACPI proximity domain will
> end up being a unique NUMA node in Linux, but the numbers may get reordered
> and Linux can create extra NUMA nodes that don't map back to ACPI proximity
> domains.  The firmware_id value is needed if anyone ever wants to look at
> the ACPI HMAT and SRAT tables directly and make sense of how they map to
> NUMA nodes in Linux.
>
> Initiators are found at /sys/devices/system/hmat/mem_initX, and the
> attributes for a given initiator look like this:
>
>   # tree mem_init0
>   mem_init0
>   ├── firmware_id
>   ├── node0 -> ../../node/node0
>   ├── power
>   │   ├── async
>   │   ...
>   ├── subsystem -> ../../../../bus/hmat
>   └── uevent
>
> Where "mem_init0" on my system represents the CPU acting as a memory
> initiator at NUMA node 0.  Users can discover which CPUs are part of this
> memory initiator by following the node0 symlink and looking at cpumap,
> cpulist and the cpu* symlinks.
>
> Targets are found at /sys/devices/system/hmat/mem_tgtX, and the attributes
> for a given target look like this:
>
>   # tree mem_tgt2
>   mem_tgt2
>   ├── firmware_id
>   ├── is_cached
>   ├── node2 -> ../../node/node2
>   ├── power
>   │   ├── async
>   │   ...
>   ├── subsystem -> ../../../../bus/hmat
>   └── uevent
>
> Users can discover information about the memory owned by this memory target
> by following the node2 symlink and looking at meminfo, vmstat and at the
> memory* memory section symlinks.
>
> Signed-off-by: Ross Zwisler <ross.zwisler(a)linux.intel.com>
> ---
>  MAINTAINERS                   |   6 +
>  drivers/acpi/Kconfig          |   1 +
>  drivers/acpi/Makefile         |   1 +
>  drivers/acpi/hmat/Kconfig     |   7 +
>  drivers/acpi/hmat/Makefile    |   2 +
>  drivers/acpi/hmat/core.c      | 536 ++++++++++++++++++++++++++++++++++++++++++
>  drivers/acpi/hmat/hmat.h      |  47 ++++
>  drivers/acpi/hmat/initiator.c |  43 ++++
>  drivers/acpi/hmat/target.c    |  55 +++++
>  9 files changed, 698 insertions(+)
>  create mode 100644 drivers/acpi/hmat/Kconfig
>  create mode 100644 drivers/acpi/hmat/Makefile
>  create mode 100644 drivers/acpi/hmat/core.c
>  create mode 100644 drivers/acpi/hmat/hmat.h
>  create mode 100644 drivers/acpi/hmat/initiator.c
>  create mode 100644 drivers/acpi/hmat/target.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 82ad0eabce4f..64ebec0708de 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6366,6 +6366,12 @@ S:       Supported
>  F:     drivers/scsi/hisi_sas/
>  F:     Documentation/devicetree/bindings/scsi/hisilicon-sas.txt
>
> +HMAT - ACPI Heterogeneous Memory Attribute Table Support
> +M:     Ross Zwisler <ross.zwisler(a)linux.intel.com>
> +L:     linux-mm(a)kvack.org
> +S:     Supported
> +F:     drivers/acpi/hmat/
> +
>  HMM - Heterogeneous Memory Management
>  M:     Jérôme Glisse <jglisse(a)redhat.com>
>  L:     linux-mm(a)kvack.org
> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
> index 46505396869e..21cdd1288430 100644
> --- a/drivers/acpi/Kconfig
> +++ b/drivers/acpi/Kconfig
> @@ -466,6 +466,7 @@ config ACPI_REDUCED_HARDWARE_ONLY
>           If you are unsure what to do, do not enable this option.
>
>  source "drivers/acpi/nfit/Kconfig"
> +source "drivers/acpi/hmat/Kconfig"
>
>  source "drivers/acpi/apei/Kconfig"
>  source "drivers/acpi/dptf/Kconfig"
> diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
> index 41954a601989..ed5eab6b0412 100644
> --- a/drivers/acpi/Makefile
> +++ b/drivers/acpi/Makefile
> @@ -75,6 +75,7 @@ obj-$(CONFIG_ACPI_PROCESSOR)  += processor.o
>  obj-$(CONFIG_ACPI)             += container.o
>  obj-$(CONFIG_ACPI_THERMAL)     += thermal.o
>  obj-$(CONFIG_ACPI_NFIT)                += nfit/
> +obj-$(CONFIG_ACPI_HMAT)                += hmat/
>  obj-$(CONFIG_ACPI)             += acpi_memhotplug.o
>  obj-$(CONFIG_ACPI_HOTPLUG_IOAPIC) += ioapic.o
>  obj-$(CONFIG_ACPI_BATTERY)     += battery.o
> diff --git a/drivers/acpi/hmat/Kconfig b/drivers/acpi/hmat/Kconfig
> new file mode 100644
> index 000000000000..954ad4701005
> --- /dev/null
> +++ b/drivers/acpi/hmat/Kconfig
> @@ -0,0 +1,7 @@
> +config ACPI_HMAT
> +       bool "ACPI Heterogeneous Memory Attribute Table Support"
> +       depends on ACPI_NUMA
> +       depends on SYSFS
> +       help
> +         Exports a sysfs representation of the ACPI Heterogeneous Memory
> +         Attributes Table (HMAT).
> diff --git a/drivers/acpi/hmat/Makefile b/drivers/acpi/hmat/Makefile
> new file mode 100644
> index 000000000000..edf4bcb1c97d
> --- /dev/null
> +++ b/drivers/acpi/hmat/Makefile
> @@ -0,0 +1,2 @@
> +obj-$(CONFIG_ACPI_HMAT) := hmat.o
> +hmat-y := core.o initiator.o target.o
> diff --git a/drivers/acpi/hmat/core.c b/drivers/acpi/hmat/core.c
> new file mode 100644
> index 000000000000..61b90dadf84b
> --- /dev/null
> +++ b/drivers/acpi/hmat/core.c
> @@ -0,0 +1,536 @@
> +/*
> + * Heterogeneous Memory Attributes Table (HMAT) representation in sysfs
> + *
> + * Copyright (c) 2017, Intel Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + */

Minor nit for starters: you should use SPDX license indentifiers in
new files and if you do so, the license boilerplace is not necessary
any more.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 153+ messages in thread
* Re: [Devel] [PATCH v3 1/3] acpi: HMAT support in acpi_parse_entries_array()
  2017-12-14  2:10   ` Ross Zwisler
  (?)
  (?)
@ 2017-12-15  0:49 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 153+ messages in thread
From: Rafael J. Wysocki @ 2017-12-15  0:49 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 723 bytes --]

On Thu, Dec 14, 2017 at 3:10 AM, Ross Zwisler
<ross.zwisler(a)linux.intel.com> wrote:
> The current implementation of acpi_parse_entries_array() assumes that each
> subtable has a standard ACPI subtable entry of type struct
> acpi_subtable_header.  This standard subtable header has a one byte length
> followed by a one byte type.
>
> The HMAT subtables have to allow for a longer length so they have subtable
> headers of type struct acpi_hmat_structure which has a 2 byte type and a 4
> byte length.
>
> Enhance the subtable parsing in acpi_parse_entries_array() so that it can
> handle these new HMAT subtables.
>
> Signed-off-by: Ross Zwisler <ross.zwisler(a)linux.intel.com>

This one is fine by me.

^ permalink raw reply	[flat|nested] 153+ messages in thread
* [Devel] [PATCH v3 3/3] hmat: add performance attributes
@ 2017-12-14  2:10 ` Ross Zwisler
  0 siblings, 0 replies; 153+ messages in thread
From: Ross Zwisler @ 2017-12-14  2:10 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 16704 bytes --]

Add performance information found in the HMAT to the sysfs representation.
This information lives as an attribute group named "local_init" in the
memory target:

  # tree mem_tgt2
  mem_tgt2
  ├── firmware_id
  ├── is_cached
  ├── local_init
  │   ├── mem_init0 -> ../../mem_init0
  │   ├── mem_init1 -> ../../mem_init1
  │   ├── read_bw_MBps
  │   ├── read_lat_nsec
  │   ├── write_bw_MBps
  │   └── write_lat_nsec
  ├── node2 -> ../../node/node2
  ├── power
  │   ├── async
  │   ...
  ├── subsystem -> ../../../../bus/hmat
  └── uevent

This attribute group surfaces latency and bandwidth performance for a
memory target and its local initiators.  For example:

  # grep . mem_tgt2/local_init/* 2>/dev/null
  mem_tgt2/local_init/read_bw_MBps:30720
  mem_tgt2/local_init/read_lat_nsec:100
  mem_tgt2/local_init/write_bw_MBps:30720
  mem_tgt2/local_init/write_lat_nsec:100

The initiators also have a symlink to their local targets:

  # ls -l mem_init0/mem_tgt2
  lrwxrwxrwx. 1 root root 0 Dec 13 16:45 mem_init0/mem_tgt2 -> ../mem_tgt2

We create performance attribute groups only for local (initiator,target)
pairings, where the first local initiator for a given target is defined by
the "Processor Proximity Domain" field in the HMAT's Memory Subsystem
Address Range Structure table.  After we have one local initiator we scan
the performance data to link to any other "local" initiators with the same
local performance to a given memory target.

A given target only has one set of local performance values, so each target
will have at most one "local_init" attribute group, though that group can
contain links to multiple initiators that all have local performance.  A
given memory initiator may have multiple local memory targets, so multiple
"mem_tgtX" links may exist for a given initiator.

If a given memory target is cached we give performance numbers only for the
media itself, and rely on the "is_cached" attribute to represent the
fact that there is a caching layer.

The fact that we only expose a subset of the performance information
presented in the HMAT via sysfs as a compromise, driven by fact that those
usages will be the highest performing and because to represent all possible
paths could cause an unmanageable explosion of sysfs entries.

If we dump everything from the HMAT into sysfs we end up with
O(num_targets * num_initiators * num_caching_levels) attributes.  Each of
these attributes only takes up 2 bytes in a System Locality Latency and
Bandwidth Information Structure, but if we have to create a directory entry
for each it becomes much more expensive.

For example, very large systems today can have on the order of thousands of
NUMA nodes.  Say we have a system which used to have 1,000 NUMA nodes that
each had both a CPU and local memory.  The HMAT allows us to separate the
CPUs and memory into separate NUMA nodes, so we can end up with 1,000 CPU
initiator NUMA nodes and 1,000 memory target NUMA nodes.  If we represented
the performance information for each possible CPU/memory pair in sysfs we
would end up with 1,000,000 attribute groups.

This is a lot to pass in a set of packed data tables, but I think we'll
break sysfs if we try to create millions of attributes, regardless of how
we nest them in a directory hierarchy.

By only representing performance information for local (initiator,target)
pairings, we reduce the number of sysfs entries to O(num_targets).

Signed-off-by: Ross Zwisler <ross.zwisler(a)linux.intel.com>
---
 drivers/acpi/hmat/Makefile          |   2 +-
 drivers/acpi/hmat/core.c            | 263 +++++++++++++++++++++++++++++++++++-
 drivers/acpi/hmat/hmat.h            |  17 +++
 drivers/acpi/hmat/perf_attributes.c |  56 ++++++++
 4 files changed, 336 insertions(+), 2 deletions(-)
 create mode 100644 drivers/acpi/hmat/perf_attributes.c

diff --git a/drivers/acpi/hmat/Makefile b/drivers/acpi/hmat/Makefile
index edf4bcb1c97d..b5d1d83684da 100644
--- a/drivers/acpi/hmat/Makefile
+++ b/drivers/acpi/hmat/Makefile
@@ -1,2 +1,2 @@
 obj-$(CONFIG_ACPI_HMAT) := hmat.o
-hmat-y := core.o initiator.o target.o
+hmat-y := core.o initiator.o target.o perf_attributes.o
diff --git a/drivers/acpi/hmat/core.c b/drivers/acpi/hmat/core.c
index 61b90dadf84b..89e84658fc73 100644
--- a/drivers/acpi/hmat/core.c
+++ b/drivers/acpi/hmat/core.c
@@ -23,11 +23,225 @@
 #include <linux/slab.h>
 #include "hmat.h"
 
+#define NO_VALUE	-1
+#define LOCAL_INIT	"local_init"
+
 static LIST_HEAD(target_list);
 static LIST_HEAD(initiator_list);
+LIST_HEAD(locality_list);
 
 static bool bad_hmat;
 
+/* Performance attributes for an initiator/target pair. */
+static int get_performance_data(u32 init_pxm, u32 tgt_pxm,
+		struct acpi_hmat_locality *hmat_loc)
+{
+	int num_init = hmat_loc->number_of_initiator_Pds;
+	int num_tgt = hmat_loc->number_of_target_Pds;
+	int init_idx = NO_VALUE;
+	int tgt_idx = NO_VALUE;
+	u32 *initiators, *targets;
+	u16 *entries, val;
+	int i;
+
+	/* the initiator array is after the struct acpi_hmat_locality fields */
+	initiators = (u32 *)(hmat_loc + 1);
+	targets = &initiators[num_init];
+	entries = (u16 *)&targets[num_tgt];
+
+	for (i = 0; i < num_init; i++) {
+		if (initiators[i] == init_pxm) {
+			init_idx = i;
+			break;
+		}
+	}
+
+	if (init_idx == NO_VALUE)
+		return NO_VALUE;
+
+	for (i = 0; i < num_tgt; i++) {
+		if (targets[i] == tgt_pxm) {
+			tgt_idx = i;
+			break;
+		}
+	}
+
+	if (tgt_idx == NO_VALUE)
+		return NO_VALUE;
+
+	val = entries[init_idx*num_tgt + tgt_idx];
+	if (val < 10 || val == 0xFFFF)
+		return NO_VALUE;
+
+	return (int)(val * hmat_loc->entry_base_unit) / 10;
+}
+
+/*
+ * 'direction' is either READ or WRITE
+ * Latency is reported in nanoseconds and bandwidth is reported in MB/s.
+ */
+static int hmat_get_attribute(int init_pxm, int tgt_pxm, int direction,
+		enum hmat_attr_type type)
+{
+	struct memory_locality *loc;
+	int value;
+
+	list_for_each_entry(loc, &locality_list, list) {
+		struct acpi_hmat_locality *hmat_loc = loc->hmat_loc;
+
+		if (direction == READ && type == LATENCY &&
+		    (hmat_loc->data_type == ACPI_HMAT_ACCESS_LATENCY ||
+		     hmat_loc->data_type == ACPI_HMAT_READ_LATENCY)) {
+			value = get_performance_data(init_pxm, tgt_pxm,
+					hmat_loc);
+			if (value != NO_VALUE)
+				return value;
+		}
+
+		if (direction == WRITE && type == LATENCY &&
+		    (hmat_loc->data_type == ACPI_HMAT_ACCESS_LATENCY ||
+		     hmat_loc->data_type == ACPI_HMAT_WRITE_LATENCY)) {
+			value = get_performance_data(init_pxm, tgt_pxm,
+					hmat_loc);
+			if (value != NO_VALUE)
+				return value;
+		}
+
+		if (direction == READ && type == BANDWIDTH &&
+		    (hmat_loc->data_type == ACPI_HMAT_ACCESS_BANDWIDTH ||
+		     hmat_loc->data_type == ACPI_HMAT_READ_BANDWIDTH)) {
+			value = get_performance_data(init_pxm, tgt_pxm,
+					hmat_loc);
+			if (value != NO_VALUE)
+				return value;
+		}
+
+		if (direction == WRITE && type == BANDWIDTH &&
+		    (hmat_loc->data_type == ACPI_HMAT_ACCESS_BANDWIDTH ||
+		     hmat_loc->data_type == ACPI_HMAT_WRITE_BANDWIDTH)) {
+			value = get_performance_data(init_pxm, tgt_pxm,
+					hmat_loc);
+			if (value != NO_VALUE)
+				return value;
+		}
+	}
+
+	return NO_VALUE;
+}
+
+/*
+ * 'direction' is either READ or WRITE
+ * Latency is reported in nanoseconds and bandwidth is reported in MB/s.
+ */
+int hmat_local_attribute(struct device *tgt_dev, int direction,
+		enum hmat_attr_type type)
+{
+	struct memory_target *tgt = to_memory_target(tgt_dev);
+	int tgt_pxm = tgt->ma->proximity_domain;
+	int init_pxm;
+
+	if (!tgt->local_init)
+		return NO_VALUE;
+
+	init_pxm = tgt->local_init->pxm;
+	return hmat_get_attribute(init_pxm, tgt_pxm, direction, type);
+}
+
+static bool is_local_init(int init_pxm, int tgt_pxm, int read_lat,
+		int write_lat, int read_bw, int write_bw)
+{
+	if (read_lat != hmat_get_attribute(init_pxm, tgt_pxm, READ, LATENCY))
+		return false;
+
+	if (write_lat != hmat_get_attribute(init_pxm, tgt_pxm, WRITE, LATENCY))
+		return false;
+
+	if (read_bw != hmat_get_attribute(init_pxm, tgt_pxm, READ, BANDWIDTH))
+		return false;
+
+	if (write_bw != hmat_get_attribute(init_pxm, tgt_pxm, WRITE, BANDWIDTH))
+		return false;
+
+	return true;
+}
+
+static const struct attribute_group performance_attribute_group = {
+	.attrs = performance_attributes,
+	.name = LOCAL_INIT,
+};
+
+static void remove_performance_attributes(struct memory_target *tgt)
+{
+	if (!tgt->local_init)
+		return;
+
+	/*
+	 * FIXME: Need to enhance the core sysfs code to remove all the links
+	 * in both the attribute group and in the device itself when those are
+	 * removed.
+	 */
+	sysfs_remove_group(&tgt->dev.kobj, &performance_attribute_group);
+}
+
+static int add_performance_attributes(struct memory_target *tgt)
+{
+	int read_lat, write_lat, read_bw, write_bw;
+	int tgt_pxm = tgt->ma->proximity_domain;
+	struct kobject *init_kobj, *tgt_kobj;
+	struct device *init_dev, *tgt_dev;
+	struct memory_initiator *init;
+	int ret;
+
+	if (!tgt->local_init)
+		return 0;
+
+	tgt_dev = &tgt->dev;
+	tgt_kobj = &tgt_dev->kobj;
+
+	/* Create entries for initiator/target pair in the target.  */
+	ret = sysfs_create_group(tgt_kobj, &performance_attribute_group);
+	if (ret < 0)
+		return ret;
+
+	/*
+	 * Iterate through initiators and find all the ones that have the same
+	 * performance as the local initiator.
+	 */
+	read_lat = hmat_local_attribute(tgt_dev, READ, LATENCY);
+	write_lat = hmat_local_attribute(tgt_dev, WRITE, LATENCY);
+	read_bw = hmat_local_attribute(tgt_dev, READ, BANDWIDTH);
+	write_bw = hmat_local_attribute(tgt_dev, WRITE, BANDWIDTH);
+
+	list_for_each_entry(init, &initiator_list, list) {
+		init_dev = &init->dev;
+		init_kobj = &init_dev->kobj;
+
+		if (init == tgt->local_init ||
+			is_local_init(init->pxm, tgt_pxm, read_lat,
+				write_lat, read_bw, write_bw)) {
+			ret = sysfs_add_link_to_group(tgt_kobj, LOCAL_INIT,
+					init_kobj, dev_name(init_dev));
+			if (ret < 0)
+				goto err;
+
+			/*
+			 * Create a link in the local initiator to this
+			 * target.
+			 */
+			ret = sysfs_create_link(init_kobj, tgt_kobj,
+					kobject_name(tgt_kobj));
+			if (ret < 0)
+				goto err;
+		}
+
+	}
+	tgt->has_perf_attributes = true;
+	return 0;
+err:
+	remove_performance_attributes(tgt);
+	return ret;
+}
+
 static int link_node_for_kobj(unsigned int node, struct kobject *kobj)
 {
 	if (node_devices[node])
@@ -132,6 +346,9 @@ static void release_memory_target(struct device *dev)
 
 static void __init remove_memory_target(struct memory_target *tgt)
 {
+	if (tgt->has_perf_attributes)
+		remove_performance_attributes(tgt);
+
 	if (tgt->is_registered) {
 		remove_node_for_kobj(pxm_to_node(tgt->ma->proximity_domain),
 				&tgt->dev.kobj);
@@ -276,6 +493,38 @@ hmat_parse_address_range(struct acpi_subtable_header *header,
 	return -EINVAL;
 }
 
+static int __init hmat_parse_locality(struct acpi_subtable_header *header,
+		const unsigned long end)
+{
+	struct acpi_hmat_locality *hmat_loc;
+	struct memory_locality *loc;
+
+	if (bad_hmat)
+		return 0;
+
+	hmat_loc = (struct acpi_hmat_locality *)header;
+	if (!hmat_loc) {
+		pr_err("HMAT: NULL table entry\n");
+		bad_hmat = true;
+		return -EINVAL;
+	}
+
+	/* We don't report cached performance information in sysfs. */
+	if (hmat_loc->flags == ACPI_HMAT_MEMORY ||
+			hmat_loc->flags == ACPI_HMAT_LAST_LEVEL_CACHE) {
+		loc = kzalloc(sizeof(*loc), GFP_KERNEL);
+		if (!loc) {
+			bad_hmat = true;
+			return -ENOMEM;
+		}
+
+		loc->hmat_loc = hmat_loc;
+		list_add_tail(&loc->list, &locality_list);
+	}
+
+	return 0;
+}
+
 static int __init hmat_parse_cache(struct acpi_subtable_header *header,
 		const unsigned long end)
 {
@@ -431,6 +680,7 @@ srat_parse_memory_affinity(struct acpi_subtable_header *header,
 static void hmat_cleanup(void)
 {
 	struct memory_initiator *init, *init_iter;
+	struct memory_locality *loc, *loc_iter;
 	struct memory_target *tgt, *tgt_iter;
 
 	list_for_each_entry_safe(tgt, tgt_iter, &target_list, list)
@@ -438,6 +688,11 @@ static void hmat_cleanup(void)
 
 	list_for_each_entry_safe(init, init_iter, &initiator_list, list)
 		remove_memory_initiator(init);
+
+	list_for_each_entry_safe(loc, loc_iter, &locality_list, list) {
+		list_del(&loc->list);
+		kfree(loc);
+	}
 }
 
 static int __init hmat_init(void)
@@ -488,13 +743,15 @@ static int __init hmat_init(void)
 	}
 
 	if (!acpi_table_parse(ACPI_SIG_HMAT, hmat_noop_parse)) {
-		struct acpi_subtable_proc hmat_proc[2];
+		struct acpi_subtable_proc hmat_proc[3];
 
 		memset(hmat_proc, 0, sizeof(hmat_proc));
 		hmat_proc[0].id = ACPI_HMAT_TYPE_ADDRESS_RANGE;
 		hmat_proc[0].handler = hmat_parse_address_range;
 		hmat_proc[1].id = ACPI_HMAT_TYPE_CACHE;
 		hmat_proc[1].handler = hmat_parse_cache;
+		hmat_proc[2].id = ACPI_HMAT_TYPE_LOCALITY;
+		hmat_proc[2].handler = hmat_parse_locality;
 
 		acpi_table_parse_entries_array(ACPI_SIG_HMAT,
 					sizeof(struct acpi_table_hmat),
@@ -516,6 +773,10 @@ static int __init hmat_init(void)
 		ret = register_memory_target(tgt);
 		if (ret)
 			goto err;
+
+		ret = add_performance_attributes(tgt);
+		if (ret)
+			goto err;
 	}
 
 	return 0;
diff --git a/drivers/acpi/hmat/hmat.h b/drivers/acpi/hmat/hmat.h
index 108aad1f8ad7..89200f5c4b38 100644
--- a/drivers/acpi/hmat/hmat.h
+++ b/drivers/acpi/hmat/hmat.h
@@ -16,6 +16,11 @@
 #ifndef _ACPI_HMAT_H_
 #define _ACPI_HMAT_H_
 
+enum hmat_attr_type {
+	LATENCY,
+	BANDWIDTH,
+};
+
 struct memory_initiator {
 	struct list_head list;
 	struct device dev;
@@ -39,9 +44,21 @@ struct memory_target {
 
 	bool is_cached;
 	bool is_registered;
+	bool has_perf_attributes;
 };
 #define to_memory_target(d) container_of((d), struct memory_target, dev)
 
+struct memory_locality {
+	struct list_head list;
+	struct acpi_hmat_locality *hmat_loc;
+};
+
 extern const struct attribute_group *memory_initiator_attribute_groups[];
 extern const struct attribute_group *memory_target_attribute_groups[];
+extern struct attribute *performance_attributes[];
+
+extern struct list_head locality_list;
+
+int hmat_local_attribute(struct device *tgt_dev, int direction,
+		enum hmat_attr_type type);
 #endif /* _ACPI_HMAT_H_ */
diff --git a/drivers/acpi/hmat/perf_attributes.c b/drivers/acpi/hmat/perf_attributes.c
new file mode 100644
index 000000000000..60f107b58822
--- /dev/null
+++ b/drivers/acpi/hmat/perf_attributes.c
@@ -0,0 +1,56 @@
+/*
+ * Heterogeneous Memory Attributes Table (HMAT) sysfs performance attributes
+ *
+ * Copyright (c) 2017, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include <linux/acpi.h>
+#include <linux/device.h>
+#include <linux/sysfs.h>
+#include "hmat.h"
+
+static ssize_t read_lat_nsec_show(struct device *dev,
+			struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%d\n", hmat_local_attribute(dev, READ, LATENCY));
+}
+static DEVICE_ATTR_RO(read_lat_nsec);
+
+static ssize_t write_lat_nsec_show(struct device *dev,
+			struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%d\n", hmat_local_attribute(dev, WRITE, LATENCY));
+}
+static DEVICE_ATTR_RO(write_lat_nsec);
+
+static ssize_t read_bw_MBps_show(struct device *dev,
+			struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%d\n", hmat_local_attribute(dev, READ, BANDWIDTH));
+}
+static DEVICE_ATTR_RO(read_bw_MBps);
+
+static ssize_t write_bw_MBps_show(struct device *dev,
+			struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%d\n",
+			hmat_local_attribute(dev, WRITE, BANDWIDTH));
+}
+static DEVICE_ATTR_RO(write_bw_MBps);
+
+struct attribute *performance_attributes[] = {
+	&dev_attr_read_lat_nsec.attr,
+	&dev_attr_write_lat_nsec.attr,
+	&dev_attr_read_bw_MBps.attr,
+	&dev_attr_write_bw_MBps.attr,
+	NULL
+};
-- 
2.14.3


^ permalink raw reply related	[flat|nested] 153+ messages in thread
* [Devel] [PATCH v3 2/3] hmat: add heterogeneous memory sysfs support
@ 2017-12-14  2:10 ` Ross Zwisler
  0 siblings, 0 replies; 153+ messages in thread
From: Ross Zwisler @ 2017-12-14  2:10 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 24360 bytes --]

Add a new sysfs subsystem, /sys/devices/system/hmat, which surfaces
information about memory initiators and memory targets to the user.  These
initiators and targets are described by the ACPI SRAT and HMAT tables.

A "memory initiator" in this case is a NUMA node containing one or more
devices such as CPU or separate memory I/O devices that can initiate
memory requests.  A "memory target" is NUMA node containing at least one
CPU-accessible physical address range.

The key piece of information surfaced by this patch is the mapping between
the ACPI table "proximity domain" numbers, held in the "firmware_id"
attribute, and Linux NUMA node numbers.  Every ACPI proximity domain will
end up being a unique NUMA node in Linux, but the numbers may get reordered
and Linux can create extra NUMA nodes that don't map back to ACPI proximity
domains.  The firmware_id value is needed if anyone ever wants to look at
the ACPI HMAT and SRAT tables directly and make sense of how they map to
NUMA nodes in Linux.

Initiators are found at /sys/devices/system/hmat/mem_initX, and the
attributes for a given initiator look like this:

  # tree mem_init0
  mem_init0
  ├── firmware_id
  ├── node0 -> ../../node/node0
  ├── power
  │   ├── async
  │   ...
  ├── subsystem -> ../../../../bus/hmat
  └── uevent

Where "mem_init0" on my system represents the CPU acting as a memory
initiator at NUMA node 0.  Users can discover which CPUs are part of this
memory initiator by following the node0 symlink and looking at cpumap,
cpulist and the cpu* symlinks.

Targets are found at /sys/devices/system/hmat/mem_tgtX, and the attributes
for a given target look like this:

  # tree mem_tgt2
  mem_tgt2
  ├── firmware_id
  ├── is_cached
  ├── node2 -> ../../node/node2
  ├── power
  │   ├── async
  │   ...
  ├── subsystem -> ../../../../bus/hmat
  └── uevent

Users can discover information about the memory owned by this memory target
by following the node2 symlink and looking at meminfo, vmstat and at the
memory* memory section symlinks.

Signed-off-by: Ross Zwisler <ross.zwisler(a)linux.intel.com>
---
 MAINTAINERS                   |   6 +
 drivers/acpi/Kconfig          |   1 +
 drivers/acpi/Makefile         |   1 +
 drivers/acpi/hmat/Kconfig     |   7 +
 drivers/acpi/hmat/Makefile    |   2 +
 drivers/acpi/hmat/core.c      | 536 ++++++++++++++++++++++++++++++++++++++++++
 drivers/acpi/hmat/hmat.h      |  47 ++++
 drivers/acpi/hmat/initiator.c |  43 ++++
 drivers/acpi/hmat/target.c    |  55 +++++
 9 files changed, 698 insertions(+)
 create mode 100644 drivers/acpi/hmat/Kconfig
 create mode 100644 drivers/acpi/hmat/Makefile
 create mode 100644 drivers/acpi/hmat/core.c
 create mode 100644 drivers/acpi/hmat/hmat.h
 create mode 100644 drivers/acpi/hmat/initiator.c
 create mode 100644 drivers/acpi/hmat/target.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 82ad0eabce4f..64ebec0708de 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6366,6 +6366,12 @@ S:	Supported
 F:	drivers/scsi/hisi_sas/
 F:	Documentation/devicetree/bindings/scsi/hisilicon-sas.txt
 
+HMAT - ACPI Heterogeneous Memory Attribute Table Support
+M:	Ross Zwisler <ross.zwisler(a)linux.intel.com>
+L:	linux-mm(a)kvack.org
+S:	Supported
+F:	drivers/acpi/hmat/
+
 HMM - Heterogeneous Memory Management
 M:	Jérôme Glisse <jglisse(a)redhat.com>
 L:	linux-mm(a)kvack.org
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index 46505396869e..21cdd1288430 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -466,6 +466,7 @@ config ACPI_REDUCED_HARDWARE_ONLY
 	  If you are unsure what to do, do not enable this option.
 
 source "drivers/acpi/nfit/Kconfig"
+source "drivers/acpi/hmat/Kconfig"
 
 source "drivers/acpi/apei/Kconfig"
 source "drivers/acpi/dptf/Kconfig"
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
index 41954a601989..ed5eab6b0412 100644
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -75,6 +75,7 @@ obj-$(CONFIG_ACPI_PROCESSOR)	+= processor.o
 obj-$(CONFIG_ACPI)		+= container.o
 obj-$(CONFIG_ACPI_THERMAL)	+= thermal.o
 obj-$(CONFIG_ACPI_NFIT)		+= nfit/
+obj-$(CONFIG_ACPI_HMAT)		+= hmat/
 obj-$(CONFIG_ACPI)		+= acpi_memhotplug.o
 obj-$(CONFIG_ACPI_HOTPLUG_IOAPIC) += ioapic.o
 obj-$(CONFIG_ACPI_BATTERY)	+= battery.o
diff --git a/drivers/acpi/hmat/Kconfig b/drivers/acpi/hmat/Kconfig
new file mode 100644
index 000000000000..954ad4701005
--- /dev/null
+++ b/drivers/acpi/hmat/Kconfig
@@ -0,0 +1,7 @@
+config ACPI_HMAT
+	bool "ACPI Heterogeneous Memory Attribute Table Support"
+	depends on ACPI_NUMA
+	depends on SYSFS
+	help
+	  Exports a sysfs representation of the ACPI Heterogeneous Memory
+	  Attributes Table (HMAT).
diff --git a/drivers/acpi/hmat/Makefile b/drivers/acpi/hmat/Makefile
new file mode 100644
index 000000000000..edf4bcb1c97d
--- /dev/null
+++ b/drivers/acpi/hmat/Makefile
@@ -0,0 +1,2 @@
+obj-$(CONFIG_ACPI_HMAT) := hmat.o
+hmat-y := core.o initiator.o target.o
diff --git a/drivers/acpi/hmat/core.c b/drivers/acpi/hmat/core.c
new file mode 100644
index 000000000000..61b90dadf84b
--- /dev/null
+++ b/drivers/acpi/hmat/core.c
@@ -0,0 +1,536 @@
+/*
+ * Heterogeneous Memory Attributes Table (HMAT) representation in sysfs
+ *
+ * Copyright (c) 2017, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include <acpi/acpi_numa.h>
+#include <linux/acpi.h>
+#include <linux/cpu.h>
+#include <linux/device.h>
+#include <linux/init.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include "hmat.h"
+
+static LIST_HEAD(target_list);
+static LIST_HEAD(initiator_list);
+
+static bool bad_hmat;
+
+static int link_node_for_kobj(unsigned int node, struct kobject *kobj)
+{
+	if (node_devices[node])
+		return sysfs_create_link(kobj, &node_devices[node]->dev.kobj,
+				kobject_name(&node_devices[node]->dev.kobj));
+	return 0;
+}
+
+static void remove_node_for_kobj(unsigned int node, struct kobject *kobj)
+{
+	if (node_devices[node])
+		sysfs_remove_link(kobj,
+				kobject_name(&node_devices[node]->dev.kobj));
+}
+
+#define HMAT_CLASS_NAME	"hmat"
+
+static struct bus_type hmat_subsys = {
+	/*
+	 * .dev_name is set before device_register() based on the type of
+	 * device we are registering.
+	 */
+	.name = HMAT_CLASS_NAME,
+};
+
+/* memory initiators */
+static void release_memory_initiator(struct device *dev)
+{
+	struct memory_initiator *init = to_memory_initiator(dev);
+
+	list_del(&init->list);
+	kfree(init);
+}
+
+static void __init remove_memory_initiator(struct memory_initiator *init)
+{
+	if (init->is_registered) {
+		remove_node_for_kobj(pxm_to_node(init->pxm), &init->dev.kobj);
+		device_unregister(&init->dev);
+	} else
+		release_memory_initiator(&init->dev);
+}
+
+static int __init register_memory_initiator(struct memory_initiator *init)
+{
+	int ret;
+
+	hmat_subsys.dev_name = "mem_init";
+	init->dev.bus = &hmat_subsys;
+	init->dev.id = pxm_to_node(init->pxm);
+	init->dev.release = release_memory_initiator;
+	init->dev.groups = memory_initiator_attribute_groups;
+
+	ret = device_register(&init->dev);
+	if (ret < 0)
+		return ret;
+
+	init->is_registered = true;
+	return link_node_for_kobj(pxm_to_node(init->pxm), &init->dev.kobj);
+}
+
+static struct memory_initiator * __init add_memory_initiator(int pxm)
+{
+	struct memory_initiator *init;
+
+	if (pxm_to_node(pxm) == NUMA_NO_NODE) {
+		pr_err("HMAT: No NUMA node for PXM %d\n", pxm);
+		bad_hmat = true;
+		return ERR_PTR(-EINVAL);
+	}
+
+	/*
+	 * Make sure we haven't already added an initiator for this proximity
+	 * domain.  We don't care about any other differences in the SRAT
+	 * tables (apic_id, etc), so we just use the data from the first table
+	 * we see for a given proximity domain.
+	 */
+	list_for_each_entry(init, &initiator_list, list)
+		if (init->pxm == pxm)
+			return 0;
+
+	init = kzalloc(sizeof(*init), GFP_KERNEL);
+	if (!init) {
+		bad_hmat = true;
+		return ERR_PTR(-ENOMEM);
+	}
+
+	init->pxm = pxm;
+
+	list_add_tail(&init->list, &initiator_list);
+	return init;
+}
+
+/* memory targets */
+static void release_memory_target(struct device *dev)
+{
+	struct memory_target *tgt = to_memory_target(dev);
+
+	list_del(&tgt->list);
+	kfree(tgt);
+}
+
+static void __init remove_memory_target(struct memory_target *tgt)
+{
+	if (tgt->is_registered) {
+		remove_node_for_kobj(pxm_to_node(tgt->ma->proximity_domain),
+				&tgt->dev.kobj);
+		device_unregister(&tgt->dev);
+	} else
+		release_memory_target(&tgt->dev);
+}
+
+static int __init register_memory_target(struct memory_target *tgt)
+{
+	int ret;
+
+	if (!tgt->ma || !tgt->spa) {
+		pr_err("HMAT: Incomplete memory target found\n");
+		return -EINVAL;
+	}
+
+	hmat_subsys.dev_name = "mem_tgt";
+	tgt->dev.bus = &hmat_subsys;
+	tgt->dev.id = pxm_to_node(tgt->ma->proximity_domain);
+	tgt->dev.release = release_memory_target;
+	tgt->dev.groups = memory_target_attribute_groups;
+
+	ret = device_register(&tgt->dev);
+	if (ret < 0)
+		return ret;
+
+	tgt->is_registered = true;
+
+	return link_node_for_kobj(pxm_to_node(tgt->ma->proximity_domain),
+			&tgt->dev.kobj);
+}
+
+static int __init add_memory_target(struct acpi_srat_mem_affinity *ma)
+{
+	struct memory_target *tgt;
+
+	if (pxm_to_node(ma->proximity_domain) == NUMA_NO_NODE) {
+		pr_err("HMAT: No NUMA node for PXM %d\n", ma->proximity_domain);
+		bad_hmat = true;
+		return -EINVAL;
+	}
+
+	/*
+	 * Make sure we haven't already added a target for this proximity
+	 * domain.  We don't care about any other differences in the SRAT
+	 * tables (base_address, length), so we just use the data from the
+	 * first table we see for a given proximity domain.
+	 */
+	list_for_each_entry(tgt, &target_list, list)
+		if (tgt->ma->proximity_domain == ma->proximity_domain)
+			return 0;
+
+	tgt = kzalloc(sizeof(*tgt), GFP_KERNEL);
+	if (!tgt) {
+		bad_hmat = true;
+		return -ENOMEM;
+	}
+
+	tgt->ma = ma;
+
+	list_add_tail(&tgt->list, &target_list);
+	return 0;
+}
+
+/* ACPI parsing code, starting with the HMAT */
+static int __init hmat_noop_parse(struct acpi_table_header *table)
+{
+	/* real work done by the hmat_parse_* and srat_parse_* routines */
+	return 0;
+}
+
+static bool __init hmat_spa_matches_srat(struct acpi_hmat_address_range *spa,
+		struct acpi_srat_mem_affinity *ma)
+{
+	if (spa->physical_address_base != ma->base_address ||
+	    spa->physical_address_length != ma->length)
+		return false;
+
+	return true;
+}
+
+static void find_local_initiator(struct memory_target *tgt)
+{
+	struct memory_initiator *init;
+
+	if (!(tgt->spa->flags & ACPI_HMAT_PROCESSOR_PD_VALID) ||
+			pxm_to_node(tgt->spa->processor_PD) == NUMA_NO_NODE)
+		return;
+
+	list_for_each_entry(init, &initiator_list, list) {
+		if (init->pxm == tgt->spa->processor_PD) {
+			tgt->local_init = init;
+			return;
+		}
+	}
+}
+
+/* ACPI HMAT parsing routines */
+static int __init
+hmat_parse_address_range(struct acpi_subtable_header *header,
+		const unsigned long end)
+{
+	struct acpi_hmat_address_range *spa;
+	struct memory_target *tgt;
+
+	if (bad_hmat)
+		return 0;
+
+	spa = (struct acpi_hmat_address_range *)header;
+	if (!spa) {
+		pr_err("HMAT: NULL table entry\n");
+		goto err;
+	}
+
+	if (spa->header.length != sizeof(*spa)) {
+		pr_err("HMAT: Unexpected header length: %d\n",
+				spa->header.length);
+		goto err;
+	}
+
+	list_for_each_entry(tgt, &target_list, list) {
+		if ((spa->flags & ACPI_HMAT_MEMORY_PD_VALID) &&
+				spa->memory_PD == tgt->ma->proximity_domain) {
+			/*
+			 * We only add a single HMAT target per proximity
+			 * domain so we wait for the one that matches the
+			 * single SRAT memory affinity structure per PXM we
+			 * saved in add_memory_target().
+			 */
+			if (hmat_spa_matches_srat(spa, tgt->ma)) {
+				tgt->spa = spa;
+				find_local_initiator(tgt);
+			}
+			return 0;
+		}
+	}
+
+	return 0;
+err:
+	bad_hmat = true;
+	return -EINVAL;
+}
+
+static int __init hmat_parse_cache(struct acpi_subtable_header *header,
+		const unsigned long end)
+{
+	struct acpi_hmat_cache *cache;
+	struct memory_target *tgt;
+
+	if (bad_hmat)
+		return 0;
+
+	cache = (struct acpi_hmat_cache *)header;
+	if (!cache) {
+		pr_err("HMAT: NULL table entry\n");
+		goto err;
+	}
+
+	if (cache->header.length < sizeof(*cache)) {
+		pr_err("HMAT: Unexpected header length: %d\n",
+				cache->header.length);
+		goto err;
+	}
+
+	list_for_each_entry(tgt, &target_list, list) {
+		if (cache->memory_PD == tgt->ma->proximity_domain) {
+			tgt->is_cached = true;
+			return 0;
+		}
+	}
+
+	pr_err("HMAT: Couldn't find cached target PXM %d\n", cache->memory_PD);
+err:
+	bad_hmat = true;
+	return -EINVAL;
+}
+
+/*
+ * SRAT parsing.  We use srat_disabled() and pxm_to_node() so we don't redo
+ * any of the SRAT sanity checking done in drivers/acpi/numa.c.
+ */
+static int __init
+srat_parse_processor_affinity(struct acpi_subtable_header *header,
+		const unsigned long end)
+{
+	struct acpi_srat_cpu_affinity *cpu;
+	struct memory_initiator *init;
+	u32 pxm;
+
+	if (bad_hmat)
+		return 0;
+
+	cpu = (struct acpi_srat_cpu_affinity *)header;
+	if (!cpu) {
+		pr_err("HMAT: NULL table entry\n");
+		bad_hmat = true;
+		return -EINVAL;
+	}
+
+	pxm = cpu->proximity_domain_lo;
+	if (acpi_srat_revision >= 2)
+		pxm |= *((unsigned int *)cpu->proximity_domain_hi) << 8;
+
+	if (!(cpu->flags & ACPI_SRAT_CPU_ENABLED))
+		return 0;
+
+	init = add_memory_initiator(pxm);
+	if (IS_ERR_OR_NULL(init))
+		return PTR_ERR(init);
+
+	init->cpu = cpu;
+	return 0;
+}
+
+static int __init
+srat_parse_x2apic_affinity(struct acpi_subtable_header *header,
+		const unsigned long end)
+{
+	struct acpi_srat_x2apic_cpu_affinity *x2apic;
+	struct memory_initiator *init;
+
+	if (bad_hmat)
+		return 0;
+
+	x2apic = (struct acpi_srat_x2apic_cpu_affinity *)header;
+	if (!x2apic) {
+		pr_err("HMAT: NULL table entry\n");
+		bad_hmat = true;
+		return -EINVAL;
+	}
+
+	if (!(x2apic->flags & ACPI_SRAT_CPU_ENABLED))
+		return 0;
+
+	init = add_memory_initiator(x2apic->proximity_domain);
+	if (IS_ERR_OR_NULL(init))
+		return PTR_ERR(init);
+
+	init->x2apic = x2apic;
+	return 0;
+}
+
+static int __init
+srat_parse_gicc_affinity(struct acpi_subtable_header *header,
+		const unsigned long end)
+{
+	struct acpi_srat_gicc_affinity *gicc;
+	struct memory_initiator *init;
+
+	if (bad_hmat)
+		return 0;
+
+	gicc = (struct acpi_srat_gicc_affinity *)header;
+	if (!gicc) {
+		pr_err("HMAT: NULL table entry\n");
+		bad_hmat = true;
+		return -EINVAL;
+	}
+
+	if (!(gicc->flags & ACPI_SRAT_GICC_ENABLED))
+		return 0;
+
+	init = add_memory_initiator(gicc->proximity_domain);
+	if (IS_ERR_OR_NULL(init))
+		return PTR_ERR(init);
+
+	init->gicc = gicc;
+	return 0;
+}
+
+static int __init
+srat_parse_memory_affinity(struct acpi_subtable_header *header,
+		const unsigned long end)
+{
+	struct acpi_srat_mem_affinity *ma;
+
+	if (bad_hmat)
+		return 0;
+
+	ma = (struct acpi_srat_mem_affinity *)header;
+	if (!ma) {
+		pr_err("HMAT: NULL table entry\n");
+		bad_hmat = true;
+		return -EINVAL;
+	}
+
+	if (!(ma->flags & ACPI_SRAT_MEM_ENABLED))
+		return 0;
+
+	return add_memory_target(ma);
+}
+
+/*
+ * Remove our sysfs entries, unregister our devices and free allocated memory.
+ */
+static void hmat_cleanup(void)
+{
+	struct memory_initiator *init, *init_iter;
+	struct memory_target *tgt, *tgt_iter;
+
+	list_for_each_entry_safe(tgt, tgt_iter, &target_list, list)
+		remove_memory_target(tgt);
+
+	list_for_each_entry_safe(init, init_iter, &initiator_list, list)
+		remove_memory_initiator(init);
+}
+
+static int __init hmat_init(void)
+{
+	struct acpi_table_header *tbl;
+	struct memory_initiator *init;
+	struct memory_target *tgt;
+	acpi_status status = AE_OK;
+	int ret;
+
+	if (srat_disabled())
+		return 0;
+
+	/*
+	 * We take a permanent reference to both the HMAT and SRAT in ACPI
+	 * memory so we can keep pointers to their subtables.  These tables
+	 * already had references on them which would never be released, taken
+	 * by acpi_sysfs_init(), so this shouldn't negatively impact anything.
+	 */
+	status = acpi_get_table(ACPI_SIG_SRAT, 0, &tbl);
+	if (ACPI_FAILURE(status))
+		return 0;
+
+	status = acpi_get_table(ACPI_SIG_HMAT, 0, &tbl);
+	if (ACPI_FAILURE(status))
+		return 0;
+
+	ret = subsys_system_register(&hmat_subsys, NULL);
+	if (ret)
+		return ret;
+
+	if (!acpi_table_parse(ACPI_SIG_SRAT, hmat_noop_parse)) {
+		struct acpi_subtable_proc srat_proc[4];
+
+		memset(srat_proc, 0, sizeof(srat_proc));
+		srat_proc[0].id = ACPI_SRAT_TYPE_CPU_AFFINITY;
+		srat_proc[0].handler = srat_parse_processor_affinity;
+		srat_proc[1].id = ACPI_SRAT_TYPE_X2APIC_CPU_AFFINITY;
+		srat_proc[1].handler = srat_parse_x2apic_affinity;
+		srat_proc[2].id = ACPI_SRAT_TYPE_GICC_AFFINITY;
+		srat_proc[2].handler = srat_parse_gicc_affinity;
+		srat_proc[3].id = ACPI_SRAT_TYPE_MEMORY_AFFINITY;
+		srat_proc[3].handler = srat_parse_memory_affinity;
+
+		acpi_table_parse_entries_array(ACPI_SIG_SRAT,
+					sizeof(struct acpi_table_srat),
+					srat_proc, ARRAY_SIZE(srat_proc), 0);
+	}
+
+	if (!acpi_table_parse(ACPI_SIG_HMAT, hmat_noop_parse)) {
+		struct acpi_subtable_proc hmat_proc[2];
+
+		memset(hmat_proc, 0, sizeof(hmat_proc));
+		hmat_proc[0].id = ACPI_HMAT_TYPE_ADDRESS_RANGE;
+		hmat_proc[0].handler = hmat_parse_address_range;
+		hmat_proc[1].id = ACPI_HMAT_TYPE_CACHE;
+		hmat_proc[1].handler = hmat_parse_cache;
+
+		acpi_table_parse_entries_array(ACPI_SIG_HMAT,
+					sizeof(struct acpi_table_hmat),
+					hmat_proc, ARRAY_SIZE(hmat_proc), 0);
+	}
+
+	if (bad_hmat) {
+		ret = -EINVAL;
+		goto err;
+	}
+
+	list_for_each_entry(init, &initiator_list, list) {
+		ret = register_memory_initiator(init);
+		if (ret)
+			goto err;
+	}
+
+	list_for_each_entry(tgt, &target_list, list) {
+		ret = register_memory_target(tgt);
+		if (ret)
+			goto err;
+	}
+
+	return 0;
+err:
+	pr_err("HMAT: Error during initialization\n");
+	hmat_cleanup();
+	return ret;
+}
+
+static __exit void hmat_exit(void)
+{
+	hmat_cleanup();
+}
+
+module_init(hmat_init);
+module_exit(hmat_exit);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Intel Corporation");
diff --git a/drivers/acpi/hmat/hmat.h b/drivers/acpi/hmat/hmat.h
new file mode 100644
index 000000000000..108aad1f8ad7
--- /dev/null
+++ b/drivers/acpi/hmat/hmat.h
@@ -0,0 +1,47 @@
+/*
+ * Heterogeneous Memory Attributes Table (HMAT) representation in sysfs
+ *
+ * Copyright (c) 2017, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#ifndef _ACPI_HMAT_H_
+#define _ACPI_HMAT_H_
+
+struct memory_initiator {
+	struct list_head list;
+	struct device dev;
+
+	/* only one of the following three will be set */
+	struct acpi_srat_cpu_affinity *cpu;
+	struct acpi_srat_x2apic_cpu_affinity *x2apic;
+	struct acpi_srat_gicc_affinity *gicc;
+
+	int pxm;
+	bool is_registered;
+};
+#define to_memory_initiator(d) container_of((d), struct memory_initiator, dev)
+
+struct memory_target {
+	struct list_head list;
+	struct device dev;
+	struct acpi_srat_mem_affinity *ma;
+	struct acpi_hmat_address_range *spa;
+	struct memory_initiator *local_init;
+
+	bool is_cached;
+	bool is_registered;
+};
+#define to_memory_target(d) container_of((d), struct memory_target, dev)
+
+extern const struct attribute_group *memory_initiator_attribute_groups[];
+extern const struct attribute_group *memory_target_attribute_groups[];
+#endif /* _ACPI_HMAT_H_ */
diff --git a/drivers/acpi/hmat/initiator.c b/drivers/acpi/hmat/initiator.c
new file mode 100644
index 000000000000..be2bf2b58940
--- /dev/null
+++ b/drivers/acpi/hmat/initiator.c
@@ -0,0 +1,43 @@
+/*
+ * Heterogeneous Memory Attributes Table (HMAT) sysfs initiator representation
+ *
+ * Copyright (c) 2017, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include <acpi/acpi_numa.h>
+#include <linux/acpi.h>
+#include <linux/device.h>
+#include <linux/sysfs.h>
+#include "hmat.h"
+
+static ssize_t firmware_id_show(struct device *dev,
+			struct device_attribute *attr, char *buf)
+{
+	struct memory_initiator *init = to_memory_initiator(dev);
+
+	return sprintf(buf, "%d\n", init->pxm);
+}
+static DEVICE_ATTR_RO(firmware_id);
+
+static struct attribute *memory_initiator_attributes[] = {
+	&dev_attr_firmware_id.attr,
+	NULL,
+};
+
+static struct attribute_group memory_initiator_attribute_group = {
+	.attrs = memory_initiator_attributes,
+};
+
+const struct attribute_group *memory_initiator_attribute_groups[] = {
+	&memory_initiator_attribute_group,
+	NULL,
+};
diff --git a/drivers/acpi/hmat/target.c b/drivers/acpi/hmat/target.c
new file mode 100644
index 000000000000..2a9b44d5f44c
--- /dev/null
+++ b/drivers/acpi/hmat/target.c
@@ -0,0 +1,55 @@
+/*
+ * Heterogeneous Memory Attributes Table (HMAT) sysfs target representation
+ *
+ * Copyright (c) 2017, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include <acpi/acpi_numa.h>
+#include <linux/acpi.h>
+#include <linux/device.h>
+#include <linux/sysfs.h>
+#include "hmat.h"
+
+/* attributes for memory targets */
+static ssize_t firmware_id_show(struct device *dev,
+			struct device_attribute *attr, char *buf)
+{
+	struct memory_target *tgt = to_memory_target(dev);
+
+	return sprintf(buf, "%d\n", tgt->ma->proximity_domain);
+}
+static DEVICE_ATTR_RO(firmware_id);
+
+static ssize_t is_cached_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct memory_target *tgt = to_memory_target(dev);
+
+	return sprintf(buf, "%d\n", tgt->is_cached);
+}
+static DEVICE_ATTR_RO(is_cached);
+
+static struct attribute *memory_target_attributes[] = {
+	&dev_attr_firmware_id.attr,
+	&dev_attr_is_cached.attr,
+	NULL
+};
+
+/* attributes which are present for all memory targets */
+static struct attribute_group memory_target_attribute_group = {
+	.attrs = memory_target_attributes,
+};
+
+const struct attribute_group *memory_target_attribute_groups[] = {
+	&memory_target_attribute_group,
+	NULL,
+};
-- 
2.14.3


^ permalink raw reply related	[flat|nested] 153+ messages in thread
* [Devel] [PATCH v3 0/3] create sysfs representation of ACPI HMAT
@ 2017-12-14  2:10 ` Ross Zwisler
  0 siblings, 0 replies; 153+ messages in thread
From: Ross Zwisler @ 2017-12-14  2:10 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 6166 bytes --]

This is the third revision of my patches adding a sysfs representation
of the ACPI Heterogeneous Memory Attribute Table (HMAT).  These patches
are based on v4.15-rc3 and a working tree can be found here:

https://git.kernel.org/pub/scm/linux/kernel/git/zwisler/linux.git/log/?h=hmat_v3

My goal is to get these patches merged for v4.16.

Changes from previous version (https://lkml.org/lkml/2017/7/6/749):

 - Changed "HMEM" to "HMAT" and "hmem" to "hmat" throughout to make sure
   that this effort doesn't get confused with Jerome's HMM work and to
   make it clear that this enabling is tightly tied to the ACPI HMAT
   table.  (John Hubbard)

 - Moved the link in the initiator (i.e. mem_init0/mem_tgt2) from
   pointing to the "mem_tgt2/local_init" attribute group to instead
   point at the mem_tgt2 target itself.  (Brice Goglin)

 - Simplified the contents of both the initiators and the targets so
   that we just symlink to the NUMA node and don't duplicate
   information.  For initiators this means that we no longer enumerate
   CPUs, and for targets this means that we don't provide physical
   address start and length information.  All of this is already
   available in the NUMA node directory itself (i.e.
   /sys/devices/system/node/node0), and it already accounts for the fact
   that both multiple CPUs and multiple memory regions can be owned by a
   given NUMA node.  Also removed some extra attributes (is_enabled,
   is_isolated) which I don't think are useful at this point in time.

I have tested this against many different configs that I implemented
using qemu.

---

==== Quick Summary ====

Platforms exist today which have multiple types of memory attached to a
single CPU.  These disparate memory ranges have some characteristics in
common, such as CPU cache coherence, but they can have wide ranges of
performance both in terms of latency and bandwidth.

For example, consider a system that contains persistent memory, standard
DDR memory and High Bandwidth Memory (HBM), all attached to the same CPU.
There could potentially be an order of magnitude or more difference in
performance between the slowest and fastest memory attached to that CPU.

With the current Linux code NUMA nodes are CPU-centric, so all the memory
attached to a given CPU will be lumped into the same NUMA node.  This makes
it very difficult for userspace applications to understand the performance
of different memory ranges on a given CPU.

We solve this issue by providing userspace with performance information on
individual memory ranges.  This performance information is exposed via
sysfs:

  # grep . mem_tgt2/* mem_tgt2/local_init/* 2>/dev/null
  mem_tgt2/firmware_id:1
  mem_tgt2/is_cached:0
  mem_tgt2/local_init/read_bw_MBps:40960
  mem_tgt2/local_init/read_lat_nsec:50
  mem_tgt2/local_init/write_bw_MBps:40960
  mem_tgt2/local_init/write_lat_nsec:50

This allows applications to easily find the memory that they want to use.
We expect that the existing NUMA APIs will be enhanced to use this new
information so that applications can continue to use them to select their
desired memory.

==== Lots of Details ====

This patch set provides a sysfs representation of parts of the
Heterogeneous Memory Attribute Table (HMAT), newly defined in ACPI 6.2.
One major conceptual change in ACPI 6.2 related to this work is that
proximity domains no longer need to contain a processor.  We can now
have memory-only proximity domains, which means that we can now have
memory-only Linux NUMA nodes.

Here is an example configuration where we have a single processor, one
range of regular memory and one range of HBM:

  +---------------+   +----------------+
  | Processor     |   | Memory         |
  | prox domain 0 +---+ prox domain 1  |
  | NUMA node 1   |   | NUMA node 2    |
  +-------+-------+   +----------------+
          |
  +-------+----------+
  | HBM              |
  | prox domain 2    |
  | NUMA node 0      |
  +------------------+

This gives us one initiator (the processor) and two targets (the two memory
ranges).  Each of these three has its own ACPI proximity domain and
associated Linux NUMA node.  Note also that while there is a 1:1 mapping
from each proximity domain to each NUMA node, the numbers don't necessarily
match up.  Additionally we can have extra NUMA nodes that don't map back to
ACPI proximity domains.

The above configuration could also have the processor and one of the two
memory ranges sharing a proximity domain and NUMA node, but for the
purposes of the HMAT the two memory ranges will need to be separated.

The overall goal of this series and of the HMAT is to allow users to
identify memory using its performance characteristics.  This is
complicated by the amount of HMAT data that could be present in very
large systems, so in this series we only surface performance information
for local (initiator,target) pairings.  The changelog for patch 5
discusses this in detail.

Ross Zwisler (3):
  acpi: HMAT support in acpi_parse_entries_array()
  hmat: add heterogeneous memory sysfs support
  hmat: add performance attributes

 MAINTAINERS                         |   6 +
 drivers/acpi/Kconfig                |   1 +
 drivers/acpi/Makefile               |   1 +
 drivers/acpi/hmat/Kconfig           |   7 +
 drivers/acpi/hmat/Makefile          |   2 +
 drivers/acpi/hmat/core.c            | 797 ++++++++++++++++++++++++++++++++++++
 drivers/acpi/hmat/hmat.h            |  64 +++
 drivers/acpi/hmat/initiator.c       |  43 ++
 drivers/acpi/hmat/perf_attributes.c |  56 +++
 drivers/acpi/hmat/target.c          |  55 +++
 drivers/acpi/tables.c               |  52 ++-
 11 files changed, 1073 insertions(+), 11 deletions(-)
 create mode 100644 drivers/acpi/hmat/Kconfig
 create mode 100644 drivers/acpi/hmat/Makefile
 create mode 100644 drivers/acpi/hmat/core.c
 create mode 100644 drivers/acpi/hmat/hmat.h
 create mode 100644 drivers/acpi/hmat/initiator.c
 create mode 100644 drivers/acpi/hmat/perf_attributes.c
 create mode 100644 drivers/acpi/hmat/target.c

-- 
2.14.3


^ permalink raw reply	[flat|nested] 153+ messages in thread

end of thread, other threads:[~2017-12-30  9:19 UTC | newest]

Thread overview: 153+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-15 20:53 [Devel] [PATCH v3 2/3] hmat: add heterogeneous memory sysfs support Ross Zwisler
2017-12-15 20:53 ` Ross Zwisler
2017-12-15 20:53 ` Ross Zwisler
2017-12-15 20:53 ` Ross Zwisler
  -- strict thread matches above, loose matches on Subject: below --
2017-12-23  1:14 [Devel] [PATCH v3 0/3] create sysfs representation of ACPI HMAT Rafael J. Wysocki
2017-12-23  1:14 ` Rafael J. Wysocki
2017-12-23  1:14 ` Rafael J. Wysocki
2017-12-23  1:14 ` Rafael J. Wysocki
2017-12-23  1:14 ` Rafael J. Wysocki
2017-12-22 21:46 [Devel] " Ross Zwisler
2017-12-22 21:46 ` Ross Zwisler
2017-12-22 21:46 ` Ross Zwisler
2017-12-22 21:46 ` Ross Zwisler
2017-12-15  0:52 [Devel] [PATCH v3 2/3] hmat: add heterogeneous memory sysfs support Rafael J. Wysocki
2017-12-15  0:52 ` Rafael J. Wysocki
2017-12-15  0:52 ` Rafael J. Wysocki
2017-12-15  0:49 [Devel] [PATCH v3 1/3] acpi: HMAT support in acpi_parse_entries_array() Rafael J. Wysocki
2017-12-15  0:49 ` Rafael J. Wysocki
2017-12-15  0:49 ` Rafael J. Wysocki
2017-12-15  0:49 ` Rafael J. Wysocki
2017-12-14  2:10 [Devel] [PATCH v3 3/3] hmat: add performance attributes Ross Zwisler
2017-12-14  2:10 ` Ross Zwisler
2017-12-14  2:10 ` Ross Zwisler
2017-12-14  2:10 ` Ross Zwisler
2017-12-14  2:10 [Devel] [PATCH v3 2/3] hmat: add heterogeneous memory sysfs support Ross Zwisler
2017-12-14  2:10 ` Ross Zwisler
2017-12-14  2:10 ` Ross Zwisler
2017-12-14  2:10 ` Ross Zwisler
2017-12-14  2:10 [Devel] [PATCH v3 0/3] create sysfs representation of ACPI HMAT Ross Zwisler
2017-12-14  2:10 ` Ross Zwisler
2017-12-14  2:10 ` Ross Zwisler
2017-12-14  2:10 ` Ross Zwisler
2017-12-14  2:10 ` [Devel] [PATCH v3 1/3] acpi: HMAT support in acpi_parse_entries_array() Ross Zwisler
2017-12-14  2:10   ` Ross Zwisler
2017-12-14  2:10   ` Ross Zwisler
2017-12-14  2:10   ` Ross Zwisler
     [not found]   ` <20171214021019.13579-2-ross.zwisler-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2017-12-15  1:10     ` Dan Williams
2017-12-15  1:10       ` Dan Williams
2017-12-15  1:10       ` Dan Williams
2017-12-16  1:53       ` Rafael J. Wysocki
2017-12-16  1:53         ` Rafael J. Wysocki
2017-12-16  1:53         ` Rafael J. Wysocki
2017-12-16  1:57         ` Dan Williams
2017-12-16  1:57           ` Dan Williams
2017-12-16  2:15           ` Rafael J. Wysocki
2017-12-16  2:15             ` Rafael J. Wysocki
2017-12-16  2:15             ` Rafael J. Wysocki
2017-12-14 13:00 ` [PATCH v3 0/3] create sysfs representation of ACPI HMAT Michal Hocko
2017-12-14 13:00   ` Michal Hocko
2017-12-18 20:35   ` [Devel] " Ross Zwisler
2017-12-18 20:35     ` Ross Zwisler
2017-12-18 20:35     ` Ross Zwisler
2017-12-18 20:35     ` Ross Zwisler
2017-12-20 16:41     ` [Devel] " Ross Zwisler
2017-12-20 16:41       ` Ross Zwisler
2017-12-20 16:41       ` Ross Zwisler
2017-12-20 16:41       ` Ross Zwisler
     [not found]       ` <20171220164107.GA29103-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2017-12-21 13:18         ` Michal Hocko
2017-12-21 13:18           ` Michal Hocko
2017-12-21 13:18           ` Michal Hocko
     [not found]     ` <20171218203547.GA2366-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2017-12-20 18:19       ` Matthew Wilcox
2017-12-20 18:19         ` Matthew Wilcox
2017-12-20 18:19         ` Matthew Wilcox
2017-12-20 20:22         ` Dave Hansen
2017-12-20 20:22           ` Dave Hansen
2017-12-20 20:22           ` Dave Hansen
2017-12-20 21:16           ` Matthew Wilcox
2017-12-20 21:16             ` Matthew Wilcox
2017-12-20 21:16             ` Matthew Wilcox
2017-12-20 21:16             ` Matthew Wilcox
2017-12-20 21:24             ` [Devel] " Ross Zwisler
2017-12-20 21:24               ` Ross Zwisler
2017-12-20 21:24               ` Ross Zwisler
2017-12-20 21:24               ` Ross Zwisler
2017-12-20 22:29               ` Dan Williams
2017-12-20 22:29                 ` Dan Williams
2017-12-20 22:29                 ` Dan Williams
2017-12-20 22:41                 ` [Devel] " Ross Zwisler
2017-12-20 22:41                   ` Ross Zwisler
2017-12-20 22:41                   ` Ross Zwisler
2017-12-20 22:41                   ` Ross Zwisler
2017-12-20 22:41                   ` Ross Zwisler
2017-12-21 20:31                   ` Brice Goglin
2017-12-21 20:31                     ` Brice Goglin
2017-12-21 20:31                     ` Brice Goglin
2017-12-21 20:31                     ` Brice Goglin
2017-12-22 22:53                     ` Dan Williams
2017-12-22 22:53                       ` Dan Williams
2017-12-22 22:53                       ` Dan Williams
2017-12-22 22:53                       ` Dan Williams
2017-12-22 23:22                       ` [Devel] " Ross Zwisler
2017-12-22 23:22                         ` Ross Zwisler
2017-12-22 23:22                         ` Ross Zwisler
2017-12-22 23:22                         ` Ross Zwisler
2017-12-22 23:22                         ` Ross Zwisler
2017-12-22 23:57                         ` Dan Williams
2017-12-22 23:57                           ` Dan Williams
2017-12-22 23:57                           ` Dan Williams
2017-12-22 23:57                           ` Dan Williams
     [not found]                       ` <CAPcyv4j9shdJFrvADa=qW4L-jPJJ4S_TJc_c=aRoW3EmSCCChQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-12-27  9:10                         ` Brice Goglin
2017-12-27  9:10                           ` Brice Goglin
2017-12-27  9:10                           ` Brice Goglin
2017-12-27  9:10                           ` Brice Goglin
     [not found]                           ` <71317994-af66-a1b2-4c7a-86a03253cf62-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-12-30  6:58                             ` Matthew Wilcox
2017-12-30  6:58                               ` Matthew Wilcox
2017-12-30  6:58                               ` Matthew Wilcox
2017-12-30  6:58                               ` Matthew Wilcox
     [not found]                               ` <20171230065845.GD27959-PfSpb0PWhxZc2C7mugBRk2EX/6BAtgUQ@public.gmane.org>
2017-12-30  9:19                                 ` Brice Goglin
2017-12-30  9:19                                   ` Brice Goglin
2017-12-30  9:19                                   ` Brice Goglin
2017-12-30  9:19                                   ` Brice Goglin
2017-12-20 21:13         ` [Devel] " Ross Zwisler
2017-12-20 21:13           ` Ross Zwisler
2017-12-20 21:13           ` Ross Zwisler
2017-12-20 21:13           ` Ross Zwisler
2017-12-20 21:13           ` Ross Zwisler
2017-12-21  1:41           ` Elliott, Robert (Persistent Memory)
2017-12-21  1:41             ` Elliott, Robert (Persistent Memory)
2017-12-21  1:41             ` Elliott, Robert (Persistent Memory)
2017-12-21  1:41             ` Elliott, Robert (Persistent Memory)
2017-12-21 12:50         ` Michael Ellerman
2017-12-21 12:50           ` Michael Ellerman
2017-12-21 12:50           ` Michael Ellerman
2017-12-21 12:50           ` Michael Ellerman
2017-12-22  3:09 ` Anshuman Khandual
2017-12-22  3:09   ` Anshuman Khandual
2017-12-22  3:09   ` Anshuman Khandual
2017-12-22 10:31   ` Kogut, Jaroslaw
2017-12-22 10:31     ` Kogut, Jaroslaw
2017-12-22 10:31     ` Kogut, Jaroslaw
2017-12-22 14:37     ` Anshuman Khandual
2017-12-22 14:37       ` Anshuman Khandual
2017-12-22 14:37       ` Anshuman Khandual
2017-12-22 17:13   ` Dave Hansen
2017-12-22 17:13     ` Dave Hansen
2017-12-22 17:13     ` Dave Hansen
2017-12-23  5:14     ` Anshuman Khandual
2017-12-23  5:14       ` Anshuman Khandual
2017-12-23  5:14       ` Anshuman Khandual
2017-12-22 22:13   ` [Devel] " Ross Zwisler
2017-12-22 22:13     ` Ross Zwisler
2017-12-22 22:13     ` Ross Zwisler
2017-12-22 22:13     ` Ross Zwisler
2017-12-23  6:56     ` Anshuman Khandual
2017-12-23  6:56       ` Anshuman Khandual
2017-12-23  6:56       ` Anshuman Khandual
2017-12-22 22:31   ` [Devel] " Ross Zwisler
2017-12-22 22:31     ` Ross Zwisler
2017-12-22 22:31     ` Ross Zwisler
2017-12-22 22:31     ` Ross Zwisler
     [not found]     ` <20171222223154.GC25711-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2017-12-25  2:05       ` Liubo(OS Lab)
2017-12-25  2:05         ` Liubo(OS Lab)
2017-12-25  2:05         ` Liubo(OS Lab)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.