qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Black <daniel@linux.ibm.com>
To: Tao <tao3.xu@intel.com>
Cc: ehabkost@redhat.com, jingqi.liu@intel.com, fan.du@intel.com,
	qemu-devel@nongnu.org, jonathan.cameron@huawei.com,
	imammedo@redhat.com, dan.j.williams@intel.com
Subject: Re: [Qemu-devel] [PATCH v9 09/11] numa: Extend the CLI to provide memory latency and bandwidth information
Date: Mon, 12 Aug 2019 15:13:40 +1000	[thread overview]
Message-ID: <20190812151340.567e1cb6@volution.ozlabs.ibm.com> (raw)
In-Reply-To: <20190809065731.9097-10-tao3.xu@intel.com>



Tao Xu, Liu Jingqi,

Thanks for doing these updates.

On Fri,  9 Aug 2019 14:57:29 +0800
Tao <tao3.xu@intel.com> wrote:

> From: Liu Jingqi <jingqi.liu@intel.com>
> 
> Add -numa hmat-lb option to provide System Locality Latency and
> Bandwidth Information. These memory attributes help to build
> System Locality Latency and Bandwidth Information Structure(s)
> in ACPI Heterogeneous Memory Attribute Table (HMAT).
> 
> Signed-off-by: Liu Jingqi <jingqi.liu@intel.com>
> Signed-off-by: Tao Xu <tao3.xu@intel.com>
> ---
>  hw/acpi/hmat.h        |   3 +
>  hw/core/numa.c        | 185
> ++++++++++++++++++++++++++++++++++++++++++ include/sysemu/numa.h |
> 2 + qapi/machine.json     |  95 +++++++++++++++++++++-
>  qemu-options.hx       |  44 +++++++++-
>  5 files changed, 326 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/acpi/hmat.h b/hw/acpi/hmat.h
> index 6c32f12e78..b7c1e02cf0 100644
> --- a/hw/acpi/hmat.h
> +++ b/hw/acpi/hmat.h
> @@ -42,6 +42,9 @@
>  
>  #define HMAT_IS_LATENCY(type) (type <= HMAT_LB_DATA_WRITE_LATENCY)
>  
> +#define PICO_PER_USEC 1000000
> +#define PICO_PER_NSEC 1000
> +
>  struct HMAT_LB_Info {
>      /*
>       * Indicates total number of Proximity Domains
> diff --git a/hw/core/numa.c b/hw/core/numa.c
> index cfb6339810..9a494145f3 100644
> --- a/hw/core/numa.c
> +++ b/hw/core/numa.c
> @@ -37,6 +37,7 @@
>  #include "qemu/option.h"
>  #include "qemu/config-file.h"
>  #include "qemu/cutils.h"
> +#include "hw/acpi/hmat.h"
>  
>  QemuOptsList qemu_numa_opts = {
>      .name = "numa",
> @@ -183,6 +184,184 @@ void parse_numa_distance(MachineState *ms,
> NumaDistOptions *dist, Error **errp)
> ms->numa_state->have_numa_distance = true; }
>  
> +void parse_numa_hmat_lb(MachineState *ms, NumaHmatLBOptions *node,
> +                        Error **errp)
> +{
..

Optional; you could support not connected (0xffff) for latency/bandwidth in
this parsing.

> +        if (*endptr == '\0') {
> +            base_lat = 1;
> +        } else if (*(endptr + 1) == 's') {
> +            switch (*endptr) {
> +            case 'p':
> +                base_lat = 1;
> +                break;
> +            case 'n':
> +                base_lat = PICO_PER_NSEC;
> +                break;
> +            case 'u':

Glad you picked up my mismatch of "u/micro".

> +        } else {
> +            error_setg(errp, "Invalid latency unit %s,"
> +                "vaild units are \"ps\" \"ns\" \"us\"",
>node->latency);

typo "valid"

> +        } else if (hmat_lb->base_lat != base_lat) {
> +            error_setg(errp, "Invalid latency unit %s,"
> +                " please unify the units.", node->latency);

This error is misleading. Should be something like "all latencies must be
specified in the same units"

> +        switch (toupper(*endptr)) {
> +        case '\0':
> +        case 'M':
> +            base_bw = 1;
> +            break;
> +        case 'G':
> +            base_bw = UINT64_C(1) << 10;
> +            break;

There was one more gap - Terra.

        case 'T':
           base_bw = UINT64_C(1) << 20;
           break;

> +        case 'P':
> +            base_bw = UINT64_C(1) << 20;
and:
               base_bw = UINT64_C(1) << 30;

> +            break;
> +        }


Currently Linux 5.3.0-rc3+ doesn't cope with real corrected "bandwidth=2P" so
maybe not worth it.

[    2.092060] HMAT: Locality: Flags:00 Type:Access Bandwidth Initiator
Domains:1 Target Domains:2 Base:1073741824 [    2.092326]   Initiator-Target[0-0]:-2147483648 MB/s

On values, testing for overflow is required. e.g:

 -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=4096T
bandwidth=4096T

[    2.047676] HMAT: Locality: Flags:00 Type:Access Bandwidth Initiator Domains:1 Target Domains:2 Base:1048576
[    2.048084]   Initiator-Target[0-0]:0 MB/s

Technically ACPI could support up to 4P with base/offset but you'd need to be a
lot trickier (i.e. base is highest common multiple of all entries and then see
if entry/base > 2^32-2 ) with base/entry values to arrive at this number.

+docs/commit message propagation of this.


> +        } else if (hmat_lb->base_lat != base_lat) {

Bug: Incorrectly copied - base_lat should be base_bw (twice)

> +            error_setg(errp, "Invalid bandwidth unit %s,"
> +                " please unify the units.", node->bandwidth);

This error is misleading. Should be something like "all bandwidths must be
specified in the same units"

> diff --git a/qemu-options.hx b/qemu-options.hx
> index c480781992..cda4607f3a 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx

> +@example
> +-m 2G \
> +-object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0 -numa node,nodeid=0,memdev=ram-node0 \
> +-object memory-backend-ram,size=1024M,policy=bind,host-nodes=1,id=ram-node1 -numa node,nodeid=1,memdev=ram-node1 \
> +-smp 2 \
> +-numa cpu,node-id=0,socket-id=0 \
> +-numa cpu,node-id=0,socket-id=1 \
> +-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=5ns \
> +-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=200M \
> +-numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-latency,latency=10ns \
> +-numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=100M \
> +@end example

nit: remove slash on last line

Is this a valid example? I get

qemu-system-x86_64: -numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-latency,latency=11us: Invalid target=1, it hasn't a valid initiator proximity domain.

(I tested with host-nodes=1 changed to 0 as local machine is single node)

Technically on [PATCH v9 07/11]
diff --git a/hw/acpi/hmat.c b/hw/acpi/hmat.c
index abf99b1adc..431818dc82 100644
--- a/hw/acpi/hmat.c
+++ b/hw/acpi/hmat.c
@@ -67,11 +67,81 @@ static void build_hmat_mpda(GArray *table_data, uint16_t flags, int initiator,
     build_append_int_noprefix(table_data, 0, 8);
 }
 
+/*
+ * ACPI 6.3: 5.2.27.4 System Locality Latency and Bandwidth Information
+ * Structure: Table 5-142

nit: 5-146

Test as follows:

qemu-system-x86_64   -kernel /home/dan/repos/linux/vmlinux   -nographic -append  console=ttyS0  \
   -m 2G -object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0 \
   -numa node,nodeid=0,memdev=ram-node0 \
   -object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node1 \
   -numa node,nodeid=1,memdev=ram-node1 -smp 2 -numa cpu,node-id=0,socket-id=0 \
   -numa cpu,node-id=0,socket-id=1  \
   -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=123us \
   -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=200M \
   -numa hmat-cache,node-id=0,size=0x20000,total=1,level=1,assoc=direct,policy=write-back,line=8 \
   -numa hmat-cache,node-id=1,size=0x20000,total=1,level=1,assoc=direct,policy=write-back,line=8 \
| grep -A 5 HMAT


[    0.038912] ACPI: HMAT 0x000000007FFE16C5 000118 (v02 BOCHS  BXPCHMAT 00000001 BXPC 00000001)
[    0.040954] SRAT: PXM 0 -> APIC 0x00 -> Node 0
[    0.040999] SRAT: PXM 0 -> APIC 0x01 -> Node 0
[    0.041189] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff]
[    0.041250] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0x3fffffff]
[    0.041276] ACPI: SRAT: Node 1 PXM 1 [mem 0x40000000-0x7fffffff]
--
[    1.984572] HMAT: Memory Flags:0001 Processor Domain:0 Memory Domain:0
[    1.984792] HMAT: Memory Flags:0000 Processor Domain:0 Memory Domain:1
[    1.985435] HMAT: Locality: Flags:00 Type:Access Latency Initiator Domains:1 Target Domains:2 Base:1000000
[    1.986424]   Initiator-Target[0-0]:123000 nsec
[    1.986664]   Initiator-Target[0-1]:0 nsec
[    1.986910] HMAT: Locality: Flags:00 Type:Access Bandwidth Initiator Domains:1 Target Domains:2 Base:1
[    1.987229]   Initiator-Target[0-0]:200 MB/s
[    1.987356]   Initiator-Target[0-1]:0 MB/s
[    1.987549] HMAT: Cache: Domain:0 Size:131072 Attrs:00081111 SMBIOS Handles:0
[    1.988393] HMAT: Cache: Domain:1 Size:131072 Attrs:00081111 SMBIOS Handles:0

Leaving default latency/bw as 0 if ok as spec says '0: the corresponding latency
or bandwidth information is not provided.'. Potentially the kernel could display this better.

Also note https://marc.info/?l=linux-acpi&m=156506549410279&w=2 submitted as
hmat_build_table_structs only calls build_hmat_mpda with flags=0 or HMAT_PROX_INIT_VALID (0x1) which is right looking at ACPI-6.3. An Ack/(Nack if I'm wrong) there would be good to have both kernel and this patch series working together.

for entire series:

Reviewed-by: Daniel Black <daniel@linux.ibm.com>



  reply	other threads:[~2019-08-12  5:14 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-09  6:57 [Qemu-devel] [PATCH v9 00/11] Build ACPI Heterogeneous Memory Attribute Table (HMAT) Tao
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 01/11] hw/arm: simplify arm_load_dtb Tao
2019-08-13 21:55   ` Alistair Francis
2019-08-14  1:19     ` Andrew Jeffery
2019-08-13 21:55   ` Eduardo Habkost
2019-08-14 13:08     ` Cédric Le Goater
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 02/11] numa: move numa global variable nb_numa_nodes into MachineState Tao
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 03/11] numa: move numa global variable have_numa_distance " Tao
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 04/11] numa: move numa global variable numa_info " Tao
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 05/11] numa: Extend CLI to provide initiator information for numa nodes Tao
2019-08-13 15:00   ` Igor Mammedov
2019-08-14  2:24     ` Tao Xu
2019-08-16 14:47       ` Igor Mammedov
2019-08-14  2:39     ` Dan Williams
2019-08-14  5:13       ` Tao Xu
2019-08-14 21:29         ` Dan Williams
2019-08-15  1:56           ` Tao Xu
2019-08-15  2:31             ` Dan Williams
2019-08-16 14:57               ` Igor Mammedov
2019-08-20  8:34                 ` Tao Xu
2019-08-27 13:12                   ` Igor Mammedov
2019-08-28  1:09                     ` Tao Xu
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 06/11] hmat acpi: Build Memory Proximity Domain Attributes Structure(s) Tao
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 07/11] hmat acpi: Build System Locality Latency and Bandwidth Information Structure(s) Tao
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 08/11] hmat acpi: Build Memory Side Cache " Tao
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 09/11] numa: Extend the CLI to provide memory latency and bandwidth information Tao
2019-08-12  5:13   ` Daniel Black [this message]
2019-08-12  6:11     ` Tao Xu
2019-08-13 15:11   ` Eric Blake
2019-08-14  2:58     ` Tao Xu
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 10/11] numa: Extend the CLI to provide memory side cache information Tao
2019-08-09  6:57 ` [Qemu-devel] [PATCH v9 11/11] tests/bios-tables-test: add test cases for ACPI HMAT Tao
2019-08-09 11:11 ` [Qemu-devel] [PATCH v9 00/11] Build ACPI Heterogeneous Memory Attribute Table (HMAT) no-reply
2019-08-13  8:53 ` Tao Xu
2019-08-14 20:57   ` Eduardo Habkost
2019-08-15  0:53     ` Tao Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190812151340.567e1cb6@volution.ozlabs.ibm.com \
    --to=daniel@linux.ibm.com \
    --cc=dan.j.williams@intel.com \
    --cc=ehabkost@redhat.com \
    --cc=fan.du@intel.com \
    --cc=imammedo@redhat.com \
    --cc=jingqi.liu@intel.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=qemu-devel@nongnu.org \
    --cc=tao3.xu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).