NVDIMM Device and Persistent Memory development
 help / color / mirror / Atom feed
From: alison.schofield@intel.com
To: Dan Williams <dan.j.williams@intel.com>,
	Ira Weiny <ira.weiny@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>,
	Ben Widawsky <bwidawsk@kernel.org>
Cc: Alison Schofield <alison.schofield@intel.com>,
	nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org
Subject: [ndctl RFC 0/3] Support poison list retrieval
Date: Thu, 13 Oct 2022 16:39:00 -0700	[thread overview]
Message-ID: <cover.1665699750.git.alison.schofield@intel.com> (raw)

From: Alison Schofield <alison.schofield@intel.com>

The RFC label is because this is built upon in flight patchsets
making it unlikely others can try it out. It depends upon the
tracing support in Dave's monitor patchset [1], and the kernel
driver support for poison in this patchset [2].

The first patch adds a libcxl API for triggering the read of a
poison list from a memory device. Users of that API will need to
trace the kernel events to collect the error records.

Patches 2 & 3 offer a pretty option, --media-errors to cxl list 
where the the poison list is read, results collected and parsed,
and the media error records included in the JSON list output.

The JSON output of 'cxl list' does not include all the same fields
that are available in the 'cxl_poison' trace event.

Trace events of 'cxl_poison' always include these fields:
region: memdev: pcidev: hpa: dpa: length: source: flags: overflow_time:

'cxl list --media-errors' omits fields that seem useless in the
context of the cxl list command:
- Do not repeat the memdev, region, or pcidev's that are
  already included in the list output.
- Only include 'hpa' when media errors are listed by region.

Examples:
cxl list -m mem2 --media-errors
[
  {
    "memdev":"mem2",
    "pmem_size":1073741824,
    "ram_size":0,
    "serial":2,
    "host":"cxl_mem.2",
    "media_errors":{
      "nr media-errors":2,
      "media-error records":[
        {
          "dpa":64,
          "length":128,
          "source":"Injected",
          "flags":"Overflow,",
          "overflow_time":1656711046
        },
        {
          "dpa":192,
          "length":192,
          "source":"Internal",
          "flags":"Overflow,",
          "overflow_time":1656711046
        },
      ]
    }
  }
]

# cxl list -r region5 --media-errors
[
  {
    "region":"region5",
    "resource":1035623989248,
    "size":2147483648,
    "interleave_ways":2,
    "interleave_granularity":4096,
    "decode_state":"commit",
    "media_errors":{
      "nr media-errors":2,
      "media-error records":[
        {
          "memdev":"mem2",
          "hpa":0,
          "dpa":0,
          "length":64,
          "source":"Reserved",
          "flags":"",
          "overflow_time":0
        },
	{
          "memdev":"mem5",
          "hpa":0,
          "dpa":384,
          "length":256,
          "source":"Injected",
          "flags":"",
          "overflow_time":0
        }
      ]
    }
  }
]

[1] https://lore.kernel.org/nvdimm/166363103019.3861186.3067220004819656109.stgit@djiang5-desk3.ch.intel.com/
[2] https://lore.kernel.org/linux-cxl/cover.1665606782.git.alison.schofield@intel.com/

Alison Schofield (3):
  libcxl: add interfaces for GET_POISON_LIST mailbox commands
  cxl/list: collect and parse the poison list records
  cxl/list: add --media-errors option to cxl list

 Documentation/cxl/cxl-list.txt |  66 +++++++++++
 cxl/filter.c                   |   2 +
 cxl/filter.h                   |   1 +
 cxl/json.c                     | 197 +++++++++++++++++++++++++++++++++
 cxl/lib/libcxl.c               |  40 +++++++
 cxl/lib/libcxl.sym             |   6 +
 cxl/libcxl.h                   |   2 +
 cxl/list.c                     |   2 +
 8 files changed, 316 insertions(+)

-- 
2.37.3


             reply	other threads:[~2022-10-13 23:39 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-13 23:39 alison.schofield [this message]
2022-10-13 23:39 ` [RFC 1/3] libcxl: add interfaces for GET_POISON_LIST mailbox commands alison.schofield
2022-10-13 23:39 ` [RFC 2/3] cxl/list: collect and parse the poison list records alison.schofield
2022-10-13 23:39 ` [RFC 3/3] cxl/list: add --media-errors option to cxl list alison.schofield

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1665699750.git.alison.schofield@intel.com \
    --to=alison.schofield@intel.com \
    --cc=bwidawsk@kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox