public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: <ira.weiny@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Alison Schofield <alison.schofield@intel.com>,
	"Vishal Verma" <vishal.l.verma@intel.com>,
	Ben Widawsky <bwidawsk@kernel.org>,
	Davidlohr Bueso <dave@stgolabs.net>,
	<linux-kernel@vger.kernel.org>, <linux-cxl@vger.kernel.org>
Subject: Re: [PATCH 02/11] cxl/mem: Implement Get Event Records command
Date: Wed, 16 Nov 2022 15:19:36 +0000	[thread overview]
Message-ID: <20221116151936.0000662f@Huawei.com> (raw)
In-Reply-To: <20221110185758.879472-3-ira.weiny@intel.com>

On Thu, 10 Nov 2022 10:57:49 -0800
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL devices have multiple event logs which can be queried for CXL event
> records.  Devices are required to support the storage of at least one
> event record in each event log type.
> 
> Devices track event log overflow by incrementing a counter and tracking
> the time of the first and last overflow event seen.
> 
> Software queries events via the Get Event Record mailbox command; CXL
> rev 3.0 section 8.2.9.2.2.
> 
> Issue the Get Event Record mailbox command on driver load.  Trace each
> record found with a generic record trace.  Trace any overflow
> conditions.
> 
> The device can return up to 1MB worth of event records per query.  This
> presents complications with allocating a huge buffers to potentially
> capture all the records.  It is not anticipated that these event logs
> will be very deep and reading them does not need to be performant.
> Process only 3 records at a time.  3 records was chosen as it fits
> comfortably on the stack to prevent dynamic allocation while still
> cutting down on extra mailbox messages.
> 
> This patch traces a raw event record only and leaves the specific event
> record types to subsequent patches.
> 
> Macros are created to use for tracing the common CXL Event header
> fields.
> 
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

Hi Ira,

A question inline about whether some of the conditions you are checking
for can actually happen. Otherwise looks good to me.

Jonathan

> 
> ---
> Change from RFC v2:
> 	Support reading 3 events at once.
> 	Reverse Jonathan's suggestion and check for positive number of
> 		records.  Because the record count may have been
> 		returned as something > 3 based on what the device
> 		thinks it can send back even though the core Linux mbox
> 		processing truncates the data.
> 	Alison and Dave Jiang
> 		Change header uuid type to uuid_t for better user space
> 		processing
> 	Smita
> 		Check status reg before reading log.
> 	Steven
> 		Prefix all trace points with 'cxl_'
> 		Use static branch <trace>_enabled() calls
> 	Jonathan
> 		s/CXL_EVENT_TYPE_INFO/0
> 		s/{first,last}/{first,last}_ts
> 		Remove Reserved field from header
> 		Fix header issue for cxl_event_log_type_str()
> 
> Change from RFC:
> 	Remove redundant error message in get event records loop
> 	s/EVENT_RECORD_DATA_LENGTH/CXL_EVENT_RECORD_DATA_LENGTH
> 	Use hdr_uuid for the header UUID field
> 	Use cxl_event_log_type_str() for the trace events
> 	Create macros for the header fields and common entries of each event
> 	Add reserved buffer output dump
> 	Report error if event query fails
> 	Remove unused record_cnt variable
> 	Steven - reorder overflow record
> 		Remove NOTE about checkpatch
> 	Jonathan
> 		check for exactly 1 record
> 		s/v3.0/rev 3.0
> 		Use 3 byte fields for 24bit fields
> 		Add 3.0 Maintenance Operation Class
> 		Add Dynamic Capacity log type
> 		Fix spelling
> 	Dave Jiang/Dan/Alison
> 		s/cxl-event/cxl
> 		trace/events/cxl-events => trace/events/cxl.h
> 		s/cxl_event_overflow/overflow
> 		s/cxl_event/generic_event
> ---
>  MAINTAINERS                  |   1 +
>  drivers/cxl/core/mbox.c      |  70 +++++++++++++++++++
>  drivers/cxl/cxl.h            |   8 +++
>  drivers/cxl/cxlmem.h         |  73 ++++++++++++++++++++
>  include/trace/events/cxl.h   | 127 +++++++++++++++++++++++++++++++++++
>  include/uapi/linux/cxl_mem.h |   1 +
>  6 files changed, 280 insertions(+)
>  create mode 100644 include/trace/events/cxl.h

> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 16176b9278b4..a908b95a7de4 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c

> +static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> +				    enum cxl_event_log_type type)
> +{
> +	struct cxl_get_event_payload payload;
> +	u16 pl_nr;
> +
> +	do {
> +		u8 log_type = type;
> +		int rc;
> +
> +		rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_EVENT_RECORD,
> +				       &log_type, sizeof(log_type),
> +				       &payload, sizeof(payload));
> +		if (rc) {
> +			dev_err(cxlds->dev, "Event log '%s': Failed to query event records : %d",
> +				cxl_event_log_type_str(type), rc);
> +			return;
> +		}
> +
> +		pl_nr = le16_to_cpu(payload.record_count);
> +		if (trace_cxl_generic_event_enabled()) {
> +			u16 nr_rec = min_t(u16, pl_nr, CXL_GET_EVENT_NR_RECORDS);

Either I'm misreading the spec, or it can't be greater than NR_RECORDS.
"The number of event records in the Event Records list...."
Event Records being the field inside this payload which is not big enough to
take more than CXL_GET_EVENT_NR_RECORDS and the intro to Get Event Records
refers to the number being restricted by the mailbox output payload provided.

I'm in favor of defense against broken hardware, but don't paper over any
such error - scream about it.

> +			int i;
> +
> +			for (i = 0; i < nr_rec; i++)
> +				trace_cxl_generic_event(dev_name(cxlds->dev),
> +							type,
> +							&payload.record[i]);
> +		}
> +
> +		if (trace_cxl_overflow_enabled() &&
> +		    (payload.flags & CXL_GET_EVENT_FLAG_OVERFLOW))
> +			trace_cxl_overflow(dev_name(cxlds->dev), type, &payload);
> +
> +	} while (pl_nr > CXL_GET_EVENT_NR_RECORDS ||

Isn't pl_nr > CXL_GET_EVENT_NR_RECORDS a hardware bug? It's the number in returned
payload not the total number.

> +		 payload.flags & CXL_GET_EVENT_FLAG_MORE_RECORDS);
> +}


> diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> new file mode 100644
> index 000000000000..60dec9a84918
> --- /dev/null
> +++ b/include/trace/events/cxl.h
> @@ -0,0 +1,127 @@


> +#define CXL_EVT_TP_fast_assign(dname, l, hdr)					\
> +	__assign_str(dev_name, (dname));					\
> +	__entry->log = (l);							\
> +	memcpy(&__entry->hdr_uuid, &(hdr).id, sizeof(uuid_t));			\
> +	__entry->hdr_length = (hdr).length;					\
> +	__entry->hdr_flags = get_unaligned_le24((hdr).flags);			\
> +	__entry->hdr_handle = le16_to_cpu((hdr).handle);			\
> +	__entry->hdr_related_handle = le16_to_cpu((hdr).related_handle);	\
> +	__entry->hdr_timestamp = le64_to_cpu((hdr).timestamp);			\
> +	__entry->hdr_maint_op_class = (hdr).maint_op_class
> +
Trivial: Maybe one blank line is enough?
> +
> +#define CXL_EVT_TP_printk(fmt, ...) \
> +	TP_printk("%s log=%s : time=%llu uuid=%pUb len=%d flags='%s' "		\
> +		"handle=%x related_handle=%x maint_op_class=%u"			\
> +		" : " fmt,							\
> +		__get_str(dev_name), cxl_event_log_type_str(__entry->log),	\
> +		__entry->hdr_timestamp, &__entry->hdr_uuid, __entry->hdr_length,\
> +		show_hdr_flags(__entry->hdr_flags), __entry->hdr_handle,	\
> +		__entry->hdr_related_handle, __entry->hdr_maint_op_class,	\
> +		##__VA_ARGS__)


  parent reply	other threads:[~2022-11-16 15:20 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-10 18:57 [PATCH 00/11] CXL: Process event logs ira.weiny
2022-11-10 18:57 ` [PATCH 01/11] cxl/pci: Add generic MSI-X/MSI irq support ira.weiny
2022-11-15 21:41   ` Dave Jiang
2022-11-16 14:53   ` Jonathan Cameron
2022-11-16 23:48     ` Ira Weiny
2022-11-17 11:20       ` Jonathan Cameron
2022-11-10 18:57 ` [PATCH 02/11] cxl/mem: Implement Get Event Records command ira.weiny
2022-11-15 21:54   ` Dave Jiang
2022-11-16 15:19   ` Jonathan Cameron [this message]
2022-11-17  0:47     ` Ira Weiny
2022-11-17 10:43       ` Jonathan Cameron
2022-11-18 23:26         ` Ira Weiny
2022-11-21 10:47           ` Jonathan Cameron
2022-11-28 23:30             ` Ira Weiny
2022-11-29 12:26               ` Jonathan Cameron
2022-11-30  5:09                 ` Ira Weiny
2022-11-30 14:05                   ` Jonathan Cameron
2022-11-10 18:57 ` [PATCH 03/11] cxl/mem: Implement Clear " ira.weiny
2022-11-15 22:09   ` Dave Jiang
2022-11-16 15:24   ` Jonathan Cameron
2022-11-16 15:45     ` Jonathan Cameron
2022-11-17  1:12       ` Ira Weiny
2022-11-17  1:07     ` Ira Weiny
2022-11-10 18:57 ` [PATCH 04/11] cxl/mem: Clear events on driver load ira.weiny
2022-11-15 22:10   ` Dave Jiang
2022-11-10 18:57 ` [PATCH 05/11] cxl/mem: Trace General Media Event Record ira.weiny
2022-11-15 22:25   ` Dave Jiang
2022-11-16 15:31   ` Jonathan Cameron
2022-11-17  1:18     ` Ira Weiny
2022-11-10 18:57 ` [PATCH 06/11] cxl/mem: Trace DRAM " ira.weiny
2022-11-15 22:26   ` Dave Jiang
2022-11-10 18:57 ` [PATCH 07/11] cxl/mem: Trace Memory Module " ira.weiny
2022-11-15 22:39   ` Dave Jiang
2022-11-16 15:35   ` Jonathan Cameron
2022-11-17  1:23     ` Ira Weiny
2022-11-17 11:22       ` Jonathan Cameron
2022-11-30  9:30         ` Ira Weiny
2022-11-22 22:36   ` Steven Rostedt
2022-11-10 18:57 ` [PATCH 08/11] cxl/mem: Wire up event interrupts ira.weiny
2022-11-15 23:13   ` Dave Jiang
2022-11-17  1:38     ` Ira Weiny
2022-11-16 14:40   ` Jonathan Cameron
2022-11-30  9:11     ` Ira Weiny
2022-11-10 18:57 ` [PATCH 09/11] cxl/test: Add generic mock events ira.weiny
2022-11-16 16:00   ` Jonathan Cameron
2022-11-29 18:29     ` Ira Weiny
2022-11-10 18:57 ` [PATCH 10/11] cxl/test: Add specific events ira.weiny
2022-11-16 16:08   ` Jonathan Cameron
2022-11-10 18:57 ` [PATCH 11/11] cxl/test: Simulate event log overflow ira.weiny
2022-11-16 16:10   ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221116151936.0000662f@Huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=bwidawsk@kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave@stgolabs.net \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox