linux-acpi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 1/3] aerdrv: Trace Event for AER
@ 2013-01-03 22:34 Lance Ortiz
  2013-01-03 22:34 ` [PATCH v9 2/3] aerdrv: Enhanced AER logging Lance Ortiz
  2013-01-03 22:34 ` [PATCH v9 3/3] aerdrv: Cleanup log output for AER Lance Ortiz
  0 siblings, 2 replies; 6+ messages in thread
From: Lance Ortiz @ 2013-01-03 22:34 UTC (permalink / raw)
  To: bhelgaas, lance_ortiz, jiang.liu, tony.luck, bp, rostedt, mchehab,
	linux-acpi, linux-pci, linux-kernel

This header file will define a new trace event that will be triggered when
a AER event occurs.  The following data will be provided to the trace
event.

char * dev_name - The name of the slot where the device resides
                  ([domain:]bus:device.function).

u32 status - Either the correctable or uncorrectable register
             indicating what error or errors have been see.

u8 severity - error severity 0:NONFATAL 1:FATAL 2:CORRECTED

The trace event will also provide a trace string that may look like:

"0000:05:00.0 PCIe Bus Error:severity=Uncorrected (Non-Fatal), Poisoned
TLP"

v1-v2 Move header from include/ras/aer_event.h to
include/trace/events/ras.h
v3-v4 Cleaned up comments and commit header
v4-v5 More cleanup remove () from if statement in print.
      Renamed string define to be more specific.
v5-v6 change TRACE_SYSTEM define to be ras and not aer.
Signed-off-by: Lance Ortiz <lance.ortiz@hp.com>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Tony Luck <tony.luck@intel.com>
---

 include/trace/events/ras.h |   77 ++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 77 insertions(+), 0 deletions(-)
 create mode 100644 include/trace/events/ras.h

diff --git a/include/trace/events/ras.h b/include/trace/events/ras.h
new file mode 100644
index 0000000..88b8783
--- /dev/null
+++ b/include/trace/events/ras.h
@@ -0,0 +1,77 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM ras
+
+#if !defined(_TRACE_AER_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_AER_H
+
+#include <linux/tracepoint.h>
+#include <linux/edac.h>
+
+
+/*
+ * PCIe AER Trace event
+ *
+ * These events are generated when hardware detects a corrected or
+ * uncorrected event on a PCIe device. The event report has
+ * the following structure:
+ *
+ * char * dev_name -	The name of the slot where the device resides
+ *			([domain:]bus:device.function).
+ * u32 status -		Either the correctable or uncorrectable register
+ *			indicating what error or errors have been seen
+ * u8 severity -	error severity 0:NONFATAL 1:FATAL 2:CORRECTED
+ */
+
+#define aer_correctable_errors		\
+	{BIT(0),	"Receiver Error"},		\
+	{BIT(6),	"Bad TLP"},			\
+	{BIT(7),	"Bad DLLP"},			\
+	{BIT(8),	"RELAY_NUM Rollover"},		\
+	{BIT(12),	"Replay Timer Timeout"},	\
+	{BIT(13),	"Advisory Non-Fatal"}
+
+#define aer_uncorrectable_errors		\
+	{BIT(4),	"Data Link Protocol"},		\
+	{BIT(12),	"Poisoned TLP"},		\
+	{BIT(13),	"Flow Control Protocol"},	\
+	{BIT(14),	"Completion Timeout"},		\
+	{BIT(15),	"Completer Abort"},		\
+	{BIT(16),	"Unexpected Completion"},	\
+	{BIT(17),	"Receiver Overflow"},		\
+	{BIT(18),	"Malformed TLP"},		\
+	{BIT(19),	"ECRC"},			\
+	{BIT(20),	"Unsupported Request"}
+
+TRACE_EVENT(aer_event,
+	TP_PROTO(const char *dev_name,
+		 const u32 status,
+		 const u8 severity),
+
+	TP_ARGS(dev_name, status, severity),
+
+	TP_STRUCT__entry(
+		__string(	dev_name,	dev_name	)
+		__field(	u32,		status		)
+		__field(	u8,		severity	)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev_name, dev_name);
+		__entry->status		= status;
+		__entry->severity	= severity;
+	),
+
+	TP_printk("%s PCIe Bus Error: severity=%s, %s\n",
+		__get_str(dev_name),
+		__entry->severity == HW_EVENT_ERR_CORRECTED ? "Corrected" :
+			__entry->severity == HW_EVENT_ERR_FATAL ?
+			"Fatal" : "Uncorrected",
+		__entry->severity == HW_EVENT_ERR_CORRECTED ?
+		__print_flags(__entry->status, "|", aer_correctable_errors) :
+		__print_flags(__entry->status, "|", aer_uncorrectable_errors))
+);
+
+#endif /* _TRACE_AER_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v9 2/3] aerdrv: Enhanced AER logging
  2013-01-03 22:34 [PATCH v9 1/3] aerdrv: Trace Event for AER Lance Ortiz
@ 2013-01-03 22:34 ` Lance Ortiz
  2013-01-04 22:07   ` Bjorn Helgaas
  2013-01-03 22:34 ` [PATCH v9 3/3] aerdrv: Cleanup log output for AER Lance Ortiz
  1 sibling, 1 reply; 6+ messages in thread
From: Lance Ortiz @ 2013-01-03 22:34 UTC (permalink / raw)
  To: bhelgaas, lance_ortiz, jiang.liu, tony.luck, bp, rostedt, mchehab,
	linux-acpi, linux-pci, linux-kernel

This patch will provide a more reliable and easy way for user-space
applications to have access to AER logs rather than reading them from the
message buffer. It also provides a way to notify user-space when an AER
event occurs.

The aer driver is updated to generate a trace event of function 'aer_event'
when a PCIe error is reported over the AER interface.  The trace event was
added to both the interrupt based aer path and the firmware first path.

v1-v2 fix compile errors in ifdefs.
v2-v3 Update to new location of trace header. Update print to remove
warning.
v3-v4 Reworked logic when getting ready to call cper_print_aer
v6-v7 Change print from pr_info to pr_err if !dev
v7-v8 Add pfx argument back into call to cper_print_aer()

Signed-off-by: Lance Ortiz <lance.ortiz@hp.com>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Tony Luck <tony.luck@intel.com>
---

 drivers/acpi/apei/cper.c               |   19 ++++++++++++++++---
 drivers/pci/pcie/aer/aerdrv_errprint.c |    9 ++++++++-
 include/linux/aer.h                    |    4 ++--
 3 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/drivers/acpi/apei/cper.c b/drivers/acpi/apei/cper.c
index e6defd8..1e5d8a4 100644
--- a/drivers/acpi/apei/cper.c
+++ b/drivers/acpi/apei/cper.c
@@ -29,6 +29,7 @@
 #include <linux/time.h>
 #include <linux/cper.h>
 #include <linux/acpi.h>
+#include <linux/pci.h>
 #include <linux/aer.h>
 
 /*
@@ -249,6 +250,10 @@ static const char *cper_pcie_port_type_strs[] = {
 static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
 			    const struct acpi_hest_generic_data *gdata)
 {
+#ifdef CONFIG_ACPI_APEI_PCIEAER
+	struct pci_dev *dev;
+#endif
+
 	if (pcie->validation_bits & CPER_PCIE_VALID_PORT_TYPE)
 		printk("%s""port_type: %d, %s\n", pfx, pcie->port_type,
 		       pcie->port_type < ARRAY_SIZE(cper_pcie_port_type_strs) ?
@@ -281,10 +286,18 @@ static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
 	"%s""bridge: secondary_status: 0x%04x, control: 0x%04x\n",
 	pfx, pcie->bridge.secondary_status, pcie->bridge.control);
 #ifdef CONFIG_ACPI_APEI_PCIEAER
-	if (pcie->validation_bits & CPER_PCIE_VALID_AER_INFO) {
-		struct aer_capability_regs *aer_regs = (void *)pcie->aer_info;
-		cper_print_aer(pfx, gdata->error_severity, aer_regs);
+	dev = pci_get_domain_bus_and_slot(pcie->device_id.segment,
+			pcie->device_id.bus, pcie->device_id.function);
+	if (!dev) {
+		pr_err("PCI AER Cannot get PCI device %04x:%02x:%02x.%d\n",
+			pcie->device_id.segment, pcie->device_id.bus,
+			pcie->device_id.slot, pcie->device_id.function);
+		return;
 	}
+	if (pcie->validation_bits & CPER_PCIE_VALID_AER_INFO)
+		cper_print_aer(pfx, dev, gdata->error_severity,
+				(struct aer_capability_regs *) pcie->aer_info);
+	pci_dev_put(dev);
 #endif
 }
 
diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c
index 3ea5173..d3e5fc5 100644
--- a/drivers/pci/pcie/aer/aerdrv_errprint.c
+++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
@@ -23,6 +23,9 @@
 
 #include "aerdrv.h"
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/ras.h>
+
 #define AER_AGENT_RECEIVER		0
 #define AER_AGENT_REQUESTER		1
 #define AER_AGENT_COMPLETER		2
@@ -194,6 +197,8 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
 	if (info->id && info->error_dev_num > 1 && info->id == id)
 		printk("%s""  Error of this Agent(%04x) is reported first\n",
 			prefix, id);
+	trace_aer_event(dev_name(&dev->dev), (info->status & ~info->mask),
+			info->severity);
 }
 
 void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
@@ -217,7 +222,7 @@ int cper_severity_to_aer(int cper_severity)
 }
 EXPORT_SYMBOL_GPL(cper_severity_to_aer);
 
-void cper_print_aer(const char *prefix, int cper_severity,
+void cper_print_aer(const char *prefix, struct pci_dev *dev, int cper_severity,
 		    struct aer_capability_regs *aer)
 {
 	int aer_severity, layer, agent, status_strs_size, tlp_header_valid = 0;
@@ -259,5 +264,7 @@ void cper_print_aer(const char *prefix, int cper_severity,
 			*(tlp + 8), *(tlp + 15), *(tlp + 14),
 			*(tlp + 13), *(tlp + 12));
 	}
+	trace_aer_event(dev_name(&dev->dev), (status & ~mask),
+			aer_severity);
 }
 #endif
diff --git a/include/linux/aer.h b/include/linux/aer.h
index 544abdb..ec10e1b 100644
--- a/include/linux/aer.h
+++ b/include/linux/aer.h
@@ -49,8 +49,8 @@ static inline int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev)
 }
 #endif
 
-extern void cper_print_aer(const char *prefix, int cper_severity,
-			   struct aer_capability_regs *aer);
+extern void cper_print_aer(const char *prefix, struct pci_dev *dev,
+			   int cper_severity, struct aer_capability_regs *aer);
 extern int cper_severity_to_aer(int cper_severity);
 extern void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
 			      int severity);

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v9 3/3] aerdrv: Cleanup log output for AER
  2013-01-03 22:34 [PATCH v9 1/3] aerdrv: Trace Event for AER Lance Ortiz
  2013-01-03 22:34 ` [PATCH v9 2/3] aerdrv: Enhanced AER logging Lance Ortiz
@ 2013-01-03 22:34 ` Lance Ortiz
  2013-01-04 22:07   ` Bjorn Helgaas
  1 sibling, 1 reply; 6+ messages in thread
From: Lance Ortiz @ 2013-01-03 22:34 UTC (permalink / raw)
  To: bhelgaas, lance_ortiz, jiang.liu, tony.luck, bp, rostedt, mchehab,
	linux-acpi, linux-pci, linux-kernel

These changes make cper_print_aer more consistent with aer_print_error
and clean things up by elimiating the use of the prefix variable and
replacing it with dev_printk.

v1-v2 fix some compile errors withinn the #ifdef
v3-v4 remove agent id stuff and kept print the same to avoid
compatibility issues
v7-v8 Updated to use dev_printk instated of prefix. Changed
log levels to KERN_ERR as per Mauro's suggestion.
v8-v9 Changed dev_printk to dev_err since all log levels are KERN_ERR.

Signed-off-by: Lance Ortiz <lance.ortiz@hp.com>
Acked-by: Tony Luck <tony.luck@intel.com>
---

 drivers/pci/pcie/aer/aerdrv_errprint.c |   54 +++++++++++++++-----------------
 1 files changed, 26 insertions(+), 28 deletions(-)

diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c
index d3e5fc5..5ab1425 100644
--- a/drivers/pci/pcie/aer/aerdrv_errprint.c
+++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
@@ -124,12 +124,11 @@ static const char *aer_agent_string[] = {
 	"Transmitter ID"
 };
 
-static void __aer_print_error(const char *prefix,
+static void __aer_print_error(struct pci_dev *dev,
 			      struct aer_err_info *info)
 {
 	int i, status;
 	const char *errmsg = NULL;
-
 	status = (info->status & ~info->mask);
 
 	for (i = 0; i < 32; i++) {
@@ -144,26 +143,22 @@ static void __aer_print_error(const char *prefix,
 				aer_uncorrectable_error_string[i] : NULL;
 
 		if (errmsg)
-			printk("%s""   [%2d] %-22s%s\n", prefix, i, errmsg,
+			dev_err(&dev->dev, "   [%2d] %-22s%s\n", i, errmsg,
 				info->first_error == i ? " (First)" : "");
 		else
-			printk("%s""   [%2d] Unknown Error Bit%s\n", prefix, i,
-				info->first_error == i ? " (First)" : "");
+			dev_err(&dev->dev, "   [%2d] Unknown Error Bit%s\n",
+				i, info->first_error == i ? " (First)" : "");
 	}
 }
 
 void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
 {
 	int id = ((dev->bus->number << 8) | dev->devfn);
-	char prefix[44];
-
-	snprintf(prefix, sizeof(prefix), "%s%s %s: ",
-		 (info->severity == AER_CORRECTABLE) ? KERN_WARNING : KERN_ERR,
-		 dev_driver_string(&dev->dev), dev_name(&dev->dev));
 
 	if (info->status == 0) {
-		printk("%s""PCIe Bus Error: severity=%s, type=Unaccessible, "
-			"id=%04x(Unregistered Agent ID)\n", prefix,
+		dev_err(&dev->dev,
+			"PCIe Bus Error: severity=%s, type=Unaccessible, "
+			"id=%04x(Unregistered Agent ID)\n",
 			aer_error_severity_string[info->severity], id);
 	} else {
 		int layer, agent;
@@ -171,22 +166,24 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
 		layer = AER_GET_LAYER_ERROR(info->severity, info->status);
 		agent = AER_GET_AGENT(info->severity, info->status);
 
-		printk("%s""PCIe Bus Error: severity=%s, type=%s, id=%04x(%s)\n",
-			prefix, aer_error_severity_string[info->severity],
+		dev_err(&dev->dev,
+			"PCIe Bus Error: severity=%s, type=%s, id=%04x(%s)\n",
+			aer_error_severity_string[info->severity],
 			aer_error_layer[layer], id, aer_agent_string[agent]);
 
-		printk("%s""  device [%04x:%04x] error status/mask=%08x/%08x\n",
-			prefix, dev->vendor, dev->device,
+		dev_err(&dev->dev,
+			"  device [%04x:%04x] error status/mask=%08x/%08x\n",
+			dev->vendor, dev->device,
 			info->status, info->mask);
 
-		__aer_print_error(prefix, info);
+		__aer_print_error(dev, info);
 
 		if (info->tlp_header_valid) {
 			unsigned char *tlp = (unsigned char *) &info->tlp;
-			printk("%s""  TLP Header:"
+			dev_err(&dev->dev, "  TLP Header:"
 				" %02x%02x%02x%02x %02x%02x%02x%02x"
 				" %02x%02x%02x%02x %02x%02x%02x%02x\n",
-				prefix, *(tlp + 3), *(tlp + 2), *(tlp + 1), *tlp,
+				*(tlp + 3), *(tlp + 2), *(tlp + 1), *tlp,
 				*(tlp + 7), *(tlp + 6), *(tlp + 5), *(tlp + 4),
 				*(tlp + 11), *(tlp + 10), *(tlp + 9),
 				*(tlp + 8), *(tlp + 15), *(tlp + 14),
@@ -195,8 +192,9 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
 	}
 
 	if (info->id && info->error_dev_num > 1 && info->id == id)
-		printk("%s""  Error of this Agent(%04x) is reported first\n",
-			prefix, id);
+		dev_err(&dev->dev,
+			   "  Error of this Agent(%04x) is reported first\n",
+			id);
 	trace_aer_event(dev_name(&dev->dev), (info->status & ~info->mask),
 			info->severity);
 }
@@ -244,21 +242,21 @@ void cper_print_aer(const char *prefix, struct pci_dev *dev, int cper_severity,
 	}
 	layer = AER_GET_LAYER_ERROR(aer_severity, status);
 	agent = AER_GET_AGENT(aer_severity, status);
-	printk("%s""aer_status: 0x%08x, aer_mask: 0x%08x\n",
-	       prefix, status, mask);
+	dev_err(&dev->dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n",
+	       status, mask);
 	cper_print_bits(prefix, status, status_strs, status_strs_size);
-	printk("%s""aer_layer=%s, aer_agent=%s\n", prefix,
+	dev_err(&dev->dev, "aer_layer=%s, aer_agent=%s\n",
 	       aer_error_layer[layer], aer_agent_string[agent]);
 	if (aer_severity != AER_CORRECTABLE)
-		printk("%s""aer_uncor_severity: 0x%08x\n",
-		       prefix, aer->uncor_severity);
+		dev_err(&dev->dev, "aer_uncor_severity: 0x%08x\n",
+		       aer->uncor_severity);
 	if (tlp_header_valid) {
 		const unsigned char *tlp;
 		tlp = (const unsigned char *)&aer->header_log;
-		printk("%s""aer_tlp_header:"
+		dev_err(&dev->dev, "aer_tlp_header:"
 			" %02x%02x%02x%02x %02x%02x%02x%02x"
 			" %02x%02x%02x%02x %02x%02x%02x%02x\n",
-			prefix, *(tlp + 3), *(tlp + 2), *(tlp + 1), *tlp,
+			*(tlp + 3), *(tlp + 2), *(tlp + 1), *tlp,
 			*(tlp + 7), *(tlp + 6), *(tlp + 5), *(tlp + 4),
 			*(tlp + 11), *(tlp + 10), *(tlp + 9),
 			*(tlp + 8), *(tlp + 15), *(tlp + 14),

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v9 3/3] aerdrv: Cleanup log output for AER
  2013-01-03 22:34 ` [PATCH v9 3/3] aerdrv: Cleanup log output for AER Lance Ortiz
@ 2013-01-04 22:07   ` Bjorn Helgaas
  2013-01-04 22:22     ` Luck, Tony
  0 siblings, 1 reply; 6+ messages in thread
From: Bjorn Helgaas @ 2013-01-04 22:07 UTC (permalink / raw)
  To: Lance Ortiz
  Cc: lance_ortiz, jiang.liu, tony.luck, bp, rostedt, mchehab,
	linux-acpi, linux-pci, linux-kernel

On Thu, Jan 3, 2013 at 3:34 PM, Lance Ortiz <lance.ortiz@hp.com> wrote:
> These changes make cper_print_aer more consistent with aer_print_error
> and clean things up by elimiating the use of the prefix variable and
> replacing it with dev_printk.
>
> v1-v2 fix some compile errors withinn the #ifdef
> v3-v4 remove agent id stuff and kept print the same to avoid
> compatibility issues
> v7-v8 Updated to use dev_printk instated of prefix. Changed
> log levels to KERN_ERR as per Mauro's suggestion.
> v8-v9 Changed dev_printk to dev_err since all log levels are KERN_ERR.
>
> Signed-off-by: Lance Ortiz <lance.ortiz@hp.com>
> Acked-by: Tony Luck <tony.luck@intel.com>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

Nits (no need to fix them unless you have other reasons to touch this patch):

s/elimiating/eliminating/ above.

I remove the "v1-v2" notes when I merge patches because I don't think
they're useful any more.  But if Tony applies these, he can use his
judgment.

And a couple more below.

> ---
>
>  drivers/pci/pcie/aer/aerdrv_errprint.c |   54 +++++++++++++++-----------------
>  1 files changed, 26 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c
> index d3e5fc5..5ab1425 100644
> --- a/drivers/pci/pcie/aer/aerdrv_errprint.c
> +++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
> @@ -124,12 +124,11 @@ static const char *aer_agent_string[] = {
>         "Transmitter ID"
>  };
>
> -static void __aer_print_error(const char *prefix,
> +static void __aer_print_error(struct pci_dev *dev,
>                               struct aer_err_info *info)

This looks like it would fit on one line.

>  {
>         int i, status;
>         const char *errmsg = NULL;
> -

Don't remove this blank line.

>         status = (info->status & ~info->mask);
>
>         for (i = 0; i < 32; i++) {
> @@ -144,26 +143,22 @@ static void __aer_print_error(const char *prefix,
>                                 aer_uncorrectable_error_string[i] : NULL;
>
>                 if (errmsg)
> -                       printk("%s""   [%2d] %-22s%s\n", prefix, i, errmsg,
> +                       dev_err(&dev->dev, "   [%2d] %-22s%s\n", i, errmsg,
>                                 info->first_error == i ? " (First)" : "");
>                 else
> -                       printk("%s""   [%2d] Unknown Error Bit%s\n", prefix, i,
> -                               info->first_error == i ? " (First)" : "");
> +                       dev_err(&dev->dev, "   [%2d] Unknown Error Bit%s\n",
> +                               i, info->first_error == i ? " (First)" : "");
>         }
>  }
>
>  void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
>  {
>         int id = ((dev->bus->number << 8) | dev->devfn);
> -       char prefix[44];
> -
> -       snprintf(prefix, sizeof(prefix), "%s%s %s: ",
> -                (info->severity == AER_CORRECTABLE) ? KERN_WARNING : KERN_ERR,
> -                dev_driver_string(&dev->dev), dev_name(&dev->dev));
>
>         if (info->status == 0) {
> -               printk("%s""PCIe Bus Error: severity=%s, type=Unaccessible, "
> -                       "id=%04x(Unregistered Agent ID)\n", prefix,
> +               dev_err(&dev->dev,
> +                       "PCIe Bus Error: severity=%s, type=Unaccessible, "
> +                       "id=%04x(Unregistered Agent ID)\n",
>                         aer_error_severity_string[info->severity], id);

This means these messages will all be KERN_ERR, when they used to be
KERN_WARNING or KERN_ERR depending on the severity.  That's OK with
me, but mentioning in the changelog would make clear that this is what
you intended.

>         } else {
>                 int layer, agent;
> @@ -171,22 +166,24 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
>                 layer = AER_GET_LAYER_ERROR(info->severity, info->status);
>                 agent = AER_GET_AGENT(info->severity, info->status);
>
> -               printk("%s""PCIe Bus Error: severity=%s, type=%s, id=%04x(%s)\n",
> -                       prefix, aer_error_severity_string[info->severity],
> +               dev_err(&dev->dev,
> +                       "PCIe Bus Error: severity=%s, type=%s, id=%04x(%s)\n",
> +                       aer_error_severity_string[info->severity],
>                         aer_error_layer[layer], id, aer_agent_string[agent]);
>
> -               printk("%s""  device [%04x:%04x] error status/mask=%08x/%08x\n",
> -                       prefix, dev->vendor, dev->device,
> +               dev_err(&dev->dev,
> +                       "  device [%04x:%04x] error status/mask=%08x/%08x\n",
> +                       dev->vendor, dev->device,
>                         info->status, info->mask);
>
> -               __aer_print_error(prefix, info);
> +               __aer_print_error(dev, info);
>
>                 if (info->tlp_header_valid) {
>                         unsigned char *tlp = (unsigned char *) &info->tlp;
> -                       printk("%s""  TLP Header:"
> +                       dev_err(&dev->dev, "  TLP Header:"
>                                 " %02x%02x%02x%02x %02x%02x%02x%02x"
>                                 " %02x%02x%02x%02x %02x%02x%02x%02x\n",
> -                               prefix, *(tlp + 3), *(tlp + 2), *(tlp + 1), *tlp,
> +                               *(tlp + 3), *(tlp + 2), *(tlp + 1), *tlp,
>                                 *(tlp + 7), *(tlp + 6), *(tlp + 5), *(tlp + 4),
>                                 *(tlp + 11), *(tlp + 10), *(tlp + 9),
>                                 *(tlp + 8), *(tlp + 15), *(tlp + 14),
> @@ -195,8 +192,9 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
>         }
>
>         if (info->id && info->error_dev_num > 1 && info->id == id)
> -               printk("%s""  Error of this Agent(%04x) is reported first\n",
> -                       prefix, id);
> +               dev_err(&dev->dev,
> +                          "  Error of this Agent(%04x) is reported first\n",
> +                       id);
>         trace_aer_event(dev_name(&dev->dev), (info->status & ~info->mask),
>                         info->severity);
>  }
> @@ -244,21 +242,21 @@ void cper_print_aer(const char *prefix, struct pci_dev *dev, int cper_severity,
>         }
>         layer = AER_GET_LAYER_ERROR(aer_severity, status);
>         agent = AER_GET_AGENT(aer_severity, status);
> -       printk("%s""aer_status: 0x%08x, aer_mask: 0x%08x\n",
> -              prefix, status, mask);
> +       dev_err(&dev->dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n",
> +              status, mask);
>         cper_print_bits(prefix, status, status_strs, status_strs_size);
> -       printk("%s""aer_layer=%s, aer_agent=%s\n", prefix,
> +       dev_err(&dev->dev, "aer_layer=%s, aer_agent=%s\n",
>                aer_error_layer[layer], aer_agent_string[agent]);
>         if (aer_severity != AER_CORRECTABLE)
> -               printk("%s""aer_uncor_severity: 0x%08x\n",
> -                      prefix, aer->uncor_severity);
> +               dev_err(&dev->dev, "aer_uncor_severity: 0x%08x\n",
> +                      aer->uncor_severity);
>         if (tlp_header_valid) {
>                 const unsigned char *tlp;
>                 tlp = (const unsigned char *)&aer->header_log;
> -               printk("%s""aer_tlp_header:"
> +               dev_err(&dev->dev, "aer_tlp_header:"
>                         " %02x%02x%02x%02x %02x%02x%02x%02x"
>                         " %02x%02x%02x%02x %02x%02x%02x%02x\n",
> -                       prefix, *(tlp + 3), *(tlp + 2), *(tlp + 1), *tlp,
> +                       *(tlp + 3), *(tlp + 2), *(tlp + 1), *tlp,
>                         *(tlp + 7), *(tlp + 6), *(tlp + 5), *(tlp + 4),
>                         *(tlp + 11), *(tlp + 10), *(tlp + 9),
>                         *(tlp + 8), *(tlp + 15), *(tlp + 14),
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v9 2/3] aerdrv: Enhanced AER logging
  2013-01-03 22:34 ` [PATCH v9 2/3] aerdrv: Enhanced AER logging Lance Ortiz
@ 2013-01-04 22:07   ` Bjorn Helgaas
  0 siblings, 0 replies; 6+ messages in thread
From: Bjorn Helgaas @ 2013-01-04 22:07 UTC (permalink / raw)
  To: Lance Ortiz
  Cc: lance_ortiz, jiang.liu, tony.luck, bp, rostedt, mchehab,
	linux-acpi, linux-pci, linux-kernel

On Thu, Jan 3, 2013 at 3:34 PM, Lance Ortiz <lance.ortiz@hp.com> wrote:
> This patch will provide a more reliable and easy way for user-space
> applications to have access to AER logs rather than reading them from the
> message buffer. It also provides a way to notify user-space when an AER
> event occurs.
>
> The aer driver is updated to generate a trace event of function 'aer_event'
> when a PCIe error is reported over the AER interface.  The trace event was
> added to both the interrupt based aer path and the firmware first path.
>
> v1-v2 fix compile errors in ifdefs.
> v2-v3 Update to new location of trace header. Update print to remove
> warning.
> v3-v4 Reworked logic when getting ready to call cper_print_aer
> v6-v7 Change print from pr_info to pr_err if !dev
> v7-v8 Add pfx argument back into call to cper_print_aer()
>
> Signed-off-by: Lance Ortiz <lance.ortiz@hp.com>
> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
> Acked-by: Tony Luck <tony.luck@intel.com>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
>
>  drivers/acpi/apei/cper.c               |   19 ++++++++++++++++---
>  drivers/pci/pcie/aer/aerdrv_errprint.c |    9 ++++++++-
>  include/linux/aer.h                    |    4 ++--
>  3 files changed, 26 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/acpi/apei/cper.c b/drivers/acpi/apei/cper.c
> index e6defd8..1e5d8a4 100644
> --- a/drivers/acpi/apei/cper.c
> +++ b/drivers/acpi/apei/cper.c
> @@ -29,6 +29,7 @@
>  #include <linux/time.h>
>  #include <linux/cper.h>
>  #include <linux/acpi.h>
> +#include <linux/pci.h>
>  #include <linux/aer.h>
>
>  /*
> @@ -249,6 +250,10 @@ static const char *cper_pcie_port_type_strs[] = {
>  static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
>                             const struct acpi_hest_generic_data *gdata)
>  {
> +#ifdef CONFIG_ACPI_APEI_PCIEAER
> +       struct pci_dev *dev;
> +#endif
> +
>         if (pcie->validation_bits & CPER_PCIE_VALID_PORT_TYPE)
>                 printk("%s""port_type: %d, %s\n", pfx, pcie->port_type,
>                        pcie->port_type < ARRAY_SIZE(cper_pcie_port_type_strs) ?
> @@ -281,10 +286,18 @@ static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
>         "%s""bridge: secondary_status: 0x%04x, control: 0x%04x\n",
>         pfx, pcie->bridge.secondary_status, pcie->bridge.control);
>  #ifdef CONFIG_ACPI_APEI_PCIEAER
> -       if (pcie->validation_bits & CPER_PCIE_VALID_AER_INFO) {
> -               struct aer_capability_regs *aer_regs = (void *)pcie->aer_info;
> -               cper_print_aer(pfx, gdata->error_severity, aer_regs);
> +       dev = pci_get_domain_bus_and_slot(pcie->device_id.segment,
> +                       pcie->device_id.bus, pcie->device_id.function);
> +       if (!dev) {
> +               pr_err("PCI AER Cannot get PCI device %04x:%02x:%02x.%d\n",
> +                       pcie->device_id.segment, pcie->device_id.bus,
> +                       pcie->device_id.slot, pcie->device_id.function);
> +               return;
>         }
> +       if (pcie->validation_bits & CPER_PCIE_VALID_AER_INFO)
> +               cper_print_aer(pfx, dev, gdata->error_severity,
> +                               (struct aer_capability_regs *) pcie->aer_info);
> +       pci_dev_put(dev);
>  #endif
>  }
>
> diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c
> index 3ea5173..d3e5fc5 100644
> --- a/drivers/pci/pcie/aer/aerdrv_errprint.c
> +++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
> @@ -23,6 +23,9 @@
>
>  #include "aerdrv.h"
>
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/ras.h>
> +
>  #define AER_AGENT_RECEIVER             0
>  #define AER_AGENT_REQUESTER            1
>  #define AER_AGENT_COMPLETER            2
> @@ -194,6 +197,8 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
>         if (info->id && info->error_dev_num > 1 && info->id == id)
>                 printk("%s""  Error of this Agent(%04x) is reported first\n",
>                         prefix, id);
> +       trace_aer_event(dev_name(&dev->dev), (info->status & ~info->mask),
> +                       info->severity);
>  }
>
>  void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
> @@ -217,7 +222,7 @@ int cper_severity_to_aer(int cper_severity)
>  }
>  EXPORT_SYMBOL_GPL(cper_severity_to_aer);
>
> -void cper_print_aer(const char *prefix, int cper_severity,
> +void cper_print_aer(const char *prefix, struct pci_dev *dev, int cper_severity,
>                     struct aer_capability_regs *aer)
>  {
>         int aer_severity, layer, agent, status_strs_size, tlp_header_valid = 0;
> @@ -259,5 +264,7 @@ void cper_print_aer(const char *prefix, int cper_severity,
>                         *(tlp + 8), *(tlp + 15), *(tlp + 14),
>                         *(tlp + 13), *(tlp + 12));
>         }
> +       trace_aer_event(dev_name(&dev->dev), (status & ~mask),
> +                       aer_severity);
>  }
>  #endif
> diff --git a/include/linux/aer.h b/include/linux/aer.h
> index 544abdb..ec10e1b 100644
> --- a/include/linux/aer.h
> +++ b/include/linux/aer.h
> @@ -49,8 +49,8 @@ static inline int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev)
>  }
>  #endif
>
> -extern void cper_print_aer(const char *prefix, int cper_severity,
> -                          struct aer_capability_regs *aer);
> +extern void cper_print_aer(const char *prefix, struct pci_dev *dev,
> +                          int cper_severity, struct aer_capability_regs *aer);
>  extern int cper_severity_to_aer(int cper_severity);
>  extern void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
>                               int severity);
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH v9 3/3] aerdrv: Cleanup log output for AER
  2013-01-04 22:07   ` Bjorn Helgaas
@ 2013-01-04 22:22     ` Luck, Tony
  0 siblings, 0 replies; 6+ messages in thread
From: Luck, Tony @ 2013-01-04 22:22 UTC (permalink / raw)
  To: Bjorn Helgaas, Lance Ortiz
  Cc: lance_ortiz@hotmail.com, jiang.liu@huawei.com, bp@alien8.de,
	rostedt@goodmis.org, mchehab@redhat.com,
	linux-acpi@vger.kernel.org, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org

I sent a "please pull" to Ingo/Peter/Thomas about an hour ago ... if they
push back (or ignore) we can fold your ack and nit-picks into another
version.

> s/elimiating/eliminating/ above.
Ugh ... nobody spotted this one ("many eyes" really does work!)

> I remove the "v1-v2" notes when I merge patches because I don't think
> they're useful any more.  But if Tony applies these, he can use his
> judgment.

Yes - I do too.  In fact it is easier on the maintainer if this sort of meta-commentary
goes *after* the "---" in the patch. Then it is available for review (where it is
most helpful), but tools will automatically drop it when applying the patch.

-Tony

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-01-04 22:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-03 22:34 [PATCH v9 1/3] aerdrv: Trace Event for AER Lance Ortiz
2013-01-03 22:34 ` [PATCH v9 2/3] aerdrv: Enhanced AER logging Lance Ortiz
2013-01-04 22:07   ` Bjorn Helgaas
2013-01-03 22:34 ` [PATCH v9 3/3] aerdrv: Cleanup log output for AER Lance Ortiz
2013-01-04 22:07   ` Bjorn Helgaas
2013-01-04 22:22     ` Luck, Tony

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).