* Re: [PATCH 1/7 v5] trace, RAS: Add basic RAS trace event [not found] ` <1402475691-30045-2-git-send-email-gong.chen@linux.intel.com> @ 2014-06-11 18:59 ` Borislav Petkov 0 siblings, 0 replies; 5+ messages in thread From: Borislav Petkov @ 2014-06-11 18:59 UTC (permalink / raw) To: Chen, Gong; +Cc: tony.luck, m.chehab, rostedt, linux-acpi, lkml On Wed, Jun 11, 2014 at 04:34:45AM -0400, Chen, Gong wrote: > To avoid confuision and conflict of usage for RAS related trace event, > add an unified RAS trace event stub. > > v5 -> v4: remove explicit RAS menuconfig. > v4 -> v3: change dependency rule of RAS_TRACE. > v3 -> v2: fix dependency in Kconfig. > v2 -> v1: adjust Kconfig to take RAS as a separate subsystem. Let's simplify it a little - I've dropped RAS_TRACE for now. We can carve it out later, when needed. --- From: "Chen, Gong" <gong.chen@linux.intel.com> Subject: [PATCH 1/7 v5] trace, RAS: Add basic RAS trace event To avoid confuision and conflict of usage for RAS related trace event, add an unified RAS trace event stub. Start a RAS subsystem menu which will be fleshed out in time, when more features get added to it. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> Link: http://lkml.kernel.org/r/1402475691-30045-2-git-send-email-gong.chen@linux.intel.com Signed-off-by: Borislav Petkov <bp@suse.de> --- drivers/Kconfig | 2 ++ drivers/Makefile | 1 + drivers/edac/Kconfig | 1 + drivers/edac/edac_mc.c | 3 --- drivers/ras/Kconfig | 6 ++++++ drivers/ras/Makefile | 1 + drivers/ras/ras.c | 12 ++++++++++++ 7 files changed, 23 insertions(+), 3 deletions(-) create mode 100644 drivers/ras/Kconfig create mode 100644 drivers/ras/Makefile create mode 100644 drivers/ras/ras.c Index: linux/drivers/Kconfig =================================================================== --- linux.orig/drivers/Kconfig 2014-06-11 17:14:23.782437196 +0200 +++ linux/drivers/Kconfig 2014-06-11 17:14:23.770437196 +0200 @@ -176,4 +176,6 @@ source "drivers/powercap/Kconfig" source "drivers/mcb/Kconfig" +source "drivers/ras/Kconfig" + endmenu Index: linux/drivers/Makefile =================================================================== --- linux.orig/drivers/Makefile 2014-06-11 17:14:23.782437196 +0200 +++ linux/drivers/Makefile 2014-06-11 17:14:23.770437196 +0200 @@ -158,3 +158,4 @@ obj-$(CONFIG_NTB) += ntb/ obj-$(CONFIG_FMC) += fmc/ obj-$(CONFIG_POWERCAP) += powercap/ obj-$(CONFIG_MCB) += mcb/ +obj-$(CONFIG_RAS) += ras/ Index: linux/drivers/edac/Kconfig =================================================================== --- linux.orig/drivers/edac/Kconfig 2014-06-11 17:14:23.782437196 +0200 +++ linux/drivers/edac/Kconfig 2014-06-11 17:24:18.142427373 +0200 @@ -72,6 +72,7 @@ config EDAC_MCE_INJ config EDAC_MM_EDAC tristate "Main Memory EDAC (Error Detection And Correction) reporting" + select RAS help Some systems are able to detect and correct errors in main memory. EDAC can report statistics on memory error Index: linux/drivers/edac/edac_mc.c =================================================================== --- linux.orig/drivers/edac/edac_mc.c 2014-06-11 17:14:23.782437196 +0200 +++ linux/drivers/edac/edac_mc.c 2014-06-11 17:14:23.770437196 +0200 @@ -33,9 +33,6 @@ #include <asm/edac.h> #include "edac_core.h" #include "edac_module.h" - -#define CREATE_TRACE_POINTS -#define TRACE_INCLUDE_PATH ../../include/ras #include <ras/ras_event.h> /* lock to memory controller's control array */ Index: linux/drivers/ras/Kconfig =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux/drivers/ras/Kconfig 2014-06-11 17:24:00.846427659 +0200 @@ -0,0 +1,2 @@ +config RAS + bool Index: linux/drivers/ras/Makefile =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux/drivers/ras/Makefile 2014-06-11 17:14:23.774437196 +0200 @@ -0,0 +1 @@ +obj-$(CONFIG_RAS) += ras.o Index: linux/drivers/ras/ras.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux/drivers/ras/ras.c 2014-06-11 17:14:23.774437196 +0200 @@ -0,0 +1,12 @@ +/* + * Copyright (C) 2014 Intel Corporation + * + * Authors: + * Chen, Gong <gong.chen@linux.intel.com> + */ + +#define CREATE_TRACE_POINTS +#define TRACE_INCLUDE_PATH ../../include/ras +#include <ras/ras_event.h> + +EXPORT_TRACEPOINT_SYMBOL_GPL(mc_event); -- -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <1402475691-30045-3-git-send-email-gong.chen@linux.intel.com>]
* Re: [PATCH 2/7 v3] trace, AER: Move trace into unified interface [not found] ` <1402475691-30045-3-git-send-email-gong.chen@linux.intel.com> @ 2014-06-11 19:00 ` Borislav Petkov 0 siblings, 0 replies; 5+ messages in thread From: Borislav Petkov @ 2014-06-11 19:00 UTC (permalink / raw) To: Chen, Gong; +Cc: tony.luck, m.chehab, rostedt, linux-acpi, lkml On Wed, Jun 11, 2014 at 04:34:46AM -0400, Chen, Gong wrote: > AER uses a separate trace interface by now. To make it > consistent, move it into unified RAS trace interface. > > v3 -> v2: change dependency rule of RAS_TRACE. > v2 -> v1: remove unnecessary dependency in drivers/ras/Kconfig. > > Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> > --- > drivers/pci/pcie/aer/Kconfig | 1 + > drivers/pci/pcie/aer/aerdrv_errprint.c | 4 +- > include/ras/ras_event.h | 64 ++++++++++++++++++++++++++++ > include/trace/events/ras.h | 77 ---------------------------------- > 4 files changed, 66 insertions(+), 80 deletions(-) > delete mode 100644 include/trace/events/ras.h > > diff --git a/drivers/pci/pcie/aer/Kconfig b/drivers/pci/pcie/aer/Kconfig > index 50e94e0..c611384 100644 > --- a/drivers/pci/pcie/aer/Kconfig > +++ b/drivers/pci/pcie/aer/Kconfig > @@ -5,6 +5,7 @@ > config PCIEAER > boolean "Root Port Advanced Error Reporting support" > depends on PCIEPORTBUS > + select RAS_TRACE > default y > help > This enables PCI Express Root Port Advanced Error Reporting With this hunk changed to Index: b/drivers/pci/pcie/aer/Kconfig =================================================================== --- a/drivers/pci/pcie/aer/Kconfig 2014-06-11 17:33:57.298417802 +0200 +++ b/drivers/pci/pcie/aer/Kconfig 2014-06-11 17:34:16.302417487 +0200 @@ -5,6 +5,7 @@ config PCIEAER boolean "Root Port Advanced Error Reporting support" depends on PCIEPORTBUS + select RAS default y help This enables PCI Express Root Port Advanced Error Reporting -- Acked-by: Borislav Petkov <bp@suse.de> -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <1402475691-30045-5-git-send-email-gong.chen@linux.intel.com>]
* Re: [PATCH 4/7 v2] RAS, debugfs: Add debugfs interface for RAS subsystem [not found] ` <1402475691-30045-5-git-send-email-gong.chen@linux.intel.com> @ 2014-06-11 19:01 ` Borislav Petkov 0 siblings, 0 replies; 5+ messages in thread From: Borislav Petkov @ 2014-06-11 19:01 UTC (permalink / raw) To: Chen, Gong; +Cc: tony.luck, m.chehab, rostedt, linux-acpi, lkml On Wed, Jun 11, 2014 at 04:34:48AM -0400, Chen, Gong wrote: > Implement a new debugfs interface for RAS susbsystem. > A file named daemon_active is added there accordingly. > This file is used to track if user space daemon enables > perf/trace interface or not. One can track which daemon > opens it via "lsof /path/to/debugfs/ras/daemon_active". > > v2 -> v1: Change file access mode from 0444 to 0400. > > Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> > --- > drivers/ras/Makefile | 2 +- > drivers/ras/debugfs.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++ > drivers/ras/ras.c | 14 +++++++++++++ > include/linux/ras.h | 15 ++++++++++++++ > 4 files changed, 87 insertions(+), 1 deletion(-) > create mode 100644 drivers/ras/debugfs.c > create mode 100644 include/linux/ras.h > > diff --git a/drivers/ras/Makefile b/drivers/ras/Makefile > index 223e806..d7f7334 100644 > --- a/drivers/ras/Makefile > +++ b/drivers/ras/Makefile > @@ -1 +1 @@ > -obj-$(CONFIG_RAS) += ras.o > +obj-$(CONFIG_RAS) += ras.o debugfs.o > diff --git a/drivers/ras/debugfs.c b/drivers/ras/debugfs.c > new file mode 100644 > index 0000000..d0bc389 > --- /dev/null > +++ b/drivers/ras/debugfs.c > @@ -0,0 +1,57 @@ > +#include <linux/debugfs.h> > + > +struct dentry *ras_debugfs_dir; > +EXPORT_SYMBOL_GPL(ras_debugfs_dir); No need to export this. Revised version below: --- From: "Chen, Gong" <gong.chen@linux.intel.com> Implement a new debugfs interface for RAS susbsystem. A file named daemon_active is added there accordingly. This file is used to track if user space daemon accesses perf/trace interface or not. One can track which daemon opens it via "lsof /path/to/debugfs/ras/daemon_active". Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> Link: http://lkml.kernel.org/r/1402475691-30045-5-git-send-email-gong.chen@linux.intel.com Signed-off-by: Borislav Petkov <bp@suse.de> --- drivers/ras/Makefile | 2 +- drivers/ras/debugfs.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++ drivers/ras/ras.c | 14 +++++++++++++ include/linux/ras.h | 15 ++++++++++++++ 4 files changed, 87 insertions(+), 1 deletion(-) create mode 100644 drivers/ras/debugfs.c create mode 100644 include/linux/ras.h Index: linux/drivers/ras/Makefile =================================================================== --- linux.orig/drivers/ras/Makefile 2014-06-11 17:54:21.738397566 +0200 +++ linux/drivers/ras/Makefile 2014-06-11 17:54:21.726397566 +0200 @@ -1 +1 @@ -obj-$(CONFIG_RAS) += ras.o +obj-$(CONFIG_RAS) += ras.o debugfs.o Index: linux/drivers/ras/debugfs.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux/drivers/ras/debugfs.c 2014-06-11 17:58:47.214393178 +0200 @@ -0,0 +1,56 @@ +#include <linux/debugfs.h> + +static struct dentry *ras_debugfs_dir; + +static atomic_t trace_count = ATOMIC_INIT(0); + +int ras_userspace_consumers(void) +{ + return atomic_read(&trace_count); +} +EXPORT_SYMBOL_GPL(ras_userspace_consumers); + +static int trace_show(struct seq_file *m, void *v) +{ + return atomic_read(&trace_count); +} + +static int trace_open(struct inode *inode, struct file *file) +{ + atomic_inc(&trace_count); + return single_open(file, trace_show, NULL); +} + +static int trace_release(struct inode *inode, struct file *file) +{ + atomic_dec(&trace_count); + return single_release(inode, file); +} + +static const struct file_operations trace_fops = { + .open = trace_open, + .read = seq_read, + .llseek = seq_lseek, + .release = trace_release, +}; + +int __init ras_add_daemon_trace(void) +{ + struct dentry *fentry; + + if (!ras_debugfs_dir) + return -ENOENT; + + fentry = debugfs_create_file("daemon_active", S_IRUSR, ras_debugfs_dir, + NULL, &trace_fops); + if (!fentry) + return -ENODEV; + + return 0; + +} + +void __init ras_debugfs_init(void) +{ + ras_debugfs_dir = debugfs_create_dir("ras", NULL); +} Index: linux/drivers/ras/ras.c =================================================================== --- linux.orig/drivers/ras/ras.c 2014-06-11 17:54:21.738397566 +0200 +++ linux/drivers/ras/ras.c 2014-06-11 17:54:21.730397566 +0200 @@ -5,8 +5,22 @@ * Chen, Gong <gong.chen@linux.intel.com> */ +#include <linux/init.h> +#include <linux/ras.h> + #define CREATE_TRACE_POINTS #define TRACE_INCLUDE_PATH ../../include/ras #include <ras/ras_event.h> +static int __init ras_init(void) +{ + int rc = 0; + + ras_debugfs_init(); + rc = ras_add_daemon_trace(); + + return rc; +} +subsys_initcall(ras_init); + EXPORT_TRACEPOINT_SYMBOL_GPL(mc_event); Index: linux/include/linux/ras.h =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux/include/linux/ras.h 2014-06-11 17:58:43.350393242 +0200 @@ -0,0 +1,14 @@ +#ifndef __RAS_H__ +#define __RAS_H__ + +#ifdef CONFIG_DEBUG_FS +int ras_userspace_consumers(void); +void ras_debugfs_init(void); +int ras_add_daemon_trace(void); +#else +static inline int ras_userspace_consumers(void) { return 0; } +static inline void ras_debugfs_init(void) { return; } +static inline int ras_add_daemon_trace(void) { return 0; } +#endif + +#endif -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <1402475691-30045-6-git-send-email-gong.chen@linux.intel.com>]
* Re: [PATCH 5/7 v7] trace, RAS: Add eMCA trace event interface [not found] ` <1402475691-30045-6-git-send-email-gong.chen@linux.intel.com> @ 2014-06-11 19:02 ` Borislav Petkov 2014-06-12 2:42 ` Chen, Gong 0 siblings, 1 reply; 5+ messages in thread From: Borislav Petkov @ 2014-06-11 19:02 UTC (permalink / raw) To: Chen, Gong; +Cc: tony.luck, m.chehab, rostedt, linux-acpi, lkml On Wed, Jun 11, 2014 at 04:34:49AM -0400, Chen, Gong wrote: > Add trace interface to elaborate all H/W error related information. > > v7 -> v6: compact trace info to save trace buffer space. > v6 -> v5: format adjustment. > v5 -> v4: Add physical mask(LSB) in trace. > v4 -> v3: change ras trace dependency rule. > v3 -> v2: minor adjustment according to the suggestion from Boris. > v2 -> v1: spinlock is not needed anymore. > > Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> > --- > drivers/acpi/Kconfig | 4 ++- > drivers/acpi/acpi_extlog.c | 27 ++++++++++++++++--- > drivers/firmware/efi/cper.c | 48 +++++++++++++++++++++++++++++++--- > drivers/ras/ras.c | 1 + > include/linux/cper.h | 21 +++++++++++++++ > include/ras/ras_event.h | 63 +++++++++++++++++++++++++++++++++++++++++++++ > 6 files changed, 156 insertions(+), 8 deletions(-) > > diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig > index a34a228..099a2d5 100644 > --- a/drivers/acpi/Kconfig > +++ b/drivers/acpi/Kconfig > @@ -370,6 +370,7 @@ config ACPI_EXTLOG > tristate "Extended Error Log support" > depends on X86_MCE && X86_LOCAL_APIC > select UEFI_CPER > + select RAS_TRACE > default n > help > Certain usages such as Predictive Failure Analysis (PFA) require > @@ -384,6 +385,7 @@ config ACPI_EXTLOG > > Enhanced MCA Logging allows firmware to provide additional error > information to system software, synchronous with MCE or CMCI. This > - driver adds support for that functionality. > + driver adds support for that functionality with corresponding > + tracepoint which carries that information to userspace. > > endif # ACPI > diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c > index 1853341..e61da95 100644 > --- a/drivers/acpi/acpi_extlog.c > +++ b/drivers/acpi/acpi_extlog.c > @@ -16,6 +16,7 @@ > #include <asm/mce.h> > > #include "apei/apei-internal.h" > +#include <ras/ras_event.h> > > #define EXT_ELOG_ENTRY_MASK GENMASK_ULL(51, 0) /* elog entry address mask */ > > @@ -137,8 +138,12 @@ static int extlog_print(struct notifier_block *nb, unsigned long val, > struct mce *mce = (struct mce *)data; > int bank = mce->bank; > int cpu = mce->extcpu; > - struct acpi_generic_status *estatus; > - int rc; > + struct acpi_generic_status *estatus, *tmp; > + struct acpi_generic_data *gdata; > + const uuid_le *fru_id = &NULL_UUID_LE; > + char *fru_text = ""; > + uuid_le *sec_type; > + static u32 err_seq; > > estatus = extlog_elog_entry_check(cpu, bank); > if (estatus == NULL) > @@ -148,7 +153,23 @@ static int extlog_print(struct notifier_block *nb, unsigned long val, > /* clear record status to enable BIOS to update it again */ > estatus->block_status = 0; > > - rc = print_extlog_rcd(NULL, (struct acpi_generic_status *)elog_buf, cpu); > + tmp = (struct acpi_generic_status *)elog_buf; > + print_extlog_rcd(NULL, tmp, cpu); > + > + /* log event via trace */ > + err_seq++; > + gdata = (struct acpi_generic_data *)(tmp + 1); > + if (gdata->validation_bits & CPER_SEC_VALID_FRU_ID) > + fru_id = (uuid_le *)gdata->fru_id; > + if (gdata->validation_bits & CPER_SEC_VALID_FRU_TEXT) > + fru_text = gdata->fru_text; > + sec_type = (uuid_le *)gdata->section_type; > + if (!uuid_le_cmp(*sec_type, CPER_SEC_PLATFORM_MEM)) { > + struct cper_sec_mem_err *mem = (void *)(gdata + 1); > + if (gdata->error_data_length >= sizeof(*mem)) > + trace_extlog_mem_event(mem, err_seq, fru_id, fru_text, > + (u8)gdata->error_severity); > + } > > return NOTIFY_STOP; > } > diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c > index 83b56b61..85d6d30 100644 > --- a/drivers/firmware/efi/cper.c > +++ b/drivers/firmware/efi/cper.c > @@ -207,7 +207,7 @@ const char *cper_mem_err_type_str(unsigned int etype) > } > EXPORT_SYMBOL_GPL(cper_mem_err_type_str); > > -int cper_mem_err_location(const struct cper_sec_mem_err *mem, char *msg) > +int cper_mem_err_location(struct cper_mem_err_compact *mem, char *msg) > { > u32 len, n; > > @@ -249,7 +249,7 @@ int cper_mem_err_location(const struct cper_sec_mem_err *mem, char *msg) > return n; > } > > -int cper_dimm_err_location(const struct cper_sec_mem_err *mem, char *msg) > +int cper_dimm_err_location(struct cper_mem_err_compact *mem, char *msg) > { > u32 len, n; > const char *bank = NULL, *device = NULL; > @@ -271,8 +271,47 @@ int cper_dimm_err_location(const struct cper_sec_mem_err *mem, char *msg) > return n; > } > > +void cper_mem_err_pack(const struct cper_sec_mem_err *mem, void *data) > +{ > + struct cper_mem_err_compact *cmem = (struct cper_mem_err_compact *)data; > + > + cmem->validation_bits = mem->validation_bits; > + cmem->node = mem->node; > + cmem->card = mem->card; > + cmem->module = mem->module; > + cmem->bank = mem->bank; > + cmem->device = mem->device; > + cmem->row = mem->row; > + cmem->column = mem->column; > + cmem->bit_pos = mem->bit_pos; > + cmem->requestor_id = mem->requestor_id; > + cmem->responder_id = mem->responder_id; > + cmem->target_id = mem->target_id; > + cmem->rank = mem->rank; > + cmem->mem_array_handle = mem->mem_array_handle; > + cmem->mem_dev_handle = mem->mem_dev_handle; > +} > +EXPORT_SYMBOL_GPL(cper_mem_err_pack); Why do we export this one and the one below? What .config warrants this? CONFIG_ACPI_EXTLOG=m doesn't need them, AFAICT. > +const char *cper_mem_err_unpack(struct trace_seq *p, void *data) > +{ > + struct cper_mem_err_compact *cmem = (struct cper_mem_err_compact *)data; > + const char *ret = p->buffer + p->len; > + > + if (cper_mem_err_location(cmem, rcd_decode_str)) > + trace_seq_printf(p, "%s", rcd_decode_str); > + if (cper_dimm_err_location(cmem, rcd_decode_str)) > + trace_seq_printf(p, "%s", rcd_decode_str); > + trace_seq_putc(p, '\0'); > + > + return ret; > +} > +EXPORT_SYMBOL_GPL(cper_mem_err_unpack); > + > static void cper_print_mem(const char *pfx, const struct cper_sec_mem_err *mem) > { > + struct cper_mem_err_compact cmem; > + > if (mem->validation_bits & CPER_MEM_VALID_ERROR_STATUS) > printk("%s""error_status: 0x%016llx\n", pfx, mem->error_status); > if (mem->validation_bits & CPER_MEM_VALID_PA) > @@ -281,14 +320,15 @@ static void cper_print_mem(const char *pfx, const struct cper_sec_mem_err *mem) > if (mem->validation_bits & CPER_MEM_VALID_PA_MASK) > printk("%s""physical_address_mask: 0x%016llx\n", > pfx, mem->physical_addr_mask); > - if (cper_mem_err_location(mem, rcd_decode_str)) > + cper_mem_err_pack(mem, &cmem); > + if (cper_mem_err_location(&cmem, rcd_decode_str)) > printk("%s%s\n", pfx, rcd_decode_str); > if (mem->validation_bits & CPER_MEM_VALID_ERROR_TYPE) { > u8 etype = mem->error_type; > printk("%s""error_type: %d, %s\n", pfx, etype, > cper_mem_err_type_str(etype)); > } > - if (cper_dimm_err_location(mem, rcd_decode_str)) > + if (cper_dimm_err_location(&cmem, rcd_decode_str)) > printk("%s%s\n", pfx, rcd_decode_str); > } > > diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c > index 4cac43a..da227a3 100644 > --- a/drivers/ras/ras.c > +++ b/drivers/ras/ras.c > @@ -23,4 +23,5 @@ static int __init ras_init(void) > } > subsys_initcall(ras_init); > > +EXPORT_TRACEPOINT_SYMBOL_GPL(extlog_mem_event); > EXPORT_TRACEPOINT_SYMBOL_GPL(mc_event); > diff --git a/include/linux/cper.h b/include/linux/cper.h > index ed088b9..3548160 100644 > --- a/include/linux/cper.h > +++ b/include/linux/cper.h > @@ -22,6 +22,7 @@ > #define LINUX_CPER_H > > #include <linux/uuid.h> > +#include <linux/trace_seq.h> > > /* CPER record signature and the size */ > #define CPER_SIG_RECORD "CPER" > @@ -363,6 +364,24 @@ struct cper_sec_mem_err { > __u16 mem_dev_handle; /* module handle in UEFI 2.4 */ > }; > > +struct cper_mem_err_compact { > + __u64 validation_bits; > + __u16 node; > + __u16 card; > + __u16 module; > + __u16 bank; > + __u16 device; > + __u16 row; > + __u16 column; > + __u16 bit_pos; > + __u64 requestor_id; > + __u64 responder_id; > + __u64 target_id; > + __u16 rank; > + __u16 mem_array_handle; > + __u16 mem_dev_handle; > +}; > + > struct cper_sec_pcie { > __u64 validation_bits; > __u32 port_type; > @@ -406,5 +425,7 @@ const char *cper_severity_str(unsigned int); > const char *cper_mem_err_type_str(unsigned int); > void cper_print_bits(const char *prefix, unsigned int bits, > const char * const strs[], unsigned int strs_size); > +void cper_mem_err_pack(const struct cper_sec_mem_err *, void *); > +const char *cper_mem_err_unpack(struct trace_seq *, void *); > > #endif > diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h > index acbcbb8..c5e58db 100644 > --- a/include/ras/ras_event.h > +++ b/include/ras/ras_event.h > @@ -9,6 +9,69 @@ > #include <linux/edac.h> > #include <linux/ktime.h> > #include <linux/aer.h> > +#include <linux/cper.h> > + > +/* > + * MCE Extended Error Log trace event > + * > + * These events are generated when hardware detects a corrected or > + * uncorrected event. > + */ > + > +/* memory trace event */ > + > +TRACE_EVENT(extlog_mem_event, > + TP_PROTO(struct cper_sec_mem_err *mem, > + u32 err_seq, > + const uuid_le *fru_id, > + const char *fru_text, > + u8 sev), > + > + TP_ARGS(mem, err_seq, fru_id, fru_text, sev), > + > + TP_STRUCT__entry( > + __field(u32, err_seq) > + __field(u8, etype) > + __field(u8, sev) > + __field(u64, pa) > + __field(u8, pa_mask_lsb) > + __array(u8, fru_id, 40) How did you come up with this magic number? Why isn't that sizeof(uuid_le)? > + __string(fru_text, fru_text) > + __array(u8, data, sizeof(struct cper_mem_err_compact)) > + ), > + > + TP_fast_assign( > + __entry->err_seq = err_seq; > + if (mem->validation_bits & CPER_MEM_VALID_ERROR_TYPE) > + __entry->etype = mem->error_type; > + else > + __entry->etype = ~0; > + __entry->sev = sev; > + if (mem->validation_bits & CPER_MEM_VALID_PA) > + __entry->pa = mem->physical_addr; > + else > + __entry->pa = ~0ull; > + > + if (mem->validation_bits & CPER_MEM_VALID_PA_MASK) > + __entry->pa_mask_lsb = > + (u8)__ffs64(mem->physical_addr_mask); No need for the linebreak here - just let it stick out. > + else > + __entry->pa_mask_lsb = ~0; > + snprintf(__entry->fru_id, 39, "%pUl", fru_id); Yeah, I didn't catch the reasoning behind why we need to convert the FRU into a string and not leave it simply as u8[16]... > + __assign_str(fru_text, fru_text); > + cper_mem_err_pack(mem, __entry->data); > + ), > + > + TP_printk("{%d} %s error: %s physical addr: %016llx (mask lsb: %x) %sFRU: %s %.20s", > + __entry->err_seq, > + cper_severity_str(__entry->sev), > + cper_mem_err_type_str(__entry->etype), > + __entry->pa, > + __entry->pa_mask_lsb, > + cper_mem_err_unpack(p, __entry->data), > + __entry->fru_id, > + __get_str(fru_text)) > +); > > /* > * Hardware Events Report > -- > 2.0.0.rc2 > > -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 5/7 v7] trace, RAS: Add eMCA trace event interface 2014-06-11 19:02 ` [PATCH 5/7 v7] trace, RAS: Add eMCA trace event interface Borislav Petkov @ 2014-06-12 2:42 ` Chen, Gong 0 siblings, 0 replies; 5+ messages in thread From: Chen, Gong @ 2014-06-12 2:42 UTC (permalink / raw) To: Borislav Petkov; +Cc: tony.luck, m.chehab, rostedt, linux-acpi, lkml [-- Attachment #1: Type: text/plain, Size: 898 bytes --] On Wed, Jun 11, 2014 at 09:02:15PM +0200, Borislav Petkov wrote: > > +EXPORT_SYMBOL_GPL(cper_mem_err_pack); > > Why do we export this one and the one below? What .config warrants this? > > CONFIG_ACPI_EXTLOG=m doesn't need them, AFAICT. > Right. acpi_extlog doesn't use it. They can be exported later until needed. > > + TP_STRUCT__entry( > > + __field(u32, err_seq) > > + __field(u8, etype) > > + __field(u8, sev) > > + __field(u64, pa) > > + __field(u8, pa_mask_lsb) > > + __array(u8, fru_id, 40) > > How did you come up with this magic number? Why isn't that sizeof(uuid_le)? Cause I want to convert it into a string. > > + snprintf(__entry->fru_id, 39, "%pUl", fru_id); > > Yeah, I didn't catch the reasoning behind why we need to convert the FRU > into a string and not leave it simply as u8[16]... Fair enough. It can be compressed a little bit more. [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-06-12 3:11 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1402475691-30045-1-git-send-email-gong.chen@linux.intel.com>
[not found] ` <1402475691-30045-2-git-send-email-gong.chen@linux.intel.com>
2014-06-11 18:59 ` [PATCH 1/7 v5] trace, RAS: Add basic RAS trace event Borislav Petkov
[not found] ` <1402475691-30045-3-git-send-email-gong.chen@linux.intel.com>
2014-06-11 19:00 ` [PATCH 2/7 v3] trace, AER: Move trace into unified interface Borislav Petkov
[not found] ` <1402475691-30045-5-git-send-email-gong.chen@linux.intel.com>
2014-06-11 19:01 ` [PATCH 4/7 v2] RAS, debugfs: Add debugfs interface for RAS subsystem Borislav Petkov
[not found] ` <1402475691-30045-6-git-send-email-gong.chen@linux.intel.com>
2014-06-11 19:02 ` [PATCH 5/7 v7] trace, RAS: Add eMCA trace event interface Borislav Petkov
2014-06-12 2:42 ` Chen, Gong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox