* [PATCH v12 0/3] cxl, EINJ: Update EINJ for CXL error types
@ 2024-02-14 20:07 Ben Cheatham
2024-02-14 20:07 ` [PATCH v12 1/3] EINJ: Migrate to a platform driver Ben Cheatham
` (3 more replies)
0 siblings, 4 replies; 20+ messages in thread
From: Ben Cheatham @ 2024-02-14 20:07 UTC (permalink / raw)
To: dan.j.williams, jonathan.cameron, rafael, james.morse, tony.luck,
bp
Cc: dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny,
linux-cxl, linux-acpi, benjamin.cheatham
v12 Changes:
- Rebase onto v6.8-rc4
- Squash Kconfig patch into patch 2/3 (Jonathan)
- Change CONFIG_CXL_EINJ from "depends on ACPI_APEI_EINJ >= CXL_BUS"
to "depends on ACPI_APEI_EINJ = CXL_BUS"
- Drop "ACPI, APEI" part of commit message title and use just EINJ
instead (Dan)
- Add protocol error types to "einj_types" documentation (Jonathan)
- Change 0xffff... constants to use GENMASK()
- Drop param* variables and use constants instead in cxl error
inject functions (Jonathan)
- Add is_cxl_error_type() helper function in einj.c (Jonathan)
- Remove a stray function declaration in einj-cxl.h (Jonathan)
- Comment #else/#endifs with corresponding #if/#ifdef in
einj-cxl.h (Jonathan)
v11 Changes:
- Drop patch 2/6 (Add CXL protocol error defines) and put the
defines in patch 4/6 instead (Dan)
- Add Dan's reviewed-by
The new CXL error types will use the Memory Address field in the
SET_ERROR_TYPE_WITH_ADDRESS structure in order to target a CXL 1.1
compliant memory-mapped downstream port. The value of the memory address
will be in the port's MMIO range, and it will not represent physical
(normal or persistent) memory.
Add the functionality for injecting CXL 1.1 errors to the EINJ module,
but not through the EINJ legacy interface under /sys/kernel/debug/apei/einj.
Instead, make the error types available under /sys/kernel/debug/cxl.
This allows for validating the MMIO address for a CXL 1.1 error type
while also not making the user responsible for finding it.
Ben Cheatham (3):
EINJ: Migrate to a platform driver
cxl/core, EINJ: Add EINJ CXL debugfs files and EINJ helper functions
EINJ, Documentation: Update EINJ kernel doc
Documentation/ABI/testing/debugfs-cxl | 30 +++
.../firmware-guide/acpi/apei/einj.rst | 19 ++
MAINTAINERS | 1 +
drivers/acpi/apei/einj.c | 202 ++++++++++++++++--
drivers/cxl/Kconfig | 12 ++
drivers/cxl/core/port.c | 41 ++++
include/linux/einj-cxl.h | 40 ++++
7 files changed, 332 insertions(+), 13 deletions(-)
create mode 100644 include/linux/einj-cxl.h
--
2.34.1
^ permalink raw reply [flat|nested] 20+ messages in thread* [PATCH v12 1/3] EINJ: Migrate to a platform driver 2024-02-14 20:07 [PATCH v12 0/3] cxl, EINJ: Update EINJ for CXL error types Ben Cheatham @ 2024-02-14 20:07 ` Ben Cheatham 2024-02-16 2:03 ` kernel test robot 2024-02-16 13:43 ` kernel test robot 2024-02-14 20:07 ` [PATCH v12 2/3] cxl/core, EINJ: Add EINJ CXL debugfs files and EINJ helper functions Ben Cheatham ` (2 subsequent siblings) 3 siblings, 2 replies; 20+ messages in thread From: Ben Cheatham @ 2024-02-14 20:07 UTC (permalink / raw) To: dan.j.williams, jonathan.cameron, rafael, james.morse, tony.luck, bp Cc: dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny, linux-cxl, linux-acpi, benjamin.cheatham Change the EINJ module to install a platform device/driver on module init and move the module init() and exit() functions to driver probe and remove. This change allows the EINJ module to load regardless of whether setting up EINJ succeeds, which allows dependent modules to still load (i.e. the CXL core). Since EINJ may no longer be initialized when the module loads, any functions that are called from dependent/external modules should check the einj_initialized variable before calling any EINJ functions. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> --- drivers/acpi/apei/einj.c | 44 ++++++++++++++++++++++++++++++++++++++-- 1 file changed, 42 insertions(+), 2 deletions(-) diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c index 89fb9331c611..4782ae0a0832 100644 --- a/drivers/acpi/apei/einj.c +++ b/drivers/acpi/apei/einj.c @@ -21,6 +21,7 @@ #include <linux/nmi.h> #include <linux/delay.h> #include <linux/mm.h> +#include <linux/platform_device.h> #include <asm/unaligned.h> #include "apei-internal.h" @@ -137,6 +138,11 @@ static struct apei_exec_ins_type einj_ins_type[] = { */ static DEFINE_MUTEX(einj_mutex); +/* + * Exported APIs use this flag to exit early if einj_probe() failed. + */ +static bool einj_initialized __ro_after_init; + static void *einj_param; static void einj_exec_ctx_init(struct apei_exec_context *ctx) @@ -703,7 +709,7 @@ static int einj_check_table(struct acpi_table_einj *einj_tab) return 0; } -static int __init einj_init(void) +static int __init einj_probe(struct platform_device *pdev) { int rc; acpi_status status; @@ -805,7 +811,7 @@ static int __init einj_init(void) return rc; } -static void __exit einj_exit(void) +static void __exit einj_remove(struct platform_device *pdev) { struct apei_exec_context ctx; @@ -826,6 +832,40 @@ static void __exit einj_exit(void) acpi_put_table((struct acpi_table_header *)einj_tab); } +static struct platform_device *einj_dev; +struct platform_driver einj_driver = { + .remove_new = einj_remove, + .driver = { + .name = "acpi-einj", + }, +}; + +static int __init einj_init(void) +{ + struct platform_device_info einj_dev_info = { + .name = "acpi-einj", + .id = -1, + }; + int rc; + + einj_dev = platform_device_register_full(&einj_dev_info); + if (IS_ERR_OR_NULL(einj_dev)) + return PTR_ERR(einj_dev); + + rc = platform_driver_probe(&einj_driver, einj_probe); + einj_initialized = rc == 0; + + return 0; +} + +static void __exit einj_exit(void) +{ + if (einj_initialized) + platform_driver_unregister(&einj_driver); + + platform_device_del(einj_dev); +} + module_init(einj_init); module_exit(einj_exit); -- 2.34.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v12 1/3] EINJ: Migrate to a platform driver 2024-02-14 20:07 ` [PATCH v12 1/3] EINJ: Migrate to a platform driver Ben Cheatham @ 2024-02-16 2:03 ` kernel test robot 2024-02-16 13:43 ` kernel test robot 1 sibling, 0 replies; 20+ messages in thread From: kernel test robot @ 2024-02-16 2:03 UTC (permalink / raw) To: Ben Cheatham, dan.j.williams, jonathan.cameron, rafael, james.morse, tony.luck, bp Cc: oe-kbuild-all, dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny, linux-cxl, linux-acpi, benjamin.cheatham Hi Ben, kernel test robot noticed the following build warnings: [auto build test WARNING on rafael-pm/linux-next] [also build test WARNING on linus/master rafael-pm/acpi-bus v6.8-rc4 next-20240215] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Ben-Cheatham/EINJ-Migrate-to-a-platform-driver/20240215-041027 base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next patch link: https://lore.kernel.org/r/20240214200709.777166-2-Benjamin.Cheatham%40amd.com patch subject: [PATCH v12 1/3] EINJ: Migrate to a platform driver config: arm64-defconfig (https://download.01.org/0day-ci/archive/20240216/202402160943.hkW7hhUs-lkp@intel.com/config) compiler: aarch64-linux-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240216/202402160943.hkW7hhUs-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202402160943.hkW7hhUs-lkp@intel.com/ All warnings (new ones prefixed by >>, old ones prefixed by <<): WARNING: modpost: missing MODULE_DESCRIPTION() in vmlinux.o >> WARNING: modpost: vmlinux: section mismatch in reference: einj_driver+0x10 (section: .data) -> einj_remove (section: .exit.text) WARNING: modpost: missing MODULE_DESCRIPTION() in arch/arm64/crypto/crct10dif-ce.o WARNING: modpost: missing MODULE_DESCRIPTION() in arch/arm64/crypto/aes-neon-bs.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/btrfs/btrfs.o WARNING: modpost: missing MODULE_DESCRIPTION() in crypto/af_alg.o WARNING: modpost: missing MODULE_DESCRIPTION() in crypto/ecc.o WARNING: modpost: missing MODULE_DESCRIPTION() in crypto/curve25519-generic.o WARNING: modpost: missing MODULE_DESCRIPTION() in crypto/xor.o WARNING: modpost: missing MODULE_DESCRIPTION() in lib/crypto/libchacha.o WARNING: modpost: missing MODULE_DESCRIPTION() in lib/crypto/libarc4.o WARNING: modpost: missing MODULE_DESCRIPTION() in lib/crypto/libdes.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/pwm/pwm-imx27.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/pwm/pwm-mediatek.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/pwm/pwm-visconti.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/pci/controller/pcie-mediatek-gen3.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/clk/qcom/lpass-gfm-sm8250.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/soc/mediatek/mtk-cmdq-helper.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/soc/qcom/spm.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/xen/xen-pciback/xen-pciback.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/bridge/lontium-lt9611.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/bridge/lontium-lt9611uxc.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/base/regmap/regmap-slimbus.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/misc/fastrpc.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/spi/spi-omap2-mcspi.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/pcs/pcs_xpcs.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/mscc/mscc_ocelot_switch_lib.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/qualcomm/emac/qcom-emac.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/qualcomm/rmnet/rmnet.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/stmicro/stmmac/dwmac-altr-socfpga.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/usb/host/xhci-pci-renesas.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/usb/gadget/function/usb_f_acm.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/usb/gadget/function/u_serial.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/usb/gadget/function/usb_f_serial.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/usb/gadget/function/usb_f_obex.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/usb/gadget/function/u_ether.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/usb/gadget/function/usb_f_ncm.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/usb/gadget/function/usb_f_ecm.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/usb/gadget/function/usb_f_eem.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/usb/gadget/function/usb_f_ecm_subset.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/usb/gadget/function/usb_f_rndis.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/usb/gadget/function/usb_f_mass_storage.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/usb/gadget/function/usb_f_fs.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/usb/gadget/libcomposite.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/media/tuners/tda9887.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/media/rc/rc-core.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/media/common/uvc.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/media/v4l2-core/v4l2-async.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/media/v4l2-core/v4l2-fwnode.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/edac/layerscape_edac_mod.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/ufs/host/ufs-qcom.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/crypto/sa2ul.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/mailbox/mtk-cmdq-mailbox.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/rpmsg/rpmsg_char.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/iio/buffer/kfifo_buf.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm-ccn.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/fsl_imx8_ddr_perf.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/arm_cspmu_module.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/interconnect/imx/imx8mm-interconnect.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/interconnect/imx/imx8mq-interconnect.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/interconnect/imx/imx8mn-interconnect.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hte/hte-tegra194-test.o WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/fsl/imx-pcm-dma.o WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/qcom/snd-soc-qcom-common.o WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/qcom/snd-soc-qcom-sdw.o WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/qcom/qdsp6/snd-q6dsp-common.o WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/mediatek/mtk-adsp-common.o WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/mediatek/mt8195/snd-sof-mt8195.o WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/snd-sof-utils.o WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/snd-sof-of.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/ip_tunnel.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/tunnel4.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv6/sit.o -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v12 1/3] EINJ: Migrate to a platform driver 2024-02-14 20:07 ` [PATCH v12 1/3] EINJ: Migrate to a platform driver Ben Cheatham 2024-02-16 2:03 ` kernel test robot @ 2024-02-16 13:43 ` kernel test robot 1 sibling, 0 replies; 20+ messages in thread From: kernel test robot @ 2024-02-16 13:43 UTC (permalink / raw) To: Ben Cheatham, dan.j.williams, jonathan.cameron, rafael, james.morse, tony.luck, bp Cc: llvm, oe-kbuild-all, dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny, linux-cxl, linux-acpi, benjamin.cheatham Hi Ben, kernel test robot noticed the following build warnings: [auto build test WARNING on rafael-pm/linux-next] [also build test WARNING on linus/master rafael-pm/acpi-bus v6.8-rc4 next-20240216] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Ben-Cheatham/EINJ-Migrate-to-a-platform-driver/20240215-041027 base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next patch link: https://lore.kernel.org/r/20240214200709.777166-2-Benjamin.Cheatham%40amd.com patch subject: [PATCH v12 1/3] EINJ: Migrate to a platform driver config: x86_64-rhel-8.3-rust (https://download.01.org/0day-ci/archive/20240216/202402162148.vGukSly8-lkp@intel.com/config) compiler: clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240216/202402162148.vGukSly8-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202402162148.vGukSly8-lkp@intel.com/ All warnings (new ones prefixed by >>, old ones prefixed by <<): WARNING: modpost: missing MODULE_DESCRIPTION() in vmlinux.o WARNING: modpost: missing MODULE_DESCRIPTION() in arch/x86/events/intel/intel-uncore.o WARNING: modpost: missing MODULE_DESCRIPTION() in arch/x86/events/intel/intel-cstate.o WARNING: modpost: missing MODULE_DESCRIPTION() in arch/x86/events/rapl.o WARNING: modpost: missing MODULE_DESCRIPTION() in arch/x86/kernel/cpu/mce/mce-inject.o WARNING: modpost: missing MODULE_DESCRIPTION() in arch/x86/crypto/crc32-pclmul.o WARNING: modpost: missing MODULE_DESCRIPTION() in arch/x86/kvm/kvm.o WARNING: modpost: missing MODULE_DESCRIPTION() in arch/x86/kvm/kvm-intel.o WARNING: modpost: missing MODULE_DESCRIPTION() in arch/x86/kvm/kvm-amd.o WARNING: modpost: missing MODULE_DESCRIPTION() in mm/hwpoison-inject.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nfs/nfsv4.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp737.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp775.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp850.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp852.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp855.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp857.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp860.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp861.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp862.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp863.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp864.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp865.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp866.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp869.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp874.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp932.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_euc-jp.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp936.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp949.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp950.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp1250.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp1251.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_iso8859-1.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_iso8859-2.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_iso8859-3.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_iso8859-4.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_iso8859-5.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_iso8859-6.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_iso8859-7.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_cp1255.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_iso8859-9.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_iso8859-13.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_iso8859-14.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_iso8859-15.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_koi8-r.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_koi8-u.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_koi8-ru.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_utf8.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/mac-celtic.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/mac-centeuro.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/mac-croatian.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/mac-cyrillic.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/mac-gaelic.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/mac-greek.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/mac-iceland.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/mac-inuit.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/mac-romanian.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/mac-roman.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/mac-turkish.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/nls/nls_ucs2_utils.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/binfmt_misc.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/cramfs/cramfs.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/fat/fat.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/isofs/isofs.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/smb/common/cifs_arc4.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/smb/common/cifs_md4.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/btrfs/btrfs.o WARNING: modpost: missing MODULE_DESCRIPTION() in crypto/cast_common.o WARNING: modpost: missing MODULE_DESCRIPTION() in crypto/ecc.o WARNING: modpost: missing MODULE_DESCRIPTION() in crypto/xor.o WARNING: modpost: missing MODULE_DESCRIPTION() in block/t10-pi.o WARNING: modpost: missing MODULE_DESCRIPTION() in lib/crypto/libchacha.o WARNING: modpost: missing MODULE_DESCRIPTION() in lib/crypto/libarc4.o WARNING: modpost: missing MODULE_DESCRIPTION() in lib/crypto/libdes.o WARNING: modpost: missing MODULE_DESCRIPTION() in lib/crypto/libpoly1305.o WARNING: modpost: missing MODULE_DESCRIPTION() in lib/ts_kmp.o WARNING: modpost: missing MODULE_DESCRIPTION() in lib/ts_bm.o WARNING: modpost: missing MODULE_DESCRIPTION() in lib/ts_fsm.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/pci/pci-pf-stub.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/video/backlight/platform_lcd.o >> WARNING: modpost: drivers/acpi/apei/einj: section mismatch in reference: einj_driver+0x10 (section: .data) -> einj_remove (section: .exit.text) WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/acpi/acpi_tad.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/acpi/platform_profile.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/acpi/nfit/nfit.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/dma/dmatest.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/dma/ioat/ioatdma.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/virtio/virtio_dma_buf.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/xen/xen-evtchn.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/xen/xen-privcmd.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/tty/n_hdlc.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/tty/n_gsm.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/char/agp/intel-gtt.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/char/lp.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/char/ppdev.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/char/tlclk.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/tiny/bochs.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/tiny/cirrus.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/i915/kvmgt.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/base/regmap/regmap-i2c.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/base/regmap/regmap-spi.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/block/brd.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/block/loop.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/block/null_blk/null_blk.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/dax/hmem/dax_hmem.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/dax/device_dax.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/dax/kmem.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/dax/dax_pmem.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/scsi/isci/isci.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/cdrom/cdrom.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/usb/serial/usb_debug.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/media/tuners/tda9887.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/media/rc/rc-core.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/media/dvb-frontends/au8522_decoder.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/media/dvb-frontends/mb86a16.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/media/v4l2-core/v4l2-async.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/media/v4l2-core/v4l2-fwnode.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hwmon/asus_atk0110.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/intel_soc_dts_iosf.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_rapl.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_rfim.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_mbox.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_wt_req.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_wt_hint.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_power_floor.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/mmc/core/mmc_core.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/mmc/core/sdio_uart.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-a4tech.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-apple.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-aureal.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-belkin.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-cherry.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-chicony.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-cypress.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-dr.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-elecom.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-ezkey.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-gyration.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-ite.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-kensington.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-keytouch.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-kye.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-lcpower.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-lenovo.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-logitech.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-lg-g15.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-logitech-dj.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-logitech-hidpp.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-microsoft.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-monterey.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-ortek.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-pl.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-petalynx.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-primax.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-saitek.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-samsung.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-sjoy.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-speedlink.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-steelseries.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-sunplus.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-gaff.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-tmff.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-tivo.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-topseed.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-twinhan.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-xinmo.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-zpff.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-zydacron.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-waltop.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/intel-ish-hid/intel-ishtp.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/platform/x86/intel/intel-hid.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/platform/x86/intel/intel-vbtn.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/platform/x86/intel/intel-rst.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/platform/x86/amilo-rfkill.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/platform/x86/classmate-laptop.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hwtracing/intel_th/intel_th_msu_sink.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/parport/parport.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvdimm/libnvdimm.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvdimm/nd_pmem.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvdimm/nd_btt.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvdimm/nd_e820.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/uio/uio.o -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v12 2/3] cxl/core, EINJ: Add EINJ CXL debugfs files and EINJ helper functions 2024-02-14 20:07 [PATCH v12 0/3] cxl, EINJ: Update EINJ for CXL error types Ben Cheatham 2024-02-14 20:07 ` [PATCH v12 1/3] EINJ: Migrate to a platform driver Ben Cheatham @ 2024-02-14 20:07 ` Ben Cheatham 2024-02-15 2:25 ` Dan Williams 2024-02-15 9:33 ` Jonathan Cameron 2024-02-14 20:07 ` [PATCH v12 3/3] EINJ, Documentation: Update EINJ kernel doc Ben Cheatham 2024-02-15 1:11 ` [PATCH v12 0/3] cxl, EINJ: Update EINJ for CXL error types Tony Luck 3 siblings, 2 replies; 20+ messages in thread From: Ben Cheatham @ 2024-02-14 20:07 UTC (permalink / raw) To: dan.j.williams, jonathan.cameron, rafael, james.morse, tony.luck, bp Cc: dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny, linux-cxl, linux-acpi, benjamin.cheatham Implement CXL helper functions in the EINJ module for getting/injecting available CXL protocol error types and export them to sysfs under kernel/debug/cxl. The kernel/debug/cxl/einj_types file will print the available CXL protocol errors in the same format as the available_error_types file provided by the EINJ module. The kernel/debug/cxl/$dport_dev/einj_inject is functionally the same as the error_type and error_inject files provided by the EINJ module, i.e.: writing an error type into $dport_dev/einj_inject will inject said error type into the CXL dport represented by $dport_dev. Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> --- Documentation/ABI/testing/debugfs-cxl | 30 +++++ MAINTAINERS | 1 + drivers/acpi/apei/einj.c | 158 ++++++++++++++++++++++++-- drivers/cxl/Kconfig | 12 ++ drivers/cxl/core/port.c | 41 +++++++ include/linux/einj-cxl.h | 40 +++++++ 6 files changed, 271 insertions(+), 11 deletions(-) create mode 100644 include/linux/einj-cxl.h diff --git a/Documentation/ABI/testing/debugfs-cxl b/Documentation/ABI/testing/debugfs-cxl index fe61d372e3fa..4c0f62f881ca 100644 --- a/Documentation/ABI/testing/debugfs-cxl +++ b/Documentation/ABI/testing/debugfs-cxl @@ -33,3 +33,33 @@ Description: device cannot clear poison from the address, -ENXIO is returned. The clear_poison attribute is only visible for devices supporting the capability. + +What: /sys/kernel/debug/cxl/einj_types +Date: January, 2024 +KernelVersion: v6.9 +Contact: linux-cxl@vger.kernel.org +Description: + (RO) Prints the CXL protocol error types made available by + the platform in the format "0x<error number> <error type>". + The possible error types are (as of ACPI v6.5): + 0x1000 CXL.cache Protocol Correctable + 0x2000 CXL.cache Protocol Uncorrectable non-fatal + 0x4000 CXL.cache Protocol Uncorrectable fatal + 0x8000 CXL.mem Protocol Correctable + 0x10000 CXL.mem Protocol Uncorrectable non-fatal + 0x20000 CXL.mem Protocol Uncorrectable fatal + + The <error number> can be written to einj_inject to inject + <error type> into a chosen dport. + +What: /sys/kernel/debug/cxl/$dport_dev/einj_inject +Date: January, 2024 +KernelVersion: v6.9 +Contact: linux-cxl@vger.kernel.org +Description: + (WO) Writing an integer to this file injects the corresponding + CXL protocol error into $dport_dev ($dport_dev will be a device + name from /sys/bus/pci/devices). The integer to type mapping for + injection can be found by reading from einj_types. If the dport + was enumerated in RCH mode, a CXL 1.1 error is injected, otherwise + a CXL 2.0 error is injected. diff --git a/MAINTAINERS b/MAINTAINERS index 73d898383e51..0980ed116dc2 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5291,6 +5291,7 @@ S: Maintained F: drivers/cxl/ F: include/linux/cxl-event.h F: include/uapi/linux/cxl_mem.h +F: include/linux/einj-cxl.h F: tools/testing/cxl/ COMPUTE EXPRESS LINK PMU (CPMU) diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c index 4782ae0a0832..ab729a12dd7f 100644 --- a/drivers/acpi/apei/einj.c +++ b/drivers/acpi/apei/einj.c @@ -21,6 +21,7 @@ #include <linux/nmi.h> #include <linux/delay.h> #include <linux/mm.h> +#include <linux/einj-cxl.h> #include <linux/platform_device.h> #include <asm/unaligned.h> @@ -37,6 +38,20 @@ #define MEM_ERROR_MASK (ACPI_EINJ_MEMORY_CORRECTABLE | \ ACPI_EINJ_MEMORY_UNCORRECTABLE | \ ACPI_EINJ_MEMORY_FATAL) +#ifndef ACPI_EINJ_CXL_CACHE_CORRECTABLE +#define ACPI_EINJ_CXL_CACHE_CORRECTABLE BIT(12) +#define ACPI_EINJ_CXL_CACHE_UNCORRECTABLE BIT(13) +#define ACPI_EINJ_CXL_CACHE_FATAL BIT(14) +#define ACPI_EINJ_CXL_MEM_CORRECTABLE BIT(15) +#define ACPI_EINJ_CXL_MEM_UNCORRECTABLE BIT(16) +#define ACPI_EINJ_CXL_MEM_FATAL BIT(17) +#endif +#define CXL_ERROR_MASK (ACPI_EINJ_CXL_CACHE_CORRECTABLE | \ + ACPI_EINJ_CXL_CACHE_UNCORRECTABLE | \ + ACPI_EINJ_CXL_CACHE_FATAL | \ + ACPI_EINJ_CXL_MEM_CORRECTABLE | \ + ACPI_EINJ_CXL_MEM_UNCORRECTABLE | \ + ACPI_EINJ_CXL_MEM_FATAL) /* * ACPI version 5 provides a SET_ERROR_TYPE_WITH_ADDRESS action. @@ -560,8 +575,11 @@ static int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, if (type & ACPI5_VENDOR_BIT) { if (vendor_flags != SETWA_FLAGS_MEM) goto inject; - } else if (!(type & MEM_ERROR_MASK) && !(flags & SETWA_FLAGS_MEM)) + } else if (!(type & MEM_ERROR_MASK) && !(flags & SETWA_FLAGS_MEM)) { goto inject; + } else if ((type & CXL_ERROR_MASK) && (flags & SETWA_FLAGS_MEM)) { + goto inject; + } /* * Disallow crazy address masks that give BIOS leeway to pick @@ -613,13 +631,16 @@ static struct { u32 mask; const char *str; } const einj_error_type_string[] = { { BIT(9), "Platform Correctable" }, { BIT(10), "Platform Uncorrectable non-fatal" }, { BIT(11), "Platform Uncorrectable fatal"}, + { BIT(31), "Vendor Defined Error Types" }, +}; + +static struct { u32 mask; const char *str; } const einj_cxl_error_type_string[] = { { BIT(12), "CXL.cache Protocol Correctable" }, { BIT(13), "CXL.cache Protocol Uncorrectable non-fatal" }, { BIT(14), "CXL.cache Protocol Uncorrectable fatal" }, { BIT(15), "CXL.mem Protocol Correctable" }, { BIT(16), "CXL.mem Protocol Uncorrectable non-fatal" }, { BIT(17), "CXL.mem Protocol Uncorrectable fatal" }, - { BIT(31), "Vendor Defined Error Types" }, }; static int available_error_type_show(struct seq_file *m, void *v) @@ -640,29 +661,46 @@ static int available_error_type_show(struct seq_file *m, void *v) DEFINE_SHOW_ATTRIBUTE(available_error_type); -static int error_type_get(void *data, u64 *val) +int einj_cxl_available_error_type_show(struct seq_file *m, void *v) { - *val = error_type; + int cxl_err, rc; + u32 available_error_type = 0; + + if (!einj_initialized) + return -ENXIO; + + rc = einj_get_available_error_type(&available_error_type); + if (rc) + return rc; + + for (int pos = 0; pos < ARRAY_SIZE(einj_cxl_error_type_string); pos++) { + cxl_err = ACPI_EINJ_CXL_CACHE_CORRECTABLE << pos; + + if (available_error_type & cxl_err) + seq_printf(m, "0x%08x\t%s\n", + einj_cxl_error_type_string[pos].mask, + einj_cxl_error_type_string[pos].str); + } return 0; } +EXPORT_SYMBOL_NS_GPL(einj_cxl_available_error_type_show, CXL); -static int error_type_set(void *data, u64 val) +static int validate_error_type(u64 type) { + u32 tval, vendor, available_error_type = 0; int rc; - u32 available_error_type = 0; - u32 tval, vendor; /* Only low 32 bits for error type are valid */ - if (val & GENMASK_ULL(63, 32)) + if (type & GENMASK_ULL(63, 32)) return -EINVAL; /* * Vendor defined types have 0x80000000 bit set, and * are not enumerated by ACPI_EINJ_GET_ERROR_TYPE */ - vendor = val & ACPI5_VENDOR_BIT; - tval = val & 0x7fffffff; + vendor = type & ACPI5_VENDOR_BIT; + tval = type & GENMASK(30, 0); /* Only one error type can be specified */ if (tval & (tval - 1)) @@ -671,9 +709,101 @@ static int error_type_set(void *data, u64 val) rc = einj_get_available_error_type(&available_error_type); if (rc) return rc; - if (!(val & available_error_type)) + if (!(type & available_error_type)) return -EINVAL; } + + return 0; +} + +static int cxl_dport_get_sbdf(struct pci_dev *dport_dev, u64 *sbdf) +{ + struct pci_bus *pbus; + struct pci_host_bridge *bridge; + u64 seg = 0, bus; + + pbus = dport_dev->bus; + bridge = pci_find_host_bridge(pbus); + + if (!bridge) + return -ENODEV; + + if (bridge->domain_nr != PCI_DOMAIN_NR_NOT_SET) + seg = bridge->domain_nr; + + bus = pbus->number; + *sbdf = (seg << 24) | (bus << 16) | dport_dev->devfn; + + return 0; +} + +static bool is_cxl_error_type(u64 type) +{ + return (type & CXL_ERROR_MASK) && (!(type & ACPI5_VENDOR_BIT)); +} + +int einj_cxl_inject_rch_error(u64 rcrb, u64 type) +{ + int rc; + + if (!einj_initialized) + return -ENXIO; + + /* Only CXL error types can be specified */ + if (!is_cxl_error_type(type)) + return -EINVAL; + + rc = validate_error_type(type); + if (rc) + return rc; + + return einj_error_inject(type, 0x2, rcrb, GENMASK_ULL(63, 12), 0, 0); +} +EXPORT_SYMBOL_NS_GPL(einj_cxl_inject_rch_error, CXL); + +int einj_cxl_inject_error(struct pci_dev *dport, u64 type) +{ + u64 param4 = 0; + int rc; + + if (!einj_initialized) + return -ENXIO; + + /* Only CXL error types can be specified */ + if (!is_cxl_error_type(type)) + return -EINVAL; + + rc = validate_error_type(type); + if (rc) + return rc; + + rc = cxl_dport_get_sbdf(dport, ¶m4); + if (rc) + return rc; + + return einj_error_inject(type, 0x4, 0, 0, 0, param4); +} +EXPORT_SYMBOL_NS_GPL(einj_cxl_inject_error, CXL); + +static int error_type_get(void *data, u64 *val) +{ + *val = error_type; + + return 0; +} + +static int error_type_set(void *data, u64 val) +{ + int rc; + + /* CXL error types have to be injected from cxl debugfs */ + if (is_cxl_error_type(val)) + return -EINVAL; + + rc = validate_error_type(val); + if (rc) + return rc; + error_type = val; return 0; @@ -709,6 +839,12 @@ static int einj_check_table(struct acpi_table_einj *einj_tab) return 0; } +bool einj_is_initialized(void) +{ + return einj_initialized; +} +EXPORT_SYMBOL_GPL(einj_is_initialized); + static int __init einj_probe(struct platform_device *pdev) { int rc; diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig index 67998dbd1d46..d1fc3ce31fbb 100644 --- a/drivers/cxl/Kconfig +++ b/drivers/cxl/Kconfig @@ -157,4 +157,16 @@ config CXL_PMU monitoring units and provide standard perf based interfaces. If unsure say 'm'. + +config CXL_EINJ + bool "CXL Error INJection Support" + default ACPI_APEI_EINJ + depends on ACPI_APEI_EINJ = CXL_BUS + help + Support for CXL protocol Error INJection through debugfs/cxl. + Availability and which errors are supported is dependent on + the host platform. Look to ACPI v6.5 section 18.6.4 and kernel + EINJ documentation for more information. + + If unsure say 'n' endif diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index e59d9d37aa65..3700700c4e0c 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -3,6 +3,7 @@ #include <linux/platform_device.h> #include <linux/memregion.h> #include <linux/workqueue.h> +#include <linux/einj-cxl.h> #include <linux/debugfs.h> #include <linux/device.h> #include <linux/module.h> @@ -793,6 +794,39 @@ static int cxl_dport_setup_regs(struct device *host, struct cxl_dport *dport, return rc; } +DEFINE_SHOW_ATTRIBUTE(einj_cxl_available_error_type); + +static int cxl_einj_inject(void *data, u64 type) +{ + struct cxl_dport *dport = data; + + if (dport->rch) + return einj_cxl_inject_rch_error(dport->rcrb.base, type); + + return einj_cxl_inject_error(to_pci_dev(dport->dport_dev), type); +} +DEFINE_DEBUGFS_ATTRIBUTE(cxl_einj_inject_fops, NULL, cxl_einj_inject, "%llx\n"); + +static void cxl_debugfs_create_dport_dir(struct cxl_dport *dport) +{ + struct dentry *dir; + + if (!einj_is_initialized()) + return; + + /* + * dport_dev needs to be a PCIe port for CXL 2.0+ ports because + * EINJ expects a dport SBDF to be specified for 2.0 error injection. + */ + if (!dport->rch && !dev_is_pci(dport->dport_dev)) + return; + + dir = cxl_debugfs_create_dir(dev_name(dport->dport_dev)); + + debugfs_create_file("einj_inject", 0200, dir, dport, + &cxl_einj_inject_fops); +} + static struct cxl_port *__devm_cxl_add_port(struct device *host, struct device *uport_dev, resource_size_t component_reg_phys, @@ -1149,6 +1183,8 @@ __devm_cxl_add_dport(struct cxl_port *port, struct device *dport_dev, if (dev_is_pci(dport_dev)) dport->link_latency = cxl_pci_get_latency(to_pci_dev(dport_dev)); + cxl_debugfs_create_dport_dir(dport); + return dport; } @@ -2221,6 +2257,11 @@ static __init int cxl_core_init(void) cxl_debugfs = debugfs_create_dir("cxl", NULL); + if (einj_is_initialized()) { + debugfs_create_file("einj_types", 0400, cxl_debugfs, NULL, + &einj_cxl_available_error_type_fops); + } + cxl_mbox_init(); rc = cxl_memdev_init(); diff --git a/include/linux/einj-cxl.h b/include/linux/einj-cxl.h new file mode 100644 index 000000000000..92c0e2e37ad9 --- /dev/null +++ b/include/linux/einj-cxl.h @@ -0,0 +1,40 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * CXL protocol Error INJection support. + * + * Copyright (c) 2023 Advanced Micro Devices, Inc. + * All Rights Reserved. + * + * Author: Ben Cheatham <benjamin.cheatham@amd.com> + */ +#ifndef CXL_EINJ_H +#define CXL_EINJ_H + +#include <linux/pci.h> + +#if IS_ENABLED(CONFIG_ACPI_APEI_EINJ) +int einj_cxl_available_error_type_show(struct seq_file *m, void *v); +int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type); +int einj_cxl_inject_rch_error(u64 rcrb, u64 type); +bool einj_is_initialized(void); +#else // !IS_ENABLED(CONFIG_ACPI_APEI_EINJ) +static inline int einj_cxl_available_error_type_show(struct seq_file *m, + void *v) +{ + return -ENXIO; +} + +static inline int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type) +{ + return -ENXIO; +} + +static inline int einj_cxl_inject_rch_error(u64 rcrb, u64 type) +{ + return -ENXIO; +} + +static inline bool einj_is_initialized(void) { return false; } +#endif // CONFIG_ACPI_APEI_EINJ + +#endif // CXL_EINJ_H -- 2.34.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v12 2/3] cxl/core, EINJ: Add EINJ CXL debugfs files and EINJ helper functions 2024-02-14 20:07 ` [PATCH v12 2/3] cxl/core, EINJ: Add EINJ CXL debugfs files and EINJ helper functions Ben Cheatham @ 2024-02-15 2:25 ` Dan Williams 2024-02-15 15:01 ` Ben Cheatham 2024-02-15 9:33 ` Jonathan Cameron 1 sibling, 1 reply; 20+ messages in thread From: Dan Williams @ 2024-02-15 2:25 UTC (permalink / raw) To: Ben Cheatham, dan.j.williams, jonathan.cameron, rafael, james.morse, tony.luck, bp Cc: dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny, linux-cxl, linux-acpi, benjamin.cheatham Ben Cheatham wrote: > Implement CXL helper functions in the EINJ module for getting/injecting > available CXL protocol error types and export them to sysfs under > kernel/debug/cxl. > > The kernel/debug/cxl/einj_types file will print the available CXL > protocol errors in the same format as the available_error_types > file provided by the EINJ module. The > kernel/debug/cxl/$dport_dev/einj_inject is functionally the same as the > error_type and error_inject files provided by the EINJ module, i.e.: > writing an error type into $dport_dev/einj_inject will inject said error > type into the CXL dport represented by $dport_dev. > > Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> [..] > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig > index 67998dbd1d46..d1fc3ce31fbb 100644 > --- a/drivers/cxl/Kconfig > +++ b/drivers/cxl/Kconfig > @@ -157,4 +157,16 @@ config CXL_PMU > monitoring units and provide standard perf based interfaces. > > If unsure say 'm'. > + > +config CXL_EINJ > + bool "CXL Error INJection Support" > + default ACPI_APEI_EINJ > + depends on ACPI_APEI_EINJ = CXL_BUS So I do not see CONFIG_CXL_EINJ used anywhere, not in a Makefile, not in a header file. My expectation is that if this variable is not set then no symbols from einj.ko are consumed by cxl_core.ko. > + help > + Support for CXL protocol Error INJection through debugfs/cxl. > + Availability and which errors are supported is dependent on > + the host platform. Look to ACPI v6.5 section 18.6.4 and kernel > + EINJ documentation for more information. > + > + If unsure say 'n' > endif [..] > diff --git a/include/linux/einj-cxl.h b/include/linux/einj-cxl.h > new file mode 100644 > index 000000000000..92c0e2e37ad9 > --- /dev/null > +++ b/include/linux/einj-cxl.h > @@ -0,0 +1,40 @@ > +/* SPDX-License-Identifier: GPL-2.0-or-later */ > +/* > + * CXL protocol Error INJection support. > + * > + * Copyright (c) 2023 Advanced Micro Devices, Inc. > + * All Rights Reserved. > + * > + * Author: Ben Cheatham <benjamin.cheatham@amd.com> > + */ > +#ifndef CXL_EINJ_H > +#define CXL_EINJ_H > + > +#include <linux/pci.h> > + > +#if IS_ENABLED(CONFIG_ACPI_APEI_EINJ) Per above this needs to be IS_ENABLED(CONFIG_CXL_EINJ), otherwise what's the point of the config symbol? > +int einj_cxl_available_error_type_show(struct seq_file *m, void *v); > +int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type); > +int einj_cxl_inject_rch_error(u64 rcrb, u64 type); > +bool einj_is_initialized(void); > +#else // !IS_ENABLED(CONFIG_ACPI_APEI_EINJ) > +static inline int einj_cxl_available_error_type_show(struct seq_file *m, > + void *v) > +{ > + return -ENXIO; > +} > + > +static inline int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type) > +{ > + return -ENXIO; > +} > + > +static inline int einj_cxl_inject_rch_error(u64 rcrb, u64 type) > +{ > + return -ENXIO; > +} > + > +static inline bool einj_is_initialized(void) { return false; } > +#endif // CONFIG_ACPI_APEI_EINJ > + > +#endif // CXL_EINJ_H > -- > 2.34.1 > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v12 2/3] cxl/core, EINJ: Add EINJ CXL debugfs files and EINJ helper functions 2024-02-15 2:25 ` Dan Williams @ 2024-02-15 15:01 ` Ben Cheatham 2024-02-16 0:15 ` Dan Williams 0 siblings, 1 reply; 20+ messages in thread From: Ben Cheatham @ 2024-02-15 15:01 UTC (permalink / raw) To: Dan Williams, jonathan.cameron, rafael, james.morse, tony.luck, bp Cc: dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny, linux-cxl, linux-acpi On 2/14/24 8:25 PM, Dan Williams wrote: > Ben Cheatham wrote: >> Implement CXL helper functions in the EINJ module for getting/injecting >> available CXL protocol error types and export them to sysfs under >> kernel/debug/cxl. >> >> The kernel/debug/cxl/einj_types file will print the available CXL >> protocol errors in the same format as the available_error_types >> file provided by the EINJ module. The >> kernel/debug/cxl/$dport_dev/einj_inject is functionally the same as the >> error_type and error_inject files provided by the EINJ module, i.e.: >> writing an error type into $dport_dev/einj_inject will inject said error >> type into the CXL dport represented by $dport_dev. >> >> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> > [..] >> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig >> index 67998dbd1d46..d1fc3ce31fbb 100644 >> --- a/drivers/cxl/Kconfig >> +++ b/drivers/cxl/Kconfig >> @@ -157,4 +157,16 @@ config CXL_PMU >> monitoring units and provide standard perf based interfaces. >> >> If unsure say 'm'. >> + >> +config CXL_EINJ >> + bool "CXL Error INJection Support" >> + default ACPI_APEI_EINJ >> + depends on ACPI_APEI_EINJ = CXL_BUS > > So I do not see CONFIG_CXL_EINJ used anywhere, not in a Makefile, not in > a header file. My expectation is that if this variable is not set then > no symbols from einj.ko are consumed by cxl_core.ko. > Yeah, you're right. More on this below. >> + help >> + Support for CXL protocol Error INJection through debugfs/cxl. >> + Availability and which errors are supported is dependent on >> + the host platform. Look to ACPI v6.5 section 18.6.4 and kernel >> + EINJ documentation for more information. >> + >> + If unsure say 'n' >> endif > [..] >> diff --git a/include/linux/einj-cxl.h b/include/linux/einj-cxl.h >> new file mode 100644 >> index 000000000000..92c0e2e37ad9 >> --- /dev/null >> +++ b/include/linux/einj-cxl.h >> @@ -0,0 +1,40 @@ >> +/* SPDX-License-Identifier: GPL-2.0-or-later */ >> +/* >> + * CXL protocol Error INJection support. >> + * >> + * Copyright (c) 2023 Advanced Micro Devices, Inc. >> + * All Rights Reserved. >> + * >> + * Author: Ben Cheatham <benjamin.cheatham@amd.com> >> + */ >> +#ifndef CXL_EINJ_H >> +#define CXL_EINJ_H >> + >> +#include <linux/pci.h> >> + >> +#if IS_ENABLED(CONFIG_ACPI_APEI_EINJ) > > Per above this needs to be IS_ENABLED(CONFIG_CXL_EINJ), otherwise what's > the point of the config symbol? > So I've tried changing this to IS_ENABLED(CONFIG_CXL_EINJ) and always get redefinition errors that I can't figure out how to get around cleanly. I should've elaborated more in the last revision, but part of changing the dependency rule from ACPI_APEI_EINJ >= CXL_BUS to ACPI_APEI_EINJ = CXL_BUS is that the above guard stays as IS_ENABLED(CONFIG_ACPI_APEI_EINJ). I'm pretty sure the only thing this symbol is doing is enforcing the above dependency. I would love to be able to remove it at this point, but doing so would require moving the dependency to either the EINJ or CXL core modules, which sounds worse. I could implement one of the other solutions I outlined last revision, but I don't particularly like any of those (and I know you don't either :)). I think the solution here is to move the einj_cxl functions into a new file, gate that file by CONFIG_CXL_EINJ (or change the name to CONFIG_EINJ_CXL to match einj-cxl.h), and add declarations of the functions in the EINJ module used by said functions to drivers/acpi/apei/apei-internal.h. I'm not sure of another approach at this point, but if you have suggestions I'd be very happy to hear them! Thanks, Ben >> +int einj_cxl_available_error_type_show(struct seq_file *m, void *v); >> +int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type); >> +int einj_cxl_inject_rch_error(u64 rcrb, u64 type); >> +bool einj_is_initialized(void); >> +#else // !IS_ENABLED(CONFIG_ACPI_APEI_EINJ) >> +static inline int einj_cxl_available_error_type_show(struct seq_file *m, >> + void *v) >> +{ >> + return -ENXIO; >> +} >> + >> +static inline int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type) >> +{ >> + return -ENXIO; >> +} >> + >> +static inline int einj_cxl_inject_rch_error(u64 rcrb, u64 type) >> +{ >> + return -ENXIO; >> +} >> + >> +static inline bool einj_is_initialized(void) { return false; } >> +#endif // CONFIG_ACPI_APEI_EINJ >> + >> +#endif // CXL_EINJ_H >> -- >> 2.34.1 >> >> > > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v12 2/3] cxl/core, EINJ: Add EINJ CXL debugfs files and EINJ helper functions 2024-02-15 15:01 ` Ben Cheatham @ 2024-02-16 0:15 ` Dan Williams 0 siblings, 0 replies; 20+ messages in thread From: Dan Williams @ 2024-02-16 0:15 UTC (permalink / raw) To: Ben Cheatham, Dan Williams, jonathan.cameron, rafael, james.morse, tony.luck, bp Cc: dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny, linux-cxl, linux-acpi Ben Cheatham wrote: [..] > >> diff --git a/include/linux/einj-cxl.h b/include/linux/einj-cxl.h > >> new file mode 100644 > >> index 000000000000..92c0e2e37ad9 > >> --- /dev/null > >> +++ b/include/linux/einj-cxl.h > >> @@ -0,0 +1,40 @@ > >> +/* SPDX-License-Identifier: GPL-2.0-or-later */ > >> +/* > >> + * CXL protocol Error INJection support. > >> + * > >> + * Copyright (c) 2023 Advanced Micro Devices, Inc. > >> + * All Rights Reserved. > >> + * > >> + * Author: Ben Cheatham <benjamin.cheatham@amd.com> > >> + */ > >> +#ifndef CXL_EINJ_H > >> +#define CXL_EINJ_H > >> + > >> +#include <linux/pci.h> > >> + > >> +#if IS_ENABLED(CONFIG_ACPI_APEI_EINJ) > > > > Per above this needs to be IS_ENABLED(CONFIG_CXL_EINJ), otherwise what's > > the point of the config symbol? > > > [..] > I think the solution here is to move the einj_cxl functions into a new > file, gate that file by CONFIG_CXL_EINJ (or change the name to > CONFIG_EINJ_CXL to match einj-cxl.h), and add declarations of the > functions in the EINJ module used by said functions to > drivers/acpi/apei/apei-internal.h. I'm not sure of another approach at > this point, but if you have suggestions I'd be very happy to hear > them! Yes, this is the way. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v12 2/3] cxl/core, EINJ: Add EINJ CXL debugfs files and EINJ helper functions 2024-02-14 20:07 ` [PATCH v12 2/3] cxl/core, EINJ: Add EINJ CXL debugfs files and EINJ helper functions Ben Cheatham 2024-02-15 2:25 ` Dan Williams @ 2024-02-15 9:33 ` Jonathan Cameron 2024-02-15 15:01 ` Ben Cheatham 1 sibling, 1 reply; 20+ messages in thread From: Jonathan Cameron @ 2024-02-15 9:33 UTC (permalink / raw) To: Ben Cheatham Cc: dan.j.williams, rafael, james.morse, tony.luck, bp, dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny, linux-cxl, linux-acpi On Wed, 14 Feb 2024 14:07:08 -0600 Ben Cheatham <Benjamin.Cheatham@amd.com> wrote: > Implement CXL helper functions in the EINJ module for getting/injecting > available CXL protocol error types and export them to sysfs under > kernel/debug/cxl. > > The kernel/debug/cxl/einj_types file will print the available CXL > protocol errors in the same format as the available_error_types > file provided by the EINJ module. The > kernel/debug/cxl/$dport_dev/einj_inject is functionally the same as the > error_type and error_inject files provided by the EINJ module, i.e.: > writing an error type into $dport_dev/einj_inject will inject said error > type into the CXL dport represented by $dport_dev. > > Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> Hi Ben, Just one trivial thing to add to Dan's comments. > diff --git a/include/linux/einj-cxl.h b/include/linux/einj-cxl.h > new file mode 100644 > index 000000000000..92c0e2e37ad9 > --- /dev/null > +++ b/include/linux/einj-cxl.h > @@ -0,0 +1,40 @@ > +/* SPDX-License-Identifier: GPL-2.0-or-later */ > +/* > + * CXL protocol Error INJection support. > + * > + * Copyright (c) 2023 Advanced Micro Devices, Inc. > + * All Rights Reserved. > + * > + * Author: Ben Cheatham <benjamin.cheatham@amd.com> > + */ > +#ifndef CXL_EINJ_H > +#define CXL_EINJ_H > + > +#include <linux/pci.h> > + > +#if IS_ENABLED(CONFIG_ACPI_APEI_EINJ) > +int einj_cxl_available_error_type_show(struct seq_file *m, void *v); > +int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type); > +int einj_cxl_inject_rch_error(u64 rcrb, u64 type); > +bool einj_is_initialized(void); > +#else // !IS_ENABLED(CONFIG_ACPI_APEI_EINJ) Whilst C++ Style comments are allowed for a few specific case, you should stick to local style. In the include directory a quick grep gave me loads of #else /* */ lines and only 2 #else // So C style comments preferred. > +static inline int einj_cxl_available_error_type_show(struct seq_file *m, > + void *v) > +{ > + return -ENXIO; > +} > + > +static inline int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type) > +{ > + return -ENXIO; > +} > + > +static inline int einj_cxl_inject_rch_error(u64 rcrb, u64 type) > +{ > + return -ENXIO; > +} > + > +static inline bool einj_is_initialized(void) { return false; } > +#endif // CONFIG_ACPI_APEI_EINJ > + > +#endif // CXL_EINJ_H ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v12 2/3] cxl/core, EINJ: Add EINJ CXL debugfs files and EINJ helper functions 2024-02-15 9:33 ` Jonathan Cameron @ 2024-02-15 15:01 ` Ben Cheatham 0 siblings, 0 replies; 20+ messages in thread From: Ben Cheatham @ 2024-02-15 15:01 UTC (permalink / raw) To: Jonathan Cameron Cc: dan.j.williams, rafael, james.morse, tony.luck, bp, dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny, linux-cxl, linux-acpi On 2/15/24 3:33 AM, Jonathan Cameron wrote: > On Wed, 14 Feb 2024 14:07:08 -0600 > Ben Cheatham <Benjamin.Cheatham@amd.com> wrote: > >> Implement CXL helper functions in the EINJ module for getting/injecting >> available CXL protocol error types and export them to sysfs under >> kernel/debug/cxl. >> >> The kernel/debug/cxl/einj_types file will print the available CXL >> protocol errors in the same format as the available_error_types >> file provided by the EINJ module. The >> kernel/debug/cxl/$dport_dev/einj_inject is functionally the same as the >> error_type and error_inject files provided by the EINJ module, i.e.: >> writing an error type into $dport_dev/einj_inject will inject said error >> type into the CXL dport represented by $dport_dev. >> >> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> > > Hi Ben, > > Just one trivial thing to add to Dan's comments. > >> diff --git a/include/linux/einj-cxl.h b/include/linux/einj-cxl.h >> new file mode 100644 >> index 000000000000..92c0e2e37ad9 >> --- /dev/null >> +++ b/include/linux/einj-cxl.h >> @@ -0,0 +1,40 @@ >> +/* SPDX-License-Identifier: GPL-2.0-or-later */ >> +/* >> + * CXL protocol Error INJection support. >> + * >> + * Copyright (c) 2023 Advanced Micro Devices, Inc. >> + * All Rights Reserved. >> + * >> + * Author: Ben Cheatham <benjamin.cheatham@amd.com> >> + */ >> +#ifndef CXL_EINJ_H >> +#define CXL_EINJ_H >> + >> +#include <linux/pci.h> >> + >> +#if IS_ENABLED(CONFIG_ACPI_APEI_EINJ) >> +int einj_cxl_available_error_type_show(struct seq_file *m, void *v); >> +int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type); >> +int einj_cxl_inject_rch_error(u64 rcrb, u64 type); >> +bool einj_is_initialized(void); >> +#else // !IS_ENABLED(CONFIG_ACPI_APEI_EINJ) > Whilst C++ Style comments are allowed for a few specific case, you should stick > to local style. In the include directory a quick grep gave me loads of > #else /* */ > lines and only 2 > #else // > > So C style comments preferred. > You are 100% correct. I did a grep for examples and didn't notice that I did it from the base linux directory, so of course I found a bunch of examples of C++ style comments :/. I'll fix it in the next revision. Thanks, Ben > >> +static inline int einj_cxl_available_error_type_show(struct seq_file *m, >> + void *v) >> +{ >> + return -ENXIO; >> +} >> + >> +static inline int einj_cxl_inject_error(struct pci_dev *dport_dev, u64 type) >> +{ >> + return -ENXIO; >> +} >> + >> +static inline int einj_cxl_inject_rch_error(u64 rcrb, u64 type) >> +{ >> + return -ENXIO; >> +} >> + >> +static inline bool einj_is_initialized(void) { return false; } >> +#endif // CONFIG_ACPI_APEI_EINJ >> + >> +#endif // CXL_EINJ_H > ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v12 3/3] EINJ, Documentation: Update EINJ kernel doc 2024-02-14 20:07 [PATCH v12 0/3] cxl, EINJ: Update EINJ for CXL error types Ben Cheatham 2024-02-14 20:07 ` [PATCH v12 1/3] EINJ: Migrate to a platform driver Ben Cheatham 2024-02-14 20:07 ` [PATCH v12 2/3] cxl/core, EINJ: Add EINJ CXL debugfs files and EINJ helper functions Ben Cheatham @ 2024-02-14 20:07 ` Ben Cheatham 2024-02-20 19:02 ` Davidlohr Bueso 2024-02-15 1:11 ` [PATCH v12 0/3] cxl, EINJ: Update EINJ for CXL error types Tony Luck 3 siblings, 1 reply; 20+ messages in thread From: Ben Cheatham @ 2024-02-14 20:07 UTC (permalink / raw) To: dan.j.williams, jonathan.cameron, rafael, james.morse, tony.luck, bp Cc: dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny, linux-cxl, linux-acpi, benjamin.cheatham Update EINJ kernel document to include how to inject CXL protocol error types, build the kernel to include CXL error types, and give an example injection. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> --- .../firmware-guide/acpi/apei/einj.rst | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst index d6b61d22f525..f179adf7b61c 100644 --- a/Documentation/firmware-guide/acpi/apei/einj.rst +++ b/Documentation/firmware-guide/acpi/apei/einj.rst @@ -181,6 +181,25 @@ You should see something like this in dmesg:: [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0 [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0) +CXL error types are supported from ACPI 6.5 onwards. These error types +are not available in the legacy interface at /sys/kernel/debug/apei/einj, +and are instead at /sys/kernel/debug/cxl/. There is a file under debug/cxl +called "einj_type" that is analogous to available_error_type under debug/cxl. +There is also a "einj_inject" file in each $dport_dev directory under debug/cxl +that will inject a given error into the dport represented by $dport_dev. +For example, to inject a CXL.mem protocol correctable error into +$dport_dev=pci0000:0c:: + + # cd /sys/kernel/debug/cxl/ + # cat einj_type # See which error can be injected + 0x00008000 CXL.mem Protocol Correctable + 0x00010000 CXL.mem Protocol Uncorrectable non-fatal + 0x00020000 CXL.mem Protocol Uncorrectable fatal + # cd 0000:e0:01.1 # Navigate to dport to inject into + # echo 0x8000 > einj_inject # Inject error + +To use CXL error types, ``CONFIG_CXL_EINJ`` will need to be enabled. + Special notes for injection into SGX enclaves: There may be a separate BIOS setup option to enable SGX injection. -- 2.34.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v12 3/3] EINJ, Documentation: Update EINJ kernel doc 2024-02-14 20:07 ` [PATCH v12 3/3] EINJ, Documentation: Update EINJ kernel doc Ben Cheatham @ 2024-02-20 19:02 ` Davidlohr Bueso 2024-02-20 19:59 ` Ben Cheatham 0 siblings, 1 reply; 20+ messages in thread From: Davidlohr Bueso @ 2024-02-20 19:02 UTC (permalink / raw) To: Ben Cheatham Cc: dan.j.williams, jonathan.cameron, rafael, james.morse, tony.luck, bp, dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny, linux-cxl, linux-acpi On Wed, 14 Feb 2024, Ben Cheatham wrote: >Update EINJ kernel document to include how to inject CXL protocol error >types, build the kernel to include CXL error types, and give an example >injection. > >Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> >Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> Would vote for folding into 2/3, but otherwise looks good with a minor suggestion. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> >--- > .../firmware-guide/acpi/apei/einj.rst | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > >diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst >index d6b61d22f525..f179adf7b61c 100644 >--- a/Documentation/firmware-guide/acpi/apei/einj.rst >+++ b/Documentation/firmware-guide/acpi/apei/einj.rst >@@ -181,6 +181,25 @@ You should see something like this in dmesg:: > [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0 > [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0) > >+CXL error types are supported from ACPI 6.5 onwards. These error types ^ and target a CXL Port >+are not available in the legacy interface at /sys/kernel/debug/apei/einj, >+and are instead at /sys/kernel/debug/cxl/. There is a file under debug/cxl >+called "einj_type" that is analogous to available_error_type under debug/cxl. >+There is also a "einj_inject" file in each $dport_dev directory under debug/cxl >+that will inject a given error into the dport represented by $dport_dev. >+For example, to inject a CXL.mem protocol correctable error into >+$dport_dev=pci0000:0c:: >+ >+ # cd /sys/kernel/debug/cxl/ >+ # cat einj_type # See which error can be injected >+ 0x00008000 CXL.mem Protocol Correctable >+ 0x00010000 CXL.mem Protocol Uncorrectable non-fatal >+ 0x00020000 CXL.mem Protocol Uncorrectable fatal >+ # cd 0000:e0:01.1 # Navigate to dport to inject into >+ # echo 0x8000 > einj_inject # Inject error >+ >+To use CXL error types, ``CONFIG_CXL_EINJ`` will need to be enabled. >+ > Special notes for injection into SGX enclaves: > > There may be a separate BIOS setup option to enable SGX injection. >-- >2.34.1 > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v12 3/3] EINJ, Documentation: Update EINJ kernel doc 2024-02-20 19:02 ` Davidlohr Bueso @ 2024-02-20 19:59 ` Ben Cheatham 0 siblings, 0 replies; 20+ messages in thread From: Ben Cheatham @ 2024-02-20 19:59 UTC (permalink / raw) To: Davidlohr Bueso Cc: dan.j.williams, jonathan.cameron, rafael, james.morse, tony.luck, bp, dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny, linux-cxl, linux-acpi Thanks for taking a look David! On 2/20/24 1:02 PM, Davidlohr Bueso wrote: > On Wed, 14 Feb 2024, Ben Cheatham wrote: > >> Update EINJ kernel document to include how to inject CXL protocol error >> types, build the kernel to include CXL error types, and give an example >> injection. >> >> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> >> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> > > Would vote for folding into 2/3, but otherwise looks good with a minor > suggestion. > I would, but I think 2/3 is already pretty large and this is more digestible to me. I've also reworked a large portion of that patch for v13 so it's probably better to keep it smaller. > Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> > >> --- >> .../firmware-guide/acpi/apei/einj.rst | 19 +++++++++++++++++++ >> 1 file changed, 19 insertions(+) >> >> diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst >> index d6b61d22f525..f179adf7b61c 100644 >> --- a/Documentation/firmware-guide/acpi/apei/einj.rst >> +++ b/Documentation/firmware-guide/acpi/apei/einj.rst >> @@ -181,6 +181,25 @@ You should see something like this in dmesg:: >> [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0 >> [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0) >> >> +CXL error types are supported from ACPI 6.5 onwards. These error types > ^ and target a CXL Port > Will add. Thanks, Ben >> +are not available in the legacy interface at /sys/kernel/debug/apei/einj, >> +and are instead at /sys/kernel/debug/cxl/. There is a file under debug/cxl >> +called "einj_type" that is analogous to available_error_type under debug/cxl. >> +There is also a "einj_inject" file in each $dport_dev directory under debug/cxl >> +that will inject a given error into the dport represented by $dport_dev. >> +For example, to inject a CXL.mem protocol correctable error into >> +$dport_dev=pci0000:0c:: >> + >> + # cd /sys/kernel/debug/cxl/ >> + # cat einj_type # See which error can be injected >> + 0x00008000 CXL.mem Protocol Correctable >> + 0x00010000 CXL.mem Protocol Uncorrectable non-fatal >> + 0x00020000 CXL.mem Protocol Uncorrectable fatal >> + # cd 0000:e0:01.1 # Navigate to dport to inject into >> + # echo 0x8000 > einj_inject # Inject error >> + >> +To use CXL error types, ``CONFIG_CXL_EINJ`` will need to be enabled. >> + >> Special notes for injection into SGX enclaves: >> >> There may be a separate BIOS setup option to enable SGX injection. >> -- >> 2.34.1 >> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v12 0/3] cxl, EINJ: Update EINJ for CXL error types 2024-02-14 20:07 [PATCH v12 0/3] cxl, EINJ: Update EINJ for CXL error types Ben Cheatham ` (2 preceding siblings ...) 2024-02-14 20:07 ` [PATCH v12 3/3] EINJ, Documentation: Update EINJ kernel doc Ben Cheatham @ 2024-02-15 1:11 ` Tony Luck 2024-02-15 2:53 ` Dan Williams 3 siblings, 1 reply; 20+ messages in thread From: Tony Luck @ 2024-02-15 1:11 UTC (permalink / raw) To: Ben Cheatham Cc: dan.j.williams, jonathan.cameron, rafael, james.morse, bp, dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny, linux-cxl, linux-acpi On Wed, Feb 14, 2024 at 02:07:06PM -0600, Ben Cheatham wrote: > v12 Changes: > - Rebase onto v6.8-rc4 > - Squash Kconfig patch into patch 2/3 (Jonathan) > - Change CONFIG_CXL_EINJ from "depends on ACPI_APEI_EINJ >= CXL_BUS" > to "depends on ACPI_APEI_EINJ = CXL_BUS" > - Drop "ACPI, APEI" part of commit message title and use just EINJ > instead (Dan) > - Add protocol error types to "einj_types" documentation (Jonathan) > - Change 0xffff... constants to use GENMASK() > - Drop param* variables and use constants instead in cxl error > inject functions (Jonathan) > - Add is_cxl_error_type() helper function in einj.c (Jonathan) > - Remove a stray function declaration in einj-cxl.h (Jonathan) > - Comment #else/#endifs with corresponding #if/#ifdef in > einj-cxl.h (Jonathan) > > v11 Changes: > - Drop patch 2/6 (Add CXL protocol error defines) and put the > defines in patch 4/6 instead (Dan) > - Add Dan's reviewed-by > > The new CXL error types will use the Memory Address field in the > SET_ERROR_TYPE_WITH_ADDRESS structure in order to target a CXL 1.1 > compliant memory-mapped downstream port. The value of the memory address > will be in the port's MMIO range, and it will not represent physical > (normal or persistent) memory. > > Add the functionality for injecting CXL 1.1 errors to the EINJ module, > but not through the EINJ legacy interface under /sys/kernel/debug/apei/einj. > Instead, make the error types available under /sys/kernel/debug/cxl. > This allows for validating the MMIO address for a CXL 1.1 error type > while also not making the user responsible for finding it. I tried this series on an Intel Icelake (which as far as I know doesn't support CXL). Couple of oddities: 1) I built as a module (CONFIG_ACPI_APEI_EINJ=m) like I normally do. But this was autoloaded and EINJ initialized during boot: [ 33.909111] EINJ: Error INJection is initialized. I'm wondering if that might be a problem for anyone that likes to leave the einj module not loaded until they have some need to inject errors. 2) Even though my system doesn't have any CXL support, I found this: # cat /sys/kernel/debug/cxl/einj_types 0x00001000 CXL.cache Protocol Correctable What does this mean? Using ras-tools I injected some DDR memory errors. So legacy functionality still works OK. -Tony ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v12 0/3] cxl, EINJ: Update EINJ for CXL error types 2024-02-15 1:11 ` [PATCH v12 0/3] cxl, EINJ: Update EINJ for CXL error types Tony Luck @ 2024-02-15 2:53 ` Dan Williams 2024-02-15 15:01 ` Ben Cheatham 2024-02-15 17:23 ` Luck, Tony 0 siblings, 2 replies; 20+ messages in thread From: Dan Williams @ 2024-02-15 2:53 UTC (permalink / raw) To: Tony Luck, Ben Cheatham Cc: dan.j.williams, jonathan.cameron, rafael, james.morse, bp, dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny, linux-cxl, linux-acpi Tony Luck wrote: > On Wed, Feb 14, 2024 at 02:07:06PM -0600, Ben Cheatham wrote: > > v12 Changes: > > - Rebase onto v6.8-rc4 > > - Squash Kconfig patch into patch 2/3 (Jonathan) > > - Change CONFIG_CXL_EINJ from "depends on ACPI_APEI_EINJ >= CXL_BUS" > > to "depends on ACPI_APEI_EINJ = CXL_BUS" > > - Drop "ACPI, APEI" part of commit message title and use just EINJ > > instead (Dan) > > - Add protocol error types to "einj_types" documentation (Jonathan) > > - Change 0xffff... constants to use GENMASK() > > - Drop param* variables and use constants instead in cxl error > > inject functions (Jonathan) > > - Add is_cxl_error_type() helper function in einj.c (Jonathan) > > - Remove a stray function declaration in einj-cxl.h (Jonathan) > > - Comment #else/#endifs with corresponding #if/#ifdef in > > einj-cxl.h (Jonathan) > > > > v11 Changes: > > - Drop patch 2/6 (Add CXL protocol error defines) and put the > > defines in patch 4/6 instead (Dan) > > - Add Dan's reviewed-by > > > > The new CXL error types will use the Memory Address field in the > > SET_ERROR_TYPE_WITH_ADDRESS structure in order to target a CXL 1.1 > > compliant memory-mapped downstream port. The value of the memory address > > will be in the port's MMIO range, and it will not represent physical > > (normal or persistent) memory. > > > > Add the functionality for injecting CXL 1.1 errors to the EINJ module, > > but not through the EINJ legacy interface under /sys/kernel/debug/apei/einj. > > Instead, make the error types available under /sys/kernel/debug/cxl. > > This allows for validating the MMIO address for a CXL 1.1 error type > > while also not making the user responsible for finding it. > > I tried this series on an Intel Icelake (which as far as I know > doesn't support CXL). Thanks Tony! > Couple of oddities: > > 1) I built as a module (CONFIG_ACPI_APEI_EINJ=m) like I normally do. > But this was autoloaded and EINJ initialized during boot: > > [ 33.909111] EINJ: Error INJection is initialized. In the current code it should only load if cxl_core.ko is also loaded. Can you share the output of lsmod to maybe see which module loaded that dependency? > I'm wondering if that might be a problem for anyone that likes to > leave the einj module not loaded until they have some need to > inject errors. That is a behavior change of this approach. Is it a problem? If it is I would say that we need to break out a new cxl_einj.ko module that when it loads walks the CXL topology and creates the debugfs files. Otherwise my assumption is that CONFIG_CXL_EINJ=y means that cxl_core.ko loads einj.ko unconditionally. I would save that work for a clear description of why einj.ko should not be resident. > 2) Even though my system doesn't have any CXL support, I found this: > > # cat /sys/kernel/debug/cxl/einj_types > 0x00001000 CXL.cache Protocol Correctable > > What does this mean? Strange, does: /sys/kernel/debug/einj/available_error_type ...show the same even before these patches? I.e. maybe this pre-CXL BIOS was using the 0x1000 encoding when it should not? > Using ras-tools I injected some DDR memory errors. So legacy > functionality still works OK. > > -Tony ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v12 0/3] cxl, EINJ: Update EINJ for CXL error types 2024-02-15 2:53 ` Dan Williams @ 2024-02-15 15:01 ` Ben Cheatham 2024-02-15 17:23 ` Luck, Tony 1 sibling, 0 replies; 20+ messages in thread From: Ben Cheatham @ 2024-02-15 15:01 UTC (permalink / raw) To: Dan Williams, Tony Luck Cc: jonathan.cameron, rafael, james.morse, bp, dave, dave.jiang, alison.schofield, vishal.l.verma, ira.weiny, linux-cxl, linux-acpi On 2/14/24 8:53 PM, Dan Williams wrote: > Tony Luck wrote: >> On Wed, Feb 14, 2024 at 02:07:06PM -0600, Ben Cheatham wrote: >>> v12 Changes: >>> - Rebase onto v6.8-rc4 >>> - Squash Kconfig patch into patch 2/3 (Jonathan) >>> - Change CONFIG_CXL_EINJ from "depends on ACPI_APEI_EINJ >= CXL_BUS" >>> to "depends on ACPI_APEI_EINJ = CXL_BUS" >>> - Drop "ACPI, APEI" part of commit message title and use just EINJ >>> instead (Dan) >>> - Add protocol error types to "einj_types" documentation (Jonathan) >>> - Change 0xffff... constants to use GENMASK() >>> - Drop param* variables and use constants instead in cxl error >>> inject functions (Jonathan) >>> - Add is_cxl_error_type() helper function in einj.c (Jonathan) >>> - Remove a stray function declaration in einj-cxl.h (Jonathan) >>> - Comment #else/#endifs with corresponding #if/#ifdef in >>> einj-cxl.h (Jonathan) >>> >>> v11 Changes: >>> - Drop patch 2/6 (Add CXL protocol error defines) and put the >>> defines in patch 4/6 instead (Dan) >>> - Add Dan's reviewed-by >>> >>> The new CXL error types will use the Memory Address field in the >>> SET_ERROR_TYPE_WITH_ADDRESS structure in order to target a CXL 1.1 >>> compliant memory-mapped downstream port. The value of the memory address >>> will be in the port's MMIO range, and it will not represent physical >>> (normal or persistent) memory. >>> >>> Add the functionality for injecting CXL 1.1 errors to the EINJ module, >>> but not through the EINJ legacy interface under /sys/kernel/debug/apei/einj. >>> Instead, make the error types available under /sys/kernel/debug/cxl. >>> This allows for validating the MMIO address for a CXL 1.1 error type >>> while also not making the user responsible for finding it. >> >> I tried this series on an Intel Icelake (which as far as I know >> doesn't support CXL). > > Thanks Tony! > >> Couple of oddities: >> >> 1) I built as a module (CONFIG_ACPI_APEI_EINJ=m) like I normally do. >> But this was autoloaded and EINJ initialized during boot: >> >> [ 33.909111] EINJ: Error INJection is initialized. > > In the current code it should only load if cxl_core.ko is also loaded. > > Can you share the output of lsmod to maybe see which module loaded that > dependency? > >> I'm wondering if that might be a problem for anyone that likes to >> leave the einj module not loaded until they have some need to >> inject errors. > > That is a behavior change of this approach. Is it a problem? > > If it is I would say that we need to break out a new cxl_einj.ko module > that when it loads walks the CXL topology and creates the debugfs files. > Otherwise my assumption is that CONFIG_CXL_EINJ=y means that cxl_core.ko > loads einj.ko unconditionally. > > I would save that work for a clear description of why einj.ko should not > be resident. > >> 2) Even though my system doesn't have any CXL support, I found this: >> >> # cat /sys/kernel/debug/cxl/einj_types >> 0x00001000 CXL.cache Protocol Correctable >> >> What does this mean? > > Strange, does: > > /sys/kernel/debug/einj/available_error_type > > ...show the same even before these patches? I.e. maybe this pre-CXL BIOS was > using the 0x1000 encoding when it should not? > Dan's already alluded to this, but to elaborate: This series doesn't do anything different than before when getting available error types, it just puts the CXL types into the "einj_types" file instead. So it's possible that your platform doesn't have CXL support, but it is reporting a CXL error type for EINJ. This could be a BIOS error, or it could be that the BIOS is pre ACPIv6.5 and was using 0x1000 for a different error type and the kernel is interpreting it as a CXL error type. If you think something else is happening, I'd love to hear about it! Thanks, Ben >> Using ras-tools I injected some DDR memory errors. So legacy >> functionality still works OK. >> >> -Tony > > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH v12 0/3] cxl, EINJ: Update EINJ for CXL error types 2024-02-15 2:53 ` Dan Williams 2024-02-15 15:01 ` Ben Cheatham @ 2024-02-15 17:23 ` Luck, Tony 2024-02-15 18:15 ` Luck, Tony 2024-02-16 0:09 ` Dan Williams 1 sibling, 2 replies; 20+ messages in thread From: Luck, Tony @ 2024-02-15 17:23 UTC (permalink / raw) To: Williams, Dan J, Ben Cheatham Cc: jonathan.cameron@huawei.com, rafael@kernel.org, james.morse@arm.com, bp@alien8.de, dave@stogolabs.net, Jiang, Dave, Schofield, Alison, Verma, Vishal L, Weiny, Ira, linux-cxl@vger.kernel.org, linux-acpi@vger.kernel.org > > Couple of oddities: > > > > 1) I built as a module (CONFIG_ACPI_APEI_EINJ=m) like I normally do. > > But this was autoloaded and EINJ initialized during boot: > > > > [ 33.909111] EINJ: Error INJection is initialized. > > In the current code it should only load if cxl_core.ko is also loaded. > > Can you share the output of lsmod to maybe see which module loaded that > dependency? > > > I'm wondering if that might be a problem for anyone that likes to > > leave the einj module not loaded until they have some need to > > inject errors. > > That is a behavior change of this approach. Is it a problem? > > If it is I would say that we need to break out a new cxl_einj.ko module > that when it loads walks the CXL topology and creates the debugfs files. > Otherwise my assumption is that CONFIG_CXL_EINJ=y means that cxl_core.ko > loads einj.ko unconditionally. > > I would save that work for a clear description of why einj.ko should not > be resident. Personally, it would save me having to type "modprobe einj" to run tests (and answer e-mails from validation folks telling they missed this step). But others might feels this is unwanted. It looks like some distros build kernels with CONFIG_ACPI_APEI_EINJ=m. On the other hand, EINJ should be under control of a BIOS option that defaults to "off". So production systems won't enable it. But perhaps there will be a pr_warn() or pr_err() during boot. One of these will likely trip: pr_warn("EINJ table not found.\n"); pr_err("Failed to get EINJ table: %s\n", acpi_format_exception(status)); pr_warn(FW_BUG "Invalid EINJ table.\n"); pr_err("Error collecting EINJ resources.\n"); > > > 2) Even though my system doesn't have any CXL support, I found this: > > > > # cat /sys/kernel/debug/cxl/einj_types > > 0x00001000 CXL.cache Protocol Correctable > > > > What does this mean? > > Strange, does: > > /sys/kernel/debug/einj/available_error_type > > ...show the same even before these patches? I.e. maybe this pre-CXL BIOS was > using the 0x1000 encoding when it should not? I added a printk() to show the raw value returned by my BIOS: 0x80001038 So your guess is correct. By BIOS is setting 0x1000 bit when it shouldn't. > > > Using ras-tools I injected some DDR memory errors. So legacy > > functionality still works OK. -Tony ^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH v12 0/3] cxl, EINJ: Update EINJ for CXL error types 2024-02-15 17:23 ` Luck, Tony @ 2024-02-15 18:15 ` Luck, Tony 2024-02-16 0:11 ` Dan Williams 2024-02-16 0:09 ` Dan Williams 1 sibling, 1 reply; 20+ messages in thread From: Luck, Tony @ 2024-02-15 18:15 UTC (permalink / raw) To: Williams, Dan J, Ben Cheatham Cc: jonathan.cameron@huawei.com, rafael@kernel.org, james.morse@arm.com, bp@alien8.de, dave@stogolabs.net, Jiang, Dave, Schofield, Alison, Verma, Vishal L, Weiny, Ira, linux-cxl@vger.kernel.org, linux-acpi@vger.kernel.org > But perhaps there will be a pr_warn() or pr_err() during boot. One of these will likely trip: > pr_warn("EINJ table not found.\n"); > pr_err("Failed to get EINJ table: %s\n", acpi_format_exception(status)); > pr_warn(FW_BUG "Invalid EINJ table.\n"); > pr_err("Error collecting EINJ resources.\n"); Just tried on my system. The winner (for me) is: [ 27.989081] EINJ: EINJ table not found. If you decide that it is OK to auto-load, I think that needs severity downgraded to pr_info(). Users ask questions when they see warnings. -Tony ^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH v12 0/3] cxl, EINJ: Update EINJ for CXL error types 2024-02-15 18:15 ` Luck, Tony @ 2024-02-16 0:11 ` Dan Williams 0 siblings, 0 replies; 20+ messages in thread From: Dan Williams @ 2024-02-16 0:11 UTC (permalink / raw) To: Luck, Tony, Williams, Dan J, Ben Cheatham Cc: jonathan.cameron@huawei.com, rafael@kernel.org, james.morse@arm.com, bp@alien8.de, dave@stogolabs.net, Jiang, Dave, Schofield, Alison, Verma, Vishal L, Weiny, Ira, linux-cxl@vger.kernel.org, linux-acpi@vger.kernel.org Luck, Tony wrote: > > But perhaps there will be a pr_warn() or pr_err() during boot. One of these will likely trip: > > > pr_warn("EINJ table not found.\n"); > > pr_err("Failed to get EINJ table: %s\n", acpi_format_exception(status)); > > pr_warn(FW_BUG "Invalid EINJ table.\n"); > > pr_err("Error collecting EINJ resources.\n"); > > Just tried on my system. The winner (for me) is: > > [ 27.989081] EINJ: EINJ table not found. > > If you decide that it is OK to auto-load, I think that needs severity downgraded to pr_info(). > > Users ask questions when they see warnings. Sounds good, I missed this before sending my last reply. ^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH v12 0/3] cxl, EINJ: Update EINJ for CXL error types 2024-02-15 17:23 ` Luck, Tony 2024-02-15 18:15 ` Luck, Tony @ 2024-02-16 0:09 ` Dan Williams 1 sibling, 0 replies; 20+ messages in thread From: Dan Williams @ 2024-02-16 0:09 UTC (permalink / raw) To: Luck, Tony, Williams, Dan J, Ben Cheatham Cc: jonathan.cameron@huawei.com, rafael@kernel.org, james.morse@arm.com, bp@alien8.de, dave@stogolabs.net, Jiang, Dave, Schofield, Alison, Verma, Vishal L, Weiny, Ira, linux-cxl@vger.kernel.org, linux-acpi@vger.kernel.org Luck, Tony wrote: > > I would save that work for a clear description of why einj.ko should not > > be resident. > > Personally, it would save me having to type "modprobe einj" to run tests (and > answer e-mails from validation folks telling they missed this step). It would only autoload with cxl_core.ko though. > > But others might feels this is unwanted. It looks like some distros build kernels > with CONFIG_ACPI_APEI_EINJ=m. > > On the other hand, EINJ should be under control of a BIOS option that > defaults to "off". So production systems won't enable it. > > But perhaps there will be a pr_warn() or pr_err() during boot. One of these will likely trip: > > pr_warn("EINJ table not found.\n"); > pr_err("Failed to get EINJ table: %s\n", acpi_format_exception(status)); > pr_warn(FW_BUG "Invalid EINJ table.\n"); > pr_err("Error collecting EINJ resources.\n"); Oh, good point. However, should a debug module really be throwing pr_err() or pr_warn()? I.e. should those all move to pr_info() or pr_debug() since the error case is detected by the lack of debugfs files being published. At least that would be my preference over creating cxl_einj.ko. ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2024-02-20 19:59 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-02-14 20:07 [PATCH v12 0/3] cxl, EINJ: Update EINJ for CXL error types Ben Cheatham 2024-02-14 20:07 ` [PATCH v12 1/3] EINJ: Migrate to a platform driver Ben Cheatham 2024-02-16 2:03 ` kernel test robot 2024-02-16 13:43 ` kernel test robot 2024-02-14 20:07 ` [PATCH v12 2/3] cxl/core, EINJ: Add EINJ CXL debugfs files and EINJ helper functions Ben Cheatham 2024-02-15 2:25 ` Dan Williams 2024-02-15 15:01 ` Ben Cheatham 2024-02-16 0:15 ` Dan Williams 2024-02-15 9:33 ` Jonathan Cameron 2024-02-15 15:01 ` Ben Cheatham 2024-02-14 20:07 ` [PATCH v12 3/3] EINJ, Documentation: Update EINJ kernel doc Ben Cheatham 2024-02-20 19:02 ` Davidlohr Bueso 2024-02-20 19:59 ` Ben Cheatham 2024-02-15 1:11 ` [PATCH v12 0/3] cxl, EINJ: Update EINJ for CXL error types Tony Luck 2024-02-15 2:53 ` Dan Williams 2024-02-15 15:01 ` Ben Cheatham 2024-02-15 17:23 ` Luck, Tony 2024-02-15 18:15 ` Luck, Tony 2024-02-16 0:11 ` Dan Williams 2024-02-16 0:09 ` Dan Williams
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox