On 3/20/20 4:31 PM, Ganesh Goudar wrote: > MCE handling on pSeries platform fails as recent rework to use common > code for pSeries and PowerNV in machine check error handling tries to > access per-cpu variables in realmode. The per-cpu variables may be > outside the RMO region on pSeries platform and needs translation to be > enabled for access. Just moving these per-cpu variable into RMO region > did'nt help because we queue some work to workqueues in real mode, which > again tries to touch per-cpu variables. Also fwnmi_release_errinfo() > cannot be called when translation is not enabled. > > This patch fixes this by enabling translation in the exception handler > when all required real mode handling is done. This change only affects > the pSeries platform. > > Without this fix below kernel crash is seen on injecting > SLB multihit: > > BUG: Unable to handle kernel data access on read at 0xc00000027b205950 > Faulting instruction address: 0xc00000000003b7e0 > Oops: Kernel access of bad area, sig: 11 [#1] > LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries > Modules linked in: mcetest_slb(OE+) af_packet(E) xt_tcpudp(E) ip6t_rpfilter(E) ip6t_REJECT(E) ipt_REJECT(E) xt_conntrack(E) ip_set(E) nfnetlink(E) ebtable_nat(E) ebtable_broute(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) ip_tables(E) x_tables(E) xfs(E) ibmveth(E) vmx_crypto(E) gf128mul(E) uio_pdrv_genirq(E) uio(E) crct10dif_vpmsum(E) rtc_generic(E) btrfs(E) libcrc32c(E) xor(E) zstd_decompress(E) zstd_compress(E) raid6_pq(E) sr_mod(E) sd_mod(E) cdrom(E) ibmvscsi(E) scsi_transport_srp(E) crc32c_vpmsum(E) dm_mod(E) sg(E) scsi_mod(E) > CPU: 34 PID: 8154 Comm: insmod Kdump: loaded Tainted: G OE 5.5.0-mahesh #1 > NIP: c00000000003b7e0 LR: c0000000000f2218 CTR: 0000000000000000 > REGS: c000000007dcb960 TRAP: 0300 Tainted: G OE (5.5.0-mahesh) > MSR: 8000000000001003 CR: 28002428 XER: 20040000 > CFAR: c0000000000f2214 DAR: c00000027b205950 DSISR: 40000000 IRQMASK: 0 > GPR00: c0000000000f2218 c000000007dcbbf0 c000000001544800 c000000007dcbd70 > GPR04: 0000000000000001 c000000007dcbc98 c008000000d00258 c0080000011c0000 > GPR08: 0000000000000000 0000000300000003 c000000001035950 0000000003000048 > GPR12: 000000027a1d0000 c000000007f9c000 0000000000000558 0000000000000000 > GPR16: 0000000000000540 c008000001110000 c008000001110540 0000000000000000 > GPR20: c00000000022af10 c00000025480fd70 c008000001280000 c00000004bfbb300 > GPR24: c000000001442330 c00800000800000d c008000008000000 4009287a77000510 > GPR28: 0000000000000000 0000000000000002 c000000001033d30 0000000000000001 > NIP [c00000000003b7e0] save_mce_event+0x30/0x240 > LR [c0000000000f2218] pseries_machine_check_realmode+0x2c8/0x4f0 > Call Trace: > Instruction dump: > 3c4c0151 38429050 7c0802a6 60000000 fbc1fff0 fbe1fff8 f821ffd1 3d42ffaf > 3fc2ffaf e98d0030 394a1150 3bdef530 <7d6a62aa> 1d2b0048 2f8b0063 380b0001 > ---[ end trace 46fd63f36bbdd940 ]--- > > Fixes: 9ca766f9891d ("powerpc/64s/pseries: machine check convert to use common event code") > Reviewed-by: Mahesh Salgaonkar > Reviewed-by: Nicholas Piggin > Signed-off-by: Ganesh Goudar Hi mpe, Do you have any comments on this patch ?