The MCA/INIT/CMC/CPE log decoding currently in arch/ia64/kernel/mca.c has some problems: - It doesn't know much about OEM-specific sections. - At boot-time, it sometimes takes so long to print the log to the console that the BSP erroneously assumes an AP is stuck. This sometimes causes *another* MCA. - The log goes ONLY to the console, where the output may be lost. So here's some fodder for discussion. I don't claim that this is ready for prime time; I just want to get some feedback on whether this is a reasonable approach. The attached patch (against 2.4.21-rc1) makes the raw, binary error records straight from SAL available via files in /proc: /proc/sal/cpu/{mca,init,cmc,cpe} If you read the file, you get the raw data. If you write "clear" to it, you invalidate the current error record (which as I read the spec, may potentially make another, pending record available to be read). The idea is that - An rc script run at boot-time can save all the logs in files, clearing each afterwards. - A user-level analysis tool can decode them as needed (perhaps also run from the same rc script above). - The user-level analyzer need not be open-source, if people are worried about IP in the OEM-specific sections. - A baseline open-source analyzer can provide at least the functionality available today in the kernel decoder. So, attached are the kernel patch against 2.4.21-rc1 and a simple user program ("salinfo") to decode the logs. Note that the kernel patch removes the SAL clear_state_info calls from mca.c, so the error records will be preserved until the user program can read them. This feels like the right thing to me (only a user program can know that the logs have been saved somewhere safe), but no doubt there are issues here. The user-space analyzer is derived from the current kernel code in mca.c and should produce identical output. For now, I left all the code in the kernel as well, but ultimately it could be removed. Bjorn