Hi, This is the latest OS_MCA handler which try to do recovery from multibit-ECC/poisoned memory-read error on user-land. Along the way, I already posted some prototypes of the OS_MCA handler to IA64ML requesting for comments. The most urgent problem was that I couldn't test my patch enough because of the lack of tools such as error(MCA) injections. However, with Tony's great cooperation, today's patch have passed all of my running tests on Intel's Tiger4. Of course, I confirmed that the handler kills a user process which encounters MCA caused by memory read, and that the system is prevented from down after the MCA in the situation. Also, the isolation of erroneous/poisoned memory is realized by PG_Reserved flag. This handler actually recover your system from memory-read MCA. This time, I suppose a functional pointer for OS_MCA. Because it: - allows OS_MCA module: - rmmod if you want - allows handler replacement on runtime: - easy to debug/test/update? - allows platform specific handling: - increase the reliability of generic kernel I'd like to request for comment about this functional pointer. If no one want to do such complicated trick, I will make a little fix for my patch to work all the time as a default handler. Here are separated patches: 1 - enable OS_MCA for errors other than TLB errors 2 - OS_MCA handler for memory read recovery (well tested on Intel Tiger4.) I'd also appreciate it if anyone having good test environment could apply my patch and could report how it works. (especially reports on non-Tiger/non-Intel platform are welcome.) Thanks, H.Seto Signed-off-by: Hidetoshi Seto