From mboxrd@z Thu Jan 1 00:00:00 1970 From: dwmw2@infradead.org (David Woodhouse) Date: Tue, 21 Mar 2017 09:42:23 +0000 Subject: [PATCH v33 00/14] add kdump support In-Reply-To: <20170321073452.GA17298@linaro.org> References: <20170315095656.24992-1-takahiro.akashi@linaro.org> <1489750991.17202.40.camel@infradead.org> <1489759373.17202.44.camel@infradead.org> <20170317153358.GI5940@leverpostej> <1489765628.17202.59.camel@infradead.org> <20170317162421.GK5940@leverpostej> <20170321073452.GA17298@linaro.org> Message-ID: <1490089343.5036.92.camel@infradead.org> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Tue, 2017-03-21 at 16:34 +0900, AKASHI Takahiro wrote: > Yes, it is intentional. I removed 'offline' code in my v14 (2016/3/4). > As you assumed, I'd expect 'online' status of all CPUs to be kept > unchanged in the core dump. I wonder if it would be better to take a *copy* of it and put it back after we're done taking the CPUs down? As things stand, we now have *three* different methods of taking down all the CPUs... and *none* of them allow a platform to override it with an NMI-based or STONITH-based method, which seems like something of an oversight. > If you can agree, I would like to modify this disputed warning code to: >? > + BUG_ON(!in_kexec_crash && (stuck_cpus || (num_online_cpus() > 1))); > + WARN(in_kexec_crash && (stuck_cpus || smp_crash_stop_failed()), > + "Some CPUs may be stale, kdump will be unreliable.\n"); That works; thanks. FWIW I'm currently blaming my platform's firmware for my sporadic crash-on-CPU#1 failures. If your testing includes crashes on non-boot CPUs (perhaps using the sysrq hack I posted) and it reliably passes for you, then let's ignore that for now. -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 4938 bytes Desc: not available URL: