From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from out02.mta.xmission.com ([166.70.13.232]) by canuck.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1RINz8-0004qU-Rv for kexec@lists.infradead.org; Mon, 24 Oct 2011 17:06:51 +0000 From: ebiederm@xmission.com (Eric W. Biederman) References: <1319468137.3615.16.camel@br98xy6r> Date: Mon, 24 Oct 2011 10:07:19 -0700 In-Reply-To: (=?utf-8?Q?=22Am=C3=A9rico?= Wang"'s message of "Mon, 24 Oct 2011 23:23:33 +0800") Message-ID: MIME-Version: 1.0 Subject: Re: kdump: crash_kexec()-smp_send_stop() race in panic List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: kexec-bounces@lists.infradead.org Errors-To: kexec-bounces+dwmw2=twosheds.infradead.org@lists.infradead.org To: =?utf-8?Q?Am=C3=A9rico?= Wang Cc: heiko.carstens@de.ibm.com, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, schwidefsky@de.ibm.com, akpm@linux-foundation.org, holzheu@linux.vnet.ibm.com, Vivek Goyal QW3DqXJpY28gV2FuZyA8eGl5b3Uud2FuZ2NvbmdAZ21haWwuY29tPiB3cml0ZXM6Cgo+IE9uIE1v biwgT2N0IDI0LCAyMDExIGF0IDExOjE0IFBNLCBFcmljIFcuIEJpZWRlcm1hbgo+IDxlYmllZGVy bUB4bWlzc2lvbi5jb20+IHdyb3RlOgo+PiBNaWNoYWVsIEhvbHpoZXUgPGhvbHpoZXVAbGludXgu dm5ldC5pYm0uY29tPiB3cml0ZXM6Cj4+Cj4+PiBIZWxsbyBWaXZlaywKPj4+Cj4+PiBJbiBvdXIg dGVzdHMgd2UgcmFuIGludG8gdGhlIGZvbGxvd2luZyBzY2VuYXJpbzoKPj4+Cj4+PiBUd28gQ1BV cyBoYXZlIGNhbGxlZCBwYW5pYyBhdCB0aGUgc2FtZSB0aW1lLiBUaGUgZmlyc3QgQ1BVIGNhbGxl ZAo+Pj4gY3Jhc2hfa2V4ZWMoKSBhbmQgdGhlIHNlY29uZCBDUFUgY2FsbGVkIHNtcF9zZW5kX3N0 b3AoKSBpbiBwYW5pYygpCj4+PiBiZWZvcmUgY3Jhc2hfa2V4ZWMoKSBmaW5pc2hlZCBvbiB0aGUg Zmlyc3QgQ1BVLiBTbyB0aGUgc2Vjb25kIENQVQo+Pj4gc3RvcHBlZCB0aGUgZmlyc3QgQ1BVIGFu ZCB0aGVyZWZvcmUga2R1bXAgZmFpbGVkLgo+Pj4KPj4+IDFzdCBDUFU6Cj4+PiBwYW5pYygpLT5j cmFzaF9rZXhlYygpLT5tdXRleF90cnlsb2NrKCZrZXhlY19tdXRleCktPiBkbyBrZHVtcAo+Pj4K Pj4+IDJuZCBDUFU6Cj4+PiBwYW5pYygpLT5jcmFzaF9rZXhlYygpLT5rZXhlY19tdXRleCBhbHJl YWR5IGhlbGQgYnkgMXN0IENQVQo+Pj4gwqAgwqAgwqAgwqAtPnNtcF9zZW5kX3N0b3AoKS0+IHN0 b3AgQ1BVIDEgKHN0b3Aga2R1bXApCj4+Pgo+Pj4gSG93IHNob3VsZCB3ZSBmaXggdGhpcyBwcm9i bGVtPyBPbmUgcG9zc2liaWxpdHkgY291bGQgYmUgdG8gZG8KPj4+IHNtcF9zZW5kX3N0b3AoKSBi ZWZvcmUgd2UgY2FsbCBjcmFzaF9rZXhlYygpLgo+Pj4KPj4+IFdoYXQgZG8geW91IHRoaW5rPwo+ Pgo+PiBzbXBfc2VuZF9zdG9wIGlzIGluc3VmZmljaWVudGx5IHJlbGlhYmxlIHRvIGJlIHVzZWQg YmVmb3JlIGNyYXNoX2tleGVjLgo+Pgo+PiBNeSBmaXJzdCByZWFjdGlvbiB3b3VsZCBiZSB0byB0 ZXN0IG9vcHNfaW5fcHJvZ3Jlc3MgYW5kIHdhaXQgdW50aWwKPj4gb29wc19pbl9wcm9ncmVzcyA9 PSAxIGJlZm9yZSBjYWxsaW5nIHNtcF9zZW5kX3N0b3AuCj4+Cj4KPiArMQo+Cj4gT25lIG9mIG15 IGNvbGxlYWd1ZSBtZW50aW9uZWQgdGhlIHNhbWUgcHJvYmxlbSB3aXRoIG1lIGluc2lkZQo+IFJI LCBnaXZlbiB0aGUgZmFjdCB0aGF0IHRoZSByYWNlIGNvbmRpdGlvbiB3aW5kb3cgaXMgc21hbGws IGl0IHdvdWxkCj4gbm90IGJlIGVhc3kgdG8gcmVwcm9kdWNlIHRoaXMgc2NlbmFyaW8uCgpBcyBm b3IgcmVwcm9kdWNpbmcgaXQgSSBoYXZlIGEgaHVuY2ggeW91IGNvdWxkIGhhY2sgdXAgc29tZXRo aW5nCmhvcnJpYmxlIHdpdGggc21wX2NhbGxfZnVuY3Rpb24gYW5kIGtwcm9iZXMuCgoKT24gYSBs aXR0bGUgbW9yZSByZWZsZWN0aW9uIHdlIGNhbid0IHdhaXQgdW50aWwgb29wc19pbl9wcm9ncmVz cyBnb2VzCnRvIDEgYmVmb3JlIGNhbGxpbmcgc21wX3NlbmRfc3RvcC4gIEJlY2F1c2UgaWYgY3Jh c2hfa2V4ZWMgaXMgbm90Cmludm9sdmVkIG5vdGhpbmcgd2Ugd2lsbCBuZXZlciBjYWxsIHNtcF9z ZW5kX3N0b3AuIAoKU28gbXkgc2Vjb25kIHRob3VnaHQgaXMgdG8gaW50cm9kdWNlIGFub3RoZXIg YXRvbWljIHZhcmlhYmxlCnBhbmljX2luX3Byb2dyZXNzLCB2aXNpYmxlIG9ubHkgaW4gcGFuaWMu ICBUaGUgY3B1IHRoYXQgc2V0cwppbmNyZW1lbnRzIHBhbmljX2luX3Byb2dyZXNzIGNhbiBjYWxs IHNtcF9zZW5kX3N0b3AuICBUaGUgcmVzdCBvZgp0aGUgY3B1cyBjYW4ganVzdCBnbyBpbnRvIGEg YnVzeSB3YWl0LiAgVGhhdCBzaG91bGQgc3RvcCBuYXN0eQpmaWdodHMgYWJvdXQgd2hvIGlzIGdv aW5nIHRvIGNvbWUgb3V0IG9mIHNtcF9zZW5kX3N0b3AgZmlyc3QuCgpFcmljCgpfX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwprZXhlYyBtYWlsaW5nIGxpc3QK a2V4ZWNAbGlzdHMuaW5mcmFkZWFkLm9yZwpodHRwOi8vbGlzdHMuaW5mcmFkZWFkLm9yZy9tYWls bWFuL2xpc3RpbmZvL2tleGVjCg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933108Ab1JXRGs (ORCPT ); Mon, 24 Oct 2011 13:06:48 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:42792 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933035Ab1JXRGq convert rfc822-to-8bit (ORCPT ); Mon, 24 Oct 2011 13:06:46 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: =?utf-8?Q?Am=C3=A9rico?= Wang Cc: holzheu@linux.vnet.ibm.com, Vivek Goyal , akpm@linux-foundation.org, schwidefsky@de.ibm.com, heiko.carstens@de.ibm.com, kexec@lists.infradead.org, linux-kernel@vger.kernel.org References: <1319468137.3615.16.camel@br98xy6r> Date: Mon, 24 Oct 2011 10:07:19 -0700 In-Reply-To: (=?utf-8?Q?=22Am=C3=A9rico?= Wang"'s message of "Mon, 24 Oct 2011 23:23:33 +0800") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-XM-SPF: eid=;;;mid=;;;hst=in01.mta.xmission.com;;;ip=98.207.153.68;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX196tWS8QqTDVRqgSOUrIHnvRVYF9lMKsPA= X-SA-Exim-Connect-IP: 98.207.153.68 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * 0.0 T_TM2_M_HEADER_IN_MSG BODY: T_TM2_M_HEADER_IN_MSG * -3.0 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.0000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa04 1397; Body=1 Fuz1=1 Fuz2=1] * 0.0 T_XMDrugObfuBody_08 obfuscated drug references * 0.0 T_TooManySym_01 4+ unique symbols in subject * 0.1 XMSolicitRefs_0 Weightloss drug * 0.4 UNTRUSTED_Relay Comes from a non-trusted relay X-Spam-DCC: XMission; sa04 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: =?ISO-8859-1?Q?;Am=c3=a9rico Wang ?= X-Spam-Relay-Country: ** Subject: Re: kdump: crash_kexec()-smp_send_stop() race in panic X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Fri, 06 Aug 2010 16:31:04 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Américo Wang writes: > On Mon, Oct 24, 2011 at 11:14 PM, Eric W. Biederman > wrote: >> Michael Holzheu writes: >> >>> Hello Vivek, >>> >>> In our tests we ran into the following scenario: >>> >>> Two CPUs have called panic at the same time. The first CPU called >>> crash_kexec() and the second CPU called smp_send_stop() in panic() >>> before crash_kexec() finished on the first CPU. So the second CPU >>> stopped the first CPU and therefore kdump failed. >>> >>> 1st CPU: >>> panic()->crash_kexec()->mutex_trylock(&kexec_mutex)-> do kdump >>> >>> 2nd CPU: >>> panic()->crash_kexec()->kexec_mutex already held by 1st CPU >>>        ->smp_send_stop()-> stop CPU 1 (stop kdump) >>> >>> How should we fix this problem? One possibility could be to do >>> smp_send_stop() before we call crash_kexec(). >>> >>> What do you think? >> >> smp_send_stop is insufficiently reliable to be used before crash_kexec. >> >> My first reaction would be to test oops_in_progress and wait until >> oops_in_progress == 1 before calling smp_send_stop. >> > > +1 > > One of my colleague mentioned the same problem with me inside > RH, given the fact that the race condition window is small, it would > not be easy to reproduce this scenario. As for reproducing it I have a hunch you could hack up something horrible with smp_call_function and kprobes. On a little more reflection we can't wait until oops_in_progress goes to 1 before calling smp_send_stop. Because if crash_kexec is not involved nothing we will never call smp_send_stop. So my second thought is to introduce another atomic variable panic_in_progress, visible only in panic. The cpu that sets increments panic_in_progress can call smp_send_stop. The rest of the cpus can just go into a busy wait. That should stop nasty fights about who is going to come out of smp_send_stop first. Eric