From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756154AbcHBH6x (ORCPT ); Tue, 2 Aug 2016 03:58:53 -0400 Received: from cn.fujitsu.com ([59.151.112.132]:14994 "EHLO heian.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1750851AbcHBH6n (ORCPT ); Tue, 2 Aug 2016 03:58:43 -0400 X-IronPort-AV: E=Sophos;i="5.22,518,1449504000"; d="scan'208";a="9419700" From: "Wei, Jiangang" To: "ebiederm@xmission.com" CC: "kexec@lists.infradead.org" , "linux-kernel@vger.kernel.org" , "Cao, Jin" , "tglx@linutronix.de" , "bhe@redhat.com" , "xpang@redhat.com" , "kernel@kyup.com" , "x86@kernel.org" , "hpa@zytor.com" , "mingo@redhat.com" , "Izumi, Taku" Subject: Re: [PATCH v2 0/3] Fix dump-capture kernel hangs with notsc Thread-Topic: [PATCH v2 0/3] Fix dump-capture kernel hangs with notsc Thread-Index: AQHR7BdwCTnVQmfPj0qyMxZVAMrZ8KA0xGkA Date: Tue, 2 Aug 2016 07:45:08 +0000 Message-ID: <1470123720.2274.53.camel@localhost> References: <1469501995-2991-1-git-send-email-weijg.fnst@cn.fujitsu.com> <1470033660.7811.99.camel@localhost> <87twf4xvpd.fsf@x220.int.ebiederm.org> In-Reply-To: <87twf4xvpd.fsf@x220.int.ebiederm.org> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.167.226.50] Content-Type: text/plain; charset="utf-8" Content-ID: <1DC224DA2F11C141ACB35FE8B65D7733@fujitsu.local> MIME-Version: 1.0 X-yoursite-MailScanner-ID: 8771942CA550.A8DEB X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: weijg.fnst@cn.fujitsu.com Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id u727wupc018824 Hi Eric, Thanks for your reply firstly. On Mon, 2016-08-01 at 12:09 -0500, Eric W. Biederman wrote: > "Wei, Jiangang" writes: > > > Ping ... > > May I ask for some community attention to this series? > > I purpose is fixing the dump-capture kernel hangs in > > calibrate_delay_converge() while specifying notsc. > > Did you not see my reply to patch 3/3? Yes, I read your email and made a reply (https://lkml.org/lkml/2016/7/26/112) . I put forward several questions in that letter, but no feedback... > > The short version of my feedback is that you seem to be fixing a case > that should not exist. So the good fix is to skip completely past > virtual wire mode and into full apic mode as soon as possible. I am afraid that there are some disagreements between us. 1) The case that dump-capture kernel boot up with the disabled APIC is very real, and the bug can be reproduced 100%. I want to emphasize that there is no guarantee of the interrupt mode of APIC and status of local APIC, Especially for the dump-capture kernel that won't through the BIOS phrase. That's why I do more check in init_bsp_APIC(), not only depends on the MP tables which be generated before the first kernel boots up. Make a point here, The BIOS must disable interrupts to all processors and set the APICs to the system initial state before giving control to the operating system. That means APICs won't be reset to initial state without BIOS phrase. 2) Your proposal (switch into full apic mode as soon as possible) seems to contradict the Intel Spec, "An MP operating system is booted under either one of the two PC/AT-compatible modes. Later the operating system switches to Symmetric I/O Mode **as it enters multiprocessor mode**." And in other words, the BSP should be in PIC mode or Virtual wire mode in startup stage. 3) The apic initialization codes maybe need a overhaul, but it goes out the scope of this patch. I focus on fixing kdump failure with notsc. And the apic initialization codes has no modification for a long time and can be regard as stable. Overhaul of it increases the chances of hitting a bug. If there's anything wrong with my understanding, please point out. Thanks, wei > > For a subset of cases the code already supports that. > > Eric > >