kexec.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: "Wei, Jiangang" <weijg.fnst@cn.fujitsu.com>
To: "xlpang@redhat.com" <xlpang@redhat.com>
Cc: "fenghua.yu@intel.com" <fenghua.yu@intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"kexec@lists.infradead.org" <kexec@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"ebiederm@xmission.com" <ebiederm@xmission.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>
Subject: Re: [PATCH v2] kexec: Fix kdump failure with notsc
Date: Tue, 12 Jul 2016 09:09:38 +0000	[thread overview]
Message-ID: <1468314444.15074.65.camel@localhost> (raw)
In-Reply-To: <578493AD.3000403@redhat.com>

On Tue, 2016-07-12 at 14:52 +0800, Xunlei Pang wrote:
> On 2016/07/07 at 18:17, Wei Jiangang wrote:
> > If we specify the 'notsc' boot parameter for the dump-capture kernel,
> > and then trigger a crash(panic) by using "ALT-SysRq-c" or "echo c >
> > /proc/sysrq-trigger",
> > the dump-capture kernel will hang in calibrate_delay_converge():
> >
> >     /* wait for "start of" clock tick */
> >     ticks = jiffies;
> >     while (ticks == jiffies)
> >         ; /* nothing */
> >
> > serial log of the hang is as follows:
> >
> >     tsc: Fast TSC calibration using PIT
> >     tsc: Detected 2099.947 MHz processor
> >     Calibrating delay loop...
> >
> > The reason is that the dump-capture kernel hangs in while loops and
> > waits for jiffies to be updated, but no timer interrupts is passed
> > to BSP by APIC.
> >
> > In fact, the local APIC was disabled in reboot and crash path by
> > lapic_shutdown(). We need to put APIC in legacy mode in kexec jump path
> > (put the system into PIT during the crash kernel),
> > so that the dump-capture kernel can get timer interrupts.
> >
> > BTW,
> > I found the buggy commit 522e66464467 ("x86/apic: Disable I/O APIC
> > before shutdown of the local APIC") via bisection.
> >
> > Originally, I want to revert it.
> > But Ingo Molnar comments that "By reverting the change can paper over
> > the bug, but re-introduce the bug that can result in certain CPUs hanging
> > if IO-APIC sends an APIC message if the lapic is disabled prematurely"
> > And I think it's pertinent.
> >
> > Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com>
> > ---
> >  arch/x86/include/asm/apic.h        | 5 +++++
> >  arch/x86/kernel/apic/apic.c        | 9 +++++++++
> >  arch/x86/kernel/machine_kexec_32.c | 5 ++---
> >  arch/x86/kernel/machine_kexec_64.c | 6 +++---
> >  4 files changed, 19 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
> > index bc27611fa58f..5d7e635e580a 100644
> > --- a/arch/x86/include/asm/apic.h
> > +++ b/arch/x86/include/asm/apic.h
> > @@ -128,6 +128,7 @@ extern void clear_local_APIC(void);
> >  extern void disconnect_bsp_APIC(int virt_wire_setup);
> >  extern void disable_local_APIC(void);
> >  extern void lapic_shutdown(void);
> > +extern int lapic_disabled(void);
> >  extern void sync_Arb_IDs(void);
> >  extern void init_bsp_APIC(void);
> >  extern void setup_local_APIC(void);
> > @@ -165,6 +166,10 @@ extern int setup_APIC_eilvt(u8 lvt_off, u8 vector, u8 msg_type, u8 mask);
> >  
> >  #else /* !CONFIG_X86_LOCAL_APIC */
> >  static inline void lapic_shutdown(void) { }
> > +static inline int lapic_disabled(void)
> > +{
> > +	return 0;
> > +}
> >  #define local_apic_timer_c2_ok		1
> >  static inline void init_apic_mappings(void) { }
> >  static inline void disable_local_APIC(void) { }
> > diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> > index 60078a67d7e3..d1df250994bb 100644
> > --- a/arch/x86/kernel/apic/apic.c
> > +++ b/arch/x86/kernel/apic/apic.c
> > @@ -133,6 +133,9 @@ static inline void imcr_apic_to_pic(void)
> >  }
> >  #endif
> >  
> > +/* Local APIC is disabled by the kernel for crash or reboot path */
> > +static int disabled_local_apic;
> > +
> >  /*
> >   * Knob to control our willingness to enable the local APIC.
> >   *
> > @@ -1097,10 +1100,16 @@ void lapic_shutdown(void)
> >  #endif
> >  		disable_local_APIC();
> >  
> > +	disabled_local_apic = 1;
> >  
> >  	local_irq_restore(flags);
> >  }
> >  
> > +int lapic_disabled(void)
> > +{
> > +	return disabled_local_apic;
> > +}
> > +
> >  /**
> >   * sync_Arb_IDs - synchronize APIC bus arbitration IDs
> >   */
> > diff --git a/arch/x86/kernel/machine_kexec_32.c b/arch/x86/kernel/machine_kexec_32.c
> > index 469b23d6acc2..c934a7868e6b 100644
> > --- a/arch/x86/kernel/machine_kexec_32.c
> > +++ b/arch/x86/kernel/machine_kexec_32.c
> > @@ -202,14 +202,13 @@ void machine_kexec(struct kimage *image)
> >  	local_irq_disable();
> >  	hw_breakpoint_disable();
> >  
> > -	if (image->preserve_context) {
> > +	if (image->preserve_context || lapic_disabled()) {
> >  #ifdef CONFIG_X86_IO_APIC
> >  		/*
> >  		 * We need to put APICs in legacy mode so that we can
> >  		 * get timer interrupts in second kernel. kexec/kdump
> >  		 * paths already have calls to disable_IO_APIC() in
> > -		 * one form or other. kexec jump path also need
> > -		 * one.
> > +		 * one form or other. kexec jump path also need one.
> >  		 */
> >  		disable_IO_APIC();
> 
> Hi Wei,
> 
> As the comment says, kexec/kdump paths already have disable_IO_APIC(), why again here?
Frankly, I am not very clear about it as well...

Originally the codes and comment added by Huang Ying
<ying.huang@intel.com>
There may be some special reason...

The comment also said that it's used to put APICs in legacy mode so that
we can get timer interrupts in second kernel. and we also need it in
kexec jump path.

In fact, the jiffies calibration needs timer interrupts in second
kernel. and the lapic and timer are not ready when second kernel waits
them to update the jiffies value.

so I just want to reuse them and enable legacy mode (PIT timer) for
notsc case.
kernel.

Thanks,
wei
> 
> Regards,
> Xunlei
> 
> >  #endif
> > diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
> > index 5a294e48b185..d3598cdd6437 100644
> > --- a/arch/x86/kernel/machine_kexec_64.c
> > +++ b/arch/x86/kernel/machine_kexec_64.c
> > @@ -23,6 +23,7 @@
> >  #include <asm/pgtable.h>
> >  #include <asm/tlbflush.h>
> >  #include <asm/mmu_context.h>
> > +#include <asm/apic.h>
> >  #include <asm/io_apic.h>
> >  #include <asm/debugreg.h>
> >  #include <asm/kexec-bzimage64.h>
> > @@ -269,14 +270,13 @@ void machine_kexec(struct kimage *image)
> >  	local_irq_disable();
> >  	hw_breakpoint_disable();
> >  
> > -	if (image->preserve_context) {
> > +	if (image->preserve_context || lapic_disabled()) {
> >  #ifdef CONFIG_X86_IO_APIC
> >  		/*
> >  		 * We need to put APICs in legacy mode so that we can
> >  		 * get timer interrupts in second kernel. kexec/kdump
> >  		 * paths already have calls to disable_IO_APIC() in
> > -		 * one form or other. kexec jump path also need
> > -		 * one.
> > +		 * one form or other. kexec jump path also need one.
> >  		 */
> >  		disable_IO_APIC();
> >  #endif
> 
> 
> 



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  parent reply	other threads:[~2016-07-12  9:10 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-07 10:17 [PATCH v2] kexec: Fix kdump failure with notsc Wei Jiangang
2016-07-07 17:55 ` Eric W. Biederman
2016-07-08  4:48   ` Wei, Jiangang
2016-07-08  7:38   ` Ingo Molnar
2016-07-11 10:30     ` Wei, Jiangang
2016-07-13  7:46       ` Wei, Jiangang
2016-07-13  9:05         ` Baoquan He
2016-07-08  8:21 ` Nikolay Borisov
2016-07-12  6:52 ` Xunlei Pang
2016-07-12  8:21   ` Baoquan He
2016-07-12  9:10     ` Wei, Jiangang
2016-07-12  9:09   ` Wei, Jiangang [this message]
2016-07-12  9:29     ` Baoquan He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1468314444.15074.65.camel@localhost \
    --to=weijg.fnst@cn.fujitsu.com \
    --cc=ebiederm@xmission.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=xlpang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).