From: Li Zhong <zhong@linux.vnet.ibm.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
PowerPC email list <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [RFC PATCH powerpc] Fix a lazy irq related WARING in arch_local_irq_restore()
Date: Mon, 24 Sep 2012 14:42:42 +0800 [thread overview]
Message-ID: <1348468962.2475.44.camel@ThinkPad-T420> (raw)
In-Reply-To: <1348464657.1132.86.camel@pasglop>
On Mon, 2012-09-24 at 15:30 +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2012-09-24 at 11:56 +0800, Li Zhong wrote:
> > This patch tries to fix a WARNING in irq.c(below), when doing cpu offline/online.
> >
> > The reason is that if the preferred offline state is CPU_STATE_INACTIVE, when
> > cpu offline, pseries_mach_cpu_die() calls extended_cede_processor() to cede
> > the processor. After the hv call returns, the MSR_EE is enabled, while
> > the irq_happened in paca should already be set as PACA_IRQ_HARD_DIS.
> >
> > Then when the cpu is put online again, the warning is reported when
> > start_secondary() tries to enable irq. [ Sometimes, we don't see this warning,
> > that's because when we come to local_irq_enable(), there might already have been
> > some interrupts occurred, e.g. PACA_IRQ_DEC is already set. ]
> >
> > The patch tries to clear MSR_EE by calling __hard_irq_disable() after cede
> > returns, and before calling start_secondary_resume().
>
Hi Ben,
> Is that enough ? Do we know for sure we won't get a stray IPI or
> interrupt which then will reach us with an inconsistent HW/SW enable
> state ?
I thought when coming to here, there should be no IPI or interrupt for
this cpu ...
> We might need to "sanitize" the enable state in the PACA before we
> actually enter NAP or in the return from NAP code, like we do for normal
> idle code...
Do you mean something like this in extended_cede_processor() to make
sure HW/SW enable state consistent?
+ prep_irq_for_idle();
rc = cede_processor();
+#ifdef CONFIG_TRACE_IRQFLAGS
+ /* Ensure that H_CEDE returns with IRQs on */
+ if (WARN_ON(!(mfmsr() & MSR_EE)))
+ __hard_irq_enable();
+#endif
So, normally, we will have irqs SW/HW enabled consistently. But if the
above happens (some missing IPI/interrupt ), I'm not sure whether we can
leave it to be handled after cpu online again?
Please correct me if I misunderstood something.
Thanks, Zhong
> Cheers,
> Ben.
>
> > [ 56.618846] WARNING: at arch/powerpc/kernel/irq.c:240
> > [ 56.618851] Modules linked in: rcutorture ipv6 dm_mod ext3 jbd mbcache sg sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt ibmveth
> > [ 56.618883] NIP: c00000000000ff94 LR: c00000000067a5e0 CTR: 0000000000000001
> > [ 56.618889] REGS: c0000001fef6bbe0 TRAP: 0700 Not tainted (3.6.0-rc1-autokern1)
> > [ 56.618894] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 42000082 XER: 20000000
> > [ 56.618913] SOFTE: 1
> > [ 56.618916] CFAR: c00000000067a5dc
> > [ 56.618920] TASK = c0000001feed79a0[0] 'swapper/5' THREAD: c0000001fef68000 CPU: 5
> > GPR00: 0000000000000001 c0000001fef6be60 c000000000f9ca08 0000000000000001
> > GPR04: 0000000000000001 0000000000000008 0000000000000001 0000000000000000
> > GPR08: 0000000000000000 c0000001feed79a0 0008a80000000000 0000000000000000
> > GPR12: 0000000022000082 c00000000f330f00 c0000001fef6bf90 000000000f394b4c
> > GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > GPR24: c000000000fe8f80 0000000000000008 0000000000000028 0000000000000000
> > GPR28: 0000000000000000 0000000000000020 c000000000f1ab40 0000000000000001
> > [ 56.619014] NIP [c00000000000ff94] .arch_local_irq_restore+0x34/0xa0
> > [ 56.619020] LR [c00000000067a5e0] .start_secondary+0x368/0x37c
> > [ 56.619025] Call Trace:
> > [ 56.619030] [c0000001fef6be60] [c000000001ba0500] 0xc000000001ba0500 (unreliable)
> > [ 56.619038] [c0000001fef6bed0] [c00000000067a5e0] .start_secondary+0x368/0x37c
> > [ 56.619046] [c0000001fef6bf90] [c000000000009380] .start_secondary_resume+0x10/0x14
> > [ 56.619052] Instruction dump:
> > [ 56.619056] f8010010 f821ff91 986d022a 2fa30000 419e0054 880d022b 78000621 41820048
> > [ 56.619071] 2f800001 40de0064 7c0000a6 78008fe2 <0b000000> 2fa00000 40de0050 38000000
> > [ 56.619088] ---[ end trace 0199c0d783d7f9ba ]---
> >
> > Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
> > ---
> > arch/powerpc/platforms/pseries/hotplug-cpu.c | 2 ++
> > 1 files changed, 2 insertions(+), 0 deletions(-)
> >
> > diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> > index 64c97d8..8de539a 100644
> > --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
> > +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> > @@ -137,6 +137,8 @@ static void pseries_mach_cpu_die(void)
> > if (get_preferred_offline_state(cpu) == CPU_STATE_ONLINE) {
> > unregister_slb_shadow(hwcpu);
> >
> > + __hard_irq_disable();
> > +
> > /*
> > * Call to start_secondary_resume() will not return.
> > * Kernel stack will be reset and start_secondary()
>
>
next prev parent reply other threads:[~2012-09-24 6:42 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-24 3:56 [RFC PATCH powerpc] Fix a lazy irq related WARING in arch_local_irq_restore() Li Zhong
2012-09-24 5:30 ` Benjamin Herrenschmidt
2012-09-24 6:42 ` Li Zhong [this message]
2012-09-26 10:10 ` Li Zhong
2012-10-17 23:54 ` Benjamin Herrenschmidt
2012-10-18 7:30 ` Li Zhong
2012-10-22 7:23 ` Li Zhong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1348468962.2475.44.camel@ThinkPad-T420 \
--to=zhong@linux.vnet.ibm.com \
--cc=benh@kernel.crashing.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).