* [RFC PATCH powerpc] Fix a lazy irq related WARING in arch_local_irq_restore() @ 2012-09-24 3:56 Li Zhong 2012-09-24 5:30 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 7+ messages in thread From: Li Zhong @ 2012-09-24 3:56 UTC (permalink / raw) To: PowerPC email list; +Cc: Paul E. McKenney, Paul Mackerras This patch tries to fix a WARNING in irq.c(below), when doing cpu offline/online. The reason is that if the preferred offline state is CPU_STATE_INACTIVE, when cpu offline, pseries_mach_cpu_die() calls extended_cede_processor() to cede the processor. After the hv call returns, the MSR_EE is enabled, while the irq_happened in paca should already be set as PACA_IRQ_HARD_DIS. Then when the cpu is put online again, the warning is reported when start_secondary() tries to enable irq. [ Sometimes, we don't see this warning, that's because when we come to local_irq_enable(), there might already have been some interrupts occurred, e.g. PACA_IRQ_DEC is already set. ] The patch tries to clear MSR_EE by calling __hard_irq_disable() after cede returns, and before calling start_secondary_resume(). [ 56.618846] WARNING: at arch/powerpc/kernel/irq.c:240 [ 56.618851] Modules linked in: rcutorture ipv6 dm_mod ext3 jbd mbcache sg sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt ibmveth [ 56.618883] NIP: c00000000000ff94 LR: c00000000067a5e0 CTR: 0000000000000001 [ 56.618889] REGS: c0000001fef6bbe0 TRAP: 0700 Not tainted (3.6.0-rc1-autokern1) [ 56.618894] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 42000082 XER: 20000000 [ 56.618913] SOFTE: 1 [ 56.618916] CFAR: c00000000067a5dc [ 56.618920] TASK = c0000001feed79a0[0] 'swapper/5' THREAD: c0000001fef68000 CPU: 5 GPR00: 0000000000000001 c0000001fef6be60 c000000000f9ca08 0000000000000001 GPR04: 0000000000000001 0000000000000008 0000000000000001 0000000000000000 GPR08: 0000000000000000 c0000001feed79a0 0008a80000000000 0000000000000000 GPR12: 0000000022000082 c00000000f330f00 c0000001fef6bf90 000000000f394b4c GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: c000000000fe8f80 0000000000000008 0000000000000028 0000000000000000 GPR28: 0000000000000000 0000000000000020 c000000000f1ab40 0000000000000001 [ 56.619014] NIP [c00000000000ff94] .arch_local_irq_restore+0x34/0xa0 [ 56.619020] LR [c00000000067a5e0] .start_secondary+0x368/0x37c [ 56.619025] Call Trace: [ 56.619030] [c0000001fef6be60] [c000000001ba0500] 0xc000000001ba0500 (unreliable) [ 56.619038] [c0000001fef6bed0] [c00000000067a5e0] .start_secondary+0x368/0x37c [ 56.619046] [c0000001fef6bf90] [c000000000009380] .start_secondary_resume+0x10/0x14 [ 56.619052] Instruction dump: [ 56.619056] f8010010 f821ff91 986d022a 2fa30000 419e0054 880d022b 78000621 41820048 [ 56.619071] 2f800001 40de0064 7c0000a6 78008fe2 <0b000000> 2fa00000 40de0050 38000000 [ 56.619088] ---[ end trace 0199c0d783d7f9ba ]--- Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> --- arch/powerpc/platforms/pseries/hotplug-cpu.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 64c97d8..8de539a 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c @@ -137,6 +137,8 @@ static void pseries_mach_cpu_die(void) if (get_preferred_offline_state(cpu) == CPU_STATE_ONLINE) { unregister_slb_shadow(hwcpu); + __hard_irq_disable(); + /* * Call to start_secondary_resume() will not return. * Kernel stack will be reset and start_secondary() -- 1.7.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC PATCH powerpc] Fix a lazy irq related WARING in arch_local_irq_restore() 2012-09-24 3:56 [RFC PATCH powerpc] Fix a lazy irq related WARING in arch_local_irq_restore() Li Zhong @ 2012-09-24 5:30 ` Benjamin Herrenschmidt 2012-09-24 6:42 ` Li Zhong 2012-09-26 10:10 ` Li Zhong 0 siblings, 2 replies; 7+ messages in thread From: Benjamin Herrenschmidt @ 2012-09-24 5:30 UTC (permalink / raw) To: Li Zhong; +Cc: Paul Mackerras, Paul E. McKenney, PowerPC email list On Mon, 2012-09-24 at 11:56 +0800, Li Zhong wrote: > This patch tries to fix a WARNING in irq.c(below), when doing cpu offline/online. > > The reason is that if the preferred offline state is CPU_STATE_INACTIVE, when > cpu offline, pseries_mach_cpu_die() calls extended_cede_processor() to cede > the processor. After the hv call returns, the MSR_EE is enabled, while > the irq_happened in paca should already be set as PACA_IRQ_HARD_DIS. > > Then when the cpu is put online again, the warning is reported when > start_secondary() tries to enable irq. [ Sometimes, we don't see this warning, > that's because when we come to local_irq_enable(), there might already have been > some interrupts occurred, e.g. PACA_IRQ_DEC is already set. ] > > The patch tries to clear MSR_EE by calling __hard_irq_disable() after cede > returns, and before calling start_secondary_resume(). Is that enough ? Do we know for sure we won't get a stray IPI or interrupt which then will reach us with an inconsistent HW/SW enable state ? We might need to "sanitize" the enable state in the PACA before we actually enter NAP or in the return from NAP code, like we do for normal idle code... Cheers, Ben. > [ 56.618846] WARNING: at arch/powerpc/kernel/irq.c:240 > [ 56.618851] Modules linked in: rcutorture ipv6 dm_mod ext3 jbd mbcache sg sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt ibmveth > [ 56.618883] NIP: c00000000000ff94 LR: c00000000067a5e0 CTR: 0000000000000001 > [ 56.618889] REGS: c0000001fef6bbe0 TRAP: 0700 Not tainted (3.6.0-rc1-autokern1) > [ 56.618894] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 42000082 XER: 20000000 > [ 56.618913] SOFTE: 1 > [ 56.618916] CFAR: c00000000067a5dc > [ 56.618920] TASK = c0000001feed79a0[0] 'swapper/5' THREAD: c0000001fef68000 CPU: 5 > GPR00: 0000000000000001 c0000001fef6be60 c000000000f9ca08 0000000000000001 > GPR04: 0000000000000001 0000000000000008 0000000000000001 0000000000000000 > GPR08: 0000000000000000 c0000001feed79a0 0008a80000000000 0000000000000000 > GPR12: 0000000022000082 c00000000f330f00 c0000001fef6bf90 000000000f394b4c > GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > GPR24: c000000000fe8f80 0000000000000008 0000000000000028 0000000000000000 > GPR28: 0000000000000000 0000000000000020 c000000000f1ab40 0000000000000001 > [ 56.619014] NIP [c00000000000ff94] .arch_local_irq_restore+0x34/0xa0 > [ 56.619020] LR [c00000000067a5e0] .start_secondary+0x368/0x37c > [ 56.619025] Call Trace: > [ 56.619030] [c0000001fef6be60] [c000000001ba0500] 0xc000000001ba0500 (unreliable) > [ 56.619038] [c0000001fef6bed0] [c00000000067a5e0] .start_secondary+0x368/0x37c > [ 56.619046] [c0000001fef6bf90] [c000000000009380] .start_secondary_resume+0x10/0x14 > [ 56.619052] Instruction dump: > [ 56.619056] f8010010 f821ff91 986d022a 2fa30000 419e0054 880d022b 78000621 41820048 > [ 56.619071] 2f800001 40de0064 7c0000a6 78008fe2 <0b000000> 2fa00000 40de0050 38000000 > [ 56.619088] ---[ end trace 0199c0d783d7f9ba ]--- > > Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> > --- > arch/powerpc/platforms/pseries/hotplug-cpu.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c > index 64c97d8..8de539a 100644 > --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c > +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c > @@ -137,6 +137,8 @@ static void pseries_mach_cpu_die(void) > if (get_preferred_offline_state(cpu) == CPU_STATE_ONLINE) { > unregister_slb_shadow(hwcpu); > > + __hard_irq_disable(); > + > /* > * Call to start_secondary_resume() will not return. > * Kernel stack will be reset and start_secondary() ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH powerpc] Fix a lazy irq related WARING in arch_local_irq_restore() 2012-09-24 5:30 ` Benjamin Herrenschmidt @ 2012-09-24 6:42 ` Li Zhong 2012-09-26 10:10 ` Li Zhong 1 sibling, 0 replies; 7+ messages in thread From: Li Zhong @ 2012-09-24 6:42 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Paul Mackerras, Paul E. McKenney, PowerPC email list On Mon, 2012-09-24 at 15:30 +1000, Benjamin Herrenschmidt wrote: > On Mon, 2012-09-24 at 11:56 +0800, Li Zhong wrote: > > This patch tries to fix a WARNING in irq.c(below), when doing cpu offline/online. > > > > The reason is that if the preferred offline state is CPU_STATE_INACTIVE, when > > cpu offline, pseries_mach_cpu_die() calls extended_cede_processor() to cede > > the processor. After the hv call returns, the MSR_EE is enabled, while > > the irq_happened in paca should already be set as PACA_IRQ_HARD_DIS. > > > > Then when the cpu is put online again, the warning is reported when > > start_secondary() tries to enable irq. [ Sometimes, we don't see this warning, > > that's because when we come to local_irq_enable(), there might already have been > > some interrupts occurred, e.g. PACA_IRQ_DEC is already set. ] > > > > The patch tries to clear MSR_EE by calling __hard_irq_disable() after cede > > returns, and before calling start_secondary_resume(). > Hi Ben, > Is that enough ? Do we know for sure we won't get a stray IPI or > interrupt which then will reach us with an inconsistent HW/SW enable > state ? I thought when coming to here, there should be no IPI or interrupt for this cpu ... > We might need to "sanitize" the enable state in the PACA before we > actually enter NAP or in the return from NAP code, like we do for normal > idle code... Do you mean something like this in extended_cede_processor() to make sure HW/SW enable state consistent? + prep_irq_for_idle(); rc = cede_processor(); +#ifdef CONFIG_TRACE_IRQFLAGS + /* Ensure that H_CEDE returns with IRQs on */ + if (WARN_ON(!(mfmsr() & MSR_EE))) + __hard_irq_enable(); +#endif So, normally, we will have irqs SW/HW enabled consistently. But if the above happens (some missing IPI/interrupt ), I'm not sure whether we can leave it to be handled after cpu online again? Please correct me if I misunderstood something. Thanks, Zhong > Cheers, > Ben. > > > [ 56.618846] WARNING: at arch/powerpc/kernel/irq.c:240 > > [ 56.618851] Modules linked in: rcutorture ipv6 dm_mod ext3 jbd mbcache sg sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt ibmveth > > [ 56.618883] NIP: c00000000000ff94 LR: c00000000067a5e0 CTR: 0000000000000001 > > [ 56.618889] REGS: c0000001fef6bbe0 TRAP: 0700 Not tainted (3.6.0-rc1-autokern1) > > [ 56.618894] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 42000082 XER: 20000000 > > [ 56.618913] SOFTE: 1 > > [ 56.618916] CFAR: c00000000067a5dc > > [ 56.618920] TASK = c0000001feed79a0[0] 'swapper/5' THREAD: c0000001fef68000 CPU: 5 > > GPR00: 0000000000000001 c0000001fef6be60 c000000000f9ca08 0000000000000001 > > GPR04: 0000000000000001 0000000000000008 0000000000000001 0000000000000000 > > GPR08: 0000000000000000 c0000001feed79a0 0008a80000000000 0000000000000000 > > GPR12: 0000000022000082 c00000000f330f00 c0000001fef6bf90 000000000f394b4c > > GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > GPR24: c000000000fe8f80 0000000000000008 0000000000000028 0000000000000000 > > GPR28: 0000000000000000 0000000000000020 c000000000f1ab40 0000000000000001 > > [ 56.619014] NIP [c00000000000ff94] .arch_local_irq_restore+0x34/0xa0 > > [ 56.619020] LR [c00000000067a5e0] .start_secondary+0x368/0x37c > > [ 56.619025] Call Trace: > > [ 56.619030] [c0000001fef6be60] [c000000001ba0500] 0xc000000001ba0500 (unreliable) > > [ 56.619038] [c0000001fef6bed0] [c00000000067a5e0] .start_secondary+0x368/0x37c > > [ 56.619046] [c0000001fef6bf90] [c000000000009380] .start_secondary_resume+0x10/0x14 > > [ 56.619052] Instruction dump: > > [ 56.619056] f8010010 f821ff91 986d022a 2fa30000 419e0054 880d022b 78000621 41820048 > > [ 56.619071] 2f800001 40de0064 7c0000a6 78008fe2 <0b000000> 2fa00000 40de0050 38000000 > > [ 56.619088] ---[ end trace 0199c0d783d7f9ba ]--- > > > > Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> > > --- > > arch/powerpc/platforms/pseries/hotplug-cpu.c | 2 ++ > > 1 files changed, 2 insertions(+), 0 deletions(-) > > > > diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c > > index 64c97d8..8de539a 100644 > > --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c > > +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c > > @@ -137,6 +137,8 @@ static void pseries_mach_cpu_die(void) > > if (get_preferred_offline_state(cpu) == CPU_STATE_ONLINE) { > > unregister_slb_shadow(hwcpu); > > > > + __hard_irq_disable(); > > + > > /* > > * Call to start_secondary_resume() will not return. > > * Kernel stack will be reset and start_secondary() > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH powerpc] Fix a lazy irq related WARING in arch_local_irq_restore() 2012-09-24 5:30 ` Benjamin Herrenschmidt 2012-09-24 6:42 ` Li Zhong @ 2012-09-26 10:10 ` Li Zhong 2012-10-17 23:54 ` Benjamin Herrenschmidt 1 sibling, 1 reply; 7+ messages in thread From: Li Zhong @ 2012-09-26 10:10 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Paul Mackerras, Paul E. McKenney, PowerPC email list On Mon, 2012-09-24 at 15:30 +1000, Benjamin Herrenschmidt wrote: > On Mon, 2012-09-24 at 11:56 +0800, Li Zhong wrote: > > This patch tries to fix a WARNING in irq.c(below), when doing cpu offline/online. > > > > The reason is that if the preferred offline state is CPU_STATE_INACTIVE, when > > cpu offline, pseries_mach_cpu_die() calls extended_cede_processor() to cede > > the processor. After the hv call returns, the MSR_EE is enabled, while > > the irq_happened in paca should already be set as PACA_IRQ_HARD_DIS. > > > > Then when the cpu is put online again, the warning is reported when > > start_secondary() tries to enable irq. [ Sometimes, we don't see this warning, > > that's because when we come to local_irq_enable(), there might already have been > > some interrupts occurred, e.g. PACA_IRQ_DEC is already set. ] > > > > The patch tries to clear MSR_EE by calling __hard_irq_disable() after cede > > returns, and before calling start_secondary_resume(). > > Is that enough ? Do we know for sure we won't get a stray IPI or > interrupt which then will reach us with an inconsistent HW/SW enable > state ? > > We might need to "sanitize" the enable state in the PACA before we > actually enter NAP or in the return from NAP code, like we do for normal > idle code... Hi Ben, After some further reading of the code, I updated the code as following, but I'm not very sure, and guess it most probably has some issues ... Could you please help to review and give your comments? In extended_cede_processor(), if there is still lazy irq pending, I used local_irq_enable() to make sure the irq replayed and flags cleared, but I'm not sure whether it is a proper way. In pseries_mach_cpu_die(), I added local_irq_disable() after cede, and prepare for the start_secondary_resume(), but I'm not sure whether we also need a hard_irq_disable() here. I'm still a little confused by the meaning of PACA_IRQ_HARD_DIS in irq_happened. From the checking at the warning point, it seems only irq_happened equaling 0x1(PACA_IRQ_HARD_DIS) means hard irqs are disabled. Is it possible to set this bit at anyplace the hard irqs are disabled, so then we could check whether this bit is set to know whether hard irqs are disabled? Then it seems that in MASKED_INTERRUPT, we need set this bit where MSR_EE is cleared for something other than decrementer. Maybe I missed too much things? Thanks, Zhong =================== diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 64c97d8..b5f7597 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c @@ -130,6 +130,8 @@ static void pseries_mach_cpu_die(void) extended_cede_processor(cede_latency_hint); } + local_irq_disable(); + if (!get_lppaca()->shared_proc) get_lppaca()->donate_dedicated_cpu = 0; get_lppaca()->idle = 0; diff --git a/arch/powerpc/platforms/pseries/plpar_wrappers.h b/arch/powerpc/platforms/pseries/plpar_wrappers.h index 13e8cc4..07560d8 100644 --- a/arch/powerpc/platforms/pseries/plpar_wrappers.h +++ b/arch/powerpc/platforms/pseries/plpar_wrappers.h @@ -2,6 +2,7 @@ #define _PSERIES_PLPAR_WRAPPERS_H #include <linux/string.h> +#include <linux/irqflags.h> #include <asm/hvcall.h> #include <asm/paca.h> @@ -41,7 +42,19 @@ static inline long extended_cede_processor(unsigned long latency_hint) u8 old_latency_hint = get_cede_latency_hint(); set_cede_latency_hint(latency_hint); + + while (!prep_irq_for_idle()) { + local_irq_enable(); + local_irq_disable(); + } + rc = cede_processor(); +#ifdef CONFIG_TRACE_IRQFLAGS + /* Ensure that H_CEDE returns with IRQs on */ + if (WARN_ON(!(mfmsr() & MSR_EE))) + __hard_irq_enable(); +#endif + set_cede_latency_hint(old_latency_hint); return rc; =================== > > Cheers, > Ben. > > > [ 56.618846] WARNING: at arch/powerpc/kernel/irq.c:240 > > [ 56.618851] Modules linked in: rcutorture ipv6 dm_mod ext3 jbd mbcache sg sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt ibmveth > > [ 56.618883] NIP: c00000000000ff94 LR: c00000000067a5e0 CTR: 0000000000000001 > > [ 56.618889] REGS: c0000001fef6bbe0 TRAP: 0700 Not tainted (3.6.0-rc1-autokern1) > > [ 56.618894] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 42000082 XER: 20000000 > > [ 56.618913] SOFTE: 1 > > [ 56.618916] CFAR: c00000000067a5dc > > [ 56.618920] TASK = c0000001feed79a0[0] 'swapper/5' THREAD: c0000001fef68000 CPU: 5 > > GPR00: 0000000000000001 c0000001fef6be60 c000000000f9ca08 0000000000000001 > > GPR04: 0000000000000001 0000000000000008 0000000000000001 0000000000000000 > > GPR08: 0000000000000000 c0000001feed79a0 0008a80000000000 0000000000000000 > > GPR12: 0000000022000082 c00000000f330f00 c0000001fef6bf90 000000000f394b4c > > GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > GPR24: c000000000fe8f80 0000000000000008 0000000000000028 0000000000000000 > > GPR28: 0000000000000000 0000000000000020 c000000000f1ab40 0000000000000001 > > [ 56.619014] NIP [c00000000000ff94] .arch_local_irq_restore+0x34/0xa0 > > [ 56.619020] LR [c00000000067a5e0] .start_secondary+0x368/0x37c > > [ 56.619025] Call Trace: > > [ 56.619030] [c0000001fef6be60] [c000000001ba0500] 0xc000000001ba0500 (unreliable) > > [ 56.619038] [c0000001fef6bed0] [c00000000067a5e0] .start_secondary+0x368/0x37c > > [ 56.619046] [c0000001fef6bf90] [c000000000009380] .start_secondary_resume+0x10/0x14 > > [ 56.619052] Instruction dump: > > [ 56.619056] f8010010 f821ff91 986d022a 2fa30000 419e0054 880d022b 78000621 41820048 > > [ 56.619071] 2f800001 40de0064 7c0000a6 78008fe2 <0b000000> 2fa00000 40de0050 38000000 > > [ 56.619088] ---[ end trace 0199c0d783d7f9ba ]--- > > > > Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> > > --- > > arch/powerpc/platforms/pseries/hotplug-cpu.c | 2 ++ > > 1 files changed, 2 insertions(+), 0 deletions(-) > > > > diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c > > index 64c97d8..8de539a 100644 > > --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c > > +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c > > @@ -137,6 +137,8 @@ static void pseries_mach_cpu_die(void) > > if (get_preferred_offline_state(cpu) == CPU_STATE_ONLINE) { > > unregister_slb_shadow(hwcpu); > > > > + __hard_irq_disable(); > > + > > /* > > * Call to start_secondary_resume() will not return. > > * Kernel stack will be reset and start_secondary() > > ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC PATCH powerpc] Fix a lazy irq related WARING in arch_local_irq_restore() 2012-09-26 10:10 ` Li Zhong @ 2012-10-17 23:54 ` Benjamin Herrenschmidt 2012-10-18 7:30 ` Li Zhong 2012-10-22 7:23 ` Li Zhong 0 siblings, 2 replies; 7+ messages in thread From: Benjamin Herrenschmidt @ 2012-10-17 23:54 UTC (permalink / raw) To: Li Zhong; +Cc: Paul Mackerras, Paul E. McKenney, PowerPC email list On Wed, 2012-09-26 at 18:10 +0800, Li Zhong wrote: ../... Sorry got distracted, got back on this patch today: > > We might need to "sanitize" the enable state in the PACA before we > > actually enter NAP or in the return from NAP code, like we do for normal > > idle code... > > Hi Ben, > > After some further reading of the code, I updated the code as following, > but I'm not very sure, and guess it most probably has some issues ... > Could you please help to review and give your comments? I think it's still not right, see below > In extended_cede_processor(), if there is still lazy irq pending, I used > local_irq_enable() to make sure the irq replayed and flags cleared, but > I'm not sure whether it is a proper way. Right but that will break things for idle. In idle, if you have re-enabled interrupts, you need to return and essentially abort the attempt at going to nap mode, because the interrupt might have set need resched. That's why we normally just check if something's pending and return, letting the upper levels re-enable interrupts and do all the dirty work for us. Now, hotplug might differ here in what it needs, but in any case, extended_cede_processor doesn't seem to be the right place to handle it, at best that function should return if it thinks there's something that needs to be done and let the upper layers deal with it appropriately. > In pseries_mach_cpu_die(), I added local_irq_disable() after cede, and > prepare for the start_secondary_resume(), but I'm not sure whether we > also need a hard_irq_disable() here. You probably do if it's going to go back to the start secondary path. It shouldn't hurt in any case as long as start_secondary_resume() eventually does a local_irq_enable(). > I'm still a little confused by the meaning of PACA_IRQ_HARD_DIS in > irq_happened. From the checking at the warning point, it seems only > irq_happened equaling 0x1(PACA_IRQ_HARD_DIS) means hard irqs are > disabled. No. They are disabled if any bit in there corresponding to a "level" interrupt is set as well. Only the "edge" interrupts (and decrementer which we treat as edge and reset to a high value when it kicks) are ignored for the sake of HW irq state. The reason we have this IRQ_HARD_DIS bit is to indicate that a 'manual' hard disabling occurred (by opposition to one happening as a result of an external interrupt). We need that so we can avoid doing an mfspr() in local_irq_enable() and entirely rely on the content of irq_happened to know whether interrupts are hard enabled or hard disabled. We do that because mfspr is a fairly expensive instruction. But that means that we need to make sure we always have a consistent content in irq_happened. That's also why I've added all those sanity checks if you enable IRQ tracing. > Is it possible to set this bit at anyplace the hard irqs are disabled, > so then we could check whether this bit is set to know whether hard irqs > are disabled? Then it seems that in MASKED_INTERRUPT, we need set this > bit where MSR_EE is cleared for something other than decrementer. Maybe > I missed too much things? Either that bit or PACA_IRQ_EE. Both indicate that interrupts are hard disabled. There are some rare cases where do do change MSR:EE without touching those bits, only when we're going to restore it shortly afterward in the kernel asm exception entry/exit path for example. Cheers, Ben. > Thanks, Zhong > > =================== > diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c > index 64c97d8..b5f7597 100644 > --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c > +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c > @@ -130,6 +130,8 @@ static void pseries_mach_cpu_die(void) > extended_cede_processor(cede_latency_hint); > } > > + local_irq_disable(); > + > if (!get_lppaca()->shared_proc) > get_lppaca()->donate_dedicated_cpu = 0; > get_lppaca()->idle = 0; > diff --git a/arch/powerpc/platforms/pseries/plpar_wrappers.h b/arch/powerpc/platforms/pseries/plpar_wrappers.h > index 13e8cc4..07560d8 100644 > --- a/arch/powerpc/platforms/pseries/plpar_wrappers.h > +++ b/arch/powerpc/platforms/pseries/plpar_wrappers.h > @@ -2,6 +2,7 @@ > #define _PSERIES_PLPAR_WRAPPERS_H > > #include <linux/string.h> > +#include <linux/irqflags.h> > > #include <asm/hvcall.h> > #include <asm/paca.h> > @@ -41,7 +42,19 @@ static inline long extended_cede_processor(unsigned long latency_hint) > u8 old_latency_hint = get_cede_latency_hint(); > > set_cede_latency_hint(latency_hint); > + > + while (!prep_irq_for_idle()) { > + local_irq_enable(); > + local_irq_disable(); > + } > + > rc = cede_processor(); > +#ifdef CONFIG_TRACE_IRQFLAGS > + /* Ensure that H_CEDE returns with IRQs on */ > + if (WARN_ON(!(mfmsr() & MSR_EE))) > + __hard_irq_enable(); > +#endif > + > set_cede_latency_hint(old_latency_hint); > > return rc; > =================== > > > > > > Cheers, > > Ben. > > > > > [ 56.618846] WARNING: at arch/powerpc/kernel/irq.c:240 > > > [ 56.618851] Modules linked in: rcutorture ipv6 dm_mod ext3 jbd mbcache sg sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt ibmveth > > > [ 56.618883] NIP: c00000000000ff94 LR: c00000000067a5e0 CTR: 0000000000000001 > > > [ 56.618889] REGS: c0000001fef6bbe0 TRAP: 0700 Not tainted (3.6.0-rc1-autokern1) > > > [ 56.618894] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 42000082 XER: 20000000 > > > [ 56.618913] SOFTE: 1 > > > [ 56.618916] CFAR: c00000000067a5dc > > > [ 56.618920] TASK = c0000001feed79a0[0] 'swapper/5' THREAD: c0000001fef68000 CPU: 5 > > > GPR00: 0000000000000001 c0000001fef6be60 c000000000f9ca08 0000000000000001 > > > GPR04: 0000000000000001 0000000000000008 0000000000000001 0000000000000000 > > > GPR08: 0000000000000000 c0000001feed79a0 0008a80000000000 0000000000000000 > > > GPR12: 0000000022000082 c00000000f330f00 c0000001fef6bf90 000000000f394b4c > > > GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > > GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > > GPR24: c000000000fe8f80 0000000000000008 0000000000000028 0000000000000000 > > > GPR28: 0000000000000000 0000000000000020 c000000000f1ab40 0000000000000001 > > > [ 56.619014] NIP [c00000000000ff94] .arch_local_irq_restore+0x34/0xa0 > > > [ 56.619020] LR [c00000000067a5e0] .start_secondary+0x368/0x37c > > > [ 56.619025] Call Trace: > > > [ 56.619030] [c0000001fef6be60] [c000000001ba0500] 0xc000000001ba0500 (unreliable) > > > [ 56.619038] [c0000001fef6bed0] [c00000000067a5e0] .start_secondary+0x368/0x37c > > > [ 56.619046] [c0000001fef6bf90] [c000000000009380] .start_secondary_resume+0x10/0x14 > > > [ 56.619052] Instruction dump: > > > [ 56.619056] f8010010 f821ff91 986d022a 2fa30000 419e0054 880d022b 78000621 41820048 > > > [ 56.619071] 2f800001 40de0064 7c0000a6 78008fe2 <0b000000> 2fa00000 40de0050 38000000 > > > [ 56.619088] ---[ end trace 0199c0d783d7f9ba ]--- > > > > > > Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > > Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> > > > --- > > > arch/powerpc/platforms/pseries/hotplug-cpu.c | 2 ++ > > > 1 files changed, 2 insertions(+), 0 deletions(-) > > > > > > diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c > > > index 64c97d8..8de539a 100644 > > > --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c > > > +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c > > > @@ -137,6 +137,8 @@ static void pseries_mach_cpu_die(void) > > > if (get_preferred_offline_state(cpu) == CPU_STATE_ONLINE) { > > > unregister_slb_shadow(hwcpu); > > > > > > + __hard_irq_disable(); > > > + > > > /* > > > * Call to start_secondary_resume() will not return. > > > * Kernel stack will be reset and start_secondary() > > > > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH powerpc] Fix a lazy irq related WARING in arch_local_irq_restore() 2012-10-17 23:54 ` Benjamin Herrenschmidt @ 2012-10-18 7:30 ` Li Zhong 2012-10-22 7:23 ` Li Zhong 1 sibling, 0 replies; 7+ messages in thread From: Li Zhong @ 2012-10-18 7:30 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Paul Mackerras, Paul E. McKenney, PowerPC email list On Thu, 2012-10-18 at 10:54 +1100, Benjamin Herrenschmidt wrote: > On Wed, 2012-09-26 at 18:10 +0800, Li Zhong wrote: > > ../... > > Sorry got distracted, got back on this patch today: > > > > We might need to "sanitize" the enable state in the PACA before we > > > actually enter NAP or in the return from NAP code, like we do for normal > > > idle code... > > > > Hi Ben, > > > > After some further reading of the code, I updated the code as following, > > but I'm not very sure, and guess it most probably has some issues ... > > Could you please help to review and give your comments? Hi Ben, > I think it's still not right, see below As expected :) > > In extended_cede_processor(), if there is still lazy irq pending, I used > > local_irq_enable() to make sure the irq replayed and flags cleared, but > > I'm not sure whether it is a proper way. > > Right but that will break things for idle. In idle, if you have > re-enabled interrupts, you need to return and essentially abort the > attempt at going to nap mode, because the interrupt might have set need > resched. That's why we normally just check if something's pending and > return, letting the upper levels re-enable interrupts and do all the > dirty work for us. It seems that extended_cede_processor() is only called in hotplug, and idle calls check_and_cede_processor(). > Now, hotplug might differ here in what it needs, but in any case, > extended_cede_processor doesn't seem to be the right place to handle it, > at best that function should return if it thinks there's something that > needs to be done and let the upper layers deal with it appropriately. OK, so can I move the while loop into pseries_mach_cpu_die(), before calling extended_cede_processor() ? - as in the attached code. > > In pseries_mach_cpu_die(), I added local_irq_disable() after cede, and > > prepare for the start_secondary_resume(), but I'm not sure whether we > > also need a hard_irq_disable() here. > > You probably do if it's going to go back to the start secondary path. It > shouldn't hurt in any case as long as start_secondary_resume() > eventually does a local_irq_enable(). OK, I added it. Could you please to take a look at the updated code? thanks. ======================= diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 64c97d8..9b9394e 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c @@ -127,9 +127,16 @@ static void pseries_mach_cpu_die(void) get_lppaca()->donate_dedicated_cpu = 1; while (get_preferred_offline_state(cpu) == CPU_STATE_INACTIVE) { + while (!prep_irq_for_idle()) { + local_irq_enable(); + local_irq_disable(); + } + extended_cede_processor(cede_latency_hint); } + local_irq_disable(); + if (!get_lppaca()->shared_proc) get_lppaca()->donate_dedicated_cpu = 0; get_lppaca()->idle = 0; @@ -137,6 +144,7 @@ static void pseries_mach_cpu_die(void) if (get_preferred_offline_state(cpu) == CPU_STATE_ONLINE) { unregister_slb_shadow(hwcpu); + hard_irq_disable(); /* * Call to start_secondary_resume() will not return. * Kernel stack will be reset and start_secondary() diff --git a/arch/powerpc/platforms/pseries/plpar_wrappers.h b/arch/powerpc/platforms/pseries/plpar_wrappers.h index 13e8cc4..caed0d1 100644 --- a/arch/powerpc/platforms/pseries/plpar_wrappers.h +++ b/arch/powerpc/platforms/pseries/plpar_wrappers.h @@ -2,6 +2,7 @@ #define _PSERIES_PLPAR_WRAPPERS_H #include <linux/string.h> +#include <linux/irqflags.h> #include <asm/hvcall.h> #include <asm/paca.h> @@ -41,7 +42,14 @@ static inline long extended_cede_processor(unsigned long latency_hint) u8 old_latency_hint = get_cede_latency_hint(); set_cede_latency_hint(latency_hint); + rc = cede_processor(); +#ifdef CONFIG_TRACE_IRQFLAGS + /* Ensure that H_CEDE returns with IRQs on */ + if (WARN_ON(!(mfmsr() & MSR_EE))) + __hard_irq_enable(); +#endif + set_cede_latency_hint(old_latency_hint); return rc; ======================= > > I'm still a little confused by the meaning of PACA_IRQ_HARD_DIS in > > irq_happened. From the checking at the warning point, it seems only > > irq_happened equaling 0x1(PACA_IRQ_HARD_DIS) means hard irqs are > > disabled. > > No. They are disabled if any bit in there corresponding to a "level" > interrupt is set as well. Only the "edge" interrupts (and decrementer > which we treat as edge and reset to a high value when it kicks) are > ignored for the sake of HW irq state. > > The reason we have this IRQ_HARD_DIS bit is to indicate that a 'manual' > hard disabling occurred (by opposition to one happening as a result of > an external interrupt). > > We need that so we can avoid doing an mfspr() in local_irq_enable() and > entirely rely on the content of irq_happened to know whether interrupts > are hard enabled or hard disabled. > > We do that because mfspr is a fairly expensive instruction. But that > means that we need to make sure we always have a consistent content in > irq_happened. That's also why I've added all those sanity checks if you > enable IRQ tracing. > > > Is it possible to set this bit at anyplace the hard irqs are disabled, > > so then we could check whether this bit is set to know whether hard irqs > > are disabled? Then it seems that in MASKED_INTERRUPT, we need set this > > bit where MSR_EE is cleared for something other than decrementer. Maybe > > I missed too much things? > > Either that bit or PACA_IRQ_EE. Both indicate that interrupts are hard > disabled. > > There are some rare cases where do do change MSR:EE without touching > those bits, only when we're going to restore it shortly afterward in the > kernel asm exception entry/exit path for example. Thank you very much for the detailed education. I'll reread the code, and might bother you again if have further questions. Thanks, Zhong > Cheers, > Ben. > > > Thanks, Zhong > > > > =================== > > diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c > > index 64c97d8..b5f7597 100644 > > --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c > > +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c > > @@ -130,6 +130,8 @@ static void pseries_mach_cpu_die(void) > > extended_cede_processor(cede_latency_hint); > > } > > > > + local_irq_disable(); > > + > > if (!get_lppaca()->shared_proc) > > get_lppaca()->donate_dedicated_cpu = 0; > > get_lppaca()->idle = 0; > > diff --git a/arch/powerpc/platforms/pseries/plpar_wrappers.h b/arch/powerpc/platforms/pseries/plpar_wrappers.h > > index 13e8cc4..07560d8 100644 > > --- a/arch/powerpc/platforms/pseries/plpar_wrappers.h > > +++ b/arch/powerpc/platforms/pseries/plpar_wrappers.h > > @@ -2,6 +2,7 @@ > > #define _PSERIES_PLPAR_WRAPPERS_H > > > > #include <linux/string.h> > > +#include <linux/irqflags.h> > > > > #include <asm/hvcall.h> > > #include <asm/paca.h> > > @@ -41,7 +42,19 @@ static inline long extended_cede_processor(unsigned long latency_hint) > > u8 old_latency_hint = get_cede_latency_hint(); > > > > set_cede_latency_hint(latency_hint); > > + > > + while (!prep_irq_for_idle()) { > > + local_irq_enable(); > > + local_irq_disable(); > > + } > > + > > rc = cede_processor(); > > +#ifdef CONFIG_TRACE_IRQFLAGS > > + /* Ensure that H_CEDE returns with IRQs on */ > > + if (WARN_ON(!(mfmsr() & MSR_EE))) > > + __hard_irq_enable(); > > +#endif > > + > > set_cede_latency_hint(old_latency_hint); > > > > return rc; > > =================== > > > > > > > > > > Cheers, > > > Ben. > > > > > > > [ 56.618846] WARNING: at arch/powerpc/kernel/irq.c:240 > > > > [ 56.618851] Modules linked in: rcutorture ipv6 dm_mod ext3 jbd mbcache sg sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt ibmveth > > > > [ 56.618883] NIP: c00000000000ff94 LR: c00000000067a5e0 CTR: 0000000000000001 > > > > [ 56.618889] REGS: c0000001fef6bbe0 TRAP: 0700 Not tainted (3.6.0-rc1-autokern1) > > > > [ 56.618894] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 42000082 XER: 20000000 > > > > [ 56.618913] SOFTE: 1 > > > > [ 56.618916] CFAR: c00000000067a5dc > > > > [ 56.618920] TASK = c0000001feed79a0[0] 'swapper/5' THREAD: c0000001fef68000 CPU: 5 > > > > GPR00: 0000000000000001 c0000001fef6be60 c000000000f9ca08 0000000000000001 > > > > GPR04: 0000000000000001 0000000000000008 0000000000000001 0000000000000000 > > > > GPR08: 0000000000000000 c0000001feed79a0 0008a80000000000 0000000000000000 > > > > GPR12: 0000000022000082 c00000000f330f00 c0000001fef6bf90 000000000f394b4c > > > > GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > > > GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > > > GPR24: c000000000fe8f80 0000000000000008 0000000000000028 0000000000000000 > > > > GPR28: 0000000000000000 0000000000000020 c000000000f1ab40 0000000000000001 > > > > [ 56.619014] NIP [c00000000000ff94] .arch_local_irq_restore+0x34/0xa0 > > > > [ 56.619020] LR [c00000000067a5e0] .start_secondary+0x368/0x37c > > > > [ 56.619025] Call Trace: > > > > [ 56.619030] [c0000001fef6be60] [c000000001ba0500] 0xc000000001ba0500 (unreliable) > > > > [ 56.619038] [c0000001fef6bed0] [c00000000067a5e0] .start_secondary+0x368/0x37c > > > > [ 56.619046] [c0000001fef6bf90] [c000000000009380] .start_secondary_resume+0x10/0x14 > > > > [ 56.619052] Instruction dump: > > > > [ 56.619056] f8010010 f821ff91 986d022a 2fa30000 419e0054 880d022b 78000621 41820048 > > > > [ 56.619071] 2f800001 40de0064 7c0000a6 78008fe2 <0b000000> 2fa00000 40de0050 38000000 > > > > [ 56.619088] ---[ end trace 0199c0d783d7f9ba ]--- > > > > > > > > Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > > > Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> > > > > --- > > > > arch/powerpc/platforms/pseries/hotplug-cpu.c | 2 ++ > > > > 1 files changed, 2 insertions(+), 0 deletions(-) > > > > > > > > diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c > > > > index 64c97d8..8de539a 100644 > > > > --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c > > > > +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c > > > > @@ -137,6 +137,8 @@ static void pseries_mach_cpu_die(void) > > > > if (get_preferred_offline_state(cpu) == CPU_STATE_ONLINE) { > > > > unregister_slb_shadow(hwcpu); > > > > > > > > + __hard_irq_disable(); > > > > + > > > > /* > > > > * Call to start_secondary_resume() will not return. > > > > * Kernel stack will be reset and start_secondary() > > > > > > > > > > ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC PATCH powerpc] Fix a lazy irq related WARING in arch_local_irq_restore() 2012-10-17 23:54 ` Benjamin Herrenschmidt 2012-10-18 7:30 ` Li Zhong @ 2012-10-22 7:23 ` Li Zhong 1 sibling, 0 replies; 7+ messages in thread From: Li Zhong @ 2012-10-22 7:23 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Paul Mackerras, Paul E. McKenney, PowerPC email list On Thu, 2012-10-18 at 10:54 +1100, Benjamin Herrenschmidt wrote: > On Wed, 2012-09-26 at 18:10 +0800, Li Zhong wrote: > > ../... > > Sorry got distracted, got back on this patch today: > > > > We might need to "sanitize" the enable state in the PACA before we > > > actually enter NAP or in the return from NAP code, like we do for normal > > > idle code... > > > > Hi Ben, > > > > After some further reading of the code, I updated the code as following, > > but I'm not very sure, and guess it most probably has some issues ... > > Could you please help to review and give your comments? > > I think it's still not right, see below > > > In extended_cede_processor(), if there is still lazy irq pending, I used > > local_irq_enable() to make sure the irq replayed and flags cleared, but > > I'm not sure whether it is a proper way. > > Right but that will break things for idle. In idle, if you have > re-enabled interrupts, you need to return and essentially abort the > attempt at going to nap mode, because the interrupt might have set need > resched. That's why we normally just check if something's pending and > return, letting the upper levels re-enable interrupts and do all the > dirty work for us. > > Now, hotplug might differ here in what it needs, but in any case, > extended_cede_processor doesn't seem to be the right place to handle it, > at best that function should return if it thinks there's something that > needs to be done and let the upper layers deal with it appropriately. > > > In pseries_mach_cpu_die(), I added local_irq_disable() after cede, and > > prepare for the start_secondary_resume(), but I'm not sure whether we > > also need a hard_irq_disable() here. > > You probably do if it's going to go back to the start secondary path. It > shouldn't hurt in any case as long as start_secondary_resume() > eventually does a local_irq_enable(). > > > I'm still a little confused by the meaning of PACA_IRQ_HARD_DIS in > > irq_happened. From the checking at the warning point, it seems only > > irq_happened equaling 0x1(PACA_IRQ_HARD_DIS) means hard irqs are > > disabled. Hi Ben, Below are my current understandings and a few more questions, please correct me if there are any misunderstandings. Thank you. > No. They are disabled if any bit in there corresponding to a "level" > interrupt is set as well. Only the "edge" interrupts (and decrementer > which we treat as edge and reset to a high value when it kicks) are > ignored for the sake of HW irq state. >From the code, it seems that the hardware_interrupt/external_input (corresponding to PACA_IRQ_EE) are "level" interrupts. For "level" interrupts, is it because we will see it again, so we need to hard disable? or else we might enter into an infinite loop? > The reason we have this IRQ_HARD_DIS bit is to indicate that a 'manual' > hard disabling occurred (by opposition to one happening as a result of > an external interrupt). The external interrupt causing the hard disabling, is done in the code of masked_##_H##interrupt: for book3s, or masked_interrupt_book3e PACA_IRQ_EE 1 for book3e ? > We need that so we can avoid doing an mfspr() in local_irq_enable() and > entirely rely on the content of irq_happened to know whether interrupts > are hard enabled or hard disabled. Is this about the code in arch_local_irq_restore()? so if (! irq_happended), we could return directly as we know it is hard enabled. And here if (unlikely(irq_happened != PACA_IRQ_HARD_DIS)) __hard_irq_disable(); We don't need to hard disable if (irq_happened == PACA_IRQ_HARD_DIS), as we know it is hard disabled ('manual'). Then here, can we save a few more mtmsrd by also checking PACA_IRQ_EE bit? like following: - if (unlikely(irq_happened != PACA_IRQ_HARD_DIS)) + if (unlikely(!(irq_happened & (PACA_IRQ_HARD_DIS | PACA_IRQ_EE)))) > We do that because mfspr is a fairly expensive instruction. But that > means that we need to make sure we always have a consistent content in > irq_happened. That's also why I've added all those sanity checks if you > enable IRQ tracing. See. > > Is it possible to set this bit at anyplace the hard irqs are disabled, > > so then we could check whether this bit is set to know whether hard irqs > > are disabled? Then it seems that in MASKED_INTERRUPT, we need set this > > bit where MSR_EE is cleared for something other than decrementer. Maybe > > I missed too much things? > > Either that bit or PACA_IRQ_EE. Both indicate that interrupts are hard > disabled. > > There are some rare cases where do do change MSR:EE without touching > those bits, only when we're going to restore it shortly afterward in the > kernel asm exception entry/exit path for example. I don't get it very clearly here. I might need some more time to read and understand all the related asm codes. Currently, it seems to me in EXCEPTION_COMMON, SOFT_DISABLE_INTS is called to set PACA_IRQ_HARD_DIS, and other bits might be set when __SOFTEN_TEST (or masked_interrupt_book3e) is called. And in the exception exit path, something like SOFT_DISABLE_INTS, .restore_interrupts, restore_check_irq_replay, etc are called to handle the irq_happened bits. Thanks, Zhong > Cheers, > Ben. > > > Thanks, Zhong > > > > =================== > > diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c > > index 64c97d8..b5f7597 100644 > > --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c > > +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c > > @@ -130,6 +130,8 @@ static void pseries_mach_cpu_die(void) > > extended_cede_processor(cede_latency_hint); > > } > > > > + local_irq_disable(); > > + > > if (!get_lppaca()->shared_proc) > > get_lppaca()->donate_dedicated_cpu = 0; > > get_lppaca()->idle = 0; > > diff --git a/arch/powerpc/platforms/pseries/plpar_wrappers.h b/arch/powerpc/platforms/pseries/plpar_wrappers.h > > index 13e8cc4..07560d8 100644 > > --- a/arch/powerpc/platforms/pseries/plpar_wrappers.h > > +++ b/arch/powerpc/platforms/pseries/plpar_wrappers.h > > @@ -2,6 +2,7 @@ > > #define _PSERIES_PLPAR_WRAPPERS_H > > > > #include <linux/string.h> > > +#include <linux/irqflags.h> > > > > #include <asm/hvcall.h> > > #include <asm/paca.h> > > @@ -41,7 +42,19 @@ static inline long extended_cede_processor(unsigned long latency_hint) > > u8 old_latency_hint = get_cede_latency_hint(); > > > > set_cede_latency_hint(latency_hint); > > + > > + while (!prep_irq_for_idle()) { > > + local_irq_enable(); > > + local_irq_disable(); > > + } > > + > > rc = cede_processor(); > > +#ifdef CONFIG_TRACE_IRQFLAGS > > + /* Ensure that H_CEDE returns with IRQs on */ > > + if (WARN_ON(!(mfmsr() & MSR_EE))) > > + __hard_irq_enable(); > > +#endif > > + > > set_cede_latency_hint(old_latency_hint); > > > > return rc; > > =================== > > > > > > > > > > Cheers, > > > Ben. > > > > > > > [ 56.618846] WARNING: at arch/powerpc/kernel/irq.c:240 > > > > [ 56.618851] Modules linked in: rcutorture ipv6 dm_mod ext3 jbd mbcache sg sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt ibmveth > > > > [ 56.618883] NIP: c00000000000ff94 LR: c00000000067a5e0 CTR: 0000000000000001 > > > > [ 56.618889] REGS: c0000001fef6bbe0 TRAP: 0700 Not tainted (3.6.0-rc1-autokern1) > > > > [ 56.618894] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 42000082 XER: 20000000 > > > > [ 56.618913] SOFTE: 1 > > > > [ 56.618916] CFAR: c00000000067a5dc > > > > [ 56.618920] TASK = c0000001feed79a0[0] 'swapper/5' THREAD: c0000001fef68000 CPU: 5 > > > > GPR00: 0000000000000001 c0000001fef6be60 c000000000f9ca08 0000000000000001 > > > > GPR04: 0000000000000001 0000000000000008 0000000000000001 0000000000000000 > > > > GPR08: 0000000000000000 c0000001feed79a0 0008a80000000000 0000000000000000 > > > > GPR12: 0000000022000082 c00000000f330f00 c0000001fef6bf90 000000000f394b4c > > > > GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > > > GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > > > GPR24: c000000000fe8f80 0000000000000008 0000000000000028 0000000000000000 > > > > GPR28: 0000000000000000 0000000000000020 c000000000f1ab40 0000000000000001 > > > > [ 56.619014] NIP [c00000000000ff94] .arch_local_irq_restore+0x34/0xa0 > > > > [ 56.619020] LR [c00000000067a5e0] .start_secondary+0x368/0x37c > > > > [ 56.619025] Call Trace: > > > > [ 56.619030] [c0000001fef6be60] [c000000001ba0500] 0xc000000001ba0500 (unreliable) > > > > [ 56.619038] [c0000001fef6bed0] [c00000000067a5e0] .start_secondary+0x368/0x37c > > > > [ 56.619046] [c0000001fef6bf90] [c000000000009380] .start_secondary_resume+0x10/0x14 > > > > [ 56.619052] Instruction dump: > > > > [ 56.619056] f8010010 f821ff91 986d022a 2fa30000 419e0054 880d022b 78000621 41820048 > > > > [ 56.619071] 2f800001 40de0064 7c0000a6 78008fe2 <0b000000> 2fa00000 40de0050 38000000 > > > > [ 56.619088] ---[ end trace 0199c0d783d7f9ba ]--- > > > > > > > > Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > > > Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> > > > > --- > > > > arch/powerpc/platforms/pseries/hotplug-cpu.c | 2 ++ > > > > 1 files changed, 2 insertions(+), 0 deletions(-) > > > > > > > > diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c > > > > index 64c97d8..8de539a 100644 > > > > --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c > > > > +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c > > > > @@ -137,6 +137,8 @@ static void pseries_mach_cpu_die(void) > > > > if (get_preferred_offline_state(cpu) == CPU_STATE_ONLINE) { > > > > unregister_slb_shadow(hwcpu); > > > > > > > > + __hard_irq_disable(); > > > > + > > > > /* > > > > * Call to start_secondary_resume() will not return. > > > > * Kernel stack will be reset and start_secondary() > > > > > > > > > > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-10-22 7:23 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-09-24 3:56 [RFC PATCH powerpc] Fix a lazy irq related WARING in arch_local_irq_restore() Li Zhong 2012-09-24 5:30 ` Benjamin Herrenschmidt 2012-09-24 6:42 ` Li Zhong 2012-09-26 10:10 ` Li Zhong 2012-10-17 23:54 ` Benjamin Herrenschmidt 2012-10-18 7:30 ` Li Zhong 2012-10-22 7:23 ` Li Zhong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).