* Traceback due to 'powerpc/mm: Fix kernel RAM protection...' when running ppc image in qemu
@ 2017-09-19 18:24 Guenter Roeck
2017-09-20 3:05 ` Michael Ellerman
0 siblings, 1 reply; 6+ messages in thread
From: Guenter Roeck @ 2017-09-19 18:24 UTC (permalink / raw)
To: Christophe Leroy
Cc: Michael Ellerman, linux-kernel, Benjamin Herrenschmidt,
linuxppc-dev, Paul Mackerras
Hi,
I see a the following traceback when running an SMP image based on
85xx/mpc85xx_cds_defconfig in qemu.
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at kernel/smp.c:416 smp_call_function_many+0xcc/0x2fc
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc1-00009-g0666f56 #1
task: cf830000 task.stack: cf82e000
NIP: c00a93c8 LR: c00a9634 CTR: 00000001
REGS: cf82fde0 TRAP: 0700 Not tainted (4.14.0-rc1-00009-g0666f56)
MSR: 00021000 <CE,ME> CR: 24000082 XER: 00000000
GPR00: c00a9634 cf82fe90 cf830000 c050ad3c c0015a54 00000000 00000001 00000001
GPR08: 00000001 00000000 00000000 cf82e000 24000084 00000000 c0003150 00000000
GPR16: 00000000 00000000 00000000 00000000 00000000 00000001 00000000 c0510000
GPR24: 00000000 c0015a54 00000000 c050ad3c c051823c c050ad3c 00000025 00000000
NIP [c00a93c8] smp_call_function_many+0xcc/0x2fc
LR [c00a9634] smp_call_function+0x3c/0x50
Call Trace:
[cf82fe90] [00000010] 0x10 (unreliable)
[cf82fed0] [c00a9634] smp_call_function+0x3c/0x50
[cf82fee0] [c0015d2c] flush_tlb_kernel_range+0x20/0x38
[cf82fef0] [c001524c] mark_initmem_nx+0x154/0x16c
[cf82ff20] [c001484c] free_initmem+0x20/0x4c
[cf82ff30] [c000316c] kernel_init+0x1c/0x108
[cf82ff40] [c000f3a8] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
7c0803a6 7d808120 38210040 4e800020 3d20c052 812981a0 2f890000 40beffac
3d20c051 8929ac64 2f890000 40beff9c <0fe00000> 4bffff94 7fc3f378 7f64db78
---[ end trace 7da7bdcf8b15ddb3 ]---
A complete log is available at:
http://kerneltests.org/builders/qemu-ppc-master/builds/814/steps/qemubuildcommand/logs/stdio
Bisect points to commit 3184cc4b6f6a1dc0 ("powerpc/mm: Fix kernel RAM protection
after freeing unused memory on PPC32"). Bisect log is attached. A quick look
suggests that mark_initmem_nx() is called with interrupts disabled, which
triggers the traceback.
Guenter
---
# bad: [ebb2c2437d8008d46796902ff390653822af6cc4] Merge tag 'mmc-v4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
# good: [e7d0c41ecc2e372a81741a30894f556afec24315] Merge tag 'devprop-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
git bisect start 'HEAD' 'e7d0c41ecc2e'
# bad: [c0da4fa0d1a54495d6055c009ac46b76d1da2c86] Merge tag 'media/v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
git bisect bad c0da4fa0d1a54495d6055c009ac46b76d1da2c86
# good: [aae3dbb4776e7916b6cd442d00159bea27a695c1] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
git bisect good aae3dbb4776e7916b6cd442d00159bea27a695c1
# bad: [3645e6d0dc80be4376f87acc9ee527768387c909] Merge tag 'md/4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md
git bisect bad 3645e6d0dc80be4376f87acc9ee527768387c909
# bad: [bac65d9d87b383471d8d29128319508d71b74180] Merge tag 'powerpc-4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
git bisect bad bac65d9d87b383471d8d29128319508d71b74180
# good: [57e88b43b81301d9b28f124a5576ac43a1cf9e8d] Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 57e88b43b81301d9b28f124a5576ac43a1cf9e8d
# bad: [f9065c83ccf4a6c1ff5419d216ad8276e99bee6c] powerpc/configs: Explicitly drop CONFIG_INPUT_MOUSEDEV
git bisect bad f9065c83ccf4a6c1ff5419d216ad8276e99bee6c
# good: [ea16e83aec40f9110be9cb0c3398ef41ae890ca6] powerpc/cpm1: link to CONFIG_CPM1 instead of CONFIG_8xx
git bisect good ea16e83aec40f9110be9cb0c3398ef41ae890ca6
# bad: [8bfa42ab84910841336218265fcee94fd1e6285a] powerpc: Add const to bin_attribute structures
git bisect bad 8bfa42ab84910841336218265fcee94fd1e6285a
# good: [36992606eee8016c36ad2576687e97422f2f35ed] powerpc/chrp: Store the intended structure
git bisect good 36992606eee8016c36ad2576687e97422f2f35ed
# bad: [86b19520e7ef5539eb081c76fe2f5c955180205f] powerpc/mm: declare some local functions static
git bisect bad 86b19520e7ef5539eb081c76fe2f5c955180205f
# good: [87be3e2d31c01d3858bff43ab663769db03aab17] powerpc/8xx: Do not allow Pinned TLBs with STRICT_KERNEL_RWX or DEBUG_PAGEALLOC
git bisect good 87be3e2d31c01d3858bff43ab663769db03aab17
# good: [e611939fc8ec13387018df88083de7102a438730] powerpc/mm: Ensure change_page_attr() doesn't invalidate pinned TLBs
git bisect good e611939fc8ec13387018df88083de7102a438730
# bad: [95902e6c8864d39b09134dcaa3c99d8161d1deea] powerpc/mm: Implement STRICT_KERNEL_RWX on PPC32
git bisect bad 95902e6c8864d39b09134dcaa3c99d8161d1deea
# bad: [3184cc4b6f6a1dc0c1745aafe2b14b1206ef3187] powerpc/mm: Fix kernel RAM protection after freeing unused memory on PPC32
git bisect bad 3184cc4b6f6a1dc0c1745aafe2b14b1206ef3187
# first bad commit: [3184cc4b6f6a1dc0c1745aafe2b14b1206ef3187] powerpc/mm: Fix kernel RAM protection after freeing unused memory on PPC32
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Traceback due to 'powerpc/mm: Fix kernel RAM protection...' when running ppc image in qemu 2017-09-19 18:24 Traceback due to 'powerpc/mm: Fix kernel RAM protection...' when running ppc image in qemu Guenter Roeck @ 2017-09-20 3:05 ` Michael Ellerman 2017-09-20 3:45 ` Guenter Roeck 0 siblings, 1 reply; 6+ messages in thread From: Michael Ellerman @ 2017-09-20 3:05 UTC (permalink / raw) To: Guenter Roeck, Christophe Leroy Cc: linux-kernel, Benjamin Herrenschmidt, linuxppc-dev, Paul Mackerras Guenter Roeck <linux@roeck-us.net> writes: > Hi, > > I see a the following traceback when running an SMP image based on > 85xx/mpc85xx_cds_defconfig in qemu. > > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 1 at kernel/smp.c:416 smp_call_function_many+0xcc/0x2fc > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc1-00009-g0666f56 #1 > task: cf830000 task.stack: cf82e000 > NIP: c00a93c8 LR: c00a9634 CTR: 00000001 > REGS: cf82fde0 TRAP: 0700 Not tainted (4.14.0-rc1-00009-g0666f56) > MSR: 00021000 <CE,ME> CR: 24000082 XER: 00000000 > > GPR00: c00a9634 cf82fe90 cf830000 c050ad3c c0015a54 00000000 00000001 00000001 > GPR08: 00000001 00000000 00000000 cf82e000 24000084 00000000 c0003150 00000000 > GPR16: 00000000 00000000 00000000 00000000 00000000 00000001 00000000 c0510000 > GPR24: 00000000 c0015a54 00000000 c050ad3c c051823c c050ad3c 00000025 00000000 > NIP [c00a93c8] smp_call_function_many+0xcc/0x2fc > LR [c00a9634] smp_call_function+0x3c/0x50 > Call Trace: > [cf82fe90] [00000010] 0x10 (unreliable) > [cf82fed0] [c00a9634] smp_call_function+0x3c/0x50 > [cf82fee0] [c0015d2c] flush_tlb_kernel_range+0x20/0x38 > [cf82fef0] [c001524c] mark_initmem_nx+0x154/0x16c > [cf82ff20] [c001484c] free_initmem+0x20/0x4c > [cf82ff30] [c000316c] kernel_init+0x1c/0x108 > [cf82ff40] [c000f3a8] ret_from_kernel_thread+0x5c/0x64 > Instruction dump: > 7c0803a6 7d808120 38210040 4e800020 3d20c052 812981a0 2f890000 40beffac > 3d20c051 8929ac64 2f890000 40beff9c <0fe00000> 4bffff94 7fc3f378 7f64db78 > ---[ end trace 7da7bdcf8b15ddb3 ]--- Thanks. I guess the system still runs OK otherwise, you're just seeing the warning? > A complete log is available at: > http://kerneltests.org/builders/qemu-ppc-master/builds/814/steps/qemubuildcommand/logs/stdio > > Bisect points to commit 3184cc4b6f6a1dc0 ("powerpc/mm: Fix kernel RAM protection > after freeing unused memory on PPC32"). Bisect log is attached. A quick look > suggests that mark_initmem_nx() is called with interrupts disabled, which > triggers the traceback. Hmm. Yes the MSR says you have interrupts disabled (EE missing). But I don't see why. start_kernel() did local_irq_enable(), so I don't understand why we got to mark_initmem_nx() with them disabled. I'll hope that Christophe has some idea. cheers ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Traceback due to 'powerpc/mm: Fix kernel RAM protection...' when running ppc image in qemu 2017-09-20 3:05 ` Michael Ellerman @ 2017-09-20 3:45 ` Guenter Roeck 2017-09-21 18:44 ` Christophe LEROY 0 siblings, 1 reply; 6+ messages in thread From: Guenter Roeck @ 2017-09-20 3:45 UTC (permalink / raw) To: Michael Ellerman, Christophe Leroy Cc: linux-kernel, Benjamin Herrenschmidt, linuxppc-dev, Paul Mackerras On 09/19/2017 08:05 PM, Michael Ellerman wrote: > Guenter Roeck <linux@roeck-us.net> writes: > >> Hi, >> >> I see a the following traceback when running an SMP image based on >> 85xx/mpc85xx_cds_defconfig in qemu. >> >> ------------[ cut here ]------------ >> WARNING: CPU: 0 PID: 1 at kernel/smp.c:416 smp_call_function_many+0xcc/0x2fc >> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc1-00009-g0666f56 #1 >> task: cf830000 task.stack: cf82e000 >> NIP: c00a93c8 LR: c00a9634 CTR: 00000001 >> REGS: cf82fde0 TRAP: 0700 Not tainted (4.14.0-rc1-00009-g0666f56) >> MSR: 00021000 <CE,ME> CR: 24000082 XER: 00000000 >> >> GPR00: c00a9634 cf82fe90 cf830000 c050ad3c c0015a54 00000000 00000001 00000001 >> GPR08: 00000001 00000000 00000000 cf82e000 24000084 00000000 c0003150 00000000 >> GPR16: 00000000 00000000 00000000 00000000 00000000 00000001 00000000 c0510000 >> GPR24: 00000000 c0015a54 00000000 c050ad3c c051823c c050ad3c 00000025 00000000 >> NIP [c00a93c8] smp_call_function_many+0xcc/0x2fc >> LR [c00a9634] smp_call_function+0x3c/0x50 >> Call Trace: >> [cf82fe90] [00000010] 0x10 (unreliable) >> [cf82fed0] [c00a9634] smp_call_function+0x3c/0x50 >> [cf82fee0] [c0015d2c] flush_tlb_kernel_range+0x20/0x38 >> [cf82fef0] [c001524c] mark_initmem_nx+0x154/0x16c >> [cf82ff20] [c001484c] free_initmem+0x20/0x4c >> [cf82ff30] [c000316c] kernel_init+0x1c/0x108 >> [cf82ff40] [c000f3a8] ret_from_kernel_thread+0x5c/0x64 >> Instruction dump: >> 7c0803a6 7d808120 38210040 4e800020 3d20c052 812981a0 2f890000 40beffac >> 3d20c051 8929ac64 2f890000 40beff9c <0fe00000> 4bffff94 7fc3f378 7f64db78 >> ---[ end trace 7da7bdcf8b15ddb3 ]--- > > Thanks. > > I guess the system still runs OK otherwise, you're just seeing the warning? > Yes, though I am not sure if that is because there is only one active CPU (there is still only one if I say "-smp 4" on the qemu command line). >> A complete log is available at: >> http://kerneltests.org/builders/qemu-ppc-master/builds/814/steps/qemubuildcommand/logs/stdio >> >> Bisect points to commit 3184cc4b6f6a1dc0 ("powerpc/mm: Fix kernel RAM protection >> after freeing unused memory on PPC32"). Bisect log is attached. A quick look >> suggests that mark_initmem_nx() is called with interrupts disabled, which >> triggers the traceback. > > Hmm. Yes the MSR says you have interrupts disabled (EE missing). > > But I don't see why. start_kernel() did local_irq_enable(), so I don't > understand why we got to mark_initmem_nx() with them disabled. I'll hope > that Christophe has some idea. > Good question. I only see this with one of 9 ppc emulations, with 85xx/mpc85xx_cds_defconfig +CONFIG_DEVTMPFS=y +CONFIG_SMP=y. Maybe there is a platform specific init function which leaves interrupts disabled. Question is which one that might be. Guenter ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Traceback due to 'powerpc/mm: Fix kernel RAM protection...' when running ppc image in qemu 2017-09-20 3:45 ` Guenter Roeck @ 2017-09-21 18:44 ` Christophe LEROY 2017-09-24 16:05 ` Guenter Roeck 0 siblings, 1 reply; 6+ messages in thread From: Christophe LEROY @ 2017-09-21 18:44 UTC (permalink / raw) To: Guenter Roeck, Michael Ellerman Cc: linux-kernel, Benjamin Herrenschmidt, linuxppc-dev, Paul Mackerras Le 20/09/2017 à 05:45, Guenter Roeck a écrit : > On 09/19/2017 08:05 PM, Michael Ellerman wrote: >> Guenter Roeck <linux@roeck-us.net> writes: >> >>> Hi, >>> >>> I see a the following traceback when running an SMP image based on >>> 85xx/mpc85xx_cds_defconfig in qemu. >>> >>> ------------[ cut here ]------------ >>> WARNING: CPU: 0 PID: 1 at kernel/smp.c:416 >>> smp_call_function_many+0xcc/0x2fc >>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc1-00009-g0666f56 #1 >>> task: cf830000 task.stack: cf82e000 >>> NIP: c00a93c8 LR: c00a9634 CTR: 00000001 >>> REGS: cf82fde0 TRAP: 0700 Not tainted (4.14.0-rc1-00009-g0666f56) >>> MSR: 00021000 <CE,ME> CR: 24000082 XER: 00000000 >>> >>> GPR00: c00a9634 cf82fe90 cf830000 c050ad3c c0015a54 00000000 00000001 >>> 00000001 >>> GPR08: 00000001 00000000 00000000 cf82e000 24000084 00000000 c0003150 >>> 00000000 >>> GPR16: 00000000 00000000 00000000 00000000 00000000 00000001 00000000 >>> c0510000 >>> GPR24: 00000000 c0015a54 00000000 c050ad3c c051823c c050ad3c 00000025 >>> 00000000 >>> NIP [c00a93c8] smp_call_function_many+0xcc/0x2fc >>> LR [c00a9634] smp_call_function+0x3c/0x50 >>> Call Trace: >>> [cf82fe90] [00000010] 0x10 (unreliable) >>> [cf82fed0] [c00a9634] smp_call_function+0x3c/0x50 >>> [cf82fee0] [c0015d2c] flush_tlb_kernel_range+0x20/0x38 >>> [cf82fef0] [c001524c] mark_initmem_nx+0x154/0x16c >>> [cf82ff20] [c001484c] free_initmem+0x20/0x4c >>> [cf82ff30] [c000316c] kernel_init+0x1c/0x108 >>> [cf82ff40] [c000f3a8] ret_from_kernel_thread+0x5c/0x64 >>> Instruction dump: >>> 7c0803a6 7d808120 38210040 4e800020 3d20c052 812981a0 2f890000 40beffac >>> 3d20c051 8929ac64 2f890000 40beff9c <0fe00000> 4bffff94 7fc3f378 >>> 7f64db78 >>> ---[ end trace 7da7bdcf8b15ddb3 ]--- >> >> Thanks. >> >> I guess the system still runs OK otherwise, you're just seeing the >> warning? >> > Yes, though I am not sure if that is because there is only one active > CPU (there is > still only one if I say "-smp 4" on the qemu command line). > >>> A complete log is available at: >>> http://kerneltests.org/builders/qemu-ppc-master/builds/814/steps/qemubuildcommand/logs/stdio >>> >>> >>> Bisect points to commit 3184cc4b6f6a1dc0 ("powerpc/mm: Fix kernel RAM >>> protection >>> after freeing unused memory on PPC32"). Bisect log is attached. A >>> quick look >>> suggests that mark_initmem_nx() is called with interrupts disabled, >>> which >>> triggers the traceback. >> >> Hmm. Yes the MSR says you have interrupts disabled (EE missing). >> >> But I don't see why. start_kernel() did local_irq_enable(), so I don't >> understand why we got to mark_initmem_nx() with them disabled. I'll hope >> that Christophe has some idea. >> > Good question. I only see this with one of 9 ppc emulations, with > 85xx/mpc85xx_cds_defconfig > +CONFIG_DEVTMPFS=y +CONFIG_SMP=y. Maybe there is a platform specific > init function > which leaves interrupts disabled. Question is which one that might be. > Unfortunatly no, I have no idea. My three platforms (860, 885 and 8321) are not SMPs so that warning would not appear, but I added a WARN_ON(1) just become calling mark_initmem_nx(), and I can confirm that MSR has EE set on all three at that time. So as you suggest, there must be a platform specific stuff leaving the interrupts disabled. Christophe > Guenter ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Traceback due to 'powerpc/mm: Fix kernel RAM protection...' when running ppc image in qemu 2017-09-21 18:44 ` Christophe LEROY @ 2017-09-24 16:05 ` Guenter Roeck 2017-09-25 6:36 ` Christophe LEROY 0 siblings, 1 reply; 6+ messages in thread From: Guenter Roeck @ 2017-09-24 16:05 UTC (permalink / raw) To: Christophe LEROY, Michael Ellerman Cc: linux-kernel, Benjamin Herrenschmidt, linuxppc-dev, Paul Mackerras On 09/21/2017 11:44 AM, Christophe LEROY wrote: > > > Le 20/09/2017 à 05:45, Guenter Roeck a écrit : >> On 09/19/2017 08:05 PM, Michael Ellerman wrote: >>> Guenter Roeck <linux@roeck-us.net> writes: >>> >>>> Hi, >>>> >>>> I see a the following traceback when running an SMP image based on >>>> 85xx/mpc85xx_cds_defconfig in qemu. >>>> >>>> ------------[ cut here ]------------ >>>> WARNING: CPU: 0 PID: 1 at kernel/smp.c:416 smp_call_function_many+0xcc/0x2fc >>>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc1-00009-g0666f56 #1 >>>> task: cf830000 task.stack: cf82e000 >>>> NIP: c00a93c8 LR: c00a9634 CTR: 00000001 >>>> REGS: cf82fde0 TRAP: 0700 Not tainted (4.14.0-rc1-00009-g0666f56) >>>> MSR: 00021000 <CE,ME> CR: 24000082 XER: 00000000 >>>> >>>> GPR00: c00a9634 cf82fe90 cf830000 c050ad3c c0015a54 00000000 00000001 00000001 >>>> GPR08: 00000001 00000000 00000000 cf82e000 24000084 00000000 c0003150 00000000 >>>> GPR16: 00000000 00000000 00000000 00000000 00000000 00000001 00000000 c0510000 >>>> GPR24: 00000000 c0015a54 00000000 c050ad3c c051823c c050ad3c 00000025 00000000 >>>> NIP [c00a93c8] smp_call_function_many+0xcc/0x2fc >>>> LR [c00a9634] smp_call_function+0x3c/0x50 >>>> Call Trace: >>>> [cf82fe90] [00000010] 0x10 (unreliable) >>>> [cf82fed0] [c00a9634] smp_call_function+0x3c/0x50 >>>> [cf82fee0] [c0015d2c] flush_tlb_kernel_range+0x20/0x38 >>>> [cf82fef0] [c001524c] mark_initmem_nx+0x154/0x16c >>>> [cf82ff20] [c001484c] free_initmem+0x20/0x4c >>>> [cf82ff30] [c000316c] kernel_init+0x1c/0x108 >>>> [cf82ff40] [c000f3a8] ret_from_kernel_thread+0x5c/0x64 >>>> Instruction dump: >>>> 7c0803a6 7d808120 38210040 4e800020 3d20c052 812981a0 2f890000 40beffac >>>> 3d20c051 8929ac64 2f890000 40beff9c <0fe00000> 4bffff94 7fc3f378 7f64db78 >>>> ---[ end trace 7da7bdcf8b15ddb3 ]--- >>> >>> Thanks. >>> >>> I guess the system still runs OK otherwise, you're just seeing the warning? >>> >> Yes, though I am not sure if that is because there is only one active CPU (there is >> still only one if I say "-smp 4" on the qemu command line). >> >>>> A complete log is available at: >>>> http://kerneltests.org/builders/qemu-ppc-master/builds/814/steps/qemubuildcommand/logs/stdio >>>> >>>> Bisect points to commit 3184cc4b6f6a1dc0 ("powerpc/mm: Fix kernel RAM protection >>>> after freeing unused memory on PPC32"). Bisect log is attached. A quick look >>>> suggests that mark_initmem_nx() is called with interrupts disabled, which >>>> triggers the traceback. >>> >>> Hmm. Yes the MSR says you have interrupts disabled (EE missing). >>> >>> But I don't see why. start_kernel() did local_irq_enable(), so I don't >>> understand why we got to mark_initmem_nx() with them disabled. I'll hope >>> that Christophe has some idea. >>> >> Good question. I only see this with one of 9 ppc emulations, with 85xx/mpc85xx_cds_defconfig >> +CONFIG_DEVTMPFS=y +CONFIG_SMP=y. Maybe there is a platform specific init function >> which leaves interrupts disabled. Question is which one that might be. >> > > Unfortunatly no, I have no idea. My three platforms (860, 885 and 8321) are not SMPs so that warning would not appear, but I added a WARN_ON(1) just become calling mark_initmem_nx(), and I can confirm that MSR has EE set on all three at that time. > You should still be able to compile and run a SMP kernel. mpc85xx_cds_defconfig without CONFIG_SMP=y does not show the warning either. Turns out interrupts are disabled in change_page_attr(), called by mark_initmem_nx(). change_page_attr() calls flush_tlb_kernel_range() with interrupts disabled. This only happens if CONFIG_PPC_MMU_NOHASH=y. Given that, I would assume that this will be seen with every 32 bit ppc build which has CONFIG_SMP=y and CONFIG_PPC_MMU_NOHASH=y. Maybe the problem was really introduced with commit e611939fc8ec1 ("powerpc/mm: Ensure change_page_attr() doesn't invalidate pinned TLBs"). From the context it appears that flush_tlb_kernel_range() should not be called with interrupts disabled. Indeed, moving flush_tlb_kernel_range() outside the irq disabled code fixes the problem for me. Thanks, Guenter > So as you suggest, there must be a platform specific stuff leaving the interrupts disabled. > > Christophe > > >> Guenter > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Traceback due to 'powerpc/mm: Fix kernel RAM protection...' when running ppc image in qemu 2017-09-24 16:05 ` Guenter Roeck @ 2017-09-25 6:36 ` Christophe LEROY 0 siblings, 0 replies; 6+ messages in thread From: Christophe LEROY @ 2017-09-25 6:36 UTC (permalink / raw) To: Guenter Roeck, Michael Ellerman Cc: linux-kernel, Benjamin Herrenschmidt, linuxppc-dev, Paul Mackerras Le 24/09/2017 à 18:05, Guenter Roeck a écrit : > On 09/21/2017 11:44 AM, Christophe LEROY wrote: >> >> >> Le 20/09/2017 à 05:45, Guenter Roeck a écrit : >>> On 09/19/2017 08:05 PM, Michael Ellerman wrote: >>>> Guenter Roeck <linux@roeck-us.net> writes: >>>> >>>>> Hi, >>>>> >>>>> I see a the following traceback when running an SMP image based on >>>>> 85xx/mpc85xx_cds_defconfig in qemu. >>>>> >>>>> ------------[ cut here ]------------ >>>>> WARNING: CPU: 0 PID: 1 at kernel/smp.c:416 >>>>> smp_call_function_many+0xcc/0x2fc >>>>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc1-00009-g0666f56 #1 >>>>> task: cf830000 task.stack: cf82e000 >>>>> NIP: c00a93c8 LR: c00a9634 CTR: 00000001 >>>>> REGS: cf82fde0 TRAP: 0700 Not tainted (4.14.0-rc1-00009-g0666f56) >>>>> MSR: 00021000 <CE,ME> CR: 24000082 XER: 00000000 >>>>> >>>>> GPR00: c00a9634 cf82fe90 cf830000 c050ad3c c0015a54 00000000 >>>>> 00000001 00000001 >>>>> GPR08: 00000001 00000000 00000000 cf82e000 24000084 00000000 >>>>> c0003150 00000000 >>>>> GPR16: 00000000 00000000 00000000 00000000 00000000 00000001 >>>>> 00000000 c0510000 >>>>> GPR24: 00000000 c0015a54 00000000 c050ad3c c051823c c050ad3c >>>>> 00000025 00000000 >>>>> NIP [c00a93c8] smp_call_function_many+0xcc/0x2fc >>>>> LR [c00a9634] smp_call_function+0x3c/0x50 >>>>> Call Trace: >>>>> [cf82fe90] [00000010] 0x10 (unreliable) >>>>> [cf82fed0] [c00a9634] smp_call_function+0x3c/0x50 >>>>> [cf82fee0] [c0015d2c] flush_tlb_kernel_range+0x20/0x38 >>>>> [cf82fef0] [c001524c] mark_initmem_nx+0x154/0x16c >>>>> [cf82ff20] [c001484c] free_initmem+0x20/0x4c >>>>> [cf82ff30] [c000316c] kernel_init+0x1c/0x108 >>>>> [cf82ff40] [c000f3a8] ret_from_kernel_thread+0x5c/0x64 >>>>> Instruction dump: >>>>> 7c0803a6 7d808120 38210040 4e800020 3d20c052 812981a0 2f890000 >>>>> 40beffac >>>>> 3d20c051 8929ac64 2f890000 40beff9c <0fe00000> 4bffff94 7fc3f378 >>>>> 7f64db78 >>>>> ---[ end trace 7da7bdcf8b15ddb3 ]--- >>>> >>>> Thanks. >>>> >>>> I guess the system still runs OK otherwise, you're just seeing the >>>> warning? >>>> >>> Yes, though I am not sure if that is because there is only one active >>> CPU (there is >>> still only one if I say "-smp 4" on the qemu command line). >>> >>>>> A complete log is available at: >>>>> http://kerneltests.org/builders/qemu-ppc-master/builds/814/steps/qemubuildcommand/logs/stdio >>>>> >>>>> >>>>> Bisect points to commit 3184cc4b6f6a1dc0 ("powerpc/mm: Fix kernel >>>>> RAM protection >>>>> after freeing unused memory on PPC32"). Bisect log is attached. A >>>>> quick look >>>>> suggests that mark_initmem_nx() is called with interrupts disabled, >>>>> which >>>>> triggers the traceback. >>>> >>>> Hmm. Yes the MSR says you have interrupts disabled (EE missing). >>>> >>>> But I don't see why. start_kernel() did local_irq_enable(), so I don't >>>> understand why we got to mark_initmem_nx() with them disabled. I'll >>>> hope >>>> that Christophe has some idea. >>>> >>> Good question. I only see this with one of 9 ppc emulations, with >>> 85xx/mpc85xx_cds_defconfig >>> +CONFIG_DEVTMPFS=y +CONFIG_SMP=y. Maybe there is a platform specific >>> init function >>> which leaves interrupts disabled. Question is which one that might be. >>> >> >> Unfortunatly no, I have no idea. My three platforms (860, 885 and >> 8321) are not SMPs so that warning would not appear, but I added a >> WARN_ON(1) just become calling mark_initmem_nx(), and I can confirm >> that MSR has EE set on all three at that time. >> > > You should still be able to compile and run a SMP kernel. > mpc85xx_cds_defconfig SMP doesn't support the 8xx, and the 83xx has hash MMU. > without CONFIG_SMP=y does not show the warning either. Yes that's normal, as the smp_call_function() is not called in that case, hence my test with a WARN_ON(1) just before calling mark_initram_nx() > > Turns out interrupts are disabled in change_page_attr(), called by > mark_initmem_nx(). Oops, you're right, I missed it. > change_page_attr() calls flush_tlb_kernel_range() with interrupts disabled. > This only happens if CONFIG_PPC_MMU_NOHASH=y. > Given that, I would assume that this will be seen with every 32 bit ppc > build which has > CONFIG_SMP=y and CONFIG_PPC_MMU_NOHASH=y. > > Maybe the problem was really introduced with commit e611939fc8ec1 > ("powerpc/mm: Ensure > change_page_attr() doesn't invalidate pinned TLBs"). From the context it > appears that > flush_tlb_kernel_range() should not be called with interrupts disabled. Right, it looks like that warning was introduced by this commit. However, by looking at flush_tlb_page() which was the function that was called instead before that commit, there was most likely also an issue with SMP because flush_tlb_page() called with a NULL vma results in a warning in the SMP NOHASH version of flush_tlb_page(). > Indeed, moving flush_tlb_kernel_range() outside the irq disabled code fixes > the problem for me. Yes that's likely the solution it seems. Thanks Christophe > > Thanks, > Guenter > >> So as you suggest, there must be a platform specific stuff leaving the >> interrupts disabled. >> >> Christophe >> >> >>> Guenter >> ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-09-25 6:37 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-09-19 18:24 Traceback due to 'powerpc/mm: Fix kernel RAM protection...' when running ppc image in qemu Guenter Roeck 2017-09-20 3:05 ` Michael Ellerman 2017-09-20 3:45 ` Guenter Roeck 2017-09-21 18:44 ` Christophe LEROY 2017-09-24 16:05 ` Guenter Roeck 2017-09-25 6:36 ` Christophe LEROY
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).