* [PATCHv2 for soc 0/4] Enabling socfpga on hardware @ 2013-01-31 17:05 dinguyen at altera.com 2013-01-31 17:05 ` [PATCHv2 for soc 1/4] arm: socfpga: Add new device tree source for actual socfpga HW dinguyen at altera.com ` (3 more replies) 0 siblings, 4 replies; 36+ messages in thread From: dinguyen at altera.com @ 2013-01-31 17:05 UTC (permalink / raw) To: linux-arm-kernel From: Dinh Nguyen <dinguyen@altera.com> V2: - Remove patch that adds clock entries in the socfpga.dtsi as this should accompany a rework in drivers/clk and will done in a different patch series. - Removed I-cache invalidate from v7_invalidate_l1 - Defined cpu1-start-addr as a device tree entry - Removed the need to use CONFIG_VMSPLIT_2G Dinh Nguyen (4): arm: socfpga: Add new device tree source for actual socfpga HW arm: socfpga: Add entries to enable make dtbs socfpga arm: Add v7_invalidate_l1 to cache-v7.S arm: socfpga: Add SMP support for actual socfpga harware .../bindings/arm/altera/socfpga-system.txt | 2 + arch/arm/boot/dts/Makefile | 2 + arch/arm/boot/dts/socfpga.dtsi | 22 +++---- arch/arm/boot/dts/socfpga_cyclone5.dts | 34 ++++++++++- arch/arm/boot/dts/socfpga_vt.dts | 64 ++++++++++++++++++++ arch/arm/mach-imx/headsmp.S | 47 -------------- arch/arm/mach-shmobile/headsmp.S | 48 --------------- arch/arm/mach-socfpga/core.h | 4 +- arch/arm/mach-socfpga/headsmp.S | 16 +++-- arch/arm/mach-socfpga/platsmp.c | 3 +- arch/arm/mach-socfpga/socfpga.c | 9 +++ arch/arm/mach-tegra/headsmp.S | 43 ------------- arch/arm/mm/cache-v7.S | 46 ++++++++++++++ 13 files changed, 182 insertions(+), 158 deletions(-) create mode 100644 arch/arm/boot/dts/socfpga_vt.dts -- 1.7.9.5 ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 1/4] arm: socfpga: Add new device tree source for actual socfpga HW 2013-01-31 17:05 [PATCHv2 for soc 0/4] Enabling socfpga on hardware dinguyen at altera.com @ 2013-01-31 17:05 ` dinguyen at altera.com 2013-02-01 3:46 ` Olof Johansson 2013-01-31 17:05 ` [PATCHv2 for soc 2/4] arm: socfpga: Add entries to enable make dtbs socfpga dinguyen at altera.com ` (2 subsequent siblings) 3 siblings, 1 reply; 36+ messages in thread From: dinguyen at altera.com @ 2013-01-31 17:05 UTC (permalink / raw) To: linux-arm-kernel From: Dinh Nguyen <dinguyen@altera.com> Up to this point, support for socfpga has only been on a virtual platform. Now that actual hardware is available, we add the appropriate device tree source files. Signed-off-by: Dinh Nguyen <dinguyen@altera.com> Tested-by: Pavel Machek <pavel@denx.de> Reviewed-by: Pavel Machek <pavel@denx.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Olof Johansson <olof@lixom.net> --- arch/arm/boot/dts/socfpga.dtsi | 22 ++++++------ arch/arm/boot/dts/socfpga_cyclone5.dts | 30 ++++++++++++++-- arch/arm/boot/dts/socfpga_vt.dts | 60 ++++++++++++++++++++++++++++++++ arch/arm/mach-socfpga/socfpga.c | 1 + 4 files changed, 99 insertions(+), 14 deletions(-) create mode 100644 arch/arm/boot/dts/socfpga_vt.dts diff --git a/arch/arm/boot/dts/socfpga.dtsi b/arch/arm/boot/dts/socfpga.dtsi index 19aec42..936d230 100644 --- a/arch/arm/boot/dts/socfpga.dtsi +++ b/arch/arm/boot/dts/socfpga.dtsi @@ -25,6 +25,10 @@ ethernet0 = &gmac0; serial0 = &uart0; serial1 = &uart1; + timer0 = &timer0; + timer1 = &timer1; + timer2 = &timer2; + timer3 = &timer3; }; cpus { @@ -98,47 +102,41 @@ interrupts = <1 13 0xf04>; }; - timer0: timer at ffc08000 { + timer0: timer0 at ffc08000 { compatible = "snps,dw-apb-timer-sp"; interrupts = <0 167 4>; - clock-frequency = <200000000>; reg = <0xffc08000 0x1000>; }; - timer1: timer at ffc09000 { + timer1: timer1 at ffc09000 { compatible = "snps,dw-apb-timer-sp"; interrupts = <0 168 4>; - clock-frequency = <200000000>; reg = <0xffc09000 0x1000>; }; - timer2: timer at ffd00000 { + timer2: timer2 at ffd00000 { compatible = "snps,dw-apb-timer-osc"; interrupts = <0 169 4>; - clock-frequency = <200000000>; reg = <0xffd00000 0x1000>; }; - timer3: timer at ffd01000 { + timer3: timer3 at ffd01000 { compatible = "snps,dw-apb-timer-osc"; interrupts = <0 170 4>; - clock-frequency = <200000000>; reg = <0xffd01000 0x1000>; }; - uart0: uart at ffc02000 { + uart0: serial0 at ffc02000 { compatible = "snps,dw-apb-uart"; reg = <0xffc02000 0x1000>; - clock-frequency = <7372800>; interrupts = <0 162 4>; reg-shift = <2>; reg-io-width = <4>; }; - uart1: uart at ffc03000 { + uart1: serial1 at ffc03000 { compatible = "snps,dw-apb-uart"; reg = <0xffc03000 0x1000>; - clock-frequency = <7372800>; interrupts = <0 163 4>; reg-shift = <2>; reg-io-width = <4>; diff --git a/arch/arm/boot/dts/socfpga_cyclone5.dts b/arch/arm/boot/dts/socfpga_cyclone5.dts index ab7e4a9..7ad3cc6 100644 --- a/arch/arm/boot/dts/socfpga_cyclone5.dts +++ b/arch/arm/boot/dts/socfpga_cyclone5.dts @@ -20,7 +20,7 @@ / { model = "Altera SOCFPGA Cyclone V"; - compatible = "altr,socfpga-cyclone5"; + compatible = "altr,socfpga-cyclone5", "altr,socfpga"; chosen { bootargs = "console=ttyS0,57600"; @@ -29,6 +29,32 @@ memory { name = "memory"; device_type = "memory"; - reg = <0x0 0x10000000>; /* 256MB */ + reg = <0x0 0x40000000>; /* 1GB */ + }; + + soc { + timer0 at ffc08000 { + clock-frequency = <100000000>; + }; + + timer1 at ffc09000 { + clock-frequency = <100000000>; + }; + + timer2 at ffd00000 { + clock-frequency = <25000000>; + }; + + timer3 at ffd01000 { + clock-frequency = <25000000>; + }; + + serial0 at ffc02000 { + clock-frequency = <100000000>; + }; + + serial1 at ffc03000 { + clock-frequency = <100000000>; + }; }; }; diff --git a/arch/arm/boot/dts/socfpga_vt.dts b/arch/arm/boot/dts/socfpga_vt.dts new file mode 100644 index 0000000..a0c6c65 --- /dev/null +++ b/arch/arm/boot/dts/socfpga_vt.dts @@ -0,0 +1,60 @@ +/* + * Copyright (C) 2013 Altera Corporation <www.altera.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see <http://www.gnu.org/licenses/>. + */ + +/dts-v1/; +/include/ "socfpga.dtsi" + +/ { + model = "Altera SOCFPGA VT"; + compatible = "altr,socfpga-vt", "altr,socfpga"; + + chosen { + bootargs = "console=ttyS0,57600"; + }; + + memory { + name = "memory"; + device_type = "memory"; + reg = <0x0 0x40000000>; /* 1 GB */ + }; + + soc { + timer0 at ffc08000 { + clock-frequency = <7000000>; + }; + + timer1 at ffc09000 { + clock-frequency = <7000000>; + }; + + timer2 at ffd00000 { + clock-frequency = <7000000>; + }; + + timer3 at ffd01000 { + clock-frequency = <7000000>; + }; + + serial0 at ffc02000 { + clock-frequency = <7372800>; + }; + + serial1 at ffc03000 { + clock-frequency = <7372800>; + }; + }; +}; diff --git a/arch/arm/mach-socfpga/socfpga.c b/arch/arm/mach-socfpga/socfpga.c index 6732924..198f491 100644 --- a/arch/arm/mach-socfpga/socfpga.c +++ b/arch/arm/mach-socfpga/socfpga.c @@ -99,6 +99,7 @@ static void __init socfpga_cyclone5_init(void) static const char *altera_dt_match[] = { "altr,socfpga", "altr,socfpga-cyclone5", + "altr,socfpga-vt", NULL }; -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 1/4] arm: socfpga: Add new device tree source for actual socfpga HW 2013-01-31 17:05 ` [PATCHv2 for soc 1/4] arm: socfpga: Add new device tree source for actual socfpga HW dinguyen at altera.com @ 2013-02-01 3:46 ` Olof Johansson 2013-02-01 15:23 ` Dinh Nguyen 0 siblings, 1 reply; 36+ messages in thread From: Olof Johansson @ 2013-02-01 3:46 UTC (permalink / raw) To: linux-arm-kernel On Thu, Jan 31, 2013 at 11:05:40AM -0600, dinguyen at altera.com wrote: > From: Dinh Nguyen <dinguyen@altera.com> > > Up to this point, support for socfpga has only been on a virtual > platform. Now that actual hardware is available, we add the appropriate > device tree source files. > > Signed-off-by: Dinh Nguyen <dinguyen@altera.com> > Tested-by: Pavel Machek <pavel@denx.de> > Reviewed-by: Pavel Machek <pavel@denx.de> > Cc: Russell King <linux@arm.linux.org.uk> > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Olof Johansson <olof@lixom.net> Are you planning on sending us a merge request, or do you want us to apply this to a branch in arm-soc for you? (Note comment below though) > diff --git a/arch/arm/boot/dts/socfpga_cyclone5.dts b/arch/arm/boot/dts/socfpga_cyclone5.dts > index ab7e4a9..7ad3cc6 100644 > --- a/arch/arm/boot/dts/socfpga_cyclone5.dts > +++ b/arch/arm/boot/dts/socfpga_cyclone5.dts > @@ -20,7 +20,7 @@ > > / { > model = "Altera SOCFPGA Cyclone V"; > - compatible = "altr,socfpga-cyclone5"; > + compatible = "altr,socfpga-cyclone5", "altr,socfpga"; > > chosen { > bootargs = "console=ttyS0,57600"; [...] > diff --git a/arch/arm/boot/dts/socfpga_vt.dts b/arch/arm/boot/dts/socfpga_vt.dts > new file mode 100644 > index 0000000..a0c6c65 > --- /dev/null > +++ b/arch/arm/boot/dts/socfpga_vt.dts [...] > + > +/ { > + model = "Altera SOCFPGA VT"; > + compatible = "altr,socfpga-vt", "altr,socfpga"; > + > + chosen { > + bootargs = "console=ttyS0,57600"; > + }; > + [...] > diff --git a/arch/arm/mach-socfpga/socfpga.c b/arch/arm/mach-socfpga/socfpga.c > index 6732924..198f491 100644 > --- a/arch/arm/mach-socfpga/socfpga.c > +++ b/arch/arm/mach-socfpga/socfpga.c > @@ -99,6 +99,7 @@ static void __init socfpga_cyclone5_init(void) > static const char *altera_dt_match[] = { > "altr,socfpga", > "altr,socfpga-cyclone5", > + "altr,socfpga-vt", > NULL > }; Since you have altr,socfpga in the compatible for the new board, you don't need to add an explicit check for the more specific one here. Same for cyclone5, you should even be able to remove that. -Olof ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 1/4] arm: socfpga: Add new device tree source for actual socfpga HW 2013-02-01 3:46 ` Olof Johansson @ 2013-02-01 15:23 ` Dinh Nguyen 0 siblings, 0 replies; 36+ messages in thread From: Dinh Nguyen @ 2013-02-01 15:23 UTC (permalink / raw) To: linux-arm-kernel Hi Olof, On Thu, 2013-01-31 at 19:46 -0800, Olof Johansson wrote: > On Thu, Jan 31, 2013 at 11:05:40AM -0600, dinguyen at altera.com wrote: > > From: Dinh Nguyen <dinguyen@altera.com> > > > > Up to this point, support for socfpga has only been on a virtual > > platform. Now that actual hardware is available, we add the appropriate > > device tree source files. > > > > Signed-off-by: Dinh Nguyen <dinguyen@altera.com> > > Tested-by: Pavel Machek <pavel@denx.de> > > Reviewed-by: Pavel Machek <pavel@denx.de> > > Cc: Russell King <linux@arm.linux.org.uk> > > Cc: Arnd Bergmann <arnd@arndb.de> > > Cc: Olof Johansson <olof@lixom.net> > > Are you planning on sending us a merge request, or do you want us to apply this > to a branch in arm-soc for you? If you can apply to arm-soc, that would great. Thanks! Let me address your comment in a v3. Dinh > > (Note comment below though) > > > diff --git a/arch/arm/boot/dts/socfpga_cyclone5.dts b/arch/arm/boot/dts/socfpga_cyclone5.dts > > index ab7e4a9..7ad3cc6 100644 > > --- a/arch/arm/boot/dts/socfpga_cyclone5.dts > > +++ b/arch/arm/boot/dts/socfpga_cyclone5.dts > > @@ -20,7 +20,7 @@ > > > > / { > > model = "Altera SOCFPGA Cyclone V"; > > - compatible = "altr,socfpga-cyclone5"; > > + compatible = "altr,socfpga-cyclone5", "altr,socfpga"; > > > > chosen { > > bootargs = "console=ttyS0,57600"; > > [...] > > > diff --git a/arch/arm/boot/dts/socfpga_vt.dts b/arch/arm/boot/dts/socfpga_vt.dts > > new file mode 100644 > > index 0000000..a0c6c65 > > --- /dev/null > > +++ b/arch/arm/boot/dts/socfpga_vt.dts > [...] > > + > > +/ { > > + model = "Altera SOCFPGA VT"; > > + compatible = "altr,socfpga-vt", "altr,socfpga"; > > + > > + chosen { > > + bootargs = "console=ttyS0,57600"; > > + }; > > + > [...] > > diff --git a/arch/arm/mach-socfpga/socfpga.c b/arch/arm/mach-socfpga/socfpga.c > > index 6732924..198f491 100644 > > --- a/arch/arm/mach-socfpga/socfpga.c > > +++ b/arch/arm/mach-socfpga/socfpga.c > > @@ -99,6 +99,7 @@ static void __init socfpga_cyclone5_init(void) > > static const char *altera_dt_match[] = { > > "altr,socfpga", > > "altr,socfpga-cyclone5", > > + "altr,socfpga-vt", > > NULL > > }; > > Since you have altr,socfpga in the compatible for the new board, you don't need > to add an explicit check for the more specific one here. Same for cyclone5, you > should even be able to remove that. > > > -Olof > ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 2/4] arm: socfpga: Add entries to enable make dtbs socfpga 2013-01-31 17:05 [PATCHv2 for soc 0/4] Enabling socfpga on hardware dinguyen at altera.com 2013-01-31 17:05 ` [PATCHv2 for soc 1/4] arm: socfpga: Add new device tree source for actual socfpga HW dinguyen at altera.com @ 2013-01-31 17:05 ` dinguyen at altera.com 2013-01-31 17:05 ` [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S dinguyen at altera.com 2013-01-31 17:05 ` [PATCHv2 for soc 4/4] arm: socfpga: Add SMP support for actual socfpga harware dinguyen at altera.com 3 siblings, 0 replies; 36+ messages in thread From: dinguyen at altera.com @ 2013-01-31 17:05 UTC (permalink / raw) To: linux-arm-kernel From: Dinh Nguyen <dinguyen@altera.com> Signed-off-by: Dinh Nguyen <dinguyen@altera.com> Tested-by: Pavel Machek <pavel@denx.de> Reviewed-by: Pavel Machek <pavel@denx.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Olof Johansson <olof@lixom.net> Cc: Pavel Machek <pavel@denx.de> --- arch/arm/boot/dts/Makefile | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile index 5ebb44f..1b8276c 100644 --- a/arch/arm/boot/dts/Makefile +++ b/arch/arm/boot/dts/Makefile @@ -124,6 +124,8 @@ dtb-$(CONFIG_ARCH_SHMOBILE) += emev2-kzm9d.dtb \ r8a7740-armadillo800eva.dtb \ sh73a0-kzm9g.dtb \ sh7372-mackerel.dtb +dtb-$(CONFIG_ARCH_SOCFPGA) += socfpga_cyclone5.dtb \ + socfpga_vt.dtb dtb-$(CONFIG_ARCH_SPEAR13XX) += spear1310-evb.dtb \ spear1340-evb.dtb dtb-$(CONFIG_ARCH_SPEAR3XX)+= spear300-evb.dtb \ -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-01-31 17:05 [PATCHv2 for soc 0/4] Enabling socfpga on hardware dinguyen at altera.com 2013-01-31 17:05 ` [PATCHv2 for soc 1/4] arm: socfpga: Add new device tree source for actual socfpga HW dinguyen at altera.com 2013-01-31 17:05 ` [PATCHv2 for soc 2/4] arm: socfpga: Add entries to enable make dtbs socfpga dinguyen at altera.com @ 2013-01-31 17:05 ` dinguyen at altera.com 2013-01-31 18:11 ` Stephen Warren ` (2 more replies) 2013-01-31 17:05 ` [PATCHv2 for soc 4/4] arm: socfpga: Add SMP support for actual socfpga harware dinguyen at altera.com 3 siblings, 3 replies; 36+ messages in thread From: dinguyen at altera.com @ 2013-01-31 17:05 UTC (permalink / raw) To: linux-arm-kernel From: Dinh Nguyen <dinguyen@altera.com> mach-socfpga is another platform that needs to use v7_invalidate_l1 to bringup additional cores. There was a comment that the ideal place for v7_invalidate_l1 should be in arm/mm/cache-v7.S Signed-off-by: Dinh Nguyen <dinguyen@altera.com> Acked-by: Simon Horman <horms+renesas@verge.net.au> Tested-by: Pavel Machek <pavel@denx.de> Reviewed-by: Pavel Machek <pavel@denx.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Olof Johansson <olof@lixom.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Sascha Hauer <kernel@pengutronix.de> Cc: Magnus Damm <magnus.damm@gmail.com> Cc: Stephen Warren <swarren@wwwdotorg.org> --- arch/arm/mach-imx/headsmp.S | 47 ------------------------------------- arch/arm/mach-shmobile/headsmp.S | 48 -------------------------------------- arch/arm/mach-tegra/headsmp.S | 43 ---------------------------------- arch/arm/mm/cache-v7.S | 46 ++++++++++++++++++++++++++++++++++++ 4 files changed, 46 insertions(+), 138 deletions(-) diff --git a/arch/arm/mach-imx/headsmp.S b/arch/arm/mach-imx/headsmp.S index 7e49deb..921fc15 100644 --- a/arch/arm/mach-imx/headsmp.S +++ b/arch/arm/mach-imx/headsmp.S @@ -17,53 +17,6 @@ .section ".text.head", "ax" -/* - * The secondary kernel init calls v7_flush_dcache_all before it enables - * the L1; however, the L1 comes out of reset in an undefined state, so - * the clean + invalidate performed by v7_flush_dcache_all causes a bunch - * of cache lines with uninitialized data and uninitialized tags to get - * written out to memory, which does really unpleasant things to the main - * processor. We fix this by performing an invalidate, rather than a - * clean + invalidate, before jumping into the kernel. - * - * This funciton is cloned from arch/arm/mach-tegra/headsmp.S, and needs - * to be called for both secondary cores startup and primary core resume - * procedures. Ideally, it should be moved into arch/arm/mm/cache-v7.S. - */ -ENTRY(v7_invalidate_l1) - mov r0, #0 - mcr p15, 0, r0, c7, c5, 0 @ invalidate I cache - mcr p15, 2, r0, c0, c0, 0 - mrc p15, 1, r0, c0, c0, 0 - - ldr r1, =0x7fff - and r2, r1, r0, lsr #13 - - ldr r1, =0x3ff - - and r3, r1, r0, lsr #3 @ NumWays - 1 - add r2, r2, #1 @ NumSets - - and r0, r0, #0x7 - add r0, r0, #4 @ SetShift - - clz r1, r3 @ WayShift - add r4, r3, #1 @ NumWays -1: sub r2, r2, #1 @ NumSets-- - mov r3, r4 @ Temp = NumWays -2: subs r3, r3, #1 @ Temp-- - mov r5, r3, lsl r1 - mov r6, r2, lsl r0 - orr r5, r5, r6 @ Reg = (Temp<<WayShift)|(NumSets<<SetShift) - mcr p15, 0, r5, c7, c6, 2 - bgt 2b - cmp r2, #0 - bgt 1b - dsb - isb - mov pc, lr -ENDPROC(v7_invalidate_l1) - #ifdef CONFIG_SMP ENTRY(v7_secondary_startup) bl v7_invalidate_l1 diff --git a/arch/arm/mach-shmobile/headsmp.S b/arch/arm/mach-shmobile/headsmp.S index b202c12..96001fd 100644 --- a/arch/arm/mach-shmobile/headsmp.S +++ b/arch/arm/mach-shmobile/headsmp.S @@ -16,54 +16,6 @@ __CPUINIT -/* Cache invalidation nicked from arch/arm/mach-imx/head-v7.S, thanks! - * - * The secondary kernel init calls v7_flush_dcache_all before it enables - * the L1; however, the L1 comes out of reset in an undefined state, so - * the clean + invalidate performed by v7_flush_dcache_all causes a bunch - * of cache lines with uninitialized data and uninitialized tags to get - * written out to memory, which does really unpleasant things to the main - * processor. We fix this by performing an invalidate, rather than a - * clean + invalidate, before jumping into the kernel. - * - * This funciton is cloned from arch/arm/mach-tegra/headsmp.S, and needs - * to be called for both secondary cores startup and primary core resume - * procedures. Ideally, it should be moved into arch/arm/mm/cache-v7.S. - */ -ENTRY(v7_invalidate_l1) - mov r0, #0 - mcr p15, 0, r0, c7, c5, 0 @ invalidate I cache - mcr p15, 2, r0, c0, c0, 0 - mrc p15, 1, r0, c0, c0, 0 - - ldr r1, =0x7fff - and r2, r1, r0, lsr #13 - - ldr r1, =0x3ff - - and r3, r1, r0, lsr #3 @ NumWays - 1 - add r2, r2, #1 @ NumSets - - and r0, r0, #0x7 - add r0, r0, #4 @ SetShift - - clz r1, r3 @ WayShift - add r4, r3, #1 @ NumWays -1: sub r2, r2, #1 @ NumSets-- - mov r3, r4 @ Temp = NumWays -2: subs r3, r3, #1 @ Temp-- - mov r5, r3, lsl r1 - mov r6, r2, lsl r0 - orr r5, r5, r6 @ Reg = (Temp<<WayShift)|(NumSets<<SetShift) - mcr p15, 0, r5, c7, c6, 2 - bgt 2b - cmp r2, #0 - bgt 1b - dsb - isb - mov pc, lr -ENDPROC(v7_invalidate_l1) - ENTRY(shmobile_invalidate_start) bl v7_invalidate_l1 b secondary_startup diff --git a/arch/arm/mach-tegra/headsmp.S b/arch/arm/mach-tegra/headsmp.S index 4a317fa..fb082c4 100644 --- a/arch/arm/mach-tegra/headsmp.S +++ b/arch/arm/mach-tegra/headsmp.S @@ -18,49 +18,6 @@ .section ".text.head", "ax" __CPUINIT -/* - * Tegra specific entry point for secondary CPUs. - * The secondary kernel init calls v7_flush_dcache_all before it enables - * the L1; however, the L1 comes out of reset in an undefined state, so - * the clean + invalidate performed by v7_flush_dcache_all causes a bunch - * of cache lines with uninitialized data and uninitialized tags to get - * written out to memory, which does really unpleasant things to the main - * processor. We fix this by performing an invalidate, rather than a - * clean + invalidate, before jumping into the kernel. - */ -ENTRY(v7_invalidate_l1) - mov r0, #0 - mcr p15, 2, r0, c0, c0, 0 - mrc p15, 1, r0, c0, c0, 0 - - ldr r1, =0x7fff - and r2, r1, r0, lsr #13 - - ldr r1, =0x3ff - - and r3, r1, r0, lsr #3 @ NumWays - 1 - add r2, r2, #1 @ NumSets - - and r0, r0, #0x7 - add r0, r0, #4 @ SetShift - - clz r1, r3 @ WayShift - add r4, r3, #1 @ NumWays -1: sub r2, r2, #1 @ NumSets-- - mov r3, r4 @ Temp = NumWays -2: subs r3, r3, #1 @ Temp-- - mov r5, r3, lsl r1 - mov r6, r2, lsl r0 - orr r5, r5, r6 @ Reg = (Temp<<WayShift)|(NumSets<<SetShift) - mcr p15, 0, r5, c7, c6, 2 - bgt 2b - cmp r2, #0 - bgt 1b - dsb - isb - mov pc, lr -ENDPROC(v7_invalidate_l1) - ENTRY(tegra_secondary_startup) bl v7_invalidate_l1 diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S index 7539ec2..15451ee 100644 --- a/arch/arm/mm/cache-v7.S +++ b/arch/arm/mm/cache-v7.S @@ -19,6 +19,52 @@ #include "proc-macros.S" /* + * The secondary kernel init calls v7_flush_dcache_all before it enables + * the L1; however, the L1 comes out of reset in an undefined state, so + * the clean + invalidate performed by v7_flush_dcache_all causes a bunch + * of cache lines with uninitialized data and uninitialized tags to get + * written out to memory, which does really unpleasant things to the main + * processor. We fix this by performing an invalidate, rather than a + * clean + invalidate, before jumping into the kernel. + * + * This function is cloned from arch/arm/mach-tegra/headsmp.S, and needs + * to be called for both secondary cores startup and primary core resume + * procedures. + */ +ENTRY(v7_invalidate_l1) + mov r0, #0 + mcr p15, 2, r0, c0, c0, 0 + mrc p15, 1, r0, c0, c0, 0 + + ldr r1, =0x7fff + and r2, r1, r0, lsr #13 + + ldr r1, =0x3ff + + and r3, r1, r0, lsr #3 @ NumWays - 1 + add r2, r2, #1 @ NumSets + + and r0, r0, #0x7 + add r0, r0, #4 @ SetShift + + clz r1, r3 @ WayShift + add r4, r3, #1 @ NumWays +1: sub r2, r2, #1 @ NumSets-- + mov r3, r4 @ Temp = NumWays +2: subs r3, r3, #1 @ Temp-- + mov r5, r3, lsl r1 + mov r6, r2, lsl r0 + orr r5, r5, r6 @ Reg = (Temp<<WayShift)|(NumSets<<SetShift) + mcr p15, 0, r5, c7, c6, 2 + bgt 2b + cmp r2, #0 + bgt 1b + dsb + isb + mov pc, lr +ENDPROC(v7_invalidate_l1) + +/* * v7_flush_icache_all() * * Flush the whole I-cache. -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-01-31 17:05 ` [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S dinguyen at altera.com @ 2013-01-31 18:11 ` Stephen Warren 2013-02-01 3:47 ` Olof Johansson 2013-02-01 11:29 ` Santosh Shilimkar 2 siblings, 0 replies; 36+ messages in thread From: Stephen Warren @ 2013-01-31 18:11 UTC (permalink / raw) To: linux-arm-kernel On 01/31/2013 10:05 AM, dinguyen at altera.com wrote: > From: Dinh Nguyen <dinguyen@altera.com> > > mach-socfpga is another platform that needs to use > v7_invalidate_l1 to bringup additional cores. There was a comment that > the ideal place for v7_invalidate_l1 should be in arm/mm/cache-v7.S Tested-by: Stephen Warren <swarren@nvidia.com> Acked-by: Stephen Warren <swarren@nvidia.com> (on Tegra30 Cardhu, with this patch applied on top of Tegra's for-next branch, played audio and did repeated CPU hotplugs) There will be a minor merge conflict when this patch is merged with the Tegra tree changes, but it's trivial to resolve; some other code was removed from this file right next to this function. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-01-31 17:05 ` [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S dinguyen at altera.com 2013-01-31 18:11 ` Stephen Warren @ 2013-02-01 3:47 ` Olof Johansson 2013-02-01 11:29 ` Santosh Shilimkar 2 siblings, 0 replies; 36+ messages in thread From: Olof Johansson @ 2013-02-01 3:47 UTC (permalink / raw) To: linux-arm-kernel Hi, On Thu, Jan 31, 2013 at 11:05:42AM -0600, dinguyen at altera.com wrote: > From: Dinh Nguyen <dinguyen@altera.com> > > mach-socfpga is another platform that needs to use > v7_invalidate_l1 to bringup additional cores. There was a comment that > the ideal place for v7_invalidate_l1 should be in arm/mm/cache-v7.S > > Signed-off-by: Dinh Nguyen <dinguyen@altera.com> > Acked-by: Simon Horman <horms+renesas@verge.net.au> > Tested-by: Pavel Machek <pavel@denx.de> > Reviewed-by: Pavel Machek <pavel@denx.de> > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Russell King <linux@arm.linux.org.uk> > Cc: Olof Johansson <olof@lixom.net> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Rob Herring <rob.herring@calxeda.com> > Cc: Sascha Hauer <kernel@pengutronix.de> > Cc: Magnus Damm <magnus.damm@gmail.com> > Cc: Stephen Warren <swarren@wwwdotorg.org> This should be merged through Russell's tree. Please feed it to his patch tracker. -Olof ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-01-31 17:05 ` [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S dinguyen at altera.com 2013-01-31 18:11 ` Stephen Warren 2013-02-01 3:47 ` Olof Johansson @ 2013-02-01 11:29 ` Santosh Shilimkar 2013-02-01 11:32 ` Russell King - ARM Linux 2013-02-01 12:11 ` Lorenzo Pieralisi 2 siblings, 2 replies; 36+ messages in thread From: Santosh Shilimkar @ 2013-02-01 11:29 UTC (permalink / raw) To: linux-arm-kernel + Lorenzo, On Thursday 31 January 2013 10:35 PM, dinguyen at altera.com wrote: > From: Dinh Nguyen <dinguyen@altera.com> > > mach-socfpga is another platform that needs to use > v7_invalidate_l1 to bringup additional cores. There was a comment that > the ideal place for v7_invalidate_l1 should be in arm/mm/cache-v7.S > > Signed-off-by: Dinh Nguyen <dinguyen@altera.com> > Acked-by: Simon Horman <horms+renesas@verge.net.au> > Tested-by: Pavel Machek <pavel@denx.de> > Reviewed-by: Pavel Machek <pavel@denx.de> > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Russell King <linux@arm.linux.org.uk> > Cc: Olof Johansson <olof@lixom.net> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Rob Herring <rob.herring@calxeda.com> > Cc: Sascha Hauer <kernel@pengutronix.de> > Cc: Magnus Damm <magnus.damm@gmail.com> > Cc: Stephen Warren <swarren@wwwdotorg.org> > --- > arch/arm/mach-imx/headsmp.S | 47 ------------------------------------- > arch/arm/mach-shmobile/headsmp.S | 48 -------------------------------------- > arch/arm/mach-tegra/headsmp.S | 43 ---------------------------------- > arch/arm/mm/cache-v7.S | 46 ++++++++++++++++++++++++++++++++++++ > 4 files changed, 46 insertions(+), 138 deletions(-) > [..] > diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S > index 7539ec2..15451ee 100644 > --- a/arch/arm/mm/cache-v7.S > +++ b/arch/arm/mm/cache-v7.S > @@ -19,6 +19,52 @@ > #include "proc-macros.S" > > /* > + * The secondary kernel init calls v7_flush_dcache_all before it enables > + * the L1; however, the L1 comes out of reset in an undefined state, so > + * the clean + invalidate performed by v7_flush_dcache_all causes a bunch > + * of cache lines with uninitialized data and uninitialized tags to get > + * written out to memory, which does really unpleasant things to the main > + * processor. We fix this by performing an invalidate, rather than a > + * clean + invalidate, before jumping into the kernel. > + * > + * This function is cloned from arch/arm/mach-tegra/headsmp.S, and needs > + * to be called for both secondary cores startup and primary core resume > + * procedures. > + */ > +ENTRY(v7_invalidate_l1) Now since we are moving the code under common place, probably we should update this a function a bit so that it invalidates the CPU cache till line of unification. Just to be consistent with other flush API. Lorenzo, Does that make sense ? Regards, Santosh ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-02-01 11:29 ` Santosh Shilimkar @ 2013-02-01 11:32 ` Russell King - ARM Linux 2013-02-01 11:44 ` Santosh Shilimkar 2013-02-01 12:11 ` Lorenzo Pieralisi 1 sibling, 1 reply; 36+ messages in thread From: Russell King - ARM Linux @ 2013-02-01 11:32 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 01, 2013 at 04:59:44PM +0530, Santosh Shilimkar wrote: > Now since we are moving the code under common place, probably we should > update this a function a bit so that it invalidates the CPU cache till > line of unification. Just to be consistent with other flush API. Hmm. Do you really want a CPU being brought up to do that to the PoU, every time that it is brought up? I thought you wanted to get rid of that kind of stuff from the hotplug paths so that a CPU being brought up/taken down doesn't affect the caches for the other CPUs within the inner sharable domain. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-02-01 11:32 ` Russell King - ARM Linux @ 2013-02-01 11:44 ` Santosh Shilimkar 2013-02-01 12:48 ` Russell King - ARM Linux 0 siblings, 1 reply; 36+ messages in thread From: Santosh Shilimkar @ 2013-02-01 11:44 UTC (permalink / raw) To: linux-arm-kernel On Friday 01 February 2013 05:02 PM, Russell King - ARM Linux wrote: > On Fri, Feb 01, 2013 at 04:59:44PM +0530, Santosh Shilimkar wrote: >> Now since we are moving the code under common place, probably we should >> update this a function a bit so that it invalidates the CPU cache till >> line of unification. Just to be consistent with other flush API. > > Hmm. Do you really want a CPU being brought up to do that to the PoU, > every time that it is brought up? I thought you wanted to get rid of > that kind of stuff from the hotplug paths so that a CPU being brought > up/taken down doesn't affect the caches for the other CPUs within the > inner sharable domain. > You are right. We already git rid of the flush of all cache levels in hotplug and wakeup paths and now it is restricted till the PoU. Assuming for the current v7 machines, PoU is L2, invalidating the cache *till* PoU means only CPU local cache. So the API will in a way invalidate only local cache. May be I am missing your point here. Regards, Santosh ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-02-01 11:44 ` Santosh Shilimkar @ 2013-02-01 12:48 ` Russell King - ARM Linux 2013-02-01 13:04 ` Santosh Shilimkar 0 siblings, 1 reply; 36+ messages in thread From: Russell King - ARM Linux @ 2013-02-01 12:48 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 01, 2013 at 05:14:11PM +0530, Santosh Shilimkar wrote: > On Friday 01 February 2013 05:02 PM, Russell King - ARM Linux wrote: >> On Fri, Feb 01, 2013 at 04:59:44PM +0530, Santosh Shilimkar wrote: >>> Now since we are moving the code under common place, probably we should >>> update this a function a bit so that it invalidates the CPU cache till >>> line of unification. Just to be consistent with other flush API. >> >> Hmm. Do you really want a CPU being brought up to do that to the PoU, >> every time that it is brought up? I thought you wanted to get rid of >> that kind of stuff from the hotplug paths so that a CPU being brought >> up/taken down doesn't affect the caches for the other CPUs within the >> inner sharable domain. >> > You are right. We already git rid of the flush of all cache levels > in hotplug and wakeup paths and now it is restricted till the PoU. > > Assuming for the current v7 machines, PoU is L2, invalidating the cache > *till* PoU means only CPU local cache. So the API will in a way > invalidate only local cache. Err, you want to _invalidate_ the caches down to the point of I/D/TLB unification? Are you really sure you want to do that on a system here other CPUs are running? Even going down to the LoUIS, that point is the point at which the _other_ CPUs may be sharing caches. And invalidating those caches while the other CPUs are running on secondary CPU startup will be VERY VERY VERY bad. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-02-01 12:48 ` Russell King - ARM Linux @ 2013-02-01 13:04 ` Santosh Shilimkar 2013-02-01 13:20 ` Russell King - ARM Linux 0 siblings, 1 reply; 36+ messages in thread From: Santosh Shilimkar @ 2013-02-01 13:04 UTC (permalink / raw) To: linux-arm-kernel On Friday 01 February 2013 06:18 PM, Russell King - ARM Linux wrote: > On Fri, Feb 01, 2013 at 05:14:11PM +0530, Santosh Shilimkar wrote: >> On Friday 01 February 2013 05:02 PM, Russell King - ARM Linux wrote: >>> On Fri, Feb 01, 2013 at 04:59:44PM +0530, Santosh Shilimkar wrote: >>>> Now since we are moving the code under common place, probably we should >>>> update this a function a bit so that it invalidates the CPU cache till >>>> line of unification. Just to be consistent with other flush API. >>> >>> Hmm. Do you really want a CPU being brought up to do that to the PoU, >>> every time that it is brought up? I thought you wanted to get rid of >>> that kind of stuff from the hotplug paths so that a CPU being brought >>> up/taken down doesn't affect the caches for the other CPUs within the >>> inner sharable domain. >>> >> You are right. We already git rid of the flush of all cache levels >> in hotplug and wakeup paths and now it is restricted till the PoU. >> >> Assuming for the current v7 machines, PoU is L2, invalidating the cache >> *till* PoU means only CPU local cache. So the API will in a way >> invalidate only local cache. > > Err, you want to _invalidate_ the caches down to the point of I/D/TLB > unification? Are you really sure you want to do that on a system > here other CPUs are running? > > Even going down to the LoUIS, that point is the point at which the > _other_ CPUs may be sharing caches. > > And invalidating those caches while the other CPUs are running on > secondary CPU startup will be VERY VERY VERY bad. > Absolutly and my intention was never to invalidate all the cache levels. When I said lous, I mean till that point and not including that and next cache levels. May be my terminology isn't accurate. Regards, Santosh ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-02-01 13:04 ` Santosh Shilimkar @ 2013-02-01 13:20 ` Russell King - ARM Linux 2013-02-01 14:09 ` Santosh Shilimkar 0 siblings, 1 reply; 36+ messages in thread From: Russell King - ARM Linux @ 2013-02-01 13:20 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 01, 2013 at 06:34:06PM +0530, Santosh Shilimkar wrote: > On Friday 01 February 2013 06:18 PM, Russell King - ARM Linux wrote: >> On Fri, Feb 01, 2013 at 05:14:11PM +0530, Santosh Shilimkar wrote: >>> On Friday 01 February 2013 05:02 PM, Russell King - ARM Linux wrote: >>>> On Fri, Feb 01, 2013 at 04:59:44PM +0530, Santosh Shilimkar wrote: >>>>> Now since we are moving the code under common place, probably we should >>>>> update this a function a bit so that it invalidates the CPU cache till >>>>> line of unification. Just to be consistent with other flush API. >>>> >>>> Hmm. Do you really want a CPU being brought up to do that to the PoU, >>>> every time that it is brought up? I thought you wanted to get rid of >>>> that kind of stuff from the hotplug paths so that a CPU being brought >>>> up/taken down doesn't affect the caches for the other CPUs within the >>>> inner sharable domain. >>>> >>> You are right. We already git rid of the flush of all cache levels >>> in hotplug and wakeup paths and now it is restricted till the PoU. >>> >>> Assuming for the current v7 machines, PoU is L2, invalidating the cache >>> *till* PoU means only CPU local cache. So the API will in a way >>> invalidate only local cache. >> >> Err, you want to _invalidate_ the caches down to the point of I/D/TLB >> unification? Are you really sure you want to do that on a system >> here other CPUs are running? >> >> Even going down to the LoUIS, that point is the point at which the >> _other_ CPUs may be sharing caches. >> >> And invalidating those caches while the other CPUs are running on >> secondary CPU startup will be VERY VERY VERY bad. >> > Absolutly and my intention was never to invalidate all the cache > levels. When I said lous, I mean till that point and not including > that and next cache levels. May be my terminology isn't accurate. Confused. Can't find the term "lous" in your previous mails. I'll assume you mean LoUIS. I'm saying that's totally wrong because going down to that point _includes_ the other CPUs in the system. What we should be doing on secondary CPU bringup for CPUs where the caches are in an unknown state is invalidating those caches which are local to _that_ _CPU_ _only_. That is not "all cache levels down to LoUIS". Here's an example. CPU0-3 are in the inner sharable domain. +----------+ +----------+ +----------+ +----------+ | CPU0 | | CPU1 | | CPU2 | | CPU3 | +----------+ +----------+ +----------+ +----------+ | | | | | | | | +--v-+--v-+ +--v-+--v-+ +--v-+--v-+ +--v-+--v-+ |CL0I|CL0D| |CL0I|CL0D| |CL0I|CL0D| |CL0I|CL0D| <-- cache level 0 +----+----+ +----+----+ +----+----+ +----+----+ | | | | | | | | | | +---+ +--+ | | | +------------------+ | | | | | +-------|--------------+ | | | | +-------+ | | | | | | | +-----------------|-----+ | | | | | | +-----------+ | | | | +---|--------|--------|--------+ | | | +--v--------v--------v--------v--+ +--v--------v--------v--------v--+ | CL1I | | CL1D | level 1 +--------------------------------+ +--------------------------------+ | | +----------------v-----------------------------------v---------------+ | CL2 | level 2 +--------------------------------------------------------------------+ Therefore, because the point of unification for the inner sharable domain is defined as the point at which the I, D and TLB streams (TLB not shown) are combined, this happens@CL2, and does *not* include CL2. CL2 is where the two paths see the same data. So, the LoUIS includes caches in level 0 and level 1. However, CL1 _is_ shared between CPU0-3 - it is part of the inner sharable domain. If you invalidate CL1, then you're destroying data held there by the other CPUs. So, invalidating down to the LoUIS on secondary CPU bringup is _wrong_. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-02-01 13:20 ` Russell King - ARM Linux @ 2013-02-01 14:09 ` Santosh Shilimkar 0 siblings, 0 replies; 36+ messages in thread From: Santosh Shilimkar @ 2013-02-01 14:09 UTC (permalink / raw) To: linux-arm-kernel On Friday 01 February 2013 06:50 PM, Russell King - ARM Linux wrote: > On Fri, Feb 01, 2013 at 06:34:06PM +0530, Santosh Shilimkar wrote: >> On Friday 01 February 2013 06:18 PM, Russell King - ARM Linux wrote: >>> On Fri, Feb 01, 2013 at 05:14:11PM +0530, Santosh Shilimkar wrote: >>>> On Friday 01 February 2013 05:02 PM, Russell King - ARM Linux wrote: >>>>> On Fri, Feb 01, 2013 at 04:59:44PM +0530, Santosh Shilimkar wrote: >>>>>> Now since we are moving the code under common place, probably we should >>>>>> update this a function a bit so that it invalidates the CPU cache till >>>>>> line of unification. Just to be consistent with other flush API. >>>>> >>>>> Hmm. Do you really want a CPU being brought up to do that to the PoU, >>>>> every time that it is brought up? I thought you wanted to get rid of >>>>> that kind of stuff from the hotplug paths so that a CPU being brought >>>>> up/taken down doesn't affect the caches for the other CPUs within the >>>>> inner sharable domain. >>>>> >>>> You are right. We already git rid of the flush of all cache levels >>>> in hotplug and wakeup paths and now it is restricted till the PoU. >>>> >>>> Assuming for the current v7 machines, PoU is L2, invalidating the cache >>>> *till* PoU means only CPU local cache. So the API will in a way >>>> invalidate only local cache. >>> >>> Err, you want to _invalidate_ the caches down to the point of I/D/TLB >>> unification? Are you really sure you want to do that on a system >>> here other CPUs are running? >>> >>> Even going down to the LoUIS, that point is the point at which the >>> _other_ CPUs may be sharing caches. >>> >>> And invalidating those caches while the other CPUs are running on >>> secondary CPU startup will be VERY VERY VERY bad. >>> >> Absolutly and my intention was never to invalidate all the cache >> levels. When I said lous, I mean till that point and not including >> that and next cache levels. May be my terminology isn't accurate. > > Confused. Can't find the term "lous" in your previous mails. I'll > assume you mean LoUIS. I'm saying that's totally wrong because going > down to that point _includes_ the other CPUs in the system. > Sorry for the typo. I mean LoUIS. > What we should be doing on secondary CPU bringup for CPUs where the > caches are in an unknown state is invalidating those caches which are > local to _that_ _CPU_ _only_. That is not "all cache levels down to > LoUIS". > Restricting it to local CPU cache is also my point. My example was bit narrow with current A9 and A15 designs where there is only L1 and L2 cache and hence not considered the below case. > Here's an example. CPU0-3 are in the inner sharable domain. > > +----------+ +----------+ +----------+ +----------+ > | CPU0 | | CPU1 | | CPU2 | | CPU3 | > +----------+ +----------+ +----------+ +----------+ > | | | | | | | | > +--v-+--v-+ +--v-+--v-+ +--v-+--v-+ +--v-+--v-+ > |CL0I|CL0D| |CL0I|CL0D| |CL0I|CL0D| |CL0I|CL0D| <-- cache level 0 > +----+----+ +----+----+ +----+----+ +----+----+ > | | | | | | | | > | | +---+ +--+ | | | +------------------+ > | | | | | +-------|--------------+ | > | | | +-------+ | | | > | | | | +-----------------|-----+ | | > | | | | +-----------+ | | | > | +---|--------|--------|--------+ | | | > +--v--------v--------v--------v--+ +--v--------v--------v--------v--+ > | CL1I | | CL1D | level 1 > +--------------------------------+ +--------------------------------+ > | | > +----------------v-----------------------------------v---------------+ > | CL2 | level 2 > +--------------------------------------------------------------------+ > > Therefore, because the point of unification for the inner sharable domain > is defined as the point at which the I, D and TLB streams (TLB not shown) > are combined, this happens at CL2, and does *not* include CL2. CL2 is > where the two paths see the same data. > > So, the LoUIS includes caches in level 0 and level 1. > Thats right for above example. I haven't seen such design so far and hence my view was narrow. > However, CL1 _is_ shared between CPU0-3 - it is part of the inner sharable > domain. If you invalidate CL1, then you're destroying data held there by > the other CPUs. So, invalidating down to the LoUIS on secondary CPU > bringup is _wrong_. > As I said above, I didn't considered the case you mentioned here. Thanks for bring up this example. I am aligned with you on _NO_ on invaliding cache level from a CPU which is shared across multiple CPUs. Regards, Santosh ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-02-01 11:29 ` Santosh Shilimkar 2013-02-01 11:32 ` Russell King - ARM Linux @ 2013-02-01 12:11 ` Lorenzo Pieralisi 2013-02-01 12:24 ` Santosh Shilimkar 2013-02-01 12:54 ` Russell King - ARM Linux 1 sibling, 2 replies; 36+ messages in thread From: Lorenzo Pieralisi @ 2013-02-01 12:11 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 01, 2013 at 11:29:44AM +0000, Santosh Shilimkar wrote: > + Lorenzo, > > On Thursday 31 January 2013 10:35 PM, dinguyen at altera.com wrote: > > From: Dinh Nguyen <dinguyen@altera.com> > > > > mach-socfpga is another platform that needs to use > > v7_invalidate_l1 to bringup additional cores. There was a comment that > > the ideal place for v7_invalidate_l1 should be in arm/mm/cache-v7.S > > > > Signed-off-by: Dinh Nguyen <dinguyen@altera.com> > > Acked-by: Simon Horman <horms+renesas@verge.net.au> > > Tested-by: Pavel Machek <pavel@denx.de> > > Reviewed-by: Pavel Machek <pavel@denx.de> > > Cc: Arnd Bergmann <arnd@arndb.de> > > Cc: Russell King <linux@arm.linux.org.uk> > > Cc: Olof Johansson <olof@lixom.net> > > Cc: Thomas Gleixner <tglx@linutronix.de> > > Cc: Rob Herring <rob.herring@calxeda.com> > > Cc: Sascha Hauer <kernel@pengutronix.de> > > Cc: Magnus Damm <magnus.damm@gmail.com> > > Cc: Stephen Warren <swarren@wwwdotorg.org> > > --- > > arch/arm/mach-imx/headsmp.S | 47 ------------------------------------- > > arch/arm/mach-shmobile/headsmp.S | 48 -------------------------------------- > > arch/arm/mach-tegra/headsmp.S | 43 ---------------------------------- > > arch/arm/mm/cache-v7.S | 46 ++++++++++++++++++++++++++++++++++++ > > 4 files changed, 46 insertions(+), 138 deletions(-) > > > [..] > > > diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S > > index 7539ec2..15451ee 100644 > > --- a/arch/arm/mm/cache-v7.S > > +++ b/arch/arm/mm/cache-v7.S > > @@ -19,6 +19,52 @@ > > #include "proc-macros.S" > > > > /* > > + * The secondary kernel init calls v7_flush_dcache_all before it enables > > + * the L1; however, the L1 comes out of reset in an undefined state, so > > + * the clean + invalidate performed by v7_flush_dcache_all causes a bunch > > + * of cache lines with uninitialized data and uninitialized tags to get > > + * written out to memory, which does really unpleasant things to the main > > + * processor. We fix this by performing an invalidate, rather than a > > + * clean + invalidate, before jumping into the kernel. > > + * > > + * This function is cloned from arch/arm/mach-tegra/headsmp.S, and needs > > + * to be called for both secondary cores startup and primary core resume > > + * procedures. > > + */ > > +ENTRY(v7_invalidate_l1) > Now since we are moving the code under common place, probably we should > update this a function a bit so that it invalidates the CPU cache till > line of unification. Just to be consistent with other flush API. > > Lorenzo, > Does that make sense ? Well, on latest processors (A15, A7) caches are invalidated on reset unless the chip is programmed to skip that on reset (ie L2 retained). But it makes sense, for sure it should not be called v7_invalidate_l1, but invalidate_louis, and instead of forcing level 0 we should be reading LoUIS and invalidate up to that level as we do in the clean and invalidate function. Is it worth adding a v7 cache function where the only difference wrt the clean and invalidate operation is a coprocessor opcode ? Probably not IMHO, why add the set/way loop again ? It is never called from C code, so I do not think there is a point in adding a C API either. Lorenzo ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-02-01 12:11 ` Lorenzo Pieralisi @ 2013-02-01 12:24 ` Santosh Shilimkar 2013-02-01 12:54 ` Russell King - ARM Linux 1 sibling, 0 replies; 36+ messages in thread From: Santosh Shilimkar @ 2013-02-01 12:24 UTC (permalink / raw) To: linux-arm-kernel On Friday 01 February 2013 05:41 PM, Lorenzo Pieralisi wrote: > On Fri, Feb 01, 2013 at 11:29:44AM +0000, Santosh Shilimkar wrote: >> + Lorenzo, >> >> On Thursday 31 January 2013 10:35 PM, dinguyen at altera.com wrote: >>> From: Dinh Nguyen <dinguyen@altera.com> >>> >>> mach-socfpga is another platform that needs to use >>> v7_invalidate_l1 to bringup additional cores. There was a comment that >>> the ideal place for v7_invalidate_l1 should be in arm/mm/cache-v7.S >>> >>> Signed-off-by: Dinh Nguyen <dinguyen@altera.com> >>> Acked-by: Simon Horman <horms+renesas@verge.net.au> >>> Tested-by: Pavel Machek <pavel@denx.de> >>> Reviewed-by: Pavel Machek <pavel@denx.de> >>> Cc: Arnd Bergmann <arnd@arndb.de> >>> Cc: Russell King <linux@arm.linux.org.uk> >>> Cc: Olof Johansson <olof@lixom.net> >>> Cc: Thomas Gleixner <tglx@linutronix.de> >>> Cc: Rob Herring <rob.herring@calxeda.com> >>> Cc: Sascha Hauer <kernel@pengutronix.de> >>> Cc: Magnus Damm <magnus.damm@gmail.com> >>> Cc: Stephen Warren <swarren@wwwdotorg.org> >>> --- >>> arch/arm/mach-imx/headsmp.S | 47 ------------------------------------- >>> arch/arm/mach-shmobile/headsmp.S | 48 -------------------------------------- >>> arch/arm/mach-tegra/headsmp.S | 43 ---------------------------------- >>> arch/arm/mm/cache-v7.S | 46 ++++++++++++++++++++++++++++++++++++ >>> 4 files changed, 46 insertions(+), 138 deletions(-) >>> >> [..] >> >>> diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S >>> index 7539ec2..15451ee 100644 >>> --- a/arch/arm/mm/cache-v7.S >>> +++ b/arch/arm/mm/cache-v7.S >>> @@ -19,6 +19,52 @@ >>> #include "proc-macros.S" >>> >>> /* >>> + * The secondary kernel init calls v7_flush_dcache_all before it enables >>> + * the L1; however, the L1 comes out of reset in an undefined state, so >>> + * the clean + invalidate performed by v7_flush_dcache_all causes a bunch >>> + * of cache lines with uninitialized data and uninitialized tags to get >>> + * written out to memory, which does really unpleasant things to the main >>> + * processor. We fix this by performing an invalidate, rather than a >>> + * clean + invalidate, before jumping into the kernel. >>> + * >>> + * This function is cloned from arch/arm/mach-tegra/headsmp.S, and needs >>> + * to be called for both secondary cores startup and primary core resume >>> + * procedures. >>> + */ >>> +ENTRY(v7_invalidate_l1) >> Now since we are moving the code under common place, probably we should >> update this a function a bit so that it invalidates the CPU cache till >> line of unification. Just to be consistent with other flush API. >> >> Lorenzo, >> Does that make sense ? > > Well, on latest processors (A15, A7) caches are invalidated on reset unless > the chip is programmed to skip that on reset (ie L2 retained). > > But it makes sense, for sure it should not be called v7_invalidate_l1, > but invalidate_louis, and instead of forcing level 0 we should be > reading LoUIS and invalidate up to that level as we do in the clean and > invalidate function. > That is exactly what I was thinking. > Is it worth adding a v7 cache function where the only difference wrt the clean > and invalidate operation is a coprocessor opcode ? Probably not IMHO, why add > the set/way loop again ? > Probably same function can be re-used with the parameter passing to differentiate the inv and flush. > It is never called from C code, so I do not think there is a point in > adding a C API either. > I agree. C API isn't needed. Regards, Santosh ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-02-01 12:11 ` Lorenzo Pieralisi 2013-02-01 12:24 ` Santosh Shilimkar @ 2013-02-01 12:54 ` Russell King - ARM Linux 2013-02-01 14:10 ` Lorenzo Pieralisi 1 sibling, 1 reply; 36+ messages in thread From: Russell King - ARM Linux @ 2013-02-01 12:54 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 01, 2013 at 12:11:38PM +0000, Lorenzo Pieralisi wrote: > Well, on latest processors (A15, A7) caches are invalidated on reset unless > the chip is programmed to skip that on reset (ie L2 retained). > > But it makes sense, for sure it should not be called v7_invalidate_l1, > but invalidate_louis, and instead of forcing level 0 we should be > reading LoUIS and invalidate up to that level as we do in the clean and > invalidate function. No. Think about it. c7, c6, 2 _invalidates_ the cache. That means any data contained within the cache is discarded. Data is not written back. If you do this down to the LoUIS, that includes all cache levels in the inner sharable domain. The inner sharable domain includes the other CPUs in the system which may already be running (certainly the boot CPU will be running). Are you _really_ sure you want to be invalidating _valid_ data held in caches in the inner sharable domain by other CPUs, rather than just the cache associated with the CPU which is being brought online? ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-02-01 12:54 ` Russell King - ARM Linux @ 2013-02-01 14:10 ` Lorenzo Pieralisi 2013-02-01 14:19 ` Russell King - ARM Linux 0 siblings, 1 reply; 36+ messages in thread From: Lorenzo Pieralisi @ 2013-02-01 14:10 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 01, 2013 at 12:54:17PM +0000, Russell King - ARM Linux wrote: > On Fri, Feb 01, 2013 at 12:11:38PM +0000, Lorenzo Pieralisi wrote: > > Well, on latest processors (A15, A7) caches are invalidated on reset unless > > the chip is programmed to skip that on reset (ie L2 retained). > > > > But it makes sense, for sure it should not be called v7_invalidate_l1, > > but invalidate_louis, and instead of forcing level 0 we should be > > reading LoUIS and invalidate up to that level as we do in the clean and > > invalidate function. > > No. Think about it. c7, c6, 2 _invalidates_ the cache. That means any > data contained within the cache is discarded. Data is not written back. > > If you do this down to the LoUIS, that includes all cache levels in the > inner sharable domain. The inner sharable domain includes the other CPUs > in the system which may already be running (certainly the boot CPU will > be running). On all v7 ARM systems I know of LoUIS correspond to L1 and using LoUIS is a plaster that works for current systems (where by LoUIS I mean all cache levels down to but not inclusive of, LoUIS). What you are saying is true for cpu_suspend code and hotplug as well, there we are cleaning and invalidating down to the LoUIS, which might not be needed since LoUIS does not mean that all cache levels included are _local_ to the CPU. There is no way on ARM systems to detect what cache levels are local to a CPU, so we decided to use LoUIS for that, as a temporary solution to avoid cluttering the code with DT bindings that link cache topology and CPU topology. And on all ARM systems in the mainline "operating on all levels down to but not inclusive of LoUIS" is equivalent to cache levels local to a CPU, if it is not I am missing something and I apologize. > Are you _really_ sure you want to be invalidating _valid_ data held in > caches in the inner sharable domain by other CPUs, rather than just the > cache associated with the CPU which is being brought online? We know that LoUIS corresponds to cache levels local to a CPU for now and we know that it will fail as soon as there are caches in the inner shareability domain that belong to multiple CPUs. Again, that's valid for the clean and invalidate functions as well (though there it is an optimization problem, here I agree is a validity issue and more serious). There is no way to detect cache levels local to a CPU, I tried to do that before posting the new LoUIS API, and all I can say is that we need device tree bindings to do that cleanly, architecturally it is just not detectable. I am happy to discuss a way forward Russell, your concerns are correct, we are aware of them and happy to improve the code to make it work consistently. Thanks, Lorenzo ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-02-01 14:10 ` Lorenzo Pieralisi @ 2013-02-01 14:19 ` Russell King - ARM Linux 2013-02-01 14:31 ` Russell King - ARM Linux 2013-02-01 14:34 ` Lorenzo Pieralisi 0 siblings, 2 replies; 36+ messages in thread From: Russell King - ARM Linux @ 2013-02-01 14:19 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 01, 2013 at 02:10:52PM +0000, Lorenzo Pieralisi wrote: > On Fri, Feb 01, 2013 at 12:54:17PM +0000, Russell King - ARM Linux wrote: > > On Fri, Feb 01, 2013 at 12:11:38PM +0000, Lorenzo Pieralisi wrote: > > > Well, on latest processors (A15, A7) caches are invalidated on reset unless > > > the chip is programmed to skip that on reset (ie L2 retained). > > > > > > But it makes sense, for sure it should not be called v7_invalidate_l1, > > > but invalidate_louis, and instead of forcing level 0 we should be > > > reading LoUIS and invalidate up to that level as we do in the clean and > > > invalidate function. > > > > No. Think about it. c7, c6, 2 _invalidates_ the cache. That means any > > data contained within the cache is discarded. Data is not written back. > > > > If you do this down to the LoUIS, that includes all cache levels in the > > inner sharable domain. The inner sharable domain includes the other CPUs > > in the system which may already be running (certainly the boot CPU will > > be running). > > On all v7 ARM systems I know of LoUIS correspond to L1 and using LoUIS is a > plaster that works for current systems (where by LoUIS I mean all cache > levels down to but not inclusive of, LoUIS). All that I'm saying is that suggesting that v7_invalidate_l1 should go down to LoUIS is wrong, in much the same way as _invalidating_ only the first level of cache. However, invalidating the first level of cache only is safer than invalidating down to LoUIS. The only path which needs this is the secondary CPU bringup path; that's the only path we have where some platforms give us a CPU which has only just come out of reset, and so the caches for that CPU may be in an unknown state. And it only happens to a small subset of platforms. Currently, that small subset of platforms only need the first level of cache invalidating. (Most platforms don't need this; most platforms this would be a waste of time - and people seem to care about hotplug times.) So, that's what we should do. And v7_invalidate_l1 is a perfectly reasonable name for a function to do that for the V7 architecture. As has been pointed out, there's several duplications of that. That's fine, let's move them into the v7 cache code. But... let's not change how they work and go through a pointless design exercise for invalidating more levels of cache when we know that no platform needs it. If/when we do end up with a platform which requires it, we can deal with it then. But let's not lead people down the route of thinking that LoUIS is suitable here when it isn't. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-02-01 14:19 ` Russell King - ARM Linux @ 2013-02-01 14:31 ` Russell King - ARM Linux 2013-02-01 14:43 ` Santosh Shilimkar 2013-02-01 14:34 ` Lorenzo Pieralisi 1 sibling, 1 reply; 36+ messages in thread From: Russell King - ARM Linux @ 2013-02-01 14:31 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 01, 2013 at 02:19:14PM +0000, Russell King - ARM Linux wrote: > On Fri, Feb 01, 2013 at 02:10:52PM +0000, Lorenzo Pieralisi wrote: > > On Fri, Feb 01, 2013 at 12:54:17PM +0000, Russell King - ARM Linux wrote: > > > On Fri, Feb 01, 2013 at 12:11:38PM +0000, Lorenzo Pieralisi wrote: > > > > Well, on latest processors (A15, A7) caches are invalidated on reset unless > > > > the chip is programmed to skip that on reset (ie L2 retained). > > > > > > > > But it makes sense, for sure it should not be called v7_invalidate_l1, > > > > but invalidate_louis, and instead of forcing level 0 we should be > > > > reading LoUIS and invalidate up to that level as we do in the clean and > > > > invalidate function. > > > > > > No. Think about it. c7, c6, 2 _invalidates_ the cache. That means any > > > data contained within the cache is discarded. Data is not written back. > > > > > > If you do this down to the LoUIS, that includes all cache levels in the > > > inner sharable domain. The inner sharable domain includes the other CPUs > > > in the system which may already be running (certainly the boot CPU will > > > be running). > > > > On all v7 ARM systems I know of LoUIS correspond to L1 and using LoUIS is a > > plaster that works for current systems (where by LoUIS I mean all cache > > levels down to but not inclusive of, LoUIS). > > All that I'm saying is that suggesting that v7_invalidate_l1 should go > down to LoUIS is wrong, in much the same way as _invalidating_ only the > first level of cache. > > However, invalidating the first level of cache only is safer than > invalidating down to LoUIS. > > The only path which needs this is the secondary CPU bringup path; that's > the only path we have where some platforms give us a CPU which has only > just come out of reset, and so the caches for that CPU may be in an > unknown state. And it only happens to a small subset of platforms. > > Currently, that small subset of platforms only need the first level of > cache invalidating. (Most platforms don't need this; most platforms > this would be a waste of time - and people seem to care about hotplug > times.) > > So, that's what we should do. And v7_invalidate_l1 is a perfectly > reasonable name for a function to do that for the V7 architecture. > > As has been pointed out, there's several duplications of that. That's > fine, let's move them into the v7 cache code. But... let's not change > how they work and go through a pointless design exercise for > invalidating more levels of cache when we know that no platform needs > it. > > If/when we do end up with a platform which requires it, we can deal > with it then. But let's not lead people down the route of thinking > that LoUIS is suitable here when it isn't. Just to further provide some insight into the reasoning: Invalidating data out of a working cache risks data corruption; maybe the data being invalidated is filesystem metadata which was about to be cleaned and written back to storage. That risks filesystem corruption. Invalidating fewer levels than are actually required is different: we may leave dirty cache lines behind which may be evicted, but there's also the chance that the CPU will end up _reading_ from its uninitialized caches and may crash before that happens. So, the risks are: 1. invalidate more levels than are necessary and risk discarding data which other CPUs are using, which may be important data. 2. invalidate less levels than are necessary and risk writing out data from the CPU cache, which may or may not happen _before_ the CPU crashes due to reading invalid data. Out of those two, (2) sounds to me to be the safer approach. Plus, I can't think of a reason why you'd want to put on a SMP system more than one layer of CPU local caches... to do so would seem to me to be an exercise in coherency complexity... So, I suspect that in the real world, we will _never_ see any system which has more than one layer of caches local to the CPU. But we may see a system with a cache architecture similar to the one I drew in my email to Santosh. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-02-01 14:31 ` Russell King - ARM Linux @ 2013-02-01 14:43 ` Santosh Shilimkar 2013-02-01 14:49 ` Russell King - ARM Linux 0 siblings, 1 reply; 36+ messages in thread From: Santosh Shilimkar @ 2013-02-01 14:43 UTC (permalink / raw) To: linux-arm-kernel On Friday 01 February 2013 08:01 PM, Russell King - ARM Linux wrote: > Just to further provide some insight into the reasoning: > > Invalidating data out of a working cache risks data corruption; maybe > the data being invalidated is filesystem metadata which was about to > be cleaned and written back to storage. That risks filesystem > corruption. > > Invalidating fewer levels than are actually required is different: we > may leave dirty cache lines behind which may be evicted, but there's > also the chance that the CPU will end up _reading_ from its > uninitialized caches and may crash before that happens. > > So, the risks are: > 1. invalidate more levels than are necessary and risk discarding data > which other CPUs are using, which may be important data. > 2. invalidate less levels than are necessary and risk writing out > data from the CPU cache, which may or may not happen _before_ the > CPU crashes due to reading invalid data. > > Out of those two, (2) sounds to me to be the safer approach. > > Plus, I can't think of a reason why you'd want to put on a SMP system > more than one layer of CPU local caches... to do so would seem to me to > be an exercise in coherency complexity... So, I suspect that in the > real world, we will _never_ see any system which has more than one > layer of caches local to the CPU. But we may see a system with a > cache architecture similar to the one I drew in my email to Santosh. > I still scratching my head on why you would even have a CPU design with two L2 shared caches for a 4 CPU system. If you ever design such a system, you need to ensure that 1. Both L2 are used in exclusive mode 2. Both L2 cache has coherency hardware connected to keep them in sync for shared data. For 1, one would just increase the size of L2 and have only 1 memory. 2 Doesn't bring much advantage unless and until your L3 is too far away for access in terms of CPU access cycles. Regards Santosh ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-02-01 14:43 ` Santosh Shilimkar @ 2013-02-01 14:49 ` Russell King - ARM Linux 2013-02-01 14:53 ` Santosh Shilimkar 0 siblings, 1 reply; 36+ messages in thread From: Russell King - ARM Linux @ 2013-02-01 14:49 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 01, 2013 at 08:13:34PM +0530, Santosh Shilimkar wrote: > On Friday 01 February 2013 08:01 PM, Russell King - ARM Linux wrote: >> Just to further provide some insight into the reasoning: >> >> Invalidating data out of a working cache risks data corruption; maybe >> the data being invalidated is filesystem metadata which was about to >> be cleaned and written back to storage. That risks filesystem >> corruption. >> >> Invalidating fewer levels than are actually required is different: we >> may leave dirty cache lines behind which may be evicted, but there's >> also the chance that the CPU will end up _reading_ from its >> uninitialized caches and may crash before that happens. >> >> So, the risks are: >> 1. invalidate more levels than are necessary and risk discarding data >> which other CPUs are using, which may be important data. >> 2. invalidate less levels than are necessary and risk writing out >> data from the CPU cache, which may or may not happen _before_ the >> CPU crashes due to reading invalid data. >> >> Out of those two, (2) sounds to me to be the safer approach. >> >> Plus, I can't think of a reason why you'd want to put on a SMP system >> more than one layer of CPU local caches... to do so would seem to me to >> be an exercise in coherency complexity... So, I suspect that in the >> real world, we will _never_ see any system which has more than one >> layer of caches local to the CPU. But we may see a system with a >> cache architecture similar to the one I drew in my email to Santosh. >> > I still scratching my head on why you would even have a CPU design > with two L2 shared caches for a 4 CPU system. > > If you ever design such a system, you need to ensure that > > 1. Both L2 are used in exclusive mode > 2. Both L2 cache has coherency hardware connected to keep them in sync > for shared data. > > For 1, one would just increase the size of L2 and have only 1 memory. > > 2 Doesn't bring much advantage unless and until your L3 is too far > away for access in terms of CPU access cycles. I don't think you quite understood my diagram. There aren't two separate L2 data caches (CL1I and CL1D). I'm showing the L2 cache as having a harvard structure (separate instruction and data) with no coherency between them - and because they're harvard structured, that means the unification level must be _below_ that point. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-02-01 14:49 ` Russell King - ARM Linux @ 2013-02-01 14:53 ` Santosh Shilimkar 0 siblings, 0 replies; 36+ messages in thread From: Santosh Shilimkar @ 2013-02-01 14:53 UTC (permalink / raw) To: linux-arm-kernel On Friday 01 February 2013 08:19 PM, Russell King - ARM Linux wrote: > On Fri, Feb 01, 2013 at 08:13:34PM +0530, Santosh Shilimkar wrote: >> On Friday 01 February 2013 08:01 PM, Russell King - ARM Linux wrote: >>> Just to further provide some insight into the reasoning: >>> >>> Invalidating data out of a working cache risks data corruption; maybe >>> the data being invalidated is filesystem metadata which was about to >>> be cleaned and written back to storage. That risks filesystem >>> corruption. >>> >>> Invalidating fewer levels than are actually required is different: we >>> may leave dirty cache lines behind which may be evicted, but there's >>> also the chance that the CPU will end up _reading_ from its >>> uninitialized caches and may crash before that happens. >>> >>> So, the risks are: >>> 1. invalidate more levels than are necessary and risk discarding data >>> which other CPUs are using, which may be important data. >>> 2. invalidate less levels than are necessary and risk writing out >>> data from the CPU cache, which may or may not happen _before_ the >>> CPU crashes due to reading invalid data. >>> >>> Out of those two, (2) sounds to me to be the safer approach. >>> >>> Plus, I can't think of a reason why you'd want to put on a SMP system >>> more than one layer of CPU local caches... to do so would seem to me to >>> be an exercise in coherency complexity... So, I suspect that in the >>> real world, we will _never_ see any system which has more than one >>> layer of caches local to the CPU. But we may see a system with a >>> cache architecture similar to the one I drew in my email to Santosh. >>> >> I still scratching my head on why you would even have a CPU design >> with two L2 shared caches for a 4 CPU system. >> >> If you ever design such a system, you need to ensure that >> >> 1. Both L2 are used in exclusive mode >> 2. Both L2 cache has coherency hardware connected to keep them in sync >> for shared data. >> >> For 1, one would just increase the size of L2 and have only 1 memory. >> >> 2 Doesn't bring much advantage unless and until your L3 is too far >> away for access in terms of CPU access cycles. > > I don't think you quite understood my diagram. There aren't two separate > L2 data caches (CL1I and CL1D). I'm showing the L2 cache as having a > harvard structure (separate instruction and data) with no coherency > between them - and because they're harvard structured, that means the > unification level must be _below_ that point. > Now I get it. Yes I missed the I and D separation in the diagram. Thanks a lot for drawing. Regards, Santosh ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S 2013-02-01 14:19 ` Russell King - ARM Linux 2013-02-01 14:31 ` Russell King - ARM Linux @ 2013-02-01 14:34 ` Lorenzo Pieralisi 1 sibling, 0 replies; 36+ messages in thread From: Lorenzo Pieralisi @ 2013-02-01 14:34 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 01, 2013 at 02:19:14PM +0000, Russell King - ARM Linux wrote: > On Fri, Feb 01, 2013 at 02:10:52PM +0000, Lorenzo Pieralisi wrote: > > On Fri, Feb 01, 2013 at 12:54:17PM +0000, Russell King - ARM Linux wrote: > > > On Fri, Feb 01, 2013 at 12:11:38PM +0000, Lorenzo Pieralisi wrote: > > > > Well, on latest processors (A15, A7) caches are invalidated on reset unless > > > > the chip is programmed to skip that on reset (ie L2 retained). > > > > > > > > But it makes sense, for sure it should not be called v7_invalidate_l1, > > > > but invalidate_louis, and instead of forcing level 0 we should be > > > > reading LoUIS and invalidate up to that level as we do in the clean and > > > > invalidate function. > > > > > > No. Think about it. c7, c6, 2 _invalidates_ the cache. That means any > > > data contained within the cache is discarded. Data is not written back. > > > > > > If you do this down to the LoUIS, that includes all cache levels in the > > > inner sharable domain. The inner sharable domain includes the other CPUs > > > in the system which may already be running (certainly the boot CPU will > > > be running). > > > > On all v7 ARM systems I know of LoUIS correspond to L1 and using LoUIS is a > > plaster that works for current systems (where by LoUIS I mean all cache > > levels down to but not inclusive of, LoUIS). > > All that I'm saying is that suggesting that v7_invalidate_l1 should go > down to LoUIS is wrong, in much the same way as _invalidating_ only the > first level of cache. > > However, invalidating the first level of cache only is safer than > invalidating down to LoUIS. > > The only path which needs this is the secondary CPU bringup path; that's > the only path we have where some platforms give us a CPU which has only > just come out of reset, and so the caches for that CPU may be in an > unknown state. And it only happens to a small subset of platforms. > > Currently, that small subset of platforms only need the first level of > cache invalidating. (Most platforms don't need this; most platforms > this would be a waste of time - and people seem to care about hotplug > times.) > > So, that's what we should do. And v7_invalidate_l1 is a perfectly > reasonable name for a function to do that for the V7 architecture. > > As has been pointed out, there's several duplications of that. That's > fine, let's move them into the v7 cache code. But... let's not change > how they work and go through a pointless design exercise for > invalidating more levels of cache when we know that no platform needs > it. > > If/when we do end up with a platform which requires it, we can deal > with it then. But let's not lead people down the route of thinking > that LoUIS is suitable here when it isn't. Your point is fair Russell, and I agree with that in the context you provided. Last thing I want to mention is that duplicating code that loops through sets/ways is not that great, but I also understand that refactoring the code to share the loop in v7_flush_dcache_all might do more harm than good and introduce bugs, so let's merge the invalidate_l1 function as it stands. Thank you, Lorenzo ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 4/4] arm: socfpga: Add SMP support for actual socfpga harware 2013-01-31 17:05 [PATCHv2 for soc 0/4] Enabling socfpga on hardware dinguyen at altera.com ` (2 preceding siblings ...) 2013-01-31 17:05 ` [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S dinguyen at altera.com @ 2013-01-31 17:05 ` dinguyen at altera.com 2013-02-01 3:50 ` Olof Johansson 2013-02-01 10:50 ` Pavel Machek 3 siblings, 2 replies; 36+ messages in thread From: dinguyen at altera.com @ 2013-01-31 17:05 UTC (permalink / raw) To: linux-arm-kernel From: Dinh Nguyen <dinguyen@altera.com> Because the CPU1 start address is different for socfpga-vt and socfpga-cyclone5, we add code to use the correct CPU1 start addr. Signed-off-by: Dinh Nguyen <dinguyen@altera.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Olof Johansson <olof@lixom.net> Cc: Pavel Machek <pavel@denx.de> --- .../bindings/arm/altera/socfpga-system.txt | 2 ++ arch/arm/boot/dts/socfpga_cyclone5.dts | 4 ++++ arch/arm/boot/dts/socfpga_vt.dts | 4 ++++ arch/arm/mach-socfpga/core.h | 4 +++- arch/arm/mach-socfpga/headsmp.S | 16 ++++++++++++---- arch/arm/mach-socfpga/platsmp.c | 3 ++- arch/arm/mach-socfpga/socfpga.c | 8 ++++++++ 7 files changed, 35 insertions(+), 6 deletions(-) diff --git a/Documentation/devicetree/bindings/arm/altera/socfpga-system.txt b/Documentation/devicetree/bindings/arm/altera/socfpga-system.txt index 07c65e3..f4d04a0 100644 --- a/Documentation/devicetree/bindings/arm/altera/socfpga-system.txt +++ b/Documentation/devicetree/bindings/arm/altera/socfpga-system.txt @@ -3,9 +3,11 @@ Altera SOCFPGA System Manager Required properties: - compatible : "altr,sys-mgr" - reg : Should contain 1 register ranges(address and length) +- cpu1-start-addr : CPU1 start address in hex. Example: sysmgr at ffd08000 { compatible = "altr,sys-mgr"; reg = <0xffd08000 0x1000>; + cpu1-start-addr = <0xffd080c4>; }; diff --git a/arch/arm/boot/dts/socfpga_cyclone5.dts b/arch/arm/boot/dts/socfpga_cyclone5.dts index 7ad3cc6..3ae8a83 100644 --- a/arch/arm/boot/dts/socfpga_cyclone5.dts +++ b/arch/arm/boot/dts/socfpga_cyclone5.dts @@ -56,5 +56,9 @@ serial1 at ffc03000 { clock-frequency = <100000000>; }; + + sysmgr at ffd08000 { + cpu1-start-addr = <0xffd080c4>; + }; }; }; diff --git a/arch/arm/boot/dts/socfpga_vt.dts b/arch/arm/boot/dts/socfpga_vt.dts index a0c6c65..1036eba 100644 --- a/arch/arm/boot/dts/socfpga_vt.dts +++ b/arch/arm/boot/dts/socfpga_vt.dts @@ -56,5 +56,9 @@ serial1 at ffc03000 { clock-frequency = <7372800>; }; + + sysmgr at ffd08000 { + cpu1-start-addr = <0xffd08010>; + }; }; }; diff --git a/arch/arm/mach-socfpga/core.h b/arch/arm/mach-socfpga/core.h index 9941caa..5b76dd4 100644 --- a/arch/arm/mach-socfpga/core.h +++ b/arch/arm/mach-socfpga/core.h @@ -20,7 +20,7 @@ #ifndef __MACH_CORE_H #define __MACH_CORE_H -extern void secondary_startup(void); +extern void v7_secondary_startup(void); extern void __iomem *socfpga_scu_base_addr; extern void socfpga_init_clocks(void); @@ -29,6 +29,8 @@ extern void socfpga_sysmgr_init(void); extern struct smp_operations socfpga_smp_ops; extern char secondary_trampoline, secondary_trampoline_end; +extern unsigned long cpu1start_addr; + #define SOCFPGA_SCU_VIRT_BASE 0xfffec000 #endif diff --git a/arch/arm/mach-socfpga/headsmp.S b/arch/arm/mach-socfpga/headsmp.S index f09b128..3c83582 100644 --- a/arch/arm/mach-socfpga/headsmp.S +++ b/arch/arm/mach-socfpga/headsmp.S @@ -13,13 +13,21 @@ __CPUINIT .arch armv7-a -#define CPU1_START_ADDR 0xffd08010 - ENTRY(secondary_trampoline) - movw r0, #:lower16:CPU1_START_ADDR - movt r0, #:upper16:CPU1_START_ADDR + movw r2, #:lower16:cpu1start_addr + movt r2, #:upper16:cpu1start_addr + + /* The socfpga VT cannot handle a 0xC0000000 page offset when loading + the cpu1start_addr, we bit clear it. Tested on HW and VT. */ + bic r2, r2, #0x40000000 + ldr r0, [r2] ldr r1, [r0] bx r1 ENTRY(secondary_trampoline_end) + +ENTRY(v7_secondary_startup) + bl v7_invalidate_l1 + b secondary_startup +ENDPROC(v7_secondary_startup) diff --git a/arch/arm/mach-socfpga/platsmp.c b/arch/arm/mach-socfpga/platsmp.c index 68dd1b6..c428519 100644 --- a/arch/arm/mach-socfpga/platsmp.c +++ b/arch/arm/mach-socfpga/platsmp.c @@ -49,7 +49,8 @@ static int __cpuinit socfpga_boot_secondary(unsigned int cpu, struct task_struct memcpy(phys_to_virt(0), &secondary_trampoline, trampoline_size); - __raw_writel(virt_to_phys(secondary_startup), (sys_manager_base_addr+0x10)); + __raw_writel(virt_to_phys(v7_secondary_startup), + (sys_manager_base_addr + (cpu1start_addr & 0x000000ff))); flush_cache_all(); smp_wmb(); diff --git a/arch/arm/mach-socfpga/socfpga.c b/arch/arm/mach-socfpga/socfpga.c index 198f491..5a56370 100644 --- a/arch/arm/mach-socfpga/socfpga.c +++ b/arch/arm/mach-socfpga/socfpga.c @@ -29,6 +29,7 @@ void __iomem *socfpga_scu_base_addr = ((void __iomem *)(SOCFPGA_SCU_VIRT_BASE)); void __iomem *sys_manager_base_addr; void __iomem *rst_manager_base_addr; +unsigned long cpu1start_addr; static struct map_desc scu_io_desc __initdata = { .virtual = SOCFPGA_SCU_VIRT_BASE, @@ -72,6 +73,13 @@ void __init socfpga_sysmgr_init(void) struct device_node *np; np = of_find_compatible_node(NULL, NULL, "altr,sys-mgr"); + + if (of_property_read_u32(np, "cpu1-start-addr", + (u32 *) &cpu1start_addr)) { + early_printk("Need cpu1-start-addr in device tree.\n"); + panic("Need cpu1-start-addr in device tree.\n"); + } + sys_manager_base_addr = of_iomap(np, 0); np = of_find_compatible_node(NULL, NULL, "altr,rst-mgr"); -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 4/4] arm: socfpga: Add SMP support for actual socfpga harware 2013-01-31 17:05 ` [PATCHv2 for soc 4/4] arm: socfpga: Add SMP support for actual socfpga harware dinguyen at altera.com @ 2013-02-01 3:50 ` Olof Johansson 2013-02-01 10:46 ` Pavel Machek 2013-02-01 10:50 ` Pavel Machek 1 sibling, 1 reply; 36+ messages in thread From: Olof Johansson @ 2013-02-01 3:50 UTC (permalink / raw) To: linux-arm-kernel Hi, On Thu, Jan 31, 2013 at 11:05:43AM -0600, dinguyen at altera.com wrote: > From: Dinh Nguyen <dinguyen@altera.com> > > Because the CPU1 start address is different for socfpga-vt and > socfpga-cyclone5, we add code to use the correct CPU1 start addr. > > Signed-off-by: Dinh Nguyen <dinguyen@altera.com> > Cc: Russell King <linux@arm.linux.org.uk> > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Olof Johansson <olof@lixom.net> > Cc: Pavel Machek <pavel@denx.de> > --- > .../bindings/arm/altera/socfpga-system.txt | 2 ++ > arch/arm/boot/dts/socfpga_cyclone5.dts | 4 ++++ > arch/arm/boot/dts/socfpga_vt.dts | 4 ++++ > arch/arm/mach-socfpga/core.h | 4 +++- > arch/arm/mach-socfpga/headsmp.S | 16 ++++++++++++---- > arch/arm/mach-socfpga/platsmp.c | 3 ++- > arch/arm/mach-socfpga/socfpga.c | 8 ++++++++ > 7 files changed, 35 insertions(+), 6 deletions(-) > @@ -72,6 +73,13 @@ void __init socfpga_sysmgr_init(void) > struct device_node *np; > > np = of_find_compatible_node(NULL, NULL, "altr,sys-mgr"); > + > + if (of_property_read_u32(np, "cpu1-start-addr", > + (u32 *) &cpu1start_addr)) { > + early_printk("Need cpu1-start-addr in device tree.\n"); > + panic("Need cpu1-start-addr in device tree.\n"); > + } > + > sys_manager_base_addr = of_iomap(np, 0); Wouldn't it be easier to diagnose this failure if you just printed the error and continued booting without the second CPU? An early panic is usually really hard to debug since you might not get early console without extra work. -Olof ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 4/4] arm: socfpga: Add SMP support for actual socfpga harware 2013-02-01 3:50 ` Olof Johansson @ 2013-02-01 10:46 ` Pavel Machek 2013-02-01 15:27 ` Dinh Nguyen 0 siblings, 1 reply; 36+ messages in thread From: Pavel Machek @ 2013-02-01 10:46 UTC (permalink / raw) To: linux-arm-kernel Hi! > > Because the CPU1 start address is different for socfpga-vt and > > socfpga-cyclone5, we add code to use the correct CPU1 start addr. > > @@ -72,6 +73,13 @@ void __init socfpga_sysmgr_init(void) > > struct device_node *np; > > > > np = of_find_compatible_node(NULL, NULL, "altr,sys-mgr"); > > + > > + if (of_property_read_u32(np, "cpu1-start-addr", > > + (u32 *) &cpu1start_addr)) { > > + early_printk("Need cpu1-start-addr in device tree.\n"); > > + panic("Need cpu1-start-addr in device tree.\n"); > > + } > > + > > sys_manager_base_addr = of_iomap(np, 0); > > Wouldn't it be easier to diagnose this failure if you just printed the error > and continued booting without the second CPU? An early panic is usually really > hard to debug since you might not get early console without extra work. I actually thought about that... but could not think of non-ugly way of doing that. I hope dts will normally be "right" for any production system... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 4/4] arm: socfpga: Add SMP support for actual socfpga harware 2013-02-01 10:46 ` Pavel Machek @ 2013-02-01 15:27 ` Dinh Nguyen 2013-02-01 15:31 ` Russell King - ARM Linux 0 siblings, 1 reply; 36+ messages in thread From: Dinh Nguyen @ 2013-02-01 15:27 UTC (permalink / raw) To: linux-arm-kernel Hi Olof, On Fri, 2013-02-01 at 11:46 +0100, ZY - pavel wrote: > Hi! > > > > Because the CPU1 start address is different for socfpga-vt and > > > socfpga-cyclone5, we add code to use the correct CPU1 start addr. > > > > @@ -72,6 +73,13 @@ void __init socfpga_sysmgr_init(void) > > > struct device_node *np; > > > > > > np = of_find_compatible_node(NULL, NULL, "altr,sys-mgr"); > > > + > > > + if (of_property_read_u32(np, "cpu1-start-addr", > > > + (u32 *) &cpu1start_addr)) { > > > + early_printk("Need cpu1-start-addr in device tree.\n"); > > > + panic("Need cpu1-start-addr in device tree.\n"); > > > + } > > > + > > > sys_manager_base_addr = of_iomap(np, 0); > > > > Wouldn't it be easier to diagnose this failure if you just printed the error > > and continued booting without the second CPU? An early panic is usually really > > hard to debug since you might not get early console without extra work. > > I actually thought about that... but could not think of non-ugly way > of doing that. I hope dts will normally be "right" for any production > system... I think a panic is better just for the reason that if someone is expecting SMP, but missed the warning message, and later finds out that the secondary core never came up, it would save some debugging time. Since I have to send out a v3 from the 1st patch anyways, let me verify that I can get the early warning. Thanks, Dinh > Pavel Confidentiality Notice. This message may contain information that is confidential or otherwise protected from disclosure. If you are not the intended recipient, you are hereby notified that any use, disclosure, dissemination, distribution, or copying of this message, or any attachments, is strictly prohibited. If you have received this message in error, please advise the sender by reply e-mail, and delete the message and any attachments. Thank you. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 4/4] arm: socfpga: Add SMP support for actual socfpga harware 2013-02-01 15:27 ` Dinh Nguyen @ 2013-02-01 15:31 ` Russell King - ARM Linux 2013-02-01 16:39 ` Dinh Nguyen 0 siblings, 1 reply; 36+ messages in thread From: Russell King - ARM Linux @ 2013-02-01 15:31 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 01, 2013 at 07:27:46AM -0800, Dinh Nguyen wrote: > Hi Olof, > On Fri, 2013-02-01 at 11:46 +0100, ZY - pavel wrote: > > Hi! > > > > > > Because the CPU1 start address is different for socfpga-vt and > > > > socfpga-cyclone5, we add code to use the correct CPU1 start addr. > > > > > > @@ -72,6 +73,13 @@ void __init socfpga_sysmgr_init(void) > > > > struct device_node *np; > > > > > > > > np = of_find_compatible_node(NULL, NULL, "altr,sys-mgr"); > > > > + > > > > + if (of_property_read_u32(np, "cpu1-start-addr", > > > > + (u32 *) &cpu1start_addr)) { > > > > + early_printk("Need cpu1-start-addr in device tree.\n"); > > > > + panic("Need cpu1-start-addr in device tree.\n"); > > > > + } > > > > + > > > > sys_manager_base_addr = of_iomap(np, 0); > > > > > > Wouldn't it be easier to diagnose this failure if you just printed the error > > > and continued booting without the second CPU? An early panic is usually really > > > hard to debug since you might not get early console without extra work. > > > > I actually thought about that... but could not think of non-ugly way > > of doing that. I hope dts will normally be "right" for any production > > system... > > I think a panic is better just for the reason that if someone is > expecting SMP, but missed the warning message, and later finds out that > the secondary core never came up, it would save some debugging time. > > Since I have to send out a v3 from the 1st patch anyways, let me verify > that I can get the early warning. The choice is between a panic() at a point where the only way to find out is to throw in printascii() or a working printk, and ending up with an unbootable kernel, vs continuing the boot and having an almost working system which can be logged into and the messages viewed. If you have an application which relies on the second CPU coming up, why not have it verify that the second CPU came up (it's quite easy to do - there's POSIX standard libc calls to get the number of online CPUs). ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 4/4] arm: socfpga: Add SMP support for actual socfpga harware 2013-02-01 15:31 ` Russell King - ARM Linux @ 2013-02-01 16:39 ` Dinh Nguyen 2013-02-02 19:24 ` Pavel Machek 0 siblings, 1 reply; 36+ messages in thread From: Dinh Nguyen @ 2013-02-01 16:39 UTC (permalink / raw) To: linux-arm-kernel On Fri, 2013-02-01 at 15:31 +0000, Russell King - ARM Linux wrote: > On Fri, Feb 01, 2013 at 07:27:46AM -0800, Dinh Nguyen wrote: > > Hi Olof, > > On Fri, 2013-02-01 at 11:46 +0100, ZY - pavel wrote: > > > Hi! > > > > > > > > Because the CPU1 start address is different for socfpga-vt and > > > > > socfpga-cyclone5, we add code to use the correct CPU1 start addr. > > > > > > > > @@ -72,6 +73,13 @@ void __init socfpga_sysmgr_init(void) > > > > > struct device_node *np; > > > > > > > > > > np = of_find_compatible_node(NULL, NULL, "altr,sys-mgr"); > > > > > + > > > > > + if (of_property_read_u32(np, "cpu1-start-addr", > > > > > + (u32 *) &cpu1start_addr)) { > > > > > + early_printk("Need cpu1-start-addr in device tree.\n"); > > > > > + panic("Need cpu1-start-addr in device tree.\n"); > > > > > + } > > > > > + > > > > > sys_manager_base_addr = of_iomap(np, 0); > > > > > > > > Wouldn't it be easier to diagnose this failure if you just printed the error > > > > and continued booting without the second CPU? An early panic is usually really > > > > hard to debug since you might not get early console without extra work. > > > > > > I actually thought about that... but could not think of non-ugly way > > > of doing that. I hope dts will normally be "right" for any production > > > system... > > > > I think a panic is better just for the reason that if someone is > > expecting SMP, but missed the warning message, and later finds out that > > the secondary core never came up, it would save some debugging time. > > > > Since I have to send out a v3 from the 1st patch anyways, let me verify > > that I can get the early warning. > > The choice is between a panic() at a point where the only way to find > out is to throw in printascii() or a working printk, and ending up with > an unbootable kernel, vs continuing the boot and having an almost > working system which can be logged into and the messages viewed. > > If you have an application which relies on the second CPU coming up, > why not have it verify that the second CPU came up (it's quite easy > to do - there's POSIX standard libc calls to get the number of online > CPUs). Point taken...thanks Russell. Dinh > ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 4/4] arm: socfpga: Add SMP support for actual socfpga harware 2013-02-01 16:39 ` Dinh Nguyen @ 2013-02-02 19:24 ` Pavel Machek 2013-02-02 21:37 ` Dinh Nguyen 0 siblings, 1 reply; 36+ messages in thread From: Pavel Machek @ 2013-02-02 19:24 UTC (permalink / raw) To: linux-arm-kernel Hi! > > > > I actually thought about that... but could not think of non-ugly way > > > > of doing that. I hope dts will normally be "right" for any production > > > > system... > > > > > > I think a panic is better just for the reason that if someone is > > > expecting SMP, but missed the warning message, and later finds out that > > > the secondary core never came up, it would save some debugging time. > > > > > > Since I have to send out a v3 from the 1st patch anyways, let me verify > > > that I can get the early warning. > > > > The choice is between a panic() at a point where the only way to find > > out is to throw in printascii() or a working printk, and ending up with > > an unbootable kernel, vs continuing the boot and having an almost > > working system which can be logged into and the messages viewed. > > > > If you have an application which relies on the second CPU coming up, > > why not have it verify that the second CPU came up (it's quite easy > > to do - there's POSIX standard libc calls to get the number of online > > CPUs). > > Point taken...thanks Russell. Well, I don't think we normally check dtbs for validity with user-helpful error messages, but it is relatively easy in this case. ---cut here--- Continue booting with second core disabled if cpu1-start-addr is not present in .dtb. Signed-off-by: Pavel Machek <pavel@denx.de> diff --git a/arch/arm/mach-socfpga/platsmp.c b/arch/arm/mach-socfpga/platsmp.c index 81e0da0..90facdd 100644 --- a/arch/arm/mach-socfpga/platsmp.c +++ b/arch/arm/mach-socfpga/platsmp.c @@ -82,6 +82,9 @@ static void __init socfpga_smp_init_cpus(void) ncores = 1; } #endif + if (!cpu1start_addr) + ncores = 1; + for (i = 0; i < ncores; i++) set_cpu_possible(i, true); diff --git a/arch/arm/mach-socfpga/socfpga.c b/arch/arm/mach-socfpga/socfpga.c index 334c330..c3cd88b 100644 --- a/arch/arm/mach-socfpga/socfpga.c +++ b/arch/arm/mach-socfpga/socfpga.c @@ -74,10 +74,9 @@ static void __init socfpga_sysmgr_init(void) np = of_find_compatible_node(NULL, NULL, "altr,sys-mgr"); - if (of_property_read_u32(np, "cpu1-start-addr", (u32 *) &cpu1start_addr)) { - early_printk("Need cpu1-start-addr in device tree.\n"); - panic("Need cpu1-start-addr in device tree.\n"); - } + if (of_property_read_u32(np, "cpu1-start-addr", (u32 *) &cpu1start_addr)) + printk(KERN_ALERT "Need cpu1-start-addr in device tree for SMP operation.\n"); + sys_manager_base_addr = of_iomap(np, 0); np = of_find_compatible_node(NULL, NULL, "altr,rst-mgr"); -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 4/4] arm: socfpga: Add SMP support for actual socfpga harware 2013-02-02 19:24 ` Pavel Machek @ 2013-02-02 21:37 ` Dinh Nguyen 2013-02-03 18:36 ` Pavel Machek 0 siblings, 1 reply; 36+ messages in thread From: Dinh Nguyen @ 2013-02-02 21:37 UTC (permalink / raw) To: linux-arm-kernel Hi Pavel, > -----Original Message----- > From: ZY - pavel > Sent: Saturday, February 02, 2013 1:24 PM > To: Dinh Nguyen > Cc: Russell King - ARM Linux; olof at lixom.net; linux-arm- > kernel at lists.infradead.org; arnd at arndb.de > Subject: Re: [PATCHv2 for soc 4/4] arm: socfpga: Add SMP support for > actual socfpga harware > > Hi! > > > > > > I actually thought about that... but could not think of non- > ugly way > > > > > of doing that. I hope dts will normally be "right" for any > production > > > > > system... > > > > > > > > I think a panic is better just for the reason that if someone is > > > > expecting SMP, but missed the warning message, and later finds > out that > > > > the secondary core never came up, it would save some debugging > time. > > > > > > > > Since I have to send out a v3 from the 1st patch anyways, let me > verify > > > > that I can get the early warning. > > > > > > The choice is between a panic() at a point where the only way to > find > > > out is to throw in printascii() or a working printk, and ending up > with > > > an unbootable kernel, vs continuing the boot and having an almost > > > working system which can be logged into and the messages viewed. > > > > > > If you have an application which relies on the second CPU coming > up, > > > why not have it verify that the second CPU came up (it's quite easy > > > to do - there's POSIX standard libc calls to get the number of > online > > > CPUs). > > > > Point taken...thanks Russell. > > Well, I don't think we normally check dtbs for validity with > user-helpful error messages, but it is relatively easy in this case. > > ---cut here--- > > Continue booting with second core disabled if cpu1-start-addr is not > present in .dtb. > > Signed-off-by: Pavel Machek <pavel@denx.de> > > diff --git a/arch/arm/mach-socfpga/platsmp.c b/arch/arm/mach- > socfpga/platsmp.c > index 81e0da0..90facdd 100644 > --- a/arch/arm/mach-socfpga/platsmp.c > +++ b/arch/arm/mach-socfpga/platsmp.c > @@ -82,6 +82,9 @@ static void __init socfpga_smp_init_cpus(void) > ncores = 1; > } > #endif > + if (!cpu1start_addr) > + ncores = 1; > + This will not work because of commit 5587164eea4aad88fcb79d9b21dc8f14fea598cd I sent out a V3 series of this patch, CPU1 will simply fail to come online if cpu1-start-addr is not defined. Dinh > for (i = 0; i < ncores; i++) > set_cpu_possible(i, true); > > diff --git a/arch/arm/mach-socfpga/socfpga.c b/arch/arm/mach- > socfpga/socfpga.c > index 334c330..c3cd88b 100644 > --- a/arch/arm/mach-socfpga/socfpga.c > +++ b/arch/arm/mach-socfpga/socfpga.c > @@ -74,10 +74,9 @@ static void __init socfpga_sysmgr_init(void) > > np = of_find_compatible_node(NULL, NULL, "altr,sys-mgr"); > > - if (of_property_read_u32(np, "cpu1-start-addr", (u32 *) > &cpu1start_addr)) { > - early_printk("Need cpu1-start-addr in device tree.\n"); > - panic("Need cpu1-start-addr in device tree.\n"); > - } > + if (of_property_read_u32(np, "cpu1-start-addr", (u32 *) > &cpu1start_addr)) > + printk(KERN_ALERT "Need cpu1-start-addr in device tree for > SMP operation.\n"); > + > sys_manager_base_addr = of_iomap(np, 0); > > np = of_find_compatible_node(NULL, NULL, "altr,rst-mgr"); > > -- > (english) http://www.livejournal.com/~pavelmachek > (cesky, pictures) > http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html Confidentiality Notice. This message may contain information that is confidential or otherwise protected from disclosure. If you are not the intended recipient, you are hereby notified that any use, disclosure, dissemination, distribution, or copying of this message, or any attachments, is strictly prohibited. If you have received this message in error, please advise the sender by reply e-mail, and delete the message and any attachments. Thank you. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 4/4] arm: socfpga: Add SMP support for actual socfpga harware 2013-02-02 21:37 ` Dinh Nguyen @ 2013-02-03 18:36 ` Pavel Machek 2013-02-04 16:12 ` Dinh Nguyen 0 siblings, 1 reply; 36+ messages in thread From: Pavel Machek @ 2013-02-03 18:36 UTC (permalink / raw) To: linux-arm-kernel Hi! > > > Point taken...thanks Russell. > > > > Well, I don't think we normally check dtbs for validity with > > user-helpful error messages, but it is relatively easy in this case. > > > > ---cut here--- > > > > Continue booting with second core disabled if cpu1-start-addr is not > > present in .dtb. > > > > Signed-off-by: Pavel Machek <pavel@denx.de> > > > > diff --git a/arch/arm/mach-socfpga/platsmp.c b/arch/arm/mach- > > socfpga/platsmp.c > > index 81e0da0..90facdd 100644 > > --- a/arch/arm/mach-socfpga/platsmp.c > > +++ b/arch/arm/mach-socfpga/platsmp.c > > @@ -82,6 +82,9 @@ static void __init socfpga_smp_init_cpus(void) > > ncores = 1; > > } > > #endif > > + if (!cpu1start_addr) > > + ncores = 1; > > + > > This will not work because of commit 5587164eea4aad88fcb79d9b21dc8f14fea598cd > I sent out a V3 series of this patch, CPU1 will simply fail to come online if > cpu1-start-addr is not defined. Are you sure? As far as I can see you still need such a line in v3 of the patch: @@ -72,6 +73,11 @@ void __init socfpga_sysmgr_init(void) struct device_node *np; np = of_find_compatible_node(NULL, NULL, "altr,sys-mgr"); + + if (of_property_read_u32(np, "cpu1-start-addr", + (u32 *) &cpu1start_addr)) + pr_err("SMP: Need cpu1-start-addr in device tree.\n"); + sys_manager_base_addr = of_iomap(np, 0); np = of_find_compatible_node(NULL, NULL, "altr,rst-mgr"); ...so cpu1-start-addr is not there, cpu1start_addr is NULL but you continue booting. ENTRY(secondary_trampoline) - movw r0, #:lower16:CPU1_START_ADDR - movt r0, #:upper16:CPU1_START_ADDR + movw r2, #:lower16:cpu1start_addr + movt r2, #:upper16:cpu1start_addr + + /* The socfpga VT cannot handle a 0xC0000000 page offset when loading + the cpu1start_addr, we bit clear it. Tested on HW and VT. */ + bic r2, r2, #0x40000000 + ldr r0, [r2] ldr r1, [r0] bx r1 ...and you'll dereference the NULL pointer here, no? Sorry for the noise, this really is not all that important... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 4/4] arm: socfpga: Add SMP support for actual socfpga harware 2013-02-03 18:36 ` Pavel Machek @ 2013-02-04 16:12 ` Dinh Nguyen 0 siblings, 0 replies; 36+ messages in thread From: Dinh Nguyen @ 2013-02-04 16:12 UTC (permalink / raw) To: linux-arm-kernel Hi Pavel, On Sun, 2013-02-03 at 19:36 +0100, ZY - pavel wrote: > Hi! > > > > > Point taken...thanks Russell. > > > > > > Well, I don't think we normally check dtbs for validity with > > > user-helpful error messages, but it is relatively easy in this case. > > > > > > ---cut here--- > > > > > > Continue booting with second core disabled if cpu1-start-addr is not > > > present in .dtb. > > > > > > Signed-off-by: Pavel Machek <pavel@denx.de> > > > > > > diff --git a/arch/arm/mach-socfpga/platsmp.c b/arch/arm/mach- > > > socfpga/platsmp.c > > > index 81e0da0..90facdd 100644 > > > --- a/arch/arm/mach-socfpga/platsmp.c > > > +++ b/arch/arm/mach-socfpga/platsmp.c > > > @@ -82,6 +82,9 @@ static void __init socfpga_smp_init_cpus(void) > > > ncores = 1; > > > } > > > #endif > > > + if (!cpu1start_addr) > > > + ncores = 1; > > > + > > > > This will not work because of commit 5587164eea4aad88fcb79d9b21dc8f14fea598cd > > I sent out a V3 series of this patch, CPU1 will simply fail to come online if > > cpu1-start-addr is not defined. > > Are you sure? As far as I can see you still need such a line in v3 of > the patch: > > @@ -72,6 +73,11 @@ void __init socfpga_sysmgr_init(void) > struct device_node *np; > > np = of_find_compatible_node(NULL, NULL, "altr,sys-mgr"); > + > + if (of_property_read_u32(np, "cpu1-start-addr", > + (u32 *) &cpu1start_addr)) > + pr_err("SMP: Need cpu1-start-addr in device tree.\n"); > + > sys_manager_base_addr = of_iomap(np, 0); > > np = of_find_compatible_node(NULL, NULL, "altr,rst-mgr"); > > ...so cpu1-start-addr is not there, cpu1start_addr is NULL but you > continue booting. > > ENTRY(secondary_trampoline) > - movw r0, #:lower16:CPU1_START_ADDR > - movt r0, #:upper16:CPU1_START_ADDR > + movw r2, #:lower16:cpu1start_addr > + movt r2, #:upper16:cpu1start_addr > + > + /* The socfpga VT cannot handle a 0xC0000000 page offset when > loading > + the cpu1start_addr, we bit clear it. Tested on HW and > VT. */ > + bic r2, r2, #0x40000000 > > + ldr r0, [r2] > ldr r1, [r0] > bx r1 > > ...and you'll dereference the NULL pointer here, no? You're right, somehow my initial test did not catch this error. For v4, I'm just going to wrap everything in sofpga_boot_secodardy in a if (cpu1start_addr){} Dinh > > Sorry for the noise, this really is not all that important... > Pavel ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCHv2 for soc 4/4] arm: socfpga: Add SMP support for actual socfpga harware 2013-01-31 17:05 ` [PATCHv2 for soc 4/4] arm: socfpga: Add SMP support for actual socfpga harware dinguyen at altera.com 2013-02-01 3:50 ` Olof Johansson @ 2013-02-01 10:50 ` Pavel Machek 1 sibling, 0 replies; 36+ messages in thread From: Pavel Machek @ 2013-02-01 10:50 UTC (permalink / raw) To: linux-arm-kernel On Thu 2013-01-31 11:05:43, dinguyen at altera.com wrote: > From: Dinh Nguyen <dinguyen@altera.com> > > Because the CPU1 start address is different for socfpga-vt and > socfpga-cyclone5, we add code to use the correct CPU1 start addr. > > Signed-off-by: Dinh Nguyen <dinguyen@altera.com> > Cc: Russell King <linux@arm.linux.org.uk> > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Olof Johansson <olof@lixom.net> > Cc: Pavel Machek <pavel@denx.de> Signed-off-by: Pavel Machek <pavel@denx.de> -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2013-02-04 16:12 UTC | newest] Thread overview: 36+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-01-31 17:05 [PATCHv2 for soc 0/4] Enabling socfpga on hardware dinguyen at altera.com 2013-01-31 17:05 ` [PATCHv2 for soc 1/4] arm: socfpga: Add new device tree source for actual socfpga HW dinguyen at altera.com 2013-02-01 3:46 ` Olof Johansson 2013-02-01 15:23 ` Dinh Nguyen 2013-01-31 17:05 ` [PATCHv2 for soc 2/4] arm: socfpga: Add entries to enable make dtbs socfpga dinguyen at altera.com 2013-01-31 17:05 ` [PATCHv2 for soc 3/4] arm: Add v7_invalidate_l1 to cache-v7.S dinguyen at altera.com 2013-01-31 18:11 ` Stephen Warren 2013-02-01 3:47 ` Olof Johansson 2013-02-01 11:29 ` Santosh Shilimkar 2013-02-01 11:32 ` Russell King - ARM Linux 2013-02-01 11:44 ` Santosh Shilimkar 2013-02-01 12:48 ` Russell King - ARM Linux 2013-02-01 13:04 ` Santosh Shilimkar 2013-02-01 13:20 ` Russell King - ARM Linux 2013-02-01 14:09 ` Santosh Shilimkar 2013-02-01 12:11 ` Lorenzo Pieralisi 2013-02-01 12:24 ` Santosh Shilimkar 2013-02-01 12:54 ` Russell King - ARM Linux 2013-02-01 14:10 ` Lorenzo Pieralisi 2013-02-01 14:19 ` Russell King - ARM Linux 2013-02-01 14:31 ` Russell King - ARM Linux 2013-02-01 14:43 ` Santosh Shilimkar 2013-02-01 14:49 ` Russell King - ARM Linux 2013-02-01 14:53 ` Santosh Shilimkar 2013-02-01 14:34 ` Lorenzo Pieralisi 2013-01-31 17:05 ` [PATCHv2 for soc 4/4] arm: socfpga: Add SMP support for actual socfpga harware dinguyen at altera.com 2013-02-01 3:50 ` Olof Johansson 2013-02-01 10:46 ` Pavel Machek 2013-02-01 15:27 ` Dinh Nguyen 2013-02-01 15:31 ` Russell King - ARM Linux 2013-02-01 16:39 ` Dinh Nguyen 2013-02-02 19:24 ` Pavel Machek 2013-02-02 21:37 ` Dinh Nguyen 2013-02-03 18:36 ` Pavel Machek 2013-02-04 16:12 ` Dinh Nguyen 2013-02-01 10:50 ` Pavel Machek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).