From mboxrd@z Thu Jan 1 00:00:00 1970 From: Arnab Basu Date: Fri, 22 Aug 2014 12:32:07 +0530 Subject: [U-Boot] [Patch v2 3/5] armv8/fsl-lsch3: Release secondary cores from boot hold off with Boot Page In-Reply-To: References: <1408480082-4617-1-git-send-email-yorksun@freescale.com> <1408480082-4617-3-git-send-email-yorksun@freescale.com> <20140821134724.GM21734@leverpostej> <53F63B55.3040900@freescale.com> Message-ID: <53F6EAEF.5050302@freescale.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de Hi Mark On 08/22/2014 12:20 AM, Sharma Bhupesh-B45370 wrote: > Hi Mark, > > >> -----Original Message----- >> From: u-boot-bounces at lists.denx.de [mailto:u-boot-bounces at lists.denx.de] >> On Behalf Of York Sun >> Sent: Friday, August 22, 2014 12:03 AM >> To: Mark Rutland; Basu Arnab-B45036 >> Cc: trini at ti.com; u-boot at lists.denx.de; Wood Scott-B07421 >> Subject: Re: [U-Boot] [Patch v2 3/5] armv8/fsl-lsch3: Release secondary >> cores from boot hold off with Boot Page >> >> On 08/21/2014 06:47 AM, Mark Rutland wrote: >>> Hi York, >>> >>> I have mostly minor comments this time; this is looking pretty good. >>> Thanks for taking the time to review this. >>> On Tue, Aug 19, 2014 at 09:28:00PM +0100, York Sun wrote: >>>> Secondary cores need to be released from holdoff by boot release >>>> registers. With GPP bootrom, they can boot from main memory directly. >>>> Individual spin table is used for each core. If a single release >>>> address is needed, defining macro CONFIG_FSL_SMP_RELEASE_ALL will use >>>> the CPU_RELEASE_ADDR. Spin table and the boot page is reserved in >>>> device tree so OS won't overwrite. >>>> >>>> Signed-off-by: York Sun >>>> Signed-off-by: Arnab Basu >>>> --- >>>> v2: Removed copying boot page. Use u-boot image as is in memory. >>>> Added dealing with different size of addr_cell in device tree. >>>> Added dealing with big- and little-endian. >>>> Added flushing spin table after cpu_release(). >>>> >>>> arch/arm/cpu/armv8/fsl-lsch3/Makefile | 2 + >>>> arch/arm/cpu/armv8/fsl-lsch3/cpu.c | 13 ++ >>>> arch/arm/cpu/armv8/fsl-lsch3/cpu.h | 1 + >>>> arch/arm/cpu/armv8/fsl-lsch3/fdt.c | 56 +++++++ >>>> arch/arm/cpu/armv8/fsl-lsch3/lowlevel.S | 128 ++++++++++++- >> -- >>>> arch/arm/cpu/armv8/fsl-lsch3/mp.c | 172 >> +++++++++++++++++++++ >>>> arch/arm/cpu/armv8/fsl-lsch3/mp.h | 36 +++++ >>>> arch/arm/cpu/armv8/transition.S | 63 +------- >>>> arch/arm/include/asm/arch-fsl-lsch3/config.h | 3 +- >>>> arch/arm/include/asm/arch-fsl-lsch3/immap_lsch3.h | 35 +++++ >>>> arch/arm/include/asm/macro.h | 92 +++++++++++ >>>> arch/arm/lib/gic_64.S | 10 +- >>>> common/board_f.c | 2 +- >>>> 13 files changed, 525 insertions(+), 88 deletions(-) create mode >>>> 100644 arch/arm/cpu/armv8/fsl-lsch3/fdt.c >>>> create mode 100644 arch/arm/cpu/armv8/fsl-lsch3/mp.c create mode >>>> 100644 arch/arm/cpu/armv8/fsl-lsch3/mp.h >>>> >>>> diff --git a/arch/arm/cpu/armv8/fsl-lsch3/Makefile >>>> b/arch/arm/cpu/armv8/fsl-lsch3/Makefile >>>> index 9249537..f920eeb 100644 >>>> --- a/arch/arm/cpu/armv8/fsl-lsch3/Makefile >>>> +++ b/arch/arm/cpu/armv8/fsl-lsch3/Makefile >>>> @@ -7,3 +7,5 @@ >>>> obj-y += cpu.o >>>> obj-y += lowlevel.o >>>> obj-y += speed.o >>>> +obj-$(CONFIG_MP) += mp.o >>>> +obj-$(CONFIG_OF_LIBFDT) += fdt.o >>>> diff --git a/arch/arm/cpu/armv8/fsl-lsch3/cpu.c >>>> b/arch/arm/cpu/armv8/fsl-lsch3/cpu.c >>>> index c129d03..47b947f 100644 >>>> --- a/arch/arm/cpu/armv8/fsl-lsch3/cpu.c >>>> +++ b/arch/arm/cpu/armv8/fsl-lsch3/cpu.c >>>> @@ -11,6 +11,7 @@ >>>> #include >>>> #include >>>> #include "cpu.h" >>>> +#include "mp.h" >>>> #include "speed.h" >>>> #include >>>> >>>> @@ -434,3 +435,15 @@ int cpu_eth_init(bd_t *bis) #endif >>>> return error; >>>> } >>>> + >>>> + >>>> +int arch_early_init_r(void) >>>> +{ >>>> + int rv; >>>> + rv = fsl_lsch3_wake_seconday_cores(); >>>> + >>>> + if (rv) >>>> + printf("Did not wake secondary cores\n"); >>>> + >>>> + return 0; >>>> +} >>>> diff --git a/arch/arm/cpu/armv8/fsl-lsch3/cpu.h >>>> b/arch/arm/cpu/armv8/fsl-lsch3/cpu.h >>>> index 28544d7..2e3312b 100644 >>>> --- a/arch/arm/cpu/armv8/fsl-lsch3/cpu.h >>>> +++ b/arch/arm/cpu/armv8/fsl-lsch3/cpu.h >>>> @@ -5,3 +5,4 @@ >>>> */ >>>> >>>> int fsl_qoriq_core_to_cluster(unsigned int core); >>>> +u32 cpu_mask(void); >>>> diff --git a/arch/arm/cpu/armv8/fsl-lsch3/fdt.c >>>> b/arch/arm/cpu/armv8/fsl-lsch3/fdt.c >>>> new file mode 100644 >>>> index 0000000..2dbcdcb >>>> --- /dev/null >>>> +++ b/arch/arm/cpu/armv8/fsl-lsch3/fdt.c >>>> @@ -0,0 +1,56 @@ >>>> +/* >>>> + * Copyright 2014 Freescale Semiconductor, Inc. >>>> + * >>>> + * SPDX-License-Identifier: GPL-2.0+ >>>> + */ >>>> + >>>> +#include >>>> +#include >>>> +#include >>>> +#include "mp.h" >>>> + >>>> +#ifdef CONFIG_MP >>>> +void ft_fixup_cpu(void *blob) >>>> +{ >>>> + int off; >>>> + __maybe_unused u64 spin_tbl_addr = (u64)get_spin_tbl_addr(); >>>> + fdt32_t *reg; >>>> + int addr_cells; >>>> + u64 val; >>>> + size_t *boot_code_size = &(__secondary_boot_code_size); >>>> + >>>> + off = fdt_node_offset_by_prop_value(blob, -1, "device_type", >>>> + "cpus", 4); >>> >>> I didn't think /cpus had device_type = "cpus". I can't see any >>> instances in any DTs I have to hand. Can we not find /cpus by path? >> >> I will let Arnab to comment on this. He is coordinating with Linux device >> tree. > > Since I contribute to the DTS for FSL ARMv8 SoC, here is my rationale behind the same. > I have used the standard ARM cpu device-tree binding documentation as a reference (see [1]) > which defined the device_type which it mentions should be set to cpu. > > Please let me know if I am missing something. > > [1] https://www.kernel.org/doc/Documentation/devicetree/bindings/arm/cpus.txt > I see the confusion here, "device_type" is required for the "cpu" node but not its container the "cpus" node. I will change this. >>> >>>> + of_bus_default_count_cells(blob, off, &addr_cells, NULL); >>>> + >>>> + off = fdt_node_offset_by_prop_value(blob, -1, "device_type", >> "cpu", 4); >>>> + while (off != -FDT_ERR_NOTFOUND) { >>>> + reg = (fdt32_t *)fdt_getprop(blob, off, "reg", 0); >>>> + if (reg) { >>>> + val = spin_tbl_addr; #ifndef >>>> +CONFIG_FSL_SMP_RELEASE_ALL >>>> + val += id_to_core(of_read_number(reg, >> addr_cells)) >>>> + * SPIN_TABLE_ELEM_SIZE; #endif >>>> + val = cpu_to_fdt64(val); >>>> + fdt_setprop_string(blob, off, "enable-method", >>>> + "spin-table"); >>>> + fdt_setprop(blob, off, "cpu-release-addr", >>>> + &val, sizeof(val)); >>>> + } else { >>>> + puts("cpu NULL\n"); >>> >>> Could we change that to "Warning: found cpu node without reg property" >>> or something like that which makes the problem clear? >> >> Yes, we can. >> >>> >>>> + } >>>> + off = fdt_node_offset_by_prop_value(blob, off, >> "device_type", >>>> + "cpu", 4); >>>> + } >>>> + >>>> + fdt_add_mem_rsv(blob, (uintptr_t)&secondary_boot_code, >>>> + *boot_code_size); } #endif >>>> + >>>> +void ft_cpu_setup(void *blob, bd_t *bd) { #ifdef CONFIG_MP >>>> + ft_fixup_cpu(blob); >>>> +#endif >>>> +} >>>> diff --git a/arch/arm/cpu/armv8/fsl-lsch3/lowlevel.S >>>> b/arch/arm/cpu/armv8/fsl-lsch3/lowlevel.S >>>> index ad32b6c..629e6d4 100644 >>>> --- a/arch/arm/cpu/armv8/fsl-lsch3/lowlevel.S >>>> +++ b/arch/arm/cpu/armv8/fsl-lsch3/lowlevel.S >>>> @@ -8,7 +8,9 @@ >>>> >>>> #include >>>> #include >>>> +#include >>>> #include >>>> +#include "mp.h" >>>> >>>> ENTRY(lowlevel_init) >>>> mov x29, lr /* Save LR */ >>>> @@ -37,29 +39,119 @@ ENTRY(lowlevel_init) >>>> >>>> branch_if_master x0, x1, 1f >>>> >>>> + ldr x0, =secondary_boot_func >>>> + blr x0 >>>> + >>>> +1: >>>> +2: >>> >>> Isn't the '2' label redundant? >> >> We have some internal code dealing with trust zone between the 1 and 2. It >> is not likely to be used in long term since we are trying to move them >> into security monitor. I can drop label 2 here. > Yes, now that I look at it the labels in this file could do with some cleanup. Will do. > U-boot can still be booted in EL3, as it can be well booted w/o a ATF or EL3 capable > s/w running before the same. That's why we have CONFIG_EL3 and CONFIG_EL2 code legs > in the u-boot ARMv8 code. > >> >>> >>>> + mov lr, x29 /* Restore LR */ >>>> + ret >>>> +ENDPROC(lowlevel_init) >>>> + >>>> + /* Keep literals not used by the secondary boot code outside it >> */ >>>> + .ltorg >>>> + >>>> + /* Using 64 bit alignment since the spin table is accessed as >> data */ >>>> + .align 4 >>>> + .global secondary_boot_code >>>> + /* Secondary Boot Code starts here */ >>>> +secondary_boot_code: >>>> + .global __spin_table >>>> +__spin_table: >>>> + .space CONFIG_MAX_CPUS*SPIN_TABLE_ELEM_SIZE >>>> + >>>> + .align 2 >>>> +ENTRY(secondary_boot_func) >>>> /* >>>> - * Slave should wait for master clearing spin table. >>>> - * This sync prevent salves observing incorrect >>>> - * value of spin table and jumping to wrong place. >>>> + * MPIDR_EL1 Fields: >>>> + * MPIDR[1:0] = AFF0_CPUID <- Core ID (0,1) >>>> + * MPIDR[7:2] = AFF0_RES >>>> + * MPIDR[15:8] = AFF1_CLUSTERID <- Cluster ID (0,1,2,3) >>>> + * MPIDR[23:16] = AFF2_CLUSTERID >>>> + * MPIDR[24] = MT >>>> + * MPIDR[29:25] = RES0 >>>> + * MPIDR[30] = U >>>> + * MPIDR[31] = ME >>>> + * MPIDR[39:32] = AFF3 >>>> + * >>>> + * Linear Processor ID (LPID) calculation from MPIDR_EL1: >>>> + * (We only use AFF0_CPUID and AFF1_CLUSTERID for now >>>> + * until AFF2_CLUSTERID and AFF3 have non-zero values) >>>> + * >>>> + * LPID = MPIDR[15:8] | MPIDR[1:0] >>>> */ >>>> -#if defined(CONFIG_GICV2) || defined(CONFIG_GICV3) -#ifdef >>>> CONFIG_GICV2 >>>> - ldr x0, =GICC_BASE >>>> + mrs x0, mpidr_el1 >>>> + ubfm x1, x0, #8, #15 >>>> + ubfm x2, x0, #0, #1 >>>> + orr x10, x2, x1, lsl #2 /* x10 has LPID */ >>>> + ubfm x9, x0, #0, #15 /* x9 contains MPIDR[15:0] */ >>>> + /* >>>> + * offset of the spin table element for this core from start of >> spin >>>> + * table (each elem is padded to 64 bytes) >>>> + */ >>>> + lsl x1, x10, #6 >>>> + ldr x0, =__spin_table >>>> + /* physical address of this cpus spin table element */ >>>> + add x11, x1, x0 >>>> + >>>> + str x9, [x11, #16] /* LPID */ >>>> + mov x4, #1 >>>> + str x4, [x11, #8] /* STATUS */ >>>> + dsb sy >>>> +#if defined(CONFIG_GICV3) >>>> + gic_wait_for_interrupt_m x0 >>>> #endif >>>> - bl gic_wait_for_interrupt >>> >>> Why do we only need to wait for GICv3? We previously did so for GICv2. >> >> I think this is leftover from the attempt to boot all cores >> simultaneously. I will let Arnab to comment. >> There is a "#elif" that I'm sure used to be here but seems to have got lost somehow. For CONFIG_GICV2 "gic_wait_for_interrupt_m" takes 2 arguments. We should be waiting in both cases. Will fix. >>> >>>> + >>>> + bl secondary_switch_to_el2 >>>> +#ifdef CONFIG_ARMV8_SWITCH_TO_EL1 >>>> + secondary_switch_to_el1 >>>> #endif >>> >>> Missing bl? >> >> Looks so. >> >>> >>> Is there any reason to have the SWITCH_TO_EL1 option other than for >>> debugging? >> >> Good question. I will let Arnab to comment here. >> >>> >>> EL2 is the preferred EL to boot at for Linux and Xen (it gives far >>> more flexibility), and if dropping to EL1 is necessary I think it >>> would make more sense as a run-time option than a compile-time option. >>> I don't think we plan to boot Linux in EL1. This is primarily here to maintain uniformity with "arch/arm/lib/bootm.c". If I remove it from here and it was ever defined in the config, then the boot core would enter Linux in EL1 while the secondaries entered Linux in EL2. I don't know if that breaks anything... The run-time option seems interesting and it would definitely work for the primary core which could access the u-boot env variables and such but the secondaries are executing assembly and the communication between cores is fairly primitive (sgi's and sev's etc) so this might require a little bit of work. If you have any thoughts on how we can go about it, I would be glad to do some research, but that seems to be the topic for a separate patchset I guess. >>>> >>>> - /* >>>> - * All processors will enter EL2 and optionally EL1. >>>> +slave_cpu: >>>> + wfe >>>> +#ifdef CONFIG_FSL_SMP_RELEASE_ALL >>>> + /* All cores are released from the address in the 1st spin >> table >>>> + * element >>>> */ >>>> - bl armv8_switch_to_el2 >>>> -#ifdef CONFIG_ARMV8_SWITCH_TO_EL1 >>>> - bl armv8_switch_to_el1 >>>> + ldr x1, =__spin_table >>>> + ldr x0, [x1] >>>> +#else >>>> + ldr x0, [x11] >>>> +#endif >>>> + cbz x0, slave_cpu >>> >>> Similarly is there any reason to have the option of a single release >>> addr if we can support unique addresses? >> >> I think it was used by Linux for some ARM parts. I personally not a fun of >> using single release. But if it makes everyone happy, I can keep it. > > We followed the standard ARMv8 foundation model DTS initially which along with others > supported a single release address for all the cores. So, we wanted to comply to the same. > Yes this is left over code which should (and will) be cleaned up. >> >>> >>> [...] >>> >>>> diff --git a/arch/arm/cpu/armv8/fsl-lsch3/mp.c >>>> b/arch/arm/cpu/armv8/fsl-lsch3/mp.c >>>> new file mode 100644 >>>> index 0000000..b29eda9 >>>> --- /dev/null >>>> +++ b/arch/arm/cpu/armv8/fsl-lsch3/mp.c >>>> @@ -0,0 +1,172 @@ >>>> +/* >>>> + * Copyright 2014 Freescale Semiconductor, Inc. >>>> + * >>>> + * SPDX-License-Identifier: GPL-2.0+ >>>> + */ >>>> + >>>> +#include >>>> +#include >>>> +#include >>>> +#include >>>> +#include >>>> +#include "mp.h" >>>> + >>>> +DECLARE_GLOBAL_DATA_PTR; >>>> + >>>> +void *get_spin_tbl_addr(void) >>>> +{ >>>> + void *ptr = (void *)&__spin_table; >>>> + >>>> + return ptr; >>>> +} >>> >>> I don't think the cast is necessary here. As far as I can see this >>> could be just: >>> >>> void *get_spin_tbl_addr(void) >>> { >>> return &__spin_table; >>> } >>> >> >> Let me try that. >> >>> [...] >>> >>>> diff --git a/arch/arm/include/asm/macro.h >>>> b/arch/arm/include/asm/macro.h index f77e4b8..0009c28 100644 >>>> --- a/arch/arm/include/asm/macro.h >>>> +++ b/arch/arm/include/asm/macro.h >>>> @@ -105,6 +105,98 @@ lr .req x30 >>>> cbz \xreg1, \master_label >>>> .endm >>>> >>>> +.macro armv8_switch_to_el2_m, xreg1 >>>> + mov \xreg1, #0x5b1 /* Non-secure EL0/EL1 | HVC | 64bit EL2 >> */ >>>> + msr scr_el3, \xreg1 >>> >>> When dropping to EL1 from EL2 we disable HVC via HCR_EL2; presumably >>> due to lack of a handler. Would it make sense to do similarly here and >>> disable SMC here until we have a user (e.g. PSCI)? >> >> I will let Arnab to comment here. > SMC's are disabled (we are setting bit 7, the SMD bit). The comment does not capture this. I'll fix it. Thanks Arnab > In my view u-boot should be similar to UEFI and both should drop only to EL2 > before booting either Linux or Hypervisor. UEFI code doesn't switch to EL1 > before invoking Linux. Thoughts? > >> >> >> York >> _______________________________________________ >> U-Boot mailing list >> U-Boot at lists.denx.de >> http://lists.denx.de/mailman/listinfo/u-boot