From mboxrd@z Thu Jan 1 00:00:00 1970 From: Edgar E. Iglesias Date: Wed, 2 Sep 2020 17:25:15 +0200 Subject: [PATCH] arm64: Add support for bigger u-boot when CONFIG_POSITION_INDEPENDENT=y In-Reply-To: <0c4c6194-9a87-99e9-4fed-92b5a705ca4a@arm.com> References: <8438394ae435af2b900b965622969dce96701b88.1599045314.git.michal.simek@xilinx.com> <67147ba5-4bcc-2f2d-d979-17d4798198e0@arm.com> <20200902145319.GX14249@toto> <0c4c6194-9a87-99e9-4fed-92b5a705ca4a@arm.com> Message-ID: <20200902152515.GY14249@toto> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de On Wed, Sep 02, 2020 at 04:18:48PM +0100, Andr? Przywara wrote: > On 02/09/2020 15:53, Edgar E. Iglesias wrote: > > On Wed, Sep 02, 2020 at 03:43:08PM +0100, Andr??? Przywara wrote: > >> On 02/09/2020 12:15, Michal Simek wrote: > > Hi, > > >> > >>> From: "Edgar E. Iglesias" > >>> > >>> When U-Boot binary exceeds 1MB with CONFIG_POSITION_INDEPENDENT=y > >>> compilation error is shown: > >>> /mnt/disk/u-boot/arch/arm/cpu/armv8/start.S:71:(.text+0x3c): relocation > >>> truncated to fit: R_AARCH64_ADR_PREL_LO21 against symbol `__rel_dyn_end' > >>> defined in .bss_start section in u-boot. > >>> > >>> It is caused by adr instruction which permits the calculation of any byte > >>> address within +- 1MB of the current PC. > >>> Because U-Boot is bigger then 1MB calculation is failing. > >>> > >>> The patch is using adrp/add instructions where adrp shifts a signed, 21-bit > >>> immediate left by 12 bits (4k page), adds it to the value of the program > >>> counter with the bottom 12 bits cleared to zero. Then add instruction > >>> provides the lower 12 bits which is offset within 4k page. > >>> These two instructions together compose full 32bit offset which should be > >>> more then enough to cover the whole u-boot size. > >>> > >>> Signed-off-by: Edgar E. Iglesias > >>> Signed-off-by: Michal Simek > >> > >> It's a bit scary that you need more than 1MB, but indeed what you do > >> below is the canonical pattern to get the full range of PC relative > >> addressing (this is used heavily in Trusted Firmware, for instance). > >> > >> The only thing to keep in mind is that this assumes that the load > >> address of the binary is 4K aligned, so that the low 12 bits of the > >> symbol stay the same. I wonder if we should enforce this somehow? But > >> the load address is not controlled by the build process (the whole > >> purpose of PIE), so that's not doable just in the build system? > > > > There shouldn't be any need for 4K alignment. Could you elaborate on > > why you think there is? > > That seems to be slightly tricky, and I tried to get some confirmation, > but here goes my reasoning. Maybe you can confirm this: > > - adrp takes the relative offset, but only of the upper 20 bits (because > that's all we can encode). It clears the lower 12 bits of the register. > - the "add" is not PC relative anymore, so it just takes the lower 12 > bits of the "absolute" linker symbol. I was under the impression that this would use a PC-relative lower 12bit relocation but you are correct. I dissasembled the result: 40: 91000042 add x2, x2, #0x0 40: R_AARCH64_ADD_ABS_LO12_NC __rel_dyn_start > So this assumes that the lower 12 bits of the actual address in memory > and the lower 12 bits of the linker's view match. > An example: > 00024: adrp x0, SYMBOL > 00028: add x0, x0, :lo12:SYMBOL > > SYMBOL: > 42058: ... > > The toolchain will generate: > adrp x0, #0x42; add x0, x0, #0x058 > > Now you load the code to 0x8000.0800 (NOT 4K aligned). SYMBOL is now at > 0x80042858. > The adrp will use the PC (0x8000.0824) & ~0xfff + offs => 0x8004.2000. > The add will just add 0x58, so you end up with x0 being 0x80042058, > which is not the right address. > > Does this make sense? Yes, it makes sense. > > > Perhaps the commit message is a little confusing. The toolchain will > > compute the pc-relative offset from this particular location to the > > symbol and apply the relocations accordingly. > > Yes, but the PC relative offset applies only to the upper 20 bits, > because it's only adrp that has PC relative semantics. > > > >> > >> Shall we at least document this? I guess typical load address are > >> actually quite well aligned, so it might not be an issue in practice. > >> Yes, probably worth documenting and perhaps an early bail-out if it's not the case... Thanks, Edgar