From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DB66FCCA47C for ; Tue, 14 Jun 2022 09:34:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:From:References:Cc:To: Subject:MIME-Version:Date:Message-ID:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=WxCU66/6QjvVrzCX7Y7DLyzjoR3ErP2oTI8a6LUcgS0=; b=oilOHE+1jBR1bC Z4jVfHiJx6h2dRP4UMgp535jSs9Hk0tr5lDt8397+oB1xhBcggkGJo7TWpySVXGALRcEwXaxMtXXa ZcF6wG4M+AJh8qHuevew5CvDPqy7KoLtGmoE1UKWyL511Ok23Q1HTlmAqTSoWDj8RK0Pg5TFVztG2 yeoz6gYjg2OQVJXPo6OWBGoHHjdpsS1Re7bU07bfSeYFC+LQgzPzendlCbcjc0nn1k/JfkSlwdBDT RuOvHYayc08WOkw01QJiS/Vg24l37Q73t0yryhAoGUlBA1DkZzzbwTosEpuPLKNIxJZ2eqNMekais 8h8BD2Kpo6fNxFAAkGgQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1o12vH-008lRt-3v; Tue, 14 Jun 2022 09:33:19 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1o12ki-008gCU-OD for linux-arm-kernel@lists.infradead.org; Tue, 14 Jun 2022 09:22:29 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4703515A1; Tue, 14 Jun 2022 02:22:23 -0700 (PDT) Received: from [10.162.40.17] (unknown [10.162.40.17]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id F148D3F66F; Tue, 14 Jun 2022 02:22:19 -0700 (PDT) Message-ID: <1223d394-cd70-49be-2d27-f245c15709ef@arm.com> Date: Tue, 14 Jun 2022 14:52:17 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: [PATCH v4 03/26] arm64: head: move assignment of idmap_t0sz to C code Content-Language: en-US To: Ard Biesheuvel , linux-arm-kernel@lists.infradead.org Cc: linux-hardening@vger.kernel.org, Marc Zyngier , Will Deacon , Mark Rutland , Kees Cook , Catalin Marinas , Mark Brown References: <20220613144550.3760857-1-ardb@kernel.org> <20220613144550.3760857-4-ardb@kernel.org> From: Anshuman Khandual In-Reply-To: <20220613144550.3760857-4-ardb@kernel.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220614_022225_089571_EEB2DF02 X-CRM114-Status: GOOD ( 30.07 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 6/13/22 20:15, Ard Biesheuvel wrote: > Setting idmap_t0sz involves fiddling with the caches if done with the > MMU off. Since we will be creating an initial ID map with the MMU and > caches off, and the permanent ID map with the MMU and caches on, let's > move this assignment of idmap_t0sz out of the startup code, and replace > it with a macro that simply issues the three instructions needed to > calculate the value wherever it is needed before the MMU is turned on. > > Signed-off-by: Ard Biesheuvel > --- > arch/arm64/include/asm/assembler.h | 14 ++++++++++++++ > arch/arm64/include/asm/mmu_context.h | 2 +- > arch/arm64/kernel/head.S | 13 +------------ > arch/arm64/mm/mmu.c | 5 ++++- > arch/arm64/mm/proc.S | 2 +- > 5 files changed, 21 insertions(+), 15 deletions(-) > > diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h > index 8c5a61aeaf8e..9468f45c07a6 100644 > --- a/arch/arm64/include/asm/assembler.h > +++ b/arch/arm64/include/asm/assembler.h > @@ -359,6 +359,20 @@ alternative_cb_end > bfi \valreg, \t1sz, #TCR_T1SZ_OFFSET, #TCR_TxSZ_WIDTH > .endm > > +/* > + * idmap_get_t0sz - get the T0SZ value needed to cover the ID map > + * > + * Calculate the maximum allowed value for TCR_EL1.T0SZ so that the > + * entire ID map region can be mapped. As T0SZ == (64 - #bits used), > + * this number conveniently equals the number of leading zeroes in > + * the physical address of _end. > + */ > + .macro idmap_get_t0sz, reg > + adrp \reg, _end > + orr \reg, \reg, #(1 << VA_BITS_MIN) - 1 > + clz \reg, \reg > + .endm Is there any particular reason to evaluate idmap t0sz from '__end' and VA_BITS_MIN, instead of '__idmap_text_end', as was the case previously. > + > /* > * tcr_compute_pa_size - set TCR.(I)PS to the highest supported > * ID_AA64MMFR0_EL1.PARange value > diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h > index 6770667b34a3..6ac0086ebb1a 100644 > --- a/arch/arm64/include/asm/mmu_context.h > +++ b/arch/arm64/include/asm/mmu_context.h > @@ -60,7 +60,7 @@ static inline void cpu_switch_mm(pgd_t *pgd, struct mm_struct *mm) > * TCR_T0SZ(VA_BITS), unless system RAM is positioned very high in > * physical memory, in which case it will be smaller. > */ > -extern u64 idmap_t0sz; > +extern int idmap_t0sz; > extern u64 idmap_ptrs_per_pgd; > > /* > diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S > index dc07858eb673..7f361bc72d12 100644 > --- a/arch/arm64/kernel/head.S > +++ b/arch/arm64/kernel/head.S > @@ -299,22 +299,11 @@ SYM_FUNC_START_LOCAL(__create_page_tables) > * physical address space. So for the ID map, use an extended virtual > * range in that case, and configure an additional translation level > * if needed. > - * > - * Calculate the maximum allowed value for TCR_EL1.T0SZ so that the > - * entire ID map region can be mapped. As T0SZ == (64 - #bits used), > - * this number conveniently equals the number of leading zeroes in > - * the physical address of __idmap_text_end. > */ > - adrp x5, __idmap_text_end > - clz x5, x5 > + idmap_get_t0sz x5 > cmp x5, TCR_T0SZ(VA_BITS_MIN) // default T0SZ small enough? > b.ge 1f // .. then skip VA range extension > > - adr_l x6, idmap_t0sz > - str x5, [x6] > - dmb sy > - dc ivac, x6 // Invalidate potentially stale cache line Right, as there is no 'idmap_t0sz' variable to update, cache maintenance can be dropped off. > - > #if (VA_BITS < 48) > #define EXTRA_SHIFT (PGDIR_SHIFT + PAGE_SHIFT - 3) > #define EXTRA_PTRS (1 << (PHYS_MASK_SHIFT - EXTRA_SHIFT)) > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > index 17b339c1a326..103bf4ae408d 100644 > --- a/arch/arm64/mm/mmu.c > +++ b/arch/arm64/mm/mmu.c > @@ -43,7 +43,7 @@ > #define NO_CONT_MAPPINGS BIT(1) > #define NO_EXEC_MAPPINGS BIT(2) /* assumes FEAT_HPDS is not used */ > > -u64 idmap_t0sz = TCR_T0SZ(VA_BITS_MIN); > +int idmap_t0sz __ro_after_init; I guess this is just to reduce 'idmap_t0sz' memory foot print. > u64 idmap_ptrs_per_pgd = PTRS_PER_PGD; > > #if VA_BITS > 48 > @@ -785,6 +785,9 @@ void __init paging_init(void) > (u64)&vabits_actual + sizeof(vabits_actual)); > #endif > > + idmap_t0sz = min(63UL - __fls(__pa_symbol(_end)), > + TCR_T0SZ(VA_BITS_MIN)); > + Just curious - but does not this also need some sync for the update to be visible across the system ? #define cpu_set_idmap_tcr_t0sz() __cpu_set_tcr_t0sz(idmap_t0sz) static inline void cpu_install_idmap(void) { cpu_set_reserved_ttbr0(); local_flush_tlb_all(); cpu_set_idmap_tcr_t0sz(); cpu_switch_mm(lm_alias(idmap_pg_dir), &init_mm); } > map_kernel(pgdp); > map_mem(pgdp); > > diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S > index 972ce8d7f2c5..97cd67697212 100644 > --- a/arch/arm64/mm/proc.S > +++ b/arch/arm64/mm/proc.S > @@ -470,7 +470,7 @@ SYM_FUNC_START(__cpu_setup) > add x9, x9, #64 > tcr_set_t1sz tcr, x9 > #else > - ldr_l x9, idmap_t0sz > + idmap_get_t0sz x9 > #endif > tcr_set_t0sz tcr, x9 > Avoiding one cache maintenance in __create_page_table(), now makes us again evaluate idmap_t0sz in __cpu_setup(), and also capture & update idmap_t0sz in paging_init(). This change moves idmap_t0sz outside the asm functions but from performance perspecive, is there an improvement ? _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel