From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9CBEBC433EF for ; Thu, 30 Jun 2022 13:48:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=6WHG2EkKHV4Hl1K3rZ847SeAk02c60RrN7+/U2Ex1uc=; b=AJnk8lcpNFMLpE piXN9+1YvBzgIuQCaY81yruiUkiljODZzGVURg+G4ynuxB2KNvQRheZS0yMTandnpXR9t1ZaiOSFL 7Nvp4PCn1e0i1DJPgZR/bz/GLnTzU7MFmEwC1MVUUixEzhFduVZaSIcCX1/jU2a85o2DofIOkcOi6 4L+UsbIIpY2+eGrv8QhYmIMrCpPbBuhGckw+RvaddGbwPTsaD5lY4wnl8WOLJhBo9qBESwjj/VKya D1lqTlmxWAoJ9GNiehp855En5KvlLTMdKqi0UoVWgC3gt7AIzY7o0O9ox2Al8OvVolYzDgivEnGKo wpF9h7Np2LekTrLz58cg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1o6uVs-00HPyg-16; Thu, 30 Jun 2022 13:47:20 +0000 Received: from dfw.source.kernel.org ([139.178.84.217]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1o6uVn-00HPw5-98 for linux-arm-kernel@lists.infradead.org; Thu, 30 Jun 2022 13:47:18 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7B9BA61FF6; Thu, 30 Jun 2022 13:47:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 336A8C3411E; Thu, 30 Jun 2022 13:47:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1656596833; bh=vI/AUt/uyOdN66LwqebDxg7Xme7QwqdUUsoL+9G6xI8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=HFgd5VtNN9LGfIXfWD0uTQXM1Ocq2WfLCMyyn95xRcQW06dNNIxlkRduyfvzQszyL kT0w8VnEG9t0E4EA0ekRLY8/9k6nA2pEiilmQLmOCgUp54JFjeviO/flyW+JB9wwWN fiZhROuBoTBaE+lk1TFFIxsO8G8tsk5jz1wE6NU8La6yPUuRwG+f3wimwe4sGUUk+K irRN6qXYgthUcktJjdoukxjEmOzNPh3tgc+RLzPHLgLocNwBngBHuGIFRZvF6AFvKE SgpWABXdy5Xk7rbNeVnzRHF5pC3WMNrQAwPhVRwFVpCoDsz/0ESd+qeF5vnWjBqCuE E8YjI9qnHOldQ== Date: Thu, 30 Jun 2022 16:46:55 +0300 From: Mike Rapoport To: Guanghui Feng Cc: baolin.wang@linux.alibaba.com, catalin.marinas@arm.com, will@kernel.org, akpm@linux-foundation.org, david@redhat.com, jianyong.wu@arm.com, james.morse@arm.com, quic_qiancai@quicinc.com, christophe.leroy@csgroup.eu, jonathan@marek.ca, mark.rutland@arm.com, thunder.leizhen@huawei.com, anshuman.khandual@arm.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, geert+renesas@glider.be, ardb@kernel.org, linux-mm@kvack.org, yaohongbo@linux.alibaba.com, alikernel-developer@linux.alibaba.com Subject: Re: [PATCH v3] arm64: mm: fix linear mapping mem access performance degradation Message-ID: References: <1656586222-98555-1-git-send-email-guanghuifeng@linux.alibaba.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1656586222-98555-1-git-send-email-guanghuifeng@linux.alibaba.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220630_064715_439250_095F5E97 X-CRM114-Status: GOOD ( 31.30 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi, On Thu, Jun 30, 2022 at 06:50:22PM +0800, Guanghui Feng wrote: > The arm64 can build 2M/1G block/sectiion mapping. When using DMA/DMA32 zone > (enable crashkernel, disable rodata full, disable kfence), the mem_map will > use non block/section mapping(for crashkernel requires to shrink the region > in page granularity). But it will degrade performance when doing larging > continuous mem access in kernel(memcpy/memmove, etc). > > There are many changes and discussions: > commit 031495635b46 ("arm64: Do not defer reserve_crashkernel() for > platforms with no DMA memory zones") > commit 0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into > mem_init()") > commit 2687275a5843 ("arm64: Force NO_BLOCK_MAPPINGS if crashkernel > reservation is required") > > This patch changes mem_map to use block/section mapping with crashkernel. > Firstly, do block/section mapping(normally 2M or 1G) for all avail mem at > mem_map, reserve crashkernel memory. And then walking pagetable to split > block/section mapping to non block/section mapping(normally 4K) [[[only]]] > for crashkernel mem. So the linear mem mapping use block/section mapping > as more as possible. We will reduce the cpu dTLB miss conspicuously, and > accelerate mem access about 10-20% performance improvement. ... > Signed-off-by: Guanghui Feng > --- > arch/arm64/include/asm/mmu.h | 1 + > arch/arm64/mm/init.c | 8 +- > arch/arm64/mm/mmu.c | 231 ++++++++++++++++++++++++++++++------------- > 3 files changed, 168 insertions(+), 72 deletions(-) ... > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > index 626ec32..4b779cf 100644 > --- a/arch/arm64/mm/mmu.c > +++ b/arch/arm64/mm/mmu.c > @@ -42,6 +42,7 @@ > #define NO_BLOCK_MAPPINGS BIT(0) > #define NO_CONT_MAPPINGS BIT(1) > #define NO_EXEC_MAPPINGS BIT(2) /* assumes FEAT_HPDS is not used */ > +#define NO_SEC_REMAPPINGS BIT(3) /* rebuild with non block/sec mapping*/ > > u64 idmap_t0sz = TCR_T0SZ(VA_BITS_MIN); > u64 idmap_ptrs_per_pgd = PTRS_PER_PGD; > @@ -156,11 +157,12 @@ static bool pgattr_change_is_safe(u64 old, u64 new) > } > > static void init_pte(pmd_t *pmdp, unsigned long addr, unsigned long end, > - phys_addr_t phys, pgprot_t prot) > + phys_addr_t phys, pgprot_t prot, int flags) > { > pte_t *ptep; > > - ptep = pte_set_fixmap_offset(pmdp, addr); > + ptep = (flags & NO_SEC_REMAPPINGS) ? pte_offset_kernel(pmdp, addr) : > + pte_set_fixmap_offset(pmdp, addr); > do { > pte_t old_pte = READ_ONCE(*ptep); > > @@ -176,7 +178,8 @@ static void init_pte(pmd_t *pmdp, unsigned long addr, unsigned long end, > phys += PAGE_SIZE; > } while (ptep++, addr += PAGE_SIZE, addr != end); > > - pte_clear_fixmap(); > + if (!(flags & NO_SEC_REMAPPINGS)) > + pte_clear_fixmap(); > } > > static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr, > @@ -208,16 +211,59 @@ static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr, > next = pte_cont_addr_end(addr, end); > > /* use a contiguous mapping if the range is suitably aligned */ > - if ((((addr | next | phys) & ~CONT_PTE_MASK) == 0) && > + if (!(flags & NO_SEC_REMAPPINGS) && > + (((addr | next | phys) & ~CONT_PTE_MASK) == 0) && > (flags & NO_CONT_MAPPINGS) == 0) > __prot = __pgprot(pgprot_val(prot) | PTE_CONT); > > - init_pte(pmdp, addr, next, phys, __prot); > + init_pte(pmdp, addr, next, phys, __prot, flags); > > phys += next - addr; > } while (addr = next, addr != end); > } > > +static void init_pmd_remap(pud_t *pudp, unsigned long addr, unsigned long end, > + phys_addr_t phys, pgprot_t prot, > + phys_addr_t (*pgtable_alloc)(int), int flags) > +{ > + unsigned long next; > + pmd_t *pmdp; > + phys_addr_t map_offset; > + pmdval_t pmdval; > + > + pmdp = pmd_offset(pudp, addr); > + do { > + next = pmd_addr_end(addr, end); > + > + if (!pmd_none(*pmdp) && pmd_sect(*pmdp)) { > + phys_addr_t pte_phys = pgtable_alloc(PAGE_SHIFT); > + pmd_clear(pmdp); > + pmdval = PMD_TYPE_TABLE | PMD_TABLE_UXN; > + if (flags & NO_EXEC_MAPPINGS) > + pmdval |= PMD_TABLE_PXN; > + __pmd_populate(pmdp, pte_phys, pmdval); > + flush_tlb_kernel_range(addr, addr + PAGE_SIZE); > + > + map_offset = addr - (addr & PMD_MASK); > + if (map_offset) > + alloc_init_cont_pte(pmdp, addr & PMD_MASK, addr, > + phys - map_offset, prot, > + pgtable_alloc, > + flags & (~NO_SEC_REMAPPINGS)); > + > + if (next < (addr & PMD_MASK) + PMD_SIZE) > + alloc_init_cont_pte(pmdp, next, > + (addr & PUD_MASK) + PUD_SIZE, > + next - addr + phys, > + prot, pgtable_alloc, > + flags & (~NO_SEC_REMAPPINGS)); > + } > + alloc_init_cont_pte(pmdp, addr, next, phys, prot, > + pgtable_alloc, flags); > + phys += next - addr; > + } while (pmdp++, addr = next, addr != end); > +} There is still to much duplicated code here and in init_pud_remap(). Did you consider something like this: void __init map_crashkernel(void) { int flags = NO_EXEC_MAPPINGS | NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; u64 size; /* * check if crash kernel supported, reserved etc */ size = crashk_res.end + 1 - crashk_res.start; __remove_pgd_mapping(swapper_pg_dir, __phys_to_virt(start), size); __create_pgd_mapping(swapper_pg_dir, crashk_res.start, __phys_to_virt(crashk_res.start), size, PAGE_KERNEL, early_pgtable_alloc, flags); } ... -- Sincerely yours, Mike. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel