From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A95EEECAAD3 for ; Thu, 1 Sep 2022 15:17:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=osEJYvYKQc4BjnOGvz6fJuM3pMv7QV45BtW9L4vsyWM=; b=1GRSNJeKRNUbrb iUD8L++Jlc9sOowdp0YWt6Cqm4PGHfivCdSDgfBgJCBYfgMRoV0zPLgrjjY7D9GPfKGLpzcoyAEHb 0zxUBj5+J70o5JEwZY9tJIdziHRa1AL3prGRhlonnkUR6wu4DQHDuQVbWPYXbtuTDmJr3pFdV9c+y r5T0jOS7pQ0+dsLiwlTmptJjJgAU+GH+KS88weF3B2MIngEbz7bHyV2JA9V4w79i8QV2gOQxsRNx2 LSBK6so3w3HqniChqSIuDc3/OFg6u6Z4c0wwWD2tHxrgmVPIOxqW2biVvf/5tttUhkAl7gTuh7PSe f7XW+u5DQxoHYvvVaLcA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oTlw6-00ClIH-J8; Thu, 01 Sep 2022 15:16:54 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oTlth-00Cjsx-9u for linux-arm-kernel@lists.infradead.org; Thu, 01 Sep 2022 15:14:27 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CB44AED1; Thu, 1 Sep 2022 08:14:30 -0700 (PDT) Received: from lakrids.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 88B9D3F766; Thu, 1 Sep 2022 08:14:23 -0700 (PDT) From: Mark Rutland To: linux-arm-kernel@lists.infradead.org Cc: ardb@kernel.org, catalin.marinas@arm.com, james.morse@arm.com, joey.gouly@arm.com, mark.rutland@arm.com, maz@kernel.org, will@kernel.org Subject: [PATCH 9/9] HACK: arm64: alternatives: dump summary of alternatives Date: Thu, 1 Sep 2022 16:14:03 +0100 Message-Id: <20220901151403.1735836-10-mark.rutland@arm.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220901151403.1735836-1-mark.rutland@arm.com> References: <20220901151403.1735836-1-mark.rutland@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220901_081425_505929_148F2139 X-CRM114-Status: GOOD ( 19.33 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org NOTE: THIS PATCH IS NOT INTENDED FOR UPSTREAM. To figure out whether it's worth making further changes to alternatives (e.g. whether it's worth replacing regular entries with callbacks), it would be useful to know the makeup of alternatives in a given kernel Image or module. This patch makes the alternatives code dump a summary of kernel alternatives at boot time, and module alternatives at module load time. For example, a defconfig v6.0-rc3+ kernel build with GCC 12.1.0 looks reports: | alternatives: Alternatives summary: | entries: 28000 (336000 bytes) | regular: 17280 | callback: 10720 | instructions: 32052 (128208 bytes) | replacements: 20962 ( 83848 bytes) | alternatives: cpucap 1 => entries: 925 orig: 1295, repl: 0, cb: 925 | alternatives: cpucap 2 => entries: 10 orig: 10, repl: 10, cb: 0 | alternatives: cpucap 4 => entries: 2 orig: 2, repl: 2, cb: 0 | alternatives: cpucap 5 => entries: 49 orig: 142, repl: 142, cb: 0 | alternatives: cpucap 10 => entries: 36 orig: 36, repl: 36, cb: 0 | alternatives: cpucap 11 => entries: 9 orig: 12, repl: 12, cb: 0 | alternatives: cpucap 12 => entries: 3 orig: 6, repl: 6, cb: 0 | alternatives: cpucap 13 => entries: 17 orig: 17, repl: 17, cb: 0 | alternatives: cpucap 14 => entries: 3 orig: 3, repl: 3, cb: 0 | alternatives: cpucap 16 => entries: 1 orig: 1, repl: 1, cb: 0 | alternatives: cpucap 18 => entries: 7 orig: 13, repl: 13, cb: 0 | alternatives: cpucap 19 => entries: 2 orig: 2, repl: 2, cb: 0 | alternatives: cpucap 20 => entries: 17 orig: 17, repl: 17, cb: 0 | alternatives: cpucap 24 => entries: 1128 orig: 1128, repl: 1128, cb: 0 | alternatives: cpucap 26 => entries: 10780 orig: 13953, repl: 4158, cb: 9795 | alternatives: cpucap 27 => entries: 39 orig: 39, repl: 39, cb: 0 | alternatives: cpucap 28 => entries: 4 orig: 8, repl: 8, cb: 0 | alternatives: cpucap 29 => entries: 15 orig: 15, repl: 15, cb: 0 | alternatives: cpucap 30 => entries: 15 orig: 27, repl: 27, cb: 0 | alternatives: cpucap 31 => entries: 3 orig: 3, repl: 3, cb: 0 | alternatives: cpucap 32 => entries: 59 orig: 118, repl: 118, cb: 0 | alternatives: cpucap 33 => entries: 6 orig: 6, repl: 6, cb: 0 | alternatives: cpucap 36 => entries: 20 orig: 20, repl: 20, cb: 0 | alternatives: cpucap 37 => entries: 2727 orig: 2727, repl: 2727, cb: 0 | alternatives: cpucap 38 => entries: 3 orig: 3, repl: 3, cb: 0 | alternatives: cpucap 40 => entries: 25 orig: 29, repl: 29, cb: 0 | alternatives: cpucap 41 => entries: 11 orig: 21, repl: 21, cb: 0 | alternatives: cpucap 42 => entries: 142 orig: 152, repl: 152, cb: 0 | alternatives: cpucap 44 => entries: 63 orig: 63, repl: 63, cb: 0 | alternatives: cpucap 45 => entries: 4 orig: 4, repl: 4, cb: 0 | alternatives: cpucap 46 => entries: 5 orig: 5, repl: 5, cb: 0 | alternatives: cpucap 47 => entries: 2 orig: 2, repl: 2, cb: 0 | alternatives: cpucap 50 => entries: 3 orig: 3, repl: 3, cb: 0 | alternatives: cpucap 51 => entries: 105 orig: 105, repl: 105, cb: 0 | alternatives: cpucap 52 => entries: 57 orig: 59, repl: 59, cb: 0 | alternatives: cpucap 53 => entries: 3 orig: 3, repl: 3, cb: 0 | alternatives: cpucap 54 => entries: 5 orig: 5, repl: 5, cb: 0 | alternatives: cpucap 55 => entries: 1 orig: 1, repl: 1, cb: 0 | alternatives: cpucap 59 => entries: 28 orig: 28, repl: 28, cb: 0 | alternatives: cpucap 60 => entries: 2 orig: 2, repl: 2, cb: 0 | alternatives: cpucap 61 => entries: 1 orig: 1, repl: 1, cb: 0 | alternatives: cpucap 65 => entries: 2 orig: 2, repl: 2, cb: 0 | alternatives: cpucap 68 => entries: 1 orig: 1, repl: 1, cb: 0 | alternatives: cpucap 70 => entries: 1 orig: 1, repl: 1, cb: 0 | alternatives: cpucap 71 => entries: 1 orig: 3, repl: 3, cb: 0 | alternatives: cpucap 72 => entries: 1 orig: 1, repl: 1, cb: 0 | alternatives: cpucap 73 => entries: 32 orig: 32, repl: 32, cb: 0 | alternatives: cpucap 74 => entries: 4 orig: 4, repl: 4, cb: 0 | alternatives: cpucap 75 => entries: 5 orig: 5, repl: 5, cb: 0 | alternatives: cpucap 76 => entries: 11391 orig: 11391, repl: 11391, cb: 0 | alternatives: cpucap 77 => entries: 1 orig: 1, repl: 1, cb: 0 | alternatives: cpucap 78 => entries: 64 orig: 224, repl: 224, cb: 0 | alternatives: cpucap 79 => entries: 141 orig: 282, repl: 282, cb: 0 | alternatives: cpucap 80 => entries: 19 orig: 19, repl: 19, cb: 0 >From this, it's worth noting: * cpucap 1 is ARM64_ALWAYS_SYSTEM. * cpucap 24 is ARM64_HAS_IRQ_PRIO_MASKING. Due to the existing structure of the alternatives, alternative entries are created for the irqflags.h code even when CONFIG_ARM64_PSEUDO_NMI=n, creating ~14KiB of alt_instr entries, and ~4KiB of replacement instructions. This could be avoided by reworking the irqflags.h code to use the new alternative_has_feature_*() helpers. * cpucap 26 is ARM64_HAS_LSE_ATOMICS, and most entries are using the shared NOP patcher. The other entries are for inline cmpxchg sequences. * cpucap 37 is ARM64_HAS_VIRT_HOST_EXTN. * cpucap 76 is ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE, which could be rewritten to use a callback to patch LDR to LDAR (or vice-versa), were the insn framework extended, to save ~44KiB of replacement instructions. NOTE: THIS PATCH IS NOT INTENDED FOR UPSTREAM. Signed-off-by: Mark Rutland --- arch/arm64/kernel/alternative.c | 67 ++++++++++++++++++++++++++++++++- 1 file changed, 65 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c index 122c59ce2772b..2f55d03adbe80 100644 --- a/arch/arm64/kernel/alternative.c +++ b/arch/arm64/kernel/alternative.c @@ -138,15 +138,74 @@ static void clean_dcache_range_nopatch(u64 start, u64 end) } while (cur += d_size, cur < end); } +#define for_each_region_alt(region, alt) \ + for (struct alt_instr *alt = (region)->begin; \ + (alt) < (region)->end; \ + (alt)++) + +void summarize_alternatives(const struct alt_region *region) +{ + unsigned int entries[ARM64_NCAPS] = { 0 }; + unsigned int orig_len[ARM64_NCAPS] = { 0 }; + unsigned int repl_len[ARM64_NCAPS] = { 0 }; + unsigned int callbacks[ARM64_NCAPS] = { 0 }; + + unsigned int total_entries = 0; + unsigned int total_orig = 0; + unsigned int total_repl = 0; + unsigned int total_callbacks = 0; + + for_each_region_alt(region, alt) { + int cap = ALT_CAP(alt); + + entries[cap]++; + total_entries++; + + orig_len[cap] += alt->orig_len; + total_orig += alt->orig_len; + + repl_len[cap] += alt->alt_len; + total_repl += alt->alt_len; + + if (ALT_HAS_CB(alt)) { + callbacks[cap]++; + total_callbacks++; + } + } + + pr_info("Alternatives summary:\n" + " entries: %6u (%6zu bytes)\n" + " regular: %6d\n" + " callback: %6d\n" + " instructions: %6u (%6u bytes)\n" + " replacements: %6u (%6u bytes)\n", + total_entries, total_entries * sizeof (struct alt_instr), + total_entries - total_callbacks, + total_callbacks, + total_orig / AARCH64_INSN_SIZE, total_orig, + total_repl / AARCH64_INSN_SIZE, total_repl); + + for (int i = 0; i < ARM64_NCAPS; i++) { + if (!entries[i]) + continue; + + pr_info("cpucap %2d => entries: %5d orig: %5d, repl: %5d, cb: %5d\n", + i, + entries[i], + orig_len[i] / AARCH64_INSN_SIZE, + repl_len[i] / AARCH64_INSN_SIZE, + callbacks[i]); + } +} + static void __nocfi __apply_alternatives(const struct alt_region *region, bool is_module, unsigned long *feature_mask) { - struct alt_instr *alt; __le32 *origptr, *updptr; alternative_cb_t alt_cb; - for (alt = region->begin; alt < region->end; alt++) { + for_each_region_alt(region, alt) { int nr_inst; int cap = ALT_CAP(alt); @@ -245,6 +304,8 @@ void __init apply_boot_alternatives(void) /* If called on non-boot cpu things could go wrong */ WARN_ON(smp_processor_id() != 0); + summarize_alternatives(&kernel_alternatives); + pr_info("applying boot alternatives\n"); __apply_alternatives(&kernel_alternatives, false, @@ -262,6 +323,8 @@ void apply_alternatives_module(void *start, size_t length) bitmap_fill(all_capabilities, ARM64_NCAPS); + summarize_alternatives(®ion); + __apply_alternatives(®ion, true, &all_capabilities[0]); } #endif -- 2.30.2 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel