From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 258DBC433FE for ; Fri, 30 Sep 2022 11:20:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-Id:Date:Subject:Cc :To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=w5Am6/mCPmIAlp7I6z8y8XEFlOHsoLHd+yaCtoPewps=; b=Ff8+tW1HDRzSbJ 1Xy8QnFnAe2CaioREfkjH+TgstMyJCf7FCsXwc+OYoZ0s7zBvPcNBX7ynhFiQEHzTFhEv2DC0OMBc 3GOKuSJGidQWgYFF3I7VzGtVTKFCmkiM7ivM9jxiuNemkS6pnplWza9uReFju/nGzvv+2jf6GaHbG UfNRJxeejoyao2LMgVNnyKZm/+FKC/aVAegD9x/rqGFr5sOpWlUCRyCGN4mi4LuPJ0IdjyUrGP7Nx Lc1+6DO82sDKILypY+tthlP29KWY2BKz9OfVIOs/Cv8tVfU6vIc1qWXMpTB9qXJ+3kZyDJZNC20KH q8GEQCzhms62daeBLN8w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oeE2r-008nzc-6F; Fri, 30 Sep 2022 11:19:05 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oeE2m-008nxu-Nz for linux-arm-kernel@lists.infradead.org; Fri, 30 Sep 2022 11:19:02 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D2EE415BF; Fri, 30 Sep 2022 04:19:02 -0700 (PDT) Received: from lakrids.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 47AC13F73B; Fri, 30 Sep 2022 04:18:55 -0700 (PDT) From: Mark Rutland To: linux-arm-kernel@lists.infradead.org Cc: mark.rutland@arm.com, pierre.gondois@arm.com, valentin.schneider@arm.com, vschneid@redhat.com, will@kernel.org Subject: [PATCH 0/3] arm_pmu: acpi: avoid allocations in atomic context Date: Fri, 30 Sep 2022 12:18:41 +0100 Message-Id: <20220930111844.1522365-1-mark.rutland@arm.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220930_041900_919948_425780E1 X-CRM114-Status: GOOD ( 14.45 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org This series attempts to make the arm_pmu ACPI probing code lpay nicely with PREEMPT_RT by moving work out of atomic context. The arm_pmu ACPI probing code tries to do a number of things in atomic context which is generally not good, and especially problematic for PREEMPT_RT, as reported by Valentin and Pierre: https://lore.kernel.org/all/20210810134127.1394269-2-valentin.schneider@arm.com/ https://lore.kernel.org/linux-arm-kernel/20220912155105.1443303-1-pierre.gondois@arm.com/ We'd previously tried to bodge around this, e.g. in commits: * 0dc1a1851af1d593 ("arm_pmu: add armpmu_alloc_atomic()") * 167e61438da0664c ("arm_pmu: acpi: request IRQs up-front") ... but this isn't good enough for PREEMPT_RT, and as reported by Pierre the probing code can trigger splats: | BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 | in_atomic(): 0, irqs_disabled(): 128, non_block: 0, pid: 24, name: cpuhp/0 | preempt_count: 0, expected: 0 | RCU nest depth: 0, expected: 0 | 3 locks held by cpuhp/0/24: | #0: ffffd8a22c8870d0 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun (linux/kernel/cpu.c:754) | #1: ffffd8a22c887120 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun (linux/kernel/cpu.c:754) | #2: ffff083e7f0d97b8 ((&c->lock)){+.+.}-{3:3}, at: ___slab_alloc (linux/mm/slub.c:2954) | irq event stamp: 42 | hardirqs last enabled at (41): finish_task_switch (linux/./arch/arm64/include/asm/irqflags.h:35) | hardirqs last disabled at (42): cpuhp_thread_fun (linux/kernel/cpu.c:776 (discriminator 1)) | softirqs last enabled at (0): copy_process (linux/./include/linux/lockdep.h:191) | softirqs last disabled at (0): 0x0 | CPU: 0 PID: 24 Comm: cpuhp/0 Tainted: G W 5.19.0-rc3-rt4-custom-piegon01-rt_0 #142 | Hardware name: WIWYNN Mt.Jade Server System B81.03001.0005/Mt.Jade Motherboard, BIOS 1.08.20220218 (SCP: 1.08.20220218) 2022/02/18 | Call trace: | dump_backtrace (linux/arch/arm64/kernel/stacktrace.c:200) | show_stack (linux/arch/arm64/kernel/stacktrace.c:207) | dump_stack_lvl (linux/lib/dump_stack.c:107) | dump_stack (linux/lib/dump_stack.c:114) | __might_resched (linux/kernel/sched/core.c:9929) | rt_spin_lock (linux/kernel/locking/rtmutex.c:1732 (discriminator 4)) | ___slab_alloc (linux/mm/slub.c:2954) | __slab_alloc.isra.0 (linux/mm/slub.c:3116) | kmem_cache_alloc_trace (linux/mm/slub.c:3207) | __armpmu_alloc (linux/./include/linux/slab.h:600) | armpmu_alloc_atomic (linux/drivers/perf/arm_pmu.c:927) | arm_pmu_acpi_cpu_starting (linux/drivers/perf/arm_pmu_acpi.c:204) | cpuhp_invoke_callback (linux/kernel/cpu.c:192) | cpuhp_thread_fun (linux/kernel/cpu.c:777 (discriminator 3)) | smpboot_thread_fn (linux/kernel/smpboot.c:164 (discriminator 3)) | kthread (linux/kernel/kthread.c:376) | ret_from_fork (linux/arch/arm64/kernel/entry.S:868) Thomas Gleixner suggested that we could pre-allocate structures to avoid this issue: https://lore.kernel.org/all/87y299oyyq.ffs@tglx/ ... and Pierre implemented that: https://lore.kernel.org/linux-arm-kernel/20220912155105.1443303-1-pierre.gondois@arm.com/ ... but in practice this gets pretty hairy due to having to manage the lifetime of those pre-allocated objects across various stages of the probing flow. This series reworks the code to perform all the allocation and registration with perf at boot time, by scannign the set of online CPUs and regsiter a PMU for each unique MIDR (which we use today to identify distinct PMUs). This avoids the need for allocation in the hotplug paths, and brings the ACPI probing code into line with the DT/platform probing code. When a CPU is late hotplugged, either: (a) It matches an existing PMU's MIDR, and will be associated with that PMU. (b) It does not match an existing PMU's MIDR, and will not be associated with a PMU (and a warning is logged to dmesg). Aside from the warning, this matches the existing behaviour, as we only register CPU PMUs with perf at boot time, and not for late hotplugged CPUs. I've tested the series in a VM, using ACPI and faked MIDR values to test a few homogeneous and heterogeneous configurations, using the 'maxcpus' kernel argument to test the late-hotplug behaviour: * On a system where all CPUs have the same MIDR, late-onlining a CPU causes it to be associated with a matching PMU: | # ls /sys/bus/event_source/devices/ | armv8_pmuv3_0 breakpoint software tracepoint | # cat /sys/bus/event_source/devices/armv8_pmuv3_0/cpus | 0-7 | # echo 1 > /sys/devices/system/cpu/cpu10/online | Detected PIPT I-cache on CPU10 | GICv3: CPU10: found redistributor a region 0:0x00000000081e0000 | GICv3: CPU10: using allocated LPI pending table @0x00000000402b0000 | CPU10: Booted secondary processor 0x000000000a [0x431f0af1] | # ls /sys/bus/event_source/devices/ | armv8_pmuv3_0 breakpoint software tracepoint | # cat /sys/bus/event_source/devices/armv8_pmuv3_0/cpus | 0-7,10 * On a system where all CPUs have a unique MIDR, each of the boot-time CPUs gets a unique PMU: | # ls /sys/bus/event_source/devices/ | armv8_pmuv3_0 armv8_pmuv3_3 armv8_pmuv3_6 software | armv8_pmuv3_1 armv8_pmuv3_4 armv8_pmuv3_7 tracepoint | armv8_pmuv3_2 armv8_pmuv3_5 breakpoint * On a system where all CPUs have a unique MIDR, late-onlining a CPU results in that CPU not being associated with a PMU, but the CPU is successfully onlined: | # echo 1 > /sys/devices/system/cpu/cpu8/online | Detected PIPT I-cache on CPU8 | GICv3: CPU8: found redistributor 8 region 0:0x00000000081a0000 | GICv3: CPU8: using allocated LPI pending table @0x0000000040290000 | Unable to associate CPU8 with a PMU | CPU8: Booted secondary processor 0x0000000008 [0x431f0af1] Thanks, Mark. Mark Rutland (3): arm_pmu: acpi: factor out PMU<->CPU association arm_pmu: factor out PMU matching arm_pmu: rework ACPI probing drivers/perf/arm_pmu.c | 17 +----- drivers/perf/arm_pmu_acpi.c | 113 ++++++++++++++++++++--------------- include/linux/perf/arm_pmu.h | 1 - 3 files changed, 69 insertions(+), 62 deletions(-) -- 2.30.2 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel