From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ECE05C43458 for ; Tue, 30 Jun 2026 13:01:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=sz/Pa3fT1bQq2Jk2W0/hbKzarWdgAvTG8FXPHENt3DQ=; b=Y8ee0bDOnxj2QS0RmxxzOxXwVj MYnJ5wFi8OYxwDNaxnH2J4NaKWRnp4PEWU+RTcJPiBZuBO1UJ5JGXc8a5qcw2ZtC85NL4SIRjtknB XHTIAKbN57HGQzvAh4R0+iUWNSQBFZlR20tCMj3LkV/rtdWLCuAPmzxY5DYZEMO+p5NFej8Dpyhq2 VhU61JC4O7DwI8CPeRSliRC7fBRf6WTI1Ot17NLXJwarvyMjuapYtsHTA3c6IE55yi1+s3zWquCu5 HbmVHUNuBiUwHW8E6dw9KFSASfpV10sr/XFmPfiuPzNn1JA81kHciJS/kVaqP/MDSqXZNtc4iCY08 sSENKaAQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.99.1 #2 (Red Hat Linux)) id 1weY5o-0000000H6sD-1a9L; Tue, 30 Jun 2026 13:01:36 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.99.1 #2 (Red Hat Linux)) id 1weY5k-0000000H6q3-2VfO for linux-arm-kernel@lists.infradead.org; Tue, 30 Jun 2026 13:01:33 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5573B16F8; Tue, 30 Jun 2026 06:01:27 -0700 (PDT) Received: from e121345-lin.cambridge.arm.com (e121345-lin.cambridge.arm.com [10.2.212.23]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 809BC3F905; Tue, 30 Jun 2026 06:01:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=arm.com; s=foss; t=1782824491; bh=qGuu0lSS+/X+nZCL+k2fprMXpVDXfcyOlmytvr+C5Ms=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=IiH/ERKkZ4aQ03NRgX7Ymy20iEVIR/am1dbJzE+U3AFT5PUrs1kJFRzi5GQONc3kA LRl786YuMeQ/3MACJShXZnB/cD2z/97tTR0XZkDeSTx1Yj0famemBDJGq1/1jGRW0b x1TNDy5qh+v2p1mC//+Xlsxi5Jgnww6bQXIc8JCg= From: Robin Murphy To: will@kernel.org Cc: mark.rutland@arm.com, linux-arm-kernel@lists.infradead.org, linux-perf-users@vger.kernel.org, zeev@amazon.com, blakgeof@amazon.com, avivb@amazon.com, ilkka@os.amperecomputing.com Subject: [PATCH v6 1/2] perf/arm-cmn: Move DTM index data out of hw_perf_event Date: Tue, 30 Jun 2026 14:01:17 +0100 Message-ID: <4be7785df09ec798173400c09ded197e246e2c4a.1782824005.git.robin.murphy@arm.com> X-Mailer: git-send-email 2.54.0.dirty In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.9.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20260630_060132_720635_DE2224A2 X-CRM114-Status: GOOD ( 16.63 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The amount of data we need to store all the per-DTM counter and watchpoint allocations is already testing the limits of hw_perf_event, and future CMNs are only likely to keep growing larger, so move these arrays out to separate memory allocations. As part of that we can use an explicit union for allocating cycle counters to dtc_cycles events, which is arguably nicer anyway. Reviewed-by: Ilkka Koskinen Signed-off-by: Robin Murphy --- v6: Use bitmap functions consistently for cleaner allocation/zeroing drivers/perf/arm-cmn.c | 91 ++++++++++++++++++++++++++++-------------- 1 file changed, 61 insertions(+), 30 deletions(-) diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c index 6e5cc4086a9e..9392838408ff 100644 --- a/drivers/perf/arm-cmn.c +++ b/drivers/perf/arm-cmn.c @@ -598,17 +598,14 @@ static void arm_cmn_debugfs_init(struct arm_cmn *cmn, int id) {} struct arm_cmn_hw_event { struct arm_cmn_node *dn; - u64 dtm_idx[DIV_ROUND_UP(CMN_MAX_NODES_PER_EVENT * 2, 64)]; + union { + unsigned long *dtm_idx; + int cc_idx; + }; + unsigned long *wp_idx; s8 dtc_idx[CMN_MAX_DTCS]; u8 num_dns; u8 dtm_offset; - - /* - * WP config registers are divided to UP and DOWN events. We need to - * keep to track only one of them. - */ - DECLARE_BITMAP(wp_idx, CMN_MAX_XPS); - bool wide_sel; enum cmn_filter_select filter_sel; }; @@ -626,25 +623,44 @@ static struct arm_cmn_hw_event *to_cmn_hw(struct perf_event *event) return (struct arm_cmn_hw_event *)&event->hw; } -static void arm_cmn_set_index(u64 x[], unsigned int pos, unsigned int val) +#define BPL2 (BITS_PER_LONG / 2) + +static void arm_cmn_set_dtm_idx(struct arm_cmn_hw_event *hw, unsigned int pos, unsigned int val) { - x[pos / 32] |= (u64)val << ((pos % 32) * 2); + hw->dtm_idx[pos / BPL2] |= (unsigned long)val << ((pos % BPL2) * 2); } -static unsigned int arm_cmn_get_index(u64 x[], unsigned int pos) +static unsigned int arm_cmn_get_dtm_idx(struct arm_cmn_hw_event *hw, unsigned int pos) { - return (x[pos / 32] >> ((pos % 32) * 2)) & 3; + return (hw->dtm_idx[pos / BPL2] >> ((pos % BPL2) * 2)) & 3; } -static void arm_cmn_set_wp_idx(unsigned long *wp_idx, unsigned int pos, bool val) +static unsigned long *arm_cmn_alloc_dtm_idx(void) +{ + return bitmap_zalloc(CMN_MAX_NODES_PER_EVENT * 2, GFP_KERNEL); +} + +static void arm_cmn_set_wp_idx(struct arm_cmn_hw_event *hw, unsigned int pos, bool val) { if (val) - set_bit(pos, wp_idx); + set_bit(pos, hw->wp_idx); } -static unsigned int arm_cmn_get_wp_idx(unsigned long *wp_idx, unsigned int pos) +static unsigned int arm_cmn_get_wp_idx(struct arm_cmn_hw_event *hw, unsigned int pos) { - return test_bit(pos, wp_idx); + return test_bit(pos, hw->wp_idx); +} + +static unsigned long *arm_cmn_alloc_wp_idx(void) +{ + return bitmap_zalloc(CMN_MAX_XPS, GFP_KERNEL); +} + +static void arm_cmn_clear_idx(struct arm_cmn_hw_event *hw) +{ + bitmap_zero(hw->dtm_idx, CMN_MAX_NODES_PER_EVENT * 2); + if (hw->wp_idx) + bitmap_zero(hw->wp_idx, CMN_MAX_XPS); } struct arm_cmn_event_attr { @@ -1377,7 +1393,7 @@ static int arm_cmn_get_assigned_wp_idx(struct perf_event *event, struct arm_cmn_hw_event *hw, unsigned int pos) { - return CMN_EVENT_EVENTID(event) + arm_cmn_get_wp_idx(hw->wp_idx, pos); + return CMN_EVENT_EVENTID(event) + arm_cmn_get_wp_idx(hw, pos); } static void arm_cmn_claim_wp_idx(struct arm_cmn_dtm *dtm, @@ -1388,7 +1404,7 @@ static void arm_cmn_claim_wp_idx(struct arm_cmn_dtm *dtm, struct arm_cmn_hw_event *hw = to_cmn_hw(event); dtm->wp_event[wp_idx] = hw->dtc_idx[dtc]; - arm_cmn_set_wp_idx(hw->wp_idx, pos, wp_idx - CMN_EVENT_EVENTID(event)); + arm_cmn_set_wp_idx(hw, pos, wp_idx - CMN_EVENT_EVENTID(event)); } static u32 arm_cmn_wp_config(struct perf_event *event, int wp_idx) @@ -1459,7 +1475,7 @@ static u64 arm_cmn_read_dtm(struct arm_cmn *cmn, struct arm_cmn_hw_event *hw, dtm = &cmn->dtms[dn->dtm] + hw->dtm_offset; reg = readq_relaxed(dtm->base + offset); } - dtm_idx = arm_cmn_get_index(hw->dtm_idx, i); + dtm_idx = arm_cmn_get_dtm_idx(hw, i); count += (u16)(reg >> (dtm_idx * 16)); } return count; @@ -1506,7 +1522,7 @@ static void arm_cmn_event_read(struct perf_event *event) unsigned long flags; if (CMN_EVENT_TYPE(event) == CMN_TYPE_DTC) { - delta = arm_cmn_read_cc(cmn->dtc + hw->dtc_idx[0]); + delta = arm_cmn_read_cc(cmn->dtc + hw->cc_idx); local64_add(delta, &event->count); return; } @@ -1573,7 +1589,7 @@ static void arm_cmn_event_start(struct perf_event *event, int flags) int i; if (type == CMN_TYPE_DTC) { - struct arm_cmn_dtc *dtc = cmn->dtc + hw->dtc_idx[0]; + struct arm_cmn_dtc *dtc = cmn->dtc + hw->cc_idx; writel_relaxed(CMN_DT_DTC_CTL_DT_EN | CMN_DT_DTC_CTL_CG_DISABLE, dtc->base + CMN_DT_DTC_CTL); @@ -1591,7 +1607,7 @@ static void arm_cmn_event_start(struct perf_event *event, int flags) writeq_relaxed(mask, base + CMN_DTM_WPn_MASK(wp_idx)); } } else for_each_hw_dn(hw, dn, i) { - int dtm_idx = arm_cmn_get_index(hw->dtm_idx, i); + int dtm_idx = arm_cmn_get_dtm_idx(hw, i); arm_cmn_set_event_sel_lo(dn, dtm_idx, CMN_EVENT_EVENTID(event), hw->wide_sel); @@ -1607,7 +1623,7 @@ static void arm_cmn_event_stop(struct perf_event *event, int flags) int i; if (type == CMN_TYPE_DTC) { - struct arm_cmn_dtc *dtc = cmn->dtc + hw->dtc_idx[0]; + struct arm_cmn_dtc *dtc = cmn->dtc + hw->cc_idx; dtc->cc_active = false; writel_relaxed(CMN_DT_DTC_CTL_DT_EN, dtc->base + CMN_DT_DTC_CTL); @@ -1620,7 +1636,7 @@ static void arm_cmn_event_stop(struct perf_event *event, int flags) writeq_relaxed(~0ULL, base + CMN_DTM_WPn_VAL(wp_idx)); } } else for_each_hw_dn(hw, dn, i) { - int dtm_idx = arm_cmn_get_index(hw->dtm_idx, i); + int dtm_idx = arm_cmn_get_dtm_idx(hw, i); arm_cmn_set_event_sel_lo(dn, dtm_idx, 0, hw->wide_sel); } @@ -1764,6 +1780,14 @@ static enum cmn_filter_select arm_cmn_filter_sel(const struct arm_cmn *cmn, } +static void arm_cmn_event_destroy(struct perf_event *event) +{ + struct arm_cmn_hw_event *hw = to_cmn_hw(event); + + bitmap_free(hw->dtm_idx); + bitmap_free(hw->wp_idx); +} + static int arm_cmn_event_init(struct perf_event *event) { struct arm_cmn *cmn = to_cmn(event->pmu); @@ -1788,6 +1812,11 @@ static int arm_cmn_event_init(struct perf_event *event) if (type == CMN_TYPE_DTC) return arm_cmn_validate_group(cmn, event); + event->destroy = arm_cmn_event_destroy; + hw->dtm_idx = arm_cmn_alloc_dtm_idx(); + if (!hw->dtm_idx) + return -ENOMEM; + eventid = CMN_EVENT_EVENTID(event); /* For watchpoints we need the actual XP node here */ if (type == CMN_TYPE_WP) { @@ -1798,6 +1827,9 @@ static int arm_cmn_event_init(struct perf_event *event) /* ...but the DTM may depend on which port we're watching */ if (cmn->multi_dtm) hw->dtm_offset = CMN_EVENT_WP_DEV_SEL(event) / 2; + hw->wp_idx = arm_cmn_alloc_wp_idx(); + if (!hw->wp_idx) + return -ENOMEM; } else if (type == CMN_TYPE_XP && (cmn->part == PART_CMN700 || cmn->part == PART_CMN_S3)) { hw->wide_sel = true; @@ -1848,7 +1880,7 @@ static void arm_cmn_event_clear(struct arm_cmn *cmn, struct perf_event *event, while (i--) { struct arm_cmn_dtm *dtm = &cmn->dtms[hw->dn[i].dtm] + hw->dtm_offset; - unsigned int dtm_idx = arm_cmn_get_index(hw->dtm_idx, i); + unsigned int dtm_idx = arm_cmn_get_dtm_idx(hw, i); if (type == CMN_TYPE_WP) { int wp_idx = arm_cmn_get_assigned_wp_idx(event, hw, i); @@ -1862,8 +1894,7 @@ static void arm_cmn_event_clear(struct arm_cmn *cmn, struct perf_event *event, dtm->pmu_config_low &= ~CMN__PMEVCNT_PAIRED(dtm_idx); writel_relaxed(dtm->pmu_config_low, dtm->base + CMN_DTM_PMU_CONFIG); } - memset(hw->dtm_idx, 0, sizeof(hw->dtm_idx)); - memset(hw->wp_idx, 0, sizeof(hw->wp_idx)); + arm_cmn_clear_idx(hw); for_each_hw_dtc_idx(hw, j, idx) cmn->dtc[j].counters[idx] = NULL; @@ -1883,7 +1914,7 @@ static int arm_cmn_event_add(struct perf_event *event, int flags) return -ENOSPC; cmn->dtc[i].cycles = event; - hw->dtc_idx[0] = i; + hw->cc_idx = i; if (flags & PERF_EF_START) arm_cmn_event_start(event, 0); @@ -1948,7 +1979,7 @@ static int arm_cmn_event_add(struct perf_event *event, int flags) goto free_dtms; } - arm_cmn_set_index(hw->dtm_idx, i, dtm_idx); + arm_cmn_set_dtm_idx(hw, i, dtm_idx); dtm->input_sel[dtm_idx] = input_sel; shift = CMN__PMEVCNTn_GLOBAL_NUM_SHIFT(dtm_idx); @@ -1981,7 +2012,7 @@ static void arm_cmn_event_del(struct perf_event *event, int flags) arm_cmn_event_stop(event, PERF_EF_UPDATE); if (type == CMN_TYPE_DTC) - cmn->dtc[hw->dtc_idx[0]].cycles = NULL; + cmn->dtc[hw->cc_idx].cycles = NULL; else arm_cmn_event_clear(cmn, event, hw->num_dns); } -- 2.54.0.dirty