From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8BA2A143895; Tue, 13 May 2025 20:50:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747169440; cv=none; b=QpaCIC5QZjYezGO1EWllsQ36R+AkT+6F3Pw0x0IDqVO91lzmRUDDtDO0ECdzEM3fijKDhjekkHnAUMgWDEkAAsPm2Zuyi14AzHAgJbnjLkh5ZsLIHeSgfUWCUoIGcfRZIeWnaId56AYBWbuAcg3DoW1j3Qga7IlHDAEJyugwEPA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747169440; c=relaxed/simple; bh=k24S/q0mljfkMX65LqmfPHCdtiBSQ90MpBoOiBvQ1gU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=ZJf5CmwFRr5eU4WaS5tfgM74SdRDm/BcTmkGE1ZT+JbjRN01cfrlpcEIIQ/c7IozC+tQ4p0+KlPwBWcF1g5+629imC/Q3cQvDPHb+tjZE2UQweEN/971Po4ru7aiUiVWPmjR60sx/PfHsC3BogcbLhEoEdO9tjoFI/+KpZ9yjYk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=jV7KEtcD; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="jV7KEtcD" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A4C76C4CEE4; Tue, 13 May 2025 20:50:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1747169440; bh=k24S/q0mljfkMX65LqmfPHCdtiBSQ90MpBoOiBvQ1gU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=jV7KEtcDUD8pAG+ZEpDgUU/G24NIcHzb3Omk1Po2t2JvLNSHX4EqurmTaLzY9APMF uYYHzE2ZeHQHabbwKMwCubMc81ZHwKS4N0+wLzq06sG0RHb5BpmOdhKJU8G93nPxex fZLk9F7if9gTaY78lbbjRU3oPd4H38h7j4qGS5LpDkLTsY61eAHB9yM4099O/e3QG0 +3l8P51W+GfsJG944oYAyjbLThST9fZl4nJplQo/JcwTIbG32OtC2twsLkEU7Iq7/s uUh+E5fFQ+hCGdD6GgCa9Fyqve4E7Lqra5alEWhnfic8lmo7EjLNX6s/bglEP1SPdm Vxd+uSISHocCA== Date: Tue, 13 May 2025 17:50:37 -0300 From: Arnaldo Carvalho de Melo To: Ian Rogers Cc: Gautam Menghani , namhyung@kernel.org, peterz@infradead.org, mingo@redhat.com, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, adrian.hunter@intel.com, kan.liang@linux.intel.com, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, maddy@linux.ibm.com Subject: Re: [PATCH v2 4/4] perf python: Add counting.py as example for counting perf events Message-ID: References: <20250512055748.479786-1-gautam@linux.ibm.com> <20250512055748.479786-5-gautam@linux.ibm.com> Precedence: bulk X-Mailing-List: linux-perf-users@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On Mon, May 12, 2025 at 12:38:23PM -0700, Ian Rogers wrote: > On Mon, May 12, 2025 at 10:49 AM Arnaldo Carvalho de Melo > wrote: > > > > On Mon, May 12, 2025 at 10:23:39AM -0700, Ian Rogers wrote: > > > On Sun, May 11, 2025 at 10:58 PM Gautam Menghani wrote: > > > > Add counting.py - a python version of counting.c to demonstrate > > > > measuring and reading of counts for given perf events. > > > > > > Signed-off-by: Gautam Menghani > > > > --- > > > > v1 -> v2: > > > > 1. Use existing iteration support instead of next > > > > 2. Read the counters on all cpus > > > > 3. Use existing helper functions > > > > > > > > tools/perf/python/counting.py | 34 ++++++++++++++++++++++++++++++++++ > > > > 1 file changed, 34 insertions(+) > > > > create mode 100755 tools/perf/python/counting.py > > > > > > diff --git a/tools/perf/python/counting.py b/tools/perf/python/counting.py > > > > new file mode 100755 > > > > index 000000000000..e535e3ae8bdf > > > > --- /dev/null > > > > +++ b/tools/perf/python/counting.py > > > > @@ -0,0 +1,34 @@ > > > > +#!/usr/bin/env python3 > > > > +# SPDX-License-Identifier: GPL-2.0 > > > > +# -*- python -*- > > > > +# -*- coding: utf-8 -*- > > > > + > > > > +import perf > > > > + > > > > +def main(): > > > > + cpus = perf.cpu_map() > > > > + thread_map = perf.thread_map(-1) > > > > + evlist = perf.parse_events("cpu-clock,task-clock", cpus, thread_map) > > > > > Thanks Gautam! I think this is really good. Perhaps the events could > > > be a command line option, but I can see why you want to keep this > > > similar to counting.c. > > > > > > + > > > > + for ev in evlist: > > > > + ev.read_format = perf.FORMAT_TOTAL_TIME_ENABLED | perf.FORMAT_TOTAL_TIME_RUNNING > > > > + > > > > + evlist.open() > > > > + evlist.enable() > > > > + > > > > + count = 100000 > > > > + while count > 0: > > > > + count -= 1 > > > > + > > > > + evlist.disable() > > > > + > > > > + for evsel in evlist: > > > > + for cpu in cpus: > > > > + for thread in range(len(thread_map)): > > > > > I kind of wish, for the reason of being intention revealing, this could just be: > > > > > for thread in thread_map: > > > > > I can see the problem though, the counts lack the thread_map and the > > > thread_map is needed to turn a thread back into an index. Perhaps when > > > the python counts is created we hold onto the evsel so that this is > > > possible. I also suspect that in the code: > > > > > for cpu in cpus: > > > > > The CPU number is being used rather than its index, which is a similar > > > story/problem. > > > > Lemme see the rest of this code... > > > > +static PyObject *pyrf_evsel__read(struct pyrf_evsel *pevsel, > > + PyObject *args, PyObject *kwargs) > > +{ > > + struct evsel *evsel = &pevsel->evsel; > > + int cpu_map_idx = 0, thread = 0; > > + struct perf_counts_values counts; > > + struct pyrf_counts_values *count_values = PyObject_New(struct pyrf_counts_values, > > + &pyrf_counts_values__type); > > + > > + if (!PyArg_ParseTuple(args, "ii", &cpu_map_idx, &thread)) > > + return NULL; > > + > > + perf_evsel__read(&(evsel->core), cpu_map_idx, thread, &counts); > > + count_values->values = counts; > > + return (PyObject *)count_values; > > +} > > > > Yeah, it is expecting the cpu_map_idx but the cpu number is being used, > > that is a bug. > > > > The way perf_evsel__read() is implemented: > > > > int perf_evsel__read(struct perf_evsel *evsel, int cpu_map_idx, int thread, > > struct perf_counts_values *count) > > > > It expects a cpu_map index, not a cpu and then a thread that in its > > prototype seems to imply its not an index? But it is an index as it ends > > up being the 'y' for: > > > > xyarray__entry(_evsel->mmap, _cpu_map_idx, _thread)) > > > > :-/ > > Yeah. In the C code we've pretty much committed to notions of cpu map > index and CPU. We're more ambiguous with threads, but generally thread > is actually thread index into the thread map. As you say it is for the > xyarray so that we can densely pack things by index rather than having > huge gaps, say between PIDs. For the python we don't have to have a > 1:1 mapping with the C code, so I was wondering if we could just > remove the notions of index and have them be implementation details? Agreed, even in the C case I find it confusing to sometimes deal with indexes and sometimes with real thread/cpu numbers, but if we try and at least keep the variables/parameter naming reflecting that, then it should be bearable. > This would lead to an unfortunate O(log n) translation between > thread/CPU and index (perf_cpu_map__idx) at the API boundary in > python.c. Maybe with some more thinking we can get something better? But I don't have the bw now to think about it. > > So probably its best to do it using indexes and when needing to know the > > pid or cpu number then use some helper to get the entry at the given > > entry? At least for the perf_evsel__read() API that seems to be the > > case, right? > > > Arnaldo, could you give some input on what to do wrt indices, threads > > > and CPUs at the API level? Perhaps we need a refactor and objects for > > > perf CPU and perf thread, similar to the use of struct perf_cpu in the > > > C code. The original API all pre-dates that change. The issue is that > > > changing the API could break existing scripts and we can only fix > > > those that ship with perf. > > So it was more for catching new/dead threads without having to process /proc. > Yeah. I think the sampling API is okay. The nice thing with Gautum's > patches is adding support for counters for use-cases like perf stat. Right, I like the effort he is making into having perf more usable in python, and I encourage him to think about the issues you raised so that we can come to some good abstractions. - Arnaldo