Linux Perf Users
 help / color / mirror / Atom feed
* [PATCH 00/14] perf c2c: add a function view
@ 2026-06-26  7:03 Jiebin Sun
  2026-06-26  7:03 ` [PATCH 01/14] perf c2c: extract shared data structures into c2c.h Jiebin Sun
                   ` (13 more replies)
  0 siblings, 14 replies; 23+ messages in thread
From: Jiebin Sun @ 2026-06-26  7:03 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra
  Cc: Adrian Hunter, Alexander Shishkin, Ian Rogers, James Clark,
	Jiri Olsa, Mark Rutland, Dapeng Mi, Thomas Falcon, Tianyou Li,
	Wangyang Guo, linux-perf-users, linux-kernel, Jiebin Sun

This series adds a new "function view" to perf c2c report, on top of the
existing cacheline view. It does not change the cacheline view; it adds a
second, complementary way to look at the same cache-to-cache (C2C) data.

Motivation
==========

The existing perf c2c report is organized by cacheline: it lists the most
contended cachelines and the symbols touching each one. That answers
"which cachelines are hot", but two common follow-up questions are harder
to answer directly:

  - Which functions are contending with each other?
  - For a given function, which other functions does it share cachelines
    with, and over which specific cachelines?

The function view reorganizes the same data around functions to answer
these directly, which is useful when chasing false sharing and cross-core
contention at the source-function level.

What it does
============

In the perf c2c TUI, press TAB in the cacheline view to switch to the
function view. It presents a 3-level hierarchy:

  Level 1: primary functions, sorted by Cycles % (estimated load cycles:
           HITM, peer-snoop and other-load cycles -- on systems whose
           default display mode is peer, such as Arm64, the peer-snoop
           component dominates)
  Level 2: other functions that share cachelines with the level-1 function
  Level 3: the specific shared cachelines for each function pair

Keys in the function view:
  TAB/ESC/q   return to the cacheline view
  d           show cacheline details for the selected entry
  e / +       expand / collapse the selected entry
  ?           help

The cacheline view and the --stdio output are unchanged.

Example
=======

A level-1 function is expanded (press 'e') to reveal the functions it
shares cachelines with, and one of those is expanded again to reveal the
specific shared cachelines:

  Shared Data Functions Table   (27 entries, sorted on Cycles %)
   Cycles  Store
        %  count  Code address        Symbol           Cacheline
  ----------------------------------------------------------------------
  - 39.03%   541  - 0xffffffffa2fc5b08  - [k] cpupri_set
             450  - 0xffffffffa2fa28a5  - [k] pull_rt_task
             450                                      0xff2d0082809da080

Reading the three levels:

  - Level 1: cpupri_set is the top contended function, accounting for
    39.03% of the estimated load cycles. The table is sorted by this
    Cycles % column.
  - Level 2: expanding cpupri_set (press 'e') lists the functions it
    shares cachelines with, sorted by store count. Here pull_rt_task is
    the contending function, with 450 stores into the shared data.
  - Level 3: expanding pull_rt_task lists the specific cachelines the two
    functions contend over -- in this case the single cacheline at
    0xff2d0082809da080.

The view reads top-down as "cpupri_set is hottest; it shares data with
pull_rt_task; the contention is on cacheline ...da080" -- the false-
sharing chain that the cacheline view otherwise makes you reconstruct by
hand.

Implementation
==============

The function view is built as a separate hist_browser in
tools/perf/ui/browsers/c2c-function.c. Shared types and helpers used by
both views are factored out of builtin-c2c.c into a new c2c.h. The
hierarchy is constructed from the existing cacheline histograms into a
dedicated set of hists, keyed by (symbol, instruction address), and
rendered with custom column formatters.

The series is split into 14 small, self-contained patches so each step
can be reviewed and builds on its own.

Testing
=======

  - Each of the 14 commits builds individually and as a full series.
  - perf c2c report --stdio (cacheline view) output is unchanged versus
    the baseline: identical trace-event totals, shared-cacheline counts,
    and HITM tallies.
  - The function view was exercised on c2c recordings; the level-1
    ordering and the level-2/3 sharing breakdown match the underlying
    cacheline data.

Jiebin Sun (14):
  perf c2c: extract shared data structures into c2c.h
  perf c2c: add function view browser skeleton
  perf c2c: add function view type definitions and helpers
  perf c2c: add column format infrastructure for function view
  perf c2c: add column entry functions for function view
  perf c2c: add comparison functions for function view sorting
  perf c2c: add dimension definitions and format creation
  perf c2c: add HPP list parsing for function view histograms
  perf c2c: add stats merging and memory management helpers
  perf c2c: add hierarchy entry creation and lookup functions
  perf c2c: add function view hierarchy builder
  perf c2c: add function view browser UI
  perf c2c: add TAB key to switch to function view
  perf c2c: document function view in perf-c2c man page

 tools/perf/Documentation/perf-c2c.txt |   15 +
 tools/perf/builtin-c2c.c              |  128 +-
 tools/perf/c2c.h                      |  140 +++
 tools/perf/ui/browsers/Build          |    1 +
 tools/perf/ui/browsers/c2c-function.c | 1547 +++++++++++++++++++++++++
 5 files changed, 1713 insertions(+), 118 deletions(-)
 create mode 100644 tools/perf/c2c.h
 create mode 100644 tools/perf/ui/browsers/c2c-function.c

base-commit: 40db90ac9f66c8246c1746c56d397283d161655c
--
2.52.0

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 01/14] perf c2c: extract shared data structures into c2c.h
  2026-06-26  7:03 [PATCH 00/14] perf c2c: add a function view Jiebin Sun
@ 2026-06-26  7:03 ` Jiebin Sun
  2026-06-26  7:13   ` sashiko-bot
  2026-06-26  7:03 ` [PATCH 02/14] perf c2c: add function view browser skeleton Jiebin Sun
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Jiebin Sun @ 2026-06-26  7:03 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra
  Cc: Adrian Hunter, Alexander Shishkin, Ian Rogers, James Clark,
	Jiri Olsa, Mark Rutland, Dapeng Mi, Thomas Falcon, Tianyou Li,
	Wangyang Guo, linux-perf-users, linux-kernel, Jiebin Sun

Move c2c_hists, compute_stats, c2c_hist_entry, and perf_c2c structure
definitions from builtin-c2c.c into a new shared header c2c.h. This
allows the upcoming function view browser (c2c-function.c) to reuse these
types.

Make the global perf_c2c instance 'c2c' non-static and export
perf_c2c__browse_cacheline() so they can be accessed from the new
function view module.

No functional change.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Tianyou Li <tianyou.li@intel.com>
Reviewed-by: Wangyang Guo <wangyang.guo@intel.com>
---
 tools/perf/builtin-c2c.c | 124 ++--------------------------------
 tools/perf/c2c.h         | 139 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 145 insertions(+), 118 deletions(-)
 create mode 100644 tools/perf/c2c.h

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index c9584dbedf77..33271554e354 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -38,6 +38,7 @@
 #include "mem-events.h"
 #include "mem-info.h"
 #include "mem2node.h"
+#include "c2c.h"
 #include "pmus.h"
 #include "session.h"
 #include "sort.h"
@@ -52,75 +53,9 @@
 #include "util/symbol.h"
 #include "util/util.h"
 
-struct c2c_hists {
-	struct hists		hists;
-	struct perf_hpp_list	list;
-	struct c2c_stats	stats;
-};
-
-struct compute_stats {
-	struct stats		 lcl_hitm;
-	struct stats		 rmt_hitm;
-	struct stats		 lcl_peer;
-	struct stats		 rmt_peer;
-	struct stats		 load;
-};
-
-struct c2c_hist_entry {
-	struct c2c_hists	*hists;
-	struct evsel		*evsel;
-	struct c2c_stats	 stats;
-	unsigned long		*cpuset;
-	unsigned long		*nodeset;
-	struct c2c_stats	*node_stats;
-	unsigned int		 cacheline_idx;
-
-	struct compute_stats	 cstats;
-
-	unsigned long		 paddr;
-	unsigned long		 paddr_cnt;
-	bool			 paddr_zero;
-	char			*nodestr;
-
-	/*
-	 * must be at the end,
-	 * because of its callchain dynamic entry
-	 */
-	struct hist_entry	he;
-};
 
 static char const *coalesce_default = "iaddr";
 
-struct perf_c2c {
-	struct perf_tool	tool;
-	struct c2c_hists	hists;
-	struct mem2node		mem2node;
-
-	unsigned long		**nodes;
-	int			 nodes_cnt;
-	int			 cpus_cnt;
-	int			*cpu2node;
-	int			 node_info;
-
-	bool			 show_src;
-	bool			 show_all;
-	bool			 use_stdio;
-	bool			 stats_only;
-	bool			 symbol_full;
-	bool			 stitch_lbr;
-
-	/* Shared cache line stats */
-	struct c2c_stats	shared_clines_stats;
-	int			shared_clines;
-
-	int			 display;
-
-	const char		*coalesce;
-	char			*cl_sort;
-	char			*cl_resort;
-	char			*cl_output;
-};
-
 enum {
 	DISPLAY_LCL_HITM,
 	DISPLAY_RMT_HITM,
@@ -141,9 +76,9 @@ static const struct option c2c_options[] = {
 	OPT_END()
 };
 
-static struct perf_c2c c2c;
+struct perf_c2c c2c;
 
-static void *c2c_he_zalloc(size_t size)
+void *c2c_he_zalloc(size_t size)
 {
 	struct c2c_hist_entry *c2c_he;
 
@@ -458,36 +393,6 @@ static const char * const __usage_report[] = {
 
 static const char * const *report_c2c_usage = __usage_report;
 
-#define C2C_HEADER_MAX 2
-
-struct c2c_header {
-	struct {
-		const char *text;
-		int	    span;
-	} line[C2C_HEADER_MAX];
-};
-
-struct c2c_dimension {
-	struct c2c_header	 header;
-	const char		*name;
-	int			 width;
-	struct sort_entry	*se;
-
-	int64_t (*cmp)(struct perf_hpp_fmt *fmt,
-		       struct hist_entry *, struct hist_entry *);
-	int   (*entry)(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
-		       struct hist_entry *he);
-	int   (*color)(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
-		       struct hist_entry *he);
-};
-
-struct c2c_fmt {
-	struct perf_hpp_fmt	 fmt;
-	struct c2c_dimension	*dim;
-};
-
-#define SYMBOL_WIDTH 30
-
 static struct c2c_dimension dim_symbol;
 static struct c2c_dimension dim_srcline;
 
@@ -1389,23 +1294,6 @@ cl_idx_empty_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 	return scnprintf(hpp->buf, hpp->size, "%*s", width, "");
 }
 
-#define HEADER_LOW(__h)			\
-	{				\
-		.line[1] = {		\
-			.text = __h,	\
-		},			\
-	}
-
-#define HEADER_BOTH(__h0, __h1)		\
-	{				\
-		.line[0] = {		\
-			.text = __h0,	\
-		},			\
-		.line[1] = {		\
-			.text = __h1,	\
-		},			\
-	}
-
 #define HEADER_SPAN(__h0, __h1, __s)	\
 	{				\
 		.line[0] = {		\
@@ -1928,7 +1816,7 @@ static struct c2c_dimension *dimensions[] = {
 	NULL,
 };
 
-static void fmt_free(struct perf_hpp_fmt *fmt)
+void fmt_free(struct perf_hpp_fmt *fmt)
 {
 	struct c2c_fmt *c2c_fmt;
 
@@ -1936,7 +1824,7 @@ static void fmt_free(struct perf_hpp_fmt *fmt)
 	free(c2c_fmt);
 }
 
-static bool fmt_equal(struct perf_hpp_fmt *a, struct perf_hpp_fmt *b)
+bool fmt_equal(struct perf_hpp_fmt *a, struct perf_hpp_fmt *b)
 {
 	struct c2c_fmt *c2c_a = container_of(a, struct c2c_fmt, fmt);
 	struct c2c_fmt *c2c_b = container_of(b, struct c2c_fmt, fmt);
@@ -2710,7 +2598,7 @@ c2c_cacheline_browser__new(struct hists *hists, struct hist_entry *he)
 	return browser;
 }
 
-static int perf_c2c__browse_cacheline(struct hist_entry *he)
+int perf_c2c__browse_cacheline(struct hist_entry *he)
 {
 	struct c2c_hist_entry *c2c_he;
 	struct c2c_hists *c2c_hists;
diff --git a/tools/perf/c2c.h b/tools/perf/c2c.h
new file mode 100644
index 000000000000..3e974dd1d7ee
--- /dev/null
+++ b/tools/perf/c2c.h
@@ -0,0 +1,139 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _PERF_C2C_H_
+#define _PERF_C2C_H_ 1
+
+#include <stdbool.h>
+#include <stddef.h>
+#include <linux/types.h>
+#include "util/stat.h"
+#include "util/hist.h"
+#include "util/mem-events.h"
+#include "util/mem2node.h"
+#include "util/tool.h"
+
+struct sort_entry;
+
+struct compute_stats {
+	struct stats		 lcl_hitm;
+	struct stats		 rmt_hitm;
+	struct stats		 lcl_peer;
+	struct stats		 rmt_peer;
+	struct stats		 load;
+};
+
+struct c2c_hists {
+	struct hists		hists;
+	struct perf_hpp_list	list;
+	struct c2c_stats	stats;
+};
+
+struct c2c_hist_entry {
+	struct c2c_hists	*hists;
+	struct evsel		*evsel;
+	struct c2c_stats	 stats;
+	unsigned long		*cpuset;
+	unsigned long		*nodeset;
+	struct c2c_stats	*node_stats;
+	unsigned int		 cacheline_idx;
+
+	struct compute_stats	 cstats;
+
+	unsigned long		 paddr;
+	unsigned long		 paddr_cnt;
+	bool			 paddr_zero;
+	char			*nodestr;
+
+	/*
+	 * must be at the end,
+	 * because of its callchain dynamic entry
+	 */
+	struct hist_entry	he;
+};
+
+struct perf_c2c {
+	struct perf_tool	tool;
+	struct c2c_hists	hists;
+	struct mem2node		mem2node;
+
+	unsigned long		**nodes;
+	int			 nodes_cnt;
+	int			 cpus_cnt;
+	int			*cpu2node;
+	int			 node_info;
+
+	bool			 show_src;
+	bool			 show_all;
+	bool			 use_stdio;
+	bool			 stats_only;
+	bool			 symbol_full;
+	bool			 stitch_lbr;
+
+	/* Shared cache line stats */
+	struct c2c_stats	shared_clines_stats;
+	int			shared_clines;
+
+	int			 display;
+
+	const char		*coalesce;
+	char			*cl_sort;
+	char			*cl_resort;
+	char			*cl_output;
+};
+
+extern struct perf_c2c c2c;
+
+#define C2C_HEADER_MAX 2
+#define SYMBOL_WIDTH 30
+
+#define HEADER_LOW(__h)			\
+	{				\
+		.line[1] = {		\
+			.text = __h,	\
+		},			\
+	}
+
+#define HEADER_BOTH(__h0, __h1)		\
+	{				\
+		.line[0] = {		\
+			.text = __h0,	\
+		},			\
+		.line[1] = {		\
+			.text = __h1,	\
+		},			\
+	}
+
+struct c2c_header {
+	struct {
+		const char *text;
+		int	    span;
+	} line[C2C_HEADER_MAX];
+};
+
+struct c2c_dimension {
+	struct c2c_header	 header;
+	const char		*name;
+	int			 width;
+	struct sort_entry	*se;
+
+	int64_t	(*cmp)(struct perf_hpp_fmt *fmt,
+		       struct hist_entry *left, struct hist_entry *right);
+	int	(*entry)(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+			 struct hist_entry *he);
+	int	(*color)(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+			 struct hist_entry *he);
+};
+
+struct c2c_fmt {
+	struct perf_hpp_fmt	 fmt;
+	struct c2c_dimension	*dim;
+};
+
+void *c2c_he_zalloc(size_t size);
+void fmt_free(struct perf_hpp_fmt *fmt);
+bool fmt_equal(struct perf_hpp_fmt *a, struct perf_hpp_fmt *b);
+
+#ifdef HAVE_SLANG_SUPPORT
+int perf_c2c__browse_cacheline(struct hist_entry *he);
+#endif
+
+#endif /* _PERF_C2C_H_ */
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 02/14] perf c2c: add function view browser skeleton
  2026-06-26  7:03 [PATCH 00/14] perf c2c: add a function view Jiebin Sun
  2026-06-26  7:03 ` [PATCH 01/14] perf c2c: extract shared data structures into c2c.h Jiebin Sun
@ 2026-06-26  7:03 ` Jiebin Sun
  2026-06-26  7:11   ` sashiko-bot
  2026-06-26  7:03 ` [PATCH 03/14] perf c2c: add function view type definitions and helpers Jiebin Sun
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Jiebin Sun @ 2026-06-26  7:03 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra
  Cc: Adrian Hunter, Alexander Shishkin, Ian Rogers, James Clark,
	Jiri Olsa, Mark Rutland, Dapeng Mi, Thomas Falcon, Tianyou Li,
	Wangyang Guo, linux-perf-users, linux-kernel, Jiebin Sun

Add c2c-function.c with a stub perf_c2c__browse_function_view() entry
point and register it in the build system. Declare the function
prototype in c2c.h.

This lays the groundwork for the function-level cacheline sharing
analysis browser that will be implemented in subsequent patches.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Tianyou Li <tianyou.li@intel.com>
Reviewed-by: Wangyang Guo <wangyang.guo@intel.com>
---
 tools/perf/c2c.h                      |  1 +
 tools/perf/ui/browsers/Build          |  1 +
 tools/perf/ui/browsers/c2c-function.c | 44 +++++++++++++++++++++++++++
 3 files changed, 46 insertions(+)
 create mode 100644 tools/perf/ui/browsers/c2c-function.c

diff --git a/tools/perf/c2c.h b/tools/perf/c2c.h
index 3e974dd1d7ee..385709536a1a 100644
--- a/tools/perf/c2c.h
+++ b/tools/perf/c2c.h
@@ -134,6 +134,7 @@ bool fmt_equal(struct perf_hpp_fmt *a, struct perf_hpp_fmt *b);
 
 #ifdef HAVE_SLANG_SUPPORT
 int perf_c2c__browse_cacheline(struct hist_entry *he);
+int perf_c2c__browse_function_view(struct hists *hists);
 #endif
 
 #endif /* _PERF_C2C_H_ */
diff --git a/tools/perf/ui/browsers/Build b/tools/perf/ui/browsers/Build
index a07489e44765..ae67a2161f7d 100644
--- a/tools/perf/ui/browsers/Build
+++ b/tools/perf/ui/browsers/Build
@@ -5,3 +5,4 @@ perf-ui-y += map.o
 perf-ui-y += scripts.o
 perf-ui-y += header.o
 perf-ui-y += res_sample.o
+perf-ui-y += c2c-function.o
diff --git a/tools/perf/ui/browsers/c2c-function.c b/tools/perf/ui/browsers/c2c-function.c
new file mode 100644
index 000000000000..040266288be3
--- /dev/null
+++ b/tools/perf/ui/browsers/c2c-function.c
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * C2C Function Browser - function-level cacheline sharing analysis
+ *
+ * Planned UI: 3-level hierarchy showing which functions share cachelines (not implemented yet):
+ *   Level 1: Primary functions sorted by Cycles % (estimated load cycles)
+ *   Level 2: Other functions sharing cachelines with the level-1 function
+ *   Level 3: Specific shared cachelines between each pair of functions
+ *
+ * Uses c2c_hist_entry->hists to build the hierarchy without adding any
+ * per-entry state to the existing c2c data structures.
+ */
+
+#include <errno.h>
+#include <inttypes.h>
+#include <stdlib.h>
+#include <string.h>
+#include <linux/list.h>
+#include <linux/rbtree.h>
+#include <linux/zalloc.h>
+
+#include "../browser.h"
+#include "../helpline.h"
+#include "../keysyms.h"
+#include "../libslang.h"
+#include "../ui.h"
+#include "../../util/addr_location.h"
+#include "../../util/cacheline.h"
+#include "../../util/debug.h"
+#include "../../util/hist.h"
+#include "../../util/map.h"
+#include "../../util/mem-events.h"
+#include "../../util/mem-info.h"
+#include "../../util/sort.h"
+#include "../../util/symbol.h"
+#include "../../util/thread.h"
+#include "../../c2c.h"
+#include "hists.h"
+
+int perf_c2c__browse_function_view(struct hists *hists __maybe_unused)
+{
+	ui__warning("C2C function view is not implemented yet.\n");
+	return -ENOSYS;
+}
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 03/14] perf c2c: add function view type definitions and helpers
  2026-06-26  7:03 [PATCH 00/14] perf c2c: add a function view Jiebin Sun
  2026-06-26  7:03 ` [PATCH 01/14] perf c2c: extract shared data structures into c2c.h Jiebin Sun
  2026-06-26  7:03 ` [PATCH 02/14] perf c2c: add function view browser skeleton Jiebin Sun
@ 2026-06-26  7:03 ` Jiebin Sun
  2026-06-26  7:14   ` sashiko-bot
  2026-06-26  7:03 ` [PATCH 04/14] perf c2c: add column format infrastructure for function view Jiebin Sun
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Jiebin Sun @ 2026-06-26  7:03 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra
  Cc: Adrian Hunter, Alexander Shishkin, Ian Rogers, James Clark,
	Jiri Olsa, Mark Rutland, Dapeng Mi, Thomas Falcon, Tianyou Li,
	Wangyang Guo, linux-perf-users, linux-kernel, Jiebin Sun

Add the foundational type definitions and helper functions for the
function view browser:

- perf_c2c_ext: extended context holding function-grouped histograms
- c2c_function_browser: browser state wrapping hist_browser
- symbol_name_equal(): symbol comparison by name
- hist_entry__iaddr(): instruction address extraction helper
- c2c_hitm_count(): inline accessor for a c2c_stats HITM total

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Tianyou Li <tianyou.li@intel.com>
Reviewed-by: Wangyang Guo <wangyang.guo@intel.com>
---
 tools/perf/ui/browsers/c2c-function.c | 29 +++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/tools/perf/ui/browsers/c2c-function.c b/tools/perf/ui/browsers/c2c-function.c
index 040266288be3..dbc912a4a242 100644
--- a/tools/perf/ui/browsers/c2c-function.c
+++ b/tools/perf/ui/browsers/c2c-function.c
@@ -37,6 +37,35 @@
 #include "../../c2c.h"
 #include "hists.h"
 
+struct perf_c2c_ext {
+	struct c2c_hists	function_hists;
+	/* Cached across all level-1 entries; 0 means "not yet computed". */
+	u64			total_cycles;
+};
+
+static struct perf_c2c_ext c2c_ext __maybe_unused;
+
+struct c2c_function_browser {
+	struct hist_browser	hb;
+};
+
+static __maybe_unused inline u64 c2c_hitm_count(const struct c2c_stats *stats)
+{
+	return stats->tot_hitm;
+}
+
+static __maybe_unused inline bool symbol_name_equal(struct symbol *a, struct symbol *b)
+{
+	return a && b && arch__compare_symbol_names(a->name, b->name) == 0;
+}
+
+static __maybe_unused inline u64 hist_entry__iaddr(struct hist_entry *he)
+{
+	if (he->mem_info)
+		return mem_info__iaddr(he->mem_info)->addr;
+	return he->ip;
+}
+
 int perf_c2c__browse_function_view(struct hists *hists __maybe_unused)
 {
 	ui__warning("C2C function view is not implemented yet.\n");
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 04/14] perf c2c: add column format infrastructure for function view
  2026-06-26  7:03 [PATCH 00/14] perf c2c: add a function view Jiebin Sun
                   ` (2 preceding siblings ...)
  2026-06-26  7:03 ` [PATCH 03/14] perf c2c: add function view type definitions and helpers Jiebin Sun
@ 2026-06-26  7:03 ` Jiebin Sun
  2026-06-26  7:03 ` [PATCH 05/14] perf c2c: add column entry functions " Jiebin Sun
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Jiebin Sun @ 2026-06-26  7:03 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra
  Cc: Adrian Hunter, Alexander Shishkin, Ian Rogers, James Clark,
	Jiri Olsa, Mark Rutland, Dapeng Mi, Thomas Falcon, Tianyou Li,
	Wangyang Guo, linux-perf-users, linux-kernel, Jiebin Sun

Add the column format plumbing functions used by all function view
dimensions:

- symbol_width(): constrain symbol column width
- c2c_width(): dispatch column width based on dimension type
- c2c_header(): render multi-line column headers

These are referenced by column entry functions and dimension
definitions added in subsequent patches.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Tianyou Li <tianyou.li@intel.com>
Reviewed-by: Wangyang Guo <wangyang.guo@intel.com>
---
 tools/perf/ui/browsers/c2c-function.c | 117 ++++++++++++++++++++++++++
 1 file changed, 117 insertions(+)

diff --git a/tools/perf/ui/browsers/c2c-function.c b/tools/perf/ui/browsers/c2c-function.c
index dbc912a4a242..d718cab6537d 100644
--- a/tools/perf/ui/browsers/c2c-function.c
+++ b/tools/perf/ui/browsers/c2c-function.c
@@ -66,6 +66,123 @@ static __maybe_unused inline u64 hist_entry__iaddr(struct hist_entry *he)
 	return he->ip;
 }
 
+static __maybe_unused int symbol_width(struct hists *hists, struct sort_entry *se)
+{
+	int width = hists__col_len(hists, se->se_width_idx);
+
+	if (!c2c.symbol_full && width > SYMBOL_WIDTH)
+		width = SYMBOL_WIDTH;
+
+	return width;
+}
+
+static struct c2c_dimension dim_symbol_view;
+
+/*
+ * c2c_width - Calculate width for a C2C column in function view
+ */
+static __maybe_unused int c2c_width(struct perf_hpp_fmt *fmt,
+		     struct perf_hpp *hpp __maybe_unused,
+		     struct hists *hists)
+{
+	struct c2c_fmt *c2c_fmt;
+	struct c2c_dimension *dim;
+
+	c2c_fmt = container_of(fmt, struct c2c_fmt, fmt);
+	dim = c2c_fmt->dim;
+
+	if (dim == &dim_symbol_view)
+		return symbol_width(hists, dim->se);
+
+	return dim->se ? hists__col_len(hists, dim->se->se_width_idx) :
+			 dim->width;
+}
+
+static __maybe_unused int c2c_header(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+		      struct hists *hists, int line, int *span)
+{
+	struct c2c_fmt *c2c_fmt;
+	struct c2c_dimension *dim;
+	const char *text = NULL;
+	int width = c2c_width(fmt, hpp, hists);
+
+	c2c_fmt = container_of(fmt, struct c2c_fmt, fmt);
+	dim = c2c_fmt->dim;
+
+	if (dim->se) {
+		text = dim->header.line[line].text;
+		/* Use the last line from sort_entry if not defined. */
+		if (!text && line == hists->hpp_list->nr_header_lines - 1)
+			text = dim->se->se_header;
+	} else {
+		text = dim->header.line[line].text;
+
+		if (span) {
+			if (*span) {
+				(*span)--;
+				return 0;
+			}
+
+			*span = dim->header.line[line].span;
+		}
+	}
+
+	if (text == NULL)
+		text = "";
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", width, text);
+}
+
+/*
+ * Return the estimated total cycles for a c2c_hist_entry
+ * (rmt_hitm + lcl_hitm + rmt_peer + lcl_peer + other loads).
+ */
+static __maybe_unused u64 c2c_hist_entry__cycles(struct c2c_hist_entry *c2c_he)
+{
+	double cycles_rmt, cycles_lcl, cycles_load;
+	u64 other_load, total_hitm;
+
+	cycles_rmt = avg_stats(&c2c_he->cstats.rmt_hitm) * c2c_he->stats.rmt_hitm;
+	cycles_lcl = avg_stats(&c2c_he->cstats.lcl_hitm) * c2c_he->stats.lcl_hitm;
+	total_hitm = c2c_he->stats.tot_hitm;
+	other_load = (c2c_he->stats.load >= total_hitm) ? c2c_he->stats.load - total_hitm : 0;
+	cycles_load = avg_stats(&c2c_he->cstats.load) * other_load;
+
+	return (u64)(cycles_rmt + cycles_lcl + cycles_load);
+}
+
+/* Sum c2c_hist_entry__cycles() across all level-1 entries. */
+static __maybe_unused u64 c2c_ext__total_cycles(void)
+{
+	struct rb_node *nd;
+	u64 total = 0;
+
+	for (nd = rb_first_cached(&c2c_ext.function_hists.hists.entries); nd;
+	     nd = rb_next(nd)) {
+		struct hist_entry *he = rb_entry(nd, struct hist_entry, rb_node);
+		struct c2c_hist_entry *c2c_he = container_of(he, struct c2c_hist_entry, he);
+
+		total += c2c_hist_entry__cycles(c2c_he);
+	}
+	return total;
+}
+
+/* Sum child entries' store counts under a level-1 hist_entry. */
+static __maybe_unused u64 hist_entry__child_stores(struct hist_entry *he)
+{
+	struct rb_node *nd;
+	u64 sum = 0;
+
+	for (nd = rb_first_cached(&he->hroot_out); nd; nd = rb_next(nd)) {
+		struct hist_entry *child = rb_entry(nd, struct hist_entry, rb_node);
+		struct c2c_hist_entry *c2c_child =
+			container_of(child, struct c2c_hist_entry, he);
+
+		sum += (u64)c2c_child->stats.store;
+	}
+	return sum;
+}
+
 int perf_c2c__browse_function_view(struct hists *hists __maybe_unused)
 {
 	ui__warning("C2C function view is not implemented yet.\n");
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 05/14] perf c2c: add column entry functions for function view
  2026-06-26  7:03 [PATCH 00/14] perf c2c: add a function view Jiebin Sun
                   ` (3 preceding siblings ...)
  2026-06-26  7:03 ` [PATCH 04/14] perf c2c: add column format infrastructure for function view Jiebin Sun
@ 2026-06-26  7:03 ` Jiebin Sun
  2026-06-26  7:03 ` [PATCH 06/14] perf c2c: add comparison functions for function view sorting Jiebin Sun
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Jiebin Sun @ 2026-06-26  7:03 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra
  Cc: Adrian Hunter, Alexander Shishkin, Ian Rogers, James Clark,
	Jiri Olsa, Mark Rutland, Dapeng Mi, Thomas Falcon, Tianyou Li,
	Wangyang Guo, linux-perf-users, linux-kernel, Jiebin Sun

Add rendering functions for each column in the function view:

- total_stores_entry(): render store count, summing children for L1
- cacheline_symbol_entry(): render cacheline address for L3 entries
- iaddr_symbol_entry(): render code address with fold indicators
- symbol_view_entry(): render symbol name with fold indicators
- cycles_percent_entry(): render estimated load-cycle percentage for L1
  (HITM, peer-snoop and other-load cycles)

Each entry function handles the 3-level hierarchy by checking
parent_he depth to decide what to display.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Tianyou Li <tianyou.li@intel.com>
Reviewed-by: Wangyang Guo <wangyang.guo@intel.com>
---
 tools/perf/ui/browsers/c2c-function.c | 132 ++++++++++++++++++++++++++
 1 file changed, 132 insertions(+)

diff --git a/tools/perf/ui/browsers/c2c-function.c b/tools/perf/ui/browsers/c2c-function.c
index d718cab6537d..5393f603c864 100644
--- a/tools/perf/ui/browsers/c2c-function.c
+++ b/tools/perf/ui/browsers/c2c-function.c
@@ -183,6 +183,138 @@ static __maybe_unused u64 hist_entry__child_stores(struct hist_entry *he)
 	return sum;
 }
 
+static __maybe_unused int
+total_stores_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+		   struct hist_entry *he)
+{
+	struct c2c_hist_entry *c2c_he = container_of(he, struct c2c_hist_entry, he);
+	int width = c2c_width(fmt, hpp, he->hists);
+	u64 total;
+
+	/* L1 shows the sum of sharing-function stores; L2/L3 show their own. */
+	total = he->parent_he ? (u64)c2c_he->stats.store : hist_entry__child_stores(he);
+
+	return scnprintf(hpp->buf, hpp->size, "%*" PRIu64, width, total);
+}
+
+/*
+ * cacheline_symbol_entry - Render cacheline address for function view
+ */
+static __maybe_unused int
+cacheline_symbol_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+		       struct hist_entry *he)
+{
+	int width = c2c_width(fmt, hpp, he->hists);
+	char buf[24];
+	u64 addr;
+
+	/* Only show the address on level-3 cacheline entries. */
+	if (!he->parent_he || !he->parent_he->parent_he || !he->mem_info)
+		return scnprintf(hpp->buf, hpp->size, "%*s", width, "");
+
+	addr = cl_address(mem_info__daddr(he->mem_info)->addr, chk_double_cl);
+	scnprintf(buf, sizeof(buf), "0x%" PRIx64, addr);
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", width, buf);
+}
+
+/* Render the code (instruction) address for level-1 and level-2 entries. */
+static __maybe_unused int
+iaddr_symbol_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+		   struct hist_entry *he)
+{
+	int width = c2c_width(fmt, hpp, he->hists);
+	int iaddr_width, ret;
+	char buf[24];
+	u64 addr;
+	char folded_sign;
+
+	/* Hide for cacheline (level-3) entries. */
+	if (he->parent_he && he->parent_he->parent_he)
+		return scnprintf(hpp->buf, hpp->size, "%*s", width, "");
+
+	addr = hist_entry__iaddr(he);
+
+	folded_sign = he->has_children ? (he->unfolded ? '-' : '+') : ' ';
+	ret = scnprintf(hpp->buf, hpp->size, "%c ", folded_sign);
+
+	iaddr_width = width - ret;
+	if (iaddr_width <= 0)
+		return ret;
+
+	scnprintf(buf, sizeof(buf), "0x%" PRIx64, addr);
+	ret += scnprintf(hpp->buf + ret, hpp->size - ret, "%*.*s", iaddr_width, iaddr_width, buf);
+	return ret;
+}
+
+/*
+ * symbol_view_entry - Render symbol name for function view with expansion indicators
+ */
+static __maybe_unused int
+symbol_view_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+		  struct hist_entry *he)
+{
+	int width = c2c_width(fmt, hpp, he->hists);
+	int sym_width;
+	int ret;
+	char symbuf[512];
+	char folded_sign;
+
+	/* Hide Symbol for cacheline entries */
+	if (he->parent_he && he->parent_he->parent_he)
+		return scnprintf(hpp->buf, hpp->size, "%*s", width, "");
+
+	folded_sign = he->has_children ? (he->unfolded ? '-' : '+') : ' ';
+
+	ret = scnprintf(hpp->buf, hpp->size, "%c ", folded_sign);
+
+	sym_width = width - ret;
+
+	if (sym_width <= 0)
+		return ret;
+
+	/* sort_sym.se_snprintf is statically set and never cleared. */
+	sort_sym.se_snprintf(he, symbuf, sizeof(symbuf), sym_width);
+
+	ret += scnprintf(hpp->buf + ret, hpp->size - ret, "%-*.*s", sym_width, sym_width, symbuf);
+	return ret;
+}
+
+/*
+ * cycles_percent_entry - Render cycles percentage column
+ */
+static __maybe_unused int
+cycles_percent_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+		     struct hist_entry *he)
+{
+	struct c2c_hist_entry *c2c_he;
+	int width = c2c_width(fmt, hpp, he->hists);
+	u64 fn_cycles, total_cycles;
+	char folded_sign;
+	double pct;
+	int ret, pct_width;
+
+	/* Hide Cycles Percent for child functions and cachelines. */
+	if (he->parent_he)
+		return scnprintf(hpp->buf, hpp->size, "%*s", width, "");
+
+	c2c_he = container_of(he, struct c2c_hist_entry, he);
+	fn_cycles = c2c_hist_entry__cycles(c2c_he);
+	/* Populated by build_function_view_hierarchy() once the L1 tree is built. */
+	total_cycles = c2c_ext.total_cycles;
+	pct = total_cycles > 0 ? (double)fn_cycles / total_cycles * 100.0 : 0.0;
+
+	/* Add folded sign only for level-1 entries */
+	folded_sign = he->has_children ? (he->unfolded ? '-' : '+') : ' ';
+	ret = scnprintf(hpp->buf, hpp->size, "%c ", folded_sign);
+
+	pct_width = width - ret;
+	if (pct_width <= 0)
+		return ret;
+	ret += scnprintf(hpp->buf + ret, hpp->size - ret, "%*.2f%%", pct_width - 1, pct);
+	return ret;
+}
+
 int perf_c2c__browse_function_view(struct hists *hists __maybe_unused)
 {
 	ui__warning("C2C function view is not implemented yet.\n");
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 06/14] perf c2c: add comparison functions for function view sorting
  2026-06-26  7:03 [PATCH 00/14] perf c2c: add a function view Jiebin Sun
                   ` (4 preceding siblings ...)
  2026-06-26  7:03 ` [PATCH 05/14] perf c2c: add column entry functions " Jiebin Sun
@ 2026-06-26  7:03 ` Jiebin Sun
  2026-06-26  7:22   ` sashiko-bot
  2026-06-26  7:03 ` [PATCH 07/14] perf c2c: add dimension definitions and format creation Jiebin Sun
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Jiebin Sun @ 2026-06-26  7:03 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra
  Cc: Adrian Hunter, Alexander Shishkin, Ian Rogers, James Clark,
	Jiri Olsa, Mark Rutland, Dapeng Mi, Thomas Falcon, Tianyou Li,
	Wangyang Guo, linux-perf-users, linux-kernel, Jiebin Sun

Add sort comparison functions for the function view columns:

- cycles_percent_cmp(): compare by weighted HITM cycle count
- iaddr_symbol_cmp(): compare by instruction address
- total_stores_cmp(): compare by store count
- empty_cmp(): no-op comparator for display-only columns

Use overflow-safe (a > b) - (a < b) pattern for unsigned
comparisons in cycles_percent_cmp() and total_stores_cmp().

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Tianyou Li <tianyou.li@intel.com>
Reviewed-by: Wangyang Guo <wangyang.guo@intel.com>
---
 tools/perf/ui/browsers/c2c-function.c | 75 +++++++++++++++++++++++++++
 1 file changed, 75 insertions(+)

diff --git a/tools/perf/ui/browsers/c2c-function.c b/tools/perf/ui/browsers/c2c-function.c
index 5393f603c864..172e1aa86eca 100644
--- a/tools/perf/ui/browsers/c2c-function.c
+++ b/tools/perf/ui/browsers/c2c-function.c
@@ -315,6 +315,81 @@ cycles_percent_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 	return ret;
 }
 
+/*
+ * cycles_percent_cmp - Comparison function for cycles percentage sorting
+ */
+static __maybe_unused int64_t
+cycles_percent_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+		   struct hist_entry *left, struct hist_entry *right)
+{
+	struct c2c_hist_entry *c2c_left = container_of(left, struct c2c_hist_entry, he);
+	struct c2c_hist_entry *c2c_right = container_of(right, struct c2c_hist_entry, he);
+	u64 cycles_left, cycles_right;
+
+	/* Cycles Percent is only shown for level-1 entries; others compare equal. */
+	if (left->parent_he || right->parent_he)
+		return 0;
+
+	cycles_left = c2c_hist_entry__cycles(c2c_left);
+	cycles_right = c2c_hist_entry__cycles(c2c_right);
+
+	return (cycles_left > cycles_right) - (cycles_left < cycles_right);
+}
+
+/*
+ * iaddr_symbol_cmp - Comparison function for instruction address sorting
+ */
+static __maybe_unused int64_t
+iaddr_symbol_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+		 struct hist_entry *left, struct hist_entry *right)
+{
+	u64 left_iaddr, right_iaddr;
+
+	/* IAddr is hidden for level-3 cacheline entries; they compare equal. */
+	if ((left->parent_he && left->parent_he->parent_he) ||
+	    (right->parent_he && right->parent_he->parent_he))
+		return 0;
+
+	left_iaddr = hist_entry__iaddr(left);
+	right_iaddr = hist_entry__iaddr(right);
+
+	/*
+	 * Descending by instruction address (same direction as sort__iaddr_cmp(),
+	 * which also returns r - l): the expression is +1 when left < right.
+	 * Uses hist_entry__iaddr(), which falls back to he->ip when mem_info is
+	 * NULL, so it matches what iaddr_symbol_entry() displays.
+	 */
+	return (left_iaddr < right_iaddr) - (left_iaddr > right_iaddr);
+}
+
+static __maybe_unused int64_t
+empty_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+	  struct hist_entry *left __maybe_unused,
+	  struct hist_entry *right __maybe_unused)
+{
+	return 0;
+}
+
+/*
+ * total_stores_cmp - Comparison function for total stores sorting
+ */
+static __maybe_unused int64_t
+total_stores_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+		 struct hist_entry *left, struct hist_entry *right)
+{
+	struct c2c_hist_entry *c2c_left = container_of(left, struct c2c_hist_entry, he);
+	struct c2c_hist_entry *c2c_right = container_of(right, struct c2c_hist_entry, he);
+	u64 left_store, right_store;
+
+	/* Match total_stores_entry(): L1 sums child stores, L2/L3 use their own. */
+	left_store = left->parent_he ? (u64)c2c_left->stats.store :
+				       hist_entry__child_stores(left);
+	right_store = right->parent_he ? (u64)c2c_right->stats.store :
+					 hist_entry__child_stores(right);
+
+	return (left_store > right_store) - (left_store < right_store);
+}
+
 int perf_c2c__browse_function_view(struct hists *hists __maybe_unused)
 {
 	ui__warning("C2C function view is not implemented yet.\n");
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 07/14] perf c2c: add dimension definitions and format creation
  2026-06-26  7:03 [PATCH 00/14] perf c2c: add a function view Jiebin Sun
                   ` (5 preceding siblings ...)
  2026-06-26  7:03 ` [PATCH 06/14] perf c2c: add comparison functions for function view sorting Jiebin Sun
@ 2026-06-26  7:03 ` Jiebin Sun
  2026-06-26  7:23   ` sashiko-bot
  2026-06-26  7:03 ` [PATCH 08/14] perf c2c: add HPP list parsing for function view histograms Jiebin Sun
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Jiebin Sun @ 2026-06-26  7:03 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra
  Cc: Adrian Hunter, Alexander Shishkin, Ian Rogers, James Clark,
	Jiri Olsa, Mark Rutland, Dapeng Mi, Thomas Falcon, Tianyou Li,
	Wangyang Guo, linux-perf-users, linux-kernel, Jiebin Sun

Add the function view dimension definitions that bind column names to
their rendering and comparison functions:

- dim_cycles_percent: estimated load-cycles percentage (L1 only)
- dim_total_stores: store count per function
- dim_cacheline_symbol: cacheline address (L3 only)
- dim_iaddr_symbol: instruction code address
- dim_symbol_view: symbol name with fold indicators

Remove __maybe_unused from the entry and comparison functions now that
they are referenced by the dimension structs above.

Also add get_function_dimension() and get_function_format() to look up
dimensions by name and create perf_hpp_fmt wrappers. These two helpers
are intentionally left __maybe_unused here; they are wired into the
function view initialization in the following patch.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Tianyou Li <tianyou.li@intel.com>
Reviewed-by: Wangyang Guo <wangyang.guo@intel.com>
---
 tools/perf/ui/browsers/c2c-function.c | 143 +++++++++++++++++++++++---
 1 file changed, 131 insertions(+), 12 deletions(-)

diff --git a/tools/perf/ui/browsers/c2c-function.c b/tools/perf/ui/browsers/c2c-function.c
index 172e1aa86eca..794a023239ce 100644
--- a/tools/perf/ui/browsers/c2c-function.c
+++ b/tools/perf/ui/browsers/c2c-function.c
@@ -66,7 +66,7 @@ static __maybe_unused inline u64 hist_entry__iaddr(struct hist_entry *he)
 	return he->ip;
 }
 
-static __maybe_unused int symbol_width(struct hists *hists, struct sort_entry *se)
+static int symbol_width(struct hists *hists, struct sort_entry *se)
 {
 	int width = hists__col_len(hists, se->se_width_idx);
 
@@ -81,7 +81,7 @@ static struct c2c_dimension dim_symbol_view;
 /*
  * c2c_width - Calculate width for a C2C column in function view
  */
-static __maybe_unused int c2c_width(struct perf_hpp_fmt *fmt,
+static int c2c_width(struct perf_hpp_fmt *fmt,
 		     struct perf_hpp *hpp __maybe_unused,
 		     struct hists *hists)
 {
@@ -98,7 +98,7 @@ static __maybe_unused int c2c_width(struct perf_hpp_fmt *fmt,
 			 dim->width;
 }
 
-static __maybe_unused int c2c_header(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+static int c2c_header(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 		      struct hists *hists, int line, int *span)
 {
 	struct c2c_fmt *c2c_fmt;
@@ -183,7 +183,7 @@ static __maybe_unused u64 hist_entry__child_stores(struct hist_entry *he)
 	return sum;
 }
 
-static __maybe_unused int
+static int
 total_stores_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 		   struct hist_entry *he)
 {
@@ -200,7 +200,7 @@ total_stores_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 /*
  * cacheline_symbol_entry - Render cacheline address for function view
  */
-static __maybe_unused int
+static int
 cacheline_symbol_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 		       struct hist_entry *he)
 {
@@ -219,7 +219,7 @@ cacheline_symbol_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 }
 
 /* Render the code (instruction) address for level-1 and level-2 entries. */
-static __maybe_unused int
+static int
 iaddr_symbol_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 		   struct hist_entry *he)
 {
@@ -250,7 +250,7 @@ iaddr_symbol_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 /*
  * symbol_view_entry - Render symbol name for function view with expansion indicators
  */
-static __maybe_unused int
+static int
 symbol_view_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 		  struct hist_entry *he)
 {
@@ -283,7 +283,7 @@ symbol_view_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 /*
  * cycles_percent_entry - Render cycles percentage column
  */
-static __maybe_unused int
+static int
 cycles_percent_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 		     struct hist_entry *he)
 {
@@ -318,7 +318,7 @@ cycles_percent_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 /*
  * cycles_percent_cmp - Comparison function for cycles percentage sorting
  */
-static __maybe_unused int64_t
+static int64_t
 cycles_percent_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
 		   struct hist_entry *left, struct hist_entry *right)
 {
@@ -339,7 +339,7 @@ cycles_percent_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
 /*
  * iaddr_symbol_cmp - Comparison function for instruction address sorting
  */
-static __maybe_unused int64_t
+static int64_t
 iaddr_symbol_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
 		 struct hist_entry *left, struct hist_entry *right)
 {
@@ -362,7 +362,7 @@ iaddr_symbol_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
 	return (left_iaddr < right_iaddr) - (left_iaddr > right_iaddr);
 }
 
-static __maybe_unused int64_t
+static int64_t
 empty_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
 	  struct hist_entry *left __maybe_unused,
 	  struct hist_entry *right __maybe_unused)
@@ -373,7 +373,7 @@ empty_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
 /*
  * total_stores_cmp - Comparison function for total stores sorting
  */
-static __maybe_unused int64_t
+static int64_t
 total_stores_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
 		 struct hist_entry *left, struct hist_entry *right)
 {
@@ -390,6 +390,125 @@ total_stores_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
 	return (left_store > right_store) - (left_store < right_store);
 }
 
+/*
+ * Function view dimensions
+ */
+static struct c2c_dimension dim_cycles_percent = {
+	.header		= HEADER_BOTH("Cycles", "%"),
+	.name		= "cycles_percent",
+	.cmp		= cycles_percent_cmp,
+	.entry		= cycles_percent_entry,
+	.width		= 9,
+};
+
+static struct c2c_dimension dim_total_stores = {
+	.header		= HEADER_BOTH("Store", "count"),
+	.name		= "total_stores",
+	.cmp		= total_stores_cmp,
+	.entry		= total_stores_entry,
+	.width		= 7,
+};
+
+static struct c2c_dimension dim_cacheline_symbol = {
+	.header		= HEADER_LOW("Cacheline"),
+	.name		= "cacheline_symbol",
+	.cmp		= empty_cmp,
+	.entry		= cacheline_symbol_entry,
+	.width		= 18,
+};
+
+static struct c2c_dimension dim_iaddr_symbol = {
+	.header		= HEADER_LOW("Code address"),
+	.name		= "iaddr_symbol",
+	.cmp		= iaddr_symbol_cmp,
+	.entry		= iaddr_symbol_entry,
+	.width		= 20,
+};
+
+static struct c2c_dimension dim_symbol_view = {
+	.header		= HEADER_LOW("Symbol"),
+	.name		= "symbol_view",
+	.se		= &sort_sym,
+	.entry		= symbol_view_entry,
+	.width		= SYMBOL_WIDTH,
+};
+
+static struct c2c_dimension *function_view_dimensions[] = {
+	&dim_iaddr_symbol,
+	&dim_cycles_percent,
+	&dim_total_stores,
+	&dim_cacheline_symbol,
+	&dim_symbol_view,
+	NULL,
+};
+
+static __maybe_unused struct c2c_dimension *get_function_dimension(const char *name)
+{
+	unsigned int i;
+
+	for (i = 0; function_view_dimensions[i]; i++) {
+		struct c2c_dimension *dim = function_view_dimensions[i];
+
+		if (!strcmp(dim->name, name))
+			return dim;
+	}
+
+	return NULL;
+}
+
+/* Wrappers so sort_entry-backed dimensions sort/collapse via their se. */
+static int64_t c2c_se_cmp(struct perf_hpp_fmt *fmt,
+			  struct hist_entry *a, struct hist_entry *b)
+{
+	struct c2c_fmt *c2c_fmt = container_of(fmt, struct c2c_fmt, fmt);
+	struct c2c_dimension *dim = c2c_fmt->dim;
+
+	return dim->se->se_cmp(a, b);
+}
+
+static int64_t c2c_se_collapse(struct perf_hpp_fmt *fmt,
+			       struct hist_entry *a, struct hist_entry *b)
+{
+	struct c2c_fmt *c2c_fmt = container_of(fmt, struct c2c_fmt, fmt);
+	struct c2c_dimension *dim = c2c_fmt->dim;
+	int64_t (*collapse_fn)(struct hist_entry *a, struct hist_entry *b);
+
+	collapse_fn = dim->se->se_collapse ?: dim->se->se_cmp;
+	return collapse_fn(a, b);
+}
+
+static __maybe_unused struct c2c_fmt *get_function_format(const char *name)
+{
+	struct c2c_dimension *dim = get_function_dimension(name);
+	struct c2c_fmt *c2c_fmt;
+	struct perf_hpp_fmt *fmt;
+
+	if (!dim)
+		return NULL;
+
+	c2c_fmt = zalloc(sizeof(*c2c_fmt));
+	if (!c2c_fmt)
+		return NULL;
+
+	fmt = &c2c_fmt->fmt;
+
+	c2c_fmt->dim = dim;
+	INIT_LIST_HEAD(&fmt->list);
+	INIT_LIST_HEAD(&fmt->sort_list);
+
+	fmt->cmp	= dim->se ? c2c_se_cmp : dim->cmp;
+	fmt->sort	= dim->se ? c2c_se_cmp : dim->cmp;
+	fmt->color	= dim->color;
+	fmt->entry	= dim->entry;
+	fmt->header	= c2c_header;
+	fmt->width	= c2c_width;
+	fmt->collapse	= dim->se ? c2c_se_collapse : dim->cmp;
+	fmt->equal	= fmt_equal;
+	fmt->free	= fmt_free;
+
+	return c2c_fmt;
+}
+
 int perf_c2c__browse_function_view(struct hists *hists __maybe_unused)
 {
 	ui__warning("C2C function view is not implemented yet.\n");
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 08/14] perf c2c: add HPP list parsing for function view histograms
  2026-06-26  7:03 [PATCH 00/14] perf c2c: add a function view Jiebin Sun
                   ` (6 preceding siblings ...)
  2026-06-26  7:03 ` [PATCH 07/14] perf c2c: add dimension definitions and format creation Jiebin Sun
@ 2026-06-26  7:03 ` Jiebin Sun
  2026-06-26  7:16   ` sashiko-bot
  2026-06-26  7:03 ` [PATCH 09/14] perf c2c: add stats merging and memory management helpers Jiebin Sun
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Jiebin Sun @ 2026-06-26  7:03 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra
  Cc: Adrian Hunter, Alexander Shishkin, Ian Rogers, James Clark,
	Jiri Olsa, Mark Rutland, Dapeng Mi, Thomas Falcon, Tianyou Li,
	Wangyang Guo, linux-perf-users, linux-kernel, Jiebin Sun

Add the histogram initialization and column parsing infrastructure:

- c2c_function_hists__init_output(): register output columns
- c2c_function_hists__init_sort(): register sort columns
- function_hpp_list__parse(): parse comma-separated column strings
- c2c_function_hists__init(): initialize function histograms with sort
- c2c_function_hists__reinit(): reinitialize with new output/sort

These functions bridge the function view's custom dimensions with
the perf HPP list infrastructure, enabling dynamic column
configuration during hierarchy construction.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Tianyou Li <tianyou.li@intel.com>
Reviewed-by: Wangyang Guo <wangyang.guo@intel.com>
---
 tools/perf/ui/browsers/c2c-function.c | 144 +++++++++++++++++++++++++-
 1 file changed, 142 insertions(+), 2 deletions(-)

diff --git a/tools/perf/ui/browsers/c2c-function.c b/tools/perf/ui/browsers/c2c-function.c
index 794a023239ce..c11a81d93c21 100644
--- a/tools/perf/ui/browsers/c2c-function.c
+++ b/tools/perf/ui/browsers/c2c-function.c
@@ -442,7 +442,7 @@ static struct c2c_dimension *function_view_dimensions[] = {
 	NULL,
 };
 
-static __maybe_unused struct c2c_dimension *get_function_dimension(const char *name)
+static struct c2c_dimension *get_function_dimension(const char *name)
 {
 	unsigned int i;
 
@@ -477,7 +477,7 @@ static int64_t c2c_se_collapse(struct perf_hpp_fmt *fmt,
 	return collapse_fn(a, b);
 }
 
-static __maybe_unused struct c2c_fmt *get_function_format(const char *name)
+static struct c2c_fmt *get_function_format(const char *name)
 {
 	struct c2c_dimension *dim = get_function_dimension(name);
 	struct c2c_fmt *c2c_fmt;
@@ -509,6 +509,146 @@ static __maybe_unused struct c2c_fmt *get_function_format(const char *name)
 	return c2c_fmt;
 }
 
+static int
+c2c_function_hists__init_output(struct perf_hpp_list *hpp_list, char *name,
+				struct perf_env *env __maybe_unused)
+{
+	struct c2c_fmt *c2c_fmt = get_function_format(name);
+	int level = 0;
+
+	if (!c2c_fmt) {
+		reset_dimensions();
+		return output_field_add(hpp_list, name, &level);
+	}
+
+	/*
+	 * Mark symbol-backed columns so hists__has(hists, sym) is correct.
+	 * Only dim_symbol_view carries a sort_entry (.se); the function
+	 * view's field strings are fixed and always include symbol_view, so
+	 * this single check is sufficient (unlike the user-configurable
+	 * cacheline view, which must also test dim_iaddr).
+	 */
+	if (c2c_fmt->dim->se == &sort_sym)
+		hpp_list->sym = 1;
+
+	perf_hpp_list__column_register(hpp_list, &c2c_fmt->fmt);
+	return 0;
+}
+
+static int
+c2c_function_hists__init_sort(struct perf_hpp_list *hpp_list, char *name,
+			      struct perf_env *env)
+{
+	struct c2c_fmt *c2c_fmt = get_function_format(name);
+
+	if (!c2c_fmt) {
+		reset_dimensions();
+		return sort_dimension__add(hpp_list, name, /*evlist=*/NULL, env, /*level=*/0);
+	}
+
+	/* Mark symbol-backed sort keys so hists__has(hists, sym) is correct. */
+	if (c2c_fmt->dim->se == &sort_sym)
+		hpp_list->sym = 1;
+
+	perf_hpp_list__register_sort_field(hpp_list, &c2c_fmt->fmt);
+	return 0;
+}
+
+typedef int (*hpp_list_add_fn)(struct perf_hpp_list *hpp_list, char *name,
+			       struct perf_env *env);
+
+static int function_hpp_list__add_tokens(struct perf_hpp_list *hpp_list, char *list,
+					 struct perf_env *env, hpp_list_add_fn add)
+{
+	char *tok, *tmp;
+	int ret;
+
+	if (!list)
+		return 0;
+
+	for (tok = strtok_r(list, ", ", &tmp); tok; tok = strtok_r(NULL, ", ", &tmp)) {
+		ret = add(hpp_list, tok, env);
+		if (ret) {
+			if (ret == -EINVAL || ret == -ESRCH)
+				pr_err("Invalid c2c function-view field: %s", tok);
+			return ret;
+		}
+	}
+	return 0;
+}
+
+static int
+function_hpp_list__parse(struct perf_hpp_list *hpp_list,
+			 const char *output_str,
+			 const char *sort_str,
+			 struct perf_env *env)
+{
+	char *output = output_str ? strdup(output_str) : NULL;
+	char *sort   = sort_str   ? strdup(sort_str)   : NULL;
+	int ret = 0;
+
+	if ((output_str && !output) || (sort_str && !sort)) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	ret = function_hpp_list__add_tokens(hpp_list, output, env,
+					    c2c_function_hists__init_output);
+	if (ret)
+		goto out;
+
+	ret = function_hpp_list__add_tokens(hpp_list, sort, env,
+					    c2c_function_hists__init_sort);
+	if (ret)
+		goto out;
+
+	perf_hpp__setup_output_field(hpp_list);
+out:
+	free(output);
+	free(sort);
+	return ret;
+}
+
+static __maybe_unused int
+c2c_function_hists__init(struct c2c_hists *hists,
+			 const char *sort,
+			 int nr_header_lines,
+			 struct perf_env *env)
+{
+	__hists__init(&hists->hists, &hists->list);
+
+	perf_hpp_list__init(&hists->list);
+
+	hists->list.nr_header_lines = nr_header_lines;
+
+	return function_hpp_list__parse(&hists->list, /*output=*/NULL, sort, env);
+}
+
+static __maybe_unused int
+c2c_function_hists__reinit(struct c2c_hists *c2c_hists,
+			   const char *output,
+			   const char *sort,
+			   struct perf_env *env)
+{
+	int nr_header_lines = c2c_hists->list.nr_header_lines;
+
+	perf_hpp__reset_output_field(&c2c_hists->list);
+	INIT_LIST_HEAD(&c2c_hists->list.sorts);
+
+	/* Clear stale state flags so a different output/sort set starts fresh. */
+	c2c_hists->list.need_collapse = 0;
+	c2c_hists->list.parent = 0;
+	c2c_hists->list.sym = 0;
+	c2c_hists->list.dso = 0;
+	c2c_hists->list.socket = 0;
+	c2c_hists->list.thread = 0;
+	c2c_hists->list.comm = 0;
+	c2c_hists->list.comm_nodigit = 0;
+	c2c_hists->list.nr_header_lines = nr_header_lines;
+
+	return function_hpp_list__parse(&c2c_hists->list, output, sort, env);
+}
+
 int perf_c2c__browse_function_view(struct hists *hists __maybe_unused)
 {
 	ui__warning("C2C function view is not implemented yet.\n");
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 09/14] perf c2c: add stats merging and memory management helpers
  2026-06-26  7:03 [PATCH 00/14] perf c2c: add a function view Jiebin Sun
                   ` (7 preceding siblings ...)
  2026-06-26  7:03 ` [PATCH 08/14] perf c2c: add HPP list parsing for function view histograms Jiebin Sun
@ 2026-06-26  7:03 ` Jiebin Sun
  2026-06-26  7:17   ` sashiko-bot
  2026-06-26  7:03 ` [PATCH 10/14] perf c2c: add hierarchy entry creation and lookup functions Jiebin Sun
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Jiebin Sun @ 2026-06-26  7:03 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra
  Cc: Adrian Hunter, Alexander Shishkin, Ian Rogers, James Clark,
	Jiri Olsa, Mark Rutland, Dapeng Mi, Thomas Falcon, Tianyou Li,
	Wangyang Guo, linux-perf-users, linux-kernel, Jiebin Sun

Add statistics aggregation and memory management functions for
function view histogram entries:

Stats merging:
- c2c_stats_merge(): combine two stats structs (Welford online merge)
- c2c_add_cstats(): merge all compute_stats sub-fields
- hist_entry__add_c2c_stats(): update hist_entry stat counters

Memory management:
- c2c_function_he_free(): free entry and recursively free children
- c2c_he__free_hierarchy(): recursive cleanup of hroot_out tree

Entry allocation reuses c2c_he_zalloc(), which was made shared via
c2c.h in an earlier patch.

These are used during hierarchy construction to allocate entries
and aggregate C2C statistics across cacheline-function pairs.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Tianyou Li <tianyou.li@intel.com>
Reviewed-by: Wangyang Guo <wangyang.guo@intel.com>
---
 tools/perf/ui/browsers/c2c-function.c | 158 ++++++++++++++++++++++++++
 1 file changed, 158 insertions(+)

diff --git a/tools/perf/ui/browsers/c2c-function.c b/tools/perf/ui/browsers/c2c-function.c
index c11a81d93c21..a42378f395f5 100644
--- a/tools/perf/ui/browsers/c2c-function.c
+++ b/tools/perf/ui/browsers/c2c-function.c
@@ -649,6 +649,164 @@ c2c_function_hists__reinit(struct c2c_hists *c2c_hists,
 	return function_hpp_list__parse(&c2c_hists->list, output, sort, env);
 }
 
+/* Welford online merge of two "stats" (from util/stat.h) accumulators. */
+static void c2c_stats_merge(struct stats *dest, const struct stats *src)
+{
+	double delta;
+
+	if (src->n == 0)
+		return;
+
+	if (dest->n == 0) {
+		*dest = *src;
+		return;
+	}
+
+	delta = src->mean - dest->mean;
+	dest->M2 += src->M2 + delta * delta * dest->n * src->n / (dest->n + src->n);
+	dest->mean = (dest->mean * dest->n + src->mean * src->n) / (dest->n + src->n);
+	dest->n += src->n;
+
+	/* Update min/max */
+	if (src->max > dest->max)
+		dest->max = src->max;
+	if (src->min < dest->min)
+		dest->min = src->min;
+}
+
+/* Merge compute_stats during function aggregation. */
+static __maybe_unused void c2c_add_cstats(struct compute_stats *dest,
+			   const struct compute_stats *src)
+{
+	c2c_stats_merge(&dest->rmt_hitm, &src->rmt_hitm);
+	c2c_stats_merge(&dest->lcl_hitm, &src->lcl_hitm);
+	c2c_stats_merge(&dest->rmt_peer, &src->rmt_peer);
+	c2c_stats_merge(&dest->lcl_peer, &src->lcl_peer);
+	c2c_stats_merge(&dest->load, &src->load);
+}
+
+static __maybe_unused bool hist_entry__add_c2c_stats(struct hist_entry *he,
+				      const struct c2c_stats *stats)
+{
+	u64 nr_events = c2c_hitm_count(stats) + stats->rmt_peer + stats->lcl_peer;
+	u64 weight1 = c2c_hitm_count(stats);
+
+	he->stat.nr_events += nr_events;
+	he->stat.period += nr_events;
+	he->stat.weight1 += weight1;
+
+	if (!symbol_conf.cumulate_callchain)
+		return true;
+
+	if (!he->stat_acc) {
+		he->stat_acc = calloc(1, sizeof(struct he_stat));
+		if (!he->stat_acc)
+			return false;
+	}
+
+	he->stat_acc->nr_events += nr_events;
+	he->stat_acc->period += nr_events;
+	he->stat_acc->weight1 += weight1;
+
+	return true;
+}
+
+static void c2c_he__free_hierarchy(struct hist_entry *he);
+
+/*
+ * Free a function-view histogram entry (hist_entry_ops::free).
+ */
+static void c2c_function_he_free(void *ptr)
+{
+	struct hist_entry *he = ptr;
+	struct c2c_hist_entry *c2c_he;
+
+	c2c_he = container_of(he, struct c2c_hist_entry, he);
+
+	if (c2c_he->hists) {
+		perf_hpp__reset_output_field(&c2c_he->hists->list);
+		hists__delete_entries(&c2c_he->hists->hists);
+		zfree(&c2c_he->hists);
+	}
+
+	c2c_he__free_hierarchy(he);
+
+	zfree(&c2c_he->nodeset);
+	zfree(&c2c_he->cpuset);
+	zfree(&c2c_he->nodestr);
+	zfree(&c2c_he->node_stats);
+
+	free(c2c_he);
+}
+
+/*
+ * Free all child entries under @he, recursively (hroot_out sub-tree).
+ *
+ * Children are built by c2c_child_entry__alloc(), which BORROWS thread and
+ * ms (plain copy, no thread__get()/map__get()) and OWNS only mem_info (a
+ * clone), stat_acc and the c2c-specific fields (hists, cpuset, nodeset,
+ * nodestr, node_stats). We therefore must NOT call hist_entry__delete()
+ * here: it would thread__zput()/map_symbol__exit() the borrowed refs and
+ * underflow their refcounts. Free exactly the owned resources instead.
+ */
+static void c2c_he__free_hierarchy(struct hist_entry *he)
+{
+	struct rb_node *nd;
+	struct hist_entry *child_he;
+	struct c2c_hist_entry *child_c2c;
+
+	/*
+	 * Leaf entries alias hroot_out with sorted_chain (callchains) in a
+	 * union, so they have no child hierarchy to free here.
+	 */
+	if (he->leaf)
+		return;
+
+	if (RB_EMPTY_ROOT(&he->hroot_out.rb_root))
+		return;
+
+	nd = rb_first_cached(&he->hroot_out);
+	while (nd) {
+		struct rb_node *next = rb_next(nd);
+
+		child_he = rb_entry(nd, struct hist_entry, rb_node);
+		child_c2c = container_of(child_he, struct c2c_hist_entry, he);
+
+		if (child_he->stat_acc)
+			zfree(&child_he->stat_acc);
+
+		if (child_he->mem_info)
+			mem_info__put(child_he->mem_info);
+
+		if (child_c2c->hists) {
+			perf_hpp__reset_output_field(&child_c2c->hists->list);
+			hists__delete_entries(&child_c2c->hists->hists);
+			zfree(&child_c2c->hists);
+		}
+
+		zfree(&child_c2c->cpuset);
+		zfree(&child_c2c->nodeset);
+		zfree(&child_c2c->nodestr);
+		zfree(&child_c2c->node_stats);
+
+		c2c_he__free_hierarchy(child_he);
+
+		rb_erase_cached(&child_he->rb_node, &he->hroot_out);
+		free(child_c2c);
+
+		nd = next;
+	}
+
+	/* All children erased; clear the tree (and its cached leftmost). */
+	he->hroot_out = RB_ROOT_CACHED;
+}
+
+/* Entry operations for function view */
+static struct hist_entry_ops c2c_function_entry_ops __maybe_unused = {
+	.new	= c2c_he_zalloc,
+	.free	= c2c_function_he_free,
+};
+
 int perf_c2c__browse_function_view(struct hists *hists __maybe_unused)
 {
 	ui__warning("C2C function view is not implemented yet.\n");
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 10/14] perf c2c: add hierarchy entry creation and lookup functions
  2026-06-26  7:03 [PATCH 00/14] perf c2c: add a function view Jiebin Sun
                   ` (8 preceding siblings ...)
  2026-06-26  7:03 ` [PATCH 09/14] perf c2c: add stats merging and memory management helpers Jiebin Sun
@ 2026-06-26  7:03 ` Jiebin Sun
  2026-06-26  7:19   ` sashiko-bot
  2026-06-26  7:03 ` [PATCH 11/14] perf c2c: add function view hierarchy builder Jiebin Sun
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Jiebin Sun @ 2026-06-26  7:03 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra
  Cc: Adrian Hunter, Alexander Shishkin, Ian Rogers, James Clark,
	Jiri Olsa, Mark Rutland, Dapeng Mi, Thomas Falcon, Tianyou Li,
	Wangyang Guo, linux-perf-users, linux-kernel, Jiebin Sun

Add functions for creating and finding entries at each level of
the 3-level function view hierarchy:

- c2c_child_entry__alloc(): allocate a child hist_entry with all
  fields initialized from a source entry
- c2c_child_entry__insert(): insert a child into the parent's
  hroot_out tree
- c2c_function_hists__level1_entry(): find/create primary function entry
  using hists__add_entry_ops() for automatic deduplication
- c2c_function_hists__level2_entry(): find/create sharing function entry
  as child of level 1, keyed by (iaddr, symbol)
- c2c_function_hists__level3_entry(): find/create cacheline entry as
  child of level 2, keyed by cacheline address

Also add c2c_function_entry_ops binding the custom allocator/free
functions for function view histogram entries.

These helpers are introduced with __maybe_unused; the annotation is
removed in the following patch once the hierarchy builder calls them.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Tianyou Li <tianyou.li@intel.com>
Reviewed-by: Wangyang Guo <wangyang.guo@intel.com>
---
 tools/perf/ui/browsers/c2c-function.c | 249 +++++++++++++++++++++++++-
 1 file changed, 248 insertions(+), 1 deletion(-)

diff --git a/tools/perf/ui/browsers/c2c-function.c b/tools/perf/ui/browsers/c2c-function.c
index a42378f395f5..a881839299e3 100644
--- a/tools/perf/ui/browsers/c2c-function.c
+++ b/tools/perf/ui/browsers/c2c-function.c
@@ -802,11 +802,258 @@ static void c2c_he__free_hierarchy(struct hist_entry *he)
 }
 
 /* Entry operations for function view */
-static struct hist_entry_ops c2c_function_entry_ops __maybe_unused = {
+static struct hist_entry_ops c2c_function_entry_ops = {
 	.new	= c2c_he_zalloc,
 	.free	= c2c_function_he_free,
 };
 
+static struct c2c_hist_entry *
+c2c_child_entry__alloc(struct hist_entry *parent_he, struct hist_entry *src_he,
+		       int depth, u64 ip)
+{
+	struct c2c_hist_entry *child_c2c;
+	struct hist_entry *child_he;
+	size_t callchain_size;
+
+	callchain_size = symbol_conf.use_callchain ? sizeof(struct callchain_root) : 0;
+	child_he = c2c_he_zalloc(callchain_size);
+	if (!child_he)
+		return NULL;
+
+	child_c2c = container_of(child_he, struct c2c_hist_entry, he);
+	child_he->callchain_size = callchain_size;
+	if (callchain_size)
+		callchain_init(child_he->callchain);
+
+	memcpy(&child_he->ms, &src_he->ms, sizeof(struct map_symbol));
+
+	if (src_he->mem_info) {
+		child_he->mem_info = mem_info__clone(src_he->mem_info);
+		if (!child_he->mem_info)
+			goto out_free;
+	}
+
+	child_he->thread = src_he->thread;
+	child_he->cpumode = src_he->cpumode;
+	child_he->cpu = src_he->cpu;
+	child_he->socket = src_he->socket;
+	child_he->level = src_he->level;
+	child_he->ip = ip;
+
+	child_he->parent_he = parent_he;
+	child_he->depth = depth;
+	child_he->leaf = (depth >= 2);
+	child_he->hists = &c2c_ext.function_hists.hists;
+	child_he->filtered = false;
+	child_he->unfolded = false;
+	child_he->has_children = false;
+	child_he->has_no_entry = false;
+	child_he->nr_rows = 0;
+	child_he->row_offset = 0;
+
+	memset(&child_he->stat, 0, sizeof(child_he->stat));
+	child_he->hroot_in = RB_ROOT_CACHED;
+	child_he->hroot_out = RB_ROOT_CACHED;
+	INIT_LIST_HEAD(&child_he->pairs.node);
+	child_he->hpp_list = &c2c_ext.function_hists.list;
+	if (symbol_conf.cumulate_callchain) {
+		child_he->stat_acc = calloc(1, sizeof(struct he_stat));
+		if (!child_he->stat_acc)
+			goto out_free;
+	}
+
+	return child_c2c;
+
+out_free:
+	if (child_he->mem_info)
+		mem_info__put(child_he->mem_info);
+	zfree(&child_c2c->cpuset);
+	zfree(&child_c2c->nodeset);
+	zfree(&child_c2c->node_stats);
+	free(child_c2c);
+	return NULL;
+}
+
+static void
+c2c_child_entry__insert(struct hist_entry *parent_he, struct hist_entry *child_he,
+			struct rb_node **p, struct rb_node *rb_parent, bool leftmost)
+{
+	rb_link_node(&child_he->rb_node, rb_parent, p);
+	rb_insert_color_cached(&child_he->rb_node, &parent_he->hroot_out, leftmost);
+
+	parent_he->has_children = true;
+	parent_he->leaf = false;
+}
+
+static __maybe_unused struct hist_entry *
+c2c_function_hists__level1_entry(struct symbol *sym, u64 iaddr,
+				 struct hist_entry *detail_he,
+				 struct thread *synthetic_thread)
+{
+	struct addr_location al;
+	struct perf_sample sample = {};
+	struct mem_info *mi;
+	struct hist_entry *he;
+
+	mi = mem_info__new();
+	if (mi) {
+		mem_info__iaddr(mi)->addr = iaddr;
+		/* mem_info__put() will map_symbol__exit() these, so take refs. */
+		mem_info__iaddr(mi)->ms.thread = thread__get(detail_he->ms.thread);
+		mem_info__iaddr(mi)->ms.map = map__get(detail_he->ms.map);
+		mem_info__iaddr(mi)->ms.sym = sym;
+		mem_info__daddr(mi)->addr = 0;
+	}
+
+	addr_location__init(&al);
+	al.thread = thread__get(synthetic_thread);
+	al.map = map__get(detail_he->ms.map);
+	al.sym = sym;
+	al.addr = iaddr;
+	al.level = detail_he->level;
+	al.cpumode = detail_he->cpumode;
+	al.cpu = 0;
+	al.socket = 0;
+	al.filtered = 0;
+	al.latency = 0;
+
+	/*
+	 * Synthetic sample: period/weight are placeholders only. The real
+	 * c2c counters live in c2c_hist_entry::stats and are added via
+	 * hist_entry__add_c2c_stats(); no function-view column or sort key
+	 * reads he->stat.period/nr_events, so the +1 that __hists__add_entry()
+	 * accrues on each dedup hit has no effect on what is displayed.
+	 */
+	sample.period = 1;
+	sample.weight = 1;
+	sample.ip = iaddr;
+	sample.pid = thread__pid(synthetic_thread);
+	sample.tid = thread__tid(synthetic_thread);
+	sample.cpu = 0;
+
+	/* Add entry - histogram handles dedup */
+	he = hists__add_entry_ops(&c2c_ext.function_hists.hists,
+				  &c2c_function_entry_ops,
+				  &al, NULL, NULL, mi,
+				  NULL, &sample, true);
+
+	addr_location__exit(&al);
+	if (mi)
+		mem_info__put(mi);
+
+	if (he)
+		he->hpp_list = &c2c_ext.function_hists.list;
+
+	return he;
+}
+
+static __maybe_unused struct c2c_hist_entry *
+c2c_function_hists__level2_entry(struct c2c_hist_entry *level1_c2c,
+				 struct symbol *sym, u64 iaddr,
+				 struct hist_entry *detail_he)
+{
+	struct hist_entry *level1_he = &level1_c2c->he;
+	struct rb_node **p = &level1_he->hroot_out.rb_root.rb_node;
+	struct rb_node *parent = NULL;
+	struct c2c_hist_entry *level2_c2c;
+	bool leftmost = true;
+
+	/*
+	 * Order by (iaddr, symbol name). Symbols are looked up by name to
+	 * coalesce identically-named symbols from different DSO/JIT copies,
+	 * which matches the dedup policy used when building the hierarchy.
+	 */
+	while (*p) {
+		struct hist_entry *iter = rb_entry(*p, struct hist_entry, rb_node);
+		u64 iter_iaddr = hist_entry__iaddr(iter);
+		int cmp;
+
+		parent = *p;
+		if (iaddr < iter_iaddr) {
+			p = &parent->rb_left;
+			continue;
+		}
+		if (iaddr > iter_iaddr) {
+			p = &parent->rb_right;
+			leftmost = false;
+			continue;
+		}
+
+		if (!sym || !iter->ms.sym)
+			/* Order NULL-symbol entries deterministically (NULL last). */
+			cmp = (iter->ms.sym ? 1 : 0) - (sym ? 1 : 0);
+		else
+			cmp = arch__compare_symbol_names(sym->name, iter->ms.sym->name);
+
+		if (cmp < 0) {
+			p = &parent->rb_left;
+		} else if (cmp > 0) {
+			p = &parent->rb_right;
+			leftmost = false;
+		} else {
+			return container_of(iter, struct c2c_hist_entry, he);
+		}
+	}
+
+	level2_c2c = c2c_child_entry__alloc(level1_he, detail_he, 1, iaddr);
+	if (!level2_c2c)
+		return NULL;
+
+	/* Key this level by the looked-up symbol, not detail_he's. */
+	level2_c2c->he.ms.sym = sym;
+
+	/* Override iaddr (and symbol) in cloned mem_info for level 2 */
+	if (level2_c2c->he.mem_info) {
+		mem_info__iaddr(level2_c2c->he.mem_info)->addr = iaddr;
+		mem_info__iaddr(level2_c2c->he.mem_info)->ms.sym = sym;
+	}
+
+	c2c_child_entry__insert(level1_he, &level2_c2c->he, p, parent, leftmost);
+
+	return level2_c2c;
+}
+
+static __maybe_unused struct c2c_hist_entry *
+c2c_function_hists__level3_entry(struct c2c_hist_entry *level2_c2c, u64 cl_addr,
+				 struct c2c_hist_entry *cacheline_src_he)
+{
+	struct hist_entry *level2_he = &level2_c2c->he;
+	struct rb_node **p = &level2_he->hroot_out.rb_root.rb_node;
+	struct rb_node *parent = NULL;
+	struct c2c_hist_entry *level3_c2c;
+	bool leftmost = true;
+
+	while (*p) {
+		struct hist_entry *iter = rb_entry(*p, struct hist_entry, rb_node);
+		u64 iter_addr = 0;
+
+		if (iter->mem_info) {
+			u64 daddr = mem_info__daddr(iter->mem_info)->addr;
+
+			iter_addr = cl_address(daddr, chk_double_cl);
+		}
+
+		parent = *p;
+		if (cl_addr < iter_addr) {
+			p = &parent->rb_left;
+		} else if (cl_addr > iter_addr) {
+			p = &parent->rb_right;
+			leftmost = false;
+		} else {
+			return container_of(iter, struct c2c_hist_entry, he);
+		}
+	}
+
+	level3_c2c = c2c_child_entry__alloc(level2_he, &cacheline_src_he->he, 2,
+					    hist_entry__iaddr(&cacheline_src_he->he));
+	if (!level3_c2c)
+		return NULL;
+
+	c2c_child_entry__insert(level2_he, &level3_c2c->he, p, parent, leftmost);
+
+	return level3_c2c;
+}
+
 int perf_c2c__browse_function_view(struct hists *hists __maybe_unused)
 {
 	ui__warning("C2C function view is not implemented yet.\n");
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 11/14] perf c2c: add function view hierarchy builder
  2026-06-26  7:03 [PATCH 00/14] perf c2c: add a function view Jiebin Sun
                   ` (9 preceding siblings ...)
  2026-06-26  7:03 ` [PATCH 10/14] perf c2c: add hierarchy entry creation and lookup functions Jiebin Sun
@ 2026-06-26  7:03 ` Jiebin Sun
  2026-06-26  7:03 ` [PATCH 12/14] perf c2c: add function view browser UI Jiebin Sun
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Jiebin Sun @ 2026-06-26  7:03 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra
  Cc: Adrian Hunter, Alexander Shishkin, Ian Rogers, James Clark,
	Jiri Olsa, Mark Rutland, Dapeng Mi, Thomas Falcon, Tianyou Li,
	Wangyang Guo, linux-perf-users, linux-kernel, Jiebin Sun

Add the core algorithm that constructs the 3-level function view:

- c2c_he__resort_by_stores(): re-sort children by store count
- build_function_view_hierarchy(): single-pass hierarchy construction

The builder traverses all cacheline entries and for each pair of
functions (A, B) sharing a cacheline:
  1. Creates/finds Level 1 entry for function A
  2. Creates/finds Level 2 entry for function B under A
  3. Creates/finds Level 3 cacheline entry under B
  4. Aggregates C2C stats at all levels

After construction, it configures output columns, sorts Level 1
by cycles percentage, and re-sorts Level 2/3 by store count.

A per-cacheline set (function_seen[], heap-allocated and grown on
demand) tracks already-processed functions to prevent duplicate parent
processing when the same function appears multiple times within a
single cacheline.

Remove __maybe_unused from all helper functions now called by
the hierarchy builder.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Tianyou Li <tianyou.li@intel.com>
Reviewed-by: Wangyang Guo <wangyang.guo@intel.com>
---
 tools/perf/ui/browsers/c2c-function.c | 375 ++++++++++++++++++++++++--
 1 file changed, 357 insertions(+), 18 deletions(-)

diff --git a/tools/perf/ui/browsers/c2c-function.c b/tools/perf/ui/browsers/c2c-function.c
index a881839299e3..452d64007604 100644
--- a/tools/perf/ui/browsers/c2c-function.c
+++ b/tools/perf/ui/browsers/c2c-function.c
@@ -14,6 +14,7 @@
 #include <errno.h>
 #include <inttypes.h>
 #include <stdlib.h>
+#include <tools/libc_compat.h> /* reallocarray */
 #include <string.h>
 #include <linux/list.h>
 #include <linux/rbtree.h>
@@ -49,17 +50,17 @@ struct c2c_function_browser {
 	struct hist_browser	hb;
 };
 
-static __maybe_unused inline u64 c2c_hitm_count(const struct c2c_stats *stats)
+static inline u64 c2c_hitm_count(const struct c2c_stats *stats)
 {
 	return stats->tot_hitm;
 }
 
-static __maybe_unused inline bool symbol_name_equal(struct symbol *a, struct symbol *b)
+static inline bool symbol_name_equal(struct symbol *a, struct symbol *b)
 {
 	return a && b && arch__compare_symbol_names(a->name, b->name) == 0;
 }
 
-static __maybe_unused inline u64 hist_entry__iaddr(struct hist_entry *he)
+static inline u64 hist_entry__iaddr(struct hist_entry *he)
 {
 	if (he->mem_info)
 		return mem_info__iaddr(he->mem_info)->addr;
@@ -137,22 +138,34 @@ static int c2c_header(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
  * Return the estimated total cycles for a c2c_hist_entry
  * (rmt_hitm + lcl_hitm + rmt_peer + lcl_peer + other loads).
  */
-static __maybe_unused u64 c2c_hist_entry__cycles(struct c2c_hist_entry *c2c_he)
+static u64 c2c_hist_entry__cycles(struct c2c_hist_entry *c2c_he)
 {
-	double cycles_rmt, cycles_lcl, cycles_load;
-	u64 other_load, total_hitm;
+	double cycles_rmt, cycles_lcl, cycles_rmt_peer, cycles_lcl_peer, cycles_load;
+	u64 categorized, other_load;
 
+	/*
+	 * compute_stats() in builtin-c2c.c assigns each load sample to exactly
+	 * one cstats bucket (rmt_hitm, lcl_hitm, rmt_peer, lcl_peer or load),
+	 * while stats.load counts every load. Weight each category by its own
+	 * average and treat only the uncategorized remainder as plain loads, so
+	 * peer-snoop cycles are neither dropped nor charged the load average.
+	 */
 	cycles_rmt = avg_stats(&c2c_he->cstats.rmt_hitm) * c2c_he->stats.rmt_hitm;
 	cycles_lcl = avg_stats(&c2c_he->cstats.lcl_hitm) * c2c_he->stats.lcl_hitm;
-	total_hitm = c2c_he->stats.tot_hitm;
-	other_load = (c2c_he->stats.load >= total_hitm) ? c2c_he->stats.load - total_hitm : 0;
+	cycles_rmt_peer = avg_stats(&c2c_he->cstats.rmt_peer) * c2c_he->stats.rmt_peer;
+	cycles_lcl_peer = avg_stats(&c2c_he->cstats.lcl_peer) * c2c_he->stats.lcl_peer;
+
+	categorized = (u64)c2c_he->stats.tot_hitm + c2c_he->stats.tot_peer;
+	other_load = (c2c_he->stats.load >= categorized) ?
+		     c2c_he->stats.load - categorized : 0;
 	cycles_load = avg_stats(&c2c_he->cstats.load) * other_load;
 
-	return (u64)(cycles_rmt + cycles_lcl + cycles_load);
+	return (u64)(cycles_rmt + cycles_lcl + cycles_rmt_peer +
+		     cycles_lcl_peer + cycles_load);
 }
 
 /* Sum c2c_hist_entry__cycles() across all level-1 entries. */
-static __maybe_unused u64 c2c_ext__total_cycles(void)
+static u64 c2c_ext__total_cycles(void)
 {
 	struct rb_node *nd;
 	u64 total = 0;
@@ -168,11 +181,18 @@ static __maybe_unused u64 c2c_ext__total_cycles(void)
 }
 
 /* Sum child entries' store counts under a level-1 hist_entry. */
-static __maybe_unused u64 hist_entry__child_stores(struct hist_entry *he)
+static u64 hist_entry__child_stores(struct hist_entry *he)
 {
 	struct rb_node *nd;
 	u64 sum = 0;
 
+	/*
+	 * Leaf entries alias hroot_out with sorted_chain (callchains), so
+	 * there are no child hist_entry nodes to walk here.
+	 */
+	if (he->leaf)
+		return 0;
+
 	for (nd = rb_first_cached(&he->hroot_out); nd; nd = rb_next(nd)) {
 		struct hist_entry *child = rb_entry(nd, struct hist_entry, rb_node);
 		struct c2c_hist_entry *c2c_child =
@@ -609,7 +629,7 @@ function_hpp_list__parse(struct perf_hpp_list *hpp_list,
 	return ret;
 }
 
-static __maybe_unused int
+static int
 c2c_function_hists__init(struct c2c_hists *hists,
 			 const char *sort,
 			 int nr_header_lines,
@@ -624,7 +644,7 @@ c2c_function_hists__init(struct c2c_hists *hists,
 	return function_hpp_list__parse(&hists->list, /*output=*/NULL, sort, env);
 }
 
-static __maybe_unused int
+static int
 c2c_function_hists__reinit(struct c2c_hists *c2c_hists,
 			   const char *output,
 			   const char *sort,
@@ -675,7 +695,7 @@ static void c2c_stats_merge(struct stats *dest, const struct stats *src)
 }
 
 /* Merge compute_stats during function aggregation. */
-static __maybe_unused void c2c_add_cstats(struct compute_stats *dest,
+static void c2c_add_cstats(struct compute_stats *dest,
 			   const struct compute_stats *src)
 {
 	c2c_stats_merge(&dest->rmt_hitm, &src->rmt_hitm);
@@ -685,7 +705,7 @@ static __maybe_unused void c2c_add_cstats(struct compute_stats *dest,
 	c2c_stats_merge(&dest->load, &src->load);
 }
 
-static __maybe_unused bool hist_entry__add_c2c_stats(struct hist_entry *he,
+static bool hist_entry__add_c2c_stats(struct hist_entry *he,
 				      const struct c2c_stats *stats)
 {
 	u64 nr_events = c2c_hitm_count(stats) + stats->rmt_peer + stats->lcl_peer;
@@ -885,7 +905,7 @@ c2c_child_entry__insert(struct hist_entry *parent_he, struct hist_entry *child_h
 	parent_he->leaf = false;
 }
 
-static __maybe_unused struct hist_entry *
+static struct hist_entry *
 c2c_function_hists__level1_entry(struct symbol *sym, u64 iaddr,
 				 struct hist_entry *detail_he,
 				 struct thread *synthetic_thread)
@@ -947,7 +967,7 @@ c2c_function_hists__level1_entry(struct symbol *sym, u64 iaddr,
 	return he;
 }
 
-static __maybe_unused struct c2c_hist_entry *
+static struct c2c_hist_entry *
 c2c_function_hists__level2_entry(struct c2c_hist_entry *level1_c2c,
 				 struct symbol *sym, u64 iaddr,
 				 struct hist_entry *detail_he)
@@ -1013,7 +1033,7 @@ c2c_function_hists__level2_entry(struct c2c_hist_entry *level1_c2c,
 	return level2_c2c;
 }
 
-static __maybe_unused struct c2c_hist_entry *
+static struct c2c_hist_entry *
 c2c_function_hists__level3_entry(struct c2c_hist_entry *level2_c2c, u64 cl_addr,
 				 struct c2c_hist_entry *cacheline_src_he)
 {
@@ -1054,6 +1074,325 @@ c2c_function_hists__level3_entry(struct c2c_hist_entry *level2_c2c, u64 cl_addr,
 	return level3_c2c;
 }
 
+/*
+ * Re-sort child entries of @parent_he by total store count, descending.
+ */
+static void c2c_he__resort_by_stores(struct hist_entry *parent_he)
+{
+	struct rb_root_cached new_root = RB_ROOT_CACHED;
+	struct rb_node *nd;
+
+	if (!parent_he->has_children)
+		return;
+
+	/* Extract all nodes and re-insert sorted by total_stores */
+	while ((nd = rb_first_cached(&parent_he->hroot_out))) {
+		struct hist_entry *he = rb_entry(nd, struct hist_entry, rb_node);
+		struct c2c_hist_entry *c2c_he = container_of(he, struct c2c_hist_entry, he);
+		struct rb_node **p = &new_root.rb_root.rb_node;
+		struct rb_node *parent = NULL;
+		bool leftmost = true;
+		int cmp;
+
+		/* Remove from current tree */
+		rb_erase_cached(&he->rb_node, &parent_he->hroot_out);
+
+		/* Insert sorted by store count, descending. */
+		while (*p) {
+			struct hist_entry *iter = rb_entry(*p, struct hist_entry, rb_node);
+			struct c2c_hist_entry *c2c_iter = container_of(iter,
+								       struct c2c_hist_entry,
+								       he);
+
+			parent = *p;
+			if (c2c_he->stats.store != c2c_iter->stats.store) {
+				cmp = c2c_he->stats.store > c2c_iter->stats.store ? -1 : 1;
+			} else {
+				/* Stable tie-break: instruction address, then name. */
+				u64 a = hist_entry__iaddr(he), b = hist_entry__iaddr(iter);
+
+				if (a != b)
+					cmp = a < b ? -1 : 1;
+				else if (he->ms.sym && iter->ms.sym)
+					cmp = arch__compare_symbol_names(he->ms.sym->name,
+									 iter->ms.sym->name);
+				else
+					cmp = (iter->ms.sym ? 1 : 0) - (he->ms.sym ? 1 : 0);
+			}
+
+			if (cmp < 0) {
+				p = &parent->rb_left;
+			} else {
+				p = &parent->rb_right;
+				leftmost = false;
+			}
+		}
+
+		rb_link_node(&he->rb_node, parent, p);
+		rb_insert_color_cached(&he->rb_node, &new_root, leftmost);
+	}
+
+	parent_he->hroot_out = new_root;
+}
+
+/* Initial per-cacheline capacity for the seen[] set; grown on demand. */
+#define DEFAULT_SYMBOLS_PER_CL 64
+
+struct function_seen {
+	struct symbol	*sym;
+	u64		 iaddr;
+};
+
+static bool function_seen__find(const struct function_seen *seen, int nr,
+				struct symbol *sym, u64 iaddr)
+{
+	int i;
+
+	for (i = 0; i < nr; i++) {
+		if (seen[i].iaddr == iaddr &&
+		    symbol_name_equal(seen[i].sym, sym))
+			return true;
+	}
+	return false;
+}
+
+/* Aggregate stats from the cacheline-side entry @c2c_b into level 2/3 @dst. */
+static bool c2c_he__add_sharing(struct c2c_hist_entry *dst, struct c2c_hist_entry *src)
+{
+	/* Do the fallible update first so a failure leaves dst unmodified. */
+	if (!hist_entry__add_c2c_stats(&dst->he, &src->stats))
+		return false;
+
+	c2c_add_stats(&dst->stats, &src->stats);
+	c2c_add_cstats(&dst->cstats, &src->cstats);
+	return true;
+}
+
+/*
+ * Process one cacheline and create/update the level-1/2/3 hierarchy entries
+ * for every pair of functions sharing it.
+ */
+static int c2c_function__process_cl(struct c2c_hist_entry *cacheline_he, u64 cl_addr,
+				    struct thread *synthetic_thread)
+{
+	struct rb_node *nd_a, *nd_b;
+	struct function_seen *seen = NULL;
+	int nr_seen = 0, nr_alloc = 0;
+	int ret = 0;
+
+	for (nd_a = rb_first_cached(&cacheline_he->hists->hists.entries); nd_a;
+	     nd_a = rb_next(nd_a)) {
+		struct hist_entry *he_a = rb_entry(nd_a, struct hist_entry, rb_node);
+		struct c2c_hist_entry *c2c_a;
+		struct hist_entry *level1_he;
+		struct c2c_hist_entry *level1_c2c;
+		u64 iaddr_a;
+
+		if (!he_a->ms.sym || he_a->filtered)
+			continue;
+
+		c2c_a = container_of(he_a, struct c2c_hist_entry, he);
+		iaddr_a = hist_entry__iaddr(he_a);
+
+		level1_he = c2c_function_hists__level1_entry(he_a->ms.sym, iaddr_a,
+							     he_a, synthetic_thread);
+		if (!level1_he) {
+			ret = -ENOMEM;
+			goto out;
+		}
+
+		level1_c2c = container_of(level1_he, struct c2c_hist_entry, he);
+
+		/*
+		 * Aggregate every source entry into its level-1 (sym, iaddr)
+		 * parent. level1_he is keyed by (sym, iaddr), so all siblings
+		 * collapse into the same parent. When the cacheline view splits
+		 * one (sym, iaddr) into siblings (only under -d pid/tid/dso),
+		 * each sibling holds a DISJOINT slice of the traffic, so summing
+		 * them here is correct accumulation, not double counting. The
+		 * seen[] set below therefore guards only the inner B-loop (to
+		 * avoid building a level-2 subtree twice), never this L1 update.
+		 * Update he->stat first; on failure leave the aggregates untouched.
+		 */
+		if (!hist_entry__add_c2c_stats(level1_he, &c2c_a->stats)) {
+			ret = -ENOMEM;
+			goto out;
+		}
+		c2c_add_stats(&level1_c2c->stats, &c2c_a->stats);
+		c2c_add_cstats(&level1_c2c->cstats, &c2c_a->cstats);
+		c2c_add_stats(&c2c_ext.function_hists.stats, &c2c_a->stats);
+
+		/* Skip the inner loop when this (symbol, iaddr) is already a parent. */
+		if (function_seen__find(seen, nr_seen, he_a->ms.sym, iaddr_a))
+			continue;
+
+		if (nr_seen == nr_alloc) {
+			struct function_seen *tmp;
+			int new_alloc = nr_alloc ? nr_alloc * 2 : DEFAULT_SYMBOLS_PER_CL;
+
+			tmp = reallocarray(seen, new_alloc, sizeof(*seen));
+			if (!tmp) {
+				ret = -ENOMEM;
+				goto out;
+			}
+			seen = tmp;
+			nr_alloc = new_alloc;
+		}
+		seen[nr_seen].sym = he_a->ms.sym;
+		seen[nr_seen].iaddr = iaddr_a;
+		nr_seen++;
+
+		for (nd_b = rb_first_cached(&cacheline_he->hists->hists.entries); nd_b;
+		     nd_b = rb_next(nd_b)) {
+			struct hist_entry *he_b = rb_entry(nd_b, struct hist_entry, rb_node);
+			struct c2c_hist_entry *c2c_b, *level2_c2c, *level3_c2c;
+			u64 iaddr_b;
+
+			if (!he_b->ms.sym || he_b->filtered)
+				continue;
+
+			c2c_b = container_of(he_b, struct c2c_hist_entry, he);
+			iaddr_b = hist_entry__iaddr(he_b);
+
+			/* Skip self. */
+			if (iaddr_a == iaddr_b &&
+			    symbol_name_equal(he_a->ms.sym, he_b->ms.sym))
+				continue;
+
+			level2_c2c = c2c_function_hists__level2_entry(level1_c2c, he_b->ms.sym,
+								      iaddr_b, he_b);
+			if (!level2_c2c || !c2c_he__add_sharing(level2_c2c, c2c_b)) {
+				ret = -ENOMEM;
+				goto out;
+			}
+
+			level3_c2c = c2c_function_hists__level3_entry(level2_c2c, cl_addr,
+								      cacheline_he);
+			if (!level3_c2c) {
+				ret = -ENOMEM;
+				goto out;
+			}
+
+			if (!c2c_he__add_sharing(level3_c2c, c2c_b)) {
+				ret = -ENOMEM;
+				goto out;
+			}
+		}
+	}
+
+out:
+	free(seen);
+	return ret;
+}
+
+/* Sort level-2/3 children by store count, then compute the global total. */
+static void c2c_function__finalize(void)
+{
+	struct rb_node *nd_l1;
+
+	for (nd_l1 = rb_first_cached(&c2c_ext.function_hists.hists.entries); nd_l1;
+	     nd_l1 = rb_next(nd_l1)) {
+		struct hist_entry *he_l1 = rb_entry(nd_l1, struct hist_entry, rb_node);
+		struct rb_node *nd_l2;
+
+		if (!he_l1->has_children)
+			continue;
+
+		c2c_he__resort_by_stores(he_l1);
+
+		for (nd_l2 = rb_first_cached(&he_l1->hroot_out); nd_l2;
+		     nd_l2 = rb_next(nd_l2)) {
+			struct hist_entry *he_l2 = rb_entry(nd_l2, struct hist_entry, rb_node);
+
+			if (he_l2->has_children)
+				c2c_he__resort_by_stores(he_l2);
+		}
+	}
+
+	c2c_ext.total_cycles = c2c_ext__total_cycles();
+}
+
+/*
+ * Build the three-level function view in a single pass over the cacheline
+ * entries:
+ *   L1: aggregate stats per primary function
+ *   L2: sharing functions referenced from each L1 function
+ *   L3: cachelines that pair L1 with L2
+ */
+static __maybe_unused int build_function_view_hierarchy(void)
+{
+	static const char output_fields[] =
+		"cycles_percent,total_stores,iaddr_symbol,symbol_view,cacheline_symbol";
+	struct rb_node *nd_cl;
+	int ret;
+
+	c2c_ext.total_cycles = 0;
+	memset(&c2c_ext.function_hists.stats, 0,
+	       sizeof(c2c_ext.function_hists.stats));
+
+	hists__delete_entries(&c2c_ext.function_hists.hists);
+	if (c2c_ext.function_hists.list.fields.next)
+		perf_hpp__reset_output_field(&c2c_ext.function_hists.list);
+
+	ret = c2c_function_hists__init(&c2c_ext.function_hists,
+				       "iaddr_symbol,symbol_view", 2, NULL);
+	if (ret)
+		return ret;
+
+	nd_cl = rb_first_cached(&c2c.hists.hists.entries);
+
+	/* An empty C2C report yields an empty (but valid) function view. */
+	for (; nd_cl; nd_cl = rb_next(nd_cl)) {
+		struct hist_entry *he_cl = rb_entry(nd_cl, struct hist_entry, rb_node);
+		struct c2c_hist_entry *cacheline_he = container_of(he_cl,
+								   struct c2c_hist_entry, he);
+		struct thread *synthetic_thread = he_cl->thread;
+		u64 cl_addr;
+
+		/*
+		 * Include any cacheline with sharing activity (HITM, peer,
+		 * stores or loads), not just HITM, so totals/sorting reflect
+		 * all aggregated traffic surfaced by the function view.
+		 */
+		if ((c2c_hitm_count(&cacheline_he->stats) == 0 &&
+		     cacheline_he->stats.tot_peer == 0 &&
+		     cacheline_he->stats.store == 0 &&
+		     cacheline_he->stats.load == 0) ||
+		    !cacheline_he->hists ||
+		    RB_EMPTY_ROOT(&cacheline_he->hists->hists.entries.rb_root) ||
+		    !he_cl->mem_info || !synthetic_thread)
+			continue;
+
+		cl_addr = cl_address(mem_info__daddr(he_cl->mem_info)->addr, chk_double_cl);
+		ret = c2c_function__process_cl(cacheline_he, cl_addr, synthetic_thread);
+		if (ret)
+			goto out_err;
+	}
+
+	ret = c2c_function_hists__reinit(&c2c_ext.function_hists, output_fields,
+					 "cycles_percent", NULL);
+	if (ret)
+		goto out_err;
+
+	hists__collapse_resort(&c2c_ext.function_hists.hists, NULL);
+	hists__output_resort(&c2c_ext.function_hists.hists, NULL);
+
+	c2c_function__finalize();
+
+	return 0;
+
+out_err:
+	/*
+	 * On error, migrate any entries still in entries_in to entries and
+	 * delete them, so a later rebuild does not strand them (the top-level
+	 * __hists__init() memset would otherwise lose the pointers).
+	 */
+	hists__collapse_resort(&c2c_ext.function_hists.hists, NULL);
+	hists__output_resort(&c2c_ext.function_hists.hists, NULL);
+	hists__delete_entries(&c2c_ext.function_hists.hists);
+	return ret;
+}
+
 int perf_c2c__browse_function_view(struct hists *hists __maybe_unused)
 {
 	ui__warning("C2C function view is not implemented yet.\n");
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 12/14] perf c2c: add function view browser UI
  2026-06-26  7:03 [PATCH 00/14] perf c2c: add a function view Jiebin Sun
                   ` (10 preceding siblings ...)
  2026-06-26  7:03 ` [PATCH 11/14] perf c2c: add function view hierarchy builder Jiebin Sun
@ 2026-06-26  7:03 ` Jiebin Sun
  2026-06-26  7:03 ` [PATCH 13/14] perf c2c: add TAB key to switch to function view Jiebin Sun
  2026-06-26  7:03 ` [PATCH 14/14] perf c2c: document function view in perf-c2c man page Jiebin Sun
  13 siblings, 0 replies; 23+ messages in thread
From: Jiebin Sun @ 2026-06-26  7:03 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra
  Cc: Adrian Hunter, Alexander Shishkin, Ian Rogers, James Clark,
	Jiri Olsa, Mark Rutland, Dapeng Mi, Thomas Falcon, Tianyou Li,
	Wangyang Guo, linux-perf-users, linux-kernel, Jiebin Sun

Replace the stub perf_c2c__browse_function_view() with the full
interactive browser implementation:

- c2c_function_browser__new(): create browser with hierarchy support,
  disabling callchains for clean function-level display
- c2c_function_browser__browse_cacheline_detail(): drill into
  cacheline detail view via 'd' key
- c2c_function_browser__title(): display entry count in title bar
- c2c_function_browser__delete(): cleanup on exit

The browser saves/restores symbol_conf.use_callchain to avoid
corrupting the callchain display in the cacheline view after
returning from the function view.

The browser entry point perf_c2c__browse_function_view() is wired up by
the next patch, which adds the TAB key in the cacheline view to switch
into the function view.
Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Tianyou Li <tianyou.li@intel.com>
Reviewed-by: Wangyang Guo <wangyang.guo@intel.com>
---
 tools/perf/ui/browsers/c2c-function.c | 159 +++++++++++++++++++++++++-
 1 file changed, 153 insertions(+), 6 deletions(-)

diff --git a/tools/perf/ui/browsers/c2c-function.c b/tools/perf/ui/browsers/c2c-function.c
index 452d64007604..11366fd5e2c1 100644
--- a/tools/perf/ui/browsers/c2c-function.c
+++ b/tools/perf/ui/browsers/c2c-function.c
@@ -2,7 +2,7 @@
 /*
  * C2C Function Browser - function-level cacheline sharing analysis
  *
- * Planned UI: 3-level hierarchy showing which functions share cachelines (not implemented yet):
+ * Displays a 3-level hierarchy showing which functions share cachelines:
  *   Level 1: Primary functions sorted by Cycles % (estimated load cycles)
  *   Level 2: Other functions sharing cachelines with the level-1 function
  *   Level 3: Specific shared cachelines between each pair of functions
@@ -44,7 +44,7 @@ struct perf_c2c_ext {
 	u64			total_cycles;
 };
 
-static struct perf_c2c_ext c2c_ext __maybe_unused;
+static struct perf_c2c_ext c2c_ext;
 
 struct c2c_function_browser {
 	struct hist_browser	hb;
@@ -1319,7 +1319,7 @@ static void c2c_function__finalize(void)
  *   L2: sharing functions referenced from each L1 function
  *   L3: cachelines that pair L1 with L2
  */
-static __maybe_unused int build_function_view_hierarchy(void)
+static int build_function_view_hierarchy(void)
 {
 	static const char output_fields[] =
 		"cycles_percent,total_stores,iaddr_symbol,symbol_view,cacheline_symbol";
@@ -1393,8 +1393,155 @@ static __maybe_unused int build_function_view_hierarchy(void)
 	return ret;
 }
 
-int perf_c2c__browse_function_view(struct hists *hists __maybe_unused)
+static int c2c_function_browser__title(struct hist_browser *browser,
+				       char *bf, size_t size)
 {
-	ui__warning("C2C function view is not implemented yet.\n");
-	return -ENOSYS;
+	scnprintf(bf, size,
+		  "Shared Data Functions Table     (%" PRIu64 " entries, sorted on Cycles %%)",
+		  browser->nr_non_filtered_entries);
+	return 0;
+}
+
+static struct c2c_function_browser *c2c_function_browser__new(struct hists *hists)
+{
+	struct c2c_function_browser *browser;
+
+	if (!hists)
+		return NULL;
+
+	browser = zalloc(sizeof(*browser));
+	if (!browser)
+		return NULL;
+
+	hist_browser__init(&browser->hb, hists);
+
+	browser->hb.title = c2c_function_browser__title;
+	browser->hb.c2c_filter = true;
+	browser->hb.show_headers = true;
+	/* Keep title line count consistent with forcing headers on. */
+	browser->hb.b.extra_title_lines = hists->hpp_list->nr_header_lines;
+	browser->hb.min_pcnt = 0.0;
+
+	/*
+	 * Note: symbol_conf.report_hierarchy is deliberately left unset.
+	 * The generic browser still descends into hroot_out children via
+	 * rb_hierarchy_next()/can_goto_child(), which key off he->unfolded,
+	 * so 'e'/'+' expands L1 -> L2 -> L3 correctly. Setting the flag would
+	 * additionally make hist_entry__delete() recurse hroot_out and free
+	 * each child, but our children borrow thread/ms (see
+	 * c2c_child_entry__alloc()), so that would underflow their refcounts.
+	 * Teardown is handled by c2c_he__free_hierarchy() instead.
+	 */
+	return browser;
+}
+
+/*
+ * c2c_function_browser__delete - Free function browser
+ */
+static void c2c_function_browser__delete(struct c2c_function_browser *browser)
+{
+	free(browser);
+}
+
+static int c2c_function_browser__browse_cacheline_detail(struct hist_entry *he_selection,
+							 struct hists *hists)
+{
+	struct rb_node *nd;
+	u64 cl_addr;
+
+	if (!he_selection || !he_selection->parent_he ||
+	    !he_selection->parent_he->parent_he || !he_selection->mem_info)
+		return -1;
+
+	cl_addr = cl_address(mem_info__daddr(he_selection->mem_info)->addr, chk_double_cl);
+
+	for (nd = rb_first_cached(&hists->entries); nd; nd = rb_next(nd)) {
+		struct hist_entry *he_cl = rb_entry(nd, struct hist_entry, rb_node);
+		u64 this_cl;
+
+		if (!he_cl->mem_info)
+			continue;
+
+		this_cl = cl_address(mem_info__daddr(he_cl->mem_info)->addr, chk_double_cl);
+		if (this_cl == cl_addr)
+			return perf_c2c__browse_cacheline(he_cl);
+	}
+
+	return -1;
+}
+
+/*
+ * perf_c2c__browse_function_view - Browse function view with TAB key support
+ * @hists: Main cacheline histograms
+ *
+ * Returns: 0 on success, negative error code on failure
+ */
+int perf_c2c__browse_function_view(struct hists *hists)
+{
+	struct c2c_function_browser *sym_browser;
+	bool saved_use_callchain = symbol_conf.use_callchain;
+	int key, ret;
+	static const char help[] =
+	" d             Display cacheline details for the selected entry\n"
+	" e/+           Expand/collapse the selected entry\n"
+	" TAB/ESC/q     Return to the cacheline view\n";
+
+	if (!hists)
+		return -EINVAL;
+
+	/* Disable callchain before building so no callchain structs are allocated. */
+	symbol_conf.use_callchain = false;
+
+	ret = build_function_view_hierarchy();
+	if (ret) {
+		ui__error("Failed to build function view hierarchy (ret=%d)\n", ret);
+		goto out;
+	}
+
+	sym_browser = c2c_function_browser__new(&c2c_ext.function_hists.hists);
+	if (!sym_browser) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	/* Reset abort key so we can receive Ctrl-C as a key. */
+	SLang_reset_tty();
+	SLang_init_tty(0, 0, 0);
+	SLtty_set_suspend_state(true);
+
+	sym_browser->hb.nr_non_filtered_entries =
+		c2c_ext.function_hists.hists.nr_non_filtered_entries;
+
+	while (1) {
+		key = hist_browser__run(&sym_browser->hb, "? - help", true, 0);
+
+		switch (key) {
+		case 'q':
+		case K_TAB:
+		case K_ESC:
+			goto browser_done;
+		case 'd':
+			/* Cacheline detail honors the user's callchain setting. */
+			symbol_conf.use_callchain = saved_use_callchain;
+			c2c_function_browser__browse_cacheline_detail(sym_browser->hb.he_selection,
+								      hists);
+			/* Preserve any toggle made in the detail view, then
+			 * re-disable callchain for the function view.
+			 */
+			saved_use_callchain = symbol_conf.use_callchain;
+			symbol_conf.use_callchain = false;
+			break;
+		case '?':
+			ui_browser__help_window(&sym_browser->hb.b, help);
+			break;
+		default:
+			break;
+		}
+	}
+
+browser_done:
+	c2c_function_browser__delete(sym_browser);
+out:
+	symbol_conf.use_callchain = saved_use_callchain;
+	return ret;
 }
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 13/14] perf c2c: add TAB key to switch to function view
  2026-06-26  7:03 [PATCH 00/14] perf c2c: add a function view Jiebin Sun
                   ` (11 preceding siblings ...)
  2026-06-26  7:03 ` [PATCH 12/14] perf c2c: add function view browser UI Jiebin Sun
@ 2026-06-26  7:03 ` Jiebin Sun
  2026-06-26  7:03 ` [PATCH 14/14] perf c2c: document function view in perf-c2c man page Jiebin Sun
  13 siblings, 0 replies; 23+ messages in thread
From: Jiebin Sun @ 2026-06-26  7:03 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra
  Cc: Adrian Hunter, Alexander Shishkin, Ian Rogers, James Clark,
	Jiri Olsa, Mark Rutland, Dapeng Mi, Thomas Falcon, Tianyou Li,
	Wangyang Guo, linux-perf-users, linux-kernel, Jiebin Sun

Wire up the TAB key in the cacheline browser to launch the function
view browser via perf_c2c__browse_function_view(). This allows users
to switch between the traditional cacheline view and the new
function-level sharing analysis view.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Tianyou Li <tianyou.li@intel.com>
Reviewed-by: Wangyang Guo <wangyang.guo@intel.com>
---
 tools/perf/builtin-c2c.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 33271554e354..a924948bc507 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2693,6 +2693,7 @@ static int perf_c2c__hists_browse(struct hists *hists)
 	static const char help[] =
 	" d             Display cacheline details \n"
 	" ENTER         Toggle callchains (if present) \n"
+	" TAB           Switch to function view\n"
 	" q             Quit \n";
 
 	browser = perf_c2c_browser__new(hists);
@@ -2714,6 +2715,9 @@ static int perf_c2c__hists_browse(struct hists *hists)
 		case 'd':
 			perf_c2c__browse_cacheline(browser->he_selection);
 			break;
+		case '\t':
+			perf_c2c__browse_function_view(hists);
+			break;
 		case '?':
 			ui_browser__help_window(&browser->b, help);
 			break;
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 14/14] perf c2c: document function view in perf-c2c man page
  2026-06-26  7:03 [PATCH 00/14] perf c2c: add a function view Jiebin Sun
                   ` (12 preceding siblings ...)
  2026-06-26  7:03 ` [PATCH 13/14] perf c2c: add TAB key to switch to function view Jiebin Sun
@ 2026-06-26  7:03 ` Jiebin Sun
  13 siblings, 0 replies; 23+ messages in thread
From: Jiebin Sun @ 2026-06-26  7:03 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Ingo Molnar,
	Peter Zijlstra
  Cc: Adrian Hunter, Alexander Shishkin, Ian Rogers, James Clark,
	Jiri Olsa, Mark Rutland, Dapeng Mi, Thomas Falcon, Tianyou Li,
	Wangyang Guo, linux-perf-users, linux-kernel, Jiebin Sun

Describe the three-level function-view hierarchy (primary function,
sharing functions, shared cachelines) and the TAB/ESC/d/e key
bindings that drive it.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Tianyou Li <tianyou.li@intel.com>
Reviewed-by: Wangyang Guo <wangyang.guo@intel.com>
---
 tools/perf/Documentation/perf-c2c.txt | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/tools/perf/Documentation/perf-c2c.txt b/tools/perf/Documentation/perf-c2c.txt
index e57a122b8719..1be38167f339 100644
--- a/tools/perf/Documentation/perf-c2c.txt
+++ b/tools/perf/Documentation/perf-c2c.txt
@@ -365,6 +365,21 @@ TUI OUTPUT
 The TUI output provides interactive interface to navigate
 through cachelines list and to display offset details.
 
+Pressing the 'TAB' key in the cacheline view switches to the function
+view.  The function view shows a three-level hierarchy of the same data,
+organized around functions rather than cachelines:
+
+  Level 1: primary functions, sorted by Cycles % (estimated load cycles:
+           HITM, peer-snoop and other-load cycles -- on systems whose
+           default display mode is peer, such as Arm64, the peer-snoop
+           component dominates)
+  Level 2: other functions that share cachelines with the level-1 function
+  Level 3: the specific shared cachelines for each function pair
+
+In the function view the 'd' key opens the detail view of the selected
+cacheline, 'e'/'+' expands or collapses the current entry, and 'TAB',
+'ESC' or 'q' returns to the cacheline view.
+
 For details please refer to the help window by pressing '?' key.
 
 CREDITS
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 02/14] perf c2c: add function view browser skeleton
  2026-06-26  7:03 ` [PATCH 02/14] perf c2c: add function view browser skeleton Jiebin Sun
@ 2026-06-26  7:11   ` sashiko-bot
  0 siblings, 0 replies; 23+ messages in thread
From: sashiko-bot @ 2026-06-26  7:11 UTC (permalink / raw)
  To: Jiebin Sun; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Low] The newly added `perf_c2c__browse_function_view` prototype is guarded by `HAVE_SLANG_SUPPORT` but lacks a dummy inline fallback stub in an `#else` block.
--

commit b142a11979c1d1a680751e23d2fc5169f9d5fc53
Author: Jiebin Sun <jiebin.sun@intel.com>

perf c2c: add function view browser skeleton

This commit adds c2c-function.c with a stub perf_c2c__browse_function_view()
entry point and registers it in the build system. It declares the function
prototype in c2c.h to prepare for function-level cacheline sharing analysis.

> diff --git a/tools/perf/c2c.h b/tools/perf/c2c.h
> index 3e974dd1d7ee6..385709536a1ae 100644
> --- a/tools/perf/c2c.h
> +++ b/tools/perf/c2c.h
> @@ -134,6 +134,7 @@ bool fmt_equal(struct perf_hpp_fmt *a, struct perf_hpp_fmt *b);
>  
>  #ifdef HAVE_SLANG_SUPPORT
>  int perf_c2c__browse_cacheline(struct hist_entry *he);
> +int perf_c2c__browse_function_view(struct hists *hists);

[Severity: Low]
Should this prototype have a dummy inline fallback stub in an #else block?
Without a dummy fallback stub, any unguarded call to
perf_c2c__browse_function_view() on a system lacking slang support might result
in a build failure.

>  #endif
>  
>  #endif /* _PERF_C2C_H_ */

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260626070355.1556721-1-jiebin.sun@intel.com?part=2

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 01/14] perf c2c: extract shared data structures into c2c.h
  2026-06-26  7:03 ` [PATCH 01/14] perf c2c: extract shared data structures into c2c.h Jiebin Sun
@ 2026-06-26  7:13   ` sashiko-bot
  0 siblings, 0 replies; 23+ messages in thread
From: sashiko-bot @ 2026-06-26  7:13 UTC (permalink / raw)
  To: Jiebin Sun; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [Medium] Missing feature fallback inline stubs for SLANG-dependent functions in header.
- [Medium] Asymmetric memory management API leading to potential resource leaks.
- [Medium] Global namespace pollution with generic function names.
--

commit e451cf21befaafcbe2925718151023dd5ccba2cc
Author: Jiebin Sun <jiebin.sun@intel.com>

perf c2c: extract shared data structures into c2c.h

This patch moves c2c_hists, compute_stats, c2c_hist_entry, and perf_c2c
structure definitions from builtin-c2c.c into a new shared header c2c.h.
This allows the upcoming function view browser to reuse these types.

> diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
> index c9584dbedf77a..33271554e354b 100644
> --- a/tools/perf/builtin-c2c.c
> +++ b/tools/perf/builtin-c2c.c
[ ... ]
> @@ -141,9 +76,9 @@ static const struct option c2c_options[] = {
>  	OPT_END()
>  };
>  
> -static struct perf_c2c c2c;
> +struct perf_c2c c2c;
>  
> -static void *c2c_he_zalloc(size_t size)
> +void *c2c_he_zalloc(size_t size)
>  {
>  	struct c2c_hist_entry *c2c_he;

[Severity: Medium]
By exporting c2c_he_zalloc(), is there a risk of memory leaks since a
corresponding free function is not exported?

Looking at the rest of c2c_he_zalloc() in builtin-c2c.c, it allocates several
sub-structures:

    c2c_he->cpuset = bitmap_zalloc(c2c.cpus_cnt);
    ...
    c2c_he->nodeset = bitmap_zalloc(c2c.nodes_cnt);
    ...
    c2c_he->node_stats = calloc(c2c.nodes_cnt, sizeof(*c2c_he->node_stats));

Any external module (like the new function view) that uses this allocator
won't have an encapsulated way to clean these up, forcing manual duplication
of the teardown logic to avoid leaks.

[ ... ]
> @@ -1928,7 +1816,7 @@ static struct c2c_dimension *dimensions[] = {
>  	NULL,
>  };
>  
> -static void fmt_free(struct perf_hpp_fmt *fmt)
> +void fmt_free(struct perf_hpp_fmt *fmt)
>  {
>  	struct c2c_fmt *c2c_fmt;
>  
> @@ -1936,7 +1824,7 @@ static void fmt_free(struct perf_hpp_fmt *fmt)
>  	free(c2c_fmt);
>  }
>  
> -static bool fmt_equal(struct perf_hpp_fmt *a, struct perf_hpp_fmt *b)
> +bool fmt_equal(struct perf_hpp_fmt *a, struct perf_hpp_fmt *b)

[Severity: Medium]
Could the removal of the static keyword here lead to symbol collisions?

Since fmt_free() and fmt_equal() lack a subsystem prefix (like c2c_ or
perf_c2c_), exporting these generic names to the global namespace increases
the risk of collisions during static linking or Link-Time Optimization with
other perf tool objects.

> diff --git a/tools/perf/c2c.h b/tools/perf/c2c.h
> new file mode 100644
> index 0000000000000..3e974dd1d7ee6
> --- /dev/null
> +++ b/tools/perf/c2c.h
[ ... ]
> +#ifdef HAVE_SLANG_SUPPORT
> +int perf_c2c__browse_cacheline(struct hist_entry *he);
> +#endif

[Severity: Medium]
Should this header provide a fallback stub for perf_c2c__browse_cacheline()
when HAVE_SLANG_SUPPORT is not defined?

The perf subsystem build guidelines require header files to provide
compatible dummy inline stubs (e.g., returning -ENOTSUPP) when feature
defines are absent. Omitting the #else block here might lead to missing
functionality or build breakages on systems lacking SLANG.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260626070355.1556721-1-jiebin.sun@intel.com?part=1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 03/14] perf c2c: add function view type definitions and helpers
  2026-06-26  7:03 ` [PATCH 03/14] perf c2c: add function view type definitions and helpers Jiebin Sun
@ 2026-06-26  7:14   ` sashiko-bot
  0 siblings, 0 replies; 23+ messages in thread
From: sashiko-bot @ 2026-06-26  7:14 UTC (permalink / raw)
  To: Jiebin Sun; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] The `symbol_name_equal` helper relies on host-bound `arch__compare_symbol_names()`, which breaks cross-platform analysis for the c2c function view.
--

commit a7beef2e1248ae5cd6dc127b26c6803a87eb9dd4
Author: Jiebin Sun <jiebin.sun@intel.com>

perf c2c: add function view type definitions and helpers

Add the foundational type definitions and helper functions for the
function view browser.

> diff --git a/tools/perf/ui/browsers/c2c-function.c b/tools/perf/ui/browsers/c2c-function.c
> index 040266288be3d..dbc912a4a242b 100644
> --- a/tools/perf/ui/browsers/c2c-function.c
> +++ b/tools/perf/ui/browsers/c2c-function.c
[ ... ]
> +static __maybe_unused inline bool symbol_name_equal(struct symbol *a, struct symbol *b)
> +{
> +	return a && b && arch__compare_symbol_names(a->name, b->name) == 0;

[Severity: Medium]
Does this break cross-platform analysis for the c2c function view?

Calling arch__compare_symbol_names() relies on a weak symbol bound to the host
architecture at compile time, since files in tools/perf/arch/ are conditionally
linked based on the build machine.

If a user runs perf c2c on an x86 host to analyze a perf.data file recorded on
PowerPC, this falls back to strcmp(). Can this result in failing to correctly
group identical symbols that have architecture-specific variations (like
PowerPC's .my_func vs my_func)?

Should this dynamically query the e_machine of the analyzed perf.data file
instead of relying on host-bound architecture functions?

> +}
> +
> +static __maybe_unused inline u64 hist_entry__iaddr(struct hist_entry *he)

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260626070355.1556721-1-jiebin.sun@intel.com?part=3

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 08/14] perf c2c: add HPP list parsing for function view histograms
  2026-06-26  7:03 ` [PATCH 08/14] perf c2c: add HPP list parsing for function view histograms Jiebin Sun
@ 2026-06-26  7:16   ` sashiko-bot
  0 siblings, 0 replies; 23+ messages in thread
From: sashiko-bot @ 2026-06-26  7:16 UTC (permalink / raw)
  To: Jiebin Sun; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] OOM failures in get_function_format() are swallowed and misreported as invalid dimensions.
--

commit c2feb45b926c3d4847f0ee3b178a0bfe059a7079
Author: Jiebin Sun <jiebin.sun@intel.com>

perf c2c: add HPP list parsing for function view histograms

This commit adds the histogram initialization and column parsing
infrastructure for the perf c2c function view. It bridges custom dimensions
with the perf HPP list infrastructure to enable dynamic column configuration.

> diff --git a/tools/perf/ui/browsers/c2c-function.c b/tools/perf/ui/browsers/c2c-function.c
> index 794a023239ce1..c11a81d93c21e 100644
> --- a/tools/perf/ui/browsers/c2c-function.c
> +++ b/tools/perf/ui/browsers/c2c-function.c
[ ... ]
> @@ -509,6 +509,146 @@ static __maybe_unused struct c2c_fmt *get_function_format(const char *name)
>  	return c2c_fmt;
>  }
>  
> +static int
> +c2c_function_hists__init_output(struct perf_hpp_list *hpp_list, char *name,
> +				struct perf_env *env __maybe_unused)
> +{
> +	struct c2c_fmt *c2c_fmt = get_function_format(name);
> +	int level = 0;
> +
> +	if (!c2c_fmt) {
> +		reset_dimensions();
> +		return output_field_add(hpp_list, name, &level);
> +	}

[Severity: Medium]
If zalloc() fails inside get_function_format(), it returns NULL. Does this
fallback path swallow the -ENOMEM and mistakenly treat it as an unknown
dimension?

Since it falls back to output_field_add() which will fail with -EINVAL, the
caller function_hpp_list__add_tokens() might print the misleading error
'Invalid c2c function-view field' instead of reporting the memory allocation
failure.

[ ... ]
> +static int
> +c2c_function_hists__init_sort(struct perf_hpp_list *hpp_list, char *name,
> +			      struct perf_env *env)
> +{
> +	struct c2c_fmt *c2c_fmt = get_function_format(name);
> +
> +	if (!c2c_fmt) {
> +		reset_dimensions();
> +		return sort_dimension__add(hpp_list, name, /*evlist=*/NULL, env, /*level=*/0);
> +	}

[Severity: Medium]
Similarly here, does this path also swallow an -ENOMEM from
get_function_format() and mistakenly treat it as an unknown dimension?

> +
> +	/* Mark symbol-backed sort keys so hists__has(hists, sym) is correct. */

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260626070355.1556721-1-jiebin.sun@intel.com?part=8

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 09/14] perf c2c: add stats merging and memory management helpers
  2026-06-26  7:03 ` [PATCH 09/14] perf c2c: add stats merging and memory management helpers Jiebin Sun
@ 2026-06-26  7:17   ` sashiko-bot
  0 siblings, 0 replies; 23+ messages in thread
From: sashiko-bot @ 2026-06-26  7:17 UTC (permalink / raw)
  To: Jiebin Sun; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] Memory leak of nested histogram entries because `hists__delete_entries()` is used instead of `hists__delete_all_entries()`.
--

commit d68acbe4d76a46f5dbb0cda5a255e3f5f919e159
Author: Jiebin Sun <jiebin.sun@intel.com>

perf c2c: add stats merging and memory management helpers

This commit adds statistics aggregation and memory management functions for
function view histogram entries, including c2c_stats_merge(),
c2c_function_he_free(), and c2c_he__free_hierarchy() to handle merging and
recursive cleanup during hierarchy construction.

> diff --git a/tools/perf/ui/browsers/c2c-function.c b/tools/perf/ui/browsers/c2c-function.c
> index c11a81d93c21e..a42378f395f59 100644
> --- a/tools/perf/ui/browsers/c2c-function.c
> +++ b/tools/perf/ui/browsers/c2c-function.c
[ ... ]
> @@ -649,6 +649,164 @@ c2c_function_hists__reinit(struct c2c_hists *c2c_hists,
[ ... ]
> +/*
> + * Free a function-view histogram entry (hist_entry_ops::free).
> + */
> +static void c2c_function_he_free(void *ptr)
> +{
> +	struct hist_entry *he = ptr;
> +	struct c2c_hist_entry *c2c_he;
> +
> +	c2c_he = container_of(he, struct c2c_hist_entry, he);
> +
> +	if (c2c_he->hists) {
> +		perf_hpp__reset_output_field(&c2c_he->hists->list);
> +		hists__delete_entries(&c2c_he->hists->hists);

[Severity: Medium]
Does this code leak nested histogram entries that haven't reached the output
sorted stage?

hists__delete_entries() only iterates over and frees entries in the output
tree. Any entries left in the input or collapsed trees might not be freed.

Should this be hists__delete_all_entries() to prevent leaking these
unprocessed entries, similar to the cleanup handling in builtin-c2c.c?

> +		zfree(&c2c_he->hists);
> +	}
> +
> +	c2c_he__free_hierarchy(he);
> +
[ ... ]
> +static void c2c_he__free_hierarchy(struct hist_entry *he)
> +{
> +	struct rb_node *nd;
> +	struct hist_entry *child_he;
> +	struct c2c_hist_entry *child_c2c;
> +
[ ... ]
> +	nd = rb_first_cached(&he->hroot_out);
> +	while (nd) {
> +		struct rb_node *next = rb_next(nd);
> +
> +		child_he = rb_entry(nd, struct hist_entry, rb_node);
> +		child_c2c = container_of(child_he, struct c2c_hist_entry, he);
> +
> +		if (child_he->stat_acc)
> +			zfree(&child_he->stat_acc);
> +
> +		if (child_he->mem_info)
> +			mem_info__put(child_he->mem_info);
> +
> +		if (child_c2c->hists) {
> +			perf_hpp__reset_output_field(&child_c2c->hists->list);
> +			hists__delete_entries(&child_c2c->hists->hists);

[Severity: Medium]
Could this recursive hierarchy deletion also leak histogram entries?

Like the issue above in c2c_function_he_free(), using hists__delete_entries()
instead of hists__delete_all_entries() skips freeing any entries that are
still in the input or collapsed trees.

> +			zfree(&child_c2c->hists);
> +		}
> +

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260626070355.1556721-1-jiebin.sun@intel.com?part=9

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 10/14] perf c2c: add hierarchy entry creation and lookup functions
  2026-06-26  7:03 ` [PATCH 10/14] perf c2c: add hierarchy entry creation and lookup functions Jiebin Sun
@ 2026-06-26  7:19   ` sashiko-bot
  0 siblings, 0 replies; 23+ messages in thread
From: sashiko-bot @ 2026-06-26  7:19 UTC (permalink / raw)
  To: Jiebin Sun; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] Raw pointer assignment of reference-counted `thread` and `map_symbol` structs violates subsystem architectural rules.
--

commit 1ca4db3fb5b5554b1481105028f0603bdc285571
Author: Jiebin Sun <jiebin.sun@intel.com>

perf c2c: add hierarchy entry creation and lookup functions

This commit adds functions for creating and finding entries at each level
of the 3-level function view hierarchy for perf c2c. It introduces
c2c_child_entry__alloc() for allocation and other helper functions to build
the hierarchy.

> diff --git a/tools/perf/ui/browsers/c2c-function.c b/tools/perf/ui/browsers/c2c-function.c
> index a42378f395f59..a881839299e37 100644
> --- a/tools/perf/ui/browsers/c2c-function.c
> +++ b/tools/perf/ui/browsers/c2c-function.c

[ ... ]

> +static struct c2c_hist_entry *
> +c2c_child_entry__alloc(struct hist_entry *parent_he, struct hist_entry *src_he,
> +		       int depth, u64 ip)
> +{
> +	struct c2c_hist_entry *child_c2c;
> +	struct hist_entry *child_he;
> +	size_t callchain_size;
> +
> +	callchain_size = symbol_conf.use_callchain ? sizeof(struct callchain_root) : 0;
> +	child_he = c2c_he_zalloc(callchain_size);
> +	if (!child_he)
> +		return NULL;
> +
> +	child_c2c = container_of(child_he, struct c2c_hist_entry, he);
> +	child_he->callchain_size = callchain_size;
> +	if (callchain_size)
> +		callchain_init(child_he->callchain);
> +
> +	memcpy(&child_he->ms, &src_he->ms, sizeof(struct map_symbol));
> +
> +	if (src_he->mem_info) {
> +		child_he->mem_info = mem_info__clone(src_he->mem_info);
> +		if (!child_he->mem_info)
> +			goto out_free;
> +	}
> +
> +	child_he->thread = src_he->thread;

[Severity: Medium]
Does this code bypass the reference tracking helpers for the thread and
map_symbol structs?

Using memcpy() for map_symbol and raw pointer assignment for the thread
handle circumvents the reference count checking safety mechanisms. Should
this use explicit lifecycle helpers like thread__get() and map__get() to
ensure proper reference count balancing?

> +	child_he->cpumode = src_he->cpumode;
> +	child_he->cpu = src_he->cpu;

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260626070355.1556721-1-jiebin.sun@intel.com?part=10

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 06/14] perf c2c: add comparison functions for function view sorting
  2026-06-26  7:03 ` [PATCH 06/14] perf c2c: add comparison functions for function view sorting Jiebin Sun
@ 2026-06-26  7:22   ` sashiko-bot
  0 siblings, 0 replies; 23+ messages in thread
From: sashiko-bot @ 2026-06-26  7:22 UTC (permalink / raw)
  To: Jiebin Sun; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Medium] total_stores_cmp performs an O(M) traversal of child hists during the sorting phase, introducing an O(N * M log N) scaling bottleneck.
- [Low] Comment in iaddr_symbol_cmp incorrectly describes the sorting order as descending when the implementation sorts in ascending order.
--

commit f2c1f87b66997d6b6a124f1e2e94dfbb11092062
Author: Jiebin Sun <jiebin.sun@intel.com>

perf c2c: add comparison functions for function view sorting

This patch adds sort comparison functions for the function view columns in 
perf c2c. It includes functions like cycles_percent_cmp, iaddr_symbol_cmp, 
and total_stores_cmp to support sorting.

> diff --git a/tools/perf/ui/browsers/c2c-function.c b/tools/perf/ui/browsers/c2c-function.c
> --- a/tools/perf/ui/browsers/c2c-function.c
> +++ b/tools/perf/ui/browsers/c2c-function.c
[ ... ]
> +static __maybe_unused int64_t
> +iaddr_symbol_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
> +		 struct hist_entry *left, struct hist_entry *right)
> +{
[ ... ]
> +	left_iaddr = hist_entry__iaddr(left);
> +	right_iaddr = hist_entry__iaddr(right);
> +
> +	/*
> +	 * Descending by instruction address (same direction as sort__iaddr_cmp(),
> +	 * which also returns r - l): the expression is +1 when left < right.

[Severity: Low]
This isn't a bug, but the comment describes the sorting order as descending,
while returning +1 when left < right instructs the sorting algorithm to place
smaller elements before larger ones, resulting in an ascending order. 

Should the comment be updated to accurately reflect the actual sort order?

> +	 * Uses hist_entry__iaddr(), which falls back to he->ip when mem_info is
> +	 * NULL, so it matches what iaddr_symbol_entry() displays.
> +	 */
> +	return (left_iaddr < right_iaddr) - (left_iaddr > right_iaddr);
> +}
[ ... ]
> +static __maybe_unused int64_t
> +total_stores_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
> +		 struct hist_entry *left, struct hist_entry *right)
> +{
> +	struct c2c_hist_entry *c2c_left = container_of(left, struct c2c_hist_entry, he);
> +	struct c2c_hist_entry *c2c_right = container_of(right, struct c2c_hist_entry, he);
> +	u64 left_store, right_store;
> +
> +	/* Match total_stores_entry(): L1 sums child stores, L2/L3 use their own. */
> +	left_store = left->parent_he ? (u64)c2c_left->stats.store :
> +				       hist_entry__child_stores(left);
> +	right_store = right->parent_he ? (u64)c2c_right->stats.store :
> +					 hist_entry__child_stores(right);

[Severity: Medium]
Will this cause a severe UI slowdown or hang on large profiles? 

During the sorting phase, the tree resort calls total_stores_cmp for node
comparisons. Since hist_entry__child_stores() performs a full traversal of
child entries, this introduces a scaling bottleneck on every comparison. 

Can the sum be cached instead of dynamically recalculating it by iterating
through children over and over during the tree sort?

> +
> +	return (left_store > right_store) - (left_store < right_store);
> +}

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260626070355.1556721-1-jiebin.sun@intel.com?part=6

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 07/14] perf c2c: add dimension definitions and format creation
  2026-06-26  7:03 ` [PATCH 07/14] perf c2c: add dimension definitions and format creation Jiebin Sun
@ 2026-06-26  7:23   ` sashiko-bot
  0 siblings, 0 replies; 23+ messages in thread
From: sashiko-bot @ 2026-06-26  7:23 UTC (permalink / raw)
  To: Jiebin Sun; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] The c2c_se_cmp wrapper improperly forces fmt->sort to use dim->se->se_cmp instead of dim->se->se_sort.
--

commit 160d3ea9bf317fa78359c422d21e9cf0a3470f5a
Author: Jiebin Sun <jiebin.sun@intel.com>

perf c2c: add dimension definitions and format creation

This patch adds function view dimension definitions that bind column names
to their rendering and comparison functions. It also introduces lookup and
format creation helpers for the function view.

> diff --git a/tools/perf/ui/browsers/c2c-function.c b/tools/perf/ui/browsers/c2c-function.c
> index 172e1aa86ecab..794a023239ce1 100644
> --- a/tools/perf/ui/browsers/c2c-function.c
> +++ b/tools/perf/ui/browsers/c2c-function.c

[ ... ]

> @@ -390,6 +390,125 @@ total_stores_cmp(struct perf_hpp_fmt *fmt __maybe_unused,

[ ... ]

> +/* Wrappers so sort_entry-backed dimensions sort/collapse via their se. */
> +static int64_t c2c_se_cmp(struct perf_hpp_fmt *fmt,
> +			  struct hist_entry *a, struct hist_entry *b)
> +{
> +	struct c2c_fmt *c2c_fmt = container_of(fmt, struct c2c_fmt, fmt);
> +	struct c2c_dimension *dim = c2c_fmt->dim;
> +
> +	return dim->se->se_cmp(a, b);
> +}
> +
> +static int64_t c2c_se_collapse(struct perf_hpp_fmt *fmt,
> +			       struct hist_entry *a, struct hist_entry *b)
> +{
> +	struct c2c_fmt *c2c_fmt = container_of(fmt, struct c2c_fmt, fmt);
> +	struct c2c_dimension *dim = c2c_fmt->dim;
> +	int64_t (*collapse_fn)(struct hist_entry *a, struct hist_entry *b);
> +
> +	collapse_fn = dim->se->se_collapse ?: dim->se->se_cmp;
> +	return collapse_fn(a, b);
> +}
> +
> +static __maybe_unused struct c2c_fmt *get_function_format(const char *name)
> +{
> +	struct c2c_dimension *dim = get_function_dimension(name);
> +	struct c2c_fmt *c2c_fmt;
> +	struct perf_hpp_fmt *fmt;
> +
> +	if (!dim)
> +		return NULL;
> +
> +	c2c_fmt = zalloc(sizeof(*c2c_fmt));
> +	if (!c2c_fmt)
> +		return NULL;
> +
> +	fmt = &c2c_fmt->fmt;
> +
> +	c2c_fmt->dim = dim;
> +	INIT_LIST_HEAD(&fmt->list);
> +	INIT_LIST_HEAD(&fmt->sort_list);
> +
> +	fmt->cmp	= dim->se ? c2c_se_cmp : dim->cmp;
> +	fmt->sort	= dim->se ? c2c_se_cmp : dim->cmp;

[Severity: Medium]
Does this assignment incorrectly bypass the sort_entry's se_sort callback?

Because c2c_se_cmp strictly calls dim->se->se_cmp:

static int64_t c2c_se_cmp(struct perf_hpp_fmt *fmt,
                          struct hist_entry *a, struct hist_entry *b)
{
        ...
        return dim->se->se_cmp(a, b);
}

Using c2c_se_cmp for fmt->sort means we will use se_cmp instead of se_sort.
For example, sorting by the symbol column might sort by memory address offset
rather than alphabetically by symbol name.

Should there be a separate c2c_se_sort wrapper that delegates to
dim->se->se_sort, or similar logic?

> +	fmt->color	= dim->color;
> +	fmt->entry	= dim->entry;
> +	fmt->header	= c2c_header;

[ ... ]

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260626070355.1556721-1-jiebin.sun@intel.com?part=7

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2026-06-26  7:23 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-26  7:03 [PATCH 00/14] perf c2c: add a function view Jiebin Sun
2026-06-26  7:03 ` [PATCH 01/14] perf c2c: extract shared data structures into c2c.h Jiebin Sun
2026-06-26  7:13   ` sashiko-bot
2026-06-26  7:03 ` [PATCH 02/14] perf c2c: add function view browser skeleton Jiebin Sun
2026-06-26  7:11   ` sashiko-bot
2026-06-26  7:03 ` [PATCH 03/14] perf c2c: add function view type definitions and helpers Jiebin Sun
2026-06-26  7:14   ` sashiko-bot
2026-06-26  7:03 ` [PATCH 04/14] perf c2c: add column format infrastructure for function view Jiebin Sun
2026-06-26  7:03 ` [PATCH 05/14] perf c2c: add column entry functions " Jiebin Sun
2026-06-26  7:03 ` [PATCH 06/14] perf c2c: add comparison functions for function view sorting Jiebin Sun
2026-06-26  7:22   ` sashiko-bot
2026-06-26  7:03 ` [PATCH 07/14] perf c2c: add dimension definitions and format creation Jiebin Sun
2026-06-26  7:23   ` sashiko-bot
2026-06-26  7:03 ` [PATCH 08/14] perf c2c: add HPP list parsing for function view histograms Jiebin Sun
2026-06-26  7:16   ` sashiko-bot
2026-06-26  7:03 ` [PATCH 09/14] perf c2c: add stats merging and memory management helpers Jiebin Sun
2026-06-26  7:17   ` sashiko-bot
2026-06-26  7:03 ` [PATCH 10/14] perf c2c: add hierarchy entry creation and lookup functions Jiebin Sun
2026-06-26  7:19   ` sashiko-bot
2026-06-26  7:03 ` [PATCH 11/14] perf c2c: add function view hierarchy builder Jiebin Sun
2026-06-26  7:03 ` [PATCH 12/14] perf c2c: add function view browser UI Jiebin Sun
2026-06-26  7:03 ` [PATCH 13/14] perf c2c: add TAB key to switch to function view Jiebin Sun
2026-06-26  7:03 ` [PATCH 14/14] perf c2c: document function view in perf-c2c man page Jiebin Sun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox