linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1] perf parse-events: Tidy name token matching
@ 2025-01-09 17:54 Ian Rogers
  2025-02-10 19:23 ` Ian Rogers
  2025-02-24  5:39 ` Namhyung Kim
  0 siblings, 2 replies; 9+ messages in thread
From: Ian Rogers @ 2025-01-09 17:54 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Andi Kleen,
	Dominique Martinet, linux-perf-users, linux-kernel

Prior to commit 70c90e4a6b2f ("perf parse-events: Avoid scanning PMUs
before parsing") names (generally event names) excluded hyphen (minus)
symbols as the formation of legacy names with hyphens was handled in
the yacc code. That commit allowed hyphens supposedly making
name_minus unnecessary. However, changing name_minus to name has
issues in the term config tokens as then name ends up having priority
over numbers and name allows matching numbers since commit
5ceb57990bf4 ("perf parse: Allow tracepoint names to start with digits
"). It is also permissable for a name to match with a colon (':') in
it when its in a config term list. To address this rename name_minus
to term_name, make the pattern match name's except for the colon, add
number matching into the config term region with a higher priority
than name matching. This addresses an inconsistency and allows greater
matching for names inside of term lists, for example, they may start
with a number.

Rename name_tag to quoted_name and update comments and helper
functions to avoid str detecting quoted strings which was already done
by the lexer.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/parse-events.l | 51 +++++++++++++++++++++-------------
 1 file changed, 32 insertions(+), 19 deletions(-)

diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index bf7f73548605..7ed86e3e34e3 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -53,21 +53,25 @@ static int str(yyscan_t scanner, int token)
 	YYSTYPE *yylval = parse_events_get_lval(scanner);
 	char *text = parse_events_get_text(scanner);
 
-	if (text[0] != '\'') {
-		yylval->str = strdup(text);
-	} else {
-		/*
-		 * If a text tag specified on the command line
-		 * contains opening single quite ' then it is
-		 * expected that the tag ends with single quote
-		 * as well, like this:
-		 *     name=\'CPU_CLK_UNHALTED.THREAD:cmask=1\'
-		 * quotes need to be escaped to bypass shell
-		 * processing.
-		 */
-		yylval->str = strndup(&text[1], strlen(text) - 2);
-	}
+	yylval->str = strdup(text);
+	return token;
+}
+
+static int quoted_str(yyscan_t scanner, int token)
+{
+	YYSTYPE *yylval = parse_events_get_lval(scanner);
+	char *text = parse_events_get_text(scanner);
 
+	/*
+	 * If a text tag specified on the command line
+	 * contains opening single quite ' then it is
+	 * expected that the tag ends with single quote
+	 * as well, like this:
+	 *     name=\'CPU_CLK_UNHALTED.THREAD:cmask=1\'
+	 * quotes need to be escaped to bypass shell
+	 * processing.
+	 */
+	yylval->str = strndup(&text[1], strlen(text) - 2);
 	return token;
 }
 
@@ -235,9 +239,16 @@ event		[^,{}/]+
 num_dec		[0-9]+
 num_hex		0x[a-fA-F0-9]{1,16}
 num_raw_hex	[a-fA-F0-9]{1,16}
-name		[a-zA-Z0-9_*?\[\]][a-zA-Z0-9_*?.\[\]!\-]*
-name_tag	[\'][a-zA-Z0-9_*?\[\]][a-zA-Z0-9_*?\-,\.\[\]:=]*[\']
-name_minus	[a-zA-Z_*?][a-zA-Z0-9\-_*?.:]*
+/* Regular pattern to match the token PE_NAME. */
+name_start      [a-zA-Z0-9_*?\[\]]
+name		{name_start}[a-zA-Z0-9_*?.\[\]!\-]*
+/* PE_NAME token when inside a config term list, allows ':'. */
+term_name	{name_start}[a-zA-Z0-9_*?.\[\]!\-:]*
+/*
+ * PE_NAME token when quoted, allows ':,.='.
+ * Matches the RHS of terms like: name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks'.
+ */
+quoted_name	[\']{name_start}[a-zA-Z0-9_*?.\[\]!\-:,\.=]*[\']
 drv_cfg_term	[a-zA-Z0-9_\.]+(=[a-zA-Z0-9_*?\.:]+)?
 /*
  * If you add a modifier you need to update check_modifier().
@@ -341,7 +352,9 @@ r0x{num_raw_hex}	{ return str(yyscanner, PE_RAW); }
 {lc_type}			{ return lc_str(yyscanner, _parse_state); }
 {lc_type}-{lc_op_result}	{ return lc_str(yyscanner, _parse_state); }
 {lc_type}-{lc_op_result}-{lc_op_result}	{ return lc_str(yyscanner, _parse_state); }
-{name_minus}		{ return str(yyscanner, PE_NAME); }
+{num_dec}		{ return value(_parse_state, yyscanner, 10); }
+{num_hex}		{ return value(_parse_state, yyscanner, 16); }
+{term_name}		{ return str(yyscanner, PE_NAME); }
 @{drv_cfg_term}		{ return drv_str(yyscanner, PE_DRV_CFG_TERM); }
 }
 
@@ -410,7 +423,7 @@ r{num_raw_hex}		{ return str(yyscanner, PE_RAW); }
 
 {modifier_event}	{ return modifiers(_parse_state, yyscanner); }
 {name}			{ return str(yyscanner, PE_NAME); }
-{name_tag}		{ return str(yyscanner, PE_NAME); }
+{quoted_name}		{ return quoted_str(yyscanner, PE_NAME); }
 "/"			{ BEGIN(config); return '/'; }
 ,			{ BEGIN(event); return ','; }
 :			{ return ':'; }
-- 
2.47.1.613.gc27f4b7a9f-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] perf parse-events: Tidy name token matching
  2025-01-09 17:54 [PATCH v1] perf parse-events: Tidy name token matching Ian Rogers
@ 2025-02-10 19:23 ` Ian Rogers
  2025-02-19 19:02   ` Ian Rogers
  2025-02-24  5:39 ` Namhyung Kim
  1 sibling, 1 reply; 9+ messages in thread
From: Ian Rogers @ 2025-02-10 19:23 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Andi Kleen,
	Dominique Martinet, linux-perf-users, linux-kernel

On Thu, Jan 9, 2025 at 9:54 AM Ian Rogers <irogers@google.com> wrote:
>
> Prior to commit 70c90e4a6b2f ("perf parse-events: Avoid scanning PMUs
> before parsing") names (generally event names) excluded hyphen (minus)
> symbols as the formation of legacy names with hyphens was handled in
> the yacc code. That commit allowed hyphens supposedly making
> name_minus unnecessary. However, changing name_minus to name has
> issues in the term config tokens as then name ends up having priority
> over numbers and name allows matching numbers since commit
> 5ceb57990bf4 ("perf parse: Allow tracepoint names to start with digits
> "). It is also permissable for a name to match with a colon (':') in
> it when its in a config term list. To address this rename name_minus
> to term_name, make the pattern match name's except for the colon, add
> number matching into the config term region with a higher priority
> than name matching. This addresses an inconsistency and allows greater
> matching for names inside of term lists, for example, they may start
> with a number.
>
> Rename name_tag to quoted_name and update comments and helper
> functions to avoid str detecting quoted strings which was already done
> by the lexer.
>
> Signed-off-by: Ian Rogers <irogers@google.com>

Ping. This patch addresses name parsing inconsistencies, in particular
events may start with a number without a PMU, but not with. It also
aims to give better names to patterns than name_minus and name_tag
(with term_name and quoted_name respectively) that have drifted from
their original meaning and become to me less than intention revealing.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] perf parse-events: Tidy name token matching
  2025-02-10 19:23 ` Ian Rogers
@ 2025-02-19 19:02   ` Ian Rogers
  2025-02-19 20:49     ` Namhyung Kim
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Rogers @ 2025-02-19 19:02 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Andi Kleen,
	Dominique Martinet, linux-perf-users, linux-kernel

On Mon, Feb 10, 2025 at 11:23 AM Ian Rogers <irogers@google.com> wrote:
>
> On Thu, Jan 9, 2025 at 9:54 AM Ian Rogers <irogers@google.com> wrote:
> >
> > Prior to commit 70c90e4a6b2f ("perf parse-events: Avoid scanning PMUs
> > before parsing") names (generally event names) excluded hyphen (minus)
> > symbols as the formation of legacy names with hyphens was handled in
> > the yacc code. That commit allowed hyphens supposedly making
> > name_minus unnecessary. However, changing name_minus to name has
> > issues in the term config tokens as then name ends up having priority
> > over numbers and name allows matching numbers since commit
> > 5ceb57990bf4 ("perf parse: Allow tracepoint names to start with digits
> > "). It is also permissable for a name to match with a colon (':') in
> > it when its in a config term list. To address this rename name_minus
> > to term_name, make the pattern match name's except for the colon, add
> > number matching into the config term region with a higher priority
> > than name matching. This addresses an inconsistency and allows greater
> > matching for names inside of term lists, for example, they may start
> > with a number.
> >
> > Rename name_tag to quoted_name and update comments and helper
> > functions to avoid str detecting quoted strings which was already done
> > by the lexer.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
>
> Ping. This patch addresses name parsing inconsistencies, in particular
> events may start with a number without a PMU, but not with. It also
> aims to give better names to patterns than name_minus and name_tag
> (with term_name and quoted_name respectively) that have drifted from
> their original meaning and become to me less than intention revealing.

Ping.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] perf parse-events: Tidy name token matching
  2025-02-19 19:02   ` Ian Rogers
@ 2025-02-19 20:49     ` Namhyung Kim
  2025-02-19 22:11       ` Ian Rogers
  0 siblings, 1 reply; 9+ messages in thread
From: Namhyung Kim @ 2025-02-19 20:49 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Andi Kleen, Dominique Martinet, linux-perf-users,
	linux-kernel

Hi Ian,

On Wed, Feb 19, 2025 at 11:02:40AM -0800, Ian Rogers wrote:
> On Mon, Feb 10, 2025 at 11:23 AM Ian Rogers <irogers@google.com> wrote:
> >
> > On Thu, Jan 9, 2025 at 9:54 AM Ian Rogers <irogers@google.com> wrote:
> > >
> > > Prior to commit 70c90e4a6b2f ("perf parse-events: Avoid scanning PMUs
> > > before parsing") names (generally event names) excluded hyphen (minus)
> > > symbols as the formation of legacy names with hyphens was handled in
> > > the yacc code. That commit allowed hyphens supposedly making
> > > name_minus unnecessary. However, changing name_minus to name has
> > > issues in the term config tokens as then name ends up having priority
> > > over numbers and name allows matching numbers since commit
> > > 5ceb57990bf4 ("perf parse: Allow tracepoint names to start with digits
> > > "). It is also permissable for a name to match with a colon (':') in
> > > it when its in a config term list. To address this rename name_minus
> > > to term_name, make the pattern match name's except for the colon, add
> > > number matching into the config term region with a higher priority
> > > than name matching. This addresses an inconsistency and allows greater
> > > matching for names inside of term lists, for example, they may start
> > > with a number.
> > >
> > > Rename name_tag to quoted_name and update comments and helper
> > > functions to avoid str detecting quoted strings which was already done
> > > by the lexer.
> > >
> > > Signed-off-by: Ian Rogers <irogers@google.com>
> >
> > Ping. This patch addresses name parsing inconsistencies, in particular
> > events may start with a number without a PMU, but not with. It also
> > aims to give better names to patterns than name_minus and name_tag
> > (with term_name and quoted_name respectively) that have drifted from
> > their original meaning and become to me less than intention revealing.
> 
> Ping.

Sorry for the delay.  Can you please give an example for better
understanding if there's a change in the behavior?

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] perf parse-events: Tidy name token matching
  2025-02-19 20:49     ` Namhyung Kim
@ 2025-02-19 22:11       ` Ian Rogers
  2025-02-19 22:27         ` Namhyung Kim
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Rogers @ 2025-02-19 22:11 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Andi Kleen, Dominique Martinet, linux-perf-users,
	linux-kernel

On Wed, Feb 19, 2025 at 12:49 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hi Ian,
>
> On Wed, Feb 19, 2025 at 11:02:40AM -0800, Ian Rogers wrote:
> > On Mon, Feb 10, 2025 at 11:23 AM Ian Rogers <irogers@google.com> wrote:
> > >
> > > On Thu, Jan 9, 2025 at 9:54 AM Ian Rogers <irogers@google.com> wrote:
> > > >
> > > > Prior to commit 70c90e4a6b2f ("perf parse-events: Avoid scanning PMUs
> > > > before parsing") names (generally event names) excluded hyphen (minus)
> > > > symbols as the formation of legacy names with hyphens was handled in
> > > > the yacc code. That commit allowed hyphens supposedly making
> > > > name_minus unnecessary. However, changing name_minus to name has
> > > > issues in the term config tokens as then name ends up having priority
> > > > over numbers and name allows matching numbers since commit
> > > > 5ceb57990bf4 ("perf parse: Allow tracepoint names to start with digits
> > > > "). It is also permissable for a name to match with a colon (':') in
> > > > it when its in a config term list. To address this rename name_minus
> > > > to term_name, make the pattern match name's except for the colon, add
> > > > number matching into the config term region with a higher priority
> > > > than name matching. This addresses an inconsistency and allows greater
> > > > matching for names inside of term lists, for example, they may start
> > > > with a number.
> > > >
> > > > Rename name_tag to quoted_name and update comments and helper
> > > > functions to avoid str detecting quoted strings which was already done
> > > > by the lexer.
> > > >
> > > > Signed-off-by: Ian Rogers <irogers@google.com>
> > >
> > > Ping. This patch addresses name parsing inconsistencies, in particular
> > > events may start with a number without a PMU, but not with. It also
> > > aims to give better names to patterns than name_minus and name_tag
> > > (with term_name and quoted_name respectively) that have drifted from
> > > their original meaning and become to me less than intention revealing.
> >
> > Ping.
>
> Sorry for the delay.  Can you please give an example for better
> understanding if there's a change in the behavior?

The example in:
https://lore.kernel.org/r/20240510-perf_digit-v4-3-db1553f3233b@codewreck.org
is `perf trace -e '9p:*'` which allows the number to start a
tracepoint name, but what is true for tracepoint names is also true
for event names. I lack the tracepoint but the patch here is making
that work if the event/tracepoint is specified with a PMU, so:

Before the input is just seen as broken:
```
$ perf stat -e 'tracepoint/9p:9p/' true
event syntax error: 'tracepoint/9p:9p/'
                               \___ Unrecognized input
Run 'perf list' for a list of valid events

Usage: perf stat [<options>] [<command>]

   -e, --event <event>   event selector. use 'perf list' to list
available events
```

After the input fails because the event wasn't found:
```
$ perf stat -e 'tracepoint/9p:9p/' true
event syntax error: 'tracepoint/9p:9p/'
                    \___ Bad event or PMU

Unable to find PMU or event on a PMU of 'tracepoint'

event syntax error: 'tracepoint/9p:9p/'
                               \___ unknown term '9p:9p' for pmu 'tracepoint'

valid terms: config,config1,config2,config3,name,period,percore,metric-id

event syntax error: 'tracepoint/9p:9p/'
                               \___ unknown term '9p:9p' for pmu 'tracepoint'

valid terms: config,config1,config2,config3,name,period,percore,metric-id
Run 'perf list' for a list of valid events

Usage: perf stat [<options>] [<command>]

   -e, --event <event>   event selector. use 'perf list' to list
available events
```

But the patch is just about making the name term more consistent and
cleaner, the weirdness above wasn't its main point, I want the code to
be easy to read and understand.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] perf parse-events: Tidy name token matching
  2025-02-19 22:11       ` Ian Rogers
@ 2025-02-19 22:27         ` Namhyung Kim
  2025-02-19 22:43           ` Ian Rogers
  0 siblings, 1 reply; 9+ messages in thread
From: Namhyung Kim @ 2025-02-19 22:27 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Andi Kleen, Dominique Martinet, linux-perf-users,
	linux-kernel

On Wed, Feb 19, 2025 at 02:11:43PM -0800, Ian Rogers wrote:
> On Wed, Feb 19, 2025 at 12:49 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hi Ian,
> >
> > On Wed, Feb 19, 2025 at 11:02:40AM -0800, Ian Rogers wrote:
> > > On Mon, Feb 10, 2025 at 11:23 AM Ian Rogers <irogers@google.com> wrote:
> > > >
> > > > On Thu, Jan 9, 2025 at 9:54 AM Ian Rogers <irogers@google.com> wrote:
> > > > >
> > > > > Prior to commit 70c90e4a6b2f ("perf parse-events: Avoid scanning PMUs
> > > > > before parsing") names (generally event names) excluded hyphen (minus)
> > > > > symbols as the formation of legacy names with hyphens was handled in
> > > > > the yacc code. That commit allowed hyphens supposedly making
> > > > > name_minus unnecessary. However, changing name_minus to name has
> > > > > issues in the term config tokens as then name ends up having priority
> > > > > over numbers and name allows matching numbers since commit
> > > > > 5ceb57990bf4 ("perf parse: Allow tracepoint names to start with digits
> > > > > "). It is also permissable for a name to match with a colon (':') in
> > > > > it when its in a config term list. To address this rename name_minus
> > > > > to term_name, make the pattern match name's except for the colon, add
> > > > > number matching into the config term region with a higher priority
> > > > > than name matching. This addresses an inconsistency and allows greater
> > > > > matching for names inside of term lists, for example, they may start
> > > > > with a number.
> > > > >
> > > > > Rename name_tag to quoted_name and update comments and helper
> > > > > functions to avoid str detecting quoted strings which was already done
> > > > > by the lexer.
> > > > >
> > > > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > >
> > > > Ping. This patch addresses name parsing inconsistencies, in particular
> > > > events may start with a number without a PMU, but not with. It also
> > > > aims to give better names to patterns than name_minus and name_tag
> > > > (with term_name and quoted_name respectively) that have drifted from
> > > > their original meaning and become to me less than intention revealing.
> > >
> > > Ping.
> >
> > Sorry for the delay.  Can you please give an example for better
> > understanding if there's a change in the behavior?
> 
> The example in:
> https://lore.kernel.org/r/20240510-perf_digit-v4-3-db1553f3233b@codewreck.org
> is `perf trace -e '9p:*'` which allows the number to start a
> tracepoint name, but what is true for tracepoint names is also true
> for event names. I lack the tracepoint but the patch here is making
> that work if the event/tracepoint is specified with a PMU, so:
> 
> Before the input is just seen as broken:
> ```
> $ perf stat -e 'tracepoint/9p:9p/' true
> event syntax error: 'tracepoint/9p:9p/'
>                                \___ Unrecognized input
> Run 'perf list' for a list of valid events
> 
> Usage: perf stat [<options>] [<command>]
> 
>    -e, --event <event>   event selector. use 'perf list' to list
> available events
> ```
> 
> After the input fails because the event wasn't found:
> ```
> $ perf stat -e 'tracepoint/9p:9p/' true
> event syntax error: 'tracepoint/9p:9p/'
>                     \___ Bad event or PMU
> 
> Unable to find PMU or event on a PMU of 'tracepoint'
> 
> event syntax error: 'tracepoint/9p:9p/'
>                                \___ unknown term '9p:9p' for pmu 'tracepoint'
> 
> valid terms: config,config1,config2,config3,name,period,percore,metric-id
> 
> event syntax error: 'tracepoint/9p:9p/'
>                                \___ unknown term '9p:9p' for pmu 'tracepoint'
> 
> valid terms: config,config1,config2,config3,name,period,percore,metric-id
> Run 'perf list' for a list of valid events
> 
> Usage: perf stat [<options>] [<command>]
> 
>    -e, --event <event>   event selector. use 'perf list' to list
> available events
> ```
> 
> But the patch is just about making the name term more consistent and
> cleaner, the weirdness above wasn't its main point, I want the code to
> be easy to read and understand.

Ok, so I guess there's no behavior change from the users perspective in
this patchset.  Do you plan to add support for the tracepoint name in
the config term (like tracepoint/9p:9p/) later?

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] perf parse-events: Tidy name token matching
  2025-02-19 22:27         ` Namhyung Kim
@ 2025-02-19 22:43           ` Ian Rogers
  2025-02-21  6:34             ` Namhyung Kim
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Rogers @ 2025-02-19 22:43 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Andi Kleen, Dominique Martinet, linux-perf-users,
	linux-kernel

On Wed, Feb 19, 2025 at 2:27 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Wed, Feb 19, 2025 at 02:11:43PM -0800, Ian Rogers wrote:
> > On Wed, Feb 19, 2025 at 12:49 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > Hi Ian,
> > >
> > > On Wed, Feb 19, 2025 at 11:02:40AM -0800, Ian Rogers wrote:
> > > > On Mon, Feb 10, 2025 at 11:23 AM Ian Rogers <irogers@google.com> wrote:
> > > > >
> > > > > On Thu, Jan 9, 2025 at 9:54 AM Ian Rogers <irogers@google.com> wrote:
> > > > > >
> > > > > > Prior to commit 70c90e4a6b2f ("perf parse-events: Avoid scanning PMUs
> > > > > > before parsing") names (generally event names) excluded hyphen (minus)
> > > > > > symbols as the formation of legacy names with hyphens was handled in
> > > > > > the yacc code. That commit allowed hyphens supposedly making
> > > > > > name_minus unnecessary. However, changing name_minus to name has
> > > > > > issues in the term config tokens as then name ends up having priority
> > > > > > over numbers and name allows matching numbers since commit
> > > > > > 5ceb57990bf4 ("perf parse: Allow tracepoint names to start with digits
> > > > > > "). It is also permissable for a name to match with a colon (':') in
> > > > > > it when its in a config term list. To address this rename name_minus
> > > > > > to term_name, make the pattern match name's except for the colon, add
> > > > > > number matching into the config term region with a higher priority
> > > > > > than name matching. This addresses an inconsistency and allows greater
> > > > > > matching for names inside of term lists, for example, they may start
> > > > > > with a number.
> > > > > >
> > > > > > Rename name_tag to quoted_name and update comments and helper
> > > > > > functions to avoid str detecting quoted strings which was already done
> > > > > > by the lexer.
> > > > > >
> > > > > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > > >
> > > > > Ping. This patch addresses name parsing inconsistencies, in particular
> > > > > events may start with a number without a PMU, but not with. It also
> > > > > aims to give better names to patterns than name_minus and name_tag
> > > > > (with term_name and quoted_name respectively) that have drifted from
> > > > > their original meaning and become to me less than intention revealing.
> > > >
> > > > Ping.
> > >
> > > Sorry for the delay.  Can you please give an example for better
> > > understanding if there's a change in the behavior?
> >
> > The example in:
> > https://lore.kernel.org/r/20240510-perf_digit-v4-3-db1553f3233b@codewreck.org
> > is `perf trace -e '9p:*'` which allows the number to start a
> > tracepoint name, but what is true for tracepoint names is also true
> > for event names. I lack the tracepoint but the patch here is making
> > that work if the event/tracepoint is specified with a PMU, so:
> >
> > Before the input is just seen as broken:
> > ```
> > $ perf stat -e 'tracepoint/9p:9p/' true
> > event syntax error: 'tracepoint/9p:9p/'
> >                                \___ Unrecognized input
> > Run 'perf list' for a list of valid events
> >
> > Usage: perf stat [<options>] [<command>]
> >
> >    -e, --event <event>   event selector. use 'perf list' to list
> > available events
> > ```
> >
> > After the input fails because the event wasn't found:
> > ```
> > $ perf stat -e 'tracepoint/9p:9p/' true
> > event syntax error: 'tracepoint/9p:9p/'
> >                     \___ Bad event or PMU
> >
> > Unable to find PMU or event on a PMU of 'tracepoint'
> >
> > event syntax error: 'tracepoint/9p:9p/'
> >                                \___ unknown term '9p:9p' for pmu 'tracepoint'
> >
> > valid terms: config,config1,config2,config3,name,period,percore,metric-id
> >
> > event syntax error: 'tracepoint/9p:9p/'
> >                                \___ unknown term '9p:9p' for pmu 'tracepoint'
> >
> > valid terms: config,config1,config2,config3,name,period,percore,metric-id
> > Run 'perf list' for a list of valid events
> >
> > Usage: perf stat [<options>] [<command>]
> >
> >    -e, --event <event>   event selector. use 'perf list' to list
> > available events
> > ```
> >
> > But the patch is just about making the name term more consistent and
> > cleaner, the weirdness above wasn't its main point, I want the code to
> > be easy to read and understand.
>
> Ok, so I guess there's no behavior change from the users perspective in
> this patchset.  Do you plan to add support for the tracepoint name in
> the config term (like tracepoint/9p:9p/) later?

I think we treat tracepoints much as we do regular PMU perf events
except in the encoding of the config. There is also a sysfs PMU:
```
$ ls -al /sys/bus/event_source/devices/tracepoint
/
total 0
drwxr-xr-x  3 root root    0 Feb 19 14:35 .
drwxr-xr-x 78 root root    0 Feb 19 08:13 ..
-rw-r--r--  1 root root 4096 Feb 19 14:34 perf_event_mux_interval_ms
drwxr-xr-x  2 root root    0 Feb 19 08:13 power
lrwxrwxrwx  1 root root    0 Feb 19 08:13 subsystem -> ../../bus/event_source
-r--r--r--  1 root root 4096 Feb 19 10:53 type
-rw-r--r--  1 root root 4096 Feb 19 08:13 uevent
```
with the type reflecting the perf_event_attr type (3 aka
PERF_TYPE_TRACEPOINT). So I think much like with the hwmon_pmu.c it
makes sense to have a tracepoint_pmu.c and move logic like
parse-events add_tracepoint in there:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n523
in that case tracepoint/9p:9p/ would be a valid tracepoint event name.
For now this code is cleaning up that if you had a 9p on say the cpu
PMU, 9p would wildcard match with it but cpu/9p/ would be a parse
error - as the event name currently doesn't allow a number to start it
when it is part of the term list, what this patch fixes as part of
tidying up the code.

Thanks,
Ian

>
> Thanks,
> Namhyung
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] perf parse-events: Tidy name token matching
  2025-02-19 22:43           ` Ian Rogers
@ 2025-02-21  6:34             ` Namhyung Kim
  0 siblings, 0 replies; 9+ messages in thread
From: Namhyung Kim @ 2025-02-21  6:34 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Andi Kleen, Dominique Martinet, linux-perf-users,
	linux-kernel

On Wed, Feb 19, 2025 at 02:43:10PM -0800, Ian Rogers wrote:
> On Wed, Feb 19, 2025 at 2:27 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Wed, Feb 19, 2025 at 02:11:43PM -0800, Ian Rogers wrote:
> > > On Wed, Feb 19, 2025 at 12:49 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > >
> > > > Hi Ian,
> > > >
> > > > On Wed, Feb 19, 2025 at 11:02:40AM -0800, Ian Rogers wrote:
> > > > > On Mon, Feb 10, 2025 at 11:23 AM Ian Rogers <irogers@google.com> wrote:
> > > > > >
> > > > > > On Thu, Jan 9, 2025 at 9:54 AM Ian Rogers <irogers@google.com> wrote:
> > > > > > >
> > > > > > > Prior to commit 70c90e4a6b2f ("perf parse-events: Avoid scanning PMUs
> > > > > > > before parsing") names (generally event names) excluded hyphen (minus)
> > > > > > > symbols as the formation of legacy names with hyphens was handled in
> > > > > > > the yacc code. That commit allowed hyphens supposedly making
> > > > > > > name_minus unnecessary. However, changing name_minus to name has
> > > > > > > issues in the term config tokens as then name ends up having priority
> > > > > > > over numbers and name allows matching numbers since commit
> > > > > > > 5ceb57990bf4 ("perf parse: Allow tracepoint names to start with digits
> > > > > > > "). It is also permissable for a name to match with a colon (':') in
> > > > > > > it when its in a config term list. To address this rename name_minus
> > > > > > > to term_name, make the pattern match name's except for the colon, add
> > > > > > > number matching into the config term region with a higher priority
> > > > > > > than name matching. This addresses an inconsistency and allows greater
> > > > > > > matching for names inside of term lists, for example, they may start
> > > > > > > with a number.
> > > > > > >
> > > > > > > Rename name_tag to quoted_name and update comments and helper
> > > > > > > functions to avoid str detecting quoted strings which was already done
> > > > > > > by the lexer.
> > > > > > >
> > > > > > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > > > >
> > > > > > Ping. This patch addresses name parsing inconsistencies, in particular
> > > > > > events may start with a number without a PMU, but not with. It also
> > > > > > aims to give better names to patterns than name_minus and name_tag
> > > > > > (with term_name and quoted_name respectively) that have drifted from
> > > > > > their original meaning and become to me less than intention revealing.
> > > > >
> > > > > Ping.
> > > >
> > > > Sorry for the delay.  Can you please give an example for better
> > > > understanding if there's a change in the behavior?
> > >
> > > The example in:
> > > https://lore.kernel.org/r/20240510-perf_digit-v4-3-db1553f3233b@codewreck.org
> > > is `perf trace -e '9p:*'` which allows the number to start a
> > > tracepoint name, but what is true for tracepoint names is also true
> > > for event names. I lack the tracepoint but the patch here is making
> > > that work if the event/tracepoint is specified with a PMU, so:
> > >
> > > Before the input is just seen as broken:
> > > ```
> > > $ perf stat -e 'tracepoint/9p:9p/' true
> > > event syntax error: 'tracepoint/9p:9p/'
> > >                                \___ Unrecognized input
> > > Run 'perf list' for a list of valid events
> > >
> > > Usage: perf stat [<options>] [<command>]
> > >
> > >    -e, --event <event>   event selector. use 'perf list' to list
> > > available events
> > > ```
> > >
> > > After the input fails because the event wasn't found:
> > > ```
> > > $ perf stat -e 'tracepoint/9p:9p/' true
> > > event syntax error: 'tracepoint/9p:9p/'
> > >                     \___ Bad event or PMU
> > >
> > > Unable to find PMU or event on a PMU of 'tracepoint'
> > >
> > > event syntax error: 'tracepoint/9p:9p/'
> > >                                \___ unknown term '9p:9p' for pmu 'tracepoint'
> > >
> > > valid terms: config,config1,config2,config3,name,period,percore,metric-id
> > >
> > > event syntax error: 'tracepoint/9p:9p/'
> > >                                \___ unknown term '9p:9p' for pmu 'tracepoint'
> > >
> > > valid terms: config,config1,config2,config3,name,period,percore,metric-id
> > > Run 'perf list' for a list of valid events
> > >
> > > Usage: perf stat [<options>] [<command>]
> > >
> > >    -e, --event <event>   event selector. use 'perf list' to list
> > > available events
> > > ```
> > >
> > > But the patch is just about making the name term more consistent and
> > > cleaner, the weirdness above wasn't its main point, I want the code to
> > > be easy to read and understand.
> >
> > Ok, so I guess there's no behavior change from the users perspective in
> > this patchset.  Do you plan to add support for the tracepoint name in
> > the config term (like tracepoint/9p:9p/) later?
> 
> I think we treat tracepoints much as we do regular PMU perf events
> except in the encoding of the config. There is also a sysfs PMU:
> ```
> $ ls -al /sys/bus/event_source/devices/tracepoint
> /
> total 0
> drwxr-xr-x  3 root root    0 Feb 19 14:35 .
> drwxr-xr-x 78 root root    0 Feb 19 08:13 ..
> -rw-r--r--  1 root root 4096 Feb 19 14:34 perf_event_mux_interval_ms
> drwxr-xr-x  2 root root    0 Feb 19 08:13 power
> lrwxrwxrwx  1 root root    0 Feb 19 08:13 subsystem -> ../../bus/event_source
> -r--r--r--  1 root root 4096 Feb 19 10:53 type
> -rw-r--r--  1 root root 4096 Feb 19 08:13 uevent
> ```
> with the type reflecting the perf_event_attr type (3 aka
> PERF_TYPE_TRACEPOINT). So I think much like with the hwmon_pmu.c it
> makes sense to have a tracepoint_pmu.c and move logic like
> parse-events add_tracepoint in there:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n523
> in that case tracepoint/9p:9p/ would be a valid tracepoint event name.
> For now this code is cleaning up that if you had a 9p on say the cpu
> PMU, 9p would wildcard match with it but cpu/9p/ would be a parse
> error - as the event name currently doesn't allow a number to start it
> when it is part of the term list, what this patch fixes as part of
> tidying up the code.

Ok, the change itself looks fine to me.

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] perf parse-events: Tidy name token matching
  2025-01-09 17:54 [PATCH v1] perf parse-events: Tidy name token matching Ian Rogers
  2025-02-10 19:23 ` Ian Rogers
@ 2025-02-24  5:39 ` Namhyung Kim
  1 sibling, 0 replies; 9+ messages in thread
From: Namhyung Kim @ 2025-02-24  5:39 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Andi Kleen, Dominique Martinet, linux-perf-users,
	linux-kernel, Ian Rogers

On Thu, 09 Jan 2025 09:54:01 -0800, Ian Rogers wrote:
> Prior to commit 70c90e4a6b2f ("perf parse-events: Avoid scanning PMUs
> before parsing") names (generally event names) excluded hyphen (minus)
> symbols as the formation of legacy names with hyphens was handled in
> the yacc code. That commit allowed hyphens supposedly making
> name_minus unnecessary. However, changing name_minus to name has
> issues in the term config tokens as then name ends up having priority
> over numbers and name allows matching numbers since commit
> 5ceb57990bf4 ("perf parse: Allow tracepoint names to start with digits
> "). It is also permissable for a name to match with a colon (':') in
> it when its in a config term list. To address this rename name_minus
> to term_name, make the pattern match name's except for the colon, add
> number matching into the config term region with a higher priority
> than name matching. This addresses an inconsistency and allows greater
> matching for names inside of term lists, for example, they may start
> with a number.
> 
> [...]
Applied to perf-tools-next, thanks!

Best regards,
Namhyung



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-02-24  5:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-09 17:54 [PATCH v1] perf parse-events: Tidy name token matching Ian Rogers
2025-02-10 19:23 ` Ian Rogers
2025-02-19 19:02   ` Ian Rogers
2025-02-19 20:49     ` Namhyung Kim
2025-02-19 22:11       ` Ian Rogers
2025-02-19 22:27         ` Namhyung Kim
2025-02-19 22:43           ` Ian Rogers
2025-02-21  6:34             ` Namhyung Kim
2025-02-24  5:39 ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).