* [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser
@ 2025-01-13 15:00 Masahiro Yamada
2025-01-13 15:00 ` [PATCH 01/17] genksyms: rename m_abstract_declarator to abstract_declarator Masahiro Yamada
` (17 more replies)
0 siblings, 18 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
This series fixes several long-standing issues in genksyms.
- The parser contains grammatical ambiguities, including both
reduce/reduce and shift/reduce conflicts.
- There are several hidden syntax errors
When a syntax error occurs, the type becomes UNKNOWN, and
precise CRC calculation becomes impossible.
Masahiro Yamada (17):
genksyms: rename m_abstract_declarator to abstract_declarator
genksyms: rename cvar_qualifier to type_qualifier
genksyms: reduce type_qualifier directly to decl_specifier
genksyms: fix 6 shift/reduce conflicts and 5 reduce/reduce conflicts
genksyms: fix last 3 shift/reduce conflicts
genksyms: remove Makefile hack
genksyms: restrict direct-abstract-declarator to take one
parameter-type-list
genksyms: restrict direct-declarator to take one parameter-type-list
genksyms: record attributes consistently for init-declarator
genksyms: decouple ATTRIBUTE_PHRASE from type-qualifier
genksyms: fix syntax error for attribute before abstract_declarator
genksyms: fix syntax error for attribute before nested_declarator
genksyms: fix syntax error for attribute after abstact_declarator
genksyms: fix syntax error for attribute after 'struct'
genksyms: fix syntax error for attribute after 'union'
genksyms: fix syntax error for builtin (u)int*x*_t types
genksyms: fix syntax error for attribute before init-declarator
scripts/genksyms/Makefile | 18 -----
scripts/genksyms/genksyms.h | 3 +
scripts/genksyms/lex.l | 17 +++-
scripts/genksyms/parse.y | 150 ++++++++++++++++++++----------------
4 files changed, 101 insertions(+), 87 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 01/17] genksyms: rename m_abstract_declarator to abstract_declarator
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
@ 2025-01-13 15:00 ` Masahiro Yamada
2025-01-13 15:00 ` [PATCH 02/17] genksyms: rename cvar_qualifier to type_qualifier Masahiro Yamada
` (16 subsequent siblings)
17 siblings, 0 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
This is called "abstract-declarator" in K&R. [1]
I am not sure what "m_" stands for, but the name is clear enough
without it.
No functional changes are intended.
[1] https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/parse.y | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index 689cb6bb40b6..02f2f713ec5a 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -367,17 +367,17 @@ parameter_declaration_list:
;
parameter_declaration:
- decl_specifier_seq m_abstract_declarator
+ decl_specifier_seq abstract_declarator
{ $$ = $2 ? $2 : $1; }
;
-m_abstract_declarator:
- ptr_operator m_abstract_declarator
+abstract_declarator:
+ ptr_operator abstract_declarator
{ $$ = $2 ? $2 : $1; }
- | direct_m_abstract_declarator
+ | direct_abstract_declarator
;
-direct_m_abstract_declarator:
+direct_abstract_declarator:
/* empty */ { $$ = NULL; }
| IDENT
{ /* For version 2 checksums, we don't want to remember
@@ -391,13 +391,13 @@ direct_m_abstract_declarator:
{ remove_node($1);
$$ = $1;
}
- | direct_m_abstract_declarator '(' parameter_declaration_clause ')'
+ | direct_abstract_declarator '(' parameter_declaration_clause ')'
{ $$ = $4; }
- | direct_m_abstract_declarator '(' error ')'
+ | direct_abstract_declarator '(' error ')'
{ $$ = $4; }
- | direct_m_abstract_declarator BRACKET_PHRASE
+ | direct_abstract_declarator BRACKET_PHRASE
{ $$ = $2; }
- | '(' m_abstract_declarator ')'
+ | '(' abstract_declarator ')'
{ $$ = $3; }
| '(' error ')'
{ $$ = $3; }
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 02/17] genksyms: rename cvar_qualifier to type_qualifier
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
2025-01-13 15:00 ` [PATCH 01/17] genksyms: rename m_abstract_declarator to abstract_declarator Masahiro Yamada
@ 2025-01-13 15:00 ` Masahiro Yamada
2025-01-13 15:00 ` [PATCH 03/17] genksyms: reduce type_qualifier directly to decl_specifier Masahiro Yamada
` (15 subsequent siblings)
17 siblings, 0 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
I believe "cvar" stands for "Const, Volatile, Attribute, or Restrict".
This is called "type-qualifier" in K&R. [1]
Adopt this more generic naming.
No functional changes are intended.
[1] https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/parse.y | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index 02f2f713ec5a..8f62b9f0d99c 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -223,7 +223,7 @@ storage_class_specifier:
type_specifier:
simple_type_specifier
- | cvar_qualifier
+ | type_qualifier
| TYPEOF_KEYW '(' parameter_declaration ')'
| TYPEOF_PHRASE
@@ -270,21 +270,21 @@ simple_type_specifier:
;
ptr_operator:
- '*' cvar_qualifier_seq_opt
+ '*' type_qualifier_seq_opt
{ $$ = $2 ? $2 : $1; }
;
-cvar_qualifier_seq_opt:
+type_qualifier_seq_opt:
/* empty */ { $$ = NULL; }
- | cvar_qualifier_seq
+ | type_qualifier_seq
;
-cvar_qualifier_seq:
- cvar_qualifier
- | cvar_qualifier_seq cvar_qualifier { $$ = $2; }
+type_qualifier_seq:
+ type_qualifier
+ | type_qualifier_seq type_qualifier { $$ = $2; }
;
-cvar_qualifier:
+type_qualifier:
CONST_KEYW | VOLATILE_KEYW | ATTRIBUTE_PHRASE
| RESTRICT_KEYW
{ /* restrict has no effect in prototypes so ignore it */
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 03/17] genksyms: reduce type_qualifier directly to decl_specifier
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
2025-01-13 15:00 ` [PATCH 01/17] genksyms: rename m_abstract_declarator to abstract_declarator Masahiro Yamada
2025-01-13 15:00 ` [PATCH 02/17] genksyms: rename cvar_qualifier to type_qualifier Masahiro Yamada
@ 2025-01-13 15:00 ` Masahiro Yamada
2025-01-13 15:00 ` [PATCH 04/17] genksyms: fix 6 shift/reduce conflicts and 5 reduce/reduce conflicts Masahiro Yamada
` (14 subsequent siblings)
17 siblings, 0 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
A type_qualifier (const, volatile, etc.) is not a type_specifier.
According to K&R [1], a type-qualifier should be directly reduced to
a declaration-specifier.
<declaration-specifier> ::= <storage-class-specifier>
| <type-specifier>
| <type-qualifier>
[1]: https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/parse.y | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index 8f62b9f0d99c..20cb3db7f149 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -211,6 +211,7 @@ decl_specifier:
$$ = $1;
}
| type_specifier
+ | type_qualifier
;
storage_class_specifier:
@@ -223,7 +224,6 @@ storage_class_specifier:
type_specifier:
simple_type_specifier
- | type_qualifier
| TYPEOF_KEYW '(' parameter_declaration ')'
| TYPEOF_PHRASE
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 04/17] genksyms: fix 6 shift/reduce conflicts and 5 reduce/reduce conflicts
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
` (2 preceding siblings ...)
2025-01-13 15:00 ` [PATCH 03/17] genksyms: reduce type_qualifier directly to decl_specifier Masahiro Yamada
@ 2025-01-13 15:00 ` Masahiro Yamada
2025-01-14 1:23 ` Masahiro Yamada
2025-01-13 15:00 ` [PATCH 05/17] genksyms: fix last 3 shift/reduce conflicts Masahiro Yamada
` (13 subsequent siblings)
17 siblings, 1 reply; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
The genksyms parser has ambiguities in its grammar, which are currently
suppressed by a workaround in scripts/genksyms/Makefile.
Building genksyms with W=1 generates the following warnings:
YACC scripts/genksyms/parse.tab.[ch]
scripts/genksyms/parse.y: warning: 9 shift/reduce conflicts [-Wconflicts-sr]
scripts/genksyms/parse.y: warning: 5 reduce/reduce conflicts [-Wconflicts-rr]
scripts/genksyms/parse.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples
The comment in the parser describes the current problem:
/* This wasn't really a typedef name but an identifier that
shadows one. */
Consider the following simple C code:
typedef int foo;
void my_func(foo foo) {}
In the function parameter list (foo foo), the first 'foo' is a type
specifier (typedef'ed as 'int'), while the second 'foo' is an identifier.
However, the lexer cannot distinguish between the two. Since 'foo' is
already typedef'ed, the lexer returns TYPE for both instances, instead
of returning IDENT for the second one.
To support shadowed identifiers, IDENT can be reduced to either a
simple_type_specifier or a direct_abstract_declarator, which creates
a grammatical ambiguity.
Without analyzing the grammar context, it is very difficult to resolve
this correctly.
This commit introduces a flag, dont_want_type_specifier, which allows
the parser to inform the lexer whether an identifier is expected. When
dont_want_type_specifier is true, the type lookup is suppressed, and
the lexer returns IDENT regardless of any preceding typedef.
After this commit, only 3 shift/reduce conflicts will remain.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/genksyms.h | 3 +++
scripts/genksyms/lex.l | 9 ++++++++-
scripts/genksyms/parse.y | 37 +++++++++++++++----------------------
3 files changed, 26 insertions(+), 23 deletions(-)
diff --git a/scripts/genksyms/genksyms.h b/scripts/genksyms/genksyms.h
index 8c45ada59ece..0c355075f0e6 100644
--- a/scripts/genksyms/genksyms.h
+++ b/scripts/genksyms/genksyms.h
@@ -12,6 +12,7 @@
#ifndef MODUTILS_GENKSYMS_H
#define MODUTILS_GENKSYMS_H 1
+#include <stdbool.h>
#include <stdio.h>
#include <list_types.h>
@@ -66,6 +67,8 @@ struct string_list *copy_list_range(struct string_list *start,
int yylex(void);
int yyparse(void);
+extern bool dont_want_type_specifier;
+
void error_with_pos(const char *, ...) __attribute__ ((format(printf, 1, 2)));
/*----------------------------------------------------------------------*/
diff --git a/scripts/genksyms/lex.l b/scripts/genksyms/lex.l
index a4d7495eaf75..e886133af578 100644
--- a/scripts/genksyms/lex.l
+++ b/scripts/genksyms/lex.l
@@ -12,6 +12,7 @@
%{
#include <limits.h>
+#include <stdbool.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
@@ -113,6 +114,12 @@ MC_TOKEN ([~%^&*+=|<>/-]=)|(&&)|("||")|(->)|(<<)|(>>)
/* The second stage lexer. Here we incorporate knowledge of the state
of the parser to tailor the tokens that are returned. */
+/*
+ * The lexer cannot distinguish whether a typedef'ed string is a TYPE or an
+ * IDENT. We need a hint from the parser to handle this accurately.
+ */
+bool dont_want_type_specifier;
+
int
yylex(void)
{
@@ -207,7 +214,7 @@ repeat:
goto repeat;
}
}
- if (!suppress_type_lookup)
+ if (!suppress_type_lookup && !dont_want_type_specifier)
{
if (find_symbol(yytext, SYM_TYPEDEF, 1))
token = TYPE;
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index 20cb3db7f149..dc575d467bbf 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -12,6 +12,7 @@
%{
#include <assert.h>
+#include <stdbool.h>
#include <stdlib.h>
#include <string.h>
#include "genksyms.h"
@@ -148,6 +149,7 @@ simple_declaration:
current_name = NULL;
}
$$ = $3;
+ dont_want_type_specifier = false;
}
;
@@ -169,6 +171,7 @@ init_declarator_list:
is_typedef ? SYM_TYPEDEF : SYM_NORMAL, decl, is_extern);
current_name = NULL;
$$ = $1;
+ dont_want_type_specifier = true;
}
| init_declarator_list ',' init_declarator
{ struct string_list *decl = *$3;
@@ -184,6 +187,7 @@ init_declarator_list:
is_typedef ? SYM_TYPEDEF : SYM_NORMAL, decl, is_extern);
current_name = NULL;
$$ = $3;
+ dont_want_type_specifier = true;
}
;
@@ -210,7 +214,7 @@ decl_specifier:
remove_node($1);
$$ = $1;
}
- | type_specifier
+ | type_specifier { dont_want_type_specifier = true; $$ = $1; }
| type_qualifier
;
@@ -307,15 +311,7 @@ direct_declarator:
current_name = (*$1)->string;
$$ = $1;
}
- }
- | TYPE
- { if (current_name != NULL) {
- error_with_pos("unexpected second declaration name");
- YYERROR;
- } else {
- current_name = (*$1)->string;
- $$ = $1;
- }
+ dont_want_type_specifier = false;
}
| direct_declarator '(' parameter_declaration_clause ')'
{ $$ = $4; }
@@ -335,8 +331,7 @@ nested_declarator:
;
direct_nested_declarator:
- IDENT
- | TYPE
+ IDENT { $$ = $1; dont_want_type_specifier = false; }
| direct_nested_declarator '(' parameter_declaration_clause ')'
{ $$ = $4; }
| direct_nested_declarator '(' error ')'
@@ -362,8 +357,9 @@ parameter_declaration_list_opt:
parameter_declaration_list:
parameter_declaration
+ { $$ = $1; dont_want_type_specifier = false; }
| parameter_declaration_list ',' parameter_declaration
- { $$ = $3; }
+ { $$ = $3; dont_want_type_specifier = false; }
;
parameter_declaration:
@@ -375,6 +371,7 @@ abstract_declarator:
ptr_operator abstract_declarator
{ $$ = $2 ? $2 : $1; }
| direct_abstract_declarator
+ { $$ = $1; dont_want_type_specifier = false; }
;
direct_abstract_declarator:
@@ -385,12 +382,6 @@ direct_abstract_declarator:
remove_node($1);
$$ = $1;
}
- /* This wasn't really a typedef name but an identifier that
- shadows one. */
- | TYPE
- { remove_node($1);
- $$ = $1;
- }
| direct_abstract_declarator '(' parameter_declaration_clause ')'
{ $$ = $4; }
| direct_abstract_declarator '(' error ')'
@@ -440,9 +431,9 @@ member_specification:
member_declaration:
decl_specifier_seq_opt member_declarator_list_opt ';'
- { $$ = $3; }
+ { $$ = $3; dont_want_type_specifier = false; }
| error ';'
- { $$ = $2; }
+ { $$ = $2; dont_want_type_specifier = false; }
;
member_declarator_list_opt:
@@ -452,7 +443,9 @@ member_declarator_list_opt:
member_declarator_list:
member_declarator
- | member_declarator_list ',' member_declarator { $$ = $3; }
+ { $$ = $1; dont_want_type_specifier = true; }
+ | member_declarator_list ',' member_declarator
+ { $$ = $3; dont_want_type_specifier = true; }
;
member_declarator:
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 05/17] genksyms: fix last 3 shift/reduce conflicts
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
` (3 preceding siblings ...)
2025-01-13 15:00 ` [PATCH 04/17] genksyms: fix 6 shift/reduce conflicts and 5 reduce/reduce conflicts Masahiro Yamada
@ 2025-01-13 15:00 ` Masahiro Yamada
2025-01-13 15:00 ` [PATCH 06/17] genksyms: remove Makefile hack Masahiro Yamada
` (12 subsequent siblings)
17 siblings, 0 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
The genksyms parser has ambiguities in its grammar, which are currently
suppressed by a workaround in scripts/genksyms/Makefile.
Building genksyms with W=1 generates the following warnings:
YACC scripts/genksyms/parse.tab.[ch]
scripts/genksyms/parse.y: warning: 3 shift/reduce conflicts [-Wconflicts-sr]
scripts/genksyms/parse.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples
The ambiguity arises when decl_specifier_seq is followed by '(' because
the following two interpretations are possible:
- decl_specifier_seq direct_abstract_declarator '(' parameter_declaration_clause ')'
- decl_specifier_seq '(' abstract_declarator ')'
This issue occurs because the current parser allows an empty string to
be reduced to direct_abstract_declarator, which is incorrect.
K&R [1] explains the correct grammar:
<parameter-declaration> ::= {<declaration-specifier>}+ <declarator>
| {<declaration-specifier>}+ <abstract-declarator>
| {<declaration-specifier>}+
<abstract-declarator> ::= <pointer>
| <pointer> <direct-abstract-declarator>
| <direct-abstract-declarator>
<direct-abstract-declarator> ::= ( <abstract-declarator> )
| {<direct-abstract-declarator>}? [ {<constant-expression>}? ]
| {<direct-abstract-declarator>}? ( {<parameter-type-list>}? )
We need to consider the difference between the following two examples:
[Example 1] ( <abstract-declarator> ) can become <direct-abstract-declarator>
void my_func(int (foo));
... is equivalent to:
void my_func(int foo);
[Example 2] ( <parameter-type-list> ) can become <direct-abstract-declarator>
typedef int foo;
void my_func(int (foo));
... is equivalent to:
void my_func(int (*callback)(int));
Please note that the function declaration is identical in both examples,
but the preceding typedef creates the distinction. I introduced a new
term, open_paren, to enable the type lookup immediately after the '('
token. Without this, we cannot distinguish between [Example 1] and
[Example 2].
With this commit, all conflicts are resolved.
[1]: https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/parse.y | 28 ++++++++++++++++++++--------
1 file changed, 20 insertions(+), 8 deletions(-)
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index dc575d467bbf..fafce939c32f 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -363,35 +363,47 @@ parameter_declaration_list:
;
parameter_declaration:
- decl_specifier_seq abstract_declarator
+ decl_specifier_seq abstract_declarator_opt
{ $$ = $2 ? $2 : $1; }
;
+abstract_declarator_opt:
+ /* empty */ { $$ = NULL; }
+ | abstract_declarator
+ ;
+
abstract_declarator:
- ptr_operator abstract_declarator
+ ptr_operator
+ | ptr_operator abstract_declarator
{ $$ = $2 ? $2 : $1; }
| direct_abstract_declarator
{ $$ = $1; dont_want_type_specifier = false; }
;
direct_abstract_declarator:
- /* empty */ { $$ = NULL; }
- | IDENT
+ IDENT
{ /* For version 2 checksums, we don't want to remember
private parameter names. */
remove_node($1);
$$ = $1;
}
- | direct_abstract_declarator '(' parameter_declaration_clause ')'
+ | direct_abstract_declarator open_paren parameter_declaration_clause ')'
{ $$ = $4; }
- | direct_abstract_declarator '(' error ')'
+ | direct_abstract_declarator open_paren error ')'
{ $$ = $4; }
| direct_abstract_declarator BRACKET_PHRASE
{ $$ = $2; }
- | '(' abstract_declarator ')'
+ | open_paren parameter_declaration_clause ')'
{ $$ = $3; }
- | '(' error ')'
+ | open_paren abstract_declarator ')'
{ $$ = $3; }
+ | open_paren error ')'
+ { $$ = $3; }
+ | BRACKET_PHRASE
+ ;
+
+open_paren:
+ '(' { $$ = $1; dont_want_type_specifier = false; }
;
function_definition:
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 06/17] genksyms: remove Makefile hack
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
` (4 preceding siblings ...)
2025-01-13 15:00 ` [PATCH 05/17] genksyms: fix last 3 shift/reduce conflicts Masahiro Yamada
@ 2025-01-13 15:00 ` Masahiro Yamada
2025-01-13 15:00 ` [PATCH 07/17] genksyms: restrict direct-abstract-declarator to take one parameter-type-list Masahiro Yamada
` (11 subsequent siblings)
17 siblings, 0 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
This workaround was introduced for suppressing the reduce/reduce conflict
warnings because the %expect-rr directive, which is applicable only to GLR
parsers, cannot be used for genksyms.
Since there are no longer any conflicts, this Makefile hack is now
unnecessary.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/Makefile | 18 ------------------
1 file changed, 18 deletions(-)
diff --git a/scripts/genksyms/Makefile b/scripts/genksyms/Makefile
index 312edccda736..4350311fb7b3 100644
--- a/scripts/genksyms/Makefile
+++ b/scripts/genksyms/Makefile
@@ -4,24 +4,6 @@ hostprogs-always-y += genksyms
genksyms-objs := genksyms.o parse.tab.o lex.lex.o
-# FIXME: fix the ambiguous grammar in parse.y and delete this hack
-#
-# Suppress shift/reduce, reduce/reduce conflicts warnings
-# unless W=1 is specified.
-#
-# Just in case, run "$(YACC) --version" without suppressing stderr
-# so that 'bison: not found' will be displayed if it is missing.
-ifeq ($(findstring 1,$(KBUILD_EXTRA_WARN)),)
-
-quiet_cmd_bison_no_warn = $(quiet_cmd_bison)
- cmd_bison_no_warn = $(YACC) --version >/dev/null; \
- $(cmd_bison) 2>/dev/null
-
-$(obj)/pars%.tab.c $(obj)/pars%.tab.h: $(src)/pars%.y FORCE
- $(call if_changed,bison_no_warn)
-
-endif
-
# -I needed for generated C source to include headers in source tree
HOSTCFLAGS_parse.tab.o := -I $(src)
HOSTCFLAGS_lex.lex.o := -I $(src)
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 07/17] genksyms: restrict direct-abstract-declarator to take one parameter-type-list
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
` (5 preceding siblings ...)
2025-01-13 15:00 ` [PATCH 06/17] genksyms: remove Makefile hack Masahiro Yamada
@ 2025-01-13 15:00 ` Masahiro Yamada
2025-01-13 15:00 ` [PATCH 08/17] genksyms: restrict direct-declarator " Masahiro Yamada
` (10 subsequent siblings)
17 siblings, 0 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
While there is no more grammatical ambiguity in genksyms, the parser
logic is still inaccurate.
For example, genksyms accepts the following invalid C code:
void my_func(int ()(int));
This should result in a syntax error because () cannot be reduced to
<direct-abstract-declarator>.
( <abstract-declarator> ) can be reduced, but <abstract-declarator>
must not be empty in the following grammar from K&R [1]:
<direct-abstract-declarator> ::= ( <abstract-declarator> )
| {<direct-abstract-declarator>}? [ {<constant-expression>}? ]
| {<direct-abstract-declarator>}? ( {<parameter-type-list>}? )
Furthermore, genksyms accepts the following weird code:
void my_func(int (*callback)(int)(int)(int));
The parser allows <direct-abstract-declarator> to recursively absorb
multiple ( {<parameter-type-list>}? ), but this behavior is incorrect.
In the example above, (*callback) should be followed by at most one
(int).
[1]: https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/parse.y | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index fafce939c32f..03cdd8d53c13 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -381,20 +381,24 @@ abstract_declarator:
;
direct_abstract_declarator:
+ direct_abstract_declarator1
+ | direct_abstract_declarator1 open_paren parameter_declaration_clause ')'
+ { $$ = $4; }
+ | open_paren parameter_declaration_clause ')'
+ { $$ = $3; }
+ ;
+
+direct_abstract_declarator1:
IDENT
{ /* For version 2 checksums, we don't want to remember
private parameter names. */
remove_node($1);
$$ = $1;
}
- | direct_abstract_declarator open_paren parameter_declaration_clause ')'
+ | direct_abstract_declarator1 open_paren error ')'
{ $$ = $4; }
- | direct_abstract_declarator open_paren error ')'
- { $$ = $4; }
- | direct_abstract_declarator BRACKET_PHRASE
+ | direct_abstract_declarator1 BRACKET_PHRASE
{ $$ = $2; }
- | open_paren parameter_declaration_clause ')'
- { $$ = $3; }
| open_paren abstract_declarator ')'
{ $$ = $3; }
| open_paren error ')'
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 08/17] genksyms: restrict direct-declarator to take one parameter-type-list
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
` (6 preceding siblings ...)
2025-01-13 15:00 ` [PATCH 07/17] genksyms: restrict direct-abstract-declarator to take one parameter-type-list Masahiro Yamada
@ 2025-01-13 15:00 ` Masahiro Yamada
2025-01-13 15:00 ` [PATCH 09/17] genksyms: record attributes consistently for init-declarator Masahiro Yamada
` (9 subsequent siblings)
17 siblings, 0 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
Similar to the previous commit, this change makes the parser logic a
little more accurate.
Currently, genksyms accepts the following invalid code:
struct foo {
int (*callback)(int)(int)(int);
};
A direct-declarator should not recursively absorb multiple
( parameter-type-list ) constructs.
In the example above, (*callback) should be followed by at most one
(int).
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/parse.y | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index 03cdd8d53c13..33a6aab53b69 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -331,12 +331,16 @@ nested_declarator:
;
direct_nested_declarator:
+ direct_nested_declarator1
+ | direct_nested_declarator1 '(' parameter_declaration_clause ')'
+ { $$ = $4; }
+ ;
+
+direct_nested_declarator1:
IDENT { $$ = $1; dont_want_type_specifier = false; }
- | direct_nested_declarator '(' parameter_declaration_clause ')'
+ | direct_nested_declarator1 '(' error ')'
{ $$ = $4; }
- | direct_nested_declarator '(' error ')'
- { $$ = $4; }
- | direct_nested_declarator BRACKET_PHRASE
+ | direct_nested_declarator1 BRACKET_PHRASE
{ $$ = $2; }
| '(' nested_declarator ')'
{ $$ = $3; }
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 09/17] genksyms: record attributes consistently for init-declarator
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
` (7 preceding siblings ...)
2025-01-13 15:00 ` [PATCH 08/17] genksyms: restrict direct-declarator " Masahiro Yamada
@ 2025-01-13 15:00 ` Masahiro Yamada
2025-01-13 15:00 ` [PATCH 10/17] genksyms: decouple ATTRIBUTE_PHRASE from type-qualifier Masahiro Yamada
` (8 subsequent siblings)
17 siblings, 0 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
I believe the missing action here is a bug.
For rules with no explicit action, the following default is used:
{ $$ = $1; }
However, in this case, $1 is the value of attribute_opt itself. As a
result, the value of attribute_opt is always NULL.
The following test code demonstrates inconsistent behavior.
int x __attribute__((__aligned__(4)));
int y __attribute__((__aligned__(4))) = 0;
The attribute is recorded only when followed by an initializer.
This commit adds the correct action to propagate the value of the
ATTRIBUTE_PHRASE token.
With this change, the attribute in the example above is consistently
recorded for both 'x' and 'y'.
[Before]
$ cat <<EOF | scripts/genksyms/genksyms -d
int x __attribute__((__aligned__(4)));
int y __attribute__((__aligned__(4))) = 0;
EOF
Defn for type0 x == <int x >
Defn for type0 y == <int y __attribute__ ( ( __aligned__ ( 4 ) ) ) >
Hash table occupancy 2/4096 = 0.000488281
[After]
$ cat <<EOF | scripts/genksyms/genksyms -d
int x __attribute__((__aligned__(4)));
int y __attribute__((__aligned__(4))) = 0;
EOF
Defn for type0 x == <int x __attribute__ ( ( __aligned__ ( 4 ) ) ) >
Defn for type0 y == <int y __attribute__ ( ( __aligned__ ( 4 ) ) ) >
Hash table occupancy 2/4096 = 0.000488281
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/parse.y | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index 33a6aab53b69..e3c160046143 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -480,7 +480,7 @@ member_bitfield_declarator:
attribute_opt:
/* empty */ { $$ = NULL; }
- | attribute_opt ATTRIBUTE_PHRASE
+ | attribute_opt ATTRIBUTE_PHRASE { $$ = $2; }
;
enum_body:
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 10/17] genksyms: decouple ATTRIBUTE_PHRASE from type-qualifier
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
` (8 preceding siblings ...)
2025-01-13 15:00 ` [PATCH 09/17] genksyms: record attributes consistently for init-declarator Masahiro Yamada
@ 2025-01-13 15:00 ` Masahiro Yamada
2025-01-13 15:00 ` [PATCH 11/17] genksyms: fix syntax error for attribute before abstract_declarator Masahiro Yamada
` (7 subsequent siblings)
17 siblings, 0 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
The __attribute__ keyword can appear in more contexts than 'const' or
'volatile'.
To avoid grammatical conflicts with future changes, ATTRIBUTE_PHRASE
should not be reduced into type_qualifier.
No functional changes are intended.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/parse.y | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index e3c160046143..cd933a95548d 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -216,6 +216,7 @@ decl_specifier:
}
| type_specifier { dont_want_type_specifier = true; $$ = $1; }
| type_qualifier
+ | ATTRIBUTE_PHRASE
;
storage_class_specifier:
@@ -285,11 +286,13 @@ type_qualifier_seq_opt:
type_qualifier_seq:
type_qualifier
+ | ATTRIBUTE_PHRASE
| type_qualifier_seq type_qualifier { $$ = $2; }
+ | type_qualifier_seq ATTRIBUTE_PHRASE { $$ = $2; }
;
type_qualifier:
- CONST_KEYW | VOLATILE_KEYW | ATTRIBUTE_PHRASE
+ CONST_KEYW | VOLATILE_KEYW
| RESTRICT_KEYW
{ /* restrict has no effect in prototypes so ignore it */
remove_node($1);
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 11/17] genksyms: fix syntax error for attribute before abstract_declarator
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
` (9 preceding siblings ...)
2025-01-13 15:00 ` [PATCH 10/17] genksyms: decouple ATTRIBUTE_PHRASE from type-qualifier Masahiro Yamada
@ 2025-01-13 15:00 ` Masahiro Yamada
2025-01-13 15:00 ` [PATCH 12/17] genksyms: fix syntax error for attribute before nested_declarator Masahiro Yamada
` (6 subsequent siblings)
17 siblings, 0 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ init/main.i
$ cat init/main.i | scripts/genksyms/genksyms -w
[ snip ]
./include/linux/efi.h:1225: syntax error
The syntax error occurs in the following code in include/linux/efi.h:
efi_status_t
efi_call_acpi_prm_handler(efi_status_t (__efiapi *handler_addr)(u64, void *),
u64 param_buffer_addr, void *context);
The issue arises from __efiapi, which is defined as either
__attribute__((ms_abi)) or __attribute__((regparm(0))).
This commit allows abstract_declarator to be prefixed with attributes.
To avoid conflicts, I tweaked the rule for decl_specifier_seq. Due to
this change, a standalone attribute cannot become decl_specifier_seq.
Otherwise, I do not know how to resolve the conflicts.
The following code, which was previously accepted by genksyms, will now
result in a syntax error:
void my_func(__attribute__((unused))x);
I do not think it is a big deal because GCC also fails to parse it.
$ echo 'void my_func(__attribute__((unused))x);' | gcc -c -x c -
<stdin>:1:37: error: unknown type name 'x'
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/parse.y | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index cd933a95548d..54e16c2e0b4b 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -203,8 +203,9 @@ decl_specifier_seq_opt:
;
decl_specifier_seq:
- decl_specifier { decl_spec = *$1; }
+ attribute_opt decl_specifier { decl_spec = *$2; }
| decl_specifier_seq decl_specifier { decl_spec = *$2; }
+ | decl_specifier_seq ATTRIBUTE_PHRASE { decl_spec = *$2; }
;
decl_specifier:
@@ -216,7 +217,6 @@ decl_specifier:
}
| type_specifier { dont_want_type_specifier = true; $$ = $1; }
| type_qualifier
- | ATTRIBUTE_PHRASE
;
storage_class_specifier:
@@ -406,8 +406,8 @@ direct_abstract_declarator1:
{ $$ = $4; }
| direct_abstract_declarator1 BRACKET_PHRASE
{ $$ = $2; }
- | open_paren abstract_declarator ')'
- { $$ = $3; }
+ | open_paren attribute_opt abstract_declarator ')'
+ { $$ = $4; }
| open_paren error ')'
{ $$ = $3; }
| BRACKET_PHRASE
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 12/17] genksyms: fix syntax error for attribute before nested_declarator
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
` (10 preceding siblings ...)
2025-01-13 15:00 ` [PATCH 11/17] genksyms: fix syntax error for attribute before abstract_declarator Masahiro Yamada
@ 2025-01-13 15:00 ` Masahiro Yamada
2025-01-13 15:00 ` [PATCH 13/17] genksyms: fix syntax error for attribute after abstact_declarator Masahiro Yamada
` (5 subsequent siblings)
17 siblings, 0 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ drivers/acpi/prmt.i
$ cat drivers/acpi/prmt.i | scripts/genksyms/genksyms -w
[ snip ]
drivers/acpi/prmt.c:56: syntax error
The syntax error occurs in the following code in drivers/acpi/prmt.c:
struct prm_handler_info {
[ snip ]
efi_status_t (__efiapi *handler_addr)(u64, void *);
[ snip ]
};
The issue arises from __efiapi, which is defined as either
__attribute__((ms_abi)) or __attribute__((regparm(0))).
This commit allows nested_declarator to be prefixed with attributes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/parse.y | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index 54e16c2e0b4b..49d3e536b9a8 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -345,8 +345,8 @@ direct_nested_declarator1:
{ $$ = $4; }
| direct_nested_declarator1 BRACKET_PHRASE
{ $$ = $2; }
- | '(' nested_declarator ')'
- { $$ = $3; }
+ | '(' attribute_opt nested_declarator ')'
+ { $$ = $4; }
| '(' error ')'
{ $$ = $3; }
;
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 13/17] genksyms: fix syntax error for attribute after abstact_declarator
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
` (11 preceding siblings ...)
2025-01-13 15:00 ` [PATCH 12/17] genksyms: fix syntax error for attribute before nested_declarator Masahiro Yamada
@ 2025-01-13 15:00 ` Masahiro Yamada
2025-01-13 15:00 ` [PATCH 14/17] genksyms: fix syntax error for attribute after 'struct' Masahiro Yamada
` (4 subsequent siblings)
17 siblings, 0 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ kernel/module/main.i
$ cat kernel/module/main.i | scripts/genksyms/genksyms -w
[ snip ]
kernel/module/main.c:97: syntax error
The syntax error occurs in the following code in kernel/module/main.c:
static void __mod_update_bounds(enum mod_mem_type type __maybe_unused, void *base,
unsigned int size, struct mod_tree_root *tree)
{
[ snip ]
}
The issue arises from __maybe_unused, which is defined as
__attribute__((__unused__)).
This commit allows direct_abstract_declarator to be followed with
attributes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/parse.y | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index 49d3e536b9a8..82774df50642 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -383,8 +383,8 @@ abstract_declarator:
ptr_operator
| ptr_operator abstract_declarator
{ $$ = $2 ? $2 : $1; }
- | direct_abstract_declarator
- { $$ = $1; dont_want_type_specifier = false; }
+ | direct_abstract_declarator attribute_opt
+ { $$ = $2; dont_want_type_specifier = false; }
;
direct_abstract_declarator:
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 14/17] genksyms: fix syntax error for attribute after 'struct'
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
` (12 preceding siblings ...)
2025-01-13 15:00 ` [PATCH 13/17] genksyms: fix syntax error for attribute after abstact_declarator Masahiro Yamada
@ 2025-01-13 15:00 ` Masahiro Yamada
2025-01-13 15:00 ` [PATCH 15/17] genksyms: fix syntax error for attribute after 'union' Masahiro Yamada
` (3 subsequent siblings)
17 siblings, 0 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ arch/x86/kernel/cpu/mshyperv.i
$ cat arch/x86/kernel/cpu/mshyperv.i | scripts/genksyms/genksyms -w
[ snip ]
./arch/x86/include/asm/svm.h:122: syntax error
The syntax error occurs in the following code in arch/x86/include/asm/svm.h:
struct __attribute__ ((__packed__)) vmcb_control_area {
[ snip ]
};
The issue arises from __attribute__ immediately after the 'struct'
keyword.
This commit allows the 'struct' keyword to be followed by attributes.
The lexer must be adjusted because dont_want_brace_phase should not be
decremented while processing attributes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/lex.l | 7 ++++++-
scripts/genksyms/parse.y | 10 +++++-----
2 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/scripts/genksyms/lex.l b/scripts/genksyms/lex.l
index e886133af578..a1f969dcf24f 100644
--- a/scripts/genksyms/lex.l
+++ b/scripts/genksyms/lex.l
@@ -438,7 +438,12 @@ fini:
if (suppress_type_lookup > 0)
--suppress_type_lookup;
- if (dont_want_brace_phrase > 0)
+
+ /*
+ * __attribute__() can be placed immediately after the 'struct' keyword.
+ * e.g.) struct __attribute__((__packed__)) foo { ... };
+ */
+ if (token != ATTRIBUTE_PHRASE && dont_want_brace_phrase > 0)
--dont_want_brace_phrase;
yylval = &next_node->next;
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index 82774df50642..33639232a709 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -234,16 +234,16 @@ type_specifier:
/* References to s/u/e's defined elsewhere. Rearrange things
so that it is easier to expand the definition fully later. */
- | STRUCT_KEYW IDENT
- { remove_node($1); (*$2)->tag = SYM_STRUCT; $$ = $2; }
+ | STRUCT_KEYW attribute_opt IDENT
+ { remove_node($1); (*$3)->tag = SYM_STRUCT; $$ = $3; }
| UNION_KEYW IDENT
{ remove_node($1); (*$2)->tag = SYM_UNION; $$ = $2; }
| ENUM_KEYW IDENT
{ remove_node($1); (*$2)->tag = SYM_ENUM; $$ = $2; }
/* Full definitions of an s/u/e. Record it. */
- | STRUCT_KEYW IDENT class_body
- { record_compound($1, $2, $3, SYM_STRUCT); $$ = $3; }
+ | STRUCT_KEYW attribute_opt IDENT class_body
+ { record_compound($1, $3, $4, SYM_STRUCT); $$ = $4; }
| UNION_KEYW IDENT class_body
{ record_compound($1, $2, $3, SYM_UNION); $$ = $3; }
| ENUM_KEYW IDENT enum_body
@@ -254,7 +254,7 @@ type_specifier:
| ENUM_KEYW enum_body
{ add_symbol(NULL, SYM_ENUM, NULL, 0); $$ = $2; }
/* Anonymous s/u definitions. Nothing needs doing. */
- | STRUCT_KEYW class_body { $$ = $2; }
+ | STRUCT_KEYW attribute_opt class_body { $$ = $3; }
| UNION_KEYW class_body { $$ = $2; }
;
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 15/17] genksyms: fix syntax error for attribute after 'union'
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
` (13 preceding siblings ...)
2025-01-13 15:00 ` [PATCH 14/17] genksyms: fix syntax error for attribute after 'struct' Masahiro Yamada
@ 2025-01-13 15:00 ` Masahiro Yamada
2025-01-13 15:00 ` [PATCH 16/17] genksyms: fix syntax error for builtin (u)int*x*_t types Masahiro Yamada
` (2 subsequent siblings)
17 siblings, 0 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ fs/lockd/svc.i
$ cat fs/lockd/svc.i | scripts/genksyms/genksyms -w
[ snip ]
./include/net/addrconf.h:35: syntax error
The syntax error occurs in the following code in include/net/addrconf.h:
union __packed {
[ snip ]
};
The issue arises from __attribute__ immediately after the 'union'
keyword.
This commit allows the 'union' keyword to be followed by attributes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/parse.y | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index 33639232a709..a2cd035a78c9 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -236,16 +236,16 @@ type_specifier:
so that it is easier to expand the definition fully later. */
| STRUCT_KEYW attribute_opt IDENT
{ remove_node($1); (*$3)->tag = SYM_STRUCT; $$ = $3; }
- | UNION_KEYW IDENT
- { remove_node($1); (*$2)->tag = SYM_UNION; $$ = $2; }
+ | UNION_KEYW attribute_opt IDENT
+ { remove_node($1); (*$3)->tag = SYM_UNION; $$ = $3; }
| ENUM_KEYW IDENT
{ remove_node($1); (*$2)->tag = SYM_ENUM; $$ = $2; }
/* Full definitions of an s/u/e. Record it. */
| STRUCT_KEYW attribute_opt IDENT class_body
{ record_compound($1, $3, $4, SYM_STRUCT); $$ = $4; }
- | UNION_KEYW IDENT class_body
- { record_compound($1, $2, $3, SYM_UNION); $$ = $3; }
+ | UNION_KEYW attribute_opt IDENT class_body
+ { record_compound($1, $3, $4, SYM_UNION); $$ = $4; }
| ENUM_KEYW IDENT enum_body
{ record_compound($1, $2, $3, SYM_ENUM); $$ = $3; }
/*
@@ -255,7 +255,7 @@ type_specifier:
{ add_symbol(NULL, SYM_ENUM, NULL, 0); $$ = $2; }
/* Anonymous s/u definitions. Nothing needs doing. */
| STRUCT_KEYW attribute_opt class_body { $$ = $3; }
- | UNION_KEYW class_body { $$ = $2; }
+ | UNION_KEYW attribute_opt class_body { $$ = $3; }
;
simple_type_specifier:
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 16/17] genksyms: fix syntax error for builtin (u)int*x*_t types
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
` (14 preceding siblings ...)
2025-01-13 15:00 ` [PATCH 15/17] genksyms: fix syntax error for attribute after 'union' Masahiro Yamada
@ 2025-01-13 15:00 ` Masahiro Yamada
2025-01-13 15:00 ` [PATCH 17/17] genksyms: fix syntax error for attribute before init-declarator Masahiro Yamada
2025-01-14 20:33 ` [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Nicolas Schier
17 siblings, 0 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, genksyms fails to parse the following code in
arch/arm64/lib/xor-neon.c:
static inline uint64x2_t eor3(uint64x2_t p, uint64x2_t q, uint64x2_t r)
{
[ snip ]
}
The syntax error occurs because genksyms does not recognize the
uint64x2_t keyword.
This commit adds support for builtin types described in Arm Neon
Intrinsics Reference.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/lex.l | 1 +
1 file changed, 1 insertion(+)
diff --git a/scripts/genksyms/lex.l b/scripts/genksyms/lex.l
index a1f969dcf24f..22aeb57649d9 100644
--- a/scripts/genksyms/lex.l
+++ b/scripts/genksyms/lex.l
@@ -51,6 +51,7 @@ MC_TOKEN ([~%^&*+=|<>/-]=)|(&&)|("||")|(->)|(<<)|(>>)
%%
+u?int(8|16|32|64)x(1|2|4|8|16)_t return BUILTIN_INT_KEYW;
/* Keep track of our location in the original source files. */
^#[ \t]+{INT}[ \t]+\"[^\"\n]+\".*\n return FILENAME;
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 17/17] genksyms: fix syntax error for attribute before init-declarator
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
` (15 preceding siblings ...)
2025-01-13 15:00 ` [PATCH 16/17] genksyms: fix syntax error for builtin (u)int*x*_t types Masahiro Yamada
@ 2025-01-13 15:00 ` Masahiro Yamada
2025-01-14 20:33 ` [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Nicolas Schier
17 siblings, 0 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-13 15:00 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel, Masahiro Yamada
A longstanding issue with genksyms is that it has hidden syntax errors.
For example, genksyms fails to parse the following valid code:
int x, __attribute__((__section__(".init.data")))y;
Here, only 'y' is annotated by the attribute, although I am not aware
of actual uses of this pattern in the kernel tree.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
$ echo 'int x, __attribute__((__section__(".init.data")))y;' | scripts/genksyms/genksyms -w
<stdin>:1: syntax error
This commit allows attributes to be placed between a comma and
init_declarator.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---
scripts/genksyms/parse.y | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index a2cd035a78c9..ee600a804fa1 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -173,9 +173,9 @@ init_declarator_list:
$$ = $1;
dont_want_type_specifier = true;
}
- | init_declarator_list ',' init_declarator
- { struct string_list *decl = *$3;
- *$3 = NULL;
+ | init_declarator_list ',' attribute_opt init_declarator
+ { struct string_list *decl = *$4;
+ *$4 = NULL;
free_list(*$2, NULL);
*$2 = decl_spec;
@@ -186,7 +186,7 @@ init_declarator_list:
add_symbol(current_name,
is_typedef ? SYM_TYPEDEF : SYM_NORMAL, decl, is_extern);
current_name = NULL;
- $$ = $3;
+ $$ = $4;
dont_want_type_specifier = true;
}
;
--
2.43.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 04/17] genksyms: fix 6 shift/reduce conflicts and 5 reduce/reduce conflicts
2025-01-13 15:00 ` [PATCH 04/17] genksyms: fix 6 shift/reduce conflicts and 5 reduce/reduce conflicts Masahiro Yamada
@ 2025-01-14 1:23 ` Masahiro Yamada
0 siblings, 0 replies; 20+ messages in thread
From: Masahiro Yamada @ 2025-01-14 1:23 UTC (permalink / raw)
To: linux-kbuild; +Cc: linux-kernel
On Tue, Jan 14, 2025 at 12:03 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> The genksyms parser has ambiguities in its grammar, which are currently
> suppressed by a workaround in scripts/genksyms/Makefile.
>
> Building genksyms with W=1 generates the following warnings:
>
> YACC scripts/genksyms/parse.tab.[ch]
> scripts/genksyms/parse.y: warning: 9 shift/reduce conflicts [-Wconflicts-sr]
> scripts/genksyms/parse.y: warning: 5 reduce/reduce conflicts [-Wconflicts-rr]
> scripts/genksyms/parse.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples
>
> The comment in the parser describes the current problem:
>
> /* This wasn't really a typedef name but an identifier that
> shadows one. */
>
> Consider the following simple C code:
>
> typedef int foo;
> void my_func(foo foo) {}
>
> In the function parameter list (foo foo), the first 'foo' is a type
> specifier (typedef'ed as 'int'), while the second 'foo' is an identifier.
>
> However, the lexer cannot distinguish between the two. Since 'foo' is
> already typedef'ed, the lexer returns TYPE for both instances, instead
> of returning IDENT for the second one.
>
> To support shadowed identifiers, IDENT can be reduced to either a
IDENT -> TYPE
> simple_type_specifier or a direct_abstract_declarator, which creates
> a grammatical ambiguity.
>
> Without analyzing the grammar context, it is very difficult to resolve
> this correctly.
>
> This commit introduces a flag, dont_want_type_specifier, which allows
> the parser to inform the lexer whether an identifier is expected. When
> dont_want_type_specifier is true, the type lookup is suppressed, and
> the lexer returns IDENT regardless of any preceding typedef.
>
> After this commit, only 3 shift/reduce conflicts will remain.
>
> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
--
Best Regards
Masahiro Yamada
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
` (16 preceding siblings ...)
2025-01-13 15:00 ` [PATCH 17/17] genksyms: fix syntax error for attribute before init-declarator Masahiro Yamada
@ 2025-01-14 20:33 ` Nicolas Schier
17 siblings, 0 replies; 20+ messages in thread
From: Nicolas Schier @ 2025-01-14 20:33 UTC (permalink / raw)
To: Masahiro Yamada; +Cc: linux-kbuild, linux-kernel, Nicolas Schier
[-- Attachment #1: Type: text/plain, Size: 2171 bytes --]
On Tue, Jan 14, 2025 at 12:00:38AM +0900, Masahiro Yamada wrote:
>
> This series fixes several long-standing issues in genksyms.
>
> - The parser contains grammatical ambiguities, including both
> reduce/reduce and shift/reduce conflicts.
>
> - There are several hidden syntax errors
> When a syntax error occurs, the type becomes UNKNOWN, and
> precise CRC calculation becomes impossible.
>
>
>
> Masahiro Yamada (17):
> genksyms: rename m_abstract_declarator to abstract_declarator
> genksyms: rename cvar_qualifier to type_qualifier
> genksyms: reduce type_qualifier directly to decl_specifier
> genksyms: fix 6 shift/reduce conflicts and 5 reduce/reduce conflicts
> genksyms: fix last 3 shift/reduce conflicts
> genksyms: remove Makefile hack
> genksyms: restrict direct-abstract-declarator to take one
> parameter-type-list
> genksyms: restrict direct-declarator to take one parameter-type-list
> genksyms: record attributes consistently for init-declarator
> genksyms: decouple ATTRIBUTE_PHRASE from type-qualifier
> genksyms: fix syntax error for attribute before abstract_declarator
> genksyms: fix syntax error for attribute before nested_declarator
> genksyms: fix syntax error for attribute after abstact_declarator
> genksyms: fix syntax error for attribute after 'struct'
> genksyms: fix syntax error for attribute after 'union'
> genksyms: fix syntax error for builtin (u)int*x*_t types
> genksyms: fix syntax error for attribute before init-declarator
>
> scripts/genksyms/Makefile | 18 -----
> scripts/genksyms/genksyms.h | 3 +
> scripts/genksyms/lex.l | 17 +++-
> scripts/genksyms/parse.y | 150 ++++++++++++++++++++----------------
> 4 files changed, 101 insertions(+), 87 deletions(-)
>
> --
> 2.43.0
Thanks for the series, especially for the very detailed,
explanatory commit messages!
I looked through all the patches and they all look good to me
-- but my bison/parsing/lexing knowledge is rusty and quite
limited, therefore I cannot review properly.
Acked-by: Nicolas Schier <n.schier@avm.de>
Kind regards,
Nicolas
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-01-14 20:33 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
2025-01-13 15:00 ` [PATCH 01/17] genksyms: rename m_abstract_declarator to abstract_declarator Masahiro Yamada
2025-01-13 15:00 ` [PATCH 02/17] genksyms: rename cvar_qualifier to type_qualifier Masahiro Yamada
2025-01-13 15:00 ` [PATCH 03/17] genksyms: reduce type_qualifier directly to decl_specifier Masahiro Yamada
2025-01-13 15:00 ` [PATCH 04/17] genksyms: fix 6 shift/reduce conflicts and 5 reduce/reduce conflicts Masahiro Yamada
2025-01-14 1:23 ` Masahiro Yamada
2025-01-13 15:00 ` [PATCH 05/17] genksyms: fix last 3 shift/reduce conflicts Masahiro Yamada
2025-01-13 15:00 ` [PATCH 06/17] genksyms: remove Makefile hack Masahiro Yamada
2025-01-13 15:00 ` [PATCH 07/17] genksyms: restrict direct-abstract-declarator to take one parameter-type-list Masahiro Yamada
2025-01-13 15:00 ` [PATCH 08/17] genksyms: restrict direct-declarator " Masahiro Yamada
2025-01-13 15:00 ` [PATCH 09/17] genksyms: record attributes consistently for init-declarator Masahiro Yamada
2025-01-13 15:00 ` [PATCH 10/17] genksyms: decouple ATTRIBUTE_PHRASE from type-qualifier Masahiro Yamada
2025-01-13 15:00 ` [PATCH 11/17] genksyms: fix syntax error for attribute before abstract_declarator Masahiro Yamada
2025-01-13 15:00 ` [PATCH 12/17] genksyms: fix syntax error for attribute before nested_declarator Masahiro Yamada
2025-01-13 15:00 ` [PATCH 13/17] genksyms: fix syntax error for attribute after abstact_declarator Masahiro Yamada
2025-01-13 15:00 ` [PATCH 14/17] genksyms: fix syntax error for attribute after 'struct' Masahiro Yamada
2025-01-13 15:00 ` [PATCH 15/17] genksyms: fix syntax error for attribute after 'union' Masahiro Yamada
2025-01-13 15:00 ` [PATCH 16/17] genksyms: fix syntax error for builtin (u)int*x*_t types Masahiro Yamada
2025-01-13 15:00 ` [PATCH 17/17] genksyms: fix syntax error for attribute before init-declarator Masahiro Yamada
2025-01-14 20:33 ` [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Nicolas Schier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox