public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Masahiro Yamada <masahiroy@kernel.org>
To: linux-kbuild@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, Masahiro Yamada <masahiroy@kernel.org>
Subject: [PATCH 04/17] genksyms: fix 6 shift/reduce conflicts and 5 reduce/reduce conflicts
Date: Tue, 14 Jan 2025 00:00:42 +0900	[thread overview]
Message-ID: <20250113150253.3097820-5-masahiroy@kernel.org> (raw)
In-Reply-To: <20250113150253.3097820-1-masahiroy@kernel.org>

The genksyms parser has ambiguities in its grammar, which are currently
suppressed by a workaround in scripts/genksyms/Makefile.

Building genksyms with W=1 generates the following warnings:

    YACC    scripts/genksyms/parse.tab.[ch]
  scripts/genksyms/parse.y: warning: 9 shift/reduce conflicts [-Wconflicts-sr]
  scripts/genksyms/parse.y: warning: 5 reduce/reduce conflicts [-Wconflicts-rr]
  scripts/genksyms/parse.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples

The comment in the parser describes the current problem:

    /* This wasn't really a typedef name but an identifier that
       shadows one.  */

Consider the following simple C code:

    typedef int foo;
    void my_func(foo foo) {}

In the function parameter list (foo foo), the first 'foo' is a type
specifier (typedef'ed as 'int'), while the second 'foo' is an identifier.

However, the lexer cannot distinguish between the two. Since 'foo' is
already typedef'ed, the lexer returns TYPE for both instances, instead
of returning IDENT for the second one.

To support shadowed identifiers, IDENT can be reduced to either a
simple_type_specifier or a direct_abstract_declarator, which creates
a grammatical ambiguity.

Without analyzing the grammar context, it is very difficult to resolve
this correctly.

This commit introduces a flag, dont_want_type_specifier, which allows
the parser to inform the lexer whether an identifier is expected. When
dont_want_type_specifier is true, the type lookup is suppressed, and
the lexer returns IDENT regardless of any preceding typedef.

After this commit, only 3 shift/reduce conflicts will remain.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
---

 scripts/genksyms/genksyms.h |  3 +++
 scripts/genksyms/lex.l      |  9 ++++++++-
 scripts/genksyms/parse.y    | 37 +++++++++++++++----------------------
 3 files changed, 26 insertions(+), 23 deletions(-)

diff --git a/scripts/genksyms/genksyms.h b/scripts/genksyms/genksyms.h
index 8c45ada59ece..0c355075f0e6 100644
--- a/scripts/genksyms/genksyms.h
+++ b/scripts/genksyms/genksyms.h
@@ -12,6 +12,7 @@
 #ifndef MODUTILS_GENKSYMS_H
 #define MODUTILS_GENKSYMS_H 1
 
+#include <stdbool.h>
 #include <stdio.h>
 
 #include <list_types.h>
@@ -66,6 +67,8 @@ struct string_list *copy_list_range(struct string_list *start,
 int yylex(void);
 int yyparse(void);
 
+extern bool dont_want_type_specifier;
+
 void error_with_pos(const char *, ...) __attribute__ ((format(printf, 1, 2)));
 
 /*----------------------------------------------------------------------*/
diff --git a/scripts/genksyms/lex.l b/scripts/genksyms/lex.l
index a4d7495eaf75..e886133af578 100644
--- a/scripts/genksyms/lex.l
+++ b/scripts/genksyms/lex.l
@@ -12,6 +12,7 @@
 %{
 
 #include <limits.h>
+#include <stdbool.h>
 #include <stdlib.h>
 #include <string.h>
 #include <ctype.h>
@@ -113,6 +114,12 @@ MC_TOKEN		([~%^&*+=|<>/-]=)|(&&)|("||")|(->)|(<<)|(>>)
 /* The second stage lexer.  Here we incorporate knowledge of the state
    of the parser to tailor the tokens that are returned.  */
 
+/*
+ * The lexer cannot distinguish whether a typedef'ed string is a TYPE or an
+ * IDENT. We need a hint from the parser to handle this accurately.
+ */
+bool dont_want_type_specifier;
+
 int
 yylex(void)
 {
@@ -207,7 +214,7 @@ repeat:
 		    goto repeat;
 		  }
 	      }
-	    if (!suppress_type_lookup)
+	    if (!suppress_type_lookup && !dont_want_type_specifier)
 	      {
 		if (find_symbol(yytext, SYM_TYPEDEF, 1))
 		  token = TYPE;
diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index 20cb3db7f149..dc575d467bbf 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -12,6 +12,7 @@
 %{
 
 #include <assert.h>
+#include <stdbool.h>
 #include <stdlib.h>
 #include <string.h>
 #include "genksyms.h"
@@ -148,6 +149,7 @@ simple_declaration:
 		    current_name = NULL;
 		  }
 		  $$ = $3;
+		  dont_want_type_specifier = false;
 		}
 	;
 
@@ -169,6 +171,7 @@ init_declarator_list:
 			     is_typedef ? SYM_TYPEDEF : SYM_NORMAL, decl, is_extern);
 		  current_name = NULL;
 		  $$ = $1;
+		  dont_want_type_specifier = true;
 		}
 	| init_declarator_list ',' init_declarator
 		{ struct string_list *decl = *$3;
@@ -184,6 +187,7 @@ init_declarator_list:
 			     is_typedef ? SYM_TYPEDEF : SYM_NORMAL, decl, is_extern);
 		  current_name = NULL;
 		  $$ = $3;
+		  dont_want_type_specifier = true;
 		}
 	;
 
@@ -210,7 +214,7 @@ decl_specifier:
 		  remove_node($1);
 		  $$ = $1;
 		}
-	| type_specifier
+	| type_specifier	{ dont_want_type_specifier = true; $$ = $1; }
 	| type_qualifier
 	;
 
@@ -307,15 +311,7 @@ direct_declarator:
 		    current_name = (*$1)->string;
 		    $$ = $1;
 		  }
-		}
-	| TYPE
-		{ if (current_name != NULL) {
-		    error_with_pos("unexpected second declaration name");
-		    YYERROR;
-		  } else {
-		    current_name = (*$1)->string;
-		    $$ = $1;
-		  }
+		  dont_want_type_specifier = false;
 		}
 	| direct_declarator '(' parameter_declaration_clause ')'
 		{ $$ = $4; }
@@ -335,8 +331,7 @@ nested_declarator:
 	;
 
 direct_nested_declarator:
-	IDENT
-	| TYPE
+	IDENT	{ $$ = $1; dont_want_type_specifier = false; }
 	| direct_nested_declarator '(' parameter_declaration_clause ')'
 		{ $$ = $4; }
 	| direct_nested_declarator '(' error ')'
@@ -362,8 +357,9 @@ parameter_declaration_list_opt:
 
 parameter_declaration_list:
 	parameter_declaration
+		{ $$ = $1; dont_want_type_specifier = false; }
 	| parameter_declaration_list ',' parameter_declaration
-		{ $$ = $3; }
+		{ $$ = $3; dont_want_type_specifier = false; }
 	;
 
 parameter_declaration:
@@ -375,6 +371,7 @@ abstract_declarator:
 	ptr_operator abstract_declarator
 		{ $$ = $2 ? $2 : $1; }
 	| direct_abstract_declarator
+		{ $$ = $1; dont_want_type_specifier = false; }
 	;
 
 direct_abstract_declarator:
@@ -385,12 +382,6 @@ direct_abstract_declarator:
 		  remove_node($1);
 		  $$ = $1;
 		}
-	/* This wasn't really a typedef name but an identifier that
-	   shadows one.  */
-	| TYPE
-		{ remove_node($1);
-		  $$ = $1;
-		}
 	| direct_abstract_declarator '(' parameter_declaration_clause ')'
 		{ $$ = $4; }
 	| direct_abstract_declarator '(' error ')'
@@ -440,9 +431,9 @@ member_specification:
 
 member_declaration:
 	decl_specifier_seq_opt member_declarator_list_opt ';'
-		{ $$ = $3; }
+		{ $$ = $3; dont_want_type_specifier = false; }
 	| error ';'
-		{ $$ = $2; }
+		{ $$ = $2; dont_want_type_specifier = false; }
 	;
 
 member_declarator_list_opt:
@@ -452,7 +443,9 @@ member_declarator_list_opt:
 
 member_declarator_list:
 	member_declarator
-	| member_declarator_list ',' member_declarator	{ $$ = $3; }
+		{ $$ = $1; dont_want_type_specifier = true; }
+	| member_declarator_list ',' member_declarator
+		{ $$ = $3; dont_want_type_specifier = true; }
 	;
 
 member_declarator:
-- 
2.43.0


  parent reply	other threads:[~2025-01-13 15:03 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-13 15:00 [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Masahiro Yamada
2025-01-13 15:00 ` [PATCH 01/17] genksyms: rename m_abstract_declarator to abstract_declarator Masahiro Yamada
2025-01-13 15:00 ` [PATCH 02/17] genksyms: rename cvar_qualifier to type_qualifier Masahiro Yamada
2025-01-13 15:00 ` [PATCH 03/17] genksyms: reduce type_qualifier directly to decl_specifier Masahiro Yamada
2025-01-13 15:00 ` Masahiro Yamada [this message]
2025-01-14  1:23   ` [PATCH 04/17] genksyms: fix 6 shift/reduce conflicts and 5 reduce/reduce conflicts Masahiro Yamada
2025-01-13 15:00 ` [PATCH 05/17] genksyms: fix last 3 shift/reduce conflicts Masahiro Yamada
2025-01-13 15:00 ` [PATCH 06/17] genksyms: remove Makefile hack Masahiro Yamada
2025-01-13 15:00 ` [PATCH 07/17] genksyms: restrict direct-abstract-declarator to take one parameter-type-list Masahiro Yamada
2025-01-13 15:00 ` [PATCH 08/17] genksyms: restrict direct-declarator " Masahiro Yamada
2025-01-13 15:00 ` [PATCH 09/17] genksyms: record attributes consistently for init-declarator Masahiro Yamada
2025-01-13 15:00 ` [PATCH 10/17] genksyms: decouple ATTRIBUTE_PHRASE from type-qualifier Masahiro Yamada
2025-01-13 15:00 ` [PATCH 11/17] genksyms: fix syntax error for attribute before abstract_declarator Masahiro Yamada
2025-01-13 15:00 ` [PATCH 12/17] genksyms: fix syntax error for attribute before nested_declarator Masahiro Yamada
2025-01-13 15:00 ` [PATCH 13/17] genksyms: fix syntax error for attribute after abstact_declarator Masahiro Yamada
2025-01-13 15:00 ` [PATCH 14/17] genksyms: fix syntax error for attribute after 'struct' Masahiro Yamada
2025-01-13 15:00 ` [PATCH 15/17] genksyms: fix syntax error for attribute after 'union' Masahiro Yamada
2025-01-13 15:00 ` [PATCH 16/17] genksyms: fix syntax error for builtin (u)int*x*_t types Masahiro Yamada
2025-01-13 15:00 ` [PATCH 17/17] genksyms: fix syntax error for attribute before init-declarator Masahiro Yamada
2025-01-14 20:33 ` [PATCH 00/17] genksyms: fix conflicts and syntax errors in parser Nicolas Schier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250113150253.3097820-5-masahiroy@kernel.org \
    --to=masahiroy@kernel.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox