All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] lexer: support // comments
@ 2025-07-22 22:12 Kris Van Hees
  2025-07-23  1:18 ` [DTrace-devel] " Eugene Loh
  0 siblings, 1 reply; 4+ messages in thread
From: Kris Van Hees @ 2025-07-22 22:12 UTC (permalink / raw)
  To: dtrace, dtrace-devel

Suggested-by: Ruud van der Pas <ruud.vanderpas@oracle.com>
Suggested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
---
 libdtrace/dt_lex.l                               | 14 +++++++++++++-
 .../lexer/err.D_SYNTAX.boc-in-line-comment.d     | 16 ++++++++++++++++
 .../lexer/err.D_SYNTAX.boc-in-line-comment.r     |  2 ++
 .../lexer/err.D_SYNTAX.eoc-in-line-comment.d     | 16 ++++++++++++++++
 .../lexer/err.D_SYNTAX.eoc-in-line-comment.r     |  2 ++
 .../lexer/err.D_SYNTAX.eof-in-line-comment.d     | 16 ++++++++++++++++
 .../lexer/err.D_SYNTAX.eof-in-line-comment.r     |  2 ++
 .../lexer/err.D_SYNTAX.lc-in-line-comment.d      | 16 ++++++++++++++++
 .../lexer/err.D_SYNTAX.lc-in-line-comment.r      |  2 ++
 test/unittest/lexer/tst.line-comment.d           | 16 ++++++++++++++++
 test/unittest/lexer/tst.line-comment.r           |  5 +++++
 11 files changed, 106 insertions(+), 1 deletion(-)
 create mode 100644 test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.d
 create mode 100644 test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.r
 create mode 100644 test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.d
 create mode 100644 test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.r
 create mode 100644 test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.d
 create mode 100644 test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.r
 create mode 100644 test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.d
 create mode 100644 test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.r
 create mode 100644 test/unittest/lexer/tst.line-comment.d
 create mode 100644 test/unittest/lexer/tst.line-comment.r

diff --git a/libdtrace/dt_lex.l b/libdtrace/dt_lex.l
index 9d502912..a7234800 100644
--- a/libdtrace/dt_lex.l
+++ b/libdtrace/dt_lex.l
@@ -35,6 +35,7 @@ int yydebug;
  * S2 - D program outer scope (probe specifiers and declarations)
  * S3 - D control line parsing (i.e. after ^# is seen but before \n)
  * S4 - D control line scan (locate control directives only and invoke S3)
+ * S5 - D line comments (i.e. skip everything until end of line)
  * SIDENT - identifiers and comments only (after -> and .).  (We switch to
  *          SIDENT only from state S0: changing this would require new code
  *          to track the state to switch back to.)
@@ -46,7 +47,7 @@ int yydebug;
 %n 600		/* maximum states */
 %option yylineno
 
-%s S0 S1 S2 S3 S4 SIDENT
+%s S0 S1 S2 S3 S4 S5 SIDENT
 
 RGX_AGG		"@"[a-zA-Z_][0-9a-zA-Z_]*
 RGX_PSPEC	[-$:a-zA-Z_.?*\\\[\]!][-$:0-9a-zA-Z_.`?*\\\[\]!]*
@@ -408,6 +409,11 @@ if (yypcb->pcb_token != 0) {
 			BEGIN(S1);
 		}
 
+<S0,S2,SIDENT>"//"	{
+			yypcb->pcb_cstate = (YYSTATE);
+			BEGIN(S5);
+		}
+
 <S0>^{RGX_INTERP} |
 <S2>^{RGX_INTERP} ;	/* discard any #! lines */
 
@@ -548,6 +554,12 @@ if (yypcb->pcb_token != 0) {
 <S1>.|\n	; /* discard */
 <S1><<EOF>>	yyerror("end-of-file encountered before matching */\n");
 
+<S1>"/*"	yyerror("/* encountered inside a line comment\n");
+<S1>"*/"	yyerror("*/ encountered inside a line comment\n");
+<S1>"//"	yyerror("/*/encountered inside a line comment\n");
+<S5>\n		BEGIN(yypcb->pcb_cstate);
+<S5>.		; /* discard */
+
 <S2>{RGX_PSPEC}	{
 			/*
 			 * S2 has an ambiguity because RGX_PSPEC includes '*'
diff --git a/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.d b/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.d
new file mode 100644
index 00000000..61f5961f
--- /dev/null
+++ b/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.d
@@ -0,0 +1,16 @@
+/*
+ * Oracle Linux DTrace.
+ * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
+ * Licensed under the Universal Permissive License v 1.0 as shown at
+ * http://oss.oracle.com/licenses/upl.
+ */
+
+/*
+ * ASSERTION: Line comments cannot contain begin-of-comment markers.
+ */
+
+// Comment /*
+BEGIN
+{
+	exit(0);
+}
diff --git a/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.r b/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.r
new file mode 100644
index 00000000..74d1d0ef
--- /dev/null
+++ b/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.r
@@ -0,0 +1,2 @@
+-- @@stderr --
+dtrace: failed to compile script test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.d: [D_SYNTAX] line 12: /* encountered inside a line comment
diff --git a/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.d b/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.d
new file mode 100644
index 00000000..8fe2ff04
--- /dev/null
+++ b/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.d
@@ -0,0 +1,16 @@
+/*
+ * Oracle Linux DTrace.
+ * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
+ * Licensed under the Universal Permissive License v 1.0 as shown at
+ * http://oss.oracle.com/licenses/upl.
+ */
+
+/*
+ * ASSERTION: Line comments cannot contain end-of-comment markers.
+ */
+
+// Comment */
+BEGIN
+{
+	exit(0);
+}
diff --git a/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.r b/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.r
new file mode 100644
index 00000000..64d5ae63
--- /dev/null
+++ b/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.r
@@ -0,0 +1,2 @@
+-- @@stderr --
+dtrace: failed to compile script test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.d: [D_SYNTAX] line 12: */ encountered inside a line comment
diff --git a/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.d b/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.d
new file mode 100644
index 00000000..a9207693
--- /dev/null
+++ b/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.d
@@ -0,0 +1,16 @@
+/*
+ * Oracle Linux DTrace.
+ * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
+ * Licensed under the Universal Permissive License v 1.0 as shown at
+ * http://oss.oracle.com/licenses/upl.
+ */
+
+/*
+ * ASSERTION: End-of-file in a line comment is an error.
+ */
+
+BEGIN
+{
+	exit(0);
+}
+// Comment
\ No newline at end of file
diff --git a/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.r b/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.r
new file mode 100644
index 00000000..f46fce3e
--- /dev/null
+++ b/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.r
@@ -0,0 +1,2 @@
+-- @@stderr --
+dtrace: failed to compile script test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.d: [D_SYNTAX] line 16: end-of-file encountered in line comment
diff --git a/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.d b/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.d
new file mode 100644
index 00000000..0332e1ae
--- /dev/null
+++ b/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.d
@@ -0,0 +1,16 @@
+/*
+ * Oracle Linux DTrace.
+ * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
+ * Licensed under the Universal Permissive License v 1.0 as shown at
+ * http://oss.oracle.com/licenses/upl.
+ */
+
+/*
+ * ASSERTION: Line comments cannot contain a line commens marker.
+ */
+
+// Comment //
+BEGIN
+{
+	exit(0);
+}
diff --git a/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.r b/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.r
new file mode 100644
index 00000000..d1152afc
--- /dev/null
+++ b/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.r
@@ -0,0 +1,2 @@
+-- @@stderr --
+dtrace: failed to compile script test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.d: [D_SYNTAX] line 12: // encountered inside a line comment
diff --git a/test/unittest/lexer/tst.line-comment.d b/test/unittest/lexer/tst.line-comment.d
new file mode 100644
index 00000000..4deb66f4
--- /dev/null
+++ b/test/unittest/lexer/tst.line-comment.d
@@ -0,0 +1,16 @@
+/*
+ * Oracle Linux DTrace.
+ * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
+ * Licensed under the Universal Permissive License v 1.0 as shown at
+ * http://oss.oracle.com/licenses/upl.
+ */
+
+/*
+ * ASSERTION: // comments are supported.
+ */
+
+// exit(1);
+BEGIN // exit(1);
+{ // exit(1);
+	exit(0); // exit(1);
+} // exit(1);
diff --git a/test/unittest/lexer/tst.line-comment.r b/test/unittest/lexer/tst.line-comment.r
new file mode 100644
index 00000000..d4e4c325
--- /dev/null
+++ b/test/unittest/lexer/tst.line-comment.r
@@ -0,0 +1,5 @@
+                   FUNCTION:NAME
+                          :BEGIN 
+
+-- @@stderr --
+dtrace: script 'test/unittest/lexer/tst.line-comment.d' matched 1 probe
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [DTrace-devel] [PATCH] lexer: support // comments
  2025-07-22 22:12 [PATCH] lexer: support // comments Kris Van Hees
@ 2025-07-23  1:18 ` Eugene Loh
  2025-07-23  3:14   ` Kris Van Hees
  0 siblings, 1 reply; 4+ messages in thread
From: Eugene Loh @ 2025-07-23  1:18 UTC (permalink / raw)
  To: Kris Van Hees, dtrace, dtrace-devel

1)  s/commens/comment/

2)  Am I doing something wrong with this patch?

$ cat x.d
/*
  *  // is this okay?
  */
BEGIN {
   exit(0);
}

I get stuff like:
         /*/encountered inside a line comment

Frankly, I don't get as far as this script;  I first hit such problems 
with all the copyright notices we have in D files -- that 
"http://oss...." stuff:

/*
  * Oracle Linux DTrace.
  * Copyright (c) 2012, 2023, Oracle and/or its affiliates. All rights 
reserved.
  * Licensed under the Universal Permissive License v 1.0 as shown at
  * http://oss.oracle.com/licenses/upl.
  */

3)  Where do all the rules in this patch (tested by err.*) come from?  
E.g., C seems more lenient:

$ cat x.c
// /* hello */
// comment /*
// comment //
int main(int c, char **v) {
   /*
    * //
    */
   return 0;
}
$ gcc x.c
$ echo $?
0

On 7/22/25 18:12, Kris Van Hees via DTrace-devel wrote:
> Suggested-by: Ruud van der Pas <ruud.vanderpas@oracle.com>
> Suggested-by: Alan Maguire <alan.maguire@oracle.com>
> Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
> ---
>   libdtrace/dt_lex.l                               | 14 +++++++++++++-
>   .../lexer/err.D_SYNTAX.boc-in-line-comment.d     | 16 ++++++++++++++++
>   .../lexer/err.D_SYNTAX.boc-in-line-comment.r     |  2 ++
>   .../lexer/err.D_SYNTAX.eoc-in-line-comment.d     | 16 ++++++++++++++++
>   .../lexer/err.D_SYNTAX.eoc-in-line-comment.r     |  2 ++
>   .../lexer/err.D_SYNTAX.eof-in-line-comment.d     | 16 ++++++++++++++++
>   .../lexer/err.D_SYNTAX.eof-in-line-comment.r     |  2 ++
>   .../lexer/err.D_SYNTAX.lc-in-line-comment.d      | 16 ++++++++++++++++
>   .../lexer/err.D_SYNTAX.lc-in-line-comment.r      |  2 ++
>   test/unittest/lexer/tst.line-comment.d           | 16 ++++++++++++++++
>   test/unittest/lexer/tst.line-comment.r           |  5 +++++
>   11 files changed, 106 insertions(+), 1 deletion(-)
>   create mode 100644 test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.d
>   create mode 100644 test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.r
>   create mode 100644 test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.d
>   create mode 100644 test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.r
>   create mode 100644 test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.d
>   create mode 100644 test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.r
>   create mode 100644 test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.d
>   create mode 100644 test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.r
>   create mode 100644 test/unittest/lexer/tst.line-comment.d
>   create mode 100644 test/unittest/lexer/tst.line-comment.r
>
> diff --git a/libdtrace/dt_lex.l b/libdtrace/dt_lex.l
> index 9d502912..a7234800 100644
> --- a/libdtrace/dt_lex.l
> +++ b/libdtrace/dt_lex.l
> @@ -35,6 +35,7 @@ int yydebug;
>    * S2 - D program outer scope (probe specifiers and declarations)
>    * S3 - D control line parsing (i.e. after ^# is seen but before \n)
>    * S4 - D control line scan (locate control directives only and invoke S3)
> + * S5 - D line comments (i.e. skip everything until end of line)
>    * SIDENT - identifiers and comments only (after -> and .).  (We switch to
>    *          SIDENT only from state S0: changing this would require new code
>    *          to track the state to switch back to.)
> @@ -46,7 +47,7 @@ int yydebug;
>   %n 600		/* maximum states */
>   %option yylineno
>   
> -%s S0 S1 S2 S3 S4 SIDENT
> +%s S0 S1 S2 S3 S4 S5 SIDENT
>   
>   RGX_AGG		"@"[a-zA-Z_][0-9a-zA-Z_]*
>   RGX_PSPEC	[-$:a-zA-Z_.?*\\\[\]!][-$:0-9a-zA-Z_.`?*\\\[\]!]*
> @@ -408,6 +409,11 @@ if (yypcb->pcb_token != 0) {
>   			BEGIN(S1);
>   		}
>   
> +<S0,S2,SIDENT>"//"	{
> +			yypcb->pcb_cstate = (YYSTATE);
> +			BEGIN(S5);
> +		}
> +
>   <S0>^{RGX_INTERP} |
>   <S2>^{RGX_INTERP} ;	/* discard any #! lines */
>   
> @@ -548,6 +554,12 @@ if (yypcb->pcb_token != 0) {
>   <S1>.|\n	; /* discard */
>   <S1><<EOF>>	yyerror("end-of-file encountered before matching */\n");
>   
> +<S1>"/*"	yyerror("/* encountered inside a line comment\n");
> +<S1>"*/"	yyerror("*/ encountered inside a line comment\n");
> +<S1>"//"	yyerror("/*/encountered inside a line comment\n");
> +<S5>\n		BEGIN(yypcb->pcb_cstate);
> +<S5>.		; /* discard */
> +
>   <S2>{RGX_PSPEC}	{
>   			/*
>   			 * S2 has an ambiguity because RGX_PSPEC includes '*'
> diff --git a/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.d b/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.d
> new file mode 100644
> index 00000000..61f5961f
> --- /dev/null
> +++ b/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.d
> @@ -0,0 +1,16 @@
> +/*
> + * Oracle Linux DTrace.
> + * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
> + * Licensed under the Universal Permissive License v 1.0 as shown at
> + * http://oss.oracle.com/licenses/upl.
> + */
> +
> +/*
> + * ASSERTION: Line comments cannot contain begin-of-comment markers.
> + */
> +
> +// Comment /*
> +BEGIN
> +{
> +	exit(0);
> +}
> diff --git a/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.r b/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.r
> new file mode 100644
> index 00000000..74d1d0ef
> --- /dev/null
> +++ b/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.r
> @@ -0,0 +1,2 @@
> +-- @@stderr --
> +dtrace: failed to compile script test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.d: [D_SYNTAX] line 12: /* encountered inside a line comment
> diff --git a/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.d b/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.d
> new file mode 100644
> index 00000000..8fe2ff04
> --- /dev/null
> +++ b/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.d
> @@ -0,0 +1,16 @@
> +/*
> + * Oracle Linux DTrace.
> + * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
> + * Licensed under the Universal Permissive License v 1.0 as shown at
> + * http://oss.oracle.com/licenses/upl.
> + */
> +
> +/*
> + * ASSERTION: Line comments cannot contain end-of-comment markers.
> + */
> +
> +// Comment */
> +BEGIN
> +{
> +	exit(0);
> +}
> diff --git a/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.r b/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.r
> new file mode 100644
> index 00000000..64d5ae63
> --- /dev/null
> +++ b/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.r
> @@ -0,0 +1,2 @@
> +-- @@stderr --
> +dtrace: failed to compile script test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.d: [D_SYNTAX] line 12: */ encountered inside a line comment
> diff --git a/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.d b/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.d
> new file mode 100644
> index 00000000..a9207693
> --- /dev/null
> +++ b/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.d
> @@ -0,0 +1,16 @@
> +/*
> + * Oracle Linux DTrace.
> + * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
> + * Licensed under the Universal Permissive License v 1.0 as shown at
> + * http://oss.oracle.com/licenses/upl.
> + */
> +
> +/*
> + * ASSERTION: End-of-file in a line comment is an error.
> + */
> +
> +BEGIN
> +{
> +	exit(0);
> +}
> +// Comment
> \ No newline at end of file
> diff --git a/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.r b/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.r
> new file mode 100644
> index 00000000..f46fce3e
> --- /dev/null
> +++ b/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.r
> @@ -0,0 +1,2 @@
> +-- @@stderr --
> +dtrace: failed to compile script test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.d: [D_SYNTAX] line 16: end-of-file encountered in line comment
> diff --git a/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.d b/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.d
> new file mode 100644
> index 00000000..0332e1ae
> --- /dev/null
> +++ b/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.d
> @@ -0,0 +1,16 @@
> +/*
> + * Oracle Linux DTrace.
> + * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
> + * Licensed under the Universal Permissive License v 1.0 as shown at
> + * http://oss.oracle.com/licenses/upl.
> + */
> +
> +/*
> + * ASSERTION: Line comments cannot contain a line commens marker.
> + */
> +
> +// Comment //
> +BEGIN
> +{
> +	exit(0);
> +}
> diff --git a/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.r b/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.r
> new file mode 100644
> index 00000000..d1152afc
> --- /dev/null
> +++ b/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.r
> @@ -0,0 +1,2 @@
> +-- @@stderr --
> +dtrace: failed to compile script test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.d: [D_SYNTAX] line 12: // encountered inside a line comment
> diff --git a/test/unittest/lexer/tst.line-comment.d b/test/unittest/lexer/tst.line-comment.d
> new file mode 100644
> index 00000000..4deb66f4
> --- /dev/null
> +++ b/test/unittest/lexer/tst.line-comment.d
> @@ -0,0 +1,16 @@
> +/*
> + * Oracle Linux DTrace.
> + * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
> + * Licensed under the Universal Permissive License v 1.0 as shown at
> + * http://oss.oracle.com/licenses/upl.
> + */
> +
> +/*
> + * ASSERTION: // comments are supported.
> + */
> +
> +// exit(1);
> +BEGIN // exit(1);
> +{ // exit(1);
> +	exit(0); // exit(1);
> +} // exit(1);
> diff --git a/test/unittest/lexer/tst.line-comment.r b/test/unittest/lexer/tst.line-comment.r
> new file mode 100644
> index 00000000..d4e4c325
> --- /dev/null
> +++ b/test/unittest/lexer/tst.line-comment.r
> @@ -0,0 +1,5 @@
> +                   FUNCTION:NAME
> +                          :BEGIN
> +
> +-- @@stderr --
> +dtrace: script 'test/unittest/lexer/tst.line-comment.d' matched 1 probe

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [DTrace-devel] [PATCH] lexer: support // comments
  2025-07-23  1:18 ` [DTrace-devel] " Eugene Loh
@ 2025-07-23  3:14   ` Kris Van Hees
  2025-07-23 14:04     ` Eugene Loh
  0 siblings, 1 reply; 4+ messages in thread
From: Kris Van Hees @ 2025-07-23  3:14 UTC (permalink / raw)
  To: Eugene Loh; +Cc: Kris Van Hees, dtrace, dtrace-devel

On Tue, Jul 22, 2025 at 09:18:16PM -0400, Eugene Loh wrote:
> 1)  s/commens/comment/

Thanks.

> 2)  Am I doing something wrong with this patch?
> 
> $ cat x.d
> /*
>  *  // is this okay?
>  */
> BEGIN {
>   exit(0);
> }
> 
> I get stuff like:
>         /*/encountered inside a line comment
> 
> Frankly, I don't get as far as this script;  I first hit such problems with
> all the copyright notices we have in D files -- that "http://oss...." stuff:
> 
> /*
>  * Oracle Linux DTrace.
>  * Copyright (c) 2012, 2023, Oracle and/or its affiliates. All rights
> reserved.
>  * Licensed under the Universal Permissive License v 1.0 as shown at
>  * http://oss.oracle.com/licenses/upl.
>  */

My mistake - I sent out the patch before I re-dumped the patch in its final
version, and in the version I sent out some rules are still marked as state
<S1> rather than state <S5> due to copy'n'paste.  That causes this strange
behaviour.

> 3)  Where do all the rules in this patch (tested by err.*) come from?  E.g.,
> C seems more lenient:
> 
> $ cat x.c
> // /* hello */
> // comment /*
> // comment //
> int main(int c, char **v) {
>   /*
>    * //
>    */
>   return 0;
> }
> $ gcc x.c
> $ echo $?
> 0

Yes, C is more lenient.  I based the implementation on the behaviour of other
DTrace implementations that added //-comments.  I'd be open to making it more
lenient as well - I hoestly didn't check out C's behaviour because the rules
I observed for other DTrace implementations seemed to be quite consistent with
what DTrace is doing with block comments (/* ... */).

> On 7/22/25 18:12, Kris Van Hees via DTrace-devel wrote:
> > Suggested-by: Ruud van der Pas <ruud.vanderpas@oracle.com>
> > Suggested-by: Alan Maguire <alan.maguire@oracle.com>
> > Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com>
> > ---
> >   libdtrace/dt_lex.l                               | 14 +++++++++++++-
> >   .../lexer/err.D_SYNTAX.boc-in-line-comment.d     | 16 ++++++++++++++++
> >   .../lexer/err.D_SYNTAX.boc-in-line-comment.r     |  2 ++
> >   .../lexer/err.D_SYNTAX.eoc-in-line-comment.d     | 16 ++++++++++++++++
> >   .../lexer/err.D_SYNTAX.eoc-in-line-comment.r     |  2 ++
> >   .../lexer/err.D_SYNTAX.eof-in-line-comment.d     | 16 ++++++++++++++++
> >   .../lexer/err.D_SYNTAX.eof-in-line-comment.r     |  2 ++
> >   .../lexer/err.D_SYNTAX.lc-in-line-comment.d      | 16 ++++++++++++++++
> >   .../lexer/err.D_SYNTAX.lc-in-line-comment.r      |  2 ++
> >   test/unittest/lexer/tst.line-comment.d           | 16 ++++++++++++++++
> >   test/unittest/lexer/tst.line-comment.r           |  5 +++++
> >   11 files changed, 106 insertions(+), 1 deletion(-)
> >   create mode 100644 test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.d
> >   create mode 100644 test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.r
> >   create mode 100644 test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.d
> >   create mode 100644 test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.r
> >   create mode 100644 test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.d
> >   create mode 100644 test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.r
> >   create mode 100644 test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.d
> >   create mode 100644 test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.r
> >   create mode 100644 test/unittest/lexer/tst.line-comment.d
> >   create mode 100644 test/unittest/lexer/tst.line-comment.r
> > 
> > diff --git a/libdtrace/dt_lex.l b/libdtrace/dt_lex.l
> > index 9d502912..a7234800 100644
> > --- a/libdtrace/dt_lex.l
> > +++ b/libdtrace/dt_lex.l
> > @@ -35,6 +35,7 @@ int yydebug;
> >    * S2 - D program outer scope (probe specifiers and declarations)
> >    * S3 - D control line parsing (i.e. after ^# is seen but before \n)
> >    * S4 - D control line scan (locate control directives only and invoke S3)
> > + * S5 - D line comments (i.e. skip everything until end of line)
> >    * SIDENT - identifiers and comments only (after -> and .).  (We switch to
> >    *          SIDENT only from state S0: changing this would require new code
> >    *          to track the state to switch back to.)
> > @@ -46,7 +47,7 @@ int yydebug;
> >   %n 600		/* maximum states */
> >   %option yylineno
> > -%s S0 S1 S2 S3 S4 SIDENT
> > +%s S0 S1 S2 S3 S4 S5 SIDENT
> >   RGX_AGG		"@"[a-zA-Z_][0-9a-zA-Z_]*
> >   RGX_PSPEC	[-$:a-zA-Z_.?*\\\[\]!][-$:0-9a-zA-Z_.`?*\\\[\]!]*
> > @@ -408,6 +409,11 @@ if (yypcb->pcb_token != 0) {
> >   			BEGIN(S1);
> >   		}
> > +<S0,S2,SIDENT>"//"	{
> > +			yypcb->pcb_cstate = (YYSTATE);
> > +			BEGIN(S5);
> > +		}
> > +
> >   <S0>^{RGX_INTERP} |
> >   <S2>^{RGX_INTERP} ;	/* discard any #! lines */
> > @@ -548,6 +554,12 @@ if (yypcb->pcb_token != 0) {
> >   <S1>.|\n	; /* discard */
> >   <S1><<EOF>>	yyerror("end-of-file encountered before matching */\n");
> > +<S1>"/*"	yyerror("/* encountered inside a line comment\n");
> > +<S1>"*/"	yyerror("*/ encountered inside a line comment\n");
> > +<S1>"//"	yyerror("/*/encountered inside a line comment\n");
> > +<S5>\n		BEGIN(yypcb->pcb_cstate);
> > +<S5>.		; /* discard */
> > +
> >   <S2>{RGX_PSPEC}	{
> >   			/*
> >   			 * S2 has an ambiguity because RGX_PSPEC includes '*'
> > diff --git a/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.d b/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.d
> > new file mode 100644
> > index 00000000..61f5961f
> > --- /dev/null
> > +++ b/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.d
> > @@ -0,0 +1,16 @@
> > +/*
> > + * Oracle Linux DTrace.
> > + * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
> > + * Licensed under the Universal Permissive License v 1.0 as shown at
> > + * http://oss.oracle.com/licenses/upl.
> > + */
> > +
> > +/*
> > + * ASSERTION: Line comments cannot contain begin-of-comment markers.
> > + */
> > +
> > +// Comment /*
> > +BEGIN
> > +{
> > +	exit(0);
> > +}
> > diff --git a/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.r b/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.r
> > new file mode 100644
> > index 00000000..74d1d0ef
> > --- /dev/null
> > +++ b/test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.r
> > @@ -0,0 +1,2 @@
> > +-- @@stderr --
> > +dtrace: failed to compile script test/unittest/lexer/err.D_SYNTAX.boc-in-line-comment.d: [D_SYNTAX] line 12: /* encountered inside a line comment
> > diff --git a/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.d b/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.d
> > new file mode 100644
> > index 00000000..8fe2ff04
> > --- /dev/null
> > +++ b/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.d
> > @@ -0,0 +1,16 @@
> > +/*
> > + * Oracle Linux DTrace.
> > + * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
> > + * Licensed under the Universal Permissive License v 1.0 as shown at
> > + * http://oss.oracle.com/licenses/upl.
> > + */
> > +
> > +/*
> > + * ASSERTION: Line comments cannot contain end-of-comment markers.
> > + */
> > +
> > +// Comment */
> > +BEGIN
> > +{
> > +	exit(0);
> > +}
> > diff --git a/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.r b/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.r
> > new file mode 100644
> > index 00000000..64d5ae63
> > --- /dev/null
> > +++ b/test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.r
> > @@ -0,0 +1,2 @@
> > +-- @@stderr --
> > +dtrace: failed to compile script test/unittest/lexer/err.D_SYNTAX.eoc-in-line-comment.d: [D_SYNTAX] line 12: */ encountered inside a line comment
> > diff --git a/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.d b/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.d
> > new file mode 100644
> > index 00000000..a9207693
> > --- /dev/null
> > +++ b/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.d
> > @@ -0,0 +1,16 @@
> > +/*
> > + * Oracle Linux DTrace.
> > + * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
> > + * Licensed under the Universal Permissive License v 1.0 as shown at
> > + * http://oss.oracle.com/licenses/upl.
> > + */
> > +
> > +/*
> > + * ASSERTION: End-of-file in a line comment is an error.
> > + */
> > +
> > +BEGIN
> > +{
> > +	exit(0);
> > +}
> > +// Comment
> > \ No newline at end of file
> > diff --git a/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.r b/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.r
> > new file mode 100644
> > index 00000000..f46fce3e
> > --- /dev/null
> > +++ b/test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.r
> > @@ -0,0 +1,2 @@
> > +-- @@stderr --
> > +dtrace: failed to compile script test/unittest/lexer/err.D_SYNTAX.eof-in-line-comment.d: [D_SYNTAX] line 16: end-of-file encountered in line comment
> > diff --git a/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.d b/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.d
> > new file mode 100644
> > index 00000000..0332e1ae
> > --- /dev/null
> > +++ b/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.d
> > @@ -0,0 +1,16 @@
> > +/*
> > + * Oracle Linux DTrace.
> > + * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
> > + * Licensed under the Universal Permissive License v 1.0 as shown at
> > + * http://oss.oracle.com/licenses/upl.
> > + */
> > +
> > +/*
> > + * ASSERTION: Line comments cannot contain a line commens marker.
> > + */
> > +
> > +// Comment //
> > +BEGIN
> > +{
> > +	exit(0);
> > +}
> > diff --git a/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.r b/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.r
> > new file mode 100644
> > index 00000000..d1152afc
> > --- /dev/null
> > +++ b/test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.r
> > @@ -0,0 +1,2 @@
> > +-- @@stderr --
> > +dtrace: failed to compile script test/unittest/lexer/err.D_SYNTAX.lc-in-line-comment.d: [D_SYNTAX] line 12: // encountered inside a line comment
> > diff --git a/test/unittest/lexer/tst.line-comment.d b/test/unittest/lexer/tst.line-comment.d
> > new file mode 100644
> > index 00000000..4deb66f4
> > --- /dev/null
> > +++ b/test/unittest/lexer/tst.line-comment.d
> > @@ -0,0 +1,16 @@
> > +/*
> > + * Oracle Linux DTrace.
> > + * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
> > + * Licensed under the Universal Permissive License v 1.0 as shown at
> > + * http://oss.oracle.com/licenses/upl.
> > + */
> > +
> > +/*
> > + * ASSERTION: // comments are supported.
> > + */
> > +
> > +// exit(1);
> > +BEGIN // exit(1);
> > +{ // exit(1);
> > +	exit(0); // exit(1);
> > +} // exit(1);
> > diff --git a/test/unittest/lexer/tst.line-comment.r b/test/unittest/lexer/tst.line-comment.r
> > new file mode 100644
> > index 00000000..d4e4c325
> > --- /dev/null
> > +++ b/test/unittest/lexer/tst.line-comment.r
> > @@ -0,0 +1,5 @@
> > +                   FUNCTION:NAME
> > +                          :BEGIN
> > +
> > +-- @@stderr --
> > +dtrace: script 'test/unittest/lexer/tst.line-comment.d' matched 1 probe

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [DTrace-devel] [PATCH] lexer: support // comments
  2025-07-23  3:14   ` Kris Van Hees
@ 2025-07-23 14:04     ` Eugene Loh
  0 siblings, 0 replies; 4+ messages in thread
From: Eugene Loh @ 2025-07-23 14:04 UTC (permalink / raw)
  To: Kris Van Hees; +Cc: dtrace, dtrace-devel

On 7/22/25 23:14, Kris Van Hees wrote:

> On Tue, Jul 22, 2025 at 09:18:16PM -0400, Eugene Loh wrote:
>> 3)  Where do all the rules in this patch (tested by err.*) come from?  E.g.,
>> C seems more lenient:
>>
>> $ cat x.c
>> // /* hello */
>> // comment /*
>> // comment //
>> int main(int c, char **v) {
>>    /*
>>     * //
>>     */
>>    return 0;
>> }
>> $ gcc x.c
>> $ echo $?
>> 0
> Yes, C is more lenient.  I based the implementation on the behaviour of other
> DTrace implementations that added //-comments.  I'd be open to making it more
> lenient as well - I hoestly didn't check out C's behaviour because the rules
> I observed for other DTrace implementations seemed to be quite consistent with
> what DTrace is doing with block comments (/* ... */).
Worth mentioning in the commit message?

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-07-23 14:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-22 22:12 [PATCH] lexer: support // comments Kris Van Hees
2025-07-23  1:18 ` [DTrace-devel] " Eugene Loh
2025-07-23  3:14   ` Kris Van Hees
2025-07-23 14:04     ` Eugene Loh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.