From: Andrei Rybak <rybak.a.v@gmail.com>
To: Johannes Sixt <j6t@kdbg.org>, git@vger.kernel.org
Cc: Jeff King <peff@peff.net>, Paolo Bonzini <pbonzini@redhat.com>,
Tassilo Horn <tsdh@gnu.org>
Subject: [PATCH v3 0/3] userdiff: Java updates
Date: Wed, 8 Feb 2023 00:42:56 +0100 [thread overview]
Message-ID: <20230207234259.452141-1-rybak.a.v@gmail.com> (raw)
In-Reply-To: <64601c4b-9ced-672f-a5fd-9a9b3b65859d@kdbg.org>
On 2023-02-05T22:33 Johannes Sixt wrote:
> Having seen all these examples, I think the following truncated
> expression might do the right thing for all cases that are valid Java:
>
> "^[ \t]*(([a-z-]+[ \t]+)*(class|enum|interface|record)[ \t].*)$"
Only the '\n' is missing at the end, but otherwise I concur, so here's a v3.
> i.e., we recognize a whitespace in order to identify the keyword, and
> then capture anything that follows without being specific. My reasoning
> is that "class", "enum", "interface", and "record" cannot occur in any
> other context than the beginning of a class definition. (But please do
> correct me; I know next to nothing about Java syntax.)
The word "class" can also occur as part of a class literal, for example:
Class<String> c = String.class;
but valid uses of class literals won't interfere with our regex, unless some
wild formatting is applied. This is technically valid Java:
Class<String> c = String.
class
;
and with a space after lowercase "class", the v3 regex will trip. Class
literals are described in the JLS here:
https://docs.oracle.com/javase/specs/jls/se17/html/jls-15.html#jls-15.8.2
> As always,
> userdiff regular expressions can assume that only valid constructs are
> inspected.
Changes since v2:
- simplified regex that doesn't match class names at all and supports more
code styles
- updated the comment just above the regex in PATCH 2/3 to mention records
- more tests to cover the cases mentioned during review of v2
- reworded commit messages to reflect the above items
Range diff since v2:
1: c300745a58 ! 1: 9e859e3b79 userdiff: support Java type parameters
@@ Metadata
## Commit message ##
userdiff: support Java type parameters
- A class or interface in Java [1] can have type parameters immediately
- following the name in the declaration, surrounded by angle brackets
- (paired less than and greater than signs).[2] Example of a class with
- type parameters "A" and "N":
-
- public class ParameterizedClass<A, N> {
- private A field1;
- private N field2;
+ A class or interface in Java can have type parameters following the name
+ in the declared type, surrounded by angle brackets (paired less than and
+ greater than signs).[2] The type parameters -- `A` and `B` in the
+ examples -- may follow the class name immediately:
+
+ public class ParameterizedClass<A, B> {
}
- Support matching a parameterized class or interface declaration with
- type parameters immediately following the name of the type in the
- builtin userdiff pattern for Java. Do so by just allowing matching the
- first character after the name of the type to "<".
+ or may be separated by whitespace:
+
+ public class SpaceBeforeTypeParameters <A, B> {
+ }
- An alternative approach could be to match both the opening and the
- closing angle brackets and matching the content between them in various
- ways. Just use the simpler regex for now.
+ A part of the builtin userdiff pattern for Java matches declarations of
+ classes, enums, and interfaces. The regular expression requires at
+ least one whitespace character after the name of the declared type.
+ This disallows matching for opening angle bracket of type parameters
+ immediately after the name of the type. Mandatory whitespace after the
+ name of the type also disallows using the pattern in repositories with a
+ fairly common code style that puts braces for the body of a class on
+ separate lines:
+
+ class WithLineBreakBeforeOpeningBrace
+ {
+ }
+
+ Support matching Java code in more diverse code styles and declarations
+ of classes and interfaces with type parameters immediately following the
+ name of the type in the builtin userdiff pattern for Java. Do so by
+ just matching anything until the end of the line after the keywords for
+ the kind of type being declared.
[1] Since Java 5 released in 2004.
[2] Detailed description is available in the Java Language
@@ Commit message
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
+ ## t/t4018/java-class-brace-on-separate-line (new) ##
+@@
++class RIGHT
++{
++ static int ONE;
++ static int TWO;
++ static int ChangeMe;
++}
+
+ ## t/t4018/java-class-space-before-type-parameters (new) ##
+@@
++class RIGHT <TYPE, PARAMS, AFTER, SPACE> {
++ static int ONE;
++ static int TWO;
++ static int THREE;
++ private A ChangeMe;
++}
+
## t/t4018/java-class-type-parameters (new) ##
@@
+class RIGHT<A, B> {
@@ userdiff.c: PATTERNS("html",
"!^[ \t]*(catch|do|for|if|instanceof|new|return|switch|throw|while)\n"
/* Class, enum, and interface declarations */
- "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface)[ \t]+[A-Za-z][A-Za-z0-9_$]*[ \t]+.*)$\n"
-+ "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface)[ \t]+[A-Za-z][A-Za-z0-9_$]*([ \t]+|<).*)$\n"
++ "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface)[ \t]+.*)$\n"
/* Method definitions; note that constructor signatures are not */
/* matched because they are indistinguishable from method calls. */
"^[ \t]*(([A-Za-z_<>&][][?&<>.,A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)$",
2: a0e622a0f8 ! 2: 4f7be5f642 userdiff: support Java record types
@@ Commit message
the name of the record class is followed by a mandatory list of
components. The list is enclosed in parentheses, it may be empty, and
it may immediately follow the name of the class or type parameters, if
- any, without separating whitespace.
-
- Code examples:
+ any, with or without separating whitespace. For example:
public record Example(int i, String s) {
}
@@ Commit message
public record WithTypeParameters<A, B>(A a, B b, String s) {
}
+ record SpaceBeforeComponents (String comp1, int comp2) {
+ }
+
Support records in the builtin userdiff pattern for Java. Add "record"
- to the alternatives of keywords for kinds of class, and match an opening
- parenthesis as the first character right after the type name.
+ to the alternatives of keywords for kinds of class.
- An alternative approach could be to have an optional group that would
- match both the opening and the closing parentheses with some way of
- matching the declarations of the components. Just use the simpler
- regular expression for now.
+ Allowing matching various possibilities for the type parameters and/or
+ list of the components of a record has already been covered by the
+ preceding patch.
[1] detailed description is available in "JEP 395: Records"
https://openjdk.org/jeps/395
@@ t/t4018/java-record (new)
+ static int TWO;
+ static int THREE;
+ static int ChangeMe;
++}
+
+ ## t/t4018/java-record-space-before-components (new) ##
+@@
++public record RIGHT (String components, String after, String space) {
++ static int ONE;
++ static int TWO;
++ static int THREE;
++ static int ChangeMe;
+}
## t/t4018/java-record-type-parameters (new) ##
@@ t/t4018/java-record-type-parameters (new)
## userdiff.c ##
@@ userdiff.c: PATTERNS("html",
+ "[^<>= \t]+"),
PATTERNS("java",
"!^[ \t]*(catch|do|for|if|instanceof|new|return|switch|throw|while)\n"
- /* Class, enum, and interface declarations */
-- "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface)[ \t]+[A-Za-z][A-Za-z0-9_$]*([ \t]+|<).*)$\n"
-+ "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface|record)[ \t]+[A-Za-z][A-Za-z0-9_$]*([ \t]+|[<(]).*)$\n"
+- /* Class, enum, and interface declarations */
+- "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface)[ \t]+.*)$\n"
++ /* Class, enum, interface, and record declarations */
++ "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface|record)[ \t]+.*)$\n"
/* Method definitions; note that constructor signatures are not */
/* matched because they are indistinguishable from method calls. */
"^[ \t]*(([A-Za-z_<>&][][?&<>.,A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)$",
3: b9c6a5dffd ! 3: ea6ce671ef userdiff: support Java sealed classes
@@ userdiff.c
@@ userdiff.c: PATTERNS("html",
PATTERNS("java",
"!^[ \t]*(catch|do|for|if|instanceof|new|return|switch|throw|while)\n"
- /* Class, enum, and interface declarations */
-- "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface|record)[ \t]+[A-Za-z][A-Za-z0-9_$]*([ \t]+|[<(]).*)$\n"
-+ "^[ \t]*(([a-z-]+[ \t]+)*(class|enum|interface|record)[ \t]+[A-Za-z][A-Za-z0-9_$]*([ \t]+|[<(]).*)$\n"
+ /* Class, enum, interface, and record declarations */
+- "^[ \t]*(([a-z]+[ \t]+)*(class|enum|interface|record)[ \t]+.*)$\n"
++ "^[ \t]*(([a-z-]+[ \t]+)*(class|enum|interface|record)[ \t]+.*)$\n"
/* Method definitions; note that constructor signatures are not */
/* matched because they are indistinguishable from method calls. */
"^[ \t]*(([A-Za-z_<>&][][?&<>.,A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)$",
Andrei Rybak (3):
userdiff: support Java type parameters
userdiff: support Java record types
userdiff: support Java sealed classes
t/t4018/java-class-brace-on-separate-line | 6 ++++++
t/t4018/java-class-space-before-type-parameters | 6 ++++++
t/t4018/java-class-type-parameters | 6 ++++++
t/t4018/java-class-type-parameters-implements | 6 ++++++
t/t4018/java-interface-type-parameters | 6 ++++++
t/t4018/java-interface-type-parameters-extends | 6 ++++++
t/t4018/java-non-sealed | 8 ++++++++
t/t4018/java-record | 6 ++++++
t/t4018/java-record-space-before-components | 6 ++++++
t/t4018/java-record-type-parameters | 6 ++++++
t/t4018/java-sealed | 7 +++++++
t/t4018/java-sealed-permits | 6 ++++++
t/t4018/java-sealed-type-parameters | 6 ++++++
t/t4018/java-sealed-type-parameters-implements-permits | 6 ++++++
t/t4018/java-sealed-type-parameters-permits | 6 ++++++
userdiff.c | 4 ++--
16 files changed, 95 insertions(+), 2 deletions(-)
create mode 100644 t/t4018/java-class-brace-on-separate-line
create mode 100644 t/t4018/java-class-space-before-type-parameters
create mode 100644 t/t4018/java-class-type-parameters
create mode 100644 t/t4018/java-class-type-parameters-implements
create mode 100644 t/t4018/java-interface-type-parameters
create mode 100644 t/t4018/java-interface-type-parameters-extends
create mode 100644 t/t4018/java-non-sealed
create mode 100644 t/t4018/java-record
create mode 100644 t/t4018/java-record-space-before-components
create mode 100644 t/t4018/java-record-type-parameters
create mode 100644 t/t4018/java-sealed
create mode 100644 t/t4018/java-sealed-permits
create mode 100644 t/t4018/java-sealed-type-parameters
create mode 100644 t/t4018/java-sealed-type-parameters-implements-permits
create mode 100644 t/t4018/java-sealed-type-parameters-permits
--
2.39.1
next prev parent reply other threads:[~2023-02-07 23:43 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-03 23:23 [PATCH v1 0/3] userdiff: Java updates Andrei Rybak
2023-02-03 23:23 ` [PATCH v1 1/3] userdiff: support Java type parameters Andrei Rybak
2023-02-03 23:23 ` [PATCH v1 2/3] userdiff: support Java record types Andrei Rybak
2023-02-03 23:23 ` [PATCH v1 3/3] userdiff: support Java sealed classes Andrei Rybak
2023-02-04 9:22 ` [PATCH v1 0/3] userdiff: Java updates Tassilo Horn
2023-02-04 13:43 ` [PATCH v2 " Andrei Rybak
2023-02-04 13:43 ` [PATCH v2 1/3] userdiff: support Java type parameters Andrei Rybak
2023-02-04 13:43 ` [PATCH v2 2/3] userdiff: support Java record types Andrei Rybak
2023-02-04 13:43 ` [PATCH v2 3/3] userdiff: support Java sealed classes Andrei Rybak
2023-02-05 10:09 ` [PATCH v2 0/3] userdiff: Java updates Johannes Sixt
2023-02-05 19:27 ` Andrei Rybak
2023-02-05 21:33 ` Johannes Sixt
2023-02-07 23:42 ` Andrei Rybak [this message]
2023-02-07 23:42 ` [PATCH v3 1/3] userdiff: support Java type parameters Andrei Rybak
2023-02-08 0:04 ` Andrei Rybak
2023-02-07 23:42 ` [PATCH v3 2/3] userdiff: support Java record types Andrei Rybak
2023-02-07 23:42 ` [PATCH v3 3/3] userdiff: support Java sealed classes Andrei Rybak
2023-02-08 20:51 ` [PATCH v3 0/3] userdiff: Java updates Johannes Sixt
2023-02-08 20:55 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230207234259.452141-1-rybak.a.v@gmail.com \
--to=rybak.a.v@gmail.com \
--cc=git@vger.kernel.org \
--cc=j6t@kdbg.org \
--cc=pbonzini@redhat.com \
--cc=peff@peff.net \
--cc=tsdh@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).