* [PATCH] userdiff: add builtin diff driver for TypeScript language
@ 2024-07-15 16:33 Matthew Hughes via GitGitGadget
2024-07-16 12:21 ` Matthew Hughes
2024-07-16 20:53 ` Johannes Sixt
0 siblings, 2 replies; 6+ messages in thread
From: Matthew Hughes via GitGitGadget @ 2024-07-15 16:33 UTC (permalink / raw)
To: git; +Cc: Matthew Hughes, Matthew Hughes
From: Matthew Hughes <matthewhughes934@gmail.com>
TypeScript[1] is an open-source programming language that builds on
JavaScript. This patch adds builtin diff driver support for this
language. As far as I can tell there is no official syntax specification
for the language (see[2] for some discussion) so this patch is based off
some existing work[3]. The docs[4] probably provide the best
reference as to what this driver should satisfy. See[5] for
discussion/motivation for this change from the TypeScript language team.
This is my first time developing a diff driver, so as such the
implementation borrows quite a bit from existing drivers. The funcname
attribute matches function and class definitions, the list of keywords
used to define functions was take from[3], I could not find an
exhaustive list for these. The word-regex borrows much from other
existing diff engines, with the addition of the rather unique
right-shifting operators (>>> and >>>=) available in JavaScript (and
hence Typescript)[6]
[1] https://www.typescriptlang.org/
[2] https://github.com/Microsoft/TypeScript/issues/15711
[3] https://github.com/git/git/pull/859
[4] https://www.typescriptlang.org/docs/
[5] https://github.com/microsoft/TypeScript/issues/36185
[6] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Unsigned_right_shift
Signed-off-by: Matthew Hughes <matthewhughes934@gmail.com>
---
userdiff: add builtin diff driver for TypeScript language
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1746%2Fmatthewhughes934%2Fadd-typescript-diff-driver-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1746/matthewhughes934/add-typescript-diff-driver-v1
Pull-Request: https://github.com/git/git/pull/1746
Documentation/gitattributes.txt | 2 ++
t/t4018/typescript-arrow-func | 7 +++++
t/t4018/typescript-class | 8 ++++++
t/t4018/typescript-func | 7 +++++
t/t4018/typescript-indented-func | 7 +++++
t/t4018/typescript-interface | 5 ++++
t/t4018/typescript-nested-arrow-func | 7 +++++
t/t4034-diff-words.sh | 1 +
t/t4034/typescript/expect | 42 ++++++++++++++++++++++++++++
t/t4034/typescript/post | 24 ++++++++++++++++
t/t4034/typescript/pre | 24 ++++++++++++++++
userdiff.c | 11 ++++++++
12 files changed, 145 insertions(+)
create mode 100644 t/t4018/typescript-arrow-func
create mode 100644 t/t4018/typescript-class
create mode 100644 t/t4018/typescript-func
create mode 100644 t/t4018/typescript-indented-func
create mode 100644 t/t4018/typescript-interface
create mode 100644 t/t4018/typescript-nested-arrow-func
create mode 100644 t/t4034/typescript/expect
create mode 100644 t/t4034/typescript/post
create mode 100644 t/t4034/typescript/pre
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index e6150595af8..88ebfb2e6c9 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -907,6 +907,8 @@ patterns are available:
- `tex` suitable for source code for LaTeX documents.
+- `typescript` suitable for source code in the TypeScript language.
+
Customizing word diff
^^^^^^^^^^^^^^^^^^^^^
diff --git a/t/t4018/typescript-arrow-func b/t/t4018/typescript-arrow-func
new file mode 100644
index 00000000000..c884c2e6bdc
--- /dev/null
+++ b/t/t4018/typescript-arrow-func
@@ -0,0 +1,7 @@
+const RIGHT = (a: number) : number => {
+ const res = a + 1
+
+ // some comment
+ console.log("ChangeMe")
+ return res
+}
diff --git a/t/t4018/typescript-class b/t/t4018/typescript-class
new file mode 100644
index 00000000000..a373f7af1dd
--- /dev/null
+++ b/t/t4018/typescript-class
@@ -0,0 +1,8 @@
+class RIGHT {
+ // some
+ // comments
+ // ChangeMe
+ constructor() {
+ console.log("constructing...")
+ }
+}
diff --git a/t/t4018/typescript-func b/t/t4018/typescript-func
new file mode 100644
index 00000000000..94cc659a452
--- /dev/null
+++ b/t/t4018/typescript-func
@@ -0,0 +1,7 @@
+function RIGHT(a: number) : number {
+ const res = a + 1
+
+ // some comment
+ console.log("ChangeMe")
+ return res
+}
diff --git a/t/t4018/typescript-indented-func b/t/t4018/typescript-indented-func
new file mode 100644
index 00000000000..fee0373126e
--- /dev/null
+++ b/t/t4018/typescript-indented-func
@@ -0,0 +1,7 @@
+class Foo {
+ public RIGHT() {
+ // some
+ // comments
+ console.log("ChangeMe")
+ }
+}
diff --git a/t/t4018/typescript-interface b/t/t4018/typescript-interface
new file mode 100644
index 00000000000..78d14523446
--- /dev/null
+++ b/t/t4018/typescript-interface
@@ -0,0 +1,5 @@
+interface RIGHT {
+ name: string;
+ // some comment
+ ChangeMe: number;
+}
diff --git a/t/t4018/typescript-nested-arrow-func b/t/t4018/typescript-nested-arrow-func
new file mode 100644
index 00000000000..52dbc6a92e5
--- /dev/null
+++ b/t/t4018/typescript-nested-arrow-func
@@ -0,0 +1,7 @@
+class Foo {
+ RIGHT = () => {
+ // some
+ // comment
+ console.log("ChangeMe")
+ }
+}
diff --git a/t/t4034-diff-words.sh b/t/t4034-diff-words.sh
index 74586f3813c..4e3cf415c23 100755
--- a/t/t4034-diff-words.sh
+++ b/t/t4034-diff-words.sh
@@ -338,6 +338,7 @@ test_language_driver python
test_language_driver ruby
test_language_driver scheme
test_language_driver tex
+test_language_driver typescript
test_expect_success 'word-diff with diff.sbe' '
cat >pre <<-\EOF &&
diff --git a/t/t4034/typescript/expect b/t/t4034/typescript/expect
new file mode 100644
index 00000000000..fde27423339
--- /dev/null
+++ b/t/t4034/typescript/expect
@@ -0,0 +1,42 @@
+<BOLD>diff --git a/pre b/post<RESET>
+<BOLD>index 6a5ba06c60..db4788b1a6 100644<RESET>
+<BOLD>--- a/pre<RESET>
+<BOLD>+++ b/post<RESET>
+<CYAN>@@ -1,24 +1,24 @@<RESET>
+console.log("Hello world<RED>!<RESET><GREEN>?<RESET>")
+foo===<RED>bar<RESET><GREEN>buz<RESET>
+foo!==<RED>bar<RESET><GREEN>buz<RESET>
+function fn() : <RED>number<RESET><GREEN>string<RESET> { return <RED>1<RESET><GREEN>"ok"<RESET> }
+<RED>a<RESET><GREEN>x<RESET>.someFunc()
+const <RED>a<RESET><GREEN>x<RESET> = () => {}
+<GREEN>(<RESET>1<GREEN>) (<RESET>-1e10<GREEN>) (<RESET>0xabcdef<GREEN>)<RESET> '<RED>x<RESET><GREEN>y<RESET>'
+"<RED>x<RESET><GREEN>y<RESET>"
+[<RED>a<RESET><GREEN>x<RESET>] <RED>a<RESET><GREEN>x<RESET>.<RED>b<RESET><GREEN>y<RESET>
+!<RED>a<RESET><GREEN>x<RESET> ~<RED>a a<RESET><GREEN>x x<RESET>++ <RED>a<RESET><GREEN>x<RESET>-- <RED>a<RESET><GREEN>x<RESET>*<RED>b a<RESET><GREEN>y x<RESET>&<RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET>*<RED>b a<RESET><GREEN>y x<RESET>/<RED>b a<RESET><GREEN>y x<RESET>%<RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET>+<RED>b a<RESET><GREEN>y x<RESET>-<RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET><<<RED>b a<RESET><GREEN>y x<RESET>>><RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET>>>>=<RED>b a<RESET><GREEN>y x<RESET>>>><RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET><<RED>b a<RESET><GREEN>y x<RESET><=<RED>b a<RESET><GREEN>y x<RESET>><RED>b a<RESET><GREEN>y x<RESET>>=<RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET>==<RED>b a<RESET><GREEN>y x<RESET>!=<RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET>&<RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET>^<RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET>|<RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET>&&<RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET>||<RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET>?<RED>b<RESET><GREEN>y<RESET>:z
+<RED>a<RESET><GREEN>x<RESET>=<RED>b a<RESET><GREEN>y x<RESET>+=<RED>b a<RESET><GREEN>y x<RESET>-=<RED>b a<RESET><GREEN>y x<RESET>*=<RED>b a<RESET><GREEN>y x<RESET>/=<RED>b a<RESET><GREEN>y x<RESET>%=<RED>b a<RESET><GREEN>y x<RESET><<=<RED>b a<RESET><GREEN>y x<RESET>>>=<RED>b a<RESET><GREEN>y x<RESET>&=<RED>b a<RESET><GREEN>y x<RESET>^=<RED>b a<RESET><GREEN>y x<RESET>|=<RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET>,y
diff --git a/t/t4034/typescript/post b/t/t4034/typescript/post
new file mode 100644
index 00000000000..4aca03f8ce6
--- /dev/null
+++ b/t/t4034/typescript/post
@@ -0,0 +1,24 @@
+console.log("Hello world?")
+foo===buz
+foo!==buz
+function fn() : string { return "ok" }
+x.someFunc()
+const x = () => {}
+(1) (-1e10) (0xabcdef) 'y'
+"y"
+[x] x.y
+!x ~x x++ x-- x*y x&y
+x*y x/y x%y
+x+y x-y
+x<<y x>>y
+x>>>=y x>>>y
+x<y x<=y x>y x>=y
+x==y x!=y
+x&y
+x^y
+x|y
+x&&y
+x||y
+x?y:z
+x=y x+=y x-=y x*=y x/=y x%=y x<<=y x>>=y x&=y x^=y x|=y
+x,y
diff --git a/t/t4034/typescript/pre b/t/t4034/typescript/pre
new file mode 100644
index 00000000000..ef654a92c61
--- /dev/null
+++ b/t/t4034/typescript/pre
@@ -0,0 +1,24 @@
+console.log("Hello world!")
+foo===bar
+foo!==bar
+function fn() : number { return 1 }
+a.someFunc()
+const a = () => {}
+1 -1e10 0xabcdef 'x'
+"x"
+[a] a.b
+!a ~a a++ a-- a*b a&b
+a*b a/b a%b
+a+b a-b
+a<<b a>>b
+a>>>=b a>>>b
+a<b a<=b a>b a>=b
+a==b a!=b
+a&b
+a^b
+a|b
+a&&b
+a||b
+a?b:z
+a=b a+=b a-=b a*=b a/=b a%=b a<<=b a>>=b a&=b a^=b a|=b
+a,y
diff --git a/userdiff.c b/userdiff.c
index c4ebb9ff734..7247d351cde 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -333,6 +333,17 @@ PATTERNS("scheme",
"|([^][)(}{[ \t])+"),
PATTERNS("tex", "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$",
"\\\\[a-zA-Z@]+|\\\\.|([a-zA-Z0-9]|[^\x01-\x7f])+"),
+PATTERNS("typescript",
+ "^[\t ]*((class|constructor|public|private|protected|function|interface)[ \t].*)$\n"
+ // arrow funcs
+ "^[\t ]*((const|let|var)?[^()]*)=[\t ]*\\([^()]*\\)[\t ]*.*=>.*$",
+ /* -- */
+ "[a-zA-Z_][a-zA-Z0-9_]*"
+ // numeric constants
+ "|[-+0-9.e]+|0[xX]?[0-9a-fA-F]"
+ // operators
+ "|[-+*/<>%&^|=!]"
+ "|--|\\+\\+|//=?|<<=?|>>?=?"),
{ .name = "default", .binary = -1 },
};
#undef PATTERNS
base-commit: a7dae3bdc8b516d36f630b12bb01e853a667e0d9
--
gitgitgadget
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] userdiff: add builtin diff driver for TypeScript language
2024-07-15 16:33 [PATCH] userdiff: add builtin diff driver for TypeScript language Matthew Hughes via GitGitGadget
@ 2024-07-16 12:21 ` Matthew Hughes
2024-07-16 15:45 ` Junio C Hamano
2024-07-16 20:53 ` Johannes Sixt
1 sibling, 1 reply; 6+ messages in thread
From: Matthew Hughes @ 2024-07-16 12:21 UTC (permalink / raw)
To: Matthew Hughes via GitGitGadget; +Cc: git
On Mon, Jul 15, 2024 at 04:33:38PM +0000, Matthew Hughes via GitGitGadget wrote:
> diff --git a/userdiff.c b/userdiff.c
> index c4ebb9ff734..7247d351cde 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -333,6 +333,17 @@ PATTERNS("scheme",
> "|([^][)(}{[ \t])+"),
> PATTERNS("tex", "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$",
> "\\\\[a-zA-Z@]+|\\\\.|([a-zA-Z0-9]|[^\x01-\x7f])+"),
> +PATTERNS("typescript",
> + "^[\t ]*((class|constructor|public|private|protected|function|interface)[ \t].*)$\n"
> + // arrow funcs
> + "^[\t ]*((const|let|var)?[^()]*)=[\t ]*\\([^()]*\\)[\t ]*.*=>.*$",
> + /* -- */
> + "[a-zA-Z_][a-zA-Z0-9_]*"
> + // numeric constants
> + "|[-+0-9.e]+|0[xX]?[0-9a-fA-F]"
> + // operators
> + "|[-+*/<>%&^|=!]"
> + "|--|\\+\\+|//=?|<<=?|>>?=?"),
> { .name = "default", .binary = -1 },
> };
> #undef PATTERNS
>
> base-commit: a7dae3bdc8b516d36f630b12bb01e853a667e0d9
> --
> gitgitgadget
This needs some updates. For the arrow function, definitions can cover multiple
lines e.g.:
const bar = (
name: string
) => console.log(name)
The funcname pattern should also consider the `export` keyword, since both of
the following are valid:
export const bar = (
name: string
) => console.log(name)
export function foo() {}
Some docs: https://www.typescriptlang.org/docs/handbook/modules/reference.html#module-syntax
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] userdiff: add builtin diff driver for TypeScript language
2024-07-16 12:21 ` Matthew Hughes
@ 2024-07-16 15:45 ` Junio C Hamano
2024-07-16 19:33 ` Matthew Hughes
0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2024-07-16 15:45 UTC (permalink / raw)
To: Matthew Hughes; +Cc: Matthew Hughes via GitGitGadget, git
Matthew Hughes <matthewhughes934@gmail.com> writes:
> This needs some updates.
What does it mean?
The patterns that were posted were so broken that they are unusable
and harm the users by giving misleading information?
Or do the patterns work just fine in basic or tutorial cases, but
with more advanced or realistic uses of the language construct, they
highlight wrong lines as the function header and/or split at wrong
word boundaries that are obviously much less optimal than ideal that
any human users would find questionable?
In the latter case, how far from the ideal are the decisions done by
the current patterns, and what's the rough percentage of usual code
we see in the real world, for which the current patterns do not work
well?
What I am trying to gauge is if it is so broken that it should not
exist (in other words, you regret sending the patch to the list
before doing these updates), or is "already serviceable, but not
perfect yet". Waiting for perfection takes forever. If the latter,
letting the general public to use it to gather feedbacks by waiting
for the dust to settle before making such updates is often better.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] userdiff: add builtin diff driver for TypeScript language
2024-07-16 15:45 ` Junio C Hamano
@ 2024-07-16 19:33 ` Matthew Hughes
2024-07-16 21:10 ` Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Matthew Hughes @ 2024-07-16 19:33 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Matthew Hughes via GitGitGadget, git
Oe Tue, Jul 16, 2024 at 08:45:08AM -0700, Junio C Hamano wrote:
> What does it mean?
>
> The patterns that were posted were so broken that they are unusable
> and harm the users by giving misleading information?
I think this would be a good summary. It's sufficient for some simpler cases
considered, and does even give some benefits e.g. for function headers for
nested functions. However, the cases where it fails can be significant, e.g.
hundreds of lines away from the correct function header for files with multiple
consecutive multi-line arrow functions.
> In the latter case, how far from the ideal are the decisions done by
> the current patterns, and what's the rough percentage of usual code
> we see in the real world, for which the current patterns do not work
> well?
I think just the missing `export` keyword handling would be equivalent to
missing all public functions in other programming languages, so that alone
would be a decent percentage.
> What I am trying to gauge is if it is so broken that it should not
> exist (in other words, you regret sending the patch to the list
> before doing these updates), or is "already serviceable, but not
> perfect yet". Waiting for perfection takes forever. If the latter,
> letting the general public to use it to gather feedbacks by waiting
> for the dust to settle before making such updates is often better.
I'm leaning towards the former case: that this patch was premature. I think
it's far enough from perfect that it would greatly benefit from me more
actively reaching out to the TypeScript language team and asking some devs
there try out the changes and gather some more input (and identify some more
missing cases, of which I now expect that are many) before getting
something out to general users.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] userdiff: add builtin diff driver for TypeScript language
2024-07-15 16:33 [PATCH] userdiff: add builtin diff driver for TypeScript language Matthew Hughes via GitGitGadget
2024-07-16 12:21 ` Matthew Hughes
@ 2024-07-16 20:53 ` Johannes Sixt
1 sibling, 0 replies; 6+ messages in thread
From: Johannes Sixt @ 2024-07-16 20:53 UTC (permalink / raw)
To: Matthew Hughes; +Cc: git, Matthew Hughes via GitGitGadget
We have had a submission for typescript just recently:
https://lore.kernel.org/git/20240404163827.5855-1-utsavp0213@gmail.com/
And two for Javascript
https://lore.kernel.org/git/20240301074048.188835-1-sergiusnyah@gmail.com/
https://lore.kernel.org/git/20220403132508.28196-1-a97410985new@gmail.com/
Please review these earlier submissions. If you think you can improve on
them, you are very welcome to do so. But you can also just resend one of
these series with a note that they are sufficiently mature and that you
support the submitted version.
I may very well mistaken, but I think that Typescript is a syntactical
superset of Javascript, so we need just one language driver and mention
in the documentation that it can be used for both languages.
Thanks,
-- Hannes
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] userdiff: add builtin diff driver for TypeScript language
2024-07-16 19:33 ` Matthew Hughes
@ 2024-07-16 21:10 ` Junio C Hamano
0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2024-07-16 21:10 UTC (permalink / raw)
To: Matthew Hughes; +Cc: Matthew Hughes via GitGitGadget, git
Matthew Hughes <matthewhughes934@gmail.com> writes:
> I'm leaning towards the former case: that this patch was premature.
OK. Then let's take sufficient time. After all, we are never in a
hurry ;-)
Thanks for giving an honest assessment. Will keep the topic in
'seen' without marking it for 'next' (at least until it gets
replaced with a version that is more suitable to the public).
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-07-16 21:10 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-15 16:33 [PATCH] userdiff: add builtin diff driver for TypeScript language Matthew Hughes via GitGitGadget
2024-07-16 12:21 ` Matthew Hughes
2024-07-16 15:45 ` Junio C Hamano
2024-07-16 19:33 ` Matthew Hughes
2024-07-16 21:10 ` Junio C Hamano
2024-07-16 20:53 ` Johannes Sixt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).