git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Moumita <dhar61595@gmail.com>
To: git@vger.kernel.org
Cc: Moumita Dhar <dhar61595@gmail.com>,
	"Johannes Sixt" <j6t@kdbg.org>,
	"Eric Sunshine" <sunshine@sunshineco.com>,
	"Junio C Hamano" <gitster@pobox.com>
Subject: [PATCH v5 1/1] userdiff: extend Bash pattern to cover more shell function forms
Date: Sun, 11 May 2025 18:28:08 +0530	[thread overview]
Message-ID: <20250511125809.14180-2-dhar61595@gmail.com> (raw)
In-Reply-To: <20250511125809.14180-1-dhar61595@gmail.com>

From: Moumita Dhar <dhar61595@gmail.com>

The previous function regex required explicit matching of function
bodies using `{`, `(`, `((`, or `[[`, which caused several issues:

- It failed to capture valid functions where `{` was on the next line
  due to line continuation (`\`).
- It did not recognize functions with single  command body, such as
  `x () echo hello`.

Replacing the function body matching logic with `.*$`, ensures
that everything on the function definition line is captured.

Additionally, the word regex is refined to better recognize shell
syntax, including additional parameter expansion operators and
command-line options.

Signed-off-by: Moumita Dhar <dhar61595@gmail.com>
---
 t/t4018/bash-bashism-style-multiline-function |  4 ++
 .../bash-hunk-header-complete-line-capture    |  4 ++
 t/t4018/bash-posix-style-multiline-function   |  4 ++
 .../bash-posix-style-single-command-function  |  3 ++
 t/t4034-diff-words.sh                         |  1 +
 t/t4034/bash/expect                           | 38 +++++++++++++++++++
 t/t4034/bash/post                             | 33 ++++++++++++++++
 t/t4034/bash/pre                              | 33 ++++++++++++++++
 userdiff.c                                    | 28 ++++++++++----
 9 files changed, 140 insertions(+), 8 deletions(-)
 create mode 100644 t/t4018/bash-bashism-style-multiline-function
 create mode 100644 t/t4018/bash-hunk-header-complete-line-capture
 create mode 100644 t/t4018/bash-posix-style-multiline-function
 create mode 100644 t/t4018/bash-posix-style-single-command-function
 create mode 100644 t/t4034/bash/expect
 create mode 100644 t/t4034/bash/post
 create mode 100644 t/t4034/bash/pre

diff --git a/t/t4018/bash-bashism-style-multiline-function b/t/t4018/bash-bashism-style-multiline-function
new file mode 100644
index 0000000000..284d50dd99
--- /dev/null
+++ b/t/t4018/bash-bashism-style-multiline-function
@@ -0,0 +1,4 @@
+function RIGHT \
+{    
+    echo 'ChangeMe'
+}
diff --git a/t/t4018/bash-hunk-header-complete-line-capture b/t/t4018/bash-hunk-header-complete-line-capture
new file mode 100644
index 0000000000..818c8c5a5f
--- /dev/null
+++ b/t/t4018/bash-hunk-header-complete-line-capture
@@ -0,0 +1,4 @@
+func() { # RIGHT
+
+    ChangeMe
+}
\ No newline at end of file
diff --git a/t/t4018/bash-posix-style-multiline-function b/t/t4018/bash-posix-style-multiline-function
new file mode 100644
index 0000000000..cc8727cbcd
--- /dev/null
+++ b/t/t4018/bash-posix-style-multiline-function
@@ -0,0 +1,4 @@
+RIGHT() \
+{
+    ChangeMe
+}
diff --git a/t/t4018/bash-posix-style-single-command-function b/t/t4018/bash-posix-style-single-command-function
new file mode 100644
index 0000000000..398ae1c5d2
--- /dev/null
+++ b/t/t4018/bash-posix-style-single-command-function
@@ -0,0 +1,3 @@
+RIGHT() echo "hello"
+
+    ChangeMe
diff --git a/t/t4034-diff-words.sh b/t/t4034-diff-words.sh
index f51d3557f1..0be647c2fb 100755
--- a/t/t4034-diff-words.sh
+++ b/t/t4034-diff-words.sh
@@ -320,6 +320,7 @@ test_expect_success 'unset default driver' '
 
 test_language_driver ada
 test_language_driver bibtex
+test_language_driver bash
 test_language_driver cpp
 test_language_driver csharp
 test_language_driver css
diff --git a/t/t4034/bash/expect b/t/t4034/bash/expect
new file mode 100644
index 0000000000..17755e455f
--- /dev/null
+++ b/t/t4034/bash/expect
@@ -0,0 +1,38 @@
+<BOLD>diff --git a/pre b/post<RESET>
+<BOLD>index 09ac008..60ba6a2 100644<RESET>
+<BOLD>--- a/pre<RESET>
+<BOLD>+++ b/post<RESET>
+<CYAN>@@ -1,33 +1,33 @@<RESET>
+<RED>my_var<RESET><GREEN>new_var<RESET>=10
+x=<RED>123<RESET><GREEN>456<RESET>
+y=<RED>3.14<RESET><GREEN>2.71<RESET>
+z=<RED>.5<RESET><GREEN>.75<RESET>
+echo <RED>$USER<RESET><GREEN>$USERNAME<RESET>
+${<RED>HOME<RESET><GREEN>HOMEDIR<RESET>}
+((a<RED>+<RESET><GREEN>+=<RESET>b))
+((a<RED>*<RESET><GREEN>*=<RESET>b))
+((a<RED>/<RESET><GREEN>/=<RESET>b))
+((a<RED>%<RESET><GREEN>%=<RESET>b))
+((a<RED>|<RESET><GREEN>|=<RESET>b))
+((a<RED>^<RESET><GREEN>^=<RESET>b))
+((a<RED>=<RESET><GREEN>==<RESET>b))
+((a<RED>!<RESET><GREEN>!=<RESET>b))
+((a<RED><<RESET><GREEN><=<RESET>b))
+((a<RED>><RESET><GREEN>>=<RESET>b))
+$((a<RED><<RESET><GREEN><<<RESET>b))
+$((a<RED>><RESET><GREEN>>><RESET>b))
+$((a<RED>&<RESET><GREEN>&&<RESET>b))
+$((a<RED>|<RESET><GREEN>||<RESET>b))
+${a<RED>:<RESET><GREEN>:-<RESET>b}
+${a<RED>:<RESET><GREEN>:=<RESET>b}
+${a<RED>:<RESET><GREEN>:+<RESET>b}
+${a<RED>:<RESET><GREEN>:?<RESET>b}
+${a<RED>#<RESET><GREEN>##<RESET>*/}
+${a<RED>%<RESET><GREEN>%%<RESET>.*}
+${a<RED>^<RESET><GREEN>^^<RESET>}
+${a<RED>,<RESET><GREEN>,,<RESET>}
+${<GREEN>!<RESET>a}
+${a[<RED>*<RESET><GREEN>@<RESET>]}
+${a<RED>:2:3<RESET><GREEN>:4:6<RESET>}
+ls <RED>-a<RESET><GREEN>-x<RESET>
+ls <RED>--a<RESET><GREEN>--x<RESET>
diff --git a/t/t4034/bash/post b/t/t4034/bash/post
new file mode 100644
index 0000000000..669e218c30
--- /dev/null
+++ b/t/t4034/bash/post
@@ -0,0 +1,33 @@
+new_var=10
+x=456
+y=2.71
+z=.75
+echo $USERNAME
+${HOMEDIR}
+((a+=b))
+((a*=b))
+((a/=b))
+((a%=b))
+((a|=b))
+((a^=b))
+((a==b))
+((a!=b))
+((a<=b))
+((a>=b))
+$((a<<b))
+$((a>>b))
+$((a&&b))
+$((a||b))
+${a:-b}
+${a:=b}
+${a:+b}
+${a:?b}
+${a##*/}
+${a%%.*}
+${a^^}
+${a,,}
+${!a}
+${a[@]}
+${a:4:6}
+ls -x
+ls --x
diff --git a/t/t4034/bash/pre b/t/t4034/bash/pre
new file mode 100644
index 0000000000..ada8470bac
--- /dev/null
+++ b/t/t4034/bash/pre
@@ -0,0 +1,33 @@
+my_var=10
+x=123
+y=3.14
+z=.5
+echo $USER
+${HOME}
+((a+b))
+((a*b))
+((a/b))
+((a%b))
+((a|b))
+((a^b))
+((a=b))
+((a!b))
+((a<b))
+((a>b))
+$((a<b))
+$((a>b))
+$((a&b))
+$((a|b))
+${a:b}
+${a:b}
+${a:b}
+${a:b}
+${a#*/}
+${a%.*}
+${a^}
+${a,}
+${a}
+${a[*]}
+${a:2:3}
+ls -a
+ls --a
diff --git a/userdiff.c b/userdiff.c
index 340c4eb4f7..655c8fe0b1 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -59,20 +59,32 @@ PATTERNS("bash",
 	 "("
 	 "("
 	     /* POSIX identifier with mandatory parentheses */
-	     "[a-zA-Z_][a-zA-Z0-9_]*[ \t]*\\([ \t]*\\))"
+	     "([a-zA-Z_][a-zA-Z0-9_]*[ \t]*\\([ \t]*\\))"
 	 "|"
 	     /* Bashism identifier with optional parentheses */
-	     "(function[ \t]+[a-zA-Z_][a-zA-Z0-9_]*(([ \t]*\\([ \t]*\\))|([ \t]+))"
+	     "(function[ \t]+[a-zA-Z_][a-zA-Z0-9_]*(([ \t]*\\([ \t]*\\))|([ \t]+)))"
 	 ")"
-	 /* Optional whitespace */
-	 "[ \t]*"
-	 /* Compound command starting with `{`, `(`, `((` or `[[` */
-	 "(\\{|\\(\\(?|\\[\\[)"
+	 /* Everything after the function header is captured  */
+	 ".*$"
 	 /* End of captured text */
 	 ")",
 	 /* -- */
-	 /* Characters not in the default $IFS value */
-	 "[^ \t]+"),
+	 /* Identifiers: variable and function names */
+	  "[a-zA-Z_][a-zA-Z0-9_]*"
+	 /* Numeric constants: integers and decimals */
+	  "|[0-9]+(\\.[0-9]*)?|[-+]?\\.[0-9]+"
+	 /* Shell variables: $VAR, ${VAR} */
+	  "|\\$[a-zA-Z_][a-zA-Z0-9_]*|\\$\\{"
+	  /* Logical and comparison operators */
+	 "|\\|\\||&&|<<|>>|==|!=|<=|>="
+	 /* Assignment and arithmetic operators */
+	 "|[-+*/%&|^!=<>]=?"
+	 /* Additional parameter expansion operators */
+	 "|:?=|:-|:\\+|:\\?|:|#|##|%|%%|\\^\\^?|,|,,?|!|@|:[0-9]+(:[0-9]+)?"
+	 /* Command-line options (to avoid splitting -option) */
+	 "|--?[a-zA-Z0-9_-]+"
+	 /* Brackets and grouping symbols */
+	 "|\\(|\\)|\\{|\\}|\\[|\\]"),
 PATTERNS("bibtex",
 	 "(@[a-zA-Z]{1,}[ \t]*\\{{0,1}[ \t]*[^ \t\"@',\\#}{~%]*).*$",
 	 /* -- */
-- 
2.48.0


  reply	other threads:[~2025-05-11 12:59 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-11 11:46 [PATCH 0/1] [GSOC 2025] [Newbie] userdiff: add built-in pattern for shell scripts Moumita
2025-02-11 11:46 ` [PATCH 1/1] Added built in function recognition for shell Moumita
2025-02-15 14:37   ` Johannes Sixt
2025-02-18 15:35 ` [PATCH v2 0/1] [PATCH v2 0/1] [GSOC 2025] [Newbie] userdiff: add built-in pattern for shell scripts Moumita
2025-02-18 15:35   ` [PATCH v2 1/1] userdiff: extend Bash pattern to cover more shell function forms Moumita
2025-02-18 19:30     ` Junio C Hamano
2025-02-22 18:15       ` Johannes Sixt
2025-02-24 16:28         ` Junio C Hamano
2025-02-18 23:38     ` Junio C Hamano
2025-02-22 18:14     ` Johannes Sixt
2025-02-18 17:30   ` [PATCH v2 0/1] [PATCH v2 0/1] [GSOC 2025] [Newbie] userdiff: add built-in pattern for shell scripts Eric Sunshine
2025-03-28 20:05   ` [PATCH v3 0/1] userdiff: improve Bash function and word regex patterns Moumita
2025-03-28 20:05     ` [PATCH v3 1/1] userdiff: extend Bash pattern to cover more shell function forms Moumita
2025-03-29 19:26     ` [PATCH v3 0/1] userdiff: improve Bash function and word regex patterns Junio C Hamano
2025-03-30 12:28       ` MOUMITA DHAR
2025-03-30 13:39     ` [PATCH v4 0/1][GSOC] userdiff:Added newlines at the end of the test cases Moumita
2025-03-30 13:39       ` [PATCH v4 1/1][GSOC] userdiff: extend Bash pattern to cover more shell function forms Moumita
2025-05-02 21:27         ` Junio C Hamano
2025-05-06 16:30         ` Johannes Sixt
2025-05-10 11:37           ` MOUMITA DHAR
2025-05-10 12:40             ` Johannes Sixt
2025-05-11 12:58       ` [PATCH v5 0/1] Added the closing ")" to make sure is not unbalanced and corrected the tests for word diff Moumita
2025-05-11 12:58         ` Moumita [this message]
2025-05-11 13:28         ` Moumita
2025-05-11 13:28           ` Moumita
2025-05-11 13:37         ` Moumita
2025-05-11 14:11         ` [PATCH v6 0/1] Added the newline after the test in t/4018 Moumita
2025-05-11 14:11           ` [PATCH v6 1/1] userdiff: extend Bash pattern to cover more shell function forms Moumita
2025-05-13 18:50             ` Junio C Hamano
2025-05-14  6:33               ` MOUMITA DHAR
2025-05-16  7:25             ` Johannes Sixt
2025-05-17 13:09               ` Junio C Hamano
2025-05-18  7:41                 ` Johannes Sixt
2025-05-16 14:45           ` [PATCH v7 0/1] Updated the word diff regex for Bash scripts Moumita
2025-05-16 14:45             ` [PATCH v7 1/1] userdiff: extend Bash pattern to cover more shell function forms Moumita
2025-05-16 17:45               ` Johannes Sixt
2025-05-16 21:56                 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250511125809.14180-2-dhar61595@gmail.com \
    --to=dhar61595@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).