Openembedded Core Discussions
 help / color / mirror / Atom feed
From: Patrick Ohly <patrick.ohly@intel.com>
To: openembedded-core@lists.openembedded.org
Subject: verify-bashisms + shellcheck (was: Re: [PATCH 0/8] verify-bashisms: fix script and one issue found by it)
Date: Wed, 01 Feb 2017 10:17:18 +0100	[thread overview]
Message-ID: <1485940638.20333.187.camel@intel.com> (raw)
In-Reply-To: <cover.55ff1fc2ce63597ab9f5324450132e786a996ac5.1485867027.git-series.patrick.ohly@intel.com>

On Tue, 2017-01-31 at 13:50 +0100, Patrick Ohly wrote:
> The script broke when introducing tinfoil2. The patches also include
> several usability enhancements, like making the output less redundant
> and including information about the actual file and line where the
> offending script can be found.

I've further modified verify-bashisms so that it can optionally run the
scripts through shellcheck (https://www.shellcheck.net/).

I'm quite impressed with how much real issues shellcheck finds and I
also found the recommendations useful. However, it is too verbose to be
applied to all scripts in OE. For example, it warns about missing
quotation marks around variables a lot.

There is no simple "check for bashisms" command line flag or "enable
just these checks" mode. One can disable warnings, so perhaps excluding
a long list of know "harmless" checks would be a way how shellcheck
could replace checkbashisms.pl?

My current solution always calls checkbashisms.pl, and shellcheck only
when a function is annotated:

foobar () {
   echo hello world
}
foobar[shellcheck] = "sh"

This sets "sh" as expected shell flavor. I did it this way because I can
imagine that someone might want to have some functions with bash
features and then ensure that bash is used to execute the code.

I can also imagine that this varflag could be used to exclude certain
checks with "sh SC2001 SC2002 ...", although the same can also be done
with inline comments for specific lines:

foobar () {
   # shellcheck disable=SC2003
   i=$( expr $i + 1 )  
}

Using expr triggers a warning because usually $(( )) is a better
alternative. However, that currently can't be used in bitbake functions
because the shell parser chokes on it:

   NotImplementedError: $((

So I've disabled that check by default.

Any suggestions how to proceed with this?

Note that shellcheck is written in Haskell. Getting it installed
automatically via a recipe would imply adding Haskell support to
OE-core. OTOH it seems to be packaged quite widely, so assuming it to be
installed on the host seems okay.

The "verify-bashisms" name of the script no longer quite matches the
actual functionality when using shellcheck, but that's minor (?).

FWIW, current additional patch is here:

diff --git a/scripts/verify-bashisms b/scripts/verify-bashisms
index dab64ef5019..000ed3f1764 100755
--- a/scripts/verify-bashisms
+++ b/scripts/verify-bashisms
@@ -24,8 +24,9 @@ def is_whitelisted(s):
 
 SCRIPT_LINENO_RE = re.compile(r' line (\d+) ')
 BASHISM_WARNING = re.compile(r'^(possible bashism in.*)$', re.MULTILINE)
+SHELLCHECK_LINENO_RE = re.compile(r'^(In .* line )(\d+):$', re.MULTILINE)
 
-def process(filename, function, lineno, script):
+def process(filename, function, lineno, script, shellcheck):
     import tempfile
 
     if not script.startswith("#!"):
@@ -35,10 +36,10 @@ def process(filename, function, lineno, script):
     fn.write(script)
     fn.flush()
 
+    results = []
     try:
         subprocess.check_output(("checkbashisms.pl", fn.name), universal_newlines=True, stderr=subprocess.STDOUT)
-        # No bashisms, so just return
-        return
+        # No bashisms, so just continue
     except subprocess.CalledProcessError as e:
         # TODO check exit code is 1
 
@@ -48,33 +49,53 @@ def process(filename, function, lineno, script):
             # Probably starts with or contains only warnings. Dump verbatim
             # with one space indention. Can't do the splitting and whitelist
             # checking below.
-            return '\n'.join([filename,
-                              ' Unexpected output from checkbashisms.pl'] +
-                             [' ' + x for x in output.splitlines()])
-
-        # We know that the first line matches and that therefore the first
-        # list entry will be empty - skip it.
-        output = BASHISM_WARNING.split(output)[1:]
-        # Turn the output into a single string like this:
-        # /.../foobar.bb
-        #  possible bashism in updatercd_postrm line 2 (type):
-        #   if ${@use_updatercd(d)} && type update-rc.d >/dev/null 2>/dev/null; then
-        #  ...
-        #   ...
-        result = []
-        # Check the results against the whitelist
-        for message, source in zip(output[0::2], output[1::2]):
-            if not is_whitelisted(source):
-                if lineno is not None:
-                    message = SCRIPT_LINENO_RE.sub(lambda m: ' line %d ' % (int(m.group(1)) + int(lineno) - 1),
-                                                   message)
-                result.append(' ' + message.strip())
-                result.extend(['  %s' % x for x in source.splitlines()])
-        if result:
-            result.insert(0, filename)
-            return '\n'.join(result)
+            results.extend([' Unexpected output from checkbashisms.pl'] +
+                           [' ' + x for x in output.splitlines()])
         else:
-            return None
+            # We know that the first line matches and that therefore the first
+            # list entry will be empty - skip it.
+            output = BASHISM_WARNING.split(output)[1:]
+            # Turn the output into a single string like this:
+            # /.../foobar.bb
+            #  possible bashism in updatercd_postrm line 2 (type):
+            #   if ${@use_updatercd(d)} && type update-rc.d >/dev/null 2>/dev/null; then
+            #  ...
+            #   ...
+            # Check the results against the whitelist
+            for message, source in zip(output[0::2], output[1::2]):
+                if not is_whitelisted(source):
+                    if lineno is not None:
+                        message = SCRIPT_LINENO_RE.sub(lambda m: ' line %d ' % (int(m.group(1)) + int(lineno) - 1),
+                                                       message)
+                    results.append(' ' + message.strip())
+                    results.extend(['  %s' % x for x in source.splitlines()])
+
+    if shellcheck:
+        try:
+            excluded = []
+            # SC2003 warns about using expr instead of $(( )).
+            # bitbake's shell parser chokes on $(( )), so expr
+            # is what embedded scripts have to use - ignore the warning.
+            excluded.append("SC2003")
+            subprocess.check_output(["shellcheck", "--shell=%s" % shellcheck] +
+                                    ["-e%s" % x for x in excluded] +
+                                    [fn.name],
+                                    universal_newlines=True, stderr=subprocess.STDOUT)
+            # No shell warnings, so just continue.
+        except subprocess.CalledProcessError as e:
+            # Replace the temporary filename with the function,
+            # update line numbers and indent the output.
+            output = e.output.replace(fn.name, function)
+            results.extend([' ' +
+                            (x if lineno is None else
+                             SHELLCHECK_LINENO_RE.sub(lambda m: m.group(1) + str(int(m.group(2)) + int(lineno) - 1), x))
+                            for x in output.splitlines()])
+
+    if results:
+        results.insert(0, filename)
+        return '\n'.join(results)
+    else:
+        return None
 
 def get_tinfoil():
     scripts_path = os.path.dirname(os.path.realpath(__file__))
@@ -94,12 +115,16 @@ if __name__=='__main__':
         print("Cannot find checkbashisms.pl on $PATH, get it from https://anonscm.debian.org/cgit/collab-maint/devscripts.git/plain/scripts/checkbashisms.pl")
         sys.exit(1)
 
+    if shutil.which("shellcheck") is None:
+        print("Cannot find shellcheck on $PATH, get it from your distro or https://www.shellcheck.net/")
+        sys.exit(1)
+
     # The order of defining the worker function,
     # initializing the pool and connecting to the
     # bitbake server is crucial, don't change it.
     def func(item):
-        (filename, key, lineno), script = item
-        return process(filename, key, lineno, script)
+        (filename, key, lineno), (script, shellcheck) = item
+        return process(filename, key, lineno, script, shellcheck)
 
     import multiprocessing
     pool = multiprocessing.Pool()
@@ -129,16 +154,22 @@ if __name__=='__main__':
                 data = tinfoil.parse_recipe_file(realfn)
                 for key in data.keys():
                     if data.getVarFlag(key, "func") and not data.getVarFlag(key, "python"):
+                        # Do not expand by default, it could change line numbers.
                         script = data.getVar(key, False)
+                        if script and '${@' in script:
+                            # Embedded Python code confuses the checkers too much,
+                            # we have to expand.
+                            script = data.getVar(key, True)
                         if script:
                             filename = data.getVarFlag(key, "filename")
                             lineno = data.getVarFlag(key, "lineno")
+                            shellcheck = data.getVarFlag(key, "shellcheck")
                             # There's no point in checking a function multiple
                             # times just because different recipes include it.
                             # We identify unique scripts by file, name, and (just in case)
                             # line number.
                             attributes = (filename or realfn, key, lineno)
-                            scripts.setdefault(attributes, script)
+                            scripts.setdefault(attributes, (script, shellcheck))
 
 
     print("Scanning scripts...\n")




-- 
Best Regards, Patrick Ohly

The content of this message is my personal opinion only and although
I am an employee of Intel, the statements I make here in no way
represent Intel's position on the issue, nor am I authorized to speak
on behalf of Intel on this matter.





      parent reply	other threads:[~2017-02-01  9:17 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-31 12:50 [PATCH 0/8] verify-bashisms: fix script and one issue found by it Patrick Ohly
2017-01-31 12:50 ` [PATCH 1/8] verify-bashisms: fix typo Patrick Ohly
2017-01-31 12:50 ` [PATCH 2/8] verify-bashisms: point out where to get checkbashisms.pl Patrick Ohly
2017-01-31 12:50 ` [PATCH 3/8] verify-bashisms: explicitly shut down server Patrick Ohly
2017-01-31 12:50 ` [PATCH 4/8] verify-bashisms: fix problems with tinfoil2 Patrick Ohly
2017-01-31 12:50 ` [PATCH 5/8] verify-bashisms: revise update-rc.d whitelist entry Patrick Ohly
2017-01-31 12:50 ` [PATCH 6/8] verify-bashisms: check scripts only once, include original file and line Patrick Ohly
2017-01-31 12:50 ` [PATCH 7/8] populate_sdk_ext: fix == bashism Patrick Ohly
2017-01-31 12:50 ` [PATCH 8/8] verify-bashisms: support warnings with more than one line of source code Patrick Ohly
2017-02-01  9:17 ` Patrick Ohly [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1485940638.20333.187.camel@intel.com \
    --to=patrick.ohly@intel.com \
    --cc=openembedded-core@lists.openembedded.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox