public inbox for smatch@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Murray <andrew.murray@arm.com>
To: Dan Carpenter <dan.carpenter@oracle.com>, Catalin.Marinas@arm.com
Cc: smatch@vger.kernel.org
Subject: [RFC PATCH 4/7] smdb.py: add find_tagged and parse_warns_tagged commands
Date: Mon,  7 Oct 2019 16:35:42 +0100	[thread overview]
Message-ID: <20191007153545.23231-5-andrew.murray@arm.com> (raw)
In-Reply-To: <20191007153545.23231-1-andrew.murray@arm.com>

The find_tagged command follows a given parameter up the call
stack as far as it can, so long as the parameter contains user
data and the top byte is non-zero. This is helpful to identify
functions that should untag a tagged address.

The parse_warns_tagged command parses the smatch_warns.txt file
and calls find_tagged for each tagged related warning, thus
providing a summary of all issues found and their potential
causes.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
---
 smatch_data/db/smdb.py | 103 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 103 insertions(+)

diff --git a/smatch_data/db/smdb.py b/smatch_data/db/smdb.py
index 80c2df59af01..ba46b01a4d08 100755
--- a/smatch_data/db/smdb.py
+++ b/smatch_data/db/smdb.py
@@ -7,6 +7,7 @@
 import sqlite3
 import sys
 import re
+import subprocess
 
 try:
     con = sqlite3.connect('smatch_db.sqlite')
@@ -25,6 +26,8 @@ def usage():
     print "data_info <struct_type> <member> - information about a given data type"
     print "function_ptr <function> - which function pointers point to this"
     print "trace_param <function> <param> - trace where a parameter came from"
+    print "find_tagged <function> <param> - find the source of a tagged value (arm64)"
+    print "parse_warns_tagged <smatch_warns.txt> - parse warns file for summary of tagged issues (arm64)"
     print "locals <file> - print the local values in a file."
     sys.exit(1)
 
@@ -542,6 +545,99 @@ def function_type_value(struct_type, member):
     for txt in cur:
         print "%-30s | %-30s | %s | %s" %(txt[0], txt[1], txt[2], txt[3])
 
+def rl_too_big(txt):
+    rl = txt_to_rl(txt)
+    ret = ""
+    for idx in range(len(rl)):
+        cur_max = rl[idx][1]
+        if (cur_max > 0xFFFFFFFFFFFFFF):
+            return 1
+
+    return 0
+
+def rl_has_min_untagged(txt):
+    rl = txt_to_rl(txt)
+    ret = ""
+    for idx in range(len(rl)):
+        cur_min = rl[idx][0]
+        if (cur_min == 0xff80000000000000):
+            return 1
+
+    return 0
+
+def rl_is_tagged(txt):
+    if not rl_too_big(txt):
+        return 0
+
+    if rl_has_min_untagged(txt):
+        return 0
+
+    return 1
+
+def parse_warns_tagged(filename):
+    proc = subprocess.Popen(['cat %s | grep "potentially tagged" | sort | uniq' %(filename)], shell=True, stdout=subprocess.PIPE)
+    while True:
+        line = proc.stdout.readline()
+        if not line:
+            break
+
+	linepos = re.search("([^\s]+)", line).group(1)
+	groupre = re.search("potentially tagged address \(([^,]+), ([^,]+), ([^\)]+)\)", line)
+	groupre.group(1)
+
+	func = groupre.group(1)
+	param = int(groupre.group(2))
+	var = groupre.group(3)
+
+	if ("end" in var or "size" in var or "len" in var):
+		continue
+
+	print "\n%s (func: %s, param: %d:%s) may be caused by:" %(linepos, func, param, var)
+
+	if (param != -1):
+		if not find_tagged(func, param, 0, []):
+			print "    %s (param %d) (can't walk call tree)" % (func, param)
+	else:
+		print "    %s (variable %s (can't walk call tree)" % (func, var)
+
+def find_tagged(func, param, caller_call_id, printed):
+
+    callers = {}
+    cur = con.cursor()
+    ptrs = get_function_pointers(func)
+    found = 0
+
+    for ptr in ptrs:
+        cur.execute("select call_id, value from caller_info where function = '%s' and parameter=%d and type=%d" %(ptr, param, type_to_int("DATA_SOURCE")))
+
+        for row in cur:
+            if (row[1][0] == '$'):
+                if row[0] not in callers:
+                    callers[row[0]] = {}
+                callers[row[0]]["param"] = int(row[1][1])
+
+    for ptr in ptrs:
+        cur.execute("select caller, call_id, value from caller_info where function = '%s' and parameter=%d and type=%d" %(ptr, param, type_to_int("USER_DATA")))
+
+        for row in cur:
+            if not rl_is_tagged(row[2]):
+                continue
+            found = 1
+            if row[1] not in callers:
+                callers[row[1]] = {}
+            if "param" not in callers[row[1]]:
+                line = "    %s (param ?) -> %s (param %d)" % (row[0], func, param)
+                if line not in printed:
+                        printed.append(line)
+                        print line
+                continue
+            if row[0] not in printed:
+                printed.append(row[0])
+                if not find_tagged(row[0], callers[row[1]]["param"], row[1], printed):
+                    print "    %s (param %d)" % (row[0], param)
+
+    return found
+
 def trace_callers(func, param):
     sources = []
     prev_type = 0
@@ -641,6 +737,13 @@ elif sys.argv[1] == "data_info":
 elif sys.argv[1] == "call_tree":
     func = sys.argv[2]
     print_call_tree(func)
+elif sys.argv[1] == "find_tagged":
+    func = sys.argv[2]
+    param = int(sys.argv[3])
+    find_tagged(func, param, 0, [])
+elif sys.argv[1] == "parse_warns_tagged":
+    filename = sys.argv[2]
+    parse_warns_tagged(filename)
 elif sys.argv[1] == "where":
     if len(sys.argv) == 3:
         struct_type = "%"
-- 
2.21.0

  parent reply	other threads:[~2019-10-07 15:35 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-07 15:35 [RFC PATCH 0/7] Tagged Pointer Detection Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 1/7] build: Add '-lm' build flag Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 2/7] smdb.py: remove undocumented test command Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 3/7] arm64: add check for comparison against tagged address Andrew Murray
2019-10-07 15:49   ` Andrew Murray
2019-12-04 15:41     ` Andrew Murray
2019-12-05 13:27     ` Dan Carpenter
2019-12-05 14:28       ` Dan Carpenter
2019-12-05 14:35         ` Dan Carpenter
2019-12-05 17:21           ` Luc Van Oostenryck
2019-12-05 17:24         ` Luc Van Oostenryck
2019-10-07 15:35 ` Andrew Murray [this message]
2019-10-07 15:52   ` [RFC PATCH 4/7] smdb.py: add find_tagged and parse_warns_tagged commands Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 5/7] kernel_user_data: track parameter __untagged annotations Andrew Murray
2019-10-07 15:55   ` Andrew Murray
2019-12-05 15:04     ` Dan Carpenter
2019-10-08  8:24   ` Dan Carpenter
2019-10-08  8:41     ` Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 6/7] smdb.py: filter out __untagged from find_tagged results Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 7/7] Documentation: add guide for tagged addresses Andrew Murray
2019-10-08  8:58 ` [RFC PATCH 0/7] Tagged Pointer Detection Dan Carpenter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191007153545.23231-5-andrew.murray@arm.com \
    --to=andrew.murray@arm.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=dan.carpenter@oracle.com \
    --cc=smatch@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox