public inbox for kdevops@lists.linux.dev
 help / color / mirror / Atom feed
* [PATCH kdevops] scripts/coccinelle/generation: add example generation script
@ 2025-04-03  2:01 Luis Chamberlain
  2025-04-03  6:50 ` [cocci] " Markus Elfring
  2025-04-06 14:00 ` Markus Elfring
  0 siblings, 2 replies; 5+ messages in thread
From: Luis Chamberlain @ 2025-04-03  2:01 UTC (permalink / raw)
  To: kdevops; +Cc: cocci, julia.lawall, dave, jack, gost.dev, Luis Chamberlain

To do complicated tasks it is sometimes a bit difficult to write
a coccinelle rule. Such is the case when you want to use the iterator,
which can be used to help check for nested set of calls.

An example use case of this is when we want to check if a routine
may call an atomic context, or if a routine is calling any known
existing sleep routines.

This adds some initial examples we can go and try to enhance over
time with some very specific specialized tasks. What we see is
it is even hard to have code generate code (cocci python code)
which we can easily maintain.

Fortunately AI does *grok* this. So no shame, *different* gen AI agents
helped me get these to where they are. I suspect we'll need gen AIs
agents to continue to maintain them as well and enhance them.

What we should do is once we have high confidence is start using it
for tests to ensure we respect some golden rules. These all take
a long time to run because of the nested stuff. Its complicated.

I wrote these try to help with a bug we're trying to fix upstream [0]
on determining if __find_get_block_slow() really can't block and ...
determining exactly  *why*.

[0] https://lore.kernel.org/all/20250330064732.3781046-1-mcgrof@kernel.org/

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 
Posting this in case this is useful to others. I'll be merging it to
kdevops now, but long term if we find value, we can move to Linux too.
It has low confidence for now and so needs much more work and a bit more
of AI & human love.

 .../generation/check_for_atomic_calls.py      | 443 ++++++++++++
 .../generation/check_for_sleepy_calls.py      | 678 ++++++++++++++++++
 2 files changed, 1121 insertions(+)
 create mode 100755 scripts/coccinelle/generation/check_for_atomic_calls.py
 create mode 100755 scripts/coccinelle/generation/check_for_sleepy_calls.py

diff --git a/scripts/coccinelle/generation/check_for_atomic_calls.py b/scripts/coccinelle/generation/check_for_atomic_calls.py
new file mode 100755
index 000000000000..5849f9643de5
--- /dev/null
+++ b/scripts/coccinelle/generation/check_for_atomic_calls.py
@@ -0,0 +1,443 @@
+#!/usr/bin/env python3
+#
+# Copyright (C) 2025 Luis Chamberlain <mcgrof@kernel.org>
+# SPDX-License-Identifier: GPL-2.0-or-later OR copyleft-next-0.3.1
+
+# Generates a Coccinelle file which can be used to track down if a routine
+# can sleep up to --max-depth levels (defauilt is 1). You can either specify
+# the sleep routine you want to check for, or you can use our built-in db.
+# Huge work in progress, but its a start.
+#
+# This is work in progress.
+# Confidence: low
+
+import multiprocessing
+import argparse
+
+"""
+Generate a Coccinelle semantic patch that checks for atomic context
+in any transitive caller (up to N levels) of a target function.
+Example usage:
+    ./check_for_atomic_calls.py --levels 5 --target __find_get_block_slow --output atomic_check_find_block.cocci
+    make coccicheck MODE=report COCCI=find_block_check.cocci
+For an obvious atomic call:
+    ./check_for_atomic_calls.py --levels 5 --target netif_rx --output netif_rx.cocci
+    make coccicheck MODE=report COCCI=netif_rx.cocci
+"""
+parser = argparse.ArgumentParser(
+    description="Generate a Coccinelle checker for atomic context in transitive callers of a target function."
+)
+parser.add_argument(
+    "--levels", "-l",
+    type=int,
+    required=True,
+    help="Maximum number of transitive caller levels to follow (e.g., 5)"
+)
+parser.add_argument(
+    "--target", "-t",
+    type=str,
+    required=True,
+    help="Target function to trace (e.g., __find_get_block_slow)"
+)
+parser.add_argument(
+    "--output", "-o",
+    type=str,
+    required=True,
+    help="Output .cocci file to generate"
+)
+args = parser.parse_args()
+max_depth = args.levels
+target_func = args.target
+
+# Add a function to get the number of processors for parallel jobs
+def get_nprocs():
+    try:
+        return multiprocessing.cpu_count()
+    except:
+        return 1  # Default to 1 if can't determine
+
+outfile = args.output
+header = f"""// SPDX-License-Identifier: GPL-2.0
+/// Autogenerated by gen_atomic_context_chain.py
+/// Detect atomic context in ANY transitive caller (up to {max_depth} levels)
+/// of `{target_func}`
+// Options: --no-includes --include-headers
+virtual after_start
+virtual report
+@initialize:python@
+@@
+seen = set()
+seen_atomic = set()
+seen_irq_regions = set()
+seen_spinlock_regions = set()
+
+def register_caller(fn, file):
+    if fn not in seen:
+        seen.add(fn)
+        it = Iteration()
+        if file is not None:
+            it.set_files([file])
+        it.add_virtual_rule("after_start")
+        it.add_virtual_identifier("transitive_caller", fn)
+        it.register()
+
+// Look for direct calls to the target function
+@seed@
+identifier fn;
+position p;
+@@
+fn@p(...) {{
+  <+... {target_func}(...); ...+>
+}}
+
+@script:python depends on seed@
+fn << seed.fn;
+p  << seed.p;
+@@
+print(f"🌱 SEED HIT: {{fn}} calls {target_func} at {{p[0].file}}:{{p[0].line}}")
+register_caller(fn, p[0].file)
+
+// Special pattern for IRQ handler detection in macro definitions
+@irq_handler_def@
+identifier fn;
+@@
+(
+DEFINE_IRQ_HANDLER(fn, ...)
+|
+DECLARE_TASKLET(fn, ...)
+)
+
+@script:python depends on irq_handler_def@
+fn << irq_handler_def.fn;
+@@
+print(f"⚡ DEFINED IRQ HANDLER: {{fn}}")
+register_caller(fn, None)
+
+// Look for irq-related function names that haven't been caught yet
+@irq_func_names@
+identifier fn =~ "(_irq|_intr|_isr|_napi|_poll|_tasklet|_softirq|_bh)$";
+@@
+
+fn(...) {{ ... }}
+
+@script:python depends on irq_func_names@
+fn << irq_func_names.fn;
+@@
+print(f"⚡ NAMED IRQ HANDLER: {{fn}}")
+register_caller(fn, None)
+
+// Look for functions with interrupt prefix naming patterns
+@interrupt_prefixed@
+identifier fn =~ "^(irq_|intr_|isr_|napi_|poll_|do_softirq|tasklet_)";
+@@
+
+fn(...) {{ ... }}
+
+@script:python depends on interrupt_prefixed@
+fn << interrupt_prefixed.fn;
+@@
+print(f"⚡ PREFIXED IRQ HANDLER: {{fn}}")
+register_caller(fn, None)
+"""
+with open(outfile, "w") as f:
+    f.write(header)
+    
+    # Generate all the caller chain rules
+    for level in range(1, max_depth + 1):
+        f.write(f"""
+// Level {level} caller discovery
+@caller{level} depends on after_start exists@
+identifier virtual.transitive_caller;
+identifier fn;
+position p;
+@@
+fn@p(...) {{
+  <+... transitive_caller(...); ...+>
+}}
+
+@script:python depends on caller{level}@
+fn << caller{level}.fn;
+p << caller{level}.p;
+transitive_caller << virtual.transitive_caller;
+@@
+print(f"🔄 Chain level {level}: {{fn}} calls {{transitive_caller}} at {{p[0].file}}:{{p[0].line}}")
+register_caller(fn, p[0].file)
+""")
+
+    # Check for atomic context in each caller in our chain
+    for level in range(1, max_depth + 1):
+        # First, check for common atomic primitives
+        f.write(f"""
+// Level {level} atomic context check - Common atomic primitives
+@atomiccheck{level} depends on after_start exists@
+identifier virtual.transitive_caller;
+position p1, p2;
+@@
+(
+spin_lock@p1(...)
+|
+spin_lock_irq@p1(...)
+|
+spin_lock_irqsave@p1(...)
+|
+spin_lock_bh@p1(...)
+|
+read_lock@p1(...)
+|
+read_lock_irq@p1(...)
+|
+read_lock_irqsave@p1(...)
+|
+read_lock_bh@p1(...)
+|
+write_lock@p1(...)
+|
+write_lock_irq@p1(...)
+|
+write_lock_irqsave@p1(...)
+|
+write_lock_bh@p1(...)
+|
+raw_spin_lock@p1(...)
+|
+raw_spin_lock_irq@p1(...)
+|
+raw_spin_lock_irqsave@p1(...)
+|
+raw_spin_lock_bh@p1(...)
+|
+local_irq_disable@p1()
+|
+local_irq_save@p1(...)
+|
+local_bh_disable@p1()
+|
+preempt_disable@p1()
+|
+in_atomic@p1()
+|
+in_atomic_preempt_off@p1()
+|
+in_interrupt@p1()
+|
+in_irq@p1()
+|
+in_serving_softirq@p1()
+|
+in_nmi@p1()
+|
+in_task@p1()
+|
+rcu_read_lock@p1()
+|
+irq_enter@p1()
+|
+napi_disable@p1(...)
+)
+...
+transitive_caller@p2(...)
+
+@script:python depends on atomiccheck{level}@
+p1 << atomiccheck{level}.p1;
+p2 << atomiccheck{level}.p2;
+transitive_caller << virtual.transitive_caller;
+@@
+key = (p1[0].file, p1[0].line, transitive_caller)
+if key not in seen_atomic:
+    seen_atomic.add(key)
+    print(f"⚠️  WARNING: atomic context at level {level}: {{p1[0].current_element}} at {{p1[0].file}}:{{p1[0].line}} may reach {{transitive_caller}}() → eventually {target_func}()")
+""")
+
+        # Check for lock-related functions directly calling our chain
+        f.write(f"""
+// Level {level} atomic context check - Lock-related functions
+@atomic_fn_check{level} depends on after_start exists@
+identifier virtual.transitive_caller;
+identifier lock_fn;
+position p;
+@@
+// Common locking function patterns
+lock_fn(...) {{
+  ... when != unlock_irqrestore(...)
+      when != local_irq_restore(...)
+      when != spin_unlock(...)
+      when != rcu_read_unlock(...)
+  transitive_caller@p(...)
+  ...
+}}
+
+@script:python depends on atomic_fn_check{level}@
+lock_fn << atomic_fn_check{level}.lock_fn;
+p << atomic_fn_check{level}.p;
+transitive_caller << virtual.transitive_caller;
+@@
+# Only report functions with lock-related names
+atomic_keywords = ['lock', 'irq', 'atomic', 'bh', 'intr', 'preempt', 'disable', 'napi', 'rcu']
+if any(kw in lock_fn.lower() for kw in atomic_keywords):
+    print(f"⚠️  WARNING: potential atomic function at level {level}: {{lock_fn}} (name suggests lock handling) contains call to {{transitive_caller}}() at {{p[0].file}}:{{p[0].line}} → eventually {target_func}()")
+""")
+
+        # Check for spinlock regions
+        f.write(f"""
+// Level {level} spinlock region check
+@spinlock_region{level} depends on after_start exists@
+identifier virtual.transitive_caller;
+expression E1, flags;
+position p1, p3;
+@@
+(
+spin_lock@p1(E1,...)
+|
+spin_lock_irq@p1(E1,...)
+|
+spin_lock_irqsave@p1(E1, flags)
+|
+spin_trylock@p1(E1)
+|
+raw_spin_lock@p1(E1,...)
+|
+raw_spin_lock_irq@p1(E1,...)
+|
+raw_spin_lock_irqsave@p1(E1, flags)
+|
+raw_spin_trylock@p1(E1)
+)
+... when != spin_unlock(E1,...)
+    when != spin_unlock_irq(E1,...)
+    when != spin_unlock_irqrestore(E1, flags)
+    when != raw_spin_unlock(E1,...)
+    when != raw_spin_unlock_irq(E1,...)
+    when != raw_spin_unlock_irqrestore(E1, flags)
+transitive_caller@p3(...)
+
+@script:python depends on spinlock_region{level}@
+p1 << spinlock_region{level}.p1;
+p3 << spinlock_region{level}.p3;
+transitive_caller << virtual.transitive_caller;
+@@
+key = (p1[0].file, p1[0].line, p3[0].line, transitive_caller)
+if key not in seen_spinlock_regions:
+    seen_spinlock_regions.add(key)
+    print(f"⚠️  WARNING: spinlock region at level {level}: {{p1[0].current_element}} at {{p1[0].file}}:{{p1[0].line}} contains call to {{transitive_caller}}() at line {{p3[0].line}} → eventually {target_func}()")
+""")
+
+        # Look for functions that can't sleep
+        f.write(f"""
+// Level {level} check - Can't sleep contexts
+@cant_sleep{level} depends on after_start exists@
+identifier virtual.transitive_caller;
+position p1, p2;
+@@
+(
+GFP_ATOMIC@p1
+|
+cond_resched@p1()
+|
+__GFP_ATOMIC@p1
+|
+DECLARE_COMPLETION@p1(...)
+)
+...
+transitive_caller@p2(...)
+
+@script:python depends on cant_sleep{level}@
+p1 << cant_sleep{level}.p1;
+p2 << cant_sleep{level}.p2;
+transitive_caller << virtual.transitive_caller;
+@@
+print(f"⚠️  WARNING: Non-sleeping context at {{p1[0].file}}:{{p1[0].line}} but calls {{transitive_caller}}() at line {{p2[0].line}} → eventually {target_func}()")
+""")
+
+        # Check for network driver contexts
+        f.write(f"""
+// Level {level} check - Network driver contexts (commonly atomic)
+@netdriver{level} depends on after_start exists@
+identifier virtual.transitive_caller;
+position p1, p2;
+@@
+(
+alloc_skb@p1(...)
+|
+netif_receive_skb@p1(...)
+|
+netdev_alloc_skb@p1(...)
+|
+napi_complete@p1(...)
+|
+napi_schedule@p1(...)
+|
+napi_gro_receive@p1(...)
+|
+skb_reserve@p1(...)
+|
+consume_skb@p1(...)
+|
+dev_consume_skb_any@p1(...)
+|
+skb_put@p1(...)
+|
+skb_push@p1(...)
+|
+netif_rx_ni@p1(...)
+)
+...
+transitive_caller@p2(...)
+
+@script:python depends on netdriver{level}@
+p1 << netdriver{level}.p1;
+p2 << netdriver{level}.p2;
+transitive_caller << virtual.transitive_caller;
+@@
+print(f"⚠️  WARNING: Network driver context at {{p1[0].file}}:{{p1[0].line}} but calls {{transitive_caller}}() at line {{p2[0].line}} → eventually {target_func}()")
+""")
+
+        # Check for functions that might call from atomic context by name
+        f.write(f"""
+// Level {level} check - Function with name suggesting atomic context
+@atomic_name{level} depends on after_start exists@
+identifier virtual.transitive_caller;
+identifier atomic_fn =~ "(_irq|_intr|_isr|_napi|_poll|_bh|_softirq|_tasklet|_atomic)$";
+position p;
+@@
+atomic_fn(...) {{
+...
+transitive_caller@p(...)
+...
+}}
+
+@script:python depends on atomic_name{level}@
+atomic_fn << atomic_name{level}.atomic_fn;
+p << atomic_name{level}.p;
+transitive_caller << virtual.transitive_caller;
+@@
+print(f"⚠️  WARNING: Function with atomic-suggesting name {{atomic_fn}} calls {{transitive_caller}}() at {{p[0].file}}:{{p[0].line}} → eventually {target_func}()")
+""")
+
+        # Check for sleep-incompatible contexts but target function might sleep
+        f.write(f"""
+// Level {level} check - Target function called in context where might_sleep is used
+@might_sleep_check{level} depends on after_start exists@
+identifier virtual.transitive_caller;
+position p1, p2;
+@@
+(
+might_sleep@p1()
+|
+might_sleep_if@p1(...)
+|
+sched_might_sleep@p1()
+) 
+...
+transitive_caller@p2(...)
+
+@script:python depends on might_sleep_check{level}@
+p1 << might_sleep_check{level}.p1;
+p2 << might_sleep_check{level}.p2;
+transitive_caller << virtual.transitive_caller;
+@@
+print(f"⚠️  WARNING: Function has might_sleep() at {{p1[0].file}}:{{p1[0].line}} but also calls {{transitive_caller}}() at line {{p2[0].line}} → eventually {target_func}()")
+""")
+
+    f.write("\n")
+
+print(f"✅ Generated {outfile} with enhanced atomic checks for `{target_func}` up to {max_depth} levels. Run with: make coccicheck MODE=report COCCI={outfile} J={get_nprocs()}")
diff --git a/scripts/coccinelle/generation/check_for_sleepy_calls.py b/scripts/coccinelle/generation/check_for_sleepy_calls.py
new file mode 100755
index 000000000000..87bd5b264203
--- /dev/null
+++ b/scripts/coccinelle/generation/check_for_sleepy_calls.py
@@ -0,0 +1,678 @@
+#!/usr/bin/env python3
+#
+# Copyright (C) 2025 Luis Chamberlain <mcgrof@kernel.org>
+# SPDX-License-Identifier: GPL-2.0-or-later OR copyleft-next-0.3.1
+
+# Generates a Coccinelle file which can be used to track down if a routine
+# may be called from any known atomic context. Much work to de done with this.
+# Huge work in progress, but its a start.
+#
+# This is work in progress.
+# Confidence: low
+
+import argparse
+import multiprocessing
+import os
+import tempfile
+import json
+
+"""
+Generate a Coccinelle semantic patch that checks if a given function
+calls any functions that might sleep. It generates a report and the
+goal is to collect stats at the end as well.
+
+Example usage:
+    ./check_for_sleepy_calls.py --function folio_mc_copy --sleepy-function cond_resched --expected --max-depth 1 --output specific_sleep_check.cocci
+    make coccicheck MODE=report COCCI=specific_sleep_check.cocci
+
+Or to check for sleep function called:
+    ./check_for_sleepy_calls.py --function __find_get_block_slow --max-depth 1   --output all_sleep_check.cocci
+    make coccicheck MODE=report COCCI=all_sleep_check.cocci
+"""
+
+parser = argparse.ArgumentParser(
+    description="Generate a Coccinelle checker to find sleeping functions called by a target function."
+)
+parser.add_argument(
+    "--function", "-f",
+    type=str,
+    required=True,
+    help="Target function to analyze (e.g., netif_rx_ni)"
+)
+parser.add_argument(
+    "--max-depth", "-d",
+    type=int,
+    default=3,
+    help="Maximum depth of function call chain to analyze (default: 3)"
+)
+parser.add_argument(
+    "--output", "-o",
+    type=str,
+    required=True,
+    help="Output .cocci file to generate"
+)
+parser.add_argument(
+    "--sleepy-function", "-s",
+    type=str,
+    default=None,
+    help="Specific function to check for that may cause sleeping (e.g., folio_wait_locked)"
+)
+parser.add_argument(
+    "--expected", "-e",
+    action="store_true",
+    default=False,
+    help="Indicate that the function is expected to have a sleep path (verified by manual inspection)"
+)
+args = parser.parse_args()
+target_func = args.function
+max_depth = args.max_depth
+outfile = args.output
+sleepy_func = args.sleepy_function
+expected_to_sleep = args.expected
+
+# Add a function to get the number of processors for parallel jobs
+def get_nprocs():
+    try:
+        return multiprocessing.cpu_count()
+    except:
+        return 1  # Default to 1 if can't determine
+
+# List of common functions known to sleep
+known_sleepy_functions = [
+    "msleep", "ssleep", "usleep_range", "schedule", "schedule_timeout",
+    "wait_event", "wait_for_completion", "mutex_lock", "down_read", "down_write",
+    "kthread_create", "kthread_run", "kmalloc", "__kmalloc", "kmem_cache_alloc", 
+    "vmalloc", "vzalloc", "kvmalloc", "kzalloc", "__vmalloc", "kvzalloc",
+    "sock_create", "sock_create_kern", "sock_create_lite", "sock_socket", 
+    "filp_open", "open_bdev_exclusive", "create_workqueue", 
+    "alloc_workqueue", "__alloc_workqueue_key", "request_threaded_irq",
+    "request_module", "try_module_get", "module_put", "printk", "GFP_KERNEL",
+    "copy_from_user", "copy_to_user", "__copy_from_user", "__copy_to_user"
+]
+
+# If a specific sleepy function is provided, only check for that one
+if sleepy_func:
+    known_sleepy_functions = [sleepy_func]
+    print(f"Note: Only checking for calls to {sleepy_func}")
+
+# List of common GFP flags that indicate sleeping is allowed
+sleepy_gfp_flags = [
+    "GFP_KERNEL", "GFP_USER", "GFP_HIGHUSER", "GFP_DMA", 
+    "GFP_DMA32", "GFP_NOWAIT", "GFP_NOIO", "GFP_NOFS",
+    "__GFP_WAIT", "__GFP_IO", "__GFP_FS"
+]
+
+# Create a stats directory
+stats_dir = os.path.join(tempfile.gettempdir(), f"cocci_stats_{os.getpid()}")
+os.makedirs(stats_dir, exist_ok=True)
+
+# Generate the Coccinelle script
+with open(outfile, "w") as f:
+    title = f"Detect if function '{target_func}' calls any functions that might sleep"
+    if sleepy_func:
+        title = f"Detect if function '{target_func}' calls '{sleepy_func}'"
+    
+    f.write(f"""// SPDX-License-Identifier: GPL-2.0
+/// Autogenerated by check_for_sleepy_calls.py
+/// {title}
+// Options: --no-includes --include-headers
+
+virtual report
+
+@initialize:python@
+@@
+import re
+import os
+import json
+import hashlib
+import tempfile
+import time
+from datetime import datetime
+
+target_func = "{target_func}"
+sleepy_func = "{sleepy_func or ''}"
+expected_to_sleep = {str(expected_to_sleep).capitalize()}
+
+# Create a stats directory
+stats_dir = "{stats_dir}"
+os.makedirs(stats_dir, exist_ok=True)
+
+# Generate a unique ID for this process to avoid race conditions
+# Use a hash of time and process ID for uniqueness
+process_id = hashlib.md5(f"{{os.getpid()}}_{{time.time()}}".encode()).hexdigest()[:8]
+stats_file = os.path.join(stats_dir, f"stats_{{process_id}}.json")
+
+# Count of known sleepy functions we're checking for
+sleepy_funcs_count = {len(known_sleepy_functions) if not sleepy_func else 1}
+
+# Use sets to track what we've seen
+seen_calls = set()
+seen_sleep_points = set()
+seen_funcs = set([target_func])
+
+# Counters for work done
+total_funcs_analyzed = 0
+total_calls_checked = 0
+sleep_checks_performed = 0
+total_sleep_routines_checked = 0
+
+# Dictionary to track calls
+call_graph = dict()
+
+# List to track detailed sleep points for final report
+sleep_point_details = []
+
+def register_call(caller, callee, file, line):
+    global total_calls_checked
+    key = (caller, callee, file, line)
+    if key not in seen_calls:
+        seen_calls.add(key)
+        total_calls_checked += 1
+        
+        # Add to call graph for path tracking
+        if caller not in call_graph:
+            call_graph[caller] = []
+        call_graph[caller].append((callee, file, line))
+        
+        # Add to analysis queue
+        if callee not in seen_funcs:
+            seen_funcs.add(callee)
+        
+        return True
+    return False
+
+# Function to find path from target to a function
+def find_path_to(target, current=None, path=None, visited=None):
+    if current is None:
+        current = target_func
+    if path is None:
+        path = [current]
+    if visited is None:
+        visited = set([current])
+    
+    if current == target:
+        return path
+    
+    for callee, _, _ in call_graph.get(current, []):
+        if callee not in visited:
+            visited.add(callee)
+            new_path = path + [callee]
+            if callee == target:
+                return new_path
+            result = find_path_to(target, callee, new_path, visited)
+            if result:
+                return result
+    
+    return None
+
+# Function to register a sleep point
+def register_sleep_point(caller_func, sleep_func, file, line, reason=""):
+    global sleep_checks_performed
+    sleep_checks_performed += 1
+    
+    key = (caller_func, sleep_func, file, line)
+    if key not in seen_sleep_points:
+        seen_sleep_points.add(key)
+        
+        # Try to find call path
+        path = find_path_to(caller_func)
+        path_str = " → ".join(path) if path else caller_func
+        
+        # Use different symbol based on expected flag
+        symbol = "✅" if expected_to_sleep else "⚠️"
+        
+        if reason:
+            message = f"{{symbol}} {'VERIFIED' if expected_to_sleep else 'WARNING'}: {{caller_func}}() might sleep at {{file}}:{{line}} - {{reason}} (via {{sleep_func}})"
+            path_message = f"   Call path: {{path_str}} → {{sleep_func}}"
+            print(message)
+            print(path_message)
+        else:
+            message = f"{{symbol}} {'VERIFIED' if expected_to_sleep else 'WARNING'}: {{caller_func}}() might sleep at {{file}}:{{line}} (via {{sleep_func}})"
+            path_message = f"   Call path: {{path_str}} → {{sleep_func}}"
+            print(message)
+            print(path_message)
+        
+        # Store this for our final stats
+        if path:
+            path_value = path_str + " → " + sleep_func
+        else:
+            path_value = caller_func + " → " + sleep_func
+            
+        detail = dict()
+        detail["caller"] = caller_func
+        detail["sleep_func"] = sleep_func
+        detail["file"] = file
+        detail["line"] = line
+        detail["reason"] = reason
+        detail["path"] = path_value
+        sleep_point_details.append(detail)
+        
+        return True
+    return False
+
+# List to track functions we need to analyze next after this iteration
+functions_to_analyze = []
+
+def register_func_for_analysis(func_name):
+    global total_funcs_analyzed
+    if func_name not in seen_funcs:
+        seen_funcs.add(func_name)
+        functions_to_analyze.append(func_name)
+        total_funcs_analyzed += 1
+        print(f"🔍 Analyzing function: {{func_name}} (Total analyzed: {{total_funcs_analyzed}})")
+        return True
+    return False
+
+# Function to save statistics to our unique file
+def save_stats():
+    stats = dict()
+    stats["process_id"] = process_id
+    stats["timestamp"] = datetime.now().isoformat()
+    stats["target_func"] = target_func
+    stats["sleepy_func"] = sleepy_func
+    stats["total_funcs_analyzed"] = total_funcs_analyzed
+    stats["total_calls_checked"] = total_calls_checked
+    stats["total_sleep_routines_checked"] = total_sleep_routines_checked
+    stats["sleep_checks_performed"] = sleep_checks_performed
+    stats["seen_funcs_count"] = len(seen_funcs)
+    stats["seen_calls_count"] = len(seen_calls)
+    stats["seen_sleep_points_count"] = len(seen_sleep_points)
+    stats["sleep_point_details"] = sleep_point_details
+    
+    with open(stats_file, "w") as f:
+        json.dump(stats, f, indent=2)
+""")
+
+    # Define the rule to find direct calls to the target function
+    f.write(f"""
+// Find direct function calls made by target function
+@find_calls@
+identifier fn;
+position p;
+@@
+{target_func}(...) {{
+  <... 
+  fn@p(...);
+  ...>
+}}
+
+@script:python depends on find_calls@
+fn << find_calls.fn;
+p << find_calls.p;
+@@
+global total_calls_checked
+total_calls_checked += 1
+register_call(target_func, fn, p[0].file, p[0].line)
+register_func_for_analysis(fn)
+save_stats()
+""")
+
+    # Add direct checking for specific sleepy function if provided
+    if sleepy_func:
+        f.write(f"""
+// Direct check: Does target function call sleepy function directly?
+@direct_sleepy_call@
+position p;
+@@
+{target_func}(...) {{
+  <...
+  {sleepy_func}@p(...);
+  ...>
+}}
+
+@script:python depends on direct_sleepy_call@
+p << direct_sleepy_call.p;
+@@
+global total_sleep_routines_checked
+total_sleep_routines_checked += 1
+register_sleep_point(target_func, sleepy_func, p[0].file, p[0].line, "directly calls target sleep function")
+save_stats()
+""")
+    
+    # Generate rules for checking nested function calls
+    for depth in range(2, max_depth + 1):  # Start from 2 as level 1 is the direct call we already checked
+        # Find functions called by functions at the previous level
+        f.write(f"""
+// Level {depth} - Find functions called by level {depth-1} functions
+@find_calls_l{depth}@
+identifier fn1;
+identifier fn2;
+position p;
+@@
+fn1(...) {{
+  <...
+  fn2@p(...);
+  ...>
+}}
+
+@script:python depends on find_calls_l{depth}@
+fn1 << find_calls_l{depth}.fn1;
+fn2 << find_calls_l{depth}.fn2;
+p << find_calls_l{depth}.p;
+@@
+global total_calls_checked
+# Only analyze calls from functions we're tracking
+if fn1 in seen_funcs and fn1 != fn2:  # Avoid self-recursion
+    total_calls_checked += 1
+    register_call(fn1, fn2, p[0].file, p[0].line)
+    register_func_for_analysis(fn2)
+    save_stats()
+""")
+
+        if sleepy_func:
+            # If looking for a specific sleepy function, check at this level
+            f.write(f"""
+// Level {depth} - Find calls to sleepy function
+@sleepy_call_l{depth}@
+identifier fn;
+position p;
+@@
+fn(...) {{
+  <...
+  {sleepy_func}@p(...);
+  ...>
+}}
+
+@script:python depends on sleepy_call_l{depth}@
+fn << sleepy_call_l{depth}.fn;
+p << sleepy_call_l{depth}.p;
+@@
+global total_sleep_routines_checked
+total_sleep_routines_checked += 1
+# Only report if we're tracking this function
+if fn in seen_funcs:
+    register_sleep_point(fn, sleepy_func, p[0].file, p[0].line, f"level {depth} call to target sleep function")
+    save_stats()
+""")
+        else:
+            # If doing general sleep checking, check for known sleepy functions at this level
+            f.write(f"""
+// Level {depth} - Check for known sleepy functions
+@known_sleepers_l{depth}@
+identifier fn;
+position p;
+@@
+fn(...) {{
+  <...
+  (""")
+            # Add all known sleepy functions to the pattern
+            for i, sleepy_func_name in enumerate(known_sleepy_functions):
+                if i > 0:
+                    f.write(f"""
+  |""")
+                f.write(f"""
+  {sleepy_func_name}@p(...)""")
+            
+            f.write(f"""
+  )
+  ...>
+}}
+
+@script:python depends on known_sleepers_l{depth}@
+fn << known_sleepers_l{depth}.fn;
+p << known_sleepers_l{depth}.p;
+@@
+global total_sleep_routines_checked
+total_sleep_routines_checked += 1
+# Only report if we're tracking this function
+if fn in seen_funcs:
+    sleep_func = p[0].current_element
+    register_sleep_point(fn, sleep_func, p[0].file, p[0].line, f"level {depth} call to known sleeping function")
+    save_stats()
+""")
+
+    # Only add the other sleep detection rules if we're not constraining to a specific function
+    if not sleepy_func:
+        # Check for GFP_KERNEL and other sleepy allocation flags
+        f.write(f"""
+// Check for sleepy memory allocation flags
+@check_sleepy_alloc@
+position p;
+identifier fn;
+@@
+fn(...) {{
+  <...
+  (""")
+        # Add patterns for all sleepy GFP flags
+        for i, flag in enumerate(sleepy_gfp_flags):
+            if i > 0:
+                f.write(f"""
+  |""")
+            f.write(f"""
+  {flag}@p""")
+        f.write(f"""
+  )
+  ...>
+}}
+
+@script:python depends on check_sleepy_alloc@
+fn << check_sleepy_alloc.fn;
+p << check_sleepy_alloc.p;
+@@
+global total_sleep_routines_checked
+total_sleep_routines_checked += 1
+# Only report if we're tracking this function
+if fn in seen_funcs:
+    flag = p[0].current_element
+    register_sleep_point(fn, flag, p[0].file, p[0].line, "uses allocation flag that may sleep")
+    save_stats()
+""")
+
+        # Check for mutex locks
+        f.write(f"""
+// Check for mutex locks
+@check_mutex@
+position p;
+identifier fn;
+@@
+fn(...) {{
+  <...
+  (
+  mutex_lock@p(...)
+  |
+  mutex_lock_interruptible@p(...)
+  |
+  mutex_lock_killable@p(...)
+  |
+  down@p(...)
+  |
+  down_interruptible@p(...)
+  |
+  down_killable@p(...)
+  |
+  down_read@p(...)
+  |
+  down_write@p(...)
+  |
+  wait_for_completion@p(...)
+  |
+  wait_for_completion_interruptible@p(...)
+  |
+  wait_for_completion_killable@p(...)
+  |
+  wait_event@p(...)
+  |
+  wait_event_interruptible@p(...)
+  |
+  wait_event_killable@p(...)
+  )
+  ...>
+}}
+
+@script:python depends on check_mutex@
+fn << check_mutex.fn;
+p << check_mutex.p;
+@@
+global total_sleep_routines_checked
+total_sleep_routines_checked += 1
+# Only report if we're tracking this function
+if fn in seen_funcs:
+    lock_func = p[0].current_element
+    register_sleep_point(fn, lock_func, p[0].file, p[0].line, "uses mutex or completion that may sleep")
+    save_stats()
+""")
+
+        # Check for might_sleep calls
+        f.write(f"""
+// Check for explicit might_sleep calls
+@check_might_sleep@
+position p;
+identifier fn;
+@@
+fn(...) {{
+  <...
+  (
+  might_sleep@p(...)
+  |
+  might_sleep_if@p(...)
+  |
+  sched_might_sleep@p(...)
+  )
+  ...>
+}}
+
+@script:python depends on check_might_sleep@
+fn << check_might_sleep.fn;
+p << check_might_sleep.p;
+@@
+global total_sleep_routines_checked
+total_sleep_routines_checked += 1
+# Only report if we're tracking this function
+if fn in seen_funcs:
+    sleep_func = p[0].current_element
+    register_sleep_point(fn, sleep_func, p[0].file, p[0].line, "contains explicit might_sleep() call")
+    save_stats()
+""")
+
+        # Check for functions with names suggesting they might sleep
+        f.write(f"""
+// Check for functions with sleep-suggesting names
+@check_sleep_names@
+position p;
+identifier fn;
+identifier sleep_fn =~ "(_sleep|_timeout|_wait|_block|_sync|_lock|create_|alloc_|_kmalloc|_mutex)";
+@@
+fn(...) {{
+  <...
+  sleep_fn@p(...)
+  ...>
+}}
+
+@script:python depends on check_sleep_names@
+fn << check_sleep_names.fn;
+sleep_fn << check_sleep_names.sleep_fn;
+p << check_sleep_names.p;
+@@
+global total_sleep_routines_checked
+total_sleep_routines_checked += 1
+# Only report if we're tracking this function
+if fn in seen_funcs:
+    # Filter out known safe functions
+    if not (sleep_fn.startswith("spin_") or 
+            sleep_fn.startswith("rcu_") or 
+            sleep_fn.startswith("atomic_") or
+            sleep_fn.startswith("local_")):
+        register_sleep_point(fn, sleep_fn, p[0].file, p[0].line, "calls function with name suggesting it might sleep")
+        save_stats()
+""")
+
+    # Add a finalization rule that summarizes the findings
+    f.write(f"""
+@finalize:python@
+@@
+# Save any final stats before finishing
+save_stats()
+
+# Now collect all stats from all parallel processes
+def collect_and_merge_stats():
+    import glob
+    import json
+    
+    # Aggregate stats from all files
+    merged_stats = dict()
+    merged_stats["total_funcs_analyzed"] = 0
+    merged_stats["total_calls_checked"] = 0
+    merged_stats["total_sleep_routines_checked"] = 0
+    merged_stats["sleep_checks_performed"] = 0
+    merged_stats["seen_funcs"] = set()
+    merged_stats["seen_calls"] = set()
+    merged_stats["seen_sleep_points"] = set()
+    merged_stats["sleep_point_details"] = []
+    
+    # Get all stats files
+    stats_files = glob.glob(os.path.join(stats_dir, "stats_*.json"))
+    
+    for file_path in stats_files:
+        try:
+            with open(file_path, "r") as f:
+                stats = json.load(f)
+                
+                # Merge counters
+                merged_stats["total_funcs_analyzed"] = max(merged_stats["total_funcs_analyzed"], stats.get("total_funcs_analyzed", 0))
+                merged_stats["total_calls_checked"] = max(merged_stats["total_calls_checked"], stats.get("total_calls_checked", 0))
+                merged_stats["total_sleep_routines_checked"] = max(merged_stats["total_sleep_routines_checked"], stats.get("total_sleep_routines_checked", 0))
+                merged_stats["sleep_checks_performed"] = max(merged_stats["sleep_checks_performed"], stats.get("sleep_checks_performed", 0))
+                
+                # Merge sleep point details (avoiding duplicates)
+                for detail in stats.get("sleep_point_details", []):
+                    # Create a unique key for this sleep point
+                    key = f"{{detail['caller']}}_{{detail['sleep_func']}}_{{detail['file']}}_{{detail['line']}}"
+                    if key not in merged_stats["seen_sleep_points"]:
+                        merged_stats["seen_sleep_points"].add(key)
+                        merged_stats["sleep_point_details"].append(detail)
+        except Exception as e:
+            print(f"Error processing stats file {{file_path}}: {{e}}")
+    
+    return merged_stats
+
+# Collect all the stats from all processes
+merged_stats = collect_and_merge_stats()
+total_sleep_points = len(merged_stats["seen_sleep_points"])
+
+# Print statistics
+print(f"\\n📊 STATISTICS:")
+print(f"   - Functions analyzed: {{merged_stats['total_funcs_analyzed']}}")
+print(f"   - Function calls checked: {{merged_stats['total_calls_checked']}}")
+print(f"   - Sleep checks performed: {{merged_stats['sleep_checks_performed']}}")
+print(f"   - Sleep points found: {{total_sleep_points}}")
+
+if total_sleep_points > 0:
+    print(f"\\n📊 SUMMARY: Found {{total_sleep_points}} potential sleep points across {{merged_stats['total_funcs_analyzed']}} analyzed functions")
+    if sleepy_func:
+        print(f"⚠️  Function '{{target_func}}' calls '{{sleepy_func}}' through call chain!")
+    else:
+        print(f"⚠️  The function '{{target_func}}' might sleep when called from atomic contexts!")
+    
+    # Sort sleep points by call path for better readability in the summary
+    sorted_sleep_points = sorted(merged_stats["sleep_point_details"], key=lambda x: x["path"])
+    
+    # Print unique sleep paths
+    print("\\n📋 UNIQUE SLEEP PATHS:")
+    for idx, detail in enumerate(sorted_sleep_points, 1):
+        path = detail["path"]
+        print(f"   {{idx}}. {{path}}")
+else:
+    symbol = "✅" if expected_to_sleep else "⚠️"
+    if sleepy_func:
+        message = f"\n\n{{symbol}}: {{caller_func}}() can sleep at {{file}}:{{line}} - {{reason}} (via {{sleep_func}})"
+        print(message)
+    else:
+        message = f"\n\n{{symbol}}: {{caller_func}}() found to may sleep."
+        print(message)
+    
+print("\\nNOTE: This analysis is conservative and may produce false positives.")
+print("      Always manually verify the findings.")
+
+# Clean up the temporary stats directory
+import shutil
+try:
+    shutil.rmtree(stats_dir)
+except Exception as e:
+    print(f"Note: Could not clean up stats directory: {{e}}")
+""")
+
+msg = f"✅ Generated {outfile} to check if '{target_func}' might sleep"
+if sleepy_func:
+    msg = f"✅ Generated {outfile} to check if '{target_func}' calls '{sleepy_func}'"
+print(msg)
+print(f"Run with: make coccicheck MODE=report COCCI={outfile} J={get_nprocs()}")
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-04-06 15:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-03  2:01 [PATCH kdevops] scripts/coccinelle/generation: add example generation script Luis Chamberlain
2025-04-03  6:50 ` [cocci] " Markus Elfring
2025-04-06 14:00 ` Markus Elfring
2025-04-06 14:07   ` Julia Lawall
2025-04-06 15:00     ` Markus Elfring

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox