public inbox for linux-trace-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Wander Lairson Costa <wander@redhat.com>
To: Steven Rostedt <rostedt@goodmis.org>,
	Gabriele Monaco <gmonaco@redhat.com>,
	Nam Cao <namcao@linutronix.de>,
	Wander Lairson Costa <wander@redhat.com>,
	linux-kernel@vger.kernel.org (open list),
	linux-trace-kernel@vger.kernel.org (open list:RUNTIME
	VERIFICATION (RV))
Subject: [PATCH 20/26] rv/rvgen: refactor automata.py to use iterator-based parsing
Date: Mon, 19 Jan 2026 17:45:56 -0300	[thread overview]
Message-ID: <20260119205601.105821-21-wander@redhat.com> (raw)
In-Reply-To: <20260119205601.105821-1-wander@redhat.com>

Refactor the DOT file parsing logic in automata.py to use Python's
iterator-based patterns instead of manual cursor indexing. The previous
implementation relied on while loops with explicit cursor management,
which made the code prone to off-by-one errors and would crash on
malformed input files containing empty lines.

The new implementation uses enumerate and itertools.islice to iterate
over lines, eliminating manual cursor tracking. Functions that search
for specific markers now use for loops with early returns and explicit
AutomataError exceptions for missing markers, rather than assuming the
markers exist. Additional bounds checking ensures that split line
arrays have sufficient elements before accessing specific indices,
preventing IndexError exceptions on malformed DOT files.

The matrix creation and event variable extraction methods now use
functional patterns with map combined with itertools.islice,
making the intent clearer while maintaining the same behavior. Minor
improvements include using extend instead of append in a loop, adding
empty file validation, and replacing enumerate with range where the
enumerated value was unused.

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
---
 tools/verification/rvgen/rvgen/automata.py | 109 +++++++++++++--------
 1 file changed, 67 insertions(+), 42 deletions(-)

diff --git a/tools/verification/rvgen/rvgen/automata.py b/tools/verification/rvgen/rvgen/automata.py
index 083d0f5cfb773..a6889d0c26c3f 100644
--- a/tools/verification/rvgen/rvgen/automata.py
+++ b/tools/verification/rvgen/rvgen/automata.py
@@ -9,6 +9,7 @@
 #   Documentation/trace/rv/deterministic_automata.rst
 
 import ntpath
+from itertools import islice
 
 
 class AutomataError(OSError):
@@ -53,37 +54,54 @@ class Automata:
         return model_name
 
     def __open_dot(self):
-        cursor = 0
         dot_lines = []
         try:
             with open(self.__dot_path) as dot_file:
-                dot_lines = dot_file.read().splitlines()
+                dot_lines = dot_file.readlines()
         except OSError as exc:
             raise AutomataError(f"Cannot open the file: {self.__dot_path}") from exc
 
+        if not dot_lines:
+            raise AutomataError(f"{self.__dot_path} is empty")
+
         # checking the first line:
-        line = dot_lines[cursor].split()
+        line = dot_lines[0].split()
 
-        if (line[0] != "digraph") or (line[1] != "state_automaton"):
+        if len(line) < 2 or line[0] != "digraph" or line[1] != "state_automaton":
             raise AutomataError(f"Not a valid .dot format: {self.__dot_path}")
-        else:
-            cursor += 1
+
         return dot_lines
 
     def __get_cursor_begin_states(self):
-        cursor = 0
-        while self.__dot_lines[cursor].split()[0] != "{node":
-            cursor += 1
-        return cursor
+        for cursor, line in enumerate(self.__dot_lines):
+            split_line = line.split()
+
+            if len(split_line) and split_line[0] == "{node":
+                return cursor
+
+        raise AutomataError("Could not find a beginning state")
 
     def __get_cursor_begin_events(self):
-        cursor = 0
-        while self.__dot_lines[cursor].split()[0] != "{node":
-            cursor += 1
-        while self.__dot_lines[cursor].split()[0] == "{node":
-            cursor += 1
-        # skip initial state transition
-        cursor += 1
+        state = 0
+        cursor = 0 # make pyright happy
+
+        for cursor, line in enumerate(self.__dot_lines):
+            line = line.split()
+            if not line:
+                continue
+
+            if state == 0:
+                if line[0] == "{node":
+                    state = 1
+            elif line[0] != "{node":
+                break
+        else:
+            raise AutomataError("Could not find beginning event")
+
+        cursor += 1 # skip initial state transition
+        if cursor == len(self.__dot_lines):
+            raise AutomataError("Dot file ended after event beginning")
+
         return cursor
 
     def __get_state_variables(self):
@@ -96,9 +114,12 @@ class Automata:
         cursor = self.__get_cursor_begin_states()
 
         # process nodes
-        while self.__dot_lines[cursor].split()[0] == "{node":
-            line = self.__dot_lines[cursor].split()
-            raw_state = line[-1]
+        for line in islice(self.__dot_lines, cursor, None):
+            split_line = line.split()
+            if not split_line or split_line[0] != "{node":
+                break
+
+            raw_state = split_line[-1]
 
             #  "enabled_fired"}; -> enabled_fired
             state = raw_state.replace('"', "").replace("};", "").replace(",", "_")
@@ -106,16 +127,14 @@ class Automata:
                 initial_state = state[len(self.init_marker) :]
             else:
                 states.append(state)
-                if "doublecircle" in self.__dot_lines[cursor]:
+                if "doublecircle" in line:
                     final_states.append(state)
                     has_final_states = True
 
-                if "ellipse" in self.__dot_lines[cursor]:
+                if "ellipse" in line:
                     final_states.append(state)
                     has_final_states = True
 
-            cursor += 1
-
         if initial_state is None:
             raise AutomataError("The automaton doesn't have a initial state")
 
@@ -130,26 +149,27 @@ class Automata:
         return states, initial_state, final_states
 
     def __get_event_variables(self):
+        events: list[str] = []
         # here we are at the begin of transitions, take a note, we will return later.
         cursor = self.__get_cursor_begin_events()
 
-        events = []
-        while self.__dot_lines[cursor].lstrip()[0] == '"':
+        for line in map(str.lstrip, islice(self.__dot_lines, cursor, None)):
+            if not line.startswith('"'):
+                break
+
             # transitions have the format:
             # "all_fired" -> "both_fired" [ label = "disable_irq" ];
             #  ------------ event is here ------------^^^^^
-            if self.__dot_lines[cursor].split()[1] == "->":
-                line = self.__dot_lines[cursor].split()
-                event = line[-2].replace('"', "")
+            split_line = line.split()
+            if len(split_line) > 1 and split_line[1] == "->":
+                event = split_line[-2].replace('"', "")
 
                 # when a transition has more than one labels, they are like this
                 # "local_irq_enable\nhw_local_irq_enable_n"
                 # so split them.
 
                 event = event.replace("\\n", " ")
-                for i in event.split():
-                    events.append(i)
-            cursor += 1
+                events.extend(event.split())
 
         return sorted(set(events))
 
@@ -171,32 +191,37 @@ class Automata:
 
         # declare the matrix....
         matrix = [
-            [self.invalid_state_str for x in range(nr_event)] for y in range(nr_state)
+            [self.invalid_state_str for _ in range(nr_event)] for _ in range(nr_state)
         ]
 
         # and we are back! Let's fill the matrix
         cursor = self.__get_cursor_begin_events()
 
-        while self.__dot_lines[cursor].lstrip()[0] == '"':
-            if self.__dot_lines[cursor].split()[1] == "->":
-                line = self.__dot_lines[cursor].split()
-                origin_state = line[0].replace('"', "").replace(",", "_")
-                dest_state = line[2].replace('"', "").replace(",", "_")
-                possible_events = line[-2].replace('"', "").replace("\\n", " ")
+        for line in map(str.lstrip,
+                        islice(self.__dot_lines, cursor, None)):
+
+            if not line or line[0] != '"':
+                break
+
+            split_line = line.split()
+
+            if len(split_line) > 2 and split_line[1] == "->":
+                origin_state = split_line[0].replace('"', "").replace(",", "_")
+                dest_state = split_line[2].replace('"', "").replace(",", "_")
+                possible_events = split_line[-2].replace('"', "").replace("\\n", " ")
                 for event in possible_events.split():
                     matrix[states_dict[origin_state]][events_dict[event]] = dest_state
-            cursor += 1
 
         return matrix
 
     def __store_init_events(self):
         events_start = [False] * len(self.events)
         events_start_run = [False] * len(self.events)
-        for i, _ in enumerate(self.events):
+        for i in range(len(self.events)):
             curr_event_will_init = 0
             curr_event_from_init = False
             curr_event_used = 0
-            for j, _ in enumerate(self.states):
+            for j in range(len(self.states)):
                 if self.function[j][i] != self.invalid_state_str:
                     curr_event_used += 1
                 if self.function[j][i] == self.initial_state:
-- 
2.52.0


  parent reply	other threads:[~2026-01-19 21:06 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-19 20:45 [PATCH 00/26] rv/rvgen: Robustness, modernization, and fixes Wander Lairson Costa
2026-01-19 20:45 ` [PATCH 01/26] rv/rvgen: introduce AutomataError exception class Wander Lairson Costa
2026-01-20  7:33   ` Gabriele Monaco
2026-01-20 12:39     ` Wander Lairson Costa
2026-01-20 15:08       ` Gabriele Monaco
2026-01-22 14:39         ` Nam Cao
2026-01-19 20:45 ` [PATCH 02/26] rv/rvgen: remove bare except clauses in generator Wander Lairson Costa
2026-01-20 10:05   ` Gabriele Monaco
2026-01-22 14:43   ` Nam Cao
2026-01-19 20:45 ` [PATCH 03/26] rv/rvgen: replace % string formatting with f-strings Wander Lairson Costa
2026-01-20 10:02   ` Gabriele Monaco
2026-01-22 14:46   ` Nam Cao
2026-01-19 20:45 ` [PATCH 04/26] rv/rvgen: replace __len__() calls with len() Wander Lairson Costa
2026-01-20  7:41   ` Gabriele Monaco
2026-01-22 14:46   ` Nam Cao
2026-01-19 20:45 ` [PATCH 05/26] rv/rvgen: remove unnecessary semicolons Wander Lairson Costa
2026-01-20  7:42   ` Gabriele Monaco
2026-01-22 14:47   ` Nam Cao
2026-01-19 20:45 ` [PATCH 06/26] rv/rvgen: use context managers for file operations Wander Lairson Costa
2026-01-20  7:44   ` Gabriele Monaco
2026-01-22 14:50   ` Nam Cao
2026-01-19 20:45 ` [PATCH 07/26] rv/rvgen: replace __contains__() with in operator Wander Lairson Costa
2026-01-20  7:45   ` Gabriele Monaco
2026-01-22 14:51   ` Nam Cao
2026-01-19 20:45 ` [PATCH 08/26] rv/rvgen: simplify boolean comparison Wander Lairson Costa
2026-01-20  7:48   ` Gabriele Monaco
2026-01-22 14:51   ` Nam Cao
2026-01-19 20:45 ` [PATCH 09/26] rv/rvgen: replace inline NotImplemented with decorator Wander Lairson Costa
2026-01-21 13:43   ` Gabriele Monaco
2026-01-21 17:49     ` Wander Lairson Costa
2026-01-22 14:57   ` Nam Cao
2026-01-19 20:45 ` [PATCH 10/26] rv/rvgen: fix typos in automata docstring and comments Wander Lairson Costa
2026-01-22 14:58   ` Nam Cao
2026-01-19 20:45 ` [PATCH 11/26] rv/rvgen: fix typo in generator module docstring Wander Lairson Costa
2026-01-20  7:51   ` Gabriele Monaco
2026-01-22 14:59   ` Nam Cao
2026-01-19 20:45 ` [PATCH 12/26] rv/rvgen: fix PEP 8 whitespace violations Wander Lairson Costa
2026-01-20  7:53   ` Gabriele Monaco
2026-01-22 14:59   ` Nam Cao
2026-01-19 20:45 ` [PATCH 13/26] rv/rvgen: fix DOT file validation logic error Wander Lairson Costa
2026-01-20  7:56   ` Gabriele Monaco
2026-01-22 15:01   ` Nam Cao
2026-01-19 20:45 ` [PATCH 14/26] rv/rvgen: remove redundant initial_state removal Wander Lairson Costa
2026-01-20  8:01   ` Gabriele Monaco
2026-01-20 12:05     ` Wander Lairson Costa
2026-01-19 20:45 ` [PATCH 15/26] rv/rvgen: use class constant for init marker Wander Lairson Costa
2026-01-20  8:06   ` Gabriele Monaco
2026-01-22 15:02   ` Nam Cao
2026-01-19 20:45 ` [PATCH 16/26] rv/rvgen: fix unbound initial_state variable Wander Lairson Costa
2026-01-20  8:21   ` Gabriele Monaco
2026-01-20 11:42     ` Wander Lairson Costa
2026-01-20 11:53       ` Gabriele Monaco
2026-01-19 20:45 ` [PATCH 17/26] rv/rvgen: fix possibly unbound variable in ltl2k Wander Lairson Costa
2026-01-20  8:59   ` Gabriele Monaco
2026-01-20 11:37     ` Wander Lairson Costa
2026-01-20 12:30       ` Gabriele Monaco
2026-01-20 19:38         ` Wander Lairson Costa
2026-01-21  6:31           ` Gabriele Monaco
2026-01-22 15:31   ` Nam Cao
2026-01-19 20:45 ` [PATCH 18/26] rv/rvgen: add fill_tracepoint_args_skel stub to ltl2k Wander Lairson Costa
2026-01-21 13:57   ` Gabriele Monaco
2026-01-21 17:53     ` Wander Lairson Costa
2026-01-22 13:10       ` Wander Lairson Costa
2026-01-22 13:49         ` Gabriele Monaco
2026-01-23 12:19           ` Wander Lairson Costa
2026-01-23 12:26             ` Gabriele Monaco
2026-01-23 14:04               ` Wander Lairson Costa
2026-01-19 20:45 ` [PATCH 19/26] rv/rvgen: add abstract method stubs to Container class Wander Lairson Costa
2026-01-21 13:59   ` Gabriele Monaco
2026-01-21 17:56     ` Wander Lairson Costa
2026-01-22 15:33     ` Nam Cao
2026-01-19 20:45 ` Wander Lairson Costa [this message]
2026-01-20  9:43   ` [PATCH 20/26] rv/rvgen: refactor automata.py to use iterator-based parsing Gabriele Monaco
2026-01-22 15:35     ` Nam Cao
2026-01-22 15:40       ` Gabriele Monaco
2026-01-22 16:01         ` Nam Cao
2026-01-19 20:45 ` [PATCH 21/26] rv/rvgen: remove unused sys import from dot2c Wander Lairson Costa
2026-01-20  9:16   ` Gabriele Monaco
2026-01-19 20:45 ` [PATCH 22/26] rv/rvgen: remove unused __get_main_name method Wander Lairson Costa
2026-01-20  9:08   ` Gabriele Monaco
2026-01-19 20:45 ` [PATCH 23/26] rv/rvgen: add type annotations to fix pyright errors Wander Lairson Costa
2026-01-22 15:43   ` Nam Cao
2026-01-19 20:46 ` [PATCH 24/26] rv/rvgen: make monitor arguments required in rvgen Wander Lairson Costa
2026-01-20  9:07   ` Gabriele Monaco
2026-01-22 15:44   ` Nam Cao
2026-01-19 20:46 ` [PATCH 25/26] rv/rvgen: fix isinstance check in Variable.expand() Wander Lairson Costa
2026-01-22 15:53   ` Nam Cao
2026-01-19 20:46 ` [PATCH 26/26] rv/rvgen: extract node marker string to class constant Wander Lairson Costa
2026-01-20  9:03   ` Gabriele Monaco
2026-01-20 11:34     ` Wander Lairson Costa
2026-01-20 12:36       ` Gabriele Monaco
2026-01-20 13:11         ` Gabriele Monaco
2026-01-20 18:56           ` Wander Lairson Costa
2026-01-21  6:16             ` Gabriele Monaco
2026-01-20  7:20 ` [PATCH 00/26] rv/rvgen: Robustness, modernization, and fixes Nam Cao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260119205601.105821-21-wander@redhat.com \
    --to=wander@redhat.com \
    --cc=gmonaco@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=namcao@linutronix.de \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox