[PULL 26/27] scripts/decodetree: Implement named field support

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Richard Henderson <richard.henderson@linaro.org>
To: qemu-devel@nongnu.org
Cc: Peter Maydell <peter.maydell@linaro.org>
Subject: [PULL 26/27] scripts/decodetree: Implement named field support
Date: Tue, 30 May 2023 11:59:48 -0700	[thread overview]
Message-ID: <20230530185949.410208-27-richard.henderson@linaro.org> (raw)
In-Reply-To: <20230530185949.410208-1-richard.henderson@linaro.org>

From: Peter Maydell <peter.maydell@linaro.org>

Implement support for named fields, i.e.  where one field is defined
in terms of another, rather than directly in terms of bits extracted
from the instruction.

The new method referenced_fields() on all the Field classes returns a
list of fields that this field references.  This just passes through,
except for the new NamedField class.

We can then use referenced_fields() to:
 * construct a list of 'dangling references' for a format or
   pattern, which is the fields that the format/pattern uses but
   doesn't define itself
 * do a topological sort, so that we output "field = value"
   assignments in an order that means that we assign a field before
   we reference it in a subsequent assignment
 * check when we output the code for a pattern whether we need to
   fill in the format fields before or after the pattern fields, and
   do other error checking

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230523120447.728365-6-peter.maydell@linaro.org>
---
 scripts/decodetree.py | 145 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 139 insertions(+), 6 deletions(-)

diff --git a/scripts/decodetree.py b/scripts/decodetree.py
index db019a25c6..13db585d04 100644
--- a/scripts/decodetree.py
+++ b/scripts/decodetree.py
@@ -290,6 +290,9 @@ def str_extract(self, lvalue_formatter):
         s = 's' if self.sign else ''
         return f'{s}extract{bitop_width}(insn, {self.pos}, {self.len})'
 
+    def referenced_fields(self):
+        return []
+
     def __eq__(self, other):
         return self.sign == other.sign and self.mask == other.mask
 
@@ -321,6 +324,12 @@ def str_extract(self, lvalue_formatter):
             pos += f.len
         return ret
 
+    def referenced_fields(self):
+        l = []
+        for f in self.subs:
+            l.extend(f.referenced_fields())
+        return l
+
     def __ne__(self, other):
         if len(self.subs) != len(other.subs):
             return True
@@ -347,6 +356,9 @@ def __str__(self):
     def str_extract(self, lvalue_formatter):
         return str(self.value)
 
+    def referenced_fields(self):
+        return []
+
     def __cmp__(self, other):
         return self.value - other.value
 # end ConstField
@@ -367,6 +379,9 @@ def str_extract(self, lvalue_formatter):
         return (self.func + '(ctx, '
                 + self.base.str_extract(lvalue_formatter) + ')')
 
+    def referenced_fields(self):
+        return self.base.referenced_fields()
+
     def __eq__(self, other):
         return self.func == other.func and self.base == other.base
 
@@ -388,6 +403,9 @@ def __str__(self):
     def str_extract(self, lvalue_formatter):
         return self.func + '(ctx)'
 
+    def referenced_fields(self):
+        return []
+
     def __eq__(self, other):
         return self.func == other.func
 
@@ -395,6 +413,32 @@ def __ne__(self, other):
         return not self.__eq__(other)
 # end ParameterField
 
+class NamedField:
+    """Class representing a field already named in the pattern"""
+    def __init__(self, name, sign, len):
+        self.mask = 0
+        self.sign = sign
+        self.len = len
+        self.name = name
+
+    def __str__(self):
+        return self.name
+
+    def str_extract(self, lvalue_formatter):
+        global bitop_width
+        s = 's' if self.sign else ''
+        lvalue = lvalue_formatter(self.name)
+        return f'{s}extract{bitop_width}({lvalue}, 0, {self.len})'
+
+    def referenced_fields(self):
+        return [self.name]
+
+    def __eq__(self, other):
+        return self.name == other.name
+
+    def __ne__(self, other):
+        return not self.__eq__(other)
+# end NamedField
 
 class Arguments:
     """Class representing the extracted fields of a format"""
@@ -418,7 +462,6 @@ def output_def(self):
             output('} ', self.struct_name(), ';\n\n')
 # end Arguments
 
-
 class General:
     """Common code between instruction formats and instruction patterns"""
     def __init__(self, name, lineno, base, fixb, fixm, udfm, fldm, flds, w):
@@ -432,6 +475,7 @@ def __init__(self, name, lineno, base, fixb, fixm, udfm, fldm, flds, w):
         self.fieldmask = fldm
         self.fields = flds
         self.width = w
+        self.dangling = None
 
     def __str__(self):
         return self.name + ' ' + str_match_bits(self.fixedbits, self.fixedmask)
@@ -439,10 +483,51 @@ def __str__(self):
     def str1(self, i):
         return str_indent(i) + self.__str__()
 
+    def dangling_references(self):
+        # Return a list of all named references which aren't satisfied
+        # directly by this format/pattern. This will be either:
+        #  * a format referring to a field which is specified by the
+        #    pattern(s) using it
+        #  * a pattern referring to a field which is specified by the
+        #    format it uses
+        #  * a user error (referring to a field that doesn't exist at all)
+        if self.dangling is None:
+            # Compute this once and cache the answer
+            dangling = []
+            for n, f in self.fields.items():
+                for r in f.referenced_fields():
+                    if r not in self.fields:
+                        dangling.append(r)
+            self.dangling = dangling
+        return self.dangling
+
     def output_fields(self, indent, lvalue_formatter):
+        # We use a topological sort to ensure that any use of NamedField
+        # comes after the initialization of the field it is referencing.
+        graph = {}
         for n, f in self.fields.items():
-            output(indent, lvalue_formatter(n), ' = ',
-                   f.str_extract(lvalue_formatter), ';\n')
+            refs = f.referenced_fields()
+            graph[n] = refs
+
+        try:
+            ts = TopologicalSorter(graph)
+            for n in ts.static_order():
+                # We only want to emit assignments for the keys
+                # in our fields list, not for anything that ends up
+                # in the tsort graph only because it was referenced as
+                # a NamedField.
+                try:
+                    f = self.fields[n]
+                    output(indent, lvalue_formatter(n), ' = ',
+                           f.str_extract(lvalue_formatter), ';\n')
+                except KeyError:
+                    pass
+        except CycleError as e:
+            # The second element of args is a list of nodes which form
+            # a cycle (there might be others too, but only one is reported).
+            # Pretty-print it to tell the user.
+            cycle = ' => '.join(e.args[1])
+            error(self.lineno, 'field definitions form a cycle: ' + cycle)
 # end General
 
 
@@ -477,10 +562,36 @@ def output_code(self, i, extracted, outerbits, outermask):
         ind = str_indent(i)
         arg = self.base.base.name
         output(ind, '/* ', self.file, ':', str(self.lineno), ' */\n')
+        # We might have named references in the format that refer to fields
+        # in the pattern, or named references in the pattern that refer
+        # to fields in the format. This affects whether we extract the fields
+        # for the format before or after the ones for the pattern.
+        # For simplicity we don't allow cross references in both directions.
+        # This is also where we catch the syntax error of referring to
+        # a nonexistent field.
+        fmt_refs = self.base.dangling_references()
+        for r in fmt_refs:
+            if r not in self.fields:
+                error(self.lineno, f'format refers to undefined field {r}')
+        pat_refs = self.dangling_references()
+        for r in pat_refs:
+            if r not in self.base.fields:
+                error(self.lineno, f'pattern refers to undefined field {r}')
+        if pat_refs and fmt_refs:
+            error(self.lineno, ('pattern that uses fields defined in format '
+                                'cannot use format that uses fields defined '
+                                'in pattern'))
+        if fmt_refs:
+            # pattern fields first
+            self.output_fields(ind, lambda n: 'u.f_' + arg + '.' + n)
+            assert not extracted, "dangling fmt refs but it was already extracted"
         if not extracted:
             output(ind, self.base.extract_name(),
                    '(ctx, &u.f_', arg, ', insn);\n')
-        self.output_fields(ind, lambda n: 'u.f_' + arg + '.' + n)
+        if not fmt_refs:
+            # pattern fields last
+            self.output_fields(ind, lambda n: 'u.f_' + arg + '.' + n)
+
         output(ind, 'if (', translate_prefix, '_', self.name,
                '(ctx, &u.f_', arg, ')) return true;\n')
 
@@ -626,8 +737,10 @@ def output_code(self, i, extracted, outerbits, outermask):
         ind = str_indent(i)
 
         # If we identified all nodes below have the same format,
-        # extract the fields now.
-        if not extracted and self.base:
+        # extract the fields now. But don't do it if the format relies
+        # on named fields from the insn pattern, as those won't have
+        # been initialised at this point.
+        if not extracted and self.base and not self.base.dangling_references():
             output(ind, self.base.extract_name(),
                    '(ctx, &u.f_', self.base.base.name, ', insn);\n')
             extracted = True
@@ -749,6 +862,7 @@ def parse_field(lineno, name, toks):
     """Parse one instruction field from TOKS at LINENO"""
     global fields
     global insnwidth
+    global re_C_ident
 
     # A "simple" field will have only one entry;
     # a "multifield" will have several.
@@ -763,6 +877,25 @@ def parse_field(lineno, name, toks):
             func = func[1]
             continue
 
+        if re.fullmatch(re_C_ident + ':s[0-9]+', t):
+            # Signed named field
+            subtoks = t.split(':')
+            n = subtoks[0]
+            le = int(subtoks[1])
+            f = NamedField(n, True, le)
+            subs.append(f)
+            width += le
+            continue
+        if re.fullmatch(re_C_ident + ':[0-9]+', t):
+            # Unsigned named field
+            subtoks = t.split(':')
+            n = subtoks[0]
+            le = int(subtoks[1])
+            f = NamedField(n, False, le)
+            subs.append(f)
+            width += le
+            continue
+
         if re.fullmatch('[0-9]+:s[0-9]+', t):
             # Signed field extract
             subtoks = t.split(':s')
-- 
2.34.1

next prev parent reply	other threads:[~2023-05-30 19:06 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-30 18:59 [PULL 00/27] tcg patch queue Richard Henderson
2023-05-30 18:59 ` [PULL 01/27] tcg: Fix register move type in tcg_out_ld_helper_ret Richard Henderson
2023-05-30 18:59 ` [PULL 02/27] accel/tcg: Fix check for page writeability in load_atomic16_or_exit Richard Henderson
2023-05-30 18:59 ` [PULL 03/27] meson: Split test for __int128_t type from __int128_t arithmetic Richard Henderson
2023-05-30 18:59 ` [PULL 04/27] qemu/atomic128: Add x86_64 atomic128-ldst.h Richard Henderson
2023-05-30 18:59 ` [PULL 05/27] tcg/i386: Support 128-bit load/store Richard Henderson
2023-05-30 18:59 ` [PULL 06/27] tcg/aarch64: Rename temporaries Richard Henderson
2023-05-30 18:59 ` [PULL 07/27] tcg/aarch64: Reserve TCG_REG_TMP1, TCG_REG_TMP2 Richard Henderson
2023-05-30 18:59 ` [PULL 08/27] tcg/aarch64: Simplify constraints on qemu_ld/st Richard Henderson
2023-05-30 18:59 ` [PULL 09/27] tcg/aarch64: Support 128-bit load/store Richard Henderson
2023-05-30 18:59 ` [PULL 10/27] tcg/ppc: " Richard Henderson
2023-05-30 18:59 ` [PULL 11/27] tcg/s390x: " Richard Henderson
2023-07-10  8:58   ` TCG broken on s390x hosts (was: [PULL 11/27] tcg/s390x: Support 128-bit load/store) Thomas Huth
2023-07-10  9:31     ` Richard Henderson
2023-07-10 11:10       ` Thomas Huth
2023-05-30 18:59 ` [PULL 12/27] accel/tcg: Extract load_atom_extract_al16_or_al8 to host header Richard Henderson
2023-05-30 18:59 ` [PULL 13/27] accel/tcg: Extract store_atom_insert_al16 " Richard Henderson
2023-05-30 18:59 ` [PULL 14/27] accel/tcg: Add x86_64 load_atom_extract_al16_or_al8 Richard Henderson
2023-05-30 18:59 ` [PULL 15/27] accel/tcg: Add aarch64 lse2 load_atom_extract_al16_or_al8 Richard Henderson
2023-05-30 18:59 ` [PULL 16/27] accel/tcg: Add aarch64 store_atom_insert_al16 Richard Henderson
2023-05-30 18:59 ` [PULL 17/27] tcg: Remove TCG_TARGET_TLB_DISPLACEMENT_BITS Richard Henderson
2023-05-30 18:59 ` [PULL 18/27] decodetree: Add --test-for-error Richard Henderson
2023-05-30 18:59 ` [PULL 19/27] decodetree: Fix recursion in prop_format and build_tree Richard Henderson
2023-05-30 18:59 ` [PULL 20/27] decodetree: Diagnose empty pattern group Richard Henderson
2023-05-30 18:59 ` [PULL 21/27] decodetree: Do not remove output_file from /dev Richard Henderson
2023-05-30 18:59 ` [PULL 22/27] tests/decode: Convert tests to meson Richard Henderson
2023-05-30 18:59 ` [PULL 23/27] docs: Document decodetree named field syntax Richard Henderson
2023-05-30 18:59 ` [PULL 24/27] scripts/decodetree: Pass lvalue-formatter function to str_extract() Richard Henderson
2023-05-30 18:59 ` [PULL 25/27] scripts/decodetree: Implement a topological sort Richard Henderson
2023-05-30 18:59 ` Richard Henderson [this message]
2023-05-30 18:59 ` [PULL 27/27] tests/decode: Add tests for various named-field cases Richard Henderson
2023-05-31  1:08 ` [PULL 00/27] tcg patch queue Richard Henderson
2023-05-31 16:12   ` Thomas Huth
     [not found]     ` <227e27e0-4035-8e17-2259-3098340f716e@linaro.org>
2023-05-31 22:20       ` Richard Henderson
2023-06-01 10:52     ` Mark Cave-Ayland

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:db019a25c dfblob:13db585d0 )
 OR (
bs:"[PULL 26/27] scripts/decodetree: Implement named field support" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230530185949.410208-27-richard.henderson@linaro.org \
    --to=richard.henderson@linaro.org \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).