qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Generate target/arm/cpu-sysregs.h.inc from AARCHMRS Registers.json
@ 2025-12-08 16:37 Eric Auger
  2025-12-08 16:37 ` [PATCH 1/3] scripts: introduce scripts/update-aarch64-sysreg-code.py Eric Auger
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Eric Auger @ 2025-12-08 16:37 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	richard.henderson, cohuck, sebott
  Cc: maz

Introduce a python script that generates ID register definitions
from the Registers.json file included in "AARCHMRS containing the
JSON files for Arm A-profile (2025-09)". It generates the content
of target/arm/cpu-sysregs.h.inc.

Since [PATCH v8 00/14] arm: rework id register storage
(https://lore.kernel.org/all/20250617153931.1330449-1-cohuck@redhat.com/)
ID regs are generically stored in an array. Auto generation brings
the capability to enhance the list of IDregs stored in that array.

Registers.json can be downloaded at:
Arm Developer A-Profile Architecture Exploration Tools page:
https://developer.arm.com/Architectures/A-Profile%20Architecture#Downloads
Open Source 2025-09 item.

This patch is a rework of:
[PATCH v8 12/14] arm/cpu: Add sysreg generation scripts
using a python script instead of bash/awk and using a different
entry: Registers.json instead of linux sysreg file.

Soon we will offer the end-user the capability to overwrite some of
them through the KVM API.

Eric Auger (3):
  scripts: introduce scripts/update-aarch64-sysreg-code.py
  target/arm/cpu-sysregs.h.inc: Sort by name alphabetical order
  target/arm/cpu-sysregs.h.inc: Update with automatic generation

 scripts/update-aarch64-sysreg-code.py | 133 ++++++++++++++++++++++++++
 target/arm/cpu-sysregs.h.inc          |  56 +++++++----
 2 files changed, 168 insertions(+), 21 deletions(-)
 create mode 100755 scripts/update-aarch64-sysreg-code.py

-- 
2.52.0



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] scripts: introduce scripts/update-aarch64-sysreg-code.py
  2025-12-08 16:37 [PATCH 0/3] Generate target/arm/cpu-sysregs.h.inc from AARCHMRS Registers.json Eric Auger
@ 2025-12-08 16:37 ` Eric Auger
  2025-12-09 11:12   ` Cornelia Huck
  2025-12-09 12:30   ` Philippe Mathieu-Daudé
  2025-12-08 16:37 ` [PATCH 2/3] target/arm/cpu-sysregs.h.inc: Sort by name alphabetical order Eric Auger
  2025-12-08 16:37 ` [PATCH 3/3] target/arm/cpu-sysregs.h.inc: Update with automatic generation Eric Auger
  2 siblings, 2 replies; 10+ messages in thread
From: Eric Auger @ 2025-12-08 16:37 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	richard.henderson, cohuck, sebott
  Cc: maz

Introduce a script that takes as input the Registers.json file
delivered in the AARCHMRS Features Model downloadable from the
Arm Developer A-Profile Architecture Exploration Tools page:
https://developer.arm.com/Architectures/A-Profile%20Architecture#Downloads
and outputs the list of ID regs in target/arm/cpu-sysregs.h.inc
under the form of DEF(<name>, <op0>, <op1>, <crn>, <crm>, <op2>).

We only care about IDregs with opcodes satisfying:
op0 = 3, op1 within [0, 3], crn = 0, crm within [0, 7], op2 within [0, 7]

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

This was tested with https://developer.arm.com/-/cdn-downloads/permalink/Exploration-Tools-OS-Machine-Readable-Data/AARCHMRS_BSD/AARCHMRS_OPENSOURCE_A_profile_FAT-2025-09_ASL0.tar.gz

Discussion about undesired generated regs can be found in
https://lore.kernel.org/all/CAFEAcA9OXi4v+hdBMamQv85HYp2EqxOA5=nfsdZ5E3nf8RP_pw@mail.gmail.com/
---
 scripts/update-aarch64-sysreg-code.py | 133 ++++++++++++++++++++++++++
 1 file changed, 133 insertions(+)
 create mode 100755 scripts/update-aarch64-sysreg-code.py

diff --git a/scripts/update-aarch64-sysreg-code.py b/scripts/update-aarch64-sysreg-code.py
new file mode 100755
index 0000000000..c7b31035d1
--- /dev/null
+++ b/scripts/update-aarch64-sysreg-code.py
@@ -0,0 +1,133 @@
+#!/usr/bin/env python3
+
+# This script takes as input the Registers.json file delivered in
+# the AARCHMRS Features Model downloadable from the Arm Developer
+# A-Profile Architecture Exploration Tools page:
+# https://developer.arm.com/Architectures/A-Profile%20Architecture#Downloads
+# and outputs the list of ID regs in target/arm/cpu-sysregs.h.inc
+# under the form of DEF(<name>, <op0>, <op1>, <crn>, <crm>, <op2>)
+#
+# Copyright (C) 2025 Red Hat, Inc.
+#
+# Authors: Eric Auger <eric.auger@redhat.com>
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+
+import json
+import os
+import sys
+
+# returns the int value of a given @opcode for a reg @encoding
+def get_opcode(encoding, opcode):
+    fvalue = encoding.get(opcode)
+    if fvalue:
+        value = fvalue.get('value')
+        if isinstance(value, str):
+            value = value.strip("'")
+            value = int(value,2)
+            return value
+    return -1
+
+def extract_idregs_from_registers_json(filename):
+    """
+    Load a Registers.json file and extract all ID registers, decode their
+    opcode and dump the information in target/arm/cpu-sysregs.h.inc
+
+    Args:
+        filename (str): The path to the Registers.json
+    returns:
+        idregs: list of ID regs and their encoding
+    """
+    if not os.path.exists(filename):
+        print(f"Error: {filename} could not be found!")
+        return {}
+
+    try:
+        with open(filename, 'r') as f:
+            register_data = json.load(f)
+
+    except json.JSONDecodeError:
+        print(f"Could not decode json from '{filename}'!")
+        return {}
+    except Exception as e:
+        print(f"Unexpected error while reading {filename}: {e}")
+        return {}
+
+    registers = [r for r in register_data if isinstance(r, dict) and \
+                r.get('_type') == 'Register']
+
+    idregs = {}
+
+    # Some regs have op code values like 000x, 001x. Anyway we don't need
+    # them. Besides some regs are undesired in the generated file such as
+    # CCSIDR_EL1 and CCSIDR2_EL1 which are arrays of regs. Also exclude
+    # VMPIDR_EL2 and VPIDR_EL2 which are outside of the IDreg scope we
+    # are interested in and are tricky to decode as their system accessor
+    # refer to MPIDR_EL1/MIDR_EL1 respectively
+
+    skiplist = ['ALLINT', 'PM', 'S1_', 'S3_', 'SVCR', \
+                'CCSIDR_EL1', 'CCSIDR2_EL1', 'VMPIDR_EL2', 'VPIDR_EL2']
+
+    for register in registers:
+        reg_name = register.get('name')
+
+        is_skipped = any(term in (reg_name or "").upper() for term in skiplist)
+
+        if reg_name and not is_skipped:
+            accessors = register.get('accessors', [])
+
+            for accessor in accessors:
+                type = accessor.get('_type')
+                if type in ['Accessors.SystemAccessor']:
+                    encoding_list = accessor.get('encoding')
+
+                    if isinstance(encoding_list, list) and encoding_list and \
+                       isinstance(encoding_list[0], dict):
+                        encoding_wrapper = encoding_list[0]
+                        encoding_source = encoding_wrapper.get('encodings', \
+                                                               encoding_wrapper)
+
+                        if isinstance(encoding_source, dict):
+                                op0 = get_opcode(encoding_source, 'op0')
+                                op1 = get_opcode(encoding_source, 'op1')
+                                op2 = get_opcode(encoding_source, 'op2')
+                                crn = get_opcode(encoding_source, 'CRn')
+                                crm = get_opcode(encoding_source, 'CRm')
+                                encoding_str=f"{op0} {op1} {crn} {crm} {op2}"
+
+                # ID regs are assumed within this scope
+                if op0 == 3 and (op1 == 0 or op1 == 1 or op1 == 3) and \
+                   crn == 0 and (crm >= 0 and crm <= 7) and (op2 >= 0 and op2 <= 7):
+                    idregs[reg_name] = encoding_str
+
+    return idregs
+
+if __name__ == "__main__":
+    # Single arg expectedr: the path to the Registers.json file
+    if len(sys.argv) < 2:
+        print("Usage: scripts/update-aarch64-sysreg-code.py <path_to_registers_json>")
+        sys.exit(1)
+    else:
+        json_file_path = sys.argv[1]
+
+    extracted_registers = extract_idregs_from_registers_json(json_file_path)
+
+    if extracted_registers:
+        output_list = extracted_registers.items()
+
+        # Sort by register name
+        sorted_output = sorted(output_list, key=lambda item: item[0])
+
+        # format lines as DEF(<name>, <op0>, <op1>, <crn>, <crm>, <op2>)
+        final_output = ""
+        for reg_name, encoding in sorted_output:
+            reformatted_encoding = encoding.replace(" ", ", ")
+            final_output += f"DEF({reg_name}, {reformatted_encoding})\n"
+
+        with open("target/arm/cpu-sysregs.h.inc", 'w') as f:
+            f.write("/* SPDX-License-Identifier: BSD-3-Clause */\n\n")
+            f.write("/* This file is autogenerated by ")
+            f.write("scripts/update-aarch64-sysreg-code.py */\n\n")
+            f.write(final_output)
+        print(f"updated target/arm/cpu-sysregs.h.inc")
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] target/arm/cpu-sysregs.h.inc: Sort by name alphabetical order
  2025-12-08 16:37 [PATCH 0/3] Generate target/arm/cpu-sysregs.h.inc from AARCHMRS Registers.json Eric Auger
  2025-12-08 16:37 ` [PATCH 1/3] scripts: introduce scripts/update-aarch64-sysreg-code.py Eric Auger
@ 2025-12-08 16:37 ` Eric Auger
  2025-12-08 16:37 ` [PATCH 3/3] target/arm/cpu-sysregs.h.inc: Update with automatic generation Eric Auger
  2 siblings, 0 replies; 10+ messages in thread
From: Eric Auger @ 2025-12-08 16:37 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	richard.henderson, cohuck, sebott
  Cc: maz

target/arm/cpu-sysregs.h.inc: Sort by name alphabetical order

Sort by register name alphabetical order. This will allow to
easily diff with the future content, automatically generated.

No functional change intended.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 target/arm/cpu-sysregs.h.inc | 40 ++++++++++++++++++------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/target/arm/cpu-sysregs.h.inc b/target/arm/cpu-sysregs.h.inc
index 2bb2861c62..3c892c4f30 100644
--- a/target/arm/cpu-sysregs.h.inc
+++ b/target/arm/cpu-sysregs.h.inc
@@ -1,12 +1,10 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-DEF(ID_AA64PFR0_EL1, 3, 0, 0, 4, 0)
-DEF(ID_AA64PFR1_EL1, 3, 0, 0, 4, 1)
-DEF(ID_AA64PFR2_EL1, 3, 0, 0, 4, 2)
-DEF(ID_AA64SMFR0_EL1, 3, 0, 0, 4, 5)
-DEF(ID_AA64DFR0_EL1, 3, 0, 0, 5, 0)
-DEF(ID_AA64DFR1_EL1, 3, 0, 0, 5, 1)
+DEF(CLIDR_EL1, 3, 1, 0, 0, 1)
+DEF(CTR_EL0, 3, 3, 0, 0, 1)
 DEF(ID_AA64AFR0_EL1, 3, 0, 0, 5, 4)
 DEF(ID_AA64AFR1_EL1, 3, 0, 0, 5, 5)
+DEF(ID_AA64DFR0_EL1, 3, 0, 0, 5, 0)
+DEF(ID_AA64DFR1_EL1, 3, 0, 0, 5, 1)
 DEF(ID_AA64ISAR0_EL1, 3, 0, 0, 6, 0)
 DEF(ID_AA64ISAR1_EL1, 3, 0, 0, 6, 1)
 DEF(ID_AA64ISAR2_EL1, 3, 0, 0, 6, 2)
@@ -14,28 +12,30 @@ DEF(ID_AA64MMFR0_EL1, 3, 0, 0, 7, 0)
 DEF(ID_AA64MMFR1_EL1, 3, 0, 0, 7, 1)
 DEF(ID_AA64MMFR2_EL1, 3, 0, 0, 7, 2)
 DEF(ID_AA64MMFR3_EL1, 3, 0, 0, 7, 3)
-DEF(ID_PFR0_EL1, 3, 0, 0, 1, 0)
-DEF(ID_PFR1_EL1, 3, 0, 0, 1, 1)
-DEF(ID_DFR0_EL1, 3, 0, 0, 1, 2)
+DEF(ID_AA64PFR0_EL1, 3, 0, 0, 4, 0)
+DEF(ID_AA64PFR1_EL1, 3, 0, 0, 4, 1)
+DEF(ID_AA64PFR2_EL1, 3, 0, 0, 4, 2)
+DEF(ID_AA64SMFR0_EL1, 3, 0, 0, 4, 5)
+DEF(ID_AA64ZFR0_EL1, 3, 0, 0, 4, 4)
 DEF(ID_AFR0_EL1, 3, 0, 0, 1, 3)
-DEF(ID_MMFR0_EL1, 3, 0, 0, 1, 4)
-DEF(ID_MMFR1_EL1, 3, 0, 0, 1, 5)
-DEF(ID_MMFR2_EL1, 3, 0, 0, 1, 6)
-DEF(ID_MMFR3_EL1, 3, 0, 0, 1, 7)
+DEF(ID_DFR0_EL1, 3, 0, 0, 1, 2)
+DEF(ID_DFR1_EL1, 3, 0, 0, 3, 5)
 DEF(ID_ISAR0_EL1, 3, 0, 0, 2, 0)
 DEF(ID_ISAR1_EL1, 3, 0, 0, 2, 1)
 DEF(ID_ISAR2_EL1, 3, 0, 0, 2, 2)
 DEF(ID_ISAR3_EL1, 3, 0, 0, 2, 3)
 DEF(ID_ISAR4_EL1, 3, 0, 0, 2, 4)
 DEF(ID_ISAR5_EL1, 3, 0, 0, 2, 5)
-DEF(ID_MMFR4_EL1, 3, 0, 0, 2, 6)
 DEF(ID_ISAR6_EL1, 3, 0, 0, 2, 7)
+DEF(ID_MMFR0_EL1, 3, 0, 0, 1, 4)
+DEF(ID_MMFR1_EL1, 3, 0, 0, 1, 5)
+DEF(ID_MMFR2_EL1, 3, 0, 0, 1, 6)
+DEF(ID_MMFR3_EL1, 3, 0, 0, 1, 7)
+DEF(ID_MMFR4_EL1, 3, 0, 0, 2, 6)
+DEF(ID_MMFR5_EL1, 3, 0, 0, 3, 6)
+DEF(ID_PFR0_EL1, 3, 0, 0, 1, 0)
+DEF(ID_PFR1_EL1, 3, 0, 0, 1, 1)
+DEF(ID_PFR2_EL1, 3, 0, 0, 3, 4)
 DEF(MVFR0_EL1, 3, 0, 0, 3, 0)
 DEF(MVFR1_EL1, 3, 0, 0, 3, 1)
 DEF(MVFR2_EL1, 3, 0, 0, 3, 2)
-DEF(ID_PFR2_EL1, 3, 0, 0, 3, 4)
-DEF(ID_DFR1_EL1, 3, 0, 0, 3, 5)
-DEF(ID_MMFR5_EL1, 3, 0, 0, 3, 6)
-DEF(CLIDR_EL1, 3, 1, 0, 0, 1)
-DEF(ID_AA64ZFR0_EL1, 3, 0, 0, 4, 4)
-DEF(CTR_EL0, 3, 3, 0, 0, 1)
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] target/arm/cpu-sysregs.h.inc: Update with automatic generation
  2025-12-08 16:37 [PATCH 0/3] Generate target/arm/cpu-sysregs.h.inc from AARCHMRS Registers.json Eric Auger
  2025-12-08 16:37 ` [PATCH 1/3] scripts: introduce scripts/update-aarch64-sysreg-code.py Eric Auger
  2025-12-08 16:37 ` [PATCH 2/3] target/arm/cpu-sysregs.h.inc: Sort by name alphabetical order Eric Auger
@ 2025-12-08 16:37 ` Eric Auger
  2025-12-09 16:33   ` Cornelia Huck
  2 siblings, 1 reply; 10+ messages in thread
From: Eric Auger @ 2025-12-08 16:37 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, qemu-devel, qemu-arm, peter.maydell,
	richard.henderson, cohuck, sebott
  Cc: maz

Generated definitions with scripts/update-aarch64-sysreg-code.py
based on "AARCHMRS containing the JSON files for Arm A-profile
architecture (2025-09)" Registers.json file.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 target/arm/cpu-sysregs.h.inc | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/target/arm/cpu-sysregs.h.inc b/target/arm/cpu-sysregs.h.inc
index 3c892c4f30..9bb27297b5 100644
--- a/target/arm/cpu-sysregs.h.inc
+++ b/target/arm/cpu-sysregs.h.inc
@@ -1,17 +1,27 @@
-/* SPDX-License-Identifier: GPL-2.0-or-later */
+/* SPDX-License-Identifier: BSD-3-Clause */
+
+/* This file is autogenerated by scripts/update-aarch64-sysreg-code.py */
+
+DEF(AIDR_EL1, 3, 1, 0, 0, 7)
 DEF(CLIDR_EL1, 3, 1, 0, 0, 1)
 DEF(CTR_EL0, 3, 3, 0, 0, 1)
+DEF(DCZID_EL0, 3, 3, 0, 0, 7)
+DEF(GMID_EL1, 3, 1, 0, 0, 4)
 DEF(ID_AA64AFR0_EL1, 3, 0, 0, 5, 4)
 DEF(ID_AA64AFR1_EL1, 3, 0, 0, 5, 5)
 DEF(ID_AA64DFR0_EL1, 3, 0, 0, 5, 0)
 DEF(ID_AA64DFR1_EL1, 3, 0, 0, 5, 1)
+DEF(ID_AA64DFR2_EL1, 3, 0, 0, 5, 2)
+DEF(ID_AA64FPFR0_EL1, 3, 0, 0, 4, 7)
 DEF(ID_AA64ISAR0_EL1, 3, 0, 0, 6, 0)
 DEF(ID_AA64ISAR1_EL1, 3, 0, 0, 6, 1)
 DEF(ID_AA64ISAR2_EL1, 3, 0, 0, 6, 2)
+DEF(ID_AA64ISAR3_EL1, 3, 0, 0, 6, 3)
 DEF(ID_AA64MMFR0_EL1, 3, 0, 0, 7, 0)
 DEF(ID_AA64MMFR1_EL1, 3, 0, 0, 7, 1)
 DEF(ID_AA64MMFR2_EL1, 3, 0, 0, 7, 2)
 DEF(ID_AA64MMFR3_EL1, 3, 0, 0, 7, 3)
+DEF(ID_AA64MMFR4_EL1, 3, 0, 0, 7, 4)
 DEF(ID_AA64PFR0_EL1, 3, 0, 0, 4, 0)
 DEF(ID_AA64PFR1_EL1, 3, 0, 0, 4, 1)
 DEF(ID_AA64PFR2_EL1, 3, 0, 0, 4, 2)
@@ -36,6 +46,10 @@ DEF(ID_MMFR5_EL1, 3, 0, 0, 3, 6)
 DEF(ID_PFR0_EL1, 3, 0, 0, 1, 0)
 DEF(ID_PFR1_EL1, 3, 0, 0, 1, 1)
 DEF(ID_PFR2_EL1, 3, 0, 0, 3, 4)
+DEF(MIDR_EL1, 3, 0, 0, 0, 0)
+DEF(MPIDR_EL1, 3, 0, 0, 0, 5)
 DEF(MVFR0_EL1, 3, 0, 0, 3, 0)
 DEF(MVFR1_EL1, 3, 0, 0, 3, 1)
 DEF(MVFR2_EL1, 3, 0, 0, 3, 2)
+DEF(REVIDR_EL1, 3, 0, 0, 0, 6)
+DEF(SMIDR_EL1, 3, 1, 0, 0, 6)
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] scripts: introduce scripts/update-aarch64-sysreg-code.py
  2025-12-08 16:37 ` [PATCH 1/3] scripts: introduce scripts/update-aarch64-sysreg-code.py Eric Auger
@ 2025-12-09 11:12   ` Cornelia Huck
  2025-12-09 12:30   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 10+ messages in thread
From: Cornelia Huck @ 2025-12-09 11:12 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, eric.auger, qemu-devel, qemu-arm,
	peter.maydell, richard.henderson, sebott
  Cc: maz

On Mon, Dec 08 2025, Eric Auger <eric.auger@redhat.com> wrote:

> Introduce a script that takes as input the Registers.json file
> delivered in the AARCHMRS Features Model downloadable from the
> Arm Developer A-Profile Architecture Exploration Tools page:
> https://developer.arm.com/Architectures/A-Profile%20Architecture#Downloads
> and outputs the list of ID regs in target/arm/cpu-sysregs.h.inc
> under the form of DEF(<name>, <op0>, <op1>, <crn>, <crm>, <op2>).
>
> We only care about IDregs with opcodes satisfying:
> op0 = 3, op1 within [0, 3], crn = 0, crm within [0, 7], op2 within [0, 7]
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
>
> This was tested with https://developer.arm.com/-/cdn-downloads/permalink/Exploration-Tools-OS-Machine-Readable-Data/AARCHMRS_BSD/AARCHMRS_OPENSOURCE_A_profile_FAT-2025-09_ASL0.tar.gz
>
> Discussion about undesired generated regs can be found in
> https://lore.kernel.org/all/CAFEAcA9OXi4v+hdBMamQv85HYp2EqxOA5=nfsdZ5E3nf8RP_pw@mail.gmail.com/
> ---
>  scripts/update-aarch64-sysreg-code.py | 133 ++++++++++++++++++++++++++
>  1 file changed, 133 insertions(+)
>  create mode 100755 scripts/update-aarch64-sysreg-code.py
>
> diff --git a/scripts/update-aarch64-sysreg-code.py b/scripts/update-aarch64-sysreg-code.py
> new file mode 100755
> index 0000000000..c7b31035d1
> --- /dev/null
> +++ b/scripts/update-aarch64-sysreg-code.py

(...)

> +if __name__ == "__main__":
> +    # Single arg expectedr: the path to the Registers.json file
> +    if len(sys.argv) < 2:
> +        print("Usage: scripts/update-aarch64-sysreg-code.py <path_to_registers_json>")
> +        sys.exit(1)
> +    else:
> +        json_file_path = sys.argv[1]
> +
> +    extracted_registers = extract_idregs_from_registers_json(json_file_path)
> +
> +    if extracted_registers:
> +        output_list = extracted_registers.items()
> +
> +        # Sort by register name
> +        sorted_output = sorted(output_list, key=lambda item: item[0])
> +
> +        # format lines as DEF(<name>, <op0>, <op1>, <crn>, <crm>, <op2>)
> +        final_output = ""
> +        for reg_name, encoding in sorted_output:
> +            reformatted_encoding = encoding.replace(" ", ", ")
> +            final_output += f"DEF({reg_name}, {reformatted_encoding})\n"
> +
> +        with open("target/arm/cpu-sysregs.h.inc", 'w') as f:
> +            f.write("/* SPDX-License-Identifier: BSD-3-Clause */\n\n")
> +            f.write("/* This file is autogenerated by ")
> +            f.write("scripts/update-aarch64-sysreg-code.py */\n\n")

I'm wondering if there is an easy way to log the version of the json
file this has been generated against. If not, putting the information
into the commit message when updating is probably sufficient.

> +            f.write(final_output)
> +        print(f"updated target/arm/cpu-sysregs.h.inc")



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] scripts: introduce scripts/update-aarch64-sysreg-code.py
  2025-12-08 16:37 ` [PATCH 1/3] scripts: introduce scripts/update-aarch64-sysreg-code.py Eric Auger
  2025-12-09 11:12   ` Cornelia Huck
@ 2025-12-09 12:30   ` Philippe Mathieu-Daudé
  2025-12-09 12:34     ` Philippe Mathieu-Daudé
  2025-12-09 13:40     ` Eric Auger
  1 sibling, 2 replies; 10+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-12-09 12:30 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	richard.henderson, cohuck, sebott
  Cc: maz

On 8/12/25 17:37, Eric Auger wrote:
> Introduce a script that takes as input the Registers.json file
> delivered in the AARCHMRS Features Model downloadable from the
> Arm Developer A-Profile Architecture Exploration Tools page:
> https://developer.arm.com/Architectures/A-Profile%20Architecture#Downloads
> and outputs the list of ID regs in target/arm/cpu-sysregs.h.inc
> under the form of DEF(<name>, <op0>, <op1>, <crn>, <crm>, <op2>).

Great idea!

> 
> We only care about IDregs with opcodes satisfying:
> op0 = 3, op1 within [0, 3], crn = 0, crm within [0, 7], op2 within [0, 7]
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> 
> This was tested with https://developer.arm.com/-/cdn-downloads/permalink/Exploration-Tools-OS-Machine-Readable-Data/AARCHMRS_BSD/AARCHMRS_OPENSOURCE_A_profile_FAT-2025-09_ASL0.tar.gz
> 
> Discussion about undesired generated regs can be found in
> https://lore.kernel.org/all/CAFEAcA9OXi4v+hdBMamQv85HYp2EqxOA5=nfsdZ5E3nf8RP_pw@mail.gmail.com/
> ---
>   scripts/update-aarch64-sysreg-code.py | 133 ++++++++++++++++++++++++++
>   1 file changed, 133 insertions(+)
>   create mode 100755 scripts/update-aarch64-sysreg-code.py
> 
> diff --git a/scripts/update-aarch64-sysreg-code.py b/scripts/update-aarch64-sysreg-code.py
> new file mode 100755
> index 0000000000..c7b31035d1
> --- /dev/null
> +++ b/scripts/update-aarch64-sysreg-code.py
> @@ -0,0 +1,133 @@
> +#!/usr/bin/env python3
> +
> +# This script takes as input the Registers.json file delivered in
> +# the AARCHMRS Features Model downloadable from the Arm Developer
> +# A-Profile Architecture Exploration Tools page:
> +# https://developer.arm.com/Architectures/A-Profile%20Architecture#Downloads
> +# and outputs the list of ID regs in target/arm/cpu-sysregs.h.inc
> +# under the form of DEF(<name>, <op0>, <op1>, <crn>, <crm>, <op2>)
> +#
> +# Copyright (C) 2025 Red Hat, Inc.
> +#
> +# Authors: Eric Auger <eric.auger@redhat.com>
> +#
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +
> +
> +import json
> +import os
> +import sys
> +

[*]

> +# returns the int value of a given @opcode for a reg @encoding
> +def get_opcode(encoding, opcode):
> +    fvalue = encoding.get(opcode)
> +    if fvalue:
> +        value = fvalue.get('value')
> +        if isinstance(value, str):
> +            value = value.strip("'")
> +            value = int(value,2)
> +            return value
> +    return -1
> +
> +def extract_idregs_from_registers_json(filename):
> +    """
> +    Load a Registers.json file and extract all ID registers, decode their
> +    opcode and dump the information in target/arm/cpu-sysregs.h.inc
> +
> +    Args:
> +        filename (str): The path to the Registers.json
> +    returns:
> +        idregs: list of ID regs and their encoding
> +    """
> +    if not os.path.exists(filename):
> +        print(f"Error: {filename} could not be found!")
> +        return {}
> +
> +    try:
> +        with open(filename, 'r') as f:
> +            register_data = json.load(f)
> +
> +    except json.JSONDecodeError:
> +        print(f"Could not decode json from '{filename}'!")
> +        return {}
> +    except Exception as e:
> +        print(f"Unexpected error while reading {filename}: {e}")
> +        return {}
> +
> +    registers = [r for r in register_data if isinstance(r, dict) and \
> +                r.get('_type') == 'Register']
> +
> +    idregs = {}
> +
> +    # Some regs have op code values like 000x, 001x. Anyway we don't need
> +    # them. Besides some regs are undesired in the generated file such as
> +    # CCSIDR_EL1 and CCSIDR2_EL1 which are arrays of regs. Also exclude
> +    # VMPIDR_EL2 and VPIDR_EL2 which are outside of the IDreg scope we
> +    # are interested in and are tricky to decode as their system accessor
> +    # refer to MPIDR_EL1/MIDR_EL1 respectively
> +
> +    skiplist = ['ALLINT', 'PM', 'S1_', 'S3_', 'SVCR', \
> +                'CCSIDR_EL1', 'CCSIDR2_EL1', 'VMPIDR_EL2', 'VPIDR_EL2']

Since we might have to update this array, I'd move it (and the big
comment preceding) in [*].

> +
> +    for register in registers:
> +        reg_name = register.get('name')
> +
> +        is_skipped = any(term in (reg_name or "").upper() for term in skiplist)
> +
> +        if reg_name and not is_skipped:
> +            accessors = register.get('accessors', [])
> +
> +            for accessor in accessors:
> +                type = accessor.get('_type')
> +                if type in ['Accessors.SystemAccessor']:
> +                    encoding_list = accessor.get('encoding')
> +
> +                    if isinstance(encoding_list, list) and encoding_list and \
> +                       isinstance(encoding_list[0], dict):
> +                        encoding_wrapper = encoding_list[0]
> +                        encoding_source = encoding_wrapper.get('encodings', \
> +                                                               encoding_wrapper)
> +
> +                        if isinstance(encoding_source, dict):
> +                                op0 = get_opcode(encoding_source, 'op0')
> +                                op1 = get_opcode(encoding_source, 'op1')
> +                                op2 = get_opcode(encoding_source, 'op2')
> +                                crn = get_opcode(encoding_source, 'CRn')
> +                                crm = get_opcode(encoding_source, 'CRm')
> +                                encoding_str=f"{op0} {op1} {crn} {crm} {op2}"
> +
> +                # ID regs are assumed within this scope
> +                if op0 == 3 and (op1 == 0 or op1 == 1 or op1 == 3) and \
> +                   crn == 0 and (crm >= 0 and crm <= 7) and (op2 >= 0 and op2 <= 7):
> +                    idregs[reg_name] = encoding_str
> +
> +    return idregs
> +
> +if __name__ == "__main__":
> +    # Single arg expectedr: the path to the Registers.json file

Typo "expectedr".

> +    if len(sys.argv) < 2:
> +        print("Usage: scripts/update-aarch64-sysreg-code.py <path_to_registers_json>")
> +        sys.exit(1)
> +    else:
> +        json_file_path = sys.argv[1]
> +
> +    extracted_registers = extract_idregs_from_registers_json(json_file_path)
> +
> +    if extracted_registers:
> +        output_list = extracted_registers.items()
> +
> +        # Sort by register name
> +        sorted_output = sorted(output_list, key=lambda item: item[0])
> +
> +        # format lines as DEF(<name>, <op0>, <op1>, <crn>, <crm>, <op2>)
> +        final_output = ""
> +        for reg_name, encoding in sorted_output:
> +            reformatted_encoding = encoding.replace(" ", ", ")
> +            final_output += f"DEF({reg_name}, {reformatted_encoding})\n"
> +
> +        with open("target/arm/cpu-sysregs.h.inc", 'w') as f:
> +            f.write("/* SPDX-License-Identifier: BSD-3-Clause */\n\n")
> +            f.write("/* This file is autogenerated by ")
> +            f.write("scripts/update-aarch64-sysreg-code.py */\n\n")
> +            f.write(final_output)
> +        print(f"updated target/arm/cpu-sysregs.h.inc")

Fixed string (no formating) so no need for f- prefix.

Patch LGTM but it should have some unit test.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] scripts: introduce scripts/update-aarch64-sysreg-code.py
  2025-12-09 12:30   ` Philippe Mathieu-Daudé
@ 2025-12-09 12:34     ` Philippe Mathieu-Daudé
  2025-12-09 13:40     ` Eric Auger
  1 sibling, 0 replies; 10+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-12-09 12:34 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	richard.henderson, cohuck, sebott
  Cc: maz

On 9/12/25 13:30, Philippe Mathieu-Daudé wrote:
> On 8/12/25 17:37, Eric Auger wrote:
>> Introduce a script that takes as input the Registers.json file
>> delivered in the AARCHMRS Features Model downloadable from the
>> Arm Developer A-Profile Architecture Exploration Tools page:
>> https://developer.arm.com/Architectures/A- 
>> Profile%20Architecture#Downloads
>> and outputs the list of ID regs in target/arm/cpu-sysregs.h.inc
>> under the form of DEF(<name>, <op0>, <op1>, <crn>, <crm>, <op2>).
> 
> Great idea!
> 
>>
>> We only care about IDregs with opcodes satisfying:
>> op0 = 3, op1 within [0, 3], crn = 0, crm within [0, 7], op2 within [0, 7]
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> This was tested with https://developer.arm.com/-/cdn-downloads/ 
>> permalink/Exploration-Tools-OS-Machine-Readable-Data/AARCHMRS_BSD/ 
>> AARCHMRS_OPENSOURCE_A_profile_FAT-2025-09_ASL0.tar.gz
>>
>> Discussion about undesired generated regs can be found in
>> https://lore.kernel.org/all/ 
>> CAFEAcA9OXi4v+hdBMamQv85HYp2EqxOA5=nfsdZ5E3nf8RP_pw@mail.gmail.com/
>> ---
>>   scripts/update-aarch64-sysreg-code.py | 133 ++++++++++++++++++++++++++
>>   1 file changed, 133 insertions(+)
>>   create mode 100755 scripts/update-aarch64-sysreg-code.py
>>
>> diff --git a/scripts/update-aarch64-sysreg-code.py b/scripts/update- 
>> aarch64-sysreg-code.py
>> new file mode 100755
>> index 0000000000..c7b31035d1
>> --- /dev/null
>> +++ b/scripts/update-aarch64-sysreg-code.py
>> @@ -0,0 +1,133 @@
>> +#!/usr/bin/env python3
>> +
>> +# This script takes as input the Registers.json file delivered in
>> +# the AARCHMRS Features Model downloadable from the Arm Developer
>> +# A-Profile Architecture Exploration Tools page:
>> +# https://developer.arm.com/Architectures/A- 
>> Profile%20Architecture#Downloads
>> +# and outputs the list of ID regs in target/arm/cpu-sysregs.h.inc
>> +# under the form of DEF(<name>, <op0>, <op1>, <crn>, <crm>, <op2>)
>> +#
>> +# Copyright (C) 2025 Red Hat, Inc.
>> +#
>> +# Authors: Eric Auger <eric.auger@redhat.com>
>> +#
>> +# SPDX-License-Identifier: GPL-2.0-or-later
>> +
>> +
>> +import json
>> +import os
>> +import sys
>> +
> 
> [*]
> 
>> +# returns the int value of a given @opcode for a reg @encoding
>> +def get_opcode(encoding, opcode):
>> +    fvalue = encoding.get(opcode)
>> +    if fvalue:
>> +        value = fvalue.get('value')
>> +        if isinstance(value, str):
>> +            value = value.strip("'")
>> +            value = int(value,2)
>> +            return value
>> +    return -1
>> +
>> +def extract_idregs_from_registers_json(filename):
>> +    """
>> +    Load a Registers.json file and extract all ID registers, decode 
>> their
>> +    opcode and dump the information in target/arm/cpu-sysregs.h.inc
>> +
>> +    Args:
>> +        filename (str): The path to the Registers.json
>> +    returns:
>> +        idregs: list of ID regs and their encoding
>> +    """
>> +    if not os.path.exists(filename):
>> +        print(f"Error: {filename} could not be found!")
>> +        return {}
>> +
>> +    try:
>> +        with open(filename, 'r') as f:
>> +            register_data = json.load(f)
>> +
>> +    except json.JSONDecodeError:
>> +        print(f"Could not decode json from '{filename}'!")
>> +        return {}
>> +    except Exception as e:
>> +        print(f"Unexpected error while reading {filename}: {e}")
>> +        return {}
>> +
>> +    registers = [r for r in register_data if isinstance(r, dict) and \
>> +                r.get('_type') == 'Register']
>> +
>> +    idregs = {}
>> +
>> +    # Some regs have op code values like 000x, 001x. Anyway we don't 
>> need
>> +    # them. Besides some regs are undesired in the generated file 
>> such as
>> +    # CCSIDR_EL1 and CCSIDR2_EL1 which are arrays of regs. Also exclude
>> +    # VMPIDR_EL2 and VPIDR_EL2 which are outside of the IDreg scope we
>> +    # are interested in and are tricky to decode as their system 
>> accessor
>> +    # refer to MPIDR_EL1/MIDR_EL1 respectively
>> +
>> +    skiplist = ['ALLINT', 'PM', 'S1_', 'S3_', 'SVCR', \
>> +                'CCSIDR_EL1', 'CCSIDR2_EL1', 'VMPIDR_EL2', 'VPIDR_EL2']
> 
> Since we might have to update this array, I'd move it (and the big
> comment preceding) in [*].
> 
>> +
>> +    for register in registers:
>> +        reg_name = register.get('name')
>> +
>> +        is_skipped = any(term in (reg_name or "").upper() for term in 
>> skiplist)
>> +
>> +        if reg_name and not is_skipped:
>> +            accessors = register.get('accessors', [])
>> +
>> +            for accessor in accessors:
>> +                type = accessor.get('_type')
>> +                if type in ['Accessors.SystemAccessor']:
>> +                    encoding_list = accessor.get('encoding')
>> +
>> +                    if isinstance(encoding_list, list) and 
>> encoding_list and \
>> +                       isinstance(encoding_list[0], dict):
>> +                        encoding_wrapper = encoding_list[0]
>> +                        encoding_source = 
>> encoding_wrapper.get('encodings', \
>> +                                                               
>> encoding_wrapper)
>> +
>> +                        if isinstance(encoding_source, dict):
>> +                                op0 = get_opcode(encoding_source, 'op0')
>> +                                op1 = get_opcode(encoding_source, 'op1')
>> +                                op2 = get_opcode(encoding_source, 'op2')
>> +                                crn = get_opcode(encoding_source, 'CRn')
>> +                                crm = get_opcode(encoding_source, 'CRm')
>> +                                encoding_str=f"{op0} {op1} {crn} 
>> {crm} {op2}"
>> +
>> +                # ID regs are assumed within this scope
>> +                if op0 == 3 and (op1 == 0 or op1 == 1 or op1 == 3) and \
>> +                   crn == 0 and (crm >= 0 and crm <= 7) and (op2 >= 0 
>> and op2 <= 7):
>> +                    idregs[reg_name] = encoding_str
>> +
>> +    return idregs
>> +
>> +if __name__ == "__main__":
>> +    # Single arg expectedr: the path to the Registers.json file
> 
> Typo "expectedr".
> 
>> +    if len(sys.argv) < 2:
>> +        print("Usage: scripts/update-aarch64-sysreg-code.py 
>> <path_to_registers_json>")
>> +        sys.exit(1)
>> +    else:
>> +        json_file_path = sys.argv[1]
>> +
>> +    extracted_registers = 
>> extract_idregs_from_registers_json(json_file_path)
>> +
>> +    if extracted_registers:
>> +        output_list = extracted_registers.items()
>> +
>> +        # Sort by register name
>> +        sorted_output = sorted(output_list, key=lambda item: item[0])
>> +
>> +        # format lines as DEF(<name>, <op0>, <op1>, <crn>, <crm>, <op2>)
>> +        final_output = ""
>> +        for reg_name, encoding in sorted_output:
>> +            reformatted_encoding = encoding.replace(" ", ", ")
>> +            final_output += f"DEF({reg_name}, {reformatted_encoding})\n"
>> +
>> +        with open("target/arm/cpu-sysregs.h.inc", 'w') as f:
>> +            f.write("/* SPDX-License-Identifier: BSD-3-Clause */\n\n")
>> +            f.write("/* This file is autogenerated by ")
>> +            f.write("scripts/update-aarch64-sysreg-code.py */\n\n")

Maybe worth adding the format in header:

      f.write("/* DEF(<name>, <op0>, <op1>, <crn>, <crm>, <op2>) */\n")

>> +            f.write(final_output)
>> +        print(f"updated target/arm/cpu-sysregs.h.inc")
> 
> Fixed string (no formating) so no need for f- prefix.
> 
> Patch LGTM but it should have some unit test.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] scripts: introduce scripts/update-aarch64-sysreg-code.py
  2025-12-09 12:30   ` Philippe Mathieu-Daudé
  2025-12-09 12:34     ` Philippe Mathieu-Daudé
@ 2025-12-09 13:40     ` Eric Auger
  2025-12-09 13:57       ` Philippe Mathieu-Daudé
  1 sibling, 1 reply; 10+ messages in thread
From: Eric Auger @ 2025-12-09 13:40 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, eric.auger.pro, qemu-devel, qemu-arm,
	peter.maydell, richard.henderson, cohuck, sebott
  Cc: maz

Hi Philippe,

On 12/9/25 1:30 PM, Philippe Mathieu-Daudé wrote:
> On 8/12/25 17:37, Eric Auger wrote:
>> Introduce a script that takes as input the Registers.json file
>> delivered in the AARCHMRS Features Model downloadable from the
>> Arm Developer A-Profile Architecture Exploration Tools page:
>> https://developer.arm.com/Architectures/A-Profile%20Architecture#Downloads
>>
>> and outputs the list of ID regs in target/arm/cpu-sysregs.h.inc
>> under the form of DEF(<name>, <op0>, <op1>, <crn>, <crm>, <op2>).
>
> Great idea!
>
>>
>> We only care about IDregs with opcodes satisfying:
>> op0 = 3, op1 within [0, 3], crn = 0, crm within [0, 7], op2 within
>> [0, 7]
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> This was tested with
>> https://developer.arm.com/-/cdn-downloads/permalink/Exploration-Tools-OS-Machine-Readable-Data/AARCHMRS_BSD/AARCHMRS_OPENSOURCE_A_profile_FAT-2025-09_ASL0.tar.gz
>>
>> Discussion about undesired generated regs can be found in
>> https://lore.kernel.org/all/CAFEAcA9OXi4v+hdBMamQv85HYp2EqxOA5=nfsdZ5E3nf8RP_pw@mail.gmail.com/
>>
>> ---
>>   scripts/update-aarch64-sysreg-code.py | 133 ++++++++++++++++++++++++++
>>   1 file changed, 133 insertions(+)
>>   create mode 100755 scripts/update-aarch64-sysreg-code.py
>>
>> diff --git a/scripts/update-aarch64-sysreg-code.py
>> b/scripts/update-aarch64-sysreg-code.py
>> new file mode 100755
>> index 0000000000..c7b31035d1
>> --- /dev/null
>> +++ b/scripts/update-aarch64-sysreg-code.py
>> @@ -0,0 +1,133 @@
>> +#!/usr/bin/env python3
>> +
>> +# This script takes as input the Registers.json file delivered in
>> +# the AARCHMRS Features Model downloadable from the Arm Developer
>> +# A-Profile Architecture Exploration Tools page:
>> +#
>> https://developer.arm.com/Architectures/A-Profile%20Architecture#Downloads
>> +# and outputs the list of ID regs in target/arm/cpu-sysregs.h.inc
>> +# under the form of DEF(<name>, <op0>, <op1>, <crn>, <crm>, <op2>)
>> +#
>> +# Copyright (C) 2025 Red Hat, Inc.
>> +#
>> +# Authors: Eric Auger <eric.auger@redhat.com>
>> +#
>> +# SPDX-License-Identifier: GPL-2.0-or-later
>> +
>> +
>> +import json
>> +import os
>> +import sys
>> +
>
> [*]
>
>> +# returns the int value of a given @opcode for a reg @encoding
>> +def get_opcode(encoding, opcode):
>> +    fvalue = encoding.get(opcode)
>> +    if fvalue:
>> +        value = fvalue.get('value')
>> +        if isinstance(value, str):
>> +            value = value.strip("'")
>> +            value = int(value,2)
>> +            return value
>> +    return -1
>> +
>> +def extract_idregs_from_registers_json(filename):
>> +    """
>> +    Load a Registers.json file and extract all ID registers, decode
>> their
>> +    opcode and dump the information in target/arm/cpu-sysregs.h.inc
>> +
>> +    Args:
>> +        filename (str): The path to the Registers.json
>> +    returns:
>> +        idregs: list of ID regs and their encoding
>> +    """
>> +    if not os.path.exists(filename):
>> +        print(f"Error: {filename} could not be found!")
>> +        return {}
>> +
>> +    try:
>> +        with open(filename, 'r') as f:
>> +            register_data = json.load(f)
>> +
>> +    except json.JSONDecodeError:
>> +        print(f"Could not decode json from '{filename}'!")
>> +        return {}
>> +    except Exception as e:
>> +        print(f"Unexpected error while reading {filename}: {e}")
>> +        return {}
>> +
>> +    registers = [r for r in register_data if isinstance(r, dict) and \
>> +                r.get('_type') == 'Register']
>> +
>> +    idregs = {}
>> +
>> +    # Some regs have op code values like 000x, 001x. Anyway we don't
>> need
>> +    # them. Besides some regs are undesired in the generated file
>> such as
>> +    # CCSIDR_EL1 and CCSIDR2_EL1 which are arrays of regs. Also exclude
>> +    # VMPIDR_EL2 and VPIDR_EL2 which are outside of the IDreg scope we
>> +    # are interested in and are tricky to decode as their system
>> accessor
>> +    # refer to MPIDR_EL1/MIDR_EL1 respectively
>> +
>> +    skiplist = ['ALLINT', 'PM', 'S1_', 'S3_', 'SVCR', \
>> +                'CCSIDR_EL1', 'CCSIDR2_EL1', 'VMPIDR_EL2', 'VPIDR_EL2']
>
> Since we might have to update this array, I'd move it (and the big
> comment preceding) in [*].
>
>> +
>> +    for register in registers:
>> +        reg_name = register.get('name')
>> +
>> +        is_skipped = any(term in (reg_name or "").upper() for term
>> in skiplist)
>> +
>> +        if reg_name and not is_skipped:
>> +            accessors = register.get('accessors', [])
>> +
>> +            for accessor in accessors:
>> +                type = accessor.get('_type')
>> +                if type in ['Accessors.SystemAccessor']:
>> +                    encoding_list = accessor.get('encoding')
>> +
>> +                    if isinstance(encoding_list, list) and
>> encoding_list and \
>> +                       isinstance(encoding_list[0], dict):
>> +                        encoding_wrapper = encoding_list[0]
>> +                        encoding_source =
>> encoding_wrapper.get('encodings', \
>> +                                                              
>> encoding_wrapper)
>> +
>> +                        if isinstance(encoding_source, dict):
>> +                                op0 = get_opcode(encoding_source,
>> 'op0')
>> +                                op1 = get_opcode(encoding_source,
>> 'op1')
>> +                                op2 = get_opcode(encoding_source,
>> 'op2')
>> +                                crn = get_opcode(encoding_source,
>> 'CRn')
>> +                                crm = get_opcode(encoding_source,
>> 'CRm')
>> +                                encoding_str=f"{op0} {op1} {crn}
>> {crm} {op2}"
>> +
>> +                # ID regs are assumed within this scope
>> +                if op0 == 3 and (op1 == 0 or op1 == 1 or op1 == 3)
>> and \
>> +                   crn == 0 and (crm >= 0 and crm <= 7) and (op2 >=
>> 0 and op2 <= 7):
>> +                    idregs[reg_name] = encoding_str
>> +
>> +    return idregs
>> +
>> +if __name__ == "__main__":
>> +    # Single arg expectedr: the path to the Registers.json file
>
> Typo "expectedr".
>
>> +    if len(sys.argv) < 2:
>> +        print("Usage: scripts/update-aarch64-sysreg-code.py
>> <path_to_registers_json>")
>> +        sys.exit(1)
>> +    else:
>> +        json_file_path = sys.argv[1]
>> +
>> +    extracted_registers =
>> extract_idregs_from_registers_json(json_file_path)
>> +
>> +    if extracted_registers:
>> +        output_list = extracted_registers.items()
>> +
>> +        # Sort by register name
>> +        sorted_output = sorted(output_list, key=lambda item: item[0])
>> +
>> +        # format lines as DEF(<name>, <op0>, <op1>, <crn>, <crm>,
>> <op2>)
>> +        final_output = ""
>> +        for reg_name, encoding in sorted_output:
>> +            reformatted_encoding = encoding.replace(" ", ", ")
>> +            final_output += f"DEF({reg_name},
>> {reformatted_encoding})\n"
>> +
>> +        with open("target/arm/cpu-sysregs.h.inc", 'w') as f:
>> +            f.write("/* SPDX-License-Identifier: BSD-3-Clause */\n\n")
>> +            f.write("/* This file is autogenerated by ")
>> +            f.write("scripts/update-aarch64-sysreg-code.py */\n\n")
>> +            f.write(final_output)
>> +        print(f"updated target/arm/cpu-sysregs.h.inc")
>
> Fixed string (no formating) so no need for f- prefix.
>
> Patch LGTM but it should have some unit test. 

thank you for the review.

Not sure what you mean by unit test? One solution could be to diff the
result with former bash/awk I used to generation previously [1] but i
guess it would be awkward to upstream the awk script we did not want in
the first place. Otherwise we could check some random opcodes but it
wouldn't mean the others are correct. To me the best way to validate the
python script is to do [1] once but do not upstream that. Reviewing the
new generated files against the previous content [3/3] is the best way
to test the result. I mean using auto generation does not prevent from
reviewing the generated stuff, especially because the generation is
triggered manually as scripts/update_linux_headers.sh and should not
happen very frequenty.

Thoughts?

Eric  



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] scripts: introduce scripts/update-aarch64-sysreg-code.py
  2025-12-09 13:40     ` Eric Auger
@ 2025-12-09 13:57       ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 10+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-12-09 13:57 UTC (permalink / raw)
  To: eric.auger, eric.auger.pro, qemu-devel, qemu-arm, peter.maydell,
	richard.henderson, cohuck, sebott
  Cc: maz

On 9/12/25 14:40, Eric Auger wrote:
> Hi Philippe,
> 
> On 12/9/25 1:30 PM, Philippe Mathieu-Daudé wrote:
>> On 8/12/25 17:37, Eric Auger wrote:
>>> Introduce a script that takes as input the Registers.json file
>>> delivered in the AARCHMRS Features Model downloadable from the
>>> Arm Developer A-Profile Architecture Exploration Tools page:
>>> https://developer.arm.com/Architectures/A-Profile%20Architecture#Downloads
>>>
>>> and outputs the list of ID regs in target/arm/cpu-sysregs.h.inc
>>> under the form of DEF(<name>, <op0>, <op1>, <crn>, <crm>, <op2>).
>>
>> Great idea!
>>
>>>
>>> We only care about IDregs with opcodes satisfying:
>>> op0 = 3, op1 within [0, 3], crn = 0, crm within [0, 7], op2 within
>>> [0, 7]
>>>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>
>>> ---
>>>
>>> This was tested with
>>> https://developer.arm.com/-/cdn-downloads/permalink/Exploration-Tools-OS-Machine-Readable-Data/AARCHMRS_BSD/AARCHMRS_OPENSOURCE_A_profile_FAT-2025-09_ASL0.tar.gz
>>>
>>> Discussion about undesired generated regs can be found in
>>> https://lore.kernel.org/all/CAFEAcA9OXi4v+hdBMamQv85HYp2EqxOA5=nfsdZ5E3nf8RP_pw@mail.gmail.com/
>>>
>>> ---
>>>    scripts/update-aarch64-sysreg-code.py | 133 ++++++++++++++++++++++++++
>>>    1 file changed, 133 insertions(+)
>>>    create mode 100755 scripts/update-aarch64-sysreg-code.py
>>>
>>> diff --git a/scripts/update-aarch64-sysreg-code.py
>>> b/scripts/update-aarch64-sysreg-code.py
>>> new file mode 100755
>>> index 0000000000..c7b31035d1
>>> --- /dev/null
>>> +++ b/scripts/update-aarch64-sysreg-code.py
>>> @@ -0,0 +1,133 @@
>>> +#!/usr/bin/env python3
>>> +
>>> +# This script takes as input the Registers.json file delivered in
>>> +# the AARCHMRS Features Model downloadable from the Arm Developer
>>> +# A-Profile Architecture Exploration Tools page:
>>> +#
>>> https://developer.arm.com/Architectures/A-Profile%20Architecture#Downloads
>>> +# and outputs the list of ID regs in target/arm/cpu-sysregs.h.inc
>>> +# under the form of DEF(<name>, <op0>, <op1>, <crn>, <crm>, <op2>)
>>> +#
>>> +# Copyright (C) 2025 Red Hat, Inc.
>>> +#
>>> +# Authors: Eric Auger <eric.auger@redhat.com>
>>> +#
>>> +# SPDX-License-Identifier: GPL-2.0-or-later
>>> +
>>> +
>>> +import json
>>> +import os
>>> +import sys
>>> +
>>
>> [*]
>>
>>> +# returns the int value of a given @opcode for a reg @encoding
>>> +def get_opcode(encoding, opcode):
>>> +    fvalue = encoding.get(opcode)
>>> +    if fvalue:
>>> +        value = fvalue.get('value')
>>> +        if isinstance(value, str):
>>> +            value = value.strip("'")
>>> +            value = int(value,2)
>>> +            return value
>>> +    return -1
>>> +
>>> +def extract_idregs_from_registers_json(filename):
>>> +    """
>>> +    Load a Registers.json file and extract all ID registers, decode
>>> their
>>> +    opcode and dump the information in target/arm/cpu-sysregs.h.inc
>>> +
>>> +    Args:
>>> +        filename (str): The path to the Registers.json
>>> +    returns:
>>> +        idregs: list of ID regs and their encoding
>>> +    """
>>> +    if not os.path.exists(filename):
>>> +        print(f"Error: {filename} could not be found!")
>>> +        return {}
>>> +
>>> +    try:
>>> +        with open(filename, 'r') as f:
>>> +            register_data = json.load(f)
>>> +
>>> +    except json.JSONDecodeError:
>>> +        print(f"Could not decode json from '{filename}'!")
>>> +        return {}
>>> +    except Exception as e:
>>> +        print(f"Unexpected error while reading {filename}: {e}")
>>> +        return {}
>>> +
>>> +    registers = [r for r in register_data if isinstance(r, dict) and \
>>> +                r.get('_type') == 'Register']
>>> +
>>> +    idregs = {}
>>> +
>>> +    # Some regs have op code values like 000x, 001x. Anyway we don't
>>> need
>>> +    # them. Besides some regs are undesired in the generated file
>>> such as
>>> +    # CCSIDR_EL1 and CCSIDR2_EL1 which are arrays of regs. Also exclude
>>> +    # VMPIDR_EL2 and VPIDR_EL2 which are outside of the IDreg scope we
>>> +    # are interested in and are tricky to decode as their system
>>> accessor
>>> +    # refer to MPIDR_EL1/MIDR_EL1 respectively
>>> +
>>> +    skiplist = ['ALLINT', 'PM', 'S1_', 'S3_', 'SVCR', \
>>> +                'CCSIDR_EL1', 'CCSIDR2_EL1', 'VMPIDR_EL2', 'VPIDR_EL2']
>>
>> Since we might have to update this array, I'd move it (and the big
>> comment preceding) in [*].
>>
>>> +
>>> +    for register in registers:
>>> +        reg_name = register.get('name')
>>> +
>>> +        is_skipped = any(term in (reg_name or "").upper() for term
>>> in skiplist)
>>> +
>>> +        if reg_name and not is_skipped:
>>> +            accessors = register.get('accessors', [])
>>> +
>>> +            for accessor in accessors:
>>> +                type = accessor.get('_type')
>>> +                if type in ['Accessors.SystemAccessor']:
>>> +                    encoding_list = accessor.get('encoding')
>>> +
>>> +                    if isinstance(encoding_list, list) and
>>> encoding_list and \
>>> +                       isinstance(encoding_list[0], dict):
>>> +                        encoding_wrapper = encoding_list[0]
>>> +                        encoding_source =
>>> encoding_wrapper.get('encodings', \
>>> +
>>> encoding_wrapper)
>>> +
>>> +                        if isinstance(encoding_source, dict):
>>> +                                op0 = get_opcode(encoding_source,
>>> 'op0')
>>> +                                op1 = get_opcode(encoding_source,
>>> 'op1')
>>> +                                op2 = get_opcode(encoding_source,
>>> 'op2')
>>> +                                crn = get_opcode(encoding_source,
>>> 'CRn')
>>> +                                crm = get_opcode(encoding_source,
>>> 'CRm')
>>> +                                encoding_str=f"{op0} {op1} {crn}
>>> {crm} {op2}"
>>> +
>>> +                # ID regs are assumed within this scope
>>> +                if op0 == 3 and (op1 == 0 or op1 == 1 or op1 == 3)
>>> and \
>>> +                   crn == 0 and (crm >= 0 and crm <= 7) and (op2 >=
>>> 0 and op2 <= 7):
>>> +                    idregs[reg_name] = encoding_str
>>> +
>>> +    return idregs
>>> +
>>> +if __name__ == "__main__":
>>> +    # Single arg expectedr: the path to the Registers.json file
>>
>> Typo "expectedr".
>>
>>> +    if len(sys.argv) < 2:
>>> +        print("Usage: scripts/update-aarch64-sysreg-code.py
>>> <path_to_registers_json>")
>>> +        sys.exit(1)
>>> +    else:
>>> +        json_file_path = sys.argv[1]
>>> +
>>> +    extracted_registers =
>>> extract_idregs_from_registers_json(json_file_path)
>>> +
>>> +    if extracted_registers:
>>> +        output_list = extracted_registers.items()
>>> +
>>> +        # Sort by register name
>>> +        sorted_output = sorted(output_list, key=lambda item: item[0])
>>> +
>>> +        # format lines as DEF(<name>, <op0>, <op1>, <crn>, <crm>,
>>> <op2>)
>>> +        final_output = ""
>>> +        for reg_name, encoding in sorted_output:
>>> +            reformatted_encoding = encoding.replace(" ", ", ")
>>> +            final_output += f"DEF({reg_name},
>>> {reformatted_encoding})\n"
>>> +
>>> +        with open("target/arm/cpu-sysregs.h.inc", 'w') as f:
>>> +            f.write("/* SPDX-License-Identifier: BSD-3-Clause */\n\n")
>>> +            f.write("/* This file is autogenerated by ")
>>> +            f.write("scripts/update-aarch64-sysreg-code.py */\n\n")
>>> +            f.write(final_output)
>>> +        print(f"updated target/arm/cpu-sysregs.h.inc")
>>
>> Fixed string (no formating) so no need for f- prefix.
>>
>> Patch LGTM but it should have some unit test.
> 
> thank you for the review.
> 
> Not sure what you mean by unit test? One solution could be to diff the
> result with former bash/awk I used to generation previously [1] but i
> guess it would be awkward to upstream the awk script we did not want in
> the first place. Otherwise we could check some random opcodes but it
> wouldn't mean the others are correct. To me the best way to validate the
> python script is to do [1] once but do not upstream that. Reviewing the
> new generated files against the previous content [3/3] is the best way
> to test the result. I mean using auto generation does not prevent from
> reviewing the generated stuff, especially because the generation is
> triggered manually as scripts/update_linux_headers.sh and should not
> happen very frequenty.
> 
> Thoughts?

I was thinking something minimal like what we use for decodetree,
see these 2 files:

  - tests/decode/meson.build
  - tests/decode/err_field10.decode


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] target/arm/cpu-sysregs.h.inc: Update with automatic generation
  2025-12-08 16:37 ` [PATCH 3/3] target/arm/cpu-sysregs.h.inc: Update with automatic generation Eric Auger
@ 2025-12-09 16:33   ` Cornelia Huck
  0 siblings, 0 replies; 10+ messages in thread
From: Cornelia Huck @ 2025-12-09 16:33 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, eric.auger, qemu-devel, qemu-arm,
	peter.maydell, richard.henderson, sebott
  Cc: maz

On Mon, Dec 08 2025, Eric Auger <eric.auger@redhat.com> wrote:

> Generated definitions with scripts/update-aarch64-sysreg-code.py
> based on "AARCHMRS containing the JSON files for Arm A-profile
> architecture (2025-09)" Registers.json file.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  target/arm/cpu-sysregs.h.inc | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/target/arm/cpu-sysregs.h.inc b/target/arm/cpu-sysregs.h.inc
> index 3c892c4f30..9bb27297b5 100644
> --- a/target/arm/cpu-sysregs.h.inc
> +++ b/target/arm/cpu-sysregs.h.inc
> @@ -1,17 +1,27 @@
> -/* SPDX-License-Identifier: GPL-2.0-or-later */
> +/* SPDX-License-Identifier: BSD-3-Clause */
> +
> +/* This file is autogenerated by scripts/update-aarch64-sysreg-code.py */
> +
> +DEF(AIDR_EL1, 3, 1, 0, 0, 7)

AIDR_EL1 (and MIDR_EL1/REVIDR_EL1) are used by the
(hopefully-soon-respun) writable id register series, so it's good that
they do not need to be added by hand anymore :)

>  DEF(CLIDR_EL1, 3, 1, 0, 0, 1)
>  DEF(CTR_EL0, 3, 3, 0, 0, 1)
> +DEF(DCZID_EL0, 3, 3, 0, 0, 7)

Also see
https://lore.kernel.org/qemu-devel/20251127170657.3335112-1-cohuck@redhat.com/T/#u

> +DEF(GMID_EL1, 3, 1, 0, 0, 4)
>  DEF(ID_AA64AFR0_EL1, 3, 0, 0, 5, 4)
>  DEF(ID_AA64AFR1_EL1, 3, 0, 0, 5, 5)
>  DEF(ID_AA64DFR0_EL1, 3, 0, 0, 5, 0)
>  DEF(ID_AA64DFR1_EL1, 3, 0, 0, 5, 1)
> +DEF(ID_AA64DFR2_EL1, 3, 0, 0, 5, 2)
> +DEF(ID_AA64FPFR0_EL1, 3, 0, 0, 4, 7)
>  DEF(ID_AA64ISAR0_EL1, 3, 0, 0, 6, 0)
>  DEF(ID_AA64ISAR1_EL1, 3, 0, 0, 6, 1)
>  DEF(ID_AA64ISAR2_EL1, 3, 0, 0, 6, 2)
> +DEF(ID_AA64ISAR3_EL1, 3, 0, 0, 6, 3)
>  DEF(ID_AA64MMFR0_EL1, 3, 0, 0, 7, 0)
>  DEF(ID_AA64MMFR1_EL1, 3, 0, 0, 7, 1)
>  DEF(ID_AA64MMFR2_EL1, 3, 0, 0, 7, 2)
>  DEF(ID_AA64MMFR3_EL1, 3, 0, 0, 7, 3)
> +DEF(ID_AA64MMFR4_EL1, 3, 0, 0, 7, 4)
>  DEF(ID_AA64PFR0_EL1, 3, 0, 0, 4, 0)
>  DEF(ID_AA64PFR1_EL1, 3, 0, 0, 4, 1)
>  DEF(ID_AA64PFR2_EL1, 3, 0, 0, 4, 2)
> @@ -36,6 +46,10 @@ DEF(ID_MMFR5_EL1, 3, 0, 0, 3, 6)
>  DEF(ID_PFR0_EL1, 3, 0, 0, 1, 0)
>  DEF(ID_PFR1_EL1, 3, 0, 0, 1, 1)
>  DEF(ID_PFR2_EL1, 3, 0, 0, 3, 4)
> +DEF(MIDR_EL1, 3, 0, 0, 0, 0)
> +DEF(MPIDR_EL1, 3, 0, 0, 0, 5)

I'm wondering if we need to add some handling for MPIDR_EL1.

>  DEF(MVFR0_EL1, 3, 0, 0, 3, 0)
>  DEF(MVFR1_EL1, 3, 0, 0, 3, 1)
>  DEF(MVFR2_EL1, 3, 0, 0, 3, 2)
> +DEF(REVIDR_EL1, 3, 0, 0, 0, 6)
> +DEF(SMIDR_EL1, 3, 1, 0, 0, 6)

SMIDR_EL1 is const 0 in tcg, and KVM currently does not support SME. So
I guess we should init the idreg to 0 for now?



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-12-09 16:33 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-08 16:37 [PATCH 0/3] Generate target/arm/cpu-sysregs.h.inc from AARCHMRS Registers.json Eric Auger
2025-12-08 16:37 ` [PATCH 1/3] scripts: introduce scripts/update-aarch64-sysreg-code.py Eric Auger
2025-12-09 11:12   ` Cornelia Huck
2025-12-09 12:30   ` Philippe Mathieu-Daudé
2025-12-09 12:34     ` Philippe Mathieu-Daudé
2025-12-09 13:40     ` Eric Auger
2025-12-09 13:57       ` Philippe Mathieu-Daudé
2025-12-08 16:37 ` [PATCH 2/3] target/arm/cpu-sysregs.h.inc: Sort by name alphabetical order Eric Auger
2025-12-08 16:37 ` [PATCH 3/3] target/arm/cpu-sysregs.h.inc: Update with automatic generation Eric Auger
2025-12-09 16:33   ` Cornelia Huck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).