From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 47133C27C7B for ; Fri, 7 Jun 2024 20:09:05 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6007C10ED11; Fri, 7 Jun 2024 20:09:02 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="YYsQaQa2"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.12]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0642410ED04 for ; Fri, 7 Jun 2024 20:08:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1717790939; x=1749326939; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=aVQXHd47HY5Z0gqPEKkQmCLcG+pMhMHoBZ1eYcZDfAY=; b=YYsQaQa2xPfCGPLSADTgRRfKQJXQ2BzbyPNfCpFtIJjbRbkglCyeGbEg yCzbjQtOuVaZD36JgMv2dr4of5xgwqx7t6sdr964X3f9UgOWQx6b6cHk9 +xOUZiEuExBMIlyspGQ72+6xGrztKr1sY2l54j3TUGD8DeGTY5f80NRtK Jn7fg5MMGmcILrRihCU8IQYvO2OS+ze7j0huQOQnYWR31SN9zlFEXVwZ7 vUX3e7gpG81rCR5UFkeUSChefiAD2kwS4nKWFcXO6FviG3jZAvrFnLdkL USru5tOElvSgvoAvwH4PWJXa4Ff5hLQ80teJCYqxQ8A79rsMeA/7ZiYKs A==; X-CSE-ConnectionGUID: 4OSVkuubSgiDH+b0kgnrfw== X-CSE-MsgGUID: 2BhjPCr+TfWdQ1s3KWJqrg== X-IronPort-AV: E=McAfee;i="6600,9927,11096"; a="18383212" X-IronPort-AV: E=Sophos;i="6.08,221,1712646000"; d="scan'208";a="18383212" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jun 2024 13:08:58 -0700 X-CSE-ConnectionGUID: Oky0h18MS4ab5qxB6hAgKw== X-CSE-MsgGUID: dYD2nmMwRg+OXDUHQ+kwcw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,221,1712646000"; d="scan'208";a="38373829" Received: from orsosgc001.jf.intel.com ([10.165.21.138]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jun 2024 13:08:57 -0700 From: Ashutosh Dixit To: igt-dev@lists.freedesktop.org Subject: [PATCH i-g-t 01/27] lib/xe/oa: Import OA metric generation files from i915 Date: Fri, 7 Jun 2024 13:08:21 -0700 Message-ID: <20240607200847.1964629-2-ashutosh.dixit@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20240607200847.1964629-1-ashutosh.dixit@intel.com> References: <20240607200847.1964629-1-ashutosh.dixit@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-BeenThere: igt-dev@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Development mailing list for IGT GPU Tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: igt-dev-bounces@lists.freedesktop.org Sender: "igt-dev" Import OA metric generation files from i915 to provide a starting point for Xe OA XML processing code. Signed-off-by: Ashutosh Dixit --- lib/xe/oa-configs/README.md | 115 + lib/xe/oa-configs/codegen.py | 444 ++++ lib/xe/oa-configs/guids.xml | 2749 +++++++++++++++++++++ lib/xe/oa-configs/mdapi-xml-convert.py | 1196 +++++++++ lib/xe/oa-configs/oa-equations-codegen.py | 261 ++ lib/xe/oa-configs/oa-metricset-codegen.py | 257 ++ lib/xe/oa-configs/oa-registers-codegen.py | 118 + lib/xe/oa-configs/oa_guid_registry.py | 117 + lib/xe/oa-configs/update-guids.py | 222 ++ 9 files changed, 5479 insertions(+) create mode 100644 lib/xe/oa-configs/README.md create mode 100644 lib/xe/oa-configs/codegen.py create mode 100644 lib/xe/oa-configs/guids.xml create mode 100755 lib/xe/oa-configs/mdapi-xml-convert.py create mode 100644 lib/xe/oa-configs/oa-equations-codegen.py create mode 100644 lib/xe/oa-configs/oa-metricset-codegen.py create mode 100644 lib/xe/oa-configs/oa-registers-codegen.py create mode 100644 lib/xe/oa-configs/oa_guid_registry.py create mode 100755 lib/xe/oa-configs/update-guids.py diff --git a/lib/xe/oa-configs/README.md b/lib/xe/oa-configs/README.md new file mode 100644 index 0000000000..513806b8e6 --- /dev/null +++ b/lib/xe/oa-configs/README.md @@ -0,0 +1,115 @@ +# About guids.xml + +This is the authoritive registry of unique identifers for different OA unit +hardware configurations. Userspace can reliably use these identifiers to map a +configuration to corresponding normalization equations and counter meta data. + +If a hardware configuration ever changes in a backwards incompatible way +(changing the semantics and/or layout of the raw counters) then it must be +given a new GUID. + +mdapi-xml-convert.py will match metric sets with a GUID from this file based on +an md5 hash of the hardware register configuration and skip a metric set with a +warning if no GUID could be found. + +All new metric sets need to be allocated a GUID here before +mdapi-xml-convert.py will output anything for that +metric set. This ensures we don't automatically import new metric sets without +some explicit review that that's appropriate. + +A failure to find a GUID for an older metric set most likely implies that the +register configuration was changed. It's possible that the change is benign +(e.g. a comment change) and in that case the mdapi_config_hash for the +corresponding metric set below can be updated. + +The update-guids.py script is the recommended way of managing updates to this +file by generate a temporary file with proposed updates that you can compare +with the current guids.xml. + + +# update-guids.xml + +update-guids.py can help with: + +* Recognising new metrics from VPG's MDAPI XML files + + *(NOTE: new guids.xml entries will initially be missing the + config_hash=MD5_HASH attribute until mdapi-xml-convert.py is used to generate + a corresponding oa-*.xml config description)* + +* Adding a config_hash=MD5_HASH attribute to recently added guids.xml entries + after mdapi-xml-convert.py has been run. + +* Allocating a GUID for a custom metric that doesn't have a counterpart in + VPG's MDAPI XML files. + + For this case you can add a stub entry with only a name like `` to guids.xml and then running update-guids.py will output a + corresponding line with the addition of an id=UUID attribute. + + +# How to sync the oa-\*.xml files with latest internal MDAPI XML files + +1. E.g. copy a new `MetricsXML_BDW.xml` to `mdapi/MetricsXML_BDW.xml` + +*Note: that the `mdapi-xml-convert.py` script will only convert configs that +have a corresponding GUID entry within `guids.xml`. This check helps avoid +unintentionally publishing early, work-in-progress/pre-production configs.* + +The `guids.xml` registry maps each, complex OA unit register configuration to a +unique ID that userspace can recognise and trust the semantics of raw counters +read using that configuration. (Just for reference, this is particularly +valuable for tools that capture raw metrics for later, offline processing since +the IDs effectively provide a compressed description of how to interpret the +data by providing an index into a database of high-level counter descriptions.) + +The registry associates each ID with a hash of the HW register config as found in +MDAPI XML files ('mdapi_config_hash') and also with a hash of the HW config as +found in oa-\*.xml files ('config_hash'). The hashes used for lookups in the +registry also help detect when the register config for a pre-existing metric set +is updated. Note: these hashes are only for the low-level hardware configuration +so updates to counter descriptions used by fronted UIs won't affect indexing +here. + +There is a chicken and egg situation when updating or adding new entries to +guids.xml since we can't hash the configs in oa-\*.xml until successfully running +mdapi-xml-convert.py which depends on a guids.xml registry entry first. The +update-guids.xml script will output registry entries without an oa-\*.xml config +hash if not available and can be re-run after mdapi-xml-convert.py to add the +missing hashes. + +2. Now run: +``` +./update-guids.py --guids=guids.xml mdapi/MetricsXML_BDW.xml > guids.xml2 +``` +*(note the script expects to find oa-\*.xml files in the current directory)* + +Diff `guids.xml` and `guilds.xml2` (easiest with a side-by-side diff editor) and +review the registry changes. *Note: many lines will have a warning like `"Not +found in MDAPI XML file[s]..."` if `update-guids.xml` wasn't given all known +MDAPI XML files but in this case they can be ignored for all non-BDW configs.* + +*Note: for any config that is already supported upstream in the xe_oa driver +we need to be careful if the hash for a metric set changes in case the semantics +for any raw counters were changed. The semantics of raw counters associated with +a given GUID form part of the drm xe_oa uapi contract and must remain +backwards compatible.* + +If the diff shows any `mdapi_config_hash` changes for pre-existing (especially +upstream) configs you should review the MDAPI XML changes for the metric set and +verify the change just relates to a bug fix. If more substantial changes were +made which could mean we need to treat it as a new config. Handling the later +case is left as an exercise to the reader, since it hasn't happened so far :-D. +Assuming all the changes and new entries look good they can be copied into +`guids.xml`, removing any trailing comment left by `update-guids.py`. + +3. Now run mdapi-xml-convert.py: +``` +./mdapi-xml-convert.py --guids=guids.xml mdapi/MetricsXML_BDW.xml > oa-bdw.xml +``` + +4. We can now update new entries in guids.xml with a 'config_hash': +``` +./update-guids.py --guids=guids.xml mdapi/MetricsXML_BDW.xml > guids.xml2 +``` +*(and again diff, check the changes and copy across)* diff --git a/lib/xe/oa-configs/codegen.py b/lib/xe/oa-configs/codegen.py new file mode 100644 index 0000000000..020e76ef4b --- /dev/null +++ b/lib/xe/oa-configs/codegen.py @@ -0,0 +1,444 @@ +import re +import xml.etree.ElementTree as et + +class Codegen: + + _file = None + _indent = 0 + + endl="\n" + use_tabs = False + + def __init__(self, filename = None): + if filename != None: + self._file = open(filename, 'w') + + def __call__(self, *args): + if self._file: + code = ' '.join(map(str, args)) + for line in code.splitlines(): + indent = ''.rjust(self._indent) + + if self.use_tabs: + indent = indent.replace(" ", "\t") + + text = indent + line + self._file.write(text.rstrip() + self.endl) + + #without indenting or new lines + def frag(self, *args): + code = ' '.join(map(str, args)) + self._file.write(code) + + def indent(self, n): + self._indent = self._indent + n + def outdent(self, n): + self._indent = self._indent - n + + +class Counter: + def __init__(self, set, xml): + self.xml = xml + self.set = set + self.read_hash = None + self.max_hash = None + + self.read_sym = "{0}__{1}__{2}__read".format(self.set.gen.chipset, + self.set.underscore_name, + self.xml.get('underscore_name')) + + max_eq = self.xml.get('max_equation') + if not max_eq: + self.max_sym = "NULL /* undefined */" + elif max_eq == "100": + self.max_sym = "percentage_max_callback_" + self.xml.get('data_type') + else: + self.max_sym = "{0}__{1}__{2}__max".format(self.set.gen.chipset, + self.set.underscore_name, + self.xml.get('underscore_name')) + + def get(self, prop): + return self.xml.get(prop) + + def compute_hashes(self): + if self.read_hash is not None: + return + + def replace_func(token): + if token[0] != "$": + return token + if token not in self.set.counter_vars: + return token + self.set.counter_vars[token].compute_hashes() + return self.set.counter_vars[token].read_hash + + read_eq = self.xml.get('equation') + self.read_hash = ' '.join(map(replace_func, read_eq.split())) + + max_eq = self.xml.get('max_equation') + if max_eq: + self.max_hash = ' '.join(map(replace_func, max_eq.split())) + +class Set: + def __init__(self, gen, xml): + self.gen = gen + self.xml = xml + + self.counter_vars = {} + self.max_funcs = {} + self.read_funcs = {} + self.counter_hashes = {} + + self.counters = [] + xml_counters = self.xml.findall("counter") + for xml_counter in xml_counters: + counter = Counter(self, xml_counter) + self.counters.append(counter) + self.counter_vars["$" + counter.get('symbol_name')] = counter + self.max_funcs["$" + counter.get('symbol_name')] = counter.max_sym + self.read_funcs["$" + counter.get('symbol_name')] = counter.read_sym + + for counter in self.counters: + counter.compute_hashes() + + @property + def hw_config_guid(self): + return self.xml.get('hw_config_guid') + + @property + def name(self): + return self.xml.get('name') + + @property + def symbol_name(self): + return self.xml.get('symbol_name') + + @property + def underscore_name(self): + return self.xml.get('underscore_name') + + @property + def oa_format(self): + return self.xml.get('oa_format') + + def findall(self, path): + return self.xml.findall(path) + + def find(self, path): + return self.xml.find(path) + + +hw_vars_mapping = { + "$EuCoresTotalCount": { 'c': "perf->devinfo.n_eus", 'desc': "The total number of execution units" }, + "$EuSlicesTotalCount": { 'c': "perf->devinfo.n_eu_slices" }, + "$EuSubslicesTotalCount": { 'c': "perf->devinfo.n_eu_sub_slices" }, + "$EuDualSubslicesTotalCount": { 'c': "perf->devinfo.n_eu_sub_slices" }, + "$EuDualSubslicesSlice0123Count": { 'c': "perf->devinfo.n_eu_sub_slices_half_slices" }, + "$EuThreadsCount": { 'c': "perf->devinfo.eu_threads_count" }, + + "$VectorEngineTotalCount": { 'c': "perf->devinfo.n_eus", 'desc': "The total number of execution units" }, + "$VectorEnginePerXeCoreCount": { 'c': "perf->devinfo.n_eu_sub_slices" }, + "$VectorEngineThreadsCount": { 'c': "perf->devinfo.eu_threads_count" }, + + "$SliceMask": { 'c': "perf->devinfo.slice_mask" }, + "$SliceTotalCount": { 'c': "perf->devinfo.n_eu_slices" }, + + "$SubsliceMask": { 'c': "perf->devinfo.subslice_mask" }, + "$DualSubsliceMask": { 'c': "perf->devinfo.subslice_mask" }, + + "$GtSliceMask": { 'c': "perf->devinfo.slice_mask" }, + "$GtSubsliceMask": { 'c': "perf->devinfo.subslice_mask" }, + "$GtDualSubsliceMask": { 'c': "perf->devinfo.subslice_mask" }, + + "$GtXeCoreMask": { 'c': "perf->devinfo.slice_mask" }, + "$XeCoreMask": { 'c': "perf->devinfo.slice_mask" }, + "$XeCoreTotalCount": { 'c': 'perf->devinfo.n_eu_sub_slices' }, + + "$GpuTimestampFrequency": { 'c': "perf->devinfo.timestamp_frequency" }, + "$GpuMinFrequency": { 'c': "perf->devinfo.gt_min_freq" }, + "$GpuMaxFrequency": { 'c': "perf->devinfo.gt_max_freq" }, + "$SkuRevisionId": { 'c': "perf->devinfo.revision" }, + "$QueryMode": { 'c': "perf->devinfo.query_mode" }, +} + +def is_hw_var(name): + m = re.search('\$GtSlice([0-9]+)XeCore([0-9]+)$', name) + if m: + return True + m = re.search('\$GtSlice([0-9]+)$', name) + if m: + return True + m = re.search('\$GtSlice([0-9]+)DualSubslice([0-9]+)$', name) + if m: + return True + return name in hw_vars_mapping + +class Gen: + def __init__(self, filename, c): + self.filename = filename + self.xml = et.parse(self.filename) + self.chipset = self.xml.find('.//set').get('chipset').lower() + self.sets = [] + self.c = c + + for xml_set in self.xml.findall(".//set"): + self.sets.append(Set(self, xml_set)) + + self.ops = {} + # (n operands, emitter) + self.ops["FADD"] = (2, self.emit_fadd) + self.ops["FDIV"] = (2, self.emit_fdiv) + self.ops["FMAX"] = (2, self.emit_fmax) + self.ops["FMUL"] = (2, self.emit_fmul) + self.ops["FSUB"] = (2, self.emit_fsub) + self.ops["READ"] = (2, self.emit_read) + self.ops["UADD"] = (2, self.emit_uadd) + self.ops["UDIV"] = (2, self.emit_udiv) + self.ops["UMUL"] = (2, self.emit_umul) + self.ops["USUB"] = (2, self.emit_usub) + self.ops["UMIN"] = (2, self.emit_umin) + self.ops["<<"] = (2, self.emit_lshft) + self.ops[">>"] = (2, self.emit_rshft) + self.ops["AND"] = (2, self.emit_and) + self.ops["UGTE"] = (2, self.emit_ugte) + self.ops["UGT"] = (2, self.emit_ugt) + self.ops["ULTE"] = (2, self.emit_ulte) + self.ops["ULT"] = (2, self.emit_ult) + + self.exp_ops = {} + # (n operands, splicer) + self.exp_ops["AND"] = (2, self.splice_bitwise_and) + self.exp_ops["UGTE"] = (2, self.splice_ugte) + self.exp_ops["UGT"] = (2, self.splice_ugt) + self.exp_ops["ULTE"] = (2, self.splice_ulte) + self.exp_ops["ULT"] = (2, self.splice_ult) + self.exp_ops["&&"] = (2, self.splice_logical_and) + self.exp_ops["<<"] = (2, self.splice_lshft) + self.exp_ops[">>"] = (2, self.splice_rshft) + self.exp_ops["UMUL"] = (2, self.splice_uml) + + self.hw_vars = hw_vars_mapping + + def emit_fadd(self, tmp_id, args): + self.c("double tmp{0} = {1} + {2};".format(tmp_id, args[1], args[0])) + return tmp_id + 1 + + # Be careful to check for divide by zero... + def emit_fdiv(self, tmp_id, args): + self.c("double tmp{0} = {1};".format(tmp_id, args[1])) + self.c("double tmp{0} = {1};".format(tmp_id + 1, args[0])) + self.c("double tmp{0} = tmp{1} ? tmp{2} / tmp{1} : 0;".format(tmp_id + 2, tmp_id + 1, tmp_id)) + return tmp_id + 3 + + def emit_fmax(self, tmp_id, args): + self.c("double tmp{0} = {1};".format(tmp_id, args[1])) + self.c("double tmp{0} = {1};".format(tmp_id + 1, args[0])) + self.c("double tmp{0} = MAX(tmp{1}, tmp{2});".format(tmp_id + 2, tmp_id, tmp_id + 1)) + return tmp_id + 3 + + def emit_fmul(self, tmp_id, args): + self.c("double tmp{0} = {1} * {2};".format(tmp_id, args[1], args[0])) + return tmp_id + 1 + + def emit_fsub(self, tmp_id, args): + self.c("double tmp{0} = {1} - {2};".format(tmp_id, args[1], args[0])) + return tmp_id + 1 + + def emit_read(self, tmp_id, args): + type = args[1].lower() + self.c("uint64_t tmp{0} = accumulator[metric_set->{1}_offset + {2}];".format(tmp_id, type, args[0])) + return tmp_id + 1 + + def emit_uadd(self, tmp_id, args): + self.c("uint64_t tmp{0} = {1} + {2};".format(tmp_id, args[1], args[0])) + return tmp_id + 1 + + # Be careful to check for divide by zero... + def emit_udiv(self, tmp_id, args): + self.c("uint64_t tmp{0} = {1};".format(tmp_id, args[1])) + self.c("uint64_t tmp{0} = {1};".format(tmp_id + 1, args[0])) + self.c("uint64_t tmp{0} = tmp{1} ? tmp{2} / tmp{1} : 0;".format(tmp_id + 2, tmp_id + 1, tmp_id)) + return tmp_id + 3 + + def emit_umul(self, tmp_id, args): + self.c("uint64_t tmp{0} = {1} * {2};".format(tmp_id, args[1], args[0])) + return tmp_id + 1 + + def emit_usub(self, tmp_id, args): + self.c("uint64_t tmp{0} = {1} - {2};".format(tmp_id, args[1], args[0])) + return tmp_id + 1 + + def emit_umin(self, tmp_id, args): + self.c("uint64_t tmp{0} = MIN({1}, {2});".format(tmp_id, args[1], args[0])) + return tmp_id + 1 + + def emit_lshft(self, tmp_id, args): + self.c("uint64_t tmp{0} = {1} << {2};".format(tmp_id, args[1], args[0])) + return tmp_id + 1 + + def emit_rshft(self, tmp_id, args): + self.c("uint64_t tmp{0} = {1} >> {2};".format(tmp_id, args[1], args[0])) + return tmp_id + 1 + + def emit_and(self, tmp_id, args): + self.c("uint64_t tmp{0} = {1} & {2};".format(tmp_id, args[1], args[0])) + return tmp_id + 1 + + def emit_ulte(self, tmp_id, args): + self.c("uint64_t tmp{0} = {1} <= {2};".format(tmp_id, args[1], args[0])) + return tmp_id + 1 + + def emit_ult(self, tmp_id, args): + self.c("uint64_t tmp{0} = {1} < {2};".format(tmp_id, args[1], args[0])) + return tmp_id + 1 + + def emit_ugte(self, tmp_id, args): + self.c("uint64_t tmp{0} = {1} >= {2};".format(tmp_id, args[1], args[0])) + return tmp_id + 1 + + def emit_ugt(self, tmp_id, args): + self.c("uint64_t tmp{0} = {1} > {2};".format(tmp_id, args[1], args[0])) + return tmp_id + 1 + + def brkt(self, subexp): + if " " in subexp: + return "(" + subexp + ")" + else: + return subexp + + def splice_bitwise_and(self, args): + return self.brkt(args[1]) + " & " + self.brkt(args[0]) + + def splice_logical_and(self, args): + return self.brkt(args[1]) + " && " + self.brkt(args[0]) + + def splice_ulte(self, args): + return self.brkt(args[1]) + " <= " + self.brkt(args[0]) + + def splice_ult(self, args): + return self.brkt(args[1]) + " < " + self.brkt(args[0]) + + def splice_ugte(self, args): + return self.brkt(args[1]) + " >= " + self.brkt(args[0]) + + def splice_ugt(self, args): + return self.brkt(args[1]) + " > " + self.brkt(args[0]) + + def splice_lshft(self, args): + return '(' + self.brkt(args[1]) + " << " + self.brkt(args[0]) + ')' + + def splice_rshft(self, args): + return '(' + self.brkt(args[1]) + " >> " + self.brkt(args[0]) + ')' + + def splice_uml(self, args): + return self.brkt(args[1]) + " * " + self.brkt(args[0]) + + def resolve_variable(self, name, set): + if name in self.hw_vars: + return self.hw_vars[name]['c'] + if name in set.counter_vars: + return set.read_funcs[name] + "(perf, metric_set, accumulator)" + m = re.search('\$GtSlice([0-9]+)$', name) + if m: + return 'intel_perf_devinfo_slice_available(&perf->devinfo, {0})'.format(m.group(1)) + m = re.search('\$GtSlice([0-9]+)DualSubslice([0-9]+)$', name) + if m: + return 'intel_perf_devinfo_subslice_available(&perf->devinfo, {0}, {1})'.format(m.group(1), m.group(2)) + m = re.search('\$GtSlice([0-9]+)XeCore([0-9]+)$', name) + if m: + return 'intel_perf_devinfo_subslice_available(&perf->devinfo, {0}, {1})'.format(m.group(1), m.group(2)) + return None + + def output_rpn_equation_code(self, set, counter, equation): + self.c("/* RPN equation: " + equation + " */") + tokens = equation.split() + stack = [] + tmp_id = 0 + tmp = None + + for token in tokens: + stack.append(token) + while stack and stack[-1] in self.ops: + op = stack.pop() + argc, callback = self.ops[op] + args = [] + for i in range(0, argc): + operand = stack.pop() + if operand[0] == "$": + resolved_variable = self.resolve_variable(operand, set) + if resolved_variable == None: + raise Exception("Failed to resolve variable " + operand + " in equation " + equation + " for " + set.name + " :: " + counter.get('name')); + operand = resolved_variable + args.append(operand) + + tmp_id = callback(tmp_id, args) + + tmp = "tmp{0}".format(tmp_id - 1) + stack.append(tmp) + + if len(stack) != 1: + raise Exception("Spurious empty rpn code for " + set.name + " :: " + + counter.get('name') + ".\nThis is probably due to some unhandled RPN function, in the equation \"" + + equation + "\"") + + value = stack[-1] + + if value[0] == "$": + resolved_variable = self.resolve_variable(value, set) + if resolved_variable == None: + raise Exception("Failed to resolve variable " + value + " in expression " + expression + " for " + set.name + " :: " + counter_name) + value = resolved_variable + + self.c("\nreturn " + value + ";") + + def splice_rpn_expression(self, set, counter_name, expression): + tokens = expression.split() + stack = [] + + for token in tokens: + stack.append(token) + while stack and stack[-1] in self.exp_ops: + op = stack.pop() + argc, callback = self.exp_ops[op] + args = [] + for i in range(0, argc): + operand = stack.pop() + if operand[0] == "$": + resolved_variable = self.resolve_variable(operand, set) + if resolved_variable == None: + raise Exception("Failed to resolve variable " + operand + " in expression " + expression + " for " + set.name + " :: " + counter_name) + operand = resolved_variable + args.append(operand) + + subexp = callback(args) + + stack.append(subexp) + + if len(stack) != 1: + raise Exception("Spurious empty rpn expression for " + set.name + " :: " + + counter_name + ".\nThis is probably due to some unhandled RPN operation, in the expression \"" + + expression + "\"") + + value = stack[-1] + + if value[0] == "$": + resolved_variable = self.resolve_variable(value, set) + if resolved_variable == None: + raise Exception("Failed to resolve variable " + value + " in expression " + expression + " for " + set.name + " :: " + counter_name) + value = resolved_variable + + return value + + def output_availability(self, set, availability, counter_name): + expression = self.splice_rpn_expression(set, counter_name, availability) + lines = expression.split(' && ') + n_lines = len(lines) + if n_lines == 1: + self.c("if (" + lines[0] + ") {") + else: + self.c("if (" + lines[0] + " &&") + self.c.indent(4) + for i in range(1, (n_lines - 1)): + self.c(lines[i] + " &&") + self.c(lines[(n_lines - 1)] + ") {") + self.c.outdent(4) diff --git a/lib/xe/oa-configs/guids.xml b/lib/xe/oa-configs/guids.xml new file mode 100644 index 0000000000..510450f87a --- /dev/null +++ b/lib/xe/oa-configs/guids.xml @@ -0,0 +1,2749 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/lib/xe/oa-configs/mdapi-xml-convert.py b/lib/xe/oa-configs/mdapi-xml-convert.py new file mode 100755 index 0000000000..5903ef4d35 --- /dev/null +++ b/lib/xe/oa-configs/mdapi-xml-convert.py @@ -0,0 +1,1196 @@ +#!/usr/bin/env python3 + +# SPDX-License-Identifier: MIT +# +# Copyright © 2024 Intel Corporation +# +# +# "MDAPI" xml files are an XML schema for maintaining meta data about Gen +# graphics Ovservability counters, where MD API is the name of a library shared +# by Intel GPA and Intel VTune. +# +# These files aren't publicly documented and have some historical baggage that +# adds some complexity as well as being inconsistent in a number of ways that +# makes it quite a bit of effort to parse/use the data. We also don't have +# guarantees about how this schema is maintained. +# +# We've taken the opportunity to find ways to simplify the input data and to +# make it more consistent to hopefully reduce the effort involved in using the +# data downstream. +# + + +import argparse +import copy +import hashlib +from operator import itemgetter +import re +import sys +import time +import uuid + +import codegen + +import xml.etree.ElementTree as et +import xml.sax.saxutils as saxutils + +import oa_guid_registry as oa_registry + + +# MDAPI configs include writes to some non-config registers, +# thus the blacklists... + +hsw_chipset_params = { + 'a_offset': 12, + 'b_offset': 192, + 'c_offset': 224, + 'oa_report_size': 256, + 'registers': { + # TODO extend the symbol table for nicer output... + 0x2710: { 'name': 'OASTARTTRIG1' }, + 0x2714: { 'name': 'OASTARTTRIG1' }, + 0x2718: { 'name': 'OASTARTTRIG1' }, + 0x271c: { 'name': 'OASTARTTRIG1' }, + 0x2720: { 'name': 'OASTARTTRIG1' }, + 0x2724: { 'name': 'OASTARTTRIG6' }, + 0x2728: { 'name': 'OASTARTTRIG7' }, + 0x272c: { 'name': 'OASTARTTRIG8' }, + 0x2740: { 'name': 'OAREPORTTRIG1' }, + 0x2744: { 'name': 'OAREPORTTRIG2' }, + 0x2748: { 'name': 'OAREPORTTRIG3' }, + 0x274c: { 'name': 'OAREPORTTRIG4' }, + 0x2750: { 'name': 'OAREPORTTRIG5' }, + 0x2754: { 'name': 'OAREPORTTRIG6' }, + 0x2758: { 'name': 'OAREPORTTRIG7' }, + 0x275c: { 'name': 'OAREPORTTRIG8' }, + 0x2770: { 'name': 'OACEC0_0' }, + 0x2774: { 'name': 'OACEC0_1' }, + 0x2778: { 'name': 'OACEC1_0' }, + 0x277c: { 'name': 'OACEC1_1' }, + 0x2780: { 'name': 'OACEC2_0' }, + 0x2784: { 'name': 'OACEC2_1' }, + 0x2788: { 'name': 'OACEC3_0' }, + 0x278c: { 'name': 'OACEC3_1' }, + 0x2790: { 'name': 'OACEC4_0' }, + 0x2794: { 'name': 'OACEC4_1' }, + 0x2798: { 'name': 'OACEC5_0' }, + 0x279c: { 'name': 'OACEC5_1' }, + 0x27a0: { 'name': 'OACEC6_0' }, + 0x27a4: { 'name': 'OACEC6_1' }, + 0x27a8: { 'name': 'OACEC7_0' }, + 0x27ac: { 'name': 'OACEC7_1' }, + }, + 'config_reg_blacklist': { + 0x2364, # OASTATUS1 register + }, + 'register_offsets': { + 0x1f0: 'PERFCNT 0', + 0x1f8: 'PERFCNT 1', + }, +} + +gen8_11_chipset_params = { + 'a_offset': 16, + 'b_offset': 192, + 'c_offset': 224, + 'oa_report_size': 256, + 'config_reg_blacklist': { + 0x2364, # OACTXID + }, + 'register_offsets': { + 0x1f0: 'PERFCNT 0', + 0x1f8: 'PERFCNT 1', + } +} + +xehpsdv_chipset_params = { + 'a_offset': 16, + 'b_offset': 192, + 'c_offset': 224, + 'oa_report_size': 256, + 'config_reg_blacklist': { + 0x2364, # OACTXID + }, + 'register_offsets': { + 0x1b0: 'PERFCNT 0', + 0x1b8: 'PERFCNT 1', + } +} + +# There is no ReportType field in most Metrics XML files, Use 256B_GENERIC_NOA16 +# to denote the generic 256 byte format that is used by most chipsets +# Just treat the MPEC counter names as A counters here. If a format has both A +# and MPEC counters, then we need to change this. +mtl_chipset_oam_samedia_ll_params = { + 'a_offset': 32, + 'b_offset': 96, + 'c_offset': 128, + 'oa_report_size': 192, + 'config_reg_blacklist': { + 0x2364, # OACTXID + }, + 'register_offsets': { + 0x1b0: 'PERFCNT 0', + 0x1b8: 'PERFCNT 1', + } +} + +mtl_chipset_oam_samedia_params = { + 'a_offset': 32, + 'b_offset': 64, + 'c_offset': 96, + 'oa_report_size': 128, + 'config_reg_blacklist': { + 0x2364, # OACTXID + }, + 'register_offsets': { + 0x1b0: 'PERFCNT 0', + 0x1b8: 'PERFCNT 1', + } +} + +hsw_chipset_oa_formats = { + '256B_GENERIC_NOA16': hsw_chipset_params, +} + +gen8_11_chipset_oa_formats = { + '256B_GENERIC_NOA16': gen8_11_chipset_params, +} + +xehpsdv_chipset_oa_formats = { + '256B_GENERIC_NOA16': xehpsdv_chipset_params, +} + +mtl_chipset_oa_formats = { + '256B_GENERIC_NOA16': xehpsdv_chipset_params, + '192B_MPEC8LL_NOA16': mtl_chipset_oam_samedia_ll_params, + '128B_MPEC8_NOA16': mtl_chipset_oam_samedia_params, +} + +chipsets = { + 'HSW': hsw_chipset_oa_formats, + 'BDW': gen8_11_chipset_oa_formats, + 'CHV': gen8_11_chipset_oa_formats, + 'SKL': gen8_11_chipset_oa_formats, + 'BXT': gen8_11_chipset_oa_formats, + 'KBL': gen8_11_chipset_oa_formats, + 'GLK': gen8_11_chipset_oa_formats, + 'CFL': gen8_11_chipset_oa_formats, + 'CNL': gen8_11_chipset_oa_formats, + 'ICL': gen8_11_chipset_oa_formats, + 'EHL': gen8_11_chipset_oa_formats, + 'TGL': gen8_11_chipset_oa_formats, + 'RKL': gen8_11_chipset_oa_formats, + 'DG1': gen8_11_chipset_oa_formats, + 'ADL': gen8_11_chipset_oa_formats, + 'ACM': xehpsdv_chipset_oa_formats, + 'MTL': mtl_chipset_oa_formats, +} + +xehp_plus = ( 'ACM', 'MTL' ) + +register_types = { 'OA', 'NOA', 'FLEX', 'PM' } + +default_set_blacklist = { "RenderDX1x", # TODO: rename to something non 'DX' + # specific if this config is generally + # usefull + "RenderBalance", # XXX: missing register config + "PipelineTimestamps", # Covered by API timestamp queries + } + +counter_blacklist = { + "DramLlcThroughput", # TODO: The max equation of this counter + # requires dram throughtput value. Need to + # investiguate how to get this value. +} + +def underscore(name): + s = re.sub('MHz', 'Mhz', name) + s = re.sub('\.', '_', s) + s = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', s) + s = re.sub('#', '_', s) + return re.sub('([a-z0-9])([A-Z])', r'\1_\2', s).lower() + +def print_err(*args): + sys.stderr.write(' '.join(map(str,args)) + '\n') + +def read_value(chipset, offset, oa_format): + if offset in chipsets[chipset][oa_format]['register_offsets']: + return chipsets[chipset][oa_format]['register_offsets'][offset] + print_err("Unknown offset register at offset {0}".format(offset)) + assert 0 + + +def read_token_to_rpn_read_oam(chipset, token, raw_offsets, oa_format): + width, offset_str = token.split('@') + offset = int(offset_str, 16) + + if width == 'qw': + den = 8 + else: + den = 4 + + if raw_offsets: + # Location in the HW reports + a_offset = chipsets[chipset][oa_format]['a_offset'] + b_offset = chipsets[chipset][oa_format]['b_offset'] + c_offset = chipsets[chipset][oa_format]['c_offset'] + report_size = chipsets[chipset][oa_format]['oa_report_size'] + + if offset < a_offset: + if offset == 8: + return "GPU_TIME 0 READ" + elif offset == 24: + return "GPU_CLOCK 0 READ" + else: + assert 0 + elif offset < b_offset: + a_cnt_offset = int((offset - a_offset) / den) + return "A " + str(a_cnt_offset) + " READ" + elif offset < c_offset: + return "B " + str(int((offset - b_offset) / den)) + " READ" + elif offset < report_size: + return "C " + str(int((offset - c_offset) / den)) + " READ" + else: + return "{0} READ".format(read_value(chipset, offset, oa_format)) + else: + # Location in the accumulated deltas + idx = int(offset / 8) + if chipset in xehp_plus: + # For XEHPSDV the array of accumulated counters is + # assumed to start with a GPU_TIME then GPU_CLOCK, + # then 38 A counters, then 8 B counters and finally + # 8 C counters. + if idx == 0: + return "GPU_TIME 0 READ" + elif idx == 1: + return "GPU_CLOCK 0 READ" + elif idx < 40: + return "A " + str(idx - 2) + " READ" + elif idx < 48: + return "B " + str(idx - 40) + " READ" + elif idx < 56: + return "C " + str(idx - 48) + " READ" + else: + return "{0} READ".format(read_value(chipset, offset, oa_format)) + + assert 0 + +def read_token_to_rpn_read_oag(chipset, token, raw_offsets, oa_format): + width, offset_str = token.split('@') + + # For Broadwell the raw read notation was extended for 40 bit + # counters: rd40@<32bit_part1_offset>:<8bit_part2_offset> + if width == "rd40": + offset_32_str, offset_8_str = offset_str.split(':') + offset_str = offset_32_str + + offset = int(offset_str, 16) + + if raw_offsets: + # Location in the HW reports + a_offset = chipsets[chipset][oa_format]['a_offset'] + b_offset = chipsets[chipset][oa_format]['b_offset'] + c_offset = chipsets[chipset][oa_format]['c_offset'] + report_size = chipsets[chipset][oa_format]['oa_report_size'] + + if offset < a_offset: + if offset == 4: + return "GPU_TIME 0 READ" + elif offset == 12: + assert chipset != "HSW" # Only for Gen8+ + return "GPU_CLOCK 0 READ" + else: + assert 0 + elif offset < b_offset: + a_cnt_offset = int((offset - a_offset) / 4) + if chipset in xehp_plus: + # Most A counters are in a contiguous array, except + # this A37. + if a_cnt_offset == 42: + return "A 37 READ" + return "A " + str(a_cnt_offset) + " READ" + else: + return "A " + str(a_cnt_offset) + " READ" + elif offset < c_offset: + return "B " + str(int((offset - b_offset) / 4)) + " READ" + elif offset < report_size: + return "C " + str(int((offset - c_offset) / 4)) + " READ" + else: + return "{0} READ".format(read_value(chipset, offset, oa_format)) + else: + # Location in the accumulated deltas + idx = int(offset / 8) + if chipset == "HSW": + # On Haswell accumulated counters are assumed to start + # with GPU_TIME followed by 45 A counters, then 8 B + # counters and finally 8 C counters. + if idx < 1: + return "GPU_TIME 0 READ" + elif idx < 46: + return "A " + str(idx - 1) + " READ" + elif idx < 54: + return "B " + str(idx - 46) + " READ" + elif idx < 62: + return "C " + str(idx - 54) + " READ" + else: + return "{0} READ".format(read_value(chipset, offset, oa_format)) + elif chipset in xehp_plus: + # For XEHPSDV the array of accumulated counters is + # assumed to start with a GPU_TIME then GPU_CLOCK, + # then 38 A counters, then 8 B counters and finally + # 8 C counters. + if idx == 0: + return "GPU_TIME 0 READ" + elif idx == 1: + return "GPU_CLOCK 0 READ" + elif idx < 40: + return "A " + str(idx - 2) + " READ" + elif idx < 48: + return "B " + str(idx - 40) + " READ" + elif idx < 56: + return "C " + str(idx - 48) + " READ" + else: + return "{0} READ".format(read_value(chipset, offset, oa_format)) + else: + # For Gen8+ the array of accumulated counters is + # assumed to start with a GPU_TIME then GPU_CLOCK, + # then 36 A counters, then 8 B counters and finally + # 8 C counters. + if idx == 0: + return "GPU_TIME 0 READ" + elif idx == 1: + return "GPU_CLOCK 0 READ" + elif idx < 38: + return "A " + str(idx - 2) + " READ" + elif idx < 46: + return "B " + str(idx - 38) + " READ" + elif idx < 54: + return "C " + str(idx - 46) + " READ" + else: + return "{0} READ".format(read_value(chipset, offset, oa_format)) + + assert 0 + + +def read_token_to_rpn_read(chipset, token, raw_offsets, oa_format): + if oa_format == '256B_GENERIC_NOA16': + return read_token_to_rpn_read_oag(chipset, token, raw_offsets, oa_format) + + if oa_format in ['192B_MPEC8LL_NOA16', '128B_MPEC8_NOA16']: + return read_token_to_rpn_read_oam(chipset, token, raw_offsets, oa_format) + + assert 0 + +def replace_read_tokens_with_rpn_read_ops(chipset, oa_format, equation, raw_offsets): + # MDAPI MetricSet equations use tokens like 'dw@0xff' for reading raw + # values from snapshots, but this doesn't seem convenient for a few + # reasons: + # + # 1) The offsets hide the particular a, b, or c counter they + # correspond to which in turn makes it awkward to experiment + # with different report sizes which trade off how many a, b and + # c counters are available + # + # 2) Raw reads could be represented as RPN operations too, and + # the consistency could make them slightly easier for tools to + # handle, E.g: + # + # "A 5 READ" = read A counter 5 + # + # We replace dw@ address tokens with GPU_TIME, A, B or C READ ops... + # + + tokens = equation.split() + equation = "" + + for token in tokens: + if '@' in token: + read_exp = read_token_to_rpn_read(chipset, token, raw_offsets, oa_format) + equation = equation + " " + read_exp + else: + equation = equation + " " + token + + return equation + + +parser = argparse.ArgumentParser() +parser.add_argument("xml", nargs="+", help="XML description of metrics") +parser.add_argument("--guids", required=True, help="Metric set GUID registry") +parser.add_argument("--whitelist", help="Only output for given, space-separated, sets") +parser.add_argument("--blacklist", help="Don't generate anything for given metric sets") +parser.add_argument("--merge", help="Additional meta data to merge into the result") +parser.add_argument("--dry-run", action="store_true", + help="Not generate new XML but to check any errors") + +args = parser.parse_args() + +metrics = et.Element('metrics') +tree = et.ElementTree(metrics) + +def apply_aliases(text, aliases): + if aliases == None: + return text + + for alias in aliases.split(','): + (a, b) = alias.split('|') + text = re.sub(r"\b%s\b" % re.escape(a), b, text) + + a = a.lower() + b = b.lower() + text = re.sub(r"\b%s\b" % re.escape(a), b, text) + + return text + +def strip_dx_apis(text): + if text == None: + return "" + stripped = "" + apis = text.split() + for api in apis: + if api[:2] != "DX": + stripped = stripped + " " + api + + return stripped.strip() + +def add_gpu_core_clocks_if_missing(metric_set, counters, counter_deps): + if len(counters) < 1: + return + + for name,element in counters.items(): + if name == 'GpuCoreClocks': + return + + print_err("WARNING: add missing GpuCoreClocks counter for MetricSets=\"{0}\"".format(metric_set.get('ShortName'))) + counter = et.SubElement(metric_set, 'Metric') + counter.set("SymbolName", "GpuCoreClocks") + counter.set("SignalName", "oa.fixed") + counter.set("ShortName", "GPU Core Clocks") + counter.set("LongName", "The total number of GPU core clocks elapsed during the measurement.") + counter.set("Group", "GPU") + counter.set("UsageFlags", "Tier1 Frame Batch Draw") + counter.set("MetricType", "EVENT") + counter.set("ResultType", "UINT64") + counter.set("MetricUnits", "cycles") + counter.set("HWUnitType", "GPU") + counter.set("SnapshotReportReadEquation", "dw@0x0c") + counter.set("SnapshotReportDeltaFunction", "DELTA 32") + counter.set("DeltaReportReadEquation", "qw@0x08") + counter.set("NormalizationEquation", "") + + counters["GpuCoreClocks"] = counter + counter_deps["GpuCoreClocks"] = [] + +# For recursively appending counters in order of dependencies... +def append_deps_and_counter(mdapi_counter, mdapi_counters, deps, + sorted_array, sorted_set): + symbol_name = oa_registry.Registry.sanitize_symbol_name(mdapi_counter.get('SymbolName')) + + if symbol_name in sorted_set: + return + + for dep_name in deps[symbol_name]: + if dep_name in mdapi_counters: + append_deps_and_counter(mdapi_counters[dep_name], mdapi_counters, deps, + sorted_array, sorted_set) + + sorted_array.append(mdapi_counter) + sorted_set[symbol_name] = mdapi_counter + +def sort_counters(mdapi_counters, deps): + sorted_array = [] + sorted_set = {} # counters in here have been added to array + for symbol_name in mdapi_counters: + append_deps_and_counter(mdapi_counters[symbol_name], mdapi_counters, deps, + sorted_array, sorted_set) + + return sorted_array + +def expand_macros(equation): + equation = equation.replace('GpuDuration', "$Self 100 UMUL $GpuCoreClocks FDIV") + equation = equation.replace('EuAggrDuration', "$Self $EuCoresTotalCount UDIV 100 UMUL $GpuCoreClocks FDIV") + return equation + +def fixup_equation(equation): + if equation is None: + return None + return equation.replace('$SubliceMask', '$SubsliceMask') + +# The MDAPI XML files sometimes duplicate the same Flex EU/OA regs +# between configs with different AvailabilityEquations even though the +# availability checks are only expected to affect the MUX configs +# +# We iterate all the configs to filter out the FLEX/OA configs and +# double check that there's never any variations between repeated +# configs +# +def filter_single_config_registers_of_type(mdapi_metric_set, type, oa_format): + regs = [] + for mdapi_reg_config in mdapi_metric_set.findall("RegConfigStart"): + tmp_regs = [] + for mdapi_reg in mdapi_reg_config.findall("Register"): + reg = (int(mdapi_reg.get('offset'),16), int(mdapi_reg.get('value'),16)) + + if reg[0] in chipsets[chipset][oa_format]['config_reg_blacklist']: + continue + + if mdapi_reg.get('type') == type: + tmp_regs.append(reg) + + if len(tmp_regs) > 0: + bad = False + if len(regs) == 0: + regs = tmp_regs + elif len(regs) != len(tmp_regs): + bad = True + else: + for i in range(0, len(regs)): + if regs[i] != tmp_regs[i]: + bad = True + break + if bad: + print_err("ERROR: multiple, differing FLEX/OA configs for one set: MetricSet=\"" + mdapi_metric_set.get('ShortName')) + sys.exit(1) + + return regs + + +# We only have a very small number of IDs, but we aren't assuming they +# start from zero or are contiguous in the MDAPI XML files. Python +# doesn't seem to have a built in sparse array type so we just +# loop over the entries we have: +def get_mux_id_group(id_groups, id): + for group in id_groups: + if group['id'] == id: + return group + + new_group = { 'id': id, 'configs': [] } + id_groups.append(new_group) + + return new_group + + + +def process_mux_configs(mdapi_set, oa_format): + allow_missing_id = True + + mux_config_id_groups = [] + + for mdapi_reg_config in mdapi_set.findall("RegConfigStart"): + + mux_regs = [] + for mdapi_reg in mdapi_reg_config.findall("Register"): + address = int(mdapi_reg.get('offset'), 16) + + if address in chipsets[chipset][oa_format]['config_reg_blacklist']: + continue + + reg_type = mdapi_reg.get('type') + + if reg_type not in register_types: + print_err("ERROR: unknown register type=\"" + reg_type + "\": MetricSet=\"" + mdapi_set.get('ShortName')) + sys.exit(1) + + if reg_type != 'NOA' and reg_type != 'PM': + continue + + reg = (address, int(mdapi_reg.get('value'), 16)) + mux_regs.append(reg) + + if len(mux_regs) == 0: + continue + + availability = mdapi_reg_config.get('AvailabilityEquation') + if availability == "": + availability = None + + if mdapi_reg_config.get('ConfigPriority') != None: + reg_config_priority = int(mdapi_reg_config.get('ConfigPriority')) + else: + reg_config_priority = 0 + + if mdapi_reg_config.get('ConfigId') != None: + reg_config_id = int(mdapi_reg_config.get('ConfigId')) + allow_missing_id = False + elif mdapi_reg_config.get('ConfigId') == None and allow_missing_id == True: + reg_config_id = 0 + else: + # It will spell trouble if there's a mixture of explicit and + # implied config IDs... + print_err("ERROR: register configs mixing implied/explicit IDs: MetricSet=\"" + mdapi_set.get('ShortName')) + sys.exit(1) + + mux_config = { 'priority': reg_config_priority, + 'availability': availability, + 'registers': mux_regs } + + mux_config_id_group = get_mux_id_group(mux_config_id_groups, reg_config_id) + mux_config_id_group['configs'].append(mux_config) + + mux_config_id_groups.sort(key=itemgetter('id')) + + # The only special case we currently support for more than one group of NOA + # MUX configs is for the Broadwell ComputeExtended metric set with two Id + # groups and the second just has a single unconditional config that can + # logically be appended to all the conditional configs of the first group + if len(mux_config_id_groups) > 1: + if len(mux_config_id_groups) != 2: + print_err("ERROR: Script doesn't currently allow more than two groups of NOA MUX configs for a single metric set: MetricSet=\"" + mdapi_set.get('ShortName')) + sys.exit(1) + + last_id_group = mux_config_id_groups[-1] + if len(last_id_group['configs']) != 1: + print_err("ERROR: Script currently only allows up to two Ids for NOA MUX configs if second Id only contains a single unconditional config: MetricSet=\"" + mdapi_set.get('ShortName')) + sys.exit(1) + + tail_config = last_id_group['configs'][0] + for mux_config in mux_config_id_groups[0]['configs']: + mux_config['registers'] = mux_config['registers'] + tail_config['registers'] + + mux_config_id_groups = [mux_config_id_groups[0]] + + if len(mux_config_id_groups) == 0 or mux_config_id_groups[0]['configs'] == 0: + return () + + mux_configs = mux_config_id_groups[0]['configs'] + assert isinstance(mux_configs, list) + assert len(mux_configs) >= 1 + assert len(mux_configs[0]['registers']) > 1 # > 1 registers + return mux_configs + + +def add_register_config(set, priority, availability, regs, type): + reg_config = et.SubElement(set, 'register_config') + + reg_config.set('type', type) + + if availability != None: + assert type == "NOA" + reg_config.set('priority', str(priority)) + reg_config.set('availability', availability) + + for reg in regs: + elem = et.SubElement(reg_config, 'register') + elem.set('type', type) + elem.set('address', "0x%08X" % reg[0]) + elem.set('value', "0x%08X" % reg[1]) + +def to_text(value): + if value == None: + return "" + return value + +# There are duplicated metric sets with the same symbol name so we +# keep track of the sets we've read so we can skip duplicates... +sets = {} + +guids = {} + +guids_xml = et.parse(args.guids) +for guid in guids_xml.findall(".//guid"): + hashing_key = oa_registry.Registry.chipset_derive_hash(guid.get('chipset'), + guid.get('name'), + guid.get('mdapi_config_hash')) + guids[hashing_key] = guid.get('id') + +for arg in args.xml: + mdapi = et.parse(arg) + + concurrent_group = mdapi.find(".//ConcurrentGroup") + chipset = oa_registry.Registry.chipset_name(concurrent_group.get('SupportedHW')) + + chipset_fullname = chipset + if concurrent_group.get('SupportedGT') != None: + chipset_fullname = chipset_fullname + oa_registry.Registry.gt_name(concurrent_group.get('SupportedGT')) + if chipset not in chipsets: + print_err("WARNING: unsupported chipset {0}, consider updating {1}".format(chipset, __file__)) + continue + + for mdapi_set in mdapi.findall(".//MetricSet"): + + apis = mdapi_set.get('SupportedAPI') + if "OGL" not in apis and "OCL" not in apis and "MEDIA" not in apis and "IO" not in apis: + continue + + oa_format = '256B_GENERIC_NOA16' + if mdapi_set.get('ReportType') in chipsets[chipset]: + oa_format = mdapi_set.get('ReportType') + + set_symbol_name = oa_registry.Registry.sanitize_symbol_name(mdapi_set.get('SymbolName')) + + if set_symbol_name in sets: + print_err("WARNING: duplicate set named \"" + set_symbol_name + "\" (SKIPPING)") + continue + + if args.whitelist: + set_whitelist = args.whitelist.split() + if set_symbol_name not in set_whitelist: + continue + + if args.blacklist: + set_blacklist = args.blacklist.split() + else: + set_blacklist = default_set_blacklist + if set_symbol_name in set_blacklist: + continue + + if mdapi_set.get('SnapshotReportSize') != str(chipsets[chipset][oa_format]['oa_report_size']): + print_err("WARNING: skipping metric set '{0}', report size {1} invalid".format(set_symbol_name, mdapi_set.get('SnapshotReportSize'))) + continue + + set = et.SubElement(metrics, 'set') + + set.set('chipset', chipset_fullname) + + set.set('name', mdapi_set.get('ShortName')) + set.set('symbol_name', set_symbol_name) + set.set('underscore_name', underscore(mdapi_set.get('SymbolName'))) + set.set('mdapi_supported_apis', strip_dx_apis(mdapi_set.get('SupportedAPI'))) + set.set('oa_format', oa_format) + + + # Look at the hardware register config before looking at the counters. + # + # The hardware configuration is used as a key to lookup up a GUID which + # is used by applications to lookup the corresponding counter + # normalization equations. + # + # We want to skip over any metric sets that don't yet have a registered + # GUID in guids.xml. + + # There can be multiple NOA MUX configs, since they may have associated + # availability tests to match particular systems. + # + # Unlike the MDAPI XML files we only support tracking one group of + # mutually exclusive MUX configs, whereas the MDAPI XML files + # theoretically allow a single metric set to be associated with ordered + # groups of mutually exclusive configs. So far there is only one + # Broadwell, ComputeExtended metric set which uses this, but that + # particular case can be expressed in less general terms. + # + # Being a bit simpler here should make it easier for downstream tools + # to deal with. (At least we got the handling of the Broadwell + # ComputeExtended example wrong and it took several email exchanges and + # a conference call to confirm how to interpret this case) + mux_configs = process_mux_configs(mdapi_set, oa_format) + + # Unlike for MUX registers, we only expect one set of FLEX/OA + # registers per metric set (even though they are sometimes duplicated + # between configs in MDAPI XML files. + # + # This filter function, extracts the register of a certain type but + # also double checks that if they are repeated in separate configs that + # they don't vary. (Notably the current xe_oa Linux driver would + # need some adapting to support multiple OA/FLEX configs with different + # availability expressions) + # + flex_regs = filter_single_config_registers_of_type(mdapi_set, "FLEX", oa_format) + oa_regs = filter_single_config_registers_of_type(mdapi_set, "OA", oa_format) + + + # Note: we ignore Perfmon registers + + for mux_config in mux_configs: + add_register_config(set, mux_config['priority'], mux_config['availability'], mux_config['registers'], "NOA") + if len(oa_regs) > 0: + add_register_config(set, 0, None, oa_regs, "OA") + if len(flex_regs) > 0: + add_register_config(set, 0, None, flex_regs, "FLEX") + + mdapi_hw_config_hash = oa_registry.Registry.mdapi_hw_config_hash(mdapi_set) + guid_hash = oa_registry.Registry.chipset_derive_hash(chipset_fullname.lower(), + set_symbol_name, + mdapi_hw_config_hash) + hw_config_hash = oa_registry.Registry.hw_config_hash(set) + + if guid_hash in guids: + set.set('hw_config_guid', guids[guid_hash]) + else: + print_err("WARNING: No GUID found for metric set " + chipset_fullname + ", " + set_symbol_name + " (SKIPPING)") + print_err("WARNING: If this is a new config add the following to guids.xml:") + print_err("") + metrics.remove(set) + continue + + + sets[set_symbol_name] = set + + counters = {} + normalization_equations = {} + raw_equations = {} + + # Awkwardly we can't assume metrics are in dependency order and have to + # sort them manually. We start by associating a list of dependencies with + # each counter... + + mdapi_counters = {} + mdapi_counter_deps = {} + + for mdapi_counter in mdapi_set.findall("Metrics/Metric"): + symbol_name = oa_registry.Registry.sanitize_symbol_name(mdapi_counter.get('SymbolName')) + + if symbol_name in counter_blacklist: + continue; + + # Have seen at least one MetricSet with a duplicate GpuCoreClocks counter... + if symbol_name in mdapi_counters: + print_err("WARNING: Skipping duplicate counter \"" + symbol_name + \ + "\" in " + set.get('name') + " :: " + mdapi_counter.get('ShortName')) + continue; + + deps = [] + equations = fixup_equation(str(mdapi_counter.get('SnapshotReportReadEquation'))) + " " + \ + fixup_equation(str(mdapi_counter.get('SnapshotReportDeltaEquation'))) + " " + \ + fixup_equation(str(mdapi_counter.get('DeltaReportReadEquation'))) + " " + \ + fixup_equation(str(mdapi_counter.get('NormalizationEquation'))) + equations = expand_macros(equations) + equations = equations.replace('$$', "$") + for token in equations.split(): + if token[0] == '$' and not codegen.is_hw_var(token) and token[1:] != "Self": + deps.append(token[1:]) + + mdapi_counters[symbol_name] = mdapi_counter + mdapi_counter_deps[symbol_name] = deps + + add_gpu_core_clocks_if_missing(mdapi_set, mdapi_counters, mdapi_counter_deps) + sorted_mdapi_counters = sort_counters(mdapi_counters, mdapi_counter_deps) + + for mdapi_counter in sorted_mdapi_counters: + + aliases = mdapi_counter.get('Alias') + + skip_counter = False + + # We don't currently support configuring and reading perfmon registers + signal = mdapi_counter.get('SignalName') + if signal and "perfmon" in signal: + continue; + + # A few things to fixup with this common counter... + if mdapi_counter.get('SymbolName') == "AvgGpuCoreFrequencyMHz": + # To avoid requiring a special case in tools, add a max value + # equation for the gpu frequency... + mdapi_counter.set('MaxValueEquation', "$GpuMaxFrequency") + + # Don't include units in the name + mdapi_counter.set('SymbolName', "AvgGpuCoreFrequency") + + # Use canonical, first order of magnitude units specifier + mdapi_counter.set('MetricUnits', 'Hz') + mdapi_counter.set('NormalizationEquation', '$GpuCoreClocks 1000000000 UMUL $GpuTime UDIV') + #mdapi_counter.set('DeltaReportReadEquation', '$GpuCoreClocks $GpuTime UDIV') + + + symbol_name = oa_registry.Registry.sanitize_symbol_name(mdapi_counter.get('SymbolName')) + + counter = et.SubElement(set, 'counter') + counter.set('name', apply_aliases(mdapi_counter.get('ShortName'), aliases)) + counter.set('symbol_name', oa_registry.Registry.sanitize_symbol_name(mdapi_counter.get('SymbolName'))) + counter.set('underscore_name', underscore(mdapi_counter.get('SymbolName'))) + counter.set('description', apply_aliases(mdapi_counter.get('LongName'), aliases)) + counter.set('mdapi_group', apply_aliases(to_text(mdapi_counter.get('Group')), aliases)) + counter.set('mdapi_usage_flags', to_text(mdapi_counter.get('UsageFlags'))) + counter.set('mdapi_supported_apis', strip_dx_apis(mdapi_counter.get('SupportedAPI'))) + low = mdapi_counter.get('LowWatermark') + if low: + counter.set('low_watermark', low) + high = to_text(mdapi_counter.get('HighWatermark')) + if high: + counter.set('high_watermark', high) + counter.set('data_type', mdapi_counter.get('ResultType').lower()) + + max_eq = fixup_equation(mdapi_counter.get('MaxValueEquation')) + if max_eq: + counter.set('max_equation', max_eq) + + # XXX Not sure why EU metrics tend to just be bundled under 'gpu' + counter.set('mdapi_hw_unit_type', mdapi_counter.get('HWUnitType').lower()) + + # Some counters do not have MetricUnits, treat them as number. + if mdapi_counter.get('MetricUnits') == None: + units = "number" + else: + units = mdapi_counter.get('MetricUnits').lower() + + # There are counters representing cycle counts that have a semantic + # type of 'duration' which doesn't seem to make sense... + if units == "cycles": + semantic_type = "event" + else: + semantic_type = mdapi_counter.get('MetricType').lower() + + counter.set('units', units) + counter.set('semantic_type', semantic_type) + + # MDAPI MetricSets have 3 different kinds of counter read equations: + # + # 1) One for reading a raw (unnormalized) value from a hardware report + # + # The line between normalized and raw isn't always clear + # as the raw equation may e.g. read and ADD multiple counters + # + # Not all counters have a raw equation if they are instead + # derived through $CounterName references to other counters + # in a normalized value equation + # + # 2) One for reading an unnormalized value from the accumulated 'delta reports' + # + # Seems to duplicate the raw equation but with delta report + # offsets and referencing 64bit values + # + # The normalized value equations are always based on these + # accumulated delta values + # + # 3) One for reading a normalized value + # + # These may start with a reference to "$Self" which is + # effectively a macro for the above delta report equation + # + # If this is missing the delta report equation is effectively + # the normalized equation too + # + # XXX: Beware that there are some inconsistent counters that + # have a normalization equation with a $Self reference and a + # raw equation but no delta report equation. This seems + # pretty sketchy, but (at least for 'MEDIA' metrics) we will + # substitute the raw equation for $Self in this case along + # with a warning to double check the results. + # + # Currently there doesn't appear to be a clear reason to + # differentiate these equations and the separation seems to + # complicate things for tools wanting to generate code from this + # data. + # + # We instead aim to have one normalized equation per counter that + # always reference accumulated counter values. + + # XXX: As a special case, we override the raw and delta report + # equations for the GpuTime counters, which seem inconsistent + if mdapi_counter.get('SymbolName') == "GpuTime": + mdapi_counter.set('DeltaReportReadEquation', "qw@0x0 1000000000 UMUL $GpuTimestampFrequency UDIV") + if chipset == 'MTL' and oa_format != '256B_GENERIC_NOA16': + mdapi_counter.set('SnapshotReportReadEquation', "qw@0x08 1000000000 UMUL $GpuTimestampFrequency UDIV") + else: + mdapi_counter.set('SnapshotReportReadEquation', "dw@0x04 1000000000 UMUL $GpuTimestampFrequency UDIV") + + availability = fixup_equation(mdapi_counter.get('AvailabilityEquation')) + if availability == "": + availability = None + + # We prefer to only look at the equations that reference the raw + # reports since the mapping of offsets back to A,B,C counters is + # unambiguous, but if necessary we will fallback to mapping + # delta report offsets (accumulated 64bit values that correspond + # to the 32bit or 40bit values from raw repots) + + raw_read_eq = fixup_equation(mdapi_counter.get('SnapshotReportReadEquation')) + if raw_read_eq: + if raw_read_eq == "": + raw_read_eq = None + else: + raw_read_eq = replace_read_tokens_with_rpn_read_ops(chipset, + oa_format, + raw_read_eq, + True) #raw offsets + + delta_read_eq = fixup_equation(mdapi_counter.get('DeltaReportReadEquation')) + if delta_read_eq: + if delta_read_eq == "": + delta_read_eq = None + else: + delta_read_eq = replace_read_tokens_with_rpn_read_ops(chipset, + oa_format, + delta_read_eq, + False) #delta offsets + + if raw_read_eq and not delta_read_eq: + print_err("WARNING: Counter with raw equation but no delta report equation: MetricSet=\"" + \ + mdapi_set.get('ShortName') + "\" Metric=\"" + mdapi_counter.get('SymbolName') + \ + "(" + mdapi_counter.get('ShortName') + ")" + "\"") + # Media metric counters currently have no delta equation even + # though they have normalization equations that reference $Self + if "MEDIA" in apis or "IO" in apis: + print_err("WARNING: -> Treating inconsistent media metric's 'raw' equation as a 'delta report' equation, but results should be double checked!") + delta_read_eq = raw_read_eq + else: + set.remove(counter) + continue + + # Some counters are sourced from register values that are + # not put into the OA reports. This is why some counters + # will have a delta equation but not a raw equation. These + # counters are typically only available in query mode. For + # this reason we put a particular availability value. + if delta_read_eq and not raw_read_eq: + assert availability == None + availability = "true $QueryMode &&" + raw_read_eq = delta_read_eq + + # After replacing read tokens with RPN counter READ ops the raw and + # delta equations are expected to be identical so warn if that's + # not true... + if bool(raw_read_eq) ^ bool(delta_read_eq) or raw_read_eq != delta_read_eq: + print_err(("WARNING: Inconsistent raw and delta report equations for {0} :: {1} ({2}): " + + "raw=\"{3}\" / \"{4}\" delta=\"{5}\" / \"{6}\" (SKIPPING)") + .format(mdapi_set.get('ShortName'), + mdapi_counter.get('SymbolName'), + mdapi_counter.get('ShortName'), + str(raw_read_eq), + mdapi_counter.get('SnapshotReportReadEquation'), + str(delta_read_eq), + mdapi_counter.get('DeltaReportReadEquation'))) + set.remove(counter) + continue + + normalize_eq = fixup_equation(mdapi_counter.get('NormalizationEquation')) + if normalize_eq and normalize_eq == "": + normalize_eq = None + + if normalize_eq: + # Some normalization equations are represented with macros such as + # 'GpuDuration' corresponding to: + # + # "$Self 100 UMUL $GpuCoreClocks FDIV" + # + # We expand macros here so tools don't need to care about them... + # + equation = normalize_eq + equation = expand_macros(equation) + if raw_read_eq: + equation = equation.replace('$Self', raw_read_eq) + else: + equation = delta_read_eq + + if '$Self' in equation: + print_err("WARNING: Counter equation (\"" + equation + "\") with unexpanded $Self token: MetricSet=\"" + \ + mdapi_set.get('ShortName') + "\" Metric=\"" + mdapi_counter.get('SymbolName') + \ + "(" + mdapi_counter.get('ShortName') + ")" + "\" (SKIPPING)") + set.remove(counter) + continue + + # $$CounterName vs $CounterName in an equation is intended to + # differentiate referencing the normalized or raw value of another + # counter. + # + # Since we are only keeping a single (normalized) equation for + # counters we only need one form, but we want to be careful to + # check if any equations really depend on the raw value of another + # counter so we can expand those variables now + # + tmp = equation + for token in tmp.split(): + if token[0] == '$' and token[1] != '$': + if token[1:] in normalization_equations: + raw_eq = raw_equations[token[1:]] + + equation = equation.replace(token, raw_eq) + #if token[1:] not in raw_equations: + # print_err("WARNING: Counter equation (\"" + equation + "\") references un-kept raw equation of another counter : MetricSet=\"" + \ + # mdapi_set.get('ShortName') + "\" Metric=\"" + mdapi_counter.get('ShortName') + "\"") + + elif token[1:] not in raw_equations and not codegen.is_hw_var(token): + print_err("Unknown variable name: \"" + token + "\" in equation \"" + equation + "\"") + + symbol_name = counter.get('symbol_name') + + # Make sure that every variable in the equation is a known sys_var or counter name + equation = equation.replace('$$', "$") + for token in equation.split(): + if token[0] == '$': + if token[1:] not in counters and not codegen.is_hw_var(token): + print_err("WARNING: Counter equation (\"" + equation + "\") with unknown variable " + \ + token + " (maybe skipped counter): MetricSet=\"" + mdapi_set.get('ShortName') + \ + "\" Metric=\"" + mdapi_counter.get('SymbolName') + "(" + mdapi_counter.get('ShortName') + \ + ")" + "\" (SKIPPING)") + set.remove(counter) + skip_counter = True + break + + if skip_counter: + continue + + counter.set('equation', equation.strip()) + + if availability != None: + counter.set('availability', availability) + + counters[symbol_name] = counter; + if normalize_eq: + normalization_equations[symbol_name] = normalize_eq + if raw_read_eq: + raw_equations[symbol_name] = raw_read_eq + + +if args.dry_run: + sys.exit(0) + +# Merge in any custom meta data we have... +if args.merge: + merge = et.parse(args.merge) + merge_metrics = merge.getroot() + + for merge_set in merge.findall(".//set"): + pattern = ".//set[@symbol_name=\"" + merge_set.get('symbol_name') + "\"][@chipset=\"" + merge_set.get('chipset') + "\"]" + real_set = metrics.find(pattern) + if real_set is not None: + for set_attr in merge_set.items(): + real_set.set(set_attr[0], set_attr[1]) + + for merge_elem in merge_set: + if merge_elem.tag == "counter": + merge_counter = merge_elem + pattern = "counter[@symbol_name=\"" + merge_counter.get('symbol_name') + "\"]" + real_counter = real_set.find(pattern) + if real_counter is not None: + for counter_attr in merge_counter.items(): + real_counter.set(counter_attr[0], counter_attr[1]) + else: + real_set.append(merge_counter) + real_counter = merge_counter + else: + real_set.append(merge_elem) + + # For consistency + readability print everything manually... + merge_md5 = hashlib.md5(open("merge.xml", 'rb').read()).hexdigest() +else: + merge_md5 = "" + +print ("") +print("") +for set in metrics.findall(".//set"): + print(" ") + for counter in set.findall("counter"): + print(" ") + for config in set.findall("register_config"): + if config.get('availability') != None: + print(" ") + else: + print(" ") + for reg in config.findall("register"): + addr = int(reg.get('address'), 16) + + if 'registers' in chipsets[chipset][oa_format] and addr in chipsets[chipset][oa_format]['registers']: + reg_info = chipsets[chipset][oa_format]['registers'][addr] + comment = ' ' + else: + comment = '' + + print(" " + comment) + print(" ") + print(" \n") +print("") diff --git a/lib/xe/oa-configs/oa-equations-codegen.py b/lib/xe/oa-configs/oa-equations-codegen.py new file mode 100644 index 0000000000..a4a00f46d9 --- /dev/null +++ b/lib/xe/oa-configs/oa-equations-codegen.py @@ -0,0 +1,261 @@ +#!/usr/bin/env python3 +# +# SPDX-License-Identifier: MIT +# +# Copyright © 2024 Intel Corporation +# +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS +# IN THE SOFTWARE. + +import argparse +import os +import sys +import textwrap + +import codegen + +h = None +c = None + +hashed_funcs = {} + +def data_type_to_ctype(ret_type): + if ret_type == "uint64": + return "uint64_t" + elif ret_type == "float": + return "double" + else: + raise Exception("Unhandled case for mapping \"" + ret_type + "\" to a C type") + + +def output_counter_read(gen, set, counter): + if counter.read_hash in hashed_funcs: + return + + c("\n") + c("/* {0} :: {1} */".format(set.name, counter.get('name'))) + + ret_type = counter.get('data_type') + ret_ctype = data_type_to_ctype(ret_type) + read_eq = counter.get('equation') + + c(ret_ctype) + c(counter.read_sym + "(const struct intel_perf *perf,\n") + c.indent(len(counter.read_sym) + 1) + c("const struct intel_perf_metric_set *metric_set,\n") + c("uint64_t *accumulator)\n") + c.outdent(len(counter.read_sym) + 1) + + c("{") + c.indent(4) + + gen.output_rpn_equation_code(set, counter, read_eq) + + c.outdent(4) + c("}") + + hashed_funcs[counter.read_hash] = counter.read_sym + + +def output_counter_read_definition(gen, set, counter): + if counter.read_hash in hashed_funcs: + h("#define %s \\" % counter.read_sym) + h.indent(4) + h("%s" % hashed_funcs[counter.read_hash]) + h.outdent(4) + else: + ret_type = counter.get('data_type') + ret_ctype = data_type_to_ctype(ret_type) + read_eq = counter.get('equation') + + h(ret_ctype) + h(counter.read_sym + "(const struct intel_perf *perf,\n") + h.indent(len(counter.read_sym) + 1) + h("const struct intel_perf_metric_set *metric_set,\n") + h("uint64_t *accumulator);\n") + h.outdent(len(counter.read_sym) + 1) + + hashed_funcs[counter.read_hash] = counter.read_sym + + +def output_counter_max(gen, set, counter): + max_eq = counter.get('max_equation') + + if not max_eq or max_eq == "100": + return + + if counter.max_hash in hashed_funcs: + return + + c("\n") + c("/* {0} :: {1} */".format(set.name, counter.get('name'))) + + ret_type = counter.get('data_type') + ret_ctype = data_type_to_ctype(ret_type) + + c(ret_ctype) + c(counter.max_sym + "(const struct intel_perf *perf,\n") + c.indent(len(counter.max_sym) + 1) + c("const struct intel_perf_metric_set *metric_set,\n") + c("uint64_t *accumulator)\n") + c.outdent(len(counter.max_sym) + 1) + + c("{") + c.indent(4) + + gen.output_rpn_equation_code(set, counter, max_eq) + + c.outdent(4) + c("}") + + hashed_funcs[counter.max_hash] = counter.max_sym + + +def output_counter_max_definition(gen, set, counter): + max_eq = counter.get('max_equation') + + if not max_eq or max_eq == "100": + return + + if counter.max_hash in hashed_funcs: + h("#define %s \\" % counter.max_sym) + h.indent(4) + h("%s" % hashed_funcs[counter.max_hash]) + h.outdent(4) + h("\n") + else: + ret_type = counter.get('data_type') + ret_ctype = data_type_to_ctype(ret_type) + + h(ret_ctype) + + h(counter.max_sym + "(const struct intel_perf *perf,") + h.indent(len(counter.max_sym) + 1) + h("const struct intel_perf_metric_set *metric_set,") + h("uint64_t *accumulator);") + h.outdent(len(counter.max_sym) + 1) + h("\n") + + hashed_funcs[counter.max_hash] = counter.max_sym + + +def generate_equations(args, gens): + global hashed_funcs + + header_file = os.path.basename(args.header) + header_define = header_file.replace('.', '_').upper() + + hashed_funcs = {} + c(textwrap.dedent("""\ + #include + #include + + #include "xe/xe_oa.h" + #include "%s" + + #define MIN(x, y) (((x) < (y)) ? (x) : (y)) + #define MAX(a, b) (((a) > (b)) ? (a) : (b)) + + double + percentage_max_callback_float(const struct intel_perf *perf, + const struct intel_perf_metric_set *metric_set, + uint64_t *accumulator) + { + return 100; + } + + uint64_t + percentage_max_callback_uint64(const struct intel_perf *perf, + const struct intel_perf_metric_set *metric_set, + uint64_t *accumulator) + { + return 100; + } + + """ % os.path.basename(args.header))) + + # Print out all equation functions. + for gen in gens: + for set in gen.sets: + for counter in set.counters: + output_counter_read(gen, set, counter) + output_counter_max(gen, set, counter) + + hashed_funcs = {} + h(textwrap.dedent("""\ + #ifndef __%s__ + #define __%s__ + + #include + #include + #include + + struct intel_perf; + struct intel_perf_metric_set; + + double + percentage_max_callback_float(const struct intel_perf *perf, + const struct intel_perf_metric_set *metric_set, + uint64_t *accumulator); + uint64_t + percentage_max_callback_uint64(const struct intel_perf *perf, + const struct intel_perf_metric_set *metric_set, + uint64_t *accumulator); + + """ % (header_define, header_define))) + + # Print out all equation functions. + for gen in gens: + for set in gen.sets: + for counter in set.counters: + output_counter_read_definition(gen, set, counter) + output_counter_max_definition(gen, set, counter) + + h(textwrap.dedent("""\ + + #endif /* __%s__ */ + """ % header_define)) + + +def main(): + global c + global h + + parser = argparse.ArgumentParser() + parser.add_argument("--header", help="Header file to write") + parser.add_argument("--code", help="C file to write") + parser.add_argument("xml_files", nargs='+', help="List of xml metrics files to process") + + args = parser.parse_args() + + # Note: either arg may == None + h = codegen.Codegen(args.header) + c = codegen.Codegen(args.code) + + gens = [] + for xml_file in args.xml_files: + gens.append(codegen.Gen(xml_file, c)) + + copyright = textwrap.dedent("""\ + /* Autogenerated file, DO NOT EDIT manually! generated by {} */ + // SPDX-License-Identifier: MIT + /* + * Copyright © 2024 Intel Corporation + */ + + """).format(os.path.basename(__file__)) + + h(copyright) + c(copyright) + + generate_equations(args, gens) + + +if __name__ == '__main__': + main() diff --git a/lib/xe/oa-configs/oa-metricset-codegen.py b/lib/xe/oa-configs/oa-metricset-codegen.py new file mode 100644 index 0000000000..be9483af02 --- /dev/null +++ b/lib/xe/oa-configs/oa-metricset-codegen.py @@ -0,0 +1,257 @@ +#!/usr/bin/env python3 +# +# SPDX-License-Identifier: MIT +# +# Copyright © 2024 Intel Corporation + +import argparse +import os +import sys +import textwrap + +import codegen + +h = None +c = None + +semantic_type_map = { + "duration": "raw", + "ratio": "event" + } + +def output_units(unit): + return unit.replace(' ', '_').upper() + +def availability_func_name(set, counter): + return set.gen.chipset + "_" + set.underscore_name + "_" + counter.get('symbol_name') + "_availability" + +def output_availability_funcs(set, counter): + availability = counter.get('availability') + if availability: + c("static bool " + availability_func_name(set, counter) + "(const struct intel_perf *perf) {") + c.indent(4) + set.gen.output_availability(set, availability, counter.get('name')) + c.indent(4) + c("return true;") + c.outdent(4) + c("}") + c("return false;") + c.outdent(4) + c("}") + +def output_counter_report(set, counter): + data_type = counter.get('data_type') + data_type_uc = data_type.upper() + c_type = data_type + + if "uint" in c_type: + c_type = c_type + "_t" + + semantic_type = counter.get('semantic_type') + if semantic_type in semantic_type_map: + semantic_type = semantic_type_map[semantic_type] + + semantic_type_uc = semantic_type.upper() + + c("\n") + + c("{") + c.indent(4) + c(".name = \"{0}\",\n".format(counter.get('name'))) + c(".symbol_name = \"{0}\",\n".format(counter.get('symbol_name'))) + c(".desc = \"{0}\",\n".format(counter.get('description'))) + c(".type = INTEL_PERF_LOGICAL_COUNTER_TYPE_{0},\n".format(semantic_type_uc)) + c(".storage = INTEL_PERF_LOGICAL_COUNTER_STORAGE_{0},\n".format(data_type_uc)) + c(".unit = INTEL_PERF_LOGICAL_COUNTER_UNIT_{0},\n".format(output_units(counter.get('units')))) + c(".read_{0} = {1},\n".format(data_type, set.read_funcs["$" + counter.get('symbol_name')])) + c(".max_{0} = {1},\n".format(data_type, set.max_funcs["$" + counter.get('symbol_name')])) + c(".group = \"{0}\",\n".format(counter.get('mdapi_group'))) + availability = counter.get('availability') + if availability: + c(".availability = {0},\n".format(availability_func_name(set, counter))) + c.outdent(4) + c("},") + + +def generate_metric_sets(args, gen): + c(textwrap.dedent("""\ +#include +#include +#include +#include +#include + + """)) + + c("#include \"{0}\"".format(os.path.basename(args.header))) + c("#include \"{0}\"".format(os.path.basename(args.equations_include))) + c("#include \"{0}\"".format(os.path.basename(args.registers_include))) + + # Print out all set registration functions for each set in each + # generation. + for set in gen.sets: + counters = sorted(set.counters, key=lambda k: k.get('symbol_name')) + + c("\n") + + for counter in counters: + output_availability_funcs(set, counter) + + c("\nstatic void\n") + c(gen.chipset + "_add_" + set.underscore_name + "_metric_set(struct intel_perf *perf)") + c("{\n") + c.indent(4) + + c("struct intel_perf_metric_set *metric_set;\n") + c("struct intel_perf_logical_counter *counter;\n\n") + + c("metric_set = calloc(1, sizeof(*metric_set));\n") + c("metric_set->name = \"" + set.name + "\";\n") + c("metric_set->symbol_name = \"" + set.symbol_name + "\";\n") + c("metric_set->hw_config_guid = \"" + set.hw_config_guid + "\";\n") + c("metric_set->counters = calloc({0}, sizeof(struct intel_perf_logical_counter));\n".format(str(len(counters)))) + c("metric_set->n_counters = 0;\n") + c("metric_set->perf_oa_metrics_set = 0; // determined at runtime\n") + + if gen.chipset.startswith("acm") or gen.chipset.startswith("mtl"): + if set.oa_format == "128B_MPEC8_NOA16": + c(textwrap.dedent("""\ + metric_set->perf_oa_format = I915_OAM_FORMAT_MPEC8u32_B8_C8; + + metric_set->perf_raw_size = 128; + metric_set->gpu_time_offset = 0; + metric_set->gpu_clock_offset = 1; + metric_set->a_offset = 2; + metric_set->b_offset = metric_set->a_offset + 8; + metric_set->c_offset = metric_set->b_offset + 8; + metric_set->perfcnt_offset = metric_set->c_offset + 8; + """)) + else: + c(textwrap.dedent("""\ + metric_set->perf_oa_format = I915_OA_FORMAT_A24u40_A14u32_B8_C8; + + metric_set->perf_raw_size = 256; + metric_set->gpu_time_offset = 0; + metric_set->gpu_clock_offset = 1; + metric_set->a_offset = 2; + metric_set->b_offset = metric_set->a_offset + 38; + metric_set->c_offset = metric_set->b_offset + 8; + metric_set->perfcnt_offset = metric_set->c_offset + 8; + """)) + else: + c(textwrap.dedent("""\ + metric_set->perf_oa_format = I915_OA_FORMAT_A32u40_A4u32_B8_C8; + + metric_set->perf_raw_size = 256; + metric_set->gpu_time_offset = 0; + metric_set->gpu_clock_offset = 1; + metric_set->a_offset = 2; + metric_set->b_offset = metric_set->a_offset + 36; + metric_set->c_offset = metric_set->b_offset + 8; + metric_set->perfcnt_offset = metric_set->c_offset + 8; + + """)) + + c("%s_%s_add_registers(perf, metric_set);" % (gen.chipset, set.underscore_name)) + + c("intel_perf_add_metric_set(perf, metric_set);"); + c("\n") + + c("{") + c.indent(4) + c("static const struct intel_perf_logical_counter _counters[] = {") + c.indent(4) + + for counter in counters: + output_counter_report(set, counter) + c.outdent(4) + c("};") + c("int i;") + + c("for (i = 0; i < sizeof(_counters) / sizeof(_counters[0]); i++) {") + c.indent(4) + c("if (_counters[i].availability && !_counters[i].availability(perf))") + c.indent(4) + c("continue;") + c.outdent(4) + c("counter = &metric_set->counters[metric_set->n_counters++];") + c("*counter = _counters[i];") + c("counter->metric_set = metric_set;") + c("intel_perf_add_logical_counter(perf, counter, counter->group);") + c.outdent(4) + c("}") + c.outdent(4) + c("}") + c("\nassert(metric_set->n_counters <= {0});\n".format(len(counters))); + + c.outdent(4) + c("}\n") + + c("\nvoid") + c("intel_perf_load_metrics_" + gen.chipset + "(struct intel_perf *perf)") + c("{") + c.indent(4) + + for set in gen.sets: + c("{0}_add_{1}_metric_set(perf);".format(gen.chipset, set.underscore_name)) + + c.outdent(4) + c("}") + + + +def main(): + global c + global h + + parser = argparse.ArgumentParser() + parser.add_argument("--header", help="Header file to write") + parser.add_argument("--code", help="C file to write") + parser.add_argument("--equations-include", help="Equations header file") + parser.add_argument("--registers-include", help="Registers header file") + parser.add_argument("--xml-file", help="Xml file to generate metric sets from") + + args = parser.parse_args() + + # Note: either arg may == None + h = codegen.Codegen(args.header) + c = codegen.Codegen(args.code) + + gen = codegen.Gen(args.xml_file, c) + + copyright = textwrap.dedent("""\ + /* Autogenerated file, DO NOT EDIT manually! generated by {} */ + // SPDX-License-Identifier: MIT + /* + * Copyright © 2024 Intel Corporation + */ + + """).format(os.path.basename(__file__)) + + header_file = os.path.basename(args.header) + header_define = header_file.replace('.', '_').upper() + + h(copyright) + h(textwrap.dedent("""\ + #ifndef %s + #define %s + + #include + + #include "xe/xe_oa.h" + + """ % (header_define, header_define))) + + # Print out all set registration functions for each generation. + h("void intel_perf_load_metrics_" + gen.chipset + "(struct intel_perf *perf);\n\n") + + h(textwrap.dedent("""\ + #endif /* %s */ + """ % header_define)) + + c(copyright) + generate_metric_sets(args, gen) + + +if __name__ == '__main__': + main() diff --git a/lib/xe/oa-configs/oa-registers-codegen.py b/lib/xe/oa-configs/oa-registers-codegen.py new file mode 100644 index 0000000000..a4aa134097 --- /dev/null +++ b/lib/xe/oa-configs/oa-registers-codegen.py @@ -0,0 +1,118 @@ +#!/usr/bin/env python3 +# +# SPDX-License-Identifier: MIT +# +# Copyright © 2024 Intel Corporation + +import argparse +import os +import sys +import textwrap + +import codegen + +h = None +c = None + + +def generate_register_configs(set): + register_types = { + 'FLEX': 'flex_regs', + 'NOA': 'mux_regs', + 'OA': 'b_counter_regs', + } + + c("void %s_%s_add_registers(struct intel_perf *perf, struct intel_perf_metric_set *metric_set)" % + (set.gen.chipset, set.underscore_name)) + c("{") + c.indent(4) + + # fill in register/values + register_configs = set.findall('register_config') + for register_config in register_configs: + t = register_types[register_config.get('type')] + + availability = register_config.get('availability') + if availability: + set.gen.output_availability(set, availability, register_config.get('type') + ' register config') + c.indent(4) + + c("{") + c.indent(4) + c("static const struct intel_perf_register_prog _%s[] = {" % t) + c.indent(4) + for register in register_config.findall('register'): + c("{ .reg = %s, .val = %s }," % + (register.get('address'), register.get('value'))) + c.outdent(4) + c("};") + c("metric_set->%s = _%s;" % (t, t)) + c("metric_set->n_%s = sizeof(_%s) / sizeof(_%s[0]);" % (t, t, t)) + c.outdent(4) + c("}") + + if availability: + c.outdent(4) + c("}") + c("\n") + + c.outdent(4) + c("}") + + +def main(): + global c + global h + global xml_equations + + parser = argparse.ArgumentParser() + parser.add_argument("--header", help="Header file to write") + parser.add_argument("--code", help="C file to write") + parser.add_argument("--xml-file", help="Xml file to generate register configurations from") + + args = parser.parse_args() + + # Note: either arg may == None + h = codegen.Codegen(args.header) + c = codegen.Codegen(args.code) + + gen = codegen.Gen(args.xml_file, c) + + copyright = textwrap.dedent("""\ + /* Autogenerated file, DO NOT EDIT manually! generated by {} */ + // SPDX-License-Identifier: MIT + /* + * Copyright © 2024 Intel Corporation + */ + + """).format(os.path.basename(__file__)) + + + header_file = os.path.basename(args.header) + header_define = "__%s__" % header_file.replace('.', '_').upper() + + h(copyright) + h("#ifndef %s" % header_define) + h("#define %s" % header_define) + h("\n") + h("struct intel_perf;") + h("struct intel_perf_metric_set;") + h("\n") + for set in gen.sets: + h("void %s_%s_add_registers(struct intel_perf *perf, struct intel_perf_metric_set *metric_set);" % + (gen.chipset, set.underscore_name)) + h("\n") + h("#endif /* %s */" % header_define) + + c(copyright) + c("\n") + c("#include \"%s\"" % header_file) + c("#include \"xe/xe_oa.h\"") + + for set in gen.sets: + c("\n") + generate_register_configs(set) + + +if __name__ == '__main__': + main() diff --git a/lib/xe/oa-configs/oa_guid_registry.py b/lib/xe/oa-configs/oa_guid_registry.py new file mode 100644 index 0000000000..ab14b398f3 --- /dev/null +++ b/lib/xe/oa-configs/oa_guid_registry.py @@ -0,0 +1,117 @@ +import copy +import hashlib +import re + +import xml.etree.ElementTree as et + + +class Registry: + + # Tries to avoid fragility from et.tostring() by normalizing into CSV string first + @staticmethod + def hw_config_hash(metric_set): + """Hashes the given metric set's HW register configs. + + Args: + metric_set -- is an ElementTree element for a 'set' + + Note this doesn't accept an MDAPI based metric set description + """ + + registers_str = "" + for config in metric_set.findall(".//register_config"): + if config.get('id') == None: + config_id = '0' + else: + config_id = config.get('id') + if config.get('priority') == None: + config_priority = '0' + else: + config_priority = config.get('priority') + if config.get('availability') == None: + config_availability = "" + else: + config_availability = config.get('availability') + for reg in config.findall("register"): + addr = int(reg.get('address'), 16) + value = int(reg.get('value'), 16) + registers_str = registers_str + config_id + ',' + config_priority + ',' + config_availability + ',' + str(addr) + ',' + str(value) + '\n' + + return hashlib.md5(registers_str.encode('utf-8')).hexdigest() + + + @staticmethod + def mdapi_hw_config_hash(mdapi_metric_set): + """Hashes the HW register configuration of a metric set from VPG's MDAPI XML files. + + Args: + mdapi_metric_set -- is an ElementTree element for a 'MetricSet' + + Note: being a simplistic hash of all RegConfigStart element contents + this will change for minor comment changes in VPG's files. Without + any promisies of stability within these files then it can't help to + err on the side of caution here, so we know when to investigate + changes that might affect our useages. + """ + + def reorder_attributes(root): + for el in root.iter(): + attrib = el.attrib + if len(attrib) > 1: + # adjust attribute order, e.g. by sorting + attribs = sorted(attrib.items()) + attrib.clear() + attrib.update(attribs) + + config = et.Element('config') + for registers in mdapi_metric_set.findall(".//RegConfigStart"): + config.append(copy.deepcopy(registers)) + reorder_attributes(config) + registers_str = et.tostring(config) + + return hashlib.md5(registers_str).hexdigest() + + @staticmethod + def chipset_derive_hash(chipset, set_name, hash): + """Derive a HW config hash for a given chipset & set name. + + This helps us avoiding collisions with identical config across + different Gen or GT. + """ + + return "%s-%s-%s" % (chipset, set_name, hash) + + + @staticmethod + def chipset_name(name): + known_chipsets = ( 'HSW', + 'BDW', + 'CHV', + 'SKL', + 'BXT', + 'KBL', + 'GLK', + 'CFL', + 'CNL', + 'ICL', + 'EHL', + 'TGL', + 'RKL', + 'DG1', + 'ACM', + 'PVC', + 'MTL', ) + if name in known_chipsets: + return name + + # Unknown HW + assert 0 + + + @staticmethod + def gt_name(name): + return re.sub(' ', '', name) + + @staticmethod + def sanitize_symbol_name(text): + return text.replace('#', "_") diff --git a/lib/xe/oa-configs/update-guids.py b/lib/xe/oa-configs/update-guids.py new file mode 100755 index 0000000000..0c7d129404 --- /dev/null +++ b/lib/xe/oa-configs/update-guids.py @@ -0,0 +1,222 @@ +#!/usr/bin/env python3 +# coding=utf-8 + +# SPDX-License-Identifier: MIT +# +# Copyright © 2024 Intel Corporation +# +# +# This script can: +# +# - Automatically add template entries for unregistered metric sets diescovered +# in new mdapi xml files. +# - Once mdapi-convert-xml.py has been run to output register configs for new +# metric sets then re-running this script can add the config_hash attribute +# to corresponding registry entries. +# +# The script is designed to allow incremental updates/fixups of the guid +# registry by working in terms of: +# +# 1) load all the existing state +# 2) apply tweaks/modifications +# 3) write everything back out +# +# The script should gracefully handle incomplete guid entries, which is +# important when considering how the mdapi-xml-convert.py script depends on the +# 'mdapi_config_hash' attribute while adding the 'config_hash' attribute +# depends on the configs output by mdapi-xml-convert.py. + + + +import argparse +import os.path +import re +import sys +import time +import uuid + +import xml.etree.ElementTree as et +import xml.sax.saxutils as saxutils + +import oa_guid_registry as oa_registry + + +def print_err(*args): + sys.stderr.write(' '.join(map(str,args)) + '\n') + +def guid_hashing_key(guid_obj): + ret = oa_registry.Registry.chipset_derive_hash(guid_obj['chipset'], + guid_obj['name'], + guid_obj['mdapi_config_hash']) + return ret + +parser = argparse.ArgumentParser() +parser.add_argument("xml", nargs="+", help="XML description of metrics") +parser.add_argument("--guids", required=True, help="Metric set GUID registry") + +args = parser.parse_args() + + +guids = [] +guid_index = {} # guid objects indexed by id +mdapi_config_hash_guid_table = {} # indexed by MDAPI XML register config hash +named_guid_table = {} # indexed by name=_ + + + +# 1) read everything we have currently +# +guids_xml = et.parse(args.guids) +for guid in guids_xml.findall(".//guid"): + guid_obj = {} + + if guid.get('id') != None: + guid_obj['id'] = guid.get('id') + else: + guid_obj['id'] = str(uuid.uuid4()) + + if guid.get('mdapi_config_hash') != None: + guid_obj['mdapi_config_hash'] = guid.get('mdapi_config_hash') + if guid.get('config_hash') != None: + guid_obj['config_hash'] = guid.get('config_hash') + + if guid.get('chipset') != None: + guid_obj['chipset'] = guid.get('chipset') + if guid.get('name') != None: + guid_obj['name'] = guid.get('name') + named_guid_table[guid_obj['chipset'] + "_" + guid_obj['name']] = guid_obj + + if 'mdapi_config_hash' in guid_obj: + hashing_key = oa_registry.Registry.chipset_derive_hash(guid_obj['chipset'], + guid_obj['name'], + guid_obj['mdapi_config_hash']) + mdapi_config_hash_guid_table[hashing_key] = guid_obj + + guids.append(guid_obj) + + if guid_obj['id'] in guid_index: + print_err("Duplicate GUID " + guid_obj['id'] + "!") + sys.exit(1) + guid_index[guid_obj['id']] = guid_obj + + +# +# 2) fixup/modify the guid entries... +# + + +for arg in args.xml: + internal = et.parse(arg) + + concurrent_group = internal.find(".//ConcurrentGroup") + chipset = oa_registry.Registry.chipset_name(concurrent_group.get('SupportedHW')).lower() + if concurrent_group.get('SupportedGT') != None: + chipset = chipset + oa_registry.Registry.gt_name(concurrent_group.get('SupportedGT')).lower() + + + for mdapi_set in internal.findall(".//MetricSet"): + + mdapi_config_hash = oa_registry.Registry.mdapi_hw_config_hash(mdapi_set) + + set_name = oa_registry.Registry.sanitize_symbol_name(mdapi_set.get('SymbolName')) + + name = chipset + "_" + set_name; + + hashing_key = oa_registry.Registry.chipset_derive_hash(chipset, set_name, mdapi_config_hash) + if hashing_key in mdapi_config_hash_guid_table: + guid_obj = mdapi_config_hash_guid_table[hashing_key] + + guid_obj['name'] = set_name + guid_obj['chipset'] = chipset + guid_obj['matched_mdapi'] = True + elif name in named_guid_table: + guid_obj = named_guid_table[name] + + guid_obj['matched_mdapi'] = True + guid_obj['mdapi_config_hash'] = mdapi_config_hash + if 'config_hash' in guid_obj: + del guid_obj['config_hash'] + guid_obj['comment'] = "WARNING: MDAPI XML config hash changed! If upstream, double check raw counter semantics unchanged" + print_err("WARNING: MDAPI XML config hash changed for \"" + set_name + "\" (" + chipset + ") If upstream, double check raw counter semantics unchanged") + else: + guid_obj = { 'mdapi_config_hash': mdapi_config_hash, + 'id': str(uuid.uuid4()), + 'name': set_name, + 'chipset': chipset, + 'unregistered': True, + 'matched_mdapi': True, + 'comment': "New" + } + guid_index[guid_obj['id']] = guid_obj + mdapi_config_hash_guid_table[guid_hashing_key(guid_obj)] = guid_obj + guids.append(guid_obj) + print_err("New GUID \"" + guid_obj['id'] + "\" for metric set = " + set_name + " (" + chipset + ")") + + named_guid_table[chipset + '_' + set_name] = guid_obj + + + +chipsets = [ 'hsw', + 'bdw', 'chv', + 'sklgt2', 'sklgt3', 'sklgt4', 'kblgt2', 'kblgt3', 'cflgt2', 'cflgt3', + 'bxt', 'glk', + 'cnl', + 'icl', 'ehl', + 'tglgt1', 'tglgt2', 'rkl', 'dg1', 'adl', + 'acmgt1', 'acmgt2', 'acmgt3', + 'mtlgt2', 'mtlgt3', +] + +for chipset in chipsets: + filename = 'oa-' + chipset + '.xml' + if not os.path.isfile(filename): + continue + + public = et.parse(filename) + + for metricset in public.findall(".//set"): + + set_name = metricset.get('symbol_name') + + config_hash = oa_registry.Registry.hw_config_hash(metricset) + + guid_key = chipset + "_" + set_name + if guid_key in named_guid_table: + guid_obj = named_guid_table[guid_key] + guid_obj['config_hash'] = config_hash + + +# +# 3) write all the guids back out... + +print("") +for guid_obj in guids: + comment = None + line = "' + + if comment != None: + print(" ") + + print(" " + line) +print("") -- 2.41.0