* [XEN PATCH v3 0/1] xen: rework compat headers generation @ 2023-01-16 18:10 Anthony PERARD 2023-01-16 18:10 ` [XEN PATCH v3 1/1] build: replace get-fields.sh by a python script Anthony PERARD 0 siblings, 1 reply; 12+ messages in thread From: Anthony PERARD @ 2023-01-16 18:10 UTC (permalink / raw) To: xen-devel Cc: Anthony PERARD, George Dunlap, Julien Grall, Andrew Cooper, Wei Liu, Stefano Stabellini, Jan Beulich Patch series available in this git branch: https://xenbits.xen.org/git-http/people/aperard/xen-unstable.git br.build-system-xen-include-rework-v3 v3: - Rewrite script into python instead of perl. (last patch of the series) v2: - new patch [1/4] to fix issue with command line that can be way too long - other small changes, and reorder patches Hi, This patch series is about 2 improvement. First one is to use $(if_changed, ) in "include/Makefile" to make the generation of the compat headers less verbose and to have the command line part of the decision to rebuild the headers. Second one is to replace one slow script by a much faster one, and save time when generating the headers. Thanks. Anthony PERARD (1): build: replace get-fields.sh by a python script xen/include/Makefile | 6 +- xen/tools/compat-xlat-header.py | 468 ++++++++++++++++++++++++++++ xen/tools/get-fields.sh | 528 -------------------------------- 3 files changed, 470 insertions(+), 532 deletions(-) create mode 100644 xen/tools/compat-xlat-header.py delete mode 100644 xen/tools/get-fields.sh -- Anthony PERARD ^ permalink raw reply [flat|nested] 12+ messages in thread
* [XEN PATCH v3 1/1] build: replace get-fields.sh by a python script 2023-01-16 18:10 [XEN PATCH v3 0/1] xen: rework compat headers generation Anthony PERARD @ 2023-01-16 18:10 ` Anthony PERARD 2023-01-17 16:07 ` Luca Fancellu 2023-01-17 17:21 ` Andrew Cooper 0 siblings, 2 replies; 12+ messages in thread From: Anthony PERARD @ 2023-01-16 18:10 UTC (permalink / raw) To: xen-devel Cc: Anthony PERARD, Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu The get-fields.sh which generate all the include/compat/.xlat/*.h headers is quite slow. It takes for example nearly 3 seconds to generate platform.h on a recent machine, or 2.3 seconds for memory.h. Rewriting the mix of shell/sed/python into a single python script make the generation of those file a lot faster. No functional change, the headers generated are identical. Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> --- Notes: To test the header generation, I've submit a branch to gitlab ci, where all the headers where generated, and for each one both the shell script and the python script where run and the result of both compared. v3: convert to python script instead of perl - this should allow more developper do be able to review/edit it. - it avoid adding a dependency on perl for the hypervisor build. It can be twice as slow than the perl version, but overall, when doing a build with make, there isn't much difference between the perl and python script. We might be able to speed the python script up by precompiling the many regex, but it's probably not worth it. (python already have a cache of compiled regex, but I think it's small, maybe 10 or so) v2: - Add .pl extension to the perl script - remove "-w" from the shebang as it is duplicate of "use warning;" - Add a note in the commit message that the "headers generated are identical". xen/include/Makefile | 6 +- xen/tools/compat-xlat-header.py | 468 ++++++++++++++++++++++++++++ xen/tools/get-fields.sh | 528 -------------------------------- 3 files changed, 470 insertions(+), 532 deletions(-) create mode 100644 xen/tools/compat-xlat-header.py delete mode 100644 xen/tools/get-fields.sh diff --git a/xen/include/Makefile b/xen/include/Makefile index 65be310eca..b950423efe 100644 --- a/xen/include/Makefile +++ b/xen/include/Makefile @@ -60,9 +60,7 @@ cmd_compat_c = \ quiet_cmd_xlat_headers = GEN $@ cmd_xlat_headers = \ - while read what name; do \ - $(SHELL) $(srctree)/tools/get-fields.sh "$$what" compat_$$name $< || exit $$?; \ - done <$(patsubst $(obj)/compat/%,$(obj)/compat/.xlat/%,$(basename $<)).lst >$@.new; \ + $(PYTHON) $(srctree)/tools/compat-xlat-header.py $< $(patsubst $(obj)/compat/%,$(obj)/compat/.xlat/%,$(basename $<)).lst > $@.new; \ mv -f $@.new $@ targets += $(headers-y) @@ -80,7 +78,7 @@ $(obj)/compat/%.c: $(src)/public/%.h $(srcdir)/xlat.lst $(srctree)/tools/compat- $(call if_changed,compat_c) targets += $(patsubst compat/%, compat/.xlat/%, $(headers-y)) -$(obj)/compat/.xlat/%.h: $(obj)/compat/%.h $(obj)/compat/.xlat/%.lst $(srctree)/tools/get-fields.sh FORCE +$(obj)/compat/.xlat/%.h: $(obj)/compat/%.h $(obj)/compat/.xlat/%.lst $(srctree)/tools/compat-xlat-header.py FORCE $(call if_changed,xlat_headers) quiet_cmd_xlat_lst = GEN $@ diff --git a/xen/tools/compat-xlat-header.py b/xen/tools/compat-xlat-header.py new file mode 100644 index 0000000000..c1b361ac56 --- /dev/null +++ b/xen/tools/compat-xlat-header.py @@ -0,0 +1,468 @@ +#!/usr/bin/env python + +from __future__ import print_function +import re +import sys + +typedefs = [] + +def removeprefix(s, prefix): + if s.startswith(prefix): + return s[len(prefix):] + return s + +def removesuffix(s, suffix): + if s.endswith(suffix): + return s[:-len(suffix)] + return s + +def get_fields(looking_for, header_tokens): + level = 1 + aggr = 0 + fields = [] + name = '' + + for token in header_tokens: + if token in ['struct', 'union']: + if level == 1: + aggr = 1 + fields = [] + name = '' + elif token == '{': + level += 1 + elif token == '}': + level -= 1 + if level == 1 and name == looking_for: + fields.append(token) + return fields + elif re.match(r'^[a-zA-Z_]', token): + if not (aggr == 0 or name != ''): + name = token + + if aggr != 0: + fields.append(token) + + return [] + +def get_typedefs(tokens): + level = 1 + state = 0 + typedefs = [] + for token in tokens: + if token == 'typedef': + if level == 1: + state = 1 + elif re.match(r'^COMPAT_HANDLE\((.*)\)$', token): + if not (level != 1 or state != 1): + state = 2 + elif token in ['[', '{']: + level += 1 + elif token in [']', '}']: + level -= 1 + elif token == ';': + if level == 1: + state = 0 + elif re.match(r'^[a-zA-Z_]', token): + if not (level != 1 or state != 2): + typedefs.append(token) + return typedefs + +def build_enums(name, tokens): + level = 1 + kind = '' + named = '' + fields = [] + members = [] + id = '' + + for token in tokens: + if token in ['struct', 'union']: + if not level != 2: + fields = [''] + kind = "%s;%s" % (token, kind) + elif token == '{': + level += 1 + elif token == '}': + level -= 1 + if level == 1: + subkind = kind + (subkind, _, _) = subkind.partition(';') + if subkind == 'union': + print("\nenum XLAT_%s {" % (name,)) + for m in members: + print(" XLAT_%s_%s," % (name, m)) + print("};") + return + elif level == 2: + named = '?' + elif re.match(r'^[a-zA-Z_]', token): + id = token + k = kind + (_, _, k) = k.partition(';') + if named != '' and k != '': + if len(fields) > 0 and fields[0] == '': + fields.pop(0) + build_enums("%s_%s" % (name, token), fields) + named = '!' + elif token == ',': + if level == 2: + members.append(id) + elif token == ';': + if level == 2: + members.append(id) + if named != '': + (_, _, kind) = kind.partition(';') + named = '' + if len(fields) != 0: + fields.append(token) + +def handle_field(prefix, name, id, type, fields): + if len(fields) == 0: + print(" \\") + if type == '': + print("%s(_d_)->%s = (_s_)->%s;" % (prefix, id, id), end='') + else: + k = id.replace('.', '_') + print("%sXLAT_%s_HNDL_%s(_d_, _s_);" % (prefix, name, k), end='') + elif not re.search(r'[{}]', ' '.join(fields)): + tag = ' '.join(fields) + tag = re.sub(r'\s*(struct|union)\s+(compat_)?(\w+)\s.*', '\\3', tag) + print(" \\") + print("%sXLAT_%s(&(_d_)->%s, &(_s_)->%s);" % (prefix, tag, id, id), end='') + else: + func_id = id + func_tokens = fields + kind = '' + array = "" + level = 1 + arrlvl = 1 + array_type = '' + id = '' + type = '' + fields = [] + for token in func_tokens: + if token in ['struct', 'union']: + if level == 2: + fields = [''] + if level == 1: + kind = token + if kind == 'union': + tmp = func_id.replace('.', '_') + print(" \\") + print("%sswitch (%s) {" % (prefix, tmp), end='') + elif token == '{': + level += 1 + id = '' + elif token == '}': + level -= 1 + id = '' + if level == 1 and kind == 'union': + print(" \\") + print("%s}" % (prefix,), end='') + elif token == '[': + if level != 2 or arrlvl != 1: + pass + elif array == '': + array = ' ' + else: + array = "%s;" % (array,) + arrlvl += 1 + elif token == ']': + arrlvl -= 1 + elif re.match(r'^COMPAT_HANDLE\((.*)\)$', token): + if level == 2 and id == '': + m = re.match(r'^COMPAT_HANDLE\((.*)\)$', token) + type = m.groups()[0] + type = removeprefix(type, 'compat_') + elif token == "compat_domain_handle_t": + if level == 2 and id == '': + array_type = token + elif re.match(r'^[a-zA-Z_]', token): + if id == '' and type == '' and array_type == '': + for id in typedefs: + if id == token: + type = id + if type == '': + id = token + else: + id = '' + else: + id = token + elif token in [',', ';']: + if level == 2 and not re.match(r'^_pad\d*$', id): + if kind == 'union': + tmp = "%s.%s" % (func_id, id) + tmp = tmp.replace('.', '_') + print(" \\") + print("%scase XLAT_%s_%s:" % (prefix, name, tmp), end='') + if len(fields) > 0 and fields[0] == '': + fields.pop(0) + handle_field("%s " % (prefix,), name, "%s.%s" % (func_id, id), type, fields) + elif array == '' and array_type == '': + if len(fields) > 0 and fields[0] == '': + fields.pop(0) + handle_field(prefix, name, "%s.%s" % (func_id, id), type, fields) + elif array == '': + copy_array(" ", "%s.%s" % (func_id, id)) + else: + (_, _, array) = array.partition(';') + if len(fields) > 0 and fields[0] == '': + fields.pop(0) + handle_array(prefix, name, "{func_id}.{id}", array, type, fields) + if token == ';': + fields = [] + id = '' + type = '' + array = '' + if kind == 'union': + print(" \\") + print("%s break;" % (prefix,), end='') + else: + if array != '': + array = "%s %s" % (array, token) + if len(fields) > 0: + fields.append(token) + +def copy_array(prefix, id): + print(" \\") + print("%sif ((_d_)->%s != (_s_)->%s) \\" % (prefix, id, id)) + print("%s memcpy((_d_)->%s, (_s_)->%s, sizeof((_d_)->%s));" % (prefix, id, id, id), end='') + +def handle_array(prefix, name, id, array, type, fields): + i = re.sub(r'[^;]', '', array) + i = "i%s" % (len(i),) + + print(" \\") + print("%s{ \\" % (prefix,)) + print("%s unsigned int %s; \\" % (prefix, i)) + (head, _, tail) = array.partition(';') + head = head.strip() + print("%s for (%s = 0; %s < %s; ++%s) {" % (prefix, i, i, head, i), end='') + if not ';' in array: + handle_field("%s " % (prefix,), name, "%s[%s]" % (id, i), type, fields) + else: + handle_array("%s " % (prefix,) , name, "%s[%s]" % (id, i), tail, type, fields) + print(" \\") + print("%s } \\" % (prefix,)) + print("%s}" % (prefix,), end='') + +def build_body(name, tokens): + level = 1 + id = '' + array = '' + arrlvl = 1 + array_type = '' + type = '' + fields = [] + + print("\n#define XLAT_%s(_d_, _s_) do {" % (name,), end='') + + for token in tokens: + if token in ['struct', 'union']: + if level == 2: + fields = [''] + elif token == '{': + level += 1 + id = '' + elif token == '}': + level -= 1 + id = '' + elif token == '[': + if level != 2 or arrlvl != 1: + pass + elif array == '': + array = ' ' + else: + array = "%s;" % (array,) + arrlvl += 1 + elif token == ']': + arrlvl -= 1 + elif re.match(r'^COMPAT_HANDLE\((.*)\)$', token): + if level == 2 and id == '': + m = re.match(r'^COMPAT_HANDLE\((.*)\)$', token) + type = m.groups()[0] + type = removeprefix(type, 'compat_') + elif token == "compat_domain_handle_t": + if level == 2 and id == '': + array_type = token + elif re.match(r'^[a-zA-Z_]', token): + if array != '': + array = "%s %s" % (array, token) + elif id == '' and type == '' and array_type == '': + for id in typedefs: + if id != token: + type = id + if type == '': + id = token + else: + id = '' + else: + id = token + elif token in [',', ';']: + if level == 2 and not re.match(r'^_pad\d*$', id): + if array == '' and array_type == '': + if len(fields) > 0 and fields[0] == '': + fields.pop(0) + handle_field(" ", name, id, type, fields) + elif array == '': + copy_array(" ", id) + else: + (head, sep, tmp) = array.partition(';') + if sep == '': + tmp = head + if len(fields) > 0 and fields[0] == '': + fields.pop(0) + handle_array(" ", name, id, tmp, type, fields) + if token == ';': + fields = [] + id = '' + type = '' + array = '' + else: + if array != '': + array = "%s %s" % (array, token) + if len(fields) > 0: + fields.append(token) + print(" \\\n} while (0)") + +def check_field(kind, name, field, extrafields): + if not re.search(r'[{}]', ' '.join(extrafields)): + print("; \\") + if len(extrafields) != 0: + for token in extrafields: + if token in ['struct', 'union']: + pass + elif re.match(r'^[a-zA-Z_]', token): + print(" CHECK_%s" % (removeprefix(token, 'xen_'),), end='') + break + else: + raise Exception("Malformed compound declaration: '%s'" % (token,)) + elif not '.' in field: + print(" CHECK_FIELD_(%s, %s, %s)" % (kind, name, field), end='') + else: + n = field.count('.') + field = field.replace('.', ', ') + print(" CHECK_SUBFIELD_%s_(%s, %s, %s)" % (n, kind, name, field), end='') + else: + level = 1 + fields = [] + id = '' + + for token in extrafields: + if token in ['struct', 'union']: + if level == 2: + fields = [''] + elif token == '{': + level += 1 + id = '' + elif token == '}': + level -= 1 + id = '' + elif re.match(r'^compat_.*_t$', token): + if level == 2: + fields = [''] + token = removesuffix(token, '_t') + token = removeprefix(token, 'compat_') + elif re.match(r'^evtchn_.*_compat_t$', token): + if level == 2 and token != "evtchn_port_compat_t": + fields = [''] + token = removesuffix(token, '_compat_t') + elif re.match(r'^[a-zA-Z_]', token): + id = token + elif token in [',', ';']: + if level == 2 and not re.match(r'^_pad\d*$', id): + if len(fields) > 0 and fields[0] == '': + fields.pop(0) + check_field(kind, name, "%s.%s" % (field, id), fields) + if token == ";": + fields = [] + id = '' + if len(fields) > 0: + fields.append(token) + +def build_check(name, tokens): + level = 1 + fields = [] + kind = '' + id = '' + arrlvl = 1 + + print("") + print("#define CHECK_%s \\" % (name,)) + + for token in tokens: + if token in ['struct', 'union']: + if level == 1: + kind = token + print(" CHECK_SIZE_(%s, %s)" % (kind, name), end='') + elif level == 2: + fields = [''] + elif token == '{': + level += 1 + id = '' + elif token == '}': + level -= 1 + id = '' + elif token == '[': + arrlvl += 1 + elif token == ']': + arrlvl -= 1 + elif re.match(r'^compat_.*_t$', token): + if level == 2 and token != "compat_argo_port_t": + fields = [''] + token = removesuffix(token, '_t') + token = removeprefix(token, 'compat_') + elif re.match(r'^[a-zA-Z_]', token): + if not (level != 2 or arrlvl != 1): + id = token + elif token in [',', ';']: + if level == 2 and not re.match(r'^_pad\d*$', id): + if len(fields) > 0 and fields[0] == '': + fields.pop(0) + check_field(kind, name, id, fields) + if token == ";": + fields = [] + id = '' + + if len(fields) > 0: + fields.append(token) + print("") + + +def main(): + header_tokens = [] + + with open(sys.argv[1]) as header: + for line in header: + if re.match(r'^\s*(#|$)', line): + continue + line = re.sub(r'([\]\[,;:{}])', ' \\1 ', line) + line = line.strip() + header_tokens += re.split(r'\s+', line) + + global typedefs + typedefs = get_typedefs(header_tokens) + + with open(sys.argv[2]) as compat_list: + for line in compat_list: + words = re.split(r'\s+', line, maxsplit=1) + what = words[0] + name = words[1] + + name = removeprefix(name, 'xen') + name = name.strip() + + fields = get_fields("compat_%s" % (name,), header_tokens) + if len(fields) == 0: + raise Exception("Fields of 'compat_%s' not found in '%s'" % (name, sys.argv[1])) + + if what == "!": + build_enums(name, fields) + build_body(name, fields) + elif what == "?": + build_check(name, fields) + else: + raise Exception("Invalid translation indicator: '%s'" % (what,)) + +if __name__ == '__main__': + main() diff --git a/xen/tools/get-fields.sh b/xen/tools/get-fields.sh deleted file mode 100644 index 002db2093f..0000000000 --- a/xen/tools/get-fields.sh +++ /dev/null @@ -1,528 +0,0 @@ -test -n "$1" -a -n "$2" -a -n "$3" -set -ef - -SED=sed -if test -x /usr/xpg4/bin/sed; then - SED=/usr/xpg4/bin/sed -fi -if test -z ${PYTHON}; then - PYTHON=`/usr/bin/env python` -fi -if test -z ${PYTHON}; then - echo "Python not found" - exit 1 -fi - -get_fields () -{ - local level=1 aggr=0 name= fields= - for token in $2 - do - case "$token" in - struct|union) - test $level != 1 || aggr=1 fields= name= - ;; - "{") - level=$(expr $level + 1) - ;; - "}") - level=$(expr $level - 1) - if [ $level = 1 -a $name = $1 ] - then - echo "$fields }" - return 0 - fi - ;; - [a-zA-Z_]*) - test $aggr = 0 -o -n "$name" || name="$token" - ;; - esac - test $aggr = 0 || fields="$fields $token" - done -} - -get_typedefs () -{ - local level=1 state= - for token in $1 - do - case "$token" in - typedef) - test $level != 1 || state=1 - ;; - COMPAT_HANDLE\(*\)) - test $level != 1 -o "$state" != 1 || state=2 - ;; - [\{\[]) - level=$(expr $level + 1) - ;; - [\}\]]) - level=$(expr $level - 1) - ;; - ";") - test $level != 1 || state= - ;; - [a-zA-Z_]*) - test $level != 1 -o "$state" != 2 || echo "$token" - ;; - esac - done -} - -build_enums () -{ - local level=1 kind= fields= members= named= id= token - for token in $2 - do - case "$token" in - struct|union) - test $level != 2 || fields=" " - kind="$token;$kind" - ;; - "{") - level=$(expr $level + 1) - ;; - "}") - level=$(expr $level - 1) - if [ $level = 1 ] - then - if [ "${kind%%;*}" = union ] - then - echo - echo "enum XLAT_$1 {" - for m in $members - do - echo " XLAT_${1}_$m," - done - echo "};" - fi - return 0 - elif [ $level = 2 ] - then - named='?' - fi - ;; - [a-zA-Z]*) - id=$token - if [ -n "$named" -a -n "${kind#*;}" ] - then - build_enums ${1}_$token "$fields" - named='!' - fi - ;; - ",") - test $level != 2 || members="$members $id" - ;; - ";") - test $level != 2 || members="$members $id" - test -z "$named" || kind=${kind#*;} - named= - ;; - esac - test -z "$fields" || fields="$fields $token" - done -} - -handle_field () -{ - if [ -z "$5" ] - then - echo " \\" - if [ -z "$4" ] - then - printf %s "$1(_d_)->$3 = (_s_)->$3;" - else - printf %s "$1XLAT_${2}_HNDL_$(echo $3 | $SED 's,\.,_,g')(_d_, _s_);" - fi - elif [ -z "$(echo "$5" | $SED 's,[^{}],,g')" ] - then - local tag=$(echo "$5" | ${PYTHON} -c ' -import re,sys -for line in sys.stdin.readlines(): - sys.stdout.write(re.subn(r"\s*(struct|union)\s+(compat_)?(\w+)\s.*", r"\3", line)[0].rstrip() + "\n") -') - echo " \\" - printf %s "${1}XLAT_$tag(&(_d_)->$3, &(_s_)->$3);" - else - local level=1 kind= fields= id= array= arrlvl=1 array_type= type= token - for token in $5 - do - case "$token" in - struct|union) - test $level != 2 || fields=" " - if [ $level = 1 ] - then - kind=$token - if [ $kind = union ] - then - echo " \\" - printf %s "${1}switch ($(echo $3 | $SED 's,\.,_,g')) {" - fi - fi - ;; - "{") - level=$(expr $level + 1) id= - ;; - "}") - level=$(expr $level - 1) id= - if [ $level = 1 -a $kind = union ] - then - echo " \\" - printf %s "$1}" - fi - ;; - "[") - if [ $level != 2 -o $arrlvl != 1 ] - then - : - elif [ -z "$array" ] - then - array=" " - else - array="$array;" - fi - arrlvl=$(expr $arrlvl + 1) - ;; - "]") - arrlvl=$(expr $arrlvl - 1) - ;; - COMPAT_HANDLE\(*\)) - if [ $level = 2 -a -z "$id" ] - then - type=${token#COMPAT_HANDLE?} - type=${type%?} - type=${type#compat_} - fi - ;; - compat_domain_handle_t) - if [ $level = 2 -a -z "$id" ] - then - array_type=$token - fi - ;; - [a-zA-Z]*) - if [ -z "$id" -a -z "$type" -a -z "$array_type" ] - then - for id in $typedefs - do - test $id != "$token" || type=$id - done - if [ -z "$type" ] - then - id=$token - else - id= - fi - else - id=$token - fi - ;; - [\,\;]) - if [ $level = 2 -a -n "$(echo $id | $SED 's,^_pad[[:digit:]]*,,')" ] - then - if [ $kind = union ] - then - echo " \\" - printf %s "${1}case XLAT_${2}_$(echo $3.$id | $SED 's,\.,_,g'):" - handle_field "$1 " $2 $3.$id "$type" "$fields" - elif [ -z "$array" -a -z "$array_type" ] - then - handle_field "$1" $2 $3.$id "$type" "$fields" - elif [ -z "$array" ] - then - copy_array " " $3.$id - else - handle_array "$1" $2 $3.$id "${array#*;}" "$type" "$fields" - fi - test "$token" != ";" || fields= id= type= - array= - if [ $kind = union ] - then - echo " \\" - printf %s "$1 break;" - fi - fi - ;; - *) - if [ -n "$array" ] - then - array="$array $token" - fi - ;; - esac - test -z "$fields" || fields="$fields $token" - done - fi -} - -copy_array () -{ - echo " \\" - echo "${1}if ((_d_)->$2 != (_s_)->$2) \\" - printf %s "$1 memcpy((_d_)->$2, (_s_)->$2, sizeof((_d_)->$2));" -} - -handle_array () -{ - local i="i$(echo $4 | $SED 's,[^;], ,g' | wc -w | $SED 's,[[:space:]]*,,g')" - echo " \\" - echo "$1{ \\" - echo "$1 unsigned int $i; \\" - printf %s "$1 for ($i = 0; $i < "${4%%;*}"; ++$i) {" - if [ "$4" = "${4#*;}" ] - then - handle_field "$1 " $2 $3[$i] "$5" "$6" - else - handle_array "$1 " $2 $3[$i] "${4#*;}" "$5" "$6" - fi - echo " \\" - echo "$1 } \\" - printf %s "$1}" -} - -build_body () -{ - echo - printf %s "#define XLAT_$1(_d_, _s_) do {" - local level=1 fields= id= array= arrlvl=1 array_type= type= token - for token in $2 - do - case "$token" in - struct|union) - test $level != 2 || fields=" " - ;; - "{") - level=$(expr $level + 1) id= - ;; - "}") - level=$(expr $level - 1) id= - ;; - "[") - if [ $level != 2 -o $arrlvl != 1 ] - then - : - elif [ -z "$array" ] - then - array=" " - else - array="$array;" - fi - arrlvl=$(expr $arrlvl + 1) - ;; - "]") - arrlvl=$(expr $arrlvl - 1) - ;; - COMPAT_HANDLE\(*\)) - if [ $level = 2 -a -z "$id" ] - then - type=${token#COMPAT_HANDLE?} - type=${type%?} - type=${type#compat_} - fi - ;; - compat_domain_handle_t) - if [ $level = 2 -a -z "$id" ] - then - array_type=$token - fi - ;; - [a-zA-Z_]*) - if [ -n "$array" ] - then - array="$array $token" - elif [ -z "$id" -a -z "$type" -a -z "$array_type" ] - then - for id in $typedefs - do - test $id != "$token" || type=$id - done - if [ -z "$type" ] - then - id=$token - else - id= - fi - else - id=$token - fi - ;; - [\,\;]) - if [ $level = 2 -a -n "$(echo $id | $SED 's,^_pad[[:digit:]]*,,')" ] - then - if [ -z "$array" -a -z "$array_type" ] - then - handle_field " " $1 $id "$type" "$fields" - elif [ -z "$array" ] - then - copy_array " " $id - else - handle_array " " $1 $id "${array#*;}" "$type" "$fields" - fi - test "$token" != ";" || fields= id= type= - array= - fi - ;; - *) - if [ -n "$array" ] - then - array="$array $token" - fi - ;; - esac - test -z "$fields" || fields="$fields $token" - done - echo " \\" - echo "} while (0)" -} - -check_field () -{ - if [ -z "$(echo "$4" | $SED 's,[^{}],,g')" ] - then - echo "; \\" - local n=$(echo $3 | $SED 's,[^.], ,g' | wc -w | $SED 's,[[:space:]]*,,g') - if [ -n "$4" ] - then - for n in $4 - do - case $n in - struct|union) - ;; - [a-zA-Z_]*) - printf %s " CHECK_${n#xen_}" - break - ;; - *) - echo "Malformed compound declaration: '$n'" >&2 - exit 1 - ;; - esac - done - elif [ $n = 0 ] - then - printf %s " CHECK_FIELD_($1, $2, $3)" - else - printf %s " CHECK_SUBFIELD_${n}_($1, $2, $(echo $3 | $SED 's!\.!, !g'))" - fi - else - local level=1 fields= id= token - for token in $4 - do - case "$token" in - struct|union) - test $level != 2 || fields=" " - ;; - "{") - level=$(expr $level + 1) id= - ;; - "}") - level=$(expr $level - 1) id= - ;; - compat_*_t) - if [ $level = 2 ] - then - fields=" " - token="${token%_t}" - token="${token#compat_}" - fi - ;; - evtchn_*_compat_t) - if [ $level = 2 -a $token != evtchn_port_compat_t ] - then - fields=" " - token="${token%_compat_t}" - fi - ;; - [a-zA-Z]*) - id=$token - ;; - [\,\;]) - if [ $level = 2 -a -n "$(echo $id | $SED 's,^_pad[[:digit:]]*,,')" ] - then - check_field $1 $2 $3.$id "$fields" - test "$token" != ";" || fields= id= - fi - ;; - esac - test -z "$fields" || fields="$fields $token" - done - fi -} - -build_check () -{ - echo - echo "#define CHECK_$1 \\" - local level=1 fields= kind= id= arrlvl=1 token - for token in $2 - do - case "$token" in - struct|union) - if [ $level = 1 ] - then - kind=$token - printf %s " CHECK_SIZE_($kind, $1)" - elif [ $level = 2 ] - then - fields=" " - fi - ;; - "{") - level=$(expr $level + 1) id= - ;; - "}") - level=$(expr $level - 1) id= - ;; - "[") - arrlvl=$(expr $arrlvl + 1) - ;; - "]") - arrlvl=$(expr $arrlvl - 1) - ;; - compat_*_t) - if [ $level = 2 -a $token != compat_argo_port_t ] - then - fields=" " - token="${token%_t}" - token="${token#compat_}" - fi - ;; - [a-zA-Z_]*) - test $level != 2 -o $arrlvl != 1 || id=$token - ;; - [\,\;]) - if [ $level = 2 -a -n "$(echo $id | $SED 's,^_pad[[:digit:]]*,,')" ] - then - check_field $kind $1 $id "$fields" - test "$token" != ";" || fields= id= - fi - ;; - esac - test -z "$fields" || fields="$fields $token" - done - echo "" -} - -list="$($SED -e 's,^[[:space:]]#.*,,' -e 's!\([]\[,;:{}]\)! \1 !g' $3)" -fields="$(get_fields $(echo $2 | $SED 's,^compat_xen,compat_,') "$list")" -if [ -z "$fields" ] -then - echo "Fields of '$2' not found in '$3'" >&2 - exit 1 -fi -name=${2#compat_} -name=${name#xen} -case "$1" in -"!") - typedefs="$(get_typedefs "$list")" - build_enums $name "$fields" - build_body $name "$fields" - ;; -"?") - build_check $name "$fields" - ;; -*) - echo "Invalid translation indicator: '$1'" >&2 - exit 1 - ;; -esac -- Anthony PERARD ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [XEN PATCH v3 1/1] build: replace get-fields.sh by a python script 2023-01-16 18:10 ` [XEN PATCH v3 1/1] build: replace get-fields.sh by a python script Anthony PERARD @ 2023-01-17 16:07 ` Luca Fancellu 2023-01-17 16:55 ` Anthony PERARD 2023-01-17 17:21 ` Andrew Cooper 1 sibling, 1 reply; 12+ messages in thread From: Luca Fancellu @ 2023-01-17 16:07 UTC (permalink / raw) To: Anthony PERARD Cc: Xen-devel, Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu > On 16 Jan 2023, at 18:10, Anthony PERARD <anthony.perard@citrix.com> wrote: > > The get-fields.sh which generate all the include/compat/.xlat/*.h > headers is quite slow. It takes for example nearly 3 seconds to > generate platform.h on a recent machine, or 2.3 seconds for memory.h. > > Rewriting the mix of shell/sed/python into a single python script make > the generation of those file a lot faster. > > No functional change, the headers generated are identical. > > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> > --- > > Notes: > To test the header generation, I've submit a branch to gitlab ci, > where all the headers where generated, and for each one both the shell > script and the python script where run and the result of both > compared. > > v3: > convert to python script instead of perl > - this should allow more developper do be able to review/edit it. > - it avoid adding a dependency on perl for the hypervisor build. > > It can be twice as slow than the perl version, but overall, when doing > a build with make, there isn't much difference between the perl and > python script. > We might be able to speed the python script up by precompiling the > many regex, but it's probably not worth it. (python already have a > cache of compiled regex, but I think it's small, maybe 10 or so) > > v2: > - Add .pl extension to the perl script > - remove "-w" from the shebang as it is duplicate of "use warning;" > - Add a note in the commit message that the "headers generated are identical". > > xen/include/Makefile | 6 +- > xen/tools/compat-xlat-header.py | 468 ++++++++++++++++++++++++++++ > xen/tools/get-fields.sh | 528 -------------------------------- > 3 files changed, 470 insertions(+), 532 deletions(-) > create mode 100644 xen/tools/compat-xlat-header.py > delete mode 100644 xen/tools/get-fields.sh > > diff --git a/xen/include/Makefile b/xen/include/Makefile > index 65be310eca..b950423efe 100644 > --- a/xen/include/Makefile > +++ b/xen/include/Makefile > @@ -60,9 +60,7 @@ cmd_compat_c = \ > > quiet_cmd_xlat_headers = GEN $@ > cmd_xlat_headers = \ > - while read what name; do \ > - $(SHELL) $(srctree)/tools/get-fields.sh "$$what" compat_$$name $< || exit $$?; \ > - done <$(patsubst $(obj)/compat/%,$(obj)/compat/.xlat/%,$(basename $<)).lst >$@.new; \ > + $(PYTHON) $(srctree)/tools/compat-xlat-header.py $< $(patsubst $(obj)/compat/%,$(obj)/compat/.xlat/%,$(basename $<)).lst > $@.new; \ > mv -f $@.new $@ > > targets += $(headers-y) > @@ -80,7 +78,7 @@ $(obj)/compat/%.c: $(src)/public/%.h $(srcdir)/xlat.lst $(srctree)/tools/compat- > $(call if_changed,compat_c) > > targets += $(patsubst compat/%, compat/.xlat/%, $(headers-y)) > -$(obj)/compat/.xlat/%.h: $(obj)/compat/%.h $(obj)/compat/.xlat/%.lst $(srctree)/tools/get-fields.sh FORCE > +$(obj)/compat/.xlat/%.h: $(obj)/compat/%.h $(obj)/compat/.xlat/%.lst $(srctree)/tools/compat-xlat-header.py FORCE > $(call if_changed,xlat_headers) > > quiet_cmd_xlat_lst = GEN $@ > diff --git a/xen/tools/compat-xlat-header.py b/xen/tools/compat-xlat-header.py > new file mode 100644 > index 0000000000..c1b361ac56 > --- /dev/null > +++ b/xen/tools/compat-xlat-header.py > @@ -0,0 +1,468 @@ > +#!/usr/bin/env python Would it make sense to start with python3 since it is a new script? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [XEN PATCH v3 1/1] build: replace get-fields.sh by a python script 2023-01-17 16:07 ` Luca Fancellu @ 2023-01-17 16:55 ` Anthony PERARD 2023-01-17 16:58 ` Luca Fancellu 0 siblings, 1 reply; 12+ messages in thread From: Anthony PERARD @ 2023-01-17 16:55 UTC (permalink / raw) To: Luca Fancellu Cc: Xen-devel, Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu On Tue, Jan 17, 2023 at 04:07:24PM +0000, Luca Fancellu wrote: > > On 16 Jan 2023, at 18:10, Anthony PERARD <anthony.perard@citrix.com> wrote: > > diff --git a/xen/tools/compat-xlat-header.py b/xen/tools/compat-xlat-header.py > > new file mode 100644 > > index 0000000000..c1b361ac56 > > --- /dev/null > > +++ b/xen/tools/compat-xlat-header.py > > @@ -0,0 +1,468 @@ > > +#!/usr/bin/env python > > Would it make sense to start with python3 since it is a new script? That shebang isn't even used as the script doesn't even have the execution bit set. So why do you say that the script isn't python3? Not really asking, just been pedantic :-) Even if it's a new script, it isn't a new project. We can't depend on brand new functionality from our dependencies. We need to be able to build the hypervisor with old build toolchain / distribution. Anyway, I did start by writing a python3 script in all its glory (or at least some of the new part of the language that I know about), but I had to rework it to be able to use it on older distribution. Our centos7 container in our GitLab CI seems to use python2.7. So I had to stop using str.removeprefix() and I introduce some function doing the same thing instead (so that works with older than python 3.9). Then I had to stop using f-strings and use %-formatting instead. Then use "m.groups()[0]" instead of "m[1]" where "m" is a match result from re.match() and other. And use the classing "from __future__ ..." preamble. Cheers, -- Anthony PERARD ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [XEN PATCH v3 1/1] build: replace get-fields.sh by a python script 2023-01-17 16:55 ` Anthony PERARD @ 2023-01-17 16:58 ` Luca Fancellu 0 siblings, 0 replies; 12+ messages in thread From: Luca Fancellu @ 2023-01-17 16:58 UTC (permalink / raw) To: Anthony PERARD Cc: Xen-devel, Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu > On 17 Jan 2023, at 16:55, Anthony PERARD <anthony.perard@citrix.com> wrote: > > On Tue, Jan 17, 2023 at 04:07:24PM +0000, Luca Fancellu wrote: >>> On 16 Jan 2023, at 18:10, Anthony PERARD <anthony.perard@citrix.com> wrote: >>> diff --git a/xen/tools/compat-xlat-header.py b/xen/tools/compat-xlat-header.py >>> new file mode 100644 >>> index 0000000000..c1b361ac56 >>> --- /dev/null >>> +++ b/xen/tools/compat-xlat-header.py >>> @@ -0,0 +1,468 @@ >>> +#!/usr/bin/env python >> >> Would it make sense to start with python3 since it is a new script? > > That shebang isn't even used as the script doesn't even have the > execution bit set. So why do you say that the script isn't python3? Not > really asking, just been pedantic :-) Yes I didn’t pay attention to that > > Even if it's a new script, it isn't a new project. We can't depend on > brand new functionality from our dependencies. We need to be able to > build the hypervisor with old build toolchain / distribution. > > Anyway, I did start by writing a python3 script in all its glory (or at > least some of the new part of the language that I know about), but I had > to rework it to be able to use it on older distribution. Our centos7 > container in our GitLab CI seems to use python2.7. That makes sense, thanks for the explanation. > > So I had to stop using str.removeprefix() and I introduce some function > doing the same thing instead (so that works with older than python 3.9). > Then I had to stop using f-strings and use %-formatting instead. > Then use "m.groups()[0]" instead of "m[1]" where "m" is a match result > from re.match() and other. > And use the classing "from __future__ ..." preamble. > > Cheers, > > -- > Anthony PERARD ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [XEN PATCH v3 1/1] build: replace get-fields.sh by a python script 2023-01-16 18:10 ` [XEN PATCH v3 1/1] build: replace get-fields.sh by a python script Anthony PERARD 2023-01-17 16:07 ` Luca Fancellu @ 2023-01-17 17:21 ` Andrew Cooper 2023-01-18 8:06 ` Jan Beulich 2023-01-18 18:21 ` Anthony PERARD 1 sibling, 2 replies; 12+ messages in thread From: Andrew Cooper @ 2023-01-17 17:21 UTC (permalink / raw) To: Anthony Perard, xen-devel@lists.xenproject.org Cc: George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu On 16/01/2023 6:10 pm, Anthony PERARD wrote: > The get-fields.sh which generate all the include/compat/.xlat/*.h > headers is quite slow. It takes for example nearly 3 seconds to > generate platform.h on a recent machine, or 2.3 seconds for memory.h. > > Rewriting the mix of shell/sed/python into a single python script make > the generation of those file a lot faster. > > No functional change, the headers generated are identical. > > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> I'll happily trust the testing you've done, and that the outputs are equivalent. However, I have some general python suggestions. > diff --git a/xen/tools/compat-xlat-header.py b/xen/tools/compat-xlat-header.py > new file mode 100644 > index 0000000000..c1b361ac56 > --- /dev/null > +++ b/xen/tools/compat-xlat-header.py > @@ -0,0 +1,468 @@ > +#!/usr/bin/env python > + > +from __future__ import print_function > +import re > +import sys > + > +typedefs = [] > + > +def removeprefix(s, prefix): > + if s.startswith(prefix): > + return s[len(prefix):] > + return s > + > +def removesuffix(s, suffix): > + if s.endswith(suffix): > + return s[:-len(suffix)] > + return s > + > +def get_fields(looking_for, header_tokens): > + level = 1 > + aggr = 0 > + fields = [] > + name = '' > + > + for token in header_tokens: > + if token in ['struct', 'union']: ('struct', 'union') A tuple here rather than a list is marginally more efficient. > + if level == 1: > + aggr = 1 > + fields = [] > + name = '' > + elif token == '{': > + level += 1 > + elif token == '}': > + level -= 1 > + if level == 1 and name == looking_for: > + fields.append(token) > + return fields > + elif re.match(r'^[a-zA-Z_]', token): Here and many other places, you've got re.{search,match} with repeated regexes. This can end up being quite expensive, and we already had one build system slowdown caused by it. Up near typedefs at the top, you want something like: token_re = re.compile('[a-zA-Z_]') to prepare the regex once at the start of day, and and use 'elif token_re.match(token)' here. Another thing to note, the difference between re.search and re.match is that match has an implicit '^' anchor. You need to be aware of this when compiling one regex to be used with both. All of this said, where is 0-9 in the token regex? Have we just got extremely lucky with having no embedded digits in identifiers thus far? > + if not (aggr == 0 or name != ''): > + name = token > + > + if aggr != 0: > + fields.append(token) > + > + return [] > + > +def get_typedefs(tokens): > + level = 1 > + state = 0 > + typedefs = [] This shadows the global typedefs list, but the result of calling this function is simply assigned to it. Looking at the code, the global list is used in several places It would be better to "global typedefs" here, and make this function void, unless you want to modify handle_field() and build_body() to take it in as a regular parameter. I'm pretty sure typedefs actually wants to be a dict rather than a list (will have better "id in typedefs" performance lower down), but that wants matching with code changes elsewhere, and probably wants doing separately. > + for token in tokens: > + if token == 'typedef': > + if level == 1: > + state = 1 > + elif re.match(r'^COMPAT_HANDLE\((.*)\)$', token): > + if not (level != 1 or state != 1): > + state = 2 > + elif token in ['[', '{']: > + level += 1 > + elif token in [']', '}']: > + level -= 1 > + elif token == ';': > + if level == 1: > + state = 0 > + elif re.match(r'^[a-zA-Z_]', token): > + if not (level != 1 or state != 2): > + typedefs.append(token) > + return typedefs > + > +def build_enums(name, tokens): > + level = 1 > + kind = '' > + named = '' > + fields = [] > + members = [] > + id = '' > + > + for token in tokens: > + if token in ['struct', 'union']: > + if not level != 2: > + fields = [''] > + kind = "%s;%s" % (token, kind) > + elif token == '{': > + level += 1 > + elif token == '}': > + level -= 1 > + if level == 1: > + subkind = kind > + (subkind, _, _) = subkind.partition(';') > + if subkind == 'union': > + print("\nenum XLAT_%s {" % (name,)) > + for m in members: > + print(" XLAT_%s_%s," % (name, m)) > + print("};") > + return > + elif level == 2: > + named = '?' > + elif re.match(r'^[a-zA-Z_]', token): > + id = token > + k = kind > + (_, _, k) = k.partition(';') > + if named != '' and k != '': > + if len(fields) > 0 and fields[0] == '': > + fields.pop(0) > + build_enums("%s_%s" % (name, token), fields) > + named = '!' > + elif token == ',': > + if level == 2: > + members.append(id) > + elif token == ';': > + if level == 2: > + members.append(id) > + if named != '': > + (_, _, kind) = kind.partition(';') > + named = '' > + if len(fields) != 0: > + fields.append(token) > + > +def handle_field(prefix, name, id, type, fields): > + if len(fields) == 0: > + print(" \\") > + if type == '': > + print("%s(_d_)->%s = (_s_)->%s;" % (prefix, id, id), end='') > + else: > + k = id.replace('.', '_') > + print("%sXLAT_%s_HNDL_%s(_d_, _s_);" % (prefix, name, k), end='') > + elif not re.search(r'[{}]', ' '.join(fields)): > + tag = ' '.join(fields) > + tag = re.sub(r'\s*(struct|union)\s+(compat_)?(\w+)\s.*', '\\3', tag) > + print(" \\") > + print("%sXLAT_%s(&(_d_)->%s, &(_s_)->%s);" % (prefix, tag, id, id), end='') > + else: > + func_id = id > + func_tokens = fields > + kind = '' > + array = "" > + level = 1 > + arrlvl = 1 > + array_type = '' > + id = '' > + type = '' > + fields = [] > + for token in func_tokens: > + if token in ['struct', 'union']: > + if level == 2: > + fields = [''] > + if level == 1: > + kind = token > + if kind == 'union': > + tmp = func_id.replace('.', '_') > + print(" \\") > + print("%sswitch (%s) {" % (prefix, tmp), end='') > + elif token == '{': > + level += 1 > + id = '' > + elif token == '}': > + level -= 1 > + id = '' > + if level == 1 and kind == 'union': > + print(" \\") > + print("%s}" % (prefix,), end='') > + elif token == '[': > + if level != 2 or arrlvl != 1: > + pass > + elif array == '': > + array = ' ' > + else: > + array = "%s;" % (array,) > + arrlvl += 1 > + elif token == ']': > + arrlvl -= 1 > + elif re.match(r'^COMPAT_HANDLE\((.*)\)$', token): > + if level == 2 and id == '': > + m = re.match(r'^COMPAT_HANDLE\((.*)\)$', token) > + type = m.groups()[0] > + type = removeprefix(type, 'compat_') So this pattern is what lead to the introduction of the := operator in Python 3.8, but we obviously can't use it yet. At least pre-compiling the regexes will reduce the hit from the double evaluation here. ~Andrew P.S. I probably don't want to know why we have to special case evtchn port, argo port and domain handle. I think it says more about the this bodge of a parser than anything else... ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [XEN PATCH v3 1/1] build: replace get-fields.sh by a python script 2023-01-17 17:21 ` Andrew Cooper @ 2023-01-18 8:06 ` Jan Beulich 2023-01-18 19:38 ` Andrew Cooper 2023-01-18 18:21 ` Anthony PERARD 1 sibling, 1 reply; 12+ messages in thread From: Jan Beulich @ 2023-01-18 8:06 UTC (permalink / raw) To: Andrew Cooper, Anthony Perard, xen-devel@lists.xenproject.org Cc: George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu On 17.01.2023 18:21, Andrew Cooper wrote: > On 16/01/2023 6:10 pm, Anthony PERARD wrote: >> + elif re.match(r'^[a-zA-Z_]', token): > >[...] > > All of this said, where is 0-9 in the token regex? Have we just got > extremely lucky with having no embedded digits in identifiers thus far? That's checking for just the first character, which can't be a digit? > P.S. I probably don't want to know why we have to special case evtchn > port, argo port and domain handle. I think it says more about the this > bodge of a parser than anything else... Iirc something broke without it, but it's been too long and spending a reasonable amount of time trying to re-construct I couldn't come up with anything. I didn't want to go as far as put time into actually trying out what (if anything) breaks with those removed. What I'm puzzled about is that argo and evtchn port types are handled in different places. For the domain handle iirc the exception was attributed to it being the only typedef of an array which is embedded in other structures. Jan ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [XEN PATCH v3 1/1] build: replace get-fields.sh by a python script 2023-01-18 8:06 ` Jan Beulich @ 2023-01-18 19:38 ` Andrew Cooper 0 siblings, 0 replies; 12+ messages in thread From: Andrew Cooper @ 2023-01-18 19:38 UTC (permalink / raw) To: Jan Beulich, Anthony Perard, xen-devel@lists.xenproject.org Cc: George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu On 18/01/2023 8:06 am, Jan Beulich wrote: > On 17.01.2023 18:21, Andrew Cooper wrote: >> On 16/01/2023 6:10 pm, Anthony PERARD wrote: >>> + elif re.match(r'^[a-zA-Z_]', token): >> [...] >> >> All of this said, where is 0-9 in the token regex? Have we just got >> extremely lucky with having no embedded digits in identifiers thus far? > That's checking for just the first character, which can't be a digit? So it is... But nothing good can possibly come of having a token here which matches on the first char but mismatches later. > >> P.S. I probably don't want to know why we have to special case evtchn >> port, argo port and domain handle. I think it says more about the this >> bodge of a parser than anything else... > Iirc something broke without it, but it's been too long and spending a > reasonable amount of time trying to re-construct I couldn't come up > with anything. I didn't want to go as far as put time into actually > trying out what (if anything) breaks with those removed. What I'm > puzzled about is that argo and evtchn port types are handled in > different places. > > For the domain handle iirc the exception was attributed to it being > the only typedef of an array which is embedded in other structures. I refer back to "bodge of a parser". ~Andrew ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [XEN PATCH v3 1/1] build: replace get-fields.sh by a python script 2023-01-17 17:21 ` Andrew Cooper 2023-01-18 8:06 ` Jan Beulich @ 2023-01-18 18:21 ` Anthony PERARD 2023-01-18 19:49 ` Andrew Cooper 2023-01-19 10:50 ` Jan Beulich 1 sibling, 2 replies; 12+ messages in thread From: Anthony PERARD @ 2023-01-18 18:21 UTC (permalink / raw) To: Andrew Cooper Cc: xen-devel@lists.xenproject.org, George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu On Tue, Jan 17, 2023 at 05:21:32PM +0000, Andrew Cooper wrote: > On 16/01/2023 6:10 pm, Anthony PERARD wrote: > > +def get_typedefs(tokens): > > + level = 1 > > + state = 0 > > + typedefs = [] > > I'm pretty sure typedefs actually wants to be a dict rather than a list > (will have better "id in typedefs" performance lower down), but that > wants matching with code changes elsewhere, and probably wants doing > separately. I'm not sure that going to make a difference to have "id in ()" instead of "id in []". I just found out that `typedefs` is always empty... I don't know what get_typedefs() is supposed to do, or at least if it works as expected, because it always returns "" or an empty list. (even the shell version) So, it would actually be a bit faster to not call get_typedefs(), but I don't know if that's safe. > > + for token in tokens: > > + if token == 'typedef': > > + if level == 1: > > + state = 1 > > + elif re.match(r'^COMPAT_HANDLE\((.*)\)$', token): > > + if not (level != 1 or state != 1): > > + state = 2 > > + elif token in ['[', '{']: > > + level += 1 > > + elif token in [']', '}']: > > + level -= 1 > > + elif token == ';': > > + if level == 1: > > + state = 0 > > + elif re.match(r'^[a-zA-Z_]', token): > > + if not (level != 1 or state != 2): > > + typedefs.append(token) > > + return typedefs -- Anthony PERARD ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [XEN PATCH v3 1/1] build: replace get-fields.sh by a python script 2023-01-18 18:21 ` Anthony PERARD @ 2023-01-18 19:49 ` Andrew Cooper 2023-01-19 10:50 ` Jan Beulich 1 sibling, 0 replies; 12+ messages in thread From: Andrew Cooper @ 2023-01-18 19:49 UTC (permalink / raw) To: Anthony Perard Cc: xen-devel@lists.xenproject.org, George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu On 18/01/2023 6:21 pm, Anthony PERARD wrote: > On Tue, Jan 17, 2023 at 05:21:32PM +0000, Andrew Cooper wrote: >> On 16/01/2023 6:10 pm, Anthony PERARD wrote: >>> +def get_typedefs(tokens): >>> + level = 1 >>> + state = 0 >>> + typedefs = [] >> I'm pretty sure typedefs actually wants to be a dict rather than a list >> (will have better "id in typedefs" performance lower down), but that >> wants matching with code changes elsewhere, and probably wants doing >> separately. > I'm not sure that going to make a difference to have "id in ()" instead > of "id in []". It will in the middle of a tight loop. Less pointer chasing in memory. But it is very marginal. > I just found out that `typedefs` is always empty... > > I don't know what get_typedefs() is supposed to do, or at least if it > works as expected, because it always returns "" or an empty list. (even > the shell version) > > So, it would actually be a bit faster to not call get_typedefs(), but I > don't know if that's safe. If it is dead logic even at the shell level, drop it. Perhaps a prereq patch, because removing the complexity first will make this patch simpler to follow. The conversion is atypical python, and perf will improve (which is the point of this patch anyway). ~Andrew ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [XEN PATCH v3 1/1] build: replace get-fields.sh by a python script 2023-01-18 18:21 ` Anthony PERARD 2023-01-18 19:49 ` Andrew Cooper @ 2023-01-19 10:50 ` Jan Beulich 2023-01-19 11:27 ` Andrew Cooper 1 sibling, 1 reply; 12+ messages in thread From: Jan Beulich @ 2023-01-19 10:50 UTC (permalink / raw) To: Anthony PERARD Cc: xen-devel@lists.xenproject.org, George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu, Andrew Cooper On 18.01.2023 19:21, Anthony PERARD wrote: > On Tue, Jan 17, 2023 at 05:21:32PM +0000, Andrew Cooper wrote: >> On 16/01/2023 6:10 pm, Anthony PERARD wrote: >>> +def get_typedefs(tokens): >>> + level = 1 >>> + state = 0 >>> + typedefs = [] >> >> I'm pretty sure typedefs actually wants to be a dict rather than a list >> (will have better "id in typedefs" performance lower down), but that >> wants matching with code changes elsewhere, and probably wants doing >> separately. > > I'm not sure that going to make a difference to have "id in ()" instead > of "id in []". I just found out that `typedefs` is always empty... > > I don't know what get_typedefs() is supposed to do, or at least if it > works as expected, because it always returns "" or an empty list. (even > the shell version) > > So, it would actually be a bit faster to not call get_typedefs(), but I > don't know if that's safe. There's exactly one instance that this would take care of: typedef XEN_GUEST_HANDLE(char) tmem_cli_va_t; But tmem.h isn't being processed anymore, and hence right now the list would always be empty. Are we going to be able to guarantee that going forward? Jan ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [XEN PATCH v3 1/1] build: replace get-fields.sh by a python script 2023-01-19 10:50 ` Jan Beulich @ 2023-01-19 11:27 ` Andrew Cooper 0 siblings, 0 replies; 12+ messages in thread From: Andrew Cooper @ 2023-01-19 11:27 UTC (permalink / raw) To: Jan Beulich, Anthony Perard Cc: xen-devel@lists.xenproject.org, George Dunlap, Julien Grall, Stefano Stabellini, Wei Liu On 19/01/2023 10:50 am, Jan Beulich wrote: > On 18.01.2023 19:21, Anthony PERARD wrote: >> On Tue, Jan 17, 2023 at 05:21:32PM +0000, Andrew Cooper wrote: >>> On 16/01/2023 6:10 pm, Anthony PERARD wrote: >>>> +def get_typedefs(tokens): >>>> + level = 1 >>>> + state = 0 >>>> + typedefs = [] >>> I'm pretty sure typedefs actually wants to be a dict rather than a list >>> (will have better "id in typedefs" performance lower down), but that >>> wants matching with code changes elsewhere, and probably wants doing >>> separately. >> I'm not sure that going to make a difference to have "id in ()" instead >> of "id in []". I just found out that `typedefs` is always empty... >> >> I don't know what get_typedefs() is supposed to do, or at least if it >> works as expected, because it always returns "" or an empty list. (even >> the shell version) >> >> So, it would actually be a bit faster to not call get_typedefs(), but I >> don't know if that's safe. > There's exactly one instance that this would take care of: > > typedef XEN_GUEST_HANDLE(char) tmem_cli_va_t; > > But tmem.h isn't being processed anymore, and hence right now the list > would always be empty. Are we going to be able to guarantee that going > forward? IMO that's a code pattern we wouldn't want to repeat moving forward. There's already too much magic in a guest handle, without hiding it behind a typedef too. ~Andrew ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2023-01-19 11:28 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-01-16 18:10 [XEN PATCH v3 0/1] xen: rework compat headers generation Anthony PERARD 2023-01-16 18:10 ` [XEN PATCH v3 1/1] build: replace get-fields.sh by a python script Anthony PERARD 2023-01-17 16:07 ` Luca Fancellu 2023-01-17 16:55 ` Anthony PERARD 2023-01-17 16:58 ` Luca Fancellu 2023-01-17 17:21 ` Andrew Cooper 2023-01-18 8:06 ` Jan Beulich 2023-01-18 19:38 ` Andrew Cooper 2023-01-18 18:21 ` Anthony PERARD 2023-01-18 19:49 ` Andrew Cooper 2023-01-19 10:50 ` Jan Beulich 2023-01-19 11:27 ` Andrew Cooper
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.