From: Lucas Meneghel Rodrigues <lmr@redhat.com>
To: Michael Goldish <mgoldish@redhat.com>
Cc: autotest@test.kernel.org, kvm@vger.kernel.org
Subject: Re: [KVM-AUTOTEST PATCH v4] KVM test: A memory efficient kvm_config implementation
Date: Wed, 03 Mar 2010 11:44:36 -0300 [thread overview]
Message-ID: <1267627476.2565.2.camel@localhost.localdomain> (raw)
In-Reply-To: <1267551059-24189-1-git-send-email-mgoldish@redhat.com>
On Tue, 2010-03-02 at 19:30 +0200, Michael Goldish wrote:
> This patch:
>
> - Makes kvm_config use less memory during parsing, by storing config data
> compactly in arrays during parsing, and generating the final dicts only when
> requested.
> On my machine this results in 5-10 times less memory being used (depending on
> the size of the final generated list).
> This allows the test configuration to keep expanding without having the
> parser run out of memory.
>
> - Adds config.fork_and_parse(), a function that parses a config file/string in
> a forked process and then terminates the process. This works around Python's
> policy of keeping allocated memory to itself even after the objects occupying
> the memory have been destroyed. If the process that does the parsing is the
> same one that runs the tests, less memory will be available to the VMs during
> testing.
>
> - Makes parsing 4-5 times faster as a result of the new internal representation.
>
> Overall, kvm_config's memory usage should now be negligible in most cases.
>
> Changes from v3:
> - Use the homemade 'configreader' class instead of regular files in parse()
> and parse_variants() (readline() and/or seek() are very slow).
> - Use a regex cache dict (regex_cache).
> - Use a string cache dict in addition to the list (object_cache_indices).
> - Some changes to fork_and_parse() (disable buffering).
>
> Changes from v2:
> - Merged _get_next_line() and _get_next_line_indent().
> - Made _array_get_name() faster.
>
> Changes from v1:
> - Added config.get_generator() which is similar to get_list() but returns a
> dict generator instead of a list. This should save some more memory and will
> make tests start sooner.
> - Use get_generator() in control.
> - Call waitpid() at the end of fork_and_parse().
As the generated patch is kinda fragmented for posting comments inline,
I am going to throw just a block of minor comments after I have reviewed
the code:
Observations:
* When a file is missing, it's more appropriate to raise a IOError than
an Exception, so we must change that. Also, it's important to follow the
coding standards for raising exceptions.
• I was wondering whether make fork_and_parse a public interface for the
config object was the right decision, maybe all calls to parse_file
should be done in a fork_and_parse fashion? I guess I got your point in
making it a public interface and separate it from parse_file, but isn't
that kinda confusing for the users (I mean, people writing control files
for kvm autotest)?
• About buffering on fork_and_parse: The performance penalties in
disabling buffering varies, with caches dropped it was something like
3-5%, after 'warming up' it was something like 8-11%, so it's small
stuff. But we can favour speed in this case so the final version won't
disable buffering.
Compliments:
• The configreader class was a very interesting move, simple, clean and
fast. Congrats!
• The output of the config system is good for debugging purposes, so
we'll stick with it.
• Thank you very much for your work, now we have faster parsing, that
consumes a lot less memory, so smaller boxes will benefit a *lot* from
that.
What I am going to do:
• I will re-send the version with the tiny changes I made so it gets
recorded on patchwork, and soon after I'll apply it upstream. I think
from this point on we might have only minor tweaks to make.
>
> Signed-off-by: Michael Goldish <mgoldish@redhat.com>
> ---
> client/tests/kvm/control | 30 +-
> client/tests/kvm/control.parallel | 21 +-
> client/tests/kvm/kvm_config.py | 832 ++++++++++++++++++++++---------------
> 3 files changed, 535 insertions(+), 348 deletions(-)
>
> diff --git a/client/tests/kvm/control b/client/tests/kvm/control
> index 163286e..15c4539 100644
> --- a/client/tests/kvm/control
> +++ b/client/tests/kvm/control
> @@ -30,34 +30,38 @@ import kvm_utils, kvm_config
> # set English environment (command output might be localized, need to be safe)
> os.environ['LANG'] = 'en_US.UTF-8'
>
> -build_cfg_path = os.path.join(kvm_test_dir, "build.cfg")
> -build_cfg = kvm_config.config(build_cfg_path)
> -# Make any desired changes to the build configuration here. For example:
> -#build_cfg.parse_string("""
> +str = """
> +# This string will be parsed after build.cfg. Make any desired changes to the
> +# build configuration here. For example:
> #release_tag = 84
> -#""")
> -if not kvm_utils.run_tests(build_cfg.get_list(), job):
> +"""
> +build_cfg = kvm_config.config()
> +build_cfg_path = os.path.join(kvm_test_dir, "build.cfg")
> +build_cfg.fork_and_parse(build_cfg_path, str)
> +if not kvm_utils.run_tests(build_cfg.get_generator(), job):
> logging.error("KVM build step failed, exiting.")
> sys.exit(1)
>
> -tests_cfg_path = os.path.join(kvm_test_dir, "tests.cfg")
> -tests_cfg = kvm_config.config(tests_cfg_path)
> -# Make any desired changes to the test configuration here. For example:
> -#tests_cfg.parse_string("""
> +str = """
> +# This string will be parsed after tests.cfg. Make any desired changes to the
> +# test configuration here. For example:
> #display = sdl
> #install|setup: timeout_multiplier = 3
> -#""")
> +"""
> +tests_cfg = kvm_config.config()
> +tests_cfg_path = os.path.join(kvm_test_dir, "tests.cfg")
> +tests_cfg.fork_and_parse(tests_cfg_path, str)
>
> pools_cfg_path = os.path.join(kvm_test_dir, "address_pools.cfg")
> tests_cfg.parse_file(pools_cfg_path)
> hostname = os.uname()[1].split(".")[0]
> -if tests_cfg.filter("^" + hostname):
> +if tests_cfg.count("^" + hostname):
> tests_cfg.parse_string("only ^%s" % hostname)
> else:
> tests_cfg.parse_string("only ^default_host")
>
> # Run the tests
> -kvm_utils.run_tests(tests_cfg.get_list(), job)
> +kvm_utils.run_tests(tests_cfg.get_generator(), job)
>
> # Generate a nice HTML report inside the job's results dir
> kvm_utils.create_report(kvm_test_dir, job.resultdir)
> diff --git a/client/tests/kvm/control.parallel b/client/tests/kvm/control.parallel
> index 343f694..07bc6e5 100644
> --- a/client/tests/kvm/control.parallel
> +++ b/client/tests/kvm/control.parallel
> @@ -160,19 +160,22 @@ if not params.get("mode") == "noinstall":
> # ----------------------------------------------------------
> import kvm_config
>
> -filename = os.path.join(pwd, "kvm_tests.cfg")
> -cfg = kvm_config.config(filename)
> -
> -# If desirable, make changes to the test configuration here. For example:
> -# cfg.parse_string("install|setup: timeout_multiplier = 2")
> -# cfg.parse_string("only fc8_quick")
> -# cfg.parse_string("display = sdl")
> +str = """
> +# This string will be parsed after tests.cfg. Make any desired changes to the
> +# test configuration here. For example:
> +#install|setup: timeout_multiplier = 3
> +#only fc8_quick
> +#display = sdl
> +"""
> +cfg = kvm_config.config()
> +filename = os.path.join(pwd, "tests.cfg")
> +cfg.fork_and_parse(filename, str)
>
> -filename = os.path.join(pwd, "kvm_address_pools.cfg")
> +filename = os.path.join(pwd, "address_pools.cfg")
> if os.path.exists(filename):
> cfg.parse_file(filename)
> hostname = os.uname()[1].split(".")[0]
> - if cfg.filter("^" + hostname):
> + if cfg.count("^" + hostname):
> cfg.parse_string("only ^%s" % hostname)
> else:
> cfg.parse_string("only ^default_host")
> diff --git a/client/tests/kvm/kvm_config.py b/client/tests/kvm/kvm_config.py
> index 798ef56..7ff28e4 100755
> --- a/client/tests/kvm/kvm_config.py
> +++ b/client/tests/kvm/kvm_config.py
> @@ -2,10 +2,10 @@
> """
> KVM configuration file utility functions.
>
> -@copyright: Red Hat 2008-2009
> +@copyright: Red Hat 2008-2010
> """
>
> -import logging, re, os, sys, StringIO, optparse
> +import logging, re, os, sys, optparse, array, traceback, cPickle
> import common
> from autotest_lib.client.common_lib import error
> from autotest_lib.client.common_lib import logging_config, logging_manager
> @@ -21,490 +21,670 @@ class config:
> """
> Parse an input file or string that follows the KVM Test Config File format
> and generate a list of dicts that will be later used as configuration
> - parameters by the the KVM tests.
> + parameters by the KVM tests.
>
> @see: http://www.linux-kvm.org/page/KVM-Autotest/Test_Config_File
> """
>
> - def __init__(self, filename=None, debug=False):
> + def __init__(self, filename=None, debug=True):
> """
> - Initialize the list and optionally parse filename.
> + Initialize the list and optionally parse a file.
>
> @param filename: Path of the file that will be taken.
> - @param debug: Whether to turn debugging output.
> + @param debug: Whether to turn on debugging output.
> """
> - self.list = [{"name": "", "shortname": "", "depend": []}]
> - self.debug = debug
> + self.list = [array.array("H", [4, 4, 4, 4])]
> + self.object_cache = []
> + self.object_cache_indices = {}
> + self.regex_cache = {}
> self.filename = filename
> + self.debug = debug
> if filename:
> self.parse_file(filename)
>
>
> def parse_file(self, filename):
> """
> - Parse filename, return the resulting list and store it in .list. If
> - filename does not exist, raise an exception.
> + Parse file. If it doesn't exist, raise an exception.
>
> @param filename: Path of the configuration file.
> """
> if not os.path.exists(filename):
> raise Exception, "File %s not found" % filename
> self.filename = filename
> - file = open(filename, "r")
> - self.list = self.parse(file, self.list)
> - file.close()
> - return self.list
> + str = open(filename).read()
> + self.list = self.parse(configreader(str), self.list)
>
>
> def parse_string(self, str):
> """
> - Parse a string, return the resulting list and store it in .list.
> + Parse a string.
>
> - @param str: String that will be parsed.
> + @param str: String to parse.
> """
> - file = StringIO.StringIO(str)
> - self.list = self.parse(file, self.list)
> - file.close()
> - return self.list
> + self.list = self.parse(configreader(str), self.list)
>
>
> - def get_list(self):
> - """
> - Return the list of dictionaries. This should probably be called after
> - parsing something.
> + def fork_and_parse(self, filename=None, str=None):
> """
> - return self.list
> + Parse a file and/or a string in a separate process to save memory.
>
> + Python likes to keep memory to itself even after the objects occupying
> + it have been destroyed. If during a call to parse_file() or
> + parse_string() a lot of memory is used, it can only be freed by
> + terminating the process. This function works around the problem by
> + doing the parsing in a forked process and then terminating it, freeing
> + any unneeded memory.
>
> - def match(self, filter, dict):
> - """
> - Return True if dict matches filter.
> + Note: if an exception is raised during parsing, its information will be
> + printed, and the resulting list will be empty. The exception will not
> + be raised in the process calling this function.
>
> - @param filter: A regular expression that defines the filter.
> - @param dict: Dictionary that will be inspected.
> + @param filename: Path of file to parse (optional).
> + @param str: String to parse (optional).
> """
> - filter = re.compile(r"(\.|^)(%s)(\.|$)" % filter)
> - return bool(filter.search(dict["name"]))
> -
> -
> - def filter(self, filter, list=None):
> + r, w = os.pipe()
> + r, w = os.fdopen(r, "r", 0), os.fdopen(w, "w", 0)
> + pid = os.fork()
> + if not pid:
> + # Child process
> + r.close()
> + try:
> + if filename:
> + self.parse_file(filename)
> + if str:
> + self.parse_string(str)
> + except:
> + traceback.print_exc()
> + self.list = []
> + # Convert the arrays to strings before pickling because at least
> + # some Python versions can't pickle/unpickle arrays
> + l = [a.tostring() for a in self.list]
> + cPickle.dump((l, self.object_cache), w, -1)
> + w.close()
> + os._exit(0)
> + else:
> + # Parent process
> + w.close()
> + (l, self.object_cache) = cPickle.load(r)
> + r.close()
> + os.waitpid(pid, 0)
> + self.list = []
> + for s in l:
> + a = array.array("H")
> + a.fromstring(s)
> + self.list.append(a)
> +
> +
> + def get_generator(self):
> """
> - Filter a list of dicts.
> + Generate dictionaries from the code parsed so far. This should
> + probably be called after parsing something.
>
> - @param filter: A regular expression that will be used as a filter.
> - @param list: A list of dictionaries that will be filtered.
> + @return: A dict generator.
> """
> - if list is None:
> - list = self.list
> - return [dict for dict in list if self.match(filter, dict)]
> + for a in self.list:
> + name, shortname, depend, content = _array_get_all(a, self.object_cache)
> + dict = {"name": name, "shortname": shortname, "depend": depend}
> + self._apply_content_to_dict(dict, content)
> + yield dict
>
>
> - def split_and_strip(self, str, sep="="):
> + def get_list(self):
> """
> - Split str and strip quotes from the resulting parts.
> + Generate a list of dictionaries from the code parsed so far.
> + This should probably be called after parsing something.
>
> - @param str: String that will be processed
> - @param sep: Separator that will be used to split the string
> + @return: A list of dicts.
> """
> - temp = str.split(sep, 1)
> - for i in range(len(temp)):
> - temp[i] = temp[i].strip()
> - if re.findall("^\".*\"$", temp[i]):
> - temp[i] = temp[i].strip("\"")
> - elif re.findall("^\'.*\'$", temp[i]):
> - temp[i] = temp[i].strip("\'")
> - return temp
> -
> + return list(self.get_generator())
>
> - def get_next_line(self, file):
> - """
> - Get the next non-empty, non-comment line in a file like object.
>
> - @param file: File like object
> - @return: If no line is available, return None.
> + def count(self, filter=".*"):
> """
> - while True:
> - line = file.readline()
> - if line == "": return None
> - stripped_line = line.strip()
> - if len(stripped_line) > 0 \
> - and not stripped_line.startswith('#') \
> - and not stripped_line.startswith('//'):
> - return line
> -
> + Return the number of dictionaries whose names match filter.
>
> - def get_next_line_indent(self, file):
> + @param filter: A regular expression string.
> """
> - Return the indent level of the next non-empty, non-comment line in file.
> -
> - @param file: File like object.
> - @return: If no line is available, return -1.
> - """
> - pos = file.tell()
> - line = self.get_next_line(file)
> - if not line:
> - file.seek(pos)
> - return -1
> - line = line.expandtabs()
> - indent = 0
> - while line[indent] == ' ':
> - indent += 1
> - file.seek(pos)
> - return indent
> -
> -
> - def add_name(self, str, name, append=False):
> - """
> - Add name to str with a separator dot and return the result.
> -
> - @param str: String that will be processed
> - @param name: name that will be appended to the string.
> - @return: If append is True, append name to str.
> - Otherwise, pre-pend name to str.
> - """
> - if str == "":
> - return name
> - # Append?
> - elif append:
> - return str + "." + name
> - # Prepend?
> - else:
> - return name + "." + str
> + exp = self._get_filter_regex(filter)
> + count = 0
> + for a in self.list:
> + name = _array_get_name(a, self.object_cache)
> + if exp.search(name):
> + count += 1
> + return count
>
>
> - def parse_variants(self, file, list, subvariants=False, prev_indent=-1):
> + def parse_variants(self, cr, list, subvariants=False, prev_indent=-1):
> """
> - Read and parse lines from file like object until a line with an indent
> - level lower than or equal to prev_indent is encountered.
> + Read and parse lines from a configreader object until a line with an
> + indent level lower than or equal to prev_indent is encountered.
>
> - @brief: Parse a 'variants' or 'subvariants' block from a file-like
> - object.
> - @param file: File-like object that will be parsed
> - @param list: List of dicts to operate on
> + @brief: Parse a 'variants' or 'subvariants' block from a configreader
> + object.
> + @param cr: configreader object to be parsed.
> + @param list: List of arrays to operate on.
> @param subvariants: If True, parse in 'subvariants' mode;
> - otherwise parse in 'variants' mode
> - @param prev_indent: The indent level of the "parent" block
> - @return: The resulting list of dicts.
> + otherwise parse in 'variants' mode.
> + @param prev_indent: The indent level of the "parent" block.
> + @return: The resulting list of arrays.
> """
> new_list = []
>
> while True:
> - indent = self.get_next_line_indent(file)
> + pos = cr.tell()
> + (indented_line, line, indent) = cr.get_next_line()
> if indent <= prev_indent:
> + cr.seek(pos)
> break
> - indented_line = self.get_next_line(file).rstrip()
> - line = indented_line.strip()
>
> # Get name and dependencies
> - temp = line.strip("- ").split(":")
> - name = temp[0]
> - if len(temp) == 1:
> - dep_list = []
> - else:
> - dep_list = temp[1].split()
> + (name, depend) = map(str.strip, line.lstrip("- ").split(":"))
>
> # See if name should be added to the 'shortname' field
> - add_to_shortname = True
> - if name.startswith("@"):
> - name = name.strip("@")
> - add_to_shortname = False
> -
> - # Make a deep copy of list
> - temp_list = []
> - for dict in list:
> - new_dict = dict.copy()
> - new_dict["depend"] = dict["depend"][:]
> - temp_list.append(new_dict)
> + add_to_shortname = not name.startswith("@")
> + name = name.lstrip("@")
> +
> + # Store name and dependencies in cache and get their indices
> + n = self._store_str(name)
> + d = self._store_str(depend)
> +
> + # Make a copy of list
> + temp_list = [a[:] for a in list]
>
> if subvariants:
> # If we're parsing 'subvariants', first modify the list
> - self.__modify_list_subvariants(temp_list, name, dep_list,
> - add_to_shortname)
> - temp_list = self.parse(file, temp_list,
> - restricted=True, prev_indent=indent)
> + if add_to_shortname:
> + for a in temp_list:
> + _array_append_to_name_shortname_depend(a, n, d)
> + else:
> + for a in temp_list:
> + _array_append_to_name_depend(a, n, d)
> + temp_list = self.parse(cr, temp_list, restricted=True,
> + prev_indent=indent)
> else:
> # If we're parsing 'variants', parse before modifying the list
> if self.debug:
> - self.__debug_print(indented_line,
> - "Entering variant '%s' "
> - "(variant inherits %d dicts)" %
> - (name, len(list)))
> - temp_list = self.parse(file, temp_list,
> - restricted=False, prev_indent=indent)
> - self.__modify_list_variants(temp_list, name, dep_list,
> - add_to_shortname)
> + _debug_print(indented_line,
> + "Entering variant '%s' "
> + "(variant inherits %d dicts)" %
> + (name, len(list)))
> + temp_list = self.parse(cr, temp_list, restricted=False,
> + prev_indent=indent)
> + if add_to_shortname:
> + for a in temp_list:
> + _array_prepend_to_name_shortname_depend(a, n, d)
> + else:
> + for a in temp_list:
> + _array_prepend_to_name_depend(a, n, d)
>
> new_list += temp_list
>
> return new_list
>
>
> - def parse(self, file, list, restricted=False, prev_indent=-1):
> + def parse(self, cr, list, restricted=False, prev_indent=-1):
> """
> - Read and parse lines from file until a line with an indent level lower
> - than or equal to prev_indent is encountered.
> -
> - @brief: Parse a file-like object.
> - @param file: A file-like object
> - @param list: A list of dicts to operate on (list is modified in
> - place and should not be used after the call)
> - @param restricted: if True, operate in restricted mode
> - (prohibit 'variants')
> - @param prev_indent: the indent level of the "parent" block
> - @return: Return the resulting list of dicts.
> + Read and parse lines from a configreader object until a line with an
> + indent level lower than or equal to prev_indent is encountered.
> +
> + @brief: Parse a configreader object.
> + @param cr: A configreader object.
> + @param list: A list of arrays to operate on (list is modified in
> + place and should not be used after the call).
> + @param restricted: If True, operate in restricted mode
> + (prohibit 'variants').
> + @param prev_indent: The indent level of the "parent" block.
> + @return: The resulting list of arrays.
> @note: List is destroyed and should not be used after the call.
> - Only the returned list should be used.
> + Only the returned list should be used.
> """
> + current_block = ""
> +
> while True:
> - indent = self.get_next_line_indent(file)
> + pos = cr.tell()
> + (indented_line, line, indent) = cr.get_next_line()
> if indent <= prev_indent:
> + cr.seek(pos)
> + self._append_content_to_arrays(list, current_block)
> break
> - indented_line = self.get_next_line(file).rstrip()
> - line = indented_line.strip()
> - words = line.split()
>
> len_list = len(list)
>
> - # Look for a known operator in the line
> - operators = ["?+=", "?<=", "?=", "+=", "<=", "="]
> - op_found = None
> - op_pos = len(line)
> - for op in operators:
> - pos = line.find(op)
> - if pos >= 0 and pos < op_pos:
> - op_found = op
> - op_pos = pos
> -
> - # Found an operator?
> - if op_found:
> + # Parse assignment operators (keep lines in temporary buffer)
> + if "=" in line:
> if self.debug and not restricted:
> - self.__debug_print(indented_line,
> - "Parsing operator (%d dicts in current "
> - "context)" % len_list)
> - (left, value) = self.split_and_strip(line, op_found)
> - filters_and_key = self.split_and_strip(left, ":")
> - filters = filters_and_key[:-1]
> - key = filters_and_key[-1]
> - filtered_list = list
> - for filter in filters:
> - filtered_list = self.filter(filter, filtered_list)
> - # Apply the operation to the filtered list
> - if op_found == "=":
> - for dict in filtered_list:
> - dict[key] = value
> - elif op_found == "+=":
> - for dict in filtered_list:
> - dict[key] = dict.get(key, "") + value
> - elif op_found == "<=":
> - for dict in filtered_list:
> - dict[key] = value + dict.get(key, "")
> - elif op_found.startswith("?"):
> - exp = re.compile("^(%s)$" % key)
> - if op_found == "?=":
> - for dict in filtered_list:
> - for key in dict.keys():
> - if exp.match(key):
> - dict[key] = value
> - elif op_found == "?+=":
> - for dict in filtered_list:
> - for key in dict.keys():
> - if exp.match(key):
> - dict[key] = dict.get(key, "") + value
> - elif op_found == "?<=":
> - for dict in filtered_list:
> - for key in dict.keys():
> - if exp.match(key):
> - dict[key] = value + dict.get(key, "")
> + _debug_print(indented_line,
> + "Parsing operator (%d dicts in current "
> + "context)" % len_list)
> + current_block += line + "\n"
> + continue
> +
> + # Flush the temporary buffer
> + self._append_content_to_arrays(list, current_block)
> + current_block = ""
> +
> + words = line.split()
>
> # Parse 'no' and 'only' statements
> - elif words[0] == "no" or words[0] == "only":
> + if words[0] == "no" or words[0] == "only":
> if len(words) <= 1:
> continue
> - filters = words[1:]
> + filters = map(self._get_filter_regex, words[1:])
> filtered_list = []
> if words[0] == "no":
> - for dict in list:
> + for a in list:
> + name = _array_get_name(a, self.object_cache)
> for filter in filters:
> - if self.match(filter, dict):
> + if filter.search(name):
> break
> else:
> - filtered_list.append(dict)
> + filtered_list.append(a)
> if words[0] == "only":
> - for dict in list:
> + for a in list:
> + name = _array_get_name(a, self.object_cache)
> for filter in filters:
> - if self.match(filter, dict):
> - filtered_list.append(dict)
> + if filter.search(name):
> + filtered_list.append(a)
> break
> list = filtered_list
> if self.debug and not restricted:
> - self.__debug_print(indented_line,
> - "Parsing no/only (%d dicts in current "
> - "context, %d remain)" %
> - (len_list, len(list)))
> + _debug_print(indented_line,
> + "Parsing no/only (%d dicts in current "
> + "context, %d remain)" %
> + (len_list, len(list)))
> + continue
>
> # Parse 'variants'
> - elif line == "variants:":
> + if line == "variants:":
> # 'variants' not allowed in restricted mode
> # (inside an exception or inside subvariants)
> if restricted:
> e_msg = "Using variants in this context is not allowed"
> raise error.AutotestError(e_msg)
> if self.debug and not restricted:
> - self.__debug_print(indented_line,
> - "Entering variants block (%d dicts in "
> - "current context)" % len_list)
> - list = self.parse_variants(file, list, subvariants=False,
> + _debug_print(indented_line,
> + "Entering variants block (%d dicts in "
> + "current context)" % len_list)
> + list = self.parse_variants(cr, list, subvariants=False,
> prev_indent=indent)
> + continue
>
> # Parse 'subvariants' (the block is parsed for each dict
> # separately)
> - elif line == "subvariants:":
> + if line == "subvariants:":
> if self.debug and not restricted:
> - self.__debug_print(indented_line,
> - "Entering subvariants block (%d dicts in "
> - "current context)" % len_list)
> + _debug_print(indented_line,
> + "Entering subvariants block (%d dicts in "
> + "current context)" % len_list)
> new_list = []
> - # Remember current file position
> - pos = file.tell()
> + # Remember current position
> + pos = cr.tell()
> # Read the lines in any case
> - self.parse_variants(file, [], subvariants=True,
> + self.parse_variants(cr, [], subvariants=True,
> prev_indent=indent)
> # Iterate over the list...
> - for index in range(len(list)):
> - # Revert to initial file position in this 'subvariants'
> - # block
> - file.seek(pos)
> + for index in xrange(len(list)):
> + # Revert to initial position in this 'subvariants' block
> + cr.seek(pos)
> # Everything inside 'subvariants' should be parsed in
> # restricted mode
> - new_list += self.parse_variants(file, list[index:index+1],
> + new_list += self.parse_variants(cr, list[index:index+1],
> subvariants=True,
> prev_indent=indent)
> list = new_list
> + continue
>
> # Parse 'include' statements
> - elif words[0] == "include":
> + if words[0] == "include":
> if len(words) <= 1:
> continue
> if self.debug and not restricted:
> - self.__debug_print(indented_line,
> - "Entering file %s" % words[1])
> + _debug_print(indented_line, "Entering file %s" % words[1])
> if self.filename:
> filename = os.path.join(os.path.dirname(self.filename),
> words[1])
> if os.path.exists(filename):
> - new_file = open(filename, "r")
> - list = self.parse(new_file, list, restricted)
> - new_file.close()
> + str = open(filename).read()
> + list = self.parse(configreader(str), list, restricted)
> if self.debug and not restricted:
> - self.__debug_print("", "Leaving file %s" % words[1])
> + _debug_print("", "Leaving file %s" % words[1])
> else:
> logging.warning("Cannot include %s -- file not found",
> filename)
> else:
> logging.warning("Cannot include %s because no file is "
> "currently open", words[1])
> + continue
>
> # Parse multi-line exceptions
> # (the block is parsed for each dict separately)
> - elif line.endswith(":"):
> + if line.endswith(":"):
> if self.debug and not restricted:
> - self.__debug_print(indented_line,
> - "Entering multi-line exception block "
> - "(%d dicts in current context outside "
> - "exception)" % len_list)
> - line = line.strip(":")
> + _debug_print(indented_line,
> + "Entering multi-line exception block "
> + "(%d dicts in current context outside "
> + "exception)" % len_list)
> + line = line[:-1]
> new_list = []
> - # Remember current file position
> - pos = file.tell()
> + # Remember current position
> + pos = cr.tell()
> # Read the lines in any case
> - self.parse(file, [], restricted=True, prev_indent=indent)
> + self.parse(cr, [], restricted=True, prev_indent=indent)
> # Iterate over the list...
> - for index in range(len(list)):
> - if self.match(line, list[index]):
> - # Revert to initial file position in this
> - # exception block
> - file.seek(pos)
> + exp = self._get_filter_regex(line)
> + for index in xrange(len(list)):
> + name = _array_get_name(list[index], self.object_cache)
> + if exp.search(name):
> + # Revert to initial position in this exception block
> + cr.seek(pos)
> # Everything inside an exception should be parsed in
> # restricted mode
> - new_list += self.parse(file, list[index:index+1],
> + new_list += self.parse(cr, list[index:index+1],
> restricted=True,
> prev_indent=indent)
> else:
> - new_list += list[index:index+1]
> + new_list.append(list[index])
> list = new_list
> + continue
>
> return list
>
>
> - def __debug_print(self, str1, str2=""):
> + def _get_filter_regex(self, filter):
> """
> - Nicely print two strings and an arrow.
> + Return a regex object corresponding to a given filter string.
>
> - @param str1: First string
> - @param str2: Second string
> + All regular expressions given to the parser are passed through this
> + function first. Its purpose is to make them more specific and better
> + suited to match dictionary names: it forces simple expressions to match
> + only between dots or at the beginning or end of a string. For example,
> + the filter 'foo' will match 'foo.bar' but not 'foobar'.
> """
> - if str2:
> - str = "%-50s ---> %s" % (str1, str2)
> - else:
> - str = str1
> - logging.debug(str)
> -
> -
> - def __modify_list_variants(self, list, name, dep_list, add_to_shortname):
> - """
> - Make some modifications to list, as part of parsing a 'variants' block.
> -
> - @param list: List to be processed
> - @param name: Name to be prepended to the dictionary's 'name' key
> - @param dep_list: List of dependencies to be added to the dictionary's
> - 'depend' key
> - @param add_to_shortname: Boolean indicating whether name should be
> - prepended to the dictionary's 'shortname' key as well
> - """
> - for dict in list:
> - # Prepend name to the dict's 'name' field
> - dict["name"] = self.add_name(dict["name"], name)
> - # Prepend name to the dict's 'shortname' field
> - if add_to_shortname:
> - dict["shortname"] = self.add_name(dict["shortname"], name)
> - # Prepend name to each of the dict's dependencies
> - for i in range(len(dict["depend"])):
> - dict["depend"][i] = self.add_name(dict["depend"][i], name)
> - # Add new dependencies
> - dict["depend"] += dep_list
> -
> -
> - def __modify_list_subvariants(self, list, name, dep_list, add_to_shortname):
> - """
> - Make some modifications to list, as part of parsing a 'subvariants'
> - block.
> -
> - @param list: List to be processed
> - @param name: Name to be appended to the dictionary's 'name' key
> - @param dep_list: List of dependencies to be added to the dictionary's
> - 'depend' key
> - @param add_to_shortname: Boolean indicating whether name should be
> - appended to the dictionary's 'shortname' as well
> - """
> - for dict in list:
> - # Add new dependencies
> - for dep in dep_list:
> - dep_name = self.add_name(dict["name"], dep, append=True)
> - dict["depend"].append(dep_name)
> - # Append name to the dict's 'name' field
> - dict["name"] = self.add_name(dict["name"], name, append=True)
> - # Append name to the dict's 'shortname' field
> - if add_to_shortname:
> - dict["shortname"] = self.add_name(dict["shortname"], name,
> - append=True)
> + try:
> + return self.regex_cache[filter]
> + except KeyError:
> + exp = re.compile(r"(\.|^)(%s)(\.|$)" % filter)
> + self.regex_cache[filter] = exp
> + return exp
> +
> +
> + def _store_str(self, str):
> + """
> + Store str in the internal object cache, if it isn't already there, and
> + return its identifying index.
> +
> + @param str: String to store.
> + @return: The index of str in the object cache.
> + """
> + try:
> + return self.object_cache_indices[str]
> + except KeyError:
> + self.object_cache.append(str)
> + index = len(self.object_cache) - 1
> + self.object_cache_indices[str] = index
> + return index
> +
> +
> + def _append_content_to_arrays(self, list, content):
> + """
> + Append content (config code containing assignment operations) to a list
> + of arrays.
> +
> + @param list: List of arrays to operate on.
> + @param content: String containing assignment operations.
> + """
> + if content:
> + str_index = self._store_str(content)
> + for a in list:
> + _array_append_to_content(a, str_index)
> +
> +
> + def _apply_content_to_dict(self, dict, content):
> + """
> + Apply the operations in content (config code containing assignment
> + operations) to a dict.
> +
> + @param dict: Dictionary to operate on. Must have 'name' key.
> + @param content: String containing assignment operations.
> + """
> + for line in content.splitlines():
> + op_found = None
> + op_pos = len(line)
> + for op in ops:
> + pos = line.find(op)
> + if pos >= 0 and pos < op_pos:
> + op_found = op
> + op_pos = pos
> + if not op_found:
> + continue
> + (left, value) = map(str.strip, line.split(op_found, 1))
> + if value and ((value[0] == '"' and value[-1] == '"') or
> + (value[0] == "'" and value[-1] == "'")):
> + value = value[1:-1]
> + filters_and_key = map(str.strip, left.split(":"))
> + filters = filters_and_key[:-1]
> + key = filters_and_key[-1]
> + for filter in filters:
> + exp = self._get_filter_regex(filter)
> + if not exp.search(dict["name"]):
> + break
> + else:
> + ops[op_found](dict, key, value)
> +
> +
> +# Assignment operators
> +
> +def _op_set(dict, key, value):
> + dict[key] = value
> +
> +
> +def _op_append(dict, key, value):
> + dict[key] = dict.get(key, "") + value
> +
> +
> +def _op_prepend(dict, key, value):
> + dict[key] = value + dict.get(key, "")
> +
> +
> +def _op_regex_set(dict, exp, value):
> + exp = re.compile("^(%s)$" % exp)
> + for key in dict:
> + if exp.match(key):
> + dict[key] = value
> +
> +
> +def _op_regex_append(dict, exp, value):
> + exp = re.compile("^(%s)$" % exp)
> + for key in dict:
> + if exp.match(key):
> + dict[key] += value
> +
> +
> +def _op_regex_prepend(dict, exp, value):
> + exp = re.compile("^(%s)$" % exp)
> + for key in dict:
> + if exp.match(key):
> + dict[key] = value + dict[key]
> +
> +
> +ops = {
> + "=": _op_set,
> + "+=": _op_append,
> + "<=": _op_prepend,
> + "?=": _op_regex_set,
> + "?+=": _op_regex_append,
> + "?<=": _op_regex_prepend,
> +}
> +
> +
> +# Misc functions
> +
> +def _debug_print(str1, str2=""):
> + """
> + Nicely print two strings and an arrow.
> +
> + @param str1: First string.
> + @param str2: Second string.
> + """
> + if str2:
> + str = "%-50s ---> %s" % (str1, str2)
> + else:
> + str = str1
> + logging.debug(str)
> +
> +
> +# configreader
> +
> +class configreader:
> + """
> + Preprocess an input string and provide file-like services.
> + This is intended as a replacement for the file and StringIO classes,
> + whose readline() and/or seek() methods seem to be slow.
> + """
> +
> + def __init__(self, str):
> + """
> + Initialize the reader.
> +
> + @param str: The string to parse.
> + """
> + self.line_index = 0
> + self.lines = []
> + for line in str.splitlines():
> + line = line.rstrip().expandtabs()
> + stripped_line = line.strip()
> + indent = len(line) - len(stripped_line)
> + if (not stripped_line
> + or stripped_line.startswith("#")
> + or stripped_line.startswith("//")):
> + continue
> + self.lines.append((line, stripped_line, indent))
> +
> +
> + def get_next_line(self):
> + """
> + Get the next non-empty, non-comment line in the string.
> +
> + @param file: File like object.
> + @return: (line, stripped_line, indent), where indent is the line's
> + indent level or -1 if no line is available.
> + """
> + try:
> + if self.line_index < len(self.lines):
> + return self.lines[self.line_index]
> + else:
> + return (None, None, -1)
> + finally:
> + self.line_index += 1
> +
> +
> + def tell(self):
> + """
> + Return the current line index.
> + """
> + return self.line_index
> +
> +
> + def seek(self, index):
> + """
> + Set the current line index.
> + """
> + self.line_index = index
> +
> +
> +# Array structure:
> +# ----------------
> +# The first 4 elements contain the indices of the 4 segments.
> +# a[0] -- Index of beginning of 'name' segment (always 4).
> +# a[1] -- Index of beginning of 'shortname' segment.
> +# a[2] -- Index of beginning of 'depend' segment.
> +# a[3] -- Index of beginning of 'content' segment.
> +# The next elements in the array comprise the aforementioned segments:
> +# The 'name' segment begins with a[a[0]] and ends with a[a[1]-1].
> +# The 'shortname' segment begins with a[a[1]] and ends with a[a[2]-1].
> +# The 'depend' segment begins with a[a[2]] and ends with a[a[3]-1].
> +# The 'content' segment begins with a[a[3]] and ends at the end of the array.
> +
> +# The following functions append/prepend to various segments of an array.
> +
> +def _array_append_to_name_shortname_depend(a, name, depend):
> + a.insert(a[1], name)
> + a.insert(a[2] + 1, name)
> + a.insert(a[3] + 2, depend)
> + a[1] += 1
> + a[2] += 2
> + a[3] += 3
> +
> +
> +def _array_prepend_to_name_shortname_depend(a, name, depend):
> + a[1] += 1
> + a[2] += 2
> + a[3] += 3
> + a.insert(a[0], name)
> + a.insert(a[1], name)
> + a.insert(a[2], depend)
> +
> +
> +def _array_append_to_name_depend(a, name, depend):
> + a.insert(a[1], name)
> + a.insert(a[3] + 1, depend)
> + a[1] += 1
> + a[2] += 1
> + a[3] += 2
> +
> +
> +def _array_prepend_to_name_depend(a, name, depend):
> + a[1] += 1
> + a[2] += 1
> + a[3] += 2
> + a.insert(a[0], name)
> + a.insert(a[2], depend)
> +
> +
> +def _array_append_to_content(a, content):
> + a.append(content)
> +
> +
> +def _array_get_name(a, object_cache):
> + """
> + Return the name of a dictionary represented by a given array.
> +
> + @param a: Array representing a dictionary.
> + @param object_cache: A list of strings referenced by elements in the array.
> + """
> + return ".".join([object_cache[i] for i in a[a[0]:a[1]]])
> +
> +
> +def _array_get_all(a, object_cache):
> + """
> + Return a 4-tuple containing all the data stored in a given array, in a
> + format that is easy to turn into an actual dictionary.
> +
> + @param a: Array representing a dictionary.
> + @param object_cache: A list of strings referenced by elements in the array.
> + @return: A 4-tuple: (name, shortname, depend, content), in which all
> + members are strings except depend which is a list of strings.
> + """
> + name = ".".join([object_cache[i] for i in a[a[0]:a[1]]])
> + shortname = ".".join([object_cache[i] for i in a[a[1]:a[2]]])
> + content = "".join([object_cache[i] for i in a[a[3]:]])
> + depend = []
> + prefix = ""
> + for n, d in zip(a[a[0]:a[1]], a[a[2]:a[3]]):
> + for dep in object_cache[d].split():
> + depend.append(prefix + dep)
> + prefix += object_cache[n] + "."
> + return name, shortname, depend, content
> +
>
>
> if __name__ == "__main__":
> parser = optparse.OptionParser()
> parser.add_option('-f', '--file', dest="filename", action='store',
> help='path to a config file that will be parsed. '
> - 'If not specified, will parse kvm_tests.cfg '
> - 'located inside the kvm test dir.')
> + 'If not specified, will parse tests.cfg located '
> + 'inside the kvm test dir.')
> parser.add_option('--verbose', dest="debug", action='store_true',
> help='include debug messages in console output')
>
> @@ -518,9 +698,9 @@ if __name__ == "__main__":
> # Here we configure the stand alone program to use the autotest
> # logging system.
> logging_manager.configure_logging(KvmLoggingConfig(), verbose=debug)
> - list = config(filename, debug=debug).get_list()
> + dicts = config(filename, debug=debug).get_generator()
> i = 0
> - for dict in list:
> + for dict in dicts:
> logging.info("Dictionary #%d:", i)
> keys = dict.keys()
> keys.sort()
_______________________________________________
Autotest mailing list
Autotest@test.kernel.org
http://test.kernel.org/cgi-bin/mailman/listinfo/autotest
next prev parent reply other threads:[~2010-03-03 14:44 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-02 17:30 [KVM-AUTOTEST PATCH v4] KVM test: A memory efficient kvm_config implementation Michael Goldish
2010-03-03 14:44 ` Lucas Meneghel Rodrigues [this message]
[not found] <199667921.2707991267633218479.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com>
2010-03-03 16:20 ` Michael Goldish
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1267627476.2565.2.camel@localhost.localdomain \
--to=lmr@redhat.com \
--cc=autotest@test.kernel.org \
--cc=kvm@vger.kernel.org \
--cc=mgoldish@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox