* [PATCH 1/1] utility-tasks.bbclass: Move distro related tasks to distrodata.bbclass
2010-11-22 6:04 [PATCH 0/1][RFC] Optimize file parsing speed Dongxiao Xu
@ 2010-11-22 6:02 ` Dongxiao Xu
2010-11-28 14:27 ` [PATCH 0/1][RFC] Optimize file parsing speed Richard Purdie
1 sibling, 0 replies; 9+ messages in thread
From: Dongxiao Xu @ 2010-11-22 6:02 UTC (permalink / raw)
To: poky
Most of the d.keys() used in file parsing are variables in
distro_tracking_fields.inc, which are not used in normal build.
Therefore remove the inclusion of distro_tracking_fields.inc from
poky.conf. Besides, move distro related tasks to distrodata.bbclass,
which includes that tracking field file.
By this change, the file parsing time could save about 25%.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
---
meta/classes/distrodata.bbclass | 440 +++++++++++++++++++++++++++++++++++
meta/classes/utility-tasks.bbclass | 442 ------------------------------------
meta/conf/distro/poky.conf | 1 -
3 files changed, 440 insertions(+), 443 deletions(-)
diff --git a/meta/classes/distrodata.bbclass b/meta/classes/distrodata.bbclass
index f6642f0..221dfae 100644
--- a/meta/classes/distrodata.bbclass
+++ b/meta/classes/distrodata.bbclass
@@ -211,3 +211,443 @@ do_distrodataall() {
:
}
+addtask checkpkg
+do_checkpkg[nostamp] = "1"
+python do_checkpkg() {
+ import sys
+ import re
+ import tempfile
+
+ """
+ sanity check to ensure same name and type. Match as many patterns as possible
+ such as:
+ gnome-common-2.20.0.tar.gz (most common format)
+ gtk+-2.90.1.tar.gz
+ xf86-intput-synaptics-12.6.9.tar.gz
+ dri2proto-2.3.tar.gz
+ blktool_4.orig.tar.gz
+ libid3tag-0.15.1b.tar.gz
+ unzip552.tar.gz
+ icu4c-3_6-src.tgz
+ genext2fs_1.3.orig.tar.gz
+ gst-fluendo-mp3
+ """
+ prefix1 = "[a-zA-Z][a-zA-Z0-9]*([\-_][a-zA-Z]\w+)*[\-_]" # match most patterns which uses "-" as separator to version digits
+ prefix2 = "[a-zA-Z]+" # a loose pattern such as for unzip552.tar.gz
+ prefix = "(%s|%s)" % (prefix1, prefix2)
+ suffix = "(tar\.gz|tgz|tar\.bz2|zip)"
+ suffixtuple = ("tar.gz", "tgz", "zip", "tar.bz2")
+
+ sinterstr = "(?P<name>%s?)(?P<ver>.*)" % prefix
+ sdirstr = "(?P<name>%s)(?P<ver>.*)\.(?P<type>%s$)" % (prefix, suffix)
+
+ def parse_inter(s):
+ m = re.search(sinterstr, s)
+ if not m:
+ return None
+ else:
+ return (m.group('name'), m.group('ver'), "")
+
+ def parse_dir(s):
+ m = re.search(sdirstr, s)
+ if not m:
+ return None
+ else:
+ return (m.group('name'), m.group('ver'), m.group('type'))
+
+ """
+ Check whether 'new' is newer than 'old' version. We use existing vercmp() for the
+ purpose. PE is cleared in comparison as it's not for build, and PV is cleared too
+ for simplicity as it's somehow difficult to get from various upstream format
+ """
+ def __vercmp(old, new):
+ (on, ov, ot) = old
+ (en, ev, et) = new
+ if on != en or (et and et not in suffixtuple):
+ return 0
+
+ ov = re.search("\d+[^a-zA-Z]+", ov).group()
+ ev = re.search("\d+[^a-zA-Z]+", ev).group()
+ return bb.utils.vercmp(("0", ov, ""), ("0", ev, ""))
+
+ """
+ wrapper for fetch upstream directory info
+ 'url' - upstream link customized by regular expression
+ 'd' - database
+ 'tmpf' - tmpfile for fetcher output
+ We don't want to exit whole build due to one recipe error. So handle all exceptions
+ gracefully w/o leaking to outer.
+ """
+ def internal_fetch_wget(url, d, tmpf):
+ status = "ErrFetchUnknown"
+ try:
+ """
+ Clear internal url cache as it's a temporary check. Not doing so will have
+ bitbake check url multiple times when looping through a single url
+ """
+ fn = bb.data.getVar('FILE', d, 1)
+ bb.fetch.urldata_cache[fn] = {}
+ bb.fetch.init([url], d)
+ except bb.fetch.NoMethodError:
+ status = "ErrFetchNoMethod"
+ except:
+ status = "ErrInitUrlUnknown"
+ else:
+ """
+ To avoid impacting bitbake build engine, this trick is required for reusing bitbake
+ interfaces. bb.fetch.go() is not appliable as it checks downloaded content in ${DL_DIR}
+ while we don't want to pollute that place. So bb.fetch.checkstatus() is borrowed here
+ which is designed for check purpose but we override check command for our own purpose
+ """
+ ld = bb.data.createCopy(d)
+ bb.data.setVar('CHECKCOMMAND_wget', "/usr/bin/env wget -t 1 --passive-ftp -O %s '${URI}'" \
+ % tmpf.name, d)
+ bb.data.update_data(ld)
+
+ try:
+ bb.fetch.checkstatus(ld)
+ except bb.fetch.MissingParameterError:
+ status = "ErrMissParam"
+ except bb.fetch.FetchError:
+ status = "ErrFetch"
+ except bb.fetch.MD5SumError:
+ status = "ErrMD5Sum"
+ except:
+ status = "ErrFetchUnknown"
+ else:
+ status = "SUCC"
+ return status
+
+ """
+ Check on middle version directory such as "2.4/" in "http://xxx/2.4/pkg-2.4.1.tar.gz",
+ 'url' - upstream link customized by regular expression
+ 'd' - database
+ 'curver' - current version
+ Return new version if success, or else error in "Errxxxx" style
+ """
+ def check_new_dir(url, curver, d):
+ pn = bb.data.getVar('PN', d, 1)
+ f = tempfile.NamedTemporaryFile(delete=False, prefix="%s-1-" % pn)
+ status = internal_fetch_wget(url, d, f)
+ fhtml = f.read()
+
+ if status == "SUCC" and len(fhtml):
+ newver = parse_inter(curver)
+
+ """
+ match "*4.1/">*4.1/ where '*' matches chars
+ N.B. add package name, only match for digits
+ """
+ m = re.search("^%s" % prefix, curver)
+ if m:
+ s = "%s[^\d\"]*?(\d+[\.\-_])+\d+/?" % m.group()
+ else:
+ s = "(\d+[\.\-_])+\d+/?"
+
+ searchstr = "[hH][rR][eE][fF]=\"%s\">" % s
+ reg = re.compile(searchstr)
+
+ valid = 0
+ for line in fhtml.split("\n"):
+ if line.find(curver) >= 0:
+ valid = 1
+
+ m = reg.search(line)
+ if m:
+ ver = m.group().split("\"")[1]
+ ver = ver.strip("/")
+ ver = parse_inter(ver)
+ if ver and __vercmp(newver, ver) < 0:
+ newver = ver
+
+ """Expect a match for curver in directory list, or else it indicates unknown format"""
+ if not valid:
+ status = "ErrParseInterDir"
+ else:
+ """rejoin the path name"""
+ status = newver[0] + newver[1]
+ elif not len(fhtml):
+ status = "ErrHostNoDir"
+
+ f.close()
+ if status != "ErrHostNoDir" and re.match("Err", status):
+ logpath = bb.data.getVar('LOG_DIR', d, 1)
+ os.system("cp %s %s/" % (f.name, logpath))
+ os.unlink(f.name)
+ return status
+
+ """
+ Check on the last directory to search '2.4.1' in "http://xxx/2.4/pkg-2.4.1.tar.gz",
+ 'url' - upstream link customized by regular expression
+ 'd' - database
+ 'curname' - current package name
+ Return new version if success, or else error in "Errxxxx" style
+ """
+ def check_new_version(url, curname, d):
+ """possible to have no version in pkg name, such as spectrum-fw"""
+ if not re.search("\d+", curname):
+ return pcurver
+ pn = bb.data.getVar('PN', d, 1)
+ f = tempfile.NamedTemporaryFile(delete=False, prefix="%s-2-" % pn)
+ status = internal_fetch_wget(url, d, f)
+ fhtml = f.read()
+
+ if status == "SUCC" and len(fhtml):
+ newver = parse_dir(curname)
+
+ """match "{PN}-5.21.1.tar.gz">{PN}-5.21.1.tar.gz """
+ pn1 = re.search("^%s" % prefix, curname).group()
+ s = "[^\"]*%s[^\d\"]*?(\d+[\.\-_])+[^\"]*" % pn1
+ searchstr = "[hH][rR][eE][fF]=\"%s\">" % s
+ reg = re.compile(searchstr)
+
+ valid = 0
+ for line in fhtml.split("\n"):
+ m = reg.search(line)
+ if m:
+ valid = 1
+ ver = m.group().split("\"")[1].split("/")[-1]
+ ver = parse_dir(ver)
+ if ver and __vercmp(newver, ver) < 0:
+ newver = ver
+
+ """Expect a match for curver in directory list, or else it indicates unknown format"""
+ if not valid:
+ status = "ErrParseDir"
+ else:
+ """newver still contains a full package name string"""
+ status = re.search("(\d+[.\-_])*\d+", newver[1]).group()
+ elif not len(fhtml):
+ status = "ErrHostNoDir"
+
+ f.close()
+ """if host hasn't directory information, no need to save tmp file"""
+ if status != "ErrHostNoDir" and re.match("Err", status):
+ logpath = bb.data.getVar('LOG_DIR', d, 1)
+ os.system("cp %s %s/" % (f.name, logpath))
+ os.unlink(f.name)
+ return status
+
+ """first check whether a uri is provided"""
+ src_uri = bb.data.getVar('SRC_URI', d, 1)
+ if not src_uri:
+ return
+
+ """initialize log files."""
+ logpath = bb.data.getVar('LOG_DIR', d, 1)
+ bb.utils.mkdirhier(logpath)
+ logfile = os.path.join(logpath, "poky_pkg_info.log.%s" % bb.data.getVar('DATETIME', d, 1))
+ if not os.path.exists(logfile):
+ slogfile = os.path.join(logpath, "poky_pkg_info.log")
+ if os.path.exists(slogfile):
+ os.remove(slogfile)
+ os.system("touch %s" % logfile)
+ os.symlink(logfile, slogfile)
+
+ """generate package information from .bb file"""
+ pname = bb.data.getVar('PN', d, 1)
+ pdesc = bb.data.getVar('DESCRIPTION', d, 1)
+ pgrp = bb.data.getVar('SECTION', d, 1)
+
+ found = 0
+ for uri in src_uri.split():
+ m = re.compile('(?P<type>[^:]*)').match(uri)
+ if not m:
+ raise MalformedUrl(uri)
+ elif m.group('type') in ('http', 'https', 'ftp', 'cvs', 'svn', 'git'):
+ found = 1
+ pproto = m.group('type')
+ break
+ if not found:
+ pproto = "file"
+ pupver = "N/A"
+ pstatus = "ErrUnknown"
+
+ (type, host, path, user, pswd, parm) = bb.decodeurl(uri)
+ if type in ['http', 'https', 'ftp']:
+ pcurver = bb.data.getVar('PV', d, 1)
+ else:
+ pcurver = bb.data.getVar("SRCREV", d, 1)
+
+ if type in ['http', 'https', 'ftp']:
+ newver = pcurver
+ altpath = path
+ dirver = "-"
+ curname = "-"
+
+ """
+ match version number amid the path, such as "5.7" in:
+ http://download.gnome.org/sources/${PN}/5.7/${PN}-${PV}.tar.gz
+ N.B. how about sth. like "../5.7/5.8/..."? Not find such example so far :-P
+ """
+ m = re.search(r"[^/]*(\d+\.)+\d+([\-_]r\d+)*/", path)
+ if m:
+ altpath = path.split(m.group())[0]
+ dirver = m.group().strip("/")
+
+ """use new path and remove param. for wget only param is md5sum"""
+ alturi = bb.encodeurl([type, host, altpath, user, pswd, {}])
+
+ newver = check_new_dir(alturi, dirver, d)
+ altpath = path
+ if not re.match("Err", newver) and dirver != newver:
+ altpath = altpath.replace(dirver, newver, 1)
+
+ """Now try to acquire all remote files in current directory"""
+ if not re.match("Err", newver):
+ curname = altpath.split("/")[-1]
+
+ """get remote name by skipping pacakge name"""
+ m = re.search(r"/.*/", altpath)
+ if not m:
+ altpath = "/"
+ else:
+ altpath = m.group()
+
+ alturi = bb.encodeurl([type, host, altpath, user, pswd, {}])
+ newver = check_new_version(alturi, curname, d)
+ if not re.match("Err", newver):
+ pupver = newver
+ if pupver != pcurver:
+ pstatus = "UPDATE"
+ else:
+ pstatus = "MATCH"
+
+ if re.match("Err", newver):
+ pstatus = newver + ":" + altpath + ":" + dirver + ":" + curname
+ elif type == 'git':
+ if user:
+ gituser = user + '@'
+ else:
+ gituser = ""
+
+ if 'protocol' in parm:
+ gitproto = parm['protocol']
+ else:
+ gitproto = "rsync"
+
+ gitcmd = "git ls-remote %s://%s%s%s HEAD 2>&1" % (gitproto, gituser, host, path)
+ print gitcmd
+ ver = os.popen(gitcmd).read()
+ if ver and re.search("HEAD", ver):
+ pupver = ver.split("\t")[0]
+ if pcurver == pupver:
+ pstatus = "MATCH"
+ else:
+ pstatus = "UPDATE"
+ else:
+ pstatus = "ErrGitAccess"
+ elif type == 'svn':
+ options = []
+ if user:
+ options.append("--username %s" % user)
+ if pswd:
+ options.append("--password %s" % pswd)
+ svnproto = 'svn'
+ if 'proto' in parm:
+ svnproto = parm['proto']
+ if 'rev' in parm:
+ pcurver = parm['rev']
+
+ svncmd = "svn info %s %s://%s%s/%s/ 2>&1" % (" ".join(options), svnproto, host, path, parm["module"])
+ print svncmd
+ svninfo = os.popen(svncmd).read()
+ for line in svninfo.split("\n"):
+ if re.search("^Last Changed Rev:", line):
+ pupver = line.split(" ")[-1]
+ if pcurver == pupver:
+ pstatus = "MATCH"
+ else:
+ pstatus = "UPDATE"
+
+ if re.match("Err", pstatus):
+ pstatus = "ErrSvnAccess"
+ elif type == 'cvs':
+ pupver = "HEAD"
+ pstatus = "UPDATE"
+ elif type == 'file':
+ """local file is always up-to-date"""
+ pupver = pcurver
+ pstatus = "MATCH"
+ else:
+ pstatus = "ErrUnsupportedProto"
+
+ if re.match("Err", pstatus):
+ pstatus += ":%s%s" % (host, path)
+
+ """Read from manual distro tracking fields as alternative"""
+ pmver = bb.data.getVar("RECIPE_LATEST_VERSION", d, 1)
+ if not pmver:
+ pmver = "N/A"
+ pmstatus = "ErrNoRecipeData"
+ else:
+ if pmver == pcurver:
+ pmstatus = "MATCH"
+ else:
+ pmstatus = "UPDATE"
+
+ lf = bb.utils.lockfile(logfile + ".lock")
+ f = open(logfile, "a")
+ f.write("%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\n" % \
+ (pname, pgrp, pproto, pcurver, pmver, pupver, pmstatus, pstatus, pdesc))
+ f.close()
+ bb.utils.unlockfile(lf)
+}
+
+addtask checkpkgall after do_checkpkg
+do_checkpkgall[recrdeptask] = "do_checkpkg"
+do_checkpkgall[nostamp] = "1"
+do_checkpkgall() {
+ :
+}
+
+#addhandler check_eventhandler
+python check_eventhandler() {
+ from bb.event import Handled, NotHandled
+ # if bb.event.getName(e) == "TaskStarted":
+
+ if bb.event.getName(e) == "BuildStarted":
+ import oe.distro_check as dc
+ tmpdir = bb.data.getVar('TMPDIR', e.data, 1)
+ distro_check_dir = os.path.join(tmpdir, "distro_check")
+ datetime = bb.data.getVar('DATETIME', e.data, 1)
+ """initialize log files."""
+ logpath = bb.data.getVar('LOG_DIR', e.data, 1)
+ bb.utils.mkdirhier(logpath)
+ logfile = os.path.join(logpath, "distrocheck.%s.csv" % bb.data.getVar('DATETIME', e.data, 1))
+ if not os.path.exists(logfile):
+ slogfile = os.path.join(logpath, "distrocheck.csv")
+ if os.path.exists(slogfile):
+ os.remove(slogfile)
+ os.system("touch %s" % logfile)
+ os.symlink(logfile, slogfile)
+ bb.data.setVar('LOG_FILE', logfile, e.data)
+
+ return NotHandled
+}
+
+addtask distro_check
+do_distro_check[nostamp] = "1"
+python do_distro_check() {
+ """checks if the package is present in other public Linux distros"""
+ import oe.distro_check as dc
+ localdata = bb.data.createCopy(d)
+ bb.data.update_data(localdata)
+ tmpdir = bb.data.getVar('TMPDIR', d, 1)
+ distro_check_dir = os.path.join(tmpdir, "distro_check")
+ datetime = bb.data.getVar('DATETIME', localdata, 1)
+ dc.update_distro_data(distro_check_dir, datetime)
+
+ # do the comparison
+ result = dc.compare_in_distro_packages_list(distro_check_dir, d)
+
+ # save the results
+ dc.save_distro_check_result(result, datetime, d)
+}
+
+addtask distro_checkall after do_distro_check
+do_distro_checkall[recrdeptask] = "do_distro_check"
+do_distro_checkall[nostamp] = "1"
+do_distro_checkall() {
+ :
+}
diff --git a/meta/classes/utility-tasks.bbclass b/meta/classes/utility-tasks.bbclass
index 205a206..3ab37fa 100644
--- a/meta/classes/utility-tasks.bbclass
+++ b/meta/classes/utility-tasks.bbclass
@@ -47,396 +47,6 @@ python do_rebuild() {
# bb.build.exec_func('do_clean', d)
#}
-addtask checkpkg
-do_checkpkg[nostamp] = "1"
-python do_checkpkg() {
- import sys
- import re
- import tempfile
-
- """
- sanity check to ensure same name and type. Match as many patterns as possible
- such as:
- gnome-common-2.20.0.tar.gz (most common format)
- gtk+-2.90.1.tar.gz
- xf86-intput-synaptics-12.6.9.tar.gz
- dri2proto-2.3.tar.gz
- blktool_4.orig.tar.gz
- libid3tag-0.15.1b.tar.gz
- unzip552.tar.gz
- icu4c-3_6-src.tgz
- genext2fs_1.3.orig.tar.gz
- gst-fluendo-mp3
- """
- prefix1 = "[a-zA-Z][a-zA-Z0-9]*([\-_][a-zA-Z]\w+)*[\-_]" # match most patterns which uses "-" as separator to version digits
- prefix2 = "[a-zA-Z]+" # a loose pattern such as for unzip552.tar.gz
- prefix = "(%s|%s)" % (prefix1, prefix2)
- suffix = "(tar\.gz|tgz|tar\.bz2|zip)"
- suffixtuple = ("tar.gz", "tgz", "zip", "tar.bz2")
-
- sinterstr = "(?P<name>%s?)(?P<ver>.*)" % prefix
- sdirstr = "(?P<name>%s)(?P<ver>.*)\.(?P<type>%s$)" % (prefix, suffix)
-
- def parse_inter(s):
- m = re.search(sinterstr, s)
- if not m:
- return None
- else:
- return (m.group('name'), m.group('ver'), "")
-
- def parse_dir(s):
- m = re.search(sdirstr, s)
- if not m:
- return None
- else:
- return (m.group('name'), m.group('ver'), m.group('type'))
-
- """
- Check whether 'new' is newer than 'old' version. We use existing vercmp() for the
- purpose. PE is cleared in comparison as it's not for build, and PV is cleared too
- for simplicity as it's somehow difficult to get from various upstream format
- """
- def __vercmp(old, new):
- (on, ov, ot) = old
- (en, ev, et) = new
- if on != en or (et and et not in suffixtuple):
- return 0
-
- ov = re.search("\d+[^a-zA-Z]+", ov).group()
- ev = re.search("\d+[^a-zA-Z]+", ev).group()
- return bb.utils.vercmp(("0", ov, ""), ("0", ev, ""))
-
- """
- wrapper for fetch upstream directory info
- 'url' - upstream link customized by regular expression
- 'd' - database
- 'tmpf' - tmpfile for fetcher output
- We don't want to exit whole build due to one recipe error. So handle all exceptions
- gracefully w/o leaking to outer.
- """
- def internal_fetch_wget(url, d, tmpf):
- status = "ErrFetchUnknown"
- try:
- """
- Clear internal url cache as it's a temporary check. Not doing so will have
- bitbake check url multiple times when looping through a single url
- """
- fn = bb.data.getVar('FILE', d, 1)
- bb.fetch.urldata_cache[fn] = {}
- bb.fetch.init([url], d)
- except bb.fetch.NoMethodError:
- status = "ErrFetchNoMethod"
- except:
- status = "ErrInitUrlUnknown"
- else:
- """
- To avoid impacting bitbake build engine, this trick is required for reusing bitbake
- interfaces. bb.fetch.go() is not appliable as it checks downloaded content in ${DL_DIR}
- while we don't want to pollute that place. So bb.fetch.checkstatus() is borrowed here
- which is designed for check purpose but we override check command for our own purpose
- """
- ld = bb.data.createCopy(d)
- bb.data.setVar('CHECKCOMMAND_wget', "/usr/bin/env wget -t 1 --passive-ftp -O %s '${URI}'" \
- % tmpf.name, d)
- bb.data.update_data(ld)
-
- try:
- bb.fetch.checkstatus(ld)
- except bb.fetch.MissingParameterError:
- status = "ErrMissParam"
- except bb.fetch.FetchError:
- status = "ErrFetch"
- except bb.fetch.MD5SumError:
- status = "ErrMD5Sum"
- except:
- status = "ErrFetchUnknown"
- else:
- status = "SUCC"
- return status
-
- """
- Check on middle version directory such as "2.4/" in "http://xxx/2.4/pkg-2.4.1.tar.gz",
- 'url' - upstream link customized by regular expression
- 'd' - database
- 'curver' - current version
- Return new version if success, or else error in "Errxxxx" style
- """
- def check_new_dir(url, curver, d):
- pn = bb.data.getVar('PN', d, 1)
- f = tempfile.NamedTemporaryFile(delete=False, prefix="%s-1-" % pn)
- status = internal_fetch_wget(url, d, f)
- fhtml = f.read()
-
- if status == "SUCC" and len(fhtml):
- newver = parse_inter(curver)
-
- """
- match "*4.1/">*4.1/ where '*' matches chars
- N.B. add package name, only match for digits
- """
- m = re.search("^%s" % prefix, curver)
- if m:
- s = "%s[^\d\"]*?(\d+[\.\-_])+\d+/?" % m.group()
- else:
- s = "(\d+[\.\-_])+\d+/?"
-
- searchstr = "[hH][rR][eE][fF]=\"%s\">" % s
- reg = re.compile(searchstr)
-
- valid = 0
- for line in fhtml.split("\n"):
- if line.find(curver) >= 0:
- valid = 1
-
- m = reg.search(line)
- if m:
- ver = m.group().split("\"")[1]
- ver = ver.strip("/")
- ver = parse_inter(ver)
- if ver and __vercmp(newver, ver) < 0:
- newver = ver
-
- """Expect a match for curver in directory list, or else it indicates unknown format"""
- if not valid:
- status = "ErrParseInterDir"
- else:
- """rejoin the path name"""
- status = newver[0] + newver[1]
- elif not len(fhtml):
- status = "ErrHostNoDir"
-
- f.close()
- if status != "ErrHostNoDir" and re.match("Err", status):
- logpath = bb.data.getVar('LOG_DIR', d, 1)
- os.system("cp %s %s/" % (f.name, logpath))
- os.unlink(f.name)
- return status
-
- """
- Check on the last directory to search '2.4.1' in "http://xxx/2.4/pkg-2.4.1.tar.gz",
- 'url' - upstream link customized by regular expression
- 'd' - database
- 'curname' - current package name
- Return new version if success, or else error in "Errxxxx" style
- """
- def check_new_version(url, curname, d):
- """possible to have no version in pkg name, such as spectrum-fw"""
- if not re.search("\d+", curname):
- return pcurver
- pn = bb.data.getVar('PN', d, 1)
- f = tempfile.NamedTemporaryFile(delete=False, prefix="%s-2-" % pn)
- status = internal_fetch_wget(url, d, f)
- fhtml = f.read()
-
- if status == "SUCC" and len(fhtml):
- newver = parse_dir(curname)
-
- """match "{PN}-5.21.1.tar.gz">{PN}-5.21.1.tar.gz """
- pn1 = re.search("^%s" % prefix, curname).group()
- s = "[^\"]*%s[^\d\"]*?(\d+[\.\-_])+[^\"]*" % pn1
- searchstr = "[hH][rR][eE][fF]=\"%s\">" % s
- reg = re.compile(searchstr)
-
- valid = 0
- for line in fhtml.split("\n"):
- m = reg.search(line)
- if m:
- valid = 1
- ver = m.group().split("\"")[1].split("/")[-1]
- ver = parse_dir(ver)
- if ver and __vercmp(newver, ver) < 0:
- newver = ver
-
- """Expect a match for curver in directory list, or else it indicates unknown format"""
- if not valid:
- status = "ErrParseDir"
- else:
- """newver still contains a full package name string"""
- status = re.search("(\d+[.\-_])*\d+", newver[1]).group()
- elif not len(fhtml):
- status = "ErrHostNoDir"
-
- f.close()
- """if host hasn't directory information, no need to save tmp file"""
- if status != "ErrHostNoDir" and re.match("Err", status):
- logpath = bb.data.getVar('LOG_DIR', d, 1)
- os.system("cp %s %s/" % (f.name, logpath))
- os.unlink(f.name)
- return status
-
- """first check whether a uri is provided"""
- src_uri = bb.data.getVar('SRC_URI', d, 1)
- if not src_uri:
- return
-
- """initialize log files."""
- logpath = bb.data.getVar('LOG_DIR', d, 1)
- bb.utils.mkdirhier(logpath)
- logfile = os.path.join(logpath, "poky_pkg_info.log.%s" % bb.data.getVar('DATETIME', d, 1))
- if not os.path.exists(logfile):
- slogfile = os.path.join(logpath, "poky_pkg_info.log")
- if os.path.exists(slogfile):
- os.remove(slogfile)
- os.system("touch %s" % logfile)
- os.symlink(logfile, slogfile)
-
- """generate package information from .bb file"""
- pname = bb.data.getVar('PN', d, 1)
- pdesc = bb.data.getVar('DESCRIPTION', d, 1)
- pgrp = bb.data.getVar('SECTION', d, 1)
-
- found = 0
- for uri in src_uri.split():
- m = re.compile('(?P<type>[^:]*)').match(uri)
- if not m:
- raise MalformedUrl(uri)
- elif m.group('type') in ('http', 'https', 'ftp', 'cvs', 'svn', 'git'):
- found = 1
- pproto = m.group('type')
- break
- if not found:
- pproto = "file"
- pupver = "N/A"
- pstatus = "ErrUnknown"
-
- (type, host, path, user, pswd, parm) = bb.decodeurl(uri)
- if type in ['http', 'https', 'ftp']:
- pcurver = bb.data.getVar('PV', d, 1)
- else:
- pcurver = bb.data.getVar("SRCREV", d, 1)
-
- if type in ['http', 'https', 'ftp']:
- newver = pcurver
- altpath = path
- dirver = "-"
- curname = "-"
-
- """
- match version number amid the path, such as "5.7" in:
- http://download.gnome.org/sources/${PN}/5.7/${PN}-${PV}.tar.gz
- N.B. how about sth. like "../5.7/5.8/..."? Not find such example so far :-P
- """
- m = re.search(r"[^/]*(\d+\.)+\d+([\-_]r\d+)*/", path)
- if m:
- altpath = path.split(m.group())[0]
- dirver = m.group().strip("/")
-
- """use new path and remove param. for wget only param is md5sum"""
- alturi = bb.encodeurl([type, host, altpath, user, pswd, {}])
-
- newver = check_new_dir(alturi, dirver, d)
- altpath = path
- if not re.match("Err", newver) and dirver != newver:
- altpath = altpath.replace(dirver, newver, 1)
-
- """Now try to acquire all remote files in current directory"""
- if not re.match("Err", newver):
- curname = altpath.split("/")[-1]
-
- """get remote name by skipping pacakge name"""
- m = re.search(r"/.*/", altpath)
- if not m:
- altpath = "/"
- else:
- altpath = m.group()
-
- alturi = bb.encodeurl([type, host, altpath, user, pswd, {}])
- newver = check_new_version(alturi, curname, d)
- if not re.match("Err", newver):
- pupver = newver
- if pupver != pcurver:
- pstatus = "UPDATE"
- else:
- pstatus = "MATCH"
-
- if re.match("Err", newver):
- pstatus = newver + ":" + altpath + ":" + dirver + ":" + curname
- elif type == 'git':
- if user:
- gituser = user + '@'
- else:
- gituser = ""
-
- if 'protocol' in parm:
- gitproto = parm['protocol']
- else:
- gitproto = "rsync"
-
- gitcmd = "git ls-remote %s://%s%s%s HEAD 2>&1" % (gitproto, gituser, host, path)
- print gitcmd
- ver = os.popen(gitcmd).read()
- if ver and re.search("HEAD", ver):
- pupver = ver.split("\t")[0]
- if pcurver == pupver:
- pstatus = "MATCH"
- else:
- pstatus = "UPDATE"
- else:
- pstatus = "ErrGitAccess"
- elif type == 'svn':
- options = []
- if user:
- options.append("--username %s" % user)
- if pswd:
- options.append("--password %s" % pswd)
- svnproto = 'svn'
- if 'proto' in parm:
- svnproto = parm['proto']
- if 'rev' in parm:
- pcurver = parm['rev']
-
- svncmd = "svn info %s %s://%s%s/%s/ 2>&1" % (" ".join(options), svnproto, host, path, parm["module"])
- print svncmd
- svninfo = os.popen(svncmd).read()
- for line in svninfo.split("\n"):
- if re.search("^Last Changed Rev:", line):
- pupver = line.split(" ")[-1]
- if pcurver == pupver:
- pstatus = "MATCH"
- else:
- pstatus = "UPDATE"
-
- if re.match("Err", pstatus):
- pstatus = "ErrSvnAccess"
- elif type == 'cvs':
- pupver = "HEAD"
- pstatus = "UPDATE"
- elif type == 'file':
- """local file is always up-to-date"""
- pupver = pcurver
- pstatus = "MATCH"
- else:
- pstatus = "ErrUnsupportedProto"
-
- if re.match("Err", pstatus):
- pstatus += ":%s%s" % (host, path)
-
- """Read from manual distro tracking fields as alternative"""
- pmver = bb.data.getVar("RECIPE_LATEST_VERSION", d, 1)
- if not pmver:
- pmver = "N/A"
- pmstatus = "ErrNoRecipeData"
- else:
- if pmver == pcurver:
- pmstatus = "MATCH"
- else:
- pmstatus = "UPDATE"
-
- lf = bb.utils.lockfile(logfile + ".lock")
- f = open(logfile, "a")
- f.write("%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\n" % \
- (pname, pgrp, pproto, pcurver, pmver, pupver, pmstatus, pstatus, pdesc))
- f.close()
- bb.utils.unlockfile(lf)
-}
-
-addtask checkpkgall after do_checkpkg
-do_checkpkgall[recrdeptask] = "do_checkpkg"
-do_checkpkgall[nostamp] = "1"
-do_checkpkgall() {
- :
-}
-
addtask checkuri
do_checkuri[nostamp] = "1"
python do_checkuri() {
@@ -487,55 +97,3 @@ do_buildall[recrdeptask] = "do_build"
do_buildall() {
:
}
-
-#addhandler check_eventhandler
-python check_eventhandler() {
- from bb.event import Handled, NotHandled
- # if bb.event.getName(e) == "TaskStarted":
-
- if bb.event.getName(e) == "BuildStarted":
- import oe.distro_check as dc
- tmpdir = bb.data.getVar('TMPDIR', e.data, 1)
- distro_check_dir = os.path.join(tmpdir, "distro_check")
- datetime = bb.data.getVar('DATETIME', e.data, 1)
- """initialize log files."""
- logpath = bb.data.getVar('LOG_DIR', e.data, 1)
- bb.utils.mkdirhier(logpath)
- logfile = os.path.join(logpath, "distrocheck.%s.csv" % bb.data.getVar('DATETIME', e.data, 1))
- if not os.path.exists(logfile):
- slogfile = os.path.join(logpath, "distrocheck.csv")
- if os.path.exists(slogfile):
- os.remove(slogfile)
- os.system("touch %s" % logfile)
- os.symlink(logfile, slogfile)
- bb.data.setVar('LOG_FILE', logfile, e.data)
-
- return NotHandled
-}
-
-addtask distro_check
-do_distro_check[nostamp] = "1"
-python do_distro_check() {
- """checks if the package is present in other public Linux distros"""
- import oe.distro_check as dc
- localdata = bb.data.createCopy(d)
- bb.data.update_data(localdata)
- tmpdir = bb.data.getVar('TMPDIR', d, 1)
- distro_check_dir = os.path.join(tmpdir, "distro_check")
- datetime = bb.data.getVar('DATETIME', localdata, 1)
- dc.update_distro_data(distro_check_dir, datetime)
-
- # do the comparison
- result = dc.compare_in_distro_packages_list(distro_check_dir, d)
-
- # save the results
- dc.save_distro_check_result(result, datetime, d)
-}
-
-addtask distro_checkall after do_distro_check
-do_distro_checkall[recrdeptask] = "do_distro_check"
-do_distro_checkall[nostamp] = "1"
-do_distro_checkall() {
- :
-}
-
diff --git a/meta/conf/distro/poky.conf b/meta/conf/distro/poky.conf
index b3c9f1a..684215d 100644
--- a/meta/conf/distro/poky.conf
+++ b/meta/conf/distro/poky.conf
@@ -139,7 +139,6 @@ COMMERCIAL_QT ?= ""
require conf/distro/include/world-broken.inc
-require conf/distro/include/distro_tracking_fields.inc
# Setup our hash policy
BB_SIGNATURE_HANDLER = "basic"
--
1.6.3.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 0/1][RFC] Optimize file parsing speed
@ 2010-11-22 6:04 Dongxiao Xu
2010-11-22 6:02 ` [PATCH 1/1] utility-tasks.bbclass: Move distro related tasks to distrodata.bbclass Dongxiao Xu
2010-11-28 14:27 ` [PATCH 0/1][RFC] Optimize file parsing speed Richard Purdie
0 siblings, 2 replies; 9+ messages in thread
From: Dongxiao Xu @ 2010-11-22 6:04 UTC (permalink / raw)
To: poky
Hi Richard,
I found that when parsing bitbake files, most of the variables in
d.keys() are in distro_tracking_fields.inc, and they are not used in
normal build.
This pull request moves some distro related functions in
utility-tasks.bbclass into distrodata.bbclass, and remove the inclusion
of distro_tracking_fields.inc from poky.conf. This could gain about 25%
parsing time saving.
Please help to review and pull. Thanks!
Pull URL: git://git.pokylinux.org/poky-contrib.git
Branch: dxu4/perf
Browse: http://git.pokylinux.org/cgit.cgi/poky-contrib/log/?h=dxu4/perf
Thanks,
Dongxiao Xu <dongxiao.xu@intel.com>
---
Dongxiao Xu (1):
utility-tasks.bbclass: Move distro related tasks to
distrodata.bbclass
meta/classes/distrodata.bbclass | 440 +++++++++++++++++++++++++++++++++++
meta/classes/utility-tasks.bbclass | 442 ------------------------------------
meta/conf/distro/poky.conf | 1 -
3 files changed, 440 insertions(+), 443 deletions(-)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/1][RFC] Optimize file parsing speed
2010-11-22 6:04 [PATCH 0/1][RFC] Optimize file parsing speed Dongxiao Xu
2010-11-22 6:02 ` [PATCH 1/1] utility-tasks.bbclass: Move distro related tasks to distrodata.bbclass Dongxiao Xu
@ 2010-11-28 14:27 ` Richard Purdie
2010-11-29 0:32 ` Tian, Kevin
2010-11-29 5:45 ` Xu, Dongxiao
1 sibling, 2 replies; 9+ messages in thread
From: Richard Purdie @ 2010-11-28 14:27 UTC (permalink / raw)
To: Dongxiao Xu; +Cc: poky
Hi Dongxiao,
On Mon, 2010-11-22 at 14:04 +0800, Dongxiao Xu wrote:
> I found that when parsing bitbake files, most of the variables in
> d.keys() are in distro_tracking_fields.inc, and they are not used in
> normal build.
>
> This pull request moves some distro related functions in
> utility-tasks.bbclass into distrodata.bbclass, and remove the inclusion
> of distro_tracking_fields.inc from poky.conf. This could gain about 25%
> parsing time saving.
I'm going to take the patch but I'd like to be clear where the speed
gains come from with this change. I suspect some are due to a smaller
number of keys but I also suspect the smaller number of tasks involved
helps too!
Also, a lot of those keys are override keys so perhaps its speeding up
update_data() calls and some of the gain is from there too?
Anyhow, its a good move :)
Cheers,
Richard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/1][RFC] Optimize file parsing speed
2010-11-28 14:27 ` [PATCH 0/1][RFC] Optimize file parsing speed Richard Purdie
@ 2010-11-29 0:32 ` Tian, Kevin
2010-11-29 5:45 ` Xu, Dongxiao
1 sibling, 0 replies; 9+ messages in thread
From: Tian, Kevin @ 2010-11-29 0:32 UTC (permalink / raw)
To: Richard Purdie, Xu, Dongxiao; +Cc: poky@yoctoproject.org
>From: Richard Purdie
>Sent: Sunday, November 28, 2010 10:28 PM
>
>Hi Dongxiao,
>
>On Mon, 2010-11-22 at 14:04 +0800, Dongxiao Xu wrote:
>> I found that when parsing bitbake files, most of the variables in
>> d.keys() are in distro_tracking_fields.inc, and they are not used in
>> normal build.
>>
>> This pull request moves some distro related functions in
>> utility-tasks.bbclass into distrodata.bbclass, and remove the inclusion
>> of distro_tracking_fields.inc from poky.conf. This could gain about 25%
>> parsing time saving.
>
>I'm going to take the patch but I'd like to be clear where the speed
>gains come from with this change. I suspect some are due to a smaller
>number of keys but I also suspect the smaller number of tasks involved
>helps too!
>
>Also, a lot of those keys are override keys so perhaps its speeding up
>update_data() calls and some of the gain is from there too?
>
>Anyhow, its a good move :)
>
Good to know the secret sauce, and the improvement is indeed obvious!
Thanks
Kevin
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/1][RFC] Optimize file parsing speed
2010-11-28 14:27 ` [PATCH 0/1][RFC] Optimize file parsing speed Richard Purdie
2010-11-29 0:32 ` Tian, Kevin
@ 2010-11-29 5:45 ` Xu, Dongxiao
2010-11-29 12:41 ` Richard Purdie
1 sibling, 1 reply; 9+ messages in thread
From: Xu, Dongxiao @ 2010-11-29 5:45 UTC (permalink / raw)
To: Richard Purdie; +Cc: poky@yoctoproject.org
Richard Purdie wrote:
> Hi Dongxiao,
>
> On Mon, 2010-11-22 at 14:04 +0800, Dongxiao Xu wrote:
>> I found that when parsing bitbake files, most of the variables in
>> d.keys() are in distro_tracking_fields.inc, and they are not used in
>> normal build.
>>
>> This pull request moves some distro related functions in
>> utility-tasks.bbclass into distrodata.bbclass, and remove the
>> inclusion of distro_tracking_fields.inc from poky.conf. This could
>> gain about 25% parsing time saving.
>
> I'm going to take the patch but I'd like to be clear where the speed
> gains come from with this change. I suspect some are due to a smaller
> number of keys but I also suspect the smaller number of tasks
> involved helps too!
Hi Richard,
I did a profiling for the parsing time w/ and w/o the patch.
Here a piece of the profiling result:
W/ PATCH:
Mon Nov 29 13:23:42 2010 profile.log
32756805 function calls (31268348 primitive calls) in 60.195 CPU seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
3572737 4.950 0.000 12.849 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:285(getVarFlag)
3917977 4.464 0.000 4.464 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:191(_findVar)
873129/431998 3.040 0.000 16.824 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:86(expandWithRefs)
1745340 3.007 0.000 4.555 0.000 /usr/lib/python2.6/copy.py:65(copy)
286785 2.570 0.000 20.051 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:271(build_dependencies)
565710 2.532 0.000 5.015 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/COW.py:97(__getitem__)
917 2.394 0.003 27.726 0.030 /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:299(generate_dependencies)
288439 2.350 0.000 11.607 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:212(setVar)
1428066/1144003 1.882 0.000 16.762 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:246(getVar)
...
2389 0.004 0.000 2.528 0.001 /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:267(update_data)
W/O PATCH:
Mon Nov 29 13:28:36 2010 profile.log
49110091 function calls (47618793 primitive calls) in 75.338 CPU seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
7547331 8.230 0.000 19.467 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:285(getVarFlag)
7908236 8.113 0.000 8.113 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:191(_findVar)
917 3.987 0.004 43.045 0.047 /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:299(generate_dependencies)
5061763 3.812 0.000 5.633 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:301(<genexpr>)
84656 3.318 0.000 13.475 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:302(<genexpr>)
897804/450300 2.838 0.000 15.801 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:86(expandWithRefs)
294083 2.777 0.000 18.854 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:271(build_dependencies)
1783941 2.740 0.000 4.154 0.000 /usr/lib/python2.6/copy.py:65(copy)
592115 2.432 0.000 5.179 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/COW.py:97(__getitem__)
...
2389 0.004 0.000 2.656 0.001 /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:267(update_data)
From the profiling result we can see that, generate_dependencies() time reduces from 43 seconds to 27 seconds, while build_dependencies() mostly keeps unchanged (From 20 seconds to 18.8 seconds). Therefore the biggest overhead reduced by the patch should be the two lines of code to parsing keys in generate_dependencies() function.
>
> Also, a lot of those keys are override keys so perhaps its speeding up
> update_data() calls and some of the gain is from there too?
Update_data() has some gains but not much, see the above profile result.
Thanks,
Dongxiao
>
> Anyhow, its a good move :)
>
> Cheers,
>
> Richard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/1][RFC] Optimize file parsing speed
2010-11-29 5:45 ` Xu, Dongxiao
@ 2010-11-29 12:41 ` Richard Purdie
2010-12-01 1:48 ` Xu, Dongxiao
2010-12-08 6:53 ` Xu, Dongxiao
0 siblings, 2 replies; 9+ messages in thread
From: Richard Purdie @ 2010-11-29 12:41 UTC (permalink / raw)
To: Xu, Dongxiao; +Cc: poky@yoctoproject.org
On Mon, 2010-11-29 at 13:45 +0800, Xu, Dongxiao wrote:
> Richard Purdie wrote:
> > I'm going to take the patch but I'd like to be clear where the speed
> > gains come from with this change. I suspect some are due to a smaller
> > number of keys but I also suspect the smaller number of tasks
> > involved helps too!
>
> I did a profiling for the parsing time w/ and w/o the patch.
>
> Here a piece of the profiling result:
>
> W/ PATCH:
>
> Mon Nov 29 13:23:42 2010 profile.log
>
> 32756805 function calls (31268348 primitive calls) in 60.195 CPU seconds
>
> Ordered by: internal time
>
> ncalls tottime percall cumtime percall filename:lineno(function)
> 3572737 4.950 0.000 12.849 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:285(getVarFlag)
> 3917977 4.464 0.000 4.464 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:191(_findVar)
> 873129/431998 3.040 0.000 16.824 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:86(expandWithRefs)
> 1745340 3.007 0.000 4.555 0.000 /usr/lib/python2.6/copy.py:65(copy)
> 286785 2.570 0.000 20.051 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:271(build_dependencies)
> 565710 2.532 0.000 5.015 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/COW.py:97(__getitem__)
> 917 2.394 0.003 27.726 0.030 /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:299(generate_dependencies)
> 288439 2.350 0.000 11.607 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:212(setVar)
> 1428066/1144003 1.882 0.000 16.762 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:246(getVar)
> ...
> 2389 0.004 0.000 2.528 0.001 /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:267(update_data)
>
>
> W/O PATCH:
>
> Mon Nov 29 13:28:36 2010 profile.log
>
> 49110091 function calls (47618793 primitive calls) in 75.338 CPU seconds
>
> Ordered by: internal time
>
> ncalls tottime percall cumtime percall filename:lineno(function)
> 7547331 8.230 0.000 19.467 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:285(getVarFlag)
> 7908236 8.113 0.000 8.113 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:191(_findVar)
> 917 3.987 0.004 43.045 0.047 /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:299(generate_dependencies)
> 5061763 3.812 0.000 5.633 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:301(<genexpr>)
> 84656 3.318 0.000 13.475 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:302(<genexpr>)
> 897804/450300 2.838 0.000 15.801 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:86(expandWithRefs)
> 294083 2.777 0.000 18.854 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:271(build_dependencies)
> 1783941 2.740 0.000 4.154 0.000 /usr/lib/python2.6/copy.py:65(copy)
> 592115 2.432 0.000 5.179 0.000 /sda1/yocto/scripts/..//bitbake/lib/bb/COW.py:97(__getitem__)
> ...
> 2389 0.004 0.000 2.656 0.001 /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:267(update_data)
>
> From the profiling result we can see that, generate_dependencies()
> time reduces from 43 seconds to 27 seconds, while build_dependencies()
> mostly keeps unchanged (From 20 seconds to 18.8 seconds). Therefore
> the biggest overhead reduced by the patch should be the two lines of
> code to parsing keys in generate_dependencies() function.
>
> >
> > Also, a lot of those keys are override keys so perhaps its speeding up
> > update_data() calls and some of the gain is from there too?
>
> Update_data() has some gains but not much, see the above profile result.
Looks good, thanks.
Interestingly looking at the profile overall, we dropped from 49 million
function calls to 32 million function calls which is always a good way
to speed things up.
getVarFlag and _findVar each dropped by 3 million calls each (which is
from the construction of the keys() list).
So a very valid speedup :). I still think we might be able to speed this
area up further though such as if we directly keep an index of exported
variables.
Cheers,
Richard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/1][RFC] Optimize file parsing speed
2010-11-29 12:41 ` Richard Purdie
@ 2010-12-01 1:48 ` Xu, Dongxiao
2010-12-08 6:53 ` Xu, Dongxiao
1 sibling, 0 replies; 9+ messages in thread
From: Xu, Dongxiao @ 2010-12-01 1:48 UTC (permalink / raw)
To: Richard Purdie; +Cc: poky@yoctoproject.org
Richard Purdie wrote:
> On Mon, 2010-11-29 at 13:45 +0800, Xu, Dongxiao wrote:
>> Richard Purdie wrote:
>>> I'm going to take the patch but I'd like to be clear where the speed
>>> gains come from with this change. I suspect some are due to a
>>> smaller number of keys but I also suspect the smaller number of
>>> tasks involved helps too!
>>
>> I did a profiling for the parsing time w/ and w/o the patch.
>>
>> Here a piece of the profiling result:
>>
>> W/ PATCH:
>>
>> Mon Nov 29 13:23:42 2010 profile.log
>>
>> 32756805 function calls (31268348 primitive calls) in 60.195
>> CPU seconds
>>
>> Ordered by: internal time
>>
>> ncalls tottime percall cumtime percall
>> filename:lineno(function) 3572737 4.950 0.000 12.849
>> 0.000
>> /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:285(getVarFlag)
>> 3917977 4.464 0.000 4.464 0.000
>>
>>
>>
>>
>> /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:191(_findVar)
>> 873129/431998 3.040 0.000 16.824 0.000
>> /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:86(expandWithRefs)
>> 1745340 3.007 0.000 4.555 0.000
>> /usr/lib/python2.6/copy.py:65(copy) 286785 2.570 0.000
>> 20.051 0.000
>> /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:271(build_dependencies)
>> 565710 2.532 0.000 5.015 0.000
>> /sda1/yocto/scripts/..//bitbake/lib/bb/COW.py:97(__getitem__) 917
>> 2.394 0.003 27.726 0.030
>> /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:299(generate_dependencies)
>> 288439 2.350 0.000 11.607 0.000
>> /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:212(setVar)
>> 1428066/1144003 1.882 0.000 16.762 0.000
>> /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:246(getVar) ...
>> 2389 0.004 0.000 2.528 0.001
>> /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:267(update_data)
>>
>>
>> W/O PATCH:
>>
>> Mon Nov 29 13:28:36 2010 profile.log
>>
>> 49110091 function calls (47618793 primitive calls) in 75.338
>> CPU seconds
>>
>> Ordered by: internal time
>>
>> ncalls tottime percall cumtime percall
>> filename:lineno(function) 7547331 8.230 0.000 19.467
>> 0.000
>>
>>
>>
>> /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:285(getVarFlag)
>> 7908236 8.113 0.000 8.113 0.000
>> /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:191(_findVar)
>> 917 3.987 0.004 43.045 0.047
>>
>> /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:299(generate_dependencies)
>> 5061763 3.812 0.000 5.633 0.000
>> /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:301(<genexpr>) 84656
>> 3.318 0.000 13.475 0.000
>> /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:302(<genexpr>)
>> 897804/450300 2.838 0.000 15.801 0.000
>> /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:86(expandWithRefs)
>> 294083 2.777 0.000 18.854 0.000
>> /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:271(build_dependencies)
>> 1783941 2.740 0.000 4.154 0.000
>> /usr/lib/python2.6/copy.py:65(copy) 592115 2.432 0.000
>> 5.179 0.000
>> /sda1/yocto/scripts/..//bitbake/lib/bb/COW.py:97(__getitem__) ...
>> 2389 0.004 0.000 2.656 0.001
>> /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:267(update_data)
>>
>> From the profiling result we can see that, generate_dependencies()
>> time reduces from 43 seconds to 27 seconds, while
>> build_dependencies()
>> mostly keeps unchanged (From 20 seconds to 18.8 seconds). Therefore
>> the biggest overhead reduced by the patch should be the two lines of
>> code to parsing keys in generate_dependencies() function.
>>
>>>
>>> Also, a lot of those keys are override keys so perhaps its speeding
>>> up update_data() calls and some of the gain is from there too?
>>
>> Update_data() has some gains but not much, see the above profile
>> result.
>
> Looks good, thanks.
>
> Interestingly looking at the profile overall, we dropped from 49
> million function calls to 32 million function calls which is always a
> good way to speed things up.
>
> getVarFlag and _findVar each dropped by 3 million calls each (which
> is from the construction of the keys() list).
>
> So a very valid speedup :). I still think we might be able to speed
> this area up further though such as if we directly keep an index of
> exported variables.
Hi Richard,
Like the whole d.keys(), exported variables are also different among
recipes, since recipe may export or unset certain variables for itself.
So to implement this, we need to keep a common list, and then handle
recipe specific export/unset variables.
Does my understanding correct?
Thanks,
Dongxiao
>
> Cheers,
>
> Richard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/1][RFC] Optimize file parsing speed
2010-11-29 12:41 ` Richard Purdie
2010-12-01 1:48 ` Xu, Dongxiao
@ 2010-12-08 6:53 ` Xu, Dongxiao
2010-12-09 14:40 ` Richard Purdie
1 sibling, 1 reply; 9+ messages in thread
From: Xu, Dongxiao @ 2010-12-08 6:53 UTC (permalink / raw)
To: Richard Purdie; +Cc: poky@yoctoproject.org
Xu, Dongxiao wrote:
> Richard Purdie wrote:
>> On Mon, 2010-11-29 at 13:45 +0800, Xu, Dongxiao wrote:
>>> Richard Purdie wrote:
>>>> I'm going to take the patch but I'd like to be clear where the
>>>> speed gains come from with this change. I suspect some are due to a
>>>> smaller number of keys but I also suspect the smaller number of
>>>> tasks involved helps too!
>>>
>>> I did a profiling for the parsing time w/ and w/o the patch.
>>>
>>> Here a piece of the profiling result:
>>>
>>> W/ PATCH:
>>>
>>> Mon Nov 29 13:23:42 2010 profile.log
>>>
>>> 32756805 function calls (31268348 primitive calls) in
>>> 60.195 CPU seconds
>>>
>>> Ordered by: internal time
>>>
>>> ncalls tottime percall cumtime percall
>>> filename:lineno(function) 3572737 4.950 0.000 12.849
>>> 0.000
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:285(getVarFlag)
>>> 3917977 4.464 0.000 4.464 0.000
>>>
>>>
>>>
>>>
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:191(_findVar)
>>> 873129/431998 3.040 0.000 16.824 0.000
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:86(expandWithRefs)
>>> 1745340 3.007 0.000 4.555 0.000
>>> /usr/lib/python2.6/copy.py:65(copy) 286785 2.570 0.000
>>> 20.051 0.000
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:271(build_dependencies)
>>> 565710 2.532 0.000 5.015 0.000
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/COW.py:97(__getitem__) 917
>>> 2.394 0.003 27.726 0.030
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:299(generate_dependencies)
>>> 288439 2.350 0.000 11.607 0.000
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:212(setVar)
>>> 1428066/1144003 1.882 0.000 16.762 0.000
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:246(getVar) ...
>>> 2389 0.004 0.000 2.528 0.001
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:267(update_data)
>>>
>>>
>>> W/O PATCH:
>>>
>>> Mon Nov 29 13:28:36 2010 profile.log
>>>
>>> 49110091 function calls (47618793 primitive calls) in
>>> 75.338 CPU seconds
>>>
>>> Ordered by: internal time
>>>
>>> ncalls tottime percall cumtime percall
>>> filename:lineno(function) 7547331 8.230 0.000 19.467
>>> 0.000
>>>
>>>
>>>
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:285(getVarFlag)
>>> 7908236 8.113 0.000 8.113 0.000
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:191(_findVar)
>>> 917 3.987 0.004 43.045 0.047
>>>
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:299(generate_dependencies)
>>> 5061763 3.812 0.000 5.633 0.000
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:301(<genexpr>) 84656
>>> 3.318 0.000 13.475 0.000
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:302(<genexpr>)
>>> 897804/450300 2.838 0.000 15.801 0.000
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/data_smart.py:86(expandWithRefs)
>>> 294083 2.777 0.000 18.854 0.000
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:271(build_dependencies)
>>> 1783941 2.740 0.000 4.154 0.000
>>> /usr/lib/python2.6/copy.py:65(copy) 592115 2.432 0.000
>>> 5.179 0.000
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/COW.py:97(__getitem__) ...
>>> 2389 0.004 0.000 2.656 0.001
>>> /sda1/yocto/scripts/..//bitbake/lib/bb/data.py:267(update_data)
>>>
>>> From the profiling result we can see that, generate_dependencies()
>>> time reduces from 43 seconds to 27 seconds, while
>>> build_dependencies()
>>> mostly keeps unchanged (From 20 seconds to 18.8 seconds). Therefore
>>> the biggest overhead reduced by the patch should be the two lines of
>>> code to parsing keys in generate_dependencies() function.
>>>
>>>>
>>>> Also, a lot of those keys are override keys so perhaps its speeding
>>>> up update_data() calls and some of the gain is from there too?
>>>
>>> Update_data() has some gains but not much, see the above profile
>>> result.
>>
>> Looks good, thanks.
>>
>> Interestingly looking at the profile overall, we dropped from 49
>> million function calls to 32 million function calls which is always
>> a good way to speed things up.
>>
>> getVarFlag and _findVar each dropped by 3 million calls each (which
>> is from the construction of the keys() list).
>>
>> So a very valid speedup :). I still think we might be able to speed
>> this area up further though such as if we directly keep an index of
>> exported variables.
>
> Hi Richard,
>
> Like the whole d.keys(), exported variables are also different among
> recipes, since recipe may export or unset certain variables for
> itself. So to implement this, we need to keep a common list, and then
> handle recipe specific export/unset variables.
>
> Does my understanding correct?
Here I mean even the exported variables may be different among different recipes, since each recipe may export or unset certain variables for itself. So is it possible for us to keep a common exported variable list?
Thanks,
Dongxiao
>
> Thanks,
> Dongxiao
>
>
>
>>
>> Cheers,
>>
>> Richard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/1][RFC] Optimize file parsing speed
2010-12-08 6:53 ` Xu, Dongxiao
@ 2010-12-09 14:40 ` Richard Purdie
0 siblings, 0 replies; 9+ messages in thread
From: Richard Purdie @ 2010-12-09 14:40 UTC (permalink / raw)
To: Xu, Dongxiao; +Cc: poky@yoctoproject.org
On Wed, 2010-12-08 at 14:53 +0800, Xu, Dongxiao wrote:
> Xu, Dongxiao wrote:
> > Richard Purdie wrote:
> >> On Mon, 2010-11-29 at 13:45 +0800, Xu, Dongxiao wrote:
> >>> Richard Purdie wrote:
> > Like the whole d.keys(), exported variables are also different among
> > recipes, since recipe may export or unset certain variables for
> > itself. So to implement this, we need to keep a common list, and then
> > handle recipe specific export/unset variables.
> >
> > Does my understanding correct?
>
> Here I mean even the exported variables may be different among
> different recipes, since each recipe may export or unset certain
> variables for itself. So is it possible for us to keep a common
> exported variable list?
The reason these functions are slow is that our dictionary is huge and
d.keys() takes time to construct and then iterate over.
Instead, if we make a cache in setVarFlag of anything with an export
flag set, we then have a pregenerated list.
Yes, the list is different between recipes but that doesn't mean we
can't prebuild the list when parsing.
Cheers,
Richard
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2010-12-09 14:40 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-22 6:04 [PATCH 0/1][RFC] Optimize file parsing speed Dongxiao Xu
2010-11-22 6:02 ` [PATCH 1/1] utility-tasks.bbclass: Move distro related tasks to distrodata.bbclass Dongxiao Xu
2010-11-28 14:27 ` [PATCH 0/1][RFC] Optimize file parsing speed Richard Purdie
2010-11-29 0:32 ` Tian, Kevin
2010-11-29 5:45 ` Xu, Dongxiao
2010-11-29 12:41 ` Richard Purdie
2010-12-01 1:48 ` Xu, Dongxiao
2010-12-08 6:53 ` Xu, Dongxiao
2010-12-09 14:40 ` Richard Purdie
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.