From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from avasout06.plus.net (avasout06.plus.net [212.159.14.18]) by mail.openembedded.org (Postfix) with ESMTP id 857F17F4C4 for ; Wed, 31 Jul 2019 17:41:55 +0000 (UTC) Received: from deneb ([80.229.24.9]) by smtp with ESMTP id ssbphqbSw7xQfssbqhwosc; Wed, 31 Jul 2019 18:41:55 +0100 X-Clacks-Overhead: "GNU Terry Pratchett" X-CM-Score: 0.00 X-CNFS-Analysis: v=2.3 cv=OswxNB3t c=1 sm=1 tr=0 a=E/9URZZQ5L3bK/voZ0g0HQ==:117 a=E/9URZZQ5L3bK/voZ0g0HQ==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=kj9zAlcOel0A:10 a=0o9FgrsRnhwA:10 a=-An2I_7KAAAA:8 a=jZYyzUo9wFxjHr3LqeAA:9 a=90wPeLLzQjfQdcbX:21 a=dk6FM9F4HeWsnDyt:21 a=CjuIK1q_8ugA:10 a=Sq34B_EcNBM9_nrAYB9S:22 Received: from mac by deneb with local (Exim 4.92) (envelope-from ) id 1hssbp-0003gs-Kd; Wed, 31 Jul 2019 18:41:53 +0100 Date: Wed, 31 Jul 2019 18:41:53 +0100 From: Mike Crowe To: Mark Hatle , openembedded-core@lists.openembedded.org Message-ID: <20190731174153.GA7953@mcrowe.com> References: <20190730110111.5143-1-mac@mcrowe.com> <20190730134917.GA30833@mcrowe.com> <6b194922-8aa4-f2c1-645b-ace8331ae532@windriver.com> MIME-Version: 1.0 In-Reply-To: <6b194922-8aa4-f2c1-645b-ace8331ae532@windriver.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-CMAE-Envelope: MS4wfAexqLJPxDEL1j1pq4BodytbC465NTfvRMUEI2hRQuZHpgNE3E6Amm7L1FvsJQBg/YFK0FTXMtzT6snlrncc8jv2JaZOTRiKAawxy7LQkaATNMsAEI+U WnTJDqnrYBf1/71bK3XpJaPLBLDcBsNghZpknzE8oPKaDea3tvW7caj+ Subject: Re: [PATCH] sstate: Truncate PV in sstate filenames that are too long X-BeenThere: openembedded-core@lists.openembedded.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Patches and discussions about the oe-core layer List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Jul 2019 17:41:56 -0000 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Tuesday 30 July 2019 at 09:14:01 -0500, Mark Hatle wrote: > On 7/30/19 8:49 AM, Mike Crowe wrote: > > On Tuesday 30 July 2019 at 08:25:52 -0500, Mark Hatle wrote: > >> On 7/30/19 6:01 AM, Mike Crowe wrote: > >>> sstate filenames are generated by concatenating a variety of bits of > >>> package metadata. Some of these parts could be long, which could cause > >>> the filename to be longer than the 255 character maximum for ext4. > >>> > >>> So, let's try to detect this situation and truncate the PV part of the > >>> filename so that it will fit. If this happens, an ellipsis is added to > >>> make it clear that the version number is incomplete. > >>> > >>> SSTATE_PKG needs to be consistent for all tasks so that the hash > >>> remains stable. This means that we need to make an assumption for the > >>> maximum length of the task name. In this implementation, the task name > >>> is limited to 27 characters. > >>> > >>> This change also results in a sensible error message being emitted if > >>> the resulting filename is still too long. > >>> > >>> Signed-off-by: Mike Crowe > >>> > >>> diff --git a/meta/classes/sstate.bbclass b/meta/classes/sstate.bbclass > >>> index 3342c5ef50..6313b1c538 100644 > >>> --- a/meta/classes/sstate.bbclass > >>> +++ b/meta/classes/sstate.bbclass > >>> @@ -8,6 +8,24 @@ def generate_sstatefn(spec, hash, d): > >>> hash = "INVALID" > >>> return hash[:2] + "/" + spec + hash > >>> > >>> +def sstate_path(taskname, d): > >>> + max_filename_len = 245 # leave some room for ".siginfo" > >>> + max_addendum_len = 32 # '_' + taskname + '.tgz' > >> > >> Since the task name is variable, is there really a 32 character limit here? > >> > >> It may make sense to do: > >> > >> # '_' + taskname + '.tgz', reserving a minimum of 32 for taskname > >> max_addendum_len = len(taskname) + 5 if len(taskname) + 5 > 32 else 32 > >> > >> Always reserve a minimum of 32 for consistency, but if we go over account > >> for it. > > > > I think that would just cause task hash mismatches (see third paragraph of > > commit message.) > > > > It probably does make sense to detect such long task names in this > > situation and generate errors though. > > > >>> + sstate_prefix = d.getVar('SSTATE_PKG') > >>> + excess = len(os.path.basename(sstate_prefix)) - (max_filename_len - max_addendum_len) > >>> + if excess > 0: > >>> + pv = d.getVar('PV') > >>> + if len(pv) >= excess and len(pv) >= 3: > >>> + short_pv = d.getVar('PV')[:-excess-3] + '...' > >> > >> Is truncating the PV enough? In a discussion on the bitbake list, I suggested > >> possibly changing the order of the entries in the SSTATE_PKGSPEC to allow us to > >> prune things prior to the hash w/o affecting the hash. Maybe this is simply not > >> needed.. but it's a possibility if this proves to not be effective. > > > > Truncating PV solves the problem I was having. The other fields don't > > really tend to be very long, so there's less to be gained by shortening > > them. Here's a change that just always truncates PV. It has the benefit of being simple but it will virtually always truncate too much and risks not truncating enough when other parts of the path are too long. However, if I can't get my head around enough of this to come up with an acceptable solution, we'll probably end up just using this change locally: diff --git a/meta/classes/sstate.bbclass b/meta/classes/sstate.bbclass index 3342c5ef50..e7003ea93f 100644 --- a/meta/classes/sstate.bbclass +++ b/meta/classes/sstate.bbclass @@ -8,17 +8,26 @@ def generate_sstatefn(spec, hash, d): hash = "INVALID" return hash[:2] + "/" + spec + hash +def shortened_pv(d): + pv = d.getVar('PV') + if len(pv) <= 64: + return pv + else: + bb.note("PV shortened: " + pv) + return pv[:61] + '...' + +SSTATE_PV = "${@shortened_pv(d)}" SSTATE_PKGARCH = "${PACKAGE_ARCH}" -SSTATE_PKGSPEC = "sstate:${PN}:${PACKAGE_ARCH}${TARGET_VENDOR}-${TARGET_OS}:${PV}:${PR}:${SSTATE_PKGARCH}:${SSTATE_VERSION}:" -SSTATE_SWSPEC = "sstate:${PN}::${PV}:${PR}::${SSTATE_VERSION}:" +SSTATE_PKGSPEC = "sstate:${PN}:${PACKAGE_ARCH}${TARGET_VENDOR}-${TARGET_OS}:${SSTATE_PV}:${PR}:${SSTATE_PKGARCH}:${SSTATE_VERSION}:" +SSTATE_SWSPEC = "sstate:${PN}::${SSTATE_PV}:${PR}::${SSTATE_VERSION}:" SSTATE_PKGNAME = "${SSTATE_EXTRAPATH}${@generate_sstatefn(d.getVar('SSTATE_PKGSPEC'), d.getVar('BB_UNIHASH'), d)}" SSTATE_PKG = "${SSTATE_DIR}/${SSTATE_PKGNAME}" SSTATE_EXTRAPATH = "" SSTATE_EXTRAPATHWILDCARD = "" SSTATE_PATHSPEC = "${SSTATE_DIR}/${SSTATE_EXTRAPATHWILDCARD}*/${SSTATE_PKGSPEC}" -# explicitly make PV to depend on evaluated value of PV variable -PV[vardepvalue] = "${PV}" +# explicitly make SSTATE_PV to depend on evaluated value of PV variable +SSTATE_PV[vardepvalue] = "${PV}" # We don't want the sstate to depend on things like the distro string # of the system, we let the sstate paths take care of this. > >>> + d2 = d.createCopy() > >>> + d2.setVar('PV', short_pv) > >>> + sstate_prefix = d2.getVar('SSTATE_PKG') > >>> + > >>> + sstatepkg = sstate_prefix + '_'+ taskname + ".tgz" > >>> + if len(os.path.basename(sstatepkg)) > max_filename_len: > >>> + bb.error('Failed to shorten sstate filename') > >>> + return sstatepkg > >>> + > >>> SSTATE_PKGARCH = "${PACKAGE_ARCH}" > >>> SSTATE_PKGSPEC = "sstate:${PN}:${PACKAGE_ARCH}${TARGET_VENDOR}-${TARGET_OS}:${PV}:${PR}:${SSTATE_PKGARCH}:${SSTATE_VERSION}:" > >>> SSTATE_SWSPEC = "sstate:${PN}::${PV}:${PR}::${SSTATE_VERSION}:" > >> > >> There is something else I noticed.. "SSTATE_PKGNAME" defined as: > >> > >> SSTATE_PKGNAME = > >> "${SSTATE_EXTRAPATH}${@generate_sstatefn(d.getVar('SSTATE_PKGSPEC'), > >> d.getVar('BB_UNIHASH'), d)}" > >> > >> From what I can tell, this really should be using the new sstate_path function > >> in someway. > > > > Hmm, you're right. I can't have covered that in my testing. :( > > > >> Would it make more sense to define SSTATE_PKGNAME in such a way that it always > >> resulted in something "short" enough, and in the right format, that it would > >> always work? > > > > I'd considered doing it that way. If SSTATE_OKGSPEC contained ${SSTATE_PV} > > and that either had the value of ${PV} or the a truncated version of ${PV} > > then the rest of the file could remain the same. However, truncating PV > > without access to the rest of the spec would mean just picking some > > arbitrary maximum PV length which is likely to be more conservative than > > necessary. > > > >> Adjusting or rewriting "generate_sstatefn" could still accomplish the PV change, > >> but the max length of the string would need further shrinking to accommodate an > >> unknown task length (which goes back to my previous comment). If the 32 default > >> is long enough then that shouldn't be a problem -- and may also resolve my > >> concerns that something outside of sstate class could try to use that various > >> and without the new magic function get the wrong results. > > > > I wonder whether I can get away with applying per-task PV truncation in > > generate_sstatefn without causing hash mismatches? That's worth a try. > > The only cause of a task hash mismatch (actual hash, not filename) would be in > SSTATE_PKGNAME is part of the hash itself. I'd contend if it is, then we should > exclude it. I would have thought so to, but I'm unable to persuade myself that it is definitely correct in all situations. If I have a fully built work tree, and change SSTATE_PKGNAME then would I expect a subsequent build to add the new filenames to the sstate cache? Although I don't believe that I yet really understand sstate.bbclass, I've done some more digging today to learn more. I thought I had a working solution (including potentially different truncations for different task names!) but then I discovered that nothing was ever successfully fetched from the sstate cache. This turned out to be because I hadn't appreciated the way that BB_HASHFILENAME was being used to transport variable values into sstate_checkhashes, so I was ending up with default values for PN, PV etc. :( I can't see a way to make this work without transporting each of the individual components of SSTATE_PKGSPEC and SSTATE_DIR, along with their unexpanded versions through BB_HASHFILENAME. I can't tunnel the full sstate filename because I don't know the task name at that point. :( In case anyone cares, here's the patch that shows what I'm trying to do but fails to work as described above: diff --git a/meta/classes/sstate.bbclass b/meta/classes/sstate.bbclass index 3342c5ef50..bd2b76bab1 100644 --- a/meta/classes/sstate.bbclass +++ b/meta/classes/sstate.bbclass @@ -3,16 +3,40 @@ SSTATE_VERSION = "3" SSTATE_MANIFESTS ?= "${TMPDIR}/sstate-control" SSTATE_MANFILEPREFIX = "${SSTATE_MANIFESTS}/manifest-${SSTATE_MANMACH}-${PN}" -def generate_sstatefn(spec, hash, d): +# Generate an sstate filename that is guaranteed to not be longer than +# the 255 character path component maximum for ext4. Returns a tuple +# (sstate leaf filename, sstate full path) +def generate_sstate_filename(d, taskname, hash = None, extension = '.tgz'): + # leave some room for '.siginfo' or '.XXXXXXXX' temporary filename suffix + max_filename_len = 246 + if not hash: + hash = d.getVar('BB_UNIHASH') if not hash: hash = "INVALID" - return hash[:2] + "/" + spec + hash + sstate_prefix = d.getVar('SSTATE_PKGSPEC') + excess = len(sstate_prefix) + len(hash) + 1 + len(taskname) + len(extension) - max_filename_len + if excess > 0: + pv = d.getVar('PV') + if len(pv) >= excess and len(pv) >= 3: + short_pv = d.getVar('PV')[:-excess-3] + '...' + d2 = d.createCopy() + d2.setVar('PV', short_pv) + sstate_prefix = d2.getVar('SSTATE_PKGSPEC') + + sstate_filename = sstate_prefix + hash + '_'+ taskname + extension + if len(sstate_filename) > max_filename_len: + bb.error('Failed to shorten sstate filename') + + return (sstate_filename, d.getVar('SSTATE_DIR') + '/' + hash[:2] + '/' + sstate_filename) SSTATE_PKGARCH = "${PACKAGE_ARCH}" SSTATE_PKGSPEC = "sstate:${PN}:${PACKAGE_ARCH}${TARGET_VENDOR}-${TARGET_OS}:${PV}:${PR}:${SSTATE_PKGARCH}:${SSTATE_VERSION}:" SSTATE_SWSPEC = "sstate:${PN}::${PV}:${PR}::${SSTATE_VERSION}:" -SSTATE_PKGNAME = "${SSTATE_EXTRAPATH}${@generate_sstatefn(d.getVar('SSTATE_PKGSPEC'), d.getVar('BB_UNIHASH'), d)}" -SSTATE_PKG = "${SSTATE_DIR}/${SSTATE_PKGNAME}" + +# Assigned in sstate_installpkg and sstate_package before running +# tasks. Should not be used beforehand +SSTATE_PKG = "SSTATE_PKG_NOT_YET_ASSIGNED" + SSTATE_EXTRAPATH = "" SSTATE_EXTRAPATHWILDCARD = "" SSTATE_PATHSPEC = "${SSTATE_DIR}/${SSTATE_EXTRAPATHWILDCARD}*/${SSTATE_PKGSPEC}" @@ -322,9 +346,7 @@ def sstate_installpkg(ss, d): from oe.gpg_sign import get_signer sstateinst = d.expand("${WORKDIR}/sstate-install-%s/" % ss['task']) - sstatefetch = d.getVar('SSTATE_PKGNAME') + '_' + ss['task'] + ".tgz" - sstatepkg = d.getVar('SSTATE_PKG') + '_' + ss['task'] + ".tgz" - + (sstatefetch, sstatepkg) = generate_sstate_filename(d, ss['task']) if not os.path.exists(sstatepkg): pstaging_fetch(sstatefetch, d) @@ -617,7 +639,7 @@ def sstate_package(ss, d): tmpdir = d.getVar('TMPDIR') sstatebuild = d.expand("${WORKDIR}/sstate-build-%s/" % ss['task']) - sstatepkg = d.getVar('SSTATE_PKG') + '_'+ ss['task'] + ".tgz" + sstatepkg = generate_sstate_filename(d, ss['task'])[1] bb.utils.remove(sstatebuild, recurse=True) bb.utils.mkdirhier(sstatebuild) bb.utils.mkdirhier(os.path.dirname(sstatepkg)) @@ -813,9 +835,7 @@ def sstate_checkhashes(sq_fn, sq_task, sq_hash, sq_hashfn, d, siginfo=False, *, ret = [] missed = [] - extension = ".tgz" - if siginfo: - extension = extension + ".siginfo" + maybe_siginfo = '.siginfo' if siginfo else '' def gethash(task): if sq_unihash is not None: @@ -844,7 +864,7 @@ def sstate_checkhashes(sq_fn, sq_task, sq_hash, sq_hashfn, d, siginfo=False, *, spec, extrapath, tname = getpathcomponents(task, d) - sstatefile = d.expand("${SSTATE_DIR}/" + extrapath + generate_sstatefn(spec, gethash(task), d) + "_" + tname + extension) + sstatefile = generate_sstate_filename(d, tname, gethash(task))[1] + maybe_siginfo if os.path.exists(sstatefile): bb.debug(2, "SState: Found valid sstate file %s" % sstatefile) @@ -907,7 +927,7 @@ def sstate_checkhashes(sq_fn, sq_task, sq_hash, sq_hashfn, d, siginfo=False, *, if task in ret: continue spec, extrapath, tname = getpathcomponents(task, d) - sstatefile = d.expand(extrapath + generate_sstatefn(spec, gethash(task), d) + "_" + tname + extension) + sstatefile = generate_sstate_filename(d, tname, gethash(task))[1] + maybe_siginfo tasklist.append((task, sstatefile)) if tasklist: @@ -937,11 +957,11 @@ def sstate_checkhashes(sq_fn, sq_task, sq_hash, sq_hashfn, d, siginfo=False, *, evdata = {'missed': [], 'found': []}; for task in missed: spec, extrapath, tname = getpathcomponents(task, d) - sstatefile = d.expand(extrapath + generate_sstatefn(spec, gethash(task), d) + "_" + tname + ".tgz") + sstatefile = generate_sstate_filename(d, tname, gethash(task)) evdata['missed'].append( (sq_fn[task], sq_task[task], gethash(task), sstatefile ) ) for task in ret: spec, extrapath, tname = getpathcomponents(task, d) - sstatefile = d.expand(extrapath + generate_sstatefn(spec, gethash(task), d) + "_" + tname + ".tgz") + sstatefile = generate_sstate_filename(d, tname, gethash(task)) evdata['found'].append( (sq_fn[task], sq_task[task], gethash(task), sstatefile ) ) bb.event.fire(bb.event.MetadataEvent("MissedSstate", evdata), d) @@ -1084,8 +1104,8 @@ python sstate_eventhandler() { if taskname in ["fetch", "unpack", "patch", "populate_lic", "preconfigure"] and swspec: d.setVar("SSTATE_PKGSPEC", "${SSTATE_SWSPEC}") d.setVar("SSTATE_EXTRAPATH", "") - sstatepkg = d.getVar('SSTATE_PKG') - bb.siggen.dump_this_task(sstatepkg + '_' + taskname + ".tgz" ".siginfo", d) + sstateinfo = generate_sstate_filename(d, taskname)[1] + ".siginfo" + bb.siggen.dump_this_task(sstateinfo, d) } SSTATE_PRUNE_OBSOLETEWORKDIR ?= "1" Thanks. Mike.