From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from bes.se.axis.com (bes.se.axis.com [195.60.68.10]) by mail.openembedded.org (Postfix) with ESMTP id 470AC71A21 for ; Tue, 28 Mar 2017 12:30:48 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by bes.se.axis.com (Postfix) with ESMTP id EDC0B2E2FD for ; Tue, 28 Mar 2017 14:30:49 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at bes.se.axis.com Received: from bes.se.axis.com ([IPv6:::ffff:127.0.0.1]) by localhost (bes.se.axis.com [::ffff:127.0.0.1]) (amavisd-new, port 10024) with LMTP id Zk0guOddrLUA for ; Tue, 28 Mar 2017 14:30:48 +0200 (CEST) Received: from boulder02.se.axis.com (boulder02.se.axis.com [10.0.8.16]) by bes.se.axis.com (Postfix) with ESMTPS id CB2712E2CD for ; Tue, 28 Mar 2017 14:30:47 +0200 (CEST) Received: from boulder02.se.axis.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A15B01A06F for ; Tue, 28 Mar 2017 14:30:47 +0200 (CEST) Received: from boulder02.se.axis.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9665C1A075 for ; Tue, 28 Mar 2017 14:30:47 +0200 (CEST) Received: from thoth.se.axis.com (unknown [10.0.2.173]) by boulder02.se.axis.com (Postfix) with ESMTP for ; Tue, 28 Mar 2017 14:30:47 +0200 (CEST) Received: from saur-2.se.axis.com (saur-2.se.axis.com [10.92.3.2]) by thoth.se.axis.com (Postfix) with ESMTP id 89491212D for ; Tue, 28 Mar 2017 14:30:47 +0200 (CEST) Received: from saur-2.se.axis.com (localhost [127.0.0.1]) by saur-2.se.axis.com (8.14.5/8.14.5) with ESMTP id v2SCUlnD023958 for ; Tue, 28 Mar 2017 14:30:47 +0200 Received: (from pkj@localhost) by saur-2.se.axis.com (8.14.5/8.14.5/Submit) id v2SCUl3j023957 for openembedded-core@lists.openembedded.org; Tue, 28 Mar 2017 14:30:47 +0200 From: Peter Kjellerstedt To: openembedded-core@lists.openembedded.org Date: Tue, 28 Mar 2017 14:30:42 +0200 Message-Id: X-Mailer: git-send-email 2.12.0 X-TM-AS-GCONF: 00 Subject: [PATCH 0/1] Create symbolic links atomically in the fetcher X-BeenThere: openembedded-core@lists.openembedded.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Patches and discussions about the oe-core layer List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Mar 2017 12:30:50 -0000 We have occasional failures in our autobuilders where a setscene task fails, causing the original task to be run instead, but bitbake still fails with an error code in the end, causing unnecessary grief. One such case has been identified through the following error log: The stack trace of python calls that resulted in this exception/failure was: File: 'exec_python_func() autogenerated', lineno: 2, function: 0001: *** 0002:do_package_write_rpm_setscene(d) 0003: File: '${COREBASE}/meta/classes/package_rpm.bbclass', lineno: 757, function: do_package_write_rpm_setscene 0753:# but we need to stop the rootfs/solver from running while we do... 0754:do_package_write_rpm[sstate-lockfile-shared] += "${DEPLOY_DIR_RPM}/rpm.lock" 0755: 0756:python do_package_write_rpm_setscene () { *** 0757: sstate_setscene(d) 0758:} 0759:addtask do_package_write_rpm_setscene 0760: 0761:python do_package_write_rpm () { File: '${COREBASE}/meta/classes/sstate.bbclass', lineno: 648, function: sstate_setscene 0644: break 0645: 0646:def sstate_setscene(d): 0647: shared_state = sstate_state_fromvars(d) *** 0648: accelerate = sstate_installpkg(shared_state, d) 0649: if not accelerate: 0650: raise bb.build.FuncFailed("No suitable staging package found") 0651: 0652:python sstate_task_prefunc () { File: '${COREBASE}/meta/classes/sstate.bbclass', lineno: 297, function: sstate_installpkg 0293: sstatefetch = d.getVar('SSTATE_PKGNAME', True) + '_' + ss['task'] + ".tgz" 0294: sstatepkg = d.getVar('SSTATE_PKG', True) + '_' + ss['task'] + ".tgz" 0295: 0296: if not os.path.exists(sstatepkg): *** 0297: pstaging_fetch(sstatefetch, sstatepkg, d) 0298: 0299: if not os.path.isfile(sstatepkg): 0300: bb.note("Staging package %s does not exist" % sstatepkg) 0301: return False File: '${COREBASE}/meta/classes/sstate.bbclass', lineno: 635, function: pstaging_fetch 0631: for srcuri in uris: 0632: localdata.setVar('SRC_URI', srcuri) 0633: try: 0634: fetcher = bb.fetch2.Fetch([srcuri], localdata, cache=False) *** 0635: fetcher.download() 0636: 0637: # Need to optimise this, if using file:// urls, the fetcher just changes the local path 0638: # For now work around by symlinking 0639: localpath = bb.data.expand(fetcher.localpath(srcuri), localdata) File: '${COREBASE}/poky/bitbake/lib/bb/fetch2/__init__.py', lineno: 1572, function: download 1568: localpath = ud.localpath 1569: elif m.try_premirror(ud, self.d): 1570: logger.debug(1, "Trying PREMIRRORS") 1571: mirrors = mirror_from_string(self.d.getVar('PREMIRRORS', True)) *** 1572: localpath = try_mirrors(self, self.d, ud, mirrors, False) 1573: 1574: if premirroronly: 1575: self.d.setVar("BB_NO_NETWORK", "1") 1576: File: '${COREBASE}/poky/bitbake/lib/bb/fetch2/__init__.py', lineno: 1020, function: try_mirrors 1016: 1017: uris, uds = build_mirroruris(origud, mirrors, ld) 1018: 1019: for index, uri in enumerate(uris): *** 1020: ret = try_mirror_url(fetch, origud, uds[index], ld, check) 1021: if ret != False: 1022: return ret 1023: return None 1024: File: '${COREBASE}/poky/bitbake/lib/bb/fetch2/__init__.py', lineno: 978, function: try_mirror_url 0974: if os.path.islink(origud.localpath): 0975: # Broken symbolic link 0976: os.unlink(origud.localpath) 0977: *** 0978: os.symlink(ud.localpath, origud.localpath) 0979: update_stamp(origud, ld) 0980: return ud.localpath 0981: 0982: except bb.fetch2.NetworkAccess: Exception: OSError: [Errno 17] File exists What happens here is that two tasks simultaneously decide to download something, and both come to the conclusion that they need to create a symbolic link. And even if there is a check for whether the link already exists, there is a small window of time where both tasks see the missing link and tries to create it with the result that the second task fails as per above. The change provided here causes the link creation to be made in an atomic way so that even if two tasks actually do decide that they need to create the same link, neither of them will fail. I do not know if this solves the same problem that is solved by commit b8b14d975a254444461ba857fc6fb8c725de8874 on the master-next branch in the bitbake repository. Since I have no way to recreate the failure in a controlled way, I cannot test if the change on the master-next branch also solves it or not. Its description does not exactly match our situation (we do not map file:// URLs to http:// URLs in our SSTATE_MIRRORS), but maybe someone with better knowledge of the code can tell if either or both changes are needed. //Peter The following changes since commit 415b72ffcbd26e5f3664370d8b2a9b8105fb6342: dnf: remove systemd units in nativesdk builds (2017-03-28 10:34:37 +0100) are available in the git repository at: git://git.yoctoproject.org/poky-contrib pkj/atomic_symlinks http://git.yoctoproject.org/cgit.cgi/poky-contrib/log/?h=pkj/atomic_symlinks Peter Kjellerstedt (1): fetch2: Create/replace symbolic links atomically bitbake/lib/bb/fetch2/__init__.py | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) -- 2.12.0