* BB_SIGNATURE_HANDLER = "basichash" unusable strict?
@ 2011-11-08 14:37 Martin Jansa
2011-11-09 10:32 ` Richard Purdie
0 siblings, 1 reply; 7+ messages in thread
From: Martin Jansa @ 2011-11-08 14:37 UTC (permalink / raw)
To: Patches and discussions about the oe-core layer
[-- Attachment #1: Type: text/plain, Size: 11701 bytes --]
Today I've started build from scratch and dediced to give basichash a try (as it is supposed to become default IIRC):
So after cleaning tmpdir, sstate cache, pseudo I've started clean build..
1) bitbake -k gcc-cross | tee -a log.${MACHINE};
2) bitbake -k virtual/kernel | tee -a log.${MACHINE};
3) bitbake -k core-image-core | tee -a log.${MACHINE};
4) bitbake -k shr-lite-image | tee -a log.${MACHINE};
But then I've noticed that after successfull build of gcc-cross in step 1 it started another gcc-* build in step 2..
$ bitbake-diffsigs sstate-cache/sstate-gcc-cross-initial-armv4t-oe-linux-gnueabi-4.6.1+svnr180099-r18-x86_64_armv4t-2-*_populate-lic.tgz.siginfo
Hash for dependent task /OE/shr-core/openembedded-core/meta/recipes-devtools/gcc/gcc-cross-initial_4.6.bb.do_patch changed from 27ad89a6c7a2909c692d70ef9e5f35f3 to 1226ce1337b5619d6ac4b3e5b4f7ad8d
$ bitbake-diffsigs tmp-eglibc/stamps/armv4t-oe-linux-gnueabi/gcc-cross-initial-4.6.1+svnr180099-r18.do_patch.sigdata.*
Hash for dependent task /OE/shr-core/openembedded-core/meta/recipes-devtools/gcc/gcc-cross-initial_4.6.bb.do_headerfix changed from b3744b0f4ce6528b83cfd41819f5dfac to 155191eba6adf57f8bb0dbfa3328e86d
Hash for dependent task /OE/shr-core/openembedded-core/meta/recipes-devtools/gcc/gcc-cross-initial_4.6.bb.do_unpack changed from 916fc502f05393665f6ce2b739f2bbdd to 607514e9b955d0574ca73d68ac1c4909
$ bitbake-diffsigs tmp-eglibc/stamps/armv4t-oe-linux-gnueabi/gcc-cross-initial-4.6.1+svnr180099-r18.do_unpack.sigdata.*
Hash for dependent task /OE/shr-core/openembedded-core/meta/recipes-devtools/gcc/gcc-cross-initial_4.6.bb.do_fetch changed from 0d06e2791086ad47e27f62d4d3a5fb64 to 2d49d81052858f923d2509004fe9b25a
$ bitbake-diffsigs tmp-eglibc/stamps/armv4t-oe-linux-gnueabi/gcc-cross-initial-4.6.1+svnr180099-r18.do_fetch.sigdata.*
Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-devtools/subversion/subversion_1.7.0.bb.do_populate_sysroot changed from d09614a543e3363f048e7b109fcd018e to 5a8f4950f420ea8c502a088425784a67
$ bitbake-diffsigs sstate-cache/sstate-subversion-native-x86_64-linux-1.7.0-r0-x86_64-2-*populate-sysroot.tgz.siginfo
Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-devtools/subversion/subversion_1.7.0.bb.do_install changed from fa7e205110d0d39d245e2e8026353121 to 5bfeb570596bd50c4f0c39c25b7801cd
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/subversion-native-1.7.0-r0.do_install.sigdata.*
Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-devtools/subversion/subversion_1.7.0.bb.do_compile changed from e99a13758e0b55409de0a6773333a7fc to 562b9e2fd556973a7f3785e1a6b5bd40
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/subversion-native-1.7.0-r0.do_compile.sigdata.*
Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-devtools/subversion/subversion_1.7.0.bb.do_configure changed from 5f19d874d3c5435aa04e30948a35769c to 4d33c1af00bd7256d169f3558aa3ce64
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/subversion-native-1.7.0-r0.do_configure.sigdata.*
Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-support/neon/neon_0.29.6.bb.do_populate_sysroot changed from bec25748384aaf2082943c484f3efa67 to 3c63f185b68a70934066deed691e5c86
$ bitbake-diffsigs sstate-cache/sstate-neon-native-x86_64-linux-0.29.6-r0-x86_64-2-*populate-sysroot.tgz.siginfo
Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-support/neon/neon_0.29.6.bb.do_install changed from 4da3b0fda3d78e2d9240823c7e4a0b12 to 099cbf5b04054ea6d0180c2ea8d55893
$ bitbake-diffsigs sstate-cache/sstate-neon-native-x86_64-linux-0.29.6-r0-x86_64-2-*populate-sysroot.tgz.siginfo
Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-support/neon/neon_0.29.6.bb.do_install changed from 4da3b0fda3d78e2d9240823c7e4a0b12 to 099cbf5b04054ea6d0180c2ea8d55893
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/neon-native-0.29.6-r0.do_install.sigdata.*
Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-support/neon/neon_0.29.6.bb.do_compile changed from 3fe8f07d07e3837e5def65d6ad526fad to cf92f049a9358396c57dd3afe89a097e
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/neon-native-0.29.6-r0.do_compile.sigdata.*
Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-support/neon/neon_0.29.6.bb.do_configure changed from fea13b2a72ceec337b9e99b2e688ce22 to 470409c68f202204094c0fb78ea0bb9c
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/neon-native-0.29.6-r0.do_configure.sigdata.*
Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-core/libxml/libxml2_2.7.8.bb.do_populate_sysroot changed from 85a14f7a73ea96fe85227c5a4bac3f1f to f3bbb2f69cdef3ee60360fbbd6fab311
$ bitbake-diffsigs sstate-cache/sstate-libxml2-native-x86_64-linux-2.7.8-r*populate-sysroot.tgz.siginfo
basehash changed from 306c63118517faab316c218dd29b9bf3 to 8cfccf0b10cde4211693f56e5a96e12e
Variable PR value changed from r3 to r4
Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-core/libxml/libxml2_2.7.8.bb.do_install changed from 57f3a4db8953b2ce5dbc363757c03b7e to 17fbf32e34250f887781b708b39a3086
Ah.. yes I did 2 small patches to libxml2 and openssl between step 1 and step2:
http://patchwork.openembedded.org/patch/14521/
http://patchwork.openembedded.org/patch/14519/
But do we want to rebuild everything after every change small like this?
Or is it configuration issue or just bug in sstate implementation?
Btw libxml2 isn't first difference.. I can dig more..
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/libxml2-native-2.7.8-r*do_install.sigdata*
basehash changed from 62dc1c0a61f00ddf535653092ac6b990 to 6a0264b149637ff9ed4cc4131dbb8281
Variable PR value changed from r3 to r4
Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-core/libxml/libxml2_2.7.8.bb.do_compile changed from 6a22c51bd60c2c9cd5d1449049fbc594 to 6fa97d974fbbee1191565d2d1093f61f
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/libxml2-native-2.7.8-r*do_compile.sigdata*
Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-core/libxml/libxml2_2.7.8.bb.do_configure changed from 15cb4d383c2cdbdc14fa03465ba12994 to 0b0d0799166dd435bd3bfd7f8013e601
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/libxml2-native-2.7.8-r*do_configure.sigdata*
basehash changed from e64cbe7dbb385da9105320b536b413f4 to a5daed8957519be42f83c33d8674a3b1
Variable PR value changed from r3 to r4
Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-core/libxml/libxml2_2.7.8.bb.do_patch changed from d5143d7e48c26966de6cd61a1c432429 to 13eb7410db1ca3751b200cdebd0634f6
Hash for dependent task /OE/shr-core/openembedded-core/meta/recipes-devtools/python/python-native_2.7.2.bb.do_populate_sysroot changed from ceefb2abf0808fa4224854d17e8f361b to ee687271695d9ac1ac5463e0891bc67c
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/libxml2-native-2.7.8-r*do_patch.sigdata*
basehash changed from d09eeb290e84cc693a2511fd22fc115e to 21637f5f0f972a33166ce82f9c25cd32
Variable PR value changed from r3 to r4
Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-core/libxml/libxml2_2.7.8.bb.do_unpack changed from 80fbdd88cf3e4bf7faa04b5aa1689bb5 to 2bc535cff2ff219fb1549bbcb3f8d40f
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/libxml2-native-2.7.8-r*do_unpack.sigdata*
basehash changed from f37acfa6dffa81ed3e7339f794e546c1 to 3dfdc9a2be00284ec0a7010a286a470c
Variable PR value changed from r3 to r4
$ bitbake-diffsigs sstate-cache/sstate-python-native-x86_64-linux-2.7.2-r0.0-x86_64-2-*populate-sysroot.tgz.siginfo
Hash for dependent task /OE/shr-core/openembedded-core/meta/recipes-devtools/python/python-native_2.7.2.bb.do_install changed from cc1386028c2c1849e65032e03e4bbc3c to 3cbe787cc4f489db10ab55d62ba7fa70
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/python-native-2.7.2-r0.0.do_install.sigdata.*
Hash for dependent task /OE/shr-core/openembedded-core/meta/recipes-devtools/python/python-native_2.7.2.bb.do_compile changed from 891f9adecc7d0e87f03b953b8bf67b29 to 79e4cebf62bc4a973d27a74984a0f1cc
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/python-native-2.7.2-r0.0.do_compile.sigdata.*
Hash for dependent task /OE/shr-core/openembedded-core/meta/recipes-devtools/python/python-native_2.7.2.bb.do_configure changed from 365565f55cf4df1a90438320ff760897 to 8fad169835037b6d8081481603036be8
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/python-native-2.7.2-r0.0.do_configure.sigdata.*
Hash for dependent task virtual:native:/OE/shr-core/meta-openembedded/meta-oe/recipes-connectivity/openssl/openssl_1.0.0e.bb.do_populate_sysroot changed from aaaf818a6acd42821a765910705fdbe3 to 76f61b8f3b9894caac33589daf4cb05b
$ bitbake-diffsigs sstate-cache/sstate-openssl-native-x86_64-linux-1.0.0e-r14.*populate-sysroot.tgz.siginfo
basehash changed from df1102af82e7c91406e7af4f8a2a588e to 98fbaf83c6220b8e892475967c251c75
Variable PR value changed from ${INC_PR}.3 to ${INC_PR}.4
Hash for dependent task virtual:native:/OE/shr-core/meta-openembedded/meta-oe/recipes-connectivity/openssl/openssl_1.0.0e.bb.do_install changed from fff1adc3d5ffab89798fd2e7c7e3278e to b3eacc0fffeb29d03880ae53eba77307
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/openssl-native-1.0.0e-r14.*do_install.sigdata*
basehash changed from de6aa28ab623161658d03dd8465558e7 to 81ecc9ba2506e7d8f3b7036fafc44fe4
Variable PR value changed from ${INC_PR}.3 to ${INC_PR}.4
Hash for dependent task virtual:native:/OE/shr-core/meta-openembedded/meta-oe/recipes-connectivity/openssl/openssl_1.0.0e.bb.do_compile changed from 81948f726657b1a54df49717006e3eb8 to 3ea129c0a248c274b57d71002150664f
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/openssl-native-1.0.0e-r14.*do_compile.sigdata*
Hash for dependent task virtual:native:/OE/shr-core/meta-openembedded/meta-oe/recipes-connectivity/openssl/openssl_1.0.0e.bb.do_configure changed from c70f8fa60f7680a69b2efa9b3328b363 to f6d20bac2b382c853e73cd1d276e3ed6
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/openssl-native-1.0.0e-r14.*do_configure.sigdata*
Hash for dependent task virtual:native:/OE/shr-core/meta-openembedded/meta-oe/recipes-connectivity/openssl/openssl_1.0.0e.bb.do_patch changed from 2352761d461df7639a4700c346bf944c to 231d6c0c919e6c86961397f2100288a3
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/openssl-native-1.0.0e-r14.*do_patch.sigdata*
basehash changed from 42b5dc2d5bf3214822c1fca56a823b93 to 44c20013d474320713936ab78ddd79d2
Variable PR value changed from ${INC_PR}.3 to ${INC_PR}.4
Hash for dependent task virtual:native:/OE/shr-core/meta-openembedded/meta-oe/recipes-connectivity/openssl/openssl_1.0.0e.bb.do_unpack changed from fb9a2dae9ddc97d158c2f41c67ecffdd to 20325a8541ca8d988c24343d45c0e908
$ bitbake-diffsigs tmp-eglibc/stamps/x86_64-linux/openssl-native-1.0.0e-r14.*do_unpack.sigdata*
basehash changed from 69ae853186255c752cb79bc6747ab5ca to 72f78bacf7e8b5f21b25f9189f2abf40
Variable PR value changed from ${INC_PR}.3 to ${INC_PR}.4
I have few extra patches in my branch so for this particular test case you also need ie
http://patchwork.openembedded.org/patch/13699/
But it shouldn't be hard to find similar issue for any other dependency tree (ie with git-native instead of subversion-native).
Regards,
--
Martin 'JaMa' Jansa jabber: Martin.Jansa@gmail.com
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 205 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: BB_SIGNATURE_HANDLER = "basichash" unusable strict? 2011-11-08 14:37 BB_SIGNATURE_HANDLER = "basichash" unusable strict? Martin Jansa @ 2011-11-09 10:32 ` Richard Purdie 2011-11-09 11:51 ` Martin Jansa 0 siblings, 1 reply; 7+ messages in thread From: Richard Purdie @ 2011-11-09 10:32 UTC (permalink / raw) To: Patches and discussions about the oe-core layer On Tue, 2011-11-08 at 15:37 +0100, Martin Jansa wrote: > Today I've started build from scratch and dediced to give basichash a try (as it is supposed to become default IIRC): > > So after cleaning tmpdir, sstate cache, pseudo I've started clean build.. > > 1) bitbake -k gcc-cross | tee -a log.${MACHINE}; > 2) bitbake -k virtual/kernel | tee -a log.${MACHINE}; > 3) bitbake -k core-image-core | tee -a log.${MACHINE}; > 4) bitbake -k shr-lite-image | tee -a log.${MACHINE}; > > But then I've noticed that after successfull build of gcc-cross in step 1 it started another gcc-* build in step 2.. [...] > Ah.. yes I did 2 small patches to libxml2 and openssl between step 1 and step2: > http://patchwork.openembedded.org/patch/14521/ > http://patchwork.openembedded.org/patch/14519/ > > But do we want to rebuild everything after every change small like this? The biggest problem we have here is deciding when to rebuild and when not to. Can you define when this should/shouldn't happen? > Or is it configuration issue or just bug in sstate implementation? I think its behaving as currently configured. Whether that configuration is right/wrong and what it should be is the question. If we can define the configuration, we can then work out how to implement it which is a separate issue. > Btw libxml2 isn't first difference.. I can dig more.. [...] > I have few extra patches in my branch so for this particular test case you also need ie > http://patchwork.openembedded.org/patch/13699/ > > But it shouldn't be hard to find similar issue for any other dependency tree (ie with git-native instead of subversion-native). The situation is currently configurable through: BB_HASHTASK_WHITELIST ?= "(.*-cross$|.*-native$|.*-cross-initial$|.*-cross-intermediate$|^virtual:native:.*|^virtual:nativesdk:.*)" however I have to admit looking at the bitbake code handling this its not that simple. The code only triggers for recipes which are not matched by the whitelist. For those not matching, it iterates through their dependencies and removes anything that matches the expression. So effectively it only modified target recipes, removes dependencies matching the above expressions. This isn't an easy problem and this is reminding me I wanted to revisit this code. I think we actually need some kind of double expression to match a regexp against like: <depender>___<depend> So we could then do: REGEXP_NONNATIVE = "(.*-cross|.*-native|.*-cross-initial|.*-cross-intermediate|virtual:native:.*|virtual:nativesdk:.*)" BB_HASHTASK_WHITELIST ?= "^.(?!${REGEXP_NONNATIVE})___${REGEXP_NONNATIVE}$" which would function as above but move more of the control into the code. I was then trying to come up with a further example to extend this but its not scaling. So lets throw away the idea of using regexps and use python. Coding off the top of my head, we could have something like: def filter_dep(depender, depend): # Return True if we should keep the dependency, False to drop it def isNative(x): return x.startswith("virtual:native:") or x.endswith("-native") def isCross(x): return x.endswith("-cross") or x.endswith("-cross-initial") or x.endswith("-cross-intermediate") def isNativeSDK(x): return x.startswith("virtual:nativesdk:") if isNative(depender) or isCross(depender) or isNativeSDK(depender): return True # Only target packages beyond here if isNative(depend) or isCross(depend) or isNativeSDK(depend): return False return True which would then be easy to extend to for example ensure the python dependency on python-native is kept. The siggen code was designed to be a plugin so changing it to the above form isn't the problem. The real problem is deciding what the policy it implements should be. So to go back to the original question, out of those changes you made, which ones would you expect to change the hash and which ones would you not expect to see changes for? Cheers, Richard ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: BB_SIGNATURE_HANDLER = "basichash" unusable strict? 2011-11-09 10:32 ` Richard Purdie @ 2011-11-09 11:51 ` Martin Jansa 2011-11-09 12:06 ` Richard Purdie 0 siblings, 1 reply; 7+ messages in thread From: Martin Jansa @ 2011-11-09 11:51 UTC (permalink / raw) To: Patches and discussions about the oe-core layer [-- Attachment #1: Type: text/plain, Size: 5637 bytes --] On Wed, Nov 09, 2011 at 10:32:18AM +0000, Richard Purdie wrote: > On Tue, 2011-11-08 at 15:37 +0100, Martin Jansa wrote: > > Today I've started build from scratch and dediced to give basichash a try (as it is supposed to become default IIRC): > > > > So after cleaning tmpdir, sstate cache, pseudo I've started clean build.. > > > > 1) bitbake -k gcc-cross | tee -a log.${MACHINE}; > > 2) bitbake -k virtual/kernel | tee -a log.${MACHINE}; > > 3) bitbake -k core-image-core | tee -a log.${MACHINE}; > > 4) bitbake -k shr-lite-image | tee -a log.${MACHINE}; > > > > But then I've noticed that after successfull build of gcc-cross in step 1 it started another gcc-* build in step 2.. > [...] > > Ah.. yes I did 2 small patches to libxml2 and openssl between step 1 and step2: > > http://patchwork.openembedded.org/patch/14521/ > > http://patchwork.openembedded.org/patch/14519/ > > > > But do we want to rebuild everything after every change small like this? > > The biggest problem we have here is deciding when to rebuild and when > not to. Can you define when this should/shouldn't happen? > > > Or is it configuration issue or just bug in sstate implementation? > > I think its behaving as currently configured. Whether that configuration > is right/wrong and what it should be is the question. If we can define > the configuration, we can then work out how to implement it which is a > separate issue. > > > Btw libxml2 isn't first difference.. I can dig more.. > [...] > > I have few extra patches in my branch so for this particular test case you also need ie > > http://patchwork.openembedded.org/patch/13699/ > > > > But it shouldn't be hard to find similar issue for any other dependency tree (ie with git-native instead of subversion-native). > > The situation is currently configurable through: > > BB_HASHTASK_WHITELIST ?= "(.*-cross$|.*-native$|.*-cross-initial$|.*-cross-intermediate$|^virtual:native:.*|^virtual:nativesdk:.*)" > > however I have to admit looking at the bitbake code handling this its > not that simple. > > The code only triggers for recipes which are not matched by the > whitelist. For those not matching, it iterates through their > dependencies and removes anything that matches the expression. > > So effectively it only modified target recipes, removes dependencies > matching the above expressions. > > This isn't an easy problem and this is reminding me I wanted to revisit > this code. I think we actually need some kind of double expression to > match a regexp against like: > > <depender>___<depend> > > So we could then do: > > REGEXP_NONNATIVE = "(.*-cross|.*-native|.*-cross-initial|.*-cross-intermediate|virtual:native:.*|virtual:nativesdk:.*)" > > BB_HASHTASK_WHITELIST ?= "^.(?!${REGEXP_NONNATIVE})___${REGEXP_NONNATIVE}$" > > which would function as above but move more of the control into the code. > > I was then trying to come up with a further example to extend this but > its not scaling. So lets throw away the idea of using regexps and use > python. Coding off the top of my head, we could have something like: > > def filter_dep(depender, depend): > # Return True if we should keep the dependency, False to drop it > def isNative(x): > return x.startswith("virtual:native:") or x.endswith("-native") > def isCross(x): > return x.endswith("-cross") or x.endswith("-cross-initial") or x.endswith("-cross-intermediate") > def isNativeSDK(x): > return x.startswith("virtual:nativesdk:") > > if isNative(depender) or isCross(depender) or isNativeSDK(depender): > return True > > # Only target packages beyond here > > if isNative(depend) or isCross(depend) or isNativeSDK(depend): > return False > > return True > > which would then be easy to extend to for example ensure the python > dependency on python-native is kept. > > The siggen code was designed to be a plugin so changing it to the above > form isn't the problem. The real problem is deciding what the policy it > implements should be. > > So to go back to the original question, out of those changes you made, > which ones would you expect to change the hash and which ones would you > not expect to see changes for? I have talked with kergoth on IRC yesterday and he had very nice remark: 16:40:50 < kergoth_> JaMa: heh, the biggest weakness of the sstate signature bits, in my opinion, is that it only tracks inputs, not outputs. If task A depends on B, and the metadata input to B changes, then A will be rebuilt, even if the *output* of B didn't change as a result of the change to its metadata. And with this idea applied on those 2 changes I think that PR change in libxml2 should of course invalidate checksum for sstate-libxml2-native-x86_64-linux-2.7.8-r*populate-sysroot.tgz.siginfo and probably wont hurt so much when neon-native is also rebuilt, but then if the output of neon build is the same with new sstate checksum as it was with older one (I know it's hard to detect ie if some file in build has "generation timestamp inside"), then we won't continue to rebuild subversion, gcc, ... all (just because neon was rebuilt due to libxml2 PR change which didn't influence neon output). The same with openssl PR change.. which can cause python-native rebuild, but as long as python-native build output is "the same" we don't need to rebuild everything which (even transitively) depends on python-native. Regards, -- Martin 'JaMa' Jansa jabber: Martin.Jansa@gmail.com [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 205 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: BB_SIGNATURE_HANDLER = "basichash" unusable strict? 2011-11-09 11:51 ` Martin Jansa @ 2011-11-09 12:06 ` Richard Purdie 2011-11-09 12:45 ` Martin Jansa 0 siblings, 1 reply; 7+ messages in thread From: Richard Purdie @ 2011-11-09 12:06 UTC (permalink / raw) To: Patches and discussions about the oe-core layer On Wed, 2011-11-09 at 12:51 +0100, Martin Jansa wrote: > I have talked with kergoth on IRC yesterday and he had very nice remark: > > 16:40:50 < kergoth_> JaMa: heh, the biggest weakness of the sstate > signature bits, in my opinion, is that it only tracks inputs, not > outputs. If task A depends on B, and the metadata input to B changes, > then A will be rebuilt, even if the *output* of B didn't change as a > result of the change to its metadata. > > And with this idea applied on those 2 changes I think that PR change in > libxml2 should of course invalidate checksum for > sstate-libxml2-native-x86_64-linux-2.7.8-r*populate-sysroot.tgz.siginfo > and probably wont hurt so much when neon-native is also rebuilt, but then > if the output of neon build is the same with new sstate checksum as it was > with older one (I know it's hard to detect ie if some file in build has > "generation timestamp inside"), then we won't continue to rebuild > subversion, gcc, ... all (just because neon was rebuilt due to libxml2 PR > change which didn't influence neon output). > > The same with openssl PR change.. which can cause python-native rebuild, > but as long as python-native build output is "the same" we don't need to > rebuild everything which (even transitively) depends on python-native. In an ideal world it would be nice to track the output. I've never seen a proposal for how we could make this work in practise though. There are at least two big problems that spring to mind: a) How do you compare two sets of output and decide whether they're the same? Same list of files? Same contents? How to deal with timestamps? b) You can't know in advance that the output will or won't match and its near impossible to calculate any kind of checksum without having the output available to perform that calculation on. This breaks a lot of the way bitbake runs the builds and makes it hard to compare two configurations. Is A compatible with B? You'd have to build them both to find out. Whilst output tracking sounds nice, I think its trading one set of problems for another and in the end, I'm not sure its the perfect solution it might look like from our current position. Cheers, Richard ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: BB_SIGNATURE_HANDLER = "basichash" unusable strict? 2011-11-09 12:06 ` Richard Purdie @ 2011-11-09 12:45 ` Martin Jansa 2011-11-09 14:13 ` Richard Purdie 0 siblings, 1 reply; 7+ messages in thread From: Martin Jansa @ 2011-11-09 12:45 UTC (permalink / raw) To: Patches and discussions about the oe-core layer [-- Attachment #1: Type: text/plain, Size: 5224 bytes --] On Wed, Nov 09, 2011 at 12:06:23PM +0000, Richard Purdie wrote: > On Wed, 2011-11-09 at 12:51 +0100, Martin Jansa wrote: > > I have talked with kergoth on IRC yesterday and he had very nice remark: > > > > 16:40:50 < kergoth_> JaMa: heh, the biggest weakness of the sstate > > signature bits, in my opinion, is that it only tracks inputs, not > > outputs. If task A depends on B, and the metadata input to B changes, > > then A will be rebuilt, even if the *output* of B didn't change as a > > result of the change to its metadata. > > > > And with this idea applied on those 2 changes I think that PR change in > > libxml2 should of course invalidate checksum for > > sstate-libxml2-native-x86_64-linux-2.7.8-r*populate-sysroot.tgz.siginfo > > and probably wont hurt so much when neon-native is also rebuilt, but then > > if the output of neon build is the same with new sstate checksum as it was > > with older one (I know it's hard to detect ie if some file in build has > > "generation timestamp inside"), then we won't continue to rebuild > > subversion, gcc, ... all (just because neon was rebuilt due to libxml2 PR > > change which didn't influence neon output). > > > > The same with openssl PR change.. which can cause python-native rebuild, > > but as long as python-native build output is "the same" we don't need to > > rebuild everything which (even transitively) depends on python-native. > > In an ideal world it would be nice to track the output. I've never seen > a proposal for how we could make this work in practise though. There are > at least two big problems that spring to mind: > > a) How do you compare two sets of output and decide whether they're the > same? Same list of files? Same contents? How to deal with timestamps? > > b) You can't know in advance that the output will or won't match and its > near impossible to calculate any kind of checksum without having the > output available to perform that calculation on. This breaks a lot of > the way bitbake runs the builds and makes it hard to compare two > configurations. Is A compatible with B? You'd have to build them both to > find out. > > Whilst output tracking sounds nice, I think its trading one set of > problems for another and in the end, I'm not sure its the perfect > solution it might look like from our current position. This could be completely silly idea and I don't have any tmpdir to check it on real sstate data, but what if we extend sstate-libxml2-native-x86_64-linux-2.7.8-r4-x86_64-2-85a14f7a73ea96fe85227c5a4bac3f1f_populate-sysroot.tgz.siginfo to contain checksums for every file included in sstate-libxml2-native-x86_64-linux-2.7.8-r4-x86_64-2-85a14f7a73ea96fe85227c5a4bac3f1f_populate-sysroot.tgz maybe store them in new extra file like sstate-libxml2-native-x86_64-linux-85a14f7a73ea96fe85227c5a4bac3f1f_populate-sysroot.tgz.files.siginfo and add only checksum of this file to oridinal siginfo file And then when neon-native do_configure task is in runqueue because of: Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-core/libxml/libxml2_2.7.8.bb.do_populate_sysroot changed from 85a14f7a73ea96fe85227c5a4bac3f1f to f3bbb2f69cdef3ee60360fbbd6fab311 We'll compare sstate-libxml2-native-x86_64-linux-85a14f7a73ea96fe85227c5a4bac3f1f_populate-sysroot.tgz.files.siginfo and sstate-libxml2-native-x86_64-linux-f3bbb2f69cdef3ee60360fbbd6fab311_populate-sysroot.tgz.files.siginfo and if they're the same, we can skip neon-native.do_configure and all followning tasks pulled to runqueue just because of libxml2-native PR change. I know this still has a lot of false positives, but we can whitelist some files with something like filesdepsexclude (as vardepsexclude) so that files matching some pattern won't be included in files.siginfo because they contain ie build timestamp (in generated files) or they change name without change of content (like /usr/doc/share/foo-1.0/README could be the same as /usr/doc/share/foo-1.1/README and it's not important for other packages depending on foo). What I fear is that change like this will force "rebuild almost from scratch" too often to finish build before another such change is pushed in some layer (=> cannot do continual builds on current hw anymore) Or that auto-PR-bump thing is going to use same checksum mechanism, so even opkg upgrade will be slower then reflashing the device. And my last thought yesterday was that it would be nice to be able to disable sstate completely, to save some IO (generating sstate-cache and siginfos) for people who know what they're doing (and can rebuild stuff manually when needed), as with basic signature handler it doesn't reuse sstate much in multimachine builds (when everything is built acording to basic signature handler, but sstate checksums are already somewhere else) http://lists.linuxtogo.org/pipermail/openembedded-core/2011-November/012053.html and when it does reuse sstate package, it sometimes causes troubles http://lists.linuxtogo.org/pipermail/openembedded-core/2011-November/012149.html Cheers, -- Martin 'JaMa' Jansa jabber: Martin.Jansa@gmail.com [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 205 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: BB_SIGNATURE_HANDLER = "basichash" unusable strict? 2011-11-09 12:45 ` Martin Jansa @ 2011-11-09 14:13 ` Richard Purdie 2011-11-09 14:48 ` Martin Jansa 0 siblings, 1 reply; 7+ messages in thread From: Richard Purdie @ 2011-11-09 14:13 UTC (permalink / raw) To: Patches and discussions about the oe-core layer On Wed, 2011-11-09 at 13:45 +0100, Martin Jansa wrote: > On Wed, Nov 09, 2011 at 12:06:23PM +0000, Richard Purdie wrote: > > On Wed, 2011-11-09 at 12:51 +0100, Martin Jansa wrote: > > > I have talked with kergoth on IRC yesterday and he had very nice remark: > > > > > > 16:40:50 < kergoth_> JaMa: heh, the biggest weakness of the sstate > > > signature bits, in my opinion, is that it only tracks inputs, not > > > outputs. If task A depends on B, and the metadata input to B changes, > > > then A will be rebuilt, even if the *output* of B didn't change as a > > > result of the change to its metadata. > > > > > > And with this idea applied on those 2 changes I think that PR change in > > > libxml2 should of course invalidate checksum for > > > sstate-libxml2-native-x86_64-linux-2.7.8-r*populate-sysroot.tgz.siginfo > > > and probably wont hurt so much when neon-native is also rebuilt, but then > > > if the output of neon build is the same with new sstate checksum as it was > > > with older one (I know it's hard to detect ie if some file in build has > > > "generation timestamp inside"), then we won't continue to rebuild > > > subversion, gcc, ... all (just because neon was rebuilt due to libxml2 PR > > > change which didn't influence neon output). > > > > > > The same with openssl PR change.. which can cause python-native rebuild, > > > but as long as python-native build output is "the same" we don't need to > > > rebuild everything which (even transitively) depends on python-native. > > > > In an ideal world it would be nice to track the output. I've never seen > > a proposal for how we could make this work in practise though. There are > > at least two big problems that spring to mind: > > > > a) How do you compare two sets of output and decide whether they're the > > same? Same list of files? Same contents? How to deal with timestamps? > > > > b) You can't know in advance that the output will or won't match and its > > near impossible to calculate any kind of checksum without having the > > output available to perform that calculation on. This breaks a lot of > > the way bitbake runs the builds and makes it hard to compare two > > configurations. Is A compatible with B? You'd have to build them both to > > find out. > > > > Whilst output tracking sounds nice, I think its trading one set of > > problems for another and in the end, I'm not sure its the perfect > > solution it might look like from our current position. > > This could be completely silly idea and I don't have any tmpdir to check > it on real sstate data, but what if we extend > > sstate-libxml2-native-x86_64-linux-2.7.8-r4-x86_64-2-85a14f7a73ea96fe85227c5a4bac3f1f_populate-sysroot.tgz.siginfo > to contain checksums for every file included in > sstate-libxml2-native-x86_64-linux-2.7.8-r4-x86_64-2-85a14f7a73ea96fe85227c5a4bac3f1f_populate-sysroot.tgz > maybe store them in new extra file like > sstate-libxml2-native-x86_64-linux-85a14f7a73ea96fe85227c5a4bac3f1f_populate-sysroot.tgz.files.siginfo > and add only checksum of this file to oridinal siginfo file > > And then when neon-native do_configure task is in runqueue because of: > Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-core/libxml/libxml2_2.7.8.bb.do_populate_sysroot > changed from 85a14f7a73ea96fe85227c5a4bac3f1f to f3bbb2f69cdef3ee60360fbbd6fab311 > > We'll compare > sstate-libxml2-native-x86_64-linux-85a14f7a73ea96fe85227c5a4bac3f1f_populate-sysroot.tgz.files.siginfo > and > sstate-libxml2-native-x86_64-linux-f3bbb2f69cdef3ee60360fbbd6fab311_populate-sysroot.tgz.files.siginfo > and if they're the same, we can skip neon-native.do_configure and all > followning tasks pulled to runqueue just because of libxml2-native PR > change. Two problems spring to mind to start with: a) bitbake could have to checksum the .tgz file each time it runs (yes we can add caches and so on but we've tried to be clever to avoid needing to md5sum data we don't already have) b) I can't calculate in advance what the checksum of a given task should be without executing the task itself and generating the output files to checksum. This means remote sstate packages become effectively useless. > I know this still has a lot of false positives, but we can whitelist > some files with something like filesdepsexclude (as vardepsexclude) so > that files matching some pattern won't be included in files.siginfo > because they contain ie build timestamp (in generated files) or they > change name without change of content (like > /usr/doc/share/foo-1.0/README could be the same as > /usr/doc/share/foo-1.1/README and it's not important for other packages > depending on foo). I suspect this logic is going to get very difficult to write maintain :(. > What I fear is that change like this will force "rebuild almost from scratch" > too often to finish build before another such change is pushed in some > layer (=> cannot do continual builds on current hw anymore) > > Or that auto-PR-bump thing is going to use same checksum mechanism, > so even opkg upgrade will be slower then reflashing the device. > > And my last thought yesterday was that it would be nice to be able to > disable sstate completely, to save some IO (generating sstate-cache and > siginfos) for people who know what they're doing (and can rebuild stuff > manually when needed), as with basic signature handler it doesn't reuse > sstate much in multimachine builds (when everything is built acording to > basic signature handler, but sstate checksums are already somewhere > else) > http://lists.linuxtogo.org/pipermail/openembedded-core/2011-November/012053.html > and when it does reuse sstate package, it sometimes causes troubles > http://lists.linuxtogo.org/pipermail/openembedded-core/2011-November/012149.html We can customise the siggen code to do whatever we think is appropriate, including just permanently just generate the same hash value with no computation, effectively disabling 99.9% of the code/overhead. I think there are ways to solve the problems and we will find a solution that works the majority of the time but until people start thinking about and using the code, its not going to happen. Its nice to see people starting to think about this though :) Cheers, Richard ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: BB_SIGNATURE_HANDLER = "basichash" unusable strict? 2011-11-09 14:13 ` Richard Purdie @ 2011-11-09 14:48 ` Martin Jansa 0 siblings, 0 replies; 7+ messages in thread From: Martin Jansa @ 2011-11-09 14:48 UTC (permalink / raw) To: Patches and discussions about the oe-core layer [-- Attachment #1: Type: text/plain, Size: 8884 bytes --] On Wed, Nov 09, 2011 at 02:13:06PM +0000, Richard Purdie wrote: > On Wed, 2011-11-09 at 13:45 +0100, Martin Jansa wrote: > > On Wed, Nov 09, 2011 at 12:06:23PM +0000, Richard Purdie wrote: > > > On Wed, 2011-11-09 at 12:51 +0100, Martin Jansa wrote: > > > > I have talked with kergoth on IRC yesterday and he had very nice remark: > > > > > > > > 16:40:50 < kergoth_> JaMa: heh, the biggest weakness of the sstate > > > > signature bits, in my opinion, is that it only tracks inputs, not > > > > outputs. If task A depends on B, and the metadata input to B changes, > > > > then A will be rebuilt, even if the *output* of B didn't change as a > > > > result of the change to its metadata. > > > > > > > > And with this idea applied on those 2 changes I think that PR change in > > > > libxml2 should of course invalidate checksum for > > > > sstate-libxml2-native-x86_64-linux-2.7.8-r*populate-sysroot.tgz.siginfo > > > > and probably wont hurt so much when neon-native is also rebuilt, but then > > > > if the output of neon build is the same with new sstate checksum as it was > > > > with older one (I know it's hard to detect ie if some file in build has > > > > "generation timestamp inside"), then we won't continue to rebuild > > > > subversion, gcc, ... all (just because neon was rebuilt due to libxml2 PR > > > > change which didn't influence neon output). > > > > > > > > The same with openssl PR change.. which can cause python-native rebuild, > > > > but as long as python-native build output is "the same" we don't need to > > > > rebuild everything which (even transitively) depends on python-native. > > > > > > In an ideal world it would be nice to track the output. I've never seen > > > a proposal for how we could make this work in practise though. There are > > > at least two big problems that spring to mind: > > > > > > a) How do you compare two sets of output and decide whether they're the > > > same? Same list of files? Same contents? How to deal with timestamps? > > > > > > b) You can't know in advance that the output will or won't match and its > > > near impossible to calculate any kind of checksum without having the > > > output available to perform that calculation on. This breaks a lot of > > > the way bitbake runs the builds and makes it hard to compare two > > > configurations. Is A compatible with B? You'd have to build them both to > > > find out. > > > > > > Whilst output tracking sounds nice, I think its trading one set of > > > problems for another and in the end, I'm not sure its the perfect > > > solution it might look like from our current position. > > > > This could be completely silly idea and I don't have any tmpdir to check > > it on real sstate data, but what if we extend > > > > sstate-libxml2-native-x86_64-linux-2.7.8-r4-x86_64-2-85a14f7a73ea96fe85227c5a4bac3f1f_populate-sysroot.tgz.siginfo > > to contain checksums for every file included in > > sstate-libxml2-native-x86_64-linux-2.7.8-r4-x86_64-2-85a14f7a73ea96fe85227c5a4bac3f1f_populate-sysroot.tgz > > maybe store them in new extra file like > > sstate-libxml2-native-x86_64-linux-85a14f7a73ea96fe85227c5a4bac3f1f_populate-sysroot.tgz.files.siginfo > > and add only checksum of this file to oridinal siginfo file > > > > And then when neon-native do_configure task is in runqueue because of: > > Hash for dependent task virtual:native:/OE/shr-core/openembedded-core/meta/recipes-core/libxml/libxml2_2.7.8.bb.do_populate_sysroot > > changed from 85a14f7a73ea96fe85227c5a4bac3f1f to f3bbb2f69cdef3ee60360fbbd6fab311 > > > > We'll compare > > sstate-libxml2-native-x86_64-linux-85a14f7a73ea96fe85227c5a4bac3f1f_populate-sysroot.tgz.files.siginfo > > and > > sstate-libxml2-native-x86_64-linux-f3bbb2f69cdef3ee60360fbbd6fab311_populate-sysroot.tgz.files.siginfo > > and if they're the same, we can skip neon-native.do_configure and all > > followning tasks pulled to runqueue just because of libxml2-native PR > > change. > > Two problems spring to mind to start with: > > a) bitbake could have to checksum the .tgz file each time it runs (yes > we can add caches and so on but we've tried to be clever to avoid > needing to md5sum data we don't already have) checksum for whole .tgz is easy, but is tgz.files.siginfo would be checksum per file (except excluded files), so it would be IMHO easier to store it when we have all required metadata (from time of .tgz creation) then on each time it runs. > b) I can't calculate in advance what the checksum of a given task should > be without executing the task itself and generating the output files to > checksum. This means remote sstate packages become effectively useless. That's why I think that we have to build neon-native (after libxml2-native change) to see that libxml2 change was contained in libxml2 and doesn't influence neon-native output (and then of course everything after neon-native). But it would build only 1 extra step (maybe unneeded) and then stop. And sstate-cache dir will have neon-native siginfo and tgz.files.siginfo for remote builder to find that even with different hash those 2 neon-native populate-sysroot.tgz are compatible. > > I know this still has a lot of false positives, but we can whitelist > > some files with something like filesdepsexclude (as vardepsexclude) so > > that files matching some pattern won't be included in files.siginfo > > because they contain ie build timestamp (in generated files) or they > > change name without change of content (like > > /usr/doc/share/foo-1.0/README could be the same as > > /usr/doc/share/foo-1.1/README and it's not important for other packages > > depending on foo). > > I suspect this logic is going to get very difficult to write > maintain :(. Yes it would need more experiments to see how often we have different sstate tgz with 100% same content (and this change would solve those without extra filesdepsexclude) and how often we can add simple rule for all recipes (maybe whole /usr/doc/share/ can be ignored for populate-sysroot or vice-versa and rebuild everything depending on foo when there is this only change /usr/doc/share/foo-1.[01], because we'll know that this rebuild spree will end again only 1 step after foo in dependency-tree. > > What I fear is that change like this will force "rebuild almost from scratch" > > too often to finish build before another such change is pushed in some > > layer (=> cannot do continual builds on current hw anymore) > > > > Or that auto-PR-bump thing is going to use same checksum mechanism, > > so even opkg upgrade will be slower then reflashing the device. > > > > And my last thought yesterday was that it would be nice to be able to > > disable sstate completely, to save some IO (generating sstate-cache and > > siginfos) for people who know what they're doing (and can rebuild stuff > > manually when needed), as with basic signature handler it doesn't reuse > > sstate much in multimachine builds (when everything is built acording to > > basic signature handler, but sstate checksums are already somewhere > > else) > > http://lists.linuxtogo.org/pipermail/openembedded-core/2011-November/012053.html > > and when it does reuse sstate package, it sometimes causes troubles > > http://lists.linuxtogo.org/pipermail/openembedded-core/2011-November/012149.html > > We can customise the siggen code to do whatever we think is appropriate, > including just permanently just generate the same hash value with no > computation, effectively disabling 99.9% of the code/overhead. This is only about disabling it, right? For python issue it should be handled by something like SSTATEPOSTINSTFUNCS used in dbus lately to replace all sysroot specific paths with right value for current machine or is it better to include MACHINE in vardeps (this time only for python-native) to make sure that Makefile has right sysroot? Which won't help users ie with different TMPDIR (like I did by removing TCLIBCAPPEND = "" and expecting sstate to populate it properly on new localtion). > I think there are ways to solve the problems and we will find a solution > that works the majority of the time but until people start thinking > about and using the code, its not going to happen. Its nice to see > people starting to think about this though :) I'm sorry to be so pesimistic about it, I was just sad when I've found out that it's not configuration problem on my side and that it does what it's expected to do (and that's something else then what I expected). Maybe per-recipe staging and package-based build-time dependencies would make it easier. I'm glad you're also evaluating such options (as last TSC meeting show). Cheers, -- Martin 'JaMa' Jansa jabber: Martin.Jansa@gmail.com [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 205 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2011-11-09 14:54 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-11-08 14:37 BB_SIGNATURE_HANDLER = "basichash" unusable strict? Martin Jansa 2011-11-09 10:32 ` Richard Purdie 2011-11-09 11:51 ` Martin Jansa 2011-11-09 12:06 ` Richard Purdie 2011-11-09 12:45 ` Martin Jansa 2011-11-09 14:13 ` Richard Purdie 2011-11-09 14:48 ` Martin Jansa
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.