* Re: [yocto] sstate causing stripped kernel vs symbols mismatch
2020-04-07 19:03 Sean McKay
@ 2020-04-07 19:11 ` Alexander Kanavin
2020-04-07 20:24 ` Sean McKay
0 siblings, 1 reply; 8+ messages in thread
From: Alexander Kanavin @ 2020-04-07 19:11 UTC (permalink / raw)
To: Sean McKay; +Cc: yocto@lists.yoctoproject.org
[-- Attachment #1: Type: text/plain, Size: 1231 bytes --]
On Tue, 7 Apr 2020 at 21:03, Sean McKay <sean.mckay@hpe.com> wrote:
> If you’re interested, this is quite easy to reproduce – these are my
> repro steps
>
> - Check out a clean copy of zeus (22.0.2)
> - Add kernel-image to core-image-minimal in whatever fashion you
> choose (I just dumped it in the RDEPENDS for packagegroup-core-boot for
> testing)
> - bitbake core-image-minimal
> - bitbake -c clean core-image-minimal linux-yocto (or just wipe your
> whole build dir, since everything should come from sstate now)
> - Delete the sstate object(s) for linux-yocto’s deploy task.
> - bitbake core-image-minimal
> - Compare the BuildID hashes for the kernel in the two locations using
> file (you’ll need to use the kernel’s extract-vmlinux script to get it out
> of the bzImage)
> - file
> tmp/work/qemux86_64-poky-linux/core-image-minimal/1.0-r0/rootfs/boot/vmlinux-5.2.28-yocto-standard
> - ./tmp/work-shared/qemux86-64/kernel-source/scripts/extract-vmlinux
> tmp/deploy/images/qemux86-64/bzImage > vmlinux-deploy && file vmlinux-deploy
>
> The kernel is still re-built from the same source, so why is this causing
issues?
Alex
[-- Attachment #2: Type: text/html, Size: 2457 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [yocto] sstate causing stripped kernel vs symbols mismatch
2020-04-07 19:11 ` [yocto] " Alexander Kanavin
@ 2020-04-07 20:24 ` Sean McKay
0 siblings, 0 replies; 8+ messages in thread
From: Sean McKay @ 2020-04-07 20:24 UTC (permalink / raw)
To: Alexander Kanavin; +Cc: yocto@lists.yoctoproject.org
[-- Attachment #1: Type: text/plain, Size: 2272 bytes --]
The kernel doesn’t have reproducible builds by default because of a handful of variables (including timestamps). If you rebuild, you get a different binary every time.
Those can be changed and overridden so that you get consistent binary output (see here: https://www.kernel.org/doc/html/latest/kbuild/reproducible-builds.html) but that hasn’t been done on the linux-yocto recipe yet.
I assume that it’s planned because there are efforts going in to making things 100% reproducible for poky, but I don’t know what the status of everything is (although I know that the kernel at least isn’t reproducible as of 22.0.2)
Since this can theoretically happen to any package that doesn’t have a reproducible build process, it seemed worth asking the question globally too.
-Sean
From: yocto@lists.yoctoproject.org <yocto@lists.yoctoproject.org> On Behalf Of Alexander Kanavin
Sent: Tuesday, April 7, 2020 12:11 PM
To: McKay, Sean <sean.mckay@hpe.com>
Cc: yocto@lists.yoctoproject.org
Subject: Re: [yocto] sstate causing stripped kernel vs symbols mismatch
On Tue, 7 Apr 2020 at 21:03, Sean McKay <sean.mckay@hpe.com<mailto:sean.mckay@hpe.com>> wrote:
If you’re interested, this is quite easy to reproduce – these are my repro steps
* Check out a clean copy of zeus (22.0.2)
* Add kernel-image to core-image-minimal in whatever fashion you choose (I just dumped it in the RDEPENDS for packagegroup-core-boot for testing)
* bitbake core-image-minimal
* bitbake -c clean core-image-minimal linux-yocto (or just wipe your whole build dir, since everything should come from sstate now)
* Delete the sstate object(s) for linux-yocto’s deploy task.
* bitbake core-image-minimal
* Compare the BuildID hashes for the kernel in the two locations using file (you’ll need to use the kernel’s extract-vmlinux script to get it out of the bzImage)
* file tmp/work/qemux86_64-poky-linux/core-image-minimal/1.0-r0/rootfs/boot/vmlinux-5.2.28-yocto-standard
* ./tmp/work-shared/qemux86-64/kernel-source/scripts/extract-vmlinux tmp/deploy/images/qemux86-64/bzImage > vmlinux-deploy && file vmlinux-deploy
The kernel is still re-built from the same source, so why is this causing issues?
Alex
[-- Attachment #2: Type: text/html, Size: 8154 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: sstate causing stripped kernel vs symbols mismatch
[not found] <16039EE1565C42EA.9427@lists.yoctoproject.org>
@ 2020-04-09 16:42 ` Sean McKay
2020-04-09 17:00 ` [yocto] " Joshua Watt
0 siblings, 1 reply; 8+ messages in thread
From: Sean McKay @ 2020-04-09 16:42 UTC (permalink / raw)
To: McKay, Sean, yocto@lists.yoctoproject.org
[-- Attachment #1: Type: text/plain, Size: 4477 bytes --]
Anyone have any thoughts or guidance on this?
It seems like a pretty major bug to me.
We're willing to put the work in to fix it, and if it's not something the upstream community is interested in, I'll just pick a solution for us and go with it.
But if it's something that we'd like me to upstream, I'd like some feedback on which path I should start walking down before I start taking things apart.
Cheers!
-Sean
From: yocto@lists.yoctoproject.org <yocto@lists.yoctoproject.org> On Behalf Of Sean McKay
Sent: Tuesday, April 7, 2020 12:03 PM
To: yocto@lists.yoctoproject.org
Subject: [yocto] sstate causing stripped kernel vs symbols mismatch
Hi all,
We've discovered that (quite frequently) the kernel that we deploy doesn't match the unstripped one that we're saving for debug symbols. I've traced the issue to a combination of an sstate miss for the kernel do_deploy step combined with an sstate hit for do_package_write_rpm. (side note: we know we have issues with sstate reuse/stamps including things they shouldn't which is why we hit this so much. We're working on that too)
The result is that when our debug rootfs is created (where we added the kernel symbols), it's got the version of the kernel from the sstate cached rpm files, but since do_deploy had an sstate miss, the entire kernel gets rebuilt to satisfy that dependency chain. Since the kernel doesn't have reproducible builds working, the resulting pair of kernels don't match each other for debug purposes.
So, I have two questions to start:
1. What is the recommended way to be getting debug symbols for the kernel, since do_deploy doesn't seem to have a debug counterpart (which is why we originally just set things up to add the rpm to the generated debug rootfs)
2. Does this seem like a bug that should be fixed? If so, what would be the recommended solution (more thoughts below)?
Even if there's a task somewhere that does what I'm looking for, this seems like a bit of a bug. I generally feel like we want to be able to trust sstate, so the fact that forking dependencies that each generate their own sstate objects can be out of sync is a bit scary.
I've thought of several ways around this, but I can't say I like any of them.
* (extremely gross hack) Create a new task to use instead of do_deploy that depends on do_packagegroup_write_rpm. Unpack the restored (or built) RPMs and use those blobs to deploy the kernel and symbols to the image directory.
* (gross hack with painful effects on build time) Disable sstate for do_package_write_rpm and do_deploy. Possibly replace with sstate logic for the kernel's do_install step (side question - why doesn't do_install generate sstate? It seems like it should be able to, since the point is to drop everything into the image directory)
* (possibly better, but sounds hard) Change the sstate logic so that if anything downstream of a do_compile task needs to be rerun, everything downstream of it needs to be rerun and sstate reuse for that recipe is not allowed (basically all or nothing sstate). Maybe with a flag that's allowed in the bitbake file to indicate that a recipe does have reproducible builds and that different pieces are allowed to come from sstate in that case.
* (fix the symptoms but not the problem) Figure out how to get linux-yocto building in a reproducible fashion and pretend the problem doesn't exist.
If you're interested, this is quite easy to reproduce - these are my repro steps
* Check out a clean copy of zeus (22.0.2)
* Add kernel-image to core-image-minimal in whatever fashion you choose (I just dumped it in the RDEPENDS for packagegroup-core-boot for testing)
* bitbake core-image-minimal
* bitbake -c clean core-image-minimal linux-yocto (or just wipe your whole build dir, since everything should come from sstate now)
* Delete the sstate object(s) for linux-yocto's deploy task.
* bitbake core-image-minimal
* Compare the BuildID hashes for the kernel in the two locations using file (you'll need to use the kernel's extract-vmlinux script to get it out of the bzImage)
* file tmp/work/qemux86_64-poky-linux/core-image-minimal/1.0-r0/rootfs/boot/vmlinux-5.2.28-yocto-standard
* ./tmp/work-shared/qemux86-64/kernel-source/scripts/extract-vmlinux tmp/deploy/images/qemux86-64/bzImage > vmlinux-deploy && file vmlinux-deploy
Anyone have thoughts or suggestions?
Cheers!
-Sean McKay
[-- Attachment #2: Type: text/html, Size: 18352 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [yocto] sstate causing stripped kernel vs symbols mismatch
2020-04-09 16:42 ` sstate causing stripped kernel vs symbols mismatch Sean McKay
@ 2020-04-09 17:00 ` Joshua Watt
2020-04-09 17:21 ` Sean McKay
0 siblings, 1 reply; 8+ messages in thread
From: Joshua Watt @ 2020-04-09 17:00 UTC (permalink / raw)
To: Sean McKay, yocto@lists.yoctoproject.org
[-- Attachment #1: Type: text/plain, Size: 5225 bytes --]
On 4/9/20 11:42 AM, Sean McKay wrote:
>
> Anyone have any thoughts or guidance on this?
>
> It seems like a pretty major bug to me.
>
> We’re willing to put the work in to fix it, and if it’s not something
> the upstream community is interested in, I’ll just pick a solution for
> us and go with it.
>
> But if it’s something that we’d like me to upstream, I’d like some
> feedback on which path I should start walking down before I start
> taking things apart.
>
We have had a recent push for reproducible builds (and they are now
enabled by default). Do you have any idea how much effort it would take
to make the kernel build reproducibly? It's something we probably want
anyway, and can add to the automated testing infrastructure to ensure it
doesn't regress.
>
> Cheers!
>
> -Sean
>
> *From:* yocto@lists.yoctoproject.org <yocto@lists.yoctoproject.org>
> *On Behalf Of *Sean McKay
> *Sent:* Tuesday, April 7, 2020 12:03 PM
> *To:* yocto@lists.yoctoproject.org
> *Subject:* [yocto] sstate causing stripped kernel vs symbols mismatch
>
> Hi all,
>
> We’ve discovered that (quite frequently) the kernel that we deploy
> doesn’t match the unstripped one that we’re saving for debug symbols.
> I’ve traced the issue to a combination of an sstate miss for the
> kernel do_deploy step combined with an sstate hit for
> do_package_write_rpm. (side note: we know we have issues with sstate
> reuse/stamps including things they shouldn’t which is why we hit this
> so much. We’re working on that too)
>
> The result is that when our debug rootfs is created (where we added
> the kernel symbols), it’s got the version of the kernel from the
> sstate cached rpm files, but since do_deploy had an sstate miss, the
> entire kernel gets rebuilt to satisfy that dependency chain. Since the
> kernel doesn’t have reproducible builds working, the resulting pair of
> kernels don’t match each other for debug purposes.
>
> So, I have two questions to start:
>
> 1. What is the recommended way to be getting debug symbols for the
> kernel, since do_deploy doesn’t seem to have a debug counterpart
> (which is why we originally just set things up to add the rpm to
> the generated debug rootfs)
> 2. Does this seem like a bug that should be fixed? If so, what would
> be the recommended solution (more thoughts below)?
>
> Even if there’s a task somewhere that does what I’m looking for, this
> seems like a bit of a bug. I generally feel like we want to be able to
> trust sstate, so the fact that forking dependencies that each generate
> their own sstate objects can be out of sync is a bit scary.
>
> I’ve thought of several ways around this, but I can’t say I like any
> of them.
>
> * (extremely gross hack) Create a new task to use instead of
> do_deploy that depends on do_packagegroup_write_rpm. Unpack the
> restored (or built) RPMs and use those blobs to deploy the kernel
> and symbols to the image directory.
> * (gross hack with painful effects on build time) Disable sstate for
> do_package_write_rpm and do_deploy. Possibly replace with sstate
> logic for the kernel’s do_install step (side question – why
> doesn’t do_install generate sstate? It seems like it should be
> able to, since the point is to drop everything into the image
> directory)
> * (possibly better, but sounds hard) Change the sstate logic so that
> if anything downstream of a do_compile task needs to be rerun,
> /everything/ downstream of it needs to be rerun and sstate reuse
> for that recipe is not allowed (basically all or nothing sstate).
> Maybe with a flag that’s allowed in the bitbake file to indicate
> that a recipe /does/ have reproducible builds and that different
> pieces are allowed to come from sstate in that case.
> * (fix the symptoms but not the problem) Figure out how to get
> linux-yocto building in a reproducible fashion and pretend the
> problem doesn’t exist.
>
> If you’re interested, this is quite easy to reproduce – these are my
> repro steps
>
> * Check out a clean copy of zeus (22.0.2)
> * Add kernel-image to core-image-minimal in whatever fashion you
> choose (I just dumped it in the RDEPENDS for
> packagegroup-core-boot for testing)
> * bitbake core-image-minimal
> * bitbake -c clean core-image-minimal linux-yocto (or just wipe your
> whole build dir, since everything should come from sstate now)
> * Delete the sstate object(s) for linux-yocto’s deploy task.
> * bitbake core-image-minimal
> * Compare the BuildID hashes for the kernel in the two locations
> using file (you’ll need to use the kernel’s extract-vmlinux script
> to get it out of the bzImage)
> o file
> tmp/work/qemux86_64-poky-linux/core-image-minimal/1.0-r0/rootfs/boot/vmlinux-5.2.28-yocto-standard
> o ./tmp/work-shared/qemux86-64/kernel-source/scripts/extract-vmlinux
> tmp/deploy/images/qemux86-64/bzImage > vmlinux-deploy && file
> vmlinux-deploy
>
> Anyone have thoughts or suggestions?
>
> Cheers!
>
> -Sean McKay
>
>
>
[-- Attachment #2: Type: text/html, Size: 20913 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [yocto] sstate causing stripped kernel vs symbols mismatch
2020-04-09 17:00 ` [yocto] " Joshua Watt
@ 2020-04-09 17:21 ` Sean McKay
2020-04-09 17:52 ` Bruce Ashfield
0 siblings, 1 reply; 8+ messages in thread
From: Sean McKay @ 2020-04-09 17:21 UTC (permalink / raw)
To: Joshua Watt, yocto@lists.yoctoproject.org
[-- Attachment #1: Type: text/plain, Size: 6192 bytes --]
I don't know offhand, but the kernel documentation seems relatively straightforward.
I can start investigating in that direction and see how complex it looks like it's going to be.
When you say that reproducible builds are turned on by default, is there a flag somewhere that can be used to turn that off that I need to gate these changes behind? Or can they be made globally so that the reproducibility can't be turned off (easily)?
Do we expect to generally be okay with letting this sort of race condition remain in sstate? I concede that it's probably okay, since I think the kernel is the only thing with this kind of forking task tree behavior after do_compile, and if we get 100% reproducible builds working, it's not overly relevant... but it seems like it probably deserves a warning somewhere in the documentation.
I can also bring this question to the next technical meeting (I know I just missed one) if it seems the sort of thing we need to get consensus.
Cheers!
-Sean
From: Joshua Watt <jpewhacker@gmail.com>
Sent: Thursday, April 9, 2020 10:00 AM
To: McKay, Sean <sean.mckay@hpe.com>; yocto@lists.yoctoproject.org
Subject: Re: [yocto] sstate causing stripped kernel vs symbols mismatch
On 4/9/20 11:42 AM, Sean McKay wrote:
Anyone have any thoughts or guidance on this?
It seems like a pretty major bug to me.
We're willing to put the work in to fix it, and if it's not something the upstream community is interested in, I'll just pick a solution for us and go with it.
But if it's something that we'd like me to upstream, I'd like some feedback on which path I should start walking down before I start taking things apart.
We have had a recent push for reproducible builds (and they are now enabled by default). Do you have any idea how much effort it would take to make the kernel build reproducibly? It's something we probably want anyway, and can add to the automated testing infrastructure to ensure it doesn't regress.
Cheers!
-Sean
From: yocto@lists.yoctoproject.org<mailto:yocto@lists.yoctoproject.org> <yocto@lists.yoctoproject.org><mailto:yocto@lists.yoctoproject.org> On Behalf Of Sean McKay
Sent: Tuesday, April 7, 2020 12:03 PM
To: yocto@lists.yoctoproject.org<mailto:yocto@lists.yoctoproject.org>
Subject: [yocto] sstate causing stripped kernel vs symbols mismatch
Hi all,
We've discovered that (quite frequently) the kernel that we deploy doesn't match the unstripped one that we're saving for debug symbols. I've traced the issue to a combination of an sstate miss for the kernel do_deploy step combined with an sstate hit for do_package_write_rpm. (side note: we know we have issues with sstate reuse/stamps including things they shouldn't which is why we hit this so much. We're working on that too)
The result is that when our debug rootfs is created (where we added the kernel symbols), it's got the version of the kernel from the sstate cached rpm files, but since do_deploy had an sstate miss, the entire kernel gets rebuilt to satisfy that dependency chain. Since the kernel doesn't have reproducible builds working, the resulting pair of kernels don't match each other for debug purposes.
So, I have two questions to start:
1. What is the recommended way to be getting debug symbols for the kernel, since do_deploy doesn't seem to have a debug counterpart (which is why we originally just set things up to add the rpm to the generated debug rootfs)
2. Does this seem like a bug that should be fixed? If so, what would be the recommended solution (more thoughts below)?
Even if there's a task somewhere that does what I'm looking for, this seems like a bit of a bug. I generally feel like we want to be able to trust sstate, so the fact that forking dependencies that each generate their own sstate objects can be out of sync is a bit scary.
I've thought of several ways around this, but I can't say I like any of them.
* (extremely gross hack) Create a new task to use instead of do_deploy that depends on do_packagegroup_write_rpm. Unpack the restored (or built) RPMs and use those blobs to deploy the kernel and symbols to the image directory.
* (gross hack with painful effects on build time) Disable sstate for do_package_write_rpm and do_deploy. Possibly replace with sstate logic for the kernel's do_install step (side question - why doesn't do_install generate sstate? It seems like it should be able to, since the point is to drop everything into the image directory)
* (possibly better, but sounds hard) Change the sstate logic so that if anything downstream of a do_compile task needs to be rerun, everything downstream of it needs to be rerun and sstate reuse for that recipe is not allowed (basically all or nothing sstate). Maybe with a flag that's allowed in the bitbake file to indicate that a recipe does have reproducible builds and that different pieces are allowed to come from sstate in that case.
* (fix the symptoms but not the problem) Figure out how to get linux-yocto building in a reproducible fashion and pretend the problem doesn't exist.
If you're interested, this is quite easy to reproduce - these are my repro steps
* Check out a clean copy of zeus (22.0.2)
* Add kernel-image to core-image-minimal in whatever fashion you choose (I just dumped it in the RDEPENDS for packagegroup-core-boot for testing)
* bitbake core-image-minimal
* bitbake -c clean core-image-minimal linux-yocto (or just wipe your whole build dir, since everything should come from sstate now)
* Delete the sstate object(s) for linux-yocto's deploy task.
* bitbake core-image-minimal
* Compare the BuildID hashes for the kernel in the two locations using file (you'll need to use the kernel's extract-vmlinux script to get it out of the bzImage)
* file tmp/work/qemux86_64-poky-linux/core-image-minimal/1.0-r0/rootfs/boot/vmlinux-5.2.28-yocto-standard
* ./tmp/work-shared/qemux86-64/kernel-source/scripts/extract-vmlinux tmp/deploy/images/qemux86-64/bzImage > vmlinux-deploy && file vmlinux-deploy
Anyone have thoughts or suggestions?
Cheers!
-Sean McKay
[-- Attachment #2: Type: text/html, Size: 22029 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [yocto] sstate causing stripped kernel vs symbols mismatch
2020-04-09 17:21 ` Sean McKay
@ 2020-04-09 17:52 ` Bruce Ashfield
2020-04-09 18:11 ` Sean McKay
2020-04-09 19:01 ` Joshua Watt
0 siblings, 2 replies; 8+ messages in thread
From: Bruce Ashfield @ 2020-04-09 17:52 UTC (permalink / raw)
To: Sean McKay; +Cc: Joshua Watt, yocto@lists.yoctoproject.org
On Thu, Apr 9, 2020 at 1:21 PM Sean McKay <sean.mckay@hpe.com> wrote:
>
> I don’t know offhand, but the kernel documentation seems relatively straightforward.
>
> I can start investigating in that direction and see how complex it looks like it’s going to be.
>
I can tweak linux-yocto in the direction of reproducibility without
much trouble (for the build part). But I'm a bit out of my normal flow
for testing that it really is reproducible. So if anyone can point me
at what they are running to currently test that .. I can do the build
part.
Bruce
>
>
> When you say that reproducible builds are turned on by default, is there a flag somewhere that can be used to turn that off that I need to gate these changes behind? Or can they be made globally so that the reproducibility can’t be turned off (easily)?
>
>
>
>
>
> Do we expect to generally be okay with letting this sort of race condition remain in sstate? I concede that it’s probably okay, since I think the kernel is the only thing with this kind of forking task tree behavior after do_compile, and if we get 100% reproducible builds working, it’s not overly relevant… but it seems like it probably deserves a warning somewhere in the documentation.
>
>
>
> I can also bring this question to the next technical meeting (I know I just missed one) if it seems the sort of thing we need to get consensus.
>
>
>
> Cheers!
>
> -Sean
>
>
>
>
>
>
>
> From: Joshua Watt <jpewhacker@gmail.com>
> Sent: Thursday, April 9, 2020 10:00 AM
> To: McKay, Sean <sean.mckay@hpe.com>; yocto@lists.yoctoproject.org
> Subject: Re: [yocto] sstate causing stripped kernel vs symbols mismatch
>
>
>
>
>
> On 4/9/20 11:42 AM, Sean McKay wrote:
>
> Anyone have any thoughts or guidance on this?
>
> It seems like a pretty major bug to me.
>
>
>
> We’re willing to put the work in to fix it, and if it’s not something the upstream community is interested in, I’ll just pick a solution for us and go with it.
>
> But if it’s something that we’d like me to upstream, I’d like some feedback on which path I should start walking down before I start taking things apart.
>
>
>
> We have had a recent push for reproducible builds (and they are now enabled by default). Do you have any idea how much effort it would take to make the kernel build reproducibly? It's something we probably want anyway, and can add to the automated testing infrastructure to ensure it doesn't regress.
>
>
>
>
>
>
>
>
>
> Cheers!
>
> -Sean
>
>
>
> From: yocto@lists.yoctoproject.org <yocto@lists.yoctoproject.org> On Behalf Of Sean McKay
> Sent: Tuesday, April 7, 2020 12:03 PM
> To: yocto@lists.yoctoproject.org
> Subject: [yocto] sstate causing stripped kernel vs symbols mismatch
>
>
>
> Hi all,
>
>
>
> We’ve discovered that (quite frequently) the kernel that we deploy doesn’t match the unstripped one that we’re saving for debug symbols. I’ve traced the issue to a combination of an sstate miss for the kernel do_deploy step combined with an sstate hit for do_package_write_rpm. (side note: we know we have issues with sstate reuse/stamps including things they shouldn’t which is why we hit this so much. We’re working on that too)
>
>
>
> The result is that when our debug rootfs is created (where we added the kernel symbols), it’s got the version of the kernel from the sstate cached rpm files, but since do_deploy had an sstate miss, the entire kernel gets rebuilt to satisfy that dependency chain. Since the kernel doesn’t have reproducible builds working, the resulting pair of kernels don’t match each other for debug purposes.
>
>
>
> So, I have two questions to start:
>
> What is the recommended way to be getting debug symbols for the kernel, since do_deploy doesn’t seem to have a debug counterpart (which is why we originally just set things up to add the rpm to the generated debug rootfs)
> Does this seem like a bug that should be fixed? If so, what would be the recommended solution (more thoughts below)?
>
>
>
> Even if there’s a task somewhere that does what I’m looking for, this seems like a bit of a bug. I generally feel like we want to be able to trust sstate, so the fact that forking dependencies that each generate their own sstate objects can be out of sync is a bit scary.
>
> I’ve thought of several ways around this, but I can’t say I like any of them.
>
> (extremely gross hack) Create a new task to use instead of do_deploy that depends on do_packagegroup_write_rpm. Unpack the restored (or built) RPMs and use those blobs to deploy the kernel and symbols to the image directory.
> (gross hack with painful effects on build time) Disable sstate for do_package_write_rpm and do_deploy. Possibly replace with sstate logic for the kernel’s do_install step (side question – why doesn’t do_install generate sstate? It seems like it should be able to, since the point is to drop everything into the image directory)
> (possibly better, but sounds hard) Change the sstate logic so that if anything downstream of a do_compile task needs to be rerun, everything downstream of it needs to be rerun and sstate reuse for that recipe is not allowed (basically all or nothing sstate). Maybe with a flag that’s allowed in the bitbake file to indicate that a recipe does have reproducible builds and that different pieces are allowed to come from sstate in that case.
> (fix the symptoms but not the problem) Figure out how to get linux-yocto building in a reproducible fashion and pretend the problem doesn’t exist.
>
>
>
>
>
> If you’re interested, this is quite easy to reproduce – these are my repro steps
>
> Check out a clean copy of zeus (22.0.2)
> Add kernel-image to core-image-minimal in whatever fashion you choose (I just dumped it in the RDEPENDS for packagegroup-core-boot for testing)
> bitbake core-image-minimal
> bitbake -c clean core-image-minimal linux-yocto (or just wipe your whole build dir, since everything should come from sstate now)
> Delete the sstate object(s) for linux-yocto’s deploy task.
> bitbake core-image-minimal
> Compare the BuildID hashes for the kernel in the two locations using file (you’ll need to use the kernel’s extract-vmlinux script to get it out of the bzImage)
>
> file tmp/work/qemux86_64-poky-linux/core-image-minimal/1.0-r0/rootfs/boot/vmlinux-5.2.28-yocto-standard
> ./tmp/work-shared/qemux86-64/kernel-source/scripts/extract-vmlinux tmp/deploy/images/qemux86-64/bzImage > vmlinux-deploy && file vmlinux-deploy
>
>
>
> Anyone have thoughts or suggestions?
>
>
>
> Cheers!
>
> -Sean McKay
>
>
>
>
--
- Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end
- "Use the force Harry" - Gandalf, Star Trek II
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [yocto] sstate causing stripped kernel vs symbols mismatch
2020-04-09 17:52 ` Bruce Ashfield
@ 2020-04-09 18:11 ` Sean McKay
2020-04-09 19:01 ` Joshua Watt
1 sibling, 0 replies; 8+ messages in thread
From: Sean McKay @ 2020-04-09 18:11 UTC (permalink / raw)
To: Bruce Ashfield; +Cc: Joshua Watt, yocto@lists.yoctoproject.org
The simplest thing I've found is checking/comparing the BuildID that GCC embeds in the ELF file after I force it to recompile. Eg:
$ file tmp/work/qemux86_64-poky-linux/linux-yocto/5.2.28+gitAUTOINC+dd6019025c_992280855e-r0/linux-qemux86_64-standard-build/vmlinux | egrep -o "BuildID\[sha1\]=[0-9a-f]*"
BuildID[sha1]=9b1971fb286e78364246543583ed13600a7f8111
Is that what you were asking for?
Presumably we could also actually hash the actual vmlinux file for comparisons at the do_compile stage, but I was originally comparing stripped vs unstripped, so I had to go by the BuildID.
Side note: from the kernel documentation, it looks like there are 4 main things that could affect reproducibility:
Timestamps, build directory, user account name, and hostname.
I assume they'd be easiest to tackle sequentially in that order.
This is the documentation I've been referencing:
https://www.kernel.org/doc/html/latest/kbuild/reproducible-builds.html
-Sean
-----Original Message-----
From: Bruce Ashfield <bruce.ashfield@gmail.com>
Sent: Thursday, April 9, 2020 10:52 AM
To: McKay, Sean <sean.mckay@hpe.com>
Cc: Joshua Watt <jpewhacker@gmail.com>; yocto@lists.yoctoproject.org
Subject: Re: [yocto] sstate causing stripped kernel vs symbols mismatch
On Thu, Apr 9, 2020 at 1:21 PM Sean McKay <sean.mckay@hpe.com> wrote:
>
> I don’t know offhand, but the kernel documentation seems relatively straightforward.
>
> I can start investigating in that direction and see how complex it looks like it’s going to be.
>
I can tweak linux-yocto in the direction of reproducibility without much trouble (for the build part). But I'm a bit out of my normal flow for testing that it really is reproducible. So if anyone can point me at what they are running to currently test that .. I can do the build part.
Bruce
>
>
> When you say that reproducible builds are turned on by default, is there a flag somewhere that can be used to turn that off that I need to gate these changes behind? Or can they be made globally so that the reproducibility can’t be turned off (easily)?
>
>
>
>
>
> Do we expect to generally be okay with letting this sort of race condition remain in sstate? I concede that it’s probably okay, since I think the kernel is the only thing with this kind of forking task tree behavior after do_compile, and if we get 100% reproducible builds working, it’s not overly relevant… but it seems like it probably deserves a warning somewhere in the documentation.
>
>
>
> I can also bring this question to the next technical meeting (I know I just missed one) if it seems the sort of thing we need to get consensus.
>
>
>
> Cheers!
>
> -Sean
>
>
>
>
>
>
>
> From: Joshua Watt <jpewhacker@gmail.com>
> Sent: Thursday, April 9, 2020 10:00 AM
> To: McKay, Sean <sean.mckay@hpe.com>; yocto@lists.yoctoproject.org
> Subject: Re: [yocto] sstate causing stripped kernel vs symbols
> mismatch
>
>
>
>
>
> On 4/9/20 11:42 AM, Sean McKay wrote:
>
> Anyone have any thoughts or guidance on this?
>
> It seems like a pretty major bug to me.
>
>
>
> We’re willing to put the work in to fix it, and if it’s not something the upstream community is interested in, I’ll just pick a solution for us and go with it.
>
> But if it’s something that we’d like me to upstream, I’d like some feedback on which path I should start walking down before I start taking things apart.
>
>
>
> We have had a recent push for reproducible builds (and they are now enabled by default). Do you have any idea how much effort it would take to make the kernel build reproducibly? It's something we probably want anyway, and can add to the automated testing infrastructure to ensure it doesn't regress.
>
>
>
>
>
>
>
>
>
> Cheers!
>
> -Sean
>
>
>
> From: yocto@lists.yoctoproject.org <yocto@lists.yoctoproject.org> On
> Behalf Of Sean McKay
> Sent: Tuesday, April 7, 2020 12:03 PM
> To: yocto@lists.yoctoproject.org
> Subject: [yocto] sstate causing stripped kernel vs symbols mismatch
>
>
>
> Hi all,
>
>
>
> We’ve discovered that (quite frequently) the kernel that we deploy
> doesn’t match the unstripped one that we’re saving for debug symbols.
> I’ve traced the issue to a combination of an sstate miss for the
> kernel do_deploy step combined with an sstate hit for
> do_package_write_rpm. (side note: we know we have issues with sstate
> reuse/stamps including things they shouldn’t which is why we hit this
> so much. We’re working on that too)
>
>
>
> The result is that when our debug rootfs is created (where we added the kernel symbols), it’s got the version of the kernel from the sstate cached rpm files, but since do_deploy had an sstate miss, the entire kernel gets rebuilt to satisfy that dependency chain. Since the kernel doesn’t have reproducible builds working, the resulting pair of kernels don’t match each other for debug purposes.
>
>
>
> So, I have two questions to start:
>
> What is the recommended way to be getting debug symbols for the
> kernel, since do_deploy doesn’t seem to have a debug counterpart (which is why we originally just set things up to add the rpm to the generated debug rootfs) Does this seem like a bug that should be fixed? If so, what would be the recommended solution (more thoughts below)?
>
>
>
> Even if there’s a task somewhere that does what I’m looking for, this seems like a bit of a bug. I generally feel like we want to be able to trust sstate, so the fact that forking dependencies that each generate their own sstate objects can be out of sync is a bit scary.
>
> I’ve thought of several ways around this, but I can’t say I like any of them.
>
> (extremely gross hack) Create a new task to use instead of do_deploy that depends on do_packagegroup_write_rpm. Unpack the restored (or built) RPMs and use those blobs to deploy the kernel and symbols to the image directory.
> (gross hack with painful effects on build time) Disable sstate for
> do_package_write_rpm and do_deploy. Possibly replace with sstate logic for the kernel’s do_install step (side question – why doesn’t do_install generate sstate? It seems like it should be able to, since the point is to drop everything into the image directory) (possibly better, but sounds hard) Change the sstate logic so that if anything downstream of a do_compile task needs to be rerun, everything downstream of it needs to be rerun and sstate reuse for that recipe is not allowed (basically all or nothing sstate). Maybe with a flag that’s allowed in the bitbake file to indicate that a recipe does have reproducible builds and that different pieces are allowed to come from sstate in that case.
> (fix the symptoms but not the problem) Figure out how to get linux-yocto building in a reproducible fashion and pretend the problem doesn’t exist.
>
>
>
>
>
> If you’re interested, this is quite easy to reproduce – these are my
> repro steps
>
> Check out a clean copy of zeus (22.0.2) Add kernel-image to
> core-image-minimal in whatever fashion you choose (I just dumped it in
> the RDEPENDS for packagegroup-core-boot for testing) bitbake
> core-image-minimal bitbake -c clean core-image-minimal linux-yocto (or
> just wipe your whole build dir, since everything should come from sstate now) Delete the sstate object(s) for linux-yocto’s deploy task.
> bitbake core-image-minimal
> Compare the BuildID hashes for the kernel in the two locations using
> file (you’ll need to use the kernel’s extract-vmlinux script to get it
> out of the bzImage)
>
> file
> tmp/work/qemux86_64-poky-linux/core-image-minimal/1.0-r0/rootfs/boot/v
> mlinux-5.2.28-yocto-standard
> ./tmp/work-shared/qemux86-64/kernel-source/scripts/extract-vmlinux
> tmp/deploy/images/qemux86-64/bzImage > vmlinux-deploy && file
> vmlinux-deploy
>
>
>
> Anyone have thoughts or suggestions?
>
>
>
> Cheers!
>
> -Sean McKay
>
>
>
>
--
- Thou shalt not follow the NULL pointer, for chaos and madness await thee at its end
- "Use the force Harry" - Gandalf, Star Trek II
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [yocto] sstate causing stripped kernel vs symbols mismatch
2020-04-09 17:52 ` Bruce Ashfield
2020-04-09 18:11 ` Sean McKay
@ 2020-04-09 19:01 ` Joshua Watt
1 sibling, 0 replies; 8+ messages in thread
From: Joshua Watt @ 2020-04-09 19:01 UTC (permalink / raw)
To: Bruce Ashfield; +Cc: Sean McKay, yocto
[-- Attachment #1: Type: text/plain, Size: 7796 bytes --]
On Thu, Apr 9, 2020, 12:52 PM Bruce Ashfield <bruce.ashfield@gmail.com>
wrote:
> On Thu, Apr 9, 2020 at 1:21 PM Sean McKay <sean.mckay@hpe.com> wrote:
> >
> > I don’t know offhand, but the kernel documentation seems relatively
> straightforward.
> >
> > I can start investigating in that direction and see how complex it looks
> like it’s going to be.
> >
>
> I can tweak linux-yocto in the direction of reproducibility without
> much trouble (for the build part). But I'm a bit out of my normal flow
> for testing that it really is reproducible. So if anyone can point me
> at what they are running to currently test that .. I can do the build
> part.
>
Reproducible builds are part of the standard OE QA tests. You can run them
with:
oe-selftest -r reproducible
It currently tests core-image-sato, which I thought would cover the kernel,
so I'm a little surprised it's not. Anyway, you can easily modify the
reporducible.py test file to build whatever you want, since doing the full
core-image-sato build can be pretty slow
> Bruce
>
> >
> >
> > When you say that reproducible builds are turned on by default, is there
> a flag somewhere that can be used to turn that off that I need to gate
> these changes behind? Or can they be made globally so that the
> reproducibility can’t be turned off (easily)?
> >
> >
> >
> >
> >
> > Do we expect to generally be okay with letting this sort of race
> condition remain in sstate? I concede that it’s probably okay, since I
> think the kernel is the only thing with this kind of forking task tree
> behavior after do_compile, and if we get 100% reproducible builds working,
> it’s not overly relevant… but it seems like it probably deserves a warning
> somewhere in the documentation.
> >
> >
> >
> > I can also bring this question to the next technical meeting (I know I
> just missed one) if it seems the sort of thing we need to get consensus.
> >
> >
> >
> > Cheers!
> >
> > -Sean
> >
> >
> >
> >
> >
> >
> >
> > From: Joshua Watt <jpewhacker@gmail.com>
> > Sent: Thursday, April 9, 2020 10:00 AM
> > To: McKay, Sean <sean.mckay@hpe.com>; yocto@lists.yoctoproject.org
> > Subject: Re: [yocto] sstate causing stripped kernel vs symbols mismatch
> >
> >
> >
> >
> >
> > On 4/9/20 11:42 AM, Sean McKay wrote:
> >
> > Anyone have any thoughts or guidance on this?
> >
> > It seems like a pretty major bug to me.
> >
> >
> >
> > We’re willing to put the work in to fix it, and if it’s not something
> the upstream community is interested in, I’ll just pick a solution for us
> and go with it.
> >
> > But if it’s something that we’d like me to upstream, I’d like some
> feedback on which path I should start walking down before I start taking
> things apart.
> >
> >
> >
> > We have had a recent push for reproducible builds (and they are now
> enabled by default). Do you have any idea how much effort it would take to
> make the kernel build reproducibly? It's something we probably want anyway,
> and can add to the automated testing infrastructure to ensure it doesn't
> regress.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > Cheers!
> >
> > -Sean
> >
> >
> >
> > From: yocto@lists.yoctoproject.org <yocto@lists.yoctoproject.org> On
> Behalf Of Sean McKay
> > Sent: Tuesday, April 7, 2020 12:03 PM
> > To: yocto@lists.yoctoproject.org
> > Subject: [yocto] sstate causing stripped kernel vs symbols mismatch
> >
> >
> >
> > Hi all,
> >
> >
> >
> > We’ve discovered that (quite frequently) the kernel that we deploy
> doesn’t match the unstripped one that we’re saving for debug symbols. I’ve
> traced the issue to a combination of an sstate miss for the kernel
> do_deploy step combined with an sstate hit for do_package_write_rpm. (side
> note: we know we have issues with sstate reuse/stamps including things they
> shouldn’t which is why we hit this so much. We’re working on that too)
> >
> >
> >
> > The result is that when our debug rootfs is created (where we added the
> kernel symbols), it’s got the version of the kernel from the sstate cached
> rpm files, but since do_deploy had an sstate miss, the entire kernel gets
> rebuilt to satisfy that dependency chain. Since the kernel doesn’t have
> reproducible builds working, the resulting pair of kernels don’t match each
> other for debug purposes.
> >
> >
> >
> > So, I have two questions to start:
> >
> > What is the recommended way to be getting debug symbols for the kernel,
> since do_deploy doesn’t seem to have a debug counterpart (which is why we
> originally just set things up to add the rpm to the generated debug rootfs)
> > Does this seem like a bug that should be fixed? If so, what would be the
> recommended solution (more thoughts below)?
> >
> >
> >
> > Even if there’s a task somewhere that does what I’m looking for, this
> seems like a bit of a bug. I generally feel like we want to be able to
> trust sstate, so the fact that forking dependencies that each generate
> their own sstate objects can be out of sync is a bit scary.
> >
> > I’ve thought of several ways around this, but I can’t say I like any of
> them.
> >
> > (extremely gross hack) Create a new task to use instead of do_deploy
> that depends on do_packagegroup_write_rpm. Unpack the restored (or built)
> RPMs and use those blobs to deploy the kernel and symbols to the image
> directory.
> > (gross hack with painful effects on build time) Disable sstate for
> do_package_write_rpm and do_deploy. Possibly replace with sstate logic for
> the kernel’s do_install step (side question – why doesn’t do_install
> generate sstate? It seems like it should be able to, since the point is to
> drop everything into the image directory)
> > (possibly better, but sounds hard) Change the sstate logic so that if
> anything downstream of a do_compile task needs to be rerun, everything
> downstream of it needs to be rerun and sstate reuse for that recipe is not
> allowed (basically all or nothing sstate). Maybe with a flag that’s allowed
> in the bitbake file to indicate that a recipe does have reproducible builds
> and that different pieces are allowed to come from sstate in that case.
> > (fix the symptoms but not the problem) Figure out how to get linux-yocto
> building in a reproducible fashion and pretend the problem doesn’t exist.
> >
> >
> >
> >
> >
> > If you’re interested, this is quite easy to reproduce – these are my
> repro steps
> >
> > Check out a clean copy of zeus (22.0.2)
> > Add kernel-image to core-image-minimal in whatever fashion you choose (I
> just dumped it in the RDEPENDS for packagegroup-core-boot for testing)
> > bitbake core-image-minimal
> > bitbake -c clean core-image-minimal linux-yocto (or just wipe your whole
> build dir, since everything should come from sstate now)
> > Delete the sstate object(s) for linux-yocto’s deploy task.
> > bitbake core-image-minimal
> > Compare the BuildID hashes for the kernel in the two locations using
> file (you’ll need to use the kernel’s extract-vmlinux script to get it out
> of the bzImage)
> >
> > file
> tmp/work/qemux86_64-poky-linux/core-image-minimal/1.0-r0/rootfs/boot/vmlinux-5.2.28-yocto-standard
> > ./tmp/work-shared/qemux86-64/kernel-source/scripts/extract-vmlinux
> tmp/deploy/images/qemux86-64/bzImage > vmlinux-deploy && file vmlinux-deploy
> >
> >
> >
> > Anyone have thoughts or suggestions?
> >
> >
> >
> > Cheers!
> >
> > -Sean McKay
> >
> >
> >
> >
>
>
>
> --
> - Thou shalt not follow the NULL pointer, for chaos and madness await
> thee at its end
> - "Use the force Harry" - Gandalf, Star Trek II
>
[-- Attachment #2: Type: text/html, Size: 9555 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-04-09 19:01 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <16039EE1565C42EA.9427@lists.yoctoproject.org>
2020-04-09 16:42 ` sstate causing stripped kernel vs symbols mismatch Sean McKay
2020-04-09 17:00 ` [yocto] " Joshua Watt
2020-04-09 17:21 ` Sean McKay
2020-04-09 17:52 ` Bruce Ashfield
2020-04-09 18:11 ` Sean McKay
2020-04-09 19:01 ` Joshua Watt
2020-04-07 19:03 Sean McKay
2020-04-07 19:11 ` [yocto] " Alexander Kanavin
2020-04-07 20:24 ` Sean McKay
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.