* [PATCH 0/2 RFC] workaround for debugedit segv
@ 2013-03-25 17:19 Mark Hatle
2013-03-25 17:19 ` [PATCH 1/2 RFC] rpm: Add workaround for debugedit-segv Mark Hatle
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Mark Hatle @ 2013-03-25 17:19 UTC (permalink / raw)
To: openembedded-core
[ YOCTO #4089 ]
This is an attempt at a workaround for the debugedit segv.
The problem is that during the creation of the build-id hash, the
system iterates over the various ELF sections passing a point to the
loaded data and size of each section to a hash function.
When it gets to the hash for the .plt and .bss section, the size is
non-zero, while the data is point to 0. This immediately causes a
segfault on access of data at address 0.
See the bug and patch comments for more diagnostic information.
^ permalink raw reply [flat|nested] 11+ messages in thread* [PATCH 1/2 RFC] rpm: Add workaround for debugedit-segv 2013-03-25 17:19 [PATCH 0/2 RFC] workaround for debugedit segv Mark Hatle @ 2013-03-25 17:19 ` Mark Hatle 2013-03-25 17:02 ` Phil Blundell 2013-03-25 17:19 ` [PATCH 2/2 RFC] package.bbclass: Trigger a bb.error if split/strip fails Mark Hatle 2013-03-25 17:47 ` [PATCH 0/2 RFC] workaround for debugedit segv Richard Purdie 2 siblings, 1 reply; 11+ messages in thread From: Mark Hatle @ 2013-03-25 17:19 UTC (permalink / raw) To: openembedded-core [ YOCTO #4089 ] On PPC and MIPS, there appears to be a condition that causes debugedit to segfault. The segfault is related to a call into the md5hash algorithm, an address of '0', and a size > 0 is passed causing the access of the address to segv. This workaround may prove to be the final fix, but it's currently unclear what the actual cause of the 0 address is. Signed-off-by: Mark Hatle <mark.hatle@windriver.com> --- meta/recipes-devtools/rpm/rpm/debugedit-segv.patch | 35 ++++++++++++++++++++++ meta/recipes-devtools/rpm/rpm_5.4.9.bb | 3 +- 2 files changed, 37 insertions(+), 1 deletion(-) create mode 100644 meta/recipes-devtools/rpm/rpm/debugedit-segv.patch diff --git a/meta/recipes-devtools/rpm/rpm/debugedit-segv.patch b/meta/recipes-devtools/rpm/rpm/debugedit-segv.patch new file mode 100644 index 0000000..bd91693 --- /dev/null +++ b/meta/recipes-devtools/rpm/rpm/debugedit-segv.patch @@ -0,0 +1,35 @@ +There are cases, especially on PPC and MIPS, where the data address +returned is 0, but the size is not 0. + +It appears to happen when the sections headers are similar to: + + [21] .data PROGBITS 000239c0 0139c0 000010 00 WA 0 0 8 + [22] .got PROGBITS 000239d0 0139d0 000014 04 WAX 0 0 4 + [23] .plt NOBITS 000239e4 0139e4 000234 00 WAX 0 0 4 + [24] .bss NOBITS 00023c18 0139e4 0001c8 00 WA 0 0 8 + [25] .comment PROGBITS 00000000 0139e4 000011 01 MS 0 0 1 + [26] .debug_aranges PROGBITS 00000000 0139f8 000d68 00 0 0 8 + +Sections 23 and 24 (.plt and .bss) which are NOBITS have a loaded data address +of 0, but a size != 0. + +This could be a bug in libelf... + +Upstream-status: Pending + +Signed-off-by: Mark Hatle <mark.hatle@windriver.com> + +Index: rpm-5.4.9/tools/debugedit.c +=================================================================== +--- rpm-5.4.9.orig/tools/debugedit.c ++++ rpm-5.4.9/tools/debugedit.c +@@ -1434,7 +1434,8 @@ handle_build_id (DSO *dso, Elf_Data *bui + auto inline void process (const void *data, size_t size) + { + memchunk chunk = { .data = (void *) data, .size = size }; +- hashFunctionContextUpdateMC (&ctx, &chunk); ++ if (data != NULL && size != 0) ++ hashFunctionContextUpdateMC (&ctx, &chunk); + } + union + { diff --git a/meta/recipes-devtools/rpm/rpm_5.4.9.bb b/meta/recipes-devtools/rpm/rpm_5.4.9.bb index ba24111..e9c8f23 100644 --- a/meta/recipes-devtools/rpm/rpm_5.4.9.bb +++ b/meta/recipes-devtools/rpm/rpm_5.4.9.bb @@ -84,7 +84,8 @@ SRC_URI = "http://www.rpm5.org/files/rpm/rpm-5.4/rpm-5.4.9-0.20120508.src.rpm;ex file://python-rpm-rpmsense.patch \ file://rpm-reloc-macros.patch \ file://rpm-platform2.patch \ - file://rpm-remove-sykcparse-decl.patch \ + file://rpm-remove-sykcparse-decl.patch \ + file://debugedit-segv.patch \ " # Uncomment the following line to enable platform score debugging -- 1.8.1.2.545.g2f19ada ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2 RFC] rpm: Add workaround for debugedit-segv 2013-03-25 17:19 ` [PATCH 1/2 RFC] rpm: Add workaround for debugedit-segv Mark Hatle @ 2013-03-25 17:02 ` Phil Blundell 2013-03-25 17:10 ` Mark Hatle 0 siblings, 1 reply; 11+ messages in thread From: Phil Blundell @ 2013-03-25 17:02 UTC (permalink / raw) To: Mark Hatle; +Cc: openembedded-core On Mon, 2013-03-25 at 12:19 -0500, Mark Hatle wrote: > +Sections 23 and 24 (.plt and .bss) which are NOBITS have a loaded data address > +of 0, but a size != 0. That doesn't seem like totally unreasonable behaviour for a NOBITS section. What were you expecting libelf to do in that case? ++ if (data != NULL && size != 0) ++ hashFunctionContextUpdateMC (&ctx, &chunk); I suppose one could argue that allocating a chunk of zero-filled memory of the right size and then hashing that would be a slightly better fix. Whether it matters in practice or not would depend on what exactly is going into this hash and what it's being used for. p. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2 RFC] rpm: Add workaround for debugedit-segv 2013-03-25 17:02 ` Phil Blundell @ 2013-03-25 17:10 ` Mark Hatle 2013-03-25 17:45 ` Phil Blundell 0 siblings, 1 reply; 11+ messages in thread From: Mark Hatle @ 2013-03-25 17:10 UTC (permalink / raw) To: Phil Blundell; +Cc: openembedded-core On 3/25/13 12:02 PM, Phil Blundell wrote: > On Mon, 2013-03-25 at 12:19 -0500, Mark Hatle wrote: >> +Sections 23 and 24 (.plt and .bss) which are NOBITS have a loaded data address >> +of 0, but a size != 0. > > That doesn't seem like totally unreasonable behaviour for a NOBITS > section. What were you expecting libelf to do in that case? > > ++ if (data != NULL && size != 0) > ++ hashFunctionContextUpdateMC (&ctx, &chunk); > > I suppose one could argue that allocating a chunk of zero-filled memory > of the right size and then hashing that would be a slightly better fix. > Whether it matters in practice or not would depend on what exactly is > going into this hash and what it's being used for. It appears in the past it either didn't load the section at all, or the size was set to 0. It's a combination of the data pointer set to NULL and the size != 0 that is causing the segfault. This doesn't appear to happen outside of PPC and MIPS. I'm going to look into identifying if the section is a NOBITS and skipping the whole operation if it is. --Mark ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2 RFC] rpm: Add workaround for debugedit-segv 2013-03-25 17:10 ` Mark Hatle @ 2013-03-25 17:45 ` Phil Blundell 2013-03-25 19:32 ` Mark Hatle 0 siblings, 1 reply; 11+ messages in thread From: Phil Blundell @ 2013-03-25 17:45 UTC (permalink / raw) To: Mark Hatle; +Cc: openembedded-core On Mon, 2013-03-25 at 12:10 -0500, Mark Hatle wrote: > I'm going to look into identifying if the section is a NOBITS and skipping the > whole operation if it is. That would mean that a change in the size of the .bss wouldn't have any impact on the hash. Maybe that's fine for your application though, I dunno. p. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2 RFC] rpm: Add workaround for debugedit-segv 2013-03-25 17:45 ` Phil Blundell @ 2013-03-25 19:32 ` Mark Hatle 2013-03-25 21:47 ` Mark Hatle 0 siblings, 1 reply; 11+ messages in thread From: Mark Hatle @ 2013-03-25 19:32 UTC (permalink / raw) To: Phil Blundell; +Cc: openembedded-core On 3/25/13 12:45 PM, Phil Blundell wrote: > On Mon, 2013-03-25 at 12:10 -0500, Mark Hatle wrote: >> I'm going to look into identifying if the section is a NOBITS and skipping the >> whole operation if it is. > > That would mean that a change in the size of the .bss wouldn't have any > impact on the hash. Maybe that's fine for your application though, I > dunno. I'm not completely familiar with how the buildid is calculated, other then an md5 hash over the sections themselves. I can't think of a case where the contents of the sections wouldn't change along with the .bss and .plt size. (buildid is supposed to be used by gdb to find/verify that the debuginfo matches the binary actually being debugged...) Through my google search, I never found a full spec of what the buildid was supposed to contain, only a high-level description from an old Fedora 8 work item. (Note, as long the buildid is consistent it really doesn't appear to matter how it was generated, just that both the app and debuginfo contain it.) --Mark > p. > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2 RFC] rpm: Add workaround for debugedit-segv 2013-03-25 19:32 ` Mark Hatle @ 2013-03-25 21:47 ` Mark Hatle 2013-03-26 10:38 ` Phil Blundell 0 siblings, 1 reply; 11+ messages in thread From: Mark Hatle @ 2013-03-25 21:47 UTC (permalink / raw) To: openembedded-core On 3/25/13 2:32 PM, Mark Hatle wrote: > On 3/25/13 12:45 PM, Phil Blundell wrote: >> On Mon, 2013-03-25 at 12:10 -0500, Mark Hatle wrote: >>> I'm going to look into identifying if the section is a NOBITS and skipping the >>> whole operation if it is. >> >> That would mean that a change in the size of the .bss wouldn't have any >> impact on the hash. Maybe that's fine for your application though, I >> dunno. > > I'm not completely familiar with how the buildid is calculated, other then an > md5 hash over the sections themselves. I can't think of a case where the > contents of the sections wouldn't change along with the .bss and .plt size. > > (buildid is supposed to be used by gdb to find/verify that the debuginfo matches > the binary actually being debugged...) > > Through my google search, I never found a full spec of what the buildid was > supposed to contain, only a high-level description from an old Fedora 8 work > item. (Note, as long the buildid is consistent it really doesn't appear to > matter how it was generated, just that both the app and debuginfo contain it.) I've looked at the code some more. It does checksum the header itself and then if it's got contents, it also adds the contents to the checksum.. That is where the failure appears to be happening: if (u.shdr.sh_type != SHT_NOBITS) { Elf_Data *d = elf_rawdata (dso->scn[i], NULL); if (d == NULL) goto bad; process (d->d_buf, d->d_size); } So it's specifically checking for SHT_NOBITS, but it's matching so it falls through and d->d_buf == 0, causing the failure. I'll keep investigating, but somehow that value (u.shdr.sh_type) is wrong [or at least unexpected!]. --Mark > --Mark > > >> p. >> >> > > > _______________________________________________ > Openembedded-core mailing list > Openembedded-core@lists.openembedded.org > http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/openembedded-core > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2 RFC] rpm: Add workaround for debugedit-segv 2013-03-25 21:47 ` Mark Hatle @ 2013-03-26 10:38 ` Phil Blundell 2013-03-26 12:20 ` Mark Hatle 0 siblings, 1 reply; 11+ messages in thread From: Phil Blundell @ 2013-03-26 10:38 UTC (permalink / raw) To: Mark Hatle; +Cc: openembedded-core On Mon, 2013-03-25 at 16:47 -0500, Mark Hatle wrote: > I've looked at the code some more. It does checksum the header itself and then > if it's got contents, it also adds the contents to the checksum.. That is where > the failure appears to be happening: > > if (u.shdr.sh_type != SHT_NOBITS) > { > Elf_Data *d = elf_rawdata (dso->scn[i], NULL); > if (d == NULL) > goto bad; > process (d->d_buf, d->d_size); > } > > So it's specifically checking for SHT_NOBITS, but it's matching so it falls > through and d->d_buf == 0, causing the failure. I'll keep investigating, but > somehow that value (u.shdr.sh_type) is wrong [or at least unexpected!]. Ah. If a NOBITS section is getting reported as something else then it seems that this must clearly be a bug in libelf and ought to be fixed there rather than working around it in rpm. What do you actually get as u.shdr.sh_type? p. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2 RFC] rpm: Add workaround for debugedit-segv 2013-03-26 10:38 ` Phil Blundell @ 2013-03-26 12:20 ` Mark Hatle 0 siblings, 0 replies; 11+ messages in thread From: Mark Hatle @ 2013-03-26 12:20 UTC (permalink / raw) To: Phil Blundell; +Cc: openembedded-core On 3/26/13 5:38 AM, Phil Blundell wrote: > On Mon, 2013-03-25 at 16:47 -0500, Mark Hatle wrote: >> I've looked at the code some more. It does checksum the header itself and then >> if it's got contents, it also adds the contents to the checksum.. That is where >> the failure appears to be happening: >> >> if (u.shdr.sh_type != SHT_NOBITS) >> { >> Elf_Data *d = elf_rawdata (dso->scn[i], NULL); >> if (d == NULL) >> goto bad; >> process (d->d_buf, d->d_size); >> } >> >> So it's specifically checking for SHT_NOBITS, but it's matching so it falls >> through and d->d_buf == 0, causing the failure. I'll keep investigating, but >> somehow that value (u.shdr.sh_type) is wrong [or at least unexpected!]. > > Ah. If a NOBITS section is getting reported as something else then it > seems that this must clearly be a bug in libelf and ought to be fixed > there rather than working around it in rpm. What do you actually get as > u.shdr.sh_type? I posted an updated patch last night. debugedit was re-translating (byte swapping) the header elements during the buildid calculations. This was causing the value of sh_type to be in the ELF binaries native endian during the check. So big endian binaries failed to match the SHT_NOBITS, thus the problem being discovered on PPC and MIPS. (Fix was to not translate the original values, but instead only translate a copy. I verified the produced buildid was expected, as I now have a better understanding of how the buildid is generated.) --Mark > p. > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/2 RFC] package.bbclass: Trigger a bb.error if split/strip fails 2013-03-25 17:19 [PATCH 0/2 RFC] workaround for debugedit segv Mark Hatle 2013-03-25 17:19 ` [PATCH 1/2 RFC] rpm: Add workaround for debugedit-segv Mark Hatle @ 2013-03-25 17:19 ` Mark Hatle 2013-03-25 17:47 ` [PATCH 0/2 RFC] workaround for debugedit segv Richard Purdie 2 siblings, 0 replies; 11+ messages in thread From: Mark Hatle @ 2013-03-25 17:19 UTC (permalink / raw) To: openembedded-core [ YOCTO #4089 ] If debugedit, objcopy calls fail, capture the error and return a failure to the user. Signed-off-by: Mark Hatle <mark.hatle@windriver.com> --- meta/classes/package.bbclass | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/meta/classes/package.bbclass b/meta/classes/package.bbclass index 0702c92..3752bdc 100644 --- a/meta/classes/package.bbclass +++ b/meta/classes/package.bbclass @@ -254,14 +254,26 @@ def splitdebuginfo(file, debugfile, debugsrcdir, d): # We need to extract the debug src information here... if debugsrcdir: - subprocess.call("'%s' -b '%s' -d '%s' -i -l '%s' '%s'" % (debugedit, workparentdir, debugsrcdir, sourcefile, file), shell=True) + cmd = "'%s' -b '%s' -d '%s' -i -l '%s' '%s'" % (debugedit, workparentdir, debugsrcdir, sourcefile, file) + ret = subprocess.call(cmd, shell=True) + if ret != 0: + bb.error("debugedit failed: %s - %d" % (cmd, ret)) + return ret bb.utils.mkdirhier(os.path.dirname(debugfile)) - subprocess.call("'%s' --only-keep-debug '%s' '%s'" % (objcopy, file, debugfile), shell=True) + cmd = "'%s' --only-keep-debug '%s' '%s'" % (objcopy, file, debugfile) + ret = subprocess.call(cmd, shell=True) + if ret != 0: + bb.error("debugedit failed: %s - %d" % (cmd, ret)) + return ret # Set the debuglink to have the view of the file path on the target - subprocess.call("'%s' --add-gnu-debuglink='%s' '%s'" % (objcopy, debugfile, file), shell=True) + cmd = "'%s' --add-gnu-debuglink='%s' '%s'" % (objcopy, debugfile, file) + ret = subprocess.call(cmd, shell=True) + if ret != 0: + bb.error("debugedit failed: %s - %d" % (cmd, ret)) + return ret if newmode: os.chmod(file, origmode) @@ -806,7 +818,9 @@ python split_and_strip_files () { bb.utils.mkdirhier(os.path.dirname(fpath)) #bb.note("Split %s -> %s" % (file, fpath)) # Only store off the hard link reference if we successfully split! - splitdebuginfo(file, fpath, debugsrcdir, d) + ret = splitdebuginfo(file, fpath, debugsrcdir, d) + if ret != 0: + return ret # Hardlink our debug symbols to the other hardlink copies for file in hardlinks: @@ -878,6 +892,8 @@ python split_and_strip_files () { # # End of strip # + + return 0 } python populate_packages () { -- 1.8.1.2.545.g2f19ada ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 0/2 RFC] workaround for debugedit segv 2013-03-25 17:19 [PATCH 0/2 RFC] workaround for debugedit segv Mark Hatle 2013-03-25 17:19 ` [PATCH 1/2 RFC] rpm: Add workaround for debugedit-segv Mark Hatle 2013-03-25 17:19 ` [PATCH 2/2 RFC] package.bbclass: Trigger a bb.error if split/strip fails Mark Hatle @ 2013-03-25 17:47 ` Richard Purdie 2 siblings, 0 replies; 11+ messages in thread From: Richard Purdie @ 2013-03-25 17:47 UTC (permalink / raw) To: Mark Hatle; +Cc: openembedded-core On Mon, 2013-03-25 at 12:19 -0500, Mark Hatle wrote: > [ YOCTO #4089 ] > > This is an attempt at a workaround for the debugedit segv. > > The problem is that during the creation of the build-id hash, the > system iterates over the various ELF sections passing a point to the > loaded data and size of each section to a hash function. > > When it gets to the hash for the .plt and .bss section, the size is > non-zero, while the data is point to 0. This immediately causes a > segfault on access of data at address 0. > > See the bug and patch comments for more diagnostic information. I really want to figure out how widespread issues are so I merged the workaround and my version of the error handling patch so we start seeing where the remaining problems are (if any). I'm hoping to test this out on the autobuilder, see how we stand. If further improvements are found to the workaround we can merge those on top of the existing tree. Cheers, Richard ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2013-03-26 12:37 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-03-25 17:19 [PATCH 0/2 RFC] workaround for debugedit segv Mark Hatle 2013-03-25 17:19 ` [PATCH 1/2 RFC] rpm: Add workaround for debugedit-segv Mark Hatle 2013-03-25 17:02 ` Phil Blundell 2013-03-25 17:10 ` Mark Hatle 2013-03-25 17:45 ` Phil Blundell 2013-03-25 19:32 ` Mark Hatle 2013-03-25 21:47 ` Mark Hatle 2013-03-26 10:38 ` Phil Blundell 2013-03-26 12:20 ` Mark Hatle 2013-03-25 17:19 ` [PATCH 2/2 RFC] package.bbclass: Trigger a bb.error if split/strip fails Mark Hatle 2013-03-25 17:47 ` [PATCH 0/2 RFC] workaround for debugedit segv Richard Purdie
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox