From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Alan Maguire <alan.maguire@oracle.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
Clark Williams <williams@redhat.com>,
dwarves@vger.kernel.org,
Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: [PATCH 12/16] dwarf_loader: Support DW_TAG_imported_unit for same-file partial units
Date: Mon, 22 Jun 2026 17:24:35 -0300 [thread overview]
Message-ID: <20260622202441.14799-13-acme@kernel.org> (raw)
In-Reply-To: <20260622202441.14799-1-acme@kernel.org>
From: Arnaldo Carvalho de Melo <acme@redhat.com>
Binaries processed by the dwz(1) tool have their DWARF type information
deduplicated into DW_TAG_partial_unit entries that are then referenced
via DW_TAG_imported_unit from each DW_TAG_compile_unit that uses those
types. This is the standard DWARF mechanism for cross-CU type sharing.
On Fedora/RHEL, most debuginfo packages are built with dwz, making this
a common pattern. For instance, bash-debuginfo has 10,486
DW_TAG_partial_unit, 384 DW_TAG_compile_unit, and 8,572
DW_TAG_imported_unit entries — all using same-file references (no .dwz
alternate DWARF file involved).
Before this patch, pahole skipped DW_TAG_partial_unit with a warning:
WARNING: DW_TAG_partial_unit used, some types will not be considered!
Probably this was optimized using a tool like 'dwz'
A future version of pahole will support this.
And DW_TAG_imported_unit was silently ignored (returned NULL), causing
pahole to report "file has no dwarf type information" for binaries like
bash and glibc.
The fix adds die__process_imported_unit(), called from die__process_unit()
when encountering DW_TAG_imported_unit. It follows DW_AT_import to the
referenced DW_TAG_partial_unit DIE and processes its children inline into
the importing compile unit's type tables. This works because
dwarf_formref_die() already handles all DWARF reference forms, and each
CU maintains its own independent hash tables — so the same partial unit
can be safely imported by multiple CUs, each getting its own copy of the
types.
Since imported units can themselves contain DW_TAG_imported_unit entries
(nested imports), a depth limit of 64 is enforced to prevent stack
overflow from pathological or corrupted DWARF. A warning is emitted if
the limit is reached.
Some binaries (e.g. chromium-browser on Fedora 44, built with Rust
components) also have DW_TAG_imported_unit entries that reference partial
units in an alternate debug file via DW_FORM_GNU_ref_alt (the
.gnu_debugaltlink mechanism). When elfutils resolves such a reference, it
returns DIEs from the alternate file whose offsets are in a different
address space — processing these into the main CU's hash tables corrupts
type references and causes a crash during type recoding.
The same DW_FORM_GNU_ref_alt form can also appear on regular type
attributes (DW_AT_type, DW_AT_abstract_origin, DW_AT_specification,
etc.), not just on DW_TAG_imported_unit's DW_AT_import. Guard all paths
via attr_form_is_ref_alt(), which skips the reference and warns once, so
users know why some types are missing rather than getting a crash.
The korg/alt_dwarf branch had a previous attempt at this that also
handled the .dwz alternate DWARF file case (DW_FORM_GNU_ref_alt), but it
was never merged and is now 294 commits behind master. This patch takes a
simpler approach focused on the same-file case first, which covers dwz
output on Fedora/RHEL where all partial units are within the same .debug
file.
Before (bash-5.3.9-3.fc44.x86_64 debuginfo):
$ pahole -F dwarf /usr/lib/debug/usr/bin/bash-5.3.9-3.fc44.x86_64.debug
WARNING: DW_TAG_partial_unit used, some types will not be considered!
pahole: /usr/lib/debug/usr/bin/bash-5.3.9-3.fc44.x86_64.debug: file has no dwarf type information
After:
$ pahole -F dwarf /usr/lib/debug/usr/bin/bash-5.3.9-3.fc44.x86_64.debug | wc -l
1605
$ pahole -F dwarf -C variable /usr/lib/debug/usr/bin/bash-5.3.9-3.fc44.x86_64.debug
struct variable {
char * name; /* 0 8 */
char * value; /* 8 8 */
...
/* size: 48, cachelines: 1, members: 7 */
};
Before (chromium-browser debuginfo, Fedora 44):
$ pahole /usr/lib/debug/.../chromium-browser-149.0.7827.155-1.fc44.x86_64.debug
Segmentation fault
After:
$ pahole /usr/lib/debug/.../chromium-browser-149.0.7827.155-1.fc44.x86_64.debug
WARNING: DW_FORM_GNU_ref_alt (dwz alternate debug file) not yet supported,
some types will not be available.
Reported-by: Sashiko:gemini-3-1-pro-preview # Running on a local machine
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
dwarf_loader.c | 153 ++++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 131 insertions(+), 22 deletions(-)
diff --git a/dwarf_loader.c b/dwarf_loader.c
index 9f7d2bd23359191b..7091655588cd8b4d 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -136,6 +136,9 @@ struct dwarf_cu {
struct dwarf_tag *last_type_lookup;
struct cu *cu;
struct dwarf_cu *type_unit;
+ Dwarf_Off *imported_units;
+ uint32_t nr_imported_units;
+ uint32_t allocated_imported_units;
};
static int dwarf_cu__init(struct dwarf_cu *dcu, struct cu *cu)
@@ -161,6 +164,9 @@ static int dwarf_cu__init(struct dwarf_cu *dcu, struct cu *cu)
INIT_HLIST_HEAD(&dcu->hash_types[i]);
}
dcu->type_unit = NULL;
+ dcu->imported_units = NULL;
+ dcu->nr_imported_units = 0;
+ dcu->allocated_imported_units = 0;
// To avoid a per-lookup check against NULL in dwarf_cu__find_type_by_ref()
dcu->last_type_lookup = &sentinel_dtag;
return 0;
@@ -185,6 +191,7 @@ static void dwarf_cu__delete(struct cu *cu)
struct dwarf_cu *dcu = cu->priv;
+ free(dcu->imported_units);
// dcu->hash_tags & dcu->hash_types are on cu->obstack
cu__free(cu, dcu);
cu->priv = NULL;
@@ -446,12 +453,32 @@ static const char *attr_string(Dwarf_Die *die, uint32_t name, struct conf_load *
return str;
}
+static bool attr_form_is_ref_alt(Dwarf_Attribute *attr)
+{
+ if (attr->form == DW_FORM_GNU_ref_alt) {
+ static bool warned;
+
+ if (!warned) {
+ fprintf(stderr,
+ "WARNING: DW_FORM_GNU_ref_alt (dwz alternate debug file) not yet supported,\n"
+ " some types will not be available.\n");
+ warned = true;
+ }
+ return true;
+ }
+ return false;
+}
+
static bool attr_type(Dwarf_Die *die, uint32_t attr_name, Dwarf_Off *offset)
{
Dwarf_Attribute attr;
if (dwarf_attr(die, attr_name, &attr) != NULL) {
Dwarf_Die type_die;
+ if (attr_form_is_ref_alt(&attr)) {
+ *offset = 0;
+ return 0;
+ }
if (dwarf_formref_die(&attr, &type_die) != NULL) {
*offset = dwarf_dieoffset(&type_die);
return attr.form == DW_FORM_ref_sig8;
@@ -679,7 +706,8 @@ static void type__init(struct type *type, Dwarf_Die *die, struct cu *cu, struct
Dwarf_Attribute attr;
if (dwarf_attr(die, DW_AT_type, &attr) != NULL) {
Dwarf_Die type_die;
- if (dwarf_formref_die(&attr, &type_die) != NULL) {
+ if (!attr_form_is_ref_alt(&attr) &&
+ dwarf_formref_die(&attr, &type_die) != NULL) {
uint64_t encoding = attr_numeric(&type_die, DW_AT_encoding);
if (encoding == DW_ATE_signed || encoding == DW_ATE_signed_char)
@@ -993,9 +1021,14 @@ static int add_gnu_annotation_chain(Dwarf_Die *die, int component_idx,
Dwarf_Attribute attr;
Dwarf_Die annot_die;
- while (dwarf_attr(die, DW_AT_GNU_annotation, &attr) != NULL &&
- dwarf_formref_die(&attr, &annot_die) != NULL &&
- dwarf_tag(&annot_die) == DW_TAG_GNU_annotation) {
+ while (dwarf_attr(die, DW_AT_GNU_annotation, &attr) != NULL) {
+ if (attr_form_is_ref_alt(&attr))
+ break;
+ if (dwarf_formref_die(&attr, &annot_die) == NULL)
+ break;
+ if (dwarf_tag(&annot_die) != DW_TAG_GNU_annotation)
+ break;
+
int ret = add_tag_annotation(&annot_die, component_idx, conf, head);
if (ret)
return ret;
@@ -1791,9 +1824,13 @@ check_gnu_attr:
goto out;
/* Handle GCC-style DW_AT_GNU_annotation attribute */
- while (dwarf_attr(die, DW_AT_GNU_annotation, &attr) != NULL &&
- dwarf_formref_die(&attr, &annot_die) != NULL &&
- dwarf_tag(&annot_die) == DW_TAG_GNU_annotation) {
+ while (dwarf_attr(die, DW_AT_GNU_annotation, &attr) != NULL) {
+ if (attr_form_is_ref_alt(&attr))
+ break;
+ if (dwarf_formref_die(&attr, &annot_die) == NULL)
+ break;
+ if (dwarf_tag(&annot_die) != DW_TAG_GNU_annotation)
+ break;
name = attr_string(&annot_die, DW_AT_name, conf);
if (strcmp(name, "btf_type_tag") != 0)
break;
@@ -2614,7 +2651,7 @@ static struct tag *__die__process_tag(Dwarf_Die *die, struct cu *cu,
switch (dwarf_tag(die)) {
case DW_TAG_imported_unit:
- return NULL; // We don't support imported units yet, so to avoid segfaults
+ return &unsupported_tag; // Handled in die__process_unit()
case DW_TAG_array_type:
tag = die__create_new_array(die, cu); break;
case DW_TAG_string_type: // FORTRAN stuff, looks like an array
@@ -2682,9 +2719,90 @@ static struct tag *__die__process_tag(Dwarf_Die *die, struct cu *cu,
return tag;
}
-static int die__process_unit(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
+#define MAX_IMPORTED_UNIT_DEPTH 64
+
+static int die__process_unit(Dwarf_Die *die, struct cu *cu, struct conf_load *conf, int import_depth);
+
+static bool dwarf_cu__imported_unit_visited(struct dwarf_cu *dcu, Dwarf_Off offset)
+{
+ for (uint32_t i = 0; i < dcu->nr_imported_units; i++)
+ if (dcu->imported_units[i] == offset)
+ return true;
+ return false;
+}
+
+static int dwarf_cu__mark_imported_unit(struct dwarf_cu *dcu, struct cu *cu, Dwarf_Off offset)
+{
+ if (dcu->nr_imported_units == dcu->allocated_imported_units) {
+ uint32_t new_size = dcu->allocated_imported_units ? dcu->allocated_imported_units * 2 : 16;
+ Dwarf_Off *new_array = realloc(dcu->imported_units, new_size * sizeof(Dwarf_Off));
+ if (new_array == NULL)
+ return -ENOMEM;
+ dcu->imported_units = new_array;
+ dcu->allocated_imported_units = new_size;
+ }
+ dcu->imported_units[dcu->nr_imported_units++] = offset;
+ return 0;
+}
+
+static int die__process_imported_unit(Dwarf_Die *die, struct cu *cu, struct conf_load *conf, int import_depth)
+{
+ Dwarf_Attribute attr;
+
+ if (dwarf_attr(die, DW_AT_import, &attr) == NULL)
+ return 0;
+
+ if (attr_form_is_ref_alt(&attr))
+ return 0;
+
+ Dwarf_Die imported_die;
+
+ if (dwarf_formref_die(&attr, &imported_die) == NULL)
+ return 0;
+
+ if (dwarf_tag(&imported_die) != DW_TAG_partial_unit)
+ return 0;
+
+ if (import_depth >= MAX_IMPORTED_UNIT_DEPTH) {
+ static bool warned;
+
+ if (!warned) {
+ fprintf(stderr,
+ "WARNING: DW_TAG_imported_unit nesting too deep (>%d), "
+ "some types will not be available.\n",
+ MAX_IMPORTED_UNIT_DEPTH);
+ warned = true;
+ }
+ return 0;
+ }
+
+ Dwarf_Off offset = dwarf_dieoffset(&imported_die);
+ struct dwarf_cu *dcu = cu->priv;
+
+ if (dwarf_cu__imported_unit_visited(dcu, offset))
+ return 0;
+
+ if (dwarf_cu__mark_imported_unit(dcu, cu, offset))
+ return -ENOMEM;
+
+ Dwarf_Die child;
+
+ if (dwarf_child(&imported_die, &child) == 0)
+ return die__process_unit(&child, cu, conf, import_depth + 1);
+
+ return 0;
+}
+
+static int die__process_unit(Dwarf_Die *die, struct cu *cu, struct conf_load *conf, int import_depth)
{
do {
+ if (dwarf_tag(die) == DW_TAG_imported_unit) {
+ int err = die__process_imported_unit(die, cu, conf, import_depth);
+ if (err)
+ return err;
+ continue;
+ }
+
struct tag *tag = die__process_tag(die, cu, 1, conf);
if (tag == NULL)
return -ENOMEM;
@@ -3305,17 +3423,8 @@ static int die__process(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
return 0; // so that other units can be processed
}
- if (tag == DW_TAG_partial_unit) {
- static bool warned;
-
- if (!warned) {
- fprintf(stderr, "WARNING: DW_TAG_partial_unit used, some types will not be considered!\n"
- " Probably this was optimized using a tool like 'dwz'\n"
- " A future version of pahole will support this.\n");
- warned = true;
- }
- return 0; // so that other units can be processed
- }
+ if (tag == DW_TAG_partial_unit)
+ return 0; // Processed inline when reached via DW_TAG_imported_unit
if (tag != DW_TAG_compile_unit && tag != DW_TAG_type_unit) {
fprintf(stderr, "%s: DW_TAG_compile_unit, DW_TAG_type_unit, DW_TAG_partial_unit or DW_TAG_skeleton_unit expected got %s (0x%x) @ %llx!\n",
@@ -3336,7 +3445,7 @@ static int die__process(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
return DWARF_CB_OK;
if (dwarf_child(die, &child) == 0) {
- int err = die__process_unit(&child, cu, conf);
+ int err = die__process_unit(&child, cu, conf, 0);
if (err)
return err;
}
@@ -4099,7 +4208,7 @@ static int cus__merge_and_process_cu(struct cus *cus, struct conf_load *conf,
filtered = conf->early_cu_filter(&unmerged_cu) == NULL;
}
- if (!filtered && die__process_unit(&child, cu, conf) != 0)
+ if (!filtered && die__process_unit(&child, cu, conf, 0) != 0)
goto out_abort;
}
--
2.54.0
next prev parent reply other threads:[~2026-06-22 20:25 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-22 20:24 [PATCHES v3 0/7] Initial support for some Rust tags, DW_TAG_imported_unit Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 01/16] dwarf_loader: Initial support for DW_TAG_variant_part Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 02/16] dwarf_loader: Allow forcing the merge of CUs for solving inter CU tag references Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 03/16] dwarf_loader: Initial support for DW_TAG_subprogram in DW_TAG_enumeration Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 04/16] encoders: Fix diagnostic messages for unexpected tags in enumerations Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 05/16] dwarves_fprintf: Accumulate function__fprintf return value in enumeration printing Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 06/16] dwarves: Use tag__delete for enumeration children Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 07/16] btf_encoder: Fix types__match parameter comparison in BTF_KIND_FUNC_PROTO Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 08/16] encoders: Handle DW_TAG_subprogram in enumerations during BTF/CTF encoding Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 09/16] dwarf_loader: Populate DW_TAG_variant children in DW_TAG_variant_part Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 10/16] btf_encoder: Encode variant parts as union members in BTF Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 11/16] dwarf_loader: Handle DW_FORM_block in attr_numeric for Rust discriminant values Arnaldo Carvalho de Melo
2026-06-22 20:24 ` Arnaldo Carvalho de Melo [this message]
2026-06-22 20:24 ` [PATCH 13/16] dwarf_loader: Fix cus__merging_cu failing to detect DW_FORM_ref_addr Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 14/16] tests: Add inter-CU type reference comparison test Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 15/16] tests: Guard cleanup() against empty outdir to prevent rm /* Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 16/16] tests: Source test_lib.sh via dirname so tests run from any directory Arnaldo Carvalho de Melo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260622202441.14799-13-acme@kernel.org \
--to=acme@kernel.org \
--cc=acme@redhat.com \
--cc=alan.maguire@oracle.com \
--cc=dwarves@vger.kernel.org \
--cc=jolsa@kernel.org \
--cc=williams@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox