From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Alan Maguire <alan.maguire@oracle.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
Clark Williams <williams@redhat.com>,
dwarves@vger.kernel.org,
Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: [PATCH 12/16] dwarf_loader: Support DW_TAG_imported_unit for same-file partial units
Date: Mon, 22 Jun 2026 17:24:35 -0300 [thread overview]
Message-ID: <20260622202441.14799-13-acme@kernel.org> (raw)
In-Reply-To: <20260622202441.14799-1-acme@kernel.org>
From: Arnaldo Carvalho de Melo <acme@redhat.com>
Binaries processed by the dwz(1) tool have their DWARF type information
deduplicated into DW_TAG_partial_unit entries that are then referenced
via DW_TAG_imported_unit from each DW_TAG_compile_unit that uses those
types. This is the standard DWARF mechanism for cross-CU type sharing.
On Fedora/RHEL, most debuginfo packages are built with dwz, making this
a common pattern. For instance, bash-debuginfo has 10,486
DW_TAG_partial_unit, 384 DW_TAG_compile_unit, and 8,572
DW_TAG_imported_unit entries — all using same-file references (no .dwz
alternate DWARF file involved).
Before this patch, pahole skipped DW_TAG_partial_unit with a warning:
WARNING: DW_TAG_partial_unit used, some types will not be considered!
Probably this was optimized using a tool like 'dwz'
A future version of pahole will support this.
And DW_TAG_imported_unit was silently ignored (returned NULL), causing
pahole to report "file has no dwarf type information" for binaries like
bash and glibc.
The fix adds die__process_imported_unit(), called from die__process_unit()
when encountering DW_TAG_imported_unit. It follows DW_AT_import to the
referenced DW_TAG_partial_unit DIE and processes its children inline into
the importing compile unit's type tables. This works because
dwarf_formref_die() already handles all DWARF reference forms, and each
CU maintains its own independent hash tables — so the same partial unit
can be safely imported by multiple CUs, each getting its own copy of the
types.
Since imported units can themselves contain DW_TAG_imported_unit entries
(nested imports), a depth limit of 64 is enforced to prevent stack
overflow from pathological or corrupted DWARF. A warning is emitted if
the limit is reached.
Some binaries (e.g. chromium-browser on Fedora 44, built with Rust
components) also have DW_TAG_imported_unit entries that reference partial
units in an alternate debug file via DW_FORM_GNU_ref_alt (the
.gnu_debugaltlink mechanism). When elfutils resolves such a reference, it
returns DIEs from the alternate file whose offsets are in a different
address space — processing these into the main CU's hash tables corrupts
type references and causes a crash during type recoding.
The same DW_FORM_GNU_ref_alt form can also appear on regular type
attributes (DW_AT_type, DW_AT_abstract_origin, DW_AT_specification,
etc.), not just on DW_TAG_imported_unit's DW_AT_import. Guard all paths
via attr_form_is_ref_alt(), which skips the reference and warns once, so
users know why some types are missing rather than getting a crash.
The korg/alt_dwarf branch had a previous attempt at this that also
handled the .dwz alternate DWARF file case (DW_FORM_GNU_ref_alt), but it
was never merged and is now 294 commits behind master. This patch takes a
simpler approach focused on the same-file case first, which covers dwz
output on Fedora/RHEL where all partial units are within the same .debug
file.
Before (bash-5.3.9-3.fc44.x86_64 debuginfo):
$ pahole -F dwarf /usr/lib/debug/usr/bin/bash-5.3.9-3.fc44.x86_64.debug
WARNING: DW_TAG_partial_unit used, some types will not be considered!
pahole: /usr/lib/debug/usr/bin/bash-5.3.9-3.fc44.x86_64.debug: file has no dwarf type information
After:
$ pahole -F dwarf /usr/lib/debug/usr/bin/bash-5.3.9-3.fc44.x86_64.debug | wc -l
1605
$ pahole -F dwarf -C variable /usr/lib/debug/usr/bin/bash-5.3.9-3.fc44.x86_64.debug
struct variable {
char * name; /* 0 8 */
char * value; /* 8 8 */
...
/* size: 48, cachelines: 1, members: 7 */
};
Before (chromium-browser debuginfo, Fedora 44):
$ pahole /usr/lib/debug/.../chromium-browser-149.0.7827.155-1.fc44.x86_64.debug
Segmentation fault
After:
$ pahole /usr/lib/debug/.../chromium-browser-149.0.7827.155-1.fc44.x86_64.debug
WARNING: DW_FORM_GNU_ref_alt (dwz alternate debug file) not yet supported,
some types will not be available.
Reported-by: Sashiko:gemini-3-1-pro-preview # Running on a local machine
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
dwarf_loader.c | 153 ++++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 131 insertions(+), 22 deletions(-)
diff --git a/dwarf_loader.c b/dwarf_loader.c
index 9f7d2bd23359191b..7091655588cd8b4d 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -136,6 +136,9 @@ struct dwarf_cu {
struct dwarf_tag *last_type_lookup;
struct cu *cu;
struct dwarf_cu *type_unit;
+ Dwarf_Off *imported_units;
+ uint32_t nr_imported_units;
+ uint32_t allocated_imported_units;
};
static int dwarf_cu__init(struct dwarf_cu *dcu, struct cu *cu)
@@ -161,6 +164,9 @@ static int dwarf_cu__init(struct dwarf_cu *dcu, struct cu *cu)
INIT_HLIST_HEAD(&dcu->hash_types[i]);
}
dcu->type_unit = NULL;
+ dcu->imported_units = NULL;
+ dcu->nr_imported_units = 0;
+ dcu->allocated_imported_units = 0;
// To avoid a per-lookup check against NULL in dwarf_cu__find_type_by_ref()
dcu->last_type_lookup = &sentinel_dtag;
return 0;
@@ -185,6 +191,7 @@ static void dwarf_cu__delete(struct cu *cu)
struct dwarf_cu *dcu = cu->priv;
+ free(dcu->imported_units);
// dcu->hash_tags & dcu->hash_types are on cu->obstack
cu__free(cu, dcu);
cu->priv = NULL;
@@ -446,12 +453,32 @@ static const char *attr_string(Dwarf_Die *die, uint32_t name, struct conf_load *
return str;
}
+static bool attr_form_is_ref_alt(Dwarf_Attribute *attr)
+{
+ if (attr->form == DW_FORM_GNU_ref_alt) {
+ static bool warned;
+
+ if (!warned) {
+ fprintf(stderr,
+ "WARNING: DW_FORM_GNU_ref_alt (dwz alternate debug file) not yet supported,\n"
+ " some types will not be available.\n");
+ warned = true;
+ }
+ return true;
+ }
+ return false;
+}
+
static bool attr_type(Dwarf_Die *die, uint32_t attr_name, Dwarf_Off *offset)
{
Dwarf_Attribute attr;
if (dwarf_attr(die, attr_name, &attr) != NULL) {
Dwarf_Die type_die;
+ if (attr_form_is_ref_alt(&attr)) {
+ *offset = 0;
+ return 0;
+ }
if (dwarf_formref_die(&attr, &type_die) != NULL) {
*offset = dwarf_dieoffset(&type_die);
return attr.form == DW_FORM_ref_sig8;
@@ -679,7 +706,8 @@ static void type__init(struct type *type, Dwarf_Die *die, struct cu *cu, struct
Dwarf_Attribute attr;
if (dwarf_attr(die, DW_AT_type, &attr) != NULL) {
Dwarf_Die type_die;
- if (dwarf_formref_die(&attr, &type_die) != NULL) {
+ if (!attr_form_is_ref_alt(&attr) &&
+ dwarf_formref_die(&attr, &type_die) != NULL) {
uint64_t encoding = attr_numeric(&type_die, DW_AT_encoding);
if (encoding == DW_ATE_signed || encoding == DW_ATE_signed_char)
@@ -993,9 +1021,14 @@ static int add_gnu_annotation_chain(Dwarf_Die *die, int component_idx,
Dwarf_Attribute attr;
Dwarf_Die annot_die;
- while (dwarf_attr(die, DW_AT_GNU_annotation, &attr) != NULL &&
- dwarf_formref_die(&attr, &annot_die) != NULL &&
- dwarf_tag(&annot_die) == DW_TAG_GNU_annotation) {
+ while (dwarf_attr(die, DW_AT_GNU_annotation, &attr) != NULL) {
+ if (attr_form_is_ref_alt(&attr))
+ break;
+ if (dwarf_formref_die(&attr, &annot_die) == NULL)
+ break;
+ if (dwarf_tag(&annot_die) != DW_TAG_GNU_annotation)
+ break;
+
int ret = add_tag_annotation(&annot_die, component_idx, conf, head);
if (ret)
return ret;
@@ -1791,9 +1824,13 @@ check_gnu_attr:
goto out;
/* Handle GCC-style DW_AT_GNU_annotation attribute */
- while (dwarf_attr(die, DW_AT_GNU_annotation, &attr) != NULL &&
- dwarf_formref_die(&attr, &annot_die) != NULL &&
- dwarf_tag(&annot_die) == DW_TAG_GNU_annotation) {
+ while (dwarf_attr(die, DW_AT_GNU_annotation, &attr) != NULL) {
+ if (attr_form_is_ref_alt(&attr))
+ break;
+ if (dwarf_formref_die(&attr, &annot_die) == NULL)
+ break;
+ if (dwarf_tag(&annot_die) != DW_TAG_GNU_annotation)
+ break;
name = attr_string(&annot_die, DW_AT_name, conf);
if (strcmp(name, "btf_type_tag") != 0)
break;
@@ -2614,7 +2651,7 @@ static struct tag *__die__process_tag(Dwarf_Die *die, struct cu *cu,
switch (dwarf_tag(die)) {
case DW_TAG_imported_unit:
- return NULL; // We don't support imported units yet, so to avoid segfaults
+ return &unsupported_tag; // Handled in die__process_unit()
case DW_TAG_array_type:
tag = die__create_new_array(die, cu); break;
case DW_TAG_string_type: // FORTRAN stuff, looks like an array
@@ -2682,9 +2719,90 @@ static struct tag *__die__process_tag(Dwarf_Die *die, struct cu *cu,
return tag;
}
-static int die__process_unit(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
+#define MAX_IMPORTED_UNIT_DEPTH 64
+
+static int die__process_unit(Dwarf_Die *die, struct cu *cu, struct conf_load *conf, int import_depth);
+
+static bool dwarf_cu__imported_unit_visited(struct dwarf_cu *dcu, Dwarf_Off offset)
+{
+ for (uint32_t i = 0; i < dcu->nr_imported_units; i++)
+ if (dcu->imported_units[i] == offset)
+ return true;
+ return false;
+}
+
+static int dwarf_cu__mark_imported_unit(struct dwarf_cu *dcu, struct cu *cu, Dwarf_Off offset)
+{
+ if (dcu->nr_imported_units == dcu->allocated_imported_units) {
+ uint32_t new_size = dcu->allocated_imported_units ? dcu->allocated_imported_units * 2 : 16;
+ Dwarf_Off *new_array = realloc(dcu->imported_units, new_size * sizeof(Dwarf_Off));
+ if (new_array == NULL)
+ return -ENOMEM;
+ dcu->imported_units = new_array;
+ dcu->allocated_imported_units = new_size;
+ }
+ dcu->imported_units[dcu->nr_imported_units++] = offset;
+ return 0;
+}
+
+static int die__process_imported_unit(Dwarf_Die *die, struct cu *cu, struct conf_load *conf, int import_depth)
+{
+ Dwarf_Attribute attr;
+
+ if (dwarf_attr(die, DW_AT_import, &attr) == NULL)
+ return 0;
+
+ if (attr_form_is_ref_alt(&attr))
+ return 0;
+
+ Dwarf_Die imported_die;
+
+ if (dwarf_formref_die(&attr, &imported_die) == NULL)
+ return 0;
+
+ if (dwarf_tag(&imported_die) != DW_TAG_partial_unit)
+ return 0;
+
+ if (import_depth >= MAX_IMPORTED_UNIT_DEPTH) {
+ static bool warned;
+
+ if (!warned) {
+ fprintf(stderr,
+ "WARNING: DW_TAG_imported_unit nesting too deep (>%d), "
+ "some types will not be available.\n",
+ MAX_IMPORTED_UNIT_DEPTH);
+ warned = true;
+ }
+ return 0;
+ }
+
+ Dwarf_Off offset = dwarf_dieoffset(&imported_die);
+ struct dwarf_cu *dcu = cu->priv;
+
+ if (dwarf_cu__imported_unit_visited(dcu, offset))
+ return 0;
+
+ if (dwarf_cu__mark_imported_unit(dcu, cu, offset))
+ return -ENOMEM;
+
+ Dwarf_Die child;
+
+ if (dwarf_child(&imported_die, &child) == 0)
+ return die__process_unit(&child, cu, conf, import_depth + 1);
+
+ return 0;
+}
+
+static int die__process_unit(Dwarf_Die *die, struct cu *cu, struct conf_load *conf, int import_depth)
{
do {
+ if (dwarf_tag(die) == DW_TAG_imported_unit) {
+ int err = die__process_imported_unit(die, cu, conf, import_depth);
+ if (err)
+ return err;
+ continue;
+ }
+
struct tag *tag = die__process_tag(die, cu, 1, conf);
if (tag == NULL)
return -ENOMEM;
@@ -3305,17 +3423,8 @@ static int die__process(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
return 0; // so that other units can be processed
}
- if (tag == DW_TAG_partial_unit) {
- static bool warned;
-
- if (!warned) {
- fprintf(stderr, "WARNING: DW_TAG_partial_unit used, some types will not be considered!\n"
- " Probably this was optimized using a tool like 'dwz'\n"
- " A future version of pahole will support this.\n");
- warned = true;
- }
- return 0; // so that other units can be processed
- }
+ if (tag == DW_TAG_partial_unit)
+ return 0; // Processed inline when reached via DW_TAG_imported_unit
if (tag != DW_TAG_compile_unit && tag != DW_TAG_type_unit) {
fprintf(stderr, "%s: DW_TAG_compile_unit, DW_TAG_type_unit, DW_TAG_partial_unit or DW_TAG_skeleton_unit expected got %s (0x%x) @ %llx!\n",
@@ -3336,7 +3445,7 @@ static int die__process(Dwarf_Die *die, struct cu *cu, struct conf_load *conf)
return DWARF_CB_OK;
if (dwarf_child(die, &child) == 0) {
- int err = die__process_unit(&child, cu, conf);
+ int err = die__process_unit(&child, cu, conf, 0);
if (err)
return err;
}
@@ -4099,7 +4208,7 @@ static int cus__merge_and_process_cu(struct cus *cus, struct conf_load *conf,
filtered = conf->early_cu_filter(&unmerged_cu) == NULL;
}
- if (!filtered && die__process_unit(&child, cu, conf) != 0)
+ if (!filtered && die__process_unit(&child, cu, conf, 0) != 0)
goto out_abort;
}
--
2.54.0
next prev parent reply other threads:[~2026-06-22 20:25 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-22 20:24 [PATCHES v3 0/7] Initial support for some Rust tags, DW_TAG_imported_unit Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 01/16] dwarf_loader: Initial support for DW_TAG_variant_part Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 02/16] dwarf_loader: Allow forcing the merge of CUs for solving inter CU tag references Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 03/16] dwarf_loader: Initial support for DW_TAG_subprogram in DW_TAG_enumeration Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 04/16] encoders: Fix diagnostic messages for unexpected tags in enumerations Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 05/16] dwarves_fprintf: Accumulate function__fprintf return value in enumeration printing Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 06/16] dwarves: Use tag__delete for enumeration children Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 07/16] btf_encoder: Fix types__match parameter comparison in BTF_KIND_FUNC_PROTO Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 08/16] encoders: Handle DW_TAG_subprogram in enumerations during BTF/CTF encoding Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 09/16] dwarf_loader: Populate DW_TAG_variant children in DW_TAG_variant_part Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 10/16] btf_encoder: Encode variant parts as union members in BTF Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 11/16] dwarf_loader: Handle DW_FORM_block in attr_numeric for Rust discriminant values Arnaldo Carvalho de Melo
2026-06-22 20:24 ` Arnaldo Carvalho de Melo [this message]
2026-06-22 20:24 ` [PATCH 13/16] dwarf_loader: Fix cus__merging_cu failing to detect DW_FORM_ref_addr Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 14/16] tests: Add inter-CU type reference comparison test Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 15/16] tests: Guard cleanup() against empty outdir to prevent rm /* Arnaldo Carvalho de Melo
2026-06-22 20:24 ` [PATCH 16/16] tests: Source test_lib.sh via dirname so tests run from any directory Arnaldo Carvalho de Melo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260622202441.14799-13-acme@kernel.org \
--to=acme@kernel.org \
--cc=acme@redhat.com \
--cc=alan.maguire@oracle.com \
--cc=dwarves@vger.kernel.org \
--cc=jolsa@kernel.org \
--cc=williams@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.