From: Domenico Andreoli <domenico.andreoli@linux.com>
To: Alan Maguire <alan.maguire@oracle.com>
Cc: Ihor Solodrai <ihor.solodrai@linux.dev>,
dwarves@vger.kernel.org, bpf@vger.kernel.org, acme@kernel.org,
andrii@kernel.org, eddyz87@gmail.com, mykolal@fb.com,
kernel-team@meta.com
Subject: Re: [PATCH dwarves] dwarf_loader: fix termination on BTF encoding error
Date: Tue, 1 Apr 2025 15:43:38 +0200 [thread overview]
Message-ID: <Z-vtiuRaolc91Nkc@localhost> (raw)
In-Reply-To: <27afc430-face-4013-9b87-4168f38b6b23@oracle.com>
On Tue, Apr 01, 2025 at 01:57:25PM +0100, Alan Maguire wrote:
> On 28/03/2025 17:40, Ihor Solodrai wrote:
> > When BTF encoding thread aborts because of an error, dwarf loader
> > worker threads get stuck in cus_queue__enqdeq_job() at:
> >
> > pthread_cond_wait(&cus_processing_queue.job_added, &cus_processing_queue.mutex);
> >
> > To avoid this, introduce an abort flag into cus_processing_queue, and
> > atomically check for it in the deq loop. The flag is only set in case
> > of a worker thread exiting on error. Make sure to pthread_cond_signal
> > to the waiting threads to let them exit too.
> >
> > In cus__process_file fix the check of an error returned from
> > dwfl_getmodules: it may return a positive number when a
> > callback (cus__process_dwflmod in our case) returns an error.
> >
> > Link: https://lore.kernel.org/dwarves/Z-JzFrXaopQCYd6h@localhost/
> >
> > Reported-by: Domenico Andreoli <domenico.andreoli@linux.com>
> > Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
>
> Thanks for the fix! I've tested this with the problematic module+vmlinux
> BTF and the previously-hanging pahole goes on to fail as expected; also
> run it through the work-in-progress CI, building and testing on x86_64
> and aarch64, no issues found. If anyone else has a chance to ack or test
> it, that would be great. Thanks!
Tested-by: Domenico Andreoli <domenico.andreoli@linux.com>
I rebuilt the Debian package with that patch applied and it then started
to fail consistently because of the extra c++ symbols.
When I use the switch --lang_exclude=rust,c++11, it works without
errors.
Thank you Alan and Ihor for the fast support!
Dom
>
> Alan
>
> > ---
> > dwarf_loader.c | 12 ++++++++++--
> > 1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/dwarf_loader.c b/dwarf_loader.c
> > index 84122d0..e1ba7bc 100644
> > --- a/dwarf_loader.c
> > +++ b/dwarf_loader.c
> > @@ -3459,6 +3459,7 @@ static struct {
> > */
> > uint32_t next_cu_id;
> > struct list_head jobs;
> > + bool abort;
> > } cus_processing_queue;
> >
> > enum job_type {
> > @@ -3479,6 +3480,7 @@ static void cus_queue__init(void)
> > pthread_cond_init(&cus_processing_queue.job_added, NULL);
> > INIT_LIST_HEAD(&cus_processing_queue.jobs);
> > cus_processing_queue.next_cu_id = 0;
> > + cus_processing_queue.abort = false;
> > }
> >
> > static void cus_queue__destroy(void)
> > @@ -3535,8 +3537,9 @@ static struct cu_processing_job *cus_queue__enqdeq_job(struct cu_processing_job
> > pthread_cond_signal(&cus_processing_queue.job_added);
> > }
> > for (;;) {
> > + bool abort = __atomic_load_n(&cus_processing_queue.abort, __ATOMIC_SEQ_CST);
> > job = cus_queue__try_dequeue();
> > - if (job)
> > + if (job || abort)
> > break;
> > /* No jobs or only steals out of order */
> > pthread_cond_wait(&cus_processing_queue.job_added, &cus_processing_queue.mutex);
> > @@ -3653,6 +3656,9 @@ static void *dwarf_loader__worker_thread(void *arg)
> >
> > while (!stop) {
> > job = cus_queue__enqdeq_job(job);
> > + if (!job)
> > + goto out_abort;
> > +
> > switch (job->type) {
> >
> > case JOB_DECODE:
> > @@ -3688,6 +3694,8 @@ static void *dwarf_loader__worker_thread(void *arg)
> >
> > return (void *)DWARF_CB_OK;
> > out_abort:
> > + __atomic_store_n(&cus_processing_queue.abort, true, __ATOMIC_SEQ_CST);
> > + pthread_cond_signal(&cus_processing_queue.job_added);
> > return (void *)DWARF_CB_ABORT;
> > }
> >
> > @@ -4028,7 +4036,7 @@ static int cus__process_file(struct cus *cus, struct conf_load *conf, int fd,
> >
> > /* Process the one or more modules gleaned from this file. */
> > int err = dwfl_getmodules(dwfl, cus__process_dwflmod, &parms, 0);
> > - if (err < 0)
> > + if (err)
> > return -1;
> >
> > // We can't call dwfl_end(dwfl) here, as we keep pointers to strings
>
>
--
rsa4096: 3B10 0CA1 8674 ACBA B4FE FCD2 CE5B CF17 9960 DE13
ed25519: FFB4 0CC3 7F2E 091D F7DA 356E CC79 2832 ED38 CB05
next prev parent reply other threads:[~2025-04-01 13:43 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-28 17:40 [PATCH dwarves] dwarf_loader: fix termination on BTF encoding error Ihor Solodrai
2025-04-01 12:57 ` Alan Maguire
2025-04-01 13:43 ` Domenico Andreoli [this message]
2025-04-02 9:56 ` Alan Maguire
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z-vtiuRaolc91Nkc@localhost \
--to=domenico.andreoli@linux.com \
--cc=acme@kernel.org \
--cc=alan.maguire@oracle.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=dwarves@vger.kernel.org \
--cc=eddyz87@gmail.com \
--cc=ihor.solodrai@linux.dev \
--cc=kernel-team@meta.com \
--cc=mykolal@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox