From: Domenico Andreoli <domenico.andreoli@linux.com>
To: Alan Maguire <alan.maguire@oracle.com>
Cc: Ihor Solodrai <ihor.solodrai@linux.dev>,
dwarves@vger.kernel.org, bpf@vger.kernel.org, acme@kernel.org,
andrii@kernel.org, eddyz87@gmail.com, mykolal@fb.com,
kernel-team@meta.com
Subject: Re: [PATCH dwarves] dwarf_loader: fix termination on BTF encoding error
Date: Tue, 1 Apr 2025 15:43:38 +0200 [thread overview]
Message-ID: <Z-vtiuRaolc91Nkc@localhost> (raw)
In-Reply-To: <27afc430-face-4013-9b87-4168f38b6b23@oracle.com>
On Tue, Apr 01, 2025 at 01:57:25PM +0100, Alan Maguire wrote:
> On 28/03/2025 17:40, Ihor Solodrai wrote:
> > When BTF encoding thread aborts because of an error, dwarf loader
> > worker threads get stuck in cus_queue__enqdeq_job() at:
> >
> > pthread_cond_wait(&cus_processing_queue.job_added, &cus_processing_queue.mutex);
> >
> > To avoid this, introduce an abort flag into cus_processing_queue, and
> > atomically check for it in the deq loop. The flag is only set in case
> > of a worker thread exiting on error. Make sure to pthread_cond_signal
> > to the waiting threads to let them exit too.
> >
> > In cus__process_file fix the check of an error returned from
> > dwfl_getmodules: it may return a positive number when a
> > callback (cus__process_dwflmod in our case) returns an error.
> >
> > Link: https://lore.kernel.org/dwarves/Z-JzFrXaopQCYd6h@localhost/
> >
> > Reported-by: Domenico Andreoli <domenico.andreoli@linux.com>
> > Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
>
> Thanks for the fix! I've tested this with the problematic module+vmlinux
> BTF and the previously-hanging pahole goes on to fail as expected; also
> run it through the work-in-progress CI, building and testing on x86_64
> and aarch64, no issues found. If anyone else has a chance to ack or test
> it, that would be great. Thanks!
Tested-by: Domenico Andreoli <domenico.andreoli@linux.com>
I rebuilt the Debian package with that patch applied and it then started
to fail consistently because of the extra c++ symbols.
When I use the switch --lang_exclude=rust,c++11, it works without
errors.
Thank you Alan and Ihor for the fast support!
Dom
>
> Alan
>
> > ---
> > dwarf_loader.c | 12 ++++++++++--
> > 1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/dwarf_loader.c b/dwarf_loader.c
> > index 84122d0..e1ba7bc 100644
> > --- a/dwarf_loader.c
> > +++ b/dwarf_loader.c
> > @@ -3459,6 +3459,7 @@ static struct {
> > */
> > uint32_t next_cu_id;
> > struct list_head jobs;
> > + bool abort;
> > } cus_processing_queue;
> >
> > enum job_type {
> > @@ -3479,6 +3480,7 @@ static void cus_queue__init(void)
> > pthread_cond_init(&cus_processing_queue.job_added, NULL);
> > INIT_LIST_HEAD(&cus_processing_queue.jobs);
> > cus_processing_queue.next_cu_id = 0;
> > + cus_processing_queue.abort = false;
> > }
> >
> > static void cus_queue__destroy(void)
> > @@ -3535,8 +3537,9 @@ static struct cu_processing_job *cus_queue__enqdeq_job(struct cu_processing_job
> > pthread_cond_signal(&cus_processing_queue.job_added);
> > }
> > for (;;) {
> > + bool abort = __atomic_load_n(&cus_processing_queue.abort, __ATOMIC_SEQ_CST);
> > job = cus_queue__try_dequeue();
> > - if (job)
> > + if (job || abort)
> > break;
> > /* No jobs or only steals out of order */
> > pthread_cond_wait(&cus_processing_queue.job_added, &cus_processing_queue.mutex);
> > @@ -3653,6 +3656,9 @@ static void *dwarf_loader__worker_thread(void *arg)
> >
> > while (!stop) {
> > job = cus_queue__enqdeq_job(job);
> > + if (!job)
> > + goto out_abort;
> > +
> > switch (job->type) {
> >
> > case JOB_DECODE:
> > @@ -3688,6 +3694,8 @@ static void *dwarf_loader__worker_thread(void *arg)
> >
> > return (void *)DWARF_CB_OK;
> > out_abort:
> > + __atomic_store_n(&cus_processing_queue.abort, true, __ATOMIC_SEQ_CST);
> > + pthread_cond_signal(&cus_processing_queue.job_added);
> > return (void *)DWARF_CB_ABORT;
> > }
> >
> > @@ -4028,7 +4036,7 @@ static int cus__process_file(struct cus *cus, struct conf_load *conf, int fd,
> >
> > /* Process the one or more modules gleaned from this file. */
> > int err = dwfl_getmodules(dwfl, cus__process_dwflmod, &parms, 0);
> > - if (err < 0)
> > + if (err)
> > return -1;
> >
> > // We can't call dwfl_end(dwfl) here, as we keep pointers to strings
>
>
--
rsa4096: 3B10 0CA1 8674 ACBA B4FE FCD2 CE5B CF17 9960 DE13
ed25519: FFB4 0CC3 7F2E 091D F7DA 356E CC79 2832 ED38 CB05
next prev parent reply other threads:[~2025-04-01 13:43 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-28 17:40 [PATCH dwarves] dwarf_loader: fix termination on BTF encoding error Ihor Solodrai
2025-04-01 12:57 ` Alan Maguire
2025-04-01 13:43 ` Domenico Andreoli [this message]
2025-04-02 9:56 ` Alan Maguire
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z-vtiuRaolc91Nkc@localhost \
--to=domenico.andreoli@linux.com \
--cc=acme@kernel.org \
--cc=alan.maguire@oracle.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=dwarves@vger.kernel.org \
--cc=eddyz87@gmail.com \
--cc=ihor.solodrai@linux.dev \
--cc=kernel-team@meta.com \
--cc=mykolal@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.