All of lore.kernel.org
 help / color / mirror / Atom feed
From: Domenico Andreoli <domenico.andreoli@linux.com>
To: Ihor Solodrai <ihor.solodrai@linux.dev>
Cc: dwarves@vger.kernel.org
Subject: Re: parallel pahole hangs while building modules from nvidia-open-kernel-dkms
Date: Fri, 28 Mar 2025 10:05:55 +0100	[thread overview]
Message-ID: <Z-ZmcwXyMtAQjaoE@localhost> (raw)
In-Reply-To: <83315e0bce204f7745448fff550574d44b09b4c1@linux.dev>

[-- Attachment #1: Type: text/plain, Size: 3755 bytes --]

On Wed, Mar 26, 2025 at 08:48:51PM +0000, Ihor Solodrai wrote:
> On 3/25/25 2:10 AM, Domenico Andreoli wrote:
> > Hi,
> >
> >   This a forward of Debian bug report [0] where you can find more
> > details. At [1] and [2] you can get the kernel and module to reproduce.
> > I could reproduce on both amd64 and arm64 using pahole 1.29.
> >
> > This is marked as serious severity because it makes the autobuilder hang
> > as well [3].
> >
> > Could you please help?
> >
> > Regards,
> > Domenico
> 
> Hi Domenico, thanks for the bug report.

Hi Ihor,

> 
> I debugged the hanging, and it appears that "abort" handling in case
> of a BTF encoding error was overlooked in recent changes to speedup
> parallel encoding.
> 
> Could you please try the diff below, and check if it resolves the
> hanging?
> 

Yes, I tried it and the hanging is gone.

Now both parallel and sequential invocations fail with this error:

  dwarf_expr: unhandled 0x12 DW_OP_ operation
  Unsupported DW_TAG_reference_type(0x10): type: 0x28172
  Error while encoding BTF.
  dwarf_expr: unhandled 0x12 DW_OP_ operation
  dwarf_expr: unhandled 0x12 DW_OP_ operation
  dwarf_expr: unhandled 0x12 DW_OP_ operation
  libbpf: failed to find '.BTF' ELF section in nvidia-modeset.ko
  pahole: nvidia-modeset.ko: Invalid argument

I guess this is another story that was simply covered by the previous bug.

> diff --git a/dwarf_loader.c b/dwarf_loader.c
> index 84122d0..e1ba7bc 100644
> --- a/dwarf_loader.c
> +++ b/dwarf_loader.c
> @@ -3459,6 +3459,7 @@ static struct {
>  	 */
>  	uint32_t next_cu_id;
>  	struct list_head jobs;
> +	bool abort;
>  } cus_processing_queue;
>  
>  enum job_type {
> @@ -3479,6 +3480,7 @@ static void cus_queue__init(void)
>  	pthread_cond_init(&cus_processing_queue.job_added, NULL);
>  	INIT_LIST_HEAD(&cus_processing_queue.jobs);
>  	cus_processing_queue.next_cu_id = 0;
> +	cus_processing_queue.abort = false;
>  }
>  
>  static void cus_queue__destroy(void)
> @@ -3535,8 +3537,9 @@ static struct cu_processing_job *cus_queue__enqdeq_job(struct cu_processing_job
>  		pthread_cond_signal(&cus_processing_queue.job_added);
>  	}
>  	for (;;) {
> +		bool abort = __atomic_load_n(&cus_processing_queue.abort, __ATOMIC_SEQ_CST);
>  		job = cus_queue__try_dequeue();
> -		if (job)
> +		if (job || abort)
>  			break;
>  		/* No jobs or only steals out of order */
>  		pthread_cond_wait(&cus_processing_queue.job_added, &cus_processing_queue.mutex);
> @@ -3653,6 +3656,9 @@ static void *dwarf_loader__worker_thread(void *arg)
>  
>  	while (!stop) {
>  		job = cus_queue__enqdeq_job(job);
> +		if (!job)
> +			goto out_abort;
> +
>  		switch (job->type) {
>  
>  		case JOB_DECODE:
> @@ -3688,6 +3694,8 @@ static void *dwarf_loader__worker_thread(void *arg)
>  
>  	return (void *)DWARF_CB_OK;
>  out_abort:
> +	__atomic_store_n(&cus_processing_queue.abort, true, __ATOMIC_SEQ_CST);
> +	pthread_cond_signal(&cus_processing_queue.job_added);
>  	return (void *)DWARF_CB_ABORT;
>  }
>  
> @@ -4028,7 +4036,7 @@ static int cus__process_file(struct cus *cus, struct conf_load *conf, int fd,
>  
>  	/* Process the one or more modules gleaned from this file. */
>  	int err = dwfl_getmodules(dwfl, cus__process_dwflmod, &parms, 0);
> -	if (err < 0)
> +	if (err)
>  		return -1;
>  
>  	// We can't call dwfl_end(dwfl) here, as we keep pointers to strings
> 

Is this patch already final or do you prefer I'd wait for review and marge first?

I would apply it on top of Debian's 1.29 and release a new 1.29-3 package.

Thank,
dom

-- 
rsa4096: 3B10 0CA1 8674 ACBA B4FE  FCD2 CE5B CF17 9960 DE13
ed25519: FFB4 0CC3 7F2E 091D F7DA  356E CC79 2832 ED38 CB05

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2025-03-28  9:06 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-25  9:10 parallel pahole hangs while building modules from nvidia-open-kernel-dkms Domenico Andreoli
2025-03-25 11:32 ` Alan Maguire
2025-03-26 20:48 ` Ihor Solodrai
2025-03-28  9:05   ` Domenico Andreoli [this message]
2025-03-28 16:25     ` Ihor Solodrai
2025-03-28 17:55       ` Ihor Solodrai
2025-03-28 20:25 ` Ihor Solodrai
2025-03-31 13:17   ` Alan Maguire

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z-ZmcwXyMtAQjaoE@localhost \
    --to=domenico.andreoli@linux.com \
    --cc=dwarves@vger.kernel.org \
    --cc=ihor.solodrai@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.