lttng-dev.lists.lttng.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers via lttng-dev <lttng-dev@lists.lttng.org>
To: Minlan Wang <wangminlan@szsandstone.com>
Cc: lttng-dev <lttng-dev@lists.lttng.org>, paulmck <paulmck@kernel.org>
Subject: Re: [lttng-dev] ***UNCHECKED*** Re: urcu workqueue thread uses 99% of cpu while	workqueue is empty
Date: Tue, 21 Jun 2022 09:12:38 -0400 (EDT)	[thread overview]
Message-ID: <1843612610.17820.1655817158376.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20220621035206.GA267474@localhost.localdomain>+E7C500F72502E4B5>

----- On Jun 20, 2022, at 11:52 PM, Minlan Wang wangminlan@szsandstone.com wrote:

> Hi, Mathieu,
> On Fri, Jun 17, 2022 at 09:37:23AM -0400, Mathieu Desnoyers wrote:
>> Can you try the following patch on top of v0.12.1 + your debug patch ?
>> 
>> diff --git a/src/workqueue.c b/src/workqueue.c
>> index eed1e21e..19fc89c8 100644
>> --- a/src/workqueue.c
>> +++ b/src/workqueue.c
>> @@ -142,6 +142,7 @@ static void futex_wait(int32_t *futex)
>>  
>>  		return;
>>  	}
>> +wait:
>>  	while ((ret = futex_async(futex, FUTEX_WAIT, -1, NULL, NULL, 0))) {
>>  		err = errno;
>>  		fprintf(stderr, "%lu: wq %p: %s failed, errno %d\n",
>> @@ -161,7 +162,16 @@ static void futex_wait(int32_t *futex)
>>  
>>  	fprintf(stderr, "%lu: wq %p: %s wait return %d\n",
>>  		pthread_self(), caa_container_of(futex, struct urcu_workqueue, futex), __func__,
>>  		ret);
>> -
>> +	/*
>> +	 * Prior queued wakeups can cause futex wait to return even
>> +	 * though the futex value is still -1. If this happens, wait
>> +	 * again.
>> +	 */
>> +	if (uatomic_read(futex) == -1) {
>> +		fprintf(stderr, "%lu: wq %p: %s futex == -1, wait again\n",
>> +			pthread_self(), caa_container_of(futex, struct urcu_workqueue, futex),
>> __func__);
>> +		goto wait;
>> +	}
>>  }
>>  
>>  static void futex_wake_up(int32_t *futex)
>> 
> I tried this patch on top of my debug patch, the issue disappeared.
> The log shows that "futex_wait futex == -1, wait again" did happen several
> times.
> I put the complete log in the attachment.

This means we appear to be on the right track. Good!

> 
>> 
>> We should not rely too much on the ordering of fprintf stderr output between
>> threads,
>> because those can be interleaved in any order (each fprintf to stderr is
>> unbuffered).
>> If we want more precision in the order of events I would recommend using
>> LTTng-UST for
>> fine-grained user-space tracing. This might explain why the futex has an
>> unexpected value
>> at line 2099.
> I tried this patch on top of previous 2 patches to use lttng_ust_tracepoint in
> workqueue.c, but always failed in compile example code with this error:
> ---
> make[5]: Entering directory
> `/home/mlwang/debug/userspace-rcu/doc/examples/rculfqueue'
>  CC       cds_lfq_enqueue.o
>  CCLD     cds_lfq_enqueue
> ../../../src/.libs//liburcu-cds.so: undefined reference to
> `lttng_ust_tracepoint_provider_workqueue'
> ---
> the patch is as following:
[...]
> --- a/src/workqueue.h
> +++ b/src/workqueue.h
> @@ -28,6 +28,7 @@
> #include <pthread.h>
> 
> #include <urcu/wfcqueue.h>
> +#include "workqueue_tp.h"
> 
> #ifdef __cplusplus
> extern "C" {
> diff --git a/src/workqueue_tp.c b/src/workqueue_tp.c
> new file mode 100644
> index 0000000..a5f9586
> --- /dev/null
> +++ b/src/workqueue_tp.c
> @@ -0,0 +1,4 @@
> +#define LTTNG_UST_TRACEPOINT_CREATE_PROBES
> +#define LTTNG_UST_TRACEPOINT_DEFINE
> +
> +#include "workqueue_tp.h"
> diff --git a/src/workqueue_tp.h b/src/workqueue_tp.h
> new file mode 100644
> index 0000000..e2bce31
> --- /dev/null
> +++ b/src/workqueue_tp.h
> @@ -0,0 +1,82 @@
> +#undef	LTTNG_UST_TRACEPOINT_PROVIDER
> +#define	LTTNG_UST_TRACEPOINT_PROVIVER workqueue

If you change "PROVIVER" into "PROVIDER" here, does it fix your issue ?

If that's not it, it should be something relatively simple with the build dependencies. Let me know
if it still does not work.

[...]

> --
> 1.8.3.1
> 
> Is "perf probe" accurate in tracing the futex event here?  Maybe i can try that.

I would prefer to use lttng-ust here to keep the footprint of the tracer minimal, so we do not
change the behavior too much by calling into the kernel.

Then we'll be able to correlate traces from lttng-ust and lttng-modules (kernel tracer) and
figure out what is happening both in user-space and kernel space, especially around futex and
scheduler activity within the kernel.

Thanks,

Mathieu

> 
> B.R
> Minlan

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

  reply	other threads:[~2022-06-21 13:12 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-14  3:55 [lttng-dev] urcu workqueue thread uses 99% of cpu while workqueue is empty Minlan Wang via lttng-dev
2022-06-14 13:39 ` Mathieu Desnoyers via lttng-dev
2022-06-14 13:40   ` Mathieu Desnoyers via lttng-dev
2022-06-14 14:19 ` Mathieu Desnoyers via lttng-dev
2022-06-14 15:53 ` Mathieu Desnoyers via lttng-dev
2022-06-15  3:49   ` Minlan Wang via lttng-dev
2022-06-15 13:35     ` Mathieu Desnoyers via lttng-dev
2022-06-15 14:15     ` Mathieu Desnoyers via lttng-dev
2022-06-16  7:09       ` Minlan Wang via lttng-dev
2022-06-16  8:09       ` Minlan Wang via lttng-dev
2022-06-17 13:37         ` Mathieu Desnoyers via lttng-dev
2022-06-21  3:52           ` Minlan Wang via lttng-dev
2022-06-21 13:12             ` Mathieu Desnoyers via lttng-dev [this message]
2022-06-22  7:45               ` [lttng-dev] ***UNCHECKED*** " Minlan Wang via lttng-dev
2022-06-22 13:19                 ` [lttng-dev] ***UNCHECKED*** " Mathieu Desnoyers via lttng-dev
2022-06-22 20:28                   ` Mathieu Desnoyers via lttng-dev
2022-06-22 20:52                     ` Mathieu Desnoyers via lttng-dev
2022-06-23  9:08                       ` Minlan Wang via lttng-dev
     [not found]                   ` <20220623034528.GA271179@localhost.localdomain>
2022-06-23  3:57                     ` Minlan Wang via lttng-dev
2022-06-23 14:09                       ` Mathieu Desnoyers via lttng-dev
2022-06-24  6:21                         ` Minlan Wang via lttng-dev
2022-06-23  8:36                   ` [lttng-dev] [PART 4/4] " Minlan Wang via lttng-dev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1843612610.17820.1655817158376.JavaMail.zimbra@efficios.com \
    --to=lttng-dev@lists.lttng.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=paulmck@kernel.org \
    --cc=wangminlan@szsandstone.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).