From: Mathieu Desnoyers via lttng-dev <lttng-dev@lists.lttng.org>
To: Minlan Wang <wangminlan@szsandstone.com>
Cc: lttng-dev <lttng-dev@lists.lttng.org>, paulmck <paulmck@kernel.org>
Subject: Re: [lttng-dev] ***UNCHECKED*** Re: Re: urcu workqueue thread uses 99% of cpu while workqueue is empty
Date: Thu, 23 Jun 2022 10:09:55 -0400 (EDT) [thread overview]
Message-ID: <1998086009.28888.1655993395580.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20220623035722.GA271206@localhost.localdomain>+CCEFE5C72E6CA40A>
----- On Jun 22, 2022, at 11:57 PM, Minlan Wang wangminlan@szsandstone.com wrote:
> On Wed, Jun 22, 2022 at 11:45:28PM -0400, Minlan Wang wrote:
>> The session output is in attachment: 0623_futex-20220623-112754.tar.bz2
> Hi, Mathieu,
> There are several workqueues in the process i tracked.
> The one used to trigger the issuse is this:
> thread with vtid=21054, and futex = 0x5652A1035C40
By looking at the trace, here is one problematic scenario which end up emitting "workqueue:futex_wait_again":
[23:33:48.060581417] (+0.000006985) localhost.localdomain workqueue:futex_no_wait: { cpu_id = 4 }, { vpid = 19495, vtid = 21054 }, { futex = 0x5652A1035C40, val = 0 }
[23:33:48.060581747] (+0.000000330) localhost.localdomain workqueue:futex_wake_up_success: { cpu_id = 6 }, { vpid = 19495, vtid = 20649 }, { futex = 0x5652A1035C40, val = -1 }
[23:33:48.060581926] (+0.000000179) localhost.localdomain workqueue:futex_dec_success: { cpu_id = 4 }, { vpid = 19495, vtid = 21054 }, { futex = 0x5652A1035C40, old = 0, new = -1 }
[23:33:48.060582826] (+0.000000900) localhost.localdomain syscall_entry_futex: { cpu_id = 6 }, { vpid = 19495, vtid = 20649, pid = 19495, tid = 20649 }, { uaddr = 94912888659008, op = 1, val = 1, utime = 0, uaddr2 = 0, val3 = 0 }
[23:33:48.060582855] (+0.000000029) localhost.localdomain syscall_entry_futex: { cpu_id = 4 }, { vpid = 19495, vtid = 21054, pid = 19495, tid = 21054 }, { uaddr = 94912888659008, op = 0, val = 4294967295, utime = 0, uaddr2 = 0, val3 = 0 }
[23:33:48.060584722] (+0.000001867) localhost.localdomain sched_stat_runtime: { cpu_id = 4 }, { vpid = 19495, vtid = 21054, pid = 19495, tid = 21054 }, { comm = "bcache_writebac", tid = 21054, runtime = 16033, vruntime = 96940983054 }
[23:33:48.060585840] (+0.000001118) localhost.localdomain sched_switch: { cpu_id = 4 }, { vpid = 19495, vtid = 21054, pid = 19495, tid = 21054 }, { prev_comm = "bcache_writebac", prev_tid = 21054, prev_prio = 20, prev_state = 1, next_comm = "swapper/4", next_tid = 0, next_prio = 20 }
[23:33:48.060587680] (+0.000001840) localhost.localdomain sched_stat_sleep: { cpu_id = 6 }, { vpid = 19495, vtid = 20649, pid = 19495, tid = 20649 }, { comm = "bcache_writebac", tid = 21054, delay = 2815 }
[23:33:48.060588560] (+0.000000880) localhost.localdomain sched_wakeup: { cpu_id = 6 }, { vpid = 19495, vtid = 20649, pid = 19495, tid = 20649 }, { comm = "bcache_writebac", tid = 21054, prio = 20, success = 1, target_cpu = 4 }
[23:33:48.060590437] (+0.000001877) localhost.localdomain syscall_exit_futex: { cpu_id = 4 }, { vpid = 19495, vtid = 21054, pid = 19495, tid = 21054 }, { ret = 0, uaddr = 94912888659008, uaddr2 = 0 }
[23:33:48.060591337] (+0.000000900) localhost.localdomain syscall_exit_futex: { cpu_id = 6 }, { vpid = 19495, vtid = 20649, pid = 19495, tid = 20649 }, { ret = 1, uaddr = 94912888659008, uaddr2 = 0 }
[23:33:48.060591385] (+0.000000048) localhost.localdomain workqueue:futex_wait_return: { cpu_id = 4 }, { vpid = 19495, vtid = 21054 }, { futex = 0x5652A1035C40, val = -1, ret = 0 }
[23:33:48.060592205] (+0.000000820) localhost.localdomain workqueue:futex_wait_again: { cpu_id = 4 }, { vpid = 19495, vtid = 21054 }, { futex = 0x5652A1035C40, val = -1 }
Where the wake_up happens right before the dec_success on the waiter, which
leads to the sched_wakeup awakening the waiter when the state is 0.
If we want to dig more into why this scenario can happen, we could also add tracepoints
in urcu_workqueue_queue_work() just before/after cds_wfcq_enqueue(), and in
workqueue_thread() around each call to cds_wfcq_empty(), and around __cds_wfcq_splice_blocking().
I suspect we end up in a situation where:
* waker:
cds_wfcq_enqueue(&workqueue->cbs_head, &workqueue->cbs_tail, &work->next); [a]
uatomic_inc(&workqueue->qlen);
wake_worker_thread(workqueue); [b]
vs
* waiter:
splice_ret = __cds_wfcq_splice_blocking(&cbs_tmp_head, [c]
&cbs_tmp_tail, &workqueue->cbs_head, &workqueue->cbs_tail);
[...]
if (cds_wfcq_empty(&workqueue->cbs_head, [d]
&workqueue->cbs_tail)) {
futex_wait(&workqueue->futex); [e]
uatomic_dec(&workqueue->futex); [f]
happen in an order such that
[a] enqueues an item. Then [c] splices the item out of the queue because it was already awakened
by a prior wakeup. So the queue is observed as empty by [c]. Then [b] sets the futex to 0
(in userspace), which causes [e] to skip waiting on the futex, and [f] brings the value back
to -1. However, in the next loop, the queue is still observed as empty by [d], thus calling
[e] again. This time, the value is -1, so it calls sys_futex FUTEX_WAIT. Unfortunately, we
still have the call to sys_futex FUTEX_WAKE that will eventually be executed, thus awakening
the futex while the futex state is still -1, which is unexpected.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
next prev parent reply other threads:[~2022-06-23 14:10 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-14 3:55 [lttng-dev] urcu workqueue thread uses 99% of cpu while workqueue is empty Minlan Wang via lttng-dev
2022-06-14 13:39 ` Mathieu Desnoyers via lttng-dev
2022-06-14 13:40 ` Mathieu Desnoyers via lttng-dev
2022-06-14 14:19 ` Mathieu Desnoyers via lttng-dev
2022-06-14 15:53 ` Mathieu Desnoyers via lttng-dev
2022-06-15 3:49 ` Minlan Wang via lttng-dev
2022-06-15 13:35 ` Mathieu Desnoyers via lttng-dev
2022-06-15 14:15 ` Mathieu Desnoyers via lttng-dev
2022-06-16 7:09 ` Minlan Wang via lttng-dev
2022-06-16 8:09 ` Minlan Wang via lttng-dev
2022-06-17 13:37 ` Mathieu Desnoyers via lttng-dev
2022-06-21 3:52 ` Minlan Wang via lttng-dev
2022-06-21 13:12 ` [lttng-dev] ***UNCHECKED*** " Mathieu Desnoyers via lttng-dev
2022-06-22 7:45 ` Minlan Wang via lttng-dev
2022-06-22 13:19 ` [lttng-dev] ***UNCHECKED*** " Mathieu Desnoyers via lttng-dev
2022-06-22 20:28 ` Mathieu Desnoyers via lttng-dev
2022-06-22 20:52 ` Mathieu Desnoyers via lttng-dev
2022-06-23 9:08 ` Minlan Wang via lttng-dev
[not found] ` <20220623034528.GA271179@localhost.localdomain>
2022-06-23 3:57 ` Minlan Wang via lttng-dev
2022-06-23 14:09 ` Mathieu Desnoyers via lttng-dev [this message]
2022-06-24 6:21 ` Minlan Wang via lttng-dev
2022-06-23 8:36 ` [lttng-dev] [PART 4/4] " Minlan Wang via lttng-dev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1998086009.28888.1655993395580.JavaMail.zimbra@efficios.com \
--to=lttng-dev@lists.lttng.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=paulmck@kernel.org \
--cc=wangminlan@szsandstone.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.