lttng-dev.lists.lttng.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers via lttng-dev <lttng-dev@lists.lttng.org>
To: Minlan Wang <wangminlan@szsandstone.com>
Cc: lttng-dev <lttng-dev@lists.lttng.org>, paulmck <paulmck@kernel.org>
Subject: Re: [lttng-dev] ***UNCHECKED*** Re: Re: urcu workqueue thread uses 99% of cpu while	workqueue is empty
Date: Thu, 23 Jun 2022 10:09:55 -0400 (EDT)	[thread overview]
Message-ID: <1998086009.28888.1655993395580.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20220623035722.GA271206@localhost.localdomain>+CCEFE5C72E6CA40A>

----- On Jun 22, 2022, at 11:57 PM, Minlan Wang wangminlan@szsandstone.com wrote:

> On Wed, Jun 22, 2022 at 11:45:28PM -0400, Minlan Wang wrote:
>> The session output is in attachment: 0623_futex-20220623-112754.tar.bz2
> Hi, Mathieu,
> There are several workqueues in the process i tracked.
> The one used to trigger the issuse is this:
> thread with vtid=21054, and futex = 0x5652A1035C40

By looking at the trace, here is one problematic scenario which end up emitting "workqueue:futex_wait_again":

[23:33:48.060581417] (+0.000006985) localhost.localdomain workqueue:futex_no_wait: { cpu_id = 4 }, { vpid = 19495, vtid = 21054 }, { futex = 0x5652A1035C40, val = 0 }
[23:33:48.060581747] (+0.000000330) localhost.localdomain workqueue:futex_wake_up_success: { cpu_id = 6 }, { vpid = 19495, vtid = 20649 }, { futex = 0x5652A1035C40, val = -1 }
[23:33:48.060581926] (+0.000000179) localhost.localdomain workqueue:futex_dec_success: { cpu_id = 4 }, { vpid = 19495, vtid = 21054 }, { futex = 0x5652A1035C40, old = 0, new = -1 }
[23:33:48.060582826] (+0.000000900) localhost.localdomain syscall_entry_futex: { cpu_id = 6 }, { vpid = 19495, vtid = 20649, pid = 19495, tid = 20649 }, { uaddr = 94912888659008, op = 1, val = 1, utime = 0, uaddr2 = 0, val3 = 0 }
[23:33:48.060582855] (+0.000000029) localhost.localdomain syscall_entry_futex: { cpu_id = 4 }, { vpid = 19495, vtid = 21054, pid = 19495, tid = 21054 }, { uaddr = 94912888659008, op = 0, val = 4294967295, utime = 0, uaddr2 = 0, val3 = 0 }
[23:33:48.060584722] (+0.000001867) localhost.localdomain sched_stat_runtime: { cpu_id = 4 }, { vpid = 19495, vtid = 21054, pid = 19495, tid = 21054 }, { comm = "bcache_writebac", tid = 21054, runtime = 16033, vruntime = 96940983054 }
[23:33:48.060585840] (+0.000001118) localhost.localdomain sched_switch: { cpu_id = 4 }, { vpid = 19495, vtid = 21054, pid = 19495, tid = 21054 }, { prev_comm = "bcache_writebac", prev_tid = 21054, prev_prio = 20, prev_state = 1, next_comm = "swapper/4", next_tid = 0, next_prio = 20 }
[23:33:48.060587680] (+0.000001840) localhost.localdomain sched_stat_sleep: { cpu_id = 6 }, { vpid = 19495, vtid = 20649, pid = 19495, tid = 20649 }, { comm = "bcache_writebac", tid = 21054, delay = 2815 }
[23:33:48.060588560] (+0.000000880) localhost.localdomain sched_wakeup: { cpu_id = 6 }, { vpid = 19495, vtid = 20649, pid = 19495, tid = 20649 }, { comm = "bcache_writebac", tid = 21054, prio = 20, success = 1, target_cpu = 4 }
[23:33:48.060590437] (+0.000001877) localhost.localdomain syscall_exit_futex: { cpu_id = 4 }, { vpid = 19495, vtid = 21054, pid = 19495, tid = 21054 }, { ret = 0, uaddr = 94912888659008, uaddr2 = 0 }
[23:33:48.060591337] (+0.000000900) localhost.localdomain syscall_exit_futex: { cpu_id = 6 }, { vpid = 19495, vtid = 20649, pid = 19495, tid = 20649 }, { ret = 1, uaddr = 94912888659008, uaddr2 = 0 }
[23:33:48.060591385] (+0.000000048) localhost.localdomain workqueue:futex_wait_return: { cpu_id = 4 }, { vpid = 19495, vtid = 21054 }, { futex = 0x5652A1035C40, val = -1, ret = 0 }
[23:33:48.060592205] (+0.000000820) localhost.localdomain workqueue:futex_wait_again: { cpu_id = 4 }, { vpid = 19495, vtid = 21054 }, { futex = 0x5652A1035C40, val = -1 }

Where the wake_up happens right before the dec_success on the waiter, which
leads to the sched_wakeup awakening the waiter when the state is 0.

If we want to dig more into why this scenario can happen, we could also add tracepoints
in urcu_workqueue_queue_work() just before/after cds_wfcq_enqueue(), and in 
workqueue_thread() around each call to cds_wfcq_empty(), and around __cds_wfcq_splice_blocking().

I suspect we end up in a situation where:

* waker:

        cds_wfcq_enqueue(&workqueue->cbs_head, &workqueue->cbs_tail, &work->next); [a]
        uatomic_inc(&workqueue->qlen);
        wake_worker_thread(workqueue);                                             [b]

vs

* waiter:
                splice_ret = __cds_wfcq_splice_blocking(&cbs_tmp_head,             [c]
                        &cbs_tmp_tail, &workqueue->cbs_head, &workqueue->cbs_tail);
[...]
                        if (cds_wfcq_empty(&workqueue->cbs_head,                   [d]
                                        &workqueue->cbs_tail)) {
                                futex_wait(&workqueue->futex);                     [e]
                                uatomic_dec(&workqueue->futex);                    [f]

happen in an order such that 

[a] enqueues an item. Then [c] splices the item out of the queue because it was already awakened
by a prior wakeup. So the queue is observed as empty by [c]. Then [b] sets the futex to 0
(in userspace), which causes [e] to skip waiting on the futex, and [f] brings the value back
to -1. However, in the next loop, the queue is still observed as empty by [d], thus calling
[e] again. This time, the value is -1, so it calls sys_futex FUTEX_WAIT. Unfortunately, we
still have the call to sys_futex FUTEX_WAKE that will eventually be executed, thus awakening
the futex while the futex state is still -1, which is unexpected.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

  reply	other threads:[~2022-06-23 14:10 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-14  3:55 [lttng-dev] urcu workqueue thread uses 99% of cpu while workqueue is empty Minlan Wang via lttng-dev
2022-06-14 13:39 ` Mathieu Desnoyers via lttng-dev
2022-06-14 13:40   ` Mathieu Desnoyers via lttng-dev
2022-06-14 14:19 ` Mathieu Desnoyers via lttng-dev
2022-06-14 15:53 ` Mathieu Desnoyers via lttng-dev
2022-06-15  3:49   ` Minlan Wang via lttng-dev
2022-06-15 13:35     ` Mathieu Desnoyers via lttng-dev
2022-06-15 14:15     ` Mathieu Desnoyers via lttng-dev
2022-06-16  7:09       ` Minlan Wang via lttng-dev
2022-06-16  8:09       ` Minlan Wang via lttng-dev
2022-06-17 13:37         ` Mathieu Desnoyers via lttng-dev
2022-06-21  3:52           ` Minlan Wang via lttng-dev
2022-06-21 13:12             ` [lttng-dev] ***UNCHECKED*** " Mathieu Desnoyers via lttng-dev
2022-06-22  7:45               ` Minlan Wang via lttng-dev
2022-06-22 13:19                 ` [lttng-dev] ***UNCHECKED*** " Mathieu Desnoyers via lttng-dev
2022-06-22 20:28                   ` Mathieu Desnoyers via lttng-dev
2022-06-22 20:52                     ` Mathieu Desnoyers via lttng-dev
2022-06-23  9:08                       ` Minlan Wang via lttng-dev
     [not found]                   ` <20220623034528.GA271179@localhost.localdomain>
2022-06-23  3:57                     ` Minlan Wang via lttng-dev
2022-06-23 14:09                       ` Mathieu Desnoyers via lttng-dev [this message]
2022-06-24  6:21                         ` Minlan Wang via lttng-dev
2022-06-23  8:36                   ` [lttng-dev] [PART 4/4] " Minlan Wang via lttng-dev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1998086009.28888.1655993395580.JavaMail.zimbra@efficios.com \
    --to=lttng-dev@lists.lttng.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=paulmck@kernel.org \
    --cc=wangminlan@szsandstone.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).