From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: "André Almeida" <andrealmeid@igalia.com>,
"Peter Zijlstra" <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
"Paul E . McKenney" <paulmck@kernel.org>,
Boqun Feng <boqun.feng@gmail.com>,
"H . Peter Anvin" <hpa@zytor.com>, Paul Turner <pjt@google.com>,
linux-api@vger.kernel.org, Christian Brauner <brauner@kernel.org>,
Florian Weimer <fw@deneb.enyo.de>,
David.Laight@ACULAB.COM, carlos@redhat.com,
Peter Oskolkov <posk@posk.io>,
Alexander Mikhalitsyn <alexander@mihalicyn.com>,
Chris Kennelly <ckennelly@google.com>,
Ingo Molnar <mingo@redhat.com>,
Darren Hart <dvhart@infradead.org>,
Davidlohr Bueso <dave@stgolabs.net>,
libc-alpha@sourceware.org, Steven Rostedt <rostedt@goodmis.org>,
Jonathan Corbet <corbet@lwn.net>,
Noah Goldstein <goldstein.w.n@gmail.com>,
Daniel Colascione <dancol@google.com>,
longman@redhat.com, kernel-dev@igalia.com
Subject: Re: [PATCH v2 1/1] futex: Add FUTEX_SPIN operation
Date: Thu, 23 May 2024 16:20:16 -0400 [thread overview]
Message-ID: <62825712-36bc-483c-9bca-db3d9233b0d2@efficios.com> (raw)
In-Reply-To: <20240523200704.281514-2-andrealmeid@igalia.com>
On 2024-05-23 16:07, André Almeida wrote:
> Add a new mode for futex wait, the futex spin.
>
> Given the FUTEX2_SPIN flag, parse the futex value as the TID of the lock
> owner. Then, before going to the normal wait path, spins while the lock
> owner is running in a different CPU, to avoid the whole context switch
> operation and to quickly return to userspace. If the lock owner is not
> running, just sleep as the normal futex wait path.
>
> The user value is masked with FUTEX_TID_MASK, to allow some bits for
> future use.
>
> The check for the owner to be running or not is important to avoid
> spinning for something that won't be released quickly. Userspace is
> responsible on providing the proper TID, the kernel does a basic check.
>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> ---
[...]
> +
> +static int futex_spin(struct futex_hash_bucket *hb, struct futex_q *q,
> + struct hrtimer_sleeper *timeout, u32 uval)
> +{
> + struct task_struct *p;
> + pid_t pid = uval & FUTEX_TID_MASK;
> +
> + p = find_get_task_by_vpid(pid);
> +
> + /* no task found, maybe it already exited */
> + if (!p) {
> + futex_q_unlock(hb);
> + return -EAGAIN;
> + }
> +
> + /* can't spin in a kernel task */
> + if (unlikely(p->flags & PF_KTHREAD)) {
> + put_task_struct(p);
> + futex_q_unlock(hb);
> + return -EPERM;
> + }
> +
> + futex_queue(q, hb);
> +
> + if (timeout)
> + hrtimer_sleeper_start_expires(timeout, HRTIMER_MODE_ABS);
> +
> + while (1) {
Infinite loops in other kernel/futex/ files appear to use "for (;;) {"
instead.
> + if (likely(!plist_node_empty(&q->list))) {
> + if (timeout && !timeout->task)
> + goto exit;
> +
> + if (task_on_cpu(p)) {
> + /* spin */
You may want to add a "cpu_relax();" here to lessen the
power consumption of this busy-loop.
> + continue;
> + } else {
> + /* task is not running, sleep */
> + break;
> + }
> + } else {
> + goto exit;
> + }
> + }
> +
> + /* spinning didn't work, go to the normal path */
> + set_current_state(TASK_INTERRUPTIBLE|TASK_FREEZABLE);
I wonder if flipping the order between:
set_current_state(TASK_INTERRUPTIBLE|TASK_FREEZABLE);
futex_queue(q, hb);
as done in futex_wait_queue() and what is done here matters ?
Does it introduce a race where a wakeup could be missed ?
If it's an issue, then setting the current state could be done
before invoking futex_queue(), and whenever the spin exits,
set it back to TASK_RUNNING.
> +
> + if (likely(!plist_node_empty(&q->list))) {
> + if (!timeout || timeout->task)
> + schedule();
> + }
> +
> + __set_current_state(TASK_RUNNING);
> +
> +exit:
> + put_task_struct(p);
> + return 0;
> +}
> +
> /**
> * futex_unqueue_multiple - Remove various futexes from their hash bucket
> * @v: The list of futexes to unqueue
> @@ -665,8 +732,15 @@ int __futex_wait(u32 __user *uaddr, unsigned int flags, u32 val,
> if (ret)
> return ret;
>
> - /* futex_queue and wait for wakeup, timeout, or a signal. */
> - futex_wait_queue(hb, &q, to);
> + if (flags & FLAGS_SPIN) {
> + ret = futex_spin(hb, &q, to, val);
The empty line below could be removed.
Thanks,
Mathieu
> +
> + if (ret)
> + return ret;
> + } else {
> + /* futex_queue and wait for wakeup, timeout, or a signal. */
> + futex_wait_queue(hb, &q, to);
> + }
>
> /* If we were woken (and unqueued), we succeeded, whatever. */
> if (!futex_unqueue(&q))
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
next prev parent reply other threads:[~2024-05-23 20:19 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-23 20:07 [PATCH v2 0/1] Add FUTEX_SPIN operation André Almeida
2024-05-23 20:07 ` [PATCH v2 1/1] futex: " André Almeida
2024-05-23 20:20 ` Mathieu Desnoyers [this message]
2024-05-24 7:52 ` David Laight
2024-05-25 9:36 ` Mateusz Guzik
2024-05-23 20:39 ` [PATCH v2 0/1] " Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=62825712-36bc-483c-9bca-db3d9233b0d2@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=David.Laight@ACULAB.COM \
--cc=alexander@mihalicyn.com \
--cc=andrealmeid@igalia.com \
--cc=boqun.feng@gmail.com \
--cc=brauner@kernel.org \
--cc=carlos@redhat.com \
--cc=ckennelly@google.com \
--cc=corbet@lwn.net \
--cc=dancol@google.com \
--cc=dave@stgolabs.net \
--cc=dvhart@infradead.org \
--cc=fw@deneb.enyo.de \
--cc=goldstein.w.n@gmail.com \
--cc=hpa@zytor.com \
--cc=kernel-dev@igalia.com \
--cc=libc-alpha@sourceware.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mingo@redhat.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=posk@posk.io \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).