From: Elizabeth Figura <zfigura@codeweavers.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "Arnd Bergmann" <arnd@arndb.de>,
"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
"Jonathan Corbet" <corbet@lwn.net>,
"Shuah Khan" <shuah@kernel.org>,
linux-kernel@vger.kernel.org, linux-api@vger.kernel.org,
wine-devel@winehq.org, "André Almeida" <andrealmeid@igalia.com>,
"Wolfram Sang" <wsa@kernel.org>,
"Arkadiusz Hiler" <ahiler@codeweavers.com>,
"Andy Lutomirski" <luto@kernel.org>,
linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org,
"Randy Dunlap" <rdunlap@infradead.org>,
"Ingo Molnar" <mingo@redhat.com>, "Will Deacon" <will@kernel.org>,
"Waiman Long" <longman@redhat.com>,
"Boqun Feng" <boqun.feng@gmail.com>
Subject: Re: [PATCH v4 00/30] NT synchronization primitive driver
Date: Tue, 16 Apr 2024 16:18:17 -0500 [thread overview]
Message-ID: <23472492.6Emhk5qWAg@terabithia> (raw)
In-Reply-To: <20240416161917.GD12673@noisy.programming.kicks-ass.net>
On Tuesday, 16 April 2024 11:19:17 CDT Peter Zijlstra wrote:
> On Tue, Apr 16, 2024 at 05:53:45PM +0200, Peter Zijlstra wrote:
> > On Tue, Apr 16, 2024 at 05:50:14PM +0200, Peter Zijlstra wrote:
> > > On Tue, Apr 16, 2024 at 10:14:21AM +0200, Peter Zijlstra wrote:
> > > > > Some aspects of the implementation may deserve particular comment:
> > > > >
> > > > > * In the interest of performance, each object is governed only by a
> > > > > single
> > > > >
> > > > > spinlock. However, NTSYNC_IOC_WAIT_ALL requires that the state of
> > > > > multiple
> > > > > objects be changed as a single atomic operation. In order to
> > > > > achieve this, we first take a device-wide lock ("wait_all_lock")
> > > > > any time we are going to lock more than one object at a time.
> > > > >
> > > > > The maximum number of objects that can be used in a vectored wait,
> > > > > and
> > > > > therefore the maximum that can be locked simultaneously, is 64.
> > > > > This number is NT's own limit.
> > >
> > > AFAICT:
> > > spin_lock(&dev->wait_all_lock);
> > >
> > > list_for_each_entry(entry, &obj->all_waiters, node)
> > >
> > > for (i=0; i<count; i++)
> > >
> > > spin_lock_nest_lock(q->entries[i].obj->lock,
> > > &dev->wait_all_lock);
> > >
> > > Where @count <= NTSYNC_MAX_WAIT_COUNT.
> > >
> > > So while this nests at most 65 spinlocks, there is no actual bound on
> > > the amount of nested lock sections in total. That is, all_waiters list
> > > can be grown without limits.
> > >
> > > Can we pretty please make wait_all_lock a mutex ?
That should be fine, at least.
> > Hurmph, it's worse, you do that list walk while holding some obj->lock
> > spinlokc too. Still need to figure out how all that works....
>
> So the point of having that other lock around is so that things like:
>
> try_wake_all_obj(dev, sem)
> try_wake_any_sem(sem)
>
> are done under the same lock?
The point of having the other lock around is that try_wake_all() needs to lock
multiple objects at the same time. It's a way of avoiding lock inversion.
Consider task A does a wait-for-all on objects X, Y, Z. Then task B signals Y,
so we do try_wake_all_obj() on Y, which does try_wake_all() on A's queue
entry; that needs to check X and Z and consume the state of all three objects
atomically. Another task could be trying to signal Z at the same time and
could hit a task waiting on Z, Y, X, and that causes inversion.
The simple and easy way to implement everything is just to have a global lock
on the whole device, but this is kind of known to be a performance bottleneck
(this was NT's BKL, and they ditched it starting with Vista or 7 or
something).
Instead we use a lock per object, and normally in the wait-for-any case we
only ever need to grab one lock at a time, but when we need to do a wait-for-
all we need to lock multiple objects at once, and we grab the outer lock to
avoid potential lock inversion.
> Where I seem to note that both those functions do that same list
> iteration.
Over different lists. I don't know if there's a better way to name things to
make that clearer.
There's the "any" wait queue, which tasks which do a wait-for-any add
themselves to, and the "all" wait queue, which tasks that do a wait-for-all
add themselves to. Signaling an object could potentially wake up either one,
but checking whether a task is eligible is a different process.
next prev parent reply other threads:[~2024-04-16 21:18 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-16 1:08 [PATCH v4 00/30] NT synchronization primitive driver Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 01/27] ntsync: Introduce NTSYNC_IOC_WAIT_ANY Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 02/27] ntsync: Introduce NTSYNC_IOC_WAIT_ALL Elizabeth Figura
2024-04-17 11:37 ` Peter Zijlstra
2024-04-17 20:03 ` Elizabeth Figura
2024-04-18 9:35 ` Peter Zijlstra
2024-04-19 16:28 ` Peter Zijlstra
2024-05-14 4:15 ` Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 03/27] ntsync: Introduce NTSYNC_IOC_CREATE_MUTEX Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 04/27] ntsync: Introduce NTSYNC_IOC_MUTEX_UNLOCK Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 05/27] ntsync: Introduce NTSYNC_IOC_MUTEX_KILL Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 06/27] ntsync: Introduce NTSYNC_IOC_CREATE_EVENT Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 07/27] ntsync: Introduce NTSYNC_IOC_EVENT_SET Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 08/27] ntsync: Introduce NTSYNC_IOC_EVENT_RESET Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 09/27] ntsync: Introduce NTSYNC_IOC_EVENT_PULSE Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 10/27] ntsync: Introduce NTSYNC_IOC_SEM_READ Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 11/27] ntsync: Introduce NTSYNC_IOC_MUTEX_READ Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 12/27] ntsync: Introduce NTSYNC_IOC_EVENT_READ Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 13/27] ntsync: Introduce alertable waits Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 14/27] selftests: ntsync: Add some tests for semaphore state Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 15/27] selftests: ntsync: Add some tests for mutex state Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 16/27] selftests: ntsync: Add some tests for NTSYNC_IOC_WAIT_ANY Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 17/27] selftests: ntsync: Add some tests for NTSYNC_IOC_WAIT_ALL Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 18/27] selftests: ntsync: Add some tests for wakeup signaling with WINESYNC_IOC_WAIT_ANY Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 19/27] selftests: ntsync: Add some tests for wakeup signaling with WINESYNC_IOC_WAIT_ALL Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 20/27] selftests: ntsync: Add some tests for manual-reset event state Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 21/27] selftests: ntsync: Add some tests for auto-reset " Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 22/27] selftests: ntsync: Add some tests for wakeup signaling with events Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 23/27] selftests: ntsync: Add tests for alertable waits Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 24/27] selftests: ntsync: Add some tests for wakeup signaling via alerts Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 25/27] selftests: ntsync: Add a stress test for contended waits Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 26/27] maintainers: Add an entry for ntsync Elizabeth Figura
2024-04-16 1:08 ` [PATCH v4 27/27] docs: ntsync: Add documentation for the ntsync uAPI Elizabeth Figura
2024-04-16 2:13 ` Randy Dunlap
2024-04-16 8:14 ` [PATCH v4 00/30] NT synchronization primitive driver Peter Zijlstra
2024-04-16 8:49 ` Greg Kroah-Hartman
2024-04-16 15:50 ` Peter Zijlstra
2024-04-16 15:53 ` Peter Zijlstra
2024-04-16 16:19 ` Peter Zijlstra
2024-04-16 21:18 ` Elizabeth Figura [this message]
2024-04-17 5:21 ` Peter Zijlstra
2024-04-16 21:18 ` Elizabeth Figura
2024-04-16 22:18 ` Elizabeth Figura
2024-04-19 16:16 ` Peter Zijlstra
2024-04-19 20:46 ` Elizabeth Figura
2024-05-07 0:40 ` Elizabeth Figura
2024-05-07 0:50 ` Elizabeth Figura
2024-04-17 5:24 ` Peter Zijlstra
2024-04-16 16:05 ` Peter Zijlstra
2024-04-16 21:18 ` Elizabeth Figura
2024-04-17 5:22 ` Peter Zijlstra
2024-04-17 6:05 ` Elizabeth Figura
2024-04-17 10:01 ` Peter Zijlstra
2024-04-17 20:02 ` Elizabeth Figura
2024-05-15 23:32 ` Elizabeth Figura
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=23472492.6Emhk5qWAg@terabithia \
--to=zfigura@codeweavers.com \
--cc=ahiler@codeweavers.com \
--cc=andrealmeid@igalia.com \
--cc=arnd@arndb.de \
--cc=boqun.feng@gmail.com \
--cc=corbet@lwn.net \
--cc=gregkh@linuxfoundation.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=longman@redhat.com \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rdunlap@infradead.org \
--cc=shuah@kernel.org \
--cc=will@kernel.org \
--cc=wine-devel@winehq.org \
--cc=wsa@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.