From: Waiman Long <longman@redhat.com>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: linux-bcachefs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, Boqun Feng <boqun.feng@gmail.com>
Subject: Re: [PATCH 11/20] locking/osq: Export osq_(lock|unlock)
Date: Wed, 2 Aug 2023 17:09:13 -0400 [thread overview]
Message-ID: <11d39248-31fc-c625-7c06-341f0146bd67@redhat.com> (raw)
In-Reply-To: <20230802204407.lk5mnj7ua6idddbd@moria.home.lan>
On 8/2/23 16:44, Kent Overstreet wrote:
> On Wed, Aug 02, 2023 at 04:16:12PM -0400, Waiman Long wrote:
>> On 7/12/23 17:11, Kent Overstreet wrote:
>>> These are used by bcachefs's six locks.
>>>
>>> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>> Cc: Ingo Molnar <mingo@redhat.com>
>>> Cc: Waiman Long <longman@redhat.com>
>>> Cc: Boqun Feng <boqun.feng@gmail.com>
>>> ---
>>> kernel/locking/osq_lock.c | 2 ++
>>> 1 file changed, 2 insertions(+)
>>>
>>> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
>>> index d5610ad52b..b752ec5cc6 100644
>>> --- a/kernel/locking/osq_lock.c
>>> +++ b/kernel/locking/osq_lock.c
>>> @@ -203,6 +203,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
>>> return false;
>>> }
>>> +EXPORT_SYMBOL_GPL(osq_lock);
>>> void osq_unlock(struct optimistic_spin_queue *lock)
>>> {
>>> @@ -230,3 +231,4 @@ void osq_unlock(struct optimistic_spin_queue *lock)
>>> if (next)
>>> WRITE_ONCE(next->locked, 1);
>>> }
>>> +EXPORT_SYMBOL_GPL(osq_unlock);
>> Have you considered extending the current rw_semaphore to support a SIX lock
>> semantics? There are a number of instances in the kernel that a up_read() is
>> followed by a down_write(). Basically, the code try to upgrade the lock from
>> read to write. I have been thinking about adding a upgrade_read() API to do
>> that. However, the concern that I had was that another writer may come in
>> and make modification before the reader can be upgraded to have exclusive
>> write access and will make the task to repeat what has been done in the read
>> lock part. By adding a read with intent to upgrade to write, we can have
>> that guarantee.
> It's been discussed, Linus had the same thought.
>
> But it'd be a massive change to the rw semaphore code; this "read with
> intent" really is a third lock state which needs all the same
> lock/trylock/unlock paths, and with the way rw semaphore has separate
> entry points for read and write it'd be a _ton_ of new code. It really
> touches everything - waitlist handling included.
Yes, it is a major change, but I had done that before and it is
certainly doable. There are spare bits in the low byte of rwsem->count
that can be used as an intent bit. We also need to add a new
rwsem_wake_type for that for waitlist handling.
>
> And six locks have several other features that bcachefs needs, and other
> users may also end up wanting, that rw semaphores don't have; the two
> main features being a percpu read lock mode and support for an external
> cycle detector (which requires exposing lock waitlists, with some
> guarantees about how those waitlists are used).
Can you provide more information about those features?
>
>> With that said, I would prefer to keep osq_{lock/unlock} for internal use by
>> some higher level locking primitives - mutex, rwsem and rt_mutex.
> Yeah, I'm aware, but it seems like exposing osq_(lock|unlock) is the
> most palatable solution for now. Long term, I'd like to get six locks
> promoted to kernel/locking.
Your SIX overlaps with rwsem in term of features. So we will have to
somehow merge them instead of having 2 APIs with somewhat similar
functionality.
Cheers,
Longman
>
next prev parent reply other threads:[~2023-08-02 21:10 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-12 21:10 [PATCH 00/20] bcachefs prereqs patch series Kent Overstreet
2023-07-12 21:10 ` [PATCH 01/20] sched: Add task_struct->faults_disabled_mapping Kent Overstreet
2023-07-12 21:10 ` [PATCH 02/20] fs: factor out d_mark_tmpfile() Kent Overstreet
2023-07-12 21:10 ` [PATCH 03/20] iov_iter: Handle compound highmem pages in copy_page_from_iter_atomic() Kent Overstreet
2023-07-12 21:10 ` [PATCH 04/20] block: Add some exports for bcachefs Kent Overstreet
2023-07-24 17:31 ` Christoph Hellwig
2023-07-25 3:00 ` Kent Overstreet
2023-07-26 13:20 ` Christoph Hellwig
2023-08-01 18:59 ` Kent Overstreet
2023-07-25 2:59 ` Matthew Wilcox
2023-07-12 21:11 ` [PATCH 05/20] block: Allow bio_iov_iter_get_pages() with bio->bi_bdev unset Kent Overstreet
2023-07-24 17:34 ` Christoph Hellwig
2023-07-25 2:43 ` Kent Overstreet
2023-07-26 13:23 ` Christoph Hellwig
2023-08-01 19:04 ` Kent Overstreet
2023-08-02 11:47 ` Christoph Hellwig
2023-08-02 16:44 ` Kent Overstreet
2023-07-12 21:11 ` [PATCH 06/20] block: Bring back zero_fill_bio_iter Kent Overstreet
2023-07-24 17:35 ` Christoph Hellwig
2023-07-25 2:45 ` Kent Overstreet
2023-07-26 13:21 ` Christoph Hellwig
2023-08-01 19:05 ` Kent Overstreet
2023-07-12 21:11 ` [PATCH 07/20] block: Don't block on s_umount from __invalidate_super() Kent Overstreet
2023-07-12 21:11 ` [PATCH 08/20] stacktrace: Export stack_trace_save_tsk Kent Overstreet
2023-07-12 21:11 ` [PATCH 09/20] lib/string_helpers: string_get_size() now returns characters wrote Kent Overstreet
2023-07-12 21:11 ` [PATCH 10/20] lib: Export errname Kent Overstreet
2023-07-13 7:10 ` Eric Biggers
2023-07-12 21:11 ` [PATCH 11/20] locking/osq: Export osq_(lock|unlock) Kent Overstreet
2023-08-02 20:16 ` Waiman Long
2023-08-02 20:44 ` Kent Overstreet
2023-08-02 21:09 ` Waiman Long [this message]
2023-08-02 21:42 ` Kent Overstreet
2023-10-10 8:09 ` [NAK] " Ingo Molnar
2023-10-18 21:04 ` Kent Overstreet
2023-07-12 21:11 ` [PATCH 12/20] bcache: move closures to lib/ Kent Overstreet
2023-07-13 3:21 ` Randy Dunlap
2023-07-13 3:52 ` Kent Overstreet
2023-07-12 21:11 ` [PATCH 13/20] MAINTAINERS: Add entry for closures Kent Overstreet
2023-07-12 21:11 ` [PATCH 14/20] closures: closure_wait_event() Kent Overstreet
2023-07-12 21:11 ` [PATCH 15/20] closures: closure_nr_remaining() Kent Overstreet
2023-07-12 21:11 ` [PATCH 16/20] closures: Add a missing include Kent Overstreet
2023-07-12 21:11 ` [PATCH 17/20] MAINTAINERS: Add entry for generic-radix-tree Kent Overstreet
2023-07-12 21:11 ` [PATCH 18/20] lib/generic-radix-tree.c: Don't overflow in peek() Kent Overstreet
2023-07-12 21:11 ` [PATCH 19/20] lib/generic-radix-tree.c: Add a missing include Kent Overstreet
2023-07-25 3:04 ` Matthew Wilcox
2023-07-25 3:36 ` Kent Overstreet
2023-07-12 21:11 ` [PATCH 20/20] lib/generic-radix-tree.c: Add peek_prev() Kent Overstreet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=11d39248-31fc-c625-7c06-341f0146bd67@redhat.com \
--to=longman@redhat.com \
--cc=boqun.feng@gmail.com \
--cc=kent.overstreet@linux.dev \
--cc=linux-bcachefs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).