All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Sitnicki <jakub@cloudflare.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	 Eric Dumazet <edumazet@google.com>,
	 Linus Torvalds <torvalds@linux-foundation.org>,
	bpf <bpf@vger.kernel.org>,  LKML <linux-kernel@vger.kernel.org>,
	 Hillf Danton <hdanton@sina.com>,
	 "Paul E. McKenney" <paulmck@kernel.org>
Subject: Re: [PATCH] bpf, sockmap: defer sk_psock_free_link() using RCU
Date: Wed, 22 May 2024 13:08:23 +0200	[thread overview]
Message-ID: <87v836w1co.fsf@cloudflare.com> (raw)
In-Reply-To: <f77290fe-a94e-498b-bbbf-429ba0ce49c2@I-love.SAKURA.ne.jp> (Tetsuo Handa's message of "Wed, 22 May 2024 19:30:58 +0900")

On Wed, May 22, 2024 at 07:30 PM +09, Tetsuo Handa wrote:
> On 2024/05/22 18:50, Jakub Sitnicki wrote:
>> On Wed, May 22, 2024 at 06:59 AM +08, Hillf Danton wrote:
>>> On Tue, 21 May 2024 08:38:52 -0700 Alexei Starovoitov <alexei.starovoitov@gmail.com>
>>>> On Sun, May 12, 2024 at 12:22=E2=80=AFAM Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>>>> --- a/net/core/sock_map.c
>>>>> +++ b/net/core/sock_map.c
>>>>> @@ -142,6 +142,7 @@ static void sock_map_del_link(struct sock *sk,
>>>>>         bool strp_stop =3D false, verdict_stop =3D false;
>>>>>         struct sk_psock_link *link, *tmp;
>>>>>
>>>>> +       rcu_read_lock();
>>>>>         spin_lock_bh(&psock->link_lock);
>>>>
>>>> I think this is incorrect.
>>>> spin_lock_bh may sleep in RT and it won't be safe to do in rcu cs.
>>>
>>> Could you specify why it won't be safe in rcu cs if you are right?
>>> What does rcu look like in RT if not nothing?
>> 
>> RCU readers can't block, while spinlock RT doesn't disable preemption.
>> 
>> https://docs.kernel.org/RCU/rcu.html
>> https://docs.kernel.org/locking/locktypes.html#spinlock-t-and-preempt-rt
>> 
>
> I didn't catch what you mean.
>
> https://elixir.bootlin.com/linux/latest/source/include/linux/spinlock_rt.h#L43 defines spin_lock() for RT as
>
> static __always_inline void spin_lock(spinlock_t *lock)
> {
> 	rt_spin_lock(lock);
> }
>
> and https://elixir.bootlin.com/linux/v6.9/source/include/linux/spinlock_rt.h#L85 defines spin_lock_bh() for RT as
>
> static __always_inline void spin_lock_bh(spinlock_t *lock)
> {
> 	/* Investigate: Drop bh when blocking ? */
> 	local_bh_disable();
> 	rt_spin_lock(lock);
> }
>
> and https://elixir.bootlin.com/linux/latest/source/kernel/locking/spinlock_rt.c#L54 defines rt_spin_lock() for RT as
>
> void __sched rt_spin_lock(spinlock_t *lock)
> {
> 	spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
> 	__rt_spin_lock(lock);
> }
>
> and https://elixir.bootlin.com/linux/v6.9/source/kernel/locking/spinlock_rt.c#L46 defines __rt_spin_lock() for RT as
>
> static __always_inline void __rt_spin_lock(spinlock_t *lock)
> {
> 	rtlock_might_resched();
> 	rtlock_lock(&lock->lock);
> 	rcu_read_lock();
> 	migrate_disable();
> }
>
> . You can see that calling spin_lock() or spin_lock_bh() automatically starts RCU critical section, can't you?
>
> If spin_lock_bh() for RT might sleep and calling spin_lock_bh() under RCU critical section is not safe,
> how can
>
>   spin_lock(&lock1);
>   spin_lock(&lock2);
>   // do something
>   spin_unlock(&lock2);
>   spin_unlock(&lock1);
>
> or
>
>   spin_lock_bh(&lock1);
>   spin_lock(&lock2);
>   // do something
>   spin_unlock(&lock2);
>   spin_unlock_bh(&lock1);
>
> be possible?
>
> Unless rcu_read_lock() is implemented in a way that is safe to do
>
>   rcu_read_lock();
>   spin_lock(&lock2);
>   // do something
>   spin_unlock(&lock2);
>   rcu_read_unlock();
>
> and
>
>   rcu_read_lock();
>   spin_lock_bh(&lock2);
>   // do something
>   spin_unlock_bh(&lock2);
>   rcu_read_unlock();
>
> , I think RT kernels can't run safely.
>
> Locking primitive ordering is too much complicated/distributed.
> We need documentation using safe/unsafe ordering examples.

You're right. My answer was too hasty. Docs say that RT kernels can
preempt RCU read-side critical sections:

https://docs.kernel.org/RCU/whatisRCU.html?highlight=rcu_read_lock#rcu-read-lock


  reply	other threads:[~2024-05-22 11:08 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-12  7:21 [PATCH] bpf, sockmap: defer sk_psock_free_link() using RCU Tetsuo Handa
2024-05-21 15:38 ` Alexei Starovoitov
2024-05-21 22:59   ` Hillf Danton
2024-05-22  9:50     ` Jakub Sitnicki
2024-05-22 10:30       ` Tetsuo Handa
2024-05-22 11:08         ` Jakub Sitnicki [this message]
2024-05-22 11:33       ` Hillf Danton
2024-05-22 12:12         ` Jakub Sitnicki
2024-05-22 14:57           ` Alexei Starovoitov
2024-05-24 13:06             ` Jakub Sitnicki
2024-05-27 11:22               ` Jakub Sitnicki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v836w1co.fsf@cloudflare.com \
    --to=jakub@cloudflare.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=edumazet@google.com \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.