netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: Jiri Olsa <olsajiri@gmail.com>, tglozar@redhat.com
Cc: linux-kernel@vger.kernel.org, john.fastabend@gmail.com,
	jakub@cloudflare.com, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org,
	bpf@vger.kernel.org
Subject: Re: [PATCH net] bpf: sockmap: Remove preempt_disable in sock_map_sk_acquire
Date: Fri, 28 Jul 2023 09:16:32 -0700	[thread overview]
Message-ID: <2c3bec7a-812c-0a65-f8c1-b9749430adba@linux.dev> (raw)
In-Reply-To: <ZMOrEi3cNWGXp9ZS@krava>



On 7/28/23 4:48 AM, Jiri Olsa wrote:
> On Fri, Jul 28, 2023 at 08:44:11AM +0200, tglozar@redhat.com wrote:
>> From: Tomas Glozar <tglozar@redhat.com>
>>
>> Disabling preemption in sock_map_sk_acquire conflicts with GFP_ATOMIC
>> allocation later in sk_psock_init_link on PREEMPT_RT kernels, since
>> GFP_ATOMIC might sleep on RT (see bpf: Make BPF and PREEMPT_RT co-exist
>> patchset notes for details).
>>
>> This causes calling bpf_map_update_elem on BPF_MAP_TYPE_SOCKMAP maps to
>> BUG (sleeping function called from invalid context) on RT kernels.
>>
>> preempt_disable was introduced together with lock_sk and rcu_read_lock
>> in commit 99ba2b5aba24e ("bpf: sockhash, disallow bpf_tcp_close and update
>> in parallel"), probably to match disabled migration of BPF programs, and
>> is no longer necessary.
>>
>> Remove preempt_disable to fix BUG in sock_map_update_common on RT.
> 
> FYI, I'm not sure it's related but I started to see following splat recently:
> 
> [  189.360689][  T658] =============================
> [  189.361149][  T658] [ BUG: Invalid wait context ]
> [  189.361588][  T658] 6.5.0-rc2+ #589 Tainted: G           OE
> [  189.362174][  T658] -----------------------------
> [  189.362660][  T658] test_progs/658 is trying to lock:
> [  189.363176][  T658] ffff8881702652b8 (&psock->link_lock){....}-{3:3}, at: sock_map_update_common+0x1c4/0x340
> [  189.364152][  T658] other info that might help us debug this:
> [  189.364689][  T658] context-{5:5}
> [  189.365021][  T658] 3 locks held by test_progs/658:
> [  189.365508][  T658]  #0: ffff888177611a80 (sk_lock-AF_INET){+.+.}-{0:0}, at: sock_map_update_elem_sys+0x82/0x260
> [  189.366503][  T658]  #1: ffffffff835a3180 (rcu_read_lock){....}-{1:3}, at: sock_map_update_elem_sys+0x78/0x260
> [  189.367470][  T658]  #2: ffff88816cf19240 (&stab->lock){+...}-{2:2}, at: sock_map_update_common+0x12a/0x340
> [  189.368420][  T658] stack backtrace:
> [  189.368806][  T658] CPU: 0 PID: 658 Comm: test_progs Tainted: G           OE      6.5.0-rc2+ #589 98af30b3c42d747b51da05f1d0e4899e394be6c9
> [  189.369889][  T658] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014
> [  189.370736][  T658] Call Trace:
> [  189.371063][  T658]  <TASK>
> [  189.371365][  T658]  dump_stack_lvl+0xb2/0x120
> [  189.371798][  T658]  __lock_acquire+0x9ad/0x2470
> [  189.372243][  T658]  ? lock_acquire+0x104/0x350
> [  189.372680][  T658]  lock_acquire+0x104/0x350
> [  189.373104][  T658]  ? sock_map_update_common+0x1c4/0x340
> [  189.373615][  T658]  ? find_held_lock+0x32/0x90
> [  189.374074][  T658]  ? sock_map_update_common+0x12a/0x340
> [  189.374587][  T658]  _raw_spin_lock_bh+0x38/0x80
> [  189.375060][  T658]  ? sock_map_update_common+0x1c4/0x340
> [  189.375571][  T658]  sock_map_update_common+0x1c4/0x340
> [  189.376118][  T658]  sock_map_update_elem_sys+0x184/0x260
> [  189.376704][  T658]  __sys_bpf+0x181f/0x2840
> [  189.377147][  T658]  __x64_sys_bpf+0x1a/0x30
> [  189.377556][  T658]  do_syscall_64+0x38/0x90
> [  189.377980][  T658]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> [  189.378473][  T658] RIP: 0033:0x7fe52f47ab5d
> 
> the patch did not help with that

I think the above splat is not related to this patch. In function
sock_map_update_common func we have
   raw_spin_lock_bh(&stab->lock);

   sock_map_add_link(psock, link, map, &stab->sks[idx]);
     spin_lock_bh(&psock->link_lock);
     ...
     spin_unlock_bh(&psock->link_lock);

   raw_spin_unlock_bh(&stab->lock);

I think you probably have CONFIG_PROVE_RAW_LOCK_NESTING turned on
in your config.

In the above case, for RT kernel, spin_lock_bh will become
'mutex' and it is sleepable, while raw_spin_lock_bh remains
to be a spin lock. The warning is about potential
locking violation with RT kernel.

To fix the issue, you can convert spin_lock_bh to raw_spin_lock_bh
to silence the warning.

> 
> jirka
> 
>>
>> Signed-off-by: Tomas Glozar <tglozar@redhat.com>
>> ---
>>   net/core/sock_map.c | 2 --
>>   1 file changed, 2 deletions(-)
>>
>> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
>> index 19538d628714..08ab108206bf 100644
>> --- a/net/core/sock_map.c
>> +++ b/net/core/sock_map.c
>> @@ -115,7 +115,6 @@ static void sock_map_sk_acquire(struct sock *sk)
>>   	__acquires(&sk->sk_lock.slock)
>>   {
>>   	lock_sock(sk);
>> -	preempt_disable();
>>   	rcu_read_lock();
>>   }
>>   
>> @@ -123,7 +122,6 @@ static void sock_map_sk_release(struct sock *sk)
>>   	__releases(&sk->sk_lock.slock)
>>   {
>>   	rcu_read_unlock();
>> -	preempt_enable();
>>   	release_sock(sk);
>>   }
>>   
>> -- 
>> 2.39.3
>>
>>
> 

  reply	other threads:[~2023-07-28 16:22 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-28  6:44 [PATCH net] bpf: sockmap: Remove preempt_disable in sock_map_sk_acquire tglozar
2023-07-28 11:48 ` Jiri Olsa
2023-07-28 16:16   ` Yonghong Song [this message]
2023-07-31  9:16 ` Jakub Sitnicki
2023-08-01  3:58   ` John Fastabend
2023-08-01  7:44     ` Paolo Abeni
2023-08-01 15:50       ` Alexei Starovoitov
2023-08-01  7:40 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2c3bec7a-812c-0a65-f8c1-b9749430adba@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=bpf@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jakub@cloudflare.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=olsajiri@gmail.com \
    --cc=pabeni@redhat.com \
    --cc=tglozar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).