All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Sitnicki <jakub@cloudflare.com>
To: John Fastabend <john.fastabend@gmail.com>
Cc: kafai@fb.com, daniel@iogearbox.net, ast@kernel.org,
	netdev@vger.kernel.org, bpf@vger.kernel.org
Subject: Re: [bpf PATCH v2 1/3] bpf, sockmap: RCU splat with redirect and strparser error or TLS
Date: Fri, 26 Jun 2020 13:11:44 +0200	[thread overview]
Message-ID: <87ftaim68f.fsf@cloudflare.com> (raw)
In-Reply-To: <159312677907.18340.11064813152758406626.stgit@john-XPS-13-9370>

On Fri, Jun 26, 2020 at 01:12 AM CEST, John Fastabend wrote:
> There are two paths to generate the below RCU splat the first and
> most obvious is the result of the BPF verdict program issuing a
> redirect on a TLS socket (This is the splat shown below). Unlike
> the non-TLS case the caller of the *strp_read() hooks does not
> wrap the call in a rcu_read_lock/unlock. Then if the BPF program
> issues a redirect action we hit the RCU splat.
>
> However, in the non-TLS socket case the splat appears to be
> relatively rare, because the skmsg caller into the strp_data_ready()
> is wrapped in a rcu_read_lock/unlock. Shown here,
>
>  static void sk_psock_strp_data_ready(struct sock *sk)
>  {
> 	struct sk_psock *psock;
>
> 	rcu_read_lock();
> 	psock = sk_psock(sk);
> 	if (likely(psock)) {
> 		if (tls_sw_has_ctx_rx(sk)) {
> 			psock->parser.saved_data_ready(sk);
> 		} else {
> 			write_lock_bh(&sk->sk_callback_lock);
> 			strp_data_ready(&psock->parser.strp);
> 			write_unlock_bh(&sk->sk_callback_lock);
> 		}
> 	}
> 	rcu_read_unlock();
>  }
>
> If the above was the only way to run the verdict program we
> would be safe. But, there is a case where the strparser may throw an
> ENOMEM error while parsing the skb. This is a result of a failed
> skb_clone, or alloc_skb_for_msg while building a new merged skb when
> the msg length needed spans multiple skbs. This will in turn put the
> skb on the strp_wrk workqueue in the strparser code. The skb will
> later be dequeued and verdict programs run, but now from a
> different context without the rcu_read_lock()/unlock() critical
> section in sk_psock_strp_data_ready() shown above. In practice
> I have not seen this yet, because as far as I know most users of the
> verdict programs are also only working on single skbs. In this case no
> merge happens which could trigger the above ENOMEM errors. In addition
> the system would need to be under memory pressure. For example, we
> can't hit the above case in selftests because we missed having tests
> to merge skbs. (Added in later patch)
>
> To fix the below splat extend the rcu_read_lock/unnlock block to
> include the call to sk_psock_tls_verdict_apply(). This will fix both
> TLS redirect case and non-TLS redirect+error case. Also remove
> psock from the sk_psock_tls_verdict_apply() function signature its
> not used there.
>
> [ 1095.937597] WARNING: suspicious RCU usage
> [ 1095.940964] 5.7.0-rc7-02911-g463bac5f1ca79 #1 Tainted: G        W
> [ 1095.944363] -----------------------------
> [ 1095.947384] include/linux/skmsg.h:284 suspicious rcu_dereference_check() usage!
> [ 1095.950866]
> [ 1095.950866] other info that might help us debug this:
> [ 1095.950866]
> [ 1095.957146]
> [ 1095.957146] rcu_scheduler_active = 2, debug_locks = 1
> [ 1095.961482] 1 lock held by test_sockmap/15970:
> [ 1095.964501]  #0: ffff9ea6b25de660 (sk_lock-AF_INET){+.+.}-{0:0}, at: tls_sw_recvmsg+0x13a/0x840 [tls]
> [ 1095.968568]
> [ 1095.968568] stack backtrace:
> [ 1095.975001] CPU: 1 PID: 15970 Comm: test_sockmap Tainted: G        W         5.7.0-rc7-02911-g463bac5f1ca79 #1
> [ 1095.977883] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
> [ 1095.980519] Call Trace:
> [ 1095.982191]  dump_stack+0x8f/0xd0
> [ 1095.984040]  sk_psock_skb_redirect+0xa6/0xf0
> [ 1095.986073]  sk_psock_tls_strp_read+0x1d8/0x250
> [ 1095.988095]  tls_sw_recvmsg+0x714/0x840 [tls]
>
> v2: Improve commit message to identify non-TLS redirect plus error case
>     condition as well as more common TLS case. In the process I decided
>     doing the rcu_read_unlock followed by the lock/unlock inside branches
>     was unnecessarily complex. We can just extend the current rcu block
>     and get the same effeective without the shuffling and branching.
>     Thanks Martin!
>
> Fixes: e91de6afa81c1 ("bpf: Fix running sk_skb program types with ktls")
> Reported-by: Jakub Sitnicki <jakub@cloudflare.com>
> Reported-by: kernel test robot <rong.a.chen@intel.com>
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---

Thanks for the detailed explanation.

Acked-by: Jakub Sitnicki <jakub@cloudflare.com>

[...]

  parent reply	other threads:[~2020-06-26 11:11 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-25 23:12 [bpf PATCH v2 0/3] Sockmap RCU splat fix John Fastabend
2020-06-25 23:12 ` [bpf PATCH v2 1/3] bpf, sockmap: RCU splat with redirect and strparser error or TLS John Fastabend
2020-06-26  6:10   ` Martin KaFai Lau
2020-06-26 11:11   ` Jakub Sitnicki [this message]
2020-06-25 23:13 ` [bpf PATCH v2 2/3] bpf, sockmap: RCU dereferenced psock may be used outside RCU block John Fastabend
2020-06-26  6:13   ` Martin KaFai Lau
2020-06-29  7:42   ` Jakub Sitnicki
2020-06-25 23:13 ` [bpf PATCH v2 3/3] bpf, sockmap: Add ingres skb tests that utilize merge skbs John Fastabend
2020-06-26  6:14   ` Martin KaFai Lau
2020-06-28 15:40 ` [bpf PATCH v2 0/3] Sockmap RCU splat fix Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ftaim68f.fsf@cloudflare.com \
    --to=jakub@cloudflare.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.