From: Simon Horman <horms@kernel.org>
To: John Fastabend <john.fastabend@gmail.com>
Cc: olsajiri@gmail.com, xukuohai@huawei.com, eddyz87@gmail.com,
edumazet@google.com, cong.wang@bytedance.com,
bpf@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH bpf] bpf: sockmap, fix skb refcnt race after locking changes
Date: Sat, 2 Sep 2023 11:00:32 +0200 [thread overview]
Message-ID: <20230902090032.GB2146@kernel.org> (raw)
In-Reply-To: <20230901202137.214666-1-john.fastabend@gmail.com>
On Fri, Sep 01, 2023 at 01:21:37PM -0700, John Fastabend wrote:
> There is a race where skb's from the sk_psock_backlog can be referenced
> after userspace side has already skb_consumed() the sk_buff and its
> refcnt dropped to zer0 causing use after free.
>
> The flow is the following,
>
> while ((skb = skb_peek(&psock->ingress_skb))
> sk_psock_handle_Skb(psock, skb, ..., ingress)
> if (!ingress) ...
> sk_psock_skb_ingress
> sk_psock_skb_ingress_enqueue(skb)
> msg->skb = skb
> sk_psock_queue_msg(psock, msg)
> skb_dequeue(&psock->ingress_skb)
>
> The sk_psock_queue_msg() puts the msg on the ingress_msg queue. This is
> what the application reads when recvmsg() is called. An application can
> read this anytime after the msg is placed on the queue. The recvmsg
> hook will also read msg->skb and then after user space reads the msg
> will call consume_skb(skb) on it effectively free'ing it.
>
> But, the race is in above where backlog queue still has a reference to
> the skb and calls skb_dequeue(). If the skb_dequeue happens after the
> user reads and free's the skb we have a use after free.
>
> The !ingress case does not suffer from this problem because it uses
> sendmsg_*(sk, msg) which does not pass the sk_buff further down the
> stack.
>
> The following splat was observed with 'test_progs -t sockmap_listen':
>
> [ 1022.710250][ T2556] general protection fault, ...
> ...
> [ 1022.712830][ T2556] Workqueue: events sk_psock_backlog
> [ 1022.713262][ T2556] RIP: 0010:skb_dequeue+0x4c/0x80
> [ 1022.713653][ T2556] Code: ...
> ...
> [ 1022.720699][ T2556] Call Trace:
> [ 1022.720984][ T2556] <TASK>
> [ 1022.721254][ T2556] ? die_addr+0x32/0x80^M
> [ 1022.721589][ T2556] ? exc_general_protection+0x25a/0x4b0
> [ 1022.722026][ T2556] ? asm_exc_general_protection+0x22/0x30
> [ 1022.722489][ T2556] ? skb_dequeue+0x4c/0x80
> [ 1022.722854][ T2556] sk_psock_backlog+0x27a/0x300
> [ 1022.723243][ T2556] process_one_work+0x2a7/0x5b0
> [ 1022.723633][ T2556] worker_thread+0x4f/0x3a0
> [ 1022.723998][ T2556] ? __pfx_worker_thread+0x10/0x10
> [ 1022.724386][ T2556] kthread+0xfd/0x130
> [ 1022.724709][ T2556] ? __pfx_kthread+0x10/0x10
> [ 1022.725066][ T2556] ret_from_fork+0x2d/0x50
> [ 1022.725409][ T2556] ? __pfx_kthread+0x10/0x10
> [ 1022.725799][ T2556] ret_from_fork_asm+0x1b/0x30
> [ 1022.726201][ T2556] </TASK>
>
> To fix we add an skb_get() before passing the skb to be enqueued in
> the engress queue. This bumps the skb->users refcnt so that consume_skb
> and kfree_skb will not immediately free the sk_buff. With this we can
> be sure the skb is still around when we do the dequeue. Then we just
> need to decrement the refcnt or free the skb in the backlog case which
> we do by calling kfree_skb() on the ingress case as well as the sendmsg
> case.
>
> Before locking change from fixes tag we had the sock locked so we
> couldn't race with user and there was no issue here.
>
> Fixes: 799aa7f98d53e (skmsg: Avoid lock_sock() in sk_psock_backlog())
Hi John,
A minor nit from my side.
I think the usual format for a fixes tag is follows.
Fixes: 799aa7f98d53e ("skmsg: Avoid lock_sock() in sk_psock_backlog()")
> Reported-by: Jiri Olsa <jolsa@kernel.org>
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
...
next prev parent reply other threads:[~2023-09-02 9:00 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-01 20:21 [PATCH bpf] bpf: sockmap, fix skb refcnt race after locking changes John Fastabend
2023-09-01 21:20 ` Jiri Olsa
2023-09-01 21:24 ` Eduard Zingerman
2023-09-02 17:28 ` Jiri Olsa
2023-09-02 8:13 ` Xu Kuohai
2023-09-02 9:00 ` Simon Horman [this message]
2023-09-04 8:13 ` patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230902090032.GB2146@kernel.org \
--to=horms@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=cong.wang@bytedance.com \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=john.fastabend@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=olsajiri@gmail.com \
--cc=xukuohai@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.