netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [bpf-next PATCH] bpf: sockhash fix race with bpf_tcp_close and map delete
@ 2018-05-25 17:37 John Fastabend
  2018-05-26  6:07 ` Song Liu
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: John Fastabend @ 2018-05-25 17:37 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev

syzbot reported two related splats, a use after free and null
pointer dereference, when a TCP socket is closed while the map is
also being removed.

The psock keeps a reference to all map slots that have a reference
to the sock so that when the sock is closed we can clean up any
outstanding sock{map|hash} entries. This avoids pinning a sock
forever if the map owner fails to do proper cleanup. However, the
result is we have two paths that can free an entry in the map. Even
the comment in the sock{map|hash} tear down function, sock_hash_free()
notes this:

 At this point no update, lookup or delete operations can happen.
 However, be aware we can still get a socket state event updates,
 and data ready callbacks that reference the psock from sk_user_data.

Both removal paths omitted taking the hash bucket lock resulting
in the case where we have two references that are in the process
of being free'd.

Reported-by: syzbot+a761b81c211794fa1072@syzkaller.appspotmail.com
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 kernel/bpf/sockmap.c |   33 +++++++++++++++++++++------------
 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index 52a91d8..b508141f 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c
@@ -225,6 +225,16 @@ static void free_htab_elem(struct bpf_htab *htab, struct htab_elem *l)
 	kfree_rcu(l, rcu);
 }
 
+static inline struct bucket *__select_bucket(struct bpf_htab *htab, u32 hash)
+{
+	return &htab->buckets[hash & (htab->n_buckets - 1)];
+}
+
+static inline struct hlist_head *select_bucket(struct bpf_htab *htab, u32 hash)
+{
+	return &__select_bucket(htab, hash)->head;
+}
+
 static void bpf_tcp_close(struct sock *sk, long timeout)
 {
 	void (*close_fun)(struct sock *sk, long timeout);
@@ -268,9 +278,15 @@ static void bpf_tcp_close(struct sock *sk, long timeout)
 				smap_release_sock(psock, sk);
 			}
 		} else {
+			u32 hash = e->hash_link->hash;
+			struct bucket *b;
+
+			b = __select_bucket(e->htab, hash);
+			raw_spin_lock_bh(&b->lock);
 			hlist_del_rcu(&e->hash_link->hash_node);
 			smap_release_sock(psock, e->hash_link->sk);
 			free_htab_elem(e->htab, e->hash_link);
+			raw_spin_unlock_bh(&b->lock);
 		}
 	}
 	write_unlock_bh(&sk->sk_callback_lock);
@@ -2043,16 +2059,6 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr)
 	return ERR_PTR(err);
 }
 
-static inline struct bucket *__select_bucket(struct bpf_htab *htab, u32 hash)
-{
-	return &htab->buckets[hash & (htab->n_buckets - 1)];
-}
-
-static inline struct hlist_head *select_bucket(struct bpf_htab *htab, u32 hash)
-{
-	return &__select_bucket(htab, hash)->head;
-}
-
 static void sock_hash_free(struct bpf_map *map)
 {
 	struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
@@ -2069,10 +2075,12 @@ static void sock_hash_free(struct bpf_map *map)
 	 */
 	rcu_read_lock();
 	for (i = 0; i < htab->n_buckets; i++) {
-		struct hlist_head *head = select_bucket(htab, i);
+		struct bucket *b = __select_bucket(htab, i);
+		struct hlist_head *head = &b->head;
 		struct hlist_node *n;
 		struct htab_elem *l;
 
+		raw_spin_lock_bh(&b->lock);
 		hlist_for_each_entry_safe(l, n, head, hash_node) {
 			struct sock *sock = l->sk;
 			struct smap_psock *psock;
@@ -2090,8 +2098,9 @@ static void sock_hash_free(struct bpf_map *map)
 				smap_release_sock(psock, sock);
 			}
 			write_unlock_bh(&sock->sk_callback_lock);
-			kfree(l);
+			free_htab_elem(htab, l);
 		}
+		raw_spin_unlock_bh(&b->lock);
 	}
 	rcu_read_unlock();
 	bpf_map_area_free(htab->buckets);

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [bpf-next PATCH] bpf: sockhash fix race with bpf_tcp_close and map delete
  2018-05-25 17:37 [bpf-next PATCH] bpf: sockhash fix race with bpf_tcp_close and map delete John Fastabend
@ 2018-05-26  6:07 ` Song Liu
  2018-05-26  8:30 ` Daniel Borkmann
  2018-05-27 22:36 ` Daniel Borkmann
  2 siblings, 0 replies; 7+ messages in thread
From: Song Liu @ 2018-05-26  6:07 UTC (permalink / raw)
  To: John Fastabend; +Cc: ast, daniel, netdev

On Fri, May 25, 2018 at 10:37 AM, John Fastabend
<john.fastabend@gmail.com> wrote:
> syzbot reported two related splats, a use after free and null
> pointer dereference, when a TCP socket is closed while the map is
> also being removed.
>
> The psock keeps a reference to all map slots that have a reference
> to the sock so that when the sock is closed we can clean up any
> outstanding sock{map|hash} entries. This avoids pinning a sock
> forever if the map owner fails to do proper cleanup. However, the
> result is we have two paths that can free an entry in the map. Even
> the comment in the sock{map|hash} tear down function, sock_hash_free()
> notes this:
>
>  At this point no update, lookup or delete operations can happen.
>  However, be aware we can still get a socket state event updates,
>  and data ready callbacks that reference the psock from sk_user_data.
>
> Both removal paths omitted taking the hash bucket lock resulting
> in the case where we have two references that are in the process
> of being free'd.
>
> Reported-by: syzbot+a761b81c211794fa1072@syzkaller.appspotmail.com
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  kernel/bpf/sockmap.c |   33 +++++++++++++++++++++------------
>  1 file changed, 21 insertions(+), 12 deletions(-)
>
> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
> index 52a91d8..b508141f 100644
> --- a/kernel/bpf/sockmap.c
> +++ b/kernel/bpf/sockmap.c
> @@ -225,6 +225,16 @@ static void free_htab_elem(struct bpf_htab *htab, struct htab_elem *l)
>         kfree_rcu(l, rcu);
>  }
>
> +static inline struct bucket *__select_bucket(struct bpf_htab *htab, u32 hash)
> +{
> +       return &htab->buckets[hash & (htab->n_buckets - 1)];
> +}
> +
> +static inline struct hlist_head *select_bucket(struct bpf_htab *htab, u32 hash)
> +{
> +       return &__select_bucket(htab, hash)->head;
> +}
> +
>  static void bpf_tcp_close(struct sock *sk, long timeout)
>  {
>         void (*close_fun)(struct sock *sk, long timeout);
> @@ -268,9 +278,15 @@ static void bpf_tcp_close(struct sock *sk, long timeout)
>                                 smap_release_sock(psock, sk);
>                         }
>                 } else {
> +                       u32 hash = e->hash_link->hash;
> +                       struct bucket *b;
> +
> +                       b = __select_bucket(e->htab, hash);
> +                       raw_spin_lock_bh(&b->lock);
>                         hlist_del_rcu(&e->hash_link->hash_node);
>                         smap_release_sock(psock, e->hash_link->sk);
>                         free_htab_elem(e->htab, e->hash_link);
> +                       raw_spin_unlock_bh(&b->lock);
>                 }
>         }
>         write_unlock_bh(&sk->sk_callback_lock);
> @@ -2043,16 +2059,6 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr)
>         return ERR_PTR(err);
>  }
>
> -static inline struct bucket *__select_bucket(struct bpf_htab *htab, u32 hash)
> -{
> -       return &htab->buckets[hash & (htab->n_buckets - 1)];
> -}
> -
> -static inline struct hlist_head *select_bucket(struct bpf_htab *htab, u32 hash)
> -{
> -       return &__select_bucket(htab, hash)->head;
> -}
> -
>  static void sock_hash_free(struct bpf_map *map)
>  {
>         struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
> @@ -2069,10 +2075,12 @@ static void sock_hash_free(struct bpf_map *map)
>          */
>         rcu_read_lock();
>         for (i = 0; i < htab->n_buckets; i++) {
> -               struct hlist_head *head = select_bucket(htab, i);
> +               struct bucket *b = __select_bucket(htab, i);
> +               struct hlist_head *head = &b->head;
>                 struct hlist_node *n;
>                 struct htab_elem *l;
>
> +               raw_spin_lock_bh(&b->lock);
>                 hlist_for_each_entry_safe(l, n, head, hash_node) {
>                         struct sock *sock = l->sk;
>                         struct smap_psock *psock;
> @@ -2090,8 +2098,9 @@ static void sock_hash_free(struct bpf_map *map)
>                                 smap_release_sock(psock, sock);
>                         }
>                         write_unlock_bh(&sock->sk_callback_lock);
> -                       kfree(l);
> +                       free_htab_elem(htab, l);
>                 }
> +               raw_spin_unlock_bh(&b->lock);
>         }
>         rcu_read_unlock();
>         bpf_map_area_free(htab->buckets);
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [bpf-next PATCH] bpf: sockhash fix race with bpf_tcp_close and map delete
  2018-05-25 17:37 [bpf-next PATCH] bpf: sockhash fix race with bpf_tcp_close and map delete John Fastabend
  2018-05-26  6:07 ` Song Liu
@ 2018-05-26  8:30 ` Daniel Borkmann
  2018-05-27  4:36   ` John Fastabend
  2018-05-27 22:36 ` Daniel Borkmann
  2 siblings, 1 reply; 7+ messages in thread
From: Daniel Borkmann @ 2018-05-26  8:30 UTC (permalink / raw)
  To: John Fastabend, ast; +Cc: netdev

Hi John,

On 05/25/2018 07:37 PM, John Fastabend wrote:
> syzbot reported two related splats, a use after free and null
> pointer dereference, when a TCP socket is closed while the map is
> also being removed.
> 
> The psock keeps a reference to all map slots that have a reference
> to the sock so that when the sock is closed we can clean up any
> outstanding sock{map|hash} entries. This avoids pinning a sock
> forever if the map owner fails to do proper cleanup. However, the
> result is we have two paths that can free an entry in the map. Even
> the comment in the sock{map|hash} tear down function, sock_hash_free()
> notes this:
> 
>  At this point no update, lookup or delete operations can happen.
>  However, be aware we can still get a socket state event updates,
>  and data ready callbacks that reference the psock from sk_user_data.
> 
> Both removal paths omitted taking the hash bucket lock resulting
> in the case where we have two references that are in the process
> of being free'd.
> 
> Reported-by: syzbot+a761b81c211794fa1072@syzkaller.appspotmail.com
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>

Could you also shortly reply with a Fixes: tag so we can track all
fixes for the original commit.

Thanks,
Daniel

P.s.: still waiting on net-next to get fast-forwarded, then I'll
fast-forward bpf-next and process the queue.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [bpf-next PATCH] bpf: sockhash fix race with bpf_tcp_close and map delete
  2018-05-26  8:30 ` Daniel Borkmann
@ 2018-05-27  4:36   ` John Fastabend
  0 siblings, 0 replies; 7+ messages in thread
From: John Fastabend @ 2018-05-27  4:36 UTC (permalink / raw)
  To: Daniel Borkmann, ast; +Cc: netdev

On 05/26/2018 01:30 AM, Daniel Borkmann wrote:
> Hi John,
> 
> On 05/25/2018 07:37 PM, John Fastabend wrote:
>> syzbot reported two related splats, a use after free and null
>> pointer dereference, when a TCP socket is closed while the map is
>> also being removed.
>>
>> The psock keeps a reference to all map slots that have a reference
>> to the sock so that when the sock is closed we can clean up any
>> outstanding sock{map|hash} entries. This avoids pinning a sock
>> forever if the map owner fails to do proper cleanup. However, the
>> result is we have two paths that can free an entry in the map. Even
>> the comment in the sock{map|hash} tear down function, sock_hash_free()
>> notes this:
>>
>>  At this point no update, lookup or delete operations can happen.
>>  However, be aware we can still get a socket state event updates,
>>  and data ready callbacks that reference the psock from sk_user_data.
>>
>> Both removal paths omitted taking the hash bucket lock resulting
>> in the case where we have two references that are in the process
>> of being free'd.
>>
>> Reported-by: syzbot+a761b81c211794fa1072@syzkaller.appspotmail.com
>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> 

Fixes: 81110384441a ("bpf: sockmap, add hash map support")

 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [bpf-next PATCH] bpf: sockhash fix race with bpf_tcp_close and map delete
  2018-05-25 17:37 [bpf-next PATCH] bpf: sockhash fix race with bpf_tcp_close and map delete John Fastabend
  2018-05-26  6:07 ` Song Liu
  2018-05-26  8:30 ` Daniel Borkmann
@ 2018-05-27 22:36 ` Daniel Borkmann
  2018-05-28 15:13   ` John Fastabend
  2 siblings, 1 reply; 7+ messages in thread
From: Daniel Borkmann @ 2018-05-27 22:36 UTC (permalink / raw)
  To: John Fastabend, ast; +Cc: netdev

On 05/25/2018 07:37 PM, John Fastabend wrote:
> syzbot reported two related splats, a use after free and null
> pointer dereference, when a TCP socket is closed while the map is
> also being removed.
> 
> The psock keeps a reference to all map slots that have a reference
> to the sock so that when the sock is closed we can clean up any
> outstanding sock{map|hash} entries. This avoids pinning a sock
> forever if the map owner fails to do proper cleanup. However, the
> result is we have two paths that can free an entry in the map. Even
> the comment in the sock{map|hash} tear down function, sock_hash_free()
> notes this:
> 
>  At this point no update, lookup or delete operations can happen.
>  However, be aware we can still get a socket state event updates,
>  and data ready callbacks that reference the psock from sk_user_data.
> 
> Both removal paths omitted taking the hash bucket lock resulting
> in the case where we have two references that are in the process
> of being free'd.
> 
> Reported-by: syzbot+a761b81c211794fa1072@syzkaller.appspotmail.com
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>

Applied to bpf-next, thanks John!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [bpf-next PATCH] bpf: sockhash fix race with bpf_tcp_close and map delete
  2018-05-27 22:36 ` Daniel Borkmann
@ 2018-05-28 15:13   ` John Fastabend
  2018-05-28 15:48     ` Daniel Borkmann
  0 siblings, 1 reply; 7+ messages in thread
From: John Fastabend @ 2018-05-28 15:13 UTC (permalink / raw)
  To: Daniel Borkmann, ast; +Cc: netdev

On 05/27/2018 03:36 PM, Daniel Borkmann wrote:
> On 05/25/2018 07:37 PM, John Fastabend wrote:
>> syzbot reported two related splats, a use after free and null
>> pointer dereference, when a TCP socket is closed while the map is
>> also being removed.
>>
>> The psock keeps a reference to all map slots that have a reference
>> to the sock so that when the sock is closed we can clean up any
>> outstanding sock{map|hash} entries. This avoids pinning a sock
>> forever if the map owner fails to do proper cleanup. However, the
>> result is we have two paths that can free an entry in the map. Even
>> the comment in the sock{map|hash} tear down function, sock_hash_free()
>> notes this:
>>
>>  At this point no update, lookup or delete operations can happen.
>>  However, be aware we can still get a socket state event updates,
>>  and data ready callbacks that reference the psock from sk_user_data.
>>
>> Both removal paths omitted taking the hash bucket lock resulting
>> in the case where we have two references that are in the process
>> of being free'd.
>>
>> Reported-by: syzbot+a761b81c211794fa1072@syzkaller.appspotmail.com
>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> 
> Applied to bpf-next, thanks John!
> 

This needs a v2 it introduces a slightly different/related error. I'll
have a v2 shortly.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [bpf-next PATCH] bpf: sockhash fix race with bpf_tcp_close and map delete
  2018-05-28 15:13   ` John Fastabend
@ 2018-05-28 15:48     ` Daniel Borkmann
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Borkmann @ 2018-05-28 15:48 UTC (permalink / raw)
  To: John Fastabend, ast; +Cc: netdev

On 05/28/2018 05:13 PM, John Fastabend wrote:
> On 05/27/2018 03:36 PM, Daniel Borkmann wrote:
>> On 05/25/2018 07:37 PM, John Fastabend wrote:
>>> syzbot reported two related splats, a use after free and null
>>> pointer dereference, when a TCP socket is closed while the map is
>>> also being removed.
>>>
>>> The psock keeps a reference to all map slots that have a reference
>>> to the sock so that when the sock is closed we can clean up any
>>> outstanding sock{map|hash} entries. This avoids pinning a sock
>>> forever if the map owner fails to do proper cleanup. However, the
>>> result is we have two paths that can free an entry in the map. Even
>>> the comment in the sock{map|hash} tear down function, sock_hash_free()
>>> notes this:
>>>
>>>  At this point no update, lookup or delete operations can happen.
>>>  However, be aware we can still get a socket state event updates,
>>>  and data ready callbacks that reference the psock from sk_user_data.
>>>
>>> Both removal paths omitted taking the hash bucket lock resulting
>>> in the case where we have two references that are in the process
>>> of being free'd.
>>>
>>> Reported-by: syzbot+a761b81c211794fa1072@syzkaller.appspotmail.com
>>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>>
>> Applied to bpf-next, thanks John!
> 
> This needs a v2 it introduces a slightly different/related error. I'll
> have a v2 shortly.

Ok, I've just dropped it from the tree in favor for a v2. Thanks John!

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-05-28 15:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-05-25 17:37 [bpf-next PATCH] bpf: sockhash fix race with bpf_tcp_close and map delete John Fastabend
2018-05-26  6:07 ` Song Liu
2018-05-26  8:30 ` Daniel Borkmann
2018-05-27  4:36   ` John Fastabend
2018-05-27 22:36 ` Daniel Borkmann
2018-05-28 15:13   ` John Fastabend
2018-05-28 15:48     ` Daniel Borkmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).