[PATCH] bpf, sockmap: fix deadlocks in the sockhash and sockmap

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] bpf, sockmap: fix deadlocks in the sockhash and sockmap
@ 2023-09-18  9:36 Ma Ke
  2023-09-18 18:49 ` Kui-Feng Lee
  0 siblings, 1 reply; 5+ messages in thread
From: Ma Ke @ 2023-09-18  9:36 UTC (permalink / raw)
  To: john.fastabend, jakub, davem, edumazet, kuba, pabeni; +Cc: netdev, bpf, Ma Ke

It seems that elements in sockhash are rarely actively
deleted by users or ebpf program. Therefore, we do not
pay much attention to their deletion. Compared with hash
maps, sockhash only provides spin_lock_bh protection.
This causes it to appear to have self-locking behavior
in the interrupt context, as CVE-2023-0160 points out.

Signed-off-by: Ma Ke <make_ruc2021@163.com>
---
 net/core/sock_map.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index cb11750b1df5..1302d484e769 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -928,11 +928,12 @@ static long sock_hash_delete_elem(struct bpf_map *map, void *key)
 	struct bpf_shtab_bucket *bucket;
 	struct bpf_shtab_elem *elem;
 	int ret = -ENOENT;
+	unsigned long flags;
 
 	hash = sock_hash_bucket_hash(key, key_size);
 	bucket = sock_hash_select_bucket(htab, hash);
 
-	spin_lock_bh(&bucket->lock);
+	spin_lock_irqsave(&bucket->lock, flags);
 	elem = sock_hash_lookup_elem_raw(&bucket->head, hash, key, key_size);
 	if (elem) {
 		hlist_del_rcu(&elem->node);
@@ -940,7 +941,7 @@ static long sock_hash_delete_elem(struct bpf_map *map, void *key)
 		sock_hash_free_elem(htab, elem);
 		ret = 0;
 	}
-	spin_unlock_bh(&bucket->lock);
+	spin_unlock_irqrestore(&bucket->lock, flags);
 	return ret;
 }
 
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] bpf, sockmap: fix deadlocks in the sockhash and sockmap
  2023-09-18  9:36 [PATCH] bpf, sockmap: fix deadlocks in the sockhash and sockmap Ma Ke
@ 2023-09-18 18:49 ` Kui-Feng Lee
  2023-09-20 18:07   ` John Fastabend
  0 siblings, 1 reply; 5+ messages in thread
From: Kui-Feng Lee @ 2023-09-18 18:49 UTC (permalink / raw)
  To: Ma Ke, john.fastabend, jakub, davem, edumazet, kuba, pabeni; +Cc: netdev, bpf



On 9/18/23 02:36, Ma Ke wrote:
> It seems that elements in sockhash are rarely actively
> deleted by users or ebpf program. Therefore, we do not
> pay much attention to their deletion. Compared with hash
> maps, sockhash only provides spin_lock_bh protection.
> This causes it to appear to have self-locking behavior
> in the interrupt context, as CVE-2023-0160 points out.
> 
> Signed-off-by: Ma Ke <make_ruc2021@163.com>
> ---
>   net/core/sock_map.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> index cb11750b1df5..1302d484e769 100644
> --- a/net/core/sock_map.c
> +++ b/net/core/sock_map.c
> @@ -928,11 +928,12 @@ static long sock_hash_delete_elem(struct bpf_map *map, void *key)
>   	struct bpf_shtab_bucket *bucket;
>   	struct bpf_shtab_elem *elem;
>   	int ret = -ENOENT;
> +	unsigned long flags;

Keep reverse xmas tree ordering?

>   
>   	hash = sock_hash_bucket_hash(key, key_size);
>   	bucket = sock_hash_select_bucket(htab, hash);
>   
> -	spin_lock_bh(&bucket->lock);
> +	spin_lock_irqsave(&bucket->lock, flags);
>   	elem = sock_hash_lookup_elem_raw(&bucket->head, hash, key, key_size);
>   	if (elem) {
>   		hlist_del_rcu(&elem->node);
> @@ -940,7 +941,7 @@ static long sock_hash_delete_elem(struct bpf_map *map, void *key)
>   		sock_hash_free_elem(htab, elem);
>   		ret = 0;
>   	}
> -	spin_unlock_bh(&bucket->lock);
> +	spin_unlock_irqrestore(&bucket->lock, flags);
>   	return ret;
>   }
>   

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] bpf, sockmap: fix deadlocks in the sockhash and sockmap
  2023-09-18 18:49 ` Kui-Feng Lee
@ 2023-09-20 18:07   ` John Fastabend
  2023-09-21  1:31     ` Martin KaFai Lau
  0 siblings, 1 reply; 5+ messages in thread
From: John Fastabend @ 2023-09-20 18:07 UTC (permalink / raw)
  To: Kui-Feng Lee, Ma Ke, john.fastabend, jakub, davem, edumazet, kuba,
	pabeni
  Cc: netdev, bpf

Kui-Feng Lee wrote:
> 
> 
> On 9/18/23 02:36, Ma Ke wrote:
> > It seems that elements in sockhash are rarely actively
> > deleted by users or ebpf program. Therefore, we do not

We never delete them in our usage. I think soon we will have
support to run BPF programs without a map at all removing these
concerns for many use cases.

> > pay much attention to their deletion. Compared with hash
> > maps, sockhash only provides spin_lock_bh protection.
> > This causes it to appear to have self-locking behavior
> > in the interrupt context, as CVE-2023-0160 points out.

CVE is a bit exagerrated in my opinion. I'm not sure why
anyone would delete an element from interrupt context. But,
OK if someone wrote such a thing we shouldn't lock up.

> > 
> > Signed-off-by: Ma Ke <make_ruc2021@163.com>
> > ---
> >   net/core/sock_map.c | 5 +++--
> >   1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> > index cb11750b1df5..1302d484e769 100644
> > --- a/net/core/sock_map.c
> > +++ b/net/core/sock_map.c
> > @@ -928,11 +928,12 @@ static long sock_hash_delete_elem(struct bpf_map *map, void *key)
> >   	struct bpf_shtab_bucket *bucket;
> >   	struct bpf_shtab_elem *elem;
> >   	int ret = -ENOENT;
> > +	unsigned long flags;
> 
> Keep reverse xmas tree ordering?
> 
> >   
> >   	hash = sock_hash_bucket_hash(key, key_size);
> >   	bucket = sock_hash_select_bucket(htab, hash);
> >   
> > -	spin_lock_bh(&bucket->lock);
> > +	spin_lock_irqsave(&bucket->lock, flags);

The hashtab code htab_lock_bucket also does a preempt_disable()
followed by raw_spin_lock_irqsave(). Do we need this as well
to handle the PREEMPT_CONFIG cases.

I'll also take a look, but figured I would post the question given
I wont likely get time to check until tonight/tomorrow.

Also converting to irqsave before ran into syzbot crash wont this do the
same?

> >   	elem = sock_hash_lookup_elem_raw(&bucket->head, hash, key, key_size);
> >   	if (elem) {
> >   		hlist_del_rcu(&elem->node);
> > @@ -940,7 +941,7 @@ static long sock_hash_delete_elem(struct bpf_map *map, void *key)
> >   		sock_hash_free_elem(htab, elem);
> >   		ret = 0;
> >   	}
> > -	spin_unlock_bh(&bucket->lock);
> > +	spin_unlock_irqrestore(&bucket->lock, flags);
> >   	return ret;
> >   }
> >   

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] bpf, sockmap: fix deadlocks in the sockhash and sockmap
  2023-09-20 18:07   ` John Fastabend
@ 2023-09-21  1:31     ` Martin KaFai Lau
  2023-09-21  4:52       ` John Fastabend
  0 siblings, 1 reply; 5+ messages in thread
From: Martin KaFai Lau @ 2023-09-21  1:31 UTC (permalink / raw)
  To: John Fastabend
  Cc: netdev, bpf, Kui-Feng Lee, Ma Ke, jakub, davem, edumazet, kuba,
	pabeni

On 9/20/23 11:07 AM, John Fastabend wrote:
>>> pay much attention to their deletion. Compared with hash
>>> maps, sockhash only provides spin_lock_bh protection.
>>> This causes it to appear to have self-locking behavior
>>> in the interrupt context, as CVE-2023-0160 points out.
> 
> CVE is a bit exagerrated in my opinion. I'm not sure why
> anyone would delete an element from interrupt context. But,
> OK if someone wrote such a thing we shouldn't lock up.

This should only happen in tracing program?
not sure if it will be too drastic to disallow tracing program to use 
bpf_map_delete_elem during load time now.

A followup question, if sockmap can be accessed from tracing program, does it 
need an in_nmi() check?

>>>    	hash = sock_hash_bucket_hash(key, key_size);
>>>    	bucket = sock_hash_select_bucket(htab, hash);
>>>    
>>> -	spin_lock_bh(&bucket->lock);
>>> +	spin_lock_irqsave(&bucket->lock, flags);
> 
> The hashtab code htab_lock_bucket also does a preempt_disable()
> followed by raw_spin_lock_irqsave(). Do we need this as well
> to handle the PREEMPT_CONFIG cases.

iirc, preempt_disable in htab is for the CONFIG_PREEMPT but it is for the 
__this_cpu_inc_return to avoid unnecessary lock failure due to preemption, so 
probably it is not needed here. The commit 2775da216287 ("bpf: Disable 
preemption when increasing per-cpu map_locked")

If map_delete can be called from any tracing context, the raw_spin_lock_xxx 
version is probably needed though. Otherwise, splat (e.g. 
PROVE_RAW_LOCK_NESTING) could be triggered.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] bpf, sockmap: fix deadlocks in the sockhash and sockmap
  2023-09-21  1:31     ` Martin KaFai Lau
@ 2023-09-21  4:52       ` John Fastabend
  0 siblings, 0 replies; 5+ messages in thread
From: John Fastabend @ 2023-09-21  4:52 UTC (permalink / raw)
  To: Martin KaFai Lau, John Fastabend
  Cc: netdev, bpf, Kui-Feng Lee, Ma Ke, jakub, davem, edumazet, kuba,
	pabeni

Martin KaFai Lau wrote:
> On 9/20/23 11:07 AM, John Fastabend wrote:
> >>> pay much attention to their deletion. Compared with hash
> >>> maps, sockhash only provides spin_lock_bh protection.
> >>> This causes it to appear to have self-locking behavior
> >>> in the interrupt context, as CVE-2023-0160 points out.
> > 
> > CVE is a bit exagerrated in my opinion. I'm not sure why
> > anyone would delete an element from interrupt context. But,
> > OK if someone wrote such a thing we shouldn't lock up.
> 
> This should only happen in tracing program?
> not sure if it will be too drastic to disallow tracing program to use 
> bpf_map_delete_elem during load time now.

I don't think we have any users from tracing programs, but
might be something out there?

> 
> A followup question, if sockmap can be accessed from tracing program, does it 
> need an in_nmi() check?

I think we could just do 'in_nmi(); return EOPNOTSUPP;'

> 
> >>>    	hash = sock_hash_bucket_hash(key, key_size);
> >>>    	bucket = sock_hash_select_bucket(htab, hash);
> >>>    
> >>> -	spin_lock_bh(&bucket->lock);
> >>> +	spin_lock_irqsave(&bucket->lock, flags);
> > 
> > The hashtab code htab_lock_bucket also does a preempt_disable()
> > followed by raw_spin_lock_irqsave(). Do we need this as well
> > to handle the PREEMPT_CONFIG cases.
> 
> iirc, preempt_disable in htab is for the CONFIG_PREEMPT but it is for the 
> __this_cpu_inc_return to avoid unnecessary lock failure due to preemption, so 
> probably it is not needed here. The commit 2775da216287 ("bpf: Disable 
> preemption when increasing per-cpu map_locked")
> 
> If map_delete can be called from any tracing context, the raw_spin_lock_xxx 
> version is probably needed though. Otherwise, splat (e.g. 
> PROVE_RAW_LOCK_NESTING) could be triggered.

Yep. I'll look at it I guess. We should probably either block
access from tracing programs or add some tests.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-09-21 18:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-18  9:36 [PATCH] bpf, sockmap: fix deadlocks in the sockhash and sockmap Ma Ke
2023-09-18 18:49 ` Kui-Feng Lee
2023-09-20 18:07   ` John Fastabend
2023-09-21  1:31     ` Martin KaFai Lau
2023-09-21  4:52       ` John Fastabend

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).