All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Martin KaFai Lau <martin.lau@linux.dev>,
	 John Fastabend <john.fastabend@gmail.com>
Cc: netdev@vger.kernel.org,  bpf@vger.kernel.org,
	 Kui-Feng Lee <sinquersw@gmail.com>,
	 Ma Ke <make_ruc2021@163.com>,
	 jakub@cloudflare.com,  davem@davemloft.net,
	 edumazet@google.com,  kuba@kernel.org,  pabeni@redhat.com
Subject: Re: [PATCH] bpf, sockmap: fix deadlocks in the sockhash and sockmap
Date: Wed, 20 Sep 2023 21:52:32 -0700	[thread overview]
Message-ID: <650bcc10d2735_7d31e208e7@john.notmuch> (raw)
In-Reply-To: <d05b61ca-0575-de1e-8638-9815ad67f597@linux.dev>

Martin KaFai Lau wrote:
> On 9/20/23 11:07 AM, John Fastabend wrote:
> >>> pay much attention to their deletion. Compared with hash
> >>> maps, sockhash only provides spin_lock_bh protection.
> >>> This causes it to appear to have self-locking behavior
> >>> in the interrupt context, as CVE-2023-0160 points out.
> > 
> > CVE is a bit exagerrated in my opinion. I'm not sure why
> > anyone would delete an element from interrupt context. But,
> > OK if someone wrote such a thing we shouldn't lock up.
> 
> This should only happen in tracing program?
> not sure if it will be too drastic to disallow tracing program to use 
> bpf_map_delete_elem during load time now.

I don't think we have any users from tracing programs, but
might be something out there?

> 
> A followup question, if sockmap can be accessed from tracing program, does it 
> need an in_nmi() check?

I think we could just do 'in_nmi(); return EOPNOTSUPP;'

> 
> >>>    	hash = sock_hash_bucket_hash(key, key_size);
> >>>    	bucket = sock_hash_select_bucket(htab, hash);
> >>>    
> >>> -	spin_lock_bh(&bucket->lock);
> >>> +	spin_lock_irqsave(&bucket->lock, flags);
> > 
> > The hashtab code htab_lock_bucket also does a preempt_disable()
> > followed by raw_spin_lock_irqsave(). Do we need this as well
> > to handle the PREEMPT_CONFIG cases.
> 
> iirc, preempt_disable in htab is for the CONFIG_PREEMPT but it is for the 
> __this_cpu_inc_return to avoid unnecessary lock failure due to preemption, so 
> probably it is not needed here. The commit 2775da216287 ("bpf: Disable 
> preemption when increasing per-cpu map_locked")
> 
> If map_delete can be called from any tracing context, the raw_spin_lock_xxx 
> version is probably needed though. Otherwise, splat (e.g. 
> PROVE_RAW_LOCK_NESTING) could be triggered.

Yep. I'll look at it I guess. We should probably either block
access from tracing programs or add some tests.

      reply	other threads:[~2023-09-21 18:41 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-18  9:36 [PATCH] bpf, sockmap: fix deadlocks in the sockhash and sockmap Ma Ke
2023-09-18 18:49 ` Kui-Feng Lee
2023-09-20 18:07   ` John Fastabend
2023-09-21  1:31     ` Martin KaFai Lau
2023-09-21  4:52       ` John Fastabend [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=650bcc10d2735_7d31e208e7@john.notmuch \
    --to=john.fastabend@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jakub@cloudflare.com \
    --cc=kuba@kernel.org \
    --cc=make_ruc2021@163.com \
    --cc=martin.lau@linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sinquersw@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.