public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Eduard Zingerman <eddyz87@gmail.com>
To: Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	bpf@vger.kernel.org,  linux-kernel@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra	 <peterz@infradead.org>,
	Will Deacon <will@kernel.org>, Waiman Long	 <llong@redhat.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko	 <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau	 <martin.lau@kernel.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Tejun Heo	 <tj@kernel.org>, Barret Rhoden <brho@google.com>,
	Josh Don <joshdon@google.com>,  Dohyun Kim <dohyunkim@google.com>,
	linux-arm-kernel@lists.infradead.org, kernel-team@meta.com
Subject: Re: [PATCH bpf-next v2 24/26] bpf: Implement verifier support for rqspinlock
Date: Tue, 11 Feb 2025 16:08:09 -0800	[thread overview]
Message-ID: <a10a2865242dc217e71de54f75fa4289aefb2fa9.camel@gmail.com> (raw)
In-Reply-To: <20250206105435.2159977-25-memxor@gmail.com>

On Thu, 2025-02-06 at 02:54 -0800, Kumar Kartikeya Dwivedi wrote:
> Introduce verifier-side support for rqspinlock kfuncs. The first step is
> allowing bpf_res_spin_lock type to be defined in map values and
> allocated objects, so BTF-side is updated with a new BPF_RES_SPIN_LOCK
> field to recognize and validate.
> 
> Any object cannot have both bpf_spin_lock and bpf_res_spin_lock, only
> one of them (and at most one of them per-object, like before) must be
> present. The bpf_res_spin_lock can also be used to protect objects that
> require lock protection for their kfuncs, like BPF rbtree and linked
> list.
> 
> The verifier plumbing to simulate success and failure cases when calling
> the kfuncs is done by pushing a new verifier state to the verifier state
> stack which will verify the failure case upon calling the kfunc. The
> path where success is indicated creates all lock reference state and IRQ
> state (if necessary for irqsave variants). In the case of failure, the
> state clears the registers r0-r5, sets the return value, and skips kfunc
> processing, proceeding to the next instruction.
> 
> When marking the return value for success case, the value is marked as
> 0, and for the failure case as [-MAX_ERRNO, -1]. Then, in the program,
> whenever user checks the return value as 'if (ret)' or 'if (ret < 0)'
> the verifier never traverses such branches for success cases, and would
> be aware that the lock is not held in such cases.
> 
> We push the kfunc state in check_kfunc_call whenever rqspinlock kfuncs
> are invoked. We introduce a kfunc_class state to avoid mixing lock
> irqrestore kfuncs with IRQ state created by bpf_local_irq_save.
> 
> With all this infrastructure, these kfuncs become usable in programs
> while satisfying all safety properties required by the kernel.
> 
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---

Apart from a few nits, I think this patch looks good.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>

[...]

> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 32c23f2a3086..ed444e44f524 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -115,6 +115,15 @@ struct bpf_reg_state {
>  			int depth:30;
>  		} iter;
>  
> +		/* For irq stack slots */
> +		struct {
> +			enum {
> +				IRQ_KFUNC_IGNORE,

Is this state ever used?
mark_stack_slot_irq_flag() is always called with either NATIVE or LOCK.

> +				IRQ_NATIVE_KFUNC,
> +				IRQ_LOCK_KFUNC,
> +			} kfunc_class;
> +		} irq;
> +
>  		/* Max size from any of the above. */
>  		struct {
>  			unsigned long raw1;

[...]

> @@ -8038,36 +8059,53 @@ static int process_spin_lock(struct bpf_verifier_env *env, int regno,
>  	}
>  
>  	rec = reg_btf_record(reg);
> -	if (!btf_record_has_field(rec, BPF_SPIN_LOCK)) {
> -		verbose(env, "%s '%s' has no valid bpf_spin_lock\n", map ? "map" : "local",
> -			map ? map->name : "kptr");
> +	if (!btf_record_has_field(rec, is_res_lock ? BPF_RES_SPIN_LOCK : BPF_SPIN_LOCK)) {
> +		verbose(env, "%s '%s' has no valid %s_lock\n", map ? "map" : "local",
> +			map ? map->name : "kptr", lock_str);
>  		return -EINVAL;
>  	}
> -	if (rec->spin_lock_off != val + reg->off) {
> -		verbose(env, "off %lld doesn't point to 'struct bpf_spin_lock' that is at %d\n",
> -			val + reg->off, rec->spin_lock_off);
> +	spin_lock_off = is_res_lock ? rec->res_spin_lock_off : rec->spin_lock_off;
> +	if (spin_lock_off != val + reg->off) {
> +		verbose(env, "off %lld doesn't point to 'struct %s_lock' that is at %d\n",
> +			val + reg->off, lock_str, spin_lock_off);
>  		return -EINVAL;
>  	}
>  	if (is_lock) {
>  		void *ptr;
> +		int type;
>  
>  		if (map)
>  			ptr = map;
>  		else
>  			ptr = btf;
>  
> -		if (cur->active_locks) {
> -			verbose(env,
> -				"Locking two bpf_spin_locks are not allowed\n");
> -			return -EINVAL;
> +		if (!is_res_lock && cur->active_locks) {

Nit: having '&& cur->active_locks' in this branch but not the one for
     'is_res_lock' is a bit confusing. As far as I understand this is
     just an optimization, and active_locks check could be done (or dropped)
     in both cases.

> +			if (find_lock_state(env->cur_state, REF_TYPE_LOCK, 0, NULL)) {
> +				verbose(env,
> +					"Locking two bpf_spin_locks are not allowed\n");
> +				return -EINVAL;
> +			}
> +		} else if (is_res_lock) {
> +			if (find_lock_state(env->cur_state, REF_TYPE_RES_LOCK, reg->id, ptr)) {
> +				verbose(env, "Acquiring the same lock again, AA deadlock detected\n");
> +				return -EINVAL;
> +			}
>  		}

Nit: there is no branch for find_lock_state(... REF_TYPE_RES_LOCK_IRQ ...),
     this is not a problem, as other checks catch the imbalance in
     number of unlocks or unlock of the same lock, but verifier won't
     report the above "AA deadlock" message for bpf_res_spin_lock_irqsave().

The above two checks make it legal to take resilient lock while
holding regular lock and vice versa. This is probably ok, can't figure
out an example when this causes trouble.

> -		err = acquire_lock_state(env, env->insn_idx, REF_TYPE_LOCK, reg->id, ptr);
> +
> +		if (is_res_lock && is_irq)
> +			type = REF_TYPE_RES_LOCK_IRQ;
> +		else if (is_res_lock)
> +			type = REF_TYPE_RES_LOCK;
> +		else
> +			type = REF_TYPE_LOCK;
> +		err = acquire_lock_state(env, env->insn_idx, type, reg->id, ptr);
>  		if (err < 0) {
>  			verbose(env, "Failed to acquire lock state\n");
>  			return err;
>  		}
>  	} else {
>  		void *ptr;
> +		int type;
>  
>  		if (map)
>  			ptr = map;

[...]


  reply	other threads:[~2025-02-12  0:08 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-06 10:54 [PATCH bpf-next v2 00/26] Resilient Queued Spin Lock Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 01/26] locking: Move MCS struct definition to public header Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 02/26] locking: Move common qspinlock helpers to a private header Kumar Kartikeya Dwivedi
2025-02-07 23:21   ` kernel test robot
2025-02-06 10:54 ` [PATCH bpf-next v2 03/26] locking: Allow obtaining result of arch_mcs_spin_lock_contended Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 04/26] locking: Copy out qspinlock.c to rqspinlock.c Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 05/26] rqspinlock: Add rqspinlock.h header Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 06/26] rqspinlock: Drop PV and virtualization support Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 07/26] rqspinlock: Add support for timeouts Kumar Kartikeya Dwivedi
2025-02-10  9:56   ` Peter Zijlstra
2025-02-11  4:55     ` Alexei Starovoitov
2025-02-11 10:11       ` Peter Zijlstra
2025-02-11 18:00         ` Alexei Starovoitov
2025-02-06 10:54 ` [PATCH bpf-next v2 08/26] rqspinlock: Protect pending bit owners from stalls Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 09/26] rqspinlock: Protect waiters in queue " Kumar Kartikeya Dwivedi
2025-02-10 10:17   ` Peter Zijlstra
2025-02-13  6:20     ` Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 10/26] rqspinlock: Protect waiters in trylock fallback " Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 11/26] rqspinlock: Add deadlock detection and recovery Kumar Kartikeya Dwivedi
2025-02-08  1:53   ` Alexei Starovoitov
2025-02-08  3:03     ` Kumar Kartikeya Dwivedi
2025-02-10 10:21   ` Peter Zijlstra
2025-02-13  6:11     ` Kumar Kartikeya Dwivedi
2025-02-10 10:36   ` Peter Zijlstra
2025-02-06 10:54 ` [PATCH bpf-next v2 12/26] rqspinlock: Add a test-and-set fallback Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 13/26] rqspinlock: Add basic support for CONFIG_PARAVIRT Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 14/26] rqspinlock: Add helper to print a splat on timeout or deadlock Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 15/26] rqspinlock: Add macros for rqspinlock usage Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 16/26] rqspinlock: Add locktorture support Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 17/26] rqspinlock: Hardcode cond_acquire loops to asm-generic implementation Kumar Kartikeya Dwivedi
2025-02-08  1:58   ` Alexei Starovoitov
2025-02-08  3:04     ` Kumar Kartikeya Dwivedi
2025-02-10  9:53   ` Peter Zijlstra
2025-02-10 10:03     ` Peter Zijlstra
2025-02-13  6:15       ` Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 18/26] rqspinlock: Add entry to Makefile, MAINTAINERS Kumar Kartikeya Dwivedi
2025-02-07 14:14   ` kernel test robot
2025-02-07 14:45   ` kernel test robot
2025-02-08  0:43   ` kernel test robot
2025-02-06 10:54 ` [PATCH bpf-next v2 19/26] bpf: Convert hashtab.c to rqspinlock Kumar Kartikeya Dwivedi
2025-02-08  2:01   ` Alexei Starovoitov
2025-02-08  3:06     ` Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 20/26] bpf: Convert percpu_freelist.c " Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 21/26] bpf: Convert lpm_trie.c " Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 22/26] bpf: Introduce rqspinlock kfuncs Kumar Kartikeya Dwivedi
2025-02-07 13:43   ` kernel test robot
2025-02-06 10:54 ` [PATCH bpf-next v2 23/26] bpf: Handle allocation failure in acquire_lock_state Kumar Kartikeya Dwivedi
2025-02-08  2:04   ` Alexei Starovoitov
2025-02-06 10:54 ` [PATCH bpf-next v2 24/26] bpf: Implement verifier support for rqspinlock Kumar Kartikeya Dwivedi
2025-02-12  0:08   ` Eduard Zingerman [this message]
2025-02-13  6:41     ` Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 25/26] bpf: Maintain FIFO property for rqspinlock unlock Kumar Kartikeya Dwivedi
2025-02-06 10:54 ` [PATCH bpf-next v2 26/26] selftests/bpf: Add tests for rqspinlock Kumar Kartikeya Dwivedi
2025-02-12  0:14   ` Eduard Zingerman
2025-02-13  6:25     ` Kumar Kartikeya Dwivedi
2025-02-10  9:31 ` [PATCH bpf-next v2 00/26] Resilient Queued Spin Lock Peter Zijlstra
2025-02-10  9:38 ` Peter Zijlstra
2025-02-10 10:49   ` Peter Zijlstra
2025-02-11  4:37     ` Alexei Starovoitov
2025-02-11 10:43       ` Peter Zijlstra
2025-02-11 18:33         ` Alexei Starovoitov
2025-02-13  9:59           ` Peter Zijlstra
2025-02-14  2:37             ` Alexei Starovoitov
2025-03-04 10:46               ` Peter Zijlstra
2025-03-05  3:26                 ` Alexei Starovoitov
2025-02-10  9:49 ` Peter Zijlstra
2025-02-10 19:16   ` Ankur Arora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a10a2865242dc217e71de54f75fa4289aefb2fa9.camel@gmail.com \
    --to=eddyz87@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brho@google.com \
    --cc=daniel@iogearbox.net \
    --cc=dohyunkim@google.com \
    --cc=joshdon@google.com \
    --cc=kernel-team@meta.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=llong@redhat.com \
    --cc=martin.lau@kernel.org \
    --cc=memxor@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox