From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-173.mta1.migadu.com (out-173.mta1.migadu.com [95.215.58.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F3B934DCC7 for ; Sat, 9 May 2026 12:53:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.173 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778331184; cv=none; b=qPp+0azdN2Aj9pxzBwjPo1M2boEAyd99bzG5CWgxYh/H8PSodjim/h3C5Oib/M9ulLm5zKTFQ3Bx4i2Cx6478MghRRYf/R7pwnyeinPJStgle/lzywIRSCspE/4H1NdjhrVRXtGEgqh3lvDghtLlgmApqHD/ErvOIAHi09z5UZs= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778331184; c=relaxed/simple; bh=zEHcF8UGh6iBzjoccOZzsct1oj4aSpJ1aER21JPaAig=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=hKBEwA0hkDSFyNLCQm2zFHa4+HRKNRc/zJeVXKxHgwyTEWIZ/3wiANOAHPSig1zG4syyrTDzfCFz3l75+jjs53PEP0rx0QvUaRM+uyxv32abaBOyBxqDICVTQPnQY3wqAgWuZ761HG0O14rHpqQs+DkPnOuOYbWDN/bSH2LjkiE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=Rz2TBa2A; arc=none smtp.client-ip=95.215.58.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="Rz2TBa2A" Message-ID: <0b36de5b-63a3-4694-9981-431b4e703e09@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1778331179; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tr3OIEvULxXbPBC1UI6czzrg66lNBZvRF8A6TGS9wtM=; b=Rz2TBa2AQIcQwi/Abftsq5TlAjeYhezSTJF2LjDW03OsLeI9VoCbFSy5q0APV+QF42QXuP j+wKFvmmlX9ECxt1/6PzVjqppQCbnrvZaK/9kHYWYF9iqQrR9YH6QYVAiFqudLRstC7ka7 49hUZLFuzQC6XPZlaWo+2KxRc7RUkBk= Date: Sat, 9 May 2026 13:52:51 +0100 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH bpf-next v2 05/23] bpf: Support stack arguments for bpf functions To: bot+bpf-ci@kernel.org, bpf@vger.kernel.org Cc: ast@kernel.org, andrii@kernel.org, daniel@iogearbox.net, jose.marchesi@oracle.com, kernel-team@fb.com, martin.lau@kernel.org, eddyz87@gmail.com, clm@meta.com, ihor.solodrai@linux.dev References: <20260507213008.1127429-1-yonghong.song@linux.dev> <11c51daec78a68837f719172ae1c21db8b3e98e0a76aa5bcd59c8b089760c40d@mail.kernel.org> Content-Language: en-GB X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Yonghong Song In-Reply-To: <11c51daec78a68837f719172ae1c21db8b3e98e0a76aa5bcd59c8b089760c40d@mail.kernel.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT On 5/7/26 12:26 AM, bot+bpf-ci@kernel.org wrote: >> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h >> index 321b9d69cf9c..1fba16fced28 100644 >> --- a/include/linux/bpf_verifier.h >> +++ b/include/linux/bpf_verifier.h >> @@ -427,6 +427,10 @@ struct bpf_func_state { >> * `stack`. allocated_stack is always a multiple of BPF_REG_SIZE. >> */ >> int allocated_stack; >> + >> + bool no_stack_arg_load; >> + u16 out_stack_arg_cnt; /* Number of outgoing on-stack argument slots */ >> + struct bpf_reg_state *stack_arg_regs; /* Outgoing on-stack arguments */ >> }; > The new field no_stack_arg_load is placed after allocated_stack, but > copy_stack_state() never assigns it. > > The comment at line 417 says 'The following fields should be last. See > copy_func_state()', and copy_func_state() does this: > > memcpy(dst, src, offsetof(struct bpf_func_state, stack)); > > so any field after stack is not memcpy'd. copy_stack_state() copies > stack, allocated_stack, out_stack_arg_cnt, and stack_arg_regs, but > not no_stack_arg_load. > > Consequence: whenever a state is cloned (e.g., via push_stack() at > conditional jumps), the pushed state inherits no_stack_arg_load=0 > regardless of the source state's value. > > This breaks the invariant the commit message describes: 'all stack arg > reads must be ahead of any stack arg write' and 'all stack arg reads must > be before any bpf func, kfunc and helpers'. > > A BPF program can write to an outgoing stack arg (which sets > cur->no_stack_arg_load=true in check_stack_arg_write), then execute a > conditional jump which calls push_stack(). The pushed state's > no_stack_arg_load is reset to 0, so a subsequent stack arg read would be > accepted by check_stack_arg_read() when it should be rejected. > > The same issue applies to state pruning: func_states_equal/stack_arg_safe > does not compare no_stack_arg_load either, so a state with > no_stack_arg_load=true can be pruned to an old state with > no_stack_arg_load=false incorrectly. > > Should no_stack_arg_load be placed before stack (to be memcpy'd) or should > copy_stack_state() explicitly assign it? This is my fault. In this version, I tried to make code better to understand, e.g. putting related fields together. But in this particular case, I should not do this as I forgot this is for state copying. Will fix in the next revision. > > [ ... ] > >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >> index 3bafb7ad2ba7..61d6663498f7 100644 >> --- a/kernel/bpf/verifier.c >> +++ b/kernel/bpf/verifier.c >> @@ -1362,6 +1362,18 @@ static int copy_stack_state(struct bpf_func_state *dst, const struct bpf_func_st >> return -ENOMEM; >> >> dst->allocated_stack = src->allocated_stack; >> + >> + /* copy stack args state */ >> + n = src->out_stack_arg_cnt; >> + if (n) { >> + dst->stack_arg_regs = copy_array(dst->stack_arg_regs, src->stack_arg_regs, n, >> + sizeof(struct bpf_reg_state), >> + GFP_KERNEL_ACCOUNT); >> + if (!dst->stack_arg_regs) >> + return -ENOMEM; >> + } >> + >> + dst->out_stack_arg_cnt = src->out_stack_arg_cnt; >> return 0; >> } > copy_stack_state() copies stack, allocated_stack, stack_arg_regs, and > out_stack_arg_cnt, but no_stack_arg_load is missing. > > The field is declared in struct bpf_func_state after allocated_stack, so > it lives in the region that copy_func_state's memcpy excludes: > > static int copy_func_state(struct bpf_func_state *dst, > const struct bpf_func_state *src) > { > memcpy(dst, src, offsetof(struct bpf_func_state, stack)); > return copy_stack_state(dst, src); > } > > The memcpy stops at stack, so no_stack_arg_load is never copied. Every > state copy path through bpf_copy_verifier_state(), push_stack(), and > pop_stack() silently resets the flag. > > The commit enforces the invariant 'all stack arg reads must precede any > stack arg write or any call within a frame' by setting > state->no_stack_arg_load = true in check_stack_arg_write() and in > do_check_insn()'s call branch, and checking the flag in > check_stack_arg_read(). > > When the verifier encounters a branch, push_stack() calls > bpf_copy_verifier_state() which loses no_stack_arg_load. A BPF program can > perform a write or call, conditionally branch, and after the branch perform > a stack arg read that should be rejected. The read passes verification > because the flag was reset to false on the restored state. > > The commit message says this invariant is required because 'JIT may emit > mov insns for read/write with the same register and bpf function, kfunc and > helper will invalidate all arguments immediately after the call'. Bypassing > the flag allows JIT-generated code that reads stale or clobbered stack > argument slots at runtime. > > Should copy_stack_state() add 'dst->no_stack_arg_load = > src->no_stack_arg_load;'? Again, will fix no_stack_arg_load issue in the next revision. > > [ ... ] > >> @@ -4050,6 +4080,105 @@ static int check_stack_write(struct bpf_verifier_env *env, >> return err; >> } >> >> +/* >> + * Write a value to the outgoing stack arg area. >> + * off is a negative offset from r11 (e.g. -8 for arg6, -16 for arg7). >> + */ >> +static int check_stack_arg_write(struct bpf_verifier_env *env, struct bpf_func_state *state, >> + int off, struct bpf_reg_state *value_reg) >> +{ >> + int max_stack_arg_regs = MAX_BPF_FUNC_ARGS - MAX_BPF_FUNC_REG_ARGS; >> + struct bpf_subprog_info *subprog = &env->subprog_info[state->subprogno]; >> + int spi = -off / BPF_REG_SIZE - 1; >> + struct bpf_func_state *cur; >> + struct bpf_reg_state *arg; >> + int err; >> + >> + if (spi >= max_stack_arg_regs) { >> + verbose(env, "stack arg write offset %d exceeds max %d stack args\n", >> + off, max_stack_arg_regs); >> + return -EINVAL; >> + } >> + >> + err = grow_stack_arg_slots(env, state, spi + 1); >> + if (err) >> + return err; >> + >> + /* Track the max outgoing stack arg slot count. */ >> + if (spi + 1 > subprog->max_out_stack_arg_cnt) >> + subprog->max_out_stack_arg_cnt = spi + 1; >> + >> + cur = env->cur_state->frame[env->cur_state->curframe]; > This isn't a bug, but the local variable cur is assigned but never read. > Both call sites already pass state = env->cur_state->frame[curframe], so > cur is always identical to the state parameter. Should this line be > removed? Right, 'cur = ...' should be removed. > > [ ... ] > >> @@ -16961,6 +17135,14 @@ static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state) > The register validation in check_and_resolve_insns() allows r11/BPF_REG_PARAMS > through for all instruction classes, but only BPF_LDX, BPF_STX, and BPF_ST > have explicit handling. > > In v4 of this series (April 11), bot+bpf-ci@kernel.org raised a concern: > > If a BPF program uses r11 as src_reg or dst_reg in an ALU or JMP > instruction, it would cause an out-of-bounds access to regs[11] (array > is only regs[0-10]). > > Would it make sense to restrict the r11 exemption here to only the > instruction classes that actually handle it (BPF_LDX, BPF_STX, BPF_ST), > rather than allowing it globally? > > The author acknowledged this was a bug and agreed to restrict the r11 > exemption to only BPF_LDX, BPF_STX, BPF_ST. However, the current code does > not have any exemption for r11. > > Looking at check_and_resolve_insns(), it checks 'insn->dst_reg >= > MAX_BPF_REG' without exception. Since BPF_REG_PARAMS == MAX_BPF_REG == 11, > this will reject all uses of r11 including legitimate LDX/STX/ST operations > on stack arguments. > > Should check_and_resolve_insns() allow r11 for the specific instruction > classes that handle stack arguments (BPF_LDX, BPF_STX, BPF_ST)? This actually handled in later patch "bpf: Enable r11 based insns", so we should be fine for this one. > > Reference: > https://lore.kernel.org/bpf/bcd1b7033de7f597d73b3991f8fc4d8b502fce451fc288d909ae72a717578a0d@mail.kernel.org/ > > > --- > AI reviewed your patch. Please fix the bug or email reply why it's not a bug. > See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md > > CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25523765323