From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from plesk.hostmyservers.fr (plesk.hostmyservers.fr [45.145.164.37]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6C82F22F76F; Sat, 25 Oct 2025 19:29:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.145.164.37 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761420550; cv=none; b=p1L1XjwVM/7sMZnIrV4DvUMvlgvuU7TNfTXrfjxa9miZ1nYcifG3whLp+WFLHxAqVif1RqsDVBZRIc7h2PQRAw/bgSuUbUBIVJcZ4qn122tcniwXRXyDF6C1CHdn6Sx5ozf26QZ137jt6/JHVkdq/Y6847RqCY9xKUL3qQvOunI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761420550; c=relaxed/simple; bh=oEW51IBjajiRamfCDh8N/0EC3k3vJUif37TpfUO9fCs=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=iadRSvM4ONdK+jpsmoZguYN0Tmi4U4p6i59uGW7HNdOoInHYkrVOGk6XRY2QQ1jAAbVzLl3+U4Eh1IbHh/062I9ZizPraFEDp9kzzCsxLYJtWdDdiGvIuQZbiXpK2HzM4jTzumzD9vTG8hk6tYl56+07KaONKAJ1fpI8kNc6o2g= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=arnaud-lcm.com; spf=pass smtp.mailfrom=arnaud-lcm.com; arc=none smtp.client-ip=45.145.164.37 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=arnaud-lcm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arnaud-lcm.com Received: from dev-dsk-arnaudlc-1a-b66eeb5f.eu-west-1.amazon.com (54-240-197-230.amazon.com [54.240.197.230]) by plesk.hostmyservers.fr (Postfix) with ESMTPSA id A565440B21; Sat, 25 Oct 2025 19:29:05 +0000 (UTC) Authentication-Results: Plesk; spf=pass (sender IP is 54.240.197.230) smtp.mailfrom=contact@arnaud-lcm.com smtp.helo=dev-dsk-arnaudlc-1a-b66eeb5f.eu-west-1.amazon.com Received-SPF: pass (Plesk: connection is authenticated) From: Arnaud Lecomte To: alexei.starovoitov@gmail.com, andrii.nakryiko@gmail.com, andrii@kernel.org Cc: ast@kernel.org, bpf@vger.kernel.org, daniel@iogearbox.net, eddyz87@gmail.com, haoluo@google.com, john.fastabend@gmail.com, jolsa@kernel.org, kpsingh@kernel.org, linux-kernel@vger.kernel.org, martin.lau@linux.dev, sdf@fomichev.me, song@kernel.org, syzbot+c9b724fbb41cf2538b7b@syzkaller.appspotmail.com, syzkaller-bugs@googlegroups.com, yonghong.song@linux.dev, Arnaud Lecomte Subject: [PATCH V10 RESEND 1/2] bpf: refactor stack map trace depth calculation into helper function Date: Sat, 25 Oct 2025 19:28:58 +0000 Message-ID: <20251025192858.31424-1-contact@arnaud-lcm.com> X-Mailer: git-send-email 2.47.3 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-PPP-Message-ID: <176142054658.23855.11269455129227822570@Plesk> X-PPP-Vhost: arnaud-lcm.com Extract the duplicated maximum allowed depth computation for stack traces stored in BPF stacks from bpf_get_stackid() and __bpf_get_stack() into a dedicated stack_map_calculate_max_depth() helper function. This unifies the logic for: - The max depth computation - Enforcing the sysctl_perf_event_max_stack limit No functional changes for existing code paths. Acked-by: Yonghong Song Acked-by: Song Liu Signed-off-by: Arnaud Lecomte --- Changes in v2: - Removed the checking 'map_size % map_elem_size' from stack_map_calculate_max_depth - Changed stack_map_calculate_max_depth params name to be more generic Changes in v3: - Changed map size param to size in max depth helper Changes in v4: - Fixed indentation in max depth helper for args Changes in v5: - Bound back trace_nr to num_elem in __bpf_get_stack - Make a copy of sysctl_perf_event_max_stack in stack_map_calculate_max_depth Changes in v6: - Restrained max_depth computation only when required - Additional cleanup from Song in __bpf_get_stack Changes in v7: - Removed additional cleanup from v6 Changes in v9: - Fixed incorrect removal of num_elem in get stack Changes in v10: - Squashed 2 previous patch 1 and 2 Link to v9: https://lore.kernel.org/all/20250912233409.74900-1-contact@arnaud-lcm.com/ --- --- kernel/bpf/stackmap.c | 47 +++++++++++++++++++++++++++++-------------- 1 file changed, 32 insertions(+), 15 deletions(-) diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 4d53cdd1374c..5e9ad050333c 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -42,6 +42,28 @@ static inline int stack_map_data_size(struct bpf_map *map) sizeof(struct bpf_stack_build_id) : sizeof(u64); } +/** + * stack_map_calculate_max_depth - Calculate maximum allowed stack trace depth + * @size: Size of the buffer/map value in bytes + * @elem_size: Size of each stack trace element + * @flags: BPF stack trace flags (BPF_F_USER_STACK, BPF_F_USER_BUILD_ID, ...) + * + * Return: Maximum number of stack trace entries that can be safely stored + */ +static u32 stack_map_calculate_max_depth(u32 size, u32 elem_size, u64 flags) +{ + u32 skip = flags & BPF_F_SKIP_FIELD_MASK; + u32 max_depth; + u32 curr_sysctl_max_stack = READ_ONCE(sysctl_perf_event_max_stack); + + max_depth = size / elem_size; + max_depth += skip; + if (max_depth > curr_sysctl_max_stack) + return curr_sysctl_max_stack; + + return max_depth; +} + static int prealloc_elems_and_freelist(struct bpf_stack_map *smap) { u64 elem_size = sizeof(struct stack_map_bucket) + @@ -300,20 +322,17 @@ static long __bpf_get_stackid(struct bpf_map *map, BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, u64, flags) { - u32 max_depth = map->value_size / stack_map_data_size(map); - u32 skip = flags & BPF_F_SKIP_FIELD_MASK; + u32 elem_size = stack_map_data_size(map); bool user = flags & BPF_F_USER_STACK; struct perf_callchain_entry *trace; bool kernel = !user; + u32 max_depth; if (unlikely(flags & ~(BPF_F_SKIP_FIELD_MASK | BPF_F_USER_STACK | BPF_F_FAST_STACK_CMP | BPF_F_REUSE_STACKID))) return -EINVAL; - max_depth += skip; - if (max_depth > sysctl_perf_event_max_stack) - max_depth = sysctl_perf_event_max_stack; - + max_depth = stack_map_calculate_max_depth(map->value_size, elem_size, flags); trace = get_perf_callchain(regs, kernel, user, max_depth, false, false); @@ -406,7 +425,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, struct perf_callchain_entry *trace_in, void *buf, u32 size, u64 flags, bool may_fault) { - u32 trace_nr, copy_len, elem_size, num_elem, max_depth; + u32 trace_nr, copy_len, elem_size, max_depth; bool user_build_id = flags & BPF_F_USER_BUILD_ID; bool crosstask = task && task != current; u32 skip = flags & BPF_F_SKIP_FIELD_MASK; @@ -438,21 +457,20 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, goto clear; } - num_elem = size / elem_size; - max_depth = num_elem + skip; - if (sysctl_perf_event_max_stack < max_depth) - max_depth = sysctl_perf_event_max_stack; + max_depth = stack_map_calculate_max_depth(size, elem_size, flags); if (may_fault) rcu_read_lock(); /* need RCU for perf's callchain below */ - if (trace_in) + if (trace_in) { trace = trace_in; - else if (kernel && task) + trace->nr = min_t(u32, trace->nr, max_depth); + } else if (kernel && task) { trace = get_callchain_entry_for_task(task, max_depth); - else + } else { trace = get_perf_callchain(regs, kernel, user, max_depth, crosstask, false); + } if (unlikely(!trace) || trace->nr < skip) { if (may_fault) @@ -461,7 +479,6 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, } trace_nr = trace->nr - skip; - trace_nr = (trace_nr <= num_elem) ? trace_nr : num_elem; copy_len = trace_nr * elem_size; ips = trace->ip + skip; -- 2.47.3