From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B4F931A587 for ; Tue, 19 Dec 2023 13:37:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6A3D4C433C8; Tue, 19 Dec 2023 13:37:53 +0000 (UTC) Date: Tue, 19 Dec 2023 08:38:51 -0500 From: Steven Rostedt To: Shung-Hsi Yu Cc: Andrii Nakryiko , Philo Lu , bpf@vger.kernel.org, song@kernel.org, andrii@kernel.org, ast@kernel.org, Daniel Borkmann , xuanzhuo@linux.alibaba.com, dust.li@linux.alibaba.com, guwen@linux.alibaba.com, alibuda@linux.alibaba.com, hengqi@linux.alibaba.com, Nathan Slingerland , "rihams@meta.com" , Alan Maguire , Masami Hiramatsu Subject: Re: Question about bpf perfbuf/ringbuf: pinned in backend with overwriting Message-ID: <20231219083851.0ec83349@gandalf.local.home> In-Reply-To: References: <3dd9114c-599f-46b2-84b9-abcfd2dcbe33@linux.alibaba.com> <23691bb5-9688-4e93-a98c-1024e8a8fc62@linux.alibaba.com> X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Tue, 19 Dec 2023 14:23:59 +0800 Shung-Hsi Yu wrote: > Curious whether it is possible to reuse ftrace's trace buffer instead > (or it's underlying ring buffer implementation at > kernel/trace/ring_buffer.c). AFAICT it satisfies both requirements that > Philo stated: (1) no need for user process as the buffer is accessible > through tracefs, and (2) has an overwrite mode. Yes, the ftrace ring-buffer was in fact designed for the above use case. > > Further more, a natural feature request that would come after > overwriting support would be snapshotting, and that has already been > covered in ftrace. Yes, it has that too. > > Note: technically BPF program could already write to ftrace's trace > buffer with the bpf_trace_vprintk() helper, but that goes through string > formatting and only allows writing into to the global buffer. When eBPF was first being developed, Alexei told me he tried the ftrace ring buffer, and he said the filtering was too slow. That's because it would always write into the ring buffer and then try to discard it after the fact, which required a few cmpxchg to synchronize. He decided that the perf ring buffer was a better fit for this. That was solved with this: 0fc1b09ff1ff4 ("tracing: Use temp buffer when filtering events") Which makes the filtering similar to perf as perf always copies events to a temporary buffer first. It still falls back to writing directly into the ring buffer if the temp buffer is currently being used by another event on the same CPU. Note that the perf ring buffer was designed for profiling (taking intermediate traces) and tightly coupled to have a reader. Whereas the ftrace ring buffer was designed for high speed constant tracing, with or without a reader. -- Steve