From: Petr Tesarik <ptesarik@suse.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Clark Williams <clrkwllms@kernel.org>,
linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
linux-rt-devel@lists.linux.dev
Subject: Re: [PATCH] ring-buffer: Use a housekeeping CPU to wake up waiters
Date: Fri, 9 Jan 2026 09:57:56 +0100 [thread overview]
Message-ID: <20260109095756.13deb429@mordecai> (raw)
In-Reply-To: <20260108115800.7a7fc8a7@gandalf.local.home>
On Thu, 8 Jan 2026 11:58:00 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:
> On Thu, 8 Jan 2026 09:39:32 +0100
> Petr Tesarik <ptesarik@suse.com> wrote:
>
> > > > Or we simply change it to:
> > > >
> > > > static inline void
> > >
> > > Actually, the above should be noinline, as it's in a slower path, and
> > > should not be adding logic into the cache of the fast path.
> >
> > However, to be honest, I'm surprized this is considered slow path. My
> > use case is to record a few selected trace events with "trace-cmd
> > record", which spends most time polling trace_pipe_raw. Consequently,
> > there is almost always a pending waiter that requires a wakeup.
> >
> > In short, irq_work_queue() is the hot path for me.
> >
> > OTOH I don't mind making it noinline, because on recent Intel and AMD
> > systems, a function call (noinline) is often cheaper than an increase
> > in L1 cache footprint (caused by inlining). But I'm confused. I have
> > always thought most people use tracing same way as I do.
>
> The call to rb_wakeups() is a fast path, but the wakeup itself is a slow
> path. This is the case even when you have user space in a loop that is just
> waiting on data.
>
> User space tool:
>
> ring_buffer_wait() {
> wake_event_interruptible(.., rb_wait_cond(..));
> }
>
> Writer:
>
> rb_wakeups() {
> if (!full_hit())
> return;
> }
>
> The full_hit() is the watermark check. If you look at the tracefs
> directory, you'll see a "buffer_percent" file, which is default set to 50.
> That means that full_hit() will not return true until the ring buffer is
> around 50 percent full. This function is called thousands of times before
> the first wakeup happens.
>
> Let's look at even a waiter that isn't using the buffer percent. This means
> it will be woken up on any event in the buffer.
>
> rb_wakeups() {
> if (buffer->irq_work.waiters_pending) {
> buffer->irq_work.waiters_pending = false;
> /* irq_work_queue() supplies it's own memory barriers */
> irq_work_queue(&buffer->irq_work.work);
>
>
> So it clears the waiters_pending flag and wakes up the waiter. Now the
> waiter wakes up and starts reading the ring buffer. While the ring buffer
> has content, it will continue to read and doesn't block again until the
> ring buffer is empty. This means that thousands of events are being
> recorded with no waiters to wake up.
>
> See why this is a slow path?
Thank you for the detailed explanation. So, yeah, most people use it
differently from me, generating trace events fast enough that the
reader does not consume the previous event before the next one arrives.
I have removed both "inline" and "noinline" in v2, leaving it at the
discretion of the compiler. If you believe it deserves a "noinline",
feel free to add it. FWIW on x86-64, I didn't observe any measurable
diference either in latency or instruction cache footprint.
Petr T
next prev parent reply other threads:[~2026-01-09 8:58 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-06 9:10 [PATCH] ring-buffer: Use a housekeeping CPU to wake up waiters Petr Tesarik
2026-01-06 22:04 ` Steven Rostedt
2026-01-07 7:50 ` Petr Tesarik
2026-01-07 9:51 ` Petr Tesarik
2026-01-07 16:17 ` Steven Rostedt
2026-01-07 16:19 ` Steven Rostedt
2026-01-08 8:39 ` Petr Tesarik
2026-01-08 10:46 ` Petr Tesarik
2026-01-08 16:58 ` Steven Rostedt
2026-01-09 8:57 ` Petr Tesarik [this message]
2026-01-09 16:15 ` Steven Rostedt
2026-01-09 16:54 ` Petr Tesarik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260109095756.13deb429@mordecai \
--to=ptesarik@suse.com \
--cc=bigeasy@linutronix.de \
--cc=clrkwllms@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-devel@lists.linux.dev \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox