linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: [PATCH v2 0/2] ring-buffer: Fix poll wakeup logic
Date: Tue, 12 Mar 2024 09:19:19 -0400	[thread overview]
Message-ID: <20240312131919.314231457@goodmis.org> (raw)


After making a slight change to wakeups in ring_buffer_wait()
the system would hang. Spending several hours going on a wild goose
chase I found that the change only triggered the bug because it
changed the timings. The bug was there before the update but never
was triggered.

The poll code has:

  rbwork->full_waiters_pending = true;
  if (!cpu_buffer->shortest_full ||
      cpu_buffer->shortest_full > full)
         cpu_buffer->shortest_full = full;

The writer will see full_waiters_pending and check if the ring buffer is
filled over the percentage of the shortest_full value. If it is, it calls
an irq_work to wake up all the waiters.

But the code could get into a circular loop:

        CPU 0                                   CPU 1
        -----                                   -----
 [ Poll ]
   [ shortest_full = 0 ]
   rbwork->full_waiters_pending = true;
                                          if (rbwork->full_waiters_pending &&
                                              [ buffer percent ] > shortest_full) {
                                                 rbwork->wakeup_full = true;
                                                 [ queue_irqwork ]

   cpu_buffer->shortest_full = full;

                                          [ IRQ work ]
                                          if (rbwork->wakeup_full) {
                                                cpu_buffer->shortest_full = 0;
                                                wakeup poll waiters;
  [woken]
   if ([ buffer percent ] > full)
      break;
   rbwork->full_waiters_pending = true;
                                          if (rbwork->full_waiters_pending &&
                                              [ buffer percent ] > shortest_full) {
                                                 rbwork->wakeup_full = true;
                                                 [ queue_irqwork ]

   cpu_buffer->shortest_full = full;

                                          [ IRQ work ]
                                          if (rbwork->wakeup_full) {
                                                cpu_buffer->shortest_full = 0;
                                                wakeup poll waiters;
  [woken]

 [ Wash, rinse, repeat! ]

The race was triggered when running:

  trace-cmd record -p function -m 5000

Which enables function tracing and then creates two files it is writing
into where each is 2500K in size. The -m is a "max file size". When
trace-cmd writes 2500K to one file it then switches to the other, erasing
the old data. To do this, trace-cmd switches between both poll and
the reader using both methods of wake up. The change to the reader wakeup
was able to change the way the poll was woken to trigger this bug.

The second patch is a clean up and also a way to consolidate the logic
of the shortest_full. The read wakeup uses rb_watermark_hit for both
full wakeups and !full wakeups. But since poll uses the same logic for
full wakeups it can just call that function with full set.

Changes since v1: https://lore.kernel.org/all/20240312115455.666920175@goodmis.org/

- Removed unused 'flags' in ring_buffer_poll_wait() as the spin_lock
  is now in rb_watermark_hit().


Steven Rostedt (Google) (2):
      ring-buffer: Fix full_waiters_pending in poll
      ring-buffer: Reuse rb_watermark_hit() for the poll logic

----
 kernel/trace/ring_buffer.c | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

             reply	other threads:[~2024-03-12 13:17 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-12 13:19 Steven Rostedt [this message]
2024-03-12 13:19 ` [PATCH v2 1/2] ring-buffer: Fix full_waiters_pending in poll Steven Rostedt
2024-03-12 15:22   ` Masami Hiramatsu
2024-03-12 15:32     ` Steven Rostedt
2024-03-12 13:19 ` [PATCH v2 2/2] ring-buffer: Reuse rb_watermark_hit() for the poll logic Steven Rostedt
2024-03-12 15:38   ` Masami Hiramatsu
2024-03-12 15:48     ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240312131919.314231457@goodmis.org \
    --to=rostedt@goodmis.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).