linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Linux Trace Kernel <linux-trace-kernel@vger.kernel.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Kent Overstreet <kent.overstreet@linux.dev>
Subject: Re: [PATCH v3] ring-buffer: Fix writing to the buffer with max_data_size
Date: Wed, 13 Dec 2023 08:45:58 +0900	[thread overview]
Message-ID: <20231213084558.90dc9a9064c48cebb3d0ab8f@kernel.org> (raw)
In-Reply-To: <20231212111617.39e02849@gandalf.local.home>

On Tue, 12 Dec 2023 11:16:17 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
> 
> The maximum ring buffer data size is the maximum size of data that can be
> recorded on the ring buffer. Events must be smaller than the sub buffer
> data size minus any meta data. This size is checked before trying to
> allocate from the ring buffer because the allocation assumes that the size
> will fit on the sub buffer.
> 
> The maximum size was calculated as the size of a sub buffer page (which is
> currently PAGE_SIZE minus the sub buffer header) minus the size of the
> meta data of an individual event. But it missed the possible adding of a
> time stamp for events that are added long enough apart that the event meta
> data can't hold the time delta.
> 
> When an event is added that is greater than the current BUF_MAX_DATA_SIZE
> minus the size of a time stamp, but still less than or equal to
> BUF_MAX_DATA_SIZE, the ring buffer would go into an infinite loop, looking
> for a page that can hold the event. Luckily, there's a check for this loop
> and after 1000 iterations and a warning is emitted and the ring buffer is
> disabled. But this should never happen.
> 
> This can happen when a large event is added first, or after a long period
> where an absolute timestamp is prefixed to the event, increasing its size
> by 8 bytes. This passes the check and then goes into the algorithm that
> causes the infinite loop.
> 
> For events that are the first event on the sub-buffer, it does not need to
> add a timestamp, because the sub-buffer itself contains an absolute
> timestamp, and adding one is redundant.
> 
> The fix is to check if the event is to be the first event on the
> sub-buffer, and if it is, then do not add a timestamp.
> 
> This also fixes 32 bit adding a timestamp when a read of before_stamp or
> write_stamp is interrupted. There's still no need to add that timestamp if
> the event is going to be the first event on the sub buffer.
> 
> Also, if the buffer has "time_stamp_abs" set, then also check if the
> length plus the timestamp is greater than the BUF_MAX_DATA_SIZE.
> 
> Link: https://lore.kernel.org/all/20231212104549.58863438@gandalf.local.home/
> Link: https://lore.kernel.org/linux-trace-kernel/20231212071837.5fdd6c13@gandalf.local.home
> 
> Cc: stable@vger.kernel.org
> Fixes: a4543a2fa9ef3 ("ring-buffer: Get timestamp after event is allocated")
> Fixes: 58fbc3c63275c ("ring-buffer: Consolidate add_timestamp to remove some branches")
> Reported-by: Kent Overstreet <kent.overstreet@linux.dev> # (on IRC)
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

This looks good to me :)

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Thank you!

> ---
> Changes since v2: https://lore.kernel.org/linux-trace-kernel/20231212065922.05f28041@gandalf.local.home
> 
> - Just test 'w' first, and then do the rest of the checks.
> 
>  kernel/trace/ring_buffer.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 8d2a4f00eca9..b8986f82eccf 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -3579,7 +3579,10 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
>  		 * absolute timestamp.
>  		 * Don't bother if this is the start of a new page (w == 0).
>  		 */
> -		if (unlikely(!a_ok || !b_ok || (info->before != info->after && w))) {
> +		if (!w) {
> +			/* Use the sub-buffer timestamp */
> +			info->delta = 0;
> +		} else if (unlikely(!a_ok || !b_ok || info->before != info->after)) {
>  			info->add_timestamp |= RB_ADD_STAMP_FORCE | RB_ADD_STAMP_EXTEND;
>  			info->length += RB_LEN_TIME_EXTEND;
>  		} else {
> @@ -3737,6 +3740,8 @@ rb_reserve_next_event(struct trace_buffer *buffer,
>  	if (ring_buffer_time_stamp_abs(cpu_buffer->buffer)) {
>  		add_ts_default = RB_ADD_STAMP_ABSOLUTE;
>  		info.length += RB_LEN_TIME_EXTEND;
> +		if (info.length > BUF_MAX_DATA_SIZE)
> +			goto out_fail;
>  	} else {
>  		add_ts_default = RB_ADD_STAMP_NONE;
>  	}
> -- 
> 2.42.0
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

      reply	other threads:[~2023-12-12 23:46 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-12 16:16 [PATCH v3] ring-buffer: Fix writing to the buffer with max_data_size Steven Rostedt
2023-12-12 23:45 ` Masami Hiramatsu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231213084558.90dc9a9064c48cebb3d0ab8f@kernel.org \
    --to=mhiramat@kernel.org \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).