* [PATCH v4] ring-buffer: Ensure proper resetting of atomic variables in ring_buffer_reset_online_cpus
@ 2023-04-12 11:23 Tze-nan Wu
2023-04-24 20:11 ` Steven Rostedt
0 siblings, 1 reply; 2+ messages in thread
From: Tze-nan Wu @ 2023-04-12 11:23 UTC (permalink / raw)
To: rostedt, mhiramat
Cc: bobule.chang, wsd_upstream, Tze-nan.Wu, cheng-jui.wang, npiggin,
stable, AngeloGioacchino Del Regno, Paul E. McKenney,
linux-kernel, linux-trace-kernel, linux-arm-kernel,
linux-mediatek
In ring_buffer_reset_online_cpus, the buffer_size_kb write operation
may permanently fail if the cpu_online_mask changes between two
for_each_online_buffer_cpu loops. The number of increases and decreases
on both cpu_buffer->resize_disabled and cpu_buffer->record_disabled may be
inconsistent, causing some CPUs to have non-zero values for these atomic
variables after the function returns.
This issue can be reproduced by "echo 0 > trace" while hotplugging cpu.
After reproducing success, we can find out buffer_size_kb will not be
functional anymore.
To prevent leaving 'resize_disabled' and 'record_disabled' non-zero after
ring_buffer_reset_online_cpus returns, we ensure that each atomic variable
has been set up before atomic_sub() to it.
Cc: stable@vger.kernel.org
Cc: npiggin@gmail.com
Fixes: b23d7a5f4a07 ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")
Reviewed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
---
Changes from v1 to v3: https://lore.kernel.org/all/20230408052226.25268-1-Tze-nan.Wu@mediatek.com/
- Declare the cpumask variable statically rather than dynamically.
Changes from v2 to v3: https://lore.kernel.org/all/20230409024616.31099-1-Tze-nan.Wu@mediatek.com/
- Considering holding cpu_hotplug_lock too long because of the
synchronize_rcu(), maybe it's better to prevent the issue by copying
cpu_online_mask at the entry of the function as V1 does, instead of
using cpus_read_lock().
Changes from v3 to v4: https://lore.kernel.org/all/20230410073512.13362-1-Tze-nan.Wu@mediatek.com/
- Considering that the size of cpumask may not be too big on some machines
We no longer adopt the approach of copying cpumask at the beginning of
the function. Instead, we ensure that atomic variables have been set up
before atomic_sub() is called.
- Change the title of the patch.
---
kernel/trace/ring_buffer.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 76a2d91eecad..8c647d8b5bb4 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -5361,20 +5361,28 @@ void ring_buffer_reset_online_cpus(struct trace_buffer *buffer)
for_each_online_buffer_cpu(buffer, cpu) {
cpu_buffer = buffer->buffers[cpu];
- atomic_inc(&cpu_buffer->resize_disabled);
+#define RESET_BIT (1 << 30)
+ atomic_add(RESET_BIT, &cpu_buffer->resize_disabled);
atomic_inc(&cpu_buffer->record_disabled);
}
/* Make sure all commits have finished */
synchronize_rcu();
- for_each_online_buffer_cpu(buffer, cpu) {
+ for_each_buffer_cpu(buffer, cpu) {
cpu_buffer = buffer->buffers[cpu];
+ /*
+ * If a CPU came online during the synchronize_rcu(), then
+ * ignore it.
+ */
+ if (!(atomic_read(&cpu_buffer->resize_disabled) & RESET_BIT))
+ continue;
+
reset_disabled_cpu_buffer(cpu_buffer);
atomic_dec(&cpu_buffer->record_disabled);
- atomic_dec(&cpu_buffer->resize_disabled);
+ atomic_sub(RESET_BIT, &cpu_buffer->resize_disabled);
}
mutex_unlock(&buffer->mutex);
--
2.18.0
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [PATCH v4] ring-buffer: Ensure proper resetting of atomic variables in ring_buffer_reset_online_cpus
2023-04-12 11:23 [PATCH v4] ring-buffer: Ensure proper resetting of atomic variables in ring_buffer_reset_online_cpus Tze-nan Wu
@ 2023-04-24 20:11 ` Steven Rostedt
0 siblings, 0 replies; 2+ messages in thread
From: Steven Rostedt @ 2023-04-24 20:11 UTC (permalink / raw)
To: Tze-nan Wu
Cc: mhiramat, bobule.chang, wsd_upstream, cheng-jui.wang, npiggin,
stable, AngeloGioacchino Del Regno, Paul E. McKenney,
linux-kernel, linux-trace-kernel, linux-arm-kernel,
linux-mediatek
On Wed, 12 Apr 2023 19:23:56 +0800
Tze-nan Wu <Tze-nan.Wu@mediatek.com> wrote:
> In ring_buffer_reset_online_cpus, the buffer_size_kb write operation
> may permanently fail if the cpu_online_mask changes between two
> for_each_online_buffer_cpu loops. The number of increases and decreases
> on both cpu_buffer->resize_disabled and cpu_buffer->record_disabled may be
> inconsistent, causing some CPUs to have non-zero values for these atomic
> variables after the function returns.
>
> This issue can be reproduced by "echo 0 > trace" while hotplugging cpu.
> After reproducing success, we can find out buffer_size_kb will not be
> functional anymore.
>
> To prevent leaving 'resize_disabled' and 'record_disabled' non-zero after
> ring_buffer_reset_online_cpus returns, we ensure that each atomic variable
> has been set up before atomic_sub() to it.
>
> Cc: stable@vger.kernel.org
> Cc: npiggin@gmail.com
> Fixes: b23d7a5f4a07 ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")
> Reviewed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
> Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
> ---
> Changes from v1 to v3: https://lore.kernel.org/all/20230408052226.25268-1-Tze-nan.Wu@mediatek.com/
> - Declare the cpumask variable statically rather than dynamically.
>
> Changes from v2 to v3: https://lore.kernel.org/all/20230409024616.31099-1-Tze-nan.Wu@mediatek.com/
> - Considering holding cpu_hotplug_lock too long because of the
> synchronize_rcu(), maybe it's better to prevent the issue by copying
> cpu_online_mask at the entry of the function as V1 does, instead of
> using cpus_read_lock().
>
> Changes from v3 to v4: https://lore.kernel.org/all/20230410073512.13362-1-Tze-nan.Wu@mediatek.com/
> - Considering that the size of cpumask may not be too big on some machines
> We no longer adopt the approach of copying cpumask at the beginning of
> the function. Instead, we ensure that atomic variables have been set up
> before atomic_sub() is called.
> - Change the title of the patch.
> ---
> kernel/trace/ring_buffer.c | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 76a2d91eecad..8c647d8b5bb4 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -5361,20 +5361,28 @@ void ring_buffer_reset_online_cpus(struct trace_buffer *buffer)
> for_each_online_buffer_cpu(buffer, cpu) {
> cpu_buffer = buffer->buffers[cpu];
>
> - atomic_inc(&cpu_buffer->resize_disabled);
> +#define RESET_BIT (1 << 30)
Nit, please add the define outside the function. You could do it right
before the function, but defines like this make the code somewhat ugly.
> + atomic_add(RESET_BIT, &cpu_buffer->resize_disabled);
> atomic_inc(&cpu_buffer->record_disabled);
> }
>
> /* Make sure all commits have finished */
> synchronize_rcu();
>
> - for_each_online_buffer_cpu(buffer, cpu) {
> + for_each_buffer_cpu(buffer, cpu) {
> cpu_buffer = buffer->buffers[cpu];
>
> + /*
> + * If a CPU came online during the synchronize_rcu(), then
> + * ignore it.
> + */
> + if (!(atomic_read(&cpu_buffer->resize_disabled) & RESET_BIT))
> + continue;
> +
> reset_disabled_cpu_buffer(cpu_buffer);
>
> atomic_dec(&cpu_buffer->record_disabled);
> - atomic_dec(&cpu_buffer->resize_disabled);
> + atomic_sub(RESET_BIT, &cpu_buffer->resize_disabled);
> }
>
> mutex_unlock(&buffer->mutex);
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2023-04-24 20:11 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-12 11:23 [PATCH v4] ring-buffer: Ensure proper resetting of atomic variables in ring_buffer_reset_online_cpus Tze-nan Wu
2023-04-24 20:11 ` Steven Rostedt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).