public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: rostedt <rostedt@goodmis.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	Yordan Karadzhov <y.karadz@gmail.com>,
	Tzvetomir Stoyanov <tz.stoyanov@gmail.com>,
	Tom Zanussi <zanussi@kernel.org>,
	Jason Behmer <jbehmer@google.com>,
	Julia Lawall <julia.lawall@inria.fr>,
	Clark Williams <williams@redhat.com>,
	bristot <bristot@redhat.com>, Daniel Wagner <wagi@monom.org>,
	Darren Hart <dvhart@vmware.com>, Jonathan Corbet <corbet@lwn.net>,
	"Suresh E. Warrier" <warrier@linux.vnet.ibm.com>,
	"Joel Fernandes, Google" <joel@joelfernandes.org>
Subject: Re: [RFC][PATCH] ring-buffer: Have nested events still record running time stamp
Date: Mon, 29 Jun 2020 23:13:32 -0400 (EDT)	[thread overview]
Message-ID: <650783164.16607.1593486812734.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20200629202146.29520925@oasis.local.home>

----- On Jun 29, 2020, at 8:21 PM, rostedt rostedt@goodmis.org wrote:

> On Thu, 25 Jun 2020 15:35:02 -0400 (EDT)
> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> 
>> >> So the reservation is not "just" an add instruction, it's actually an
>> >> xadd on x86. Is that really faster than a cmpxchg ?
>> > 
>> > I believe the answer is still yes. But I can run some benchmarks to
>> > make sure.
>> 
>> This would be interesting to see, because if xadd and cmpxchg have
>> similar overhead, then going for a cmpxchg-loop for the space
>> reservation could vastly decrease the overall complexity of this
>> timestamp+space reservation algorithm.
> 
> So I tested this on three different machines to see how it compared. I
> applied this patch:
> 
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index c2c0d25ea004..8a7d62375eb3 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -3161,7 +3161,19 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
> 
>  /*B*/	rb_time_set(&cpu_buffer->before_stamp, info->ts);
> 
> +#if 0
>  /*C*/	write = local_add_return(info->length, &tail_page->write);
> +#else
> +	{
> +		unsigned long expect, result;
> +
> +		do {
> +			expect = local_read(&tail_page->write);
> +			write = expect + info->length;
> +			result = local_cmpxchg(&tail_page->write, expect, write);
> +		} while (expect != result);
> +	}
> +#endif
> 
> 	/* set write to only the index of the write */
> 	write &= RB_WRITE_MASK;
> 
> 
> And tested before and after the results.

[...]
> 
> Not that big of a difference, but there is a difference. And the
> standard deviation appears to be better, which would make sense as an
> local_add_return() needs no repeat on collision.

I tried running your recommended test on my machine many times with the same
configuration. It appears that the avg and std dev results vary a lot between
runs of the exact same configuration:

Model name:          Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz

* With xadd (existing code):

 event_benchmark-1828  [016]    71.309475: benchmark_event:      last=190 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309476: benchmark_event:      last=192 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309476: benchmark_event:      last=195 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309477: benchmark_event:      last=193 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309478: benchmark_event:      last=190 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309479: benchmark_event:      last=190 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309479: benchmark_event:      last=192 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309480: benchmark_event:      last=192 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309481: benchmark_event:      last=195 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437
 event_benchmark-1828  [016]    71.309481: benchmark_event:      last=193 first=4200 max=234226 min=185 avg=256 std=333 std^2=111437


 event_benchmark-1865  [003]   162.955546: benchmark_event:      last=489 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955546: benchmark_event:      last=205 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955547: benchmark_event:      last=204 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955548: benchmark_event:      last=204 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955549: benchmark_event:      last=203 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955549: benchmark_event:      last=206 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955550: benchmark_event:      last=203 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955551: benchmark_event:      last=205 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955552: benchmark_event:      last=205 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057
 event_benchmark-1865  [003]   162.955552: benchmark_event:      last=205 first=4386 max=29000 min=185 avg=243 std=161 std^2=26057

 event_benchmark-1904  [003]   202.847955: benchmark_event:      last=205 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847956: benchmark_event:      last=500 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847957: benchmark_event:      last=355 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847958: benchmark_event:      last=340 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847959: benchmark_event:      last=462 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847960: benchmark_event:      last=278 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847961: benchmark_event:      last=243 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847962: benchmark_event:      last=205 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847963: benchmark_event:      last=470 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605
 event_benchmark-1904  [003]   202.847964: benchmark_event:      last=364 first=5230 max=27647 min=185 avg=239 std=150 std^2=22605

 event_benchmark-1941  [003]   234.577797: benchmark_event:      last=204 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577798: benchmark_event:      last=445 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577799: benchmark_event:      last=693 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577800: benchmark_event:      last=293 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577801: benchmark_event:      last=205 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577802: benchmark_event:      last=202 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577802: benchmark_event:      last=203 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577803: benchmark_event:      last=204 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577804: benchmark_event:      last=204 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989
 event_benchmark-1941  [003]   234.577805: benchmark_event:      last=205 first=5117 max=17387 min=185 avg=233 std=141 std^2=19989


 event_benchmark-1978  [003]   279.276351: benchmark_event:      last=205 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276352: benchmark_event:      last=203 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276353: benchmark_event:      last=207 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276354: benchmark_event:      last=207 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276355: benchmark_event:      last=205 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276355: benchmark_event:      last=205 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276356: benchmark_event:      last=204 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276357: benchmark_event:      last=206 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276358: benchmark_event:      last=286 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885
 event_benchmark-1978  [003]   279.276359: benchmark_event:      last=202 first=4583 max=17638 min=185 avg=241 std=163 std^2=26885


 event_benchmark-2015  [003]   309.974432: benchmark_event:      last=202 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974433: benchmark_event:      last=204 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974434: benchmark_event:      last=205 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974435: benchmark_event:      last=206 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974435: benchmark_event:      last=205 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974436: benchmark_event:      last=203 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974437: benchmark_event:      last=205 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974438: benchmark_event:      last=275 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974439: benchmark_event:      last=205 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323
 event_benchmark-2015  [003]   309.974440: benchmark_event:      last=600 first=5090 max=22623 min=185 avg=241 std=152 std^2=23323

-> This configuration's avg varies between 233 and 256 ns based on your benchmark methodology. The std. dev. varies between
141 and 333 ns.


* With your cmpxchg modification:

 event_benchmark-1256  [010]    56.943734: benchmark_event:      last=210 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943736: benchmark_event:      last=208 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943737: benchmark_event:      last=1223 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943738: benchmark_event:      last=676 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943739: benchmark_event:      last=337 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943740: benchmark_event:      last=455 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943741: benchmark_event:      last=280 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943742: benchmark_event:      last=246 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943743: benchmark_event:      last=205 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524
 event_benchmark-1256  [010]    56.943743: benchmark_event:      last=207 first=5210 max=83243 min=187 avg=265 std=201 std^2=40524

 event_benchmark-1293  [025]    84.352468: benchmark_event:      last=198 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352469: benchmark_event:      last=198 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352470: benchmark_event:      last=338 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352471: benchmark_event:      last=197 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352471: benchmark_event:      last=196 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352472: benchmark_event:      last=197 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352473: benchmark_event:      last=198 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352474: benchmark_event:      last=197 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352474: benchmark_event:      last=197 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469
 event_benchmark-1293  [025]    84.352475: benchmark_event:      last=196 first=5446 max=25969 min=188 avg=241 std=168 std^2=28469


 event_benchmark-1330  [014]   111.729937: benchmark_event:      last=473 first=3724 max=31525 min=187 avg=245 std=160 std^2=25626
 event_benchmark-1330  [014]   111.729939: benchmark_event:      last=1085 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627
 event_benchmark-1330  [014]   111.729940: benchmark_event:      last=602 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627
 event_benchmark-1330  [014]   111.729942: benchmark_event:      last=602 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627
 event_benchmark-1330  [014]   111.729943: benchmark_event:      last=393 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627
 event_benchmark-1330  [014]   111.729944: benchmark_event:      last=558 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627
 event_benchmark-1330  [014]   111.729946: benchmark_event:      last=410 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627
 event_benchmark-1330  [014]   111.729947: benchmark_event:      last=408 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627
 event_benchmark-1330  [014]   111.729949: benchmark_event:      last=428 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627
 event_benchmark-1330  [014]   111.729950: benchmark_event:      last=717 first=3724 max=31525 min=187 avg=245 std=160 std^2=25627

 event_benchmark-1367  [025]   143.251162: benchmark_event:      last=209 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251163: benchmark_event:      last=420 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251164: benchmark_event:      last=413 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251165: benchmark_event:      last=209 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251165: benchmark_event:      last=205 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251166: benchmark_event:      last=209 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251167: benchmark_event:      last=208 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251168: benchmark_event:      last=207 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251169: benchmark_event:      last=208 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567
 event_benchmark-1367  [025]   143.251169: benchmark_event:      last=210 first=3706 max=19397 min=187 avg=244 std=146 std^2=21567

 event_benchmark-1404  [025]   182.750612: benchmark_event:      last=207 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750612: benchmark_event:      last=207 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750613: benchmark_event:      last=207 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750614: benchmark_event:      last=210 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750615: benchmark_event:      last=209 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750616: benchmark_event:      last=208 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750616: benchmark_event:      last=206 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750617: benchmark_event:      last=207 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750618: benchmark_event:      last=206 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319
 event_benchmark-1404  [025]   182.750619: benchmark_event:      last=209 first=3727 max=18550 min=187 avg=239 std=138 std^2=19319

-> This configuration's avg varies between 239 and 256 ns based on your benchmark methodology. The std. dev. varies between
138 and 201 ns.

In the benchmark results you collected, config #1 appeared to have little delta between xadd and cmpxchg, whereas
config #2 and #3 had some avg difference. Did you try running this test many times on each machine to see if the
result was indeed stable ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  reply	other threads:[~2020-06-30  3:13 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-25 13:44 [RFC][PATCH] ring-buffer: Have nested events still record running time stamp Steven Rostedt
2020-06-25 13:53 ` Mathieu Desnoyers
2020-06-25 14:37   ` Steven Rostedt
2020-06-25 16:42     ` Korben Rusek
2020-06-25 18:12       ` Steven Rostedt
2020-06-25 17:55 ` Mathieu Desnoyers
2020-06-25 18:35   ` Steven Rostedt
2020-06-25 19:35     ` Mathieu Desnoyers
2020-06-25 19:58       ` Steven Rostedt
2020-06-26  2:36       ` Steven Rostedt
2020-06-26  3:35         ` Steven Rostedt
2020-06-26 13:58           ` Steven Rostedt
2020-06-26 18:13             ` Mathieu Desnoyers
2020-06-26 18:58               ` Steven Rostedt
2020-06-26 19:39                 ` Steven Rostedt
2020-06-30  0:21       ` Steven Rostedt
2020-06-30  3:13         ` Mathieu Desnoyers [this message]
2020-06-30  3:26           ` Steven Rostedt
2020-06-25 19:04   ` Steven Rostedt
2020-06-25 19:58     ` Mathieu Desnoyers
2020-06-25 20:42       ` Steven Rostedt
2020-06-25 19:09   ` Steven Rostedt
2020-06-25 20:03     ` Mathieu Desnoyers
2020-06-25 18:09 ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=650783164.16607.1593486812734.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=acme@redhat.com \
    --cc=bristot@redhat.com \
    --cc=corbet@lwn.net \
    --cc=dvhart@vmware.com \
    --cc=jbehmer@google.com \
    --cc=joel@joelfernandes.org \
    --cc=jolsa@redhat.com \
    --cc=julia.lawall@inria.fr \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=tz.stoyanov@gmail.com \
    --cc=wagi@monom.org \
    --cc=warrier@linux.vnet.ibm.com \
    --cc=williams@redhat.com \
    --cc=y.karadz@gmail.com \
    --cc=zanussi@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox