All of lore.kernel.org
 help / color / mirror / Atom feed
* Question on updated Figure 5.1 "Atomic Increment Scalability on Kaby Lake"
@ 2018-09-02  8:54 Akira Yokosawa
       [not found] ` <CAFMy-Rk76bc+oOXq7ijGT4wzYkFbs1F7bkLWQ-VwE_O2Q-Z2VQ@mail.gmail.com>
  0 siblings, 1 reply; 4+ messages in thread
From: Akira Yokosawa @ 2018-09-02  8:54 UTC (permalink / raw)
  To: Palik, Imre; +Cc: Paul E. McKenney, perfbook, Akira Yokosawa

Hello Palik,

I'm looking at Figure 5.1 of perfbook which was updated by commit
3b62f67a76e1 ("Regenerating the atomic counter graph on a more
modern CPU"), and wondering the time to increment at x = 1 looks
too small.

As is described in the Answer to Quick Quiz 5.8, atomic increment
should take a bit longer than the ideal non-atomic increment even
when the number of updaters is 1.

Or on Kaby Lake, some special optimization results in almost no
cost in this case? If that is the case, the Quick Quiz needs update...

Also, the horizontal dashed line should be closer to the x-axis,
I suppose. The time of the ideal non-atomic increment is embedded on
line 44 of CodeSamples/count/plots.sh as "8.81772".
You might need to use a proper value for Kaby Lake (should be near
zero) instead.

If I understand the circumstances right, raw data for the previous
version of the graph reside in
CodeSamples/count/data/count_atomic:u.2009.05.03a.dat.

Can you share the corresponding raw data on Kaby Lake?

Also, in the above mentioned commit, atomic125.{eps|png} were
overwritten accidentally. They were graphs of 125-thread case,
and should not have been updated IMO. They are not used in perfbook
at the moment, but I'd like to have the original ones in the repository.

If you could share the raw data for Kaby Lake, I'd be happy to fix
these issues for you. 

I'm not blaming any of you. I could blame the ad-hoc design of plots.sh,
but it was written by Paul for his own use. Such scripts tend to be
ad-hoc. ;-)  I might be able to improve the script.

        Thanks, Akira


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Question on updated Figure 5.1 "Atomic Increment Scalability on Kaby Lake"
       [not found] ` <CAFMy-Rk76bc+oOXq7ijGT4wzYkFbs1F7bkLWQ-VwE_O2Q-Z2VQ@mail.gmail.com>
@ 2018-09-12 10:51   ` Akira Yokosawa
  2018-09-12 10:53     ` [PATCH] CodeSamples/count: Use READ_ONCE/WRITE_ONCE in count_nonatomic.c Akira Yokosawa
  0 siblings, 1 reply; 4+ messages in thread
From: Akira Yokosawa @ 2018-09-12 10:51 UTC (permalink / raw)
  To: Imre Palik; +Cc: Paul E. McKenney, perfbook, Akira Yokosawa

(Added CC to Paul and perfbook list.)

On 2018/09/12 07:54:06 +0200, Imre Palik wrote:
> Hi Akira,
> 
> sorry for the confusion.  I should have taken better care.  I attached
> the row data from a measurement.

Thank you!

I just skimmed the output of count_nonatomic.

---
count_nonatomic 1 uperf 2
n_reads: 0  n_updates: 778797305000  nreaders: 0  nupdaters: 1 duration: 240
ns/read: -nan  ns/update: 0.000308167
count_nonatomic 1 uperf 2
n_reads: 0  n_updates: 779548753000  nreaders: 0  nupdaters: 1 duration: 240
ns/read: -nan  ns/update: 0.00030787
count_nonatomic 1 uperf 2
n_reads: 0  n_updates: 801753128000  nreaders: 0  nupdaters: 1 duration: 240
ns/read: -nan  ns/update: 0.000299344
count_nonatomic 1 uperf 2
n_reads: 0  n_updates: 783816136000  nreaders: 0  nupdaters: 1 duration: 240
ns/read: -nan  ns/update: 0.000306194
count_nonatomic 1 uperf 2
n_reads: 0  n_updates: 786531305000  nreaders: 0  nupdaters: 1 duration: 240
ns/read: -nan  ns/update: 0.000305137
count_nonatomic 2 uperf 2
!!! Count mismatch: 1566325493000 counted vs. 776288032000 final value
n_reads: 0  n_updates: 1566325493000  nreaders: 0  nupdaters: 2 duration: 240
ns/read: -nan  ns/update: 0.00030645
count_nonatomic 2 uperf 2
!!! Count mismatch: 1637607003000 counted vs. 825312711000 final value
n_reads: 0  n_updates: 1637607003000  nreaders: 0  nupdaters: 2 duration: 240
ns/read: -nan  ns/update: 0.000293111
count_nonatomic 2 uperf 2
!!! Count mismatch: 1525890163000 counted vs. 768873049000 final value
n_reads: 0  n_updates: 1525890163000  nreaders: 0  nupdaters: 2 duration: 240
ns/read: -nan  ns/update: 0.00031457
count_nonatomic 2 uperf 2
!!! Count mismatch: 1487184107000 counted vs. 717784175000 final value
n_reads: 0  n_updates: 1487184107000  nreaders: 0  nupdaters: 2 duration: 240
ns/read: -nan  ns/update: 0.000322758
count_nonatomic 2 uperf 2
!!! Count mismatch: 1512515450000 counted vs. 751090603000 final value
n_reads: 0  n_updates: 1512515450000  nreaders: 0  nupdaters: 2 duration: 240
ns/read: -nan  ns/update: 0.000317352
--

These results look too fast. I checked CodeSamples/count/count_nonatoimc.c
and found that it has no READ_ONCE()/WRITE_ONCE() to access count.

I'm submitting a patch to fix it soon.
Can you share the raw data after applying it?

> 
> Thanks,
> 
> Imre

And apologies for my confusion of your first name.

Thanks, Akira

> On Sun, Sep 2, 2018 at 10:55 AM Akira Yokosawa <akiyks@gmail.com> wrote:
>>
>> Hello Palik,
>>
>> I'm looking at Figure 5.1 of perfbook which was updated by commit
>> 3b62f67a76e1 ("Regenerating the atomic counter graph on a more
>> modern CPU"), and wondering the time to increment at x = 1 looks
>> too small.
>>
>> As is described in the Answer to Quick Quiz 5.8, atomic increment
>> should take a bit longer than the ideal non-atomic increment even
>> when the number of updaters is 1.
>>
>> Or on Kaby Lake, some special optimization results in almost no
>> cost in this case? If that is the case, the Quick Quiz needs update...
>>
>> Also, the horizontal dashed line should be closer to the x-axis,
>> I suppose. The time of the ideal non-atomic increment is embedded on
>> line 44 of CodeSamples/count/plots.sh as "8.81772".
>> You might need to use a proper value for Kaby Lake (should be near
>> zero) instead.
>>
>> If I understand the circumstances right, raw data for the previous
>> version of the graph reside in
>> CodeSamples/count/data/count_atomic:u.2009.05.03a.dat.
>>
>> Can you share the corresponding raw data on Kaby Lake?
>>
>> Also, in the above mentioned commit, atomic125.{eps|png} were
>> overwritten accidentally. They were graphs of 125-thread case,
>> and should not have been updated IMO. They are not used in perfbook
>> at the moment, but I'd like to have the original ones in the repository.
>>
>> If you could share the raw data for Kaby Lake, I'd be happy to fix
>> these issues for you.
>>
>> I'm not blaming any of you. I could blame the ad-hoc design of plots.sh,
>> but it was written by Paul for his own use. Such scripts tend to be
>> ad-hoc. ;-)  I might be able to improve the script.
>>
>>         Thanks, Akira


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] CodeSamples/count: Use READ_ONCE/WRITE_ONCE in count_nonatomic.c
  2018-09-12 10:51   ` Akira Yokosawa
@ 2018-09-12 10:53     ` Akira Yokosawa
  2018-09-12 13:18       ` Paul E. McKenney
  0 siblings, 1 reply; 4+ messages in thread
From: Akira Yokosawa @ 2018-09-12 10:53 UTC (permalink / raw)
  To: Imre Palik, Paul E. McKenney; +Cc: perfbook

From 13fa1b9b162483df5d88b7aefe408a4e17e83338 Mon Sep 17 00:00:00 2001
From: Akira Yokosawa <akiyks@gmail.com>
Date: Wed, 12 Sep 2018 19:27:41 +0900
Subject: [PATCH] CodeSamples/count: Use READ_ONCE/WRITE_ONCE in count_nonatomic.c

Without them, inlined accesses can be optimized away by "-O3".

Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
---
 CodeSamples/count/count_nonatomic.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/CodeSamples/count/count_nonatomic.c b/CodeSamples/count/count_nonatomic.c
index 90979c5..50df475 100644
--- a/CodeSamples/count/count_nonatomic.c
+++ b/CodeSamples/count/count_nonatomic.c
@@ -25,12 +25,12 @@ unsigned long counter = 0;

 __inline__ void inc_count(void)
 {
-	counter++;
+	WRITE_ONCE(counter, READ_ONCE(counter) + 1);
 }

 __inline__ unsigned long read_count(void)
 {
-	return counter;
+	return READ_ONCE(counter);
 }

 __inline__ void count_init(void)
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] CodeSamples/count: Use READ_ONCE/WRITE_ONCE in count_nonatomic.c
  2018-09-12 10:53     ` [PATCH] CodeSamples/count: Use READ_ONCE/WRITE_ONCE in count_nonatomic.c Akira Yokosawa
@ 2018-09-12 13:18       ` Paul E. McKenney
  0 siblings, 0 replies; 4+ messages in thread
From: Paul E. McKenney @ 2018-09-12 13:18 UTC (permalink / raw)
  To: Akira Yokosawa; +Cc: Imre Palik, perfbook

On Wed, Sep 12, 2018 at 07:53:24PM +0900, Akira Yokosawa wrote:
> >From 13fa1b9b162483df5d88b7aefe408a4e17e83338 Mon Sep 17 00:00:00 2001
> From: Akira Yokosawa <akiyks@gmail.com>
> Date: Wed, 12 Sep 2018 19:27:41 +0900
> Subject: [PATCH] CodeSamples/count: Use READ_ONCE/WRITE_ONCE in count_nonatomic.c
> 
> Without them, inlined accesses can be optimized away by "-O3".

That code was written in a far more innocent time when compilers were
far less aggressive, wasn't it?

Great catch, both on the data and on the code, thank you!  Applied and
pushed.

							Thanx, Paul

> Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
> ---
>  CodeSamples/count/count_nonatomic.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/CodeSamples/count/count_nonatomic.c b/CodeSamples/count/count_nonatomic.c
> index 90979c5..50df475 100644
> --- a/CodeSamples/count/count_nonatomic.c
> +++ b/CodeSamples/count/count_nonatomic.c
> @@ -25,12 +25,12 @@ unsigned long counter = 0;
> 
>  __inline__ void inc_count(void)
>  {
> -	counter++;
> +	WRITE_ONCE(counter, READ_ONCE(counter) + 1);
>  }
> 
>  __inline__ unsigned long read_count(void)
>  {
> -	return counter;
> +	return READ_ONCE(counter);
>  }
> 
>  __inline__ void count_init(void)
> -- 
> 2.7.4
> 
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-09-12 18:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-09-02  8:54 Question on updated Figure 5.1 "Atomic Increment Scalability on Kaby Lake" Akira Yokosawa
     [not found] ` <CAFMy-Rk76bc+oOXq7ijGT4wzYkFbs1F7bkLWQ-VwE_O2Q-Z2VQ@mail.gmail.com>
2018-09-12 10:51   ` Akira Yokosawa
2018-09-12 10:53     ` [PATCH] CodeSamples/count: Use READ_ONCE/WRITE_ONCE in count_nonatomic.c Akira Yokosawa
2018-09-12 13:18       ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.