Discussions of the Parallel Programming book
 help / color / mirror / Atom feed
* Memory barrier pairing question
@ 2018-07-23  1:42 Artem Polyakov
  2018-07-23  2:37 ` Paul E. McKenney
  0 siblings, 1 reply; 4+ messages in thread
From: Artem Polyakov @ 2018-07-23  1:42 UTC (permalink / raw)
  To: perfbook

[-- Attachment #1: Type: text/plain, Size: 838 bytes --]

Hello,
I have a question about the following scenario (considering POWER arch):

initial { x = 0; y = 0; }

thread0 {
    x = 1;
    lwsync;
    y = 1;
}

thread1 {
    a = y;
    isync;
    b = x;
}

Because "isync" is not a memory barrier this example doesn't have
read/write barrier pairing. However, if I understand correctly, lwsync will
ensure that "x = 1" will become visible to thread1 before lwsync is done
and before "y = 1" will become visible. So "isync" here can be sort of
control dependency as it ensures that "a = y" will be performed before "b =
x" and even will flush the pipeline according to POWER9 spec.

Can someone comment on this scenario and tell if I am right or where I am
wrong.

-- 
С Уважением, Поляков Артем Юрьевич
Best regards, Artem Y. Polyakov

[-- Attachment #2: Type: text/html, Size: 1231 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Memory barrier pairing question
  2018-07-23  1:42 Memory barrier pairing question Artem Polyakov
@ 2018-07-23  2:37 ` Paul E. McKenney
  2018-07-23  4:52   ` Artem Polyakov
  0 siblings, 1 reply; 4+ messages in thread
From: Paul E. McKenney @ 2018-07-23  2:37 UTC (permalink / raw)
  To: Artem Polyakov; +Cc: perfbook

On Sun, Jul 22, 2018 at 06:42:45PM -0700, Artem Polyakov wrote:
> Hello,
> I have a question about the following scenario (considering POWER arch):
> 
> initial { x = 0; y = 0; }
> 
> thread0 {
>     x = 1;
>     lwsync;
>     y = 1;
> }
> 
> thread1 {
>     a = y;
>     isync;
>     b = x;
> }
> 
> Because "isync" is not a memory barrier this example doesn't have
> read/write barrier pairing. However, if I understand correctly, lwsync will
> ensure that "x = 1" will become visible to thread1 before lwsync is done
> and before "y = 1" will become visible. So "isync" here can be sort of
> control dependency as it ensures that "a = y" will be performed before "b =
> x" and even will flush the pipeline according to POWER9 spec.
> 
> Can someone comment on this scenario and tell if I am right or where I am
> wrong.

I am not a Power hardware architect, but here is my understanding.

The isync waits until all the prior instructions "execute", but for
a limited definition of "execute".  One way to think of isync is as
an instruction that does not allow subsequent instructions to start
until it can be proven that execution really will reach the isync
instruction.

So given a load, how could it be that execution would be prevented
from reaching the isync instruction?  One possibility is a SEGV.
But once address translation completes successfully, SEGV cannot
happen.  So isync doesn't need to wait for the load to return an
actual value, but instead only for its execution to reach a point
where the hardware knows that a value will eventually be returned.

And that is why you need a compare and conditional branch before the
isync to ensure that the load has completed.  In that case, the
hardware cannot prove that execution will actually reach the isync
until the load completes, the compare sets the condition code,
and the branch condition is evaluated.  Therefore, anything after
the compare-branch-isync series cannot start executing until after
the load returns its value.

Make sense, or am I missing the point of your question?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Memory barrier pairing question
  2018-07-23  2:37 ` Paul E. McKenney
@ 2018-07-23  4:52   ` Artem Polyakov
  2018-07-23 12:03     ` Paul E. McKenney
  0 siblings, 1 reply; 4+ messages in thread
From: Artem Polyakov @ 2018-07-23  4:52 UTC (permalink / raw)
  To: paulmck; +Cc: perfbook

[-- Attachment #1: Type: text/plain, Size: 3413 bytes --]

Thank you, Paul!

I was wondering why in the following tutorial (section 4.4):
https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf
They were putting "if ( r1 == r1)" before "isync" and your explanation
makes it less confusing.

What was (and somewhat still) confusing in Power ISA (v2.07) is the
isync description:
"Executing an isync instruction ensures that all instructions preceding the
isync instruction have completed before the isync instruction completes,
and that no subsequent instructions are initiated until after the isync
instruction completes."
The term "completes" I guess was understood by me as "committed", but
according to your explanation, it only means that the instruction is deep
enough in the pipeline to ensure it's successful commit in the future. And
the same is with the phrase "no subsequent instructions are initiated until
after the isync instruction completes". The term "completion" seems obscure
to me.

But you did get my question perfectly precise and thank you once again!

вс, 22 июля 2018 г. в 19:37, Paul E. McKenney <paulmck@linux.vnet.ibm.com>:

> On Sun, Jul 22, 2018 at 06:42:45PM -0700, Artem Polyakov wrote:
> > Hello,
> > I have a question about the following scenario (considering POWER arch):
> >
> > initial { x = 0; y = 0; }
> >
> > thread0 {
> >     x = 1;
> >     lwsync;
> >     y = 1;
> > }
> >
> > thread1 {
> >     a = y;
> >     isync;
> >     b = x;
> > }
> >
> > Because "isync" is not a memory barrier this example doesn't have
> > read/write barrier pairing. However, if I understand correctly, lwsync
> will
> > ensure that "x = 1" will become visible to thread1 before lwsync is done
> > and before "y = 1" will become visible. So "isync" here can be sort of
> > control dependency as it ensures that "a = y" will be performed before
> "b =
> > x" and even will flush the pipeline according to POWER9 spec.
> >
> > Can someone comment on this scenario and tell if I am right or where I am
> > wrong.
>
> I am not a Power hardware architect, but here is my understanding.
>
> The isync waits until all the prior instructions "execute", but for
> a limited definition of "execute".  One way to think of isync is as
> an instruction that does not allow subsequent instructions to start
> until it can be proven that execution really will reach the isync
> instruction.
>
> So given a load, how could it be that execution would be prevented
> from reaching the isync instruction?  One possibility is a SEGV.
> But once address translation completes successfully, SEGV cannot
> happen.  So isync doesn't need to wait for the load to return an
> actual value, but instead only for its execution to reach a point
> where the hardware knows that a value will eventually be returned.
>
> And that is why you need a compare and conditional branch before the
> isync to ensure that the load has completed.  In that case, the
> hardware cannot prove that execution will actually reach the isync
> until the load completes, the compare sets the condition code,
> and the branch condition is evaluated.  Therefore, anything after
> the compare-branch-isync series cannot start executing until after
> the load returns its value.
>
> Make sense, or am I missing the point of your question?
>
>                                                         Thanx, Paul
>
>

[-- Attachment #2: Type: text/html, Size: 5401 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Memory barrier pairing question
  2018-07-23  4:52   ` Artem Polyakov
@ 2018-07-23 12:03     ` Paul E. McKenney
  0 siblings, 0 replies; 4+ messages in thread
From: Paul E. McKenney @ 2018-07-23 12:03 UTC (permalink / raw)
  To: Artem Polyakov; +Cc: perfbook

On Sun, Jul 22, 2018 at 09:52:27PM -0700, Artem Polyakov wrote:
> Thank you, Paul!
> 
> I was wondering why in the following tutorial (section 4.4):
> https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf
> They were putting "if ( r1 == r1)" before "isync" and your explanation
> makes it less confusing.
> 
> What was (and somewhat still) confusing in Power ISA (v2.07) is the
> isync description:
> "Executing an isync instruction ensures that all instructions preceding the
> isync instruction have completed before the isync instruction completes,
> and that no subsequent instructions are initiated until after the isync
> instruction completes."
> The term "completes" I guess was understood by me as "committed", but
> according to your explanation, it only means that the instruction is deep
> enough in the pipeline to ensure it's successful commit in the future. And
> the same is with the phrase "no subsequent instructions are initiated until
> after the isync instruction completes". The term "completion" seems obscure
> to me.

I wasn't around for the early PowerPC days, but I could easily imagine
that the word "committed" was a much better match for the much simpler
CPUs of that day.  Things are a bit more complicated these days.  ;-)

> But you did get my question perfectly precise and thank you once again!

Glad that it helped!

							Thanx, Paul

> вс, 22 июля 2018 г. в 19:37, Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> 
> > On Sun, Jul 22, 2018 at 06:42:45PM -0700, Artem Polyakov wrote:
> > > Hello,
> > > I have a question about the following scenario (considering POWER arch):
> > >
> > > initial { x = 0; y = 0; }
> > >
> > > thread0 {
> > >     x = 1;
> > >     lwsync;
> > >     y = 1;
> > > }
> > >
> > > thread1 {
> > >     a = y;
> > >     isync;
> > >     b = x;
> > > }
> > >
> > > Because "isync" is not a memory barrier this example doesn't have
> > > read/write barrier pairing. However, if I understand correctly, lwsync
> > will
> > > ensure that "x = 1" will become visible to thread1 before lwsync is done
> > > and before "y = 1" will become visible. So "isync" here can be sort of
> > > control dependency as it ensures that "a = y" will be performed before
> > "b =
> > > x" and even will flush the pipeline according to POWER9 spec.
> > >
> > > Can someone comment on this scenario and tell if I am right or where I am
> > > wrong.
> >
> > I am not a Power hardware architect, but here is my understanding.
> >
> > The isync waits until all the prior instructions "execute", but for
> > a limited definition of "execute".  One way to think of isync is as
> > an instruction that does not allow subsequent instructions to start
> > until it can be proven that execution really will reach the isync
> > instruction.
> >
> > So given a load, how could it be that execution would be prevented
> > from reaching the isync instruction?  One possibility is a SEGV.
> > But once address translation completes successfully, SEGV cannot
> > happen.  So isync doesn't need to wait for the load to return an
> > actual value, but instead only for its execution to reach a point
> > where the hardware knows that a value will eventually be returned.
> >
> > And that is why you need a compare and conditional branch before the
> > isync to ensure that the load has completed.  In that case, the
> > hardware cannot prove that execution will actually reach the isync
> > until the load completes, the compare sets the condition code,
> > and the branch condition is evaluated.  Therefore, anything after
> > the compare-branch-isync series cannot start executing until after
> > the load returns its value.
> >
> > Make sense, or am I missing the point of your question?
> >
> >                                                         Thanx, Paul
> >
> >


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-07-23 13:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-23  1:42 Memory barrier pairing question Artem Polyakov
2018-07-23  2:37 ` Paul E. McKenney
2018-07-23  4:52   ` Artem Polyakov
2018-07-23 12:03     ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox