* Memory barrier pairing question
@ 2018-07-23 1:42 Artem Polyakov
2018-07-23 2:37 ` Paul E. McKenney
0 siblings, 1 reply; 4+ messages in thread
From: Artem Polyakov @ 2018-07-23 1:42 UTC (permalink / raw)
To: perfbook
[-- Attachment #1: Type: text/plain, Size: 838 bytes --]
Hello,
I have a question about the following scenario (considering POWER arch):
initial { x = 0; y = 0; }
thread0 {
x = 1;
lwsync;
y = 1;
}
thread1 {
a = y;
isync;
b = x;
}
Because "isync" is not a memory barrier this example doesn't have
read/write barrier pairing. However, if I understand correctly, lwsync will
ensure that "x = 1" will become visible to thread1 before lwsync is done
and before "y = 1" will become visible. So "isync" here can be sort of
control dependency as it ensures that "a = y" will be performed before "b =
x" and even will flush the pipeline according to POWER9 spec.
Can someone comment on this scenario and tell if I am right or where I am
wrong.
--
С Уважением, Поляков Артем Юрьевич
Best regards, Artem Y. Polyakov
[-- Attachment #2: Type: text/html, Size: 1231 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: Memory barrier pairing question 2018-07-23 1:42 Memory barrier pairing question Artem Polyakov @ 2018-07-23 2:37 ` Paul E. McKenney 2018-07-23 4:52 ` Artem Polyakov 0 siblings, 1 reply; 4+ messages in thread From: Paul E. McKenney @ 2018-07-23 2:37 UTC (permalink / raw) To: Artem Polyakov; +Cc: perfbook On Sun, Jul 22, 2018 at 06:42:45PM -0700, Artem Polyakov wrote: > Hello, > I have a question about the following scenario (considering POWER arch): > > initial { x = 0; y = 0; } > > thread0 { > x = 1; > lwsync; > y = 1; > } > > thread1 { > a = y; > isync; > b = x; > } > > Because "isync" is not a memory barrier this example doesn't have > read/write barrier pairing. However, if I understand correctly, lwsync will > ensure that "x = 1" will become visible to thread1 before lwsync is done > and before "y = 1" will become visible. So "isync" here can be sort of > control dependency as it ensures that "a = y" will be performed before "b = > x" and even will flush the pipeline according to POWER9 spec. > > Can someone comment on this scenario and tell if I am right or where I am > wrong. I am not a Power hardware architect, but here is my understanding. The isync waits until all the prior instructions "execute", but for a limited definition of "execute". One way to think of isync is as an instruction that does not allow subsequent instructions to start until it can be proven that execution really will reach the isync instruction. So given a load, how could it be that execution would be prevented from reaching the isync instruction? One possibility is a SEGV. But once address translation completes successfully, SEGV cannot happen. So isync doesn't need to wait for the load to return an actual value, but instead only for its execution to reach a point where the hardware knows that a value will eventually be returned. And that is why you need a compare and conditional branch before the isync to ensure that the load has completed. In that case, the hardware cannot prove that execution will actually reach the isync until the load completes, the compare sets the condition code, and the branch condition is evaluated. Therefore, anything after the compare-branch-isync series cannot start executing until after the load returns its value. Make sense, or am I missing the point of your question? Thanx, Paul ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Memory barrier pairing question 2018-07-23 2:37 ` Paul E. McKenney @ 2018-07-23 4:52 ` Artem Polyakov 2018-07-23 12:03 ` Paul E. McKenney 0 siblings, 1 reply; 4+ messages in thread From: Artem Polyakov @ 2018-07-23 4:52 UTC (permalink / raw) To: paulmck; +Cc: perfbook [-- Attachment #1: Type: text/plain, Size: 3413 bytes --] Thank you, Paul! I was wondering why in the following tutorial (section 4.4): https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf They were putting "if ( r1 == r1)" before "isync" and your explanation makes it less confusing. What was (and somewhat still) confusing in Power ISA (v2.07) is the isync description: "Executing an isync instruction ensures that all instructions preceding the isync instruction have completed before the isync instruction completes, and that no subsequent instructions are initiated until after the isync instruction completes." The term "completes" I guess was understood by me as "committed", but according to your explanation, it only means that the instruction is deep enough in the pipeline to ensure it's successful commit in the future. And the same is with the phrase "no subsequent instructions are initiated until after the isync instruction completes". The term "completion" seems obscure to me. But you did get my question perfectly precise and thank you once again! вс, 22 июля 2018 г. в 19:37, Paul E. McKenney <paulmck@linux.vnet.ibm.com>: > On Sun, Jul 22, 2018 at 06:42:45PM -0700, Artem Polyakov wrote: > > Hello, > > I have a question about the following scenario (considering POWER arch): > > > > initial { x = 0; y = 0; } > > > > thread0 { > > x = 1; > > lwsync; > > y = 1; > > } > > > > thread1 { > > a = y; > > isync; > > b = x; > > } > > > > Because "isync" is not a memory barrier this example doesn't have > > read/write barrier pairing. However, if I understand correctly, lwsync > will > > ensure that "x = 1" will become visible to thread1 before lwsync is done > > and before "y = 1" will become visible. So "isync" here can be sort of > > control dependency as it ensures that "a = y" will be performed before > "b = > > x" and even will flush the pipeline according to POWER9 spec. > > > > Can someone comment on this scenario and tell if I am right or where I am > > wrong. > > I am not a Power hardware architect, but here is my understanding. > > The isync waits until all the prior instructions "execute", but for > a limited definition of "execute". One way to think of isync is as > an instruction that does not allow subsequent instructions to start > until it can be proven that execution really will reach the isync > instruction. > > So given a load, how could it be that execution would be prevented > from reaching the isync instruction? One possibility is a SEGV. > But once address translation completes successfully, SEGV cannot > happen. So isync doesn't need to wait for the load to return an > actual value, but instead only for its execution to reach a point > where the hardware knows that a value will eventually be returned. > > And that is why you need a compare and conditional branch before the > isync to ensure that the load has completed. In that case, the > hardware cannot prove that execution will actually reach the isync > until the load completes, the compare sets the condition code, > and the branch condition is evaluated. Therefore, anything after > the compare-branch-isync series cannot start executing until after > the load returns its value. > > Make sense, or am I missing the point of your question? > > Thanx, Paul > > [-- Attachment #2: Type: text/html, Size: 5401 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Memory barrier pairing question 2018-07-23 4:52 ` Artem Polyakov @ 2018-07-23 12:03 ` Paul E. McKenney 0 siblings, 0 replies; 4+ messages in thread From: Paul E. McKenney @ 2018-07-23 12:03 UTC (permalink / raw) To: Artem Polyakov; +Cc: perfbook On Sun, Jul 22, 2018 at 09:52:27PM -0700, Artem Polyakov wrote: > Thank you, Paul! > > I was wondering why in the following tutorial (section 4.4): > https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf > They were putting "if ( r1 == r1)" before "isync" and your explanation > makes it less confusing. > > What was (and somewhat still) confusing in Power ISA (v2.07) is the > isync description: > "Executing an isync instruction ensures that all instructions preceding the > isync instruction have completed before the isync instruction completes, > and that no subsequent instructions are initiated until after the isync > instruction completes." > The term "completes" I guess was understood by me as "committed", but > according to your explanation, it only means that the instruction is deep > enough in the pipeline to ensure it's successful commit in the future. And > the same is with the phrase "no subsequent instructions are initiated until > after the isync instruction completes". The term "completion" seems obscure > to me. I wasn't around for the early PowerPC days, but I could easily imagine that the word "committed" was a much better match for the much simpler CPUs of that day. Things are a bit more complicated these days. ;-) > But you did get my question perfectly precise and thank you once again! Glad that it helped! Thanx, Paul > вс, 22 июля 2018 г. в 19:37, Paul E. McKenney <paulmck@linux.vnet.ibm.com>: > > > On Sun, Jul 22, 2018 at 06:42:45PM -0700, Artem Polyakov wrote: > > > Hello, > > > I have a question about the following scenario (considering POWER arch): > > > > > > initial { x = 0; y = 0; } > > > > > > thread0 { > > > x = 1; > > > lwsync; > > > y = 1; > > > } > > > > > > thread1 { > > > a = y; > > > isync; > > > b = x; > > > } > > > > > > Because "isync" is not a memory barrier this example doesn't have > > > read/write barrier pairing. However, if I understand correctly, lwsync > > will > > > ensure that "x = 1" will become visible to thread1 before lwsync is done > > > and before "y = 1" will become visible. So "isync" here can be sort of > > > control dependency as it ensures that "a = y" will be performed before > > "b = > > > x" and even will flush the pipeline according to POWER9 spec. > > > > > > Can someone comment on this scenario and tell if I am right or where I am > > > wrong. > > > > I am not a Power hardware architect, but here is my understanding. > > > > The isync waits until all the prior instructions "execute", but for > > a limited definition of "execute". One way to think of isync is as > > an instruction that does not allow subsequent instructions to start > > until it can be proven that execution really will reach the isync > > instruction. > > > > So given a load, how could it be that execution would be prevented > > from reaching the isync instruction? One possibility is a SEGV. > > But once address translation completes successfully, SEGV cannot > > happen. So isync doesn't need to wait for the load to return an > > actual value, but instead only for its execution to reach a point > > where the hardware knows that a value will eventually be returned. > > > > And that is why you need a compare and conditional branch before the > > isync to ensure that the load has completed. In that case, the > > hardware cannot prove that execution will actually reach the isync > > until the load completes, the compare sets the condition code, > > and the branch condition is evaluated. Therefore, anything after > > the compare-branch-isync series cannot start executing until after > > the load returns its value. > > > > Make sense, or am I missing the point of your question? > > > > Thanx, Paul > > > > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-07-23 13:04 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-07-23 1:42 Memory barrier pairing question Artem Polyakov 2018-07-23 2:37 ` Paul E. McKenney 2018-07-23 4:52 ` Artem Polyakov 2018-07-23 12:03 ` Paul E. McKenney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox