* Other-multicopy atomicity
@ 2017-09-02 4:09 Akira Yokosawa
2017-09-03 0:57 ` Paul E. McKenney
0 siblings, 1 reply; 5+ messages in thread
From: Akira Yokosawa @ 2017-09-02 4:09 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa
Hi Paul,
I have a comment on the term "other-multicompy atomicity".
It took a while for me to realize that the "other-" stands for "other than self CPU".
At first, it sounded like "other type of multicompy atomicity", which looked
quite vague.
Commit 43236beadb1 ("memorder: Expand on cumulativity and {other,} multicopy
atomicity") helped me to realize your intention. May I suggest to add a footnote
on the use of "other-"?
Also, you failed to replace tabs to white spaces in listing added in the
above mentioned commit.
Thanks, Akira
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: Other-multicopy atomicity 2017-09-02 4:09 Other-multicopy atomicity Akira Yokosawa @ 2017-09-03 0:57 ` Paul E. McKenney 2017-09-03 2:02 ` Akira Yokosawa 0 siblings, 1 reply; 5+ messages in thread From: Paul E. McKenney @ 2017-09-03 0:57 UTC (permalink / raw) To: Akira Yokosawa; +Cc: perfbook On Sat, Sep 02, 2017 at 01:09:37PM +0900, Akira Yokosawa wrote: > Hi Paul, > > I have a comment on the term "other-multicompy atomicity". > > It took a while for me to realize that the "other-" stands for "other than self CPU". > At first, it sounded like "other type of multicompy atomicity", which looked > quite vague. > > Commit 43236beadb1 ("memorder: Expand on cumulativity and {other,} multicopy > atomicity") helped me to realize your intention. May I suggest to add a footnote > on the use of "other-"? I am trying to do a bit too much with that paragraph, aren't I? How about the patch below? > Also, you failed to replace tabs to white spaces in listing added in the > above mentioned commit. Good eyes, fixed! (Not yet pushed, will get there.) Thanx, Paul ------------------------------------------------------------------------ commit 87b29716cee78c5505039ba933c2f991ed3b1dec Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Date: Sat Sep 2 17:48:39 2017 -0700 memorder: Clarify other-multicopy atomicity Reported-by: Akira Yokosawa <akiyks@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> diff --git a/memorder/memorder.tex b/memorder/memorder.tex index 62544ae8ed52..90e2b5e2f294 100644 --- a/memorder/memorder.tex +++ b/memorder/memorder.tex @@ -1703,32 +1703,32 @@ and other counterintuitive behavior, as discussed in the next section. Threads running on a \emph{multicopy atomic}~\cite{Stone:1995:SP:623262.623912} platform are guaranteed -to agree on the order of writes, even to different variables. +to agree on the order of stores, even to different variables. A useful mental model of such a system is the single-bus architecture shown in Figure~\ref{fig:memorder:Global System Bus And Multi-Copy Atomicity}. -If each write resulted in a message on the bus, and if the bus could -accommodate only one write at a time, then any pair of CPUs would -agree on the order of all writes that they observed. +If each store resulted in a message on the bus, and if the bus could +accommodate only one store at a time, then any pair of CPUs would +agree on the order of all stores that they observed. Unfortunately, building a computer system as shown in the figure, without store buffers or even caches, would result in glacial computation. -CPU vendors have therefore taken one of three approaches: -(1)~Provide store buffers, caches, and the rest and abandon -multicopy atomicity (weakly ordered platforms), -(2)~Provide all those hardware optimizations, and invest many transistors -into preserving multicopy atomicity (TSO platforms), or -(3)~Define a slightly weaker \emph{other-multicopy atomicity} that allows -a given CPU's stores to become visible to that CPU before they become visible -to other CPUs, but in which each of those stores becomes visible to all -the other CPUs simultaneously~\cite{ARMv8A:2017}. -Perhaps there will come a day when all platforms provide some flavor -of multi-copy atomicity, but -in the meantime, non-multicopy-atomic platforms do exist, and so software -does need to deal with them. +CPU vendors interested in providing multicopy atomicity have therefore +instead provided the slightly weaker +\emph{other-multicopy atomicity}~\cite{ARMv8A:2017}, +which excludes the CPU doing a given store from the requirement that all +CPUs agree on the order of all stores. +This means that if only a subset of CPUs are doing stores, the +other CPUs will agree on the order of stores, hence the ``other'' +in ``other-multicopy atomicity''. +Unlike multicopy-atomic platforms, within other-multicopy-atomic platforms, +the CPU doing the store is permitted to observe its +store early, which allows its later loads to obtain the newly stored +value directly from the store buffer. +This in turn improves performance. \QuickQuiz{} Can you give a specific example showing different behavior for - multicopy atomic on the one hand and other multicopy atomic + multicopy atomic on the one hand and other-multicopy atomic on the other? \QuickQuizAnswer{ \begin{listing}[tbp] @@ -1790,6 +1790,12 @@ exists (1:r1=1 /\ 1:r2=0) which in turn allows the \co{exists} clause to trigger. } \QuickQuizEnd + +Perhaps there will come a day when all platforms provide some flavor +of multi-copy atomicity, but +in the meantime, non-multicopy-atomic platforms do exist, and so software +does need to deal with them. + \begin{listing}[tbp] { \scriptsize \begin{verbbox}[\LstLineNo] -- To unsubscribe from this list: send the line "unsubscribe perfbook" in the body of a message to majordomo@vger.kernel.org More majordomo info at https://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_majordomo-2Dinfo.html&d=DwIBAg&c=jf_iaSHvJObTbx-siA1ZOg&r=ux41CW3B5BSVxDMRNRWyLbUmPebZc70Kq4AkfdiRGMI&m=4C4QF7BfbGArvD2WxudLaa7Qm9wEkEiEvkE5vbtD8PE&s=jilcBkgE1e1AY60gJfhiKpDB00kxiL--FmmprNHWFw0&e= ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: Other-multicopy atomicity 2017-09-03 0:57 ` Paul E. McKenney @ 2017-09-03 2:02 ` Akira Yokosawa 2017-09-03 3:06 ` Paul E. McKenney 0 siblings, 1 reply; 5+ messages in thread From: Akira Yokosawa @ 2017-09-03 2:02 UTC (permalink / raw) To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa On 2017/09/02 17:57:44 -0700, Paul E. McKenney wrote: > On Sat, Sep 02, 2017 at 01:09:37PM +0900, Akira Yokosawa wrote: >> Hi Paul, >> >> I have a comment on the term "other-multicompy atomicity". >> >> It took a while for me to realize that the "other-" stands for "other than self CPU". >> At first, it sounded like "other type of multicompy atomicity", which looked >> quite vague. >> >> Commit 43236beadb1 ("memorder: Expand on cumulativity and {other,} multicopy >> atomicity") helped me to realize your intention. May I suggest to add a footnote >> on the use of "other-"? > > I am trying to do a bit too much with that paragraph, aren't I? > > How about the patch below? Please see the comments below. > >> Also, you failed to replace tabs to white spaces in listing added in the >> above mentioned commit. > > Good eyes, fixed! (Not yet pushed, will get there.) > > Thanx, Paul > > ------------------------------------------------------------------------ > > commit 87b29716cee78c5505039ba933c2f991ed3b1dec > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > Date: Sat Sep 2 17:48:39 2017 -0700 > > memorder: Clarify other-multicopy atomicity > > Reported-by: Akira Yokosawa <akiyks@gmail.com> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > diff --git a/memorder/memorder.tex b/memorder/memorder.tex > index 62544ae8ed52..90e2b5e2f294 100644 > --- a/memorder/memorder.tex > +++ b/memorder/memorder.tex > @@ -1703,32 +1703,32 @@ and other counterintuitive behavior, as discussed in the next section. > > Threads running on a \emph{multicopy atomic}~\cite{Stone:1995:SP:623262.623912} > platform are guaranteed > -to agree on the order of writes, even to different variables. > +to agree on the order of stores, even to different variables. > A useful mental model of such a system is the single-bus architecture > shown in > Figure~\ref{fig:memorder:Global System Bus And Multi-Copy Atomicity}. > -If each write resulted in a message on the bus, and if the bus could > -accommodate only one write at a time, then any pair of CPUs would > -agree on the order of all writes that they observed. > +If each store resulted in a message on the bus, and if the bus could > +accommodate only one store at a time, then any pair of CPUs would > +agree on the order of all stores that they observed. > Unfortunately, building a computer system as shown in the figure, > without store buffers or even caches, would result in glacial computation. > -CPU vendors have therefore taken one of three approaches: > -(1)~Provide store buffers, caches, and the rest and abandon > -multicopy atomicity (weakly ordered platforms), > -(2)~Provide all those hardware optimizations, and invest many transistors > -into preserving multicopy atomicity (TSO platforms), or > -(3)~Define a slightly weaker \emph{other-multicopy atomicity} that allows > -a given CPU's stores to become visible to that CPU before they become visible > -to other CPUs, but in which each of those stores becomes visible to all > -the other CPUs simultaneously~\cite{ARMv8A:2017}. > -Perhaps there will come a day when all platforms provide some flavor > -of multi-copy atomicity, but > -in the meantime, non-multicopy-atomic platforms do exist, and so software > -does need to deal with them. > +CPU vendors interested in providing multicopy atomicity have therefore > +instead provided the slightly weaker > +\emph{other-multicopy atomicity}~\cite{ARMv8A:2017}, On the ARMv8 multicopy atomicity, I found a paper "Simplifying ARM Concurrency: Multicopy-atomic Axiomatic and Operational Models for ARMv8" at https://urldefense.proofpoint.com/v2/url?u=http-3A__www.cl.cam.ac.uk_-7Epes20_armv8-2Dmca_armv8-2Dmca-2Ddraft.pdf&d=DwICaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=ux41CW3B5BSVxDMRNRWyLbUmPebZc70Kq4AkfdiRGMI&m=1JFkyKvDbZmHr-CRbzC5HuCgZZCSnpvTioqYoFTfMog&s=BdlEGULkAzO_ibDzx4a3IT6_-zC815dPjOwJa9qPLLo&e= (Draft, July 12, 2017) by Christopher Pulte, et.al. It is a draft, but could also be cited here. As you know, "ARM ARM" is quite a large document. If you specified where to look in the manual, it would be even better. > +which excludes the CPU doing a given store from the requirement that all > +CPUs agree on the order of all stores. > +This means that if only a subset of CPUs are doing stores, the > +other CPUs will agree on the order of stores, hence the ``other'' > +in ``other-multicopy atomicity''. Yes, now the meaning of "other-" is clear enough. Thanks, Akira > +Unlike multicopy-atomic platforms, within other-multicopy-atomic platforms, > +the CPU doing the store is permitted to observe its > +store early, which allows its later loads to obtain the newly stored > +value directly from the store buffer. > +This in turn improves performance. > > \QuickQuiz{} > Can you give a specific example showing different behavior for > - multicopy atomic on the one hand and other multicopy atomic > + multicopy atomic on the one hand and other-multicopy atomic > on the other? > \QuickQuizAnswer{ > \begin{listing}[tbp] > @@ -1790,6 +1790,12 @@ exists (1:r1=1 /\ 1:r2=0) > which in turn allows the \co{exists} clause to trigger. > } \QuickQuizEnd > > + > +Perhaps there will come a day when all platforms provide some flavor > +of multi-copy atomicity, but > +in the meantime, non-multicopy-atomic platforms do exist, and so software > +does need to deal with them. > + > \begin{listing}[tbp] > { \scriptsize > \begin{verbbox}[\LstLineNo] > > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Other-multicopy atomicity 2017-09-03 2:02 ` Akira Yokosawa @ 2017-09-03 3:06 ` Paul E. McKenney 2017-09-03 5:40 ` Akira Yokosawa 0 siblings, 1 reply; 5+ messages in thread From: Paul E. McKenney @ 2017-09-03 3:06 UTC (permalink / raw) To: Akira Yokosawa; +Cc: perfbook On Sun, Sep 03, 2017 at 11:02:55AM +0900, Akira Yokosawa wrote: > On 2017/09/02 17:57:44 -0700, Paul E. McKenney wrote: > > On Sat, Sep 02, 2017 at 01:09:37PM +0900, Akira Yokosawa wrote: > >> Hi Paul, > >> > >> I have a comment on the term "other-multicompy atomicity". > >> > >> It took a while for me to realize that the "other-" stands for "other than self CPU". > >> At first, it sounded like "other type of multicompy atomicity", which looked > >> quite vague. > >> > >> Commit 43236beadb1 ("memorder: Expand on cumulativity and {other,} multicopy > >> atomicity") helped me to realize your intention. May I suggest to add a footnote > >> on the use of "other-"? > > > > I am trying to do a bit too much with that paragraph, aren't I? > > > > How about the patch below? > > Please see the comments below. > > > > >> Also, you failed to replace tabs to white spaces in listing added in the > >> above mentioned commit. > > > > Good eyes, fixed! (Not yet pushed, will get there.) > > > > Thanx, Paul > > > > ------------------------------------------------------------------------ > > > > commit 87b29716cee78c5505039ba933c2f991ed3b1dec > > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > Date: Sat Sep 2 17:48:39 2017 -0700 > > > > memorder: Clarify other-multicopy atomicity > > > > Reported-by: Akira Yokosawa <akiyks@gmail.com> > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > > > diff --git a/memorder/memorder.tex b/memorder/memorder.tex > > index 62544ae8ed52..90e2b5e2f294 100644 > > --- a/memorder/memorder.tex > > +++ b/memorder/memorder.tex > > @@ -1703,32 +1703,32 @@ and other counterintuitive behavior, as discussed in the next section. > > > > Threads running on a \emph{multicopy atomic}~\cite{Stone:1995:SP:623262.623912} > > platform are guaranteed > > -to agree on the order of writes, even to different variables. > > +to agree on the order of stores, even to different variables. > > A useful mental model of such a system is the single-bus architecture > > shown in > > Figure~\ref{fig:memorder:Global System Bus And Multi-Copy Atomicity}. > > -If each write resulted in a message on the bus, and if the bus could > > -accommodate only one write at a time, then any pair of CPUs would > > -agree on the order of all writes that they observed. > > +If each store resulted in a message on the bus, and if the bus could > > +accommodate only one store at a time, then any pair of CPUs would > > +agree on the order of all stores that they observed. > > Unfortunately, building a computer system as shown in the figure, > > without store buffers or even caches, would result in glacial computation. > > -CPU vendors have therefore taken one of three approaches: > > -(1)~Provide store buffers, caches, and the rest and abandon > > -multicopy atomicity (weakly ordered platforms), > > -(2)~Provide all those hardware optimizations, and invest many transistors > > -into preserving multicopy atomicity (TSO platforms), or > > -(3)~Define a slightly weaker \emph{other-multicopy atomicity} that allows > > -a given CPU's stores to become visible to that CPU before they become visible > > -to other CPUs, but in which each of those stores becomes visible to all > > -the other CPUs simultaneously~\cite{ARMv8A:2017}. > > -Perhaps there will come a day when all platforms provide some flavor > > -of multi-copy atomicity, but > > -in the meantime, non-multicopy-atomic platforms do exist, and so software > > -does need to deal with them. > > +CPU vendors interested in providing multicopy atomicity have therefore > > +instead provided the slightly weaker > > +\emph{other-multicopy atomicity}~\cite{ARMv8A:2017}, > > On the ARMv8 multicopy atomicity, I found a paper "Simplifying ARM Concurrency: > Multicopy-atomic Axiomatic and Operational Models for ARMv8" at > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.cl.cam.ac.uk_-7Epes20_armv8-2Dmca_armv8-2Dmca-2Ddraft.pdf&d=DwICaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=ux41CW3B5BSVxDMRNRWyLbUmPebZc70Kq4AkfdiRGMI&m=1JFkyKvDbZmHr-CRbzC5HuCgZZCSnpvTioqYoFTfMog&s=BdlEGULkAzO_ibDzx4a3IT6_-zC815dPjOwJa9qPLLo&e= (Draft, July 12, 2017) > by Christopher Pulte, et.al. It is a draft, but could also be cited here. > As you know, "ARM ARM" is quite a large document. If you specified where to look > in the manual, it would be even better. Section B2.3, which I have now included in the citation. Please see below for updated patch. > > +which excludes the CPU doing a given store from the requirement that all > > +CPUs agree on the order of all stores. > > +This means that if only a subset of CPUs are doing stores, the > > +other CPUs will agree on the order of stores, hence the ``other'' > > +in ``other-multicopy atomicity''. > > Yes, now the meaning of "other-" is clear enough. Glad it helped! Thanx, Paul ------------------------------------------------------------------------ commit 8223c00857dca7eef47015744b77c126d0c8626e Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Date: Sat Sep 2 17:48:39 2017 -0700 memorder: Clarify other-multicopy atomicity Reported-by: Akira Yokosawa <akiyks@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> diff --git a/memorder/memorder.tex b/memorder/memorder.tex index 62544ae8ed52..1d4256d76e7a 100644 --- a/memorder/memorder.tex +++ b/memorder/memorder.tex @@ -1703,32 +1703,32 @@ and other counterintuitive behavior, as discussed in the next section. Threads running on a \emph{multicopy atomic}~\cite{Stone:1995:SP:623262.623912} platform are guaranteed -to agree on the order of writes, even to different variables. +to agree on the order of stores, even to different variables. A useful mental model of such a system is the single-bus architecture shown in Figure~\ref{fig:memorder:Global System Bus And Multi-Copy Atomicity}. -If each write resulted in a message on the bus, and if the bus could -accommodate only one write at a time, then any pair of CPUs would -agree on the order of all writes that they observed. +If each store resulted in a message on the bus, and if the bus could +accommodate only one store at a time, then any pair of CPUs would +agree on the order of all stores that they observed. Unfortunately, building a computer system as shown in the figure, without store buffers or even caches, would result in glacial computation. -CPU vendors have therefore taken one of three approaches: -(1)~Provide store buffers, caches, and the rest and abandon -multicopy atomicity (weakly ordered platforms), -(2)~Provide all those hardware optimizations, and invest many transistors -into preserving multicopy atomicity (TSO platforms), or -(3)~Define a slightly weaker \emph{other-multicopy atomicity} that allows -a given CPU's stores to become visible to that CPU before they become visible -to other CPUs, but in which each of those stores becomes visible to all -the other CPUs simultaneously~\cite{ARMv8A:2017}. -Perhaps there will come a day when all platforms provide some flavor -of multi-copy atomicity, but -in the meantime, non-multicopy-atomic platforms do exist, and so software -does need to deal with them. +CPU vendors interested in providing multicopy atomicity have therefore +instead provided the slightly weaker +\emph{other-multicopy atomicity}~\cite[Section B2.3]{ARMv8A:2017}, +which excludes the CPU doing a given store from the requirement that all +CPUs agree on the order of all stores. +This means that if only a subset of CPUs are doing stores, the +other CPUs will agree on the order of stores, hence the ``other'' +in ``other-multicopy atomicity''. +Unlike multicopy-atomic platforms, within other-multicopy-atomic platforms, +the CPU doing the store is permitted to observe its +store early, which allows its later loads to obtain the newly stored +value directly from the store buffer. +This in turn improves performance. \QuickQuiz{} Can you give a specific example showing different behavior for - multicopy atomic on the one hand and other multicopy atomic + multicopy atomic on the one hand and other-multicopy atomic on the other? \QuickQuizAnswer{ \begin{listing}[tbp] @@ -1790,6 +1790,12 @@ exists (1:r1=1 /\ 1:r2=0) which in turn allows the \co{exists} clause to trigger. } \QuickQuizEnd + +Perhaps there will come a day when all platforms provide some flavor +of multi-copy atomicity, but +in the meantime, non-multicopy-atomic platforms do exist, and so software +does need to deal with them. + \begin{listing}[tbp] { \scriptsize \begin{verbbox}[\LstLineNo] -- To unsubscribe from this list: send the line "unsubscribe perfbook" in the body of a message to majordomo@vger.kernel.org More majordomo info at https://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_majordomo-2Dinfo.html&d=DwIBAg&c=jf_iaSHvJObTbx-siA1ZOg&r=ux41CW3B5BSVxDMRNRWyLbUmPebZc70Kq4AkfdiRGMI&m=Q7EVeNleJycyxaIDU8zrQ-TvAloij0JpWYOZrXIKx4c&s=etV_KUVGGzv0WanXWYzQHz5KX51L3c3orpwBfwRCyvY&e= ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: Other-multicopy atomicity 2017-09-03 3:06 ` Paul E. McKenney @ 2017-09-03 5:40 ` Akira Yokosawa 0 siblings, 0 replies; 5+ messages in thread From: Akira Yokosawa @ 2017-09-03 5:40 UTC (permalink / raw) To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa On 2017/09/02 20:06:18 -0700, Paul E. McKenney wrote: > On Sun, Sep 03, 2017 at 11:02:55AM +0900, Akira Yokosawa wrote: >> On 2017/09/02 17:57:44 -0700, Paul E. McKenney wrote: >>> On Sat, Sep 02, 2017 at 01:09:37PM +0900, Akira Yokosawa wrote: >>>> Hi Paul, >>>> >>>> I have a comment on the term "other-multicompy atomicity". >>>> >>>> It took a while for me to realize that the "other-" stands for "other than self CPU". >>>> At first, it sounded like "other type of multicompy atomicity", which looked >>>> quite vague. >>>> >>>> Commit 43236beadb1 ("memorder: Expand on cumulativity and {other,} multicopy >>>> atomicity") helped me to realize your intention. May I suggest to add a footnote >>>> on the use of "other-"? >>> >>> I am trying to do a bit too much with that paragraph, aren't I? >>> >>> How about the patch below? >> >> Please see the comments below. >> >>> >>>> Also, you failed to replace tabs to white spaces in listing added in the >>>> above mentioned commit. >>> >>> Good eyes, fixed! (Not yet pushed, will get there.) >>> >>> Thanx, Paul >>> >>> ------------------------------------------------------------------------ >>> >>> commit 87b29716cee78c5505039ba933c2f991ed3b1dec >>> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> >>> Date: Sat Sep 2 17:48:39 2017 -0700 >>> >>> memorder: Clarify other-multicopy atomicity >>> >>> Reported-by: Akira Yokosawa <akiyks@gmail.com> >>> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> >>> >>> diff --git a/memorder/memorder.tex b/memorder/memorder.tex >>> index 62544ae8ed52..90e2b5e2f294 100644 >>> --- a/memorder/memorder.tex >>> +++ b/memorder/memorder.tex >>> @@ -1703,32 +1703,32 @@ and other counterintuitive behavior, as discussed in the next section. >>> >>> Threads running on a \emph{multicopy atomic}~\cite{Stone:1995:SP:623262.623912} >>> platform are guaranteed >>> -to agree on the order of writes, even to different variables. >>> +to agree on the order of stores, even to different variables. >>> A useful mental model of such a system is the single-bus architecture >>> shown in >>> Figure~\ref{fig:memorder:Global System Bus And Multi-Copy Atomicity}. >>> -If each write resulted in a message on the bus, and if the bus could >>> -accommodate only one write at a time, then any pair of CPUs would >>> -agree on the order of all writes that they observed. >>> +If each store resulted in a message on the bus, and if the bus could >>> +accommodate only one store at a time, then any pair of CPUs would >>> +agree on the order of all stores that they observed. >>> Unfortunately, building a computer system as shown in the figure, >>> without store buffers or even caches, would result in glacial computation. >>> -CPU vendors have therefore taken one of three approaches: >>> -(1)~Provide store buffers, caches, and the rest and abandon >>> -multicopy atomicity (weakly ordered platforms), >>> -(2)~Provide all those hardware optimizations, and invest many transistors >>> -into preserving multicopy atomicity (TSO platforms), or >>> -(3)~Define a slightly weaker \emph{other-multicopy atomicity} that allows >>> -a given CPU's stores to become visible to that CPU before they become visible >>> -to other CPUs, but in which each of those stores becomes visible to all >>> -the other CPUs simultaneously~\cite{ARMv8A:2017}. >>> -Perhaps there will come a day when all platforms provide some flavor >>> -of multi-copy atomicity, but >>> -in the meantime, non-multicopy-atomic platforms do exist, and so software >>> -does need to deal with them. >>> +CPU vendors interested in providing multicopy atomicity have therefore >>> +instead provided the slightly weaker >>> +\emph{other-multicopy atomicity}~\cite{ARMv8A:2017}, >> >> On the ARMv8 multicopy atomicity, I found a paper "Simplifying ARM Concurrency: >> Multicopy-atomic Axiomatic and Operational Models for ARMv8" at >> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.cl.cam.ac.uk_-7Epes20_armv8-2Dmca_armv8-2Dmca-2Ddraft.pdf&d=DwICaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=ux41CW3B5BSVxDMRNRWyLbUmPebZc70Kq4AkfdiRGMI&m=1JFkyKvDbZmHr-CRbzC5HuCgZZCSnpvTioqYoFTfMog&s=BdlEGULkAzO_ibDzx4a3IT6_-zC815dPjOwJa9qPLLo&e= (Draft, July 12, 2017) >> by Christopher Pulte, et.al. It is a draft, but could also be cited here. >> As you know, "ARM ARM" is quite a large document. If you specified where to look >> in the manual, it would be even better. > > Section B2.3, which I have now included in the citation. Please see > below for updated patch. Looks good! Thanks, Akira > >>> +which excludes the CPU doing a given store from the requirement that all >>> +CPUs agree on the order of all stores. >>> +This means that if only a subset of CPUs are doing stores, the >>> +other CPUs will agree on the order of stores, hence the ``other'' >>> +in ``other-multicopy atomicity''. >> >> Yes, now the meaning of "other-" is clear enough. > > Glad it helped! > > Thanx, Paul > > ------------------------------------------------------------------------ > > commit 8223c00857dca7eef47015744b77c126d0c8626e > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > Date: Sat Sep 2 17:48:39 2017 -0700 > > memorder: Clarify other-multicopy atomicity > > Reported-by: Akira Yokosawa <akiyks@gmail.com> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > diff --git a/memorder/memorder.tex b/memorder/memorder.tex > index 62544ae8ed52..1d4256d76e7a 100644 > --- a/memorder/memorder.tex > +++ b/memorder/memorder.tex > @@ -1703,32 +1703,32 @@ and other counterintuitive behavior, as discussed in the next section. > > Threads running on a \emph{multicopy atomic}~\cite{Stone:1995:SP:623262.623912} > platform are guaranteed > -to agree on the order of writes, even to different variables. > +to agree on the order of stores, even to different variables. > A useful mental model of such a system is the single-bus architecture > shown in > Figure~\ref{fig:memorder:Global System Bus And Multi-Copy Atomicity}. > -If each write resulted in a message on the bus, and if the bus could > -accommodate only one write at a time, then any pair of CPUs would > -agree on the order of all writes that they observed. > +If each store resulted in a message on the bus, and if the bus could > +accommodate only one store at a time, then any pair of CPUs would > +agree on the order of all stores that they observed. > Unfortunately, building a computer system as shown in the figure, > without store buffers or even caches, would result in glacial computation. > -CPU vendors have therefore taken one of three approaches: > -(1)~Provide store buffers, caches, and the rest and abandon > -multicopy atomicity (weakly ordered platforms), > -(2)~Provide all those hardware optimizations, and invest many transistors > -into preserving multicopy atomicity (TSO platforms), or > -(3)~Define a slightly weaker \emph{other-multicopy atomicity} that allows > -a given CPU's stores to become visible to that CPU before they become visible > -to other CPUs, but in which each of those stores becomes visible to all > -the other CPUs simultaneously~\cite{ARMv8A:2017}. > -Perhaps there will come a day when all platforms provide some flavor > -of multi-copy atomicity, but > -in the meantime, non-multicopy-atomic platforms do exist, and so software > -does need to deal with them. > +CPU vendors interested in providing multicopy atomicity have therefore > +instead provided the slightly weaker > +\emph{other-multicopy atomicity}~\cite[Section B2.3]{ARMv8A:2017}, > +which excludes the CPU doing a given store from the requirement that all > +CPUs agree on the order of all stores. > +This means that if only a subset of CPUs are doing stores, the > +other CPUs will agree on the order of stores, hence the ``other'' > +in ``other-multicopy atomicity''. > +Unlike multicopy-atomic platforms, within other-multicopy-atomic platforms, > +the CPU doing the store is permitted to observe its > +store early, which allows its later loads to obtain the newly stored > +value directly from the store buffer. > +This in turn improves performance. > > \QuickQuiz{} > Can you give a specific example showing different behavior for > - multicopy atomic on the one hand and other multicopy atomic > + multicopy atomic on the one hand and other-multicopy atomic > on the other? > \QuickQuizAnswer{ > \begin{listing}[tbp] > @@ -1790,6 +1790,12 @@ exists (1:r1=1 /\ 1:r2=0) > which in turn allows the \co{exists} clause to trigger. > } \QuickQuizEnd > > + > +Perhaps there will come a day when all platforms provide some flavor > +of multi-copy atomicity, but > +in the meantime, non-multicopy-atomic platforms do exist, and so software > +does need to deal with them. > + > \begin{listing}[tbp] > { \scriptsize > \begin{verbbox}[\LstLineNo] > > ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-09-03 5:40 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-09-02 4:09 Other-multicopy atomicity Akira Yokosawa 2017-09-03 0:57 ` Paul E. McKenney 2017-09-03 2:02 ` Akira Yokosawa 2017-09-03 3:06 ` Paul E. McKenney 2017-09-03 5:40 ` Akira Yokosawa
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.