From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:47614 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751664AbdGEPk1 (ORCPT ); Wed, 5 Jul 2017 11:40:27 -0400 Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v65Fd5pR113982 for ; Wed, 5 Jul 2017 11:40:26 -0400 Received: from e11.ny.us.ibm.com (e11.ny.us.ibm.com [129.33.205.201]) by mx0b-001b2d01.pphosted.com with ESMTP id 2bh0y4616w-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Wed, 05 Jul 2017 11:40:26 -0400 Received: from localhost by e11.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 5 Jul 2017 11:40:25 -0400 Date: Wed, 5 Jul 2017 08:40:24 -0700 From: "Paul E. McKenney" Subject: Re: [PATCH] advsync: Fix store-buffering sequence table Reply-To: paulmck@linux.vnet.ibm.com References: <1e3fe2af-cce3-7327-488d-fb27ec7d9fc8@gmail.com> <20170704222138.GR2393@linux.vnet.ibm.com> <73696d24-1e3d-68cf-5bc2-f15390883bdc@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <73696d24-1e3d-68cf-5bc2-f15390883bdc@gmail.com> Message-Id: <20170705154024.GU2393@linux.vnet.ibm.com> Sender: perfbook-owner@vger.kernel.org List-ID: To: Akira Yokosawa Cc: perfbook@vger.kernel.org On Wed, Jul 05, 2017 at 11:22:52PM +0900, Akira Yokosawa wrote: > On 2017/07/04 15:21:38 -0700, Paul E. McKenney wrote: > > On Wed, Jul 05, 2017 at 12:23:09AM +0900, Akira Yokosawa wrote: > >> >From 2845eb208a6e63493997de47293a47ef774a9d49 Mon Sep 17 00:00:00 2001 > >> From: Akira Yokosawa > >> Date: Tue, 4 Jul 2017 23:18:30 +0900 > >> Subject: [PATCH] advsync: Fix store-buffering sequence table > >> > >> Row 6 of the table added in commit 2d5bf8d25a71 ("advsync: Add > >> memory-barriered store-buffering example") needs some context > >> adjustment. > >> > >> Also tweak horizontal spacing of wide tables for one-column layout. > >> Also add a few words to the footnote giving definition of > >> __atomic_thread_fence(). > >> > >> Signed-off-by: Akira Yokosawa > > > > Good catches! Queued and pushed. I reworded the footnote a bit, so > > please let me know if I overdid it. > > After your changes in commit 036372ac2573 ("advsync: Use gcc's C11-like > intrinsics to avoid data races"), this footnote seems verbose, doesn't it? > > But, I'm not so much a fan of the changes of your commit. > It becomes hard to see the relation of lines in litmus tests and rows > in the tables. Also, those intrinsics have fairly large overheads. They certainly are ugly, no two ways about that! ;-) > How about using "volatile" in thread arg declaration such as the following? > > --- > C C-SB+o-o+o-o > { > } > > P0(volatile int *x0, volatile int *x1) > { > int r2; > > *x0 = 2; > r2 = *x1; > } > > > P1(volatile int *x0, volatile int *x1) > { > int r2; > > *x1 = 2; > r2 = *x0; > } > > exists (1:r2=0 /\ 0:r2=0) > --- > > If all you need is to prevent memory accesses from being optimized away, > they should suffice. But they might be unpopular among kernel community. To say nothing of their unpopularity among the C11 and C++11 communities! > I checked the generated C code in cross-compiling mode of litmus7, and > the volatile-ness is reflected there. And it also works just fine without the volatile -- the litmus7 tool does the translation so as to avoid destructive compiler optimizations. I am checking with the litmus7 people to see if there is any way to map identifiers. Some of the other tools support a "-macros" command-line argument, which would allow mapping from "smp_mb()" to "__atomic_thread_fence(__ATOMIC_SEQ_CST)", but not litmus7. So I cannot go with "volatile", but let's see if I can do something better than the gcc intrinsics. Thanx, Paul > Thoughts? > > Thanks, Akira > > > > > Thanx, Paul > > > >> --- > >> advsync/memorybarriers.tex | 10 ++++++---- > >> 1 file changed, 6 insertions(+), 4 deletions(-) > >> > >> diff --git a/advsync/memorybarriers.tex b/advsync/memorybarriers.tex > >> index 4ae3ca8..f26a7c5 100644 > >> --- a/advsync/memorybarriers.tex > >> +++ b/advsync/memorybarriers.tex > >> @@ -174,7 +174,7 @@ Figure~\ref{fig:advsync:Memory Misordering: Store-Buffering Litmus Test}. > >> > >> \begin{table*} > >> \small > >> -\centering > >> +\centering\OneColumnHSpace{-.1in} > >> \begin{tabular}{r||l|l|l||l|l|l} > >> & \multicolumn{3}{c||}{CPU 0} & \multicolumn{3}{c}{CPU 1} \\ > >> \cline{2-7} > >> @@ -318,6 +318,8 @@ ordering and memory barriers work, read on! > >> The first stop is > >> Figure~\ref{fig:advsync:Memory Ordering: Store-Buffering Litmus Test}, > >> which has \co{__atomic_thread_fence()} directives\footnote{ > >> + One of GCC's atomic intrinsics briefly introduced in > >> + Section~\ref{sec:toolsoftrade:Atomic Operations (C11)}. > >> Similar to the Linux kernel's \co{smp_mb()} full memory barrier.} > >> placed between > >> the store and load in both \co{P0()} and \co{P1()}, but is otherwise > >> @@ -339,7 +341,7 @@ Figure~\ref{fig:advsync:Memory Misordering: Store-Buffering Litmus Test}. > >> > >> \begin{table*} > >> \small > >> -\centering > >> +\centering\OneColumnHSpace{-0.75in} > >> \begin{tabular}{r||l|l|l||l|l|l} > >> & \multicolumn{3}{c||}{CPU 0} & \multicolumn{3}{c}{CPU 1} \\ > >> \cline{2-7} > >> @@ -362,8 +364,8 @@ Figure~\ref{fig:advsync:Memory Misordering: Store-Buffering Litmus Test}. > >> 5 & (Finish store) & & \tco{x0==2} & > >> (Finish store) & & \tco{x1==2} \\ > >> \hline > >> - 6 & \tco{r2 = *x1;} (2) & \tco{x0==2} & \tco{x1==0} & > >> - \tco{r2 = *x0;} (2) & \tco{x1==2} & \tco{x0==0} \\ > >> + 6 & \tco{r2 = *x1;} (2) & & \tco{x1==2} & > >> + \tco{r2 = *x0;} (2) & & \tco{x0==2} \\ > >> \end{tabular} > >> \caption{Memory Ordering: Store-Buffering Sequence of Events} > >> \label{tab:advsync:Memory Ordering: Store-Buffering Sequence of Events} > >> -- > >> 2.7.4 > >> > > > > >